A new dimensionality reduction technique based on the Wavelet Transform for cancer classification

Abstract

Problem DNA methylation and hydroxymethylation have become important epigenetic markers for early detection of cancer. In recent years, there has been a significant increase in both the number of research works on this topic and the number and size of labeled databases with some type of cancer. Although the advent of methylation microarrays such as the HumanMethylation450 platform has greatly reduced the dimensionality of the problem from billions to 450K positions, this data size is still too large to be processed by machine learning algorithms for cancer prediction and classification.Aim In the particular case of methylation, an efficient dimensionality reduction technique should also preserve the spatial information of the original data in order to properly predict and classify cancer.Method This work proposes a new approach for data dimensionality reduction technique based on the Discrete Wavelet Transform (DWT), which preserves spatial information. We have evaluated the proposed technique with a dataset collected from the most important cancer databases according to their social impact, and we have compared our proposal to five well-known dimensionality reduction techniques: PCA, ReliefF, Isomap, LLE and UMAP.Results The performance evaluation results show that the proposed technique significantly reduces both the computational resources and the execution time required for dimensionality reduction. In addition, it significantly improves the accuracy achieved in the classification by a support vector machine when it uses as input data the resulting dataset yielded by each technique.Conclusions The proposed approach based on the DWT can be considered as an efficient alternative for those cases where dimensionality reduction must preserve spatial information.

Publication DOI: https://doi.org/10.1186/s40537-024-01039-9
Divisions: College of Engineering & Physical Sciences
College of Engineering & Physical Sciences > School of Computer Science and Digital Technologies
Aston University (General)
Funding Information: This work has been supported by MCIU /AEI /10.13039/501100011033 / FEDER, UE under grant PID2023-146335NB-I00.
Additional Information: Copyright © The Author(s) 2024. This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit https://creativecommons.org/licenses/by-nc-nd/4.0/
Uncontrolled Keywords: dimensionality reduction,cancer classification,DNA methylation analysis,Wavelet Transform,machine learning classification
Publication ISSN: 2196-1115
Last Modified: 01 May 2025 08:43
Date Deposited: 24 Apr 2025 16:51
Full Text Link:
Related URLs: https://journal ... 537-024-01039-9 (Publisher URL)
http://www.scop ... tnerID=8YFLogxK (Scopus URL)
PURE Output Type: Article
Published Date: 2025-01-21
Accepted Date: 2024-12-14
Authors: Fernández, Lisardo
Pérez, Mariano
Orduña, Juan M.
Alcaraz Calero, Jose M. (ORCID Profile 0000-0002-2654-7595)

Download

Export / Share Citation


Statistics

Additional statistics for this record