Date of Award
2024
Document Type
Open Access Dissertation
Degree Name
Doctor of Philosophy in Statistics (PhD)
Administrative Home Department
Department of Mathematical Sciences
Advisor 1
Benjamin Ong
Committee Member 1
Allan Struthers
Committee Member 2
Byung-Jun Kim
Committee Member 3
Laura Brown
Abstract
Analyzing high-dimensional data exposes the challenge associated with the "curse of dimensionality", making data analysis computationally intensive. To tackle this, dimension reduction techniques play a pivotal role in simplifying high-dimensional data. These methods can be categorized into two groups: linear and non-linear dimension reduction, with the latter accommodating complex data structures.
Our focus is on the development of an incremental non-linear dimension reduction method for streaming data based on the Geometric Multi-Resolution Analysis framework. The primary goal is to assess the incremental GMRA's effectiveness compared to the batch GMRA approach and overcome key challenges specific to streaming data scenarios.
Key challenges include the incremental update of the existing cluster map to align it with a cluster map generated from a bulk dataset, the incremental updating of PCA basis vectors instead of recomputing them entirely, determining whether continuous updating of PCA basis vectors is necessary or if bulk updating suffices, and exploring the necessity of updating and computing wavelet coefficients every time the GMRA structure undergoes incremental updates.
Numerical experiments conducted to assess the proposed Incremental GMRA method's performance showed that the algorithm demonstrates adaptability as it accurately represents nonlinear manifolds even with small initial sample sizes, and the final approximation closely aligns with the batch GMRA results. A unique advantage of the incremental approach is the ability to maintain the multiscale structure with updated basis vectors, resulting in efficient updates. Additionally, we observe a decay pattern in wavelet coefficients, aiding in the determination of the required depth of approximation.
This research emphasizes the potential of the incremental GMRA approach for efficient dimension reduction in streaming data scenarios by handling evolving and complex manifold structures. It also addresses and overcomes key challenges encountered in this context.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Hettige, Praveen T.W., "AN INCREMENTAL MULTISCALE NON–LINEAR MANIFOLD APPROXIMATION METHOD", Open Access Dissertation, Michigan Technological University, 2024.