Open Conference Systems, STATISTICS AND DATA SCIENCE: NEW CHALLENGES, NEW GENERATIONS

Font Size: 
A Multiscale Approach to Manifold Estimation
Alessandro Lanteri, Mauro Maggioni

Last modified: 2017-05-22

Abstract


In recent years, due to the huge amount of data that new technologies provide,modern science has become dependent on reliable methods to deal with highdimensionaldata avoiding the curse of dimensionality issue. In presence of highdimensionaldata, sampled from an unknown distribution on a high-dimensionalspace, it is common to assume that the support of this distribution is well approximatedby a low-dimensional set, for example a Riemannian manifold.We introducea novel technique to estimate underlying structure of the data using an algorithmwhich approximates the manifold with a collection of hyperplanes. This is done in amultiscale fashion, using a subspace clustering algorithm recursively. The proposedapproach is data-adaptive and, by construction, provides a tree structure for the data.We evaluate the performance of the proposed method with synthetic data, showingthat our algorithm is fast and provides good estimates. An application to the MNISTdata set shows how the proposed algorithm succeeds to approximate and encode thedata with a collection of low-dimensional planes, with an accuracy comparable to ahigh-dimensional encoder.