Random Projections of Smooth Manifolds
Baraniuk, Richard G.
Manifolds; dimensionality reduction; random projections; Compressed Sensing; sparsity; manifold learning; Johnson-Lindenstrauss lemma
Many types of data and information can be described by concise models that suggest each data vector (or signal) actually has â few degrees of freedomâ relative to its size N. This is the motivation for a variety of dimensionality reduction techniques for data processing that attempt to reduce or eliminate the impact of the ambient dimension N on computational or storage requirements. As an example, many signals can be expressed as a sparse linear combination of elements from some dictionary. The sparsity of the representation directly reflects the conciseness of the model and permits efficient techniques such as Compressed Sensing (CS), an emerging theory for sparse signal recovery requiring only a small number of nonadaptive, random linear measurements. In other cases, the conciseness of the signal model may dictate that the signal class forms a low-dimensional manifold as a subset of the high-dimensional ambient space R^N. This type of geometric structure may not be neatly reflected in a sparse representation. Instead, dimensionality reduction techniques for manifold-modeled data typically involve â learningâ the manifold structure from a collection of data points, often by constructing nonlinear mappings from R^N to R^M for some M < N that are adapted to the training data and intended to preserve some characteristic property of the manifold. In this paper, we propose a new approach for nonadaptive dimensionality reduction of manifold-modeled data, demonstrating that a small number of random linear projections can preserve key information about a manifold-modeled signal. We center our analysis on the effect of a random linear projection operator Phi: R^N -> R^M on a smooth K-dimensional submanifold F of R^N. As our main theoretical contribution, we establish a sufficient number M of random projections to guarantee that, with high probability, all pairwise Euclidean and geodesic distances between points on F are well-preserved under the mapping Phi. Our results bear strong resemblance to CS. As in CS, the random measurements we propose can be used to recover the original data in R^N. Moreover, like the fundamental bound in CS, our requisite M is linear in the â information levelâ K and logarithmic in the ambient dimension N; we also identify a logarithmic dependence on the volume and curvature of the manifold. In addition to recovering faithful approximations to manifold-modeled signals, however, the random projections we propose can also be used to discern key properties about the manifold. We discuss connections with existing techniques in manifold learning.