The Sparse Hausdorff Moment Problem, with Application to Topic Models

Spencer Gordon, Bijan Mazaheri, Yuval Rabani, Leonard Schulman

April, 2019

Abstract

We consider the problem of identifying, from its first m noisy moments, a probability distribution on $[0, 1]$ of support $k < \infty$ . This is equivalent to the problem of learning a distribution on m observable binary random variables $X_{1}, X_{2}, \dots, X_{m}$ that are iid conditional on a hidden random variable $U$ taking values in $1, 2, \dots, k$ . Our focus is on accomplishing this with $m = 2 k$ , which is the minimum $m$ for which verifying that the source is a $k$ -mixture is possible (even with exact statistics). This problem, so simply stated, is quite useful: e.g., by a known reduction, any algorithm for it lifts to an algorithm for learning pure topic models.

We give an algorithm for identifying a $k$ -mixture using samples of $m = 2 k$ iid binary random variables using a sample of size $(1 / w_{min}) 2 \cdot (1 / ζ) O (k)$ and post-sampling runtime of only $O (k^{2} + o (1))$ arithmetic operations. Here $w_{min}$ is the minimum probability of an outcome of $U$ , and $ζ$ is the minimum separation between the distinct success probabilities of the $X_{i}$ s. Stated in terms of the moment problem, it suffices to know the moments to additive accuracy $w_{min} \cdot ζ O (k)$ . It is known that the sample complexity of any solution to the identification problem must be at least exponential in $k$ . Previous results demonstrated either worse sample complexity and worse $O (k c)$ runtime for some $c$ substantially larger than 2, or similar sample complexity and much worse $k O (k^{2})$ runtime.

Type

Preprint

Discrete Mixture Models