Stochastic Tverberg-type theorems and their relevance in Machine Learning and Statistical Inference

Discrete geometry can play a role in foundations of data science. Here I present concrete examples. In statistical inference we wish to find the properties or parameters of a distribution or model through sufficiently many samples. A famous example is logistic regression, a popular non-linear model in multivariate statistics and supervised learning. Users often rely on optimizing of maximum likelihood estimation, but how much training data do we need, as a function of the dimension of the covariates of the data, before we expect an MLE to exist with high probability? 

Similarly, for unsupervised learning and non-parametric statistics, one wishes to uncover the shape and patterns from samples of a measure or measures. We use only the intrinsic geometry and topology of the sample. A famous example of this type of method is the $k$-means clustering algorithm. A fascinating challenge is to explain the variability of behavior of $k$-means algorithms with distinct random initializations and the shapes of the clusters. 

 

In this talk we present new stochastic combinatorial theorems, direct new variations of Tverberg’s theorem, that give bounds on the probability of existence of maximum likelihood estimators in multinomial logistic regression and also quantify to the variability of clustering initializations. Along the way we will see fascinating connections to the coupon collector problem, topological data analysis, and to the computation of Tukey centerpoints of data clouds (a high-dimensional generalization of median). 

 

This is joint work with (in various papers) with T. Hogan, R.  D. Oliveros, E. Jaramillo-Rodriguez, A. Torres-Hernandez, and Dominic Yang.