Ranks, Centers, and dispersion measures for networked and metric data

seminar
Author

Giulia Bertagnolli

Published

March 1, 2024

Abstract
At the basis of non-parametric statistics there is the concept of ranks, which presumes the possibility of ordering data points. A natural order relation on a multivariate (Euclidean) space—or worse, on the vertices of a graph, or on the elements of a metric space—is, however, missing and we cannot define the median and quantiles for distributions on these spaces. Statistical data depth functions have been introduced since 1975 to solve this problem, so that we have now the Tukey median and Tukey’s contours (Tukey 1975), along with a variety of other multivariate medians induced by different ways of ordering points, i.e. different data depths. The simplicial volume depth (Oja 1983) is particularly well suited to networked and metric data, allowing us not only to define the median of a network, but also to quantify its dispersion, through an unbiased and consistent estimator. Time permitting, we will see a sketch of the (unfinished) proof of a central limit theorem for this dispersion estimator. (Joint work with Claudio Agostinelli).

References

Oja, Hannu. 1983. “Descriptive Statistics for Multivariate Distributions.” Statistics & Probability Letters 1 (6): 327–32. https://doi.org/10.1016/0167-7152(83)90054-8.
Tukey, John W. 1975. “Mathematics and the Picturing of Data.” In Proceedings of the International Congress of Mathematicians, Vancouver, 1975, 2:523–31.