MENU

PHD Seminar Series

"Seminari di Statistica del Dottorato"

Meta Fusion: A Unified Framework For Multimodal Fusion with Adaptive Mutual Learning

Babak Shahbaba (Department of Statistics University of California, Irvine)
13th Febraury 2026, 2:00 pm 

Abstract: Developing effective multimodal data fusion strategies has become increasingly important for enhancing the predictive power of statistical machine learning methods across a wide range of applications, from autonomous driving to medical diagnosis. Traditional fusion methods, including early, intermediate, and late fusion, integrate data at different stages, each offering distinct advantages and limitations. In this talk, we discuss Meta Fusion, a flexible and principled framework that unifies these existing strategies as special cases. Motivated by deep mutual learning and ensemble learning, Meta Fusion constructs a cohort of models based on various combinations of latent representations across modalities, and further boosts predictive performance through ”soft information sharing” within the cohort. Our approach is model-agnostic in learning the latent representations, allowing it to flexibly adapt to the unique characteristics of each modality. Theoretically, our soft information sharing mechanism reduces the generalization error. Empirically, Meta Fusion consistently outperforms conventional fusion strategies in extensive simulation studies. We further validate our approach on real-world applications, including Alzheimer’s disease detection and neural decoding. 

Locandina


Copula Tensor Count Autoregressions

Mirko Armillotta (Department of Economics and Finance, University of Rome Tor Vergata)
17th November 2025, 12:00 pm

Abstract: This paper presents a novel copula-based autoregressive framework for multi-layer arrays of integer-valued time series with tensor structure. Our framework generalizes recent advances in tensor time series models for real-valued data to a context that accounts for the unique properties of integer-valued data, such as discreteness and non-negativity. The model incorporates feedback effects for the counts' temporal dynamics and introduces new identification constraints. An asymptotic theory is developed for a Two-Stage Maximum Likelihood Estimator (2SMLE) for the model's parameters. The estimator balances the challenges of high-dimensionality, interdependence of the different count series, and computational stability. Together, this substantially pushes the frontier for modeling high-dimensional, structured tensor time series of counts. An application to tensor crime counts demonstrates the practical usefulness of the proposed methodology.


Pairwise Comparisons without Stochastic Transitivity: Model, Theory and Applications

Yunxiao Chen (London School of Economics)
12th September 2025, 10:00 am

Abstract: Most statistical models for pairwise comparisons, including the Bradley-Terry (BT) and Thurstone models and many extensions, make a relatively strong assumption of stochastic transitivity. This assumption imposes the existence of an unobserved global ranking among all the players/teams/items and monotone constraints on the comparison probabilities implied by the global ranking. However, the stochastic transitivity assumption does not hold in many real-world scenarios of pairwise comparisons, especially games involving multiple skills or strategies. As a result, models relying on this assumption can have suboptimal predictive performance. In this paper, we propose a general family of statistical models for pairwise comparison data without a stochastic transitivity assumption, substantially extending the BT and Thurstone models. In this model, the pairwise probabilities are determined by a (approximately) low-dimensional skew-symmetric matrix. Likelihood-based estimation methods and computational algorithms are developed, which allow for sparse data with only a small proportion of observed pairs. Theoretical analysis shows that the proposed estimator achieves minimax-rate optimality, which adapts effectively to the sparsity level of the data. The spectral theory for skew-symmetric matrices plays a crucial role in the implementation and theoretical analysis. The proposed method’s superiority against the BT model, along with its broad applicability across diverse scenarios, is further supported by simulations and real data analysis. This is a joint work with Sze Ming Lee (phd student at LSE).


Nearest neighbor matching

Fang Han (Department of Statistics, University of Washington, Seattle)
17th April 2025, 10.30 am

Abstract: In two landmark Econometrica papers, Abadie and Imbens proved that the nearest neighbor (NN) matching estimator of the average treatment effect, when using a fixed number of neighbors, is asymptotically normal but semiparametrically inefficient and bootstrap inconsistent. In this talk, I will show that the same NN matching estimator becomes asymptotically normal, doubly robust, semiparametrically efficient, and bootstrap consistent as long as we force the number of NNs to diverge with the sample size.


Using machine learning for confounding control in pharmacoepidemiology

Robert Platt (Departments of Pediatrics and of Epidemiology, Biostatistics and Occupational Health, McGill University - Montreal, Québec - Canada)
25th March 2025, 12.00 pm

Abstract: Machine learning tools are used extensively for prediction, but they are not typically designed for causal inference. However, tools such as targeted learning have been developed to exploit the strengths of machine learning in causal inference. In this presentation I will describe machine learning and its use to infer causation in pharmacoepidemiology. I will discuss settings in which machine learning can be used together with appropriate methods for confounding control, when it is useful, and when it may be unnecessary.

Ultimo aggiornamento

03.02.2026

Cookie

I cookie di questo sito servono al suo corretto funzionamento e non raccolgono alcuna tua informazione personale. Se navighi su di esso accetti la loro presenza.  Maggiori informazioni