Research

Machine learning is obsessed with accuracy, but traditional science embraces unexpected results as sources of new insight. My work investigates how errors in carefully engineered ML tasks can teach us about hidden structures in data. I’m generally interested in mixture models, high level data fusion, and stability to distribution shift - usually through the lense of causality.

Topics

Distribution Shift and Transportability
Statistical prediction models are often trained on data that is drawn from different probability distributions than their eventual use cases. My (upcomming) work uses insights from causal inference to develop new methods for building machine learning models that are robust to environmental changes.
Distribution Shift and Transportability
Mixture Models for Causal Inference
Interventional distributions acted on by a universal unobserveed confounder can be though of as a mixture model. Cardinality assumptions allow identification of within-component probability distributions, allowing access to inteverntional distirbutions that were previously considered unidentifiable.
Mixture Models for Causal Inference
High-Level Data Fusion
We have developed “Expert Graphs” to study consistencies and inconsistencies in partial, but overlapping expertise. As it turns out, this problem is deeply related to issues in voting theory, such as the Cordorcet Paradox.
High-Level Data Fusion
Time-Dependent Genomic Signatures for Cancer Classification and Prediction
We process non-coding regions of the genome which contain duplication and mutation signatures. These mutation profiles have been shown to be predictive of various forms of cancer.
Time-Dependent Genomic Signatures for Cancer Classification and Prediction

Recent Publications

(2023). Causal Inference Despite Limited Global Confounding via Mixture Models. In CLEAR 2023.

PDF Cite Project

(2022). Combining Binary Classifiers Leads to Nontransitive Paradoxes.

PDF Cite Project

(2021). Glioblastoma signature in the DNA of blood-derived cells. PLoS ONE 16(9).

PDF Cite Project

(2021). Source Identification for Mixtures of Product Distributions. In COLT 2021.

PDF Cite Project

(2021). Synthesizing New Expertise via Collaboration. In ISIT 2021.

PDF Cite Project