Research

I am interested in the synthesis and transportation of knowledge from and between multiple settings. My work spans the fields of causality, mixture models, data fusion, and distribution shift. I am especially interested in how errors in carefully engineered training tasks can teach us about hidden causal structures.

Topics

Distribution Shift and Transportability
Statistical prediction models are often trained on data that is drawn from different probability distributions than their eventual use cases. My work uses insights from causal inference to develop new methods for building machine learning models that are robust to environmental changes.
Distribution Shift and Transportability
Mixture Models for Causal Inference
Combining multiple populations or contexts induces universal confounding on a structural causal model (SCM). Assuming a bound on the cardinality of a discrete universal confounder turns the problem into a mixture model, allowing identification of within-source probability distributions. This perspective expands the notion of causal identifiability, as many graphically unidentifiable relationships can be identified.
Mixture Models for Causal Inference
Decision Fusion
Many medical settings with privacy concerns deny direct access to data, requiring us to synthesize conclusions at a higher level. This setting is riddled with paradoxes - namely that conclusions are not necessarily transitive. We use “expert graphs” to define a new notion of consistency in networks of conclusions from differing contexts.
Decision Fusion
Time-Dependent Genomic Signatures for Cancer Classification and Prediction
We process non-coding regions of the genome which contain duplication and mutation signatures. These mutation profiles have been shown to be predictive of various forms of cancer.
Time-Dependent Genomic Signatures for Cancer Classification and Prediction

Recent Publications

(2023). Causal Information Splitting: Engineering Proxy Features for Robustness to Distribution Shifts. To appear in UAI 2023.

PDF Cite Project

(2023). Causal Inference Despite Limited Global Confounding via Mixture Models. In CLeaR 2023.

PDF Cite Project Poster

(2022). Combining Binary Classifiers Leads to Nontransitive Paradoxes.

PDF Cite Project

(2021). Glioblastoma signature in the DNA of blood-derived cells. PLoS ONE 16(9).

PDF Cite Project

(2021). Source Identification for Mixtures of Product Distributions. In COLT 2021.

PDF Cite Project