I am an assistant professor in the Department of Statistics at the University of Florida. Previously, I completed my PhD on Biostatistics at Harvard University working with professors Francesca Dominici and Corwin Zigler, followed by two years of postdoctoral training in the department of Statistical Science at Duke University working under the mentorship of professors David Dunson and Fan Li.
My research interests lie broadly in the fields of causal inference and Bayesian modeling.
I am particularly motivated in providing useful, applicable, and creative solutions to important scientific questions, and in developing tools that allow researchers to analyze data properly and flexibly.
When I’m not working on statistics, I enjoy running and rock climbing. I like to grow my own vegetables, though they don’t always survive. I love traveling and learning new languages. Within the next few years, I’m hoping to spend time hiking in Mongolia.
In cluster-randomized trials, individuals within a cluster are assigned the same treatment condition, but the treatment uptake status may vary across individuals due to noncompliance. We propose a semiparametric framework to evaluate the individual compliance effect and network assignment effect within principal strata exhibiting different patterns of noncompliance. The individual compliance effect captures the portion of the treatment effect attributable to changes in treatment receipt, while the network assignment effect reflects the pure impact of treatment assignment and spillover among individuals within the same cluster. We characterize new structural assumptions for nonparametric point identification, and we develop semiparametrically efficient estimators that combine data-adaptive machine learning methods with efficient influence functions.
In the presence of interference, IPW estimators often suffer from high variance. Under low-rank assumptions on the potential outcomes and in the presence of interference, we design optimal covariate balancing estimators. The framework encompasses commonly-invoked assumptions such as stratified or additive interference.
We study design-based causal inference when there are two distinct sets of units, one on which interventions are applied, and one on which the outcome is measured. We introduce various causal estimands and propose weighting estimators from a design-based perspective. We derive the estimators’ variance and prove consistency for a growing bipartite graph. In our analysis, we illustrate complications that arise from the positivity assumption, the experimental design, and the structure of the graph.
When the treated units are spatial areas, their relationship with the control units is expected to exhibit a spatial relationship. Under the vertical regression framework, we propose a Bayesian approach for estimation of causal effects with spatial panel data.
Estimating the parameters of a temporal, spatio-temporal, or mutually-exciting Hawkes process based on data that are available in aggregated form by time, space, or both.
Investigating treatment effect heterogeneity with spatio-temporal data, point pattern treatment and outcome, and spatial or spatio-temporal potential moderators
In cluster randomized experiments with selection bias due to recruitment, data are often only available on those that were recruited. Based on a principal stratification framework, we show that causal effects on the overall population are identifiable based on the recruited sample only.
We investigate the complications and opportunities when drawing causal inference from spatial observational data. We introduce causal diagrams that allow us to investigate the impact of spatial confounders, interference, and the inherent spatial structure in the exposure variable, and we illustrate that causal inference with spatial data has crucial differences to counterparts with independent observations. We then propose an approach that mitigates bias from unmeasured spatial confounding and incorporates interference within one framework.
We propose a latent factor interaction model for networks measured with error, and a variable importance metric for latent models. We use the model to address the geographic and taxonomic bias of ecological studies of species’ interactions, and identify the important bird and plant covariates for forming and detecting interactions.
Causal inference with interference for realistic treatment allocation programs. Evaluating the comparitive effectiveness of power plant emission reduction strategies for reducing ambient ozone concentrations.