Integrating genome-wide association studies (GWAS) with functional genomics can aid biological interpretation of disease-associated variants. First, I will introduce single-cell disease relevance score (scDRS), a method that identifies disease-critical cell populations by integrating GWAS with single-cell RNA-sequencing (scRNA-seq). Previous works can identify disease associations for predefined cell types, while scDRS can detect heterogeneous disease associations within classical cell types without predefined cell type annotations. Applying scDRS to 74 diseases/traits and 1.3 million single-cell gene-expression profiles, we identified subpopulations of disease-associated cells not captured by existing cell-type labels, such as T cell subpopulations associated with inflammatory bowel disease, partially characterized by their effector-like states. Second, I will discuss optimizing parameters in scRNA-seq experiments to maximally extract biological information. An underlying question is how to allocate the limited sequencing budget between the number of cells and sequencing depth. Instead of focusing on results of a specific analysis, we information-theoretically evaluated the fundamental limit in learning the underlying biological ground truth. We determined that, for estimating many important gene properties, the optimal allocation was to sequence at a depth of around one read per cell per gene, slightly deeper than most existing data sets. Interestingly, the corresponding optimal estimator was different from the commonly-used plug-in estimator, which produced severely biased estimates.
Dr. Martin Jinye Zhang is a research associate at Harvard School of Public Health, advised by Prof. Alkes Price. He obtained a PhD in Electrical Engineering from Stanford University, advised by Prof. David Tse and Prof. James Zou. He is the recipient of the 2021 ASHG Epstein postdoc semifinalist award, the 2020 Top 50 Life and Biological Sciences Articles in Nature Communications, and the 2019 RECOMB best paper award. His research focuses on the development of statistical methods that integrates GWAS and functional genomics to uncover the genetic basis of human disease. Areas of interest include functional components of heritability, disease-critical cellular contexts, and causal inference approaches to identify disease genes and proteins.
Faculty Host: Mohammed El-Kebir
Meeting ID: 835 6347 3816; Password: csillinois