PILLAR 01

Computational Genomics and Pangenomics

We develop scalable methods for large biobanks and population cohorts where genetic relatedness is pervasive. The lab has a sustained research program in IBD detection, PBWT-based algorithms, haplotype sharing, genotype imputation, relatedness inference, local ancestry, and pangenomic indexing.

Representative directions include RaPID for biobank-scale IBD detection, RAFFI for relatedness inference, FiMAP for IBD mapping, ROH analysis, local ancestry inference, and GBWT/RLBWT-based pangenome graph indexing through work connected to the Human Pangenome Reference Consortium.

PILLAR 02

Clinical EHR Deep Learning

We build deep learning systems for longitudinal electronic health records, with an emphasis on generalizable clinical representation learning. Med-BERT established a pretraining and fine-tuning paradigm for structured EHR diagnosis-code sequences, while CovRNN and PK-RNN-style models address clinical trajectories and outcome prediction.

Current work focuses on large-scale clinical foundation models, multi-task fine-tuning, deployment-oriented validation, multimodal clinical representation learning, and links between EHR phenotypes and biobank-scale genetic discovery.

PILLAR 03

AI-Powered Imaging Genetics and GWAS

We use AI to derive quantitative phenotypes from medical images and connect them to genetic variation through GWAS. The lab's UDIP framework applies unsupervised and self-supervised representation learning to imaging genetics, including brain MRI, diffusion MRI white matter FA maps, and retinal imaging.

Current projects include multimodal imaging representation learning, imaging-derived phenotype GWAS, Alzheimer's disease neuroimaging biomarkers, retinal imaging genetics, and multi-omics integration.