Home Symposiums / Research Seminars / Workshop

Research Seminar (19 Apr 2013)



Research Seminar

Population Structure and Ancestry Estimation in
Large-scale Human Genetic Data


Dr Chaolong Wang

Biostatistics, Harvard University

19 April 2013 (Friday)
3:00 – 4:30pm

Seminar Room 7-03

7/F The HK Jockey Club Building for Interdisciplinary Research

5 Sassoon Road, Pokfulam, Hong Kong

Understanding spatial patterns of human genetic variation is important for both evolutionary biology and disease association studies. The recent expansion of genetic datasets in diverse populations has enabled investigation of population structure in unprecedented resolution. Many studies have reported qualitative similarity between geographic maps of population locations and statistical maps of human genetic variation in different regions of the world. To provide a quantitative and systematic evaluation of the similarity between genes and geography, I collected genotype data from over 100 populations worldwide and introduced a Procrustes analysis approach to quantify the similarity between geographic maps and different statistical maps of genetic variation. We showed that significant similarity between genes and geography exists in general at different geographic levels, supporting the view that geography plays a strong role in giving rise to human population structure. Next, I developed statistical methods to estimate individual ancestry from genotypes and next-generation sequencing. In particular, correcting for population structure is challenging for targeted sequencing experiments, because targeted regions include too few variants to accurately represent global ancestry and off-target regions are covered poorly, precluding estimation of the accurate genotypes. To address these challenges, I developed a method that skips genotype calling and directly analyzes sequence reads from off-target regions to estimate individual ancestry. Using simulations and real data, we showed that the method can accurately infer worldwide continental ancestry and fine-scale ancestry within Europe with modest amount of sequence data. This method enabled us to introduce additional ancestry-matched controls from public resources to a targeted sequencing study of age-related macular degeneration, leading to discovery of a rare variant that is significantly associated with increased risk of the disease.

About the Speaker:   Please visit Dr Chaolong Wang's CV



For enquiries, please call 2831-5500 or write to This e-mail address is being protected from spambots. You need JavaScript enabled to view it