Problems of Scale in Statistical Genetics

One of the central theoretical and practical problems of genome-wide association studies (GWAS) is the large amount of data. Standard arrays today start at 500,000 genetic markers; after imputation, with the help of publicly accessible databases for reference populations such as HapMap or the "1000 Genomes Project", this often increases to around 9 million markers. Sequence analyses result in even larger amounts of genetic material. This leads to problems in data preparation and subsequent association analysis. We work on the development of statistical methods for such high-dimensional genetic data.

Last updated March 2023