Lingzhao Fang

Research leader

 

Project title

Decipher Complex Phenotypes via Cross-species Analysis of the Regulatory Genome

What is your project about?

In genetics, we currently have a poor understanding of how genetic variation affects gene expression and ultimately complex phenotypes within and across species. This lack of knowledge prevents us from properly realizing the potential of genomic technologies in precision agriculture (in farm animals) and personalized medicine (in humans). My project will develop DNA language models for identifying genetic variants that regulate gene expression across three farm animals (i.e., cattle, pigs, and chickens), mice, and humans. I will then investigate how patterns of gene regulation are shared and evolved across species, and how they shape the genetic landscape of complex traits in both farm animals and humans.

How did you become interested in your particular field of research?

Since high school, I have been interested in genetics and biology, and two of my favorite books were ‘On the Origin of Species’ and ‘The Selfish Gene’ at that time. During my PhD, I started to use conventional statistical methods to explore the association of DNA variation with complex traits in farm animals. I initiated the Farm animal Genotype-Tissue Expression (FarmGTEx) project 7 years ago to generate large omics data in farm animals. More recently, the rapid development of large language models has started to transform our understanding of genetics and biology. I believe that it is a great time to train large language models on large-scale functional genomics data for understanding the regulatory grammar of DNA sequences across diverse cells and tissues between farm animals and humans. The knowledge and insights learnt can then be used to enhance precision agriculture and human biomedicine.

What are the scientific challenges and perspectives in your project?

Understanding genetic and molecular mechanisms of complex traits is essential for the development of precision agriculture and medicine. Although conventional genome-wide association studies (GWAS) have discovered numerous trait-associated loci in both farm animals and humans, the underpinning mechanisms of complex traits are largely unknown due to many non-coding variants, the extensive linkage disequilibrium among nearby variants, and rare variants. To address these challenges and facilitate dual-purpose research in agriculture and human biomedicine using farm animals, I will integrate large language models (LLM), evolution genomics, and GWAS to decipher regulatory mechanisms of complex traits. In the future, with more functional multi-omics data available across species, it is important to train more advanced multi-modal LLM to systematically quantify the evolutionary constraints of regulatory genome.

What is your estimate of the impact, which your project may have to society in the long term?

The computational approaches and prioritized regulatory variants delivered by this project will substantially advance our understanding of genetic and molecular controls of complex traits. The systematic comparison of human and farm animal genomes at a functional level in this project will facilitate dual-purpose research in agriculture and human biomedicine using farm animals. It will become a foundational resource for both livestock improvement and comparative studies in human health. In the long term, it will pave the way for innovative applications in both agricultural and biomedical fields, advancing sustainable farming and enhancing our understanding of human diseases through the development of better animal models for studying human diseases and testing drug safety.

Which impact do you expect the Sapere Aude programme will have on your career as a researcher?

I am very grateful for receiving this highly prestigious Sapere Aude: DFF-Starting Grant. I have a clear intention of becoming a research leader in the field of computational biology. This project will allow me to develop a world-leading research group in this field at Center for Quantitative Genetics and Genomics (QGG), Aarhus University, Denmark. Additionally, this project will enable me to further strengthen and develop my existing international collaborative networks that have been established in the FarmGTEx project. I believe that this project will bring me, my group and QGG to the forefront of methodology development and application for AI-based computational biology in life science.

Background and personal life

I grew up in China before moving to Denmark for my PhD on bioinformatics and statistical genetics at Aarhus University in 2014. Afterwards, I went to Maryland, USA, for my postdoc training on functional genomics and bioinformatics at USDA and University of Maryland. From the beginning of 2019 to 2022, I was a Marie Skłodowska-Curie actions COFUND Train@ed fellow on both livestock and human genetics at University of Edinburgh, UK. I was very pleased to move back to Aarhus, Denmark, in 2022, to work on integrative genomics and genetics. Besides science, I like hiking, swimming, cycling, reading history books, and cooking.