A completely new, comprehensive human genome reference will be created in a National Institutes of Health-funded, multi-institutional initiative. The Human Pangenome Project will be based on the complete genome sequences of 350 individuals from a variety of the world’s populations.
The international Human Genome Project in 2000 produced the first working draft of a genome sequence from one person’s genome – a genetic blueprint for a human being. That sequence was annotated over the years and became the primary guide for understanding new genomic data. It also propelled the field of genomic medicine. The current reference sequence, however, remains incomplete and misses the diversity and genetic variation among human populations.
As part of a new Human Genome Reference Program, the National Human Genome Research Institute will fund two, multi-site centers. Grants totaling about $29.5 million over the next five years, pending availability of funds, will support advances in DNA sequencing as well as new mapping and computational methods.
Researchers at the University of Washington School of Medicine will join this international effort. The Human Reference Genome Sequencing Center will be headed by principal investigator David Haussler at University of California, Santa Cruz.
The UW team will be led by Evan Eichler, professor of genome science and a Howard Hughes Medical Institute investigator.
Eichler’s group is known for work on structural variation in the human genome and on genomic instability, as well as for contributions to comparative DNA sequencing and autism genetics. His team will focus on more difficult genome regions that vary among individuals and will use advances in highly accurate long-read sequencing to characterize those areas.
“We finally have the technology and methods to go after the parts of the human genome that were beyond our reach 20 years ago,” said Eichler. “It’s an exciting time for human genetics with implications for improved variant discovery associated with disease.”
Other lead investigators for the pangenome sequencing project include Ira Hall at Washington University in St. Louis and Erich Jarvis at Rockefeller University.
According to the NHGRI news announcement, almost all biomedical studies that use or analyze human genomic data rely on the established reference sequence of the human genome. For example, it has become the standard to follow in assembling genome sequences from other individuals. It is expected that higher quality reference sequences will better assist, for example, in pinpointing the genomic location of disease-related variants more exactly.
“It has grown more and more important to have a high-quality, highly usable human genome reference that represents the diversity of human populations,” said Adam Felsenfeld, program director, Division of Genome Sciences at the NHGRI. “The proposed improvements will serve the growing basic and clinical genomics research communities by helping them interpret both research and patient genome sequences.”
Over time, additional human genome sequences will be incorporated into the reference to enhance its usefulness for human DNA analysis.
The NHGRI has also awarded additional major funding to Washington University in St. Louis, UC Santa Cruz and the European Bioinformatics Institute to coordinate with the National Center for Biotechnology Information to form the WashU-UCSC-EBI Human Pangenome Reference Center. Together, the sequencing and the reference initiatives will carry out the Human Pangenome Project.
The second center will build the reference by weaving the collection of human genome sequences into a map of similarities and differences. This representation would be amenable to computational strategies for comparing new genome sequences to the reference map. The lead investigators for the Human Pangenome Reference Center include Ting Wang and Ira Hall at Washington University in St. Louis, Benedict Paten at UC Santa Cruz, and Paul Flicek at the European Bioinformatics Institute
Additional institutions and leading scientists in the Human Pangenome Project include, Richard Durbin at the University of Cambridge University in England, Karen Miga at UC Santa Cruz, Gene Myers at Max Planck Institute of Molecular Cell Biology and Genetics in Germany, Kersten Howe at the Wellcome Sanger Institute in England, Adam Phillippy at the NHGRI, Heng Li at the Broad Institute in Boston, Paolo Carnevali at the Chan Zuckerberg Initiative in Palo Alto, California, and several others.
The international group, the Genome Reference Consortium, which has developed and maintained the current human genome reference since the end of the Human Genome Project, will be essential to the new pangenome program.
Several DNA sequencing companies will support the project centers by bringing in technologies as contributing partners.