2023/12 The YAO genome: a more accurate genome sequence for the Han population



A team of geneticists has assembled a complete human genome reference for Han Chinese, the first of its kind and which could potentially promote precision medicine in China.

The telomere-to-telomere (T2T) gapless diploid genome sequence of a healthy male individual, named T2T-YAO, contained two complete sets of chromosomes, one from each parent, and the Y-chromosome that passes only from male parents to male offspring.

A similar work, T2T-CHM13, published in 2022 by the U.S. National Institutes of Health, fulfilled 8 percent of the previously unknown highly repetitive region in the human genome. However, it was of European-origin and without the Y-chromosome, thus not enough for representing all individuals worldwide.

The scientists from Peking University People’s Hospital and Beijing Institute of Genomics (BIG) under the Chinese Academy of Sciences, collected samples from an ancient village in Hongtong County in Shanxi Province in the north of China, a place believed to be a starting point of the countrywide mass migration in around the late 14th century. The YAO part of the name stems from the sampling point located near the ruins of the capital of the legendary Chinese emperor Yao, while T2T stands for telomere-to-telomere or end-to-end sequence of all chromosomes in the genome.

A comparative analysis conducted by the Chinese team has revealed that about 11 percent of YAO’s genome is not alignable to that of T2T-CHM13, with about 3,000 different genes in each genome, a discrepancy much wider than previously estimated, said Gao Zhancheng from Peking University People’s Hospital, the paper’s correspondence author. The significant discrepancies between the individual genomes of the two human populations in the study are mainly attributed to the mass of non-coding DNA, accounting for nearly 99 percent of the genome, he added. Recently, some of those non-coding DNA sequences were found to serve important functional roles, such as in the regulation of gene expression, while the functions of other non-coding DNA remain unknown.

Gao and his collaborators have found that YAO is mostly of East Asian origin and admixed with sporadic predicted markers of South Asia, Europe, and America. The markers from South Asia are a little more than those stemming from Europe and America, revealing greater genetic exchange between the East Asian and South Asian ethnic groups, according to the study.

In addition, the haplotype of Y-chromosome in YAO, a predominant type in China and Asia, has also been identified in ancient DNA samples from a Neolithic site in the nearby Shaanxi Province dating back to approximately 4,000 years ago, which suggests a potential genetic continuity in the region from the earliest days of human habitation in this part of China.

The reference human genome is known as a genetic “navigation map” widely used in human genetics and medical research, and the great genome discrepancies among ethnic groups suggest that YAO is a more appropriate reference genome for Han Chinese. The YAO genome can thus provide more accurate gene and mutation information for the Han Chinese population in establishing a technical system and quality benchmark for clinical research such as genetic disease diagnosis, disease risk prediction, cancer studies, and precision medicine in China.

2023/12 The YAO genome: a more accurate genome sequence for the Han population
Scroll to top