SE Asia genome map of 3000 individuals (SEA3K) provides new insights in human evolution

https://www.cas.cn/cm/202505/t20250519_5068891.shtml

https://www.nature.com/articles/s41586-025-08998-w

Southeast Asia is one of the most important human evolution regions in the world. Although Southeast Asia is the region with the largest indigenous population in the world, accounting for 1/3, only 1.6% of the reported 670,000 human genome sequencing data worldwide are from Southeast Asian populations, and only 163 indigenous population individuals are included.

The Kunming Institute of Zoology, Chinese Academy of Sciences, in collaboration with scientific research institutions in many Southeast Asian countries, has now constructed a genetic variation dataset for Southeast Asian populations. After collecting samples of Southeast Asian populations covering five major language families, six countries, and more than 30 places, the team completed 3023 deep whole genome sequences and constructed the most complete genetic variation dataset of Southeast Asian populations-SEA3K.

The researchers identified more than 20 million short sequence variants and nearly 25,000 structural variants as newly discovered genetic variations in Southeast Asian populations. They found that the genetic structure of most Southeast Asian populations is consistent with their main geographical distribution pattern, rather than clustering by language family, indicating that the differentiation between populations is mainly formed through geographical isolation, confirming the complex genetic integration and language replacement history of the region.

Four key genetic components were detected in the Southeast Asian population, the most important of which is a unique ancient component that dominates the populations of Cambodia and the Andaman Islands, which may have originated from ancient populations. In addition, the study proved that the Southeast Asian population experienced a huge bottleneck effect during the last glacial maximum, and then grew explosively driven by agricultural expansion.

The unique and diverse climatic conditions and geographical environment in Southwest China and Southeast Asia are one of the important reasons for the diversity of the population in the region. Among them, the tropical rainforest environment has a profound impact on the genetic adaptation of the local population. The research team identified 44 regions that were strongly positively selected, covering 89 genes, of which 72 were positive selection targets discovered for the first time. The functions of these genes involve adaptive characteristics such as constitution, immunity, and metabolism, revealing the unique evolutionary strategy of Southeast Asian people to adapt to tropical environments.

A large number of studies on ancient human genomes have shown that there are at least two extinct ancient human gene sequence remains in the genomes of modern humans – Neanderthals and Denisovans. The research team conducted a systematic analysis of ancient human gene introgression and proved that there are indeed multiple Denisovan introgression patterns in Southeast Asian populations.This discovery indicates that Denisovans may have been widely distributed in eastern Asia during the Paleolithic period, and may have mixed with modern people many times in mainland Southeast Asia.

Southeast Asia is located in the tropical-subtropical region. The hot and humid tropical rainforest environment has shaped the rich and unique phenotypic characteristics of the local population, as well as tropical high-incidence diseases that account for 1/3 of the global incidence. These diseases have seriously affected the population health and regional economic development in the region.

Among 10 Southeast Asian-specific high-frequency pathogenic variants, the most notable example was a new pathogenic variant on the α-thalassemia-related gene HBA2. The frequency of this variant is as high as 28.6% in Southeast Asian populations, while it is almost zero in other populations. The researchers believe that this is the evolutionary result of the balanced selection between thalassemia risk variants and anti-malarial resistance.

The Kunming institute has officially launched the second phase of the “Southeast Asian Population Genome Project” (SEA10K) in conjunction with international partners, aiming to build a high-resolution genome map covering 10,000 people across Southeast Asia.

Most popular posts: