GenBase, the latest gene sequence database developed by the National Genome Sciences Data Center (NGDC) of the Beijing Genome Institute of the Chinese Academy of Sciences (National Center for Biological Information), is officially online, providing research users with gene sequence data remittance sharing and query download services.
Gene sequence and annotation information (including DNA, RNA and protein sequence information) is one of the core basic data to support gene function research. Along with the rapid development of biology, scientists in China’s life science field have produced a huge amount of gene sequence data in the past decades. In order to guarantee the sovereignty and security of our gene sequence data and meet the realistic needs of our researchers in the process of remitting, managing and sharing gene sequence data, NGDC has established GenBase, a gene sequence database, in line with the GenBank database of NCBI, the National Center for Biological Information.
The core function of GenBase is to store, manage and share gene sequences, annotation information and protein sequences of all species, and provide a series of web services for rendezvous, storage, publication and sharing of gene sequence data. The GenBase submission system allows users to follow detailed instructions to submit important entity and metadata information, including submitter information, references, nucleotide sequences, data sources, data characteristics, etc. GenBase strictly controls data quality to ensure the accuracy, integrity and availability of gene sequence data. It is based in China and serves the whole world. It can receive data submissions from researchers all over the world and seamlessly share with GenBank through the data exchange system. Meanwhile, to ensure the localized management of global gene sequence data, GenBase integrates the gene sequence data published by INSDC to improve the efficiency of domestic researchers in querying and acquiring data. At present, GenBase can support users to query or download over 420 million nucleic acid and its encoded protein sequences that have been made public by GenBank.
In addition to GenBase, the Beijing Genome Institute of the Chinese Academy of Sciences (National Center for Biological Information) has established 65 public databases for biomedical research, covering raw data, genomes and variants, gene expression, non-coding RNAs, epigenomes, single-cell genomics, biodiversity, and other genomic data, in response to the practical needs of “storage and management” of genomic data in China. The database covers 10 major categories, such as raw data, genome and variation, gene expression, non-coding RNA, epigenome, single cell histology, biodiversity and biosynthesis, health and disease, literature and education, and tools, etc. It has initially formed a framework of data resource system for safe rendezvous, management, sharing and application of life-omics data in China, serving basic and translational research in the field of biology and medicine.