The shrinking human genome
Researchers from the Spanish National Cancer Research Centre (CNIO) have updated the number of human genes - those that can generate proteins - to 19,000. This is 1700 fewer genes than described in the most recent annotation and well below the initial estimations of 100,000 genes.
Published in the journal Human Molecular Genetics, the study was led by Alfonso Valencia, Vice-Director of Basic Research at CNIO and head of the Structural Computational Biology Group, and Michael Tress, a researcher in the group. Valencia described the continuous corrections to the numbers of the protein-coding genes in the human genome over the years as “the shrinking human genome”.
The scientists began by analysing proteomics experiments; proteomics being the most powerful tool to detect protein molecules. They then integrated data from seven large-scale mass spectrometry studies, from more than 50 human tissues, “in order to verify which genes really do produce proteins”, said Valencia.
The results brought to light just over 12,000 proteins, which the researchers mapped to the corresponding regions of the genome. They analysed thousands of genes which were annotated in the human genome but did not appear in the proteomics analysis.
According to Tress, “1700 of the genes that are supposed to produce proteins almost certainly do not for various reasons, either because they do not exhibit any protein-coding features, or because the conservation of their reading frames does not support protein coding ability.” Valencia suggested that “1700 genes may have to be re-annotated … [and] we will have to re-do the calculations for all genomes, not only the human genome”.
One hypothesis derived from the study is that more than 90% of human genes produce proteins that originated in metazoans or multicellular organisms of the animal kingdom hundreds of millions of years ago; the figure is over 99% for those genes whose origin predates the emergence of primates 50 million years ago. The researchers stated that “the differences between humans and primates at the level of genes and proteins are very small”.
Team member David Juan added that “the number of new genes that separate humans from mice [those genes that have evolved since the split from primates] may even be fewer than 10”, contrasting with the more than 500 human genes with origins since primates that can be found in the current annotation. Juan theorised, “The physiological and developmental differences between primates are likely to be caused by gene regulation rather than by differences in the basic functions of the proteins in question.”
The research results are part of GENCODE, a consortium which is integrated into the ENCODE Project and formed by research groups from around the world whose task is to provide an annotation of all the gene-based elements in the human genome. Valencia claimed that once the team’s data is incorporated into the new annotations, “it will redefine the entire mapping of the human genome and how it is used in macro projects, such as those for cancer genome analysis”.
Droplet microfluidics for single-cell analysis
Discover how droplet microfluidics is revolutionising single-cell analysis and selection in...
PCR alternative offers diagnostic testing in a handheld device
Researchers have developed a diagnostic platform that uses similar techniques to PCR, but within...
Urine test enables non-invasive bladder cancer detection
Researchers have developed a streamlined and simplified DNA-based urine test to improve early...