Consortium identifies overlooked protein-coding genes and maps 7,200 more
Updated
Updated · Chemical & Engineering News · May 6
Consortium identifies overlooked protein-coding genes and maps 7,200 more
11 articles · Updated · Chemical & Engineering News · May 6
In Nature, researchers screened 7,264 candidate sequences, mined billions of mass spectra from more than 400 studies, and the European Bioinformatics Institute has already added at least three microproteins.
Relaxing standard proteomics rules helped detect many tiny proteins missed before, while evolutionary analyses and CRISPR screens found strong evidence for some candidates but mixed support for others' functions.
Scientists now label definitely translated but not yet fully validated sequences as peptideins; preliminary analyses of proteins as small as 10 amino acids suggest the human proteome could approach 20,000 more entries.
Thousands of new microproteins have been discovered. Is the textbook definition of a 'gene' officially obsolete?
A hidden layer of the genome is now revealed; what new cancer-killing drugs will it unleash?
The Dark Proteome Revealed: Comprehensive Identification and Functional Insights of 1,700 Peptideins
Overview
In 2026, the TransCODE Consortium revealed that many regions of the human genome once thought non-coding actually produce small proteins called peptideins. They confirmed 1,700 such proteins from non-canonical open reading frames and integrated this data into major databases, enabling global research access. Advances like ribosome profiling, mass spectrometry, and CRISPR screens were key to detecting and validating peptideins, including OLMALINC, which is essential for cancer cell survival. Many peptideins appear on immune cell surfaces, making them promising targets for cancer immunotherapy. Despite challenges in drug delivery and unknown functions for most peptideins, ongoing research and data sharing policies support their potential to transform disease understanding and treatment.