Big data annotation tool for RNA secondary structures


By Mansi Gandhi
Wednesday, 23 May, 2018

Big data annotation tool for RNA secondary structures

A big-data annotation tool, called bpRNA, makes it easier to understand links between disease and mutant RNA.

Developed by researchers at the Oregon State University, the program is capable of parsing RNA structures, including complex pseudoknot-containing RNAs, so you end up with an objective, precise, easily interpretable description of all loops, stems and pseudoknots, said corresponding author and Assistant Professor David Hendrix.

“You also get the positions, sequence and flanking base pairs of each structural feature, which enables us to study RNA structure en masse at a large scale.”

RNA works with DNA, the other nucleic acid — so named because they were first discovered in the cell nuclei of living things — to produce the proteins needed throughout the body. DNA contains a person’s hereditary information, and RNA delivers the information’s coded instructions to the protein-manufacturing sites within the cells. Many RNA molecules do not encode a protein, and these are known as noncoding RNAs.

“There are plenty of examples of disease-associated mutations in noncoding RNAs that probably affect their structure, and in order to statistically analyse why those mutations are linked to disease we have to automate the analysis of RNA structure,” said Hendrix.

“RNA is one of the fundamental, essential molecules for life, and we need to understand RNAs’ structure to understand how they function.”

Secondary structures are the base-pairing interactions within a single nucleic acid polymer or between two polymers. DNA has mainly fully base-paired double helices, but RNA is single stranded and can form complicated interactions.

Hendrix said bpRNA, presented in a paper in Nucleic Acids Research, features the largest and most detailed database to date of secondary RNA structures. “To be fair it’s a meta-database, but our special sauce is the tool to annotate everything,” said Hendrix, who is also an assistant professor in the OSU College of Engineering. “Before there was no way of saying where all the structural features were in an automated way. We provide a colour-coded map of where everything is. These annotations will enable us to identify statistical trends that may shed light on RNA structure formation and may open the door for machine learning algorithms to predict secondary RNA structure in ways that haven’t been possible.”

Researchers have successfully tested the tool on more than 100,000 structures, “many of which are very complex, with lots of complex pseudoknots”.

“Every day new RNAs are discovered and researchers are making huge progress in understanding their function,” Hendrix said. “We’re starting to appreciate that the genome is full of noncoding RNAs in addition to messenger RNAs, and they’re important biological molecules with big effects on human health and disease.”

Image caption: An example of the annotation provided by a new software tool for RNA secondary structure researchers (provided by David Hendrix, OSU College of Science).

Related News

AI camera tech could help quickly identify serious infections

A combination of camera technology, software and AI has the potential to assess the severity of...

Machine learning identifies 800,000+ antimicrobial peptides

An international research team has used machine learning to search for antibiotics in a vast...

AI platform makes microscopy image analysis more accessible

DL4MicEverywhere makes artificial intelligence (AI) accessible for analysing microscopy images,...


  • All content Copyright © 2024 Westwick-Farrow Pty Ltd