Feature: Epigenetics key to human evolution

By Graeme O'Neill
Tuesday, 18 May, 2010

Humans possess the most complex Rnome – the RNA equivalent of a genome – of any animal, allowing us to at least lay technical claim to being the most highly evolved species on Earth. Prune down the enormous assemblage of non-coding RNAs (ncRNAs) that are thought to coordinate the activity of our 20,000-odd protein-coding genes, and you might end up with something like a nematode worm, one of the simplest multicellular animals.

“The basic toolkit for multicellular development, such as the Hox body-patterning genes and the Wnt cell-polarity genesare all there in worms,” says Professor John Mattick, of the Institute for Molecular Bioscience at the University of Queensland, and a pioneer in the still-new frontier of Rnomics.

Mattick says the genetic programming of complex organisms has been largely misunderstood for the past 50 years because of the assumption that proteins transact most genetic information. He says that even after more than half a billion years of evolutionary divergence, most genes are still recognisably common to all animal species. All animals share a basic complement of about 20,000 protein-coding genes. In humans, protein-coding genes account for only 1.2 per cent of genomic DNA.

“It is now clear that the majority of the mammalian genome is transcribed into non-protein coding RNA, and that there are tens, if not hundreds of thousands, of long and short RNAs in mammals that show specific expression patterns and sub-cellular locations,” says Mattick. “Our studies indicate that these RNAs form a massive, hidden network of regulation that regulates epigenetic processes, and directs the precise patterns of gene expression during growth and development.”

Thus, the differences between species and individuals emerge from the relative complexity of their RNA-encoded regulatory architectures. Human tissues teem with non-protein-coding RNAs.

“It is now obvious that the differences between us and other animals are not just embedded in the combinatorics of a similar complement of transcription factors,” says Mattick. “They stem from a massive expansion in humans of the genome’s RNA regulatory architecture.”

Digging through the Rnome

The field grows apace: a trickle of research papers in the mid-1990s has become a flood, with around 270 papers on non-coding RNAs published in 2008. Mattick says his IMB team and others around the world are finding more and more tangible evidence for the functionality of tens of thousands of non-coding RNA molecules in the human genome.

He thinks their most important function is to regulate epigenetic processes such as DNA methylation and histone modifications, which are involved in gene silencing and activation at thousands of positions around the genome during differentiation and development.

“Another surprising thing to emerge is that some ncRNAs are involved in formation and function of subcellular organelles, like paraspeckles.” Discovered in 2002 by Dr Archa Fox of the WA Institute for Medical Research, paraspeckles are tiny (0.2-1μm) complexes of RNA and protein found in the interstices in chromatin in the nucleus. They appear when mammalian cells differentiate, and they vary in size, shape and number in different cell types. Recent research indicates they serve as a localised reserve of long non-coding RNAs within the nucleus. These non-coding RNAs can be released quickly to activate transcription of stress-response genes, allowing the cell to respond rapidly to stress signals.

---PB---

Mattick says there is evidence that some of these retained RNAs are mature messenger RNAs, that can be mobilised rapidly to resume protein synthesis, minimising the lag time involved in reactivating the originating gene.

Mattick and his colleagues, in collaboration with the Allen Institute for Brain Science in Seattle, examined hundreds of specific long non-coding RNAs in 2008, and showed that almost half are differentially expressed in different compartments of the brain, including the hippocampus, where they may be involved in memory formation.

“We have a program to do medium-throughput subcellular localisation of non-coding RNAs in cells to identify novel nuclear compartments. We may find some associated with known compartments, which would yield new insights into how they function,” he says.

Mattick’s team has discovered a class of small RNAs (tiRNAs) associated with transcription initiation sites, and says it is “surprising” that two or three new classes of microRNA-like small RNAs are being produced from small nucleolar RNAs (snoRNAs).

He and his IMB colleagues also recently showed that exons are preferentially positioned within nucleosomes, spherical bodies consisting of around 140 base pairs of DNA, wound around a core of eight histone proteins. Nucleosomes are also the basic structural units of chromatin.

“So the genome already ‘knows’ the protein-coding regions of genes from the positioning of the nucleosomes. This establishes a mechanistic link between the observed architecture of the genome, and the exons and splice sites within genes. The positioning of nucleosomes informs the cell about the epigenetic status not just of genes but of individual exons within them, which suggests that this information might inform the splicing patterns, consistent with the known link between transcription and splicing,” says Mattick.

“It’s a big finding, as it was previously thought that nucleosomes were randomly positioned around transcription initiation sites, and polyadenylation sites. The epigenetic structure of the cell does not just involve opening and closing chromatin to make it accessible to the transcription machinery. In principle, the nucleosomes mark DNA in a way that facilitates the alternative splicing involved in differential expression of the protein-coding exons.”

Mattick says this prediction has just recently been confirmed by Luco and his colleagues in Science.

Polyepigenectics

Mattick says as genome-wide association studies identify regions of the genome associated with complex diseases, it has become increasingly apparent that many such diseases are linked to variations in non-coding regions of the genome, many of which may express regulatory RNAs.

“About 95 per cent of these variants occur in non-coding sequences, far away from any protein-coding gene, so the problems clearly do not involve protein-coding errors like those that cause classical, inherited, monogenic disorders. It’s forcing the medical genetics community to focus on the fine detail of genetic variation,” he says.

“In the past, when researchers were dealing with high-penetrance, monogenic disorders like cystic fibrosis and thalassaemia, it wasn’t hard to find coding errors like point mutations, deletions and frameshift mutations, that cause catastrophic component damage. As soon as geneticists found a missense mutation in a subset of the general population, they had their smoking gun.

---PB---

“But when you map genetically complex disorders to non-coding regions of the genome, it is difficult to identify which particular variant is responsible unless you prove the mechanism.

“The effect has been to push large groups of medical geneticists interested in the genetic and biochemical basis of complex disease, into an assault on the challenge of understanding the nature of the genome’s regulatory architecture. I confidently predict many such disorders and other complex aspects of human variation will be found to involve non-coding RNAs.

So why haven’t non-coding RNAs been showing up in genetic screens? “The answer is that they have,” says Mattick, “but they’ve either been overlooked, or interpreted as affecting conventional regulatory regions such as promoters.

“I’ve been saying for years that regulatory variation rarely originates in the protein component set, which is slowly evolving. It more likely involves the variation in the rapidly evolving set of regulatory non-coding RNAs, as well as variation in the interactions of these RNAs and regulatory regions of DNA with generic and state-specific ‘regulatory proteins’.”

Paramutation

At the conclusion of a review article published in BioEssays, Mattick asks whether such examples of the regulatory roles of non-coding RNAs in the genome are merely the tip of an iceberg of a vast layer of RNA regulatory networks that are interpreted by different types of relatively generic proteins, or whether they merely augment main protein-based regulatory mechanisms during development.

Other major questions, he says, include how extensively RNA editing modifies regulatory networks, and thus, how plastic the epigenome may be, especially in the brain but also in other tissues, and the extent to which such plasticity is transmitted in the germline.

Mattick says it is exciting that RNA now appears to play a central role in brain development, learning and memory. Animals – particularly primates – have developed sophisticated RNA-editing systems to modify hard-wired genetic information in response to experience, that in turn can modulate epigenetic memory, which in some measure, is heritable.

“Thus, RNA may represent the computational engine of the cell and the major substrate for gene-environment interactions,” he says. “So what was once dismissed as ‘junk’ because it was not understood may hold the key to understanding human evolution, development and cognition, as well as our idiosyncrasies and susceptibility to complex diseases.”

Mattick notes the growing interest in the emerging field of RNA editing as the molecular basis of epigenome-environment interactions that drive phenotypic variation. “If RNA is controlling epigenetic memory, and RNA can be modified by editing in response to external cues – that’s where the action is,” he says.

In his BioEssays review, Mattick notes that many studies have indicated that epigenetic memory is heritable in both plants and animals and that the process is RNA-directed. “Perhaps the most exciting aspect of this field, and one with the capacity to change our view of inheritance and evolution, is the recently described phenomenon of ‘paramutation’,” he wrote.

---PB---

Paramutation refers to the allele-specific transfer of epigenetic information to cause the heritable silencing of one allele by another. In maize and mice, paramutation appears to involve RNA signaling. Non-coding RNAs also appear to regulate imprinting – the phenomenon of sex-specific expression of alleles, according to whether the gene is paternally or maternally inherited. Mattick says researchers have shown that RNA-interference-mediated silencing of particular genes is heritable over several generations in Caenorhabditis elegans.

“Intriguingly, evidence is increasing that RNA-coupled DNA repair can also occur in eukaryotes, suggesting that RNA can direct both epigenetic and genetic modifications, and that points to a much more dynamic interplay between genomes and the environment than was previously envisioned.”

Self-aware epigenome

In a recent article for EMBO Reports, for which he is a columnist, Mattick posed the question: “Has evolution learnt how to learn?” Where others, like Stephen Jay Gould, have argued that sentience is not an inevitable outcome of the evolutionary process, Mattick contends that natural selection has led inexorably to the development of complex organisms, including species cable of learning, cognition and, ultimately, self-reflection.

“Indeed, cognition must be an eventual outcome of the evolutionary process, albeit with a highly contingent history. While it is an advantage to evolve particular metabolic capacities or physical characteristics, it is also an advantage to be able to collect information about the environment and to act on this information to increase the odds of survival and reproduction, especially for those species capable of movement and dexterity.

“Many species, notably birds and mammals, but also invertebrates such as the octopus, have progressed to varying extents along the pathway to more advanced cognitive capabilities. Importantly, cognitive function and memory are connected to epigenetic processes which, although the mechanisms are uncertain, increasingly appear to be RNA directed.

“Moreover, the brain is also the main site for RNA editing – the post-transcriptional alteration of RNA sequences by adenosine and cytosine deamination, which is far more widespread that generally appreciated, and which has expanded enormously in humans.

“RNA might not only be the computational engine of the cell, but a major conduit for gene-environment interactions which might, in turn, feed back into epigenetic memory. If the ability to learn and adapt in real time comprises a selective advantage, would it not also be a selective advantage [for organisms] to use experience to alter the behaviour or physical capabilities of their progeny?”

Feature: Epigenetics key to human evolution

Mini lung organoids could help test new treatments

Clogged 'drains' in the brain an early sign of Alzheimer’s

World's oldest known RNA extracted from woolly mammoth

Content from other channels on our network