Big genome


By Graeme O'Neill
Wednesday, 17 July, 2013


Big genome

Debate continues about the claim that proteins interact with 80% of the human genome and whether this much regulation is required for a functional human being.

After labouring nine years to catalogue every functional element in the human genome, the ENCODE (ENCyclopaedia Of DNA Elements) research consortium announced this year it has linked more than 80% of the human genome sequence to a specific biological function.

The 440-odd researchers involved in the ENCODE project say they have mapped more than a million regulatory regions where proteins interact with DNA. Which begs the question: why are so many functional elements required to regulate the expression and activity of so few genes?

Function or fallacy?

Some critics greeted the ENCODE consortium’s estimate with scepticism - four papers criticising the ENCODE consortium now exist in the scientific literature, not to mention blog postings.

Molecular evolutionist Dan Graur and several colleagues from the University of Houston and Johns Hopkins University went further, veering into open derision. Their acerbic critique in Genome Biology and Evolution threw an uppercut in its title ‘On the immortality of television sets: “function” in the human genome according to the evolution-free gospel of ENCODE’.

Graur et al wrote that the ENCODE estimate flies in the face of current estimates that under 10% of the genome is evolutionarily conserved through purifying selection.

“Thus, according to the ENCODE Consortium, a biological function can be maintained indefinitely without selection, which implies that at least 80 – 10 = 70 per cent of the genome is perfectly invulnerable to deleterious mutations, either because no mutation can ever occur in these ‘functional’ regions, or because no mutation in these regions can ever be deleterious.”

Graur and his colleagues maintain the rage for another 35 pages, before concluding, “… the ENCODE Consortium has, so far, failed to provide a compelling reason to abandon the prevailing understanding among evolutionary biologists according to which most of the human genome is devoid of function. The ENCODE results were predicted by one of its lead authors to necessitate the rewriting of textbooks (Pennisi 2012). We agree, many textbooks dealing with marketing, mass-media hype, and public relations may well have to be rewritten.”

Modest versus majority function

So, does a surprisingly small complement (~22,000) of protein-coding genes need such a huge constellation of DNA regulatory regions and RNA-encoded DNA elements for the task of assembling and operating a human being? Or is the functional genome a masterpiece whose operational complexity is conjured from a compact kernel of stringently conserved functional elements?

ALS invited two leading molecular geneticists to argue for a modest versus mostly functional genome. Professor Merlin Crossley, Dean of the Faculty of Science at the University of Sydney, has specialised in the role of DNA-binding proteins - gene-transcription factors - in development and disease. He believes less than 20% of the genome is highly conserved and functional.

Professor John Mattick, Director of the Garvan Medical Research Institute in Sydney, believes most of the genome is functional. Mattick has been influential in the developing field of ‘RNA-omics’, which emerged from the discovery of introns in 1979.

Mattick and others have shown that the ‘junk DNA’ of introns and vast tracts of intergenic DNA harbours a host of small, functional RNA elements that effectively form a complex, multilayered operating system for the genome.

Jumping genes have no function

Crossley says around two thirds of the human genome is now understood to consist of dead retroviruses, in the form of transposable DNA elements - jumping genes.

“I think transposable elements are essentially parasitic,” he said. “If two thirds of the genome is transposable elements, that leaves only a third that could have function. The estimates we are seeing is that only 10% is conserved.”

Crossley says transposable elements will occasionally suffer replication errors and may evolve functions that will be co-opted by the host, but the vast majority won’t have any function.

Alu elements are the most abundant transposable elements in primate genomes. Exclusive to primates, their numbers have expanded over the past 40-50 million years, peaking at 1.2 million in modern humans. Alu elements flank a significant number of duplicated segments in the human genome, suggesting they sometimes ‘capture’ and copy intervening DNA sequences as they replicate (see end of article).

“Gene duplication is the raw material for evolutionary change, so the fact that so a large part of our DNA consists consisting of repeated elements like Alu transposons has certainly influenced human evolution - it can affect biological function, but it doesn’t have intrinsic biological function itself.

“Some of the duplicated genes end up as pseudogenes. A few may evolve separate roles, but most will remain pseudogenes with no function.”

Gene spacing

Crossley says there are many different ways that genetic accidents - point mutations, insertions, deletions and frame-shift mutations - can knock out a functioning gene, resulting in dominant negative changes in gene function. But these changes cannot be regarded as functional, only a very small number will result in new positive or dominant negative functions.

While Crossley finds these genetic accidents fascinating, and they exhibit regulatory effects, he says their influence is incidental and they are not so common as to constitute a significant fraction of the genome.

But he accepts that in some cases, the amount of spacer DNA in the genome can be critically important - changing the spacing between a remote enhancer and its target gene would probably affect the gene’s function.

Does that mean the intervening DNA is functional? “I don’t know - if the gene has come to depend on it, it might,” he said.

“But it would surprise me, when you look at related species with very different amounts of DNA, such as lungfish and puffer fish, or two Drosophila species. The fact that the genome sizes can be different, when the genes and the organism itself are similar, suggests the spacing between genes is not having a major effect on gene regulation.

“Think of a stack of boxes - the one at the top prevents dust accumulating on the boxes beneath and the one at the bottom keeps the stack off the wet floor, but these are not ‘functions’.”

Crossley believes the amount of conserved, functional DNA in the human genome is more than 10% but no higher than 20% - the rest is stitched in to the stuff that matters and has been provided with a ticket to ride through time, courtesy of the phenomenon that accounts for linkage disequilibrium.

Permissive territory

John Mattick said he was surprised by the “almost vituperative” tone of the Graur commentary. He and Garvan Institute bioinformatician Marcel Dinger are seeking to publish a response in Genome Biology and Evolution.

He said the Graur paper’s focus on strong conservation ignored evidence for the presence of large amounts of less stringently conserved DNA in the human genome.

He focuses on “permissive territory” in the genome, where protein-coding DNA specifies active sites within the mature, folded protein. “The structure-function relation within these active sites is biochemically mediated - for example, a particular motif might form an oxygen-binding site.

“Such active sites are analogue devices, and the amino acid sequences within them may be quite loose. As long as the structure of the binding site is vaguely polar, it will bind oxygen.

“Within the active sites of a protein, the sequence can vary enormously across species, or even between individuals of the same species. Lack of conservation does not mean lack of function. A functional nucleotide sequence is not like a telephone number, where every digit has to be dialled in its correct order to make the connection.”

Mattick finds linguistic analogies useful in explaining the concept of relative conservation of genome sequences.

He says cognates - similar words in modern languages of common ancestry - have undergone phonetic shifts over time but retain their original function. Cognates like the German “brater” and Latin “frater” (brother), and the Latin “pater” and Sanskrit “pitar” (father) reflect common descent from proto-Indo European, the 6000-year-old mother tongue of the first farmers of Turkey’s Trans-Caucasus region.

According to Mattick, the relative conservation of genomic DNA is evident in the ability of different peptides and proteins to bind the same receptor. The receptor is strongly conserved, maintaining the reciprocity of its core for multiple ligands, but the complementary amino acid sequences of its ligands can vary considerably without ill effect.

Mattick says the Graur critique fails because it focuses on strict conservation and, as a result, makes circular assumptions and dubious comparisons.

He believes Graur has made the mistake of arguing that only 10% of the genome is functional because the rest does not exhibit strong sequence conservation, when transcription across most of the genome more reliably indicates genetic function.

Keeping an open mind

Mattick has spent much of his career identifying regulatory RNA sequences in the genome and says the sheer volume of regulatory RNA molecules challenges the traditional, protein-centric view of genetic programming.

He also questions the assumption that transposable elements and the ancient, highly repeated DNA sequences shared by all mammals are neutrally evolving and thus non-functional. “Evidence is growing that this assumption is incorrect,” Mattick said.

The conclusion that ancient repeats - many of them created by transposon activity - have no function ignores evidence that some protein-coding DNA sequences, and some micro-RNA sequences, have structure-function constraints.

Such sequences may exhibit unrecognised patterns of mutation, different to the familiar classes of mutation that occur in cis-regulatory sequences and other classes of trans-acting regulatory RNAs encoded in the genome.

Mattick says the major finding of the ENCODE project is that the majority of the mammalian genome is transcribed in highly cell-specific patterns. Such patterns indicate that the transcribed elements play a dynamic role in embryonal development, differentiation and disease - “Even supposed ‘gene deserts’ yield transcripts that are expressed in specific cells,” he said.

Two key functions inferred for these transcripts are guiding chromatin-modifying complexes to their sites of action and supervising the precise sequence of epigenetic changes required for normal embryonal development.

Mattick says it is unsurprising that science’s understanding of the processes underlying the molecular evolution of life is incomplete and predicts that developments in massive parallel computing will bring to light new and surprising mechanisms. Until then, he suggests that everyone should keep an open mind on the extent of functionality in the human genome.

******************************************************

Accidental evolution

An example of transposons playing an accidental, yet possibly pivotal role in human evolution came to light last when Professor Evan Eichler’s team at the University of Washington, Seattle, announced in Cell it had discovered an Alu-mediated triplication of a gene called SRGAP2 that happened some 3.4 million years ago.

SRGAP2 molecules pair up to form a duplex signalling protein that, in chimpanzees and other great apes, terminates the elongation of neural precursor cells and thread-like processes called filopodia that differentiate into dendrites.

In humans, one of the three SRGAP2 proteins is truncated. If the truncated molecule is present in a duplex molecule, differentiation is delayed, and the neural precursors continue to elongate without differentiating. The result is a population of neurons in the embryonic brain able to make elaborate, long-distance interconnections between the brain’s functional modules.

Eichler’s team believes the error may have laid the foundation for a phenomenally rapid increase in cognitive power in the hominin brain as its volume increased fourfold over the ensuing 2.5 million years. Although Alu elements caused the duplication, they were not functionally involved in the resulting gain-of-function mutation.

******************************************************

Image credit ©iStockphoto.com/gvinpin

Related Articles

Certain hormone drugs linked to increased brain tumour risk

Prolonged use of certain progestogen hormone drugs has been associated with an increased risk of...

A new pathway for reversible male birth control

Most experimental male birth control drugs use a hammer approach to block sperm production, but...

CRISPR-Cas gene editing eliminates HIV in lab

Scientists deployed CRISPR-Cas molecular scissors and two gRNAs against 'conserved' HIV...


  • All content Copyright © 2024 Westwick-Farrow Pty Ltd