Frontiers in Genomics - a meeting report

Publiceret April 2009

Take a mouth swap, mail it to deCODEme in Reykjavik, buy a complete scan and pay $985. After three weeks more than a million genetic variants have been scanned and your risk for 38 conditions including cancer, heart attacks, brain disorders and more have been calculated. Furthermore, your Ancestry has been traced in your male and female line as well as your geographic roots. All of it is available to you via deCODEme's on-line web portal. This was just one example of the many new developments within Genomics and sequencing presented at the Danish Biochemical Society's 37th annual meeting entitled "Frontiers in Genomics", which took place at Gl. Avernæs Conference Centre on Southern Fyn October 29-31, 2008.

A bit of background

In the last decade, nucleotide sequencing has revolutionized our understanding of the genomes across all domains of life. In particular the sequencing of the human genome and closely related species has fundamentally changed the way research can be conducted. The whole-genome sequencing continues at a high pace and the data is providing us with an increasingly clear picture of the evolutionary history of life forms on our planet. Whole-genome sequencing has for many years been such a monumental undertaking that the work has been organized in gigantic sequencing centers hosting hundreds of machines in a factory-like setting.

This picture is rapidly changing. Due to technological advances, sequencing is being democratized, and many new applications of the technology are emerging. You can now buy a desktop sequencer which can sequence an entire bacterial genome in a day and there is a prize offered for the first team to sequence a human genome for less than $1000!

The meeting featured presentations by some of the world's most prominent researchers in the field of genomics and provided the participants with a unique chance to tap into this exciting development. It is clear from the meeting that we are at the border of a new wave of innovative new applications of sequencing technology in fields as diverse as metagenomics, genetic variation, cancer genomics, ancient genomes, epigenetics and transcriptomics. (See the entire program here)

Genome sequencing

The meeting was kicked off with an overview lecture by Professor Niels Tommerup from the Wilhelm Johansen Centre for Functional Genome Research at the University of Copenhagen. The lecture reviewed our present understanding of the human genome. Multicellular organisms have the same number of protein-coding genes, but the size of their genomes differs widely in the non-coding regions. The human genome consists of 30% exons + introns and 70% intergenic space, and the exons constitute less than 1.5%. The term junk DNA was introduced in 1972 by Susumu Ohno for 95% of the DNA sequence for which no function has been identified, but is as of 2008 somewhat outdated. The facts that some of this alleged junk DNA is conserved over many millions of years of evolution, and that about 80% of the bases in the human genome are transcribed, implies the majority of the genome has essential functions and that the notion of junk DNA has to be revised. Furthermore, Niels Tommerup reviewed the complexities of genetic diseases, which has not been revealed by classical linkage analysis. He discussed comorbidity, comparative genomics, plasticity, copy number variants, long-range position effects, transcription of the 90% of the genome, and epigenetics.

In the next talk, the latest developments within sequencing technology was reviewed by Michael Egholm, Vice President of Research & Development at 454 Life Sciences (now part of Roche), manufacturer of the 454 desktop sequencing machine. The underlying technology is based on pyrosequencing. Other next generation DNA sequencing includes the Illumina Solexa Genome analyzer based on four-color DNA sequencing-by-synthesis technology and Applied Biosystems SOLiD sequencing methodology based on sequential ligation with dye-labeled oligonucleotides.

Michael Egholm discussed the complete 454 sequencing Jim Watson's (the co-discoverer of DNA's structure and 1962 Nobel Laureate) genome. This is the first genome to be sequenced for less than $1 million. It was completed in only two months, but it took a year to analyze all the data. "Nothing learned" was the comment by geneticist Maynard Olsen. 454 is also involved in the "1000 genomes project" together with several other companies, as well as in several ancient DNA sequencing projects. Furthermore, amplicon "ultra deep" sequencing is a new field, which is largely being enabled through 454 sequencing technology, allowing mutations to be detected at extremely low levels by PCR ampification. In the latest versions of the 454 sequencer, the company has managed to increase the read length substantially making it easier to computationally assemble the pieces afterwards.

Human genetic variation

Following the consensus sequence of the human genome, many groups are now using the technology to understand the small genetic differences that make us all different. In this area Hákon Guðbjartsson from deCODE Genetics in Iceland presented their search for disease genes based on genealogy, health records and gene sequencing of 300.000 inhabitants on Iceland. This includes microsatellite analysis, SNP genotyping, deletion detection, as well as long-range multi locus haplotype phasing. Apart from identifying disease related genes, the results are also expected to be applicable for personalized genomics.

Lastly, Guðbjartsson presented the deCODEme initiative, where the company offers genetic profiling of individuals based on a saliva sample. As mentioned, you do a swap of your mouth, send it to deCODEme and receive the results back via a web portal that shows your risk profile and your Ancestry. They are even discussing having a face book-like feature, where you can meet people with a similar genetic profile on-line!

Jun Wang of the Beijing Genomics Institute in China presented their work on sequencing the diploid genome of the first Asian individual. The genome was sequenced to 36-fold coverage using massively parallel sequencing technology. Assembly of a high-quality consensus sequence for 92% of the Asian individual's genome was used to identify 3 million SNPs, heterozygote phasing and haplotype prediction against Hap Map haplotypes, sequence comparison with the two available individual genomes (Jim Watson and Craig Venter) and structural variation identification. The sequence data and analysis demonstrate the usefulness of next-generation sequencing technologies for personal genomics.

Metagenomics

Metagenomics - or Environmental sequencing as it is also called - is another novel application of sequencing. Here, the researchers sequence all DNA in environmental samples from soil, water, gut, etc. Metagenomics provides insight into the genetic repertoire of an entire ecosystem or environment, rather than its individual species. Since only about 1% of all bacteria have been cultivated in the lab, the technique provides a long sought for opportunity to tap into the remaining 99%. This issue of BioZoom vol. 12, no. 2, 2009 contains an excellent review on Metagenomics by science journalist Lone Frank.

Our annual EMBO lecture was presented by Peer Bork - one of the world's leading experts within genomics and metagenomics. Bork is group leader and associated coordinator of the Structural and Computational Biology unit at the European Molecular Biology Laboratory (EMBL), and author of more than 300 research papers, including many of the high-profile genome projects (human, rat, chicken, malaria parasite, beetle, etc.) . In 2007, he was ranked as the 4th most cited scientist worldwide within the area of molecular biology and genetics.

In his Keynote lecture he presented a breath-taking overview of his research in multiple fields of biology such as whole-genome analysis, metagenomics, evolution, post-translational modifications, cell cycle regulation, transcriptomics, drug discovery and systems biology. Bork also demonstrated a handful of free-to-use online tools and resources developed in his lab, including identification of protein domains (Smart), protein-protein functional association networks (String), chemical-protein interactions (Stitch), drug-targets prediction via side effect profiles, drug-target database (Matador).

The dedicated Metagenomics session also included a talk by Professor Roger A. Garrett from the University of Copenhagen, who walked us through the third domain of life, Archaea. He introduced the biology of the extremophile Archaea and discussed the enigma of the evolution and classification of Archae in relation to Bacteria and Eukarya.

Last speaker on Metagenomics was Thomas Sicheritz-Pontén from the Technical University of Denmark, who presented two ongoing metagenomics projects: Sequencing of DNA in polar sea samples collected during the 3rd Danish Galatea expedition near Greenland and Antarctica, and analysis of the microbiome of the human gut. 

Ancient DNA

One of the most spectacular new applications of sequencing - which until recently was considered science fiction - is the analysis of ancient genetic material. The meeting featured Keynote lectures by two of the leading experts in the field: Svante Pääbo, Director of Evolutionary Genetics at the Max Planck Institute in Leipzig and Eske Willerslev from the University of Copenhagen. Svante Pääbo was the first to sequence DNA from Egyptian mummies and is now heading a project to sequence the full genome of our close relative, the Neanderthal. The work is among the world's most anticipated scientific projects, as it may provide clues to the genetic changes that shaped modern humans; in particular the differences related to the development of our brain.

In his lecture he described the methodological difficulties regarding degradation and contamination of DNA and efforts to control them. Using 454 sequencing the DNA from a Neanderthal bone 38,000 years old has been sequenced (3.2 billion bases) to one-fold coverage. The conservation of the FOXP2 gene suggests that the Neanderthal might have developed language like modern humans in contrast to chimpanzees. Transgenic mice which express human FOXP2 show one distinct phenotypic difference from normal mice: they vocalize in a way that is similar to humans!

Eske Willerslev's truly original work includes sequencing of DNA samples from Siberian permafrost and the surface of Greenland buried deep under the Ice. The samples cast light on the climate and ecosystems hundreds of thousands of years back. His recent projects have provided new insight into human evolution and migration by sequencing DNA from 14,000 year old human feces found in a cave in Oregon, USA and from a frozen hair sample of an Eskimo that lived in western Greenland roughly 4,000 year ago.

Cancer genomes

Phil Stevens of the Wellcome Trust Sanger Institute, UK presented the cancer genome project with focus on sequencing genomes of breast cancer tumors using the new sequencing technology called massively parallel sequencing. The results suggest that a limited number of mutations (10-30) are driving the cancer, whereas a large number of mutations (100-1000) are not causal and can therefore be regarded as "passenger" mutations. Nearly 380 structurally mutated human cancer genes have been identified to date. The majority of these mutations are driven by large-scale re-arrangements in the genome generating fusion genes.

Tobias Sjöblom from Uppsala University in Sweden described their work on sequencing the genomes of tumors from breast, colorectal, pancreatic and brain cancers. Initially, the analysis indicated that a larger number of genes than hitherto thought are mutated during tumor evolution. However, they could not distinguish between driver and passenger mutations. The data were used to prioritize candidate genes and select the top 20 mutated genes. These mutations included previously known genes as well as new genes. The individual tumors are heterogenous with respect to mutated genes, but the gene level complexity can be reduced to a limited number of molecular pathways.

Epigenetics

Kristian Helin, Director of BRIC at the University of Copenhagen addressed the important field of epigenetics i.e. inherited changes in gene expression caused by mechanisms other than changes in the DNA sequence. The underlying factors include DNA methylation and histone modifications like acetylation, methylation, sumoylation, and ubiquitination. Epigenetic mechanisms are important in gene inactivation, embryogenesis, development and disease. Kristian Helin described the role of two families of proteins in controlling cell proliferation, differentiation and senescence: the Polycomb group proteins and EZH2 histone methyl transferases. Furthermore, he presented new results on an exciting group of proteins that catalyze the demethylation of methylated lysins. Members of the Jumonji demethylase family are overexpressed in human cancer and neurological disorders and may be candidate drug targets. 

Ronni Nielsen, University of Southern Denmark talked about epigenetics in regulation of transcription. He focused on the nuclear peroxisome proliferator-activated receptors (PPAR), which bind to DNA as heterodimers with members of the retinoid X receptor (RXR) family. PPARγ is an important regulator of adipocyte differentiation and function. Ronni Nielsen used a novel experimental approach where chromatin immunoprecipitation (ChIP) of transcription factors cross-linked to their native chromosomal sites of action is combined with deep sequencing of the DNA binding sites. With ChIP sequencing he generated a genome-wide map of PPARγ-RXR binding to chromatin as well as investigated the activation of associated target genes through polymeraseII binding during differentiation of murine 3T3-L1 preadipocytes. The data showed that PPARγ-RXR binding is specifically associated with induced genes involved in diverse processes including lipid and glucose metabolism. Ronni Nielsen's work is described further in an article in this issue of BioZoom vol. 12, no. 2, 2009.

Transcriptomics

The topic of the last session of the meeting was transcriptional regulation. Jürg Bähler of the Wellcome Trust Sanger Institute described their exciting project to do a complete sequencing of the entire transcriptome of a eukaryote organism (fission yeast) under various environmental conditions. The work reveals that most of the genome is in fact transcribed and illustrates that sequencing may in the future make technologies such as expression microarrays obsolete! High-throughput sequencing proved to be a powerful and quantitative method to deeply sample transcriptomes at maximal resolution.

Finally, Albin Sandelin from BRIC at University of Copenhagen talked about his computational studies of transcriptomes. Massive sequencing of 5' ends of mRNAs has given new insights into how promoters work and how to analyze gene regulation computationally. Using the cap analysis of gene expression technology (CAGE) and 454 FLX sequencing multiple transcription start sites have been identified in the exonic, intronic and intergenic regions of the genome. It is now clear that most genes have multiple alternative promoters with different regulatory inputs, and individual promoters typically have many transcription start sites. You can read more about Albins work in this issue of BioZoom vol. 12, no. 2, 2009.

Conclusion

After two and a half days with outstanding lectures, intense discussions, good food, social networking and walks in the scenic park at Gl. Avernæs, the 100 participants were full of inspiration upon returning home to continue their own research. The 37th Annual Meeting of the Danish Biochemical Society represented a milestone with its enlightened demarcation of the frontiers in genomics and look into the exciting future of genome research in the 21st century.