Functional Annotation of the Mammalian Genome in the FANTOM projects

Michiel de Hoon, PhD
Team Leader, Laboratory for Applied Computational Genomics, RIKEN Center for Integrative Medical Sciences, Japan
Thursday, November 15, 2018 - 10:00am
PMH, 610 University Avenue, 6th Floor Auditorium, Rm 6-604
Special Seminar
Abstract: 
Mammals are complex multicellular organisms composed of hundreds of cell types that vary widely in shape, function, development, mutual interactions, and localization. This extraordinary variety in cellular behavior is achieved by using the same genomic information encoded in the DNA in different ways, in particular by expressing coding and non-coding transcripts in a cell type specific manner under the control of transcription factors and regulatory RNAs. FANTOM (Functional ANnoTation Of the Mammalian genome) is an international research consortium that aims at a comprehensive identification of mammalian transcripts as well as their functional annotation. In the fifth edition of FANTOM (FANTOM5) [1,2], we have used single molecule sequencing across a broad panel of primary cells, cell lines and tissues, to produce a comprehensive atlas of gene expression in mammalian cells by mapping transcription start sites at single-nucleotide resolution using CAGE (Cap Analysis Gene Expression). Using this atlas, we identified cell type specific promoter usage, key transcription factors, novel transcripts, as well as enhancer activity profiles as signatures of cell states [3]. In addition, we performed short RNA sequencing to profile microRNAs in human and mouse cells, and systematically identified the transcription start site of primary microRNA transcripts in human and mouse [4]. We also used CAGE to identify with high confidence the 5' end and therefore the promoter region of 27,919 long non-coding RNA (lncRNA) genes in human [5]. An analysis of expression quantitative trait loci (eQTL)- and disease-associated single nucleotide polymorphisms (SNPs) overlapping lncRNA loci suggested biological significance of lncRNAs regulation and disease [5]. While the number of lncRNAs encoded in mammalian genomes exceed those of protein-coding genes, for the vast majority of lncRNAs no functional annotation is currently available. In the sixth edition of FANTOM, we build on our unique expression atlas and collection of lncRNA annotations to create the first broad functional annotation and categorization of lncRNAs. [1] Forrest AR, et al. Nature 507: 462-470 (2014). [2] Arner E, et al. Science 347: 1010-1014 (2015). [3] Andersson R, et al. Nature 407: 455-561 (2014). [4] De Rie D, et al., Nature Biotechnology 35: 872-878 (2017). [5] Hon CC, et al., Nature 543: 199-204 (2017).
Host: 
Dr. Michael Hoffman