Understanding RNA splicing with interpretable models

Chris Burge, PhD
Department of Biology, MIT
Friday, January 26, 2024 - 11:00am
Ramsay Wright Building, Room 432
Invited Speaker Seminar
Abstract: 
Pre-mRNA splicing is a fundamental step in gene expression, conserved across eukaryotes, in which the spliceosome recognizes motifs at the 3’ and 5’ splice sites (SS), excises introns and ligates exons. SS recognition and pairing is often influenced by splicing regulatory factors (SRFs) that bind to splicing regulatory elements (SREs). The position-specific activity of SRFs is commonly described by “RNA maps” derived from crosslinking and knockdown/RNA-seq data. I will describe an alternative “splicing activity map” and the KATMAP algorithm used to derive these maps from in vitro binding data (from RBNS or RNACompete) and knockdown/RNA-seq alone, and applications including prediction of direct and indirect splicing targets. I will also describe a fully interpretable model of pre-mRNA splicing, SMsplice, that combines new models of core SS motifs, SREs, and exonic and intronic length preferences. We learned models that predict SS locations with 83-86% accuracy in fish, insects and plants, and about 70% in mammals. Learned SRE motifs include both known SRF binding motifs as well as novel motifs, and both classes are supported by genetic analyses. Comparisons across species highlight similarities between non-mammals, a greater reliance on SREs in mammalian splicing, and increased reliance on intronic SREs in plant splicing.
Host: 
CSB Trainees
Dept of Cell and Systems Biology