TY - JOUR
T1 - RNA sequencing of transcriptomes in human brain regions
T2 - Protein-coding and non-coding RNAs, isoforms and alleles
AU - Webb, Amy
AU - Papp, Audrey C.
AU - Curtis, Amanda
AU - Newman, Leslie C.
AU - Pietrzak, Maciej
AU - Seweryn, Michal
AU - Handelman, Samuel K.
AU - Rempala, Grzegorz A.
AU - Wang, Daqing
AU - Graziosa, Erica
AU - Tyndale, Rachel F.
AU - Lerman, Caryn
AU - Kelsoe, John R.
AU - Mash, Deborah C.
AU - Sadee, Wolfgang
N1 - Funding Information:
This study is supported in part by the NIH National Institute of General Medical Sciences, Pharmacogenomics Research Network (PGRN) grant U01 GM092655 (WS), the RNA Sequencing Project (GM61390), and grant #DA06227 (DCM).
Publisher Copyright:
© 2015 Webb et al.
PY - 2015/11/23
Y1 - 2015/11/23
N2 - Background: We used RNA sequencing to analyze transcript profiles of ten autopsy brain regions from ten subjects. RNA sequencing techniques were designed to detect both coding and non-coding RNA, splice isoform composition, and allelic expression. Brain regions were selected from five subjects with a documented history of smoking and five non-smokers. Paired-end RNA sequencing was performed on SOLiD instruments to a depth of >40 million reads, using linearly amplified, ribosomally depleted RNA. Sequencing libraries were prepared with both poly-dT and random hexamer primers to detect all RNA classes, including long non-coding (lncRNA), intronic and intergenic transcripts, and transcripts lacking poly-A tails, providing additional data not previously available. The study was designed to generate a database of the complete transcriptomes in brain region for gene network analyses and discovery of regulatory variants. Results: Of 20,318 protein coding and 18,080 lncRNA genes annotated from GENCODE and lncipedia, 12 thousand protein coding and 2 thousand lncRNA transcripts were detectable at a conservative threshold. Of the aligned reads, 52 % were exonic, 34 % intronic and 14 % intergenic. A majority of protein coding genes (65 %) was expressed in all regions, whereas ncRNAs displayed a more restricted distribution. Profiles of RNA isoforms varied across brain regions and subjects at multiple gene loci, with neurexin 3 (NRXN3) a prominent example. Allelic RNA ratios deviating from unity were identified in > 400 genes, detectable in both protein-coding and non-coding genes, indicating the presence of cis-acting regulatory variants. Mathematical modeling was used to identify RNAs stably expressed in all brain regions (serving as potential markers for normalizing expression levels), linked to basic cellular functions. An initial analysis of differential expression analysis between smokers and nonsmokers implicated a number of genes, several previously associated with nicotine exposure. Conclusions: RNA sequencing identifies distinct and consistent differences in gene expression between brain regions, with non-coding RNA displaying greater diversity between brain regions than mRNAs. Numerous RNAs exhibit robust allele selective expression, proving a means for discovery of cis-acting regulatory factors with potential clinical relevance.
AB - Background: We used RNA sequencing to analyze transcript profiles of ten autopsy brain regions from ten subjects. RNA sequencing techniques were designed to detect both coding and non-coding RNA, splice isoform composition, and allelic expression. Brain regions were selected from five subjects with a documented history of smoking and five non-smokers. Paired-end RNA sequencing was performed on SOLiD instruments to a depth of >40 million reads, using linearly amplified, ribosomally depleted RNA. Sequencing libraries were prepared with both poly-dT and random hexamer primers to detect all RNA classes, including long non-coding (lncRNA), intronic and intergenic transcripts, and transcripts lacking poly-A tails, providing additional data not previously available. The study was designed to generate a database of the complete transcriptomes in brain region for gene network analyses and discovery of regulatory variants. Results: Of 20,318 protein coding and 18,080 lncRNA genes annotated from GENCODE and lncipedia, 12 thousand protein coding and 2 thousand lncRNA transcripts were detectable at a conservative threshold. Of the aligned reads, 52 % were exonic, 34 % intronic and 14 % intergenic. A majority of protein coding genes (65 %) was expressed in all regions, whereas ncRNAs displayed a more restricted distribution. Profiles of RNA isoforms varied across brain regions and subjects at multiple gene loci, with neurexin 3 (NRXN3) a prominent example. Allelic RNA ratios deviating from unity were identified in > 400 genes, detectable in both protein-coding and non-coding genes, indicating the presence of cis-acting regulatory variants. Mathematical modeling was used to identify RNAs stably expressed in all brain regions (serving as potential markers for normalizing expression levels), linked to basic cellular functions. An initial analysis of differential expression analysis between smokers and nonsmokers implicated a number of genes, several previously associated with nicotine exposure. Conclusions: RNA sequencing identifies distinct and consistent differences in gene expression between brain regions, with non-coding RNA displaying greater diversity between brain regions than mRNAs. Numerous RNAs exhibit robust allele selective expression, proving a means for discovery of cis-acting regulatory factors with potential clinical relevance.
KW - Allelic expression imbalance
KW - Brain regions
KW - Differential expression
KW - Isoform fraction
KW - Non-coding RNA
KW - RNA sequencing
UR - http://www.scopus.com/inward/record.url?scp=84947735453&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84947735453&partnerID=8YFLogxK
U2 - 10.1186/s12864-015-2207-8
DO - 10.1186/s12864-015-2207-8
M3 - Article
C2 - 26597164
AN - SCOPUS:84947735453
VL - 16
JO - BMC Genomics
JF - BMC Genomics
SN - 1471-2164
IS - 1
M1 - 990
ER -