Notl flanking sequences

A tool for gene discovery and verification of the human genome

Alexey S. Kutsenko, Rinat Z. Gizatullin, Ali N. Al-Amin, Fuli Wang, Sergei M. Kvasha, Raf M. Podowski, Yuri G. Matushkin, Anita Gyanchandani, Olga V. Muravenko, Viktor G. Levitsky, Nikolay A. Kolchanov, Alexei I. Protopopov, Vladimir I. Kashuba, Lev L. Kisselev, Wyeth Wasserman, Claes R Wahlestedt, Eugene R. Zabarovsky

Research output: Contribution to journalArticle

21 Citations (Scopus)

Abstract

A set of 22 551 unique human NotI flanking sequences (16.2 Mb) was generated. More than 40% of the set had regions with significant similarity to known proteins and expressed sequences. The data demonstrate that regions flanking NotI sites are less likely to form nucleosomes efficiently and resemble promoter regions. The draft human genome sequence contained 55.7% of the NotI flanking sequences, Celera's database contained matches to 57.2% of the clones and all public databases (including non-human and previously sequenced NotI flanks) matched 89.2% of the NotI flanking sequences (identity ≥90% over at least 50 bp, data from December 2001). The data suggest that the shotgun sequencing approach used to generate the draft human genome sequence resulted in a bias against cloning and sequencing of NotI flanks. A rough estimation (based primarily on chromosomes 21 and 22) is that the human genome contains 15 000-20 000 NotI sites, of which 6000-9000 are unmethylated in any particular cell. The results of the study suggest that the existing tools for computational determination of CpG islands fall to identify a significant fraction of functional CpG islands, and unmethylated DNA stretches with a high frequency of CpG dinucleotides can be found even in regions with low CG content.

Original languageEnglish
Pages (from-to)3163-3170
Number of pages8
JournalNucleic Acids Research
Volume30
Issue number14
StatePublished - Jul 15 2002
Externally publishedYes

Fingerprint

Genetic Association Studies
Human Genome
CpG Islands
Databases
Chromosomes, Human, Pair 22
Chromosomes, Human, Pair 21
Nucleosomes
Firearms
Genetic Promoter Regions
Organism Cloning
Clone Cells
DNA
Proteins

ASJC Scopus subject areas

  • Genetics

Cite this

Kutsenko, A. S., Gizatullin, R. Z., Al-Amin, A. N., Wang, F., Kvasha, S. M., Podowski, R. M., ... Zabarovsky, E. R. (2002). Notl flanking sequences: A tool for gene discovery and verification of the human genome. Nucleic Acids Research, 30(14), 3163-3170.

Notl flanking sequences : A tool for gene discovery and verification of the human genome. / Kutsenko, Alexey S.; Gizatullin, Rinat Z.; Al-Amin, Ali N.; Wang, Fuli; Kvasha, Sergei M.; Podowski, Raf M.; Matushkin, Yuri G.; Gyanchandani, Anita; Muravenko, Olga V.; Levitsky, Viktor G.; Kolchanov, Nikolay A.; Protopopov, Alexei I.; Kashuba, Vladimir I.; Kisselev, Lev L.; Wasserman, Wyeth; Wahlestedt, Claes R; Zabarovsky, Eugene R.

In: Nucleic Acids Research, Vol. 30, No. 14, 15.07.2002, p. 3163-3170.

Research output: Contribution to journalArticle

Kutsenko, AS, Gizatullin, RZ, Al-Amin, AN, Wang, F, Kvasha, SM, Podowski, RM, Matushkin, YG, Gyanchandani, A, Muravenko, OV, Levitsky, VG, Kolchanov, NA, Protopopov, AI, Kashuba, VI, Kisselev, LL, Wasserman, W, Wahlestedt, CR & Zabarovsky, ER 2002, 'Notl flanking sequences: A tool for gene discovery and verification of the human genome', Nucleic Acids Research, vol. 30, no. 14, pp. 3163-3170.
Kutsenko AS, Gizatullin RZ, Al-Amin AN, Wang F, Kvasha SM, Podowski RM et al. Notl flanking sequences: A tool for gene discovery and verification of the human genome. Nucleic Acids Research. 2002 Jul 15;30(14):3163-3170.
Kutsenko, Alexey S. ; Gizatullin, Rinat Z. ; Al-Amin, Ali N. ; Wang, Fuli ; Kvasha, Sergei M. ; Podowski, Raf M. ; Matushkin, Yuri G. ; Gyanchandani, Anita ; Muravenko, Olga V. ; Levitsky, Viktor G. ; Kolchanov, Nikolay A. ; Protopopov, Alexei I. ; Kashuba, Vladimir I. ; Kisselev, Lev L. ; Wasserman, Wyeth ; Wahlestedt, Claes R ; Zabarovsky, Eugene R. / Notl flanking sequences : A tool for gene discovery and verification of the human genome. In: Nucleic Acids Research. 2002 ; Vol. 30, No. 14. pp. 3163-3170.
@article{d91d418151aa4bb2b90ce4a43e5267fb,
title = "Notl flanking sequences: A tool for gene discovery and verification of the human genome",
abstract = "A set of 22 551 unique human NotI flanking sequences (16.2 Mb) was generated. More than 40{\%} of the set had regions with significant similarity to known proteins and expressed sequences. The data demonstrate that regions flanking NotI sites are less likely to form nucleosomes efficiently and resemble promoter regions. The draft human genome sequence contained 55.7{\%} of the NotI flanking sequences, Celera's database contained matches to 57.2{\%} of the clones and all public databases (including non-human and previously sequenced NotI flanks) matched 89.2{\%} of the NotI flanking sequences (identity ≥90{\%} over at least 50 bp, data from December 2001). The data suggest that the shotgun sequencing approach used to generate the draft human genome sequence resulted in a bias against cloning and sequencing of NotI flanks. A rough estimation (based primarily on chromosomes 21 and 22) is that the human genome contains 15 000-20 000 NotI sites, of which 6000-9000 are unmethylated in any particular cell. The results of the study suggest that the existing tools for computational determination of CpG islands fall to identify a significant fraction of functional CpG islands, and unmethylated DNA stretches with a high frequency of CpG dinucleotides can be found even in regions with low CG content.",
author = "Kutsenko, {Alexey S.} and Gizatullin, {Rinat Z.} and Al-Amin, {Ali N.} and Fuli Wang and Kvasha, {Sergei M.} and Podowski, {Raf M.} and Matushkin, {Yuri G.} and Anita Gyanchandani and Muravenko, {Olga V.} and Levitsky, {Viktor G.} and Kolchanov, {Nikolay A.} and Protopopov, {Alexei I.} and Kashuba, {Vladimir I.} and Kisselev, {Lev L.} and Wyeth Wasserman and Wahlestedt, {Claes R} and Zabarovsky, {Eugene R.}",
year = "2002",
month = "7",
day = "15",
language = "English",
volume = "30",
pages = "3163--3170",
journal = "Nucleic Acids Research",
issn = "0305-1048",
publisher = "Oxford University Press",
number = "14",

}

TY - JOUR

T1 - Notl flanking sequences

T2 - A tool for gene discovery and verification of the human genome

AU - Kutsenko, Alexey S.

AU - Gizatullin, Rinat Z.

AU - Al-Amin, Ali N.

AU - Wang, Fuli

AU - Kvasha, Sergei M.

AU - Podowski, Raf M.

AU - Matushkin, Yuri G.

AU - Gyanchandani, Anita

AU - Muravenko, Olga V.

AU - Levitsky, Viktor G.

AU - Kolchanov, Nikolay A.

AU - Protopopov, Alexei I.

AU - Kashuba, Vladimir I.

AU - Kisselev, Lev L.

AU - Wasserman, Wyeth

AU - Wahlestedt, Claes R

AU - Zabarovsky, Eugene R.

PY - 2002/7/15

Y1 - 2002/7/15

N2 - A set of 22 551 unique human NotI flanking sequences (16.2 Mb) was generated. More than 40% of the set had regions with significant similarity to known proteins and expressed sequences. The data demonstrate that regions flanking NotI sites are less likely to form nucleosomes efficiently and resemble promoter regions. The draft human genome sequence contained 55.7% of the NotI flanking sequences, Celera's database contained matches to 57.2% of the clones and all public databases (including non-human and previously sequenced NotI flanks) matched 89.2% of the NotI flanking sequences (identity ≥90% over at least 50 bp, data from December 2001). The data suggest that the shotgun sequencing approach used to generate the draft human genome sequence resulted in a bias against cloning and sequencing of NotI flanks. A rough estimation (based primarily on chromosomes 21 and 22) is that the human genome contains 15 000-20 000 NotI sites, of which 6000-9000 are unmethylated in any particular cell. The results of the study suggest that the existing tools for computational determination of CpG islands fall to identify a significant fraction of functional CpG islands, and unmethylated DNA stretches with a high frequency of CpG dinucleotides can be found even in regions with low CG content.

AB - A set of 22 551 unique human NotI flanking sequences (16.2 Mb) was generated. More than 40% of the set had regions with significant similarity to known proteins and expressed sequences. The data demonstrate that regions flanking NotI sites are less likely to form nucleosomes efficiently and resemble promoter regions. The draft human genome sequence contained 55.7% of the NotI flanking sequences, Celera's database contained matches to 57.2% of the clones and all public databases (including non-human and previously sequenced NotI flanks) matched 89.2% of the NotI flanking sequences (identity ≥90% over at least 50 bp, data from December 2001). The data suggest that the shotgun sequencing approach used to generate the draft human genome sequence resulted in a bias against cloning and sequencing of NotI flanks. A rough estimation (based primarily on chromosomes 21 and 22) is that the human genome contains 15 000-20 000 NotI sites, of which 6000-9000 are unmethylated in any particular cell. The results of the study suggest that the existing tools for computational determination of CpG islands fall to identify a significant fraction of functional CpG islands, and unmethylated DNA stretches with a high frequency of CpG dinucleotides can be found even in regions with low CG content.

UR - http://www.scopus.com/inward/record.url?scp=18444387393&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=18444387393&partnerID=8YFLogxK

M3 - Article

VL - 30

SP - 3163

EP - 3170

JO - Nucleic Acids Research

JF - Nucleic Acids Research

SN - 0305-1048

IS - 14

ER -