Notl clones in the analysis of the human genome

Eugene R. Zabarovsky, Rinat Gizatullin, Raf M. Podowski, Veronika V. Zabarovska, Li Xie, Olga V. Muravenko, Sergei Kozyrev, Lev Petrenko, Natalia Skobeleva, Jingfeng Li, Alexei Protopopov, Vladimir Kashuba, Ingemar Ernberg, Gösta Winberg, Claes R Wahlestedt

Research output: Contribution to journalArticle

22 Citations (Scopus)

Abstract

Notl linking clones contain sequences flanking Notl recognition sites and were previously shown to be tightly associated with CpG islands and genes. To directly assess the value of Notl clones in genome research, high density grids with 50,000 Notl linking clones originating from six representative Notl linking libraries were constructed. Altogether, these libraries contained nearly 100 times the total number of Notl sites in the human genome. A total of 3437 sequences flanking Notl sites were generated. Analysis of 3265 unique sequences demonstrated that 51% of the clones displayed significant protein similarity to SWISSPROT and TREMBL database proteins based on MSPcrunch filtering with stringent parameters. Of the 3265 sequences, 1868 (57.2%) were new sequences, not present in the EMBL and EST databases (similarity ≤ 90%). Among these new sequences, 795 (24.3%) showed similarity to known proteins and 712 (21.8%) displayed an identity of > 75% at the nucleotide level to sequences from EMBL or EST databases. The remaining 361 (11.1%) sequences were completely new, i.e. < 75% identical. The work also showed tight, specific association of Notl sites with the first exon and suggest that the so-called 3' ESTs can actually be generated from 5'-ends of genes that contain Notl sites in their first exon.

Original languageEnglish
Pages (from-to)1635-1639
Number of pages5
JournalNucleic Acids Research
Volume28
Issue number7
StatePublished - Apr 1 2000
Externally publishedYes

Fingerprint

Human Genome
Expressed Sequence Tags
Clone Cells
Libraries
Exons
Databases
Protein Databases
CpG Islands
Genes
Proteins
Nucleotides
Genome
Research

ASJC Scopus subject areas

  • Genetics

Cite this

Zabarovsky, E. R., Gizatullin, R., Podowski, R. M., Zabarovska, V. V., Xie, L., Muravenko, O. V., ... Wahlestedt, C. R. (2000). Notl clones in the analysis of the human genome. Nucleic Acids Research, 28(7), 1635-1639.

Notl clones in the analysis of the human genome. / Zabarovsky, Eugene R.; Gizatullin, Rinat; Podowski, Raf M.; Zabarovska, Veronika V.; Xie, Li; Muravenko, Olga V.; Kozyrev, Sergei; Petrenko, Lev; Skobeleva, Natalia; Li, Jingfeng; Protopopov, Alexei; Kashuba, Vladimir; Ernberg, Ingemar; Winberg, Gösta; Wahlestedt, Claes R.

In: Nucleic Acids Research, Vol. 28, No. 7, 01.04.2000, p. 1635-1639.

Research output: Contribution to journalArticle

Zabarovsky, ER, Gizatullin, R, Podowski, RM, Zabarovska, VV, Xie, L, Muravenko, OV, Kozyrev, S, Petrenko, L, Skobeleva, N, Li, J, Protopopov, A, Kashuba, V, Ernberg, I, Winberg, G & Wahlestedt, CR 2000, 'Notl clones in the analysis of the human genome', Nucleic Acids Research, vol. 28, no. 7, pp. 1635-1639.
Zabarovsky ER, Gizatullin R, Podowski RM, Zabarovska VV, Xie L, Muravenko OV et al. Notl clones in the analysis of the human genome. Nucleic Acids Research. 2000 Apr 1;28(7):1635-1639.
Zabarovsky, Eugene R. ; Gizatullin, Rinat ; Podowski, Raf M. ; Zabarovska, Veronika V. ; Xie, Li ; Muravenko, Olga V. ; Kozyrev, Sergei ; Petrenko, Lev ; Skobeleva, Natalia ; Li, Jingfeng ; Protopopov, Alexei ; Kashuba, Vladimir ; Ernberg, Ingemar ; Winberg, Gösta ; Wahlestedt, Claes R. / Notl clones in the analysis of the human genome. In: Nucleic Acids Research. 2000 ; Vol. 28, No. 7. pp. 1635-1639.
@article{e62e3bb378dd44dfa43800771b61ed0b,
title = "Notl clones in the analysis of the human genome",
abstract = "Notl linking clones contain sequences flanking Notl recognition sites and were previously shown to be tightly associated with CpG islands and genes. To directly assess the value of Notl clones in genome research, high density grids with 50,000 Notl linking clones originating from six representative Notl linking libraries were constructed. Altogether, these libraries contained nearly 100 times the total number of Notl sites in the human genome. A total of 3437 sequences flanking Notl sites were generated. Analysis of 3265 unique sequences demonstrated that 51{\%} of the clones displayed significant protein similarity to SWISSPROT and TREMBL database proteins based on MSPcrunch filtering with stringent parameters. Of the 3265 sequences, 1868 (57.2{\%}) were new sequences, not present in the EMBL and EST databases (similarity ≤ 90{\%}). Among these new sequences, 795 (24.3{\%}) showed similarity to known proteins and 712 (21.8{\%}) displayed an identity of > 75{\%} at the nucleotide level to sequences from EMBL or EST databases. The remaining 361 (11.1{\%}) sequences were completely new, i.e. < 75{\%} identical. The work also showed tight, specific association of Notl sites with the first exon and suggest that the so-called 3' ESTs can actually be generated from 5'-ends of genes that contain Notl sites in their first exon.",
author = "Zabarovsky, {Eugene R.} and Rinat Gizatullin and Podowski, {Raf M.} and Zabarovska, {Veronika V.} and Li Xie and Muravenko, {Olga V.} and Sergei Kozyrev and Lev Petrenko and Natalia Skobeleva and Jingfeng Li and Alexei Protopopov and Vladimir Kashuba and Ingemar Ernberg and G{\"o}sta Winberg and Wahlestedt, {Claes R}",
year = "2000",
month = "4",
day = "1",
language = "English",
volume = "28",
pages = "1635--1639",
journal = "Nucleic Acids Research",
issn = "0305-1048",
publisher = "Oxford University Press",
number = "7",

}

TY - JOUR

T1 - Notl clones in the analysis of the human genome

AU - Zabarovsky, Eugene R.

AU - Gizatullin, Rinat

AU - Podowski, Raf M.

AU - Zabarovska, Veronika V.

AU - Xie, Li

AU - Muravenko, Olga V.

AU - Kozyrev, Sergei

AU - Petrenko, Lev

AU - Skobeleva, Natalia

AU - Li, Jingfeng

AU - Protopopov, Alexei

AU - Kashuba, Vladimir

AU - Ernberg, Ingemar

AU - Winberg, Gösta

AU - Wahlestedt, Claes R

PY - 2000/4/1

Y1 - 2000/4/1

N2 - Notl linking clones contain sequences flanking Notl recognition sites and were previously shown to be tightly associated with CpG islands and genes. To directly assess the value of Notl clones in genome research, high density grids with 50,000 Notl linking clones originating from six representative Notl linking libraries were constructed. Altogether, these libraries contained nearly 100 times the total number of Notl sites in the human genome. A total of 3437 sequences flanking Notl sites were generated. Analysis of 3265 unique sequences demonstrated that 51% of the clones displayed significant protein similarity to SWISSPROT and TREMBL database proteins based on MSPcrunch filtering with stringent parameters. Of the 3265 sequences, 1868 (57.2%) were new sequences, not present in the EMBL and EST databases (similarity ≤ 90%). Among these new sequences, 795 (24.3%) showed similarity to known proteins and 712 (21.8%) displayed an identity of > 75% at the nucleotide level to sequences from EMBL or EST databases. The remaining 361 (11.1%) sequences were completely new, i.e. < 75% identical. The work also showed tight, specific association of Notl sites with the first exon and suggest that the so-called 3' ESTs can actually be generated from 5'-ends of genes that contain Notl sites in their first exon.

AB - Notl linking clones contain sequences flanking Notl recognition sites and were previously shown to be tightly associated with CpG islands and genes. To directly assess the value of Notl clones in genome research, high density grids with 50,000 Notl linking clones originating from six representative Notl linking libraries were constructed. Altogether, these libraries contained nearly 100 times the total number of Notl sites in the human genome. A total of 3437 sequences flanking Notl sites were generated. Analysis of 3265 unique sequences demonstrated that 51% of the clones displayed significant protein similarity to SWISSPROT and TREMBL database proteins based on MSPcrunch filtering with stringent parameters. Of the 3265 sequences, 1868 (57.2%) were new sequences, not present in the EMBL and EST databases (similarity ≤ 90%). Among these new sequences, 795 (24.3%) showed similarity to known proteins and 712 (21.8%) displayed an identity of > 75% at the nucleotide level to sequences from EMBL or EST databases. The remaining 361 (11.1%) sequences were completely new, i.e. < 75% identical. The work also showed tight, specific association of Notl sites with the first exon and suggest that the so-called 3' ESTs can actually be generated from 5'-ends of genes that contain Notl sites in their first exon.

UR - http://www.scopus.com/inward/record.url?scp=0034164069&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0034164069&partnerID=8YFLogxK

M3 - Article

C2 - 10710430

AN - SCOPUS:0034164069

VL - 28

SP - 1635

EP - 1639

JO - Nucleic Acids Research

JF - Nucleic Acids Research

SN - 0305-1048

IS - 7

ER -