TY - JOUR
T1 - Notl clones in the analysis of the human genome
AU - Zabarovsky, Eugene R.
AU - Gizatullin, Rinat
AU - Podowski, Raf M.
AU - Zabarovska, Veronika V.
AU - Xie, Li
AU - Muravenko, Olga V.
AU - Kozyrev, Sergei
AU - Petrenko, Lev
AU - Skobeleva, Natalia
AU - Li, Jingfeng
AU - Protopopov, Alexei
AU - Kashuba, Vladimir
AU - Ernberg, Ingemar
AU - Winberg, Gösta
AU - Wahlestedt, Claes
PY - 2000/4/1
Y1 - 2000/4/1
N2 - Notl linking clones contain sequences flanking Notl recognition sites and were previously shown to be tightly associated with CpG islands and genes. To directly assess the value of Notl clones in genome research, high density grids with 50,000 Notl linking clones originating from six representative Notl linking libraries were constructed. Altogether, these libraries contained nearly 100 times the total number of Notl sites in the human genome. A total of 3437 sequences flanking Notl sites were generated. Analysis of 3265 unique sequences demonstrated that 51% of the clones displayed significant protein similarity to SWISSPROT and TREMBL database proteins based on MSPcrunch filtering with stringent parameters. Of the 3265 sequences, 1868 (57.2%) were new sequences, not present in the EMBL and EST databases (similarity ≤ 90%). Among these new sequences, 795 (24.3%) showed similarity to known proteins and 712 (21.8%) displayed an identity of > 75% at the nucleotide level to sequences from EMBL or EST databases. The remaining 361 (11.1%) sequences were completely new, i.e. < 75% identical. The work also showed tight, specific association of Notl sites with the first exon and suggest that the so-called 3' ESTs can actually be generated from 5'-ends of genes that contain Notl sites in their first exon.
AB - Notl linking clones contain sequences flanking Notl recognition sites and were previously shown to be tightly associated with CpG islands and genes. To directly assess the value of Notl clones in genome research, high density grids with 50,000 Notl linking clones originating from six representative Notl linking libraries were constructed. Altogether, these libraries contained nearly 100 times the total number of Notl sites in the human genome. A total of 3437 sequences flanking Notl sites were generated. Analysis of 3265 unique sequences demonstrated that 51% of the clones displayed significant protein similarity to SWISSPROT and TREMBL database proteins based on MSPcrunch filtering with stringent parameters. Of the 3265 sequences, 1868 (57.2%) were new sequences, not present in the EMBL and EST databases (similarity ≤ 90%). Among these new sequences, 795 (24.3%) showed similarity to known proteins and 712 (21.8%) displayed an identity of > 75% at the nucleotide level to sequences from EMBL or EST databases. The remaining 361 (11.1%) sequences were completely new, i.e. < 75% identical. The work also showed tight, specific association of Notl sites with the first exon and suggest that the so-called 3' ESTs can actually be generated from 5'-ends of genes that contain Notl sites in their first exon.
UR - http://www.scopus.com/inward/record.url?scp=0034164069&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0034164069&partnerID=8YFLogxK
M3 - Article
C2 - 10710430
AN - SCOPUS:0034164069
VL - 28
SP - 1635
EP - 1639
JO - Nucleic Acids Research
JF - Nucleic Acids Research
SN - 0305-1048
IS - 7
ER -