Short interfering RNAs are used in functional genomics studies to knockdown a single gene in a reversible manner. The results of siRNA experiments are highly dependent on the choice of siRNA sequence. In order to evaluate siRNA design rules, we collected a database of 398 siRNAs of known efficacy from 92 genes. We used this database to evaluate previously proposed rules from smaller datasets, and to find a new set of rules that are optimal for the entire database. We also trained a regression tree with full cross-validation. It was however difficult to obtain the same precision as methods previously tested on small datasets from one or two genes. We show that those methods are overfitting as they work poorly on independent validation datasets from multiple genes. Our new design rules can predict siRNAs with efficacy ≥50% in 91% of cases, and with efficacy ≥90% in 52% of cases, which is more than a twofold improvement over random selection. Software for designing siRNAs is available online via a web server at http://sisearch.cgb.ki.se/ or as a standalone version for high-throughput applications.
|Original language||English (US)|
|Number of pages||11|
|Journal||Biochemical and biophysical research communications|
|State||Published - Jun 18 2004|
ASJC Scopus subject areas
- Molecular Biology
- Cell Biology