Predicting protein-protein interactions from protein domains using a set cover approach

Chengbang Huang, Faruck Morcos, Simon P. Kanaan, Stefan Wuchty, Danny Z. Chen, Jesús A. Izaguirre

Research output: Contribution to journalArticlepeer-review

51 Scopus citations


One goal of contemporary proteome research is the elucidation of cellular protein interactions. Based on currently available protein-protein interaction and domain data, we introduce a novel method, Maximum Specificity Set Cover (MSSC), for the prediction of protein-protein interactions. In our approach, we map the relationship between interactions of proteins and their corresponding domain architectures to a generalized weighted set cover problem. The application of a greedy algorithm provides sets of domain interactions which explain the presence of protein interactions to the largest degree of specificity. Utilizing domain and protein interaction data of S. cerevisiae. MSSC enables prediction of previously unknown protein interactions, links that are well supported by a high tendency of coexpression and functional homogeneity of the corresponding proteins. Focusing on concrete examples, we show that MSSC reliably predicts protein interactions in well-studied molecular systems, such as the 26S proteasome and RNA polymerase II of S. cerevisiae. We also show that the quality of the predictions is comparable to the Maximum Likelihood Estimation while MSSC is faster. This new algorithm and all data sets used are accessible through a Web portal at

Original languageEnglish (US)
Pages (from-to)78-87
Number of pages10
JournalIEEE/ACM Transactions on Computational Biology and Bioinformatics
Issue number1
StatePublished - Jan 2007
Externally publishedYes


  • Bioinformatics (genome or protein) databases
  • Biology
  • Computations on discrete structures
  • Genetics
  • Graph algorithms

ASJC Scopus subject areas

  • Biotechnology
  • Genetics
  • Applied Mathematics


Dive into the research topics of 'Predicting protein-protein interactions from protein domains using a set cover approach'. Together they form a unique fingerprint.

Cite this