TY - JOUR
T1 - A general method applicable to the search for similarities in the amino acid sequence of two proteins
AU - Needleman, Saul B.
AU - Wunsch, Christian D.
N1 - Funding Information:
This work was supported in part by grants to one of us (S.B.N.) from the U.S. Public Health Service (1 501 FR 05370 02) and from Merck Sharp & Dohme.
PY - 1970/3/28
Y1 - 1970/3/28
N2 - A computer adaptable method for finding similarities in the amino acid sequences of two proteins has been developed. From these findings it is possible to determine whether significant homology exists between the proteins. This information is used to trace their possible evolutionary development. The maximum match is a number dependent upon the similarity of the sequences. One of its definitions is the largest number of amino acids of one protein that can be matched with those of a second protein allowing for all possible interruptions in either of the sequences. While the interruptions give rise to a very large number of comparisons, the method efficiently excludes from consideration those comparisons that cannot contribute to the maximum match. Comparisons are made from the smallest unit of significance, a pair of amino acids, one from each protein. All possible pairs are represented by a two-dimensional array, and all possible comparisons are represented by pathways through the array. For this maximum match only certain of the possible pathways must be evaluated. A numerical value, one in this case, is assigned to every cell in the array representing like amino acids. The maximum match is the largest number that would result from summing the cell values of every pathway.
AB - A computer adaptable method for finding similarities in the amino acid sequences of two proteins has been developed. From these findings it is possible to determine whether significant homology exists between the proteins. This information is used to trace their possible evolutionary development. The maximum match is a number dependent upon the similarity of the sequences. One of its definitions is the largest number of amino acids of one protein that can be matched with those of a second protein allowing for all possible interruptions in either of the sequences. While the interruptions give rise to a very large number of comparisons, the method efficiently excludes from consideration those comparisons that cannot contribute to the maximum match. Comparisons are made from the smallest unit of significance, a pair of amino acids, one from each protein. All possible pairs are represented by a two-dimensional array, and all possible comparisons are represented by pathways through the array. For this maximum match only certain of the possible pathways must be evaluated. A numerical value, one in this case, is assigned to every cell in the array representing like amino acids. The maximum match is the largest number that would result from summing the cell values of every pathway.
UR - http://www.scopus.com/inward/record.url?scp=0014757386&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0014757386&partnerID=8YFLogxK
U2 - 10.1016/0022-2836(70)90057-4
DO - 10.1016/0022-2836(70)90057-4
M3 - Article
C2 - 5420325
AN - SCOPUS:0014757386
VL - 48
SP - 443
EP - 453
JO - Journal of Molecular Biology
JF - Journal of Molecular Biology
SN - 0022-2836
IS - 3
ER -