Europe PMC

This website requires cookies, and the limited processing of your personal data in order to function. By using the site you are agreeing to this as outlined in our privacy notice and cookie policy.

Abstract 


We have developed three computer programs for comparisons of protein and DNA sequences. They can be used to search sequence data bases, evaluate similarity scores, and identify periodic structures based on local sequence similarity. The FASTA program is a more sensitive derivative of the FASTP program, which can be used to search protein or DNA sequence data bases and can compare a protein sequence to a DNA sequence data base by translating the DNA data base as it is searched. FASTA includes an additional step in the calculation of the initial pairwise similarity score that allows multiple regions of similarity to be joined to increase the score of related sequences. The RDF2 program can be used to evaluate the significance of similarity scores using a shuffling method that preserves local sequence composition. The LFASTA program can display all the regions of local similarity between two sequences with scores greater than a threshold, using the same scoring parameters and a similar alignment algorithm; these local similarities can be displayed as a "graphic matrix" plot or as individual alignments. In addition, these programs have been generalized to allow comparison of DNA or protein sequences based on a variety of alternative scoring matrices.

Free full text 


Logo of pnasLink to Publisher's site
Proc Natl Acad Sci U S A. 1988 Apr; 85(8): 2444–2448.
PMCID: PMC280013
PMID: 3162770

Improved tools for biological sequence comparison.

Abstract

We have developed three computer programs for comparisons of protein and DNA sequences. They can be used to search sequence data bases, evaluate similarity scores, and identify periodic structures based on local sequence similarity. The FASTA program is a more sensitive derivative of the FASTP program, which can be used to search protein or DNA sequence data bases and can compare a protein sequence to a DNA sequence data base by translating the DNA data base as it is searched. FASTA includes an additional step in the calculation of the initial pairwise similarity score that allows multiple regions of similarity to be joined to increase the score of related sequences. The RDF2 program can be used to evaluate the significance of similarity scores using a shuffling method that preserves local sequence composition. The LFASTA program can display all the regions of local similarity between two sequences with scores greater than a threshold, using the same scoring parameters and a similar alignment algorithm; these local similarities can be displayed as a "graphic matrix" plot or as individual alignments. In addition, these programs have been generalized to allow comparison of DNA or protein sequences based on a variety of alternative scoring matrices.

Full text

Full text is available as a scanned copy of the original print version. Get a printable copy (PDF file) of the complete article (1014K), or click on a page image below to browse page by page. Links to PubMed are also available for Selected References.

Selected References

These references are in PubMed. This may not be the complete list of references from this article.
  • Lipman DJ, Pearson WR. Rapid and sensitive protein similarity searches. Science. 1985 Mar 22;227(4693):1435–1441. [Abstract] [Google Scholar]
  • Dumas JP, Ninio J. Efficient algorithms for folding and comparing nucleic acid sequences. Nucleic Acids Res. 1982 Jan 11;10(1):197–206. [Europe PMC free article] [Abstract] [Google Scholar]
  • Wilbur WJ, Lipman DJ. Rapid similarity searches of nucleic acid and protein data banks. Proc Natl Acad Sci U S A. 1983 Feb;80(3):726–730. [Europe PMC free article] [Abstract] [Google Scholar]
  • Needleman SB, Wunsch CD. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol. 1970 Mar;48(3):443–453. [Abstract] [Google Scholar]
  • Smith TF, Waterman MS. Identification of common molecular subsequences. J Mol Biol. 1981 Mar 25;147(1):195–197. [Abstract] [Google Scholar]
  • Maizel JV, Jr, Lenk RP. Enhanced graphic matrix analysis of nucleic acid and protein sequences. Proc Natl Acad Sci U S A. 1981 Dec;78(12):7665–7669. [Europe PMC free article] [Abstract] [Google Scholar]
  • Goad WB, Kanehisa MI. Pattern recognition in nucleic acid sequences. I. A general method for finding local homologies and symmetries. Nucleic Acids Res. 1982 Jan 11;10(1):247–263. [Europe PMC free article] [Abstract] [Google Scholar]
  • Sellers PH. Pattern recognition in genetic sequences. Proc Natl Acad Sci U S A. 1979 Jul;76(7):3041–3041. [Europe PMC free article] [Abstract] [Google Scholar]
  • Lipman DJ, Wilbur WJ, Smith TF, Waterman MS. On the statistical significance of nucleic acid similarities. Nucleic Acids Res. 1984 Jan 11;12(1 Pt 1):215–226. [Europe PMC free article] [Abstract] [Google Scholar]
  • Doolittle RF. Similar amino acid sequences: chance or common ancestry? Science. 1981 Oct 9;214(4517):149–159. [Abstract] [Google Scholar]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

Citations & impact 


Impact metrics

Jump to Citations

Citations of article over time

Alternative metrics

Altmetric item for https://www.altmetric.com/details/1334711
Altmetric
Discover the attention surrounding your research
https://www.altmetric.com/details/1334711

Smart citations by scite.ai
Smart citations by scite.ai include citation statements extracted from the full text of the citing article. The number of the statements may be higher than the number of citations provided by EuropePMC if one paper cites another multiple times or lower if scite has not yet processed some of the citing articles.
Explore citation contexts and check if this article has been supported or disputed.
https://scite.ai/reports/10.1073/pnas.85.8.2444

Supporting
Mentioning
Contrasting
19
5478
0

Article citations


Go to all (5,804) article citations

Other citations