Abstract
Free full text
Improved tools for biological sequence comparison.
Abstract
We have developed three computer programs for comparisons of protein and DNA sequences. They can be used to search sequence data bases, evaluate similarity scores, and identify periodic structures based on local sequence similarity. The FASTA program is a more sensitive derivative of the FASTP program, which can be used to search protein or DNA sequence data bases and can compare a protein sequence to a DNA sequence data base by translating the DNA data base as it is searched. FASTA includes an additional step in the calculation of the initial pairwise similarity score that allows multiple regions of similarity to be joined to increase the score of related sequences. The RDF2 program can be used to evaluate the significance of similarity scores using a shuffling method that preserves local sequence composition. The LFASTA program can display all the regions of local similarity between two sequences with scores greater than a threshold, using the same scoring parameters and a similar alignment algorithm; these local similarities can be displayed as a "graphic matrix" plot or as individual alignments. In addition, these programs have been generalized to allow comparison of DNA or protein sequences based on a variety of alternative scoring matrices.
Full text
Full text is available as a scanned copy of the original print version. Get a printable copy (PDF file) of the complete article (1014K), or click on a page image below to browse page by page. Links to PubMed are also available for Selected References.
Selected References
These references are in PubMed. This may not be the complete list of references from this article.
- Lipman DJ, Pearson WR. Rapid and sensitive protein similarity searches. Science. 1985 Mar 22;227(4693):1435–1441. [Abstract] [Google Scholar]
- Dumas JP, Ninio J. Efficient algorithms for folding and comparing nucleic acid sequences. Nucleic Acids Res. 1982 Jan 11;10(1):197–206. [Europe PMC free article] [Abstract] [Google Scholar]
- Wilbur WJ, Lipman DJ. Rapid similarity searches of nucleic acid and protein data banks. Proc Natl Acad Sci U S A. 1983 Feb;80(3):726–730. [Europe PMC free article] [Abstract] [Google Scholar]
- Needleman SB, Wunsch CD. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol. 1970 Mar;48(3):443–453. [Abstract] [Google Scholar]
- Smith TF, Waterman MS. Identification of common molecular subsequences. J Mol Biol. 1981 Mar 25;147(1):195–197. [Abstract] [Google Scholar]
- Maizel JV, Jr, Lenk RP. Enhanced graphic matrix analysis of nucleic acid and protein sequences. Proc Natl Acad Sci U S A. 1981 Dec;78(12):7665–7669. [Europe PMC free article] [Abstract] [Google Scholar]
- Goad WB, Kanehisa MI. Pattern recognition in nucleic acid sequences. I. A general method for finding local homologies and symmetries. Nucleic Acids Res. 1982 Jan 11;10(1):247–263. [Europe PMC free article] [Abstract] [Google Scholar]
- Sellers PH. Pattern recognition in genetic sequences. Proc Natl Acad Sci U S A. 1979 Jul;76(7):3041–3041. [Europe PMC free article] [Abstract] [Google Scholar]
- Lipman DJ, Wilbur WJ, Smith TF, Waterman MS. On the statistical significance of nucleic acid similarities. Nucleic Acids Res. 1984 Jan 11;12(1 Pt 1):215–226. [Europe PMC free article] [Abstract] [Google Scholar]
- Doolittle RF. Similar amino acid sequences: chance or common ancestry? Science. 1981 Oct 9;214(4517):149–159. [Abstract] [Google Scholar]
Associated Data
Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences
Full text links
Read article at publisher's site: https://doi.org/10.1073/pnas.85.8.2444
Read article for free, from open access legal sources, via Unpaywall: https://europepmc.org/articles/pmc280013?pdf=render
Citations & impact
Impact metrics
Citations of article over time
Alternative metrics
Smart citations by scite.ai
Explore citation contexts and check if this article has been
supported or disputed.
https://scite.ai/reports/10.1073/pnas.85.8.2444
Article citations
Expanding the diversity of origin of transfer-containing sequences in mobilizable plasmids.
Nat Microbiol, 08 Nov 2024
Cited by: 0 articles | PMID: 39516559
Protein domain embeddings for fast and accurate similarity search.
Genome Res, 34(9):1434-1444, 11 Oct 2024
Cited by: 0 articles | PMID: 39237301 | PMCID: PMC11529836
Physics-Based Protein Networks Might Recover Effectful Mutations─a Case Study on Cathepsin G.
J Phys Chem B, 128(41):10043-10050, 02 Oct 2024
Cited by: 0 articles | PMID: 39357873 | PMCID: PMC11492240
Meta-analysis of gonadal transcriptome provides novel insights into sex change mechanism across protogynous fishes.
Genes Cells, 29(11):1052-1068, 29 Sep 2024
Cited by: 1 article | PMID: 39344081 | PMCID: PMC11555629
Streamlining of Simple Sequence Repeat Data Mining Methodologies and Pipelines for Crop Scanning.
Plants (Basel), 13(18):2619, 19 Sep 2024
Cited by: 0 articles | PMID: 39339594 | PMCID: PMC11435353
Review Free full text in Europe PMC
Go to all (5,804) article citations
Other citations
Similar Articles
To arrive at the top five similar articles we use a word-weighted algorithm to compare words from the Title and Abstract of each citation.
Rapid and sensitive sequence comparison with FASTP and FASTA.
Methods Enzymol, 183:63-98, 01 Jan 1990
Cited by: 990 articles | PMID: 2156132
BLAST and FASTA similarity searching for multiple sequence alignment.
Methods Mol Biol, 1079:75-101, 01 Jan 2014
Cited by: 22 articles | PMID: 24170396
Profile analysis: detection of distantly related proteins.
Proc Natl Acad Sci U S A, 84(13):4355-4358, 01 Jul 1987
Cited by: 600 articles | PMID: 3474607 | PMCID: PMC305087