|
|
Fasta FormatIn bioinformatics, FASTA format is a file format used to exchange information between genetic sequence databases. Its format looks like this: >SEQUENCE_1 - comment line 1(optional)
-
MTEITAAMVKELRESTGAGMMDCKNALSETNGDFDKAVQLLREKGLGKAAKKADRLAAEG LVSVKVSDDFTIAAMRPSYLSYEDLDMTFVENEYKALVAELEKENEERRRLKDPNKPEHK IPQFASRKQLSDAILKEAEEKIKEELKAQGKPEKIWDNIIPGKMNSFIADNSQLDSKLTL MGQFYVMDDKKTVEQVIAEKEKEFGGKIKIVEFICFEVGEGLEKKTEDFAAEVAAQL >SEQUENCE_2 - comment line 1(optional)
-
- comment line 2 (optional)
-
SATVSEINSETDFVAKNDQFIALTKDTTAHIQSNSLQSVEELHSSTINGVKFEEYLKSQI ATIGENLVVRRFATLKAGANGVVNGYIHTNGRVGVVIAAACDSAEVASKSRDLLRQICMH It consists of a header line (beginning with a '>') which gives a name and/or a unique identifier for the sequence, and often lots of other information too. Many different sequence databases use standarized headers, which helps when automatically extracting information from the header. After the header line, one or more comments, distinguished by a semi-colon at the beginning of the line, may occur. Most databases and bioinformatics applications do not recognize these comments so their use is discouraged, but they are part of the official format. After the header line and comments, one or more sequence lines may follow. Sequences may be protein sequences or DNA sequences, they can be of any length and can contain gaps or alignment characters (see sequence alignment). FASTA format files often have file extensions like .fa, .mpfa or .fsa (and probably many more!). The simple format of FASTA files makes them easy to manipulate using text processing tools and scripting languages like Perl. The NCBI have gone so far as to define a standard for their fasta header (although generally this is a bit messy)... GenBank gi|gi-number|gb|accession|locus EMBL Data Library gi|gi-number|emb|accession|locus DDBJ, DNA Database of Japan gi|gi-number|dbj|accession|locus NBRF PIR pir||entry Protein Research Foundation prf||name SWISS-PROT sp|accession|entry name Brookhaven Protein Data Bank pdb|entry|chain Patents pat|country|number GenInfo Backbone Id bbs|number General database identifier gnl|database|identifier NCBI Reference Sequence ref|accession|locus Local Sequence identifier lcl|identifier External Links this
|  | medical software oxisol list of turks richard assheton cross, 1st viscount cross stephen daedalus list of leaders of russia list of east timorese people goattracker film at 11 sopwith pup xplane
| list of named ethnic enclaves in north american cities professional wrestling throws battle of paris zupan earl of verulam zupa jake earl jellicoe red tape australian legislative election, 1996 fritillaria meleagris
| iso 9241 misumi, kumamoto shiranuhi, kumamoto uto district, kumamoto jonan, kumamoto tomiai, kumamoto matsubase, kumamoto ogawa, kumamoto sandra magnus toyono, kumamoto chuo, kumamoto
| tomochi, kumamoto shimomashiki district, kumamoto taimei, kumamoto piers sellers yokoshima, kumamoto tensui, kumamoto gyokuto, kumamoto kikusui, kumamoto mikawa, kumamoto pamela melroy nankan, kumamoto
|
|
 |