|
explore trees sequences interact miscellaneous development |
How mor Works - Flow Chart Guide This guide is divided into sections corresponding to the major modules of mor. 1. Acquisition Summary: mor queries GenBank for any new sequences that have been added since the last run, and downloads them. 1. Called by main mor script. 2. Calls tryQueryGB() which connects to GenBank and attempts the sequence download if there are any to grab. 3. Save downloaded data to a FASTA-format file for the next module. 2. Screening Summary: runs tests on the downloaded sequences to see if they are fit for inclusion or if they should be rejected. 1. Called by main mor script. 2. Receives the FASTA-format file from the previous module that was downloaded from GenBank. 3. Downloads all current accepted accnos from the database and removes any accno found in the downloaded file with stripAlreadyAligned(). Otherwise, if a sequence happens to get downloaded again by mistake, it will break ClustalW. Outputs the "safe" FASTA file to a new file. 4. The current new sequences from the safe FASTA file are sent to sortSequences() where they are tested with HMMER to make sure that the sequence is a homobasidiomycete, and then sequenceSimilarity, where a Bio::Perl routine is used to make sure that the sequence isn't too similar to one already in the database. Sequences are also checked with multistagePercentage(), which checks to make sure that the sequence doesn't have too high a proportion of unknown basepair indicators. 5. Any rejected sequence is sent through reasonForRejection() to figure out what error code to display. 6. A new data file with just the sequence data is outputted for the next module 3. Alignment Summary: Uses ClustalW to align the sequences. 1. Called by main mor script. 2. Sends the sequence data outputted by the previous module to ClustalW. Either a complete alignment (align all the sequences from scratch) or a profile alignment (align new sequences against the previous aligned data) is used based on how the system is configured. 3. Outputs a NEXUS-format file for the next module. 4. Analysis Summary: Uses Paup to generate phylogenetic trees based on the aligned data. 1. Called by main mor script. 2. Outputs a consensus tree file for later use by CladeSys. 3. Outputs the backbone constraint tree from the database to a file for Paup to use. 4. Generates a neighbor-joining tree and a parsimony tree using Paup. 5. In each stage, Paup outputs a log file which is parsed by htmlFormatter(). htmlFormatter() outputs the final HTML files. 5. Archiver Summary: Backs up the ALN, DND, NXS, and other such files generated during the course of mor. 1. Called by main mor script. 2. Creates a .zip archive file with some of the files, e.g., those generated by ClustalW. 3. archiveForWeb() calls archiveFileToPath() for each file to be archived, which copies the file to the appopriate directory; this includes the archive file and HTML files generated by PAUP. 4. The HTML files generated by PAUP are copied to overwrite the corresponding files in the web directory, so that the link on the website points to the new tree files. |
|
|
mor is being built by Hibbett D, Nilsson RH, Shonfeld M, Snyder M, Costanzo J, Fonseca M, Twomey R, Gaytan B, Stein P, Burke JP, Heider T and Ha C |
||