![]() |
|
MacVector Assembler MacVector Assembler is an add-on module for MacVector that lets you assemble sequences using the phred/phrap/cross_match algorithms from the University of Washington. MacVector Assembler integrates tightly into MacVector so that they appear as a single application. This means that when you create assembled contigs, you can immediately analyze the sequences using any of MacVector's DNA analysis or manipulation functions. Click here to learn more about MacVector Assembler vs. the old AssemblyLIGN Creating a Sequencing Project The first step in any assembly is to create a project and import the sequences/trace files that are to be assembled. MacVector Assembler lets you directly import files in the following formats; Chromatogram Files • ABI (from 373, 377, 3700 and 3730 models) • SCF version 2 • SCF version 3 • ALF Sequence Files • MacVector • FastA Sequence files in other formats should first be converted into a compatible format using MacVector. The imported sequences are maintained in a project window where you can sort and rename etc. |
||
![]() |
||
A populated sequence assembly project window Base Calling Using Phred phred is an algorithm that takes chromatogram information from an automated sequencing run and re-evaluates the peaks to produce a "base call" that is usually significantly more accurate than the original call. In addition to recalculating the residues, phred also adds quality score information to each residue. This is a logarithmic value from 0 to 99 where a value of 10 indicates that there is a 1 in 10 chance that the call is in error, a score of 20 indicates in 1 in 100 chance the call is in error, a score of 30 indicates a 1 in 1,000 chance of an error etc. MacVector Assembler takes advantage of multi-CPU machines (such as the Intel Core Duo or multiple processor PowerPC machines) and splits up the phred jobs between the processors so that you see a speed up directly proportional to the number of processors. In addition, the phred executable is a Universal Binary, so it runs at full native speed on Intel or PowerPC processors. Once you have basecalled the imported sequences, you can view the phred base calls and the quality values by double-clicking on one of the sequences. |
||
![]() |
||
A trace editor window after base calling by phred, showing quality values and the original base call. |
||
Trimming Vector Sequences Using cross_match Many raw sequences from automated sequencing machines contain vector sequences at the beginning and/or end. MacVector Assembler lets you mask these out using the cross_match algorithm. To use this you just need to supply the sequences of the vector(s) you used for the cloning - there is no need to indicate the cloning site you used as cross_match can easily identify the exact position where the vector sequences terminate. Like phred, the cross_match executable is a Universal Binary and MacVector Assembler splits up jobs between multiple CPUs if they are available. After processing by cross_match, you can view the masked vector sequences in the trace editor window where they appear in grayed out italics. |
||
![]() |
||
A trace editor window after vector sequences have been masked, indicated by gray italic text. |
||
Assembling Sequences Using phrap MacVector Assembler assembles sequences using the phrap algorithm. phrap does not require the sequences to have been base called by phred, or to have had any vector sequences masked. However, using phred and cross_match will improve the accuracy of phrap assemblies. phrap assembles sequences into contigs and creates a consensus sequence with its own set of quality values, based on the quality and strandedness of the overlapping sequences. Contigs can be viewed and edited in a contig editor that shows the aligned sequences along with the chromatograms in a lower pane. Clicking on a residue in the consensus sequence resets the chromatogram view so that they are all aligned to that base. This allows you to easily align the chromatograms so that you can resolve ambiguities in the consensus sequence. |
||
![]() |
||
Clicking on a consensus residue in the contig editor aligns all of the chromatograms to that position. |
||
![]() |
||
You can select individual individual sequences in the graphical overview of a contig |
||
Editing and Analysis You can edit the sequences in a contig and the consensus will be updated automatically. MacVector follows the phred/phrap quality value rule where edited residues are given a quality value of 99 to indicate they have been assigned by a user. These are shown in blue in the quality display. You can view a variety of statistics on the composition of the contig in the annotations window. |
||
![]() |
||
The annotations window contains a summary of the contig composition statistics |
||
You can invoke any MacVector DNA analysis algorithm from the contig editor window, including online NCBI blast searches. Any gaps in the consensus are removed before the analysis is performed. This lets you scan the contig for restriction enzyme sites, then edit the consensus and rescan without having to export the consensus sequence or switch to a different module. The consensus sequence, without gaps, can be saved to disk as a single sequence at any time. The sequence can be saved in any format supported by MacVector and retains a list of the individual reads used to generate the consensus. |
||
![]() |
||
You can directly scan contigs for restriction enzyme sequences without needing to export the consensus sequence. |
||