Searching for Coding regions

	Sequence Analysis Tools for Molecular Biologists

MacVector Support

Forums

Visit our forums where you can discuss MacVector-related issues with our staff and other users.

Screencasts

Short videos with tips, techniques and useful information on using MacVector.

Tutorials and downloads

Tutorials, updated restriction enzyme files, utilities and other useful downloads.

Workshops

We often tour the USA or Europe presenting free workshops on the use of MacVector. While many are private, we do have public workshops that anyone can attend.

To communicate with a live human body knowledgable in all technical aspects of MacVector and Assembler, please call or email:

US and worldwide (except Europe)

support@macvector.com

(919) 303-7450

(866) 338-0222

Europe

support@macvector.com

+44 (0) 1223 410 552

Support Home

Click here to get back to the main support page.

Searching for Coding Regions

Using Coding Preference Plots to finding coding regions

After you determine the sequence of a piece of DNA, one of the first things you would like to know is whether it codes for a protein. If you are lucky, your sequence will contain at least one long open reading frame—a reading frame of at least 50 to 100 codons that contains no stop codons. You can translate the open reading frame and search the NBRF Protein Identification Resource database or the GenBank nucleic acid database to see if there is a match with a known protein. If you find a match, you will have answered your question.

If there is no match, you need some other method of determining the biological significance of the open reading frame. It may code for a previously unsequenced protein, or it may have no biological significance whatsoever - after all, not all open reading frames are protein coding regions.

MacVector provides a range of analyses to help you make this decision. In addition to open reading frame analysis, the program provides various methods use that base or codon composition to help you determine if your open reading frame has the characteristics of a protein coding region. In conjunction with these methods, you can use nucleic acid subsequence analysis to look for motifs in your sequence, such as ribosome binding sites or intron-exon splice sites, that may help define the exact boundaries of coding regions.

MacVector, Inc • PO Box 1147 • Apex • North Carolina 27502 • USA

phone: +1-919-303-7450 • toll free: +1-866-338-0222 • fax: +1-919-303-7449