Wednesday, 8th December 2010
Here's a quick code snippet to generate a codon table in Python. The 'table' is actually a dictionary that takes a three-letter, lowercase codon as a key, and returns a single uppercase letter corresponding to the encoded amino acid (or '*' if it's a stop codon).
bases = ['t', 'c', 'a', 'g'] codons = [a+b+c for a in bases for b in bases for c in bases] amino_acids = 'FFLLSSSSYY**CC*WLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG' codon_table = dict(zip(codons, amino_acids))
So if you type
codon_table['atg'], you'll get M for methionine. If you prefer to use 'u' rather than 't', simply change the base in the first line.
It's now quite easy to make a function to translate a gene into an amino acid sequence.
def translate(seq): seq = seq.lower().replace('\n', '').replace(' ', '') peptide = '' for i in xrange(0, len(seq), 3): codon = seq[i: i+3] amino_acid = codon_table.get(codon, '*') if amino_acid != '*': peptide += amino_acid else: break return peptide
This function takes a DNA sequence, converts it to lowercase and removes any line breaks or spaces. Then it loops through it in chunks of 3, i.e. codons, translating them until it hits a stop codon or a codon not in the dictionary. It returns the amino acid sequence of the resulting peptide.