Genetic code
The genetic code describes the relationship between DNA sequences and the protein sequences encoded by them. A DNA sequence is transcribed into an mRNA sequence, which is then translated into a protein sequence by assembling an amino acid chain, where each amino acid corresponds to a triplet of nucleotide – a so-called codon.
Since proteins are translated from RNA templates, the codons are typically specified with the RNA nucleotides A, C, G, and U. However, many resources also accept or provide codons with the DNA nucleotides A, C, G, and T.
There is a standard genetic code that is used by almost all organisms:
| UUU – Phe
UUC – Phe UUA – Leu UUG – Leu |
UCU – Ser
UCC – Ser UCA – Ser UCG – Ser |
UAU – Tyr
UAC – Tyr UAA – Stop UAG – Stop |
UGU – Cys
UGC – Cys UGA – Stop UGG – Trp |
| CUU – Leu
CUC – Leu CUA – Leu CUG – Leu |
CCU – Pro
CCC – Pro CCA – Pro CCG – Pro |
CAU – His
CAC – His CAA – Gln CAG – Gln |
CGU – Arg
CGC – Arg CGA – Arg CGG – Arg |
| AUU – Ile
AUC – Ile AUA – Ile AUG – Met |
ACU – Thr
ACC – Thr ACA – Thr ACG – Thr |
AAU – Asn
AAC – Asn AAA – Lys AAG – Lys |
AGU – Ser
AGC – Ser AGA – Arg AGG – Arg |
| GUU – Val
GUC – Val GUA – Val GUG – Val |
GCU – Ala
GCC – Ala GCA – Ala GCG – Ala |
GAU – Asp
GAC – Asp GAA – Glu GAG – Glu |
GGU – Gly
GGC – Gly GGA – Gly GGG – Gly |
The genetic code has several interesting properties:
- It is degenerate, that is, for most aminon acids, several codons map onto the same amino acid. This means there is a certain level of redundancy, and any given protein sequence can be encoded by a large number of alternative DNA sequences, using different codons for the same amino acids.
- In many cases, the first nucleotide for a given amino acid is constant, i.e. all codons for that amino acid start with the same first nucleotide; the third nucleotide is often not determinant for the encoded amino acid
- The code contains three codons that lead to a translation stop
- The code contains one codon that initiates translation, the methionine-encoding AUG. Note that this is the only codon for methionine, therefore additional methionines within a sequence must be encoded by AUG as well.
The genetic code is said to be universal, in the sense that the same code applies to almost all organisms. However, there are instances where the genetic code is slightly modified:
- Mitochondria use non-standard stop codons and redefine some codon/amino acid relations
- Mycoplasma – a bacterium with an extremely small genome – has a different code
- Bacteria often use the additional start codons GUG and UUG
Note that the differences are small, and often apply to relatively small genomes, where a change of the genetic code affects a relatively small number of genes and thus can be compensated in rare evolutionary events. Changing the code of a large genome in a random evolutionary event, where tens of thousands of vital genes would be affected, is virtually impossible.
A list of alternative genetic codes can be found at NCBI.


