Your Biotechnology Service Provider

Genetic code

The genetic code describes the relationship between DNA sequences and the protein sequences encoded by them. A DNA sequence is transcribed into an mRNA sequence, which is then translated into a protein sequence by assembling an amino acid chain, where each amino acid corresponds to a triplet of nucleotide – a so-called codon.

Since proteins are translated from RNA templates, the codons are typically specified with the RNA nucleotides A, C, G, and U. However, many resources also accept or provide codons with the DNA nucleotides A, C, G, and T.

There is a standard genetic code that is used by almost all organisms:

UUU – Phe

UUC – Phe

UUA – Leu

UUG – Leu

UCU – Ser

UCC – Ser

UCA – Ser

UCG – Ser

UAU – Tyr

UAC – Tyr

UAA – Stop

UAG – Stop

UGU – Cys

UGC – Cys

UGA – Stop

UGG – Trp

CUU – Leu

CUC – Leu

CUA – Leu

CUG – Leu

CCU – Pro

CCC – Pro

CCA – Pro

CCG – Pro

CAU – His

CAC – His

CAA – Gln

CAG – Gln

CGU – Arg

CGC – Arg

CGA – Arg

CGG – Arg

AUU – Ile

AUC – Ile

AUA – Ile

AUG – Met

ACU – Thr

ACC – Thr

ACA – Thr

ACG – Thr

AAU – Asn

AAC – Asn

AAA – Lys

AAG – Lys

AGU – Ser

AGC – Ser

AGA – Arg

AGG – Arg

GUU – Val

GUC – Val

GUA – Val

GUG – Val

GCU – Ala

GCC – Ala

GCA – Ala

GCG – Ala

GAU – Asp

GAC – Asp

GAA – Glu

GAG – Glu

GGU – Gly

GGC – Gly

GGA – Gly

GGG – Gly

The genetic code has several interesting properties:

  • It is degenerate, that is, for most aminon acids, several codons map onto the same amino acid. This means there is a certain level of redundancy, and any given protein sequence can be encoded by a large number of alternative DNA sequences, using different codons for the same amino acids.
  • In many cases, the first nucleotide for a given amino acid is constant, i.e. all codons for that amino acid start with the same first nucleotide; the third nucleotide is often not determinant for the encoded amino acid
  • The code contains three codons that lead to a translation stop
  • The code contains one codon that initiates translation, the methionine-encoding AUG. Note that this is the only codon for methionine, therefore additional methionines within a sequence must be encoded by AUG as well.

The genetic code is said to be universal, in the sense that the same code applies to almost all organisms. However, there are instances where the genetic code is slightly modified:

  • Mitochondria use non-standard stop codons and redefine some codon/amino acid relations
  • Mycoplasma – a bacterium with an extremely small genome – has a different code
  • Bacteria often use the additional start codons GUG and UUG

Note that the differences are small, and often apply to relatively small genomes, where a change of the genetic code affects a relatively small number of genes and thus can be compensated in rare evolutionary events. Changing the code of a large genome in a random evolutionary event, where tens of thousands of vital genes would be affected, is virtually impossible.

A list of alternative genetic codes can be found at NCBI.