Blosum62
Blosum62 is a substitution matrix for pairwise protein sequence alignments. You will encounter Blosum62 in a number of bioinformatics applications that align protein sequences or analyze the homology between sequences. Most noteworthy, Blosum62 is used by BLAST by default for protein homology detection.
When comparing two sequences and looking at their homology, it is important to have a metric for how closely related to sequence symbols – or amino acids in the case of proteins – are. Among the 20 standard amino acids, some are more closely related than others when it comes to their physicochemical properties. For example, there are hydrophobic and hydrophilic amino acids, and the hydrophobic ones are more closely related to each other than the hydrophobic ones.
If two sequences are evolutionary related, it is plausible to assume that any amino acid changes have a high probability of being conservative, i.e. replacing one amino acid by a closely related – i.e. similar – one.
Therefore, when assigning a match or mismatch score to a pair of amino acids, we need a table where we can look up the score for any pair of two amino acids. That score should reflect the similarity of both amino acids.
This is what Blosum62 does: It contains similarity scores for all permutations of two amino acids, assigning higher (better) scores to similar amino acids. You can find the Blosum62 matrix for example at Expasy.
Blosum62 has been created by a rational process: A large sample set of homologous protein sequences has been aligned and the substitutions analyzed. In the analysis, blocks that showed a good alignment, where used to calculated summed and averaged scores. Therefore the name ‘Blosum’ stems from ‘BLOck SUMs’. ’62′ means that members of a homology block that shared at least 62% of identity with any other member of the block where averaged.
You may also be interested in:


