Your Biotechnology Service Provider

BLAST output

In addition to alignments of the query sequence with hits from the database, BLAST provides a number of statistical values that help to evaluate the quality of a match.

Score

The score is the result of the alignment process. An optimal alignment of two sequences tries to maximize the score. Each pair of identical, matching nucleotides or amino acids adds a positive ‘reward’ to the score. Each mismatch deducts a negative ‘penalty’. Likewise, the insertion and extension of gaps in one of the two sequences leads to a penalty. The sum of all rewards and penalties is the score.

The actual values for matches, mismatches, gap insertions and gap extensions are determined by the substitution matrix. This is a matrix which represents scores for all possible combinations of matchind or mismatching nucleotides or amino acids, including gaps. In the simplest case, the matrix may assign a positive score of, say, +1, to each match, and a negative score of, say, -3, to each mismatch and gap. However, the matrix may also reflect the fact that some nucleotide or amino acid exchanges are more likely than others; for instance, a hydrophobic amino acid is more likely to be replaced by another hydrophobic amino acid than by a hydrophilic one.

BLAST uses BLOSUM62 as its default matrix. Note that the total score of an alignment depends on the actual matrix used, therefore scores cannot be compared if they are based on different matrices.

Bit score

In order to account for these differences in matrices, an additional score is computed, the so-called bit score. The bit score takes into consideration the specific properties of a scoring matrix and is normalized – bit scores from alignments based on different matrices can be compared to each other. Higher bit scores mean better alignments. The formula for transforming a score to a bit score is:

[Bit score] = (lambda *[Score] – ln(K) ) / ( ln(2) )

where lambda and K are matrix specific parameters that reflect the properties of the substitution matrix used.

Identities

This is the number of matching nucleotides or amino acids, and is a good estimate of the quality of the alignment.

Expect value

The Expect value is very important for the assessment of BLAST results: It is a rough estimate of the significance of the match. It indicates the number of hits the query sequence would yield in a random database of the same size by chance. See the article on the Expect value for details.

Gaps

This is the sum of all gaps in an alignment. The number of gaps – rather than their lengths – indicates the quality of an alignment.

You may also be interested in:

  1. BLAST – NCBI web service
  2. Expect value
  3. BLAST parameters
  4. BLAST – Install locally
  5. BLAST