Gene synthesis
Gene synthesis is the fastest and most efficient way to get hold of long fragments of DNA, i.e. DNA molecules with a length of 100 base pairs or more. Today, gene synthesis is virtually sequence independent; for instance, even sequences with a very low or very high GC content can be readily assembled.
Due to the high efficiency of the process, gene synthesis is the method of choice for the assembly of complex operons or enzyme cascades, for protein engineering, for the production of antibodies, and for all experiments where a long DNA fragment is required that is not readily available from natural sources. Today it is much easier to synthesize a gene based on a database sequence than to isolate wildtype DNA if it is not available as a cassette in a cloning vector.
The term ‘gene synthesis’ is not very precise: It applies to the assembly of any long DNA fragment, independent of whether it encodes a gene, is a non-coding piece of DNA or contains several genes at once.
There are a number of gene synthesis strategies available, both as the basis of commercial services as well as methods that can be implemented ad hoc by a molecular biologist. Please note that some of these methods may be patent protected, so IP restrictions may apply that you have to take into consideration. Popular methods are:
Solid phase synthesis: In this strategy, an anchor molecule is attached to a solid phase, and the desired DNA strand is assembled on the end of the anchor molecule. In a cyclic process, small building blocks are appended to the anchor and nascent DNA strand. In the simplest form, the building blocks may be individual nucleotides – this is precisely the well known and established oligonucleotide synthesis based on phosphoramidites. In a more complex – and for long DNA fragments more efficient – setup, the building blocks are already pre-assembled oligonucleotides. After the coupling of each building block, the remaining, non-coupled building blocks are washed away, and a new cycle begins. Since the solid phase synthesis works in cycles, it is time-consuming and has a limited efficiency. Commercially, it currently plays a secondary role only.
PCR: During the polymerisation phase of a PCR, the polymerase can fill ‘gaps’ in one of the two DNA strands. By designing precursor oligonucleotides in a way so that they partially overlap, the polymerase can fill the remaining gaps and at the same time create a contiguous double strand. PCR-based gene synthesis is convenient, since it works with a partial set of precursor oligos. The disadvantage of a PCR-based method is that the gaps between oligos can lead to the formation of DNA secondary structures which cause faulty byproducts. Also, any DNA polymerase has a limited fidelity, and the polymerisation may introduce mismatch mutations.
Ligase-based: If oligonucleotides are synthesized in a way so that they cover the complete forward and reverse strand of a DNA double helix, and so that each two oligos of one strand overlap with one oligo of the reverse strand, then they will self assemble into the correct full-length DNA sequence under suitable temperature conditions. It is then possible to create a contiguous double strand by adding a DNA ligase. Assuming that each oligo is phosphorylated at the 5′ end, the ligase will form the standard DNA phosphodiester between adjacent oligonucleotides, thus leading to complete DNA strands.
Chip-based: All of the above methods can be combined with a chip-based synthesis of oligonucleotide precursors. This can potentially increase the efficiency of the whole process and allows for a certain degree of error correction. However, it also adds another layer of complexity to the process and the increased efficiency can be leveraged only for very long DNA fragments – it is difficult to perform multiple short gene assemblies in parallel on the same chip.
Note that independent on the way the primary DNA product is assembled, all methods require postprocessing, in which the DNA fragment is ligated into a vector, and a 100% correct clone is retrieved by screening – usually in the form of full-length DNA sequencing.
One of the most important aspects of gene synthesis is the design of the sequence itself: Since the sequence of a synthetic gene is not limited by any natural constraints, it is possible to choose the sequence so that it is optimal with regard to a number of objectives. This can include high expression yield, easy handling in subsequent molecular biological processes, easy combination with other DNA fragments or further modification. Therefore, it is important to use an effective and well-designed gene optimization process.
Criteria for the selection of a gene assembly strategy or gene synthesis service are:
- Reliability of the process in terms of predictable time frame
- Robustness with regard to GC content, repetitive motifs and secondary structure
- Availability and quality of the gene optimization process
- Quality of the postprocessing step, in terms of turn around time and reliability of the result
Since the direct costs in terms of the price of the required reagents and oligos or the gene synthesis service are usually small, compared to the indirect costs of failed or delayed synthesis, it is important to put a strong emphasis on the quality, robustness and reliability of the overall process. This is especially true for experiments that require the repeated synthesis of very similar or related sequences. In these cases, the overall efficiency can be greatly increased by careful planning, clever bioinformatics preprocessing and analysis and adaptive gene synthesis protocols that can re-use parts of the already synthesized fragments.


