Biotech Blog

Online tool for the calculation of randomized mutant libraries

2011-11-18

Entelechon has published a free online tool for the calculation of the complexity of randomized mutant libraries. This tool allows to enter the number of randomized amino acid positions, as well as the number of codon variants at each position, in order to estimate the total number of possible permutations. Based on this complexity, the tool tests whether a synthetic library will contain sufficient molecules in order to cover each variant at least once.

In addition, the tool compares the performance of codon-precision libraries with conventional libraries based on single nucleotide randomizations. This allows the user to choose the most economic and efficient approach for the problem at hand.

The tool can be found here.

Gene optimization helps to protect your intellectual property

2011-09-12

There is a very nice side effect of gene synthesis which is rarely considered: When you have a gene made synthetically, based on a gene sequence that is custom optimized, you get a unique DNA sequence which isn’t existing elsewhere in the world. Even if the encoded protein is a run-of-the-mill standard, the underlying gene belongs to you only. Therefore, synthetic genes make it easy to identify IP theft: If, for example, you intend to produce a kit containing a DNA template, or a transgenic strain or cell line, the contained custom-made gene will give away where the product comes from.

With synthetic genes, you can prove beyond any doubt that the gene was made for you, and that the product originates from your lab. As simple as this concept seems, it can become an invaluable tool in defending your IP rights and your position in the market. Therefore, not only do synthetic genes lead to optimized expression yields and improved ease of use in the lab, they can even help you to brand your DNA-containing cell lines, strains, or kits.

If you want to know more about gene optimization and custom gene synthesis, please contact us or send an inquiry.

Synthetic DNA controls as process standards

2011-09-09

It’s called gene synthesis by virtually everybody, but of course it is not restricted to actual genes. The custom synthesis of long DNA fragments can produce all sorts of very interesting and economically relevant non-coding DNA sequences.

One of the applications which becomes increasingly popular among our customers, is the synthesis of designed DNA standards. For applications as varied as (q)PCR, FISH, or transcriptomics, well-defined control sequences can be extremely helpful. Gene synthesis can provide DNA controls which are well characterized and tuned to the job at hand. Parameters such as melting temperature, homology, or secondary structure can be adjusted in silico, and the control can have additional features, such as binding sites for fluorescent primers, or restriction sites.

No two DNA control standards are identical – each customer and each project bring their own set of requirements. Nevertheless, design and synthesis are usually very straightforward. Entelechon’s bioinformatics unit can help with the design (or do it completely autonomously), and synthesis is as simple and efficient as for other synthetic genes. More often than not, we have written customized software for the design of DNA controls that match the special requirements of new processes at a customer’s.

Compared to existing, “natural” DNA – such as a plasmid lying around in your lab – synthetic DNA standards have significant advantages: They are tailor-made for the job, and therefore avoid problems such as undesired secondary structures. They are well-characterized. And they are yours entirely: Since you designed them (or we did for you), you own the sequence – no patent and licensing problems. And even better: If someone should get the bad idea of stealing or duplicating your controls, you can easily identify what is yours simply by sequencing: The customized DNA sequence of your standards is a unique signature that will belong to you forever.

If you are interested in discussing the construction of DNA standards from scratch, please contact us.

From protein to DNA

2011-08-31

If you want to produce a given protein sequence by recombinant expression, in some cases you do not have a corresponding DNA sequence at hand. And even if you do, it’s codon bias may not match that of your expression host. In such cases, you need to backtranslate the protein sequence into a DNA sequence. This can be done in a number of ways, and there are many tools out there which do the job.

We have created two software tools specifically for the purpose of protein expression: The backtranslation tool is an online tool which allows to adjust the codon frequency for each amino acid in detail. It can download codon frequency tables for a wide range of potential expression hosts.

However, the backtranslation tool is rather simplistic. It accepts a list of DNA motives which can be avoided (such as specific restriction sites), but apart from that it doesn’t control other parameters. Therefore, we have created a dedicated software package called Leto. Leto uses a genetic algorithm to iteratively optimize a given DNA or protein sequence. It takes into consideration a wide range of
optimization parameters which have an impact on the expression yield.

For instance, Leto can remove potential splice sites, avoid mRNA destabilizing motives, adjust the GC ratio, reduce the mRNA secondary structure, and – of course – adapt the codon frequencies. Therefore, an optimization using Leto will be very likely to improve the expression yield significantly.

Leto is a standalone application running under Windows, Linux and Mac OS. Also, we offer custom Leto optimizations of your gene sequences as part of the gene synthesis service.

Your expensive project depends on high expression yield?

2011-08-26

Get a free ticket for the downstream processing by optimizing the gene. In many projects where expression yields are critical, a lot of work goes into the finetuning of expression conditions – vectors, host strains, media, expression conditions.

However, there is an often overlooked parameter which hugely affects expression yields: The gene sequence itself. With today’s affordable gene synthesis, it is trivial to optimize a gene sequence with regard to all features which are relevant for expression.

However, the story doesn’t stop here. Since the cost for gene synthesis is extremely low compared to the downstream costs of failed or weak expression, it is worthwile to leverage gene optimization as much as possible. Consider this: Due to the degenerate nature of the genetic code, any given protein sequence has a huge set of possible encoding genes. For a 300aa protein, the number of permutations of encoding genes is roughly

3^300 = 1.4 x 10^143

Gene optimization tries to explore this “solution space” as deeply as possible, but still manages to cover a tiny fraction of it only. And even then the assumptions underlying the optimization parameters are somewhat theoretical. They are a good educated guess, but they are by far not an exact picture of the actual transcription/translation process in the host cells.

Therefore, using a single gene – even if well optimized – is very unlikely to result in the best possible expression yield. Chances are good that it will perform satisfactorily – after all, a lot of throught and work went into the optimization – but still, it’s very likely that there are better solutions “out there”.

Now, when you consider the significant downstream costs of less than optimal gene expression, it may be prudent to invest in a set of 2, 3 or even 10 genes. All of them will be optimized, but they will be placed in entirely different regions of the solution space, thus increasing your chances drastically that you come close to the absolute optimum.

If you are interested in such optimization sets, please contact us and ask about our special gene set discount.

Gene synthesis

2011-08-22

Gene synthesis has come a long way. What started in the 90ies as a modest (and quite costly) service field, turned into one of the major supporting platform technologies for biotech and pharma. Today, many research projects would be impossible without affordable and rapid gene assembly at hand.

If you haven’t shopped for gene synthesis services recently, you will be surprised how far we have come: Synthetic genes are very affordable, and the order process is as simple as it gets.

With the logistics out of the way, the design of synthetic gene sequences becomes more and more important. However, you are not on your own when designing a gene: We can help, and in fact we have put considerable effort into the question of gene optimization in the past. The question of whether a gene shows a satisfactory expression yield is not just dependent on the codon frequency, but on a range of other factors as well. With today’s turnorver in gene synthesis, it is paramount that gene optimization is both efficient and cost-effective. Thus, any viable gene optimization approach must be automated as much as possible, while at the same time being sufficiently flexible to adapt to the specific requirements of the project at hand.

Over the past ten years, we have developed a software package called Leto which provides the automated optimization of synthetic genes. If you have questions concerning the optimization of a particular gene sequence, please do not hesitate to contact us.

Apart from the question of optimization, what other factors are important when starting a gene synthesis? Obviously, turn-around time is of the essence for most projects, as is a reliable delivery date. Since the cost for gene synthesis is so low these days, the external costs of something going wrong become a much more important concern. Therefore, make sure that you discuss questions of gene design, cloning strategy, the scenario in which the gene will be used, etc with your gene synthesis provider. Look for a provider where you get direct access to the molecular biology experts, not just the sales department.

Make sure to mention all “unusual” factors in your project, such as non-standard expression systems, “interesting” features of the protein at question, or relevant downstream processes such as the creation and screening of DNA libraries. A good service provider will be able to adapt the synthesis process to your needs and will help you to work around problems.

 

Coverage and complexity of DNA libraries

2011-08-19

Designing a DNA library is not a trivial task. As simple as it may seem to incorporate a couple of randomized nucleotides into a synthetic gene, it requires some careful planning to end up with a useful library.

In particular, it is important to determine the complexity of the libray – how many different variants of the sequence are to be expected. This is very easy: Just multiply the combinations of 2, 3, or 4 nucleotide variants at all randomized positions, like so:

n = 2^a * 3^b * 4^c

where n is the number of permutations, a is the number of wobble positions with two alternative nucleotides, b is the number of wobble positions with three alternative nucleotides, and c the one with four.

Now, this is a very theoretical value. If you use a significant number of fully randomized codon positions, n will quickly become very large.

You now have to make an important decision: Should the library cover each possible permutation at least once, or is it ok if it covers a subset only?

This is primarily a question of practicability: Covering all possible permutations can be very difficult for large numbers of n, and the effort may not be warranted if the library is simply used for the exploration of the optimization potential of a protein. In that case, it may be worthwile to perform multiple iterations of a library design, carefully approaching the optimum.

If you want to cover the complete permutation space, for instance when you are expecting complex interactions between multiple amino acids, then think about ways to reduce the absolute number of randomized nucleotide positions. For instance, it may be useful to allow a limited number of prototypic amino acid residues per position instead of all twenty.

The question of whether a library covers all possible permutations can be answered by looking at the number of full-length double stranded DNA molecules in the product. There is an in-depth discussion of how to do that in the technical appendix of the Entelechon catalog.

Since the number of possible permutations increases rapidly with the number of randomized positions, steps to reduce the complexity of the library can save you time and money: Using a proprietary technology, we can synthesize codon-precision mutant libraries which contain only the codons you want – no undesired codons, no stop codons. For instance, for a fully randomized amino acid positions, this reduces the number of variants from 32 for a conventional NNS “codon” to 20.

Directed evolution

2011-08-18

Directed evolution experiments can be extremely helpful when aiming to improve biophysical properties of proteins. For instance, stabilizing a protein at high temperatures or at extreme pH, or improving an enzyme’s katalytic efficiency may turn out to be very difficult using a rational design approach; whereas directed evolution can accomplish these goals effortlessly.

Oh sorry, did I say effortlessly? I meant, “with potentially less effort”. Yes, directed evolution is neither a silver bullet, nor is it necessarily particularly easy to implement. Nevertheless, it may make solutions accessible which are otherwise impossible to achieve.

In its most basic form, a directed evolution experiment works like this:

  1. Create a DNA library with a gene partially randomized at strategic positions. The goal is to get a mixture of protein variants, with different amino acid residues at positions where you expect them to have an impact on the properties that should be optimized. For instance, for the improvement of enzymatic activity, amino acids near the katalytic center of the enzyme should be targetted, or residues that are suspected to be involved in substrate binding.
  2. Subclone the library into an expression vector, express the gene variants. Make sure that you can refer back to the underlying gene sequence. There are a number of systems which are designed to keep the gene and gene product connected, for instance phage display systems.
  3. Test the desired properties of the protein, for example binding to a substrate, stability at harsh conditions or enzymatic activity. If possible, conceive a selection mechanism that works automatically. For example, a phage display can bind only those proteins which interact strongly enough with a partner protein, so all “unfit” variants can be washed away.
  4. Sequence the resulting gene variants, and use them as the starting point for another iteration of the process. Repeat until you are satisfied with the result.

The first step – creating a randomized gene library – can be significantly improved if you don’t use individual wobble nucleotides, but codon-precision libraries which contain the desired codons only, but no other variants or stop codons. Entelechon has a proprietary technology to construct such codon-precision libraries.

As you can see, all of these steps are not particularly trivial. Therefore, careful planning and proper design are required for a successful directed evolution approach. If the problem at hand can be simulated on a computer, it may be worth to start with an in silico approach, where an evolutionary algorithm does basically the same. This will very likely not result in a useful solution, but you may learn a lot about the underlying system, it’s dynamics and limitations, and you may even get a better starting position for the wet lab work.

Getting things right at the first attempt is difficult. If you haven’t done a directed evolution experiment before, we highly recommend to consult with someone who has. Entelechon has a long track record of gene library design, protein expression, and screening. Feel free to contact us for a discussion of your requirements.

 

 

Gene optimization

2011-08-18

If you want to obtain maximum expression yields, it is a good idea to adapt the gene at question to the target host. There is a number of parameters which have an impact on the expression yield.

Today, gene synthesis is a very efficient and affordable process which allows to tweak a given gene sequence strategically in order to improve expression yields. By applying rational design principles to a gene sequence underlying a target protein, money and time can be saved for the downstream process.

Parameters which affect expression yields, are:

  • GC content
  • Codon bias (preference of a particular subset of codons for a given expression host)
  • mRNA destabilizing motifs
  • mRNA secondary structure
  • secondary ORFs (extended reading frames in the second or third frame or on the opposing strand)

Taking all these parameters (and other constraints such as specific restriction sites) into account simultaneously is a time consuming and difficult process. Therefore, I suggest that you use suitable design software which automates the process, such as Leto.

Or, even better, outsource the process altogether. For instance, Entelechon provides the optimization of a gene as part of the fee-for-service gene synthesis.

Why does my gene expression fail?

2011-08-17

Sometimes, when a gene refuses to be expressed, it seems like an impenetrable problem that sits on your lab bench like a black box. However, Nature’s ways may be complicated but they are not random meanderings. Therefore, all failed expressions can be “debugged” in a rational way.

As a guideline, ask yourself these questions – and try to answer them:

  • Is the gene sequence correct? As trivial as it may seem, double check this. Is everything where it should be, start codon, promoter, stop codon? Is the amino acid sequenc in frame? Make sure you spend the little money it takes for complete sequencing of the gene – more often than not, a seemingly perfect plasmid turns out to be corrupted somewhere in the middle, where MCS primers don’t reach.
  • Is there anything peculiar about the gene? Extended mRNA secondary structure? Long stretches of high or low GC content? Long stretches of uninterrupted reading frames in the second or third frame, or on the opposite strand?
  • What is known about the gene, its interaction with its natural host organism, toxic effects? Are there existing expression protocols? Don’t be fooled by related genes – we have seen cases where very closely related genes of the same family behaved entirely differently in recombinant expression. Don’t jump to conclusions based on protocols designed for ‘similar’ genes.
  • What part of the protein synthesis process is failing? Transcription (do you see the mRNA – this can be verified by Northern blot or PCR)? Translation?
  • Check the protein content after various induction times, at different expression temperatures
  • The protein product may be “invisible” due to a number of factors: Hiding in the insoluble fraction, as aggregates, being exported into the medium in a system which isn’t supposed to leak the protein outside the cells.
  • The protein may be produced in high amounts but may be rapidly degraded. Try different expression strains.
  • The DNA may be methylated – check the documentation for your expression system.
  • The mRNA may be unstable. For instance, it could have an extended secondary structure which kinetically hinders the movement of the ribosome, thus leading to premature termination of the translation process. Or translation may be slowed down by rare codons. Both problems could be fixed by gene optimization.

It is tempting to cut corners and tweak multiple parameters at once. However, this usually complicates the process rather than simplifying it. Be prudent and change one parameter at a time. Debugging an expression problem can be time-consuming and expensive. In many cases, there are no alternatives, though, and it’s still the best option.

Questions? Get in touch with us.