ProteinGPS™ Engineering Technology

Quickly and efficiently design proteins with improved characteristics to find the protein activity you need.

  • Optimize directly for function in the final application
  • Save years of time and millions of dollars
  • No high-throughput (HTP) screens
  • Screen small numbers of variants (50-200) directly for the desired function, ideal for Enzyme Engineering
  • Don’t waste time pursuing false positives: variants identified by HTP screens that do not retain activity in ‘real’ assay
  • No false negatives lost due to screening error or poor correlation between HTP screen and ‘real’ assay
  • No biodiversity collections required, everything is synthesized as needed
  • Sequence-function relationships provide the basis for strong composition-of-matter patent claims.

Typical protein engineering methods rely on screening a high number (106-1012 or more) of gene variants to identify individuals with improved activity using a surrogate high throughput screen (HTP) to identify initial hits.  Unfortunately, you get what you screen for as the “hit” from the HTP screen often has very little real activity in a lower throughput assay more indicative of the improved functionality for which the protein is being developed.

ProteinGPS™ instead relies on identifying key amino acid substitutions through bioinformatics-based mining of available sequence space and combining such substitutions in an information maximized variant dataset (usually less than 100 unique gene variants).  At that scale determining the activity for the commercially relevant function in an indicative assay can be readily performed. DNA2.0 then uses advanced machine learning algorithms to deconvolute the relative contributions of each substitutions to map the megadimensional sequence space contributing to the desired protein activity. We routinely see orders of magnitude functional improvement by measuring no more than 100-300 samples.

ProteinGPS Engineering Overview

Applying modern engineering principles to protein engineering

The bioengineering technology developed by DNA2.0 is based on mathematical nonlinear systems modeling and optimization algorithms routinely used in such diverse areas as small molecule QSAR, process control design for manufacturing, website optimization, and logistics. These problems all require methods that can analyze systems with high complexity and large numbers of independent impactful variables. Over the past seventy years, mathematicians and engineers have developed algorithms for identifying optimal solutions from data sets that are very small relative to the total potential information space being interrogated. Today, these principles are used in the development of numerous products, from the design of jet engines to the optimization of gasoline formulations to credit card fraud detection. Methods for multidimensional optimization that are now routinely employed in other engineering disciplines contrast starkly with both structure-based protein design and directed evolution, which have no real parallels in other engineering areas.

I. Transforming Enzyme Engineering with Infologs™

Wheat Infologs

Using independently designed synthetic genes where substitutions are systematically incorporated (Infologs™) leads to uniform sampling, systematic variance and unrestricted information rich results.  Wheat GST with the ability to detoxify a panel of common herbicides was designed using this patented DNA2.0 bioengineering method. The relative functional contribution of 60 amino acid substitutions against 14 herbicides was quantified using only 96 infologs and dramatically improved by a small set (16) of 2nd generation infologs. Check out the full “Using Infologs to Engineer Biological Systems” Presentation

II. Successful Aminotransferase Enzyme Engineering with ProteinGPS™

Aminotransferase engineering image

Researchers at Pfizer and DNA2.0 publish the enzyme engineering of an aminotransferase for the biocatalysis of a key chiral intermediate in the synthesis of imagabalin, an advanced anxiolytic drug candidate. The starting wt protein, Vfat, is an ω-amino acid:pyruvate transaminase with very weak but detectable catalytic activity toward aliphatic amines. Designing and testing <450 Vfat variants synthesized by DNA2.0 resulted in an aminotransferase optimized for substrate selectivity and reaction velocity sufficient for the commercial biocatalysis goal.

Vfat Variants image

Developing algorithms appropriate for engineering proteins

At DNA2.0 we have modified the standard algorithms for engineering complex systems to work with biological systems. The resulting process enables us to deconvolute how substitutions within a protein sequence modify its function. We have combined these algorithms with an integrated query and ranking mechanism to identify appropriate sequence substitutions.

From predicted sequences to testable genes

The conversion of computationally predicted DNA sequences to physically testable genes is powered by our gene synthesis pipeline.  Until recently, the synthesis of individually designed genes was prohibitively expensive.  As a result, the only practical way to obtain combinatorially modified proteins was to make recombinant libraries, which in turn necessitated high-throughput screens.  By instead synthesizing individually designed gene variants, DNA2.0 ensures that amino acid changes are distributed to achieve maximum information content.  This in turn obviates the need for high-throughput screening, allowing us instead to focus on measuring protein properties that are important for the final application.

ProteinGPS Engineering Overview

Webinar: Using ProteinGPS and Infologs to Engineer Biological Systems

ProteinGPS™ relies on identifying key amino acid substitutions through bioinformatics-based mining of available sequence space and combining such substitutions in information maximized Infologs – synthetic gene variants designed to be systematically varied across the searched space. The presentation includes recent case studies. April 19, 2012
View a pdf of the webinar slides.

DNA2.0 Presentations from Scientific Conferences:

GenomeGPS and PathwayGPS

The bioengineering technology developed by DNA2.0 can also be used to develop completely novel genomes or to optimize pathways.

PathwayGPS™ and GenomeGPS™ build on DNA2.0’s other GPS systems to explore higher order combinations of multiple genes into functionally improved metabolic pathways.  Our capability for low-cost, high-capacity gene synthesis enables synthesis of multi-component multi-gene pathways up to several hundred kilobases in size.

Systemic non-correlated variation of control elements such as operators, promoters, and terminators across complex metabolic pathways while at the same time simultaneously varying individual genes to cover a range of expression levels, specificity and activity allows for sampling of a vast areas of metabolic space.  Application of advanced machine learning algorithms then enables a determination of each element’s contribution to pathway efficacy within a multitude of complex, interacting enzyme activities.  The elements are then engineered to drive the system to its optimal performance using a minimum number of assays.

Modular Libraries allow researchers to build new genetic modules or pathways from basic DNA parts including regulatory elements (promoters, untranslated regions, signal peptides, terminators) and/or coding sequences. Multiple genetic elements can be combined in various predefined arrangements to create new transcriptional units, biochemical pathways, genetic circuits, or develop strains displaying new traits.

VectorGPS

The bioengineering technology developed by DNA2.0 can also be used to develop completely novel, optimized vectors. DNA2.0 will design and create a new vector that works for your research and experimental needs. No longer will you need to work around vectors that just happen to be in your lab.

VectorGPS™ builds on DNA2.0’s other GPS systems to explore higher order combinations of multiple genes into functionally improved vectors. Our capability for low-cost, high-capacity gene synthesis enables synthesis of multi-component multi-gene pathways up to almost any size.

Systemic non-correlated variation of control elements such as operators, promoters, and terminators across complex metabolic pathways while at the same time simultaneously varying individual genes to cover a range of expression levels, specificity and activity allows for sampling of a vast areas of metabolic space. Application of advanced machine learning algorithms then enables a determination of each element’s contribution to pathway efficacy within a multitude of complex, interacting enzyme activities. The elements are then engineered to drive the system to its optimal performance using a minimum number of assays.

View Publications using Protein Engineering

Search the DNA2.0 Literature Database, containing over 800 scientific publications using DNA2.0 technology for references relevant to your research.

Highlight References

J Am Chem Soc 2013. Improved biocatalysts from a synthetic circular permutation library of the flavin-dependent oxidoreductase Old Yellow Enzyme. Daugherty et al.

Protein Eng Des Sel 2013 26(1):25-33. Redesigning and characterizing the substrate specificity and activity of Vibrio fluvialis aminotransferase for the synthesis of imagabalin. Midelfort, KS. et al.

IBC Antibody Engineering and Therapeutics Poster December, 2011: Strategies for Maximizing Information Content in Protein Libraries

PNAS 2010 107(5):1948-53. Reconstructed evolutionary adaptive paths give polymerases accepting reversible terminators for sequencing and SNP detection. Chen, F. et al.

J Biol Chem 2009 284(39):26229-33. SCHEMA recombination of a fungal cellulase uncovers a single mutation that contributes markedly to stability. Heinzelman, P. et al.

PNAS 2009 106(14):5610-5. A family of thermostable fungal cellulases created by structure-guided recombination. Heinzelman, P. et al.

Protein Eng Des Sel 2008 21:699-707. Protein engineering of improved prolyl endopeptidases for celiac sprue therapy. Ehren, Govindarajan, Morón, Minshull, Khosla.

BMC Biotechnol. 2007 7:16. Engineering proteinase K using machine learning and synthetic genes. Liao, Warmuth, Govindarajan, Ness, Wang, Gustafsson, Minshull.

Curr Opin Chem Biol 2005 9:202-9. Predicting enzyme function from protein sequence. Minshull, Ness, Gustafsson, Govindarajan

Curr Opin Biotechnol 2003 14:366-70. Putting engineering back into protein engineering. Bioinformatic approaches to catalyst design. Gustafsson, Govindarajan, Minshull.

This technology is partially covered by US patents 8005620, 8412461, and 8635029 issued to DNA2.0.

Discover Your Protein Engineering Solution

Consult with a Protein Engineering Specialist today at +1 877 362 8646 or info@DNA20.com to discuss how to move your project forward quickly and affordably.
ProteinGPS and Infologs Development Pathway

Using independently designed synthetic genes where substitutions are systematically incorporated (Infologs™) leads to uniform sampling, systematic variance and information rich results.