GeneGPS™ Optimizes Expression in Your System.

DNA2.0 has employed a variety of multivariate analysis strategies to identify relationships between gene design variables and protein expression. We have ongoing studies in Bacteria, Yeasts, Mammalian cells and in vivo, Plants, Fungi, Insects, and Cell-free systems. DNA2.0 is using the results of our research to create patented gene design algorithms for numerous host systems. When utilized, these new GeneGPS™ design algorithms routinely produce 10-100 times more protein than competing methods. Results from these investigations can be seen in DNA2.0 PepTalk 2014 Presentation and below.

An example, from our work published in PLoS, shows improved expression in a bacterial host system.

GeneGPS E. coli host system data

Variants expressed in E. coli: Expression of polymerase variants (red squares) and scFv antibody variants (blue diamonds) are shown. Each point shows data from a different codon bias. Genes designed using DNA2.0's advanced algorithms are shown in green. Black symbols show the two major algorithms used by our competitors: matching the E. coli genome bias (filled black symbols) or matching the bias found in highly expressed genes (open black symbols).

PloS One 2009, 4:e7002 Design parameters to control synthetic gene expression in Escherichia coli. Welch, et al.

DNA2.0 synthetic genes are available in vectors that are immediately usable for Bacterial protein expression.
DNA2.0 synthetic genes are available in vectors that are immediately usable for S. cerevisiae protein expression.

An example, from work we have been doing in partnership with Dr. Robert Stroud at the University of California, San Francisco, shows improved expression in a yeast host system.

GeneGPS S.cerevisiae data

PLS Model of S. cerevisiae Expression: With this data we are able to correlate expression to codon usage with a predictive PLS model which explains variation of hybrid genes as well as the initial variants.

GeneGPS S. cerevisiae host system gel

Human Membrane Protein expressed in S. cerevisiae

Total protein in membrane fraction analyzed, WT gene shows no detectable expression, Top expression level ~1mg/L.

GeneGPS model for K. lactis

PLS Model for K. lactis
Initial Design Set, ~10kDa secreted protein

DNA2.0 synthetic genes are available in vectors that are immediately usable for Pichia protein expression.

GeneGPS Pichia PLS Model

PLS Model for TrCBH2 in P. pastoris

  • Model based on total secreted enzyme activity
  • Strong MeOH-induced AOX1 derived promoter
  • Similar models obtained for each promoter
  • No significant correlation with CAI or GC percent
DNA2.0 synthetic genes are available in vectors that are immediately usable for mammalian protein expression.

GeneGPS Mammalian Cell Preferences Model

Gene design preferences for HEK293 and CHO cell lines. Performance of PLS model of gene sets for 6 different proteins (267 genes) developed on this grant. Variables are restricted to the codon usage frequencies of the genes. Correlation coefficients and significance are indicated for fits to the full data and in cross-validation (prediction of variants when left-out of the training set). Measured values plotted for each protein are normalized to the highest expressing gene variant for that protein.

GeneGPS Fungus Model

PLS Model for Fungal Host: Antibody expressed in a fungal host.
Model has already proven useful to optimize a second gene in this host.

GeneGPS Plant model gel

Expression of Variants in Plant Host: DNA2.0 naive variant set resulted in several variants which far outperformed our partner’s previous best or the patented method of a competitor.