Creating an entire bacterial genome with a compressed genetic code

Computational design of an entire bacterial genome with a compressed genetic code. © Larissa Ulisko

Jason Chin’s group in the LMB’s PNAC Division have, for the first time, synthesised the entire genome of a commonly used model organism, the bacterium E. coli. There has only been one previous example of synthesis of an entire genome: for the Mycoplasma bacterial genome, which consists of approximately 1 million bases. Over the last 5 years, Jason’s group have developed a robust method for assembly of large pieces of synthetic DNA. This has enabled them to synthesise the entire E. coli genome, which is approximately four times larger than that of Mycoplasma. The new synthetic genome was computationally designed to contain just 61 codons, rather than the 64 codons that are used in all known terrestrial life. The success of this project shows for the first time that living organisms can be created with compressed genetic codes.

In translation of the genetic code to produce proteins, the series of bases is read in sets of three, known as codons. There are 64 possible combinations of the four 4 bases in different sets of three, so there are 64 codons, all of which are used in biology. 61 of these codons code for the 20 different amino acids used to build proteins and three function as full-stops to terminate production of a protein. This means that most of the amino acids are coded for by more than one codon; different codons that are used for the same amino acid are known as synonymous codons.

The researchers computationally designed an E. coli genome in which every instance of two particular codons for the amino acid serine (TCG and TCA) were replaced with synonymous codons (AGC and AGT) and every instance of the stop codon TAG was replaced with its synonym TAA. In total, this amounted to 18,214 codons that would be recoded using this set of recoding rules. While there are an astronomically large number of theoretical ways to compress the genetic code in the genome the researchers had previously identified this as a promising rule for recoding serine codons, following their investigation of different recoding rules on a small region of the genome.

Julius Fredens, Kaihang Wang, Daniel de la Torre, Louise Funke, and Wesley Robertson, researchers from Jason’s group, implemented this design through the use of a technique that Kaihang Wang in Jason’s group had previously developed, REXER (replicon excision enhanced recombination). This process involved generation of sequences of synthetic DNA of about 1/40th of the total genome size and driving replacement of the bacterium’s natural DNA with its corresponding synthetic version. This step could then be repeated to, step by step, gradually replace the natural E. coli genome with the synthetic version.

The researchers started this process from 8 different points in the E. coli genome in parallel to produce 8 different strains, each with approximately 1/8th of their genome replaced by the synthetic version. These were then combined by taking advantage of an engineered version of a natural gene transfer process in bacteria, called conjugation. This resulted in an E. coli bacterium that contained a fully synthetic genome corresponding to the team’s computational design.

A fundamental aim of synthetic biology involves reprogramming the genetic code of cells to allow incorporation of non-canonical amino acids, with new and diverse properties, into proteins, to allow production of a wide range of new molecules that could have benefits to medicine and biotechnology. Previous work to reassign codons to allow incorporation of non-canonical amino acids has commonly relied on outcompeting the natural decoding system used by the cells. Jason’s group have now shown that it is possible to remove the tRNAs and release factors that decode the three codons that have been removed from the genome. This may enable these codons to be cleanly reassigned and facilitate the incorporation of multiple non-canonical amino acids. This greatly expands the scope of using non-canonical amino acids as unique tools for biological research. The group plan to combine these advances with other recent advances, including their work on a “stapled” ribosome, to enable the encoded cellular biosynthesis of non-canonical biopolymers.

Interestingly, this codon reassignment also has the consequence that this new synthetic E. coli should not be able to decode DNA from any other organism and therefore it should not be possible to infect it with a virus. With E. coli already being an important workhorse of biotechnology and biological research, this study is the first time any commonly used model organism has had its genome designed and fully synthesised and this synthetic version could become an important resource for future development of new types of molecules. This monumental paper, which Jason’s group have been aiming towards for several years, was achievable through the hard work of many members of the group, as well as assistance from the LMB’s Mass Spectrometry and Light Microscopy facilities.

The work was funded by the MRC, ERC, the Medical Research Foundation, and the Lundbeck Foundation.

Further references

Total synthesis of Escherichia coli with a recoded genome. Fredens, J., Wang, K., de la Torre, D., Funke, LFH., Robertson, WE., Christova, Y., Chia, T., Schmied, WH., Dunkelmann, D., Beránek, V., Uttamapinant, C., Gonzalez Llamazares, A., Elliott, TS., Chin, JW. Nature [Epub ahead of print]
Jason’s group page
Previous Insight on Research: New technologies enable systematic recoding of genomes
Previous Insight on Research: Making a cell-based factory for polymer synthesis
Nature News and Views: Construction of an Escherichia coli genome with fewer codons sets records