A fundamental concept taught in every basic biology course is that proteins are made up of sequences of 20 different amino acids, arranged much like words. However, researchers aiming to create biological molecules with novel functions have often felt constrained by these 20 essential building blocks, prompting them to seek methods to include new components, known as non-canonical amino acids, within their proteins.
Recently, scientists at Scripps Research developed an innovative method for straightforwardly incorporating non-canonical amino acids into proteins. This strategy, published in Nature Biotechnology on September 11, 2024, involves the use of four RNA nucleotides instead of the conventional three to encode each new amino acid.
“Our objective is to create proteins with customized functionalities for uses in a variety of fields from bioengineering to drug discovery,” explains senior author Ahmed Badran, PhD, an assistant professor of chemistry at Scripps Research. “Incorporating non-canonical amino acids into proteins using this novel technique brings us closer to achieving that objective.”
For a cell to synthesize a specific protein, it must convert a strand of RNA into a sequence of amino acids. Every set of three RNA nucleotides, termed a codon, corresponds directly to one amino acid. However, many amino acids can be represented by more than one codon; for example, both the RNA sequences UAU and UAC correspond to the amino acid tyrosine. Transfer RNAs (tRNAs), which are small molecules, are responsible for connecting each amino acid to its matching codons.
Recently, scientists striving to introduce entirely new amino acids into proteins devised methods to reassign codons. For instance, by modifying the tRNA for UAU, researchers could link this codon to a new amino acid, causing the cell to interpret UAU as corresponding to something other than tyrosine. However, this requires re-assignment of every instance of UAU within the cell’s genome to UAC to stop the new amino acid from being mistakenly integrated into countless other proteins where it is not required.
“Establishing free codons via complete genome recoding can be an effective strategy, yet it poses significant challenges, as it demands substantial resources to engineer new genomes,” notes Badran. “Additionally, it can be tough to predict how such codon modifications affect genome stability and the production of host proteins.”
Badran and his team aimed to devise a straightforward plug-and-play approach that would exclusively incorporate the desired non-canonical amino acid(s) into specific locations within a target protein, without interfering with the cell’s normal functions or needing a full genome edit. This called for employing tRNA that was unassigned to any amino acids. Their breakthrough involved utilizing a four-nucleotide codon.
The team recognized that in certain instances—like bacteria rapidly evolving to resist antibiotics—four-nucleotide codons had emerged through natural processes. Thus, in their research, they investigated what prompted cells to utilize a four-nucleotide codon instead of the typical three. They found that the sequences surrounding the four-base codon were crucial; frequently used codons aided the cell in reading a four-nucleotide codon to incorporate a non-canonical amino acid.
Badran’s group subsequently assessed whether they could modify the sequence of a single gene to include a new four-nucleotide codon that the cell would accurately utilize. Their approach succeeded: by framing a target site with three-letter, commonly-used codons and maintaining adequate levels of the four-nucleotide tRNA, the cell integrated any new amino acid linked to the corresponding four-letter tRNA. The research team repeated their experiment with 12 different four-nucleotide codons, ultimately using this methodology to design more than 100 new cyclic peptides—known as macrocycles—with up to three non-canonical amino acids in each.
“These cyclic peptides resemble bioactive small molecules that can be found in nature,” says Badran. “By leveraging the programmability of protein synthesis and the variety of available building blocks through this approach, we can create novel small molecules that have promising applications in drug discovery.”
He adds that compared to earlier methodologies for incorporating non-canonical amino acids, this new strategy is user-friendly since it requires modifying just one gene instead of altering the entire genome of the cell. Furthermore, more non-canonical amino acids could be utilized in a single protein due to the higher number of available four-nucleotide codons as opposed to three-nucleotide codons.
“Our findings indicate that it is now feasible to efficiently and effectively integrate non-canonical amino acids at various sites in a wide range of proteins,” notes Badran. “We are enthusiastic about the potential this holds for our continuing research and for sharing this ability with the broader scientific community.”
He highlights that this technique might serve to re-engineer existing proteins or generate entirely new ones, which could be beneficial across various sectors, including healthcare, manufacturing, and chemical detection.
This research was supported by funding from the National Institutes of Health (DP5-OD024590), the Research Corporation for Science Advancement, the Sloan Foundation (G-2023-19625), the Thomas Daniel Innovation Fund (627163_1), an Abdul Latif Jameel Water and Food Systems Lab Grand Challenge Award (GR000141-S6241), a Breakthrough Energy Explorer Grant (GR000056), the Foundation for Food & Agriculture Research (28-000578), a Homeworld Collective Garden Grant (GR000129), the Army Research Office (81341- BB-ECP), the Hope Funds for Cancer Research (HFCR-23-03-01), a Skaggs-Oxford Scholarship, and a Fletcher Jones Foundation Fellowship.