The capacity to modify proteins for enhanced control over binding affinity and specificity facilitates the development of customized therapeutics that have fewer side effects, highly sensitive diagnostic devices, effective biocatalysts, targeted drug delivery systems, and eco-friendly bioremediation strategies. Nonetheless, existing strategies for protein redesign come with lengthy and tedious challenges. Now, researchers have introduced a streamlined technique named ProteinReDiff that leverages artificial intelligence to accelerate the process of redesigning ligand-binding proteins.
Cellular proteins interact with molecules known as ligands, a process that supports numerous vital functions for life, such as cell signaling and enzyme activity. In the fields of biotechnology and medicine, if researchers can modify proteins to enhance their binding affinity and specificity, it can lead to more personalized therapies with minimized side effects, highly responsive diagnostic tools, effective biocatalysis, targeted drug delivery systems, and environmentally sustainable solutions for pollution cleanup.
Different methods for protein redesign present certain limitations. Traditional approaches often rely on slow trial and error techniques, while many computational design frameworks require extensive details about the protein structure and the binding site for the ligand.
A research team led by Truong Son Hy, Ph.D., at the University of Alabama at Birmingham, introduces a more straightforward technique known as ProteinReDiff, which applies artificial intelligence to expedite the redesign process of ligand-binding proteins.
“Our framework facilitates the creation of high-affinity ligand-binding proteins without the need for detailed structural data,” explained Hy, who is an assistant professor in UAB’s Department of Computer Science. “We depend solely on the original protein sequences and ligand SMILES strings.”
SMILES, or Simplified Molecular Input Line Entry System, is a long-standing method that outlines the structure of molecules using only ASCII characters that can be interpreted by computers.
“An essential aspect of our technique is blind docking, which estimates how the redesigned protein interacts with its ligand without requiring pre-set information about binding sites,” Hy stated. “This efficient method greatly diminishes dependence on detailed structural knowledge, broadening the potential for exploring protein-ligand interactions based on sequences.”
The research team, including Viet Thanh Duy Nguyen from FPT Software AI Center in Ho Chi Minh City, Vietnam, and Nhan D. Nguyen from the University of Chicago, trained the ProteinReDiff artificial intelligence framework using numerous known protein structures and their corresponding ligands. They were able to creatively redesign specific protein-ligand pairs by randomly altering amino acids and denoising the diffusion model to accurately depict the joint distribution of the ligands and protein complexes.
Hy and his collaborators assessed ProteinReDiff alongside eight other computational protein design models based on their input and output features, specifically focusing on the enhancement of ligand-binding in proteins from select ligand-protein pairs.
In terms of input features, six out of the eight comparative models incorporated structural data from proteins, while only ProteinReDiff and another model known as DPL depended purely on protein sequences and ligand SMILES inputs. Regarding output, only ProteinReDiff yielded new protein designs encompassing protein sequences, structures, and ligand structures.
Performance-wise, redesigned proteins from chosen ligand-protein pairings produced by ProteinReDiff were evaluated against those from the other protein design models for their ligand-binding affinity, diversity in amino acid sequences, and structural integrity. ProteinReDiff demonstrated improved ligand-binding affinity compared to its counterparts.
“Our model excels in enhancing ligand binding affinity using only initial sequences of proteins and ligand SMILES strings, avoiding the necessity for detailed structural data,” Hy stated. “These results unveil new opportunities for modeling protein-ligand complexes, showcasing the considerable potential of ProteinReDiff across various biotechnology and pharmaceutical sectors.”
ProteinReDiff stands for Protein Redesign based on Diffusion Models. It implements significant enhancements inspired by the representation learning elements from the AlphaFold2 architecture used in computer-based protein folding. These enhancements enable the ProteinReDiff framework to accurately capture complex protein-ligand interactions, boost the accuracy of binding affinity predictions, and facilitate more precise redesigns of ligand-binding proteins.