Synthetic

Recent Advances In The Protein Folding Problem

Protein folding in the cell often relies on the help of chaperonins, naturally occurring cellular nanomachines that fold many critical cellular proteins in all human and animal cells. Knowledge of protein folding is important because proteins must assume the correct 3-D structure to function properly.

Different classes of chaperones work together to form elaborate cooperative networks and also ensure that potentially damaging misfolded polypeptides are cleared from the cell. Such misfolded proteins would otherwise cause a cascade of cellular damage and ultimately lead to globalized cell death. The term “proteomics” refers to a large-scale comprehensive study of a specific proteome resulting from its genome, including abundances of proteins, their variations and modifications, and interacting partners and networks in order to understand cellular processes involved. 

Hundreds of enzymes depend on fully-functioning chaperones. As our population grows older, an increasing socio-economic burden stems from a class of diseases resulting from protein misfolding and protein aggregation. Millions of Americans suffer from the most common of these: Alzheimer’s disease, Parkinson’s disease, and Huntington’s disease. In addition to such neurodegenerative diseases, folding defects play important roles in stroke, various types of cancer, and cataract formation. Protein chains that are required for healthy cell and organ function can misfold and ultimately aggregate into toxic fibers and large complexes in all these diseases.

Despite recent technological advances in proteomics, comprehensively characterizing an entire proteome still poses a challenge inherent in proteomics.  This lies in a proteome’s increased degree of complexity compared to its genome and argues for the need of continuous development of technology/platform.  For example:

AlphaFold

Figuring out what shapes proteins fold into is known as the “protein folding problem.”

A project called Alphafold entered CASP13 (Critical Assessment of Structure Prediction) in 2018 and achieved the highest accuracy among participants. Afterwards, the team published a paper on our CASP13 methods in Nature with associated code, that has gone on to inspire other work and community-developed open source implementations. Now, the new deep learning architectures have driven changes in methods for CASP14, enabling the team to achieve unparalleled levels of protein folding accuracy. These methods draw inspiration from the fields of biology, physics, and machine learning, as well as of course the work of many scientists in the protein folding field over the past half-century.

A folded protein can be thought of as a “spatial graph”, where residues are the nodes and edges connect the residues in close proximity. This graph is important for understanding the physical interactions within proteins, as well as their evolutionary history. For the latest version of AlphaFold, used at CASP14, the team created an attention-based neural network system, trained end-to-end, that attempts to interpret the structure of this graph, while reasoning over the implicit graph that it’s building. It uses evolutionarily related sequences, multiple sequence alignment (MSA), and a representation of amino acid residue pairs to refine this graph.

“It’s a very substantial advance,” says Mohammed AlQuraishi, a systems biologist at Columbia University who has developed his own software for predicting protein structure. “It’s something I simply didn’t expect to happen nearly this rapidly. It’s shocking, in a way.” Quote source: Technology Review.

Recent Advances In The Protein Folding Problem was last modified: December 2nd, 2020 by Staff