Comprehensive coverage

Google's DeepMind has published an accurate picture of the human proteome

In what has been cited as "the most significant contribution that artificial intelligence (AI) has so far provided to the advancement of scientific knowledge of humanity" the DeepMind company in collaboration with the "European Molecular Biology Laboratory" (EMBL) published the most complete database for predicting the three-dimensional structures of human proteins

[Translation by Dr. Moshe Nachmani]

structures of different proteins. Illustration: depositphotos.com
structures of different proteins. Illustration: depositphotos.com

 The company DeepMind (a British company for the development of artificial intelligence established in September 2010 under the name DeepMind Technologies, and purchased by Google in 2014. In 2015 it was transferred from Google to Alphabet) announced today a collaboration with the "European Molecular Biology Laboratory" (EMBL) in order to produce the complete and accurate database the most so far that includes predictions of structural models of proteins within the framework of the human proteome. This database will cover all twenty thousand proteins expressed by the human genome, and the data will be available in an accessible and open manner to the scientific community. The database combined with the artificial intelligence system provides researchers dealing in the field of structural biology with powerful new tools for examining the three-dimensional structure of each protein, while creating a new era of biology based on artificial intelligence.

The AlphaFold artificial intelligence system was recognized in 2020 by the scientific community as the most complete solution to a huge fifty-year challenge of predicting the structure of proteins. The database based on this system also makes use of the discoveries of generations of scientists, from the first pioneers of the field of imaging and crystallography of proteins, to the thousands of predictions made by experts in the field and structural biologists who have devoted many years of protein experiments since time immemorial. The database significantly expands the accumulated knowledge of protein structures, more than doubling the number of highly resolved human protein structures available to researchers. Advancing the understanding of these building blocks of life, which are related to all biological processes in all living things, will help researchers in a variety of fields to accelerate their research. Last week, the methodology behind this innovative system was published as part of the open source code and accessible to everyone in the prestigious scientific journal Nature. Now, another article has been published that provides the most complete picture of proteins that make up the human proteome, and in addition the data of twenty other organisms that are important in the field of biological research have been published. 

Artificial intelligence to accelerate the pace of scientific discoveries

"Our goal at DeepMind has always been to develop artificial intelligence and then use it as a tool to help accelerate the pace of scientific discoveries themselves, thus advancing our understanding of the world around us," said the founder and CEO of the company (Demis Hassabis). "We used our AlphaFold system to create the most complete and accurate picture of the human proteome. We believe that this is the most significant contribution that artificial intelligence (AI) has made so far to the advancement of humanity's scientific knowledge, and is a wonderful demonstration of the types of benefits that this field can provide to humanity."

The ability to predict the structure of a protein with the help of computer simulations based on its amino acid sequence - instead of determining the structure experimentally and for years using expensive and arduous methods - is already helping researchers achieve within a few months results that previously only took many years. "The AlphaFold database is a great example of the vicious circle of open source," said EMBL's director general, researcher Edith Heard. "The system went through practice using data taken from public sources built by the scientific community, so it makes sense to publish its results freely and accessible to the public. Sharing the predictions in an open and accessible manner will allow researchers to gain new insights and make important discoveries in the field. I truly believe that our system is a breakthrough in the field of life sciences, and am proud to allow open access to this great source." 

The system is already helping researchers in the field: one of the research centers is using it to develop enzymes used to recycle some of the most polluting plastic materials; Researchers from the University of Colorado are using it to test the resistance of bacteria, and researchers from the University of California are using it to increase their understanding of the biology of the corona virus. In addition to the human proteome, the database has published 350 protein structures of other important organisms commonly used in biology research, such as Escherichia coli, the fruit fly, mice, zebrafish, the parasite responsible for malaria, and the tuberculosis bacterium.

The database and the system should be updated periodically while improving and upgrading the systems and planning to significantly expand the protein coverage to almost every protein known to science that has passed the floor - more than a hundred million structures.  

2 תגובות

  1. There is a young researcher at the Weizmann Institute who does this. Won the Blavatnik Award this year. I don't remember his name.

Leave a Reply

Email will not be published. Required fields are marked *

This site uses Akismat to prevent spam messages. Click here to learn how your response data is processed.