Comprehensive coverage

Researchers have developed an algorithm that reads our DNA as human language

Scientists from Ben-Gurion University have developed an algorithm that locates sequences in DNA, with the help of which the hereditary material can be replicated in the cell. This multidisciplinary method will enable the design of personalized innovative drugs in the future

Scientists from Ben-Gurion University have developed an algorithm that locates sequences in DNA, with the help of which the hereditary material can be replicated in the cell. This multidisciplinary method will allow in the future the planning of personalized medicines according to the need. The findings of the method were published in the journal Nucleic Acid Research.

The code for the operating instructions in every living cell is stored in our hereditary material, the DNA. Similar to a text made up of letters, which come together to form words and sentences that create a meaningful story, the chemical composition of DNA, which contains letters, makes it possible to store coded information for operating instructions of the living cell. In fact, the DNA is an example of a text of a non-human language with operating instructions necessary to maintain the existence of life.

Admittedly, the language coded in DNA is not human, but it fulfills semantics and syntax conditions so that the tools we use today in analyzing texts, can also be used by us in analyzing biochemically significant DNA sequences inside the cell. In a similar way to finding words with a similar context in a long text, or finding a group of words with the same root, it is possible to find meaningful sequences in DNA. This data enables a deeper understanding of biochemical processes in cells that depend on the meaning of the DNA sequence.

Properties inherent in the DNA sequence influence interactions with proteins. DNA has the ability to summon a key protein (primase) to replicate the genetic code. Specific binding of the primase to a certain sequence is possible thanks to the properties of the bindings in the DNA sequence. These features allow protein-specific familiarization sequences to have semantic and syntactic meaning just like words with a common root, thus identifying similar features.

For the purpose of understanding the language of DNA, researchers in the biochemical-computational laboratory of Dr. Barak Akabiov From the chemistry department at Ben-Gurion University of the Negev, an algorithm that cracks the code in the DNA sequence like human language and even predicts the protein's ability to perform an activity that can be measured. The algorithm learns the collection of rules encoded in DNA and important for binding the protein.

The goal of the method was to locate hidden characteristics in DNA. Initially, DNA sequences were classified into groups with common features that turned out to be important for the link. For example, groups containing common markers, properties and structures were discovered that determined a high binding value of DNA to primase, a low binding value to primase and also a group with mixed binding values.

The insights obtained made it possible to build a prediction model that takes into account the link values ​​measured for all sequences. The model made it possible to accurately determine binding values ​​for each DNA sequence and was even used to design a new sequence. The researchers were not satisfied with testing the quality of the model using the test sample, but tested it experimentally and found a complete match between the primase activity and the binding values ​​predicted for different DNA sequences.

The developed computational model will enable the determination of sequence motifs in DNA that are also important in other mechanisms in the cell that depend on specific binding of protein to DNA (beyond DNA replication). An understanding of specific DNA binding to a protein will make it possible to control gene expression, for example by coordinating proteins that recognize specific sequences and recruit the gene transcription machinery.

"We were able to develop a multidisciplinary method that brought with it several levels of innovation," explains Dr. Barak Akabiov. "With the help of the tool we developed, it is possible to control a specific interaction between DNA and protein, to identify sequences in the DNA that are important for the proper replication of the genetic material before the cell divides, and in my estimation it will be possible to use this method to design innovative drugs in the future as needed."

The research group included: Dr. Dan Vilanchik from the Department of Communication Systems Engineering at Ben-Gurion University of the Negev, Adam Sofer, Sarah Eisdorf, Moriah Yafarah, Stefan Ilitch and Dr. Barak Akabiov- All from the chemistry department at Ben-Gurion University of the Negev.

This study (No. 1023/18) was supported by the National Science Foundation.

More of the topic in Hayadan:

Leave a Reply

Email will not be published. Required fields are marked *

This site uses Akismat to prevent spam messages. Click here to learn how your response data is processed.