Comprehensive coverage

The programming language of life: what is included in the RNA vaccine that a million Israelis have already received

If they had to choose the outstanding scientific development of 2020 it would undoubtedly be an mRNA-based vaccine. Dr.Roey Tsezana explains

This is how it looks: DNA inside the nucleus. The commands are copied to RNA which goes outside the nucleus and then undergoes translation into protein. Source: Wikipedia

The next entry is about the new vaccines, and more than that: it is about an act of beauty, of a creation that is art and computing and biology at the same time. If you have ever programmed, or are a biology enthusiast, you owe it to yourself to read it.

Everyone talks about RNA vaccines, but few pay attention to their complexity. Apparently, they are very simple: the researchers inject tiny submarines into the body that contain a strand of RNA. The cells in the area absorb the RNA, which gives them instructions to produce one of the virus proteins and secrete it into the bloodstream. The immune system learns to recognize that protein, and when the virus itself arrives and tries to spread in the body, the experienced immune system stops it immediately.

Simple, right?

In fact, this is an achievement that relies on decades of research in biology, which all came together in one simple molecule: the same strand of RNA that is inserted into cells.

To understand the complexity of the achievement, we need to talk a little about RNA first.

Who are you, RNA?

I have previously compared DNA to a book. Don't get confused: DNA and RNA are two different molecules in terms of their function. The DNA contains the operating instructions of the entire cell, and is hidden and well protected inside the cell nucleus. This is the cell's recipe book, which contains thousands of pages. Each page contains one recipe for creating a protein that will do a certain action in the cell.

When the cell needs to realize one of these recipes, it does not take all the DNA out of the nucleus to do so. it's dangerous! Therefore, he only copies one of the pages into a molecule called messenger RNA, and it comes out of the nucleus, where it is read by tiny machines that translate its instructions into a complete protein. If you want an analogy that is more related to the world of computing, DNA is the portable hard disk of the cell. The messenger RNA is the RAM memory, which does the small and quick operations, and quickly breaks down after it has done the work.

Now, let's talk about RNA for a moment. Each RNA molecule is made up of a chain of small molecules that are connected to each other. But if we continue with the previous analogy, it is better to think of the RNA as one page full of letters. There are four types of letters in RNA, each of which represents a different molecule. These are A, C, U and G. That's it. Using these four letters, the RNA strand conveys the instructions for creating a complex protein. It is a programming language, in fact, with only four letters.

Consider for a moment what the researchers did with this vaccine: they programmed in the language of the cell itself. In the language of life that developed over hundreds of millions of years of evolution. They treated our cells as a computer, and injected into them an instruction that did not exist in them before.

And in this post I want to tell you a little bit about the commands they used to make the RNA do exactly what they want in the cell. Do you remember that I wrote that the record is also intended for programmers? So please. You have here a biological programming language that is just as complex as any programming language of human origin.

Let's dive into the RNA strand that the researchers developed, and what are the different commands it contains.

The commands in Pfizer's new vaccine

At this point I will give due credit: the rest of the post is based on the brilliant analysis by Brett Hubert, as published on his website last week. If you want to read more about the topic, you can find the link to the website HERE.

And now, for the vaccine: that new Pfizer vaccine that consists of one strand of RNA, with 4,284 letters. And yes, it's a lot, but each of them is necessary to produce the right protein.

The first two letters in that RNA are GA. This sounds like a minor piece of information, but it is important. When an RNA molecule begins with these two letters, the cell receives it as a message, or a type of identity card. GA makes it clear to the cell that this RNA came from the nucleus, and that it can be trusted. The researchers simply hacked to trick the cell.

The next part of the RNA strand is particularly important. It's called (and I'm simplifying things here) - "untranslated area" - and it looks like this -

GAAΨAAACΨAGΨAΨΨCΨΨCΨGGΨCCCCACAGACΨCAGAGAGAACCCGCCACC

One moment, say now. What is the Greek letter Pesei doing here? Didn't you tell us there are only four letters - ACUG?

Yes, but as we said: the researchers are hacking the cell's operating system. When the cells are exposed to RNA from a foreign source, they are not ready to surrender to it just like that. They have some kind of internal anti-virus system, which destroys any suspicious RNA.

Our friend the ribosome, which translates the RNA strand into protein. Source
Our friend the ribosome, which translates the RNA strand into protein.  מקור

To deal with this particular problem, the researchers replaced the letter U in BRNA, with a similar molecule called pesei. When the intracellular antivirus takes a look at the RNA and sees this molecule, it decides it's okay after all, and lets it get on with its job. Why? So. Don't argue with the cell's antivirus. But I already want to reassure you: viruses are not able to replace the U with a P, so they cannot deceive the cells in the same way. This is a new development that only humans could produce.

Let's go back to the untranslated area. It contains 51 letters, and you can refer to it as the title on the page. You know that the most important part of every story is the title. She is the one who draws the reader. A good headline will attract many readers. A bad headline will drive them away.

The title should attract the sophisticated machine called "Ribosome". The ribosome is the one that will read our page, understand the instructions in it and translate them into a protein. But in order for him to even agree to begin the work of translation, he has to be attracted to the RNA we introduced into the cell. It needs to find a docking place on the strand, from where it can start going over all the other letters to translate them into protein. This is what the header does: it provides the ribosome with a landing spot. Need a good title to attract ribosomes quickly to our page.

How do you write a good headline? The answer is known to every writer: they copy. The researchers simply copied another mRNA label that we know from experience attracts ribosomes like flies to honey. Thanks to this copied title, which is 51 letters long, the RNA will be translated into protein at a rapid rate.

And now that our translation machine - the ribosome - has attached itself to the label at the beginning of the RNA, it can begin the work of translation.

But what is she even going to translate?

mRNA for protein

The ribosome sits on the RNA. It starts reading the page and translating it into protein.

What does "translation" mean?

We said that RNA consists of thousands of letters. All three such letters form one word. The ribosome reads three letters, then the next three letters, then the next and so on. Each combination of three letters tells the ribosome to grab another lego block, and attach it to the crystallizing protein. In other words, if the ribosome goes through 3,000 such letters, it will assemble a protein made of a thousand Lego blocks. These lego blocks are called amino acids, by the way, but we'll just call them lego blocks so as not to complicate the matter.

We want the protein that is produced to be the one that the corona virus uses to stick to cells: the spike protein, and this is really the protein that the ribosome will produce. But wait! We want the protein to be secreted out of the cell, right? The immune system cells that roam the body looking for invaders, will not find the same protein easily if it remains inside the cell.

So how do you instruct a cell to secrete the protein out?

The first letters in the RNA that the ribosome reads are translated into a group of Lego blocks that serves as a kind of note - a tag - on the protein. The same tag makes it clear to the cell that it needs to secrete the protein out. And that's exactly what the cell will do, once the ribosome finishes assembling the protein.

The hook proteins on the virus. If the virus is not there to support them - they collapse. source
The hook proteins on the virus. If the virus is not there to support them - they collapse. מקור

Now that the ribosome has created the tag, it continues to read the remaining 3,777 letters in the RNA strand. These direct the ribosome to assemble a protein that is identical to the hook of the virus... but with a critical change.

The same change focuses on five letters in the RNA that were replaced by other letters. This change causes the ribosome to attach an unusual Lego block to the protein: proline. The reason for this new stone is that without it, our hook protein would collapse and collapse in on itself. Why? Because it is not intended to be secreted out of a fitting. It is not meant to stand on its own. Originally it was attached to a virus, which supported it and prevented it from crashing. But when it doesn't get that support from the virus, it just collapses.

This is the reason why the researchers replace one of the normal lego blocks in the protein, with proline - which is an especially strong lego block. It provides the protein with a new backbone, which prevents it from collapsing even when it is not attached to a virus. Thanks to the proline, the effectiveness of the vaccine improves miraculously.

At the end of going over those 3,777 letters, we finally have the protein we needed. But now the ribosome needs to be informed that it is time to stop translation. Again, like in a programming language where we tell the computer to stop running. Luckily for us, there is a three-letter combination that conveys exactly this message to the ribosome. The ribosome reaches them - and stops there.

That's it?

Just one more thing.

The ribosome detaches from the RNA, and the new protein begins its journey out of the cell. But the RNA itself remains in the cell, and we want more ribosomes to attach to it and produce more proteins. But there is one problem: every time a ribosome translates the same RNA strand, the back end of the strand is cut off and thrown away. Why? It is not clear. Maybe it's a mechanism of the cell designed to limit the number of times a new protein will be created. As soon as too many letters fall off the back end, the cell recognizes that this strand of RNA has reached the end of its life, and breaks it down. And it really doesn't suit us, as mentioned, because we want each strand of RNA to be translated into many proteins.

so what are we doing? Cheating the cell, again. The researchers added about a hundred identical letters to our RNA strand: A, A, A and another 97 repetitions of A. This is a kind of long tail that RNA has and it is all made of the same letter. Every time the RNA undergoes translation, part of this tail is cut off. And that is perfectly fine and does not prevent the RNA from continuing to connect to additional ribosomes. Only after tens of translations into proteins, this tail will disappear completely, and our hero RNA strand will be disassembled by the cell. This is the end of every thistle.

Summary

So what did we have here? A real act of programming, all in one strand of RNA:

  • An instruction to the cell to treat the RNA as if it came from the nucleus (GA)
  • A particularly enticing title with a length of 51 letters, which attracts the ribosomes to translate the protein
  • An instruction to print a tag that will cause the cell to secrete the protein out
  • The instructions for making a protein are 3,777 letters long, with a small change that stabilizes it
  • A tail that makes sure the protein translation 'loop' continues over and over again, for a specified number of times

Each of these instructions is based on many years of research and endless scientific articles. Some of them were even perfected by the pharmaceutical companies, and have copyrights on them - just like libraries of functions and commands that companies release to the market for application developers.

Now you know how to program the vaccine, and also know how to appreciate the complexity of the final product.

In the coming years we will use this programming language for many more purposes. We will use it to program instructions that will cause the cell to produce new and different proteins: proteins that can fight diseases, help the cells deal with the invasion of viruses, and even give instructions to the cells - for example, to return to the stage where they can divide again, and thus regrow organs and even whole limbs.

This is the full meaning of the fact that we know how to program in the language of life.

And if you are also excited by these developments, well - maybe it's time for you to go get a bachelor's degree in biology.

Successfully!

More of the topic in Hayadan: