Comprehensive coverage

When scientists use ChatGPT to do polite science

About 1.5 percent of all scientific articles were written in the last year with the help of artificial intelligence, or maybe more?

CHATGPT helps to write scientific papers. Illustration:
CHATGPT helps to write scientific papers. Illustration:

Something strange happened to me recently when I was reading a scientific paper: I felt as if my friend had written it. 

But who? The writers were from China, and I can count my friends from there on the fingers of one hand Three-toed sloth. Still, the feeling was clear: someone I know wrote this article. The style, the phrasing, the words - something just came together.

Then I went to write my own article in English, asked ChatGPT for help - and suddenly it was clear.

I am probably neither the first nor the last to get the same feeling: more and more scientific papers today are written with the generous help of ChatGPT. Recently, this general feeling has also received more empirical support, thanks to Research carried out by librarian Andrew Gray on scientific articles from 2023.

Gray analyzed five million scientific papers published in the past year, and found that certain words began to appear with much greater frequency that year. It will certainly not surprise you to find out that these are also the words that ChatGPT especially likes to use: meticulously ("strictly") and intricate ("complex"), for example, more than doubled their number of appearances in scientific articles compared to 2022. Also commendable ("worthy of praise ”) was revealed as one of the new words of the year, followed by words such as notable, pivotal and invaluable. All these comply with the general feeling that ChatGPT likes to please and flatter us. 

Overall, Gray estimates that about 1.5 percent of all scientific papers were written in the past year with the help of artificial intelligence.

But what does this mean - "assistance"? And in more common words: how complex and praiseworthy is the rigorous use of artificial intelligence by the researchers?

At the end of his article, Gray raises the same question himself - 

"Are these tools used only for purely stylistic purposes?"

The answer to this question could be critical to the future of science.

Who are you, scientific papers

Let's explain for a moment what the importance of scientific articles is in general. 

People who are not part of the scientific community and don't know its ways, sometimes tell me that the focus on scientific articles seems silly to them. 

"What does it matter what is written about you in the newspaper?" they ask

The answer is that scientific articles are much more than "writing in a newspaper". When a scientist reaches impressive results in research, he writes an article that describes the experiments, the logic behind them and their significance. He then sends the article to some scientific journal. The more prestigious and well-regarded a journal is, the higher the level required to publish articles in its pages - and thus the higher the fortune of the scientist who managed to thread an article there. The most respected journals - Nature and Science - are usually only willing to publish articles that describe groundbreaking and innovative experiments. These are the kind of articles that change an entire scientific field, and whose results will be described and included in university textbooks.

We'd like to think that the only thing that matters in science is the results of your experiments, but in the end, scientists are human too. We need to communicate with each other in a clear and understandable way, so it is not surprising to discover that those who do not have a good command of the English language - will have a very difficult time publishing their articles. Yes, the experiments are important, but if you don't also know how to explain what you did like an American gentleman from New York, your chances of getting published decrease.

So what's the wonder that academics from all over the world choose to use ChatGPT to polish and sharpen the wording in their articles, until they look like they were written by a native English speaker? Priceless!

The trouble is that we are not the only ones who read the scientific papers to learn from them. In recent years, a new entity has joined that also reads the scientific articles to better understand how to write. This is, of course, ChatGPT.

Every advanced language model today is trained on piles and piles of texts that come from the web, from the beautiful literature and the content of scientific journals. When he is asked to write a scientific paper himself afterwards, he tends to reproduce the style and key words that appear in the material he studied. Since until now almost all the content in the world has been written by humans, artificial intelligences imitate human writing.

But what will happen if they start receiving as training material the output they themselves produced? I mean, will they start reading their own writing and treat it as the most successful way of writing?

"The more text produced by large language engines is used as training material for future large language models, the greater the danger of 'model collapse,'" Gray warned in his seminal paper. "Artificially created text will increase in importance over real text, and will lead to the production of more low-quality results. The more the scientific literature includes text produced by large language engines without declaring it, the worse the future text of large language engines will become, in a vicious circle.”

can it happen

A Facebook study from more than seven years ago shows that this is indeed a real possibility. 

The bots who wanted to murder us all

In 2017, a journalistic article appeared with a title that could be cautiously defined as "inaccurate". The writer, Gary Ogden, argued that

"Facebook bots have developed their own language, are starting to plan to murder us all"

Just in case it wasn't clear, we'll make it clear that this is not a scientific article, but cheap internet journalism of the kind that can be found anywhere outside of science. 

Ogden described in his pictorial language a real experiment that took place in Facebook's laboratories that year, but he took more than a little artistic liberty. Facebook wanted to create bots for business owners. The idea was that the human traders would be able to run bots that would talk to human customers and sell them valuables of any kind. To do this, Facebook created two bots that were supposed to train against each other in the art of sales. 

For days, as only artificial intelligence has the patience to do, the bots haggled with each other. They traded stars, orbs, and books, with each bot assigning a different value to each object. The conversations started well, but as the bots continued to talk, the text became less and less understandable to the human eye. In fact, at one point their interaction seems like complete gibberish. As the bots themselves said - 

Bob: I can do everything else. . . . . . . . . . . . . .

Alice: balls have zero to me to me to me to me to me to me to me to me to

Bob: you and everything else. . . . . . . . . . . . . .

Alice: balls have a ball to me to me to me to me to me to me to me

Bob: I can iii everything else. . . . . . . . . . . . . .

Alice: balls have a ball to me to me to me to me to me to me to me

Bob: i. . . . . . . . . . . . . . . . . . .

what happened here?

The bots, as far as we can tell, have developed a new kind of language to best accomplish the task. Human language, after all, is designed to convey a vast array of complexities in many areas of life: love, money, ownership, law, science, and on and on. The bots did not need all these complexities, and focused only on one area: the trade of three objects. It turns out that in this field, at least, it is possible to achieve success even with a very limited number of words.

The guess right now is that the bots learned from each other how to speak more and more tersely. The initial conversations were probably more similar to those of humans. When one of the bots began to speak more concisely to convey the same message, the other bot saw and copied, then shortened even more itself. Over the course of an unknown number of such discourse cycles, the conversation between the bots gradually deteriorated into semblance. In fact, they developed a kind of abbreviated language that was still sufficient to convey the main messages: how much the balls, books and hats are worth to each of the bots, and what they are willing to give and receive for them. But to the humans, the end result looked, well, like gibberish. 

And that was the end of the bots.

"Our interest was in bots that could talk to people." said researcher Mark Lewis in an interview with Fast Company, when had to explain Why Facebook decided to stop the experiment and turn off the bots.

talk to people

We develop large language models so they can communicate with people. The best way we have found to date to do this is to train them on previous human communication. But what happens if we let them practice on material they themselves produce - with its biases, delusions and nonsense?

Gray, the librarian who counted complex words meticulously and remarkably, addressed such a possible future in his essay.

"In the best case, in which it is only a stylistic imitation - we may reach a situation where the academic literature of the 1930s will sound strangely positive and encouraging, with literary reviews endlessly praising their "careful", "noteworthy" and " complexity" of other writers." Wrote. "In the worst case, we could see the quality of the models deteriorate in other ways beyond simple stylistic choices, alongside an increasing reliance on them."

I will reassure you right now: this is probably not what will happen.

The thinker Stuart Brand once said that he is "pessimistic in the short term, and optimistic in the long term". He is pessimistic because we need to shine a light on all the existing problems, and optimistic because from the moment we do this - we can count on the human race to fix them. I think he is right. Gray's concerns about the future are completely realistic, but precisely because they are mentioned in many places, they will not come true. 

How can you prevent them from happening? Easily. One way, for example, is for the editors of the scientific magazines to oblige the authors to include a clarification that they wrote their articles with the help of artificial intelligence. The artificial intelligence developers of the future (that is, those who will train the next generation of intelligence in a few months) knew to avoid those articles, and not let the artificial intelligence read them.

My fear is that we are witnessing a more problematic phenomenon, which will not specifically affect artificial intelligence but rather the human beings themselves. Specifically, about the scientists. And this is because artificial intelligence is starting to approach the point where it does better science than humans themselves.

When artificial intelligence conducts research on its own

The scientist's work is often described as an experience of discovery and surprise, with free meals in between and receiving Nobel prizes for dessert. 

The truth is very different.

To reach new discoveries, the scientist often has to toil in the laboratory from morning to night. He has to transfer liquids from one test tube to another, and then repeat this another ninety-nine times as a preliminary step to the experiment. He has to sit in front of the centrifuge and carefully measure microliter by microliter of enzyme solution, armed for this purpose with all five senses that were originally developed to protect himself from saber-toothed tigers - not from microdrops of hydrochloric acid. 

In short, there is a reason for the saying that Thomas Edison coined a hundred years ago - 

"Genius is one percent inspiration and 99 percent perspiration."

But in the laboratory you have to sweat carefully, so as not to contaminate the experiments.

To clarify: not all experiments are complex and boring. But a very large part of them is like that. Many of the tasks that the average researcher performs in the laboratory could be performed with the same degree of success by a robot.

And this is exactly what is happening today in the most innovative laboratories: laboratories where the robots begin to carry out the research from the beginning to the end. The autonomous laboratories.

And they are already starting to produce results, and in abundance.

Take for example Google's new artificial intelligence engine DeepMind, known as GNoME. It is an artificial intelligence model that is able to predict millions of new molecular structures with particularly useful properties. Any such successful material has the potential to catapult an entire industry forward. Maybe it will allow us to harvest sunlight better, or produce batteries for tablets that last for months, or computer chips that are even smaller and more compact than the ones we have today. But GNoME recommends hundreds of different materials for each purpose, and now someone has to test them in the lab, one by one. This is research work that can take months, if not years, and most of it is Sisyphean and repetitive.

So if artificial intelligence came up with the idea for materials, why wouldn't it also test them in the lab itself?

This is exactly what Google DeepMind decided to do. They shared their AI conclusions with a research group in California that is currently developing an autonomous laboratory. Autonomous laboratories - the kind that includes robots and artificial intelligence that processes the results of experiments and extracts insights - can perform experiments at a speed that exceeds that of any human researcher. They also do not require food, water, or sleep. They don't whine and they certainly don't falsify test results. As one of the experts in the field concluded in an interview with New Scientist - 

"We can do more science in less time."

The Autonomous Laboratory in California received the AI's predictions about the most promising materials, and began to automatically synthesize and test each material. In no time she was able to create 41 of these hypothetical substances. In fact, the virtual-imaginary ideas of artificial intelligence have become reality in the physical world.

The connection between Google DeepMind and the autonomous laboratory is just one example of a larger trend: for science to be done in an automatic and much faster way through the transfer of information between artificial intelligence that plans and interprets experiments, and robots that do them. 

Even without a complete autonomous laboratory, robots and artificial intelligence are starting to work together for the benefit of science. In a laboratory in Boston, for example, researchers discovered a new material with extremely high energy absorption capabilities. Such a material can save many lives when it is incorporated into car chassis or biker helmets. But it is important to say that it was not the human researchers who discovered this material, but "An experimental autonomous Bayesian researcher(Haban for short). The Haban includes several elements that are controlled by artificial intelligence: a 3D printer, a robotic arm, scales, and more. It runs dozens of experiments a day and transmits the results to the human researcher. 

You can go on and on like this. In Liverpool, researchers managed to set up a tiny robotic laboratory, which synthesized and scanned for "catalysts" - catalysts - for eight days. In that short time, she managed to conduct 688 experiments on her own, and identified compounds that were six times more successful than the existing catalysts. as that the researchers wrote

"Our strategy used a trained robot with freedom of movement in the laboratory, so we did not automate the research tools, but the researcher himself."

Although I have yet to find a case where artificial researchers went ahead and translated their research directly into articles written by artificial intelligence, it appears that it is just around the corner. We are not far from the day when an entire research work will be carried out entirely by artificial intelligence: from raising the initial hypothesis, planning and conducting the experiment and finally - writing the article and sending it to the scientific journal.

And it's wonderful, and it's good and it's also bad.

The good and the wonderful

In one of my recent lectures to an adult audience, I asked how many people thought they would still be alive today if it weren't for modern science. Only half raised their hands. The rest admitted that their lives were saved thanks to advanced medical treatments, some of which were given even before the birth itself. When I asked how many would come to the lecture on foot, if the internal combustion engine did not exist, only one person remained with his hand raised in the air.

And he was wearing glasses too.

The scientific research of the past hundreds of years has brought to the world many of the good things we enjoy today. From the computer I'm writing this article on, through a wealth of insights into human biology and medical treatments that make it possible to save our lives, to cars, mass-produced carved lenses with a level of finish that could only have been dreamed of a hundred years ago, and on and on.

Unfortunately, scientific research is a slow, Sisyphean and tedious process. Let's face it, humans are not meant to do science. We evolved during evolution to sit around a campfire and swap stories, to love our families and our tribemates, and perhaps also to think strategically about our position in the social hierarchy. But conducting repeated experiments, isolating variables, meticulously documenting any change in results, failing, biting your lip and trying from the beginning - and repeating the process every day, again and again and again?

Very rare are the humans who do this at a high level and are able to persist with it for a long time. 

But what if artificial intelligence could do the entire scientific research process for us? And then, instead of settling for the few successful human scientists, we could enjoy a vast number of good scientists. Artificial, yes, but what does it matter as long as they get results?

The rate of scientific development was leaping forward accordingly.

I believe we are beginning to see this movement forward in science. The number of scientific articles that are published each year increased by more than thirty percent In the last eight years alone. This is at a time when the number of people completing a doctorate in the world every year has stabilized and even decreased. The meaning, as written in Science, is that - 

"On average, each scientist writes, edits and reviews more articles."

How are the scientists able to produce so much more? Simple: AI tools have become more common and easier to use. Human scientists are still behind the vast majority of research today, but they are using artificial intelligence in almost every study. as per Australian Science Agency, more than 99 percent of research today produces results that rely on artificial intelligence. 

And yet, human scientists are, well, just human. We still need them, of course, to do scientific research. But when AIs can do the scientific research for us, they will do it much faster. They will produce an enormous wealth of scientific insights and theories, and will quickly work on turning them into technological products that will serve humanity. 

And that's good. it's wonderful. So many things need to be fixed in the world: lack of energy, the pollution of the seas, the gradual destruction of the body that we call "aging", diseases of all kinds and even the inability of people from different camps to find a common denominator with each other. We need science to help us address all these problems and many more. Artificial intelligence will be able to bring us the solutions to many of them.

But will we know how to choose the best solutions? And will we understand how to adapt them for us and use them?

The bad

This happened last week when I needed to generate a small piece of code for a research project. What else? I don't know how to program - at least not at the required level. I could go to the books… sorry, to the forums for help. But I decided to save myself unnecessary effort, and turned to my best friend: ChatGPT, or Jippy as I call him in particularly warm moments. I defined the need for him, and he immediately developed the code I needed and tried to run it.

He failed.

It's OK. Even the best programmers don't succeed on the first try. But what happened then made me blink and concentrate better on the screen. Because after he failed to run the code, he proceeded himself to the next step that any human programmer would do: he tried to figure out where the bug was in the code.

A few seconds later, he tried to run the corrected code.

He failed again. Searched for the bug again, fixed again, ran again, failed again, fixed again, ran again.

And this time - success.

With his chest puffed out with pride, Gippy informed me that he had managed to develop the code for me and passed it on to me. Behind the scenes he actually carried out research: he had an assumption about the required code, and he tested it, found out that it was not true and continued a series of experiments until he formulated the final and most successful result.

Did he achieve the desired result? Definitely yes. The code worked fine.

Did I learn anything myself from this whole process? nothing at all My programming skills haven't improved one iota. In fact, I'm not even sure what's in the code that makes it work the way it does. I don't know if it doesn't have hidden surprises that could hurt me if I try to run it. But I definitely intend to use it. Because I trust Jeep.

And this, of course, is a very bad way to do science.

In a world where artificial intelligence is responsible for a large part of scientific advances, we may find that very few really understand how they did the research behind the scenes. As long as they produce the desired results - scientific models that work well enough, and technologies that meet the defined needs - there will be many who will agree to accept the gifts they give us and implement them without thinking twice.

But even artificial intelligence can make mistakes. They can be biased and give non-optimal solutions. They can be hacked and give wrong solutions, or even - in the future - try to act against their human controllers and provide codes that contain malicious parts. And if we don't know, or don't bother, to develop the skills needed to understand the science they do for us, we will quickly lose our ability to sift the chaff from the chaff. Use only the most suitable and appropriate products.

A future of feeling powerful

In the short science fiction story "The feeling of power", Isaac Asimov reveals to us an imaginary future world in which the machines do all the calculation work. Humans have completely forgotten the basics of mathematics, and do not even know how to multiply numbers together. A low-level technician examines the workings of ancient computers, and rediscovers the basics of arithmetic. The story ends with a glimpse into the thoughts of one of the characters, who reveals a deep satisfaction - and even a "sense of power" - that she is able to compete with computers.

"Nine times seven... is sixty-three, and I don't need a computer to tell me that. The computer is in my head.”

Asimov, of course, took things to the extreme in an attempt to illustrate the main idea: that computers can cause certain of our abilities to degenerate. In Asimov's story, society as a whole has forgotten the most basic principles of mathematics. 

In reality, that's probably not going to happen. At least not before World War IV.

In the coming decades we are expected to see scientists who understand less and less how the artificial intelligences themselves work and think. We will see many scientists using artificial intelligence to do parts of the research themselves, but they will be responsible for the overall and larger research. It should be clear that this is not an optimal way to do research. Ideally we would like the researcher to understand at the most minute level how every machine and every piece of code in his lab works. In reality, this is already not the case today. AI will just make it even more extreme.

And the situation will get even worse because 'real' scientists - that is, those who learn from professors for years how to research - will not be the only ones doing science.

When AI is advanced enough to conduct much of the scientific research itself, anyone will be able to operate it. Children, criminals and terrorists will be able to ask her to conduct research, run her in creative ways and get answers that previously required the work of several scientists in a laboratory for years. She will be able to do the necessary calculations for them to understand how much explosive material is needed to smash the wall of a safe, or blow up a restaurant. She will be able to find for them the chemical processes required to produce toxic chlorine gas from materials found in every home. And she will do it with joy and happiness and out of a desire and need to serve mankind, as she was programmed to do.

It is likely that those children, criminals and terrorists will not understand exactly how the artificial intelligence arrives at the answers it provides them. But if the answers work, I can guarantee you one thing: they too will be filled with a "feeling of power".

And she is priceless.

What to do?

In mid-2023, researchers took GPT4 and added to it the ability to scan entire libraries of molecules and chemical reactions. They then asked him to suggest routes to produce known chemical substances, and he succeeded again and again. When human testers had to evaluate the routes it offered, they gave it an average score of 9 out of 10.

One thing he was not able - or willing - to do. He refused to develop a path for them to produce sarin - a deadly nerve gas. And I think we should all be grateful for that.

The artificial intelligences of the future - the ones that our children are going to use in a few years to conduct scientific research - will be able to explain to them exactly how to produce nerve gas, explosives, and an abundance of other explosives. But if we open them correctly, they will choose not to do so. 

Is it not possible to bypass these defense mechanisms? Of course it is possible. Alongside the 'responsible' artificial intelligences, we can also find 'unruly' and open language models, which the most sophisticated criminals and terrorists could exploit to produce advanced combat materials. But to get to these models, it will be necessary to take another step or two beyond the usual and moral artificial intelligences that everyone will use.

This additional step will provide us all with a little more protection from research that will lead to harmful products and fall into the wrong hands. He won't protect us perfectly, but that's the world. Paraphrasing the quote attributed to Thomas Jefferson - 

"Those who want freedom must accept the fact that they will have to live with fear as well."

We are about to get artificial intelligences that will catapult science and technology forward. that will improve our lives immeasurably, and help human civilization not to destroy the environment and perhaps itself. And yes, those artificial intelligences will also be able to give great power to agents of chaos on the way. 

As always, we will have to find the middle way to get the best out of the technology - and avoid catastrophes.


More of the topic in Hayadan: