Because AI isn’t really smart

31 December 20239 months ago babelfish

Knowledge comes from establishing reliable cause-effect relationships, summarized in postulates, laws, statements, theorems, equations etc., but (at the moment) artificial intelligence is unable to understand or use this form of knowledge. The article by Daniele Paganelli, scientific manager of a micro-enterprise specialized in high-temperature measuring instruments, for the Appunti di Stefano Feltri newsletter

Rereading my first article for Notes – The digital impostor – on the prospects of artificial intelligence, also in light of subsequent developments, I realized that I was perhaps too pessimistic in my tone.

My area of experience, and my reference values deriving from training and the working sector, naturally place me in a very critical observation point. This is not the case for everyone – in fact, perhaps only for a minority.

The human brain is predisposed to anthropomorphize everything it sees: two balls and a curved line are enough to make a face; the languid look of an animal to think that "all he needs is the word"; an incomprehensible phenomenon to imagine sentient spirits and deities.

It is therefore very easy to fall into the temptation to consider AI as truly intelligent .

ELIZA, the first chatbot in history, had already managed to enchant the secretary of its developer, Joseph Weizenbaum, with 1960s technology.

Those who follow a technical-scientific path are educated in deterministic laws and principles and in the mantra correlation is not causation .

Knowledge comes from establishing reliable cause-effect relationships, summarized in postulates, laws, statements, theorems, equations, etc.

One, ten, a thousand observations may be worth nothing in the face of a single contrary observation.

Large language model (LLM) AI is (currently) unable to understand or use this form of knowledge. And I go much deeper: the problem is the same concept of statistical regression, which is the basis of the learning of neural networks, and therefore also of LLMs.

To learn a law it must convert all its (infinite) manifestations into statistical weights – creating a very complicated probabilistic map for even the most banal deterministic law.

In the world of AI, “ causation is correlation ” because there is no other knowledge than pure statistical correlation!

If you want a practical demonstration, try posing a high school physics problem to an LLM.

They were asked to calculate the energy to pump water from 20 meters below ground to the top of a tower 5 meters above.

ChatGPT gets the height difference wrong, and then totally messes up the water density units, getting the answer wrong by almost 3 orders of magnitude.

Although the AI shows that it remembers what the laws of motion and potential energy say, it does not understand them. He doesn't even understand the problem.

He plays with numbers as he plays with words, arriving at a totally incorrect conclusion, but with a very authoritative tone.

Statistically he encountered those words in that order, similar questions answered in a certain way, and tries to imitate the man. But it's a fake, because he can't really understand anything.

From probabilities to certainties

With the advent of statistical thermodynamics in the late nineteenth century and quantum mechanics in the early twentieth century, it was discovered that many fully deterministic principles in the macroscopic world were generated by microscopic phenomena governed by pure chance.

The random behaviors of billions of billions of billions and many more atoms and other elementary particles determine the seemingly predictable properties of the experienceable world.

In the microscopic world there is almost nothing impossible, but the composition of infinite microscopic probabilities leads the macroscopic world to be divided between the practically certain, which we observe, and the absurdly improbable, which has never been observed and never will be.

The reason why hot water does not spontaneously separate from cold water, giving us infinite energy to warm us, is not because a God has decreed to frustrate man with the second law of thermodynamics.

But because it is very, very unlikely that a huge amount of particles of different speeds (temperatures) will separate randomly.

The second law then derives from these probabilistic calculations. Can it theoretically be hacked? Yes. Has it ever happened/will it happen? The entire history of the universe will not be enough to make this happen.

Another striking example: perhaps not everyone knows that teleportation already exists and is widely used in scanning electron microscopes.

All matter can, as far as theory predicts, teleport anywhere in the universe (respecting the speed of light).

However, the probability of this happening is inversely proportional to the mass. Electrons do this quite often by traveling sub-nanometer distances: this is the tunneling effect exploited in microscopes.

But even for an entire atom it is so unlikely that it has never been observed. For an entire space crew you really need to watch a science fiction movie.

Therefore it would seem that the AI approach is correct: where there is sufficient correlation, it means that there is also causation.

Where sufficient , however, can mean a number immensely larger than any computational capacity we will ever have.

How do you teach science to an AI?

Let's imagine we want to teach the second law of thermodynamics, from the previous example, to a neural network.

We will have to make him observe the random motion of many, many particles. And in many, many different situations. You might need a huge model just to be able to respect thermodynamics.

I give an example from my professional experience: I have studied different methods to apply AI to process control.

In my case these are thermal systems, but let's suppose we want to keep the speed of a car constant by implementing a cruise control .

If we wanted to calculate exactly how much power is needed ab initio , that is, from first principles, we would have to know: road slope, wind direction and strength, car load, external temperature, pressure and humidity, fuel, engine, tire wear, and maybe more. An analytical model that puts all these parameters together would also be needed.

We would travel in a kind of mobile laboratory, but we could always determine exactly how much to “step on the accelerator”.

Impractical? So, let's train an AI. We have three choices: We could give it a real car and have it drive it in all possible conditions, so that the AI learns the underlying physical laws. But it would destroy billions of cars in the process.

Or we could record the driving conditions of billions of motorists, but would lack a lot of the background information necessary to define the problem (cars, after all, are not mobile laboratories…).

Finally, having a complete model, we could create a virtual car and teach driving with it. But we would still spend immense computational resources trying to explore infinite physically senseless answers.

However, learning would need to be validated in many different situations, and we would never be sure that it does the right thing, remaining probabilistic in essence.

The frontier approach is to harness the AI inside the approximate physical model, prohibiting it, during learning, from wasting time on infinite impossible solutions and, during execution, from returning implausible answers.

This approach is called Physics Informed Neural Network (PINN), i.e. neural networks informed by physical models.

Since acceptable training of a PINN often requires a fairly complete model, in my personal experience it was ultimately more cost-effective to use the model itself and leave the AI alone.

Cars rely on a family of algorithms called PID, Proportional Integrative Derivative – very widespread in automatic controls.

With just three parameters and a very simple formula, it is possible to keep even rather chaotic processes under acceptable control.

It is illuminating how preferable an equation with three parameters can be to a neural network that has thousands or millions of them.

Andrea Vestrucci's recent article here on Appunti explained the underlying problem in theoretical terms: current large language models (LLMs) are based on purely probabilistic neural networks, which poorly learn "hard" laws such as those of logic, mathematics , of physics in the infinite space of reality.

To understand rules, we would need an AI that understands what a rule is at a fundamental level, just as today neural networks are entirely based on the concept of probability that a signal passes from one neuron to another.

Previous attempts to build rule-based systems were abandoned in the 1980s, but neural networks were also left in the dustbin of computing history.

More powerful hardware and innovative approaches have brought them to the heart of the current digital revolution: it is always worth carefully following the efforts made in previously abandoned areas.

(Excerpt from Stefano Feltri's Notes blog )

This is a machine translation from Italian language of a post published on Start Magazine at the URL https://www.startmag.it/innovazione/perche-lintelligenza-artificiale-non-e-davvero-intelligente/ on Sun, 31 Dec 2023 06:14:38 +0000.