Traditional software is made to appear clever by human programmers. AI apps are much cleverer, and they do it all by themselves. So, can we ever trust them?
Software can be infuriating, challenging to learn, and sometimes unpredictable, buggy and stubborn. You could be forgiven for sometimes thinking that it has a mind of its own.
Despite that, it’s pretty safe to say that it doesn’t. Unexpected behaviour is more about the observer than the software itself, which is strictly deterministic. Just because it doesn’t work the way you think it does doesn’t mean it is making things up as it goes along or that it is trying to outsmart you. It is simply a category mistake to think that a traditional software program can “want” to do anything - still less rule the world.
Give a software program the same set of inputs, and you will get the same outputs. That’s something that, without realising it, we should be extremely grateful for, especially when it comes to software in aircraft, lifts and medical ventilators. Seemingly random and unexpected behaviour would not be welcome in those examples.
What's in the box?
But AI is different. You can take a conventional software program and unpack it, stepping through each piece of code and figuring out how it works, line by line, decision by decision. AI doesn’t have that level of transparency. You could say that AI is the definitive black box. And that’s a problem if you want to put AI in charge of your nuclear power station, or anything, for that matter, where human lives are concerned.
Take autonomous cars, for example. Tesla’s latest self-driving software is entirely based on a massive resource of neural nets, learning constantly from millions of drivers’ inputs. In some ways, it’s brilliant. It’s only a matter of time before it gets provably better than any human driver. If you’re going to say that you can’t trust AI because you don’t know intimately how it works, then you also have to say that about human airline pilots and neurosurgeons. At some point, you have to say that in order to get things done, you just have to trust the agent that’s doing the job.
Maybe the problem with AI is that it hasn’t earned our trust yet. Human experts spend years perfecting their knowledge with education and experience.
When we meet a doctor, we don’t know anything about their path to qualification, but the fact that they’re called “doctor”, that they work in a hospital or clinic, and that they have all the trappings of medical doctor-hood, adds up to a powerful reason to trust them. Layered on top of this is their manner. Medical professionals sound confident and reassuring. They tick all the boxes needed for us to trust them implicitly. We don’t demand to see our doctors’ qualifications before they write a prescription, and most of the time, that’s perfectly OK.
But that type of definition of trust doesn’t seem adequate with AI. Maybe it’s because it doesn’t have any sort of “embodiment” that it seems so strange to us. How can we trust it from its body language when it doesn’t have a body? A smiley face on a goofy low-resolution screen instead of an actual face doesn’t cut it for most of us. We know a “Speak and Spell” is not really speaking - and if we don’t, just try holding a conversation with this 1980s relic.
Perhaps what’s worse is that Large Language Models (LLMs) are rather good at sounding like they know things. That’s because they’re trained on a wide range of knowledge, as well as the language that humans would typically employ to discuss that information. They almost have no alternative but to sound confident, even if they’re not.
Removal of doubt
These are likely to be transitional issues as the technology improves, but there will always be doubts until an AI model becomes so provably adept at a given set of tasks that it is literally beyond criticism.
But will that ever happen? It might be for some tasks like cooking and ironing. But what about flying a plane with five hundred people on board? In principle, if an AI model works well and does so repeatedly, then we should no more doubt its capabilities than we would a human. But we still don’t know what’s going on inside. The AI remains a black box - in the sense that if it makes a mistake - we won’t know where to look to fix it.
I think it’s likely that we will be able to build self-reporting into AI. In some ways, it’s a necessary step. It may slow things down a bit (or a lot), but the benefits would far outweigh that, and with progress already so fast, slightly slower rapid progress is not going to hurt anyone.
But there are other ways to impose control over AI-driven activity. Professor Max Tegmark suggested recently that, as it develops. AI goes through a “phase change” analogous to the transition from a gas to a liquid or a liquid to a solid. In each case, the atoms or molecules in a substance go through a massive simplification of their motion. Particles of matter are clearly far less mobile when in solid form than in a liquid or a gas.
Tegmark makes an analogy with a moon shot. In the very early days, a rocket company might do things by trial and error. Essentially, they would fire a rocket at the moon, look at the results, and then adjust to correct the trajectory and try again. They would have got there in the end but would have lost a lot of rockets and several peoples’ fortunes.
But it never happened like that. Because most of that trial and error testing became unnecessary when someone worked out the equations concerning gravity, mass and velocity. Knowing these, all you needed to do was populate the equations with the figures for a given rocket - including velocity, direction, position of the moon, etc, and the result would come out correct. The remarkable thing is that once the dynamics of mass and gravity were understood, the whole exercise could be computed on a processor that would be considered outdated in a washing machine.
One way to look at this is to talk about dimensions. Stay with me - I’m not going to invoke 13-dimensional String Theory. It’s really quite simple.
When an AI model enters training, it is genuinely clueless. It’s like a black box covered in controls (like a synthesiser) with no labels. Tweak one knob - or a hundred - and you’ll probably get a meaningless or useless result. But as you do this repeatedly, you’ll find settings that do make a difference. So you make a note and put a label on the machine to say, “If you see this pattern of behaviour, it’s statistically likely that this (an image or shape or something) is what we’re looking at”. Do that often, and do it enough, and you’ll start to build a recognition machine.
So, eventually, your AI model will “learn” about putting rockets accurately on the moon. But it might take a supercomputer to do it. That’s because it doesn’t really “know” what it’s doing. It’s just based on a few good results pointing it in the right direction.
But what if another layer in your AI model could detect and isolate mathematical trends? It’s just possible that it would pick up the set of trends that come just before a formula. These AI-generated equations might not look exactly like ours, but they will be functionally the same. In a sense, what’s happened is that the “dimensionality” of the problem has been reduced, ignoring potentially thousands of spurious or coincidental data points that can safely be put in the bin because they don’t affect anything.
This rapid and drastic reduction in dimensionality is what allows us to let AI find a result and then put the essence of the process onto a much smaller computer than was needed in the early stages. It also means we’re much closer to having a verifiable AI. This is what Teghmark means by “Phase change”. It’s a loose analogy but a helpful one.
It might mean that we can be much closer to a testable, verifiable, and deterministic AI future, where we can see into the black box, see what’s coming up next, and take corrective measures if we have to.