Bits of Entropy - On Limits of Superintelligence

On Limits of Superintelligence

2025-01-02

It's hard to pin down a rigorous definition of artificial superintelligence (ASI). Everyone seems to agree that it is something that is better than the artificial general intelligence (AGI), which is human-level, but in what way? Here are some thoughts by well-known and respected thinkers who have opined on the matter:

Nick Bostrom defines it as "any intellect that greatly exceeds the cognitive performance of humans in virtually all domains of interest" (Wikipedia).
Tim Urban of the Wait But Why fame goes as far as to say "Superintelligence ... is not something we can remotely grasp, any more than a bumblebee can wrap its head around Keynesian Economics" ( The AI Revolution: The Road to Superintelligence).
Max Tegmark, a physicist and AI-researcher said not long ago: "Superintelligence is dramatically beyond the human capability. It might be as far beyond our intelligence as human intelligence is beyond a ladybug" (YouTube video with time code).
In his 2024 NeurIPS keynote speech Ilya Sutskever, a famous AI researcher (formerly at OpenAI) remarked: "[Superintelligence] will be different qualitatively from what we have" (YouTube video with time code).

With the exception of Bostrom, the speakers above seem to imply that superintelligence will be beyond our reasoning. They emphasize a qualitative difference in intelligence as opposed to merely a quantitative one.

Better memory and recall, faster logical inference, mastery of abstractions -- these are examples of quantitative superiority, like a computer with a faster CPU and more RAM. Qualitatively higher intelligence, I'd argue, must necessarily wield a different (more powerful) kind of mathematics, the kind we have no hope of getting a handle on. If ASI relies on conventional math then we will be able to comprehend it, even if it takes hundreds of years of collective human effort to do so. In contrast, Urban's bumblebee and Tegmark's ladybug will never ever understand even the simplest of human mathematics.

As luck would have it, theoretical computer science may just have the right framework to reason about this stuff -- expressive power or expressivity, which describes the complexity of ideas a system can represent (or express). Based on their expressivity, computational systems form a hierarchy, the Chomsky hierarchy.

The simplest such system is a finite automaton, a state machine with no memory. Its expressive power is equivalent to that of regular expressions, a thing we use for all sorts of string search.

But you know how modern text editors automatically track if parentheses are balanced? Well that idea can't be expressed with a finite automaton -- its expressivity is too low for that. It needs some kind of memory to keep track of all the open parentheses "(" so that it could match them with the corresponding close parentheses ")". We'll need a pushdown automaton (PDA) for this task, a state machine equipped with stack memory (a.k.a. first in/last out queue).

And of course there are limits to pushdown automata too. Suppose we're given a text and asked to find all strings that are some substring w, immediately followed by w again (e.g. bonbon, tintin, abcdabcd). PDA can't crack it as it would need at least two stacks (or two FILO queues).

At the top of the Chomsky hierarchy we find the Turing machine (TM), which is a mathematical model for modern general-purpose computers (except it has infinite RAM). Incidentally, Alan Turing invented said machine as means to reason about the decision problem in mathematical logic. What he discovered along the way though was the Halting Problem -- a problem that is famously unsolvable by Turing machines (the highest expressive power!) and is the limit of computation itself!

At this point we might want to ponder about the expressive power of the human mind. Since human mind is kind of a benchmark for AGI, finding its ranking in the Chomsky hierarchy may shed some light on the limits (if any) of superintelligence. For instance, if human intelligence is the best our corner of the Universe can physically accommodate, then there's no literal ASI -- any physical system will be as expressive as the human mind and not any higher.

Before we can tackle this question we need to revise how to compare the expressivity of two systems. How do we know, for example, that a pushdown automaton is more expressive than a finite automaton? Here's how:

A pushdown automaton can simulate a finite automaton. It's trivial, just ignore the memory and here you go.
A finite automaton cannot simulate a pushdown automaton, which can be proven using the pumping lemma

Similarly, when we hear something like Conway's Game of Life (GoL) is Turing-complete, it means the following:

a Turing machine can simulate the Game of Life (trivially so)
the Game of Life can simulate a Turing machine (details)

Thus Game of Life and Turing machine have the same expressive power, i.e. GoL is Turing-complete.

So what about the human mind? Can it simulate a Turing machine? Of course! Given the description of a Turing machine, a human can easily simulate it with pen and paper. Using paper is not cheating in any way -- it's just a tool, it doesn't qualitatively improve our cognitive abilities, it just gives us "a better RAM" if you will. Just like a computer wouldn't acquire a higher expressive power if we attached a new fancy GPU to it.

Can a Turing machine simulate a human mind? From what we currently know about human mind, it is in principle possible, unless we invoke some kind of dualism (soul, extra-physical nature of mind etc.). But from a dualist point of view even the AGI is likely unachievable and the discussion is moot.

Assuming a Turing machine can simulate a mind, it stands to reason that our minds are Turing-complete (to be pedantic they're equivalent to a linear bounded Turing machine as we don't possess infinite memory). And so to be truly beyond our reasoning, superintelligence must have the above-Turing-machine expressive power. Such systems are called super-Turing or hypercomputers.

So let's assume that it is possible to have an AGI running on some big computer. It is also possible to give it a goal of building a better artificial intelligence. However, if ASI is super-Turing, then it can't possibly be run on a Turing-machine inspired system -- its just not powerful enough.

Could an AGI build a super-Turing machine (or provide a blueprint thereof) and then program it to be ASI? To do that it would need to be able to conceive of such a machine. Can a low expressivity system conceive of a higher expressivity system? Wouldn't conceiving be equivalent to simulating it? I don't know a good framework that could help us answer these questions.

More importantly -- is it physically possible to build a super-Turing system? Is our corner of the Universe powerful enough to run a hypercomputer? To tackle this question we need to turn to the Church-Turing thesis, which has many formulations, but the gist is that any computation can be carried out by a Turing machine. It spawned a further inquiry: can any physical system be simulated by a Turing machine? A lot of research has been done on this subject. The contributors include such legends as Robin Gandy (a computer scientist and Turing's student) and David Deutsch (one of the of founders of quantum computation). What they've arrived at is called the Total thesis:

Every physical aspect of the behavior of any physical system can be calculated (to any specified degree of accuracy) by a universal Turing machine.

This may need a bit of unpacking. The remark about any specified degree of accuracy means that every discreet physical system is calculable by a Turing machine. So our hypothetical super-Turing device has to be continuous, it has to be an analog machine. But not just that -- it must function such that no discreet approximation can replicate its functionality without loss of utility.

For example consider a simple audio speaker -- it is an analog device, it has a knob with which you can adjust the volume to the minutest precision. But it can be approximated by a discreet equivalent -- instead of an analog knob it could have, say, 50 discreet levels of volume controlled by two buttons (+/-). In this scenario you can't set the volume between 49 and 50, but chances are human ear can't perceive such tiny a difference anyway. So we have a discreet system approximating the functionality of an analog system without loss of utility. The ASI must be different, it must be non-approximable in this way. Are such non-approximable continuous system even possible? At this point I don't know, it's a subject of more learning and research.

Finally, I'd like to point out one other problem with ASI. Suppose it is physically possible and it has been built. How do we know it is ASI? If it is beyond our reasoning, its actions are incomprehensible to us just as our actions are incomprehensible to squirrels. Did you know for example that trees in a forest can exchange information via mycelial networks (underground networks of interconnected mushrooms)? What if a forest is an ASI and it takes 100 years to produce one thought? If ASI is beyond our reasoning there's no way for us to detect such an intelligence and for all we know it may be all around us.

So, what does it all mean? For one, there is a chance that Turing-level expressivity is the best our corner of the Universe can accommodate and super-Turing systems are impossible, which in turn precludes the ASI that's "beyond our reasoning". In practice though it doesn't change much -- an AGI can be so much faster at reasoning and recall that it will take us, humans, ages to catch up with its findings, effectively making it "beyond our reasoning" at least in terms of human lifespan. I'd just stay away from bombastic claims about ASI being to us as we are to bumblebees.