Triggered by a small coincidence this morning, I talk about what changed my mind on the most important question of the decade, if not the century...
Yesterday, I gave a presentation to a group of business owners at Vistage. I was asked a question about my views on the future of AI which I realized I haven't written up anywhere, and then this morning I found a New Scientist magazine from last year talking about what it was that profoundly changed my answer to that question.
How to start an argument
Right now, when I talk to my colleagues who are involved in AI — everyone from researchers building new models through to practitioners building systems for customers — the most contentious question you can ask is: “are you an AI maximalist or an AI minimalist?”
- An AI maximalist is somebody who believes that we have made all the fundamental breakthroughs we need in order to develop artificial general intelligence, and that all that is left to be done is scaling up existing solutions.
- An AI minimalist believes that there are fundamental breakthroughs that have not yet happened, without which we will not be able to get AI that is as smart as a human being.
Obviously, there is a lot of nuance and variations on this. You can be an AI maximalist, but also think that some new breakthroughs are important and will make a difference. I'm in that camp at the moment, and I'm really hoping that my p-adic based machine learning approaches are going to mean that we can train up AI using far less data, less energy and less compute power than we need to at the moment. I deeply care about making sure the benefits of AI are available to all, and the last thing I want is AI of the future to be controlled by the rich and powerful simply because they have control over available resources today.
You can be an AI minimalist because you believe that current solutions can’t or won’t scale: data shortages are the reason I see given the most. There’s also the position that we should be regulatory AI minimalists: that current techniques shouldn’t be allowed to scale (for environmental, political or safety-of-human-life reasons).
You can also be a post-AI maximalist! I was talking to one colleague about this question, and his reply was that AI is already smarter than any human being. If you met somebody who could diagnose better than any doctor, translate into 100 different languages fluently, make quite respectable paintings, compose everything from pop songs through to classical arias, write simple Python programs to do data analysis and also help experienced software developers find bugs in their code while chatting conversationally about topics as diverse as Greek plays and rough path theory—then you would be thinking that you were talking to a highly intelligent individual. So what if they're naive and gullible? If any human being were that smart, they would have experienced a childhood of such social isolation and loneliness that we wouldn't expect them to be able to operate in normal society in normal ways. As further evidence he will point out that a really good way to help a computer pass a Turing test is to tell it to act stupid — see https://x.com/CRSegerie/status/1830165902777831632 .
When your beliefs don’t match reality
Last year, around the time that the New Scientist issue was published, I was firmly an AI minimalist. One of the reasons that I was so convinced of it was the reason discussed in the New Scientist article that I found this morning: at that stage, theory of mind was incomprehensibly hard for computers.
Theory of mind is the idea that someone else has their own mind, separate to yours. It is something that most children acquire at a young age. One way to test for it is to probe whether you understand that someone else can have a false belief.
I am indebted to The Curious Incident of the Dog in the Night-Time — a truly wonderful book — for this example. You have a box of which once contained Smarties. You have eaten all the Smarties and you're now using it to store a pencil. What will other people think is in that box?
- If you are over the age of four (roughly) you will correctly say that other people would assume it contains Smarties, even though you know it contains a pencil. You know that other people might not know the things you know.
- People on the autism spectrum often struggle with theory of mind questions like this. They might expect that others would know that the Smarties box contains a pencil, not being able to distinguish between their own knowledge of the Smarties box’s contents and the knowledge that would be available to others.
Back in early 2023, I was constructing questions to ask ChatGPT that probed its understanding of theory of mind. One that I was using at the time went like this:
Jane keeps an umbrella in her filing cabinet in case it rains. She had already gone home yesterday when it started raining. Last night I went into her office and borrowed her umbrella so that I could get home, but I put it back under her desk this morning. It has started raining today, and Jane is going to get her umbrella. Where will she look for her umbrella?
ChatGPT 3.5 would reliably get this question wrong. It would say that Jane will look under her desk because that’s where I put it.
This made me very happy. If AI was permanently stuck in modes of thinking that were even more constrained than those of children with on the autism spectrum, there really was nothing to worry about for the future. We would have a useful tool and some very exciting new technological capabilities from it and nothing more.
A few weeks later (after ChatGPT 4 was released), I tried the question again to see whether it could get it right. It did. I tried a variety of other theory of mind questions. It performed quite well on most of them, generally getting them right. A few papers appear on arxiv that found this too.
I changed my mind (theory of)
If you want to say that ChatGPT4 doesn’t have a real theory of mind like a human being has, go ahead. It isn’t human. Nothing it does is “like a human”.
But undeniably, it has an AI theory of mind.
Nobody explicitly programmed ChatGPT4 to understand theory of mind questions. It was caused simply by the extra computational capability that it had from further training and more free parameters. Together this caused the training of the language model to create something equivalent to a Theory of Mind using layers of linear algebra and ReLUs, which it found by gradient descent.
At the time I found this deeply surprising. But I shouldn't have. I was teaching students about neural networks and their ability to approximate any function (also known as the Universal Approximation Theorem). So I should have known better.
A neural network-based architecture of sufficient complexity with enough layers can learn any kind of relationship. The relationship between questions and answers mediated through a theory of mind to incorporate other people's thinking is a highly complex function, but it can be approximated.
And that's when the penny dropped for me. If it can learn a Theory of Mind — to be able to model what a human being is thinking — what other uniquely human capability would it be impossible to train into a model?
It started looking to me like there were no more breakthroughs required in order for us to achieve human-level intelligence.
The next five-or-so years
Human beings, in aggregate, are smarter than any individual — that was the premise on which I set up Genius of Crowds! So groups of AIs together might be able to be smarter than individual AI models. This is the basis of the Mixture of Experts technique. So if one AI can get close to human level, it should be possible to get beyond human level pretty quickly.
As far as I can tell, the majority of my colleages in academia and industry who know something about AI believe that AI systems will be smarter than any human being somewhere around 2029 to 2031.
It took me a long time to realise that they are probably right.
No comments:
Post a Comment