| Episode | Status |
|---|---|
| Episode | Status |
|---|---|
Tim sits down with Max Bennett to explore how our brains evolved over 600 million years—and what that means for understanding both human intelligence and AI.Max isn't a neuroscientist by training....
What's really interesting about about this book, Max, is, you know, obviously, I've I've read loads and loads of books in in the space, and there's, you know, people like Hinton and Hawkins and Di Macio and Friston and, I mean, god, you know, you you even like Sutton. And what's interesting is it's a bit like the blind men and the elephant. So they've all got a completely different story to tell, and I think the magic that you have pulled off with this book is somehow you've woven it together into a coherent story. Like, what what do you think about that?
Well, first, I'm very appreciative of of the kind words. Yeah. I think I came from a very unique perspective just because I was a complete outsider. And I think, and I didn't come to it with the objective of writing an academic book at all. I came to it from the objective of I was just learning on my own.
And I just started building this corpus of notes because I was so independently curious. And I kind of stumbled on this idea really for myself of how do I make sense of all of these disparate opinions and really this complete lack of information about how the brain actually works. And I had my own set of, I think, biases coming from sort of the technology entrepreneurial world where we tend to think about things as ordered modifications. When you think about product strategies or how to roll things out, we like to think about things as what's step one, then what's step two, what's step three. So I think I did have sort of a cognitive bias to when presented with an incredible amount of complexity to try and make sense of it in a similar type of way.
But yeah, I think as an outsider, I felt very free to sort of explore and cross the boundaries between fields. I mean, look at the book as a merging of three fields. One is comparative psychology, so trying to understand what are the different intellectual capacities of different species. Evolutionary neuroscience, so what do we know about the past brains of humans and the ordered set of modifications of how brains came to be, and then AI, which is how do we ground sort of the highfalutin conceptual discussions about how the brain works in what works in practice, I think is a really important grounding principle because it helps really hold us accountable towards the principles that we think work. If we can't implement them in AI systems, it should make us question if we actually have the ideas right.
So yeah, I think being an outsider in some ways comes with disadvantages, but there are some advantages also, which is you're free to borrow from a variety of different fields and think freshly about things.
Yes. If you can point to any particular ideas that you found really, really difficult to reconcile, what would those be?
One thing that's really challenging is if we were to actually lay out what's the data richness of comparative psychology studies across species, If you put that on a whiteboard and looked at it, you would realize we have so little data on what intellectual capacities different animals in fact have. For example, the lamprey fish, which is sort of the canonical animal that's used as a model organism for the first vertebrates because it's the, of all vertebrates alive today, it's one of our most distant vertebrate cousins. To my knowledge, there are absolutely no studies examining the map based navigation of the lamprey fish. So we'd have no idea if it's in fact capable of recognizing things in three d space. Now, we look at other vertebrates like teleost fish, it seems like they're eminently capable of doing that.
We look at lizards, they're eminently capable of doing that. So we sort of infer, it seems likely the first vertebrates were able to do this. We know the brain structures from which it emerges and reptiles and telios fish are present in the lamprey. So we sort of back into an inference that, okay, probably the lamprey fish can do that. But this is all sort of in some sense guessing and trying to put the pieces together from very little information.
So I think that's one challenging aspect to reconcile. The other one that's really hard is in neuroscience, there's a lot of really interesting ideas about how the brain might work that have not really been tested in the wild from an AI perspective. And then there's a lot of AI systems that work really well that have diverged substantially from at least evidence, what the evidence suggests is how brains work. So how do you bridge the gap between these two things, I think is a really fascinating space to operate in, is like, what can we learn about the brain, if anything, from the success of transformers, as an example? What can we learn, if anything, from the success of generative models in general?
What can we learn from the success of and failures of modern reinforcement learning? I mean, in some ways, reinforcement learning has been a success. In other ways, it's really fallen short of what a lot of people hoped it would be. So I think that the gap between neuroscience and AI is still a challenging one to bridge in a lot of ways. For example, Carl Ferson has all these incredible ideas in active inference.
In one hundred years, will we look back on this and be like Carl Ferson was onto something? If you look at the AI systems today, there's very little usage of active inference principles working in practice. So that could mean that the ideas don't have legs, or it could mean that there's a breakthrough behind the corner where we're actually missing some of the key principles that he's devising. And these are you know, questions we don't have the answers to.
I think there might possibly be some breakthroughs around the corner. I I don't know if you know, but I'm I'm Carl Friston's personal publicist. I do all of his stuff. I I I probably interviewed him more than anyone else, but I
love I love friend. He's an amazing guy. Yeah. He's an amazing man.
Honestly, he he is the man. He is the man. And
So kind.
I know. But for me as well, he has so much time to explain things. But, yeah, then you could cynically argue that, you know, the the the effective active inference agent is just a reinforcement learning agent of a particular variety. I think it's equivalent to a, like an inverse reinforcement learning maximum entropy agent or something. But there's so much more than that.
There's so much richness and explanatory power of of of kind of modeling this thing as as a generative model that can generate policies and plans of actions and and so on. And also, we wanna have agents that, you know, we understand what they're doing and and, like, with steerability and being able to do the simulations that you talk to so eloquently in in your in your paper. But just I love for instance, so we'll do we'll do him properly later. But to come back to what you're saying though, so you were saying, oh, you know, there might be a parallel between transformers and and AI models. In in your book, actually, on this page, you sort of like gen kind you of analogize model based reinforcement learning and the neocortex.
And, of course, I interviewed Hawkins back in the day, and the main criticism of his book is that, you know, the tree and brain type argument. So he's kind of giving giving the explanation that the brain developed a bit like geological strata with one layer and then another layer rather than kind of coevolving together. And it's so hard not to think like that because you give so many beautiful examples in your book, not only just morphologically, but in but in terms of capability, how the neocortex I mean, you you said, like, with stroke victims, for example, it's not like the brain recovers those dead cells. It learns to kind of repurpose those functions in other parts of the brain. So it seems like Mountcastle was correct in the neocortex as this just magic general purpose learning system.
I mean, what do you think?
So I think, well, there's different ways to look at the neocortex enabling things like mental simulation and model based reinforcement learning. One is that that function and algorithm is being implemented in the neocortex, but another, which is a slightly less strong claim, which is the one I would make, is the addition of the neocortex enables the overall system to engage in this process. So it is not saying that the entire process is implemented in the neocortex. I think it seems very clear that the thalamus and basal ganglia are essential aspects of enabling the pausing, the mental simulation, the modeling of one's own intentions, the evaluation of the results, etcetera. But it is possibly to say, which is what I'm arguing in the book, is in the absence of the neocortex that process does not happen.
And so I think where my ideas would synergize with what Hawkins is saying is the neocortex builds a very rich model of the world and a model of sufficient richness that you can explore it in the absence of sensory input. And that's a really essential aspect of model based reinforcement learning because if I have a model of the world that has sufficient richness that I can mentally simulate actions that I'm not actually taking and at least somewhat accurately predicts the real consequences of those actions. And that model is really useful because I can now imagine outcomes before having them, I can flexibly adjust to new situations. And of course there's so many really deep interesting questions that are yet to be answered about that. For example, just because you can render a simulation of the world doesn't answer the question, okay, what do you simulate?
I mean, is one the hardest problems of model based reinforcement learning is fine, can have a model of the world, but how do you prune the search space of which aspects of that model you explore before evaluating outcomes? And so that's another really hard challenge that I think there's a lot of good evidence is actually a partnership between the neocortex and basal ganglia, which is a much older structure. So yeah, I'm not really of the, you know, I think the triune brain has been, you know, amongst evolutionary neuroscientists, largely discredited. And I think in part somewhat unfairly, if you actually read McLean's writings, he was he actually is very open about the fact that, you know, this is an approximation. It's not exactly accurate.
Like, couches his claims much more than the popular culture than just converted into a dogma. But I think the popular interpretation of the triune brain is not accurate, which is it is clearly not the case that the brain evolves in three key layers. It's not the case that a reptile brain doesn't have anything limbic like. If you the reptile brain absolutely has a cortex that does a lot of what our limbic structures do, etcetera, etcetera. So, yeah, that those would be my thoughts on that.
Yeah. It is fascinating because we as humans need to have models to explain and understand the thing itself, just like active inference, for example. I mean, I I've well, I'll get to planning and agency and goals in in a little while, but a lot of these things are instrumental fictions. I'm not saying that our brains don't plan, but, you know, like, the way the the abstract mathematical way that we understand planning is probably not how the brain works. It's it's much more complicated than that.
But why don't we just rewind to the beginning? So, you know, we're gonna be talking about this chapter on on, you know, simulation, if you like. And you kind of lead by saying that what the neocortex does is it does learning by imagining. And Hawkins spoke about this as well. He said, we've got the matrix inside our brains.
Right? We're always just doing all of these simulations of of future things, and we're using that to kind of help us understand the world. And you you get this really interesting example of some of the features of the brain that kind of lead you to believe that we are basically living in a simulation. And it's almost like rather than perceiving things, we're testing if our simulation is correct. But that means that we can only simulate one thing at a time.
So we can't Mhmm. We can't see two things. We can only see one thing. So can can you talk through that?
Sure. So, one of the, first sort of introspections and explorations into how perception works in the human mind happened in the late nineteenth century with all of these explorations of visual illusions that you see in pretty much every neuroscience textbook or book that you open. So listeners will be familiar with, you've probably seen examples of triangles where you actually perceive a triangle in a picture when there is in fact no triangle, or yes, that isn't yeah, you can find that picture.
Yeah. So I hope I'm not distracting you.
No. No. No. So that's so that's, you know, a standard finding that was observed in the nineteenth century, which is this idea that clearly the brain observes the presence of things even though they're not actually there. So we perceive a triangle there, perceive a sphere, we perceive sort of a bar, we perceive the word editor when in fact if you actually examine that, the word E is not there.
There's evidence that suggests E is there by virtue of showing the shadows, but we did not actually write the letter E there. But the brain regularly observes that. So that finding led this scientist, Hermann von Helmholtz, to sort of come up with this concept that what the brain, what you actually consciously perceive is not your sensory stimuli. You are not receiving sensory input and experiencing the sensory input. What's happening is your brain is making an inference as to what is true in the world, what's actually there.
And then the evidence is just, the sensory input is giving evidence to your brain as to what's there. And so you start from this prior and then that prior maintains itself until you get sufficient evidence to the contrary and then you change your mind. And so it's not hard to imagine why this would be extremely useful in any sort of environment that an animal might evolve in. You know, suppose you have a mouse running across a tree branch at night. First I see the tree branch in the moonlight.
So I build a mental model of the tree branch. As I move forward, I lose moonlight, I no longer see the tree branch. As long as I'm stepping forward, the evidence is consistent with my prior of the tree branch, it makes way more sense for me to maintain the mental model of that tree branch as opposed to all of a sudden the tree branch disappears because I no longer see the sensory stimuli of it. So because sensory stimuli is very noisy, it makes a lot of sense that we integrate it over time, build a prior, and then until something gives us evidence to the contrary, we maintain our prior about the world. So that was sort of the first idea that there's some form of inference, there's difference between sensory input and some model of the world that we infer and then thus perceive.
What's interesting that is not as discussed but is also present in the discussions amongst scientists in the late twentieth century about this, the late nineteenth century about this, is this idea that you can't actually render a simulation of two things at once. So there's lots of really interesting sort of visual illusions around this where you can see something, the famous one is it's either a duck or a rabbit. Yep, exactly. And it's interesting, yeah, and then so you can see that staircase is either moving up to the left or you're under the staircase looking upwards and it's actually a ceiling that's jagged. And why can't the brain perceive both of those things at the same time?
Well, would make sense if you have a model that there are such things as ducks, there are such things as rabbits, there are such things as three d shapes that operate under certain assumptions, and if that's true, then you cannot see a duck and a rabbit at the same time because there's no such thing. It cannot be the case that the staircase is looking from above and below at the same time. So what your brain is not doing is just perceiving the sensory stimuli. It's trying to infer what is a real three d thing in the world that I am aware about that this sensory stimuli is suggesting is true and that is the thing that I'm gonna render in your mind. And I think one way this parallels nicely to some of Hawkins' ideas actually is if you hold the thousand brains theory to be true, which is the neocortex has sort of all of these redundant overlapping models of objects, then it would make a lot of sense that we wanna synergize these models to render one thing at a time.
You don't wanna have 15 different things rendered because then it's really hard to evaluate them and vote between these different columns. So it makes sense that the brain does is say, let me integrate all the input across sensory stimuli and render one sort of symphony of models in my mind so I can see one thing at a time. So that's this idea of perception by inference. At the time no one really connected that to the idea of planning. So this was just this idea that what we perceive is different from the sensory stimuli we get.
But later on, sort of as the world of AI started thinking about things from the perspective of perception by inference, what we end up realizing is this idea of perception by inference, if you're gonna train a model to do that, it comes with this notion of generation. So because the way it self supervises is it takes the prior and it tries to make predictions and it compares the predictions in the world to the sensory stimuli. And as long as those predictions are below a threshold, I maintain my prior. So a famous version of this is the Helmholtz machine which Hinton devised, I think that was in the '80s. It could be later, I forget.
Yep, and so this is this basic idea that you can build a model. A lot of people use the term latent representation. Some people don't like that term for a variety of philosophical reasons, but it builds a representation of things by virtue of building a model, in other words perception by inference, and the way you build that model is you're constantly comparing, generating predictions from that model to what actually occurs. And this also has synergies to a lot of Hawkins' ideas where we think about intelligence as prediction. And so what that means is if you build a model of perception by inference by virtue of generation, then it's relatively easy to say, okay, well what happens if I just turn off sensory stimuli and I start exploring the latent representation?
Well now we're exploring a simulated world. I'm able to cut off sensory stimuli, close my eyes and imagine a chair, rotate the chair, change the color of the chair. And because this model is relatively good, has relatively rich features about how the world actually works, I can model things without ever having experienced them, without ever having done them, and relatively reasonably predict what would actually happen if I were to do those things. So what I think is interesting and perhaps somewhat of a novel proposal in the book is I think a lot of people think about the neocortex as having adaptive value because of how good it is at recognizing things in the world, right? So a lot, you know, if you read a standard textbook, a lot of what people will talk about in the neocortex is how good it is at perceiving things, object recognition, some of the best studied parts of the brain are this visual neocortex so we understand reasonably well how we're building models of visual objects, etc.
But from an evolutionary perspective, this is a little bit hard to find convincing because if you actually examine the object recognition of vertebrates, it's incredibly accurate. I mean, a fish can recognize human faces. A fish can one shot recognize an object when rotated in three d space. So it's hard to find a dividing line between object recognition in animals with the neocortex and object recognition in objects without a neocortex and with brain structures that seem more similar to early vertebrates. Did you have a question?
Yeah.
Well, yeah, I just wanted to touch on a couple So of things Hawkins said that the reason I mean, he said, first of all, that we overcome the binding problem by having this kind of profusion of individual sensory motor models rather than having this kind of, you know, feed forward enrichment of representations. And that was really interesting. But he also said the reason why having let's say we've got a 150,000 mini cortical columns that are just wired to different sensory motor signals. And he said it's the it's the kind of diversity and sparsity that gave the robustness of recognition. And then you can kind of think, okay.
Well well, what do the rep representations look like? So if our brain builds some kind of a a model of the world, some kind of topological model, it it must be a representation, not necessarily it's not like a a homunculus well, it's not like a stapler inside the brain if I'm modeling a stapler. It's actually some weird structure of the stapler, but as seen by every way I can touch, feel, hear, and, you know, lick a stapler or or whatever. So it's difficult for us to imagine what that is. But the the reason I'm the reason I'm going down down this road, though, because that's a bit weird, isn't it?
So we have this very weird representation of things. And then I come to Hinton's Helmholtzian generative model. And you can get it to generate, let's say, a a number eight if it's trained to do numbers. Right? And Hinton would argue, and I would disagree with him, that that the model understands what an eight is.
Now this is weird, isn't it? Because we understand what a mouse is, but intuitively, we feel that a neural network doesn't understand what an eight is. And I would argue that we're getting into semantics here. So I think the reason we understand things is because there's a relational component to understanding. So semantics is is about the ontology, you know, our rep the way we feel about things, where the thing came from, what was the intention guiding the creation of the thing, what was the provenance of the thing.
It's almost like the interconnectedness of the thing tells you more about the meaning of the thing than the actual thing itself in a weird way. What do you think?
Well, I do think I think this is where the word understanding, I think can mean different things to different people could understand the word understanding in different ways. I think there is absolutely something to the idea that just because you can recognize something, so that would be a feed forward network that can observe a stapler, alone is insufficient to what most people would mean when they use the word understanding. And just to talk about interrelatedness, if I have a feedforward network that, let's say it's just a binary classifier, is this thing a stapler or not? Now, can't ask lots of things of that feed forward network that I would expect of an agent or a model that understands what a stapler is. I couldn't ask it, for example, what would happen if I burned the stapler?
What would happen if I opened it? What would you see inside of it? I can't ask what does a stapler do? I can't say show a human holding a stapler and a set of objects in front of them and what do you think the person's gonna do next with this stapler? And so clearly understanding most of our intuitive understanding of the word understanding contains some richness that's not included in just classifying or recognizing the presence of objects.
I think it's absolutely the case. My intuitions sort of fall in the direction of what we mean typically when we use the word understanding, it comes to having something that can be mentally explored. And so I think that requires what you're describing, which is the interrelatedness of things. Because when I see someone holding a stapler and you ask me the question, what are the things they would likely do? Well, I start imagining what they might do with that stapler and then I can evaluate which one seemed plausible to me.
And so in the imagination and evaluation of which things seem plausible, there's interconnectedness between the thing and the world around it. But yes, I absolutely agree with you that just recognizing objects clearly lacks something that we mean when we say understanding.
Yeah. Do do you take into consideration the meme sphere? So, you know, we've got this kind of we've got ideas that are quite collective as well. I thought I guess I I was gonna explore this with you later, but maybe maybe let me frame it like this. I I have this intuition that a lot of our intelligence is outside of the brain.
So it's almost like if if I was in the wilderness and, you know, disconnected from society, I would be much, you know, a lesser human being. In a weird way, I might have more agency, but, you know, I I wouldn't have access to all of these rich cultural tools and knowledge and patterns and and so on. So it it's almost like that's where a lot of our intelligence comes from, rather than and also meaning as well, rather than just being able to plan in the brain and so on. And there must be some kind of interplay, so culture must shape the development of our brain and vice versa, but culture seems to be more dynamic. How do you wrestle with that?
Well, mean, that's undeniably true. I mean, the first example that comes to mind is just writing. I mean, what would humans be if you removed the technology of writing? I mean, we would all realize we're not that smart. Mean, writing is a technology that externalizes a feature of the brain, which is memory, which brains are not that great at.
We do a good job of condensing aspects of a memory. So for episodic like things and procedural memory, we're relatively good at. But for semantic memories, we're terrible. And so externalizing that with writing is one of the key technologies that enables us to be way smarter than we are in fact, because we now have this external device that enables us to store largely infinite numbers of memories and translate them across generations. So that alone, I think proves your point, which is what humans are capable of is clearly some relationship between brains and external things that can be writing tools, that can be other brains, you know, sharing ideas and getting challenged.
So, yeah, you're absolutely I agree with you.
Intelligence is like the it's the dynamics, isn't it? It's it's all of the low level, you know, dynamics of things interacting with each other. And we can take a snapshot of language, and and we might say, oh, well, yeah, you know, language isn't isn't intelligence, but language itself is a form of intelligence. So I'm not just talking about the words and language models and so on. We're talking about the actual language in our culture.
And I guess I think of that as almost a distinct form of intelligence.
Yeah. I mean, I I think it we all it almost gets into a a philosophical territory as where do you draw the bounding boxes over the things that are imbued with intelligence and the things that are supportive mechanisms that allow those things to have intelligence. So through one lens, could think about brains as the physical entity in which intelligence is instantiated and then language as a supportive tool. You could take a very perhaps odd view, which is language is the thing that's evolving and it's just instantiated in these brains that sort of produce the language, consume the language, but it's language that's evolving. The same way that, you know, we don't think about intelligence on the level of an individual neuron.
We don't imbue a neuron with intelligence, but we think on the scale of the 86,000,000,000 neuron, something has emergently appeared that we do deem intelligent. You know, there's some great sci fi books where intelligence gets instantiated in colonies of ants, where each individual ant isn't not intelligent, but somehow the colony itself is capable of doing incredible abstractions and whatever. So yes, I think there's very interesting ways to, and it's not obvious how one divides the lines between what are the physical entities in which these are instantiated. That said, I have a particular interest in brains because I think if we're looking for what is the physical manifestation that we can learn from and thus try to, one, to understand ourselves, just I think all species have an interest in understanding ourselves, But then also, if we wanna try and borrow some ideas from how biological intelligence works into AI, I think of all the physical things to examine, the brain to me seems clearly the one that is probably the most rich with insight. But no, I agree.
I I I think your point is well taken.
Interesting. Okay. I wanna close the loop on what what you said about the brain being an imagination filling in machine. So you said that it does filling in. It's one at a time.
It can't unsee visual illusions, and evidence is seen in the wiring of the neocortex itself, you say. So it's, shown to have many properties consistent with a generative model. The evidence is seen in the surprising symmetry, the ironclad inseparability between perception and imagination that is found in generative models in the neocortex. And you give examples like, illusions and how humans succumb to hallucinations, why we dream and sleep, and even the inner workings of imagination itself. So it really seems plausible when you kind of think of it in that way.
Yeah, I think there is a reason why so much of the neuroscience community, I mean, what I'm saying there is not really novel, so much the neuroscience community has sort of rallied around this idea of predictive coding, is very related to active inference and generative models because there's just so much evidence that what's going on in the neocortex, the imagination of things, episodic memories, I mean, there's been some good evidence that episodic memory, other words, thinking about past and imagining the future are in fact the same underlying process happening in the neocortex, which is again consistent with this idea that there's a generative model. If we look at the connectivity patterns, I didn't talk about this too deeply in the book because it's a little technical, but what you would expect from a generative model is backwards connections would be much richer than forward connections because you're modulating downstream. Of course, the neocortex is not perfectly hierarchical, but things that are in general lower in the hierarchy would have lots of inputs from parts of the neocortex that are higher in the hierarchy. It's absolutely what we see. So yeah, there's a lot of evidence that these are two sides of the same coin, which is there's some form of generative model being implemented.
I do think in AI, one way in which this manifests is the very clear success of self supervision. I mean, this idea, although the actual predictive coding algorithms that people have sort of devised as neocortex implementing, when we've actually modeled them, they haven't outperformed any of the stuff going on in the AI world. The principle of self supervision, is can a system end up having really interesting emergent properties and generalize well when you only train it on predicting sensory input that it receives? And that clearly has become the case. The transformer I think is a great example of if you just give it a bunch of data and you train it through self supervision, I.
E. Masking, so you hide certain data inputs, it becomes remarkably accurate and good at generalizing across data that it hasn't seen before, which is in principle what people are predicting or claiming that the neocortex is doing as a generative model.
Yeah, it feels to me that there is a bright difference between, let's say, a transformer and the neocortex. And I think the difference is maybe agency is not the right word, but you can think of the the neurons, I think, as having some kind of autonomy. So they're sending messages to each other, and then the other you know, it's eventually consistent. So the other neuron will get the message, it will decide itself what it's gonna do. And, in a transformer, just because of the way they're connected in the back prop algorithm and so on, they they all, they all kind of ride a Mexican wave together to use an analogy.
So that it it feels like a difference in kind to me.
Well, I mean, clearly, as I argue in the book, I don't think that the brain is just one big transformer. So I would agree with you. In the human brain, unless you think there's something that's nondeterministic and sort of magical happening, I think you would still say that there is either base firing rates of neurons and then there's sensory input that flows up and then goes through the brain until eventually there's sort of it's affecting muscles until you're responding. There is sort of a deterministic flow happening. It might not be as feed forward as what's happening in a transformer, which is definitely the case.
But they might both be sort of you know, deterministic in a similar fashion. I think a lot of people, I've had this exact sort of argument with a lot of people and one counter argument that people have towards this idea, I don't know if I fully agree with it, but it's interesting, is that attention heads really are doing something more magical than we give them credit for, which is they are kind of dynamically rerouting and effectively resetting the network based on the context that the prompt is getting. And so although technically it's just a series of matrix multiplications, etcetera, if in principle what's happening is these attention heads are doing something really clever where they're looking at the context of a prompt and then effectively dynamically reweighting the network to decide what it cares about and what it doesn't. So there are people that think there is something really interesting happening in the transformer that might be analogous to certain things that are happening in the brain. But yeah, clearly, these feed forward networks are not capturing everything that's going on in the brain.
Yeah, it's really interesting what you said because the way I read that is things like chat GBT and language models, they are entropy smuggling or agency smuggling. So what what that means is they kind of just do what you tell them to do and all of the kind of, the agency. So my directedness comes from me. So I give it a prompt. It does the thing that I wanted it to do, And then the kind of the the mapping that you were talking about, I interpret that a bit like a database query.
So, you know, depending on the prompt you give it, it'll activate a certain part of the representation space, it will give you a certain result back. But the the the brain has this thing where all of the neurons have their own directedness. And and the weird thing is at the cosmic scale, agency is a site it seems to emerge. So even transformer models that were acting autonomously could presumably in large enough scale give rise to something that we think of as directedness or goals or purpose or or or whatever. But it's almost like, in the natural world, because there are so many levels, scales, scales of independent autonomous things just kind of mingling with each other independently and then, like, downstream mixing their information together and rinsing and repeating over many, many different scales.
That seems to be the thing that gives rise to all of these amazing things like agency and creativity and and etcetera.
Yeah. Yeah, mean, think the notion of agency is an interesting one where I really am amenable. I mean, is sort of a, I don't know if I would call it a schism in the field, but there is a debate where between sort of the reinforcement learning world and the active inference worlds where how much of intelligence can be conceived as optimizing a reward function. And sort of the hardcore reinforcement learning world is like everything is just a reward. And then the active inference world would argue that not all behavior is driven just by optimizing a single reward function.
There is some uncertainty minimization. There is trying to satisfy your own model of yourself, fulfill your own predictions, these sort of things that seem very well aligned to behavior we see, but it's unclear which of these is right. It's probably some balance of the two. But to me, agency, people would conceive of agency differently in these two worlds, right? So I think some people on the RL world would say agency is just you give something a reward function and then it just learns over time trying to optimize that reward.
In the more active inference world, which I do think has legs and I'm obviously amenable to, the idea of agency is a little bit more. It's building a model of yourself and trying to infer what your goals are based on observing yourself and then trying to make predictions to fulfill those end goals. In other words, it's constructed, goals are constructed. And this is sort of one of my favorite Friston papers is Predictions Not Commands. I don't know if you've read that paper, but I think it's a brilliant paper about how you could reconceive motor cortex, not as sending motor commands to your body, but actually as building a model of yourself and predicting what will happen.
And the way the spinal cord is wired, it just fulfills those predictions. And I think that's a really interesting sort of reframe of how you could get agency and really interesting smart behavior in the absence of just a strict reward function, right? So how that would learn is it's trying to model the behaviors it observes, then it's trying to sort of predict those and fulfill them. So, but yeah, I think agency is a really interesting concept because it sort of manifests itself in these different paradigms in different ways.
Yeah. I I find it fascinating. I mean, because the way I read it in active inference literature, it's a very principled definition of an agent. And there's still a bit of a gap because I think Fristin would argue, well, in the natural world, because of the laws of physics and particles and whatnot, you get the emergence of things and things become agents when they have a certain, you know, depth of planning, shall we say. But, yeah, it's really, really interesting, and I guess he would argue that you get all of these, phenomena that give rise to agency like biotic self organization and and and so on.
But, yeah, maybe we should do we'll slowly go go in that direction. So you you give the example of, mice doing planning. Can you can you sketch that out?
Yeah. So this is another real, like, area of neuroscience research that I just absolutely love. So it was the case, I think it was the 40s or 50s, I forget the exact decade, where Tolman observed that worms, when they would reach choice points in mazes, where he was training them to navigate around mazes, would pause and then they would sniff back and forth and then they would choose an action. And so he hypothesized this idea that they must be engaging in vicarious trial and error. They must be imagining possible outcomes before deciding.
And of course, this was hugely controversial because there's no evidence. He had no evidence that they were in fact imagining anything. And of course, most people like to, in the absence of evidence, assume animals are as dumb as possible. Only when there's irrefutable evidence will we imbue them with any intellectual capacities, which I think is an interesting human bias, but that's fine. And then David Redish, who is also a close friend and mentor of mine, he did some amazing research with one of his PhD students where they were recording hippocampal place cells.
So very quick background for viewers. You can go into the hippocampus of really any mammal, but this is best well studied in rats. And part of the hippocampus region called CA1 has these things called place cells. And so if you record it as a mouse is moving around a maze, what you find is this incredible thing where there are neurons that activate only in specific locations in that two dimensional plane. And it's not based on how they got there.
It is egocentric, it is independent of their egocentric path. It is allocentric, meaning in the plane of external space. So they can come back to the same place from any route and that same place cell will activate. And as an aside from the evolutionary story, we find similar types of cells, it's not exactly as accurate, but we find similar types of cells in fish, in the homologous region of the hippocampus of their cortex where they have place like cells, it's not as accurate, but cells that activate in certain locations in NMAs. What he found is when mice engage in this act of vicarious trial and error, meaning when they pause and look back and forth, the place cells in the hippocampus cease to only activate in the location they are.
It actually starts activating down the paths of each route it might take. In other words, you can literally watch rats imagining the future. I think one of the most incredible neuroscience findings. And so he then took this and then did a bunch of other experiments which I think reveal even further the power of sort of imagination in rats. One of my favorites is his counterfactual learning studies where he puts rats in this thing called restaurant row where it's a sort of square like yeah.
Exactly. It's just like a square like maze. And as the rat is going counterclockwise, at each sort of door, a sound is released or made. And that sound signals to the rat whether or not they can go right through the door and get food in like, I think it's like three seconds, or they're gonna have to wait forty five seconds before they get the food. They're given a bunch of time to try and get as much food as they want.
Rats have clear preferences. So some rats will really prefer bananas and they don't really like the bland food. So what happens? Well this presents a set of irreversible choices to a rat. So let's say they come up and they can either get a treat right now.
They can get a treat right now that they don't really like that much. Let's say it's the bland treat or they can go to the next one and hope that they're gonna get the banana really fast. If they go to the next one and the banana sound is long, meaning forty five seconds, then they regret their choice because it would have been better if they just went in and quickly got the food. How do we know they're regretting the choice? We can literally watch them imagining eating the foregone choice.
We can go into a part of their brain called the orbital frontal cortex which activates for certain types of tastants and we know we can see them imagining the foregone choice and they end up making different choices the next time around. They end up being less likely to forego that choice in the future. So I think this is just such an incredible finding of what we mean when we say model based reinforcement learning is clearly happening in these brains of very, very simple mammals.
There's a real challenge knowing which simulations to run. Because if you think about it, we've got a search problem, right? There's an intractable number So of simulations to how we fix that in AI, and how do humans fix that?
So this is one of the, I think, of the many, but this is one of the big outstanding questions in AI, which is how do you effectively prune the search space? We do not know how mammal brains do this so well. So I'll give you some like high level ideas or theories, but we just don't know. And this is one of the big possible breakthroughs in figuring out how do mammal brands do such a good job of this. So the thing that AlphaGo does, which I think perhaps is a clue and is clever, is the selection of the search space actually is bootstrapped on the temporal difference learning model under the hood.
So this is actually very clever, is let's say you train something to learn without a model of the world. All it's doing is it gets sensory stimuli, it gets a model of a Go board and then it just predicts the right next action. It just has a policy value function that bootstraps on each other, etcetera. So no planning. If you wanna add planning to that, what they did, is quite brilliant, is you say, okay, well, you know what, instead of building some other system to try and choose good trajectories, why don't we just use the policy network and we just don't only pick the first one.
We pick its favorite move, but then we also look, what's your second favorite move, your third favorite move, maybe your fourth favorite move. And then let's literally play the games out. And let's just see let's play a bunch of games against ourselves and then see the ratios in which we win one of them. And then what we might learn is your second best guess was actually better than your first best guess. But we're not starting from every possible possibility.
We're saying let's bootstrap on our best guesses of good moves, but then check them by playing out the possible futures. So if we were to analogize that to the brain, what that would suggest is perhaps it's the basal ganglia which a lot of evidence suggests is engaging in this type of model free reinforcement learning actually is the thing that chooses the moves, but there's some other system that lets us choose the second best move, the third best move, etcetera. One way this might happen, there's some evidence for this, far from conclusive, that there's some notion of uncertainty that frontal cortex or basal ganglia is measuring and when the level of uncertainty between the next actions passes a threshold, pausing occurs. Because when we see animals do this vicarious trial and error, it almost always occurs in moments of high uncertainty, when contingencies have changed, when the right answer is not obvious. And so you could conceive of this as a policy network where you're evaluating its best choice, second best choice, third best choice, and when there's uncertainty about it, in other words, they're close together or there's some other measure of uncertainty, perhaps you have parallel policy models that, and you're comparing the similarity between them, a lot of ways to do this, that triggers a process of playing forward.
This is another key thing that mammal brains do that AlphaGo does not do. AlphaGo engage in planning on every move. So there was never the question of when do we pause to plan? In a game of Go, it doesn't matter. Just engage in planning in every move because we can do it so fast.
In the real world, there's so much uncertainty and noise, and we need to be so energy efficient with human brains. We can't engage in planning every instant. So we need some mechanism that tells us when to stop and think about what I'm gonna do next and when I can just continuously go with model free choice. This is also something we don't know how mammal brains do, but I think a reasonable sort of speculation is that there's some uncertainty measurement that's occurring. One last point I'll make that I think parallels in an interesting way to some Hawkins' ideas, and Forsen's ideas is if you take the thousand brains model and you apply it to frontal cortex, in other words we have multiple parallel models of ourselves, you could imagine that there's an uncertainty measurement the same way we do uncertainty measurement in a lot of deep learning models where you create parallel models and you just measure how similar are the predictions of the model.
And if multiple parallel models predict similar things, we just measure it as low uncertainty. When they diverge substantially, all of a sudden we measure high uncertainty. So again, speculation, but you could imagine if it is the case we have redundant models in the neocortex, then might it be the case that somewhere perhaps the thalamus and the basal ganglia, the similarity or differences between these predictions are a measure of uncertainty that triggers pausing. Steven Grossberg has similar ideas. He calls this, like, matching and nonmatching.
Yeah. Yeah. Yeah. Like, almost like because our ability to do abduction is something that fascinates me, there is some kind of a model selection or or or matching step that that goes on there. But, anyway, so so we we've got the neocortex.
It's an absolute beast when it comes to predicting sensory signals, and then we we've see the the emergence of planning and and also smart planning as as we've just spoken about. So now we're actually talking about traversing these sensory networks over space and time. And then something else really interesting happens. So the the next two moves you make are bringing in, selfhood. So, like, know, bringing yourself as an explicit actor and including that in the planning.
And and then that naturally leads to this idea of, I would call it teleology, but, you know, why or intentionality. So let's let's go on that journey. So maybe the the the the self modeling first. How does that come into the picture?
So I think there's two notions of self modeling. One notion of self modeling is the kind that I think we see in, early mammals. And this is an idea where frontal cortex of mammals, a region in general called agranular prefrontal cortex, is present in all mammals and largely believed to have existed in the very first mammal brains, gets sensory input from an animal's own introspective signals, so interoceptive signals. So it gets input from the hypothalamus, which measures things like hunger. It gets input from amygdala, which measures things like valence in the world, fear, danger, etcetera.
And when does the agranular prefrontal cortex get most excited? It's in these moments of uncertainty and in these moments where they're engaging in planning an episodic memory, and if you damage agranular prefrontal cortex in rats, they seem to dramatically have impaired, if not completely lose the ability to engage in mental simulation and episodic memory, etcetera. Okay. So I think what one might speculate is happening here, is not a novel suggestion by me, is it's modeling the self. In other words, what is the activations of the amygdala and the hypothalamus that are happening and why am I doing what I'm doing.
In other words, if I wake up and I see I have these hypothalamic activations and then I go down to go get water, then it builds a model of in the presence of these hypothalamic activations, the next action is I'm going to go over and get water. And it constructs an explanation. And so as odd and philosophical as that sounds, that is in principle computationally exactly the same thing as when we showed a picture of that triangle and your posterior sensory cortex constructs an explanation of what it saw. I perceive the triangle. So this idea of constructing an explanation of one's own behavior is this idea of, the first idea of self, which is constructing intent.
And there's lots of evidence that even in rats, if you record neurons in their agronular prefrontal cortex, they seem to be very sensitive to tasks that they're in and measuring progress towards goals. There's lots of evidence that it's doing something akin to that. So that's one notion of self. When you get to primates, you see a whole new region of frontal cortex emerge, what's called granular prefrontal cortex, which is only seen in the primate lineage. There's no other mammals that have this region of prefrontal cortex.
As a quick aside for the reason it's called granular versus agranular is most neocortex has six layers. The fourth layer is called the granule layer because it contains granule cells to a certain type of neuron. And for a variety of really unknown reasons, but there's some interesting speculations, agranular prefrontal cortex is missing layer four. That's why it's called agranular. Same thing with motor cortex is missing layer four.
So most mammals, their whole frontal cortex is missing a fourth layer. It only has five layers. But in primates, get this granular prefrontal cortex, this huge region of neocortex that does contain a layer four. And the best explanation for this I've actually seen Friston talk about. I'm happy to go into it if you think your folks would be interested in it.
But the point is there's a new do you want me to go into that?
Oh, oh, yeah. Yeah. But I mean, so we we love Friston. And I mean, because active inference is actually about preferences. It's you know, because an agent expresses agency by Right.
Kind of like adapting the environment to suit its preferences or to kinda make the environment like its preferences. So, basically, this is a a theory of volition. Right? And an act that you know, that what what Carl Friston's talking about is where do these goals come from? Where does the volition come from?
And spoke with Carl about that. I think over several years, didn't you?
Yep. Yep. Yeah. Carl's been a wonderful mentor of mine. I honestly, I like, it's a funny story.
I like one. I didn't know this was a term in academia, but there's the the reviewer lottery where you just get lucky and you get a reviewer. So the first paper I submitted, Carl Ferson was a reviewer on which was just lucky and then he became a mentor and reviewed my book and gave me lots of good feedback. So yeah, he's an amazing person, been a wonderful mentor of mine. So yes, so okay, so let's talk about granular versus agranular because I think the best theory I've seen is Tristan's theory on this.
So what does layer four do in neocortex? Across the entire neocortex, layer four is where sensory input is received. The primary sensory input is received into the neocortical column. This comes from the thalamus, so the canonical model is sensory input from sensors, eyes, ears, skin, flows up through the brainstem to the thalamus, and then from the thalamus propagates to layer four and then from layer four it goes within a variety of other layers of neocortex. Then other layers of neocortex project back to the thalamus and the rest of the brain.
So why would it be the case that regions of neocortex would not have a layer four? Well if you actually watch an animal's development, what's interesting is mammals with an agranular prefrontal cortex is not always agranular. It actually starts having a layer four and the layer four atrophies over development. And so I think this is very, mirrors well, for instance, idea of active inference where what's happening is the neocortical column can kind of be in two phases. It can either be trying to match its model of the world to its sensory input.
In other words, I see sensory input and I'm trying to infer what's there and I'm going to construct the idea of a triangle. But there's another state of a neocortical column which is generation. I'm gonna start from the latent representation of a triangle and I'm gonna imagine and explore it. And so one idea is that what frontal cortex does is primarily try and fit the world to its model. In other words, it spends the vast majority of its time constructing intents and not trying to modify that intents to fit what it observes, but in fact try to change what an animal does to satisfy its intent.
Layer four atrophies doesn't actually go all the way away. If you go deep into a brain you see some basic layer four so it's not completely gone but it atrophies because frontal cortex spends very little time trying to change what it perceives its intent to be to map what the animal's doing, but in fact what it does is it tries to change what an animal does to map it match to its intent. And what I think is so interesting and brilliant about this idea is it explains exactly why layer four doesn't start not existing. Because at first an animal needs to build a model of itself, thus layer four is present, but over time it shifts towards once I have a model of what I want and who I am and the things I would do, I don't need to spend as much time changing my model of self. I'm gonna spend most of my time trying to change my behavior.
So this is a very speculative idea but it makes a lot of sense in the context of active inference and it's the best to personally, and all of my reading of, explanations of why, agranularity exists, the best explanation I've seen.
Yeah. So so this is absolutely fascinating. And in my mind, it bridges the gap between internalism and externalism because it is describing this kind of didactic exchange to use. Know, there are certain high entropy words that only Friston uses. So if I say die didactic exchange, you know, I'm and by the way, if you if you, just for the folks at home, if you read Friston's papers, there are certain words.
So he says, this licenses something. So, you know, if you see the word licenses, then Friston wrote the paper. But, anyway, but but yeah. So but you you, you you have this kind of, like, you know, like, the the agents have models of the world, but they're they're kind of exchanging information with with the other agents. And then you get it's because what what an an agent does is it has this generative model of policies, you know, which is just a sequence of of actions.
And here's where I wanna get into the the nitty gritty a tiny bit. So you could think of those, those, those plans as being goals. So you could think of a goal as just being an end state in one of your plans, But that doesn't really satisfy me because I think of goals like eating food as being like a kind of a category, not not like a kind of a pointillistic traversal of, you know, a specific state in the future because it kind of feels like, well, if there's an infinite number of goals, then in some sense, there are no goals at all. So what is a goal to you?
So, okay. So great question. So I think this is where, semantics matters a lot. I think we can think about goals in several ways. So I think if we think strict RL lands, they would just think about goals as you're just optimizing a reward function.
The goal is simple. There's only one goal, is maximize reward. And in a changing complex world, your reward function might fluctuate over time, etcetera, etcetera, but the goal is singular. In the active inference world what I find compelling is it introduces a different components of what we mean by a goal which is not just sort of intellectual interest just because it's cool, but it has like very real AI implications actually because it contains the notion of explainability. So for example, if wake up and I'm hungry and I start imagining ways to satiate my hunger and then I decide I'm gonna get in a car and I'm gonna go to this restaurant and then I'm gonna go eat this specific food.
When I get into the car and someone calls me and go, why'd you get in the car? The reason I can explain that so easily is because there was a rendered simulation of a plan that terminated an end state that I deemed that I wanted, I selected. So it's very easy to explain why I did that. In the absence of that, it's actually very hard to explain why you're doing things. If you're walking down the street, if I asked you to explain any one of your model free behaviors, why did you move your foot there instead of there?
You have no explanation. And so I think you can think about, you can assign the word goals to multiple levels here. I think you could say it's terminating the end state. You could say the goal is some more abstract representation of the satiation of thirst in general. And you could think about that as distinct from just the reward function.
Some might challenge that and just say, well all that you're talking about then is just optimized reward function, but I think you could make an argument that there is a distinction happening there. But what I think is so critical that's unique that happens within mammal brands, which why I was sort of really honing in on that, is the ability to plan a series of actions that terminates an end result and then execute that plan. And I think that has very clear implications for explainability, which sort of model free actions do not. But yeah, yeah, so that would be how I would think about goals. But I think it is a little bit of a semantic where we can think about the concept of goals in multiple ways.
How do we actually know what the cognitive abilities were of early animals, and why should we care?
Great question. So I think there's two reasons why we should care about the evolution of our brains and intelligence. The first is to understand who we are. So the scope of what it means to be a human is not constrained to what it means to be a homo sapiens. So so much ink has been spilled on the last seventy thousand years of us being Homo sapiens.
But if aliens were to come down and engage with us and analyze us as a species, most of the things they would observe about us don't come from our legacy as homo sapiens. They come from our legacy of being a primate and our legacy of being a mammal and our legacy of being a vertebrate and our legacy of being an animal in general. And so I think if we wanna understand what it means to be a human being, I don't think we can skip the full six hundred million year story of how we came to be. And I think in there is so much rich history and insight about what it means to be us. So I think that's one really key reason.
It's our legacy, it's our history, how we came to be. The second perhaps more practical reason is I think understanding the evolution of the human brain and the evolution of human intelligence is a key tool in our toolbox to understanding how the brain works and how human intelligence works. It's by no means the only method. It might not even be the main method, but it's a very useful method to add to the toolbox. So the problem with going into the human brain and trying to directly reverse engineer it is that evolution doesn't work in clean ways.
It doesn't work the way a human designer would. It doesn't work from first principles. It tinkers. And so when we go into the brain, we see all of this messiness. There's redundant systems, there's vestigial systems, new things evolve that make old things redundant, but they're still there.
Lots of processing is duplicated in different regions. And so one way to understand the brain is to continuously probe it as the human brain is, which is fine. But another method that's also useful and can impose constraints for us is to actually track the history of how it came to be. And that can provide insights as to when this brain modification occurred, such as when the neocortex evolved or when the basal ganglia evolved. What were the new abilities that this enabled?
And how did it affect the prior brain regions that were already there? And so this can give us insight into how the brain works today. So I think in the toolbox that we have of ways to reverse engineer the brain, I think this is just an underappreciated one that is worthy of being included. And of course, how the human brain works has so many different applications. Helps us with mental health, helps us with understanding why people do what we do.
It helps us with building AI systems. I think there's lots of insights to garner for the brain. So that's why to do it. Now, how to reverse engineer what behavioral abilities exist in our ancestors is a really interesting question. Of course, we can't go back in time.
So what we can do though is there are mechanisms to reverse engineer what their brains look like and there are mechanisms to reverse engineer what abilities they had. So what their brains look like, we can do just by looking at other animals in the animal kingdom. So for example, we can look at all of the existing primates and all of the existing non primate mammals. And we can see what are the common brain structures that exist between them. We can look at genetic analysis, meaning what things seem to derive from similar roots.
And we can back into what seems to be common and shared amongst them. And thus, what do we think was actually existing in the brains of the first mammals? We do the same thing with fish and reptiles to do that with early vertebrates. And we can do that with invertebrates to try and infer what was existing in the first bilaterians. In other words, the first animal with brains.
So we can compare different brains to try and back into what the brains looked like. For behavioral abilities, there are sort of three ways you do this. And this is like my sort of approach to trying to infer behavioral abilities. I call them the in group condition, the out group condition, and the stem group condition. So in order to make the arguments that a behavioral ability emerged at a certain location in our evolutionary history.
So for example, a behavioral ability like episodic memory evolving with the first mammals. You need to satisfy these three criteria. The in group condition stipulates that most ancestors, or sorry, most descendants of this species, in other words, most mammals should show this ability. Doesn't mean all of them, abilities get lost all the time, but most of them should show this ability. And the neural mechanisms by which the ability emerges should come from homologous regions.
What that means is a shared neural underpinning. So if for example, mammals show episodic memory, but they come from neurological regions that independently evolved along different mammal lineages, and that suggests it wasn't present in the first mammals. But if they all emerge from regions that emerged with early mammals, that's good evidence that this ability also emerged mammals. The out group condition says most, doesn't have to be all, but at least many non mammal, so out group, non mammal vertebrates. So the group right above, should not show this ability.
And if they do show the ability, it should emerge from non homologous regions. In other words, parts of their brain that evolved independently. So for example, birds definitely show episodic memory. But when we look into the brain regions from which episodic memory emerges, it's clearly non homologous. It's a part of the brain that early vertebrates did not have.
And the stem group condition is in the ecological sort of dynamics that early mammals existed or this ancestor existed, we should be able to devise an argument for why this ability would have been adaptive. So why would episodic memory have evolved? So with these three things, we can start to infer the story of when behavioral abilities emerged. Is this perfect? Absolutely not, because we do not have enough, we do not have a ton of data on behavioral abilities across species.
So as new evidence emerges, the story might change. But with these three conditions, we can do a reasonable job inferring what abilities emerged. And the main finding of the book, or the research which led me to be so excited about the book, is when you do this, what's kind of crazy is you find as a first approximation, a really coherent story, which is a lot of the behavioral abilities that emerge at each milestone in brain evolution don't seem to be sort of a haphazard array of different skills, but often emerge from really one what I call breakthrough, but one underlying intellectual capacity applied in different ways. So that's sort of the idea of the five breakthroughs. One thing to note actually about the basal ganglia is I think this is a little bit of a sidebar, but I think fun.
The basal ganglia, I think is one of the most underappreciated parts of the brain in the sense that so much work has gone into understanding how the neocortex functions. So much work has gone into the neocortical column and that's all wonderful work to be done. But the basal ganglia not only is evolutionarily much older. If you look in a lamprey fish, as we talked about, I think last time, a lamprey fish has a common ancestor with us 500,000,000 ago. It's one of the most distant vertebrate cousins that still exists today.
They have a basal ganglia that looks exactly like our basal ganglia, same inner structure. And the basal ganglia has perhaps one of the most beautiful internal structures that can be computationally reverse engineered. There's not good consensus on the actual inner wiring and the computations performed by a neocortical column. But there is much broader consensus as to what's being executed by the basal ganglia. And without getting overly technical and perhaps boring people, I would encourage anyone who's computationally interested in this to dive into the literature here, because it is almost beautiful that evolution came up with this.
For example, the input structure of the basal ganglia has this mosaic of neurons that each express two different types of dopamine receptors. This would be my own little technical diatribe. One are called D1 receptors and the other are D2 receptors. D1 receptors when it receives dopamine strengthens connections. D2 receptors when they lose dopamine strengthen connections.
And now you track these different neurons and they actually split their paths. The D1 receptors go to a nuclei that when activated disinhibits behaviors and D2 neurons go through a different set of nuclei that when activated inhibits behaviors. And so we can literally watch how dopamine signals drive repeated behavior and inhibit and dopamine drops inhibits behavior. We can literally look at the mosaic of connectivity here and say, oh, when you spike dopamine, it weakens the stop pathway, which through D2 neurons and it disinhibits the go pathway through D1 neurons and makes you more likely to repeat the behavior and vice versa if something bad happens and you lose dopamine. And I think that is just so beautiful that evolution stumbled on something that clean in its macro structure.
So just diatribe on, I I think that basal ganglia is cool.
Yeah. It is quite interesting as well. You know, people get addicted to drugs. A lot of that is about wire heading in the basal ganglia. And Yep.
And and, of course, habitual learning is something that's you know, when when something becomes so habituated, it kind of, you know, just moves down the stack into the basal ganglia and and, you know, so drug taking would be an example of that. And you actually cited, I I think, an experiment in China where they removed parts of the basal ganglia and and there was a 40% recidivism from addiction. Yeah, very controversial study that probably violates many ethical codes in
The US. But yes, they did this study for intractable heroin addiction, and they lesioned a part of the basal ganglia called the nucleus accumbens, which is sort of where goals are habitually selected. And it showed a dramatic reduction in heroin addiction. Also had other sort of side effects that maybe doctors would deem unreasonable, but it worked. And it absolutely reduced sort of the addictive cravings that are triggered by stimuli.
Yeah. The story of this chapter, incredible chapter, I've just been studying it today in great detail, is the story of mentalizing, but I I would I would call it, you know, social complexification. Actually, that's a hypothesis of why our brains expanded so dramatically. So there was this extinction event. I think it was at the the Devonian extinction events and only birds and, you know, not many things survived.
And then, we got the chance to kind of evolve after that, and and our brains rapidly exploded, and there are different theories about why that happened. You know, maybe it's because we had access to loads of calories in the form of, fruit, preferential access, and it gave us an incredible amount of, time and energy, the excess of which might have led to social complexification. But can can you just sort of give us a little bit of a background about that first piece, first?
Absolutely. So we don't know, there's lots of speculations, but we don't know why this occurred with primates. But we do have some really good evidence that at least part of what drove the sort of explosion in primate brains was social dynamics. And so Robin Dunbar did sort of the seminal work here, where what he showed is in primates, the encephalization quotient, which is just the ratio, like brain size, especially the neocortical ratio. So the ratio of the size of the neocortex to the rest of the brain is extremely correlated with social group size in primates.
So the bigger a social group size you find in a group of primates, the bigger their neocortex seems to be relative to their body size. And what's so interesting is you don't see this in most other mammals. So this is not a standard correlation that applies across the animal kingdom. It seems to be this correlation that's very specific to primates. There might be other mammals, but for most mammals, you don't see this correlation.
And so Robin Dunbar's famous social brain hypothesis is what drove the explosion in primate brains is some form of social dynamic between people. And the more social relationships you're managing, the bigger your brain has to be. Now it doesn't explain what specifically happened in the brain, which is what we can get to with mentalizing and why that applies. But it does suggest that whatever drove this explosion in brain size seems to be something correlated to social grouping. And what's interesting about primate social groups relative to not all mammals, but many other mammals is they're very, very political.
So many mammals live in sort of solitary sort of social lives where males mostly live alone and females will rear a child and then usually they'll go off on their own. There are animals that live in sort of herds where they socially group together, but there's not really very rigorous sort of hierarchies amongst them. But primates, especially apes, have these really complicated social structures with rigid hierarchies. So there truly is someone at the top of the hierarchy and we can measure this through, primatologists have gone to painstaking links to verify this. For example, there's transitivity.
So if you show that one primate tends to show a submissive signal to another primate and that other primate, so just submissive signal to another one, then it's almost definitely the case that that first one will show a submissive signal to the last one. In other words, these are real rigid hierarchies. They're not like random interactions of submission and dominance. And what we see is one of the main ways you survive as a primate is you successfully climb this hierarchy. And so what's so interesting is in many mammals, what makes someone the top dog or the sort of person who's the top of a hierarchy is sort of bronze.
It's just strength. They're trying to flaunt who would win in a physical altercation, which evolutionarily is beneficial because if you can prevent actually fighting each other and just say, this is who would win the fight, then we both save energy just having these sort of fake battles. And then whoever wins gets to eat the food, etcetera. But with primates, it's not always the strongest one that reaches the top. It's the most socially savvy one.
And socially savviness comes into these alliances that are built within primates. So you'll see that people at the top of the hierarchy frequently will befriend and groom and come to the aid of certain other non family members. And those folks will thus reciprocate and come to their aid. And there are these really interesting dynamics that play out. You even see wars.
So there's mutinies that take place in primate societies. So in this sort of soup of the way to survive and get evolutionary advantage as a primate is not perhaps only to make sure you get access to food, but to climb a social hierarchy. All of a sudden there are these huge social pressures to be able to do things like infer what someone else would do in a certain circumstance or what someone knows or what you can get away with or how to change someone's opinion of you. And so what that lines nicely to is what we see of the new brain regions that emerge in primates, most notably is brain region called the granular prefrontal cortex and these areas in the back of the brain called the superior temporal sulcus, temporoparietal junction. These brain regions across primates are highly implicated in what I call mentalizing, which is thinking about thinking.
But standard literature would call this just theory of mind. So being able to infer the intent or knowledge of someone else. And so it's easy to understand why this would be so adaptively valuable in a sort of politicking arms race where you're trying to sort of deceive each other. There's a great study that really revealed this with primates by Emil Menzel in the 1970s. I love this story.
So he trying to He had this like one acre forest. Do you wanna ask a question?
Are you gonna do the Machiavellian apes? Yes. Yeah, go on.
Okay, okay. So Emil Menzel had this one acre forest and his main objective was not to study ape social behaviors in the sense of how they would sort of climb social hierarchies. His only objective was to measure spatial reasoning in chimpanzees. And so he had this group of chimps. There was one chimp named Bell, another one named Rock, and there was a few others.
And he would show Bell the location of food. So he would hide food under a bush. And then he would see, would Bell go back to that same location looking for food? In other words, could she remember locations in three d space or two dimensional map like space? And what he found is readily, yes, they do that.
And we now know that lots of mammals are capable of that. In fact, even fish can do things like that. But in this study, he started finding something that was odd. When Bell would find the food, what she would frequently do is share it with her fellow chimpanzee group members. And that was great until Rock, who was a high sort of ranking aggressive male, when she would share, would take the food from her.
And so what she started doing is hiding the food when she found it. So instead of sharing with Rock, she would just sit on the food. And so Rock realized she was doing this and not sharing. So Rock would come over and push her to try and get the food under her. So then what she started doing was when she knew the location of the food, because on some recurring cycle or experiment or a signal that the food was now available, what she would do is not go to it until Rock was not looking.
So then what did Rock start doing? Rock started pretending not to look. Rock would look away while Bell would go towards the food. And once he noticed she was doing that, he would turn around and run to try and grab the food before her. So then Bell started trying to lead him in the wrong directions.
And so this cycle of deception, counter deception kept playing out. And what that becomes this beautiful anecdote of is sort of a key study in what happens when you have a bunch of sort of hierarchically interacting animals in an arms race for things like this. What you get is deception and counter deception. And that's really only conceivable with some notion of theory of mind. Because in order for Bell to trick or try to trick rocks, you need to be able to say that in order for me to change the knowledge in Rock's head of where the food is, what I need to do is walk in this other direction.
And what that will do is make Rock think the food is in this direction when in fact I know it's in this other location. Or also to reason about someone's intentions. I know Rock intends to trick me. So when he's looking away, I don't believe that he in fact is not paying attention. And so this was sort of one of the first early anecdotes that some form of theory of mind is occurring in primates.
There has since been lots of studies that show this. And so for example, just to give some case studies here, you can take a chimpanzee and you can teach them when there's two boxes, the box with the red mark on it is the one with food in it. So they easily will learn that. Then what you do is you have an experimenter come in with two boxes and they bend over and they mark one and then they they pretend to accidentally drop the marker on the other and then they leave. So the marking is identical in both cases.
The chimpanzees always go for the one that was intentionally marked. They can infer the difference between of the same stimuli of someone meaning to do something, intending and something being an accident. There's other studies of chimpanzees playing with different goggles. And so one goggle you can't see through another goggle you can. And if you put those goggles on human experimenters, the chimpanzees always go, who have food, they always go to the experimenter with the see through goggles asking for food.
So they're somehow inferring that the other person can't see them. So why would they ask? So there's lots of studies that show this ability. And evidence outside of primates and other mammals is very loose. So it's inconclusive, controversial, but the loose evidence shows up in only the smartest mammals, which possibly suggests some independent convergence.
So, yeah, there's lots of rich evidence that this theory of mind exists within primates and it emerges from these uniquely primate regions. And I'm happy to go into the evidence of brains, but I'll stop there for a second.
So with the Machiavellian apes, the X risk people, you know, there are people who talk about AI killing everyone, and they make the arguments. It's called instrumental convergence from Nick Bostrom, which is basically that, you know, things like power seeking and deceptive behavior would be instrumental to any end goal. And this is a great example from the animal kingdom of deceptive Machiavellian behavior. So I guess it does seem plausible at least on the surface that a level of, you know, let's say sophisticated agents following their own intentions and inferring the intentions of others would seek to deceive each other. That's like a natural phenomenon.
I absolutely think so. I absolutely think it is the case that if you the more autonomy you give an intelligent agent and the more ability you give it to define its own sub goals, the more risk emerges. Because you absolutely get what Nick Bostrom is talking about, which is a sub goal to trying to help cure cancer might be dominate all of earth and control the labor supply and allocation of resources amongst all of But I don't think that's necessarily inevitable. I think it is a risk. Evolution is a constrained search algorithm for intelligent entities.
It does not give moral weight to what emerges. This is, I think, an important distinction, which is just because something is a natural consequence of evolutionary systems does not mean that we should deem it morally superior.
Yeah. That's the naturalistic fallacy.
Right. Yeah. So it might be the case that it is very likely that species will eventually enter a politicking arms race and certain forms of deception will emerge and power seeking will emerge. That doesn't mean when we produce our own sort of intelligent entities in AI, that we should imbue them with those features. One of the, I think, optimistic outcomes of this new AI world we're gonna enter in the next hundred years is that we actually now as designers can do our best to try and remove some of the evolutionary baggage that we don't like that's evolved in humans in these new entities.
And so there's of course risk. But I think there's also a really great opportunity that we could have benevolent beings that do not seek to dominate. Yan Lakun talks a lot about this and are less selfish. And so yeah, think there's a great opportunity, but there's definitely risk because the second you give an autonomous agent the opportunity to produce its own sub goals, you need to have really rich either constraints or a really well defined reward function or one ability that I think comes from mentalizing actually is, and this is an idea in alignment research, which is if you can convince an AI agent to try and do what it thinks the human wants it to do, what you're actually doing is you're requiring it to engage in some form of mentalizing to infer the preferences of the requester and then try and do what is best for that individual. Because you can't just have them take requests at face value because then there's all these opportunities for misinterpretations.
Nick Bostrom's famous paperclip factory. Maximize production of paperclips, earth is turned into paperclips. We obviously don't want that. But with mentalizing, with the ability to model the internal simulation of another mind and be able to play out how would this person feel about possible futures. You could imagine optimistically an outcome where an AI agent could easily infer if I turn all of earth into paper clips, that's not what the person giving me this request would have in fact wanted.
They would regret that outcome. So of course, doesn't fully de risk things, but it is one methodology and one learning from sort of evolutionary neuroscience I think we can garner that mentalizing is a tool that can be used to try and stabilize sort of requests that we give each other in a more grounded way. So there's not these types of misinterpretations. Of course, caveat, humans misinterpret each other all the time. It's by no means perfect, but it is a tool.
And I think that's a very natural phenomenon. I think any intelligence system is naturally incoherent. I think it's impossible to have a single monolithic intelligence, which is monomaniacally, you know, focused in in a particular direction. But but anyway, I wanna just slightly rewind a little bit to what we were saying. So the the first animals, they they had quite simplistic, social games that they were playing.
So they were interested in strength and submission, and it was a fairly fixed interface. And what was really interesting is that deer, for example, they lock horns, don't they? So it's predictive. They don't actually have to have a fight because that would be, you know, evolutionary evolutionarily not a smart thing to do. So, so the social game they play, even though the game is fixed, it's predictive, which is fascinating.
And then you are telling the story of, I think, monkeys and macaques, how they have this really interesting virtual social game where strength and social status diverged. So your social status actually became this virtual thing that was based on grooming and pruning and lots of completely unrelated things. And it was entirely possible for, a very weak macaque to have a significantly higher social status than than a big, strong one. So that's really fascinating. But then we get into I mean, maybe a broader question is we are still very social creatures ourselves.
We have Facebook, for example. And could you just arguably you know, could you cynically argue that Facebook or, you know, all socializing is just a kind of arms race to improve our social status? So when we're kind of posting on Facebook, in a way, it's like the deer locking horns. It's us playing these status games without having to have a fight with each other.
I think there is an aspect of human behavior that can absolutely be explained by this. There's a great book called The Elephant in the Brain. Oh, yes. There's a great book called The Elephant in the Brain that talks about how much of human behavior can be explained by the sort of status seeking behavior. And the reason why it's so hard to study is because it's what they call, I think a cognitive taboo or an intellectual taboo where we don't want to admit it to ourselves.
So we self delude ourselves into believing we're doing things for virtuous reasons because it makes it easy, it's more convincing. If I know that I'm doing something to deceive someone, it's easier to tell than I'm doing it. If I genuinely believe that I'm doing these things to help the world, but subconsciously, of course, they're actually just benefiting me, it's more convincing to other people. So their argument is primates evolved this sort of self deceit to make them more convincing and this really accelerated with language in humans and all that. So I think it's it's very likely to be the case that the core thesis of their book is right.
That a lot of human behavior is this sort of subtle status seeking, which I do think has not to go on a sort of tangent here, but I do think has sort of social political theory implications here where how do we make sure that society doesn't devolve into just a hedonic treadmill where the interesting thing about social status is it's always definitionally a scarce resource because it's just a ranking game. So unlike physical resources where it is possible for all of us to live better than kings did a thousand years ago, kings or queens did a thousand years ago, we can all have better access to medical care, all have better access to information, all have better access to food. Social status is always a zero sum game, unless someone can conceive of a better way to do it. But I think in most cases, it's a zero sum game. And this is problematic because if over time, most of our actions become pursuing social status, then we're gonna forever be in this sort of game.
Now, I don't think personally that we're doomed to this. I think there are absolutely better virtues in the human psychology where not everything we do is based on pursuing social status. And I think you can conceive of dynamics where humans are doing things for other reasons, not just to gain status. But I think it's absolutely fair to say that a surprising amount of human behavior is status seeking and maybe a depressing amount.
Yeah. Yeah. Well, I agree with you, and I don't necessarily want to get too philosophical on that. But, yeah, that book was Will Storr's Status Games, where he said that there were three meta status games that we played. Yeah.
I think he gave, yeah, the virtue game, the dominance game, the success game. So, you know, I might be playing the success game. I want to have the best podcast or or whatever. But but, no, the the the the reason I bring this up is the difference between humans and animals is they're just playing one game. So it's really interesting that they have this mimetic social score, but the the game is the same everywhere.
Whereas for us, we go one level of abstraction up. So the success game to us can be manifest in in a myriad of different ways. It could be successful at playing computer games. It could be, you know, writing books or or making podcasts or or whatever. So it's almost like we we fractionate our our social ranking into a myriad of different games.
And I think that's a little bit of a testament to the difference us humans have in general with our metacognition, which is our ability to kind of like create the memetics, you know, in a novel way.
One thing that's sort of related to this, at least in early human societies, is one way to sort of, I'm taking a little bit from the business world here, where one way to reduce sort of status infighting is to make it such that members of a team have distinct roles. And I don't think this lesson only comes from management theory and entrepreneurship. I think this probably derives from either early human or maybe even early primate societies where it's much more stable. And you can introspect that this feels much more comforting being a part of a troop of a 100 humans where pretty much everyone is pulling their weight and everyone matters because they're doing their own distinct thing. And that's a very stable state where we're not infighting as much because we're all sort of doing We all matter to some degree.
But when there's infighting for, five blacksmiths or there's five podcasts or there's five books about the evolution of the brain. Now all of a sudden, these other types of things start emerging because we're no longer all fulfilling a role that matters. There's it feels like there's a ranking and only one of these is gonna matter. So I think as an example of ways to reduce this sort of status seeking, I see this in sort of business all the time, which is the more you can create an environment where it's not zero sum, where everyone's pulling their weight and together we all win, the more sort of the best version of humans emerge. The more zero sum it becomes and the less divided the sort of roles and what people are doing is, the more of these sort of, I would argue sort of primitive primate behaviors start emerging.
An early stage company, so a company of 30 people has such different dynamics than my last company where when I left that we were 400 people. The social dynamics is so different. And I do think one could speculatively correlate that to sort of our evolutionary history here where in a 30 person company, you don't need that much structure. I mean, if you have like people that work well together that are aligned on a mission that you get rid of people that are in general mean spirited or have bad intent, you don't need a lot of structure and process to get people to work well together, to support each other, to move in a common direction. I think what that demonstrates when one observes that is that what's playing out is an evolutionary program that got groups of 30 humans to work really well together.
When you're at 400 people, what very quickly happens and takes a lot of work to fight this is you start getting sort of internal factions emerge because what splinters out is these subgroups of 30 to 100 people that then have their own points of view. And then it's very easy to us versus them, other groups. And you start seeing things break down. And that's where one mechanism for solving this is very rigid hierarchies. It's what the military does.
Another mechanism is embrace the chaos, which is sort of a little bit more what Google does. Another mechanism is effectively make it a satellite of different startups, which is what Amazon does where each group is a kind of autonomous and has very clean interfaces to other groups. So can kind of, so there's many different sort of management approaches to this. But the breakdown I do think emerges from humans did not evolve to interact with 100,000 people. We did evolve in an environment where we interacted naturally with about 100 people.
And that's why that comes very, we don't need as much sort of process to make that work, but we do when we scale it up.
Yeah, yeah, it's fascinating. I mean, as you say, you could argue that Amazon has one overarching goal to make money. But as soon as you increase the autonomy in the organization, it's a very human trait, isn't it? So you were talking about the the Machiavellian behaviors and the deceptive behaviors, and you just wonder how much energy is just wasted with infighting. And even though I've made the comment that in the military, you know, they might be doing quite simplistic jobs compared to Google.
But even at Google, there's an obsession with job level. That's the I mean, you go on war or if you go on blind, that's the only thing people talk about is their total conversation and job level. So, you know, I mean, maybe we should save the sin the cynicism. But so come coming coming back to the to the chapter. So we're telling the story basically of of how this metacognition and this predictive apparatus, you know, gave rise to an entire suite of complex social behaviors that that we see in primates, which is fascinating.
And maybe we should just talk a little bit about, I call it, why bootstrapping. So it's like why why why. I I think there was a one guy at Toyota Research was quite famous because he would get people to ask why five times. So you say why why why. And it's almost as if there's some magic number.
You know, like, everyone is only a certain number of degrees of separation. And it's a similar thing that you only need to ask why a few times, and you'll always get to some kind of base reason. And maybe that's why evolutionarily, we have, you know, two levels of of causation metacognition in our brain. You know, we we have the the agranular, prefrontal cortex, and we have the granular prefrontal cortex. So I guess, like, one potential question there is why is there not a third level of asking why and what would that look like?
And can you just kind of sketch out that metacognition picture in general?
When we think about what, when we speculate what agranular prefrontal cortex does, a reasonable framework for it is it generates explanations of an animal's behavior. It models an animal's own behavior. So one sort of cognitive tool to reason about that is if it observes a rat wake up and have certain hypothalamic activations and run to in a certain direction to drink water, it produces a representation that could be interpreted as I am explaining this behavior by I am hungry as an animal. And so that can be useful in a variety of ways. It can trigger simulations to find alternative solutions to satiate the same need, right?
So if you think about this, if you put a rat in a novel situation, but the agranular prefrontal cortex infers that right now I am hungry, Now we can start triggering a bunch of simulations to try and satiate the same desire to fulfill, this is a little bit in active inference land, to fulfill sort of what I believe about myself through alternative means and enables an animal to be flexible.
Okay,
So this is sort of the, this is an explanation of an animal itself. Why would it be the case? So what I sort of argue in the book is the granular prefrontal cortex builds a model of that model. So instead of a simulation, it's a simulation of the simulation. And so what that would mean is if you could, as a thought experiment, if you could ask, let's go one step further or one step back.
If we could ask the basal ganglia, which is the sort of reinforcement learning system, why did you turn left to go in this direction to drink water? It would just say because that turning left maximizes reward. The answer would always be the same. If you ask the agranular prefrontal cortex, why did you turn left? You would say, oh, because I'm thirsty.
There's a specific thing that me as an entity, this animal I'm modeling wants to achieve. But if you ask the granular prefrontal cortex, it would say, well, I turned left because I am thirsty and that made me think about ways to satiate my thirst. And I simulated going to the left and I remembered water being there because last time I was there, there was water. And so I went to the left. And so in other words, it enables you to simulate different types of simulations and reason about what you would think in a new setting, which of course enables you to think about what someone else might be thinking.
And we do this all the time. I mean, someone doesn't respond to a text message. Someone makes an odd facial expression and social interaction. And we're immediately trying to figure out what is this person thinking? Why would they do this, etcetera?
So the question first is why do we even need this new level at all? And sort of, I think one of the main adaptive values is it enables survival in the politicking arms race. Because now if I can simulate a simulation, I can infer why you might do a certain behavior, how to manipulate someone's knowledge, your intentions behind things. So this is why you would have one layer. To go a level above, you could make an argument that theoretically there should be an infinite scaling up of whys.
I think this is maybe a cop out, but I think there is huge energetic costs to any sort of scale up. So what that means is the question is not would there be benefits to a third level of hierarchy, but the question is would the benefits of a third level of hierarchy outweigh the massive energetic costs of producing it? And so I think that would be my first blush explanation as to why we might only only have two levels instead of three or four, because the second level added a clear adaptive value relative to the cost to survive in the politicking arms race. And the third one perhaps was superfluous and unnecessary relative to the energetic cost.
I think having that second level of metacognition, it does a lot of work. Right? And I'm gonna talk a little bit about that now. But one of the things is you can infer the the intents and knowledge of others through the same process of doing simulations yourself. So you can kind of imagine, let's say, yourself doing something, but you can kind of swap out the pointer to be someone else and swap out the knowledge to be someone else, and that's incredibly valuable.
But the knowledge thing is is really, really, really interesting. So I I asked the question last time, and this is something that I've been quite confused about. And I feel that reading this chapter has actually really cleared up for me, which is about goals. Because when you look one level down at the agranular prefrontal cortex, it's modeling intense. And then this granular prefrontal cortex, which is trying to seek explanations about the level below, which is the agranular prefrontal cortex.
It it's it's going a level of abstraction up, and it's modeling goals, not not intense. But it's actually modeling knowledge. It's what it's doing is it's categorizing. So when you have a simulation of simulate a simulation of simulations, what it's doing is is it's creating a category. So to the example you just gave before, thirsty becomes a category.
So rather than it being a pointillistic intent, you know, it's a little bit like saying, I can go and have a sandwich or I can go and have McDonald's or my abstract simulator could kind of draw a boundary around those things, and now I'm getting food. So as well as being able to categorize intents in yourself and other people, you're also categorizing knowledge, and then it can be shared memetically. So it's almost like just going to that second level of metacognition just just gives you so much that you didn't have before.
100%, yeah. I think the thinking about the level of granular prefrontal cortex, the new primate regions as enabling something akin to knowledge, I think is a really wonderful way to look at it. One from the connectivity analysis and then two, just from what we mean by knowledge. So connectivity analysis, if you look at superior temporal sulcus and the temporoparietal junction, These are regions of the posterior cortex that in simple terms are at the very top of the hierarchy. I mean, they get multimodal input from all the other regions of sensory cortex.
So a very sort of simple rule of thumb for understanding this is just this models the rest of sensory cortex. I understand the full rendering of the simulation of the external world that is happening and this is where I build a representation of that. And it is perhaps no coincidence that's also where we see brain regions light up when you're engaging in things like theory of mind and solving false belief tests. In other words, trying to infer the knowledge of someone else, these same regions light up. And what do we mean by knowledge?
I would argue that knowledge can mean a few things. One is procedural knowledge where I just know how to do certain motor behaviors. I don't think that's what we mean. I think we mean more semantic knowledge or episodic knowledge, which would be I know that water is over there. And I know that if I do this behavior, this will be the causal outcome.
That type of knowledge I think is absolutely rendered in the mental simulation. When I imagine certain things, when I imagine the case of lightning hitting the ground, what do I see afterwards? I see fire. And that's the source of my knowledge about the causal relationship between these two things. So having a layer that models the simulation enables me to reason about my own knowledge and to see what the effect of changing knowledge is on behavior.
And this, of course, enables us to flexibly adapt to other people's behavior and predict what they would do under cases of different knowledge and different intents.
Yeah. That's fascinating. But you did say that the there was a bit of a riddle about the granular prefrontal cortex because there was one study where it could be damaged and the person would still score really highly in in IQ. But you said it's about being able to project yourself in simulations, this kind of like abstract modeling your own mind. So in this particular case, how could the person still score the same IQ without that part of the brain?
So this is such a cool story in the history of neuroscience where you would think that if you look at a human brain, I mean, granular prefrontal cortex is this huge region in the front of the brain. I mean, it takes up a gargantuan amount of space. You would think that taking a chunk out of that part of the brain would have a gross effect on a human being. Just like if you took a part of even a relatively small region of the back of your brain, which is where your visual cortex are, you become hugely visually impaired. You take a region of your motor cortex out and humans become largely paralyzed for months until they recover from that.
You take a region of auditory cortex out and they can lose the ability to recognize even words. So there's relatively small regions of neocortex that if there's damage to it, there's gross obvious effects to the human intelligence and behavior. And after World War II, there were so many patients with brain damage that there's all these studies and people could not figure out. It was a puzzle, what does this huge region of prefrontal cortex do when people don't have Something seems off about them, but it's not obvious what is wrong with them. People would note personality changes.
They don't seem to be themselves. But on sort of logic tests, on IQ tests, it wasn't obvious they were dramatically impaired in many cases. And there was one famous case where they could test someone before and after because for surgical reasons, they were gonna remove parts of the cranial prefrontal cortex. And this patient actually improved in IQ tests. So render this huge puzzle, what does this part of the brain do?
And so then if you track sort of the studies from that point forward, we start learning that what granular prefrontal cortex does in large part isn't related to these types of logic puzzles. It's related to thinking about thinking and modeling ourselves. So for example, if you look at someone who has damage to granular prefrontal cortex, someone who has damage to the hippocampus and then someone who has a normal brain, And you ask them something very simple. You give them a random word and you say, just tell me a story. Just imagine a story of you with this word.
The word could be restaurant. And you compare these stories, you immediately see something very different. The people with hippocampal damage give a very, very rich story about themselves, but the external world misses details. So there's not a lot of rich details about the external world. This is consistent with the idea we talked about with early mammals where hippocampus helps render a simulated external world.
The people with granular prefrontal damage could render a very rich external world. They could tell you the details of the leaves, the smell of food, exactly what a restaurant looked like, but they themselves were woefully missing from the stories. They could not project themselves into this imagined world. And so then if we go back and look at all these other things that light up granular prefrontal cortex, if you ask someone to think about how they're feeling, granular prefrontal cortex lights up, self reference. But if you ask another question such as what does it look like outside, the granular prefrontal cortex does not activate.
APFC will activate in both cases. So we start to see that it's in cases of thinking about yourself and thinking about others that this granular region gets reactivated. And now if you go back and study these people more deeply, you notice that they become hugely impaired at false belief tests. They can't recognize faux pas. So they don't understand what's like not really appropriate, which of course makes sense because how do I know what's appropriate?
I'm gonna infer how you feel about the things that I'm saying. And so you see all of these sort of mentalizing impairments that emerge, but it's not related directly to these logic puzzles that are typically in things like IQ tests. Yep. You mentioned
the false belief test. Can you just briefly sketch out what
that is? Yes, so there's a good picture, Do you wanna hold it up? Oh yeah, me show another podcast. Okay, so the way the test works is you have Sally on the left who has a basket and then you have Anne on the right who has a box. So Sally puts a marble in the basket and then she walks away.
Then Anne goes over and moves the marble from Sally's basket and puts it into her box and then leaves. When Sally comes back, where does she look for the marble? So it's so simple, but in order to figure out that Sally will look into the box, you have to understand that it's possible for another mind to have incorrect knowledge, to have false belief about something. So young children don't understand this. They assume that knowledge is omnipotent.
Everyone has the same knowledge about the world. But at a certain point, they start learning that it's possible for people to have false beliefs. And so we actually know that nonhuman primates can do this. They've done studies on macaques where you do exactly the same stallion test and you just look where their eyes look when the person comes back into the room to look between the two boxes. And they always look or tend to look in the direction of where that person thinks the marble is or the piece of food is, not where it actually is.
If you inhibit their granular prefrontal cortex through an injection or another mechanism, this bias goes away. They no longer look in the right direction. There's lots of really good evidence that this sort of false belief mechanism is occurring in these primate regions.
Yeah. And what really hit home to me is that in a way, it's not even knowledge. It's all simulations. It's just simulations of other agents. And we've we've always kind of spoken about knowledge in some weird, platonistic abstract sense.
And I quite like the idea that the the primitive form of communication between humans is just simulations even when we're speaking to each other.
Yeah, exactly. So how do we solve the Sally Anne problem? It probably happens so quickly, but we just simulate what would we think if we were in Sally's shoes. And then I realized, well, I would look in this place. And this helps us reason about other people.
And this begs a really almost profound question about how unique is theory of mind. This brings me to a question that I've been asked multiple times, which is does ChatGPT have theory of mind? If by theory of mind, and so the evidence I should stipulate for anyone curious is if you ask GPT-three these sort of theory of mind puzzles, it does terribly. So that's an easy one to sort of discard out of hand. But if you ask GBT for these theory of mind puzzles, it performs remarkably accurate, like as human level at these theory of mind puzzles.
And there's been people that have explored, is it just in the training data? And there's good evidence that it's not just because they're just regurgitating what was in the training data. So does this mean that Chad GBT has theory of mind? So I think there's a few ways to reason about this. One is what do we mean by theory of mind?
If by theory of mind, we just mean the ability to solve these sort of false belief puzzles, then I think you have to accept the fact that yes, it can solve those tasks. The problem is the way in which it renders this model of other minds is not through having a similar mind itself. And so what this means is we should be concerned. Doesn't mean it won't work well, but we should be concerned about how well this will generalize to real tasks where we might care about this much more deeply. So for example, with a human, because part of this good evidence suggests that part of my ability to reason about your mind is because I have a mind that works quite similarly, we are almost bound together by some common mechanistic synergy between the way in which our brains work because our brains are quite similar, which enables a lot of data efficiency, which is I'm pretty good at predicting what people do, not perfect, but pretty good at predicting what people do because we're all people and there are similarities between how we act.
And so that makes us quite data efficient and decent at generalizing to new situations where we put people in new places that we've never seen before. I can kind of guess, well, if I were in that situation, this is what I would do. GPT-four has learned to build a theory of mind simply by reading text of these puzzles. And so clearly it has some mechanism to build a model of predicting what people will do in certain circumstances and differentiating knowledge and intent, etcetera. But the concern is twofold.
One, what will happen if we take those types of models and put them in very new situations that are not based on just these puzzles, but for example, we're asking them to optimize a paperclip factory? That's a situation where we should be concerned. How well will it do at actually inferring what we mean by what we say? And the second is data efficiency, which is how much data did it have to see to build this model? If it was a ton of data, then it's gonna be problematic if we have these new situations where we wanna teach them to model people's behaviors in this new place.
If it requires a ridiculous amount of data, then it's always gonna sort of be slow to learn these things and always be at risk of not generalizing well when we put them in these new situations. So my answer here is nuanced, which is I think if by theory mind, mean solving puzzle questions, I think it's very hard to say that ChatGPT does not have some model of human behavior. But I do think the human and primate mechanism for doing so has a data efficiency advantage and a mechanistic synergy advantage. In other words, we can use ourselves to reason about things that is relevant. And if we wanna have these systems do a good job listening to human requests, we shouldn't translate performance on false belief tests to believing that they'll do a good job correctly inferring our intent to knowledge in new situations.
Yeah. I I would agree with that. I I think ChatGPT, it's in the the world of text, and it's learned all of this structured narratology and things on Reddit and things on Twitter. And as we were saying last week, you know, language has evolved to be very simple as a it's learnable by children. It has a small subspace.
But it is it is a real kind of generalization over human behaviors, and it's and it's in this very low resolution substrate. Whereas in the Machiavellian apes example that we were talking about before, these are agents performing real time sensing and inferencing and making, like, you know, in the moment judgments, and they're they're in this continuous sensor domain where they have many, many different types of signals, you know, visual signals, sound signals, and also, you know, memory of of what happened in in those dynamics just before. So it feels like a a difference in kind to me between those two situations. But it is remarkable that in the the GPT domain, any kind of theory of mind could could work.
One good example of this, I think is, is there a difference in our human ability to predict behavior between a car and a person? So the brain is always able to model things it observes and simulate it and predict what it will do. So I can look at a car and I can imagine different colors of it and I can imagine what will happen if I drop it and it rolls down a hill and I can, we build models of things all the time. We build models of computers and models of, I'm just looking around the room, of books. So the brain produces models of things.
Is the way that the brain produces models of other human behaviors exactly the same or is there some unique advantage? And my argument is that there's something unique happening when I'm building a model of another person, which is I'm leveraging my own inner simulation of things as a useful prior to try and predict what other people will do. And so chat GBT models human behaviors in a, to draw a crude analogy, the way we would model a random object, which is I'm only modeling it based on seeing its behaviors in certain situations with the data I received. On the other hand, when we model someone else's behavior, we're doing some form of projection and using the prior of how we would behave. And we probably bootstrap part of our model of human knowledge and intent based on our own introspection.
And I think in that way, it is a difference in kind.
Fascinating. I I completely agree with that. The selfish gene is kind of saying, it doesn't matter what what you folks do. The gene is kind of directing your behavior, and you don't really have as much agency as as you think you do. And it's a similar thing with language.
If you think of language as being a super organism or a virus and we are the hosts and information is being shared memetically, and and it's kind of, you know, it's it's shaping our evolution, but it's also shaping our behavior. So it's almost like when we become infected by certain, memes. It might be like religion, for example. It completely, you know, it's almost like it parasitically affects our behavior. But I think there is a difference between social memes and physical memes.
So tool use, for example, that doesn't seem to have the same parasitic effect. So, you know, the if you look at the behavioral complexity of of apes, because they don't have these, you know, novel virtual memes, in their culture. Their behavior seems quite monolithic compared to ours. But I just wondered if you could kind of contrast that next level of mimesis.
So there's been lots of great sort of writing about the distinction in the literature is typically called cultural transmission between sort of non human primates and humans. And a lot of sort of the general consensus here is that although there is transmission amongst non human primates, which we see in particular with tool use, it doesn't sort of accumulate in the same way that it does with humans. In other words, humans can pass a piece of information to another generation, which that next generation will reliably copy and then can merge with other new information, which they can then reliably copy. And you do this over a thousand years and you go from, I know how to use whittle of bone into a needle for sewing to all of a sudden now I've built a loom, right? And these ideas keep accumulating on top of each other.
Whereas in non human primates, don't see the same type of accumulation. And that's what I almost, in like a pithy way in the book called The Singularity That Already Happened, which is once you enable these memes or ideas to accumulate across generations, you get what you're describing as this sort of mimetic organism that we are the substrate for, for sure. Now, what I think is interesting here is one lens through which I like to think about this is sources of learning. So if you think about how nonhuman primates learn, there's sort of three sources. One is they learn from direct experience.
So their own actual actions. This is reinforcement learning writ large. I do something, it succeeds, it fails fine. Another is their own imagined actions. This is the part that evolved in early mammals.
So I can imagine doing five different strategies to try and get to the food over there and I find the one that worked and that's a source of learning. My model of the world became a source of figuring out the right path. So my own imagined actions. What mentalizing enabled with primates is this third mechanism, which is learning from other people's actual actions. So I can see my, if I'm a young chimpanzee, I can see my mother using a stick to put into this termite mound and pull it out and this complicated behavior and eat food.
And I don't have to do my own behaviors to do that. I don't even have to simulate doing it. By watching her do that skill, I will adopt and learn. But what nonhuman primates don't have, which is what's very uniquely human, is learning from other people's imagined actions. This is sort of the key breakthrough that happens with language, which is the bandwidth through which nonhuman primates can communicate what we're sort of calling knowledge here is only through actions themselves.
I can't describe if I were a nonhuman primate, what I saw when I imagined five different ways to try and hunt the boar over there. I can just do it and you can learn from what you saw me do. But language enables us to share the outcomes of our imaginations. And so that is a way higher bandwidth mechanism for translating information. So that enables accumulation across generations.
So for example, it's so easy to think about ways in which this would be adaptive. Two would be sharing semantics. I go into the forest. There's two snakes there. One bites me and I'm fine.
The other bites me and I get really sick. I come back and I say, green snakes are okay, red snakes don't go near. And that semantic knowledge now exists amongst the whole troop. In the old world before there was language, you would have only the people surrounding who saw this happen would have the knowledge. Now I can translate it just through I simulated in my mind the episodic memory and I translate it to everyone.
The other is coordinated planning. So before language, it would not be possible for five humans to jump in trees and say, okay, here's how we're gonna hunt these boar. We're gonna stay silent and then I'm gonna whistle three times and then we're all gonna jump down and surround the one in the back. That type of planning is only possible because one person can simulate something and then translate and say, Hey, when I imagine this happened, we succeed. And other people of course can edit that simulation and say, When I imagine that happened, I don't see us succeed for this reason.
You can start refining. So this ability to have a source of learning from other people's mental simulations is sort of what I would argue is the source of this very unique human superpower that emerges from language. And of course now with such a high bandwidth transference of mental simulations, you do get this sort of quasi evolutionary process, which is what Richard Dawkins is talking about, where you do have a process by which the memes, the ideas that do a good job propagating are the ones that will propagate. And the ones that for whatever reasons are not viral and either don't do a good job of maintaining the hosts so the ideas are bad and I end up dying or I just don't have an incentive to share them, they're not viral, those ideas die. And so then you get this sort of meme evolution.
But to me, the source is the fact that language enables us to share in our simulations. It becomes a much higher bandwidth communication mechanism.
I'm fascinated by this idea of the meme itself being an agent, being a virtual agent. And it in in expressing its agency, it needs to manipulate us. So you might argue, as you do in your book, that there has to be some kind of traceable chain down to the basal ganglia. So we have many, many levels of bootstrapping. And at some point, the thing exists because the basal ganglia says, oh, that's good.
I like that. So then we have one level, and then we have another level, and then we have, like, the the hemisphere. And it's almost as if that thing is manipulating us because when you have weakly emergent microscopic phenomena, part of the definition of of emergence is surprise. So it's macroscopically surprising. So it does something which is completely unexpected and unlike the thing that went below.
It's just kind of weird that it might be manipulating us down here, but doing something completely different up there.
Yeah. So I think there's probably, this is mostly fun speculation, but I think there's probably two ways if I'm gonna draw analogies to brain regions and intellectual features of the human brain, I think there's probably two lightweight ways we could think about why memes become sort of attractive. One would be sort of the older vertebrate like structures. So this would be basal ganglia plus amygdala. And these are things, a meme that makes me feel fear or makes me think that unless I take an action, something bad is gonna happen to me and one of those actions has to be sharing it, then that's gonna be highly viral.
If you make me afraid for my family's well-being, you're gonna activate my amygdala. And even if there's only a two percent chance that this is true, I might still share it, right? So you get these sort of, we're not good, humans are not good at dealing with low probability, high magnitude events, which is another brain constraint. The other key thing that also exists at the level of sort of early vertebrate like structures is a preference for surprise. So in order for reinforcement learning to work well, it's very effective, we see this in AI systems, to make people pursue actions that are novel.
Because that's one way in which we can explore new areas and learn new actions that sort of explore the space of possible choices to make. This is one intrinsic way to get trial and error to work. So the way casinos make money from you is they hack into this sort of preference for surprise. So they have something If there's a 0% chance of winning, you would never play. But if there's a net 48% chance, meaning in the long run, you'll lose money, but every once in a while there's some surprising thing, that actually is over the threshold of being worth it to the basal ganglia because the surprise is so exciting.
It's one way to think about it. So I think you get something that creates sort of innate fear or some great outcome or surprise, you get these older structures. I think with mammalian structures, I think there is sort of an active inference play here. You can even correlate it to granular prefrontal cortex or things like identity, where if you give me some information that is consistent with my model of myself and I'm highly motivated to maintain my model of myself for a variety of reasons that we can talk about, then I'm more likely to maintain this belief. Versus if you give me information that's inconsistent with my model of who I am, people are highly likely to reject these beliefs.
So this is another speculative way to think about why memes sort of persist within their little echo chambers and it can quickly become sort of identity wars is if one's identity is consistent with a certain set of beliefs, then that almost creates a gated wall for certain types of memes to enter. It becomes much harder for certain ideas to be entered into my mind and it creates a very porous film for other types of ideas that are consistent with my identity become very easy for me to adopt. And I think that's another, if we're gonna frame memes as having agency, one strategy by which a meme would seek to survive is you find a way to be consistent with certain people's view of themselves and the world so that what you're doing is you're reinforcing it as opposed to challenging it. Now there is a very clear difference though, which is the human brain is analog and these machine brains are digital and there's pros and cons of each of these. So a digital brain, Jeff Hinton talks about this to make sure I'm citing these cool ideas.
But he has a great talk where he very well describes that the benefit of a digital brain is it's immortal where all the weights are stored in sort of binary so I can very easily transfer it to different brains. But it's hugely energy inefficient because I need to model everything exactly to be copyable. But the human brain is way more efficient, but it's not copyable because the information exists in the physical representation of the analog connection between all these neurons, the protein receptors that exist in them, the gene expressions, all this crazy stuff that makes it non copyable. And so it'll be interesting to see what is the energy efficiency, for example, of an AI, digital AI system that actually does attempts to recapitulate a human brain that might be very energy inefficient. And it might open the door for a whole new area of research that I think would be really fascinating, which is building analog brains.
Can we have these systems that actually work in a more analog way? And the way they pass information to each other, this is also a Geoffrey Hinton idea, is by teaching each other. And because they're AI systems, they can teach each other with better fidelity than a human because they can actually share probabilistic outcomes as opposed to just the words we say. I mean, they can generate way more samples for each other than a human could because they can live much longer, etcetera. But there's a whole emerging world of the distinction between these digital machines that are immortal, but very energy inefficient and then analog machines that are much more energy efficient, but less good at translating information.
To one of your points that I think is really key is one of the main things missing that I don't think is talked about enough. And maybe there'll be a breakthrough soon. That would be great. But I don't see very clear ideas over the horizon that will solve this is the continual learning problem. I would say this is one of the essential lines that differentiates biological brains from modern AI systems, which is the way in which AI systems are trained are such that we cannot let them continuously learn from new experiences because it disrupts the old information they have.
So whether that is an architectural constraint or something that needs to change in the underlying learning algorithm itself, lots of open research and debate about that. But the fact remains that if you allowed ChatGPT to learn from every chat that happens to it, it would get rapidly dumber. And that is not the case with humans. We can continuously update our information. Our representations are robust.
And I think for many of the applications for AI systems systems that are gonna be most impactful, continual learning is going to be an essential component because we're gonna wanna bring an AI agent in, show it new information and immediately have it incorporate that without forgetting old things. So I think that is very clearly aligned that there's a lot of really interesting research happening and a lot of research left to be done.
Why are we superior to animals?
So there's been such a long history of us pontificating on the various chasms or attempting to create a chasm intellectually between us and other animals. The most famous form of this, which I think still shows threads in modernity, is from Aristotle, where he sort of took the same kind of ideas that you see in McLean's triune brain, where other animals might have these basic instincts, they might have some form of emotions, but what they all lack, which humans uniquely have is this notion of reason. We can uniquely reason about things in the world. And I think as I try to argue in the book and I think most comparative psychology demonstrates quite clearly is there's clearly forms of reasoning that we see in other animals. And so, you know, of all of the different abilities and capacities that seem unique to humans, the one that stands out as most salient is undeniably language.
Because we, despite many painstaking attempts, have not even been able to teach chimpanzees, bonobos, or gorillas to speak with the same degree of fidelity of human like language. Now there is some controversy as to the extent to which did Kanzi and Koko and Washoe pass the threshold that we define as language. And so that can be debated, where do we draw the line? But undeniably, most people would agree that these nonhuman primates do not learn language naturally without painstaking attempts to teach them. And when they do learn language, it does not show the same sort of flexibility as humans.
Now what makes language unique, is two things. One is declarative labeling. So there is a distinction between imperative labels and declarative labels. An imperative label is learning that a phrase, a cue, leads if you take an action response to a cue, you get a reward. So when a dog, responds to a specific cue with a response and then you give them a treat, that's not what we define as language, it's an imperative label.
A declarative label is when I say dog, and in your head, you know that reference is a concept or a thing. Now we have a label for a concept or a thing. And so, it's not at all clear that other species perform these types of declarative labeling. And if they do, it surely independently evolved. Like we're quite confident that early primates didn't have this ability.
The second thing that makes language unique is grammar. So we can take these declarative labels that reference things or actions, and then we can weave them together in certain structure and the structure itself has meaning. So that basic example is just the ordering of phrases. So if I say Ben hugged James, that means something different than James hugged Ben, despite the fact that it's the same phrases or the same declarative labels, the order presents meaning. So there's a whole interesting world of why language, if it is the case that language is the fundamental difference, why is it language that has allowed humans to sort of take over the world?
And that's another interesting topic we should discuss. But I would argue that primarily what makes humans different is language. And and Aristotle's idea of reason, we see at least in smaller forms of other animals.
Yeah. I mean, it's quite interesting because because you said right at the very beginning that, you know, yeah, Aristotle, he spoke about the rational soul that we have. Yep. And even in the twentieth century, you know, we spoke about things like mental time travel and our sense of self and tool use. And it's really interesting because we look in the animal kingdom, and one by one, all of these things that we thought placed a bright line between us and animals faded away.
Some people think that language is a continuum, that there's just a gradation that, you know, if you scale up the, the brain of of the ape that you will get human language. Is that the case?
So I'm, the reason I've I'm very skeptical of that claim is we don't see variance in language abilities based on brain size. Children who learn language at the age of four still have small brains relative. I mean, I actually, I'm not sure the exact brain size comparison, but I'd be curious the brain size of a four year old child relative to an adult chimpanzee, just based on volume. The other interesting case is, yeah, floresiensis. Yeah, Homo Homo floresiensis.
Oh yeah, from Indonesia, the ones with the small brain.
Yeah, yeah, yeah. Yes, so Homo floresiensis is a great case study here because we found fossils of ancestral humans on Indonesia who were effectively miniature humans. I mean, they had shrunk in size to, I think they were like three and a half feet to four feet tall. And their brain capacity, we can look at their fossilized brains had shrunk actually from our ancestral humans. So they were marginally larger than the size of a modern chimpanzee brain.
And yet they showed a lot of signs of superior human intelligence despite having smaller brains. They showed tool use that was akin to ancestral humans. They had Alduin tools, which is supposedly a sign of uniquely human intelligent tool making. And so that is suggestive of the idea that whatever, unique intellectual capacities humans earned around two million to one million years ago was present despite their shrinking brain. So either one has to argue that language evolved much earlier, which some people do, or whatever sort of proto language emerged back then was present even when these brains started shrinking, which suggests to me, which is actually, aligned with the the ideas in the language game and my understanding of them, which is a great book, is that fundamentally what's unique is we have an instinct to learn language.
It's not that we have some unique capacity for language. And I think that is a key difference that we can talk about because when you look at children who learn language, there are two very unique features of how they go through language learning. By the young age of around two, they're already engaging in proto conversations. So even actually a younger infant will pause, will match the pausing of their mother. So even if they're just babbling, they will engage in the synchrony of babbling time intervals.
And so that is clearly a demonstration of some initial instincts which demonstrates the ability for me to want to engage in some turn taking action with you. The other unique thing that emerges a little bit later is joint attention. And where human children will uniquely attempt to get their parents to engage in attention towards the same object. And scientists have gone to painstaking efforts to demonstrate that this attempt to get a parent to engage in attention of an object is not the attempt to get the object. So a child or an infant will be dissatisfied if the parent doesn't look at the object, but they get the object.
So a third party comes in and hands it to them, they're dissatisfied. If the parent looks at them and is excited when they're pointing at an object, the kid will also not be satisfied. But only when the parent looks at the object and then looks back at the kid and smiles is the child satisfied. So there's this instinct to engage in conversation and to jointly attend to things which gives us the sort of instinctual foundation on which you can start adding declarative labels. Because when you have joint attention to something and you're paying attention to this turn taking, it enables you to label things and say, well, this means run or this means book.
So so I think I think it make that all of that makes it hard to argue that it's just a consequence of a scaled up brain.
One of the things that that's really fascinating is that animal communication seems extremely superficial. And when I say superficial, I mean that when you take different populations of the same species or different species, the the kind of the expression, the complexity, it's it's very, very simple. We don't see this, like, incredible fractionation and divergence that that we see in in human language. And as you articulated just a minute ago, a big part of that is this declarative labeling, which is that, you know, one of one of the reasons presumably for language is the ability to, do variable binding on symbols. So to say, you know, this thing is a dog.
That thing's a bear, to be able to dynamically, manipulate that. It seems to me that you can think of language as a form of agentic communication. So the difference between humans as as languages is that we are agents, and agents is about being able to kind of, like, you know, have your own directedness and plan many steps ahead and take control of your environment and so on. So the difference in communication with animals is the information content is more in the environment around them, whereas, you know, human languaging, a lot of it comes from the agent itself. So I just wondered whether you could think of any weird way to distinguish human languaging from animal communication.
One line that I think there's some good evidence to suggest exists between human communication and non human primate communication is humans have much more of a desire to share what's going on in our own mind. There is a unique pleasure we have from sharing our thoughts. And when we look at the communication styles that happens in nonhuman primates, there's much less of a desire, even for these, when we go through these language learning experiments where they have forms of communication, there's much less of a desire to share thoughts that are going on in one's mind. And one line, there's some controversy around this is humans from a very young age will ask questions. They'll inquire as to what's going on in someone else's head.
And with the exception of maybe Kanzi who there was some argument maybe asked questions, you did not see nonhuman primates probe the mind of other individuals. Even though we know they have theory of mind, we know when they're trying to deceive others or they're trying to learn actions by observation, they clearly engage in theory of mind, but when it comes to language, they weren't interested in inquiring as to what someone is thinking about. And so I think in that sense, there's an agency to language being a tool for inquiring as to what's going on in someone else's mind and sharing what's going on in your mind. And this is where I think language is in part part of why language is a superpower because it provides a completely unique source of learning and data, or data for learning. So nonhuman primates can engage in learning through observation because I can see someone take action, as you said with imitation learning.
I can see you open a puzzle box to get food and I can learn from observing your actual actions. And the way I do that is because I can infer the intents of what you're trying to do and then I can figure out which of the actions are relevant and which are irrelevant. So a monkey and a chimpan or a or a and an ape will ignore irrelevant actions when they observe you do a task. So they've done these experiments with humans and chimpanzees where they do all these actions to open a puzzle box and they do some random actions and, you know, chimpanzees will ignore the random actions, which suggests they can infer the intent of it, which is great. But chimpanzees don't learn from what's going on in your head.
And so the ability to learn from from other mental simulations is what's so powerful about language. Because I can say, you know, I just went over to that forest over there and I saw a red and blue snake. And I saw that the red snake is really dangerous, but the blue snake is not. Cause the blue snakes bit me and nothing happened. And so I share that episodic memory and now everyone has that knowledge, even though that was just in my mental simulation.
Or when planning a hunt, you know, a group of five humans, I can imagine a strategy of how all five of us are gonna coordinate, see it succeed in my mind and then share the results and the plan with everyone. So language enables us to tether our mental simulations to each other. And I think there is a sense of agency in the idea that there's a purpose to that. There's a volitional purpose to the communication. The neurological underpinnings of communication that occurs in nonhuman primates is more analogous to our emotional expressions than it is to language.
And we see this also in the brain. So monkeys and nonhuman apes have these innate expressions that they do, which are this genetically hard coded. And we know that because it's the same even across species often who have never interacted with each other. And it comes from neurological structures similar to our laughing and crying. And so it's clearly a hard coded emotional expression.
In that sense, it doesn't have the same volition because I'm not doing this action to communicate a concept to you. I'm I'm doing this action as an innate response to a cue or a feeling I have. So, yeah, I think there there's there's meat to that idea.
Well, a few few things to explore there. I mean, first of all, we should just talk about how we became a collective intelligence after the fact. So I'm not sure whether that's unique to humans if you look at other forms of of collective intelligence. There's always a kind of juxtaposition between the intelligence of the individual versus the collective. And, actually, usually, you find that having very intelligent individuals is not good for the intelligence of the collective.
But what's interesting about humans is, like, we clearly didn't evolve as a collective intelligence. So we had this kind of bootstrapping process where we were very, very useful, you know, independent agents. And then this collective intelligence, just emerged out out of nowhere. So is that is that an interesting observation?
There's different degrees of collectiveness. And and so I think we can, draw distinctions between different flavors of collectiveness, but I don't think humans are uniquely collective. So for example, the imitation learning of nonhuman primates is a form of collective intelligence because you can teach one member of a chimpanzee troop how to use a tool and then over time, the rest of the troop will learn just through observation. So that's a sense of collective intelligence. Many vertebrates, and likely the first vertebrate, you can even see fish, will learn through observation.
In other words, when a fish swims in a certain direction to get food, other fish can see that fish do that and follow them. There can be an instinct to follow others around you. So I think there is flavors of collectiveness that exists across many different species. But what's unique about the collectiveness in humans is the fidelity with which we transfer our mental simulations enables it to accumulate across generations. In that sense, it has its almost its own, you know, you could argue it becomes, it has its own agency or is its own thing because it can actually go through its own process of evolution as ideas propagate through generations of people, which is not the same thing that you see in other animals.
Yeah. I mean, a couple of things on that. I mean, first of all, I would quite like to distinguish, knowledge and intelligence. So collective intelligence and intelligence in general is is a process of discovering models. And when I, just to sort of get the language down here, I'll use models and skills and knowledge, pretty much interchangeably.
So I I think of, like, an intelligent process as epistemic foraging. So just finding interesting models, and then they can be discovered and and shared by by other people. So, it's a little bit like when you distribute a GPU workload, you can do, like, model parallelism, and you can do data parallelism. So you can either split up the the the the CPU, like, you know, the processing, or you can split up the actual, representation. So I think the kind of collective intelligence that you've just been speaking about is, okay.
We've got all of these independent agents, and they are finding models and sharing models, and, you know, the models get refined over time and and and it adapts and so on. But I also think a a big important element is sharing the the computation. So even though there's there's there's some, kind of redundant work going on, but, you know, epistemic areas over here are are being explored. But also in many cases, the same problems are being explored, but in slightly different variations. So we're kind of sharing the workload with other humans.
Yes. I think that totally makes sense. There's some interesting ideas in AI here, actually, where there's this concept of knowledge distillation in AI where one way in which you can have model A teach model B the things that model A knows. One way is you can wholesale copy the parameters of model A. Of course, that's totally biologically implausible.
There are aspects of parameter copying, is the components of our brain that are genetically hard coded. That is a version of parameter copying. But for other applications, it's feasible to just copy parameters or it's maybe not desirable. So knowledge distillation is saying, Okay, well, can have a set of data that we give model A, and we either look at the outputs of model A or the layer before the outputs. So we can see sort of more richness in its representation of the input you give it.
And then take those, that data, that almost labeled data to model B and then train model B on it. So that's distilling some of the knowledge through almost training model B to try and act similarly to model A. And so that type of information transfer I think does occur in nonhuman primates and that's imitation learning. However, it is not nearly as rich because what happens in nonhuman primates is it's primarily grounded in just the actions that I'm taking, which is much less rich than I can share not only the data of what you see me actually do, but I can also include in the data I'm transferring to you things that happen only in my mind. And that opens the door for much more transference of, and the word used, the computations that I'm performing.
Yeah.
So I think it's absolutely, absolutely true.
Before, we were, learning in the physical world. So we were learning from physical things that we were directly observing, and now we are learning from imagined actions. But there's a bit of a latent component to language as well. So for example, someone might come up to me and say, oh, the blue swirly thing is over there, and I'll say, well, I don't know what you mean about the blue swirly thing because I've I've never seen one before. So there's this kind of, inference process.
And this is where it starts to get really interesting because there's a diffusion. Right? There's a kind of, there's a message passing that happens between all of the different agents, and it's filling in missing information. So even though many of the agents wouldn't have seen anything like what we're talking about, sometimes it can be filled in with subsequent, interactions with people, and sometimes it can just become a kind of latent category, which can be filled in later. So there's this real diffusion process going on, which I think is quite difficult to articulate.
Part of what's so interesting about language, is it's still an area of such controversy amongst cognitive psychologists, linguists, and even AI people. And so much is still unsettled about it. There's still debates today. I mean, there's debates today about whether language is primarily a tool for thinking or communication. And most people were, Chomsky is the most famous proponent of the idea of language for thinking.
So, and he has evolutionary arguments that language initially evolved not as a tool for communication, but for our own process of thinking and then later was accepted or used for communication. That's a minority view. And then other people argue, which I'm more amenable to, that language was primarily used as a tool for communicating. And these ideas actually are reemerging with language models because the way language models learn about the world in some sense is language becomes the reasoning tool itself, which is more Chomsky like, even though, you know, there's a lot of, you know, I think the successive language models, I think in a lot of ways discredits a lot of Chomsky's ideas and we can talk about that. But interestingly, the fact that we're using language as the fundamental mechanism for reasoning and thinking is actually somewhat Chomsky like versus language as communication.
The idea is language is a a condensed set of tokens that I'm passing between minds. But the goal, the real communication I'm trying to share with you is what's going on in my mind. In other words, the mental simulation, the more mammalian component here. The rendered three d world is what I'm trying to transfer to you. And I condense it into this code that then you reverse engineer back into a mental simulation.
And theory of mind, one reason why language might be so rare in the animal kingdom is mentalizing theory of mind, is relatively rare in the animal kingdom, is a prerequisite. Because in order for me to reverse engineer the language code you've provided me, I need to be able to infer what you might have meant by what you're saying and reason about why you would have said this and what knowledge you have and etcetera, etcetera. So yes, so I think language is intended to cue to another person to render something in their mind. This is also where teaching is such a key aspect of language learning because we can infer what declarative labels is this person aware of. And when they're confused, then you have to start trying to iterate to understand what are they confused about that I'm saying that I can disambiguate for them.
So there's also a disambiguation process where you ask follow-up questions when you feel like you don't fully understand what's going on in someone else's head.
Yeah. I mean, the guardrails thing is interesting because they they they're not necessarily thinking guardrails. They're also pragmatic guardrails. And there's a really interesting, figure in in in the book, actually. Yeah.
Here it is. And it talks about how language is is sharing information over generations. So without language, you know, we learn a little bit inside a generation, then it goes to, you know, pretty much back to zero again. But now we have the ability to to pass on these memetic bits of information over several, generations. But the thing is, there's a real structure to it.
I think of it as a bit like, you know, a directed acyclic graph. So it's a tree structure, and every single bit of knowledge that we discover kind of stands on the shoulders of giants. So it needs all of the things that we discovered beforehand. In So a sense, you know, we're all of these little agents, and we're doing this epistemic foraging. So we, you know, we're finding new skill programs, we're sharing them, and so on.
But it's almost like we shouldn't think of the mass as being, like, an entire convex hull. It's only on the boundary where all of the creativity and and all of the information sharing happens, you know, like, the surface of this object that that's being created. And what I mean by that is, like, now in modern cities, for example, you can't live without a driver's license. You can't live without the Internet. You need to do things a certain way.
And even though it's not technically constraining our brains and how we think, like, we live in a very, very constrained and weird world now.
Yeah. Totally great point. There's a biological constraint as to how much knowledge a given human brain can contain. And so one lens through which to see the last, one hundred thousand years, especially the last one hundred years, is us finding solutions to getting past the biological constraint of human brains. Language was one tool because it used to be the case that all of the information that a given entity learned needed to be learned by my brain within my lifetime.
And language enables us to our group to have shared knowledge, but not every brain contains all of the knowledge. So if you think about a troop of 100 people, it's possible for that 100 people and all their descendants for one thousand years to have tons of skills despite the fact that any one brain never had all of the skills. So someone becomes really good at hunting, someone becomes really good at weaving animal skins into clothing and all of these types of skills. And actually there are cases in anthropology of groups of humans that get separated from each other, of their technology degrading because there is a limit. There is a minimum number of brains needed to contain and store a certain amount of information in the absence of writing.
And so writing language was maybe innovation one here. Writing was another innovation, which is great. Now we can more reliably transfer these ideas across generations even if there are gaps. In other words, even if there's a period of time for maybe two generations, no brains contain it. A third generation go back to the writing and pick up that knowledge.
And then of course now with the internet, we have just scaled up writing even more. But you're absolutely right. Sometimes I think about this as like, if me and a group of 20 friends ended up on an island, how much of human, and we were the only 20 humans left, not that I think about this all the time. But it is crazy how little of human knowledge would be contained in our 20 brains, how how dramatically we would degrade.
Essentially, like, we've got this thing where we've got all of these different brains, and individual people can, you know, have about a 150 friends or something. You know? It's it's the the social Dunbar limit. But as you say, because we have, like, this ability to to share simulations and we have common myths and so on, we can actually address a much larger carrying capacity of of people and and knowledge. And, actually, you said something really interesting in the book, which is four things.
Right? So bigger brains, specialization, more brains, bigger population size, and writing and sharing simulations and the Internet and all of these things. So so we've increased our carrying capacity, and, like, now something very interesting and arbitrary has emerged. So we've got all of these different specializations of skills. And I guess the question is, where does it end?
Has it converged? Like, could could we could we carry much more knowledge than we already have, or would we have to wait for a top down kind of genetic pressure for our brains to get a bit bigger again?
I mean, I think we are we are about to go through this. Google and the internet has turned us all into epistemic hybrids. I mean, Google has become a shared knowledge store that we all use. And of course there's problems because now there's sub areas on the internet where we can use different knowledge stores. And now we live in these different epistemic bubbles and that creates political problems as well.
But we have already become hybrids where we use technology to overcome limitations in our own brains. Writing is a tool to overcome challenges in memory and at times thinking. The internet has become, you know, a tool to answer any question at a whim. And some people have concerns with this because it can also atrophy parts of our brain that maybe we want. So for example, through mere introspection, I will say once I started using Google Maps as a kid, the part of my brain that was learning how to navigate a city through actually remembering the grid and map of a city just started atrophying.
Like now I have no capacity to do that. Whereas my dad, he'd take him to any new city and he's like, you can see him rendering a map of the city in his mind and he won't use Google Maps. Now one could argue that it doesn't matter because I'll always have Google Maps, so why do I need this skill? And then another argument would be that atrophying may have other consequences in my life and it would be important for me to go through the cognitive exercise even though technology enables me to do it. And we do make these trade offs at different times.
Why do we teach kids arithmetic? They can always just use a calculator, but we deem it important for us to go through the process of understanding arithmetic even though technology can already do a better job for us. And so this new frontier with using large language models and there's some really cool things with education happening like in Khan Academy, they're working on building language models to help children go through reasoning steps, which is a really cool application because instead of just asking the question, it'll probe the student to go through a process so they can come to the conclusion themselves. And so there's a sort of pessimistic and optimistic world here. An optimistic world is these new AI systems are actually gonna be a new step forward in sort of cyborgizing ourselves.
But it's not necessarily gonna be as atrophying as something like Google because these systems won't only just get us sort the dopamine hit of a factual answer, but will also guide us towards better understanding how it came to this conclusion to ensure that we understand, when we're probing and asking questions. That's an optimistic state of the world. The pessimistic state of the world would be that we offload more and more of our own sort of cognitive reasoning to these systems and we become even more atrophied in these abilities. And that might be, you know, not a good world we wanna live in if we keep offloading more and more of reasoning to systems that, and we lose the ability to do it well ourselves.
Yeah. So I've been thinking about this a lot recently. So, I mean, I was involved in a startup that did, transcription and land large language models and augmented reality glasses. So the idea was, you know, you can be in a lecture. And by the way, I I still think this is very useful for people with, you know, accessibility, concerns like hard of hearing or something like that.
But, you know, but, you know, we were kind of thinking of it as something which can augment your cognition. So you're in a lecture, and now you don't need to pay attention to the lecturer because, you know, you're you're transcribing, and GPT is making notes for you and so on. And I think this is really wrong, but you give a counterexample of satnav, so we don't need to read the maps anymore, because we can now externalize that cognition. But I feel like this is different. So you you're in a you're in a lecture or something like that, and now all of these kind of AI language tools, they are a form of understanding procrastination.
Right? So understanding or intelligence is is the process of creating a model. So you're creating a simulation, and in order to create a simulation, you actually have to think. And, you know, you normally think you externalize the thinking a bit. You do some writing, and you pay attention.
Now here's the thing. In the situation, there are so many more cues. Right? Because it's in four d. You can hear things.
You can see things. It's a social activity. It's a physical activity. Even the the sort of the the dance, the the performativity of the of the lecturer, it's all information. It helps you understand.
So now I'm transcribing the thing, and people say, oh, it's okay. I I can just read the transcription later, and I can understand it. Well, yeah, maybe, but you're already at a disadvantage, and you probably won't because this procrastination, you're just like, you're paying it down the line. You're saying, might do it later. I might do it later, and you never will.
And that's going to create a society of automatons that just don't think for themselves.
Yeah. Well, I think that's a very, I'm sort of torn between the optimistic and pessimistic state of the future, but I think there's a very good argument behind what you're saying. So I'm not, I don't, I definitely don't reject that out of hand. I think the, I actually really liked the analogy to sort of model based versus model free that you were suggesting there because that actually applies very well towards Google Maps. Because when my dad navigates a city, he has a model of the world and he's engaging in this model based planning of how to get somewhere.
When I use Google Maps, I've externalized the model and all I do is respond to the cue of when do I turn right or left. And so I think that is absolutely a good way to think about this, which is we use technology to externalize building models, which can sometimes make things more efficient because then we can just be model free actors. But there are places like in the example that you're suggesting, where we really want people to engage in the more painful, hard process of building models of things. And in those cases, you know, obviously it's dangerous to make it so easy to externalize these models.
Yeah. I mean, it's hard to articulate. I think part of it is it's a kind of acquiescence. So I think you're sequestering your agency when you externalize too much of your cognition, particularly if it's parts of your cognition that are useful in the sense that it has core knowledge which would generalize and help you acquire new knowledge, or, it's just the the the proto ability of, you know, discovering knowledge. It's just your intelligence, and you're not exercising that muscle so you become acquiescent, and then you become less of of an agent.
And from a collective intelligence point of view, I think this you know, we're just saying, like, language and intelligence is about discovering knowledge. And if we are all sequestering our agency and becoming less intelligent as individuals, as a collective, maybe we will suffer. But it's one of those things that it's so easy for us now just to make grand statements about this. And people in two hundred years will look back on this and just laugh and say, oh, you know, it's a little bit like when when they introduced bicycles. They were saying, oh, bicycles are there there there was a moral panic apparently because they said, oh, women will start cheating on their husbands and using the bicycles to go to the next town.
You know?
That is an interesting fact. Yeah. Well, I think think history is such a good I love thinking about history as a tool when trying to reason about how people in the future will think about us because I think, you know, we are the people in the future to the past, which is obvious, but a useful tool. So for example, in some sense, we already live in this dystopian world when it comes to physical exercise. I mean, roll back the clock five hundred years.
And most people, most people, didn't have to think about physical exercise as much because most work required physical exercise. So we exercise with our work. And so much of jobs, at least in the developed world, are information related jobs where we don't exercise, and so we go to the gym. I mean, the gym is a is a weird if the like, aliens came down and observed gyms, it would be an anthropologically very bizarre behavior because we just go into a room and we run on treadmills. And we do it because obviously we've evolved to require exercise and modernity has removed exercise as a prerequisite to most of the things that we need in life.
But now there's this gaping hole. And so what we do is we just go to the gym and we run-in place to satiate this physical need. And so you could imagine, and one might interpret this as dystopian or utopian, but you could imagine a world where we've offloaded so much cognition, but because humans need to think about things, or as a society we value it the same way we value physical fitness, that there's now social pressures to go to these intellectual gyms just to make sure even though you don't need to do it for work or it's not necessary for the world to function, but we feel like there's just value in a human who knows how to reason about things. So we just, you know, we go to intellectual gyms for that. And so we might I don't know if that's utopian or dystopian future, but however we feel about it, I would venture to guess people five hundred years ago who looked at a treadmill would probably feel similarly.
A 100%. Well, MLST is my intellectual Jim, by the way. But, you know, you spoke about, DNA. So, Dawkins, of course, wrote this book, The The Selfish Gene. And you said actually that the value of DNA was not what it creates.
So, you know, it creates hearts and lungs and so on, but what it enables, which is this evolution process. But then it gets to this concept of what we mean by a meme in general. So you said that it's it's an idea or behavior which spreads contagiously. I mean, what how do you think about memes?
Well, I think Dawkins did a a wonderful job articulating this idea in a way that's really understandable, where a meme is a is a is a concept or a a behavior. So a meme can be just like the idea that individuals should have rights or the idea of equality or something sillier as the idea that we shake hands before we sit down for a meeting. And these things, because humans can share simulations through language and we engage in imitation learning, these ideas or behaviors propagate throughout societies. And because these things are propagating a different form, not evolution in the sense of, you know, genetic evolution, but a form of evolution emerges because some ideas will propagate better than other ideas. So by nature of that process unfolding, memes, these concepts or behaviors actually go through an evolutionary process.
So ideas that either are viral because people want to share them with each other or ideas that somehow support the survival of the individuals that hold them, those are gonna be ideas that propagate correctly. Ideas that negatively affect the survival of the individuals that hold them or for whatever reason people do not desire to share or ideas that are gonna do a worse job propagating. And so it's a really almost brilliant lens to look at human culture when you reframe sort of cultural ideas and concepts as memes, a a different take on a gene, that goes through its own sort of process of iteration, which is not my idea. This is Richard Dawkins'
Oh, yeah. Well, we can we can thank Richard, very much for this. But no, I'm I'm fascinated with with memes, and I I kind of think of languages as being a a collection of memes. But now we're in this very, very interesting space. Right?
So, you know, before before language, we learned by fit you know, observing physical skills as performed by other people, and and we could kind of imitate them and so on. Now we are sharing, you know, kind of simulations, basically, without actually needing to see the thing. And that means that we are kind of one step removed from reality. So all sorts of memes have cropped up, and some of them are better described, as you say in your book, as shared delusions. And but they have some utility as well.
You know? So they they have this, you know, when we have a common myth, for example, it might be a religion. It might be a nation state. It allows us to kind of, cooperate with each other in in a way that we we wouldn't be able to do before. And you actually cited, some some ideas by John Searle and Yuval Harari in his book Sapiens on that.
Yeah. They both have great they both, Yuval famously, like, popularized this idea, but Searle was one of the original, ideators of it. But what's so powerful about these sort of shared fictions is they can propagate much more easily than a human can talk to everyone in a group. And so because they propagate much more easily and with very high fidelity, it enables me to meet someone who is a New Yorker, who I never have met before, and immediately have shared views. We probably both believe in individual rights.
We probably both believe in, you know, that money can be used for transacting things. So if I give them a dollar, they'll believe that the dollar will be used elsewhere or they can give me a dollar. Of course, today it's hard to reason about these things because there's so many rules in place that you don't realize it's all a shared fiction. So, you know, the reason we think we believe in money is because we're like, well, I know that all the other stores I go to will take this money. So that's the reason it works.
But why do they all take the money? It's all this just shared belief that we all trust that this thing will be used for transacting. And so because of that, it enables really large groups of people to coordinate. And that is a very powerful aspect of language. But sort of the argument I make in the book, which is that similar to how genes are powerful, not because of the structures it creates, but because it enables a process of evolution by which good structures will emerge.
Language is similar in that sense. It's not what's powerful about language, per se is not that we can engage in these shared simulations for coordination. It's that language enables the propagation of ideas and concepts across generations, which will there thereby go under its own evolutionary process. So, of course, these good ideas that enable survival are going to emerge. And that's really what's so powerful about about language.
I guess, like, the arbitrariness is is quite interesting. So some of them on the surface don't seem like good ideas. They just seem like really bad ideas. And there's this kind of, I mean, I I guess you can think about it in terms of creativity as well, which is like, you know, for a meme to be established in in the sphere of possible memes, does it need to have intrinsic value? And possibly not, you know, because we're getting into creativity.
Is it novelty? Does it have intrinsic value? Is it just social proof? You know? Is is the meme does it only exist because lots of people have been fooled into thinking it has value?
So it's kind of like extrinsic value via via social proof. And then there's, like, almost a double entendre with the meme or a deeper meme meaning because you talk about altruism. So the meme itself might actually be quite a stupid meme, but if it causes altruism, so there's actually a group selection advantage to it, then it's almost like that's the lens of analysis to understand how good the meme is.
Yeah. So there's so much, like, it's a really fun area of literature to read through because there's still like zero consensus as to how language evolved. And one reason why it's so controversial is the way in which we disambiguate, and I'll get to your question. The way we disambiguate evolutionary arguments is typically by observing gradation in extant or currently present animals. And so that enables us to observe these intermediary steps between, you know, morphological or sort of aspect of of the body A and morphological aspect of a body B.
And the problem with language is we have nonhuman primates that, for the most part, don't have any language, and then we have humans that have very complex language. And all of the intermediary humans that existed between our divergence with chimpanzees about six million years ago and our divergence with all other modern humans between fifty thousand and one hundred thousand years ago, we don't have they're all dead. All those lineages are lost. And so that means that there's this broad spectrum of arguments that could be made between, you know, Chomsky argues, I find this a very strong claim and thus hard to defend, but there are people that argue that it happened all at once or like very rapidly. There was like no language, then all of a sudden there's language.
Prometheus. And then there's, right. And there's other arguments that it was a sort of gradual process. But one of the most controversial aspects of language evolution goes to what you're talking about, which is evolutionary arguments for why language evolved has almost a harder burden of proof than other adaptations. So when we argue about the evolutionary benefit of something like theory of mind, There's no complex evolutionary machinations one needs to conceive of to defend it because you can see why it would be beneficial for an individual chimpanzee to be born with the ability to infer what's going on in other people's heads because they can better defend themselves when someone is gonna be mean.
They can better figure out who to trust. They can better climb a social hierarchy, etcetera. But language, unless you make take the Chomsky view that its primary adaptation is for thinking. But if you hold the argument that language evolved for communication, that's more challenging because it's not valuable for an individual human to be born with a little bit of language skill unless other humans are also engaging in language as well. And so this then means that the only benefit is if we're both sharing true useful information with each other.
And although it seems intuitive that like, well, the way this would function is that a group of humans that, are sharing knowledge with each other is gonna survive better than another group of human that's not, and that's how evolution will ensue. This is actually quite controversial in evolutionary biology because that's invoking something called group selection, which now some people, the modern incarnation of this, something called multilevel selection, where there's some consensus that yes, there are group level effects where that kind of can impact things, But it's but most people think that group level effects are not nearly as strong as we would intuit. And the issue the issue is the following. If you have a group of a 100 humans that use language with each other, and then you have one human that's born that actually is just going to try and trick all the others, so all they're going to do is use language just to be disingenuous, it's not at all clear that that human would be at a disadvantage. And so if people then start in fact, they might be at an advantage relative to everyone else.
And so if you play that forward over time, language will be lost because someone born that's not going to be tricked by the individual that's trying to lie to them with language is actually going to survive better than the people that have language skills. And so there's been so much debate throughout evolutionary linguistics about these arguments as to how language evolved. I like the sort of there's a great book called The Evolution of Language by Fitch, and I think he makes a really great argument around how, you know, you could think about this occurring. But a lot of people argue that it probably started with something called reciprocal altruism. And so the way altruism exists in the animal kingdom, there's two forms of accepted altruism.
One is thing called kin selection, is quite straightforward. Willing to sacrifice something, in other words, share something with an individual if I share genes with them. So, you know, that's easy. Reciprocal altruism we do see in the animal kingdom, which is I'll scratch your back if you scratch my back. But if you start not scratching my back, then I'm gonna stop scratching your back.
And so what this suggests is in order for language to be stable, in other words, it'd be beneficial for me to truthfully share information, there needs to be costs to me lying. And so this is one argument that people speculate. This is one reason why humans have such strong sort of moral preferences towards punishing of liars and out groups and in groups because what we do is we really try to identify individuals that are lying. Robin Dunbar has a beautiful argument that this is why gossip evolved. One way that evolution can stabilize using language is by virtue of us having a preference to share moral violations.
And so gossip being a tool that of language where if you see someone lie or cheat and you share it with a bunch of other individuals, that becomes a huge cost to someone lying and cheating because if one person catches them, then the whole group is aware of it. And so there's this sort of special feedback loop that happens where language skills require more punishment of violation to be a stable strategy. And then the one way you get that is by having more gossip and making sure there's higher costs to defecting. So this is not by any means like the only story of language evolution, but it's one that there's, you know, a lot of interesting evidence behind. There are some people that argue that the the feedback loop, one emerging idea, which I don't talk about in the book, which is I do think is interesting, is that the the feedback loop of language evolution is actually one in which we try to detect lying in others.
So so and and make the counterarguments to me saying that the effect of lying is loss of language. There is an argument that you get the reverse where you get really good theory of mind in humans because we're so sensitive to trying to detect people who are actually giving us false information. So still a lot of controversy around it, but the main takeaway is that the whole the blanket group level selection arguments that language is obviously beneficial because once a group has language, they're all going to survive better, is not a sufficient argument for language evolution. You need a more nuanced evolutionary argument as to why it's a stable strategy for an individual to be born with superior language skills, or you have to argue that language did not evolve primarily for communication.
You know, that was quite interesting, first of all, that you were writing this book actually a couple of years ago. So this was before g p t four, although you did put a note in about g p t four. Yeah. And, you were speaking about Blake Lemione, and, he was this Google engineer, and he famously came out and, you know, he was convinced that these things had developed sentience. I I think much of this actually hinges on this concept of a world model.
So one view of language models is that, you know, they're just, modeling a statistical distribution of of tokens, and that seems quite low resolution. Like, another take is that they're learning a world model. And what that means is that rather than just kinda capturing the the state of language, they are actually they're they're simulators. They're generating the underlying processes of language. They're capturing the dynamics of language.
How do you say that something is or is not sentience, especially given that the models could potentially be so high resolution that they are generating the same thing for all intents and purposes?
Yeah, this is where, I don't see myself as a philosopher, but this is where I do think scientists need to include philosophers. Because when questions become nonscientific, I think the scientific instinct is to argue that we don't draw distinctions between things that scientists that the scientific method can't draw a distinction between. But the problem is there might be moral differences between them. So for example, it might be scientifically impossible for us to have, we have no methodology for differentiating whether two systems that look indistinguishable in their outputs and inputs, which one is sentient. And so scientifically, might say, well, for for the because we can't differentiate the two, we're gonna say they're the same.
But that doesn't mean they're the same. That just means that because we have no methodology for for drawing a distinction between them, we're from a scientific perspective, we're not gonna draw a distinction because we're entering philosophical territory. But if you take that and then you start talking about policy implications and the actual values we attribute to them and how we introduce these things to society, I think we need to include a sort of philosophy lens here because that might not actually be the case that they're the same just because we can't distinguish them. So that's just one thought. Other thought on world models.
So distinction I wanna draw because I've seen a lot of, confusion on the Internet about about the world model dilemma. So there's a difference between a world model and a model. It is undeniable that language models have a model. And all that means is clearly, in order for g p t four to correctly predict the next token in these really complicated language questions, it clearly has some model of something. And because we can ask common sense questions about the world to it and it answers many of them correct, you can say this is a model of aspects of our world without question.
I think that would very hard to argue that's not the case if you look at the performance of of gbt four and many of these these questions. But what most people mean when they say a world model is they mean a specific process of stimulating and ordered states of consequences of different actions and identifying end results of these actions in your head. Another way to think about what we mean by world model is the ability to reason about interventions and causality. And so this is the Judea Pearl sort of argument here, which is with our world model, we can hypothesis test. I can say, I imagine that if I do this thing in the world, I think this will be the consequence of it because that's what I see in my head.
Now I have a hypothesis. Now I'm gonna actually do that thing in the world and see if my hypothesis is correct. And so that's very different than what's happening in a language model where its understanding of the world derives solely from its input data. Versus in a world model, my understanding of the world comes from the delta, the difference between what I hypothesize is gonna happen in the world and then my actual experience of it. And this distinction really matters the more we are gonna start offloading our cognition to these systems.
Because for example, everything that ChadGBT knows is on the basis of its input data. Mhmm. And so that means if false information or wrong information is in the input data, ChatGBT is gonna know that information. There's there's absolutely no hypothesis testing embedded into ChatGBT versus our true AGI agent that one will will be invented. What it would do is it's going to hypothesize aspects of the world, and it's going to test its own hypotheses.
And so if you give it false information, if it reads articles about how the earth is flat, it's not gonna just start talking about how the earth is flat. It's going to say, okay. Well, that is incongruent with my model of the world. I'm gonna now render some tests where I could differentiate them, and I'm gonna perform those tests, and then I'm gonna conclude that the world is not is not flat. So so when one says that Chatuchitee does not have a world model, I think some people misinterpret that as suggesting that it's just dumbly looking at the statistics.
And I think that's not at all what we're saying. In order to correctly look at the statistics of language, clearly, it's built up a very rich and complex model, of the text that it's seeing, and that's how it's able to predict the next word so well. But it's not what most people mean when we say world model.
Yeah. I mean, a couple of things on that. Yeah. I think I think people conflate the machinations of of language models with how we represent them statistically or or abstractly. Because if you look at a lot of papers, they they actually represent it like a probability, you know, like a joint probability distribution.
And, of course, you know, the way that language models work is completely different to that. But you are bringing in some very interesting things. So first of all, we are agents in the world. So the the agential lens is quite interesting. We interact with the world, so we're not just learning from observational data.
It's quite interesting, actually. Was talking with Nick Chater. You know, we said, oh, why is it that in our everyday experience, we see we experience the world in in four d color? And he said it's because it's interactive. So in your experience plane, you can actually seek new information.
Right? You can saccade your eyes. You can get new information, and you can you can touch things. And when you're doing future or past simulations, you don't have that interactivity. So there's something about interactivity, which is really important.
But even then, right, you know, we could you know, how far how far could you go? So a complete one to one simulacrum of the world wouldn't be a particularly good model. And in in physics, there is no causality. Right? So it's just dynamics.
So causality is actually something which emerges very, very far up. We're talking about a model, which is an approximation of the real world, which may or may not include causality. It probably would because we're it's an interactive model, and it has this kind of agential map. But I guess we're just kind of drawing the line somewhere, and we're saying, well, that is a world model. And yeah.
Well, I think the it's I'm actually I'm not sure if even if we rendered let's think about it. If we rendered a perfect three-dimensional map of every particle in the universe, and and that was the input data to some infinitely large model, I still would argue that it is learning something different than a model that is given some form of agency where it can hypothesize rules and then test its own rules. Now given infinite time, it is possible that those will converge because given infinite time, every possible hypothesis I could conceive of will end up showing up in the training data. So eventually, I'll see the training data of every possible experiment I could run. Yeah, if time is infinite, I guess you could suppose that happens.
But what's so different is there's this traumatic dimensionality reduction that happens when you show me something uncertain, and then I can conceive of specifically the tests I wanna run to map the uncertain thing to my mental model of the world. And so that's a very different way of learning about things. It's not just input data and then self supervising on predicting one's own input data. It's building a model that can I can simulate possible outcomes and then hypothesis test those outcomes? And so I think, you know, when we look at even this is not uniquely human at all.
If you look at, you know, the way a rat would deal with something novel in its environment, it's drawn to the novel thing, and it explores the novel thing until it feels like it understands it, and then it will move away. And so when you show a child an object that is perplexing, they will touch it, turn it around, try and understand it until they feel like they built a model of it. That simple act is doing something very different than the self supervision we see in most AI models today because I see something I'm uncertain about and I'm volitionally gonna create new training data for me. I know the training data I want now. I wanna see what happens when I pick it up and I turn it to the left and I turn it to the top.
A convolutional neural network doesn't do that. So the way we teach CNNs to understand rotations in three d objects is we manipulate the training data ourselves. We take data, we take imagery, and then we rotate a bunch of different ways so that we are the ones curating the dataset to teach it these things. But that's different than the way we learn about things. So I think this is a key aspect that's missing from AI systems today that folks are working on, but that's something we're gonna have to add in.
Yeah. Completely agree. And it feels like, I mean, you're you're saying basically what I think, which is that there is, you used the word volition. I used the word agency. There's a creativity and an agency gap.
And a lot of that is because we we are agents. As say, we we create our own training data, and we do this active inferencing and sense making, and we build we build these models in real time. And as a collective intelligence, it creates a kind of divergent search process for knowledge. It's this epistemic foraging that we spoke about. GPT is a monolithic model, and, you know, it does have models, but the models are only learned at train time.
The inference actually happens at train time. And then when you put a prompt in into GPT, you're just retrieving one of the models that are already learned a long time ago. It's not creating a new model in the moment. It creates this kind of sclerotic system rather than this divergent creative system, you know, which is what we experience in biomimetic intelligence.
The one sort of mental model I have of this, because there's so much debate around, so I'd be curious. I haven't put this to the test of the gauntlet of what other people think about. So maybe in the comments, people will either agree or disagree with this. But I think an interesting alternative experiments or eval of an AI model, which I haven't heard before, is if you give it knowingly false information in the training data, not at inference time, in the training data, will it reject it wholesale? And to me, this is this is the distinction, which is a agent that can hypothesis test and intervene in the world will reject false information.
You
can
if you tell it, if you say the world is flat, it will know that the world is not flat. And so versus GBT, any data you give it is given equal weight to every other data. So the only reason it would reject that is if there's other data in the training set that it would that it's going to ignore. And so almost it's almost cheating because by definition, we know a language model is gonna fail at this task. The only way you can fix it is if you give it other data in the training set.
There's no notion of hypothesis testing. Versus an agent, the only way you could get it to be wrong is if you manipulated sensors on the actual hypothesis testing that it does. You can of course manipulate it by, you know, when it does these tests, change the actual test, which happens in the book Three Body Problem, which is an amazing book Yeah. Where aliens manipulate our experiments. But anyway, I think that's another way to eval these these systems, which is, does it can it figure out, that you are giving it false information and reject it?
Yeah. 100%. A couple of things, though. So there's something magic about the having a gentle density in in the system. Right?
So when you have something like GPT, just to make it statistically tractable, it's generally doing a kind of low entropy search. And what, you know, what what I mean by that is it's just looking for the baseline patterns. It's not doing a lot of exploration. It's not searching outside of the the main sources of statistical regularity. Whereas when you have a divergence, you know, in the search process, so you have all of these individual agents doing their own things.
As a system, it's much more of a high entropy search, which means you're actually bringing in lots of new information to solve problems in creative and interesting ways. But in the physical world, though, it's quite interesting. Right? Because the problems come from the physical world. So the trees get big.
Now the giraffes need to have a a long neck in order to eat the leaves from the trees, and this whole thing just rinses and repeats. So, you know, the the environment produces novel solutions, and then, like, we see this divergence, and we find novel, you know, creative solutions to to the problems that get generated. But in the memetic sphere, it's so much more difficult than that. Right? Because the the problems and the guardrails aren't constrained in in the way that they are in the physical world.
So for example, we have capitalism or we have the nation states. And, again, there's there's all kinds of interesting divergence going in different directions, but, like, it doesn't seem like there are the same pressures that ground the thing to reality.
Well, yeah, I I think it's definitely not, grounded in truth. It's tethering to truth is it's tethering towards knowingly false information that leads me to take actions that will hurt my survival will fade. But false information that helps me survive better will propagate freely or is at least neutral. And, you know, another way this shows up is in, this is where I'll go into maybe some pontificating.
Please.
But where I think there is sort of mimetic evolution that can drive us away even from things like happiness. So if we think about what are systems of coordination that survive, there are systems of domination, of militarism. If you take two groups, one had let's say two you take two groups of individuals. Let's say one is really happy and calm, and they see no desire for domination, and they do not attempt to innovate and build more technology, another, they're unhappy, but they're super aggressive and they want power and they want to expand, these ideas will die out. And so what this suggests, what's important, and I think I talked about this in one of our previous conversations, is delineating, in my view, the Darwinian component of what does survive from the moral component of what is right or wrong.
Because it is definitely not the case that what is right, what survives is definitionally right. It is absolutely possible that the things that survive and do well evolutionarily are not the things that we feel like are aligned morally. And that of course is not to propose a correct or incorrect system, but it is an important distinction to draw when we're trying to decide what do we deem to be morally right or wrong. So I think that's just one example of what you're saying where the ideas that propagate successfully might not be the ones that are truth. They might also not be the ones that we deem to be moral or or not even be the ones that lead to human happiness.
They're just ones that do a good job keeping humans alive and reproducing of the idea.
Yeah. So on on that then, I mean, people say that language models, you know, they they confabulate, and, they they don't, preserve epistemic factualness. But you could also argue the same thing about us. Right? So we actually confabulate everything.
We don't really have goals. We just kind of, generate these post hoc confabulations, and then we explain our behavior, and we kind of pretend that that was what we wanted to do, that we had beliefs and so on, but we just kind of make it up as as we go along using this kind of active inference. So I guess the the the question is, like, we do think of ourselves as being like, even though we are emotional and subjective and, like, you know, like, we believe in religion and lots of things that that we presumably made up, but but we have Wikipedia. We have objectivity. Even though it's an illusion.
Right? There's there's no such thing. Know, even, like, general relativity, it's not as objective as as we think. Like, if you keep asking why and why and why, it just kind of disintegrates into incoherence. But there seems to be some objective structure which is preserved.
And how is that explained given that our brain simulations are, you know, they don't seem to select for for truthfulness?
I think the desire to say is are humans better or worse than ChatGPT is almost like a a red herring, where I look at ChatGPT as like an alien. It's like an alien brain. And there are certain things it does that are clearly better than us. I mean, information retrieval in ChatGPT blows a human away, without a question. So, like, in many ways, it's way better than humans.
And but but there are certain things that human brains do that ChatGPT does not have. And if we're trying to build human like intelligence, there's certain inspiration we can garner from from human brains. So I think, there's a component of our model based sort of rendering a plan and then executing that plan that has a level of explainability that is unique relative to a system that is sort of just iteratively predicting the next token. But we also do the same thing that ChatGPT does. When we make model free choices and then you say, why did you do that thing?
What we engage in is exactly as you're describing, a post hoc explanation. Because I didn't render a plan. I was just walking down the street, and you say, did you move your your foot there as opposed to two inches to the right? What I'm gonna do is render a post hoc explanation of why I did that, but I didn't really think about it. I'm just explaining it after the fact.
So it's definitely the case that humans have that component, but it means there's also another component which I would argue is unique and important, which is our ability to pause, render a plan, and then execute against that plan. But the key thing that I think is the dividing line between these models and us is the ability to render hypotheses and make interventions in the world. That's the key thing. And so it's not the case that our brain has the true objective state of the world in our head. I don't think that's there might be components of objective truth in ChatuchuBT that it contains that we don't have.
And I think in its information retrieval, it has probably, in some ways, more aspects of reality than I do in terms of having read all of Wikipedia and answering questions about biology that I don't even know. But there are also components of the world that the human brain has rendered and contains that does not because of our ability to make hypotheses and intervene and learn the causal structure of the world. And I think that is the the dividing line. But it's I wouldn't I wouldn't say it's because the human brain knows the objective state of the world, and ChatGPT does not.
Max Bennett, it's been an absolute honor to have you on MLST. Everyone at home, you need to buy his book immediately. It is a a wonderful, wonderful book, Max. You did such an amazing job of bringing all these things together, and we've now spoken for four and a half hours going through the last three chapters. And my god, it's been an honor.
Thank you so much.
Oh, it's been my pleasure. Thank you for having me.
Beautiful. Okay.
Your Brain is Running a Simulation Right Now [Max Bennett]
Ask me anything about this podcast episode...
Try asking: