We understand the outcomes well enough. LLMs converge onto a similar process by being trained on human-made text. Is LLM reasoning a 1:1 replica of what the human brain does? No, but it does something very similar in function.
I see no reason to think that humans are anything more than "a pile of statistics that can mimic human speech patterns if you don't tax them too hard". Humans can get offended when you point it out though. It's too dismissive of their unique human gift of intelligence that a chatbot clearly doesn't have.
We do not, in fact, "understand the outcomes well enough" lol.
I don't really care if you want to have an AI waifu or whatever. I'm pointing out that you're vastly underestimating the complexity behind human brains and cognition.
And that complex human brain of yours is attributing behaviors to a statistical model that the model does not, in fact, possess.
I think saying that "LLMs can produce outcomes akin to those produced by human intelligence (in many but not all cases)" and "LLMs are intelligent" to both be fairly defensible.
> I see no reason whatsoever to believe that what your wet meat brain is doing now is any different from what an LLM does.
I don't think this follows though. Birds and planes can both fly, but a bird and a plane are clearly not doing the same thing to achieve flight. Interestingly, both birds and planes excel at different aspects of flight. It seems at least plausible (imo likely) that there are meaningful differences in how intelligence is implemented in LLMs and humans, and that that might manifest as some aspects of intelligence being accessible to LLMs but not humans and vice versa.
> It seems at least plausible (imo likely) that there are meaningful differences in how intelligence is implemented in LLMs and humans
Intelligence isn’t "implemented" in an LLM at all. The model doesn’t carry a reasoning engine or a mental model of the world. It generates tokens by mathematically matching patterns: each new token is chosen to best fit the statistical patterns it learned from its training data and the immediate context you give it. In effect, it’s producing a compressed, context-aware summary of the most relevant pieces of its training data, one token at a time.
The training data is where the intelligence happened, and that's because it was generated by human brains.
There doesn't seem to be much consensus on defining what intelligence is. For the definitions of at least some reasonable people of sound mind, I think it is defensible to call them intelligent, even if I don't necessarily agree. I sometimes call them "intelligent" because many of the things they do seem to me like they should require intelligence.
That said, to whatever extent they're intelligent or not, by almost any definition of intelligence, I don't think they're achieving it through the same mechanism that humans do. That is my main argument. I thing confident arguments that "LLMs think just like humans" are very bad, given that we clearly don't understand how humans achieve intelligence and the vastly different substrates and constraints that humans and LLMs are working with.
I guess to me, how is the ability to represent the statistical distribution of outcomes of almost any combination of scenarios, represented as textual data not a form of world model?
I think you're looking at it too abstractly. An LLM isn't representing anything, it has a bag of numbers that some other algorithm produced for it. When you give it some numbers, it takes them and does matrix operations with them in order to randomly select a token from a softmax distribution, one at a time, until the EOS token is generated.
If they don't have any training data that covers a particular concept, they can't map it onto a world model and make predictions about that concept based on an understanding of the world and how it works. [This video](https://www.youtube.com/watch?v=160F8F8mXlo) illustrates it pretty well. These things may or may not end up being fixed in the models, but that's only because they've been further trained with the specific examples. Brains have world models. Cats see a cup of water, and they know exactly what will happen when you tip it over (and you can bet they're gonna do it).
That video is a poor and mis-understood analysis of an old version of ChatGPT.
Analyzing an image generation failure modes from the dall-e family of models isn't really helpful in understanding if the invoking LLM has a robust world model or not.
The point of me sharing the video was to use the full glass of wine as an example for how generative AI models doing inference lack a true world model. The example was just as relevant now as it was then, and it applies to inference being done by LMs and SD models in the same way. Nothing has fundamentally changed in how these models work. Getting better at edge cases doesn't give them a world model.
That's the point though. Look at any end-to-end image model. Currently I think nano banana (Gemini 2.5 Flash) is probably the best in prod. (Looks like ChatGPT has regressed the image pipeline right now with GPT-5, but not sure)
SD models have a much higher propensity to fixate on proximal in distribution solutions because of the way they de-noise.
For example.. you can ask nano banana for a "Completely full wine glass in zero g" which I'm pretty sure is way more out of distribution, the model does a reasonable job at approximating what they might look like.
That's a fairly bad example. They don't have any trouble taking unrelated things and sticking them together. A world model isn't required for you to take two unrelated things and stick them together. If I ask it to put a frog on the moon, it can know what frogs look like and what the the moon looks like, and put the frog on the moon.
But what it won't be able to do, which does require a world model, is put a frog on the moon, and be able to imagine what that frog's body would look like on the moon in the vacuum of space as it dies a horrible death.
Your example is a good one. The frog won't work because ethically the model won't want to show a dead frog very easily, BUT if you ask nano-banana for:
"Create an image of what a watermelon would look like after being teleported to the surface of the moon for 30 seconds."
> "We don't fully understand how a bird works, and thus: "wind tunnel" is useless, Wright brothers are utter fools, what their crude mechanical contraptions are doing isn't actually flight, and heavier than air flight is obviously unattainable."
Completely false equivalency. We did in fact back then completely understand "how a bird works", how the physics of flight work. The problem getting man-made flying vehicles off the ground was mostly about not having good enough materials to build one (plus some economics-related issues).
Whereas in case of AI, we are very far from even slightly understanding how our brains work, how the actual thinking happens.
One of the Wright brothers achievements was to realize the published tables of flight physics was wrong and to carefully redo it with their own wind tunnel until they had a correct model from which to design a flying vehicle
https://humansofdata.atlan.com/2019/07/historical-humans-of-...
"Anthropocentric cope >:(" is one of the funniest things I've read this week, so genuinely thank you for that.
"LLMs think like people do" is the equivalent of flat earth theory or UFO bros.
Flerfers run on ignorance, misunderstanding and oppositional defiant disorder. You can easily prove the earth is round in quite a lot of ways (the Greeks did it) but the flerfers either don't know them or refuse to apply them.
There are quite a lot of reasons to believe brains work differently than LLMs (and ways to prove it) you just don't know them or refuse to believe them.
It's neat tech, and I use them. They're just wayyyyyyyy overhyped and we don't need to anthropomorphize them lol
I see no reason to think that humans are anything more than "a pile of statistics that can mimic human speech patterns if you don't tax them too hard". Humans can get offended when you point it out though. It's too dismissive of their unique human gift of intelligence that a chatbot clearly doesn't have.