Actually we do know that there are diminishing returns from scaling already. Furthermore, I would argue that there are inherent limits in simply using correlations in text as the basis for the model. Human reasoning isn’t primarily based on language, we create an internal model of the world that acts as a shared context. The language is rooted in that model and that’s what allows us to communicate effectively and understand the actual meaning behind words. Skipping that step leads to the problems we’re seeing with LLMs.
That said, I agree they are a tool, and they obviously have uses. I just think that they’re going to be a part of a bigger tool set going forward. Right now there’s an incredible amount of hype associated with LLMs. Once the hype settles we’ll know what use cases are most appropriate for them.
The whole “it’s just autocomplete” is just a comforting mantra. A sufficiently advanced autocomplete is indistinguishable from intelligence. LLMs provably have a world model, just like humans do. They build that model by experiencing the universe via the medium of human-generated text, which is much more limited than human sensory input, but has allowed for some very surprising behavior already.
We’re not seeing diminishing returns yet, and in fact we’re going to see some interesting stuff happen as we start hooking up sensors and cameras as direct input, instead of these models building their world model indirectly through purely text. Let’s see what happens in 5 years or so before saying that there’s any diminishing returns.
I’m saying that the medium of text is not a good way to create a world model, and the problems LLMs have stem directly from people trying to do that. Just because autocomplete produces results that look fancy doesn’t make it actually meaningful. These things are great for scenarios where you just want to produce something aesthetically pleasing like an image or generate some text. However, this quickly falls apart when it comes to problems where there is a specific correct answer.
Furthermore, there is plenty of progress being made with DNNs and CNNs using embodiment which looks to be far more promising than LLMs in actually producing machines that can interact with the world meaningfully. This idea that GPT is some holy grail of AI seems rather misguided to me. It’s a useful tool, but there are plenty of other approaches being explored, and it’s most likely that future systems will use a combination of these techniques.
Actually we do know that there are diminishing returns from scaling already. Furthermore, I would argue that there are inherent limits in simply using correlations in text as the basis for the model. Human reasoning isn’t primarily based on language, we create an internal model of the world that acts as a shared context. The language is rooted in that model and that’s what allows us to communicate effectively and understand the actual meaning behind words. Skipping that step leads to the problems we’re seeing with LLMs.
That said, I agree they are a tool, and they obviously have uses. I just think that they’re going to be a part of a bigger tool set going forward. Right now there’s an incredible amount of hype associated with LLMs. Once the hype settles we’ll know what use cases are most appropriate for them.
The whole “it’s just autocomplete” is just a comforting mantra. A sufficiently advanced autocomplete is indistinguishable from intelligence. LLMs provably have a world model, just like humans do. They build that model by experiencing the universe via the medium of human-generated text, which is much more limited than human sensory input, but has allowed for some very surprising behavior already.
We’re not seeing diminishing returns yet, and in fact we’re going to see some interesting stuff happen as we start hooking up sensors and cameras as direct input, instead of these models building their world model indirectly through purely text. Let’s see what happens in 5 years or so before saying that there’s any diminishing returns.
I’m saying that the medium of text is not a good way to create a world model, and the problems LLMs have stem directly from people trying to do that. Just because autocomplete produces results that look fancy doesn’t make it actually meaningful. These things are great for scenarios where you just want to produce something aesthetically pleasing like an image or generate some text. However, this quickly falls apart when it comes to problems where there is a specific correct answer.
Furthermore, there is plenty of progress being made with DNNs and CNNs using embodiment which looks to be far more promising than LLMs in actually producing machines that can interact with the world meaningfully. This idea that GPT is some holy grail of AI seems rather misguided to me. It’s a useful tool, but there are plenty of other approaches being explored, and it’s most likely that future systems will use a combination of these techniques.