Claude and ChatGPT too expensive, Chinese AI models surge in use due to low cost

☆ Yσɠƚԋσʂ ☆@lemmy.ml · 19 hours ago

Claude and ChatGPT too expensive, Chinese AI models surge in use due to low cost

brucethemoose@lemmy.world · edit-2 18 hours ago

Nvidia sees the writing in the wall too, hence the big Nemotron effort now. They’ve been pushing open models, but no one can hear them over Altman’s lies.

AMD… is… trying.

Some other companies have made pretty interesting efforts too, like LG and IBM. Huawei already publish a big model to promote their ASICs, and is planning another in weeks. Even some Russian company trained a big open LLM from scratch, though it wasn’t very good TBH.

And this is not even looking outside the LLM space, where all sorts of interesting models are published.

ms.lane@lemmy.world · 16 hours ago

AMD… is… trying.

I’m pressing X to doubt.

brucethemoose@lemmy.world · edit-2 16 hours ago

They’ve released two open weights LLMs, trained on AMD hardware.

…And yes. They are archaic jokes. I could have trained a better model if I was in charge of it, which is sad.

And don’t even get me started on hardware and library footgunning.

☆ Yσɠƚԋσʂ ☆@lemmy.ml · 18 hours ago

Yeah, NVIDIA knows they have to pivot to the consumer market soon. Apple seems to be going in that direction as well.

brucethemoose@lemmy.world · edit-2 17 hours ago

Oh don’t mistake me, they are not consumer friendly.

They are just trying to sell enterprise GPUs directly to “consumer” businesses and the cloud providers they use, instead of through literally fraudulent middlemen like OpenAI.

This is what pretty much everyone with hardware is doing, including Huawei, Tenstorrent, Cerebras, even AMD. Maybe I misinterpreted you, but hardly anyone cares about individual self-hosters.

Apple does, though. MLX is actually getting pretty cool. But they’ll always be quite insular, anti-consumer in other ways, and they still seem detached from what the community is largely doing.

☆ Yσɠƚԋσʂ ☆@lemmy.ml · 17 hours ago

My view is that we’re basically in the mainframe era of AI, but local models are already getting good enough to do useful stuff. Qwen 3.6 in particular is very capable, and you can do real work with it. So, extrapolate this into a couple of years into the future and it’s almost certain that we’ll be able to run models that perform as well as current frontier models locally. And that means companies are going to be much more likely to self host as well. In fact, I think you’re completely right that the immediate target will be business customers that want to self host their own models before this tech really gets to consumer grade.

brucethemoose@lemmy.world · edit-2 17 hours ago

Yeah. I mean, I have a Ryzen desktop and a 2020 GPU, and Mimo 2.5 is a bit faster and mind bogglingly better than frontier models from like… two years ago? And frontier models are plateauing, I think.

Still, my worry is that we consumer won’t HAVE any hardware. Many don’t even own a laptop these days, and it feels like they’ll just drop desktops (and work will just use thin clients) if they’re too cost prohibitive for people to buy.

☆ Yσɠƚԋσʂ ☆@lemmy.ml · 16 hours ago

I guess gonna have to hope that Chinese companies ramp up production soon. Might have to smuggle that hardware in though at the rate things are going.

brucethemoose@lemmy.world · edit-2 15 hours ago

Of what, though? Huawei NPUs are datacenter hardware.

As much as we hate it, Nvidia gaming GPUs are ultimately cheap consumer devices, and they’re very good at hybrid CPU+GPU inference.

I think Intel has the best chance of pulling a rabbit out of a hat with Arc. They have a usable platform already, hardware “close enough” to Nvidia that LLM compatibility isn’t a nightmare. And they have nothing to lose, no illusion of “protecting datacenter cards” like AMD has.

☆ Yσɠƚԋσʂ ☆@lemmy.ml · edit-2 15 hours ago

Chinese companies are very much ramping up production fo consumer devices right as we speak. I expect we’ll see the same thing we saw with stuff like solar panels and EVs in the coming years. https://www.techspot.com/news/112529-china-first-credible-gaming-gpu-sells-30000-units.html

cattywampus@lemmy.world · edit-2 19 hours ago

That’s the plan, undercut funding for US AI labs, also the main reason they’ve been releasing open source models after spending millions to train them essentially getting zero financial return. They didn’t release them for the love of FOSS or just donate that capital for the lolz.

davel [he/him]@lemmy.ml · 14 hours ago

They didn’t release them for the love of FOSS or just donate that capital for the lolz.

no one is claiming they did. lolz.

cattywampus@lemmy.world · 10 hours ago

You would be surprised, I’ve heard that response more than once already. They are just philanthropic and dump millions upon millions of dollars down the drain for the love of FOSS and the community.

☆ Yσɠƚԋσʂ ☆@lemmy.ml · 18 hours ago

I mean anybody who lived through the rise of Linux on the server should understand the benefit of releasing common infrastructure in the open and amortizing costs that way. The real difference in philosophy is that Americans companies treat the model as the product, while Chinese companies see models at infrastructure you build products on top of. You amortize the cost of deploying it at scale by sharing knowledge and iterating quickly to bring the cost down.

Seppo@sopuli.xyz · 16 hours ago

Is it just as useless as the regular AI? It’s like crack for really stupid people.

laz@lemmy.dbzer0.com · 15 hours ago

deleted by creator

m532@lemmygrad.ml · 12 hours ago

If you can’t see any uses — then I’d be worried about your own brain capacity, not about the one of people who did find ways to use it.