I'm tired of LLM bullshitting. So I fixed it.

SuspciousCarrot78@lemmy.world · edit-2 20 days ago

I'm tired of LLM bullshitting. So I fixed it.

WolfLink@sh.itjust.works · 23 days ago

I’m probably going to give this a try, but I think you should make it clearer for those who aren’t going to dig through the code that it’s still LLMs all the way down and can still have issues - it’s just there are LLMs double-checking other LLMs work to try to find those issues. There are still no guarantees since it’s still all LLMs.

skisnow@lemmy.ca · 22 days ago

I haven’t tried this tool specifically, but I do on occasion ask both Gemini and ChatGPT’s search-connected models to cite sources when claiming stuff and it doesn’t seem to even slightly stop them bullshitting and claiming a source says something that it doesn’t.

SuspciousCarrot78@lemmy.world · 22 days ago

Yeah, this is different. Try it. It gives you cryptogenic key to the source (which you must provide yourself: please be aware. GIGO).

skisnow@lemmy.ca · edit-2 22 days ago

How does having a key solve anything? Its not that the source doesn’t exist, it’s that the source says something different to the LLM’s interpretation of it.

SuspciousCarrot78@lemmy.world · edit-2 22 days ago

Yeah.

The SHA isn’t there to make the model smarter. It’s there to make the source immutable and auditable.

Having been burnt by LLMs (far too many times), I now start from a position of “fuck you, prove it”.

The hash proves which bytes the answer was grounded in, should I ever want to check it. If the model misreads or misinterprets, you can point to the source and say “the mistake is here, not in my memory of what the source was.”.

If it does that more than twice, straight in the bin. I have zero chill any more.

Secondly, drift detection. If someone edits or swaps a file later, the hash changes. That means yesterday’s answer can’t silently pretend it came from today’s document. I doubt my kids are going to sneak in and change the historical prices of 8 bit computers (well, the big one might…she’s dead keen on being a hacker) but I wanted to be sure no one and no-thing was fucking with me.

Finally, you (or someone else) can re-run the same question against the same hashed inputs and see if the system behaves the same way.

So: the hashes don’t fix hallucinations (I don’t even think that’s possible, even with magic). The hashes make it possible to audit the answer and spot why hallucinations might have happened.

PS: You’re right that interpretation errors still exist. That’s why Mentats does the triple-pass and why the system clearly flags “missing / unsupported” instead of filling gaps. The SHA is there to make the pipeline inspectable, instead of “trust me, bro.”.

Guess what? I don’t trust you. Prove it or GTFO.

skisnow@lemmy.ca · 22 days ago

The hash proves which bytes the answer was grounded in, should I ever want to check it. If the model misreads or misinterprets, you can point to the source and say “the mistake is here, not in my memory of what the source was.”.

Eh. This reads very much like your headline is massively over-promising clickbait. If your fix for an LLM bullshitting is that you have to check all its sources then you haven’t fixed LLM bullshitting

If it does that more than twice, straight in the bin. I have zero chill any more.

That’s… not how any of this works…

SuspciousCarrot78@lemmy.world · edit-2 22 days ago

Fair point on setting expectations, but this isn’t just LLMs checking LLMs. The important parts are non-LLM constraints.

The model never gets to “decide what’s true.” In KB mode it can only answer from attached files. Don’t feed it shit and it won’t say shit.

In Mentats mode it can only answer from the Vault. If retrieval returns nothing, the system forces a refusal. That’s enforced by the router, not by another model.

The triple-pass (thinker → critic → thinker) is just for internal consistency and formatting. The grounding, provenance, and refusal logic live outside the LLM.

So yeah, no absolute guarantees (nothing in this space has those), but the failure mode is “I don’t know / not in my sources, get fucked” not “confidently invented gibberish.”

I'm tired of LLM bullshitting. So I fixed it.

I'm tired of LLM bullshitting. So I fixed it.

llama-conductor

The thing: llama-conductor

1) KB mechanics that don’t suck (1990s engineering: markdown, JSON, checksums)

2) Mentats: proof-or-refusal mode (Vault-only)

3) Vodka: deterministic memory on a potato budget