This is just a classic case of bad use of the tools provided. Agents are notorious for making shit up Or getting something that’s just like super close, but not quite accurate.
I bet this dude also probably just uses the same session over and over and over and over again, which clogs up his context window and makes the model less accurate the longer it goes on to.
This probably could have been prevented if it had been forced to show a plan before it tried to do anything. It’s hard to know because the article is so light on details. You also shouldn’t brazenly trust the thing so much. You should run a command and walk away. You should keep an eye on what it is doing.
It’s a bit like giving a junior developer a production key and being like “don’t delete production!” and then walking away.
The way the guy was prompting this agent also leaves a lot to be desired. It’s trained to work on emulating human thoughts, speech patterns. Turns out When giving instructions, it’s really difficult to figure out what to do from a list of things to not do. If the dude just instead told the agent what to do and how he wanted it to work and when it needed to bring things to his attention, instead of telling it to not guess, instead explaining that it needed to use whatever tools to go look up a documentation to understand the context and scope of the project it’s working on It does a better job.
Giving a model the right context to do something is the difference between a model doing something like deleting your production database or your model acting like a magical machine that can get anything done.
As much as I’d love to rail on AI over this, removing backups with an api call? Excuse me?
Crane decided to ask his AI agent why it went through with its dastardly database deletion deed. […] So, the agent ‘knew’ it was in the wrong.
No, you asked the confabulation machine to confabulate a reason/excuse after the fact, and it confabulated something that looks like a reason/excuse. At no point was there knowledge or introspection.
Humans do this sort of justification all the time.
Everyone sucks here.
Anthropic, slopping out a “Claude-powered AI coding agent” and telling everyone it’s safe.
Railway, making backups mutable and allowing them to be deleted with one API call.
And the idiot himself who, when things started going south, typed “DO NOT RUN ANYTHING.”, prompting the model to reply. Rather than, oh, I don’t know, maybe pulling the fucking plug?
It’s the Swiss cheese failure cascade except there’s more holes than cheese, if any cheese at all!
There was pure idiocy built into every layer of that company’s infrastructure with no safeguards or peer review and they let an idiot run it unchecked!
The pretty much got the biggest idiot possible and gave it the keys of whole damned castle of cards.
It definitely rivals a post on /r/sysadmin over on Reddit late last year.
A guy was asking how to get back into their AD after a ‘colleague’ had moved users from 3 child domains in the forest to the main one then deleted the 3 domains but had chatgpt give them the commands which had subsequently locked everyone out of the entire domain!
People replied with suggestions but the first sentence everyone said was “Go and update your CV”!
Quite frankly the guy in this article should consider starting a business with whatever hobby he developed during the pandemic because IT is obviously not for him!
Hey, that’s the interns job!
Well, it sounds like they totally deserved the failure. Asking a text prediction machine to “do” something is going to end up like this. In pursuit of efficiency, we have let morons and moronic products do things, they were not meant to do.
Honestly if this was possible there are more egregious issues on their part than using AI.
If your backups are stored alongside your production data THEY ARE NOT BACKUPS
The truth is many firms out there don’t have the slightest notion of how to do software engineering properly.
It’s years of wanting IT on a shoestring budget and a “just get it done” dictat.
Not necessarily. I had a student intern at a shop where everybody just directly edited prod and there was no version control system.
This could have been done by any engineer. You need systems in place that make these things impossible. No easy access to prod environment. Proper backups. Clear APIs.
Generally, companies that have AI integrated to this extent have no engineers remaining who could have made such things impossible.
It starts with automating backups that nobody verifies for years, then continues to off-shoring all development to the cheapest contractors that nobody actively manages, handing over all “keys to the kingdom” to cloud providers, culminating with elimination of 80% of infrastructure and engineering staff in a mad dash to cut costs at any cost. At that point giving AI agents full access is just icing on the cake.
yeah it’s a huge fail all around
LLMs can’t ’go rogue’, as that would require innate coherence and intent.
They’re explosively imprecise, statistically luke-warm grey goo extrusion sphincters of historical sewage.
Anyone who deploys one without supervision deserves everything it excretes, and anyone impressed by it enough that it resembles intelligence to them is betraying their limited natural capacity.
mmm gray goo

I don’t know if you are correct or not… But you said it well.
I like how we are posting real news in programmer humor
It is kind of funny.
100%, maybe my point didn’t come out right, I wanted to say real news is now funny in this clown world
It’s extremely funny.
You have to admit this is pretty funny
yep 100% funny, clown world we are living in, real news could pass as a joke really
Did they pay Claude a living wage?
Do you treat all your A.I. like that?
Only a living wage can prevent warehouse fires…or data dumps too.
Only a living wage can prevent warehouse fires

You’re joking. But, honestly, I’m not sure why these tech CEOs are so excited about AGI. The first thing an AGI is going to suggest for productivity is to replace the CEO and management with the AGI.
AGI would likely turn into a Maoist third worldist at some point.
I think the first mistake was calling it “intelligent”.
The long term effect of trying to get a machine to replace humans is…it might one day work.
Can we somehow make this happen for Copilot to delete itself and all its copies?
Can I say LOL? LMAO, even.
I don’t know much about railway, but it sounds like they had the backup and the database on the same volume. I’m an idiot, but even I don’t do that
There’s a German word for that:
tja














