Tbf 500ms latency on - IIRC - a loopback network connection in a test environment is a lot. It’s not hugely surprising that a curious engineer dug into that.
It started with ssh using unreasonably much cpu which interfered with benchmarks. Then profiling showed that cpu time being spent in lzma, without being attributable to anything. And he remembered earlier valgrind issues. These valgrind issues only came up because he set some build flag he doesn’t even remember anymore why it is set. On top he ran all of this on debian unstable to catch (unrelated) issues early. Any of these factors missing, he wouldn’t have caught it. All of this is so nuts.
I’ve seen that claim a couple of places and would like a source. It very well may be since Microsoft prefers Debian based systems for WSL and for azure, but its not something I would have assumed by default
Don’t forget all of this was discovered because ssh was running 0.5 seconds slower
Its toooo much bloat. There must be malware XD linux users at there peak!
Tbf 500ms latency on - IIRC - a loopback network connection in a test environment is a lot. It’s not hugely surprising that a curious engineer dug into that.
Especially that it only took 300ms before and 800ms after
Half a second is a really, really long time.
reminds of Data after the Borg Queen incident
Which ep/movie are you referring to?
Star Trek: First Contact
The one where they go back in time but the whales were already nuked
I… actually can’t tell if you’re taking the piss or if that’s a real episode.
I have so many questions about the whales.
If this exploit was more performant, I wonder how much longer it would have taken to get noticed.
Technically that wasn’t the initial entrypoint, paraphrasing from https://mastodon.social/@AndresFreundTec/112180406142695845 :
It started with ssh using unreasonably much cpu which interfered with benchmarks. Then profiling showed that cpu time being spent in lzma, without being attributable to anything. And he remembered earlier valgrind issues. These valgrind issues only came up because he set some build flag he doesn’t even remember anymore why it is set. On top he ran all of this on debian unstable to catch (unrelated) issues early. Any of these factors missing, he wouldn’t have caught it. All of this is so nuts.
Postgres sort of saved the day
RIP Simon Riggs
https://www.postgresql.org/about/news/remembering-simon-riggs-2830/
Is that from the Microsoft engineer or did he start from this observation?
From what I read it was this observation that led him to investigate the cause. But this is the first time I read that he’s employed by Microsoft.
I’ve seen that claim a couple of places and would like a source. It very well may be since Microsoft prefers Debian based systems for WSL and for azure, but its not something I would have assumed by default
It’s in his mastodon bio. https://mastodon.social/@AndresFreundTec/112180083704606941
Thank you!
AFAIK he works on the Azure PostgreSQL product.
His LinkedIn, his Twitter, his Mastodon, and the Verge, for starters.