But what's actually happening? There seems to be a lack of technical information.
And why does the SSD allow this to happen? A SSD has its own onboard computer, it's not just allowing the OS to do whatever it wants. Obviously the OS can write way too much and reach the endurance limit but that should have been figured out almost instantly, with OS write stats and SMART stats.
> And why does the SSD allow this to happen? A SSD has its own onboard computer, it's not just allowing the OS to do whatever it wants.
If the device is DRAM-less, much of its central information (large parts of the FTL, in particular) resides in the host's RAM, where the OS could presumably touch it. If that area of RAM is _somehow_ being overwritten or out-of-sync or otherwise unreliable, you can get pretty bad corruption.
no, the FTL is still in the SSD unless it's a host-managed SSD which is also operating in host-managed mode, which none of the articles have mentioned to be related to the issue
No, some SSDs use host memory buffer (HMB) to cache FTL tables. If the FTL cache gets corrupted, and that causes critical data to be overwritten, that could brick the SSD. For instance, if the FTL table was corrupted in such a way where a page for a random file is mapped to the page for the SSD's FTL (or other critical data), and the OS/user tries to write to that random file.
Yes, which is why they're cheap(er). It's better than the alternative of using flash instead of going out to system RAM, but DRAM-less SSDs are still the cheap option; HMB is a mitigation, and not a complete fix.
The FTL executes on the SSD controller, which (on a DRAM-less controller) has limited on-chip SRAM and no DRAM. In contrast, a controller for more expensive SSDs which will require an external on-SSD DRAM chip of 1+GB.
The FTL algorithm still needs one or more large tables. The driver allocates host-side memory for these tables, and the CPU on the SSD that runs the FTL has to reach out over the PCIe bus (e.g. using DMA operations) to write or read these tables.
It's an abomination that wouldn't exist in an ideal world, but in that same ideal world people wouldn't buy a crappy product because it's $5 cheaper.
One of the Japanese sites has a list of SSDs that people have observed the problem on - most of them seem to be dramless, especially if "Phison PS5012-E12" is an error. (PS5012-E12S is the dramless version)
Then again, I think dramless SSDs represent a large fraction of the consumer SSD market, so they'd probably be well-represented no matter what causes the issue.
Finally, I'll point out that there's a lot of nonsense about DRAMless SSDs on the internet - e.g. Google shows this snippet from r/hardware: "Top answer: DRAM on the drive benefits writes, not reads. Gaming is extremely read-heavy, and reads are..."
FTL stands for flash TRANSLATION layer - it needs to translate from a logical disk address to a real location on the flash chip, and every time you write a logical block that real location changes, because you can't overwrite data in flash. (you have to wait and then erase a huge group of blocks - i.e. garbage collection)
If you put the translation table in on-SSD DRAM, it's real fast, but gets huge for a modern SSD (1+GB per TB of SSD). If you put all of it on flash - well, that's one reason thumb drives are so slow. I believe most DRAM-full consumer SSDs nowadays keep their translation tables in flash, but use a bunch of DRAM to cache as much as they can, and use the rest of their DRAM for write buffering.
DRAMless controllers put those tables in host memory, although I'd bet they still treat it as a cache and put the full table in flash. I can't imagine them using it as a write buffer; instead I'm guessing when they DMA a block from the host, they buffer 512B or so on-chip to compute ECC, then send those chunks directly to the flash chips.
There's a lot of guesswork here - I don't have engineering-level access to SSD vendors, and it's been a decade since I've put a logic analyzer on an SSD and done any reverse-engineering; SSDs are far more complicated today. If anyone has some hard facts they can share, I'd appreciate it.
I dont buy this. There are plenty of dramless SATA SSDs which should be impossible if your description was correct, not to mention DRAMless drives working just fine inside USB-NVME enclosures.
>but gets huge for a modern SSD (1+GB per TB of SSD)
except most drives allocate 64MB thru HMB. Do you know of any NVME drives that steal Gigabytes of ram? Afaik Windows limits HMB to ~200MB?
>Finally, I'll point out that there's a lot of nonsense about DRAMless SSDs on the internet
FTL doesnt need all that ram. Ram on drives _is_ used for caching writes, or more specifically reordering and grouping small writes to efficiently fill whole NAND pages preventing fragmentation that destroys endurance and write speed.
>But what's actually happening? There seems to be a lack of technical information.
That's also what I want to know. All the information on this topic seems to be just circular anecdotes like a snake eating its own tail: a bunch of anecdotal reddit posts, quoting a Tom's hardware article, that's quoting more anecdotal reddit posts, that's quoting one Japanese tweet of someone's speculation.
Like how many of these SSD deaths can actually be pinned on this update, and how much of this is just "Havana syndrome" of people's SSDs dying for whatever other reason, then they hear about this hubbub in the news and then they go on reddit and say "OMG mine too", then clickbait journalists pick up on it, and round and round we go, further reinforcing the FUD, but without any actual technical analysis to verify.
Right. It could just be the usual suspects of misinformation (Reddit, click-hungry "journalists", certain YouTube/Tiktok creators) amplifying each other in a circle. Just like that "16 billion passwords data leak" earlier this year.
There is probably something going on. It could very well just be a bad batch of SSD controllers from one manufacturer failing.
"I installed a Windows update and my SSD died afterwards" doesn't seem like news, given that almost all Windows users periodically install Windows updates and SSDs sometimes fail.
Runaway processes are big problems for SSD life. A runaway file indexer, or a tool which re-writes large chunks of data can consume the TBW limit of an SSD pretty fast if it's left unchecked for long.
Is it actually killing the SSD (SSD can no longer be used) or just corrupting the data on the SSD? It's hard to make out from all the comments and news articles.
I've seen lots of SSDs die suddenly (no longer visible on the bus), so I would assume that is what is happening based on the words people are using. I've yet to see an SSD fail to read only mode like they're supposed to... and there's rarely any warning, just working or dead (although I did have a couple that went from working to terribly slow while doing a large reallocation, and we replaced those rather than find out what would happen over a longer term)
That said, people use words with a different meaning all the time, and data corruption could fit as a failure.
I've not seen an Intel SSD do it either, although I've seen many of them escape their earthly existence :P
There was a firmware bug, but updating the firmware was inconvenient, and the specific interaction that caused the failure wasn't stated, so I couldn't avoid whatever it was; seemed connected to being pretty idle... we had a second data center as an untested "warm" failover target, and disks would tend to die over there where nothing significant was happening.
I wonder what the commercial effect is of such a thing on MS. Because assuming that the SSDs are unrecoverable it might lead to sales of new machines or new Windows licenses. There is a fair chance that bugs like these end up making good money, the numbers are large enough that even a small fraction of the users being affected can translate into a serious windfall.
You seem to be implying that Linus Thorvalds should also be liable for damage caused by Linux kernel.
I don't think the analogy is good. You might be better off replacing Linus with Apple and Linux with macOS. In that case, I would definitely think Apple should be held liable if an update to macOS bricks some hardware in a Mac.
But with Linux, it is different: You do not have a business relationship with Linus.
Sure, if you bought your Linux distribution from, say Red Hat, and it bricks your server, I think you might have a good case against Red Hat(IBM).
Not related, but this reminds me of a recent issue with the Samsung 990 Pro SSD that required a firmware update for fix, and some drives had to be returned. I speculate it was exacerbated by increased usage.
You might want to run memtest86+ (or the built-in equivalent from some OEMs like Dell), in my experience memory sticks sometimes go bad after being in use for a while.
Maybe just re-tuning the timing, if he's using high performance sticks. Because parts are hard to get by where I live, I usually stick 10+ years with a PC. With usage I found that I have to relax the timings a bit after some years.
That's why I don't install updates, unless and until they've been proven not to break things. I miss the old days when software was expected to work out of the box and updates, on the rare occasions when they appeared, were actually useful.
I hope you are speaking with tongue in cheek. Security is the main reason to keep current with updates. They address various “CVE” reports and go beyond to patch things not reported by CVEs.
I think users wouldn't be so resistant to security updates of they were just that and not bundled with feature removal, unwanted new features, and other things.
Or if they were properly done. Example: Intel and the plundervolt vulnerability. To fix that they removed the ability for undervolting in ny laptop. If I don't use SGX there's no reason for the block. They could've restricted undervolting only when SGX is enabled but no, they had to "fix" it in the worst way possible.
CVE inflation is real. Most CVEs are of very low quality.
Anyway, security updates should be decoupled from feature updates, so that people aren't hesitant to update. Otherwise, you get people who hold out because they're worried the new release is going to break all their settings and "opt-in" into all kinds of new telemetry.
> Security is the main reason to keep current with updates.
It shouldn't be that way though. Especially the billion dollar corporations should not be excused for shipping insecure software - the sad reality though is that Microsoft seems to have lost most of its QA team and what remains of its dev team gets shifted to developing adware for that sweet sweet "recurring revenue" nectar. Apple doesn't have that problem at least, but their management also has massive problems, prioritizing shiny new gadgets over fixing the tons of bugs people have.
The biggest problem with this is near zero communications from Microsoft. But what do I expect these days? Shovel AI in everything at any cost.
I’ve had repeatable data loss recently from windows 11 under a specific condition copying directories in explorer. The case works on windows 10 LTSC fine. I have absolutely no idea where to even raise this as an issue now. I’m not sure I even give a fuck.
I think you also didn't read the EULA. The EULA says something along the lines:
> the statements incompatible with local law are to be disregarded as void
This is to protect The beneficent of EULA terms (Microsoft) from the possibility that entire EULA is rendered illegal because one of its statements is illegal.
So EULA doesn't say
> no
What it says instead is
> no, if that's legal where you use this software
Though this condition doesn't neighbor the statement like this.
Install "Windows 10 IoT Enterprise 2021 LTSC" if you don't mind buying grey market keys. Less crapware, more mature and less enshittified than 11, and security fixes until 2032.
I don't want to endorse Windows at all (use Linux if you can!). But maybe you need it to occasionally test something or whatever.
You don't have to buy grey market keys, use the public ones installed through mass gravel. Open source, hosted on Microsoft's own GitHub - it's practically an endorsement!
Even though I professionally work with Linux I still don't trust it enough for gaming. I know that Steam does great things with Proton, my issue is that I'm not the type of gamer who constantly plays the same game - Play a game for how long the story or my interests lasts, then switch to the next game.
And after a whole day of debugging and hair pulling at work I just don't feel like then also debugging why a game is not running like it should.
But I heard I should give it a try again, last time I gave it a shot was 2-3 years ago. Big plus would be that I'd be completely free of Windows...
Did you try bazzite OS, the only issue I have had was to select the proton build of CS, everything else works out of the box. Except for games that need anti-cheat… So I still ended up with a windows partition.
Maybe you know this but Bazzite works perfectly well as a standard Linux desktop operating system. It comes with a non-gaming desktop environment and can be setup to boot directly into that desktop environment. It just defaults to the steam gaming interface.
Please don't buy "grey market" MS keys (i.e. super cheap keys or keys for products not sold to end users, like LTSC).
Either buy keys from legitimate vendors or use alternative activation methods (emulated KMS, etc.). I believe a lot of these grey market keys come either from MSDN subscriptions or leaked MAK keys, in either case, you aren't really paying for the product, you're just funneling money to sketchy people.
Weirdly enough I had one of those 10 IoT Enterprise 2021 LTSC systems kill a SSD in the past month, bad blocks. Intel 520 180GB. Probably coincidence but I figured I'd mention since this was also a system with a large OST file in use.
Tomorrow, somebody will still explain to you like you're a child that Linux has hardware incompatibilities (on the computer they bought last week the day it came out), and is just not ready for prime time.
They want to stick with Windows because it's safe and just works.
And I will continue to use non-upgradable Macs because, while I miss tinkering with and upgrading my computers, I simply don’t have time for it anymore.
I have a strong suspicion this was some kind of stock / market attack. Phison dropped 14% (and their main competitor Silicon Motion increased 7% incidentally), while every single "news / slop" points to a single original source, some random Japanese person called "necoru_cat" that posted a supposed list of affected models (full of spelling mistakes).
I'm actually very surprised a single person managed to pull off a scam of this magnitude and am very worried about what effect fabricated news (now helped by AI) will have in the future.
Nah, I used various Linux distros for years and the update problems happen there all the time, I think even more TBH, and require substantial technical expertise to fix them.
IMO, the only good way is "if it works, don't fix it", which means, no updates. People are seriously overhyping updates.
I stopped updating all the stuff - OSes, smart locks, android apps, TVs, BP monitors - I honestly had multiple update problems on ALL mentioned devices, multiple times. I only update the thing when I have an actual problem and there is changelog stating that the bug is fixed, or when I want a new feature. You can handle security in other ways in almost all the cases.
I think this IT update burden has gotten out of hand - I don't recall any other domain is like that - my car, my house, my bicycle, my glasses DO NOT UPDATE and its glorious - apart from physical damage, they work the same as yesterday.
I've also been using Linux for years (Arch, btw) and never had an update break my install or cause issues, I've only had to fix the Linux bootloader when Windows overwrote it after a major update, multiple times...
In fact; I have a laptop right now that hasn't received updates because there's a shared object that has been removed that `yay` depends on.
(this was from a long time ago).
I generally think that updates of the mainstream distro's like Debian will definitely *NOT* brick your system in almost any circumstance, and arch tends to be somewhat solid, but every once in a while something dire happens with arch which would make me not agree with the fact that updates are always seamless.
"AUR packages are user-produced content. These PKGBUILDs are completely unofficial and have not been thoroughly vetted. Any use of the provided files is at your own risk."
Usually just rebuilding a AUR package will fix most issues.
I would say it's fair, but then it leads to an interesting problem that could happen.
Build a tiny core, ask the community to extend: never be liable for any issue because important packages that are essentially required aren't part of the core and thus any criticism is invalid.
Not saying that's happening here, but it could happen by this definition.
But, ok, there are more issues anyway, for example it's pretty common that you have to update archlinux-keychain before an upgrade can succeed because the signing keys have rotated and someone has already packaged an update to something with the new key. That is definitely base.
I disagree that having to update a package which is also now a automatic service qualifies as broken. Sure there are issues but they are usually either self induced e.g. AUR, easily fixable e.g. keyring update or broadcasted publicly with a fix e.g. archnews.
At the end of the day arch is firmly a do it yourself distro where some user intervention is expected.
> I've also been using Linux for years (Arch, btw) and never had an update break my install or cause issues
Is an anecdote that is worthy of being attacked with my own, given the context that people might come away thinking that Arch updates do not break their system.. right?
Yeah, arch is basically unusable without AUR, so that is semantic difference without grounding in reality. Many people use Arch becuase of its package manager.
That is maybe how you feel but it obviously works just fine without it otherwise it would not be explicitly separated. I do not have more than a handful AUR packages none of which are essential or without alternative.
>Many people use Arch because of its package manager
True pacman is great but that has nothing to do with the AUR.
I've been using Linux for decades and have had many times of installing updates breaking a machine to the point where I'd rather just reimage than try and fix the pile of issues. So many times of installing updates and now all the graphics don't work, now audio is broken, where did my network adapters go, etc. To the point that now most Linux things I touch, when I want to update I just deploy a fresh image rather than actually install updates. I can't really trust the state of the machine after a year of quietly installing updates.
A line item on my agenda today is actually helping a team figure out why when they do a release upgrade on their pet Ubuntu VM practically everything they care about on the box breaks and helping them plan out strategies to un-pet these workloads.
I've had a good share of Windows updates making a mess of things, don't get me wrong. But I've had plenty of bad updates in Linux over the years.
Agreed. Continuous updates are unnecessary if you have sane restrictions on your system, i.e. strict firewall, limit software being installed. Probably the only exception to that is the web browser itself, but even that can be mitigated to an extent by running with an ad blocker and javascript off by default.
Not only are they unnecessary, they are often best to be avoided. I recently had to reinstall Windows OS after I updated firewall software as after that VPN stopped working and some serious IT pros were not able to fix it (even with different clients tried). After we lost a couple of days, I reinstalled OS.
It depends on context really. Security is a feature like anything else. In an ideal world I would agree with you, but professional security costs a lot of money and the stakeholder is not necessarily willing to pay for it versus actually observable features. Also, security might be irrelevant in a bunch of contexts, particularly with great recovery options (and absence of PII) that you need to have anyway and people are actually willing to pay for it as it covers grounds that overlap with security partially or totally.
Funny whenever people mention using Linux instead of windows there is always a user who claims Linux is broken, hard to use, while most Linux users live a peaceful and uneventful life.
I think the issue is that we excuse worse behaviour of Windows because of a preconceived notion that "Windows is easy and linux is hard".
The amount of registry hacks and debloat scripts and messed up updates I've had to endure on Windows makes my Linux experience seem very breezy by comparison to be perfectly honest with you.
The main issue I have with my own reasoning here is that about 20 years ago when I was getting into Linux: it was hard; and thus I learned a lot. Something that is a literal 2 second fix for me might take someone else 2 days or not be solvable. I can't really understand how hard others have it.
But: my mum ran Linux mint for the last 8 years without issue. So maybe it really is a skill issue. This woman is not tech literate like I am. So, there's that.
I am sure your mum uses 0.000001% of Linux features. She could probably work equally well on Commodore 64 with browser and mail. So that amounts to nothing. Besides, she has you to fix any issues, and Mint is the recommended distribution to use if you are switching from Windows. So there's that.
Your own usage of Linux is irrelevant to the mainstream user. If you use anything for 20 years, you will probably be good with it if you care at least a bit. Most people don't care, it is a mean to an end. Hell, I am in IT since kindergarten and I don't care sometimes, it happens now and then that I just want some stuff to work without any issues.
Linux is harder than Windows. It can't be anything else, as it is not originally made for mainstream user, but for engineers and geeks. We came a long way since inception though and today many distros offer decent OS for mainstream user.
Yeah, my mum could work with a chromebook- but why does that matter- what are we talking about, ease of use or power users?
Because if its power users; having command of the whole operating system is just objectively better. Nothing hidden or forbidden and no oddities and quirks that are non-detiministic, indeterministic and actively prohibit debugging.
If its ease of use: then the easiest laptop operating system to maintain is… Linux (in the form of a chromebook).
That said, I really haven’t helped my mum with her laptop- I actually thought she didn’t use it because she didn’t ask me any support questions but it turned out that it “just worked” and she clicked the update pop-up once in a while like I asked her to.
> Linux is harder than Windows. It can't be anything else, as it is not originally made for mainstream user
This sounds like cope. Accessibility can be better on Windows for this reason… people being paid to drudge through an accessibility checklist- but the reality is not this. CommodoreOS was also built for humans and its harder to use than OpenBSD (which makes no apologies to non-power users). Your entire thesis here is bunk- there’s just countless examples of people thinking they know best for users but actively harming them instead (by hiding information or misleading them intentionally).
well if it is GNU(+linux) there are alternative mechanisms of action other than waiting for a corporation to act in your benefit. i think that is a giant difference.
How are you defining "bricked"? The SSD device can no longer be enumerated on the PCIe/SATA bus, or it doesn't respond to ATA/NVMe commands, or it doesn't respond as expected, or it does but the data is always wrong? Does the same SSD work in another machine?
edit: The author of the comment I replied to has changed their comment to remove all details of their testing.
The latest as far as I know is that Phison couldn't replicate the issue. [1]
[1] https://wccftech.com/phison-dismisses-reports-of-windows-11-...
[dead]
But what's actually happening? There seems to be a lack of technical information.
And why does the SSD allow this to happen? A SSD has its own onboard computer, it's not just allowing the OS to do whatever it wants. Obviously the OS can write way too much and reach the endurance limit but that should have been figured out almost instantly, with OS write stats and SMART stats.
> And why does the SSD allow this to happen? A SSD has its own onboard computer, it's not just allowing the OS to do whatever it wants.
If the device is DRAM-less, much of its central information (large parts of the FTL, in particular) resides in the host's RAM, where the OS could presumably touch it. If that area of RAM is _somehow_ being overwritten or out-of-sync or otherwise unreliable, you can get pretty bad corruption.
no, the FTL is still in the SSD unless it's a host-managed SSD which is also operating in host-managed mode, which none of the articles have mentioned to be related to the issue
No, some SSDs use host memory buffer (HMB) to cache FTL tables. If the FTL cache gets corrupted, and that causes critical data to be overwritten, that could brick the SSD. For instance, if the FTL table was corrupted in such a way where a page for a random file is mapped to the page for the SSD's FTL (or other critical data), and the OS/user tries to write to that random file.
Isn't that a huge flaw?
Yes, which is why they're cheap(er). It's better than the alternative of using flash instead of going out to system RAM, but DRAM-less SSDs are still the cheap option; HMB is a mitigation, and not a complete fix.
The FTL executes on the SSD controller, which (on a DRAM-less controller) has limited on-chip SRAM and no DRAM. In contrast, a controller for more expensive SSDs which will require an external on-SSD DRAM chip of 1+GB.
The FTL algorithm still needs one or more large tables. The driver allocates host-side memory for these tables, and the CPU on the SSD that runs the FTL has to reach out over the PCIe bus (e.g. using DMA operations) to write or read these tables.
It's an abomination that wouldn't exist in an ideal world, but in that same ideal world people wouldn't buy a crappy product because it's $5 cheaper.
One of the Japanese sites has a list of SSDs that people have observed the problem on - most of them seem to be dramless, especially if "Phison PS5012-E12" is an error. (PS5012-E12S is the dramless version)
Then again, I think dramless SSDs represent a large fraction of the consumer SSD market, so they'd probably be well-represented no matter what causes the issue.
Finally, I'll point out that there's a lot of nonsense about DRAMless SSDs on the internet - e.g. Google shows this snippet from r/hardware: "Top answer: DRAM on the drive benefits writes, not reads. Gaming is extremely read-heavy, and reads are..."
FTL stands for flash TRANSLATION layer - it needs to translate from a logical disk address to a real location on the flash chip, and every time you write a logical block that real location changes, because you can't overwrite data in flash. (you have to wait and then erase a huge group of blocks - i.e. garbage collection)
If you put the translation table in on-SSD DRAM, it's real fast, but gets huge for a modern SSD (1+GB per TB of SSD). If you put all of it on flash - well, that's one reason thumb drives are so slow. I believe most DRAM-full consumer SSDs nowadays keep their translation tables in flash, but use a bunch of DRAM to cache as much as they can, and use the rest of their DRAM for write buffering.
DRAMless controllers put those tables in host memory, although I'd bet they still treat it as a cache and put the full table in flash. I can't imagine them using it as a write buffer; instead I'm guessing when they DMA a block from the host, they buffer 512B or so on-chip to compute ECC, then send those chunks directly to the flash chips.
There's a lot of guesswork here - I don't have engineering-level access to SSD vendors, and it's been a decade since I've put a logic analyzer on an SSD and done any reverse-engineering; SSDs are far more complicated today. If anyone has some hard facts they can share, I'd appreciate it.
I dont buy this. There are plenty of dramless SATA SSDs which should be impossible if your description was correct, not to mention DRAMless drives working just fine inside USB-NVME enclosures.
>but gets huge for a modern SSD (1+GB per TB of SSD)
except most drives allocate 64MB thru HMB. Do you know of any NVME drives that steal Gigabytes of ram? Afaik Windows limits HMB to ~200MB?
>Finally, I'll point out that there's a lot of nonsense about DRAMless SSDs on the internet
FTL doesnt need all that ram. Ram on drives _is_ used for caching writes, or more specifically reordering and grouping small writes to efficiently fill whole NAND pages preventing fragmentation that destroys endurance and write speed.
It is not published yet on the Microsoft update page (https://support.microsoft.com/KB/5063878). And it only applies to Windows 11 24 H2.
https://learn.microsoft.com/en-us/answers/questions/5536733/...
>But what's actually happening? There seems to be a lack of technical information.
That's also what I want to know. All the information on this topic seems to be just circular anecdotes like a snake eating its own tail: a bunch of anecdotal reddit posts, quoting a Tom's hardware article, that's quoting more anecdotal reddit posts, that's quoting one Japanese tweet of someone's speculation.
Like how many of these SSD deaths can actually be pinned on this update, and how much of this is just "Havana syndrome" of people's SSDs dying for whatever other reason, then they hear about this hubbub in the news and then they go on reddit and say "OMG mine too", then clickbait journalists pick up on it, and round and round we go, further reinforcing the FUD, but without any actual technical analysis to verify.
Agree; any truth to the fact that this is push back for Windows 10 EOL?
Right. It could just be the usual suspects of misinformation (Reddit, click-hungry "journalists", certain YouTube/Tiktok creators) amplifying each other in a circle. Just like that "16 billion passwords data leak" earlier this year.
There is probably something going on. It could very well just be a bad batch of SSD controllers from one manufacturer failing.
> But what's actually happening?
Publications need clicks, videos need watches, people need upvotes
"I installed a Windows update and my SSD died afterwards" doesn't seem like news, given that almost all Windows users periodically install Windows updates and SSDs sometimes fail.
Runaway processes are big problems for SSD life. A runaway file indexer, or a tool which re-writes large chunks of data can consume the TBW limit of an SSD pretty fast if it's left unchecked for long.
I seem to remember Spotify causing big problems because of this
This is doing the rounds on YouTube, too. But with pretty much the same information as everywhere else that tracks back to the same original sources.
* https://youtube.com/watch?v=mlY2QjP_-9s (JayzTwoCents)
* https://youtube.com/watch?v=sU_WepeHUd8 (ThioJoe)
* https://youtube.com/watch?v=7xS-CE-hy6Q (Dave's Attic)
* https://youtube.com/watch?v=zoHGSz-f6os (Pureinfotech)
Is it actually killing the SSD (SSD can no longer be used) or just corrupting the data on the SSD? It's hard to make out from all the comments and news articles.
I've seen lots of SSDs die suddenly (no longer visible on the bus), so I would assume that is what is happening based on the words people are using. I've yet to see an SSD fail to read only mode like they're supposed to... and there's rarely any warning, just working or dead (although I did have a couple that went from working to terribly slow while doing a large reallocation, and we replaced those rather than find out what would happen over a longer term)
That said, people use words with a different meaning all the time, and data corruption could fit as a failure.
Failing to read-only is only an Intel thing, I've not seen any other SSD do that...
I've not seen an Intel SSD do it either, although I've seen many of them escape their earthly existence :P
There was a firmware bug, but updating the firmware was inconvenient, and the specific interaction that caused the failure wasn't stated, so I couldn't avoid whatever it was; seemed connected to being pretty idle... we had a second data center as an untested "warm" failover target, and disks would tend to die over there where nothing significant was happening.
"Just" corrupting your filesystem...
Relative seriousness, both drive damage and filesystem damage are both bad but by slightly different degrees.
There is more chance of being able to fix data corruption, than being able to fix a bricked drive or one with unbearable blocks.
Self rely as I'm too late to edit out a slide-keyboard error: unbearable -> unreadable
some data might be worth way more than any SSD.
If it is then storing it without backups sounds like a bad idea
I wonder what the commercial effect is of such a thing on MS. Because assuming that the SSDs are unrecoverable it might lead to sales of new machines or new Windows licenses. There is a fair chance that bugs like these end up making good money, the numbers are large enough that even a small fraction of the users being affected can translate into a serious windfall.
They should be held liable if their software bricks hardware.
Can you get Linus Bucks by replacing your efivars with Doom?
You seem to be implying that Linus Thorvalds should also be liable for damage caused by Linux kernel.
I don't think the analogy is good. You might be better off replacing Linus with Apple and Linux with macOS. In that case, I would definitely think Apple should be held liable if an update to macOS bricks some hardware in a Mac.
But with Linux, it is different: You do not have a business relationship with Linus.
Sure, if you bought your Linux distribution from, say Red Hat, and it bricks your server, I think you might have a good case against Red Hat(IBM).
You took my reply a little too seriously :-)
Torvalds* but I'm sure he'd not mind the extra H :)
We knew more technical information about the CrowdStrike than we know about this. It's ridiculous.
Wasn't this mostly a WD HBM issue? [1]
[1] https://www.neowin.net/news/report-microsofts-latest-windows...
Some more information here https://www.windowslatest.com/2025/08/20/microsoft-is-invest...
Not related, but this reminds me of a recent issue with the Samsung 990 Pro SSD that required a firmware update for fix, and some drives had to be returned. I speculate it was exacerbated by increased usage.
https://serverfault.com/questions/1172216/issue-with-samsung...
https://www.tomshardware.com/news/samsung-990-pro-health-dro...
> I speculate it was exacerbated by increased usage.
Then the drive is defective.
I'm wondering if I should defer my full system backup on the 1st of September, as the resulting file is 300+ GB.
I had a BSOD last week, 0x0000012b (FAULTY_HARDWARE_CORRUPTED_PAGE), which I've never had, and was hoping it isn't related to this update.
You might want to run memtest86+ (or the built-in equivalent from some OEMs like Dell), in my experience memory sticks sometimes go bad after being in use for a while.
Maybe just re-tuning the timing, if he's using high performance sticks. Because parts are hard to get by where I live, I usually stick 10+ years with a PC. With usage I found that I have to relax the timings a bit after some years.
That's why I don't install updates, unless and until they've been proven not to break things. I miss the old days when software was expected to work out of the box and updates, on the rare occasions when they appeared, were actually useful.
I hope you are speaking with tongue in cheek. Security is the main reason to keep current with updates. They address various “CVE” reports and go beyond to patch things not reported by CVEs.
I think users wouldn't be so resistant to security updates of they were just that and not bundled with feature removal, unwanted new features, and other things.
Or if they were properly done. Example: Intel and the plundervolt vulnerability. To fix that they removed the ability for undervolting in ny laptop. If I don't use SGX there's no reason for the block. They could've restricted undervolting only when SGX is enabled but no, they had to "fix" it in the worst way possible.
CVE inflation is real. Most CVEs are of very low quality.
Anyway, security updates should be decoupled from feature updates, so that people aren't hesitant to update. Otherwise, you get people who hold out because they're worried the new release is going to break all their settings and "opt-in" into all kinds of new telemetry.
> Security is the main reason to keep current with updates.
For plenty of users, their only exposed attack surface is the web browser and AV codecs. Updates outside of that make no security difference for them.
> For plenty of users, their only exposed attack surface is the web browser
Until they realize that every Microsoft app sends data to mothership.
> Security is the main reason to keep current with updates. They address various “CVE” reports and go beyond to patch things not reported by CVEs.
This does not seems to be the case. Rounding buttons and changing icons size in Teams and Office 365 has nothing to do with security.
> Security is the main reason to keep current with updates.
It shouldn't be that way though. Especially the billion dollar corporations should not be excused for shipping insecure software - the sad reality though is that Microsoft seems to have lost most of its QA team and what remains of its dev team gets shifted to developing adware for that sweet sweet "recurring revenue" nectar. Apple doesn't have that problem at least, but their management also has massive problems, prioritizing shiny new gadgets over fixing the tons of bugs people have.
[dupe] Earlier: https://news.ycombinator.com/item?id=44931383
The biggest problem with this is near zero communications from Microsoft. But what do I expect these days? Shovel AI in everything at any cost.
I’ve had repeatable data loss recently from windows 11 under a specific condition copying directories in explorer. The case works on windows 10 LTSC fine. I have absolutely no idea where to even raise this as an issue now. I’m not sure I even give a fuck.
If it breaks a SSD, would microsoft be liable for the damage?
The EULA that nobody reads says no
I think you also didn't read the EULA. The EULA says something along the lines:
> the statements incompatible with local law are to be disregarded as void
This is to protect The beneficent of EULA terms (Microsoft) from the possibility that entire EULA is rendered illegal because one of its statements is illegal.
So EULA doesn't say
> no
What it says instead is
> no, if that's legal where you use this software
Though this condition doesn't neighbor the statement like this.
I doubt it
Has Microsoft ever been liable for anything?
Install "Windows 10 IoT Enterprise 2021 LTSC" if you don't mind buying grey market keys. Less crapware, more mature and less enshittified than 11, and security fixes until 2032.
I don't want to endorse Windows at all (use Linux if you can!). But maybe you need it to occasionally test something or whatever.
You don't have to buy grey market keys, use the public ones installed through mass gravel. Open source, hosted on Microsoft's own GitHub - it's practically an endorsement!
> https://github.com/massgravel/Microsoft-Activation-Scripts
Even though I professionally work with Linux I still don't trust it enough for gaming. I know that Steam does great things with Proton, my issue is that I'm not the type of gamer who constantly plays the same game - Play a game for how long the story or my interests lasts, then switch to the next game.
And after a whole day of debugging and hair pulling at work I just don't feel like then also debugging why a game is not running like it should.
But I heard I should give it a try again, last time I gave it a shot was 2-3 years ago. Big plus would be that I'd be completely free of Windows...
FWIW : I own 262 games on my steam library, played most of them at least once. I had no issue with any single game.
I don't play multiplayer games so I'm not concerned by anti cheats though.
Did you try bazzite OS, the only issue I have had was to select the proton build of CS, everything else works out of the box. Except for games that need anti-cheat… So I still ended up with a windows partition.
I guess he wants to use his general computing device as a general computing device and not as a console.
Maybe you know this but Bazzite works perfectly well as a standard Linux desktop operating system. It comes with a non-gaming desktop environment and can be setup to boot directly into that desktop environment. It just defaults to the steam gaming interface.
It's an immutable distribution...
> if you don't mind buying grey market keys
Please don't buy "grey market" MS keys (i.e. super cheap keys or keys for products not sold to end users, like LTSC).
Either buy keys from legitimate vendors or use alternative activation methods (emulated KMS, etc.). I believe a lot of these grey market keys come either from MSDN subscriptions or leaked MAK keys, in either case, you aren't really paying for the product, you're just funneling money to sketchy people.
Had to get windows to play anti-cheat games. The EU mandated N versions seem pretty bloat free to me.
Weirdly enough I had one of those 10 IoT Enterprise 2021 LTSC systems kill a SSD in the past month, bad blocks. Intel 520 180GB. Probably coincidence but I figured I'd mention since this was also a system with a large OST file in use.
> Intel 520 180GB
Sorry but this drive is almost 15 years old.
Any word from Microsoft?
Tomorrow, somebody will still explain to you like you're a child that Linux has hardware incompatibilities (on the computer they bought last week the day it came out), and is just not ready for prime time.
They want to stick with Windows because it's safe and just works.
And I will continue to use non-upgradable Macs because, while I miss tinkering with and upgrading my computers, I simply don’t have time for it anymore.
And sadly they’ll still be right :<
I have a strong suspicion this was some kind of stock / market attack. Phison dropped 14% (and their main competitor Silicon Motion increased 7% incidentally), while every single "news / slop" points to a single original source, some random Japanese person called "necoru_cat" that posted a supposed list of affected models (full of spelling mistakes).
I'm actually very surprised a single person managed to pull off a scam of this magnitude and am very worried about what effect fabricated news (now helped by AI) will have in the future.
[flagged]
Nah, I used various Linux distros for years and the update problems happen there all the time, I think even more TBH, and require substantial technical expertise to fix them.
IMO, the only good way is "if it works, don't fix it", which means, no updates. People are seriously overhyping updates.
I stopped updating all the stuff - OSes, smart locks, android apps, TVs, BP monitors - I honestly had multiple update problems on ALL mentioned devices, multiple times. I only update the thing when I have an actual problem and there is changelog stating that the bug is fixed, or when I want a new feature. You can handle security in other ways in almost all the cases.
I think this IT update burden has gotten out of hand - I don't recall any other domain is like that - my car, my house, my bicycle, my glasses DO NOT UPDATE and its glorious - apart from physical damage, they work the same as yesterday.
I've also been using Linux for years (Arch, btw) and never had an update break my install or cause issues, I've only had to fix the Linux bootloader when Windows overwrote it after a major update, multiple times...
I've had plenty of issues with Arch updates.
In fact; I have a laptop right now that hasn't received updates because there's a shared object that has been removed that `yay` depends on.
(this was from a long time ago).
I generally think that updates of the mainstream distro's like Debian will definitely *NOT* brick your system in almost any circumstance, and arch tends to be somewhat solid, but every once in a while something dire happens with arch which would make me not agree with the fact that updates are always seamless.
AUR issues are not arch issues
"AUR packages are user-produced content. These PKGBUILDs are completely unofficial and have not been thoroughly vetted. Any use of the provided files is at your own risk."
Usually just rebuilding a AUR package will fix most issues.
I would say it's fair, but then it leads to an interesting problem that could happen.
Build a tiny core, ask the community to extend: never be liable for any issue because important packages that are essentially required aren't part of the core and thus any criticism is invalid.
Not saying that's happening here, but it could happen by this definition.
But, ok, there are more issues anyway, for example it's pretty common that you have to update archlinux-keychain before an upgrade can succeed because the signing keys have rotated and someone has already packaged an update to something with the new key. That is definitely base.
Arch isn't liable regardless. It's a community driven project run by volunteers without contracts.
>it's pretty common that you have to update archlinux-keychain before an upgrade can succeed because the signing keys
"Since 2022-07-29, the archlinux-keyring-wkd-sync.service and the associated systemd timer have been created and enabled by default"
https://wiki.archlinux.org/title/Pacman/Package_signing#Upgr...
Listen, dude, this isn’t litigation.
The shit is broken sometimes, it’s ok, but we are here to be intellectually curious and have a discussion.
Lying to people about how broken it can be to update in reality is the opposite of that.
We aren’t here to pull down arch or the community; just here to spit facts.
I disagree that having to update a package which is also now a automatic service qualifies as broken. Sure there are issues but they are usually either self induced e.g. AUR, easily fixable e.g. keyring update or broadcasted publicly with a fix e.g. archnews.
At the end of the day arch is firmly a do it yourself distro where some user intervention is expected.
Right so,
> I've also been using Linux for years (Arch, btw) and never had an update break my install or cause issues
Is an anecdote that is worthy of being attacked with my own, given the context that people might come away thinking that Arch updates do not break their system.. right?
Right, just make sure that anecdote actually applies.
Yeah, arch is basically unusable without AUR, so that is semantic difference without grounding in reality. Many people use Arch becuase of its package manager.
That is maybe how you feel but it obviously works just fine without it otherwise it would not be explicitly separated. I do not have more than a handful AUR packages none of which are essential or without alternative.
>Many people use Arch because of its package manager
True pacman is great but that has nothing to do with the AUR.
Using a rolling release distro is not really a fair comparison. Little frequent updates are a lot easier than giant catastrophic ones.
I've been using Linux for decades and have had many times of installing updates breaking a machine to the point where I'd rather just reimage than try and fix the pile of issues. So many times of installing updates and now all the graphics don't work, now audio is broken, where did my network adapters go, etc. To the point that now most Linux things I touch, when I want to update I just deploy a fresh image rather than actually install updates. I can't really trust the state of the machine after a year of quietly installing updates.
A line item on my agenda today is actually helping a team figure out why when they do a release upgrade on their pet Ubuntu VM practically everything they care about on the box breaks and helping them plan out strategies to un-pet these workloads.
I've had a good share of Windows updates making a mess of things, don't get me wrong. But I've had plenty of bad updates in Linux over the years.
Really? I love Arch, but even my hard core Linux friends warn me about update problems.
Agreed. Continuous updates are unnecessary if you have sane restrictions on your system, i.e. strict firewall, limit software being installed. Probably the only exception to that is the web browser itself, but even that can be mitigated to an extent by running with an ad blocker and javascript off by default.
Not only are they unnecessary, they are often best to be avoided. I recently had to reinstall Windows OS after I updated firewall software as after that VPN stopped working and some serious IT pros were not able to fix it (even with different clients tried). After we lost a couple of days, I reinstalled OS.
This ofc. happened at worst possible time.
i seriously hope you don't apply this "who needs security updates, just secure it on a different level"-mantra to your profession :O
i get it for private/home stuff (even then it would make me uncomfortable, but i see the appeal).
It depends on context really. Security is a feature like anything else. In an ideal world I would agree with you, but professional security costs a lot of money and the stakeholder is not necessarily willing to pay for it versus actually observable features. Also, security might be irrelevant in a bunch of contexts, particularly with great recovery options (and absence of PII) that you need to have anyway and people are actually willing to pay for it as it covers grounds that overlap with security partially or totally.
Funny whenever people mention using Linux instead of windows there is always a user who claims Linux is broken, hard to use, while most Linux users live a peaceful and uneventful life.
> Linux users live a peaceful and uneventful life.
So do most Windows users.
I love and use Linux, what do you mean? Maybe a little bit too defensive that your favorite OS is not perfect?
I think the issue is that we excuse worse behaviour of Windows because of a preconceived notion that "Windows is easy and linux is hard".
The amount of registry hacks and debloat scripts and messed up updates I've had to endure on Windows makes my Linux experience seem very breezy by comparison to be perfectly honest with you.
The main issue I have with my own reasoning here is that about 20 years ago when I was getting into Linux: it was hard; and thus I learned a lot. Something that is a literal 2 second fix for me might take someone else 2 days or not be solvable. I can't really understand how hard others have it.
But: my mum ran Linux mint for the last 8 years without issue. So maybe it really is a skill issue. This woman is not tech literate like I am. So, there's that.
I am sure your mum uses 0.000001% of Linux features. She could probably work equally well on Commodore 64 with browser and mail. So that amounts to nothing. Besides, she has you to fix any issues, and Mint is the recommended distribution to use if you are switching from Windows. So there's that.
Your own usage of Linux is irrelevant to the mainstream user. If you use anything for 20 years, you will probably be good with it if you care at least a bit. Most people don't care, it is a mean to an end. Hell, I am in IT since kindergarten and I don't care sometimes, it happens now and then that I just want some stuff to work without any issues.
Linux is harder than Windows. It can't be anything else, as it is not originally made for mainstream user, but for engineers and geeks. We came a long way since inception though and today many distros offer decent OS for mainstream user.
Yeah, my mum could work with a chromebook- but why does that matter- what are we talking about, ease of use or power users?
Because if its power users; having command of the whole operating system is just objectively better. Nothing hidden or forbidden and no oddities and quirks that are non-detiministic, indeterministic and actively prohibit debugging.
If its ease of use: then the easiest laptop operating system to maintain is… Linux (in the form of a chromebook).
That said, I really haven’t helped my mum with her laptop- I actually thought she didn’t use it because she didn’t ask me any support questions but it turned out that it “just worked” and she clicked the update pop-up once in a while like I asked her to.
> Linux is harder than Windows. It can't be anything else, as it is not originally made for mainstream user
This sounds like cope. Accessibility can be better on Windows for this reason… people being paid to drudge through an accessibility checklist- but the reality is not this. CommodoreOS was also built for humans and its harder to use than OpenBSD (which makes no apologies to non-power users). Your entire thesis here is bunk- there’s just countless examples of people thinking they know best for users but actively harming them instead (by hiding information or misleading them intentionally).
well if it is GNU(+linux) there are alternative mechanisms of action other than waiting for a corporation to act in your benefit. i think that is a giant difference.
this issue has been going for 2 months
How are you defining "bricked"? The SSD device can no longer be enumerated on the PCIe/SATA bus, or it doesn't respond to ATA/NVMe commands, or it doesn't respond as expected, or it does but the data is always wrong? Does the same SSD work in another machine?
edit: The author of the comment I replied to has changed their comment to remove all details of their testing.
yes
That’s a fragile, sort of roundabout comment. I can think of 90125 reasons closer to the edge that will move us back two squares.