It appears to work fine (it contains my home partition for my main machine I daily drive) and I haven’t noticed signs of failure. Not noticeably slow either. I used to boot Windows off of it once upon a time which was incredibly slow to start up, but I haven’t noticed slowness since using it for my home partition for my personal files.

Articles online seem to suggest the life expectancy for an HDD is 5–7 years. Should I be worried? How do I know when to get a new drive?

  • Elise@beehaw.org
    link
    fedilink
    arrow-up
    1
    ·
    1 day ago

    Hmm. Yeah I’m thinking of keeping my operation lean and simple, with an online copy. One issue I’ve noticed is that sometimes files just get corrupted. Perhaps due to a radiation event? A parity drive could solve that, but I want something simpler. I’m thinking just a tar with hash and then store multiple copies. What do you think?

    • ragebutt@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 day ago

      Bitrot sucks

      Zfs protects against this. It historically has been a pain to work with for home users but recently the implementation raidz expansion has made things a lot easier as you can now expand vdevs and increase the size of arrays without doubling the amount of disks.

      This is a potential great option for someone like you who is just starting out but still would require a minimum of 3 disks and the associated hardware. Sucks for people like me though who built arrays lonnnnng before zfs had this feature! It was literally up streamed like less than a year ago, good timing on your part (or maybe bad, maybe it doesn’t work well? I haven’t read much about it tbf but from the small amount I have read it seems to work fine. They worked on it for years)

      Btrfs is also an option for similar reasons as it has built in protections against bitrot. If you read on this there can be a lot of debate about whether it’s actually useful or dangerous. FWIW the consensus seems to be for single drives it’s fine. My array has a separate raid1 array of 2tb nvme drives, these are utilized as much higher speed cache/working storage for the services that run. Eg if a torrent downloads it goes to the nvme first as this storage is much easier to work with than the slow rotational drives that are even slower because they are in a massive array, then later the file is moved to the large array for storage in the middle of the night. Reading from the array is generally not an intensive operation but writing to it can be and a torrent that saturates my gigabit connection sometimes can’t keep up (or other operations that aren’t internet dependent like muxing or transcoding a video file). Anyway, this array has btrfs and has had 0 issues. That said I personally wouldn’t recommend it for raid5/6 and given the nature of this array I don’t care at all about the data on it

      My array has xfs. This doesn’t protect against bitrot. What you can do if you are in this scenario is what I do: once a week I run a plugin that checksums all new files and verifies checksums of old files. If checksums don’t match it warns me. I can then restore the invalid file from backup and investigate for issues (smart errors, bad sata cable, ecc problem with ram, etc). The upside of my xfs array is that I can expand it very easily and storage is maximized. I have 2 parity drives and at any point I can simply pop in another drive and extend the array to be bigger. This was not an option with zfs until about 9 months ago. This is a relatively “dangerous” setup but my array isn’t storing amazing critical data, it’s fully backed up despite that, and despite all of that it’s been going for 6+ years and has survived at least 3 drive failures

      That said my approach is inferior to btrfs and zfs because in this scenario they could revert to snapshot rather than needing to manually restore from backup. One day I will likely rebuild my array with zfs especially now that raidz expansion is complete. I was basically waiting for that

      As always double check everything I say. It is very possible someone will reply and tell me I’m stupid and wrong for several reasons. People can be very passionate about filesystems

      • Elise@beehaw.org
        link
        fedilink
        arrow-up
        1
        ·
        22 hours ago

        Where do you store the checksums? Is it for every file? I thought of just making a tar for each year and then storing it next to it, and storing a copy off-site.

        • ragebutt@lemmy.dbzer0.com
          link
          fedilink
          English
          arrow-up
          2
          ·
          20 hours ago

          I just have them on a usb stick with a copy on the array as well so they can also be checked for bitrot. Even doing it for every file it’s not that much data and it’s scripted so it’s done pretty continuously (I do it weekly).

          Actual file backups are what I store off site. 2 copies, one here and one off. My data generally isn’t changed all that much so I don’t bother continually backing up most directories. Like it doesn’t make sense to have 30 backups of my tv folder with my shows. They’re the same shows. I have some redundancy, I don’t just do one and done, but tape media is expensive so I don’t do like monthly backups either. Tape is wildly impractical for most home users though and offsite with tape means you need a trusted place to put it that’s reasonably safe and of moderately decent climate/humidity. Though an advantage of tape is that basically no one but the biggest of tech dorks is going to be able to read that data (versus something like leaving an external hard drive or bluray at a friends house. Even if you trust them a LOT they might plug it in. Although encryption exists)

          It’s home data so it’s about balancing what makes sense with what’s cost effective and your risk tolerance

          Some data is crucial of course. My personal documents are backed up far more regularly, like once an hour or so, and that’s where I utilize services like back blaze. My business, which is healthcare oriented, is entirely different and that data is segregated and utilizes backblaze as well as specialized software since it handles PHI and hipaa concerns. That’s backed up pretty much every few minutes.