Hello,

I’ve been lately thinking about my backup strategy as I’m finalising building my NAS. I want to use ZFS and my idea was to have two drives in mirror (RAID-1) configuration and just execute periodical snapshots on such dataset. I want to the same thing in a second location, so in the end my files would be on 4 different drives in 2 different locations and protected by snapshots from deletion or any other unwanted modification.

Would be possible with this setup to just swap one of the drives in one location and have ZFS automatically rebuild data on the new drive and then I take the drive to second location and do the same so all drives would be exactly the same, instead of copying data manually? Though I believe all of the drives would need to be exactly the same size, is that right?

Is it a good idea in general or should I ditch it, or maybe just ditch the part with ZFS rebuilding and use instead some kind of software for that?

Thank you for your help in advance!

  • unsaid0415@szmer.info
    link
    fedilink
    English
    arrow-up
    4
    ·
    edit-2
    1 year ago

    I’m not sure if I understood your question correctly, but perhaps it’d be more comfortable to use the native ZFS sync mechanism over the network. It’s “just snapshots”, but in the process the whole initial dataset gets synced as well

    A very simple ZFS to (Raspi+ZFS) setup is shown here, it relies on cron: https://blog.beardhatcode.be/2021/05/raspberry-pi-zfs-replication.html

    If you have two e.g. TrueNAS servers thta run ZFS you can skip sanoid/syncoid and just use zfs send from one server to another directly, using the network address

    • biscuits@lemmy.sdfeu.orgOP
      link
      fedilink
      arrow-up
      2
      ·
      edit-2
      1 year ago

      Yeah, I guess it may be risky to remove drives from pool, so maybe it would be better to build to just move the whole secondary pool as the other commenter pointed out (at least for the first time, smaller increments should be easier to handle). But do you think my strategy with snapshots as backup is good overall or should I use something else?

  • PriorProject@lemmy.world
    link
    fedilink
    arrow-up
    1
    ·
    edit-2
    1 year ago

    I don’t know if what you’re suggesting is possible, which as I read it is to split your “live” raid-1 in half and use one drive to rebuild the “live” pool and the other drive to rebuild the “backups” pool. It might be, but I can’t think of any advantage to that approach and it’s not something I would have thought to attempt.

    I’d do one of:

    • Ship the data over the network using ZFS send or something like syncoid/sanoid (which use ZFS send under the hood). It might be slow, but is that an issue? Waiting a week for the initial sync might be fine.
    • But syncing by sneakernet is a good strategy too, and can be faster if your backup site is close or your connectivity is slow. In this case, I’d build the backup pool at the live site… ideally in an external drive bay… but one could plug it in internally as well. Then sync them with a local ZFS send, export the backup pool, detach and transport the backup pool to the backup site, them reattach the backup pool at the backup site and import it. Et Voila, the backup pool is running at the remote site fully populated with data and subsequent ZFS sends will be incremental.

    Splitting and rebuilding your live pool might be possible, but I can imagine a lot of that might go wrong and I can’t see any reason to do it that way over export/import.

    • biscuits@lemmy.sdfeu.orgOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      1 year ago

      Thanks, I guess it’s even better solution and doesn’t involve kinda risky removing drives from pool. But do you think my strategy with snapshots as backup is good overall or should I use something else?

      • PriorProject@lemmy.world
        link
        fedilink
        arrow-up
        2
        ·
        1 year ago

        Yeah, snapshots sent to a separate and often remote pool is an extremely common backup strategy for folks who have long-term settled on ZFS. There’s very nice tooling for this that presents a more traditional schedule/retention based interface to save you scripting snapshots and sends directly.

        • Sanoid is an old standby in that space.
        • Zrepl is getting a lot of traction lately and seems to be an up-and-coming option.
        • I use pyznap, but I don’t recommend it to others as as the maintainer is on a multi-year hiatus which makes it undermaintained. It works great, but isn’t getting active development which makes it a poor bet in a crowded space with many great options. I plan to eval Zrepl when I get around to it.
  • greengnu@slrpnk.net
    link
    fedilink
    English
    arrow-up
    1
    arrow-down
    4
    ·
    1 year ago

    Your ZFS backup strategy should be to follow one of the following rulesets:

    3-2-1 [3 copies of the data at 2 different locations for the 1 purpose of preserving the data]

    4-3-2-1 [4 copies of the data at 3 different locations in 2 different types of media for the 1 purpose of preserving the data]

    5-4-3-2-1 [5 copies of the data at 4 different locations across 3 different continents in 2 different types of media for the 1 purpose of preserving the data]

    The details of the backup is more if you have a second system to enable ZFS send/receive or if you have to transport deltas from ZFS send