I have a 56 TB local Unraid NAS that is parity protected against single drive failure, and while I think a single drive failing and being parity recovered covers data loss 95% of the time, I’m always concerned about two drives failing or a site-/system-wide disaster that takes out the whole NAS.

For other larger local hosters who are smarter and more prepared, what do you do? Do you sync it off site? How do you deal with cost and bandwidth needs if so? What other backup strategies do you use?

(Sorry if this standard scenario has been discussed - searching didn’t turn up anything.)

  • tal@lemmy.today
    link
    fedilink
    English
    arrow-up
    7
    ·
    edit-2
    5 hours ago

    I don’t know of a pre-wrapped utility to do that, but assuming that this is a Linux system, here’s a simple bash script that’d do it.

    #!/bin/bash
    
    # Set this.  Path to a new, not-yet-existing directory that will retain a copy of a list
    # of your files.  You probably don't actually want this in /tmp, or
    # it'll be wiped on reboot.
    
    file_list_location=/tmp/storage-history
    
    # Set this.  Path to location with files that you want to monitor.
    
    path_to_monitor=path-to-monitor
    
    # If the file list location doesn't yet exist, create it.
    if [[ ! -d "$file_list_location" ]]; then
        mkdir "$file_list_location"
        git -C "$file_list_location" init
    fi
    
    # in case someone's checked out things at a different time
    git -C "$file_list_location" checkout master
    find "$path_to_monitor"|sort>"$file_list_location/files.txt"
    git -C "$file_list_location" add "$file_list_location/files.txt"
    git -C "$file_list_location" commit -m "Updated file list for $(date)"
    

    That’ll drop a text file at /tmp/storage-history/files.txt with a list of the files at that location, and create a git repo at /tmp/storage-history that will contain a history of that file.

    When your drive array kerplodes or something, your files.txt file will probably become empty if the mount goes away, but you’ll have a git repository containing a full history of your list of files, so you can go back to a list of the files there as they existed at any historical date.

    Run that script nightly out of your crontab or something ($ crontab -e to edit your crontab).

    As the script says, you need to choose a file_list_location (not /tmp, since that’ll be wiped on reboot), and set path_to_monitor to wherever the tree of files is that you want to keep track of (like, /mnt/file_array or whatever).

    You could save a bit of space by adding a line at the end to remove the current files.txt after generating the current git commit if you want. The next run will just regenerate files.txt anyway, and you can just use git to regenerate a copy of the file at for any historical day you want. If you’re not familiar with git, $ git log to find the hashref for a given day, $ git checkout <hashref> to move where things were on that day.

    EDIT: Moved the git checkout up.

      • zorflieg@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        3 hours ago

        Abefinder/Neofinder is great for cataloging but it costs money. If you do a limited backup it’s good to know what you had. I use tape formatted to LTFS and Neofind both the source and the finished tape.