It’s fairly obvious why stopping a service while backing it up makes sense. Imagine backing up Immich while it’s running. You start the backup, db is backed up, now image assets are being copied. That could take an hour. While the assets are being backed up, a new image is uploaded. The live database knows about it but the one you’ve backed up doesn’t. Then your backup process reaches the new image asset and it copies it. If you restore this backup, Immich will contain an asset that isn’t known by the database. In order to avoid scenarios like this, you’d stop Immich while the backup is running.

Now consider a system that can do instant snapshots like ZFS or LVM. Immich is running, you stop it, take a snapshot, then restart it. Then you backup Immich from the snapshot while Immich is running. This should reduce the downtime needed to the time it takes to do the snapshot. The state of Immich data in the snapshot should be equivalent to backing up a stopped Immich instance.

Now consider a case like above without stopping Immich while taking the snapshot. In theory the data you’re backing up should represent the complete state of Immich at a point in time eliminating the possibility of divergent data between databases and assets. It would however represent the state of a live Immich instance. E.g. lock files, etc. Wouldn’t restoring from such a backup be equivalent to kill -9 or pulling the cable and restarting the service? If a service can recover from a cable pull, is it reasonable to consider it should recover from restoring from a snapshot taken while live? If so, is there much point to stopping services during snapshots?

  • johntash@eviltoast.org
    link
    fedilink
    English
    arrow-up
    1
    ·
    2 months ago

    If you’re worried a out a database being corrupt, I’d recommend doing an actual backup dump of the database and not only backing up the raw disk files for it.

    That should help provide some consistency. Of course it takes longer too if it’s a big db

    • Avid Amoeba@lemmy.caOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 months ago

      I dump the db too.

      With that said if backing up the raw files of a db while the service is stopped can produce a bad backup, I think we have bigger problems. That’s because restoring the raw files and starting the service is functionally equivalent to just starting the service with its existing raw files. If that could cause a problem then the service can’t be trusted to be stopped and restarted either. Am I wrong?

      • johntash@eviltoast.org
        link
        fedilink
        English
        arrow-up
        2
        ·
        2 months ago

        I was talking about dumping the database as an alternative to backing up the raw database files without stopping the database first. Taking a filesystem-level snapshot of the raw database without stopping the database first also isn’t guaranteed to be consistent. Most databases are fairly resilient now though and can recover themselves even if the raw files aren’t completely consistent. Stopping the database first and then backing up the raw files should be fine.

        The important thing is to test restoring :)

  • Decronym@lemmy.decronym.xyzB
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    2 months ago

    Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I’ve seen in this thread:

    Fewer Letters More Letters
    LVM (Linux) Logical Volume Manager for filesystem mapping
    NFS Network File System, a Unix-based file-sharing protocol known for performance and efficiency
    SMTP Simple Mail Transfer Protocol
    ZFS Solaris/Linux filesystem focusing on data integrity

    4 acronyms in this thread; the most compressed thread commented on today has 4 acronyms.

    [Thread #902 for this sub, first seen 1st Aug 2024, 21:25] [FAQ] [Full list] [Contact] [Source code]

  • MangoPenguin@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    1
    ·
    2 months ago

    You start the backup, db is backed up, now image assets are being copied. That could take an hour.

    For the initial backup maybe, but subsequent incrementals should only take a minute or two.

    I don’t bother stopping services, it’s too time intensive to deal with setting that up.

    I’ve yet to meet any service that can’t recover smoothly from a kill -9 equivalent, any that did sure wouldn’t be in my list of stuff I run anymore.

  • MaximilianKohler@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    2 months ago

    I ran into a similar problem with snapshots of a forum and email server – if there are scheduled emails when you take the snapshot they get sent out again if you create a new test server from the snapshot. And similarly for the forum.

    I’m not sure what the solution is either. The emails are sent via an SMTP so it’s not as simple as disabling email (ports, firewall, etc.) on the new test server.

  • adr1an@programming.dev
    link
    fedilink
    English
    arrow-up
    0
    arrow-down
    1
    ·
    2 months ago

    Check “green blue” deployment strategy. This is done by many businesses, where an interrupted service might mean losing a sale, or a client forever… I tried it sometime witj Nginx but it was more pain than gain (for my personal use)