Front page

injecting old backup generations

9da088cc54e14ed48990cbe8e8c5611c
QUADRANT ORLANDO NEWBORN

From: Stefano Zacchiroli <zack@upsilon.cc>
Date: Fri, 26 Feb 2016 09:51:04 +0100

   [ forwarding here a discussion started with Lars on IRC ]
   
   Hi all,
     in migrating backup solutions to obnam, I've stumbled upon a use case
   that is not very well supported by obnam today: injecting old backups
   (say, ~5 years of history) as retrofitted obnam backup generations. That
   is a requirement in many cases, as you don't want to be forced to use
   different tools---obnam or your previous solution---depending on the age
   of the backup you want to explore.
   
   There are at least two problems to support this use case:
   
   a) faking paths: I want to be able to extract old backups in some dir,
      and tell obnam to backup those dirs as if they were the root dir
      (this is to avoid inconsistent path access when exploring pre-obnam
      backups and post-obnam ones)
   
   b) faking timestamps: once extracted old backups, I want to inject them
      telling obnam "consider this backup as dated $TIMESTAMP"
   
   Regarding (a), it ties to a more generally wanted obnam feature, i.e.,
   the ability to mangle paths. I was considering that (a) could be
   implemented by adding some sort of chroot support into obnam, but Lars
   told me he is pondering a more general "path mangling" support for
   obnam. Any pointers to the current status (spec, design, code) of that?
   
   Regarding (b), I haven't tried myself but I've been told that faketime
   works just fine. So that might be a way around this issue. OTOH the
   question remains of whether obnam would welcome to have a more specific
   "time override" flag as part of its native features.
   
   Also, there is the question of whether people would prefer adding two
   separate knobs that allow to do (a) and (b) independently, or rather add
   a high-level "inject old backup" feature.
   
   I'd love to hear your thoughts on this matter!
   Cheers.
From: Stefano Zacchiroli <zack@upsilon.cc>
Date: Fri, 26 Feb 2016 09:51:04 +0100

   [ forwarding here a discussion started with Lars on IRC ]
   
   Hi all,
     in migrating backup solutions to obnam, I've stumbled upon a use case
   that is not very well supported by obnam today: injecting old backups
   (say, ~5 years of history) as retrofitted obnam backup generations. That
   is a requirement in many cases, as you don't want to be forced to use
   different tools---obnam or your previous solution---depending on the age
   of the backup you want to explore.
   
   There are at least two problems to support this use case:
   
   a) faking paths: I want to be able to extract old backups in some dir,
      and tell obnam to backup those dirs as if they were the root dir
      (this is to avoid inconsistent path access when exploring pre-obnam
      backups and post-obnam ones)
   
   b) faking timestamps: once extracted old backups, I want to inject them
      telling obnam "consider this backup as dated $TIMESTAMP"
   
   Regarding (a), it ties to a more generally wanted obnam feature, i.e.,
   the ability to mangle paths. I was considering that (a) could be
   implemented by adding some sort of chroot support into obnam, but Lars
   told me he is pondering a more general "path mangling" support for
   obnam. Any pointers to the current status (spec, design, code) of that?
   
   Regarding (b), I haven't tried myself but I've been told that faketime
   works just fine. So that might be a way around this issue. OTOH the
   question remains of whether obnam would welcome to have a more specific
   "time override" flag as part of its native features.
   
   Also, there is the question of whether people would prefer adding two
   separate knobs that allow to do (a) and (b) independently, or rather add
   a high-level "inject old backup" feature.
   
   I'd love to hear your thoughts on this matter!
   Cheers.
From: Lars Wirzenius <liw@liw.fi>
Date: Sun, 24 Jul 2016 12:42:46 +0300

   HI, sorry for taking several months to respond.
   
   On Fri, Feb 26, 2016 at 09:51:04AM +0100, Stefano Zacchiroli wrote:
   > a) faking paths: I want to be able to extract old backups in some dir,
   >    and tell obnam to backup those dirs as if they were the root dir
   >    (this is to avoid inconsistent path access when exploring pre-obnam
   >    backups and post-obnam ones)
   
   Right. So this has been asked for a few times, and while I'm generally
   not happy about the complexity it will introduce, I could be willing
   to add it, since there's enough reasonable use cases for it, it seems.
   
   The way I'd implement this is to have a hook that a plugin can use to
   change the path that gets put into the repository. In other words,
   Obnam would then work like this:
   
       scan the filesystem, looking for paths for files to back up
       for each file, mangle the path via a hook
       use the mangled path to check if the file is in the repository
         already, and if not, put it into the repository using the
         mangled path
       obviously, use the unmangled path to access the live data file
   
   A plugin would the attach itself to the hook, and mangle the path
   based on configuration from the user. At its core, the plugin would do
   this:
   
       def mangle_path(pathname):
           if pathname.startswith(self.strip_dirs):
               pathname = pathname[len(self.strip_dirs):]
           return pathname
   
   I've skipped one or two or a hundred details above, of course. Such as
   tests (unit and integration ones).
   
   I don't see a need to do the reverse hook for restoring.
   
   I've not done any coding for this. It should be a fairly simple task
   for someone who knows Python, except for the part of learning enough
   Obnam internals and test scaffolding to do a decent job.
   
   (Sometimes I think I should arrange a workshop for those who want to
   get familiar with Obnam internals. Alas, there's no funding.)
   
   > b) faking timestamps: once extracted old backups, I want to inject them
   >    telling obnam "consider this backup as dated $TIMESTAMP"
   
   This would similarly be a hook that mangles the metadata of the file.
   
       obnam reads metadata from live data
       obnam calls a hook to mangle the metadata
       obnam stores mangled metadata
   
   I'd do this as a generic metadata mangling hook rather than
   specifically for timestamps.
   
   In addition to file metadata, there could be another hook for faking
   the generation timestamps, except that kinda already exists. The
   --pretend-time setting is there for testing purposes, but it could be
   used for setting the generation timestamps.
From: Stefano Zacchiroli <zack@upsilon.cc>
Date: Sun, 24 Jul 2016 12:45:48 +0200

   Hey Lars, thanks for your answer! All looks good to me, but I do have a
   question about a specific point you raised:
   
   On Sun, Jul 24, 2016 at 12:42:46PM +0300, Lars Wirzenius wrote:
   > I don't see a need to do the reverse hook for restoring.
   
   Why?
   
   Aside from the design elegance that round-tripping has in general,
   without the converse mapping people would be unable to just restore a
   system in place.  Not that one *usually* does that, but even if you
   restore in a temp dir having the content of that dir being isomorphic to
   the file-system part it comes from is really handy.
   
   Same goes for timestamps, with the added disadvantage that they would be
   even more difficult to fix upon restore.
   
   It is true that with round-tripping you'll go down the rabbit hole of
   whether you offer a mangling API that *guarantees* round-tripping (which
   would be probably awkward) or just expect users to do that themselves
   (and then allow users to shoot themselves in the foot). But round
   tripping still seems to me the obvious right way to go here. What am I
   missing? Maybe some intrinsic difficulty related to how obnam works
   internally?
   
   Cheers.
From: Lars Wirzenius <liw@liw.fi>
Date: Sat, 30 Jul 2016 19:16:54 +0300

   On Sun, Jul 24, 2016 at 12:45:48PM +0200, Stefano Zacchiroli wrote:
   > Hey Lars, thanks for your answer! All looks good to me, but I do have a
   > question about a specific point you raised:
   > 
   > On Sun, Jul 24, 2016 at 12:42:46PM +0300, Lars Wirzenius wrote:
   > > I don't see a need to do the reverse hook for restoring.
   > 
   > Why?
   > 
   > Aside from the design elegance that round-tripping has in general,
   > without the converse mapping people would be unable to just restore a
   > system in place.
   
   Obnam already doesn't restoring things in place. It only allows
   restoring to an empty directory, or a directory that doesn't exist and
   Obnam creates. For anything else you'll need to use "obnam mount" and
   cp, rsync, or some other standard file copying tool.
   
   Round-tripping can have a design elegance, but I'd really like to have
   a real use case for why it should be implemented.