Front page

Obnam FORMAT GREEN ALBATROSS status and roadmap

182bd772889544d5867e1a0ce4e76652
BEAMING CHEROKEE STOPWATCH

From: Lars Wirzenius <liw@liw.fi>
Date: Sat, 21 Jan 2017 16:45:32 +0200

   Happy 2017, everyone.
   
   I intend to get the new Obnam repository format, GREEN ALBATORSS,
   finished this year. It's January so I feel there's a lot of time, but
   if anyone wants to help, that would of course be quite welcome.
   
   Current status is that I use GA on almost all backup repositories and
   it's tolerably fast for backups, but forgets are very slow. Haven't
   tried fsck or restore much.
   
   I just added a short note about GA to the obnam.org front page, and
   also a separete page for a draft roadmap to finish GA. See
   <http://obnam.org/roadmap-ga/>.
   
   Also, if anyone would like to try GA now, that'd be quite welcome.
   Read the note on the website to understand the risks, however. Treat
   GA as alpha level software. If you do try, I'd be curious to hear
   about your experiences, and what your live data is like (number of
   files, total data, how much changes between backups, etc).
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Wed, 08 Feb 2017 12:34:07 +0100

   Hi Lars!
   
   On Sat, 21 Jan 2017 16:45:32 +0200, Lars Wirzenius <liw@liw.fi> wrote:
   > I intend to get the new Obnam repository format, GREEN ALBATORSS,
   > finished this year.
   
   :-)
   
   
   One idea I just had -- but I don't know if that is feasible to implement:
   you could (internally!) have obnam "write" both the "current" and the
   "Green Albatross" formats, and then every "read" operation would read
   both of them, and compare that they return the same data.  Then have
   obnam complain if that doesn't match (with the "current" format's data
   then taking precedence, I think).  Obviously, that will add a lot of
   overhead for every repository access, but any users willing to take that
   can then easily help testing the "Green Albatross" format.  But I don't
   know if the repository abstraction in obnam is sufficiently generic for
   being able to deal with two underlying repository formats, etc.
   
   
   Grüße
    Thomas
   
   _______________________________________________
   obnam-dev mailing list
   obnam-dev@obnam.org
   http://listmaster.pepperfish.net/cgi-bin/mailman/listinfo/obnam-dev-obnam.org
From: SanskritFritz <sanskritfritz@gmail.com>
Date: Wed, 8 Feb 2017 14:02:30 +0100

   On Wed, Feb 8, 2017 at 1:58 PM, Lars Wirzenius <liw@liw.fi> wrote:
   
   > On Wed, Feb 08, 2017 at 12:34:07PM +0100, Thomas Schwinge wrote:
   > > One idea I just had -- but I don't know if that is feasible to implement:
   > > you could (internally!) have obnam "write" both the "current" and the
   > > "Green Albatross" formats, and then every "read" operation would read
   > > both of them, and compare that they return the same data.  Then have
   > > obnam complain if that doesn't match (with the "current" format's data
   > > then taking precedence, I think).  Obviously, that will add a lot of
   > > overhead for every repository access, but any users willing to take that
   > > can then easily help testing the "Green Albatross" format.  But I don't
   > > know if the repository abstraction in obnam is sufficiently generic for
   > > being able to deal with two underlying repository formats, etc.
   >
   > I fear it'd make things so slow nobody would actually use this, even
   > for testing. It'd mean obnam was making two backups at the same time.
   >
   
   You could start two obnam instances parallelly :D
From: Lars Wirzenius <liw@liw.fi>
Date: Wed, 8 Feb 2017 14:58:24 +0200

   On Wed, Feb 08, 2017 at 12:34:07PM +0100, Thomas Schwinge wrote:
   > One idea I just had -- but I don't know if that is feasible to implement:
   > you could (internally!) have obnam "write" both the "current" and the
   > "Green Albatross" formats, and then every "read" operation would read
   > both of them, and compare that they return the same data.  Then have
   > obnam complain if that doesn't match (with the "current" format's data
   > then taking precedence, I think).  Obviously, that will add a lot of
   > overhead for every repository access, but any users willing to take that
   > can then easily help testing the "Green Albatross" format.  But I don't
   > know if the repository abstraction in obnam is sufficiently generic for
   > being able to deal with two underlying repository formats, etc.
   
   I fear it'd make things so slow nobody would actually use this, even
   for testing. It'd mean obnam was making two backups at the same time.
From: Jan Niggemann <jn@hz6.de>
Date: Wed, 08 Feb 2017 15:39:14 +0100

   Zitat von Thomas Schwinge <thomas@codesourcery.com>:
   
   > Hi Lars!
   >
   > On Sat, 21 Jan 2017 16:45:32 +0200, Lars Wirzenius <liw@liw.fi> wrote:
   >> I intend to get the new Obnam repository format, GREEN ALBATORSS,
   >> finished this year.
   >
   > :-)
   >
   >
   > One idea I just had -- but I don't know if that is feasible to implement:
   > you could (internally!) have obnam "write" both the "current" and the
   > "Green Albatross" formats, and then every "read" operation would read
   > both of them, and compare that they return the same data.  Then have
   > obnam complain if that doesn't match (with the "current" format's data
   > then taking precedence, I think).  Obviously, that will add a lot of
   > overhead for every repository access, but any users willing to take that
   > can then easily help testing the "Green Albatross" format.  But I don't
   > know if the repository abstraction in obnam is sufficiently generic for
   > being able to deal with two underlying repository formats, etc.
   That's exactly what I started doing a couple of weeks ago - I backup  
   to a repo using the current stable format and then re-run the same  
   backup using GREEN ALBATROSS...
   I don't take measurements though, I just wanted to know how the new  
   format "feels like"...
   
   Jan
From: Alexander Batischev <eual.jp@gmail.com>
Date: Fri, 26 May 2017 22:28:01 +0300

   Hi!
   
   On Sat, Jan 21, 2017 at 04:45:32PM +0200, Lars Wirzenius wrote:
   >Also, if anyone would like to try GA now, that'd be quite welcome. […] 
   >If you do try, I'd be curious to hear about your experiences, and what 
   >your live data is like (number of files, total data, how much changes 
   >between backups, etc).
   
   I switched from default format to Green Albatross immediately after 
   reading this.
   
   I have `deduplicate` set to `verify`.
   
   My live data consists of around 100k files which amount to 4 to 10 
   Gigabytes total (sometimes backup occurs before I had a chance to 
   offload new photos to git-annex, so they bloat the repo a bit). 
   
   Obnam usually uploads around half a gigabyte of data, 80% of which is 
   claimed to be overhead. This seems about right as I don't change that 
   many files.
   
   The backup is performed from an internal SSD onto external USB 2.0 HDD. 
   Both volumes are encrypted and I have an old CPU. Average speed is 
   around 5 MB/sec and it takes about two minutes to do the backup. Fast 
   enough for me; IIRC this is faster than the old format.
   
   Obnam hanged once while forgetting old generations. My `keep` setting is 
   `5d,104w,60m,100y` and I do backups every 3 days, so if it was a bug, 
   I'd see it much more often. I also tend to run low on free memory, so 
   that might be the cause too. And yes, 100y; I don't plan to die.
   
   The only pain point right now is `obnam verify`. It takes around an hour 
   to run the verification. I do it a couple times a month, right after the 
   backup step, and it never turned up anything I didn't expect (sometimes 
   I lose patience and start using the computer before verification is 
   done.)
   
   Overall my experience with Green Albatross has been a positive one, and 
   at least for data sets as small as mine, I consider it to be ready for 
   daily use. Thanks to Lars and contributors for excellent work!