Front page

[rfc] Passphrase-based encryption

e438054ed0074cc2b9c85554d2504b38
TONIC CONSULTING ADULT

From: Wladimir Palant <gtiobnam@palant.de>
Date: Mon, 3 Jul 2017 00:14:44 +0200

   Hi,
   
   with GPG being great and all that, I'd still prefer having the option to 
   use a plain passphrase and AES encryption with obnam. IMHO, this 
   approach has two advantages:
   
   * Considerably simpler setup, you merely need to come up with a 
   high-entropy passphrase.
   * Much easier to back up - you don't need to worry about losing the 
   passphrase due to a hard drive crash. If you are afraid of forgetting 
   it, then writing it down and keeping somewhere safe will do.
   
   It's particularly that second point which is important to me: if the GPG 
   key / passphrase used to encrypt my backup are lost it becomes 
   completely useless. With GPG I need to back up the encryption key 
   separately, doing so securely tends to be rather complicated.
   
   Sure, passphrases usually won't have the 256 bits of entropy necessary 
   to take full advantage of AES-256. However, this doesn't matter much as 
   long as they aren't easily guessable and a good (meaning slow) key 
   derivation algorithm is used.
   
   I'm currently trying out the following simple plugin to implement 
   encryption via passphrase (please don't comment on code quality, this 
   hasn't been polished):
   
   > import hashlib
   > import os
   > 
   > from Crypto.Cipher import AES
   
   > class EncryptionPlugin(obnamlib.ObnamPlugin):
   >     def enable(self):
   >         self.tag = "encaespw"
   > 
   >         # There doesn't appear to be any "canonical" way to derive an AES key
   >         # from a passphrase. There is OpenSSL's enc tool but it uses a very
   >         # weak key derivation function (details under
   >         # https://security.stackexchange.com/a/29139/4778). So let's just use
   >         # PBKDF2 with a high number of iterations.
   >         passphrase = os.environ['PASSPHRASE']
   >         if not passphrase:
   >             raise Exception('No encryption passphrase given')
   > 
   >         self.key = hashlib.pbkdf2_hmac('sha256', passphrase, 'aes key',
   >                                        256 * 1024, dklen=32)
   >         self.app.hooks.add_callback('repository-data', self, obnamlib.Hook.LATE_PRIORITY)
   > 
   > 
   >     def filter_read(self, encrypted, repo, toplevel):
   >         iv = encrypted[0:16]
   >         return AES.new(self.key, AES.MODE_CFB, iv).decrypt(encrypted[16:])
   > 
   > 
   >     def filter_write(self, cleartext, repo, toplevel):
   >         iv = os.urandom(16)
   >         return iv + AES.new(self.key, AES.MODE_CFB, iv).encrypt(cleartext)
   
   It works nicely and IMHO similar functionality could be added to the 
   official distribution. Notes:
   
   * The passphrase is being passed in via an environment variable rather 
   than command line parameters. While I am not a Linux expert, it's my 
   understanding that this is a more secure approach - the command line can 
   be seen by other users on the same computer, environment variables IMHO 
   cannot be accessed.
   
   * In my setup, the passphrase is mandatory (I don't want to create an 
   unencrypted backup by mistake). In the official encryption plugin, there 
   would rather be a command line option like 
   --encryption-backend=passphrase to enable passphrase-based encryption. 
   Also, the key size doesn't have to be hardcoded at 32 bytes (meaning 
   AES-256), there can be an additional option like 
   --encryption-algo=aes-128 allowing to specify other key sizes.
   
   * I am currently using a hardcoded salt for PBKDF2. While not 
   particularly bad (only relevant if a large number of encrypted obnam 
   backups is being accessed by an unauthorized party), this isn't optimal 
   either. One solution would be having a random salt for each file, but 
   this would require deriving an individual key for each file and degrade 
   performance. The other solution would be generating a unique random salt 
   for each repository. This would create a single point of failure 
   however, if the file storing that random salt gets corrupted the entire 
   backup becomes unusable.
   
   * The current encryption plugin will use /dev/random rather than 
   /dev/urandom by default. This precaution might be justified when 
   generating encryption keys, yet I'm only calling os.urandom() to 
   generate the initialization vector. With a new initialization vector 
   being generated for each encrypted file, polling /dev/random might be 
   too slow here. Also, randomness of initialization vectors isn't as 
   critical and doesn't justify such measures IMHO.
   
   Any comments? I can write a patch if the general direction is approved.
   
   regards
   Wladimir
   
   _______________________________________________
   obnam-dev mailing list
   obnam-dev@obnam.org
   http://listmaster.pepperfish.net/cgi-bin/mailman/listinfo/obnam-dev-obnam.org
From: Lars Wirzenius <liw@liw.fi>
Date: Mon, 3 Jul 2017 08:05:40 +0300

   On Mon, Jul 03, 2017 at 12:14:44AM +0200, Wladimir Palant wrote:
   > Hi,
   > 
   > with GPG being great and all that, I'd still prefer having the option to use
   > a plain passphrase and AES encryption with obnam. IMHO, this approach has
   > two advantages:
   > 
   > * Considerably simpler setup, you merely need to come up with a high-entropy
   > passphrase.
   > * Much easier to back up - you don't need to worry about losing the
   > passphrase due to a hard drive crash. If you are afraid of forgetting it,
   > then writing it down and keeping somewhere safe will do.
   
   If you want this, you should write a plugin that adds symmetric
   encryption in addition to the PGP based on that Obnam currently
   provides. You should probably do it by only encrypting the symmetric
   encryption key that PGP encrypts. This would allow PGP and symmetric
   to be used on the same repo by different clients.
   
   I am afraid, however, that I am unlikely to accept the plugin into
   Obnam proper, since I don't think it makes things better. It's true
   that it will probably be easier to set up, but at the cost of more
   difficult key management.
   
   Backing up small files such as PGP keys is so easy I don't agree with
   that part of your argument. It's a matter of a few kilobytes. You
   could put the key into a QR code and print it on paper.
   
   Also, environment variables can be read by other processes, just like
   command line arguments can be. See /proc/*/environ. The environ files
   are only readable by the owner, but it's still not a way to pass
   secrets, in my opinion. Defense in depth, and all that.
   
   > * The current encryption plugin will use /dev/random rather than
   > /dev/urandom by default.
   
   Since 1.20 (October 2010) the default is /dev/urandom.
From: Wladimir Palant <gtiobnam@palant.de>
Date: Mon, 3 Jul 2017 09:48:45 +0200

   On 03.07.2017 07:05, Lars Wirzenius wrote:
   > If you want this, you should write a plugin that adds symmetric
   > encryption in addition to the PGP based on that Obnam currently
   > provides. You should probably do it by only encrypting the symmetric
   > encryption key that PGP encrypts. This would allow PGP and symmetric
   > to be used on the same repo by different clients.
   
   Not really worth it as long as I'm the only one using that plugin, I'd 
   rather stay with my simple approach then.
   
   > I am afraid, however, that I am unlikely to accept the plugin into
   > Obnam proper, since I don't think it makes things better. It's true
   > that it will probably be easier to set up, but at the cost of more
   > difficult key management.
   
   No problem, if it isn't a good match for the overall concept then so be it.
   
   > Backing up small files such as PGP keys is so easy I don't agree with
   > that part of your argument. It's a matter of a few kilobytes. You
   > could put the key into a QR code and print it on paper.
   
   My thought was rather encrypting it with a passphrase and storing next 
   to the actual backup. Doing this correctly turned out non-trivial, with 
   both GPG's own passphrase encryption and OpenSSL's enc tool using 
   suboptimal key derivation to say the least.
   
   > Also, environment variables can be read by other processes, just like
   > command line arguments can be. See /proc/*/environ. The environ files
   > are only readable by the owner, but it's still not a way to pass
   > secrets, in my opinion. Defense in depth, and all that.
   
   There aren't too many ways to pass secrets and AFAIK none of them will 
   protect against other processes running with the same privileges. For 
   example, you could require the passphrase to be stored in a file 
   readable only by the owner - but this protection will be equivalent to 
   the way /proc/*/environ is protected (or GPG keys for that matter).
   
   regards
   Wladimir
   
   _______________________________________________
   obnam-dev mailing list
   obnam-dev@obnam.org
   http://listmaster.pepperfish.net/cgi-bin/mailman/listinfo/obnam-dev-obnam.org
From: Wladimir Palant <gtiobnam@palant.de>
Date: Mon, 3 Jul 2017 21:16:35 +0200

   On 03.07.2017 20:29, Henri Sivonen wrote:
   > If you don't need AES specifically, you can find an XSalsa20+Poly1305
   > implementation at:
   > https://github.com/hsivonen/obnam/compare/salsa?expand=1
   
   Interesting, thank you for sharing. This is way more advanced than my 
   quick and dirty plugin of course.
   
   > I haven't had the time to write proper unit tests, benchmarks or docs,
   > which is why I haven't tried upstreaming it.
   
   Unfortunately, I assume that the arguments against upstreaming my 
   solution apply to yours just as well - so even with tests, benchmarks 
   and docs it won't get accepted.
   
   > Probably more important that letting users tweak the key size is to
   > make sure that the AEAD construction is good and suitable for use with
   > a randomly-generated nonce for the amount of data one would expect to
   > encrypt using Obnam. I don't know if CFB fits this, but
   > XSalsa20+Poly1305 or XChaCha20+Poly1305 should (the non-X variants of
   > Salsa20 and ChaCha20 *don't*).
   
   CFB uses initialization vectors (randomly generated for each file in my 
   case) which I think serve a similar purpose. But I'm not really familiar 
   with either Salsa20 or ChaCha20 so I would be grateful if you could 
   expand. What kind of issues is this about? Are you implying that these 
   algorithms would be better performance-wise? I don't really know how 
   they compare to AES but at least for me the performance is clearly 
   limited by the uplink and not by the CPU. In other scenarios it could be 
   completely different of course.
   
   regards
   Wladimir
   
   _______________________________________________
   obnam-dev mailing list
   obnam-dev@obnam.org
   http://listmaster.pepperfish.net/cgi-bin/mailman/listinfo/obnam-dev-obnam.org
From: Henri Sivonen <hsivonen@hsivonen.fi>
Date: Mon, 3 Jul 2017 21:29:53 +0300

   On Mon, Jul 3, 2017 at 1:14 AM, Wladimir Palant <gtiobnam@palant.de> wrote:
   > with GPG being great and all that, I'd still prefer having the option to use
   > a plain passphrase and AES encryption with obnam.
   
   If you don't need AES specifically, you can find an XSalsa20+Poly1305
   implementation at:
   https://github.com/hsivonen/obnam/compare/salsa?expand=1
   
   (It was written before libsodium has XChaCha20.)
   
   I haven't had the time to write proper unit tests, benchmarks or docs,
   which is why I haven't tried upstreaming it.
   
   > --encryption-algo=aes-128 allowing to specify other key sizes.
   
   Probably more important that letting users tweak the key size is to
   make sure that the AEAD construction is good and suitable for use with
   a randomly-generated nonce for the amount of data one would expect to
   encrypt using Obnam. I don't know if CFB fits this, but
   XSalsa20+Poly1305 or XChaCha20+Poly1305 should (the non-X variants of
   Salsa20 and ChaCha20 *don't*).
From: Henri Sivonen <hsivonen@hsivonen.fi>
Date: Tue, 4 Jul 2017 09:41:32 +0300

   On Mon, Jul 3, 2017 at 10:16 PM, Wladimir Palant <gtiobnam@palant.de> wrote:
   > On 03.07.2017 20:29, Henri Sivonen wrote:
   >> Probably more important that letting users tweak the key size is to
   >> make sure that the AEAD construction is good and suitable for use with
   >> a randomly-generated nonce for the amount of data one would expect to
   >> encrypt using Obnam. I don't know if CFB fits this, but
   >> XSalsa20+Poly1305 or XChaCha20+Poly1305 should (the non-X variants of
   >> Salsa20 and ChaCha20 *don't*).
   >
   >
   > CFB uses initialization vectors (randomly generated for each file in my
   > case) which I think serve a similar purpose. But I'm not really familiar
   > with either Salsa20 or ChaCha20 so I would be grateful if you could expand.
   > What kind of issues is this about?
   
   If the nonce has too few bits, the probability of nonce reuse is more
   than negligible for randomly-generated nonces. The X in XSalsa20 and
   XChaCha20 stands for eXtended nonce: A nonce that's long enough that
   the probability of nonce reuse with randomly-generated nonces is
   considered negligible. XSalsa20 uses a 192-bit nonce. Salsa20 uses a
   64-bit nonce.
   
   A 192-bit nonce is considered long enough in order for it to be OK to
   generate the nonce simply by pulling the bits out of a random number
   generator while a 64-bit nonce is too short for that to be OK. I now
   fail to find a good paper that would explain why 192 bits is
   considered enough and how bad 128-bit nonces are, but it is a matter
   of probability. (I can't recall how the probability threshold for
   "negligible" is chosen.)
   
   > Are you implying that these algorithms
   > would be better performance-wise?
   
   At least ChaCha20 outperforms AES in the absence of hardware support
   for AES (such as Intel AES-NI).
   https://www.imperialviolet.org/2013/10/07/chacha20.html