Luckily we were able to restore the data, but now I (we) really want to learn what a proper setup would look like.
If you have any clear overview reading on the topic I'd be very interested to to know about it.
In particular I'm wondering: how do you back up your encryption keys, or even put them in escrow somewhere? Assuming we don't rotate the keys constantly I would love to just save them in somewhing like a passsword manager that's secured with 2FA/FIDO.
Would love to hear your thoughts!
We use Hashicorp's Vault product to manage SSH credentials, TLS certificates, as well as application secrets across thousands of users, tens of thousands of virtual machines, and hundreds of applications.
We pay for the enterprise version, but the free version is more than capable for most needs. Avoid a password manager if you can, it leads to poor security practices and availability issues depending on how and where the data is stored (IMHO, YMMV). Use single sign on where ever possible.
Disclaimer: No affiliation other than a satisfied customer and frequent end user of the product.
1. Buy a bunch of Yubikeys, minimum of 2.
2. Create GPG keys and store them on YubiKeys. Follow this guide: https://github.com/drduh/YubiKey-Guide (if you want to, keep the secret keys, but in case of multiple YubiKeys I would not keep them anywhere). Remember to set the keys to always require touch.
3. Use GPG to encrypt your backups to multiple recipients (all of the YubiKeys).
4. Take care of the physical keys with proper storage and procedures. Do not store the keys together, have at least one in a really secure location, check if you have all the keys regularly, etc.
5. Test restores at least once per quarter, with a randomly selected key.
The advantages of this solution is that it is simple, works pretty well, and gets you a lot of mileage with relatively little inconvenience. You don't have the risk of keys being copied, and guarding physical keys is easier than digital ones.
You still have the problem of guarding the passphrases to the Yubikeys (if you use them), but that is much less of a problem than guarding the encryption keys. A passphrase without the physical key is useless.
This setup works for organization from size 1 up to fairly large ones.
Note that some recently fashionable security consultants crap on GPG from great height, but do not provide an alternative. It's a tool that while having multiple flaws, does many jobs better than anything else out there.
Have an offline backup printed along with the disaster recovery checklist and documentation and put them in a safe in your company - the checklist should be dumb enough that your drunk self can use it at four in the morning, because you were the nearest employee when everything went down.
Ensure that you have stupid manual processes in place on rotation of the safe's PIN and encryption keys in general, including a sanity check if the newly generated keys actually work (e.g. if they are used for your backup storage, actually back something up and restore it). Ensure that the safe's PIN is available to at least another person and used regularly (e.g. because you store your backup tapes there).
If you feel that you need to change from this very simple system to a more complex one, ask yourself why. What does your change actually add in terms of security and what risks does it add.
In the end, you want your system available to customers and the security you add is to not only secure the data, but actually to know who can access it (the auditing part).
- All "hot" keys were stored in an offline credential manager in specific vaults depending on who needed access to them. Only staff with actual clearance could request temporary access to a vault (fully background checked, 1 year employment, etc).
- Copies of each vaulth and our master CA cert were written to 4 encrypted USB sticks. Two stored on-site in the fire-safe and two off-site at our safety deposit box that only c-level staff could access. (We had the same process with our tokens and master logins for AWS).
- Any work using those keys was on a pair-up basis, so at least two people, one doing the work and the other observing.
- We had a detailed policy around this that covered each step in the process and who needs to approve them; everyone who could feasibly need to access the keys was briefed annually as part of our security awareness training.
We handled a LOT of sensitive financial data, so this was the most appropriate way that we could find that maintained both sensible availability and key control.
So in order to get to the keys you needed:
- Access to the fire safe (Senior Ops, Senior Security and C-Level only).
- The LUKS passphrase for the USB sticks (Senior Security and some C-Level only).
- The passphrase for the specific vault (Senior Security and some C-Level only).
I don't know how the passphrases were managed by our sec team, but I know that the C-Level staff had physical envelopes in their home safes.
Accessible over USB or HTTP, it supports every major crypto algorithm [1], and keys can be backed up onto another HSM via a wrap key (if they are marked as exportable -- you can also control what can and cannot be exported -- in fact, every operation may be allowed or disallowed per key).
Every operation is logged for audit, of course, and the device may be setup to require logs to be read before they are overwritten. In combination with configuring a special authentication key to access the logs, you can ensure that every operation on the HSM is logged to a remote store before additional operations may be completed.
It does depend on your existing physical security, so that has to be taken into account when designing architectures including it. The micro form factor at least makes it trivial to put into an internal USB port.
And of course, if you require a more enterprise grade tool, you may want to use an HSM in combination with a tool like Hashicorp Vault to manage your keys throughout your orgnaization.
[1] https://developers.yubico.com/YubiHSM2/Product_Overview/
FYI: Hashicorp vault just uses Shamir's Secret Sharing scheme under the hood: https://github.com/hashicorp/vault/blob/45b9f7c1a53506dc9722...
* The pricing for just storing keys is incredibly cheap.
* At least with Google KMS you can't delete the keys without a 24 hour waiting period (and you can alert on the deletion attempt), so that's a huge safeguard.
* You get key access auditing out of the box.
In addition to it, we use envwarden[1], which is a simple open-source wrapper around the Bitwarden CLI to manage our server secrets. It's super simple, but does the job for us well. We can then manage both passwords and keys in one place.
Disclaimer: I created envwarden. I'm not affiliated with Bitwarden in any way however. Just a happy customer.
I first wrote about it back in 2017 (1) and we released an open framework for multiple languages/frameworks (2).
1: https://neosmart.net/blog/2017/securestore-a-net-secrets-man...
2: https://neosmart.net/blog/2020/securestore-open-secrets-form...
It's designed to integrate really well with your existing CLI tools like vim, xargs, and diff. It offers user-based permissions, and secrets are encrypted into a single file that's safe to commit into your git repo. We can stream secrets out of it directly to our remote servers during deploys.
Unlike Vault you don't need to manage infra to run it -- it's just a file. Unlike cloud secret managers, there's no lock-in.
Application repos have the encrypted secrets (meta) stored in their repos using Ansible-vault and the .vault_keys are stored in Bitwarden.
Usage includes SSH authentication, file encryption (backups and exchanges), git commit signatures and password/secret storage using `pass`.
Copies of the offline master keys keys are stored on flash in safes onsite and offsite in bank vaults, and sub-keys are valid for one year.
We use Hashicorp's Vault for secrets that require automated access.
Utilized globally by individual developers, large enterprises such as JP Morgan & Chase[1], and integrated into the KMS services such as Google Cloud[2].
1. https://venturebeat.com/2019/02/27/ionic-security-raises-40-...
2. https://cloud.google.com/blog/products/identity-security/clo...
This was early in the “cloud” epoch and at any rate they preferred in-house iron for entirely understandable reasons. Also, they needed to do some pretty interesting things, so they had a NAS populated with enterprise-grade SSDs and a native AES-based encryption scheme. This kept the key on what was basically a glorified USB key.
(For those of you who wish to know these details, the network between the blades and between the blades and the NAS was a nicely spec’d fibre channel network, and there was some iSCSI involved. The blades featured Itanium processors, which kind of gives the manufacturer away, and the firm had invested quite heavily in producing very high-performance code for those ill-fated microprocessors, but I digress.)
So... it happened that somebody lost the USB key. Well, not quite. Somebody took it home during a weekend (whilst the system was shut down) and their kid used it for school work.
This proved to be a “significant problem”. There was a backup, and it was encrypted and stored on an adjacent SAN. It wasn’t exactly stale, but it wasn’t entirely pristine either.
There was much woe and gnashing of teeth.
Nobody was fired because the dolt who maintained custody of the only USB key was the founder/CEO, so he couldn’t exactly blame himself.
But, yeah. That happened, sadly.
One threat model might be: Burglars sneaking in during the night and stealing the hard drivers. Then you would store the keys on a different location then the disks.
Then you make it a routine to reboot parts of the fleet, like scheduled simulation/training so that everyone knows what to do when you actually need them.
In general I would suggest using a key vault. AWS, GCP, and Azure all have cloud versions that are backed by virtual HSM's built on top of actual physical HSM's. For the vast majority of usages they're good enough. Use admin account management to enforce 2FA/FIDO for all AWS/Azure/GCP logins. (You should be enforcing 2FA with phone/FIDO auth anyway.)
If you need truly paranoid backups, you can back the key up onto a portable hard drive that you lock in a safe in the closet, with a few key people who know the code.
I recommend against using a (cloud-synced) password manager. Cloud key vaults do the same thing but offer specific features relevant to server stuff. And if you want more paranoia, a physical safe is probably safer than extending your attack surface to a cloud-synced password manager.
Also: make sure that you set up a ~quarterly ritual of opening and verifying the backup. For crucial backup fallback systems, you want to make sure you actually use the system so that you know if it fails.
Poorly seems to be the answer industry wide. Both encryption and disaster recovery are hard tasks. Combine them and you have recipe for a total mess.
Think of a lighter vault, with ACLs for people and/or machines to access keys and versioning to rotate keys.
In our case, Knox depends upon AWS KMS to "lock/unlock" its storage.
For context: We run a centralised salt-master, salt master unencrypts content using gpg filters as part of variable generation (salt "pillars"). So it's encrypted at rest and encrypted in our git repositories.
What we do/did, is:
* grab a pair of differently branded USB sticks.
* LUKS encrypt the USB sticks; we used a keyfile which is encrypted on our machines with our GPG key.
* encrypt salt's GPG private key with all of our keys.
* encrypt some of the "irreplacable" private keys (IE: CA roots) with all of our GPG keys.
* store it all on the pair of USB drives
* put the USB drives in a real-life vault, give the keys to the office team.
We haven't needed to recover, but it's clearly documented how to recover if anything went wrong.
1) If you don't need it - don't store it. If you need it - but it needs to be encrypted - probably! don't store it.
2) Think one way hashes with salts, think deletion policies, rotations, small disk drives.
3) You will get owned!!
LDAP is ubiquitous enough as an auth method (how do you auth to Vault? You auth to LDAP with it...) that any service you run or use is likely to speak it.
Why this isn't done more often is a mystery to me and probably the number 1 source of credentials being baked into things accidentally: oh we need a service account into the Might have answered my own question there though.
It's been quite a few years since I interacted with them, but for some keys there is a server somewhere with an HSM installed, and two people have credentials for it. If you need something signed you send it to them, with a justification for why it needs to be signed with the real keys, and they will send you a path to get the signed file, and remind you to delete the signed file when you no longer need it.
This is overkill for some things, and probably would be considered sloppy for others.
For other parts HashiCorp Vault/AWS kms
For standard password style secrets used by Ops, we use Team Password Manager. Which we chose about 5 years ago because it was self-hosted, the database was encrypted, and it had fantastic audit capabilities.
There are different CAs for different purposes. There's an intermediate for device management, and another for user or service auth purposes.
For anyone who needs a super simple place to store their encryption keys that works with Heroku and has versioning, I think Doppler could help. It doesn’t have all the fancy (and really cool) features of KMS as it’s designed to be a kv store for secrets, but it could be helpful. We have a free tier for anyone who wants to try it out. https://doppler.com
Signing and authentication keys are expendable but encryption keys are worth keeping even after they've been rotated since decryption of existing data may be necessary.
The key can be printed on paper and stored in a physical safe. Paper isn't a high density storage medium but it is remarkably durable and perfect for small amounts of data such as encryption keys. It also counts as an offline backup.
Keys can also be printed as QR codes. They support error correction and enable automatic data restoration. Even 4096 bit RSA keys fit in a binary mode QR code and the smaller ECC keys allow use of high error correction modes, making the data even more durable.
I wrote a binary decoding feature for ZBar in order to support this exact use case:
zbarcam --raw --oneshot -Sbinary > key
It's available on version 0.23.1 and above.
http://docs.oasis-open.org/kmip/spec/v1.4/kmip-spec-v1.4.htm...
Catch: Client machine is encypted with VeraCrypt. Veracript hidden drives, one password is kept by product owner and another password is kept by head of security.
Offline client machine is key for us.
We rotate encryption keys after quarterly security audit.
Not ideal, but it works... At least until one of us resign (but turnover is quite low here, so crossed fingers).
Check it out and if you have any questions, feel free to ask here or open an issue in github. We also have a Rust version in the works for those interested in something native.
The main observation I'd make that if you put your keys somewhere only accessible in production, you've made it impossible to test anywhere except production. If you do that, you need to create a process where people can ship some small bit of code to test if the production key setup genuinely works (hint: it won't).
AWS ACM for SSL certs.
AWS SSM for ssh, eliminates the needs for ssh keys.
Not everyone loves AWS (me included), but this stack works nicely in removing the need to ever touch raw encryption key files locally.
All of the key management is "built-in" and managed, so there really isn't much overhead. All software-based, with FIPS certified key management. It's very easy to encrypt data this way. It is expensive though.
Disclaimer: while I used to work for Vormetric / Thales, I no longer do.
Locked in a vault.
Keys are an actual secret key (within the HSM), unknown to any human.
Two people with two different access cards have to be present to enable key operations.
So, you don't have to worry about changing keys as employees come and go, because no one knows the actual keys.
There is a whole structure of spare cards stored in offsite secure storage in case a card is damaged, lost, or stolen.
Card set one is stored separately from card set two.
None of our secret material is particularly hard to revoke or replace; we don't run an internal CA or anything like that.
self promotion * You did ask how people do it :), this is my way, Ive written my own service which has been in production for more than 3 years, http://pkhub.io (if you would like to try it send me an email to admin@pkhub.io). This was before aws secrets manager, the tooling is usefull cause I wrote: running your app with its needed secrests dev/stage/prod, accessing dbs, downloading and installing ssh keys to ssh agent, utilities.. end
of course you could write all these yourself with aws secrets manager.
there is hashicorp's vault but tbh it always seemed like way to complicated to setup.
my advice in general would be: to get something secure but simple enough that your engineers can do their work and access the resources they need, without the oh only bob has the keys on his laptop situation.
[0] https://kubernetes.io/docs/concepts/configuration/secret/
How the 4 shareholders store their shares is up to them. Mine is in a secure note in 1Password.
I will be very interested in hearing how others do it.
You can rotate keys and facilitate key pinning scenarios.
Cheers!
Sadly now forced to figure out what to move to. We're considering 1Password with it's CLI as a short term option, but will be wanting to move to Hashicorp Vault or similar on the mid-term
My team uses lastpass
and that team now keeps growing and the feature never improves :)
Here is an air-gapped solution:
We don't have SSH keys because it's not the 90's and we don't have servers.