* repeatable boot randomness inside KVM guest @ 2018-04-14 19:59 Alexey Dobriyan 2018-04-14 22:41 ` Andy Lutomirski 2018-04-14 22:44 ` Theodore Y. Ts'o 0 siblings, 2 replies; 16+ messages in thread From: Alexey Dobriyan @ 2018-04-14 19:59 UTC (permalink / raw) To: linux-kernel, tytso, kvm; +Cc: security SLAB allocators got CONFIG_SLAB_FREELIST_RANDOM option which randomizes allocation pattern inside a slab: #ifdef CONFIG_SLAB_FREELIST_RANDOM /* Pre-initialize the random sequence cache */ static int init_cache_random_seq(struct kmem_cache *s) { ... Then I printed actual random sequences for each kmem cache. Turned out they were all the same for most of the caches and they didn't vary across guest reboots. int cache_random_seq_create(struct kmem_cache *cachep, unsigned int count, gfp_t gfp) { ... /* Get best entropy at this stage of boot */ prandom_seed_state(&state, get_random_long()); Then I searched internet and turned out KVM can pass randomness via virtio-rng or something. So I linked /dev/urandom. And it didn't help! The only way to get randomness for SLAB is to enable RDRAND inside guest. Is it KVM bug? For the record I'm using qemu 2.11.1-r2 and whatever F27 ships now. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: repeatable boot randomness inside KVM guest 2018-04-14 19:59 repeatable boot randomness inside KVM guest Alexey Dobriyan @ 2018-04-14 22:41 ` Andy Lutomirski 2018-04-14 23:09 ` Alexey Dobriyan 2018-04-14 22:44 ` Theodore Y. Ts'o 1 sibling, 1 reply; 16+ messages in thread From: Andy Lutomirski @ 2018-04-14 22:41 UTC (permalink / raw) To: Alexey Dobriyan; +Cc: LKML, Ted Ts'o, kvm list, security On Sat, Apr 14, 2018 at 12:59 PM, Alexey Dobriyan <adobriyan@gmail.com> wrote: > SLAB allocators got CONFIG_SLAB_FREELIST_RANDOM option which randomizes > allocation pattern inside a slab: > > > #ifdef CONFIG_SLAB_FREELIST_RANDOM > /* Pre-initialize the random sequence cache */ > static int init_cache_random_seq(struct kmem_cache *s) > { > ... > > Then I printed actual random sequences for each kmem cache. > Turned out they were all the same for most of the caches and > they didn't vary across guest reboots. > > int cache_random_seq_create(struct kmem_cache *cachep, unsigned int count, gfp_t gfp) > { > ... > /* Get best entropy at this stage of boot */ > prandom_seed_state(&state, get_random_long()); > > Then I searched internet and turned out KVM can pass randomness via > virtio-rng or something. So I linked /dev/urandom. > > And it didn't help! > > The only way to get randomness for SLAB is to enable RDRAND inside guest. > > Is it KVM bug? > > For the record I'm using qemu 2.11.1-r2 and whatever F27 ships now. virtio-rng doesn't really do that. I have an ancient patch set to do exactly what you want, and I should dust it off. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: repeatable boot randomness inside KVM guest 2018-04-14 22:41 ` Andy Lutomirski @ 2018-04-14 23:09 ` Alexey Dobriyan 0 siblings, 0 replies; 16+ messages in thread From: Alexey Dobriyan @ 2018-04-14 23:09 UTC (permalink / raw) To: Andy Lutomirski; +Cc: LKML, Ted Ts'o, kvm list, security On Sat, Apr 14, 2018 at 03:41:42PM -0700, Andy Lutomirski wrote: > On Sat, Apr 14, 2018 at 12:59 PM, Alexey Dobriyan <adobriyan@gmail.com> wrote: > > SLAB allocators got CONFIG_SLAB_FREELIST_RANDOM option which randomizes > > allocation pattern inside a slab: > > > > > > #ifdef CONFIG_SLAB_FREELIST_RANDOM > > /* Pre-initialize the random sequence cache */ > > static int init_cache_random_seq(struct kmem_cache *s) > > { > > ... > > > > Then I printed actual random sequences for each kmem cache. > > Turned out they were all the same for most of the caches and > > they didn't vary across guest reboots. > > > > int cache_random_seq_create(struct kmem_cache *cachep, unsigned int count, gfp_t gfp) > > { > > ... > > /* Get best entropy at this stage of boot */ > > prandom_seed_state(&state, get_random_long()); > > > > Then I searched internet and turned out KVM can pass randomness via > > virtio-rng or something. So I linked /dev/urandom. > > > > And it didn't help! > > > > The only way to get randomness for SLAB is to enable RDRAND inside guest. > > > > Is it KVM bug? > > > > For the record I'm using qemu 2.11.1-r2 and whatever F27 ships now. > > virtio-rng doesn't really do that. I have an ancient patch set to do > exactly what you want, and I should dust it off. Please, do. Here is a list of caches which aren't exactly randomly randomized with my setup. Many important ones are there :-( XXX name 'dma-kmalloc-96', r b1e6718e2e7147d4 XXX name 'dma-kmalloc-192', r a7664a0d69968019 XXX name 'dma-kmalloc-8', r 662c2e986443235c XXX name 'dma-kmalloc-16', r 770a9b620ae4cd62 XXX name 'dma-kmalloc-32', r 2e200073d5fa9f46 XXX name 'dma-kmalloc-64', r d8538fda83c74168 XXX name 'dma-kmalloc-128', r 9e4b956d09dd7d44 XXX name 'dma-kmalloc-256', r 8b14bcb58f9e18f5 XXX name 'dma-kmalloc-512', r 2bbace4b7120624a XXX name 'dma-kmalloc-1024', r 7cdf44406db52f5b XXX name 'dma-kmalloc-2048', r 18fe0ebf6bcfdf43 XXX name 'dma-kmalloc-4096', r 9f1a5eee118facf7 XXX name 'dma-kmalloc-8192', r f514d72a1cc441a2 XXX name 'kmalloc-8192', r 14843df817b556cc XXX name 'kmalloc-4096', r 52ed85fa9c691bbe XXX name 'kmalloc-2048', r fa81aa9222ff65a7 XXX name 'kmalloc-1024', r ae355c02d31f21d3 XXX name 'kmalloc-512', r 5fe0d22aaf2ef8d9 XXX name 'kmalloc-256', r 336d07a06917b95 XXX name 'kmalloc-192', r 6b6cd5399dd06d95 XXX name 'kmalloc-128', r 893b9e85369964ab XXX name 'kmalloc-96', r 179e185395d2612 XXX name 'kmalloc-64', r 29cf688b37eccea7 XXX name 'kmalloc-32', r fb7b4e7dca6de00a XXX name 'kmalloc-16', r a2a441fdc499d0c7 XXX name 'kmalloc-8', r e5454c7095ddd2be XXX name 'kmem_cache_node', r 500dc6126a47b229 XXX name 'kmem_cache', r 816c8c7bcde08372 XXX name 'task_group', r c09c4d1c1436ce97 XXX name 'radix_tree_node', r 4dd9540b830a4ea8 XXX name 'pool_workqueue', r 88b1e9d9a1f0b570 XXX name 'Acpi-Namespace', r 3e34d55f8f1cb140 XXX name 'Acpi-State', r b94e04635e77b48a XXX name 'Acpi-Parse', r d5374863b90f2a4c XXX name 'Acpi-ParseExt', r eefb2fff892f64a9 XXX name 'Acpi-Operand', r ce51949bcc80af13 XXX name 'pid', r cd6d8ee9e5209156 XXX name 'anon_vma', r c3a9273a68127ac7 XXX name 'anon_vma_chain', r a7cec15033c31a9b XXX name 'cred_jar', r fe4cc38c6d99cf63 XXX name 'task_struct', r eecb8895c6b7dbdb XXX name 'sighand_cache', r e5243c5eb2ce3a63 XXX name 'signal_cache', r 88b2e108d8ef81c7 XXX name 'files_cache', r ee29814e58dc909c XXX name 'fs_cache', r bc700a5f8fc28ff8 XXX name 'mm_struct', r f5230f99c7447359 XXX name 'vm_area_struct', r e30f3f8e648a9f88 XXX name 'nsproxy', r ae7c08b524a0f4d4 XXX name 'uts_namespace', r 6b1266178968ed99 XXX name 'buffer_head', r b24c10679dc55a11 XXX name 'names_cache', r 2e023b54e3ca5b8f XXX name 'dentry', r 83cc18634fbd74e8 XXX name 'inode_cache', r ff9a0ff3b4665cf5 XXX name 'filp', r 4fdad214b7ca7fc1 XXX name 'mnt_cache', r 8e726d32470b23e0 XXX name 'kernfs_node_cache', r 929c5f56778d365d XXX name 'bdev_cache', r 8a5520036bd0a464 XXX name 'sigqueue', r 2cf75c4d16191efb XXX name 'seq_file', r ec3ba1fe514524d5 XXX name 'proc_inode_cache', r b0c76cbbda5bb41f XXX name 'pde_opener', r 5f82f8e7100a517c XXX name 'proc_dir_entry', r ebabc4e93b52d7b8 XXX name 'shmem_inode_cache', r 2b25a3eb9aa32973 XXX name 'net_namespace', r 95793a7eae08a33f ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: repeatable boot randomness inside KVM guest 2018-04-14 19:59 repeatable boot randomness inside KVM guest Alexey Dobriyan 2018-04-14 22:41 ` Andy Lutomirski @ 2018-04-14 22:44 ` Theodore Y. Ts'o 2018-04-15 0:41 ` Matthew Wilcox 2018-04-16 15:54 ` Kees Cook 1 sibling, 2 replies; 16+ messages in thread From: Theodore Y. Ts'o @ 2018-04-14 22:44 UTC (permalink / raw) To: Alexey Dobriyan; +Cc: linux-kernel, linux-mm +linux-mm@kvack.org kvm@vger.kernel.org, security@kernel.org moved to bcc On Sat, Apr 14, 2018 at 10:59:21PM +0300, Alexey Dobriyan wrote: > SLAB allocators got CONFIG_SLAB_FREELIST_RANDOM option which randomizes > allocation pattern inside a slab: > > int cache_random_seq_create(struct kmem_cache *cachep, unsigned int count, gfp_t gfp) > { > ... > /* Get best entropy at this stage of boot */ > prandom_seed_state(&state, get_random_long()); > > Then I printed actual random sequences for each kmem cache. > Turned out they were all the same for most of the caches and > they didn't vary across guest reboots. The problem is at the super-early state of the boot path, kernel code can't allocate memory. This is something most device drivers kinda assume they can do. :-) So it means we haven't yet initialized the virtio-rng driver, and it's before interrupts have been enabled, so we can't harvest any entropy from interrupt timing. So that's why trying to use virtio-rng didn't help. > The only way to get randomness for SLAB is to enable RDRAND inside guest. > > Is it KVM bug? No, it's not a KVM bug. The fundamental issue is in how the CONFIG_SLAB_FREELIST_RANDOM is currently implemented. What needs to happen is freelist should get randomized much later in the boot sequence. Doing it later will require locking; I don't know enough about the slab/slub code to know whether the slab_mutex would be sufficient, or some other lock might need to be added. The other thing I would note that is that using prandom_u32_state() doesn't really provide much security. In fact, if the the goal is to protect against a malicious attacker trying to guess what addresses will be returned by the slab allocator, I suspect it's much like the security patdowns done at airports. It might protect against a really stupid attacker, but it's mostly security theater. The freelist randomization is only being done once; so it's not like performance is really an issue. It would be much better to just use get_random_u32() and be done with it. I'd drop using prandom_* functions in slab.c and slubct and slab_common.c, and just use a really random number generator, if the goal is real security as opposed to security for show.... (Not that there's necessarily any thing wrong with security theater; the US spends over 3 billion dollars a year on security theater. As politicians know, symbolism can be important. :-) Cheers, - Ted ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: repeatable boot randomness inside KVM guest 2018-04-14 22:44 ` Theodore Y. Ts'o @ 2018-04-15 0:41 ` Matthew Wilcox 2018-04-17 9:13 ` James Bottomley 2018-04-16 15:54 ` Kees Cook 1 sibling, 1 reply; 16+ messages in thread From: Matthew Wilcox @ 2018-04-15 0:41 UTC (permalink / raw) To: Theodore Y. Ts'o, Alexey Dobriyan, linux-kernel, linux-mm On Sat, Apr 14, 2018 at 06:44:19PM -0400, Theodore Y. Ts'o wrote: > What needs to happen is freelist should get randomized much later in > the boot sequence. Doing it later will require locking; I don't know > enough about the slab/slub code to know whether the slab_mutex would > be sufficient, or some other lock might need to be added. Could we have the bootloader pass in some initial randomness? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: repeatable boot randomness inside KVM guest 2018-04-15 0:41 ` Matthew Wilcox @ 2018-04-17 9:13 ` James Bottomley 2018-04-17 11:47 ` Matthew Wilcox 0 siblings, 1 reply; 16+ messages in thread From: James Bottomley @ 2018-04-17 9:13 UTC (permalink / raw) To: Matthew Wilcox, Theodore Y. Ts'o, Alexey Dobriyan, linux-kernel, linux-mm On Sat, 2018-04-14 at 17:41 -0700, Matthew Wilcox wrote: > On Sat, Apr 14, 2018 at 06:44:19PM -0400, Theodore Y. Ts'o wrote: > > What needs to happen is freelist should get randomized much later > > in the boot sequence. Doing it later will require locking; I don't > > know enough about the slab/slub code to know whether the slab_mutex > > would be sufficient, or some other lock might need to be added. > > Could we have the bootloader pass in some initial randomness? Where would the bootloader get it from (securely) that the kernel can't? For example, if you compile in a TPM driver, the kernel will pick up 32 random entropy bytes from the TPM to seed the pool, but I think it happens too late to help with this problem currently. IMA also needs the TPM very early in the boot sequence, so I was wondering about using the initial EFI driver, which is present on boot, and then transitioning to the proper kernel TPM driver later, which would mean we could seed the pool earlier. As long as you mix it properly and limit the amount, it shouldn't necessarily be a source of actual compromise, but having an external input to our cryptographically secure entropy pool is an additional potential attack vector. James ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: repeatable boot randomness inside KVM guest 2018-04-17 9:13 ` James Bottomley @ 2018-04-17 11:47 ` Matthew Wilcox 2018-04-17 11:57 ` James Bottomley 0 siblings, 1 reply; 16+ messages in thread From: Matthew Wilcox @ 2018-04-17 11:47 UTC (permalink / raw) To: James Bottomley Cc: Theodore Y. Ts'o, Alexey Dobriyan, linux-kernel, linux-mm On Tue, Apr 17, 2018 at 10:13:34AM +0100, James Bottomley wrote: > On Sat, 2018-04-14 at 17:41 -0700, Matthew Wilcox wrote: > > On Sat, Apr 14, 2018 at 06:44:19PM -0400, Theodore Y. Ts'o wrote: > > > What needs to happen is freelist should get randomized much later > > > in the boot sequence. Doing it later will require locking; I don't > > > know enough about the slab/slub code to know whether the slab_mutex > > > would be sufficient, or some other lock might need to be added. > > > > Could we have the bootloader pass in some initial randomness? > > Where would the bootloader get it from (securely) that the kernel > can't? In this particular case, qemu is booting the kernel, so it can apply to /dev/random for some entropy. > For example, if you compile in a TPM driver, the kernel will > pick up 32 random entropy bytes from the TPM to seed the pool, but I > think it happens too late to help with this problem currently. IMA > also needs the TPM very early in the boot sequence, so I was wondering > about using the initial EFI driver, which is present on boot, and then > transitioning to the proper kernel TPM driver later, which would mean > we could seed the pool earlier. > > As long as you mix it properly and limit the amount, it shouldn't > necessarily be a source of actual compromise, but having an external > input to our cryptographically secure entropy pool is an additional > potential attack vector. I thought our model was that if somebody had compromised the bootloader, all bets were off. And also that we were free to mix in as many untrustworthy bytes of alleged entropy into the random pool as we liked. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: repeatable boot randomness inside KVM guest 2018-04-17 11:47 ` Matthew Wilcox @ 2018-04-17 11:57 ` James Bottomley 2018-04-17 14:07 ` Matthew Wilcox 2018-04-17 15:16 ` Theodore Y. Ts'o 0 siblings, 2 replies; 16+ messages in thread From: James Bottomley @ 2018-04-17 11:57 UTC (permalink / raw) To: Matthew Wilcox Cc: Theodore Y. Ts'o, Alexey Dobriyan, linux-kernel, linux-mm On Tue, 2018-04-17 at 04:47 -0700, Matthew Wilcox wrote: > On Tue, Apr 17, 2018 at 10:13:34AM +0100, James Bottomley wrote: > > On Sat, 2018-04-14 at 17:41 -0700, Matthew Wilcox wrote: > > > On Sat, Apr 14, 2018 at 06:44:19PM -0400, Theodore Y. Ts'o wrote: > > > > What needs to happen is freelist should get randomized much > > > > later in the boot sequence. Doing it later will require > > > > locking; I don't know enough about the slab/slub code to know > > > > whether the slab_mutex would be sufficient, or some other lock > > > > might need to be added. > > > > > > Could we have the bootloader pass in some initial randomness? > > > > Where would the bootloader get it from (securely) that the kernel > > can't? > > In this particular case, qemu is booting the kernel, so it can apply > to /dev/random for some entropy. Well, yes, but wouldn't qemu virtualize /dev/random anyway so the guest kernel can get it from the HWRNG provided by qemu? > > For example, if you compile in a TPM driver, the kernel will > > pick up 32 random entropy bytes from the TPM to seed the pool, but > > I think it happens too late to help with this problem > > currently. IMA also needs the TPM very early in the boot sequence, > > so I was wondering about using the initial EFI driver, which is > > present on boot, and then transitioning to the proper kernel TPM > > driver later, which would mean we could seed the pool earlier. > > > > As long as you mix it properly and limit the amount, it shouldn't > > necessarily be a source of actual compromise, but having an > > external input to our cryptographically secure entropy pool is an > > additional potential attack vector. > > I thought our model was that if somebody had compromised the > bootloader, all bets were off. You don't have to compromise the bootloader to influence this, you merely have to trick it into providing the random number you wanted. The bigger you make the attack surface (the more inputs) the more likelihood of finding a trick that works. > And also that we were free to mix in as many untrustworthy bytes of > alleged entropy into the random pool as we liked. No, entropy mixing ensures that all you do with bad entropy is degrade the quality, but if the quality degrades to zero (as it might at boot when you've no other entropy sources so you feed in 100% bad entropy), then the random sequences become predictable. James ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: repeatable boot randomness inside KVM guest 2018-04-17 11:57 ` James Bottomley @ 2018-04-17 14:07 ` Matthew Wilcox 2018-04-17 15:20 ` James Bottomley 2018-04-17 15:16 ` Theodore Y. Ts'o 1 sibling, 1 reply; 16+ messages in thread From: Matthew Wilcox @ 2018-04-17 14:07 UTC (permalink / raw) To: James Bottomley Cc: Theodore Y. Ts'o, Alexey Dobriyan, linux-kernel, linux-mm On Tue, Apr 17, 2018 at 12:57:12PM +0100, James Bottomley wrote: > On Tue, 2018-04-17 at 04:47 -0700, Matthew Wilcox wrote: > > On Tue, Apr 17, 2018 at 10:13:34AM +0100, James Bottomley wrote: > > > On Sat, 2018-04-14 at 17:41 -0700, Matthew Wilcox wrote: > > > > On Sat, Apr 14, 2018 at 06:44:19PM -0400, Theodore Y. Ts'o wrote: > > > > > What needs to happen is freelist should get randomized much > > > > > later in the boot sequence. Doing it later will require > > > > > locking; I don't know enough about the slab/slub code to know > > > > > whether the slab_mutex would be sufficient, or some other lock > > > > > might need to be added. > > > > > > > > Could we have the bootloader pass in some initial randomness? > > > > > > Where would the bootloader get it from (securely) that the kernel > > > can't? > > > > In this particular case, qemu is booting the kernel, so it can apply > > to /dev/random for some entropy. > > Well, yes, but wouldn't qemu virtualize /dev/random anyway so the guest > kernel can get it from the HWRNG provided by qemu? The part of Ted's mail that I snipped explained that virtio-rng relies on being able to kmalloc memory, so by definition it can't provide entropy before kmalloc is initialised. > > I thought our model was that if somebody had compromised the > > bootloader, all bets were off. > > You don't have to compromise the bootloader to influence this, you > merely have to trick it into providing the random number you wanted. > The bigger you make the attack surface (the more inputs) the more > likelihood of finding a trick that works. > > > And also that we were free to mix in as many untrustworthy bytes of > > alleged entropy into the random pool as we liked. > > No, entropy mixing ensures that all you do with bad entropy is degrade > the quality, but if the quality degrades to zero (as it might at boot > when you've no other entropy sources so you feed in 100% bad entropy), > then the random sequences become predictable. I don't understand that. If I estimate that I have 'k' bytes of entropy in my pool, and then I mix in 'n' entirely predictable bytes, I should still have k bytes of entropy in the pool. If I withdraw k bytes from the pool, then yes the future output from the pool may be entirely predictable, but I have to know what those k bytes were. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: repeatable boot randomness inside KVM guest 2018-04-17 14:07 ` Matthew Wilcox @ 2018-04-17 15:20 ` James Bottomley 0 siblings, 0 replies; 16+ messages in thread From: James Bottomley @ 2018-04-17 15:20 UTC (permalink / raw) To: Matthew Wilcox Cc: Theodore Y. Ts'o, Alexey Dobriyan, linux-kernel, linux-mm On Tue, 2018-04-17 at 07:07 -0700, Matthew Wilcox wrote: > On Tue, Apr 17, 2018 at 12:57:12PM +0100, James Bottomley wrote: > > On Tue, 2018-04-17 at 04:47 -0700, Matthew Wilcox wrote: > > > On Tue, Apr 17, 2018 at 10:13:34AM +0100, James Bottomley wrote: > > > > On Sat, 2018-04-14 at 17:41 -0700, Matthew Wilcox wrote: > > > > > On Sat, Apr 14, 2018 at 06:44:19PM -0400, Theodore Y. Ts'o > > > > > wrote: > > > > > > What needs to happen is freelist should get randomized much > > > > > > later in the boot sequence. Doing it later will require > > > > > > locking; I don't know enough about the slab/slub code to > > > > > > know whether the slab_mutex would be sufficient, or some > > > > > > other lock might need to be added. > > > > > > > > > > Could we have the bootloader pass in some initial randomness? > > > > > > > > Where would the bootloader get it from (securely) that the > > > > kernel can't? > > > > > > In this particular case, qemu is booting the kernel, so it can > > > apply to /dev/random for some entropy. > > > > Well, yes, but wouldn't qemu virtualize /dev/random anyway so the > > guest kernel can get it from the HWRNG provided by qemu? > > The part of Ted's mail that I snipped explained that virtio-rng > relies on being able to kmalloc memory, so by definition it can't > provide entropy before kmalloc is initialised. That sounds fixable ... > > > I thought our model was that if somebody had compromised the > > > bootloader, all bets were off. > > > > You don't have to compromise the bootloader to influence this, you > > merely have to trick it into providing the random number you > > wanted. The bigger you make the attack surface (the more inputs) > > the more likelihood of finding a trick that works. > > > > > And also that we were free to mix in as many untrustworthy > > > bytes of alleged entropy into the random pool as we liked. > > > > No, entropy mixing ensures that all you do with bad entropy is > > degrade the quality, but if the quality degrades to zero (as it > > might at boot when you've no other entropy sources so you feed in > > 100% bad entropy), then the random sequences become predictable. > > I don't understand that. If I estimate that I have 'k' bytes of > entropy in my pool, and then I mix in 'n' entirely predictable bytes, > I should still have k bytes of entropy in the pool. If I withdraw k > bytes from the pool, then yes the future output from the pool may be > entirely predictable, but I have to know what those k bytes were. If that were true, why are we debating this? I thought the problem was the alleged random sequences for slub placement were repeating on subsequent VM boots meaning there's effectively no entropy in the pool and we need to add some. James ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: repeatable boot randomness inside KVM guest 2018-04-17 11:57 ` James Bottomley 2018-04-17 14:07 ` Matthew Wilcox @ 2018-04-17 15:16 ` Theodore Y. Ts'o 2018-04-17 15:42 ` James Bottomley 1 sibling, 1 reply; 16+ messages in thread From: Theodore Y. Ts'o @ 2018-04-17 15:16 UTC (permalink / raw) To: James Bottomley; +Cc: Matthew Wilcox, Alexey Dobriyan, linux-kernel, linux-mm On Tue, Apr 17, 2018 at 12:57:12PM +0100, James Bottomley wrote: > > You don't have to compromise the bootloader to influence this, you > merely have to trick it into providing the random number you wanted. > The bigger you make the attack surface (the more inputs) the more > likelihood of finding a trick that works. There is a large class of devices where the bootloader can be considered trusted. For example, all modern Chrome and Android devices have signed bootloaders by default. And if you are using an Amazon or Chrome VM, you are generally started it with a known, trusted boot image. The reason why it's useful to have the bootloader get the entropy is because it may device-specific access and be able to leverage whatever infrastructure was used to load the kernel and/or intialramfs to also load the equivalent of /var/lib/systemd/random-seed (or /var/lib/urandom, et. al) --- and do this early enough that we can have truely secure randomness for those kernel faciliteis that need access to real randomness to initialize the stack canary, or initializing the slab cache. There are other ways that this could be done, of course. If the UEFI boot services are still available, you might be able to ask the UEFI services to give you randomness. And yes, the hardware might be backdoored to the fare-the-well by the MSS (for devices manufactured in China) or by an NSA Tailored Access Operations intercepting a computer shipment in transit. But my vision was that this wouldn't necessarily bump the entropy accounting or mark the CRNG as fully intialized. (If you work for the NSA and you're sure you won't do an own-goal, you could enable a kernel boot option which marks the CRNG initialized from entropy coming from UEFI or RDRAND or a TPM. But I don't think it should be the default.) The only goal was to get enough uncertainty so we can secure early kernel users of entropy for security features such as kernel ASLR, the kernel stack canary, SLAB freelist randomization, etc. And by the way --- if you think it is easy / possible to get secure random numbers easily from either a TPMv1 or TPMv2 w/o any early boot services (e.g., no interrupts, no DMA, no page tables, no memory allocation) that would be really good to know. Cheers, > No, entropy mixing ensures that all you do with bad entropy is degrade > the quality, but if the quality degrades to zero (as it might at boot > when you've no other entropy sources so you feed in 100% bad entropy), > then the random sequences become predictable. Actually, if you have good entropy mixing, you can mix super-bad entropy --- e.g., completely known by the attacker, and it won't make the entropy pool any worse. It can only help. It does require that the entropy mixing algorithm should be reversible, so that mixing in even a fully known sequence will not cause uncertainty to be lost. The input_pool in the random driver is designed in such a way, which is why /dev/[u]random is world-writable. Anyone can contribute potential uncertainty into the pool. Regardless of whether they have zero, partial, or full knowledge of the internal random state, they won't have any more certainty of the pool after they mix in their contribution. And an attacker which does not know the contribution, and who might have partial knowledge of the pool, will less knowledge about the internal state afterwards. Cheers, - Ted ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: repeatable boot randomness inside KVM guest 2018-04-17 15:16 ` Theodore Y. Ts'o @ 2018-04-17 15:42 ` James Bottomley 2018-04-17 21:40 ` Theodore Y. Ts'o 0 siblings, 1 reply; 16+ messages in thread From: James Bottomley @ 2018-04-17 15:42 UTC (permalink / raw) To: Theodore Y. Ts'o Cc: Matthew Wilcox, Alexey Dobriyan, linux-kernel, linux-mm On Tue, 2018-04-17 at 11:16 -0400, Theodore Y. Ts'o wrote: > On Tue, Apr 17, 2018 at 12:57:12PM +0100, James Bottomley wrote: > > > > You don't have to compromise the bootloader to influence this, you > > merely have to trick it into providing the random number you > > wanted. The bigger you make the attack surface (the more inputs) > > the more likelihood of finding a trick that works. > > There is a large class of devices where the bootloader can be > considered trusted. For example, all modern Chrome and Android > devices have signed bootloaders by default. And if you are using an > Amazon or Chrome VM, you are generally started it with a known, > trusted boot image. Depends how the parameter is passed. If it can be influenced from the command line then a large class of "trusted boot" systems actually don't verify the command line, so you can boot a trusted system and still inject bogus command line parameters. This is definitely true of PC class secure boot. Not saying it will always be so, just illustrating why you don't necessarily want to expand the attack surface. > The reason why it's useful to have the bootloader get the entropy is > because it may device-specific access and be able to leverage > whatever infrastructure was used to load the kernel and/or > intialramfs to also load the equivalent of /var/lib/systemd/random- > seed (or /var/lib/urandom, et. al) --- and do this early enough that > we can have truely secure randomness for those kernel faciliteis that > need access to real randomness to initialize the stack canary, or > initializing the slab cache. OK, in the UEFI ideal world where every component is a perfectly written OS, perhaps you're right. In the more real world, do you trust the people who wrote the bootloader to understand and correctly implement the cryptographically secure process of obtaining a random input? > There are other ways that this could be done, of course. If the UEFI > boot services are still available, you might be able to ask the UEFI > services to give you randomness. And yes, the hardware might be > backdoored to the fare-the-well by the MSS (for devices manufactured > in China) or by an NSA Tailored Access Operations intercepting a > computer shipment in transit. But my vision was that this wouldn't > necessarily bump the entropy accounting or mark the CRNG as fully > intialized. (If you work for the NSA and you're sure you won't do an > own-goal, you could enable a kernel boot option which marks the CRNG > initialized from entropy coming from UEFI or RDRAND or a TPM. But I > don't think it should be the default.) > > The only goal was to get enough uncertainty so we can secure early > kernel users of entropy for security features such as kernel ASLR, > the kernel stack canary, SLAB freelist randomization, etc. > > And by the way --- if you think it is easy / possible to get secure > random numbers easily from either a TPMv1 or TPMv2 w/o any early boot > services (e.g., no interrupts, no DMA, no page tables, no memory > allocation) that would be really good to know. Well, as I said, I was planning to use the EFI driver (actually for IMA, but it works here too) which should be present to the kernel on boot. We also don't have quite the severe restrictions you say. The bootmem interface is usable for allocations (even ones that persist beyond init discard) and, although most TPMs are actually polled devices, it is possible to use interrupt drivers that do DMA via UEFI in early boot provided you know what you're doing. James ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: repeatable boot randomness inside KVM guest 2018-04-17 15:42 ` James Bottomley @ 2018-04-17 21:40 ` Theodore Y. Ts'o 0 siblings, 0 replies; 16+ messages in thread From: Theodore Y. Ts'o @ 2018-04-17 21:40 UTC (permalink / raw) To: James Bottomley; +Cc: Matthew Wilcox, Alexey Dobriyan, linux-kernel, linux-mm On Tue, Apr 17, 2018 at 04:42:39PM +0100, James Bottomley wrote: > Depends how the parameter is passed. If it can be influenced from the > command line then a large class of "trusted boot" systems actually > don't verify the command line, so you can boot a trusted system and > still inject bogus command line parameters. This is definitely true of > PC class secure boot. Not saying it will always be so, just > illustrating why you don't necessarily want to expand the attack > surface. Sure, this is why I don't really like the scheme of relying on the command line. For one thing, the command-line is public, so if the attacker can read /proc/cmdline, they'll have access to the entropy. What I would prefer is an extension to the boot protocol so that some number of bytes would be passed to the kernel as a separate bag of bytes alongside the kernel command line and the initrd. The kernel would mix that into the random driver (which is written so the basic input pool and primary_crng can accept input in super-early boot). This woud be done *before* we relocate the kernel, so that kernel ASLR code can relocate the kernel test to a properly unpredictable number --- so this really is quite super-early boot. > OK, in the UEFI ideal world where every component is a perfectly > written OS, perhaps you're right. In the more real world, do you trust > the people who wrote the bootloader to understand and correctly > implement the cryptographically secure process of obtaining a random > input? In the default setup, I would expect the bootloader (such as grub) would read the random initialization data from disk. So it would work much like systemd reading from /var/lib/systemd/random-seed. And I would trust the bootloader implementors to be able to do this about as well as I would trust the systemd implementors. :-) It's not that hard, after all.... - Ted ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: repeatable boot randomness inside KVM guest 2018-04-14 22:44 ` Theodore Y. Ts'o 2018-04-15 0:41 ` Matthew Wilcox @ 2018-04-16 15:54 ` Kees Cook 2018-04-16 16:15 ` Thomas Garnier 1 sibling, 1 reply; 16+ messages in thread From: Kees Cook @ 2018-04-16 15:54 UTC (permalink / raw) To: Theodore Y. Ts'o, Alexey Dobriyan, LKML, Linux-MM, Thomas Garnier On Sat, Apr 14, 2018 at 3:44 PM, Theodore Y. Ts'o <tytso@mit.edu> wrote: > +linux-mm@kvack.org > kvm@vger.kernel.org, security@kernel.org moved to bcc > > On Sat, Apr 14, 2018 at 10:59:21PM +0300, Alexey Dobriyan wrote: >> SLAB allocators got CONFIG_SLAB_FREELIST_RANDOM option which randomizes >> allocation pattern inside a slab: >> >> int cache_random_seq_create(struct kmem_cache *cachep, unsigned int count, gfp_t gfp) >> { >> ... >> /* Get best entropy at this stage of boot */ >> prandom_seed_state(&state, get_random_long()); >> >> Then I printed actual random sequences for each kmem cache. >> Turned out they were all the same for most of the caches and >> they didn't vary across guest reboots. > > The problem is at the super-early state of the boot path, kernel code > can't allocate memory. This is something most device drivers kinda > assume they can do. :-) > > So it means we haven't yet initialized the virtio-rng driver, and it's > before interrupts have been enabled, so we can't harvest any entropy > from interrupt timing. So that's why trying to use virtio-rng didn't > help. > >> The only way to get randomness for SLAB is to enable RDRAND inside guest. >> >> Is it KVM bug? > > No, it's not a KVM bug. The fundamental issue is in how the > CONFIG_SLAB_FREELIST_RANDOM is currently implemented. > > What needs to happen is freelist should get randomized much later in > the boot sequence. Doing it later will require locking; I don't know > enough about the slab/slub code to know whether the slab_mutex would > be sufficient, or some other lock might need to be added. > > The other thing I would note that is that using prandom_u32_state() doesn't > really provide much security. In fact, if the the goal is to protect > against a malicious attacker trying to guess what addresses will be > returned by the slab allocator, I suspect it's much like the security > patdowns done at airports. It might protect against a really stupid > attacker, but it's mostly security theater. > > The freelist randomization is only being done once; so it's not like > performance is really an issue. It would be much better to just use > get_random_u32() and be done with it. I'd drop using prandom_* > functions in slab.c and slubct and slab_common.c, and just use a > really random number generator, if the goal is real security as > opposed to security for show.... > > (Not that there's necessarily any thing wrong with security theater; > the US spends over 3 billion dollars a year on security theater. As > politicians know, symbolism can be important. :-) I've added Thomas Garnier to CC (since he wrote this originally). He can speak to its position in the boot ordering and the effective entropy. -Kees -- Kees Cook Pixel Security ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: repeatable boot randomness inside KVM guest 2018-04-16 15:54 ` Kees Cook @ 2018-04-16 16:15 ` Thomas Garnier 2018-04-17 0:31 ` Alexey Dobriyan 0 siblings, 1 reply; 16+ messages in thread From: Thomas Garnier @ 2018-04-16 16:15 UTC (permalink / raw) To: Kees Cook; +Cc: tytso, Alexey Dobriyan, LKML, Linux-MM On Mon, Apr 16, 2018 at 8:54 AM Kees Cook <keescook@chromium.org> wrote: > On Sat, Apr 14, 2018 at 3:44 PM, Theodore Y. Ts'o <tytso@mit.edu> wrote: > > +linux-mm@kvack.org > > kvm@vger.kernel.org, security@kernel.org moved to bcc > > > > On Sat, Apr 14, 2018 at 10:59:21PM +0300, Alexey Dobriyan wrote: > >> SLAB allocators got CONFIG_SLAB_FREELIST_RANDOM option which randomizes > >> allocation pattern inside a slab: > >> > >> int cache_random_seq_create(struct kmem_cache *cachep, unsigned int count, gfp_t gfp) > >> { > >> ... > >> /* Get best entropy at this stage of boot */ > >> prandom_seed_state(&state, get_random_long()); > >> > >> Then I printed actual random sequences for each kmem cache. > >> Turned out they were all the same for most of the caches and > >> they didn't vary across guest reboots. > > > > The problem is at the super-early state of the boot path, kernel code > > can't allocate memory. This is something most device drivers kinda > > assume they can do. :-) > > > > So it means we haven't yet initialized the virtio-rng driver, and it's > > before interrupts have been enabled, so we can't harvest any entropy > > from interrupt timing. So that's why trying to use virtio-rng didn't > > help. > > > >> The only way to get randomness for SLAB is to enable RDRAND inside guest. > >> > >> Is it KVM bug? > > > > No, it's not a KVM bug. The fundamental issue is in how the > > CONFIG_SLAB_FREELIST_RANDOM is currently implemented. Entropy at early boot in VM has always been a problem for this feature or others. Did you look at the impact on other boot security features fetching random values? Does your VM had RDRAND support (we use get_random_long() which will fetch from RDRAND to provide as much entropy as possible at this point)? > > > > What needs to happen is freelist should get randomized much later in > > the boot sequence. Doing it later will require locking; I don't know > > enough about the slab/slub code to know whether the slab_mutex would > > be sufficient, or some other lock might need to be added. You can't re-randomize pre-allocated pages that's why the cache is randomized that early. If you don't have RDRAND, we could re-randomize later at boot with more entropy that could be useful in this specific case. > > > > The other thing I would note that is that using prandom_u32_state() doesn't > > really provide much security. In fact, if the the goal is to protect > > against a malicious attacker trying to guess what addresses will be > > returned by the slab allocator, I suspect it's much like the security > > patdowns done at airports. It might protect against a really stupid > > attacker, but it's mostly security theater. > > > > The freelist randomization is only being done once; so it's not like > > performance is really an issue. It would be much better to just use > > get_random_u32() and be done with it. I'd drop using prandom_* > > functions in slab.c and slubct and slab_common.c, and just use a > > really random number generator, if the goal is real security as > > opposed to security for show.... The state is seeded with get_random_long() which will use RDRAND and any available entropy at this point. I am not sure the value of calling get_random_long() on each iteration especially if you don't have RDRAND. > > > > (Not that there's necessarily any thing wrong with security theater; > > the US spends over 3 billion dollars a year on security theater. As > > politicians know, symbolism can be important. :-) > I've added Thomas Garnier to CC (since he wrote this originally). He > can speak to its position in the boot ordering and the effective > entropy. Thanks for including me. > -Kees > -- > Kees Cook > Pixel Security -- Thomas ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: repeatable boot randomness inside KVM guest 2018-04-16 16:15 ` Thomas Garnier @ 2018-04-17 0:31 ` Alexey Dobriyan 0 siblings, 0 replies; 16+ messages in thread From: Alexey Dobriyan @ 2018-04-17 0:31 UTC (permalink / raw) To: Thomas Garnier; +Cc: Kees Cook, tytso, LKML, Linux-MM On Mon, Apr 16, 2018 at 04:15:44PM +0000, Thomas Garnier wrote: > On Mon, Apr 16, 2018 at 8:54 AM Kees Cook <keescook@chromium.org> wrote: > > > On Sat, Apr 14, 2018 at 3:44 PM, Theodore Y. Ts'o <tytso@mit.edu> wrote: > > > +linux-mm@kvack.org > > > kvm@vger.kernel.org, security@kernel.org moved to bcc > > > > > > On Sat, Apr 14, 2018 at 10:59:21PM +0300, Alexey Dobriyan wrote: > > >> SLAB allocators got CONFIG_SLAB_FREELIST_RANDOM option which randomizes > > >> allocation pattern inside a slab: > > >> > > >> int cache_random_seq_create(struct kmem_cache *cachep, unsigned > int count, gfp_t gfp) > > >> { > > >> ... > > >> /* Get best entropy at this stage of boot */ > > >> prandom_seed_state(&state, get_random_long()); > > >> > > >> Then I printed actual random sequences for each kmem cache. > > >> Turned out they were all the same for most of the caches and > > >> they didn't vary across guest reboots. > > > > > > The problem is at the super-early state of the boot path, kernel code > > > can't allocate memory. This is something most device drivers kinda > > > assume they can do. :-) > > > > > > So it means we haven't yet initialized the virtio-rng driver, and it's > > > before interrupts have been enabled, so we can't harvest any entropy > > > from interrupt timing. So that's why trying to use virtio-rng didn't > > > help. > > > > > >> The only way to get randomness for SLAB is to enable RDRAND inside > guest. > > >> > > >> Is it KVM bug? > > > > > > No, it's not a KVM bug. The fundamental issue is in how the > > > CONFIG_SLAB_FREELIST_RANDOM is currently implemented. > > Entropy at early boot in VM has always been a problem for this feature or > others. Did you look at the impact on other boot security features fetching > random values? Does your VM had RDRAND support (we use get_random_long() > which will fetch from RDRAND to provide as much entropy as possible at this > point)? The problem is that "qemu-system-x86_64" by default doesn't use RDRAND nor does it use entropy from the host to bootstrap. You need "-cpu host" or equivalent. Given that DMI strings are acting as a seed and fixed creation order of core kernel caches those SLAB randomization sequences may be globally the same (I didn't check) or draw from a small set. And of course there will be users which don't use RDRAND because it is NSA backdoor. ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2018-04-17 21:41 UTC | newest] Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-04-14 19:59 repeatable boot randomness inside KVM guest Alexey Dobriyan 2018-04-14 22:41 ` Andy Lutomirski 2018-04-14 23:09 ` Alexey Dobriyan 2018-04-14 22:44 ` Theodore Y. Ts'o 2018-04-15 0:41 ` Matthew Wilcox 2018-04-17 9:13 ` James Bottomley 2018-04-17 11:47 ` Matthew Wilcox 2018-04-17 11:57 ` James Bottomley 2018-04-17 14:07 ` Matthew Wilcox 2018-04-17 15:20 ` James Bottomley 2018-04-17 15:16 ` Theodore Y. Ts'o 2018-04-17 15:42 ` James Bottomley 2018-04-17 21:40 ` Theodore Y. Ts'o 2018-04-16 15:54 ` Kees Cook 2018-04-16 16:15 ` Thomas Garnier 2018-04-17 0:31 ` Alexey Dobriyan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).