From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753867AbdKFTFV (ORCPT ); Mon, 6 Nov 2017 14:05:21 -0500 Received: from mail-io0-f195.google.com ([209.85.223.195]:48622 "EHLO mail-io0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753101AbdKFTFT (ORCPT ); Mon, 6 Nov 2017 14:05:19 -0500 X-Google-Smtp-Source: ABhQp+R4gXxwe2jqdQjy/lbJiRw+mMm5z5NUE8pWJ1ZgB76Ab70DpcgTQeKmdZBJmVSodA1xKje+FJ8ReigF4f1tBAE= MIME-Version: 1.0 In-Reply-To: References: <001a11447f06b06409055cd621ed@google.com> <367fa416-1adb-53db-4295-c25cec17b893@redhat.com> <12909e30-82ec-f39b-743e-5fb1426b1b89@redhat.com> From: Dmitry Vyukov Date: Mon, 6 Nov 2017 20:04:58 +0100 Message-ID: Subject: Re: BUG: sleeping function called from invalid context at ./include/linux/uaccess.h:LINE To: Paolo Bonzini Cc: David Hildenbrand , Nick Desaulniers , syzbot , Christian Borntraeger , "H. Peter Anvin" , KVM list , LKML , Lorenzo Stoakes , Michal Hocko , Ingo Molnar , =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= , syzkaller-bugs@googlegroups.com, Thomas Gleixner , "the arch/x86 maintainers" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Nov 6, 2017 at 5:14 PM, Paolo Bonzini wrote: > On 06/11/2017 17:01, David Hildenbrand wrote: >> On 06.11.2017 16:10, Nick Desaulniers wrote: >>> Does it have to be stack allocated? >> >> We can't use kmalloc and friends in emulate.c. We would have to >> introduce new emulator callbacks. >> >> a) for malloc and free. hmmm. >> b) for carrying out the fxrstr/fixup. >> >> Paolo, what do you suggest? > > You can use kmalloc. Any userspace user of emulate.c would have to > write a wrapper. Can you please tell me more about this? Is it used for testing? Is there an example code that builds and tests this in user-space? > But I'm not sure it's useful... maybe the > asm_safe+memcpy could be moved to a separate noinline function, so that > segmented_read_std is invoked with a leaner stack. > > Paolo > >>> >>> On Nov 6, 2017 3:52 AM, "David Hildenbrand" >> > wrote: >>> >>> On 31.10.2017 12:34, syzbot wrote: >>> > Hello, >>> > >>> > syzkaller hit the following crash on >>> > 91dfed74eabcdae9378131546c446442c29bf769 >>> > >>> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master >>> >>> > compiler: gcc (GCC) 7.1.1 20170620 >>> > .config is attached >>> > Raw console output is attached. >>> > C reproducer is attached >>> > syzkaller reproducer is attached. See https://goo.gl/kgGztJ >>> > for information about syzkaller reproducers >>> > >>> > >>> > in_atomic(): 1, irqs_disabled(): 0, pid: 2909, name: syzkaller879109 >>> > 2 locks held by syzkaller879109/2909: >>> > #0: (&vcpu->mutex){+.+.}, at: [] >>> vcpu_load+0x1c/0x70 >>> > arch/x86/kvm/../../../virt/kvm/kvm_main.c:154 >>> > #1: (&kvm->srcu){....}, at: [] vcpu_enter_guest >>> > arch/x86/kvm/x86.c:6983 [inline] >>> > #1: (&kvm->srcu){....}, at: [] vcpu_run >>> > arch/x86/kvm/x86.c:7061 [inline] >>> > #1: (&kvm->srcu){....}, at: [] >>> > kvm_arch_vcpu_ioctl_run+0x1bc2/0x58b0 arch/x86/kvm/x86.c:7222 >>> > CPU: 1 PID: 2909 Comm: syzkaller879109 Not tainted >>> 4.13.0-rc4-next-20170811 >>> > #1 >>> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs >>> 01/01/2011 >>> > Call Trace: >>> > __dump_stack lib/dump_stack.c:16 [inline] >>> > dump_stack+0x194/0x257 lib/dump_stack.c:52 >>> > ___might_sleep+0x2b2/0x470 kernel/sched/core.c:6014 >>> > __might_sleep+0x95/0x190 kernel/sched/core.c:5967 >>> > __might_fault+0xab/0x1d0 mm/memory.c:4383 >>> > __copy_from_user include/linux/uaccess.h:71 [inline] >>> > __kvm_read_guest_page+0x58/0xa0 >>> > arch/x86/kvm/../../../virt/kvm/kvm_main.c:1771 >>> > kvm_vcpu_read_guest_page+0x44/0x60 >>> > arch/x86/kvm/../../../virt/kvm/kvm_main.c:1791 >>> > kvm_read_guest_virt_helper+0x76/0x140 arch/x86/kvm/x86.c:4407 >>> > kvm_read_guest_virt_system+0x3c/0x50 arch/x86/kvm/x86.c:4466 >>> > segmented_read_std+0x10c/0x180 arch/x86/kvm/emulate.c:819 >>> > em_fxrstor+0x27b/0x410 arch/x86/kvm/emulate.c:4022 >>> >>> >>> In em_fxrstor, we do a get_fpu(), which in return disables preemption. >>> >>> With preempt_disable(), we do a >>> segmented_read_std()->kvm_vcpu_read_guest_page(), triggering the >>> warning. >>> >>> > x86_emulate_insn+0x55d/0x3c50 arch/x86/kvm/emulate.c:5471 >>> > x86_emulate_instruction+0x411/0x1ca0 arch/x86/kvm/x86.c:5698 >>> > kvm_mmu_page_fault+0x18b/0x2c0 arch/x86/kvm/mmu.c:4854 >>> > handle_ept_violation+0x1fc/0x5e0 arch/x86/kvm/vmx.c:6400 >>> > vmx_handle_exit+0x281/0x1ab0 arch/x86/kvm/vmx.c:8718 >>> > vcpu_enter_guest arch/x86/kvm/x86.c:6999 [inline] >>> > vcpu_run arch/x86/kvm/x86.c:7061 [inline] >>> > kvm_arch_vcpu_ioctl_run+0x1cee/0x58b0 arch/x86/kvm/x86.c:7222 >>> > kvm_vcpu_ioctl+0x64c/0x1010 >>> arch/x86/kvm/../../../virt/kvm/kvm_main.c:2591 >>> > vfs_ioctl fs/ioctl.c:45 [inline] >>> > do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:685 >>> > SYSC_ioctl fs/ioctl.c:700 [inline] >>> > SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691 >>> > entry_SYSCALL_64_fastpath+0x1f/0xbe >>> >>> I don't really see a way to avoid two fxstate variables. Unloading the >>> fpu in between the fxstore/fxrstr could lead to host values getting >>> overwritten. Loading/saving the fpu in kvm_arch_vcpu_ioctl_run() would >>> most probably also not work, as the relevant portions of fxregs_state >>> would not get saved/restored. So the preemption would still be needed. >>> >>> >>> So all I can offer for now is the following (untested, can send as >>> proper patch if needed): >>> >>> >>> From f32d06c8c621c6d68e073e9bdb81a6280b6c9544 Mon Sep 17 00:00:00 2001 >>> From: David Hildenbrand > >>> Date: Mon, 6 Nov 2017 12:35:39 +0100 >>> Subject: [PATCH v1] KVM: x86: fix em_fxstor sleeping while in atomic >>> >>> Commit 9d643f63128b tried to optimize the stack size, but introduced a >>> guest memory access which might sleep while in atomic. >>> >>> Let's undo that part of the commit but keep the cleanups. >>> >>> Reported by syzbot: >>> >>> in_atomic(): 1, irqs_disabled(): 0, pid: 2909, name: syzkaller879109 >>> 2 locks held by syzkaller879109/2909: >>> #0: (&vcpu->mutex){+.+.}, at: [] >>> vcpu_load+0x1c/0x70 >>> arch/x86/kvm/../../../virt/kvm/kvm_main.c:154 >>> #1: (&kvm->srcu){....}, at: [] vcpu_enter_guest >>> arch/x86/kvm/x86.c:6983 [inline] >>> #1: (&kvm->srcu){....}, at: [] vcpu_run >>> arch/x86/kvm/x86.c:7061 [inline] >>> #1: (&kvm->srcu){....}, at: [] >>> kvm_arch_vcpu_ioctl_run+0x1bc2/0x58b0 arch/x86/kvm/x86.c:7222 >>> CPU: 1 PID: 2909 Comm: syzkaller879109 Not tainted >>> 4.13.0-rc4-next-20170811 >>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs >>> 01/01/2011 >>> Call Trace: >>> __dump_stack lib/dump_stack.c:16 [inline] >>> dump_stack+0x194/0x257 lib/dump_stack.c:52 >>> ___might_sleep+0x2b2/0x470 kernel/sched/core.c:6014 >>> __might_sleep+0x95/0x190 kernel/sched/core.c:5967 >>> __might_fault+0xab/0x1d0 mm/memory.c:4383 >>> __copy_from_user include/linux/uaccess.h:71 [inline] >>> __kvm_read_guest_page+0x58/0xa0 >>> arch/x86/kvm/../../../virt/kvm/kvm_main.c:1771 >>> kvm_vcpu_read_guest_page+0x44/0x60 >>> arch/x86/kvm/../../../virt/kvm/kvm_main.c:1791 >>> kvm_read_guest_virt_helper+0x76/0x140 arch/x86/kvm/x86.c:4407 >>> kvm_read_guest_virt_system+0x3c/0x50 arch/x86/kvm/x86.c:4466 >>> segmented_read_std+0x10c/0x180 arch/x86/kvm/emulate.c:819 >>> em_fxrstor+0x27b/0x410 arch/x86/kvm/emulate.c:4022 >>> x86_emulate_insn+0x55d/0x3c50 arch/x86/kvm/emulate.c:5471 >>> x86_emulate_instruction+0x411/0x1ca0 arch/x86/kvm/x86.c:5698 >>> kvm_mmu_page_fault+0x18b/0x2c0 arch/x86/kvm/mmu.c:4854 >>> handle_ept_violation+0x1fc/0x5e0 arch/x86/kvm/vmx.c:6400 >>> vmx_handle_exit+0x281/0x1ab0 arch/x86/kvm/vmx.c:8718 >>> vcpu_enter_guest arch/x86/kvm/x86.c:6999 [inline] >>> vcpu_run arch/x86/kvm/x86.c:7061 [inline] >>> kvm_arch_vcpu_ioctl_run+0x1cee/0x58b0 arch/x86/kvm/x86.c:7222 >>> kvm_vcpu_ioctl+0x64c/0x1010 >>> arch/x86/kvm/../../../virt/kvm/kvm_main.c:2591 >>> vfs_ioctl fs/ioctl.c:45 [inline] >>> do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:685 >>> SYSC_ioctl fs/ioctl.c:700 [inline] >>> SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691 >>> entry_SYSCALL_64_fastpath+0x1f/0xbe >>> RIP: 0033:0x437fc9 >>> RSP: 002b:00007ffc7b4d5ab8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010 >>> RAX: ffffffffffffffda RBX: 00000000004002b0 RCX: 0000000000437fc9 >>> RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000005 >>> RBP: 0000000000000086 R08: 0000000000000000 R09: 0000000020ae8000 >>> R10: 0000000000009120 R11: 0000000000000206 R12: 0000000000000000 >>> R13: 0000000000000004 R14: 0000000000000004 R15: 0000000020077000 >>> >>> Fixes: 9d643f63128b ("KVM: x86: avoid large stack allocations in >>> em_fxrstor") >>> Signed-off-by: David Hildenbrand >> > >>> --- >>> arch/x86/kvm/emulate.c | 16 +++++++++------- >>> 1 file changed, 9 insertions(+), 7 deletions(-) >>> >>> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c >>> index fb0055953fbc..d87f01a2d6f4 100644 >>> --- a/arch/x86/kvm/emulate.c >>> +++ b/arch/x86/kvm/emulate.c >>> @@ -4002,7 +4002,7 @@ static int em_fxsave(struct x86_emulate_ctxt >>> *ctxt) >>> >>> static int em_fxrstor(struct x86_emulate_ctxt *ctxt) >>> { >>> - struct fxregs_state fx_state; >>> + struct fxregs_state fx_state, fx_old; >>> int rc; >>> size_t size; >>> >>> @@ -4010,19 +4010,21 @@ static int em_fxrstor(struct >>> x86_emulate_ctxt *ctxt) >>> if (rc != X86EMUL_CONTINUE) >>> return rc; >>> >>> + size = fxstate_size(ctxt); >>> + rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, >>> &fx_state, size); >>> + if (rc != X86EMUL_CONTINUE) >>> + return rc; >>> + >>> ctxt->ops->get_fpu(ctxt); >>> >>> - size = fxstate_size(ctxt); >>> if (size < __fxstate_size(16)) { >>> - rc = asm_safe("fxsave %[fx]", , [fx] "+m"(fx_state)); >>> + rc = asm_safe("fxsave %[fx]", , [fx] "+m"(fx_old)); >>> if (rc != X86EMUL_CONTINUE) >>> goto out; >>> + memcpy(((void *)&fx_state) + size, ((void *)&fx_old) >>> + size, >>> + __fxstate_size(16) - size); >>> } >>> >>> - rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, >>> &fx_state, size); >>> - if (rc != X86EMUL_CONTINUE) >>> - goto out; >>> - >>> if (fx_state.mxcsr >> 16) { >>> rc = emulate_gp(ctxt, 0); >>> goto out; >>> -- >>> 2.13.6 >>> >>> >>> >>> -- >>> >>> Thanks, >>> >>> David / dhildenb >>> >> >> > > -- > You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group. > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/fc4fec6a-13fc-e5f4-e78a-711f2fccec92%40redhat.com. > For more options, visit https://groups.google.com/d/optout.