From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751981AbdAPVe5 (ORCPT ); Mon, 16 Jan 2017 16:34:57 -0500 Received: from mail-lf0-f44.google.com ([209.85.215.44]:36560 "EHLO mail-lf0-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751711AbdAPVey (ORCPT ); Mon, 16 Jan 2017 16:34:54 -0500 MIME-Version: 1.0 In-Reply-To: References: From: Dmitry Vyukov Date: Mon, 16 Jan 2017 22:34:26 +0100 Message-ID: Subject: Re: kvm: use-after-free in process_srcu To: Steve Rutherford Cc: syzkaller , Paolo Bonzini , =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= , KVM list , LKML Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Jan 15, 2017 at 6:11 PM, Dmitry Vyukov wrote: > On Fri, Jan 13, 2017 at 10:19 AM, Dmitry Vyukov wrote: >> On Fri, Jan 13, 2017 at 4:30 AM, Steve Rutherford >> wrote: >>> I'm not that familiar with the kernel's workqueues, but this seems >>> like the classic "callback outlives the memory it references" >>> use-after-free, where the process_srcu callback is outliving struct >>> kvm (which contains the srcu_struct). If that's right, then calling >>> srcu_barrier (which should wait for all of the call_srcu callbacks to >>> complete, which are what enqueue the process_srcu callbacks) before >>> cleanup_srcu_struct in kvm_destroy_vm probably fixes this. >>> >>> The corresponding patch to virt/kvm/kvm_main.c looks something like: >>> static void kvm_destroy_vm(struct kvm *kvm) >>> { >>> ... >>> for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) >>> kvm_free_memslots(kvm, kvm->memslots[i]); >>> + srcu_barrier(&kvm->irq_srcu); >>> cleanup_srcu_struct(&kvm->irq_srcu); >>> + srcu_barrier(&kvm->srcu); >>> cleanup_srcu_struct(&kvm->srcu); >>> ... >>> >>> >>> Since we don't have a repro, this obviously won't be readily testable. >>> I find srcu subtle enough that I don't trust my reasoning fully (in >>> particular, I don't trust that waiting for all of the call_srcu >>> callbacks to complete also waits for all of the process_srcu >>> callbacks). Someone else know if that's the case? >> >> >> From the function description it looks like it should do the trick: >> >> 514 /** >> 515 * srcu_barrier - Wait until all in-flight call_srcu() callbacks complete. >> 516 * @sp: srcu_struct on which to wait for in-flight callbacks. >> 517 */ >> 518 void srcu_barrier(struct srcu_struct *sp) >> >> I see this failure happening several times per day. I've applied your >> patch locally and will check if I see these failures happening. > > > I have not seen the crash in 3 days, when usually I see several > crashes per night. So I think we can consider that the patch fixes the > crash: > > Tested-by: Dmitry Vyukov Unfortunately I hit it again with the patch applied. It definitely happens less frequently now, but still happens: BUG: KASAN: use-after-free in rcu_batch_empty kernel/rcu/srcu.c:64 [inline] at addr ffff88006ada0218 BUG: KASAN: use-after-free in rcu_batch_dequeue kernel/rcu/srcu.c:75 [inline] at addr ffff88006ada0218 BUG: KASAN: use-after-free in srcu_invoke_callbacks kernel/rcu/srcu.c:624 [inline] at addr ffff88006ada0218 BUG: KASAN: use-after-free in process_srcu+0x275/0x280 kernel/rcu/srcu.c:672 at addr ffff88006ada0218 Read of size 8 by task kworker/2:2/1073 CPU: 2 PID: 1073 Comm: kworker/2:2 Not tainted 4.10.0-rc3+ #164 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Workqueue: events_power_efficient process_srcu Call Trace: __dump_stack lib/dump_stack.c:15 [inline] dump_stack+0x2ee/0x3ef lib/dump_stack.c:51 kasan_object_err+0x1c/0x70 mm/kasan/report.c:165 print_address_description mm/kasan/report.c:203 [inline] kasan_report_error mm/kasan/report.c:287 [inline] kasan_report+0x1b6/0x460 mm/kasan/report.c:307 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:333 rcu_batch_empty kernel/rcu/srcu.c:64 [inline] rcu_batch_dequeue kernel/rcu/srcu.c:75 [inline] srcu_invoke_callbacks kernel/rcu/srcu.c:624 [inline] process_srcu+0x275/0x280 kernel/rcu/srcu.c:672 process_one_work+0xbd0/0x1c10 kernel/workqueue.c:2098 worker_thread+0x223/0x1990 kernel/workqueue.c:2232 kthread+0x326/0x3f0 kernel/kthread.c:227 ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:427 Object at ffff88006ada00c0, in cache kmalloc-16384 size: 16384 Allocated: PID = 32313 [] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57 [] save_stack+0x43/0xd0 mm/kasan/kasan.c:502 [] set_track mm/kasan/kasan.c:514 [inline] [] kasan_kmalloc+0xaa/0xd0 mm/kasan/kasan.c:605 [] kmem_cache_alloc_trace+0x10b/0x670 mm/slab.c:3629 [] kvm_arch_alloc_vm include/linux/slab.h:490 [inline] [] kvm_create_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:613 [inline] [] kvm_dev_ioctl_create_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:3175 [inline] [] kvm_dev_ioctl+0x1be/0x11b0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3219 [] vfs_ioctl fs/ioctl.c:43 [inline] [] do_vfs_ioctl+0x1bf/0x1790 fs/ioctl.c:683 [] SYSC_ioctl fs/ioctl.c:698 [inline] [] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:689 [] entry_SYSCALL_64_fastpath+0x1f/0xc2 Freed: PID = 32310 [] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57 [] save_stack+0x43/0xd0 mm/kasan/kasan.c:502 [] set_track mm/kasan/kasan.c:514 [inline] [] kasan_slab_free+0x6f/0xb0 mm/kasan/kasan.c:578 [] __cache_free mm/slab.c:3505 [inline] [] kfree+0xd3/0x250 mm/slab.c:3822 [] kvm_arch_free_vm include/linux/kvm_host.h:781 [inline] [] kvm_destroy_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:744 [inline] [] kvm_put_kvm+0x719/0x9a0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:759 [] kvm_vm_release+0x42/0x50 arch/x86/kvm/../../../virt/kvm/kvm_main.c:770 [] __fput+0x332/0x7f0 fs/file_table.c:208 [] ____fput+0x15/0x20 fs/file_table.c:244 [] task_work_run+0x18a/0x260 kernel/task_work.c:116 [] exit_task_work include/linux/task_work.h:21 [inline] [] do_exit+0x18e7/0x28a0 kernel/exit.c:839 [] do_group_exit+0x149/0x420 kernel/exit.c:943 [] get_signal+0x7e0/0x1820 kernel/signal.c:2313 [] do_signal+0xd2/0x2190 arch/x86/kernel/signal.c:807 [] exit_to_usermode_loop+0x200/0x2a0 arch/x86/entry/common.c:156 [] prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline] [] syscall_return_slowpath+0x4d3/0x570 arch/x86/entry/common.c:259 [] entry_SYSCALL_64_fastpath+0xc0/0xc2 Memory state around the buggy address: ffff88006ada0100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88006ada0180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff88006ada0200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88006ada0280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88006ada0300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== Disabling lock debugging due to kernel taint Kernel panic - not syncing: panic_on_warn set ... CPU: 2 PID: 1073 Comm: kworker/2:2 Tainted: G B 4.10.0-rc3+ #164 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Workqueue: events_power_efficient process_srcu Call Trace: __dump_stack lib/dump_stack.c:15 [inline] dump_stack+0x2ee/0x3ef lib/dump_stack.c:51 panic+0x1fb/0x412 kernel/panic.c:179 kasan_end_report+0x5b/0x60 mm/kasan/report.c:144 kasan_report_error mm/kasan/report.c:291 [inline] kasan_report+0x386/0x460 mm/kasan/report.c:307 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:333 rcu_batch_empty kernel/rcu/srcu.c:64 [inline] rcu_batch_dequeue kernel/rcu/srcu.c:75 [inline] srcu_invoke_callbacks kernel/rcu/srcu.c:624 [inline] process_srcu+0x275/0x280 kernel/rcu/srcu.c:672 process_one_work+0xbd0/0x1c10 kernel/workqueue.c:2098 worker_thread+0x223/0x1990 kernel/workqueue.c:2232 kthread+0x326/0x3f0 kernel/kthread.c:227 ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:427 Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: disabled