From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751134AbdAQLJZ (ORCPT ); Tue, 17 Jan 2017 06:09:25 -0500 Received: from mail-lf0-f66.google.com ([209.85.215.66]:35019 "EHLO mail-lf0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750958AbdAQLJV (ORCPT ); Tue, 17 Jan 2017 06:09:21 -0500 Subject: Re: kvm: use-after-free in process_srcu To: Dmitry Vyukov , Paul McKenney References: <754246063.9562871.1484603305281.JavaMail.zimbra@redhat.com> Cc: Steve Rutherford , syzkaller , =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= , KVM list , LKML From: Paolo Bonzini Message-ID: Date: Tue, 17 Jan 2017 12:08:30 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 17/01/2017 10:56, Dmitry Vyukov wrote: >> I am seeing use-after-frees in process_srcu as struct srcu_struct is >> already freed. Before freeing struct srcu_struct, code does >> cleanup_srcu_struct(&kvm->irq_srcu). We also tried to do: >> >> + srcu_barrier(&kvm->irq_srcu); >> cleanup_srcu_struct(&kvm->irq_srcu); >> >> It reduced rate of use-after-frees, but did not eliminate them >> completely. The full threaded is here: >> https://groups.google.com/forum/#!msg/syzkaller/i48YZ8mwePY/0PQ8GkQTBwAJ >> >> Does Paolo's fix above make sense to you? Namely adding >> flush_delayed_work(&sp->work) to cleanup_srcu_struct()? > > I am not sure about interaction of flush_delayed_work and > srcu_reschedule... flush_delayed_work probably assumes that no work is > queued concurrently, but what if srcu_reschedule queues another work > concurrently... can't it happen that flush_delayed_work will miss that > newly scheduled work? Newly scheduled callbacks would be a bug in SRCU usage, but my patch is indeed insufficient. Because of SRCU's two-phase algorithm, it's possible that the first flush_delayed_work doesn't invoke all callbacks. Instead I would propose this (still untested, but this time with a commit message): ---------------- 8< -------------- From: Paolo Bonzini Subject: [PATCH] srcu: wait for all callbacks before deeming SRCU "cleaned up" Even though there are no concurrent readers, it is possible that the work item is queued for delayed processing when cleanup_srcu_struct is called. The work item needs to be flushed before returning, or a use-after-free can ensue. Furthermore, because of SRCU's two-phase algorithm it may take up to two executions of srcu_advance_batches before all callbacks are invoked. This can happen if the first flush_delayed_work happens as follows srcu_read_lock process_srcu srcu_advance_batches ... if (!try_check_zero(sp, idx^1, trycount)) // there is a reader return; srcu_invoke_callbacks ... srcu_read_unlock cleanup_srcu_struct flush_delayed_work srcu_reschedule queue_delayed_work Now flush_delayed_work returns but srcu_reschedule will *not* have cleared sp->running to false. Not-tested-by: Paolo Bonzini Signed-off-by: Paolo Bonzini diff --git a/kernel/rcu/srcu.c b/kernel/rcu/srcu.c index 9b9cdd549caa..9470f1ba2ef2 100644 --- a/kernel/rcu/srcu.c +++ b/kernel/rcu/srcu.c @@ -283,6 +283,14 @@ void cleanup_srcu_struct(struct srcu_struct *sp) { if (WARN_ON(srcu_readers_active(sp))) return; /* Leakage unless caller handles error. */ + + /* + * No readers active, so any pending callbacks will rush through the two + * batches before sp->running becomes false. No risk of busy-waiting. + */ + while (sp->running) + flush_delayed_work(&sp->work); + free_percpu(sp->per_cpu_ref); sp->per_cpu_ref = NULL; } Thanks, Paolo