From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751288AbdAQMGk (ORCPT ); Tue, 17 Jan 2017 07:06:40 -0500 Received: from mail-lf0-f65.google.com ([209.85.215.65]:35265 "EHLO mail-lf0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750831AbdAQMGF (ORCPT ); Tue, 17 Jan 2017 07:06:05 -0500 Subject: Re: kvm: use-after-free in process_srcu To: Dmitry Vyukov References: <754246063.9562871.1484603305281.JavaMail.zimbra@redhat.com> Cc: Paul McKenney , Steve Rutherford , syzkaller , =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= , KVM list , LKML From: Paolo Bonzini Message-ID: Date: Tue, 17 Jan 2017 13:03:28 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 17/01/2017 12:13, Dmitry Vyukov wrote: > On Tue, Jan 17, 2017 at 12:08 PM, Paolo Bonzini wrote: >> >> >> On 17/01/2017 10:56, Dmitry Vyukov wrote: >>>> I am seeing use-after-frees in process_srcu as struct srcu_struct is >>>> already freed. Before freeing struct srcu_struct, code does >>>> cleanup_srcu_struct(&kvm->irq_srcu). We also tried to do: >>>> >>>> + srcu_barrier(&kvm->irq_srcu); >>>> cleanup_srcu_struct(&kvm->irq_srcu); >>>> >>>> It reduced rate of use-after-frees, but did not eliminate them >>>> completely. The full threaded is here: >>>> https://groups.google.com/forum/#!msg/syzkaller/i48YZ8mwePY/0PQ8GkQTBwAJ >>>> >>>> Does Paolo's fix above make sense to you? Namely adding >>>> flush_delayed_work(&sp->work) to cleanup_srcu_struct()? >>> >>> I am not sure about interaction of flush_delayed_work and >>> srcu_reschedule... flush_delayed_work probably assumes that no work is >>> queued concurrently, but what if srcu_reschedule queues another work >>> concurrently... can't it happen that flush_delayed_work will miss that >>> newly scheduled work? >> >> Newly scheduled callbacks would be a bug in SRCU usage, but my patch is > > I mean not srcu callbacks, but the sp->work being rescheduled. > Consider that callbacks are already scheduled. We call > flush_delayed_work, it waits for completion of process_srcu. But that > process_srcu schedules sp->work again in srcu_reschedule. > > >> indeed insufficient. Because of SRCU's two-phase algorithm, it's possible >> that the first flush_delayed_work doesn't invoke all callbacks. Instead I >> would propose this (still untested, but this time with a commit message): >> >> ---------------- 8< -------------- >> From: Paolo Bonzini >> Subject: [PATCH] srcu: wait for all callbacks before deeming SRCU "cleaned up" >> >> Even though there are no concurrent readers, it is possible that the >> work item is queued for delayed processing when cleanup_srcu_struct is >> called. The work item needs to be flushed before returning, or a >> use-after-free can ensue. >> >> Furthermore, because of SRCU's two-phase algorithm it may take up to >> two executions of srcu_advance_batches before all callbacks are invoked. >> This can happen if the first flush_delayed_work happens as follows >> >> srcu_read_lock >> process_srcu >> srcu_advance_batches >> ... >> if (!try_check_zero(sp, idx^1, trycount)) >> // there is a reader >> return; >> srcu_invoke_callbacks >> ... >> srcu_read_unlock >> cleanup_srcu_struct >> flush_delayed_work >> srcu_reschedule >> queue_delayed_work >> >> Now flush_delayed_work returns but srcu_reschedule will *not* have cleared >> sp->running to false. >> >> Not-tested-by: Paolo Bonzini >> Signed-off-by: Paolo Bonzini >> >> diff --git a/kernel/rcu/srcu.c b/kernel/rcu/srcu.c >> index 9b9cdd549caa..9470f1ba2ef2 100644 >> --- a/kernel/rcu/srcu.c >> +++ b/kernel/rcu/srcu.c >> @@ -283,6 +283,14 @@ void cleanup_srcu_struct(struct srcu_struct *sp) >> { >> if (WARN_ON(srcu_readers_active(sp))) >> return; /* Leakage unless caller handles error. */ >> + >> + /* >> + * No readers active, so any pending callbacks will rush through the two >> + * batches before sp->running becomes false. No risk of busy-waiting. >> + */ >> + while (sp->running) >> + flush_delayed_work(&sp->work); > > Unsynchronized accesses to shared state make me nervous. running is > meant to be protected with sp->queue_lock. I think it could just be while (flush_delayed_work(&sp->work)); but let's wait for Paul. Paolo > At least we will get back to you with a KTSAN report. > >> free_percpu(sp->per_cpu_ref); >> sp->per_cpu_ref = NULL; >> } >> >> >> Thanks, >> >> Paolo >> >> -- >> You received this message because you are subscribed to the Google Groups "syzkaller" group. >> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+unsubscribe@googlegroups.com. >> For more options, visit https://groups.google.com/d/optout. >