All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dmitry Vyukov <dvyukov@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>,
	Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: "Steve Rutherford" <srutherford@google.com>,
	syzkaller <syzkaller@googlegroups.com>,
	"Radim Krčmář" <rkrcmar@redhat.com>,
	"KVM list" <kvm@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: kvm: use-after-free in process_srcu
Date: Tue, 17 Jan 2017 10:56:34 +0100	[thread overview]
Message-ID: <CACT4Y+Y82G0KHgBGLvcACSgVxczeJV1TK-umg3qeZXX4S3D2qw@mail.gmail.com> (raw)
In-Reply-To: <CACT4Y+YTuqCK24X-DYpbG6jg9Nqqvb2uikZzyZeqFUmUmGbipw@mail.gmail.com>

On Tue, Jan 17, 2017 at 10:47 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Mon, Jan 16, 2017 at 10:48 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>
>>
>> ----- Original Message -----
>>> From: "Dmitry Vyukov" <dvyukov@google.com>
>>> To: "Steve Rutherford" <srutherford@google.com>
>>> Cc: "syzkaller" <syzkaller@googlegroups.com>, "Paolo Bonzini" <pbonzini@redhat.com>, "Radim Krčmář"
>>> <rkrcmar@redhat.com>, "KVM list" <kvm@vger.kernel.org>, "LKML" <linux-kernel@vger.kernel.org>
>>> Sent: Monday, January 16, 2017 10:34:26 PM
>>> Subject: Re: kvm: use-after-free in process_srcu
>>>
>>> On Sun, Jan 15, 2017 at 6:11 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
>>> > On Fri, Jan 13, 2017 at 10:19 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>>> >> On Fri, Jan 13, 2017 at 4:30 AM, Steve Rutherford
>>> >> <srutherford@google.com> wrote:
>>> >>> I'm not that familiar with the kernel's workqueues, but this seems
>>> >>> like the classic "callback outlives the memory it references"
>>> >>> use-after-free, where the process_srcu callback is outliving struct
>>> >>> kvm (which contains the srcu_struct). If that's right, then calling
>>> >>> srcu_barrier (which should wait for all of the call_srcu callbacks to
>>> >>> complete, which are what enqueue the process_srcu callbacks) before
>>> >>> cleanup_srcu_struct in kvm_destroy_vm probably fixes this.
>>> >>>
>>> >>> The corresponding patch to virt/kvm/kvm_main.c looks something like:
>>> >>> static void kvm_destroy_vm(struct kvm *kvm)
>>> >>> {
>>> >>> ...
>>> >>>         for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++)
>>> >>>                 kvm_free_memslots(kvm, kvm->memslots[i]);
>>> >>> +      srcu_barrier(&kvm->irq_srcu);
>>> >>>         cleanup_srcu_struct(&kvm->irq_srcu);
>>> >>> +      srcu_barrier(&kvm->srcu);
>>> >>>         cleanup_srcu_struct(&kvm->srcu);
>>> >>> ...
>>> >>>
>>> >>>
>>> >>> Since we don't have a repro, this obviously won't be readily testable.
>>> >>> I find srcu subtle enough that I don't trust my reasoning fully (in
>>> >>> particular, I don't trust that waiting for all of the call_srcu
>>> >>> callbacks to complete also waits for all of the process_srcu
>>> >>> callbacks). Someone else know if that's the case?
>>> >>
>>> >>
>>> >> From the function description it looks like it should do the trick:
>>> >>
>>> >> 514 /**
>>> >> 515  * srcu_barrier - Wait until all in-flight call_srcu() callbacks
>>> >> complete.
>>> >> 516  * @sp: srcu_struct on which to wait for in-flight callbacks.
>>> >> 517  */
>>> >> 518 void srcu_barrier(struct srcu_struct *sp)
>>> >>
>>> >> I see this failure happening several times per day. I've applied your
>>> >> patch locally and will check if I see these failures happening.
>>> >
>>> >
>>> > I have not seen the crash in 3 days, when usually I see several
>>> > crashes per night. So I think we can consider that the patch fixes the
>>> > crash:
>>> >
>>> > Tested-by: Dmitry Vyukov <dvyukov@google.com>
>>>
>>>
>>> Unfortunately I hit it again with the patch applied. It definitely
>>> happens less frequently now, but still happens:
>>
>> Try this:
>>
>> diff --git a/kernel/rcu/srcu.c b/kernel/rcu/srcu.c
>> index 9b9cdd549caa..ef5599c65299 100644
>> --- a/kernel/rcu/srcu.c
>> +++ b/kernel/rcu/srcu.c
>> @@ -283,6 +283,7 @@ void cleanup_srcu_struct(struct srcu_struct *sp)
>>  {
>>         if (WARN_ON(srcu_readers_active(sp)))
>>                 return; /* Leakage unless caller handles error. */
>> +       flush_delayed_work(&sp->work);
>>         free_percpu(sp->per_cpu_ref);
>>         sp->per_cpu_ref = NULL;
>>  }
>>
>> I think it should subsume Steve's patch, but I'm not 101% sure.  We
>> will have to run this through Paul.
>
>
> Hi +Pual,
>
> I am seeing use-after-frees in process_srcu as struct srcu_struct is
> already freed. Before freeing struct srcu_struct, code does
> cleanup_srcu_struct(&kvm->irq_srcu). We also tried to do:
>
> +      srcu_barrier(&kvm->irq_srcu);
>          cleanup_srcu_struct(&kvm->irq_srcu);
>
> It reduced rate of use-after-frees, but did not eliminate them
> completely. The full threaded is here:
> https://groups.google.com/forum/#!msg/syzkaller/i48YZ8mwePY/0PQ8GkQTBwAJ
>
> Does Paolo's fix above make sense to you? Namely adding
> flush_delayed_work(&sp->work) to cleanup_srcu_struct()?


I am not sure about interaction of flush_delayed_work and
srcu_reschedule... flush_delayed_work probably assumes that no work is
queued concurrently, but what if srcu_reschedule queues another work
concurrently... can't it happen that flush_delayed_work will miss that
newly scheduled work?

Meanwhile I will apply this change instead of Steve's change and see
what happens.

  reply	other threads:[~2017-01-17  9:56 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-11  6:46 kvm: use-after-free in process_srcu Dmitry Vyukov
2016-12-11  8:40 ` Vegard Nossum
2016-12-11  8:49   ` Dmitry Vyukov
2017-01-13  3:30     ` Steve Rutherford
2017-01-13  9:19       ` Dmitry Vyukov
2017-01-15 17:11         ` Dmitry Vyukov
2017-01-16 21:34           ` Dmitry Vyukov
2017-01-16 21:48             ` Paolo Bonzini
2017-01-17  9:47               ` Dmitry Vyukov
2017-01-17  9:56                 ` Dmitry Vyukov [this message]
2017-01-17 11:08                   ` Paolo Bonzini
2017-01-17 11:13                     ` Dmitry Vyukov
2017-01-17 12:03                       ` Paolo Bonzini
2017-01-17 20:34                         ` Paul E. McKenney
2017-01-18  8:53                           ` Paolo Bonzini
2017-01-18 22:15                             ` Paul E. McKenney
2017-01-19  9:27                               ` Paolo Bonzini
2017-01-19 21:52                                 ` Paul McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACT4Y+Y82G0KHgBGLvcACSgVxczeJV1TK-umg3qeZXX4S3D2qw@mail.gmail.com \
    --to=dvyukov@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=rkrcmar@redhat.com \
    --cc=srutherford@google.com \
    --cc=syzkaller@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.