All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ben Gardon <bgardon@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: LKML <linux-kernel@vger.kernel.org>, kvm <kvm@vger.kernel.org>,
	Peter Xu <peterx@redhat.com>,
	Sean Christopherson <seanjc@google.com>,
	Peter Shier <pshier@google.com>,
	Junaid Shahid <junaids@google.com>,
	Jim Mattson <jmattson@google.com>,
	Yulei Zhang <yulei.kernel@gmail.com>,
	Wanpeng Li <kernellwp@gmail.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Xiao Guangrong <xiaoguangrong.eric@gmail.com>
Subject: Re: [PATCH 5/6] KVM: x86/mmu: Protect kvm->memslots with a mutex
Date: Wed, 28 Apr 2021 13:40:12 -0700	[thread overview]
Message-ID: <CANgfPd9kVJOAR_uq+oh9kE2gr00EUAGSPiJ9jMR9BdG2CAC+BA@mail.gmail.com> (raw)
In-Reply-To: <d936b13b-bb00-fc93-de3b-adc59fa32a7b@redhat.com>

On Wed, Apr 28, 2021 at 10:46 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 28/04/21 18:40, Ben Gardon wrote:
> > On Tue, Apr 27, 2021 at 11:25 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
> >>
> >> On 28/04/21 00:36, Ben Gardon wrote:
> >>> +void kvm_arch_assign_memslots(struct kvm *kvm, int as_id,
> >>> +                          struct kvm_memslots *slots)
> >>> +{
> >>> +     mutex_lock(&kvm->arch.memslot_assignment_lock);
> >>> +     rcu_assign_pointer(kvm->memslots[as_id], slots);
> >>> +     mutex_unlock(&kvm->arch.memslot_assignment_lock);
> >>> +}
> >>
> >> Does the assignment also needs the lock, or only the rmap allocation?  I
> >> would prefer the hook to be something like kvm_arch_setup_new_memslots.
> >
> > The assignment does need to be under the lock to prevent the following race:
> > 1. Thread 1 (installing a new memslot): Acquires memslot assignment
> > lock (or perhaps in this case rmap_allocation_lock would be more apt.)
> > 2. Thread 1: Check alloc_memslot_rmaps (it is false)
> > 3. Thread 1: doesn't allocate memslot rmaps for new slot
> > 4. Thread 1: Releases memslot assignment lock
> > 5. Thread 2 (allocating a shadow root): Acquires memslot assignment lock
> > 6. Thread 2: Sets alloc_memslot_rmaps = true
> > 7. Thread 2: Allocates rmaps for all existing slots
> > 8. Thread 2: Releases memslot assignment lock
> > 9. Thread 2: Sets shadow_mmu_active = true
> > 10. Thread 1: Installs the new memslots
> > 11. Thread 3: Null pointer dereference when trying to access rmaps on
> > the new slot.
>
> ... because thread 3 would be under mmu_lock and therefore cannot
> allocate the rmap itself (you have to do it in mmu_alloc_shadow_roots,
> as in patch 6).
>
> Related to this, your solution does not have to protect kvm_dup_memslots
> with the new lock, because the first update of the memslots will not go
> through kvm_arch_prepare_memory_region but it _will_ go through
> install_new_memslots and therefore through the new hook.  But overall I
> think I'd prefer to have a kvm->slots_arch_lock mutex in generic code,
> and place the call to kvm_dup_memslots and
> kvm_arch_prepare_memory_region inside that mutex.

That makes sense, and I think it also avoids a bug in this series.
Currently, if the rmaps are allocated between kvm_dup_memslots and and
install_new_memslots, we run into a problem where the copied slots
will have new rmaps allocated for them before installation.
Potentially creating all sorts of problems. I had fixed that issue in
the past by allocating the array of per-level rmaps at memslot
creation, seperate from the memslot. That meant that the copy would
get the newly allocated rmaps as well, but I thought that was obsolete
after some memslot refactors went in.

I hoped we could get away without that change, but I was probably
wrong. However, with this enlarged critical section, that should not
be an issue for creating memslots since kvm_dup_memslots will either
copy memslots with the rmaps already allocated, or the whole thing
will happen before the rmaps are allocated.

... However with the locking you propose below, we might still run
into issues on a move or delete, which would mean we'd still need the
separate memory allocation for the rmaps array. Or we do some
shenanigans where we try to copy the rmap pointers from the other set
of memslots.

I can put together a v2 with the seperate rmap memory and more generic
locking and see how that looks.

>
> That makes the new lock decently intuitive, and easily documented as
> "Architecture code can use slots_arch_lock if the contents of struct
> kvm_arch_memory_slot needs to be written outside
> kvm_arch_prepare_memory_region.  Unlike slots_lock, slots_arch_lock can
> be taken inside a ``kvm->srcu`` read-side critical section".
>
> I admit I haven't thought about it very thoroughly, but if something
> like this is enough, it is relatively pretty:
>
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 9b8e30dd5b9b..6e5106365597 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1333,6 +1333,7 @@ static struct kvm_memslots
> *install_new_memslots(struct kvm *kvm,
>
>         rcu_assign_pointer(kvm->memslots[as_id], slots);
>
> +       mutex_unlock(&kvm->slots_arch_lock);
>         synchronize_srcu_expedited(&kvm->srcu);
>
>         /*
> @@ -1399,6 +1398,7 @@ static int kvm_set_memslot(struct kvm *kvm,
>         struct kvm_memslots *slots;
>         int r;
>
> +       mutex_lock(&kvm->slots_arch_lock);
>         slots = kvm_dup_memslots(__kvm_memslots(kvm, as_id), change);
>         if (!slots)
>                 return -ENOMEM;
> @@ -1427,6 +1427,7 @@ static int kvm_set_memslot(struct kvm *kvm,
>                  *      - kvm_is_visible_gfn (mmu_check_root)
>                  */
>                 kvm_arch_flush_shadow_memslot(kvm, slot);
> +               mutex_lock(&kvm->slots_arch_lock);
>         }
>
>         r = kvm_arch_prepare_memory_region(kvm, new, mem, change);
>
> It does make the critical section a bit larger, so that the
> initialization of the shadow page (which is in KVM_RUN context) contends
> with slightly more code than necessary.  However it's all but a
> performance critical situation, as it will only happen just once per VM.

I agree performance is not a huge concern here. Excluding
kvm_arch_flush_shadow_memslot from the critical section also helps a
lot because that's where most of the work could be if we're deleting /
moving a slot.
My only worry is the latency this could add to a nested VM launch, but
it seems pretty unlikely that that would be frequently coinciding with
a memslot change in practice.

>
> WDYT?
>
> Paolo
>
> > Putting the assignment under the lock prevents 5-8 from happening
> > between 2 and 10.
> >
> > I'm open to other ideas as far as how to prevent this race though. I
> > admit this solution is not the most elegant looking.
>
>

  reply	other threads:[~2021-04-28 20:40 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-27 22:36 [PATCH 0/6] Lazily allocate memslot rmaps Ben Gardon
2021-04-27 22:36 ` [PATCH 1/6] KVM: x86/mmu: Track if shadow MMU active Ben Gardon
2021-04-27 22:36 ` [PATCH 2/6] KVM: x86/mmu: Skip rmap operations if shadow MMU inactive Ben Gardon
2021-04-27 22:36 ` [PATCH 3/6] KVM: x86/mmu: Deduplicate rmap freeing in allocate_memslot_rmap Ben Gardon
2021-04-28 10:00   ` Paolo Bonzini
2021-04-28 16:23     ` Ben Gardon
2021-04-27 22:36 ` [PATCH 4/6] KVM: x86/mmu: Factor out allocating memslot rmap Ben Gardon
2021-04-27 22:36 ` [PATCH 5/6] KVM: x86/mmu: Protect kvm->memslots with a mutex Ben Gardon
2021-04-28  6:25   ` Paolo Bonzini
2021-04-28 16:40     ` Ben Gardon
2021-04-28 17:46       ` Paolo Bonzini
2021-04-28 20:40         ` Ben Gardon [this message]
2021-04-28 21:41           ` Paolo Bonzini
2021-04-28 21:46             ` Ben Gardon
2021-04-28 23:42               ` Paolo Bonzini
2021-04-29  0:40                 ` Sean Christopherson
2021-04-29  1:42                   ` Sean Christopherson
2021-04-29  7:02                   ` Paolo Bonzini
2021-04-29 17:45                     ` Ben Gardon
2021-04-27 22:36 ` [PATCH 6/6] KVM: x86/mmu: Lazily allocate memslot rmaps Ben Gardon
2021-04-28 10:03   ` Paolo Bonzini
2021-04-28 16:45     ` Ben Gardon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANgfPd9kVJOAR_uq+oh9kE2gr00EUAGSPiJ9jMR9BdG2CAC+BA@mail.gmail.com \
    --to=bgardon@google.com \
    --cc=jmattson@google.com \
    --cc=junaids@google.com \
    --cc=kernellwp@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=pshier@google.com \
    --cc=seanjc@google.com \
    --cc=vkuznets@redhat.com \
    --cc=xiaoguangrong.eric@gmail.com \
    --cc=yulei.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.