All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Rutland <mark.rutland@arm.com>
To: Marc Zyngier <maz@kernel.org>
Cc: linux-kernel@vger.kernel.org, aleksandar.qemu.devel@gmail.com,
	alexandru.elisei@arm.com, anup.patel@wdc.com,
	aou@eecs.berkeley.edu, atish.patra@wdc.com,
	benh@kernel.crashing.org, borntraeger@linux.ibm.com,
	bp@alien8.de, catalin.marinas@arm.com, chenhuacai@kernel.org,
	dave.hansen@linux.intel.com, david@redhat.com,
	frankja@linux.ibm.com, frederic@kernel.org, gor@linux.ibm.com,
	hca@linux.ibm.com, imbrenda@linux.ibm.com, james.morse@arm.com,
	jmattson@google.com, joro@8bytes.org, kvm@vger.kernel.org,
	mingo@redhat.com, mpe@ellerman.id.au, nsaenzju@redhat.com,
	palmer@dabbelt.com, paulmck@kernel.org, paulus@samba.org,
	paul.walmsley@sifive.com, pbonzini@redhat.com, seanjc@google.com,
	suzuki.poulose@arm.com, tglx@linutronix.de,
	tsbogend@alpha.franken.de, vkuznets@redhat.com,
	wanpengli@tencent.com, will@kernel.org
Subject: Re: [PATCH 1/5] kvm: add exit_to_guest_mode() and enter_from_guest_mode()
Date: Thu, 13 Jan 2022 11:01:30 +0000	[thread overview]
Message-ID: <YeAGit8JTO/AmAaU@FVFF77S0Q05N> (raw)
In-Reply-To: <87v8yqrwcs.wl-maz@kernel.org>

On Tue, Jan 11, 2022 at 05:54:59PM +0000, Marc Zyngier wrote:
> Hi Mark,
> 
> On Tue, 11 Jan 2022 15:35:35 +0000,
> Mark Rutland <mark.rutland@arm.com> wrote:
> > 
> > When transitioning to/from guest mode, it is necessary to inform
> > lockdep, tracing, and RCU in a specific order, similar to the
> > requirements for transitions to/from user mode. Additionally, it is
> > necessary to perform vtime accounting for a window around running the
> > guest, with RCU enabled, such that timer interrupts taken from the guest
> > can be accounted as guest time.
> > 
> > Most architectures don't handle all the necessary pieces, and a have a
> > number of common bugs, including unsafe usage of RCU during the window
> > between guest_enter() and guest_exit().
> > 
> > On x86, this was dealt with across commits:
> > 
> >   87fa7f3e98a1310e ("x86/kvm: Move context tracking where it belongs")
> >   0642391e2139a2c1 ("x86/kvm/vmx: Add hardirq tracing to guest enter/exit")
> >   9fc975e9efd03e57 ("x86/kvm/svm: Add hardirq tracing on guest enter/exit")
> >   3ebccdf373c21d86 ("x86/kvm/vmx: Move guest enter/exit into .noinstr.text")
> >   135961e0a7d555fc ("x86/kvm/svm: Move guest enter/exit into .noinstr.text")
> >   160457140187c5fb ("KVM: x86: Defer vtime accounting 'til after IRQ handling")
> >   bc908e091b326467 ("KVM: x86: Consolidate guest enter/exit logic to common helpers")
> > 
> > ... but those fixes are specific to x86, and as the resulting logic
> > (while correct) is split across generic helper functions and
> > x86-specific helper functions, it is difficult to see that the
> > entry/exit accounting is balanced.
> > 
> > This patch adds generic helpers which architectures can use to handle
> > guest entry/exit consistently and correctly. The guest_{enter,exit}()
> > helpers are split into guest_timing_{enter,exit}() to perform vtime
> > accounting, and guest_context_{enter,exit}() to perform the necessary
> > context tracking and RCU management. The existing guest_{enter,exit}()
> > heleprs are left as wrappers of these.
> > 
> > Atop this, new exit_to_guest_mode() and enter_from_guest_mode() helpers
> > are added to handle the ordering of lockdep, tracing, and RCU manageent.
> > These are named to align with exit_to_user_mode() and
> > enter_from_user_mode().
> > 
> > Subsequent patches will migrate architectures over to the new helpers,
> > following a sequence:
> > 
> > 	guest_timing_enter_irqoff();
> > 
> > 	exit_to_guest_mode();
> > 	< run the vcpu >
> > 	enter_from_guest_mode();
> > 
> > 	< take any pending IRQs >
> > 
> > 	guest_timing_exit_irqoff();
> > 
> > This sequences handles all of the above correctly, and more clearly
> > balances the entry and exit portions, making it easier to understand.
> > 
> > The existing helpers are marked as deprecated, and will be removed once
> > all architectures have been converted.
> > 
> > There should be no functional change as a result of this patch.
> > 
> > Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> 
> Thanks a lot for looking into this and writing this up. I have a
> couple of comments below, but that's pretty much cosmetic and is only
> there to ensure that I actually understand this stuff. FWIW:
> 
> Reviewed-by: Marc Zyngier <maz@kernel.org>

Thanks!

> > ---
> >  include/linux/kvm_host.h | 108 +++++++++++++++++++++++++++++++++++++--
> >  1 file changed, 105 insertions(+), 3 deletions(-)
> > 
> > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > index c310648cc8f1..13fcf7979880 100644
> > --- a/include/linux/kvm_host.h
> > +++ b/include/linux/kvm_host.h
> > @@ -29,6 +29,8 @@
> >  #include <linux/refcount.h>
> >  #include <linux/nospec.h>
> >  #include <linux/notifier.h>
> > +#include <linux/ftrace.h>
> > +#include <linux/instrumentation.h>
> >  #include <asm/signal.h>
> >  
> >  #include <linux/kvm.h>
> > @@ -362,8 +364,11 @@ struct kvm_vcpu {
> >  	int last_used_slot;
> >  };
> >  
> > -/* must be called with irqs disabled */
> > -static __always_inline void guest_enter_irqoff(void)
> > +/*
> > + * Start accounting time towards a guest.
> > + * Must be called before entering guest context.
> > + */
> > +static __always_inline void guest_timing_enter_irqoff(void)
> >  {
> >  	/*
> >  	 * This is running in ioctl context so its safe to assume that it's the
> > @@ -372,7 +377,17 @@ static __always_inline void guest_enter_irqoff(void)
> >  	instrumentation_begin();
> >  	vtime_account_guest_enter();
> >  	instrumentation_end();
> > +}
> >  
> > +/*
> > + * Enter guest context and enter an RCU extended quiescent state.
> > + *
> > + * This should be the last thing called before entering the guest, and must be
> > + * called after any potential use of RCU (including any potentially
> > + * instrumented code).
> 
> nit: "the last thing called" is terribly ambiguous. Any architecture
> obviously calls a ****load of stuff after this point. Should this be
> 'the last thing involving RCU' instead?

I agree this is unclear and I struggled to fing good wording for this. Is the
following any better?

/*
 * Enter guest context and enter an RCU extended quiescent state.
 *
 * Between guest_context_enter_irqoff() and guest_context_exit_irqoff() it is
 * unsafe to use any code which may directly or indirectly use RCU, tracing
 * (including IRQ flag tracing), or lockdep. All code in this period must be
 * non-instrumentable.
 */

If that's good I can add similar to guest_context_exit_irqoff().

[...]

> > +/**
> > + * exit_to_guest_mode - Fixup state when exiting to guest mode
> > + *
> > + * This is analagous to exit_to_user_mode(), and ensures we perform the
> > + * following in order:
> > + *
> > + * 1) Trace interrupts on state
> > + * 2) Invoke context tracking if enabled to adjust RCU state
> > + * 3) Tell lockdep that interrupts are enabled
> 
> nit: or rather, are about to be enabled? Certainly on arm64, the
> enable happens much later, right at the point where we enter the guest
> for real.

True; I'd cribbed the wording from the comment block above exit_to_user_mode(),
but I stripped the context that made that clear. I'll make that:

	/**
	 * exit_to_guest_mode - Fixup state when exiting to guest mode
	 *
	 * Entry to a guest will enable interrupts, but the kernel state is
	 * interrupts disabled when this is invoked. Also tell RCU about it.
	 *
	 * 1) Trace interrupts on state
	 * 2) Invoke context tracking if enabled to adjust RCU state
	 * 3) Tell lockdep that interrupts are enabled
	 *
	 * Invoked from architecture specific code before entering a guest.
	 * Must be called with interrupts disabled and the caller must be
	 * non-instrumentable.
	 * The caller has to invoke guest_timing_enter_irqoff() before this.
	 *
	 * Note: this is analagous to exit_to_user_mode().
	 */

... with likewise for enter_from_guest_mode(), if that's clear enough?

FWIW, the comment blcok for exit_to_user_mode() in
include/linux/entry-common.h says:

	/**
	 * exit_to_user_mode - Fixup state when exiting to user mode
	 *
	 * Syscall/interrupt exit enables interrupts, but the kernel state is
	 * interrupts disabled when this is invoked. Also tell RCU about it.
	 *
	 * 1) Trace interrupts on state
	 * 2) Invoke context tracking if enabled to adjust RCU state
	 * 3) Invoke architecture specific last minute exit code, e.g. speculation
	 *    mitigations, etc.: arch_exit_to_user_mode()
	 * 4) Tell lockdep that interrupts are enabled
	 *
	 * Invoked from architecture specific code when syscall_exit_to_user_mode()
	 * is not suitable as the last step before returning to userspace. Must be
	 * invoked with interrupts disabled and the caller must be
	 * non-instrumentable.
	 * The caller has to invoke syscall_exit_to_user_mode_work() before this.
	 */

Thanks,
Mark.

  reply	other threads:[~2022-01-13 11:01 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-11 15:35 [PATCH 0/5] kvm: fix latent guest entry/exit bugs Mark Rutland
2022-01-11 15:35 ` [PATCH 1/5] kvm: add exit_to_guest_mode() and enter_from_guest_mode() Mark Rutland
2022-01-11 17:54   ` Marc Zyngier
2022-01-13 11:01     ` Mark Rutland [this message]
2022-01-13 11:55       ` Marc Zyngier
2022-01-13 13:01         ` Mark Rutland
2022-01-13 20:32   ` Sean Christopherson
2022-01-14 11:48     ` Mark Rutland
2022-01-14 16:11       ` Sean Christopherson
2022-01-18 13:01         ` Mark Rutland
2022-01-11 15:35 ` [PATCH 2/5] kvm/arm64: rework guest entry logic Mark Rutland
2022-01-11 17:55   ` Marc Zyngier
2022-01-13 11:17     ` Mark Rutland
2022-01-13 11:43       ` Marc Zyngier
2022-01-13 12:58         ` Mark Rutland
2022-01-11 15:35 ` [PATCH 3/5] kvm/mips: " Mark Rutland
2022-01-11 15:35 ` [PATCH 4/5] kvm/riscv: " Mark Rutland
2022-01-11 15:35 ` [PATCH 5/5] kvm/x86: " Mark Rutland
2022-01-13 20:50   ` Sean Christopherson
2022-01-14 12:05     ` Mark Rutland
2022-01-14 16:49       ` Sean Christopherson
2022-01-11 18:47 ` [PATCH 0/5] kvm: fix latent guest entry/exit bugs Palmer Dabbelt
2022-01-13 15:20 ` Christian Borntraeger
2022-01-14 12:19   ` Mark Rutland
2022-01-14 12:29     ` Christian Borntraeger
2022-01-14 13:32       ` Mark Rutland
2022-01-14 13:51         ` Christian Borntraeger
2022-01-14 15:19           ` Mark Rutland
2022-01-17 17:45             ` Paolo Bonzini
2022-01-18 12:02               ` Mark Rutland
2022-01-18 12:08                 ` Christian Borntraeger
2022-01-18 12:42                 ` Christian Borntraeger
2022-01-18 13:12                   ` Mark Rutland
2022-01-18 14:15                     ` Christian Borntraeger
2022-01-18 15:43                       ` Mark Rutland
2022-01-18 16:09                     ` Sven Schnelle
2022-01-18 17:50                       ` Mark Rutland
2022-01-18 18:12                         ` Mark Rutland
2022-01-19  6:41                         ` Sven Schnelle

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YeAGit8JTO/AmAaU@FVFF77S0Q05N \
    --to=mark.rutland@arm.com \
    --cc=aleksandar.qemu.devel@gmail.com \
    --cc=alexandru.elisei@arm.com \
    --cc=anup.patel@wdc.com \
    --cc=aou@eecs.berkeley.edu \
    --cc=atish.patra@wdc.com \
    --cc=benh@kernel.crashing.org \
    --cc=borntraeger@linux.ibm.com \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=chenhuacai@kernel.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=frankja@linux.ibm.com \
    --cc=frederic@kernel.org \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=imbrenda@linux.ibm.com \
    --cc=james.morse@arm.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=nsaenzju@redhat.com \
    --cc=palmer@dabbelt.com \
    --cc=paul.walmsley@sifive.com \
    --cc=paulmck@kernel.org \
    --cc=paulus@samba.org \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=suzuki.poulose@arm.com \
    --cc=tglx@linutronix.de \
    --cc=tsbogend@alpha.franken.de \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.