All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christoffer Dall <cdall@linaro.org>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: marc.zyngier@arm.com, kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org
Subject: Re: [PATCH v2 4/9] KVM: arm/arm64: replace vcpu->arch.pause with a vcpu request
Date: Thu, 6 Apr 2017 16:14:43 +0200	[thread overview]
Message-ID: <20170406141443.GC27123@cbox> (raw)
In-Reply-To: <91ab3b0b-c30e-7a19-f60f-882d26388ef5@redhat.com>

On Wed, Apr 05, 2017 at 01:37:10PM +0200, Paolo Bonzini wrote:
> 
> 
> On 05/04/2017 09:09, Christoffer Dall wrote:
> >>>  - In the explanation you wrote, you use the term 'we' a lot, but when
> >>>    talking about SMP barriers, I think it only makes sense to talk about
> >>>    actions and observations between multiple CPUs and we have to be
> >>>    specific about which CPU observes or does what with respect to the
> >>>    other.  Maybe I'm being a stickler here, but there something here
> >>>    which is making me uneasy.
> >> The write1-mb-if(read2) / write2-mb-if(read1) pattern is pretty common,
> >> so I think it is justified to cut the ordering on the reasoning and just
> >> focus on what the two memory locations and conditions mean.
> > ok, but the pattern above was not common to me (and I'm pretty sure I'm
> > not the only fool in the bunch here), so if we can reference something
> > that explains that this is a known pattern which has been tried and
> > proven, that would be even better.
> 
> I found https://lwn.net/Articles/573436/ which shows this example:
> 
>   CPU 0					CPU 1
>   ---------------------			----------------------
>   WRITE_ONCE(x, 1);			WRITE_ONCE(y, 1);
>   smp_mb();				smp_mb();
>   r2 = READ_ONCE(y);			r4 = READ_ONCE(x);
> 
> And says that it is a bug if r2 == 0 && r4 == 0.  This is exactly what 
> happens in KVM:
> 
>   CPU 0					CPU 1
>   ---------------------			----------------------
>   vcpu->mode = IN_GUEST_MODE;		kvm_make_request(REQ, vcpu);
>   smp_mb();				smp_mb();
>   r2 = kvm_request_pending(vcpu)	r4 = (vcpu->mode == IN_GUEST_MODE)
>   if (r2)				if (r4)
> 	abort entry				kick();
> 
> If r2 sees no request and r4 doesn't kick there would be a bug.
> But why can't this happen?
> 
> - if no request is pending at the time of the read to r2, CPU 1 must
> not have executed kvm_make_request yet.  In CPU 0, kvm_request_pending
> must happen after vcpu->mode is set to IN_GUEST_MODE, therefore CPU 1
> will read IN_GUEST_MODE and kick.
> 
> - if no kick happens in CPU 1, CPU 0 must not have set vcpu->mode yet.
> In CPU 1, vcpu->mode is read after setting the request bit, therefore
> CPU 0 will see the request bit and abort the guest entry.
> 
> >>>  - Finally, it feels very hard to prove the correctness of this, and
> >>>    equally hard to test it (given how long we've been running with
> >>>    apparently racy code).  I would hope that we could abstract some of
> >>>    this into architecture generic things, that someone who eat memory
> >>>    barriers for breakfast could help us verify, but again, maybe this is
> >>>    Radim's series I'm asking for here.
> >>
> >> What I can do here is to suggest copying the paradigms from x86, which
> >> is quite battle tested (Windows hammers it really hard).
> >
> > That sounds reasonable, but I think part of the problem was that we
> > simply didn't understand what the paradigms were (see the
> > kvm_make_all_cpus_request above as an example), so Drew's action about
> > documenting what this all is and the constraints of using it is really
> > important for me to do that.
> 
> Yes, totally agreed on that.
> 
> >> For QEMU I did use model checking in the past for some similarly hairy
> >> synchronization code, but that is really just "executable documentation"
> >> because the model is not written in C.
> >>
> > I played with using blast on some of the KVM/ARM code a long time ago,
> > and while I was able to find a bug with it, it was sort of an obvious
> > bug, and the things I was able to do with it was pretty limited to the
> > problems I could imagine myself anyhow.  Perhaps this is what you mean
> > with executable documentation.
> 
> I prepared three examples of a spin model for KVM vCPU kicking, and
> the outcome was actually pretty surprising: the mode check seems not
> to be necessary.
> 
> I haven't covered all x86 cases so I'm not going to remove it right
> ahead, but for ARM it really seems like EXITING_GUEST_MODE is nothing
> but an optimization of consecutive kvm_vcpu_kicks.
> 
> All three models can use C preprocessor #defines to inject bugs:
> 
> - kvm-arm-pause.promela: the "paused" mechanism; the model proves that
>   the "paused" test in the interrupt-disabled region is necessary
> 
> - kvm-req.promela: the requests mechanism; the model proves that
>   the requests check in the interrupt-disabled region is necessary
> 
> - kvm-x86-pi.promela: the x86 posted interrupt mechanism (simplified
>   a bit); the model proves that KVM must disable interrupts before
>   checking for interrupts injected while outside guest mode
>   (commit b95234c84004, "kvm: x86: do not use KVM_REQ_EVENT for APICv
>   interrupt injection", 2017-02-15)
> 
> So it seems like there are no races after all in KVM/ARM code

No races after Drew's fix has been applied to set vcpu->mode =
IN_GUEST_MODE, before checking the pause flag, correct?  (I think that's
what the spin model below is modeling).

(Currently, we have a window between checking the pause flag for the
last time and etting mode = IN_GUEST_MODE, where we would loose IPIs and
not check any variables.)


> , though
> the code can still be cleaned up.  And I have been convinced of the wrong
> thing all the time. :)
> 
> But why is KVM/ARM using KVM_REQ_VCPU_EXIT
> just fine without checking for requests (kvm-req.promela)?  Because,
> as mentioned earlier in the thread, KVM/ARM is using kvm_make_all_vcpus_request
> simply to kick all VCPUs.  The paused variable _is_ checked after disabling
> interrupts, and that is fine.
> 
> After this experiment, I think I like Drew's KVM_REQ_PAUSE more than I did
> yesterday.  However, yet another alternative is to leave pause/power_off as
> they are, while taking some inspiration from his patch to do some cleanups:
> 
> 1) change the "if"
> 
>                 if (ret <= 0 || need_new_vmid_gen(vcpu->kvm) ||
>                         vcpu->arch.power_off || vcpu->arch.pause) {
> 
> to test kvm_requests_pending instead of pause/power_off
> 
> 2) clear KVM_REQ_VCPU_EXIT before the other "if":
> 
>                 if (vcpu->arch.power_off || vcpu->arch.pause)
>                         vcpu_sleep(vcpu);
> 

I like using requests as only requests from one thread to the VCPU
thread, and not to maintain specific state about a VCPU.

The benefit of Drew's approach is that since these pieces of state are
boolean, you can have just a single check in the critical path in the
run loop instead of having to access multiple fields.

I think I'll let Drew decide at this point what he prefers.

> 
> In any case, the no-wakeup behavior of kvm_make_all_vcpus_request suits
> either use of requests (KVM_REQ_PAUSE or "fixed" KVM_REQ_VCPU_EXIT).
> 

Agreed.

Thanks,
-Christoffer

> /* To run the model checker:
>  *
>  *      spin -a kvm-arm-pause.promela
>  *      gcc -O2 pan.c
>  *      ./a.out -a -f
>  *
>  * Remove the tests using -DREMOVE_MODE_TEST, -DREMOVE_PAUSED_TEST
>  * right after -a.  The mode test is not necessary, the paused test is.
>  */
> #define OUTSIDE_GUEST_MODE      0
> #define IN_GUEST_MODE           1
> #define EXITING_GUEST_MODE      2
> 
> bool kick;
> bool paused;
> int vcpu_mode = OUTSIDE_GUEST_MODE;
> 
> active proctype vcpu_run()
> {
>     do
>         :: true -> {
>             /* In paused state, sleep with interrupts on */
>             if
>                 :: !paused -> skip;
>             fi;
> 
>             /* IPIs are eaten until interrupts are turned off.  */
>             kick = 0;
> 
>             /* Interrupts are now off. */
>             vcpu_mode = IN_GUEST_MODE;
> 
>             if
> #ifndef REMOVE_MODE_TEST
>                 :: vcpu_mode != IN_GUEST_MODE -> skip;
> #endif
> #ifndef REMOVE_PAUSED_TEST
>                 :: paused -> skip;
> #endif
>                 :: else -> {
>                     do
>                         /* Stay in guest mode until an IPI comes */
>                         :: kick -> break;
>                     od;
>                 }
>             fi;
>             vcpu_mode = OUTSIDE_GUEST_MODE;
> 
>             /* Turn on interrupts */
>         }
>     od
> }
> 
> active proctype vcpu_kick()
> {
>     int old;
> 
>     do
>         :: true -> {
>             paused = 1;
>             /* cmpxchg */
>             atomic {
>                 old = vcpu_mode;
>                 if
>                     :: vcpu_mode == IN_GUEST_MODE ->
>                         vcpu_mode = EXITING_GUEST_MODE;
>                     :: else -> skip;
>                 fi;
>             }
> 
>             if
>                 :: old == IN_GUEST_MODE -> kick = 1;
>                 :: else -> skip;
>             fi;
> 
>             if
>                :: vcpu_mode == OUTSIDE_GUEST_MODE -> paused = 0;
>             fi; 
>         }
>     od;
> }
> 
> never {
>     do
>        /* After an arbitrarily long prefix */
>        :: 1 -> skip;
> 
>        /* if we get a pause request */
>        :: paused -> break;
>     od; 
> 
> accept:
>     /* we must eventually leave guest mode (this condition is reversed!) */
>     do
>        :: vcpu_mode != OUTSIDE_GUEST_MODE
>     od; 
> }

> /* To run the model checker:
>  *
>  *      spin -a kvm-req.promela
>  *      gcc -O2 pan.c
>  *      ./a.out -a -f
>  *
>  * Remove the tests using -DREMOVE_MODE_TEST, -DREMOVE_REQ_TEST
>  * right after -a.  The mode test is not necessary, the vcpu_req test is.
>  */
> #define OUTSIDE_GUEST_MODE      0
> #define IN_GUEST_MODE           1
> #define EXITING_GUEST_MODE      2
> 
> bool kick;
> int vcpu_req;
> int vcpu_mode = OUTSIDE_GUEST_MODE;
> 
> active proctype vcpu_run()
> {
>     do
>         :: true -> {
>             /* Requests are processed with interrupts on */
>             vcpu_req = 0;
> 
>             /* IPIs are eaten until interrupts are turned off.  */
>             kick = 0;
> 
>             /* Interrupts are now off. */
>             vcpu_mode = IN_GUEST_MODE;
> 
>             if
> #ifndef REMOVE_MODE_TEST
>                 :: vcpu_mode != IN_GUEST_MODE -> skip;
> #endif
> #ifndef REMOVE_REQ_TEST
>                 :: vcpu_req -> skip;
> #endif
>                 :: else -> {
>                     do
>                         /* Stay in guest mode until an IPI comes */
>                         :: kick -> break;
>                     od;
>                 }
>             fi;
>             vcpu_mode = OUTSIDE_GUEST_MODE;
> 
>             /* Turn on interrupts */
>         }
>     od
> }
> 
> active proctype vcpu_kick()
> {
>     int old;
> 
>     do
>         :: true -> {
>             vcpu_req = 1;
>             if
>                 :: old == 0 -> {
>                     /* cmpxchg */
>                     atomic {
>                         old = vcpu_mode;
>                         if
>                             :: vcpu_mode == IN_GUEST_MODE ->
>                                 vcpu_mode = EXITING_GUEST_MODE;
>                             :: else -> skip;
>                         fi;
>                     }
> 
>                     if
>                         :: old == IN_GUEST_MODE -> kick = 1;
>                         :: else -> skip;
>                     fi;
>                 }
>                 :: else -> skip;
>             fi;
>         }
>     od;
> }
> 
> never {
>     do
>        /* After an arbitrarily long prefix */
>        :: 1 -> skip;
> 
>        /* we get in guest mode */
>        :: vcpu_mode == IN_GUEST_MODE -> break;
>     od; 
> 
> accept:
>     /* and never leave it (this condition is reversed!) */
>     do
>        :: vcpu_mode != OUTSIDE_GUEST_MODE
>     od; 
> }

> /* To run the model checker:
>  *
>  *      spin -a kvm-x86-pi.promela
>  *      gcc -O2 pan.c
>  *      ./a.out -a -f
>  *
>  * Remove the test using -DREMOVE_MODE_TEST, move the PIR->IRR sync
>  * before local_irq_disable() with SYNC_WITH_INTERRUPTS_ENABLED.  The
>  * mode test is not necessary, while sync_pir_to_irr must be placed
>  * after interrupts are disabled.
>  */
> #define OUTSIDE_GUEST_MODE      0
> #define IN_GUEST_MODE           1
> #define EXITING_GUEST_MODE      2
> 
> bool kick;
> bool posted_interrupt;
> int vcpu_pir;
> int vcpu_mode = OUTSIDE_GUEST_MODE;
> 
> active proctype vcpu_run()
> {
>     do
>         :: true -> {
> #ifdef SYNC_WITH_INTERRUPTS_ENABLED
>             /* Guest interrupts are injected with interrupts off */
>             vcpu_pir = 0;
> #endif
> 
>             /* Both kinds of IPI are eaten until interrupts are turned off.  */
>             atomic {
>                 kick = 0;
>                 posted_interrupt = 0;
>             }
> 
>             /* Interrupts are now off. */
>             vcpu_mode = IN_GUEST_MODE;
> 
> #ifndef SYNC_WITH_INTERRUPTS_ENABLED
>             /* Guest interrupts are injected with interrupts off */
>             vcpu_pir = 0;
> #endif
> 
>             if
> #ifndef REMOVE_MODE_TEST
>                 :: vcpu_mode != IN_GUEST_MODE -> skip;
> #endif
> 
>                 :: else -> {
>                     do
>                         /* Stay in guest mode until an IPI comes */
>                         :: kick -> break;
> 
>                         /* The processor handles the posted interrupt IPI */
>                         :: posted_interrupt -> vcpu_pir = 0;
>                     od;
>                 }
>             fi;
>             vcpu_mode = OUTSIDE_GUEST_MODE;
> 
>             /* Turn on interrupts */
>         }
>     od
> }
> 
> active proctype vcpu_posted_interrupt()
> {
>     int old;
> 
>     do
>         :: vcpu_pir == 0 -> {
>             vcpu_pir = 1;
>             if
>                 :: vcpu_mode == IN_GUEST_MODE ->
>                     /* If in guest mode, we can send a posted interrupt IPI */
>                     posted_interrupt = 1;
> 
>                 :: else -> {
>                     /* Else, do a kvm_vcpu_kick.  */
>                     atomic {
>                         old = vcpu_mode;
>                         if
>                             :: vcpu_mode == IN_GUEST_MODE ->
>                                 vcpu_mode = EXITING_GUEST_MODE;
>                             :: else -> skip;
>                         fi;
>                     }
> 
>                     if
>                         :: old == IN_GUEST_MODE -> kick = 1;
>                         :: else -> skip;
>                     fi;
>                 }
>             fi;
>         }
>     od;
> }
> 
> never {
>     do
>        /* After an arbitrarily long prefix */
>        :: 1 -> skip;
> 
>        /* if we get an interrupt */
>        :: vcpu_pir -> break;
>     od; 
> 
> accept:
>     /* we must eventually inject it (this condition is reversed!) */
>     do
>        :: vcpu_pir
>     od; 
> }

  reply	other threads:[~2017-04-06 14:14 UTC|newest]

Thread overview: 85+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-31 16:06 [PATCH v2 0/9] KVM: arm/arm64: race fixes and vcpu requests Andrew Jones
2017-03-31 16:06 ` [PATCH v2 1/9] KVM: add kvm_request_pending Andrew Jones
2017-04-04 15:30   ` Christoffer Dall
2017-04-04 16:41     ` Andrew Jones
2017-04-05 13:10       ` Radim Krčmář
2017-04-05 17:39         ` Christoffer Dall
2017-04-05 18:30           ` Paolo Bonzini
2017-04-05 20:20           ` Radim Krčmář
2017-04-06 12:02             ` Andrew Jones
2017-04-06 14:37               ` Christoffer Dall
2017-04-06 15:08                 ` Andrew Jones
2017-04-07 15:33                   ` Paolo Bonzini
2017-04-08 18:19                     ` Christoffer Dall
2017-04-06 14:25             ` Christoffer Dall
2017-04-07 13:15               ` Radim Krčmář
2017-04-08 18:23                 ` Christoffer Dall
2017-04-08 19:32                   ` Paolo Bonzini
2017-04-11 21:06                     ` Radim Krčmář
2017-03-31 16:06 ` [PATCH v2 2/9] KVM: Add documentation for VCPU requests Andrew Jones
2017-04-04 15:24   ` Christoffer Dall
2017-04-04 17:06     ` Andrew Jones
2017-04-04 17:23       ` Christoffer Dall
2017-04-04 17:36         ` Paolo Bonzini
2017-04-05 14:11         ` Radim Krčmář
2017-04-05 17:45           ` Christoffer Dall
2017-04-05 18:29             ` Paolo Bonzini
2017-04-05 20:46               ` Radim Krčmář
2017-04-06 14:29                 ` Christoffer Dall
2017-04-07 11:44                   ` Paolo Bonzini
2017-04-06 14:27               ` Christoffer Dall
2017-04-06 10:18   ` Christian Borntraeger
2017-04-06 12:08     ` Andrew Jones
2017-04-06 12:29     ` Radim Krčmář
2017-03-31 16:06 ` [PATCH v2 3/9] KVM: arm/arm64: prepare to use vcpu requests Andrew Jones
2017-04-04 15:34   ` Christoffer Dall
2017-04-04 17:06     ` Andrew Jones
2017-03-31 16:06 ` [PATCH v2 4/9] KVM: arm/arm64: replace vcpu->arch.pause with a vcpu request Andrew Jones
2017-04-04 13:39   ` Marc Zyngier
2017-04-04 14:47     ` Andrew Jones
2017-04-04 14:51       ` Paolo Bonzini
2017-04-04 15:05         ` Marc Zyngier
2017-04-04 17:07         ` Andrew Jones
2017-04-04 16:04   ` Christoffer Dall
2017-04-04 16:24     ` Paolo Bonzini
2017-04-04 17:19       ` Christoffer Dall
2017-04-04 17:35         ` Paolo Bonzini
2017-04-04 17:57           ` Christoffer Dall
2017-04-04 18:15             ` Paolo Bonzini
2017-04-04 18:38               ` Christoffer Dall
2017-04-04 18:18           ` Andrew Jones
2017-04-04 18:59             ` Paolo Bonzini
2017-04-04 17:57     ` Andrew Jones
2017-04-04 19:04       ` Christoffer Dall
2017-04-04 20:10         ` Paolo Bonzini
2017-04-05  7:09           ` Christoffer Dall
2017-04-05 11:37             ` Paolo Bonzini
2017-04-06 14:14               ` Christoffer Dall [this message]
2017-04-07 11:47                 ` Paolo Bonzini
2017-04-08  8:35                   ` Christoffer Dall
2017-03-31 16:06 ` [PATCH v2 5/9] KVM: arm/arm64: replace vcpu->arch.power_off " Andrew Jones
2017-04-04 17:37   ` Christoffer Dall
2017-03-31 16:06 ` [PATCH v2 6/9] KVM: arm/arm64: use a vcpu request on irq injection Andrew Jones
2017-04-04 17:42   ` Christoffer Dall
2017-04-04 18:27     ` Andrew Jones
2017-04-04 18:59     ` Paolo Bonzini
2017-04-04 18:51   ` Paolo Bonzini
2017-03-31 16:06 ` [PATCH v2 7/9] KVM: arm/arm64: PMU: remove request-less vcpu kick Andrew Jones
2017-04-04 17:46   ` Christoffer Dall
2017-04-04 18:29     ` Andrew Jones
2017-04-04 19:35       ` Christoffer Dall
2017-03-31 16:06 ` [PATCH v2 8/9] KVM: arm/arm64: fix race in kvm_psci_vcpu_on Andrew Jones
2017-04-04 19:42   ` Christoffer Dall
2017-04-05  8:35     ` Andrew Jones
2017-04-05  8:50       ` Christoffer Dall
2017-04-05  9:12         ` Andrew Jones
2017-04-05  9:30           ` Christoffer Dall
2017-03-31 16:06 ` [PATCH v2 9/9] KVM: arm/arm64: avoid race by caching MPIDR Andrew Jones
2017-04-04 19:44   ` Christoffer Dall
2017-04-05  8:50     ` Andrew Jones
2017-04-05 11:03       ` Christoffer Dall
2017-04-05 11:14         ` Andrew Jones
2017-04-03 15:28 ` [PATCH v2 0/9] KVM: arm/arm64: race fixes and vcpu requests Christoffer Dall
2017-04-03 17:11   ` Paolo Bonzini
2017-04-04  7:27   ` Andrew Jones
2017-04-04 16:05     ` Christoffer Dall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170406141443.GC27123@cbox \
    --to=cdall@linaro.org \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=marc.zyngier@arm.com \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.