All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefano Stabellini <sstabellini@kernel.org>
To: Julien Grall <julien.grall@arm.com>
Cc: xen-devel@lists.xenproject.org, nd@arm.com,
	Stefano Stabellini <sstabellini@kernel.org>
Subject: Re: [PATCH v4 2/2] arm: proper ordering for correct execution of gic_update_one_lr and vgic_store_itargetsr
Date: Thu, 16 Feb 2017 15:39:31 -0800 (PST)	[thread overview]
Message-ID: <alpine.DEB.2.10.1702161524080.9566@sstabellini-ThinkPad-X260> (raw)
In-Reply-To: <e5db92ff-febf-f458-978f-bbdf0b505c24@arm.com>

On Thu, 16 Feb 2017, Julien Grall wrote:
> On 16/02/2017 22:10, Stefano Stabellini wrote:
> > On Thu, 16 Feb 2017, Julien Grall wrote:
> > > Hi Stefano,
> > > 
> > > On 11/02/17 02:05, Stefano Stabellini wrote:
> > > > Concurrent execution of gic_update_one_lr and vgic_store_itargetsr can
> > > > result in the wrong pcpu being set as irq target, see
> > > > http://marc.info/?l=xen-devel&m=148218667104072.
> > > > 
> > > > To solve the issue, add barriers, remove an irq from the inflight
> > > > queue, only after the affinity has been set. On the other end, write the
> > > > new vcpu target, before checking GIC_IRQ_GUEST_MIGRATING and inflight.
> > > > 
> > > > Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
> > > > ---
> > > >  xen/arch/arm/gic.c     | 3 ++-
> > > >  xen/arch/arm/vgic-v2.c | 4 ++--
> > > >  xen/arch/arm/vgic-v3.c | 4 +++-
> > > >  3 files changed, 7 insertions(+), 4 deletions(-)
> > > > 
> > > > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> > > > index a5348f2..bb52959 100644
> > > > --- a/xen/arch/arm/gic.c
> > > > +++ b/xen/arch/arm/gic.c
> > > > @@ -503,12 +503,13 @@ static void gic_update_one_lr(struct vcpu *v, int
> > > > i)
> > > >               !test_bit(GIC_IRQ_GUEST_MIGRATING, &p->status) )
> > > >              gic_raise_guest_irq(v, irq, p->priority);
> > > >          else {
> > > > -            list_del_init(&p->inflight);
> > > >              if ( test_and_clear_bit(GIC_IRQ_GUEST_MIGRATING,
> > > > &p->status) )
> > > >              {
> > > >                  struct vcpu *v_target = vgic_get_target_vcpu(v, irq);
> > > >                  irq_set_affinity(p->desc,
> > > > cpumask_of(v_target->processor));
> > > >              }
> > > > +            smp_mb();
> > > > +            list_del_init(&p->inflight);
> > > 
> > > I don't understand why you remove from the inflight list afterwards. If
> > > you do
> > > that you introduce that same problem as discussed in
> > > <7a78c859-fa6f-ba10-b574-d8edd46ea934@arm.com>
> > > 
> > > As long as the interrupt is routed to the pCPU running gic_update_one_lr,
> > > the
> > > interrupt cannot fired because the interrupts are masked.
> > 
> > This is not accurate: it is possible to receive a second interrupt
> > notification while the first one is still active.
> 
> Can you detail here? Because if you look at how gic_update_one_lr is called
> from gic_clear_lrs, interrupts are masked.
> 
> So it cannot be received by Xen while you are in gic_update_one_lr and before
> irq_set_affinity is called.

Yes, you are right, I meant generally. In this case, it is as you say.


> > > However, as soon as irq_set_affinity is called the interrupt may fire
> > > on the other pCPU.
> > 
> > This is true.
> > 
> > 
> > > However, list_del_init is not atomic and not protected by any lock. So
> > > vgic_vcpu_inject_irq may see a corrupted version of {p,n}->inflight.
> > > 
> > > Did I miss anything?
> > 
> > Moving list_del_init later ensures that there are no conflicts between
> > gic_update_one_lr and vgic_store_itargetsr (more specifically,
> > vgic_migrate_irq). If you look at the implementation of
> > vgic_migrate_irq, all checks depends on list_empty(&p->inflight). If we
> > don't move list_del_init later, given that vgic_migrate_irq can be
> > called with a different vgic lock taken than gic_update_one_lr, the
> > following scenario can happen:
> > 
> > 
> >   <GIC_IRQ_GUEST_MIGRATING is set>      <GIC_IRQ_GUEST_MIGRATING is set>
> >   CPU0: gic_update_one_lr               CPU1: vgic_store_itargetsr
> >   ----------------------------------------------------------------------
> >   remove from inflight
> >   clear GIC_IRQ_GUEST_MIGRATING
> >   read rank->vcpu (intermediate)
> 
> It is only true if vgic_store_itargetsr is testing GIC_IRQ_GUEST_MIGRATING
> here and it was clear.

Right, that's the scenario, see the right colum. If you meant to say
something else, I couldn't understand, sorry.

I have been playing with rearranging these few lines of code in
gic_update_one_lr and vgic_store_itargetsr/vgic_migrate_irq quite a bit,
but I couldn't figure out a way to solve all races. This patch is one of
the best outcomes I found. If you can figure out a way to rearrange this
code to not be racy, but still lockless, let me know!


> >                                         set rank->vcpu (final)
> >                                         vgic_migrate_irq
> >                                           if (!inflight) irq_set_affinity
> > (final)
> >   irq_set_affinity (intermediate)
> > 
> > 
> > As a result, the irq affinity is set to the wrong cpu. With this patch,
> > this problem doesn't occur.
> > 
> > However, you are right that both in the case of gic_update_one_lr and
> > vgic_migrate_irq, as well as the case of gic_update_one_lr and
> > vgic_vcpu_inject_irq that you mentioned, list_del_init (from
> > gic_update_one_lr) is potentially run as the same time as list_empty
> > (from vgic_migrate_irq or from vgic_vcpu_inject_irq), and they are not
> > atomic.
> > 
> > Also see this other potential issue:
> > http://marc.info/?l=xen-devel&m=148703220714075
> > 
> > All these concurrent accesses are difficult to understand and to deal
> > with. This is why my original suggestion was to use the old vcpu vgic
> > lock, rather then try to ensure safe concurrent accesses everywhere.
> > That option is still open and would solve both problems.
> > We only need to:
> > 
> > - store the vcpu to which an irq is currently injected
> > http://marc.info/?l=xen-devel&m=148237295020488
> > - check the new irq->vcpu field, and take the right vgic lock
> > something like http://marc.info/?l=xen-devel&m=148237295920492&w=2, but
> > would need improvements
> > 
> > Much simpler, right?
> 
> Would not it be easier to just take the desc->lock to protect the concurrent
> access?

One more lock is almost never easier :) things are going to get more
entangled. Also given your accurate observation that
vgic_vcpu_inject_irq wouldn't be a problem if we place
list_del_init(&p->inflight) before irq_set_affinity, I think the problem
is rather limited and could be solved easily with the lock we have.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2017-02-16 23:39 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-11  2:05 [PATCH v4 1/2] arm: read/write rank->vcpu atomically Stefano Stabellini
2017-02-11  2:05 ` [PATCH v4 2/2] arm: proper ordering for correct execution of gic_update_one_lr and vgic_store_itargetsr Stefano Stabellini
2017-02-14  0:29   ` Stefano Stabellini
2017-02-16 19:36   ` Julien Grall
2017-02-16 22:10     ` Stefano Stabellini
2017-02-16 23:12       ` Julien Grall
2017-02-16 23:39         ` Stefano Stabellini [this message]
2017-02-16 18:55 ` [PATCH v4 1/2] arm: read/write rank->vcpu atomically Julien Grall
2017-02-16 19:59   ` Stefano Stabellini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.10.1702161524080.9566@sstabellini-ThinkPad-X260 \
    --to=sstabellini@kernel.org \
    --cc=julien.grall@arm.com \
    --cc=nd@arm.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.