From mboxrd@z Thu Jan  1 00:00:00 1970
From: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Subject: Re: [PATCH v8 03/10] xen/arm: inflight irqs during
	migration
Date: Wed, 23 Jul 2014 15:45:51 +0100
Message-ID: <alpine.DEB.2.02.1407231513540.2293@kaball.uk.xensource.com>
References: <alpine.DEB.2.02.1407101908280.29039@kaball.uk.xensource.com>
	<1405016003-19131-3-git-send-email-stefano.stabellini@eu.citrix.com>
	<1405601051.31127.9.camel@kazak.uk.xensource.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <1405601051.31127.9.camel@kazak.uk.xensource.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Ian Campbell <Ian.Campbell@citrix.com>
Cc: julien.grall@citrix.com, xen-devel@lists.xensource.com, Stefano Stabellini <stefano.stabellini@eu.citrix.com>
List-Id: xen-devel@lists.xenproject.org

On Thu, 17 Jul 2014, Ian Campbell wrote:
> On Thu, 2014-07-10 at 19:13 +0100, Stefano Stabellini wrote:
> > We need to take special care when migrating irqs that are already
> > inflight from one vcpu to another. See "The effect of changes to an
> > GICD_ITARGETSR", part of chapter 4.3.12 of the ARM Generic Interrupt
> > Controller Architecture Specification.
> > 
> > The main issue from the Xen point of view is that the lr_pending and
> > inflight lists are per-vcpu. The lock we take to protect them is also
> > per-vcpu.
> > 
> > In order to avoid issues, if the irq is still lr_pending, we can
> > immediately move it to the new vcpu for injection.
> > 
> > Otherwise if it is in a GICH_LR register, set a new flag
> > GIC_IRQ_GUEST_MIGRATING, so that we can recognize when we receive an irq
> > while the previous one is still inflight (given that we are only dealing
> > with hardware interrupts here, it just means that its LR hasn't been
> > cleared yet on the old vcpu).  If GIC_IRQ_GUEST_MIGRATING is set, we
> > only set GIC_IRQ_GUEST_QUEUED and interrupt the old vcpu. To know which
> > one is the old vcpu, we introduce a new field to pending_irq, called
> > vcpu_migrate_from.
> > When clearing the LR on the old vcpu, we take special care of injecting
> > the interrupt into the new vcpu. To do that we need to release the old
> > vcpu lock before taking the new vcpu lock.
> 
> I still think this is an awful lot of complexity and scaffolding for
> something which is rare on the scale of things and which could be almost
> trivially handled by requesting a maintenance interrupt for one EOI and
> completing the move at that point.

Requesting a maintenance interrupt is not as simple as it looks:
- ATM we don't know how to edit a living GICH_LR register, we would have
to add a function for that;
- if we request a maintenance interrupt then we also need to EOI the
physical IRQ, that is something that we don't do anymore (unless
PLATFORM_QUIRK_GUEST_PIRQ_NEED_EOI but that is another matter). We would
need to understand that some physical irqs need to be EOI'ed by Xen and
some don't.

Also requesting a maintenance interrupt would only guarantee that the
vcpu is interrupted as soon as possible, but it won't save us from
having to introduce GIC_IRQ_GUEST_MIGRATING. It would only let us skip
adding vcpu_migrate_from and the 5 lines of code in
vgic_vcpu_inject_irq.

Overall I thought that this approach would be easier.


> In order to avoid a simple maint interrupt you are adding code to the
> normal interrupt path and a potential SGI back to another processor (and
> I hope I'm misreading this but it looks like an SGI back again to finish
> off?). That's got to be way more costly to the first interrupt on the
> new VCPU than the cost of a maintenance IRQ on the old one.
> 
> I think avoiding maintenance interrupts in general is a worthy goal, but
> there are times when they are the most appropriate mechanism.

To be clear the case we are talking about is when the guest kernel wants
to migrate an interrupt that is currently inflight in a GICH_LR register.

Requesting a maintenance interrupt for it would only make sure that the
old vcpu is interrupted soon after the EOI. Without it, we need to
identify which one is the old vcpu (in case of 2 consequent migrations),
I introduced vcpu_migrate_from for that, and kick it when receiving the
second interrupt if the first is still inflight. Exactly and only the
few lines of code you quoted below.

It is one SGI more in the uncommon case when we receive a second
physical interrupt without the old vcpu being interrupted yet.  In the
vast majority of cases the old vcpu has already been interrupted by
something else or by the second irq itself (we haven't changed affinity
yet) and there is no need for the additional SGI.


> > @@ -344,6 +385,21 @@ void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int irq)
> >      }
> >  
> >      set_bit(GIC_IRQ_GUEST_QUEUED, &n->status);
> > +    vcpu_migrate_from = n->vcpu_migrate_from;
> > +    /* update QUEUED before MIGRATING */
> > +    smp_wmb();
> > +    if ( test_bit(GIC_IRQ_GUEST_MIGRATING, &n->status) )
> > +    {
> > +        spin_unlock_irqrestore(&v->arch.vgic.lock, flags);
> > +
> > +        /* The old vcpu must have EOIed the SGI but not cleared the LR.
> > +         * Give it a kick. */
> 
> You mean SPI I think.

Yes, you are right