From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ian Campbell Subject: Re: [PATCH v5 4/6] xen/arm: inflight irqs during migration Date: Wed, 18 Jun 2014 12:04:03 +0100 Message-ID: <1403089443.24051.20.camel@kazak.uk.xensource.com> References: <1402504032-13267-4-git-send-email-stefano.stabellini@eu.citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1402504032-13267-4-git-send-email-stefano.stabellini@eu.citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Stefano Stabellini Cc: julien.grall@citrix.com, xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org On Wed, 2014-06-11 at 17:27 +0100, Stefano Stabellini wrote: > We need to take special care when migrating irqs that are already > inflight from one vcpu to another. See "The effect of changes to an > GICD_ITARGETSR", part of chapter 4.3.12 of the ARM Generic Interrupt > Controller Architecture Specification. > > The main issue from the Xen point of view is that the lr_pending and > inflight lists are per-vcpu. The lock we take to protect them is also > per-vcpu. > > In order to avoid issues, we set a new flag GIC_IRQ_GUEST_MIGRATING, so > that we can recognize when we receive an irq while the previous one is > still inflight (given that we are only dealing with hardware interrupts > here, it just means that its LR hasn't been cleared yet on the old vcpu). > > If GIC_IRQ_GUEST_MIGRATING is set, we only set GIC_IRQ_GUEST_QUEUED and > interrupt the old vcpu. When clearing the LR on the old vcpu, we take > special care of injecting the interrupt into the new vcpu. To do that we > need to release the old vcpu lock before taking the new vcpu lock. > > Using barriers and test_bit on GIC_IRQ_GUEST_MIGRATING, make sure that > vgic_migrate_irq can run at the same time as gic_update_one_lr on > different cpus with consistent results. > > Signed-off-by: Stefano Stabellini > > --- > > Changes in v5: > - pass unsigned long to find_next_bit for alignment on aarch64; > - call vgic_get_target_vcpu instead of gic_add_to_lr_pending to add the > irq in the right inflight queue; > - add barrier and bit tests to make sure that vgic_migrate_irq and > gic_update_one_lr can run simultaneously on different cpus without > issues; > - rework the loop to identify the new vcpu when ITARGETSR is written; > - use find_first_bit instead of find_next_bit. > --- > xen/arch/arm/gic.c | 25 ++++++++++++++-- > xen/arch/arm/vgic.c | 66 +++++++++++++++++++++++++++++++++++++----- > xen/include/asm-arm/domain.h | 4 +++ > 3 files changed, 85 insertions(+), 10 deletions(-) > > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c > index 5e502df..8ed242e 100644 > --- a/xen/arch/arm/gic.c > +++ b/xen/arch/arm/gic.c > @@ -677,10 +677,31 @@ static void gic_update_one_lr(struct vcpu *v, int i) > clear_bit(GIC_IRQ_GUEST_ACTIVE, &p->status); > p->lr = GIC_INVALID_LR; > if ( test_bit(GIC_IRQ_GUEST_ENABLED, &p->status) && > - test_bit(GIC_IRQ_GUEST_QUEUED, &p->status) ) > + test_bit(GIC_IRQ_GUEST_QUEUED, &p->status) && > + !test_bit(GIC_IRQ_GUEST_MIGRATING, &p->status) ) > gic_raise_guest_irq(v, irq, p->priority); > - else > + else { > list_del_init(&p->inflight); > + > + /* inflight is cleared before clearing > + * GIC_IRQ_GUEST_MIGRATING */ > + dsb(sy); Is sy really necessary? THis is all about state stored in RAM not the actual GIC, right? I think "ish" is probably sufficient. POssibly even a dmb of some sort. > diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c > index 4d655af..e640de9 100644 > --- a/xen/arch/arm/vgic.c > +++ b/xen/arch/arm/vgic.c > @@ -543,6 +562,8 @@ static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info) > int offset = (int)(info->gpa - v->domain->arch.vgic.dbase); > int gicd_reg = REG(offset); > uint32_t tr; > + unsigned long trl; > + int i; > > switch ( gicd_reg ) > { > @@ -626,22 +647,45 @@ static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info) > rank = vgic_rank_offset(v, 8, gicd_reg - GICD_ITARGETSR); > if ( rank == NULL) goto write_ignore; > /* 8-bit vcpu mask for this domain */ > - tr = (1 << v->domain->max_vcpus) - 1; > - tr = tr | (tr << 8) | (tr << 16) | (tr << 24); > - tr &= *r; > + trl = (1 << v->domain->max_vcpus) - 1; > + if ( dabt.size == 2 ) > + trl = trl | (trl << 8) | (trl << 16) | (trl << 24); > + else > + trl = (trl << (8 * (offset & 0x3))); > + trl &= *r; What does trl stand for? Why is it an unsigned long when r and tr are uint32_t? > /* ignore zero writes */ > - if ( !tr ) > + if ( !trl ) > goto write_ignore; > if ( dabt.size == 2 && > - !((tr & 0xff) && (tr & (0xff << 8)) && > - (tr & (0xff << 16)) && (tr & (0xff << 24)))) > + !((trl & 0xff) && (trl & (0xff << 8)) && > + (trl & (0xff << 16)) && (trl & (0xff << 24)))) > goto write_ignore; I'm still not sure what this is ;-) > vgic_lock_rank(v, rank); > + i = 0; > + while ( (i = find_next_bit((const unsigned long *)&trl, 32, i)) < 32 ) Unnecessary cast (if trl really should be an unsigned long). > + { > + unsigned int irq, target, old_target; > + unsigned long old_target_mask; > + struct vcpu *v_target, *v_old; > + > + target = i % 8; > + old_target_mask = byte_read(rank->itargets[REG_RANK_INDEX(8, gicd_reg - GICD_ITARGETSR)], 0, i/8); > + old_target = find_first_bit((const unsigned long *) &old_target_mask, 8); Another unnecessary cast. > + > + if ( target != old_target ) Given the implementation only supports a single target I'm wondering if maybe we should just store the configured target. If we weren't required to read back the actual value then we could even dispense with rank->itargets completely and just fabricate it. > + { > + irq = offset + (i / 8); > + v_target = v->domain->vcpu[target]; > + v_old = v->domain->vcpu[old_target]; > + vgic_migrate_irq(v_old, v_target, irq); > + } > + i += 8 - target; I think this whole loop would be less subtle as: for ( byte = 0 ; byte < 4 ; byte++ ) target = find_first_bit((char *)(&tr)+byte, 8) if ( target > 8 ) continue; old_target =... } The use of find_next_bit is not really semantically what is happening here. Ian.