All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
To: Julien Grall <julien.grall@linaro.org>
Cc: Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>,
	Ian Campbell <Ian.Campbell@citrix.com>,
	"jaeyong.yoo@samsung.com" <jaeyong.yoo@samsung.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: Bug report and patch about IRQ freezing after gic_restore_state
Date: Tue, 21 May 2013 13:00:58 +0100	[thread overview]
Message-ID: <alpine.DEB.2.02.1305211249381.4799@kaball.uk.xensource.com> (raw)
In-Reply-To: <519A20FE.3030307@linaro.org>

On Mon, 20 May 2013, Julien Grall wrote:
> On 05/20/2013 01:41 AM, Jaeyong Yoo wrote:
> 
> Hello,
> 
> > I'm running xen on Arndale board and if I run both iperf and du command at Dom0, 
> > one of IRQ (either SATA or network) suddenly stop occuring anymore. 
> > After some investigation, I found out that when context switching at Xen, 
> > IRQs in LR (about to be delivered to Doms) could be lost and never occur anymore. 
> > Here goes function call sequence that this problem occurs: 
> > (in context switching)
> >   - schedule_tail 
> >       - ctxt_switch_from 
> >       - local_irq_enable 
> >       - // after this part, some IRQ can occur and could be directly written to LR 
> >       - ctxt_switch_to 
> >           - ... (some more functions) 
> >           - // before the above IRQ is delivered to Dom (and maintenance IRQ not called),
> >             // gic_restore_state can be called 
> >           - gic_restore_state /* when restoring gic state, the above IRQ 
> >                                        * (written to LR) is overwritten 
> >                                        * to the previous values, and somehow, 
> >                                        * the corresponding IRQ never occur again */ 
> > 
> > I made the following patch (i.e., enable local irq after gic_restore_state) 
> > for preventing the above problem. 
> 
> Thanks for the patch, I was looking with a similar error on the Arndale
> Board for a couple of day.

Indeed, thanks for the analysis of the bug and the patch!

It is a particularly difficult bug to track down because it can only
happen if an irq arrives after ctxt_switch_from and before
ctxt_switch_to, and the irq is for the next vcpu to be scheduled on the
pcpu (otherwise the v == current check at the beginning of
gic_set_guest_irq would catch that).
Rather than extending the check in gic_set_guest_irq, I think it is wise
to run ctxt_switch_to with interrupts disabled.


> > Signed-off-by: Jaeyong Yoo <jaeyong.yoo@samsung.com> 
> > --- 
> >  xen/arch/arm/domain.c |    4 ++-- 
> >  xen/arch/arm/gic.c    |    4 ++-- 
> >  2 files changed, 4 insertions(+), 4 deletions(-) 
> > diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c 
> > index f71b582..2c3b132 100644 
> > --- a/xen/arch/arm/domain.c 
> > +++ b/xen/arch/arm/domain.c 
> > @@ -141,6 +141,8 @@ static void ctxt_switch_to(struct vcpu *n) 
> >      /* VGIC */ 
> >      gic_restore_state(n); 
> > +    local_irq_enable(); 
> > +
> 
> Could you move the local_irq_enable right after ctxt_switch_to?

Right, good idea.


> >      /* XXX VFP */ 
> >      /* XXX MPU */ 
> > @@ -215,8 +217,6 @@ static void schedule_tail(struct vcpu *prev) 
> >  { 
> >      ctxt_switch_from(prev); 
> > -    local_irq_enable(); 
> > - 
> >      /* TODO 
> >         update_runstate_area(current); 
> >      */ 
> > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c 
> > index d4f0a43..8186ad8 100644 
> > --- a/xen/arch/arm/gic.c 
> > +++ b/xen/arch/arm/gic.c 
> > @@ -81,11 +81,11 @@ void gic_restore_state(struct vcpu *v) 
> >      if ( is_idle_vcpu(v) ) 
> >          return; 
> > -    spin_lock_irq(&gic.lock); 
> > +    spin_lock(&gic.lock); 
> >      this_cpu(lr_mask) = v->arch.lr_mask; 
> >      for ( i=0; i<nr_lrs; i++) 
> >          GICH[GICH_LR + i] = v->arch.gic_lr[i]; 
> > -    spin_unlock_irq(&gic.lock); 
> > +    spin_unlock(&gic.lock); 
> 
> As the IRQ is disabled and the GICH registers can only be modified by
> the current physical CPU, I think you can remove the spin_{,un}lock and
> replace it by a dsb.

Yes, we can remove the spin_lock but I don't think we need a dsb
there. See the presence of an isb() two lines below.

  reply	other threads:[~2013-05-21 12:00 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-20  0:41 Bug report and patch about IRQ freezing after gic_restore_state Jaeyong Yoo
2013-05-20 13:11 ` Julien Grall
2013-05-21 12:00   ` Stefano Stabellini [this message]
2013-05-21 11:13 유재용
2013-05-21 13:00 ` Ian Campbell
2013-05-22  2:34 Jaeyong Yoo
2013-05-22 16:55 ` Stefano Stabellini
2013-05-23 13:24   ` Ian Campbell
2013-05-23 23:57 Jaeyong Yoo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.02.1305211249381.4799@kaball.uk.xensource.com \
    --to=stefano.stabellini@eu.citrix.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=jaeyong.yoo@samsung.com \
    --cc=julien.grall@linaro.org \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.