linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Raj, Ashok" <ashok.raj@intel.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: "Raj, Ashok" <ashok.raj@linux.intel.com>,
	Evan Green <evgreen@chromium.org>,
	Mathias Nyman <mathias.nyman@linux.intel.com>,
	x86@kernel.org, linux-pci <linux-pci@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Bjorn Helgaas <bhelgaas@google.com>,
	"Ghorai, Sukumar" <sukumar.ghorai@intel.com>,
	"Amara, Madhusudanarao" <madhusudanarao.amara@intel.com>,
	"Nandamuri, Srikanth" <srikanth.nandamuri@intel.com>,
	Ashok Raj <ashok.raj@intel.com>
Subject: Re: MSI interrupt for xhci still lost on 5.6-rc6 after cpu hotplug
Date: Thu, 7 May 2020 05:18:50 -0700	[thread overview]
Message-ID: <20200507121850.GB85463@otc-nc-03> (raw)
In-Reply-To: <875zdarr4h.fsf@nanos.tec.linutronix.de>

Hi Thomas

We did a bit more tracing and it looks like the IRR check is actually
not happening on the right cpu. See below.

On Tue, May 05, 2020 at 11:47:26PM +0200, Thomas Gleixner wrote:
> >
> > msi_set_affinit ()
> > {
> > ....
> >         unlock_vector_lock();
> >
> >         /*
> >          * Check whether the transition raced with a device interrupt and
> >          * is pending in the local APICs IRR. It is safe to do this outside
> >          * of vector lock as the irq_desc::lock of this interrupt is still
> >          * held and interrupts are disabled: The check is not accessing the
> >          * underlying vector store. It's just checking the local APIC's
> >          * IRR.
> >          */
> >         if (lapic_vector_set_in_irr(cfg->vector))
> >                 irq_data_get_irq_chip(irqd)->irq_retrigger(irqd);
> 
> No. This catches the transitional interrupt to the new vector on the
> original CPU, i.e. the one which is running that code.

Mathias added some trace to his xhci driver when the isr is called.

Below is the tail of my trace with last two times xhci_irq isr is called:

    <idle>-0     [003] d.h.   200.277971: xhci_irq: xhci irq
    <idle>-0     [003] d.h.   200.278052: xhci_irq: xhci irq

Just trying to follow your steps below with traces. The traces follow
the same comments in the source.

> 
> Again the steps are:
> 
>  1) Allocate new vector on new CPU

        /* Allocate a new target vector */
        ret = parent->chip->irq_set_affinity(parent, mask, force);

migration/3-24    [003] d..1   200.283012: msi_set_affinity: msi_set_affinity: quirk: 1: new vector allocated, new cpu = 0

> 
>  2) Set new vector on original CPU

        /* Redirect it to the new vector on the local CPU temporarily */
        old_cfg.vector = cfg->vector;
        irq_msi_update_msg(irqd, &old_cfg);

migration/3-24    [003] d..1   200.283033: msi_set_affinity: msi_set_affinity: Redirect to new vector 33 on old cpu 6
> 
>  3) Set new vector on new CPU

        /* Now transition it to the target CPU */
        irq_msi_update_msg(irqd, cfg);

     migration/3-24    [003] d..1   200.283044: msi_set_affinity: msi_set_affinity: Transition to new target cpu 0 vector 33



     if (lapic_vector_set_in_irr(cfg->vector))
	irq_data_get_irq_chip(irqd)->irq_retrigger(irqd);


migration/3-24    [003] d..1   200.283046: msi_set_affinity: msi_set_affinity: Update Done [IRR 0]: irq 123 localsw: Nvec 33 Napic 0

> 
> So we have 3 points where an interrupt can fire:
> 
>  A) Before #2
> 
>  B) After #2 and before #3
> 
>  C) After #3
> 
> #A is hitting the old vector which is still valid on the old CPU and
>    will be handled once interrupts are enabled with the correct irq
>    descriptor - Normal operation (same as with maskable MSI)
> 
> #B This must be checked in the IRR because the there is no valid vector
>    on the old CPU.

The check for IRR seems like on a random cpu3 vs checking for the new vector 33
on old cpu 6?

This is the place when we force the retrigger without the IRR check things seem to fix itself.

> 
> #C is handled on the new vector on the new CPU
> 


Did we miss something? 

Cheers,
Ashok

  reply	other threads:[~2020-05-07 12:18 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-18 19:25 MSI interrupt for xhci still lost on 5.6-rc6 after cpu hotplug Mathias Nyman
2020-03-19 20:24 ` Evan Green
2020-03-20  8:07   ` Mathias Nyman
2020-03-20  9:52 ` Thomas Gleixner
2020-03-23  9:42   ` Mathias Nyman
2020-03-23 14:10     ` Thomas Gleixner
2020-03-23 20:32       ` Mathias Nyman
2020-03-24  0:24         ` Thomas Gleixner
2020-03-24 16:17           ` Evan Green
2020-03-24 19:03             ` Thomas Gleixner
2020-05-01 18:43               ` Raj, Ashok
2020-05-05 19:36                 ` Thomas Gleixner
2020-05-05 20:16                   ` Raj, Ashok
2020-05-05 21:47                     ` Thomas Gleixner
2020-05-07 12:18                       ` Raj, Ashok [this message]
2020-05-07 12:53                         ` Thomas Gleixner
2020-05-07 17:57                           ` Raj, Ashok
2020-05-07 19:41                             ` Thomas Gleixner
2020-03-25 17:12             ` Mathias Nyman
     [not found] <20200508005528.GB61703@otc-nc-03>
2020-05-08 11:04 ` Thomas Gleixner
2020-05-08 16:09   ` Raj, Ashok
2020-05-08 16:49     ` Thomas Gleixner
2020-05-11 19:03       ` Raj, Ashok
2020-05-11 20:14         ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200507121850.GB85463@otc-nc-03 \
    --to=ashok.raj@intel.com \
    --cc=ashok.raj@linux.intel.com \
    --cc=bhelgaas@google.com \
    --cc=evgreen@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=madhusudanarao.amara@intel.com \
    --cc=mathias.nyman@linux.intel.com \
    --cc=srikanth.nandamuri@intel.com \
    --cc=sukumar.ghorai@intel.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).