All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: "Raj\, Ashok" <ashok.raj@intel.com>
Cc: "Raj\, Ashok" <ashok.raj@linux.intel.com>,
	Evan Green <evgreen@chromium.org>,
	Mathias Nyman <mathias.nyman@linux.intel.com>,
	x86@kernel.org, linux-pci <linux-pci@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Bjorn Helgaas <bhelgaas@google.com>, "Ghorai\,
	Sukumar" <sukumar.ghorai@intel.com>, "Amara\,
	Madhusudanarao" <madhusudanarao.amara@intel.com>, "Nandamuri\,
	Srikanth" <srikanth.nandamuri@intel.com>,
	Ashok Raj <ashok.raj@intel.com>
Subject: Re: MSI interrupt for xhci still lost on 5.6-rc6 after cpu hotplug
Date: Mon, 11 May 2020 22:14:04 +0200	[thread overview]
Message-ID: <87h7wm5iwj.fsf@nanos.tec.linutronix.de> (raw)
In-Reply-To: <20200511190341.GA95413@otc-nc-03>

Ashok,

"Raj, Ashok" <ashok.raj@intel.com> writes:
> On Fri, May 08, 2020 at 06:49:15PM +0200, Thomas Gleixner wrote:
>> "Raj, Ashok" <ashok.raj@intel.com> writes:
>> > With legacy MSI we can have these races and kernel is trying to do the
>> > song and dance, but we see this happening even when IR is turned on.
>> > Which is perplexing. I think when we have IR, once we do the change vector 
>> > and flush the interrupt entry cache, if there was an outstandng one in 
>> > flight it should be in IRR. Possibly should be clearned up by the
>> > send_cleanup_vector() i suppose.
>> 
>> Ouch. With IR this really should never happen and yes the old vector
>> will catch one which was raised just before the migration disabled the
>> IR entry. During the change nothing can go wrong because the entry is
>> disabled and only reenabled after it's flushed which will send a pending
>> one to the new vector.
>
> with IR, I'm not sure if we actually mask the interrupt except when
> its a Posted Interrupt.

Indeed. Memory tricked me.

> We do an atomic update to IRTE, with cmpxchg_double
>
> 	ret = cmpxchg_double(&irte->low, &irte->high,
> 			     irte->low, irte->high,
> 			     irte_modified->low, irte_modified->high);

We only do this if:

        if ((irte->pst == 1) || (irte_modified->pst == 1))

i.e. the irte entry was or is going to be used for posted interrupts.

In the non-posted remapping case we do:

       set_64bit(&irte->low, irte_modified->low);
       set_64bit(&irte->high, irte_modified->high);

> followed by flushing the interrupt entry cache. After which any 
> old ones in flight before the flush should be sittig in IRR
> on the outgoing cpu.

Correct.

> The send_cleanup_vector() sends IPI to the apic_id->old_cpu which 
> would be the cpu we are running on correct? and this is a self_ipi
> to IRQ_MOVE_CLEANUP_VECTOR.

Yes, for the IR case the cleanup vector IPI is sent right away because
the IR logic allegedly guarantees that after flushing the IRTE cache no
interrupt can be sent to the old vector. IOW, after the flush a vector
sent to the previous CPU must be already in the IRR of that CPU.

> smp_irq_move_cleanup_interrupt() seems to check IRR with 
> apicid_prev_vector()
>
> 	irr = apic_read(APIC_IRR + (vector / 32 * 0x10));
> 	if (irr & (1U << (vector % 32))) {
> 		apic->send_IPI_self(IRQ_MOVE_CLEANUP_VECTOR);
> 		continue;
> 	}
>
> And this would allow any pending IRR bits in the outgoing CPU to 
> call the relevant ISR's before draining all vectors on the outgoing
> CPU.

No. When the CPU goes offline it does not handle anything after moving
the interrupts away. The pending bits are handled in fixup_irq() after
the forced migration. If the vector is set in the IRR it sends an IPI to
the new target CPU.

That IRR check is paranoia for the following case where interrupt
migration happens between live CPUs:

    Old CPU

    set affinity                <- device sends to old vector/cpu
    ...                 
    irte_...()
    send_cleanup_vector();

    handle_cleanup_vector()     <- Would remove the vector
                                   without that IRR check and
                                   then result in a spurious interrupt

This should actually never happen because the cleanup vector is the
lowest priority vector.

> I couldn't quite pin down how the device ISR's are hooked up through
> this send_cleanup_vector() and what follows.

Device ISRs don't know anything about the underlying irq machinery.

The non IR migration does:

    Old CPU          New CPU

    set affinity
    ...                 
                                <- device sends to new vector/cpu
                     handle_irq
                       ack()
                        apic_ack_irq()
                          irq_complete_move()
                            send_cleanup_vector()

    handle_cleanup_vector()

In case that an interrupt still hits the old CPU:

    set affinity                <- device sends to old vector/cpu
    ...
    handle_irq
     ack()
      apic_ack_irq()
       irq_complete_move()      (Does not send cleanup vector because wrong CPU)

                                <- device sends to new vector/cpu
                     handle_irq
                       ack()
                        apic_ack_irq()
                          irq_complete_move()
                            send_cleanup_vector()

    handle_cleanup_vector()

Thanks,

        tglx

  reply	other threads:[~2020-05-11 20:14 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20200508005528.GB61703@otc-nc-03>
2020-05-08 11:04 ` MSI interrupt for xhci still lost on 5.6-rc6 after cpu hotplug Thomas Gleixner
2020-05-08 16:09   ` Raj, Ashok
2020-05-08 16:49     ` Thomas Gleixner
2020-05-11 19:03       ` Raj, Ashok
2020-05-11 20:14         ` Thomas Gleixner [this message]
2020-03-18 19:25 Mathias Nyman
2020-03-19 20:24 ` Evan Green
2020-03-20  8:07   ` Mathias Nyman
2020-03-20  9:52 ` Thomas Gleixner
2020-03-23  9:42   ` Mathias Nyman
2020-03-23 14:10     ` Thomas Gleixner
2020-03-23 20:32       ` Mathias Nyman
2020-03-24  0:24         ` Thomas Gleixner
2020-03-24 16:17           ` Evan Green
2020-03-24 19:03             ` Thomas Gleixner
2020-05-01 18:43               ` Raj, Ashok
2020-05-05 19:36                 ` Thomas Gleixner
2020-05-05 20:16                   ` Raj, Ashok
2020-05-05 21:47                     ` Thomas Gleixner
2020-05-07 12:18                       ` Raj, Ashok
2020-05-07 12:53                         ` Thomas Gleixner
2020-05-07 17:57                           ` Raj, Ashok
2020-05-07 19:41                             ` Thomas Gleixner
2020-03-25 17:12             ` Mathias Nyman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87h7wm5iwj.fsf@nanos.tec.linutronix.de \
    --to=tglx@linutronix.de \
    --cc=ashok.raj@intel.com \
    --cc=ashok.raj@linux.intel.com \
    --cc=bhelgaas@google.com \
    --cc=evgreen@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=madhusudanarao.amara@intel.com \
    --cc=mathias.nyman@linux.intel.com \
    --cc=srikanth.nandamuri@intel.com \
    --cc=sukumar.ghorai@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.