linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Ramon Fried <rfried.dev@gmail.com>
Cc: hkallweit1@gmail.com, Bjorn Helgaas <bhelgaas@google.com>,
	maz@kernel.org, lorenzo.pieralisi@arm.com,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: MSI irqchip configured as IRQCHIP_ONESHOT_SAFE causes spurious IRQs
Date: Fri, 17 Jan 2020 15:38:36 +0100	[thread overview]
Message-ID: <87zhem172r.fsf@nanos.tec.linutronix.de> (raw)
In-Reply-To: <CAGi-RUJtqdLtFBVMxL8TOQ3LGRqqrV4Ge7Fu9mTyDoQVYxtA5g@mail.gmail.com>

Ramon,

Ramon Fried <rfried.dev@gmail.com> writes:
>> So from a software perspective you want to do something like this:
>>
>> gic_irq_handler()
>> {
>>    mask_ack(gic_irqX);
>>
>>    pending = read(msi_status);
>>    for_each_bit(bit, pending) {
>>        ack(msi_status, bit);  // This clears the latch in the MSI block
>>        handle_irq(irqof(bit));
>>    }
>>    unmask(gic_irqX);
>> }
>>
>> And that works perfectly correct without masking the MSI interrupt at
>> the PCI level for a threaded handler simply because the PCI device
>> will not send another interrupt until the previous one has been
>> handled by the driver unless the PCI device is broken.
>
> I'm missing something here, isn't this implementation blocks IRQ's only during
> the HW handler and not during the threaded handler ? (Assuming that I selected
> handle_level_irq() as the default handler)

handle_level_irq() is the proper handler for the actual GIC interrupt
which does the demultiplexing. The MSI interrupts want to have
handle_edge_irq().

> Actually my implementation current implementation is very similar to what
> you just described:
>
> static void eq_msi_isr(struct irq_desc *desc)
> {
>         struct irq_chip *chip = irq_desc_get_chip(desc);
>         struct eq_msi *msi;
>         u16 status;
>         unsigned long bitmap;
>         u32 bit;
>         u32 virq;
>
>         chained_irq_enter(chip, desc);
>         msi = irq_desc_get_handler_data(desc);
>
>         while ((status = readw(msi->gcsr_regs_base + LINK_GCSR5_OFFSET)
>                 & MSI_IRQ_REQ) != 0) {
>                 pr_debug("MSI: %x\n", status >> 12);
>
>                 bitmap = status >> 12;
>                 for_each_set_bit(bit, &bitmap, msi->num_of_vectors) {
>                         virq = irq_find_mapping(msi->inner_domain, bit);
>                         if (virq) {
>                                 generic_handle_irq(virq);
>                         } else {
>                                 pr_err("unexpected MSI\n");
>                                 handle_bad_irq(desc);

Now if you look at the example I gave you there is a subtle difference:

>>    pending = read(msi_status);
>>    for_each_bit(bit, pending) {
>>        ack(msi_status, bit);  // This clears the latch in the MSI block
>>        handle_irq(irqof(bit));
>>    }

And this clearing is important when one of the MSI interrupts is
actually having a threaded handler.

 MSI interrupt fires
  -> sets bit in msi_status
    -> MSI block raises GIC interrupt because msi_status != 0

 CPU handles GIC interrupt
   pending = read(msi_status);
   for_each_bit(bit, pending)
      handle_irq()
        primary_handler()
           -> WAKEUP_THREAD

   RETI, but msi_status is still != 0

> Additionally the domain allocation is defined like:
> static int eq_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
>                                  unsigned int nr_irqs, void *args)
> {
>         struct eq_msi *msi = domain->host_data;
>         unsigned long bit;
>         u32 mask;
>
>         /* We only allow 32 MSI per device */
>         WARN_ON(nr_irqs > 32);
>         if (nr_irqs > 32)
>                 return -ENOSPC;
>
>         bit = find_first_zero_bit(msi->used, msi->num_of_vectors);
>         if (bit >= msi->num_of_vectors)
>                 return -ENOSPC;
>
>         set_bit(bit, msi->used);
>
>         mask = readw(msi->gcsr_regs_base + LINK_GCSR6_OFFSET);
>         mask |= BIT(bit) << 12;
>         writew(mask, msi->gcsr_regs_base + LINK_GCSR6_OFFSET);
>
>         irq_domain_set_info(domain, virq, bit, &eq_msi_bottom_irq_chip,
>                             domain->host_data, handle_level_irq,

This is wrong. MSI is edge type, not level and you are really mixing up
the concepts here.

The fact that the MSI block raises a level interrupt on the output side
has absolutely nothing to do with the type of the MSI interrupt itself.

MSI is edge type by definition and this does not change just because
there is a translation unit between the MSI interrupt and the CPU
controller.

The actual MSI interrupts do not even know about the existance of that
MSI block at all. They do not care, as all they need to know is a
message and an address. When an interrupt is raised in the device the
MSI chip associated to the device (PCI or something else) writes this
message to the address exactly ONCE. And this exactly ONCE defines the
edge nature of MSI.

A proper designed MSI device should not send another message before the
interrupt handler which is associated to the device has handled the
interrupt at the device level.

So you really have to understand that the GIC interrupt and the MSI
interrupts are two different entities. They just have a 'connection'
because the message/address which is handed to the MSI device triggers
that GIC interrupt via the MSI translation unit. But they are still
different and independent entities.

See?

Thanks,

        tglx


        




  reply	other threads:[~2020-01-17 14:38 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-09 10:27 MSI irqchip configured as IRQCHIP_ONESHOT_SAFE causes spurious IRQs Ramon Fried
2020-01-02  8:19 ` Ramon Fried
2020-01-13 14:59 ` Ramon Fried
2020-01-14 12:15 ` Thomas Gleixner
2020-01-14 21:38   ` Ramon Fried
2020-01-14 21:40     ` Ramon Fried
2020-01-14 22:54       ` Thomas Gleixner
2020-01-16  0:19         ` Ramon Fried
2020-01-16  1:39           ` Thomas Gleixner
2020-01-16  7:58             ` Ramon Fried
2020-01-16  9:32               ` Thomas Gleixner
2020-01-17 13:32                 ` Ramon Fried
2020-01-17 14:38                   ` Thomas Gleixner [this message]
2020-01-17 15:43                     ` Ramon Fried
2020-01-17 17:11                       ` Thomas Gleixner
2020-01-17 20:01                         ` Ramon Fried
2020-01-17 22:47                           ` Thomas Gleixner
2020-01-20  8:02                             ` Ramon Fried

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87zhem172r.fsf@nanos.tec.linutronix.de \
    --to=tglx@linutronix.de \
    --cc=bhelgaas@google.com \
    --cc=hkallweit1@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lorenzo.pieralisi@arm.com \
    --cc=maz@kernel.org \
    --cc=rfried.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).