All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sandor Bodo-Merle <sbodomerle@gmail.com>
To: Ray Jui <ray.jui@broadcom.com>
Cc: "Pali Rohár" <pali@kernel.org>,
	linux-pci@vger.kernel.org, bcm-kernel-feedback-list@broadcom.com
Subject: Re: pcie-iproc-msi.c: Bug in Multi-MSI support?
Date: Fri, 21 May 2021 16:00:16 +0200	[thread overview]
Message-ID: <CABLWAfSV1WhToV8VWVUyGOPPr4mPC_Md6pb=1RkO0y=eK+JW7w@mail.gmail.com> (raw)
In-Reply-To: <CABLWAfQsspRJdqHJ1o5WGzDYne_uSk-8qdAzQ1LtzGtgM4Yhxg@mail.gmail.com>

Reading back the patches - and comparing them to the multi-MSI driver
versions that are using "order_base_2(nr_irqs)"
it seems that our engineering went on a different route.
We store the allocated hwirq in the MSI message data field (shifted by
5 - to leave room for the max 32 vectors).
So instead of using something like find_next_bit() - and i guess in
that case you really need properly aligned bitfields - we
use all the available data from the msi vector to be able to infer the
proper virq number.

btw - my interpretation can be totally bogus - since it was a while i
looked on the MSI code.

On Fri, May 21, 2021 at 1:39 PM Sandor Bodo-Merle <sbodomerle@gmail.com> wrote:
>
> Digging up our patch queue - i found another multi-MSI related fix:
>
>     Unfortunately the reverse mapping of the hwirq - as made by
>     irq_find_mapping() was not applied to the message data only, but
>     also to the MSI vector, which was lost as a result.
>     Make sure that the reverse mapping is applied to the proper hwirq,
>     contained in the message data.
>     Tested on Saber2 and Katana2.
>
>     Fixes: fc54bae288182 ("PCI: iproc: Allow allocation of multiple MSIs")
>
> diff --git drivers/pci/host/pcie-iproc-msi.c drivers/pci/host/pcie-iproc-msi.c
> index 990fc906d73d..708fdb1065f8 100644
> --- drivers/pci/host/pcie-iproc-msi.c
> +++ drivers/pci/host/pcie-iproc-msi.c
> @@ -237,7 +237,12 @@ static void iproc_msi_irq_compose_msi_msg(struct
> irq_data *data,
>         addr = msi->msi_addr + iproc_msi_addr_offset(msi, data->hwirq);
>         msg->address_lo = lower_32_bits(addr);
>         msg->address_hi = upper_32_bits(addr);
> -       msg->data = data->hwirq << 5;
> +       /*
> +        * Since we have multiple hwirq mapped to a single MSI vector,
> +        * now we need to derive the hwirq at CPU0.  It can then be used to
> +        * mapped back to virq.
> +        */
> +       msg->data = hwirq_to_canonical_hwirq(msi, data->hwirq) << 5;
>  }
>
>  static struct irq_chip iproc_msi_bottom_irq_chip = {
> @@ -307,14 +312,8 @@ static inline u32 decode_msi_hwirq(struct
> iproc_msi *msi, u32 eq, u32 head)
>         offs = iproc_msi_eq_offset(msi, eq) + head * sizeof(u32);
>         msg = (u32 *)(msi->eq_cpu + offs);
>         hwirq = readl(msg);
> -       hwirq = (hwirq >> 5) + (hwirq & 0x1f);
>
> -       /*
> -        * Since we have multiple hwirq mapped to a single MSI vector,
> -        * now we need to derive the hwirq at CPU0.  It can then be used to
> -        * mapped back to virq.
> -        */
> -       return hwirq_to_canonical_hwirq(msi, hwirq);
> +       return hwirq;
>  }
>
>  static void iproc_msi_handler(struct irq_desc *desc)
> @@ -360,7 +359,7 @@ static void iproc_msi_handler(struct irq_desc *desc)
>                 /* process all outstanding events */
>                 while (nr_events--) {
>                         hwirq = decode_msi_hwirq(msi, eq, head);
> -                       virq = irq_find_mapping(msi->inner_domain, hwirq);
> +                       virq = irq_find_mapping(msi->inner_domain,
> hwirq >> 5) + (hwirq & 0x1f);
>                         generic_handle_irq(virq);
>
>                         head++;
>
> On Thu, May 20, 2021 at 7:11 PM Ray Jui <ray.jui@broadcom.com> wrote:
> >
> >
> >
> > On 5/20/2021 7:22 AM, Sandor Bodo-Merle wrote:
> > > On Thu, May 20, 2021 at 4:05 PM Pali Rohár <pali@kernel.org> wrote:
> > >>
> > >> Hello!
> > >>
> > >> On Thursday 20 May 2021 15:47:46 Sandor Bodo-Merle wrote:
> > >>> Hi Pali,
> > >>>
> > >>> thanks for catching this - i dig up the followup fixup commit we have
> > >>> for the iproc multi MSI (it was sent to Broadcom - but unfortunately
> > >>> we missed upstreaming it).
> > >>>
> > >>> Commit fc54bae28818 ("PCI: iproc: Allow allocation of multiple MSIs")
> > >>> failed to reserve the proper number of bits from the inner domain.
> > >>> We need to allocate the proper amount of bits otherwise the domains for
> > >>> multiple PCIe endpoints may overlap and freeing one of them will result
> > >>> in freeing unrelated MSI vectors.
> > >>>
> > >>> Fixes: fc54bae28818 ("PCI: iproc: Allow allocation of multiple MSIs")
> > >>> ---
> > >>>  drivers/pci/host/pcie-iproc-msi.c | 8 ++++----
> > >>>  1 file changed, 4 insertions(+), 4 deletions(-)
> > >>>
> > >>> diff --git drivers/pci/host/pcie-iproc-msi.c drivers/pci/host/pcie-iproc-msi.c
> > >>> index 708fdb1065f8..a00492dccb74 100644
> > >>> --- drivers/pci/host/pcie-iproc-msi.c
> > >>> +++ drivers/pci/host/pcie-iproc-msi.c
> > >>> @@ -260,11 +260,11 @@ static int iproc_msi_irq_domain_alloc(struct
> > >>> irq_domain *domain,
> > >>>
> > >>>         mutex_lock(&msi->bitmap_lock);
> > >>>
> > >>> -       /* Allocate 'nr_cpus' number of MSI vectors each time */
> > >>> +       /* Allocate 'nr_irqs' multiplied by 'nr_cpus' number of MSI
> > >>> vectors each time */
> > >>>         hwirq = bitmap_find_next_zero_area(msi->bitmap, msi->nr_msi_vecs, 0,
> > >>> -                                          msi->nr_cpus, 0);
> > >>> +                                          msi->nr_cpus * nr_irqs, 0);
> > >>
> > >> I'm not sure if this construction is correct. Multi-MSI interrupts needs
> > >> to be aligned to number of requested interrupts. So if wifi driver asks
> > >> for 32 Multi-MSI interrupts then first allocated interrupt number must
> > >> be dividable by 32.
> > >>
> > >
> > > Ahh - i guess you are right. In our internal engineering we always
> > > request 32 vectors.
> > > IIRC the multiply by "nr_irqs" was added for iqr affinity to work correctly.
> > >
> >
> > May I ask which platforms are you guys running this driver on? Cygnus or
> > Northstar? Not that it matters, but just out of curiosity.
> >
> > Let me start by explaining how MSI support works in this driver, or more
> > precisely, for all platforms that support this iProc based event queue
> > MSI scheme:
> >
> > In iProc PCIe core, each MSI group is serviced by a GIC interrupt
> > (hwirq) and a dedicated event queue (event queue is paired up with
> > hwirq).  Each MSI group can support up to 64 MSI vectors. Note 64 is the
> > depth of the event queue.
> >
> > The number of MSI groups varies between different iProc SoCs. The total
> > number of CPU cores also varies. To support MSI IRQ affinity, we
> > distribute GIC interrupts across all available CPUs.  MSI vector is
> > moved from one GIC interrupt to another to steer to the target CPU.
> >
> > Assuming:
> > The number of MSI groups (the number of total hwirq for this PCIe
> > controller) is M
> > The number of CPU cores is N
> > M is always a multiple of N (we ensured that in the setup function)
> >
> > Therefore:
> > Total number of raw MSI vectors = M * 64
> > Total number of supported MSI vectors = (M * 64) / N
> >
> > I guess I'm not too clear on what you mean by "multi-MSI interrupts
> > needs to be aligned to number of requested interrupts.". Would you be
> > able to plug this into the above explanation so we can have a more clear
> > understanding of what you mean here?
> >
> > In general, I don't see much issue of allocating 'msi->nr_cpus *
> > nr_irqs' here as long as we can still meet the affinity distribution
> > requirement as mentioned above.
> >
> > Example in the dw PCIe driver does the following for reserving in the
> > bitmap:
> >
> > bitmap_find_free_region(pp->msi_irq_in_use, pp->num_vectors,
> > order_base_2(nr_irqs));
> >
> > >>>         if (hwirq < msi->nr_msi_vecs) {
> > >>> -               bitmap_set(msi->bitmap, hwirq, msi->nr_cpus);
> > >>> +               bitmap_set(msi->bitmap, hwirq, msi->nr_cpus * nr_irqs);
> > >>
> > >> And another issue is that only power of 2 interrupts for Multi-MSI can
> > >> be allocated. Otherwise one number may be allocated to more devices.
> > >>
> > >> But I'm not sure how number of CPUs affects it as other PCIe controller
> > >> drivers do not use number of CPUs.
> >
> > As I explained above, we definitely need to take nr_cpus into account,
> > since we cannot move around the physical interrupt between CPUs.
> > Instead, we dedicate each physical interrupt to the CPU and service the
> > MSI across different event queues accordingly when user change irq affinity.
> >
> > >>
> > >> Other drivers are using bitmap_find_free_region() function with
> > >> order_base_2(nr_irqs) as argument.
> > >>
> >
> > Yes we should do that too.
> >
> > >> I hope that somebody else more skilled with MSI interrupts look at these
> > >> constructions if are correct or needs more rework.
> > >>
> > >
> > > I see the point - i'll ask also in our engineering department ...
> > >
> > >>>         } else {
> > >>>                 mutex_unlock(&msi->bitmap_lock);
> > >>>                 return -ENOSPC;
> > >>> @@ -292,7 +292,7 @@ static void iproc_msi_irq_domain_free(struct
> > >>> irq_domain *domain,
> > >>>         mutex_lock(&msi->bitmap_lock);
> > >>>
> > >>>         hwirq = hwirq_to_canonical_hwirq(msi, data->hwirq);
> > >>> -       bitmap_clear(msi->bitmap, hwirq, msi->nr_cpus);
> > >>> +       bitmap_clear(msi->bitmap, hwirq, msi->nr_cpus * nr_irqs);
> > >>>
> > >>>         mutex_unlock(&msi->bitmap_lock);
> > >>>
> > >>>
> > >>> On Thu, May 20, 2021 at 2:04 PM Pali Rohár <pali@kernel.org> wrote:
> > >>>>
> > >>>> Hello!
> > >>>>
> > >>>> I think there is a bug in pcie-iproc-msi.c driver. It declares
> > >>>> Multi MSI support via MSI_FLAG_MULTI_PCI_MSI flag, see:
> > >>>>
> > >>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/controller/pcie-iproc-msi.c?h=v5.12#n174
> > >>>>
> > >>>> but its iproc_msi_irq_domain_alloc() function completely ignores nr_irqs
> > >>>> argument when allocating interrupt numbers from bitmap, see:
> > >>>>
> > >>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/controller/pcie-iproc-msi.c?h=v5.12#n246
> > >>>>
> > >>>> I think this this is incorrect as alloc callback should allocate nr_irqs
> > >>>> multi interrupts as caller requested. All other drivers with Multi MSI
> > >>>> support are doing it.
> > >>>>
> > >>>> Could you look at it?

  reply	other threads:[~2021-05-21 14:00 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-20 12:00 pcie-iproc-msi.c: Bug in Multi-MSI support? Pali Rohár
2021-05-20 13:47 ` Sandor Bodo-Merle
2021-05-20 14:05   ` Pali Rohár
2021-05-20 14:22     ` Sandor Bodo-Merle
2021-05-20 17:11       ` Ray Jui
2021-05-20 19:31         ` Sandor Bodo-Merle
2021-05-21 11:39         ` Sandor Bodo-Merle
2021-05-21 14:00           ` Sandor Bodo-Merle [this message]
2021-05-24 10:37         ` Marc Zyngier
2021-05-25 17:27           ` Ray Jui
2021-05-25 17:35             ` Pali Rohár
2021-05-25 17:39               ` Ray Jui
2021-05-26  7:57             ` Marc Zyngier
2021-05-26 16:10               ` Sandor Bodo-Merle
2021-06-02  8:34                 ` Marc Zyngier
2021-06-03 16:59                   ` Ray Jui
2021-06-04  9:17                     ` Sandor Bodo-Merle
2021-06-04 16:47                       ` Ray Jui
2021-05-26 17:34               ` Ray Jui
2021-05-26 20:18                 ` Sandor Bodo-Merle

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CABLWAfSV1WhToV8VWVUyGOPPr4mPC_Md6pb=1RkO0y=eK+JW7w@mail.gmail.com' \
    --to=sbodomerle@gmail.com \
    --cc=bcm-kernel-feedback-list@broadcom.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=pali@kernel.org \
    --cc=ray.jui@broadcom.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.