On 6/2/2021 1:34 AM, Marc Zyngier wrote: > On Wed, 26 May 2021 17:10:24 +0100, > Sandor Bodo-Merle wrote: >> >> [1 ] >> The following patch addresses the allocation issue - but indeed - wont >> fix the atomicity of IRQ affinity in this driver (but the majority of >> our product relies on single core SOCs; we also use a dual-core SOC >> also - but we don't change the initial the IRQ affinity). >> >> On Wed, May 26, 2021 at 9:57 AM Marc Zyngier wrote: >>> >>> On Tue, 25 May 2021 18:27:54 +0100, >>> Ray Jui wrote: >>>> >>>> On 5/24/2021 3:37 AM, Marc Zyngier wrote: >>>>> On Thu, 20 May 2021 18:11:32 +0100, >>>>> Ray Jui wrote: >>>>>> >>>>>> On 5/20/2021 7:22 AM, Sandor Bodo-Merle wrote: >>> >>> [...] >>> >>>>>> I guess I'm not too clear on what you mean by "multi-MSI interrupts >>>>>> needs to be aligned to number of requested interrupts.". Would you be >>>>>> able to plug this into the above explanation so we can have a more clear >>>>>> understanding of what you mean here? >>>>> >>>>> That's a generic PCI requirement: if you are providing a Multi-MSI >>>>> configuration, the base vector number has to be size-aligned >>>>> (2-aligned for 2 MSIs, 4 aligned for 4, up to 32), and the end-point >>>>> supplies up to 5 bits that are orr-ed into the base vector number, >>>>> with a *single* doorbell address. You effectively provide a single MSI >>>>> number and a single address, and the device knows how to drive 2^n MSIs. >>>>> >>>>> This is different from MSI-X, which defines multiple individual >>>>> vectors, each with their own doorbell address. >>>>> >>>>> The main problem you have here (other than the broken allocation >>>>> mechanism) is that moving an interrupt from one core to another >>>>> implies moving the doorbell address to that of another MSI >>>>> group. This isn't possible for Multi-MSI, as all the MSIs must have >>>>> the same doorbell address. As far as I can see, there is no way to >>>>> support Multi-MSI together with affinity change on this HW, and you >>>>> should stop advertising support for this feature. >>>>> >>>> >>>> I was not aware of the fact that multi-MSI needs to use the same >>>> doorbell address (aka MSI posted write address?). Thank you for helping >>>> to point it out. In this case, yes, like you said, we cannot possibly >>>> support both multi-MSI and affinity at the same time, since supporting >>>> affinity requires us to move from one to another event queue (and irq) >>>> that will have different doorbell address. >>>> >>>> Do you think it makes sense to do the following by only advertising >>>> multi-MSI capability in the single CPU core case (detected runtime via >>>> 'num_possible_cpus')? This will at least allow multi-MSI to work in >>>> platforms with single CPU core that Sandor and Pali use? >>> >>> I don't think this makes much sense. Single-CPU machines are an oddity >>> these days, and I'd rather you simplify this (already pretty >>> complicated) driver. >>> >>>>> There is also a more general problem here, which is the atomicity of >>>>> the update on affinity change. If you are moving an interrupt from one >>>>> CPU to the other, it seems you change both the vector number and the >>>>> target address. If that is the case, this isn't atomic, and you may >>>>> end-up with the device generating a message based on a half-applied >>>>> update. >>>> >>>> Are you referring to the callback in 'irq_set_addinity" and >>>> 'irq_compose_msi_msg'? In such case, can you help to recommend a >>>> solution for it (or there's no solution based on such architecture)? It >>>> does not appear such atomy can be enforced from the irq framework level. >>> >>> irq_compose_msi_msg() is only one part of the problem. The core of the >>> issue is that the programming of the end-point is not atomic (you need >>> to update a 32bit payload *and* a 64bit address). >>> >>> A solution to workaround it would be to rework the way you allocate >>> the vectors, making them constant across all CPUs so that only the >>> address changes when changing the affinity. >>> >>> Thanks, >>> >>> M. >>> >>> -- >>> Without deviation from the norm, progress is not possible. >> [2 0001-PCI-iproc-fix-the-base-vector-number-allocation-for-.patch ] >> From df31c9c0333ca4922b7978b30719348e368bea3c Mon Sep 17 00:00:00 2001 >> From: Sandor Bodo-Merle >> Date: Wed, 26 May 2021 17:48:16 +0200 >> Subject: [PATCH] PCI: iproc: fix the base vector number allocation for Multi >> MSI >> MIME-Version: 1.0 >> Content-Type: text/plain; charset=UTF-8 >> Content-Transfer-Encoding: 8bit >> >> Commit fc54bae28818 ("PCI: iproc: Allow allocation of multiple MSIs") >> failed to reserve the proper number of bits from the inner domain. >> Natural alignment of the base vector number was also not guaranteed. >> >> Fixes: fc54bae28818 ("PCI: iproc: Allow allocation of multiple MSIs") >> Reported-by: Pali Rohár >> Signed-off-by: Sandor Bodo-Merle >> --- >> drivers/pci/controller/pcie-iproc-msi.c | 18 ++++++++---------- >> 1 file changed, 8 insertions(+), 10 deletions(-) >> >> diff --git drivers/pci/controller/pcie-iproc-msi.c drivers/pci/controller/pcie-iproc-msi.c >> index eede4e8f3f75..fa2734dd8482 100644 >> --- drivers/pci/controller/pcie-iproc-msi.c >> +++ drivers/pci/controller/pcie-iproc-msi.c >> @@ -252,18 +252,15 @@ static int iproc_msi_irq_domain_alloc(struct irq_domain *domain, >> >> mutex_lock(&msi->bitmap_lock); >> >> - /* Allocate 'nr_cpus' number of MSI vectors each time */ >> - hwirq = bitmap_find_next_zero_area(msi->bitmap, msi->nr_msi_vecs, 0, >> - msi->nr_cpus, 0); >> - if (hwirq < msi->nr_msi_vecs) { >> - bitmap_set(msi->bitmap, hwirq, msi->nr_cpus); >> - } else { >> - mutex_unlock(&msi->bitmap_lock); >> - return -ENOSPC; >> - } >> + /* Allocate 'nr_irqs' multiplied by 'nr_cpus' number of MSI vectors each time */ >> + hwirq = bitmap_find_free_region(msi->bitmap, msi->nr_msi_vecs, >> + order_base_2(msi->nr_cpus * nr_irqs)); >> >> mutex_unlock(&msi->bitmap_lock); >> >> + if (hwirq < 0) >> + return -ENOSPC; >> + >> for (i = 0; i < nr_irqs; i++) { >> irq_domain_set_info(domain, virq + i, hwirq + i, >> &iproc_msi_bottom_irq_chip, >> @@ -284,7 +281,8 @@ static void iproc_msi_irq_domain_free(struct irq_domain *domain, >> mutex_lock(&msi->bitmap_lock); >> >> hwirq = hwirq_to_canonical_hwirq(msi, data->hwirq); >> - bitmap_clear(msi->bitmap, hwirq, msi->nr_cpus); >> + bitmap_release_region(msi->bitmap, hwirq, >> + order_base_2(msi->nr_cpus * nr_irqs)); >> >> mutex_unlock(&msi->bitmap_lock); >> > > This looks reasonable. However, this doesn't change the issue that you > have with SMP systems and Multi-MSI. I'd like to see a more complete > patch (disabling Multi-MSI on SMP, at the very least). > Yeah, agree with you that we want to see this patch at least disables multi-msi when it detects 'num_possible_cpus' > 1. > Thanks, > > M. >