linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: luojiaxing <luojiaxing@huawei.com>
To: Marc Zyngier <maz@kernel.org>, John Garry <john.garry@huawei.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Zhou Wang <wangzhou1@hisilicon.com>,
	<linux-kernel@vger.kernel.org>, <qianweili@huawei.com>
Subject: Re: PCI MSI issue with reinserting a driver
Date: Fri, 27 Aug 2021 16:33:31 +0800	[thread overview]
Message-ID: <e4689914-508c-b1f1-a372-cb890d64f391@huawei.com> (raw)
In-Reply-To: <3d3d0155e66429968cb4f6b4feeae4b3@kernel.org>


On 2021/2/4 1:23, Marc Zyngier wrote:
> On 2021-02-02 15:46, John Garry wrote:
>> On 02/02/2021 14:48, Marc Zyngier wrote:
>>>>>
>>>>> Not sure. I also now notice an error for the SAS PCI driver on D06 
>>>>> when nr_cpus < 16, which means number of MSI vectors allocated < 
>>>>> 32, so looks the same problem. There we try to allocate 16 + 
>>>>> max(nr cpus, 16) MSI.
>>>>>
>>>>> Anyway, let me have a look today to see what is going wrong.
>>>>>
>>>> Could this be the problem:
>>>>
>>>> nr_cpus=11
>>>>
>>>> In alloc path, we have:
>>>>     its_alloc_device_irq(nvecs=27 = 16+11)
>>>>       bitmap_find_free_region(order = 5);
>>>> In free path, we have:
>>>>     its_irq_domain_free(nvecs = 1) and free each 27 vecs
>>>>       bitmap_release_region(order = 0)
>>>>
>>>> So we allocate 32 bits, but only free 27. And 2nd alloc for 32 fails.
>>
>> [ ... ]
>>
>>>>
>>>>
>>>> But I'm not sure that we have any requirement for those map bits to be
>>>> consecutive.
>>>
>>> We can't really do that. All the events must be contiguous,
>>> and there is also a lot of assumptions in the ITS driver that
>>> LPI allocations is also contiguous.
>>>
>>> But there is also the fact that for Multi-MSI, we *must*
>>> allocate 32 vectors. Any driver could assume that if we have
>>> allocated 17 vectors, then there is another 15 available.
>>>
>>> My question still stand: how was this working with the previous
>>> behaviour?
>>
>> Because previously in this scenario we would allocate 32 bits and free
>> 32 bits in the map; but now we allocate 32 bits, yet only free 27 - so
>> leak 5 bits. And this comes from how irq_domain_free_irqs_hierarchy()
>> now frees per-interrupt, instead of all irqs per domain.
>>
>> Before:
>>  In free path, we have:
>>      its_irq_domain_free(nvecs = 27)
>>        bitmap_release_region(count order = 5 == 32bits)
>>
>> Current:
>>  In free path, we have:
>>      its_irq_domain_free(nvecs = 1) for free each 27 vecs
>>        bitmap_release_region(count order = 0 == 1bit)
>
> Right. I was focusing on the patch and blindly ignored the explanation
> at the top of the email. Apologies for that.
>
> I'm not overly keen on handling this in the ITS though, and I'd rather
> we try and do it in the generic code. How about this (compile tested
> only)?
>
> Thanks,
>
>         M.


Hi, Marc, Just a friendly reminder on this issue. We tested the kernel 
on 5.14-rc4 and found that this issue still existed, and the following 
bugfix code was not incorporated into the kernel.

I wonder if you have any plans to merge this bugfix patch.


Thanks

Jiaxing


>
> diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
> index 6aacd342cd14..cfccad83c2df 100644
> --- a/kernel/irq/irqdomain.c
> +++ b/kernel/irq/irqdomain.c
> @@ -1399,8 +1399,19 @@ static void 
> irq_domain_free_irqs_hierarchy(struct irq_domain *domain,
>          return;
>
>      for (i = 0; i < nr_irqs; i++) {
> -        if (irq_domain_get_irq_data(domain, irq_base + i))
> -            domain->ops->free(domain, irq_base + i, 1);
> +        int n ;
> +
> +        /* Find the largest possible span of IRQs to free in one go */
> +        for (n = 0;
> +             ((i + n) < nr_irqs &&
> +              irq_domain_get_irq_data(domain, irq_base + i + n));
> +             n++);
> +
> +        if (!n)
> +            continue;
> +
> +        domain->ops->free(domain, irq_base + i, n);
> +        i += n;
>      }
>  }
>
>


  parent reply	other threads:[~2021-08-27  8:33 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-01 18:34 PCI MSI issue with reinserting a driver John Garry
2021-02-01 18:50 ` Marc Zyngier
2021-02-02  8:37   ` John Garry
2021-02-02 12:38     ` John Garry
2021-02-02 14:48       ` Marc Zyngier
2021-02-02 15:46         ` John Garry
2021-02-03 17:23           ` Marc Zyngier
2021-02-04 10:45             ` John Garry
2022-08-04 10:59               ` John Garry
2021-04-06  9:46             ` John Garry
2021-08-27  8:33             ` luojiaxing [this message]
2023-08-29 23:00             ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e4689914-508c-b1f1-a372-cb890d64f391@huawei.com \
    --to=luojiaxing@huawei.com \
    --cc=john.garry@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=qianweili@huawei.com \
    --cc=tglx@linutronix.de \
    --cc=wangzhou1@hisilicon.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).