* [IRQ] IRQ affinity not working properly?
@ 2021-01-29 19:17 Chris Friesen
2021-03-28 18:45 ` Thomas Gleixner
0 siblings, 1 reply; 5+ messages in thread
From: Chris Friesen @ 2021-01-29 19:17 UTC (permalink / raw)
To: Thomas Gleixner, LKML
Hi,
I'm not subscribed to the list, please cc me on replies.
I have a CentOS 7 linux system with 48 logical CPUs and a number of
Intel NICs running the i40e driver. It was booted with
irqaffinity=0-1,24-25 in the kernel boot args, resulting in
/proc/irq/default_smp_affinity showing "0000,03000003". CPUs 2-11 are
set as "isolated" in the kernel boot args. The irqbalance daemon is not
running.
The problem I'm seeing is that /proc/interrupts shows iavf interrupts
(associated with physical devices running the i40e driver) on other CPUs
than the expected affinity. For example, here are some iavf interrupts
on CPU 4 where I would not expect to see any interrupts given that "cat
/proc/irq/<NUM>/smp_affinity_list" reports "0-1,24-25" for all these
interrupts. (Sorry for the line wrapping.)
cat /proc/interrupts | grep -e CPU -e 941: -e 942: -e 943: -e 944: -e
945: -e 961: -e 962: -e 963: -e 964: -e 965:
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5
941: 0 0 0 0 28490 0
IR-PCI-MSI-edge iavf-0000:b5:03.6:mbx
942: 0 0 0 0 333832 0
IR-PCI-MSI-edge iavf-net1-TxRx-0
943: 0 0 0 0 300842 0
IR-PCI-MSI-edge iavf-net1-TxRx-1
944: 0 0 0 0 333845 0
IR-PCI-MSI-edge iavf-net1-TxRx-2
945: 0 0 0 0 333822 0
IR-PCI-MSI-edge iavf-net1-TxRx-3
961: 0 0 0 0 28492 0
IR-PCI-MSI-edge iavf-0000:b5:02.7:mbx
962: 0 0 0 0 435608 0
IR-PCI-MSI-edge iavf-net1-TxRx-0
963: 0 0 0 0 394832 0
IR-PCI-MSI-edge iavf-net1-TxRx-1
964: 0 0 0 0 398414 0
IR-PCI-MSI-edge iavf-net1-TxRx-2
965: 0 0 0 0 192847 0
IR-PCI-MSI-edge iavf-net1-TxRx-3
There were IRQs coming in on the "iavf-0000:b5:02.7:mbx" interrupt at
roughly 1 per second without any traffic, while the interrupt rate on
the "iavf-net1-TxRx-<X>" seemed to be related to traffic.
Is this expected? It seems like the IRQ subsystem is not respecting the
configured SMP affinity for the interrupt in question. I've also seen
the same behaviour with igb interrupts.
Anyone have any ideas?
Thanks,
Chris
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [IRQ] IRQ affinity not working properly?
2021-01-29 19:17 [IRQ] IRQ affinity not working properly? Chris Friesen
@ 2021-03-28 18:45 ` Thomas Gleixner
2021-04-21 13:31 ` Nitesh Narayan Lal
0 siblings, 1 reply; 5+ messages in thread
From: Thomas Gleixner @ 2021-03-28 18:45 UTC (permalink / raw)
To: Chris Friesen, LKML
On Fri, Jan 29 2021 at 13:17, Chris Friesen wrote:
> I have a CentOS 7 linux system with 48 logical CPUs and a number of
Kernel version?
> Intel NICs running the i40e driver. It was booted with
> irqaffinity=0-1,24-25 in the kernel boot args, resulting in
> /proc/irq/default_smp_affinity showing "0000,03000003". CPUs 2-11 are
> set as "isolated" in the kernel boot args. The irqbalance daemon is not
> running.
>
> The problem I'm seeing is that /proc/interrupts shows iavf interrupts
> (associated with physical devices running the i40e driver) on other CPUs
> than the expected affinity. For example, here are some iavf interrupts
> on CPU 4 where I would not expect to see any interrupts given that "cat
> /proc/irq/<NUM>/smp_affinity_list" reports "0-1,24-25" for all these
> interrupts. (Sorry for the line wrapping.)
>
> cat /proc/interrupts | grep -e CPU -e 941: -e 942: -e 943: -e 944: -e
> 945: -e 961: -e 962: -e 963: -e 964: -e 965:
>
> CPU0 CPU1 CPU2 CPU3 CPU4 CPU5
> 941: 0 0 0 0 28490 0
> IR-PCI-MSI-edge iavf-0000:b5:03.6:mbx
> 942: 0 0 0 0 333832 0
> IR-PCI-MSI-edge iavf-net1-TxRx-0
> 943: 0 0 0 0 300842 0
> IR-PCI-MSI-edge iavf-net1-TxRx-1
> 944: 0 0 0 0 333845 0
> IR-PCI-MSI-edge iavf-net1-TxRx-2
> 945: 0 0 0 0 333822 0
> IR-PCI-MSI-edge iavf-net1-TxRx-3
> 961: 0 0 0 0 28492 0
> IR-PCI-MSI-edge iavf-0000:b5:02.7:mbx
> 962: 0 0 0 0 435608 0
> IR-PCI-MSI-edge iavf-net1-TxRx-0
> 963: 0 0 0 0 394832 0
> IR-PCI-MSI-edge iavf-net1-TxRx-1
> 964: 0 0 0 0 398414 0
> IR-PCI-MSI-edge iavf-net1-TxRx-2
> 965: 0 0 0 0 192847 0
> IR-PCI-MSI-edge iavf-net1-TxRx-3
>
> There were IRQs coming in on the "iavf-0000:b5:02.7:mbx" interrupt at
> roughly 1 per second without any traffic, while the interrupt rate on
> the "iavf-net1-TxRx-<X>" seemed to be related to traffic.
>
> Is this expected? It seems like the IRQ subsystem is not respecting the
> configured SMP affinity for the interrupt in question. I've also seen
> the same behaviour with igb interrupts.
No it's not expected. Do you see the same behaviour with a recent
mainline kernel, i.e. 5.10 or 5.11?
Thanks,
tglx
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [IRQ] IRQ affinity not working properly?
2021-03-28 18:45 ` Thomas Gleixner
@ 2021-04-21 13:31 ` Nitesh Narayan Lal
2021-04-22 15:42 ` Thomas Gleixner
0 siblings, 1 reply; 5+ messages in thread
From: Nitesh Narayan Lal @ 2021-04-21 13:31 UTC (permalink / raw)
To: Thomas Gleixner, Chris Friesen, LKML, Jesse Brandeburg
On 3/28/21 2:45 PM, Thomas Gleixner wrote:
> On Fri, Jan 29 2021 at 13:17, Chris Friesen wrote:
>> I have a CentOS 7 linux system with 48 logical CPUs and a number of
<snip>
>> IR-PCI-MSI-edge iavf-net1-TxRx-3
>> 961: 0 0 0 0 28492 0
>> IR-PCI-MSI-edge iavf-0000:b5:02.7:mbx
>> 962: 0 0 0 0 435608 0
>> IR-PCI-MSI-edge iavf-net1-TxRx-0
>> 963: 0 0 0 0 394832 0
>> IR-PCI-MSI-edge iavf-net1-TxRx-1
>> 964: 0 0 0 0 398414 0
>> IR-PCI-MSI-edge iavf-net1-TxRx-2
>> 965: 0 0 0 0 192847 0
>> IR-PCI-MSI-edge iavf-net1-TxRx-3
>>
>> There were IRQs coming in on the "iavf-0000:b5:02.7:mbx" interrupt at
>> roughly 1 per second without any traffic, while the interrupt rate on
>> the "iavf-net1-TxRx-<X>" seemed to be related to traffic.
>>
>> Is this expected? It seems like the IRQ subsystem is not respecting the
>> configured SMP affinity for the interrupt in question. I've also seen
>> the same behaviour with igb interrupts.
> No it's not expected. Do you see the same behaviour with a recent
> mainline kernel, i.e. 5.10 or 5.11?
>
>
Jesse pointed me to this thread and apologies that it took a while for me
to respond here.
I agree it will be interesting to see with which kernel version Chris is
reproducing the issue.
Initially, I thought that this issue is the same as the one that we have
been discussing in another thread [1].
However, in that case, the smp affinity mask itself is incorrect and doesn't
follow the default smp affinity mask (with irqbalance disabled).
[1] https://lore.kernel.org/lkml/1a044a14-0884-eedb-5d30-28b4bec24b23@redhat.com/
--
Thanks
Nitesh
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [IRQ] IRQ affinity not working properly?
2021-04-21 13:31 ` Nitesh Narayan Lal
@ 2021-04-22 15:42 ` Thomas Gleixner
2021-04-22 17:00 ` Chris Friesen
0 siblings, 1 reply; 5+ messages in thread
From: Thomas Gleixner @ 2021-04-22 15:42 UTC (permalink / raw)
To: Nitesh Narayan Lal, Chris Friesen, LKML, Jesse Brandeburg
On Wed, Apr 21 2021 at 09:31, Nitesh Narayan Lal wrote:
> On 3/28/21 2:45 PM, Thomas Gleixner wrote:
>> On Fri, Jan 29 2021 at 13:17, Chris Friesen wrote:
>>> I have a CentOS 7 linux system with 48 logical CPUs and a number of
>
> <snip>
>
>>> IR-PCI-MSI-edge iavf-net1-TxRx-3
>>> 961: 0 0 0 0 28492 0
>>> IR-PCI-MSI-edge iavf-0000:b5:02.7:mbx
>>> 962: 0 0 0 0 435608 0
>>> IR-PCI-MSI-edge iavf-net1-TxRx-0
>>> 963: 0 0 0 0 394832 0
>>> IR-PCI-MSI-edge iavf-net1-TxRx-1
>>> 964: 0 0 0 0 398414 0
>>> IR-PCI-MSI-edge iavf-net1-TxRx-2
>>> 965: 0 0 0 0 192847 0
>>> IR-PCI-MSI-edge iavf-net1-TxRx-3
>>>
>>> There were IRQs coming in on the "iavf-0000:b5:02.7:mbx" interrupt at
>>> roughly 1 per second without any traffic, while the interrupt rate on
>>> the "iavf-net1-TxRx-<X>" seemed to be related to traffic.
>>>
>>> Is this expected? It seems like the IRQ subsystem is not respecting the
>>> configured SMP affinity for the interrupt in question. I've also seen
>>> the same behaviour with igb interrupts.
>> No it's not expected. Do you see the same behaviour with a recent
>> mainline kernel, i.e. 5.10 or 5.11?
>>
>>
> Jesse pointed me to this thread and apologies that it took a while for me
> to respond here.
>
> I agree it will be interesting to see with which kernel version Chris is
> reproducing the issue.
And the output of
/proc/irq/$NUMBER/smp_affinity_list
/proc/irq/$NUMBER/effective_affinity_list
> Initially, I thought that this issue is the same as the one that we have
> been discussing in another thread [1].
>
> However, in that case, the smp affinity mask itself is incorrect and doesn't
> follow the default smp affinity mask (with irqbalance disabled).
That's the question...
Thanks,
tglx
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [IRQ] IRQ affinity not working properly?
2021-04-22 15:42 ` Thomas Gleixner
@ 2021-04-22 17:00 ` Chris Friesen
0 siblings, 0 replies; 5+ messages in thread
From: Chris Friesen @ 2021-04-22 17:00 UTC (permalink / raw)
To: Thomas Gleixner, Nitesh Narayan Lal, LKML, Jesse Brandeburg
On 4/22/2021 9:42 AM, Thomas Gleixner wrote:
> On Wed, Apr 21 2021 at 09:31, Nitesh Narayan Lal wrote:
>> I agree it will be interesting to see with which kernel version Chris is
>> reproducing the issue.
>
> And the output of
>
> /proc/irq/$NUMBER/smp_affinity_list
> /proc/irq/$NUMBER/effective_affinity_list
I haven't forgotten about this, but I've had other priorities. Hoping
to get back to it in May sometime.
Chris
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-04-22 17:01 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-29 19:17 [IRQ] IRQ affinity not working properly? Chris Friesen
2021-03-28 18:45 ` Thomas Gleixner
2021-04-21 13:31 ` Nitesh Narayan Lal
2021-04-22 15:42 ` Thomas Gleixner
2021-04-22 17:00 ` Chris Friesen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).