From: Nitesh Narayan Lal <nitesh@redhat.com>
To: Jesse Brandeburg <jesse.brandeburg@intel.com>,
Marcelo Tosatti <mtosatti@redhat.com>,
Thomas Gleixner <tglx@linutronix.de>,
"frederic@kernel.org" <frederic@kernel.org>
Cc: Robin Murphy <robin.murphy@arm.com>,
linux-kernel@vger.kernel.org, linux-api@vger.kernel.org,
juri.lelli@redhat.com, abelits@marvell.com, bhelgaas@google.com,
linux-pci@vger.kernel.org, rostedt@goodmis.org, mingo@kernel.org,
peterz@infradead.org, davem@davemloft.net,
akpm@linux-foundation.org, sfr@canb.auug.org.au,
stephen@networkplumber.org, rppt@linux.vnet.ibm.com,
jinyuqi@huawei.com, zhangshaokun@hisilicon.com,
Network Development <netdev@vger.kernel.org>,
"sassmann@redhat.com" <sassmann@redhat.com>,
"Yang, Lihong" <lihong.yang@intel.com>
Subject: Re: [Patch v4 1/3] lib: Restrict cpumask_local_spread to houskeeping CPUs
Date: Thu, 8 Apr 2021 14:49:22 -0400 [thread overview]
Message-ID: <b21853ab-e3b4-889a-07bd-742db8aa2e4b@redhat.com> (raw)
In-Reply-To: <1a044a14-0884-eedb-5d30-28b4bec24b23@redhat.com>
On 4/7/21 11:18 AM, Nitesh Narayan Lal wrote:
> On 4/6/21 1:22 PM, Jesse Brandeburg wrote:
>> Continuing a thread from a bit ago...
>>
>> Nitesh Narayan Lal wrote:
>>
>>>> After a little more digging, I found out why cpumask_local_spread change
>>>> affects the general/initial smp_affinity for certain device IRQs.
>>>>
>>>> After the introduction of the commit:
>>>>
>>>> e2e64a932 genirq: Set initial affinity in irq_set_affinity_hint()
>>>>
>>> Continuing the conversation about the above commit and adding Jesse.
>>> I was trying to understand the problem that the commit message explains
>>> "The default behavior of the kernel is somewhat undesirable as all
>>> requested interrupts end up on CPU0 after registration.", I have also been
>>> trying to reproduce this behavior without the patch but I failed in doing
>>> so, maybe because I am missing something here.
>>>
>>> @Jesse Can you please explain? FWIU IRQ affinity should be decided based on
>>> the default affinity mask.
> Thanks, Jesse for responding.
>
>> The original issue as seen, was that if you rmmod/insmod a driver
>> *without* irqbalance running, the default irq mask is -1, which means
>> any CPU. The older kernels (this issue was patched in 2014) used to use
>> that affinity mask, but the value programmed into all the interrupt
>> registers "actual affinity" would end up delivering all interrupts to
>> CPU0,
> So does that mean the affinity mask for the IRQs was different wrt where
> the IRQs were actually delivered?
> Or, the affinity mask itself for the IRQs after rmmod, insmod was changed
> to 0 instead of -1?
>
> I did a quick test on top of 5.12.0-rc6 by comparing the i40e IRQ affinity
> mask before removing the kernel module and after doing rmmod+insmod
> and didn't find any difference.
>
>> and if the machine was under traffic load incoming when the
>> driver loaded, CPU0 would start to poll among all the different netdev
>> queues, all on CPU0.
>>
>> The above then leads to the condition that the device is stuck polling
>> even if the affinity gets updated from user space, and the polling will
>> continue until traffic stops.
>>
>>> The problem with the commit is that when we overwrite the affinity mask
>>> based on the hinting mask we completely ignore the default SMP affinity
>>> mask. If we do want to overwrite the affinity based on the hint mask we
>>> should atleast consider the default SMP affinity.
> For the issue where the IRQs don't follow the default_smp_affinity mask
> because of this patch, the following are the steps by which it can be easily
> reproduced with the latest linux kernel:
>
> # Kernel
> 5.12.0-rc6+
>
> # Other pramaeters in the cmdline
> isolcpus=2-39,44-79 nohz=on nohz_full=2-39,44-79
> rcu_nocbs=2-39,44-79
>
> # cat /proc/irq/default_smp_affinity
> 0000,00000f00,00000003 [Corresponds to HK CPUs - 0, 1, 40, 41, 42 and 43]
>
> # Create VFs and check IRQ affinity mask
>
> /proc/irq/1423/iavf-ens1f1v3-TxRx-3
> 3
> /proc/irq/1424/iavf-0000:3b:0b.0:mbx
> 0
> 40
> 42
> /proc/irq/1425/iavf-ens1f1v8-TxRx-0
> 0
> /proc/irq/1426/iavf-ens1f1v8-TxRx-1
> 1
> /proc/irq/1427/iavf-ens1f1v8-TxRx-2
> 2
> /proc/irq/1428/iavf-ens1f1v8-TxRx-3
> 3
> ...
> /proc/irq/1475/iavf-ens1f1v15-TxRx-0
> 0
> /proc/irq/1476/iavf-ens1f1v15-TxRx-1
> 1
> /proc/irq/1477/iavf-ens1f1v15-TxRx-2
> 2
> /proc/irq/1478/iavf-ens1f1v15-TxRx-3
> 3
> /proc/irq/1479/iavf-0000:3b:0a.0:mbx
> 0
> 40
> 42
> ...
> /proc/irq/240/iavf-ens1f1v3-TxRx-0
> 0
> /proc/irq/248/iavf-ens1f1v3-TxRx-1
> 1
> /proc/irq/249/iavf-ens1f1v3-TxRx-2
> 2
>
>
> Trace dump:
> ----------
> ..
> 11551082: NetworkManager-1734 [040] 8167.465719: vector_activate:
> irq=1478 is_managed=0 can_reserve=1 reserve=0
> 11551090: NetworkManager-1734 [040] 8167.465720: vector_alloc:
> irq=1478 vector=65 reserved=1 ret=0
> 11551093: NetworkManager-1734 [040] 8167.465721: vector_update:
> irq=1478 vector=65 cpu=42 prev_vector=0 prev_cpu=0
> 11551097: NetworkManager-1734 [040] 8167.465721: vector_config:
> irq=1478 vector=65 cpu=42 apicdest=0x00000200
> 11551357: NetworkManager-1734 [040] 8167.465768: vector_alloc:
> irq=1478 vector=46 reserved=0 ret=0
>
> 11551360: NetworkManager-1734 [040] 8167.465769: vector_update:
> irq=1478 vector=46 cpu=3 prev_vector=65 prev_cpu=42
>
> 11551364: NetworkManager-1734 [040] 8167.465770: vector_config:
> irq=1478 vector=46 cpu=3 apicdest=0x00040100
> ..
>
> As we can see in the above trace the initial affinity for the IRQ 1478 was
> correctly set as per the default_smp_affinity mask which includes CPU 42,
> however, later on, it is updated with CPU3 which is returned from
> cpumask_local_spread().
>
>> Maybe the right thing is to fix which CPUs are passed in as the valid
>> mask, or make sure the kernel cross checks that what the driver asks
>> for is a "valid CPU"?
>>
> Sure, if we can still reproduce the problem that your patch was fixing then
> maybe we can consider adding a new API like cpumask_local_spread_irq in
> which we should consider deafult_smp_affinity mask as well before returning
> the CPU.
>
Didn't realize that netdev ml was not included, so adding that.
--
Nitesh
next prev parent reply other threads:[~2021-04-08 18:49 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-25 22:34 [PATCH v4 0/3] Preventing job distribution to isolated CPUs Nitesh Narayan Lal
2020-06-25 22:34 ` [Patch v4 1/3] lib: Restrict cpumask_local_spread to houskeeping CPUs Nitesh Narayan Lal
2020-06-29 16:11 ` Nitesh Narayan Lal
2020-07-01 0:32 ` Andrew Morton
2020-07-01 0:47 ` Nitesh Narayan Lal
2020-07-09 8:45 ` [tip: sched/core] " tip-bot2 for Alex Belits
2021-01-27 11:57 ` [Patch v4 1/3] " Robin Murphy
2021-01-27 12:19 ` Marcelo Tosatti
2021-01-27 12:36 ` Robin Murphy
2021-01-27 13:09 ` Marcelo Tosatti
2021-01-27 13:49 ` Robin Murphy
2021-01-27 14:16 ` Nitesh Narayan Lal
2021-01-28 15:56 ` Thomas Gleixner
2021-01-28 16:33 ` Marcelo Tosatti
[not found] ` <02ac9d85-7ddd-96da-1252-4663feea7c9f@marvell.com>
2021-02-01 17:50 ` [EXT] " Marcelo Tosatti
2021-01-28 16:02 ` Thomas Gleixner
2021-01-28 16:59 ` Marcelo Tosatti
2021-01-28 17:35 ` Nitesh Narayan Lal
2021-01-28 20:01 ` Thomas Gleixner
[not found] ` <d2a4dc97-a9ed-e0e7-3b9c-c56ae46f6608@redhat.com>
[not found] ` <20210129142356.GB40876@fuller.cnet>
2021-01-29 17:34 ` [EXT] " Alex Belits
[not found] ` <18584612-868c-0f88-5de2-dc93c8638816@redhat.com>
2021-02-05 19:56 ` Thomas Gleixner
2021-02-04 18:15 ` Marcelo Tosatti
2021-02-04 18:47 ` Nitesh Narayan Lal
2021-02-04 19:06 ` Marcelo Tosatti
2021-02-04 19:17 ` Nitesh Narayan Lal
2021-02-05 22:23 ` Thomas Gleixner
2021-02-05 22:26 ` Thomas Gleixner
2021-02-05 23:02 ` [tip: sched/urgent] Revert "lib: Restrict cpumask_local_spread to houskeeping CPUs" tip-bot2 for Thomas Gleixner
2021-02-07 0:43 ` [Patch v4 1/3] lib: Restrict cpumask_local_spread to houskeeping CPUs Nitesh Narayan Lal
2021-02-11 15:55 ` Nitesh Narayan Lal
2021-03-04 18:15 ` Nitesh Narayan Lal
[not found] ` <faa8d84e-db67-7fbe-891e-f4987f106b20@marvell.com>
2021-03-04 23:23 ` [EXT] " Nitesh Narayan Lal
2021-04-06 17:22 ` Jesse Brandeburg
2021-04-07 15:18 ` Nitesh Narayan Lal
2021-04-08 18:49 ` Nitesh Narayan Lal [this message]
2021-04-14 16:11 ` Jesse Brandeburg
2021-04-15 22:11 ` Nitesh Narayan Lal
2021-04-29 21:44 ` Nitesh Lal
2021-04-30 1:48 ` Jesse Brandeburg
2021-04-30 13:10 ` Nitesh Lal
2021-04-30 7:10 ` Thomas Gleixner
2021-04-30 16:14 ` Nitesh Lal
2021-04-30 18:21 ` Thomas Gleixner
2021-04-30 21:07 ` Nitesh Lal
2021-05-01 2:21 ` Jesse Brandeburg
2021-05-03 13:15 ` Nitesh Lal
2020-06-25 22:34 ` [Patch v4 2/3] PCI: Restrict probe functions to housekeeping CPUs Nitesh Narayan Lal
2020-07-09 8:45 ` [tip: sched/core] " tip-bot2 for Alex Belits
2020-06-25 22:34 ` [Patch v4 3/3] net: Restrict receive packets queuing " Nitesh Narayan Lal
2020-06-26 11:14 ` Peter Zijlstra
2020-06-26 17:20 ` David Miller
2020-07-09 8:45 ` [tip: sched/core] " tip-bot2 for Alex Belits
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b21853ab-e3b4-889a-07bd-742db8aa2e4b@redhat.com \
--to=nitesh@redhat.com \
--cc=abelits@marvell.com \
--cc=akpm@linux-foundation.org \
--cc=bhelgaas@google.com \
--cc=davem@davemloft.net \
--cc=frederic@kernel.org \
--cc=jesse.brandeburg@intel.com \
--cc=jinyuqi@huawei.com \
--cc=juri.lelli@redhat.com \
--cc=lihong.yang@intel.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=mtosatti@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=robin.murphy@arm.com \
--cc=rostedt@goodmis.org \
--cc=rppt@linux.vnet.ibm.com \
--cc=sassmann@redhat.com \
--cc=sfr@canb.auug.org.au \
--cc=stephen@networkplumber.org \
--cc=tglx@linutronix.de \
--cc=zhangshaokun@hisilicon.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).