linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/3] Preventing job distribution to isolated CPUs
@ 2020-06-23 19:23 Nitesh Narayan Lal
  2020-06-23 19:23 ` [Patch v3 1/3] lib: Restrict cpumask_local_spread to houskeeping CPUs Nitesh Narayan Lal
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Nitesh Narayan Lal @ 2020-06-23 19:23 UTC (permalink / raw)
  To: linux-kernel, linux-api, frederic, mtosatti, juri.lelli, abelits,
	bhelgaas, linux-pci, rostedt, mingo, peterz, tglx, davem, akpm,
	sfr, stephen, rppt

This patch-set is originated from one of the patches that have been
posted earlier as a part of "Task_isolation" mode [1] patch series
by Alex Belits <abelits@marvell.com>. There are only a couple of
changes that I am proposing in this patch-set compared to what Alex
has posted earlier.
 
 
Context
=======
On a broad level, all three patches that are included in this patch
set are meant to improve the driver/library to respect isolated
CPUs by not pinning any job on it. Not doing so could impact
the latency values in RT use-cases.


Patches
=======
* Patch1:
  The first patch is meant to make cpumask_local_spread()
  aware of the isolated CPUs. It ensures that the CPUs that
  are returned by this API only includes housekeeping CPUs.

* Patch2:
  This patch ensures that a probe function that is called
  using work_on_cpu() doesn't run any task on an isolated CPU.

* Patch3:
  This patch makes store_rps_map() aware of the isolated
  CPUs so that rps don't queue any jobs on an isolated CPU. 


Proposed Changes
================
To fix the above-mentioned issues Alex has used housekeeping_cpumask().
The only changes that I am proposing here are:
- Removing the dependency on CONFIG_TASK_ISOLATION that was proposed by
  Alex. As it should be safe to rely on housekeeping_cpumask()
  even when we don't have any isolated CPUs and we want
  to fall back to using all available CPUs in any of the above scenarios.
- Using both HK_FLAG_DOMAIN and HK_FLAG_WQ in all three patches, this is
  because we would want the above fixes not only when we have isolcpus but
  also with something like systemd's CPU affinity.


Testing
=======
* Patch 1:
  Fix for cpumask_local_spread() is tested by creating VFs, loading
  iavf module and by adding a tracepoint to confirm that only housekeeping
  CPUs are picked when an appropriate profile is set up and all remaining
  CPUs when no CPU isolation is configured.

* Patch 2:
  To test the PCI fix, I hotplugged a virtio-net-pci from qemu console
  and forced its addition to a specific node to trigger the code path that
  includes the proposed fix and verified that only housekeeping CPUs
  are included via tracepoint.

* Patch 3:
  To test the fix in store_rps_map(), I tried configuring an isolated
  CPU by writing to /sys/class/net/en*/queues/rx*/rps_cpus which
  resulted in 'write error: Invalid argument' error. For the case
  where a non-isolated CPU is writing in rps_cpus the above operation
  succeeded without any error.


Changes from v2[2]:
==================
- Patch1: Removed the extra while loop from cpumask_local_spread and fixed
  the code styling issues.
- Patch3: Change to use cpumask_empty() for verifying that the requested
  CPUs are available in the housekeeping CPUs.

Changes from v1[3]:
==================
- Included the suggestions made by Bjorn Helgaas in the commit message.
- Included the 'Reviewed-by' and 'Acked-by' received for Patch-2.


[1] https://patchwork.ozlabs.org/project/netdev/patch/51102eebe62336c6a4e584c7a503553b9f90e01c.camel@marvell.com/
[2] https://patchwork.ozlabs.org/project/linux-pci/cover/20200622234510.240834-1-nitesh@redhat.com/
[3] https://patchwork.ozlabs.org/project/linux-pci/cover/20200610161226.424337-1-nitesh@redhat.com/


Alex Belits (3):
  lib: Restrict cpumask_local_spread to houskeeping CPUs
  PCI: Restrict probe functions to housekeeping CPUs
  net: Restrict receive packets queuing to housekeeping CPUs

 drivers/pci/pci-driver.c |  5 ++++-
 lib/cpumask.c            | 16 +++++++++++-----
 net/core/net-sysfs.c     | 10 +++++++++-
 3 files changed, 24 insertions(+), 7 deletions(-)

-- 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Patch v3 1/3] lib: Restrict cpumask_local_spread to houskeeping CPUs
  2020-06-23 19:23 [PATCH v3 0/3] Preventing job distribution to isolated CPUs Nitesh Narayan Lal
@ 2020-06-23 19:23 ` Nitesh Narayan Lal
  2020-06-24 12:13   ` Frederic Weisbecker
  2020-06-24 19:26   ` Andrew Morton
  2020-06-23 19:23 ` [Patch v3 2/3] PCI: Restrict probe functions to housekeeping CPUs Nitesh Narayan Lal
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 11+ messages in thread
From: Nitesh Narayan Lal @ 2020-06-23 19:23 UTC (permalink / raw)
  To: linux-kernel, linux-api, frederic, mtosatti, juri.lelli, abelits,
	bhelgaas, linux-pci, rostedt, mingo, peterz, tglx, davem, akpm,
	sfr, stephen, rppt

From: Alex Belits <abelits@marvell.com>

The current implementation of cpumask_local_spread() does not respect the
isolated CPUs, i.e., even if a CPU has been isolated for Real-Time task,
it will return it to the caller for pinning of its IRQ threads. Having
these unwanted IRQ threads on an isolated CPU adds up to a latency
overhead.

Restrict the CPUs that are returned for spreading IRQs only to the
available housekeeping CPUs.

Signed-off-by: Alex Belits <abelits@marvell.com>
Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com>
---
 lib/cpumask.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/lib/cpumask.c b/lib/cpumask.c
index fb22fb266f93..d73104995981 100644
--- a/lib/cpumask.c
+++ b/lib/cpumask.c
@@ -6,6 +6,7 @@
 #include <linux/export.h>
 #include <linux/memblock.h>
 #include <linux/numa.h>
+#include <linux/sched/isolation.h>
 
 /**
  * cpumask_next - get the next cpu in a cpumask
@@ -205,22 +206,27 @@ void __init free_bootmem_cpumask_var(cpumask_var_t mask)
  */
 unsigned int cpumask_local_spread(unsigned int i, int node)
 {
-	int cpu;
+	int cpu, hk_flags;
+	const struct cpumask *mask;
 
+	hk_flags = HK_FLAG_DOMAIN | HK_FLAG_WQ;
+	mask = housekeeping_cpumask(hk_flags);
 	/* Wrap: we always want a cpu. */
-	i %= num_online_cpus();
+	i %= cpumask_weight(mask);
 
 	if (node == NUMA_NO_NODE) {
-		for_each_cpu(cpu, cpu_online_mask)
+		for_each_cpu(cpu, mask) {
 			if (i-- == 0)
 				return cpu;
+		}
 	} else {
 		/* NUMA first. */
-		for_each_cpu_and(cpu, cpumask_of_node(node), cpu_online_mask)
+		for_each_cpu_and(cpu, cpumask_of_node(node), mask) {
 			if (i-- == 0)
 				return cpu;
+		}
 
-		for_each_cpu(cpu, cpu_online_mask) {
+		for_each_cpu(cpu, mask) {
 			/* Skip NUMA nodes, done above. */
 			if (cpumask_test_cpu(cpu, cpumask_of_node(node)))
 				continue;
-- 
2.18.4


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Patch v3 2/3] PCI: Restrict probe functions to housekeeping CPUs
  2020-06-23 19:23 [PATCH v3 0/3] Preventing job distribution to isolated CPUs Nitesh Narayan Lal
  2020-06-23 19:23 ` [Patch v3 1/3] lib: Restrict cpumask_local_spread to houskeeping CPUs Nitesh Narayan Lal
@ 2020-06-23 19:23 ` Nitesh Narayan Lal
  2020-06-23 19:23 ` [Patch v3 3/3] net: Restrict receive packets queuing " Nitesh Narayan Lal
  2020-06-24 10:08 ` [PATCH v3 0/3] Preventing job distribution to isolated CPUs Peter Zijlstra
  3 siblings, 0 replies; 11+ messages in thread
From: Nitesh Narayan Lal @ 2020-06-23 19:23 UTC (permalink / raw)
  To: linux-kernel, linux-api, frederic, mtosatti, juri.lelli, abelits,
	bhelgaas, linux-pci, rostedt, mingo, peterz, tglx, davem, akpm,
	sfr, stephen, rppt

From: Alex Belits <abelits@marvell.com>

pci_call_probe() prevents the nesting of work_on_cpu() for a scenario
where a VF device is probed from work_on_cpu() of the PF.

Replace the cpumask used in pci_call_probe() from all online CPUs to only
housekeeping CPUs. This is to ensure that there are no additional latency
overheads caused due to the pinning of jobs on isolated CPUs.

Signed-off-by: Alex Belits <abelits@marvell.com>
Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
---
 drivers/pci/pci-driver.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index da6510af1221..449466f71040 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -12,6 +12,7 @@
 #include <linux/string.h>
 #include <linux/slab.h>
 #include <linux/sched.h>
+#include <linux/sched/isolation.h>
 #include <linux/cpu.h>
 #include <linux/pm_runtime.h>
 #include <linux/suspend.h>
@@ -333,6 +334,7 @@ static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
 			  const struct pci_device_id *id)
 {
 	int error, node, cpu;
+	int hk_flags = HK_FLAG_DOMAIN | HK_FLAG_WQ;
 	struct drv_dev_and_id ddi = { drv, dev, id };
 
 	/*
@@ -353,7 +355,8 @@ static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
 	    pci_physfn_is_probed(dev))
 		cpu = nr_cpu_ids;
 	else
-		cpu = cpumask_any_and(cpumask_of_node(node), cpu_online_mask);
+		cpu = cpumask_any_and(cpumask_of_node(node),
+				      housekeeping_cpumask(hk_flags));
 
 	if (cpu < nr_cpu_ids)
 		error = work_on_cpu(cpu, local_pci_probe, &ddi);
-- 
2.18.4


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Patch v3 3/3] net: Restrict receive packets queuing to housekeeping CPUs
  2020-06-23 19:23 [PATCH v3 0/3] Preventing job distribution to isolated CPUs Nitesh Narayan Lal
  2020-06-23 19:23 ` [Patch v3 1/3] lib: Restrict cpumask_local_spread to houskeeping CPUs Nitesh Narayan Lal
  2020-06-23 19:23 ` [Patch v3 2/3] PCI: Restrict probe functions to housekeeping CPUs Nitesh Narayan Lal
@ 2020-06-23 19:23 ` Nitesh Narayan Lal
  2020-06-24 10:08 ` [PATCH v3 0/3] Preventing job distribution to isolated CPUs Peter Zijlstra
  3 siblings, 0 replies; 11+ messages in thread
From: Nitesh Narayan Lal @ 2020-06-23 19:23 UTC (permalink / raw)
  To: linux-kernel, linux-api, frederic, mtosatti, juri.lelli, abelits,
	bhelgaas, linux-pci, rostedt, mingo, peterz, tglx, davem, akpm,
	sfr, stephen, rppt

From: Alex Belits <abelits@marvell.com>

With the existing implementation of store_rps_map(), packets are queued
in the receive path on the backlog queues of other CPUs irrespective of
whether they are isolated or not. This could add a latency overhead to
any RT workload that is running on the same CPU.

Ensure that store_rps_map() only uses available housekeeping CPUs for
storing the rps_map.

Signed-off-by: Alex Belits <abelits@marvell.com>
Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com>
---
 net/core/net-sysfs.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index e353b822bb15..677868fea316 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -11,6 +11,7 @@
 #include <linux/if_arp.h>
 #include <linux/slab.h>
 #include <linux/sched/signal.h>
+#include <linux/sched/isolation.h>
 #include <linux/nsproxy.h>
 #include <net/sock.h>
 #include <net/net_namespace.h>
@@ -741,7 +742,7 @@ static ssize_t store_rps_map(struct netdev_rx_queue *queue,
 {
 	struct rps_map *old_map, *map;
 	cpumask_var_t mask;
-	int err, cpu, i;
+	int err, cpu, i, hk_flags;
 	static DEFINE_MUTEX(rps_map_mutex);
 
 	if (!capable(CAP_NET_ADMIN))
@@ -756,6 +757,13 @@ static ssize_t store_rps_map(struct netdev_rx_queue *queue,
 		return err;
 	}
 
+	hk_flags = HK_FLAG_DOMAIN | HK_FLAG_WQ;
+	cpumask_and(mask, mask, housekeeping_cpumask(hk_flags));
+	if (cpumask_empty(mask)) {
+		free_cpumask_var(mask);
+		return -EINVAL;
+	}
+
 	map = kzalloc(max_t(unsigned int,
 			    RPS_MAP_SIZE(cpumask_weight(mask)), L1_CACHE_BYTES),
 		      GFP_KERNEL);
-- 
2.18.4


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 0/3] Preventing job distribution to isolated CPUs
  2020-06-23 19:23 [PATCH v3 0/3] Preventing job distribution to isolated CPUs Nitesh Narayan Lal
                   ` (2 preceding siblings ...)
  2020-06-23 19:23 ` [Patch v3 3/3] net: Restrict receive packets queuing " Nitesh Narayan Lal
@ 2020-06-24 10:08 ` Peter Zijlstra
  3 siblings, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2020-06-24 10:08 UTC (permalink / raw)
  To: Nitesh Narayan Lal
  Cc: linux-kernel, linux-api, frederic, mtosatti, juri.lelli, abelits,
	bhelgaas, linux-pci, rostedt, mingo, tglx, davem, akpm, sfr,
	stephen, rppt

On Tue, Jun 23, 2020 at 03:23:28PM -0400, Nitesh Narayan Lal wrote:
> This patch-set is originated from one of the patches that have been
> posted earlier as a part of "Task_isolation" mode [1] patch series
> by Alex Belits <abelits@marvell.com>. There are only a couple of
> changes that I am proposing in this patch-set compared to what Alex
> has posted earlier.

> 
> Alex Belits (3):
>   lib: Restrict cpumask_local_spread to houskeeping CPUs
>   PCI: Restrict probe functions to housekeeping CPUs
>   net: Restrict receive packets queuing to housekeeping CPUs
> 
>  drivers/pci/pci-driver.c |  5 ++++-
>  lib/cpumask.c            | 16 +++++++++++-----
>  net/core/net-sysfs.c     | 10 +++++++++-
>  3 files changed, 24 insertions(+), 7 deletions(-)

This looks reasonable to me; who is expected to merge this? Should I
take it through the scheduler tree like most of the nohz_full, or what
do we do?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Patch v3 1/3] lib: Restrict cpumask_local_spread to houskeeping CPUs
  2020-06-23 19:23 ` [Patch v3 1/3] lib: Restrict cpumask_local_spread to houskeeping CPUs Nitesh Narayan Lal
@ 2020-06-24 12:13   ` Frederic Weisbecker
  2020-06-24 20:37     ` Nitesh Narayan Lal
  2020-06-24 19:26   ` Andrew Morton
  1 sibling, 1 reply; 11+ messages in thread
From: Frederic Weisbecker @ 2020-06-24 12:13 UTC (permalink / raw)
  To: Nitesh Narayan Lal
  Cc: linux-kernel, linux-api, mtosatti, juri.lelli, abelits, bhelgaas,
	linux-pci, rostedt, mingo, peterz, tglx, davem, akpm, sfr,
	stephen, rppt

On Tue, Jun 23, 2020 at 03:23:29PM -0400, Nitesh Narayan Lal wrote:
> From: Alex Belits <abelits@marvell.com>
> 
> The current implementation of cpumask_local_spread() does not respect the
> isolated CPUs, i.e., even if a CPU has been isolated for Real-Time task,
> it will return it to the caller for pinning of its IRQ threads. Having
> these unwanted IRQ threads on an isolated CPU adds up to a latency
> overhead.
> 
> Restrict the CPUs that are returned for spreading IRQs only to the
> available housekeeping CPUs.
> 
> Signed-off-by: Alex Belits <abelits@marvell.com>
> Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com>
> ---
>  lib/cpumask.c | 16 +++++++++++-----
>  1 file changed, 11 insertions(+), 5 deletions(-)
> 
> diff --git a/lib/cpumask.c b/lib/cpumask.c
> index fb22fb266f93..d73104995981 100644
> --- a/lib/cpumask.c
> +++ b/lib/cpumask.c
> @@ -6,6 +6,7 @@
>  #include <linux/export.h>
>  #include <linux/memblock.h>
>  #include <linux/numa.h>
> +#include <linux/sched/isolation.h>
>  
>  /**
>   * cpumask_next - get the next cpu in a cpumask
> @@ -205,22 +206,27 @@ void __init free_bootmem_cpumask_var(cpumask_var_t mask)
>   */
>  unsigned int cpumask_local_spread(unsigned int i, int node)
>  {
> -	int cpu;
> +	int cpu, hk_flags;
> +	const struct cpumask *mask;
>  
> +	hk_flags = HK_FLAG_DOMAIN | HK_FLAG_WQ;

This should be HK_FLAG_MANAGED_IRQ instead of HK_FLAG_WQ since this
function seem to be used mostly to select CPUs to affine managed IRQs.
In the end the cpumask you pass to IRQ core will be filtered throughout
HK_FLAG_MANAGED_IRQ anyway so better select an appropriate one in the
first place to avoid an empty cpumask intersection.

Now even if cpumask_local_spread() is currently mostly used to select
managed irq targets, the name and role of the function don't refer to that.
Probably cpumask_local_spread() should take HK_ flag in parameter so that
it can correctly handle future users?

That being said, I plan to merge HK_FLAG_RCU, HK_FLAG_MISC, HK_FLAG_SCHED,
HK_FLAG_WQ and HK_FLAG_TIMER into HK_FLAG_UNBOUND since it doesn't make sense
to divide them all. And the actual flag used inside cpumask_local_spread()
could end up being HK_FLAG_DOMAIN | HK_FLAG_UNBOUND. So probably you don't
need to worry about that and just change the HK_FLAG_WQ in your patch
with HK_FLAG_MANAGED_IRQ.

Thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Patch v3 1/3] lib: Restrict cpumask_local_spread to houskeeping CPUs
  2020-06-23 19:23 ` [Patch v3 1/3] lib: Restrict cpumask_local_spread to houskeeping CPUs Nitesh Narayan Lal
  2020-06-24 12:13   ` Frederic Weisbecker
@ 2020-06-24 19:26   ` Andrew Morton
  2020-06-24 20:38     ` Nitesh Narayan Lal
                       ` (2 more replies)
  1 sibling, 3 replies; 11+ messages in thread
From: Andrew Morton @ 2020-06-24 19:26 UTC (permalink / raw)
  To: Nitesh Narayan Lal
  Cc: linux-kernel, linux-api, frederic, mtosatti, juri.lelli, abelits,
	bhelgaas, linux-pci, rostedt, mingo, peterz, tglx, davem, sfr,
	stephen, rppt, yuqi jin, Shaokun Zhang

On Tue, 23 Jun 2020 15:23:29 -0400 Nitesh Narayan Lal <nitesh@redhat.com> wrote:

> From: Alex Belits <abelits@marvell.com>
> 
> The current implementation of cpumask_local_spread() does not respect the
> isolated CPUs, i.e., even if a CPU has been isolated for Real-Time task,
> it will return it to the caller for pinning of its IRQ threads. Having
> these unwanted IRQ threads on an isolated CPU adds up to a latency
> overhead.
> 
> Restrict the CPUs that are returned for spreading IRQs only to the
> available housekeeping CPUs.
> 
> ...
>
> --- a/lib/cpumask.c
> +++ b/lib/cpumask.c
> @@ -6,6 +6,7 @@
>  #include <linux/export.h>
>  #include <linux/memblock.h>
>  #include <linux/numa.h>
> +#include <linux/sched/isolation.h>
>  
>  /**
>   * cpumask_next - get the next cpu in a cpumask
> @@ -205,22 +206,27 @@ void __init free_bootmem_cpumask_var(cpumask_var_t mask)
>   */
>  unsigned int cpumask_local_spread(unsigned int i, int node)
>  {
> -	int cpu;
> +	int cpu, hk_flags;
> +	const struct cpumask *mask;
>  
> +	hk_flags = HK_FLAG_DOMAIN | HK_FLAG_WQ;
> +	mask = housekeeping_cpumask(hk_flags);
>  	/* Wrap: we always want a cpu. */
> -	i %= num_online_cpus();
> +	i %= cpumask_weight(mask);
>  
>  	if (node == NUMA_NO_NODE) {
> -		for_each_cpu(cpu, cpu_online_mask)
> +		for_each_cpu(cpu, mask) {
>  			if (i-- == 0)
>  				return cpu;
> +		}
>  	} else {
>  		/* NUMA first. */
> -		for_each_cpu_and(cpu, cpumask_of_node(node), cpu_online_mask)
> +		for_each_cpu_and(cpu, cpumask_of_node(node), mask) {
>  			if (i-- == 0)
>  				return cpu;
> +		}
>  
> -		for_each_cpu(cpu, cpu_online_mask) {
> +		for_each_cpu(cpu, mask) {
>  			/* Skip NUMA nodes, done above. */
>  			if (cpumask_test_cpu(cpu, cpumask_of_node(node)))
>  				continue;

Are you aware of these changes to cpu_local_spread()?
https://lore.kernel.org/lkml/1582768688-2314-1-git-send-email-zhangshaokun@hisilicon.com/

I don't see a lot of overlap but it would be nice for you folks to
check each other's homework ;)



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Patch v3 1/3] lib: Restrict cpumask_local_spread to houskeeping CPUs
  2020-06-24 12:13   ` Frederic Weisbecker
@ 2020-06-24 20:37     ` Nitesh Narayan Lal
  0 siblings, 0 replies; 11+ messages in thread
From: Nitesh Narayan Lal @ 2020-06-24 20:37 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: linux-kernel, linux-api, mtosatti, juri.lelli, abelits, bhelgaas,
	linux-pci, rostedt, mingo, peterz, tglx, davem, akpm, sfr,
	stephen, rppt


[-- Attachment #1.1: Type: text/plain, Size: 2713 bytes --]


On 6/24/20 8:13 AM, Frederic Weisbecker wrote:
> On Tue, Jun 23, 2020 at 03:23:29PM -0400, Nitesh Narayan Lal wrote:
>> From: Alex Belits <abelits@marvell.com>
>>
>> The current implementation of cpumask_local_spread() does not respect the
>> isolated CPUs, i.e., even if a CPU has been isolated for Real-Time task,
>> it will return it to the caller for pinning of its IRQ threads. Having
>> these unwanted IRQ threads on an isolated CPU adds up to a latency
>> overhead.
>>
>> Restrict the CPUs that are returned for spreading IRQs only to the
>> available housekeeping CPUs.
>>
>> Signed-off-by: Alex Belits <abelits@marvell.com>
>> Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com>
>> ---
>>  lib/cpumask.c | 16 +++++++++++-----
>>  1 file changed, 11 insertions(+), 5 deletions(-)
>>
>> diff --git a/lib/cpumask.c b/lib/cpumask.c
>> index fb22fb266f93..d73104995981 100644
>> --- a/lib/cpumask.c
>> +++ b/lib/cpumask.c
>> @@ -6,6 +6,7 @@
>>  #include <linux/export.h>
>>  #include <linux/memblock.h>
>>  #include <linux/numa.h>
>> +#include <linux/sched/isolation.h>
>>  
>>  /**
>>   * cpumask_next - get the next cpu in a cpumask
>> @@ -205,22 +206,27 @@ void __init free_bootmem_cpumask_var(cpumask_var_t mask)
>>   */
>>  unsigned int cpumask_local_spread(unsigned int i, int node)
>>  {
>> -	int cpu;
>> +	int cpu, hk_flags;
>> +	const struct cpumask *mask;
>>  
>> +	hk_flags = HK_FLAG_DOMAIN | HK_FLAG_WQ;
> This should be HK_FLAG_MANAGED_IRQ instead of HK_FLAG_WQ since this
> function seem to be used mostly to select CPUs to affine managed IRQs.

IIRC then there are drivers such as ixgbe that use cpumask_local_spread while
affining NORMAL IRQs as well.
But I can recheck that.

> In the end the cpumask you pass to IRQ core will be filtered throughout
> HK_FLAG_MANAGED_IRQ anyway so better select an appropriate one in the
> first place to avoid an empty cpumask intersection.
>
> Now even if cpumask_local_spread() is currently mostly used to select
> managed irq targets, the name and role of the function don't refer to that.
> Probably cpumask_local_spread() should take HK_ flag in parameter so that
> it can correctly handle future users?
>
> That being said, I plan to merge HK_FLAG_RCU, HK_FLAG_MISC, HK_FLAG_SCHED,
> HK_FLAG_WQ and HK_FLAG_TIMER into HK_FLAG_UNBOUND since it doesn't make sense
> to divide them all.

That would be nice.

>  And the actual flag used inside cpumask_local_spread()
> could end up being HK_FLAG_DOMAIN | HK_FLAG_UNBOUND. So probably you don't
> need to worry about that and just change the HK_FLAG_WQ in your patch
> with HK_FLAG_MANAGED_IRQ.
>
> Thanks.
>
-- 
Thanks
Nitesh


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Patch v3 1/3] lib: Restrict cpumask_local_spread to houskeeping CPUs
  2020-06-24 19:26   ` Andrew Morton
@ 2020-06-24 20:38     ` Nitesh Narayan Lal
  2020-06-24 23:31     ` Nitesh Narayan Lal
  2020-06-29  9:01     ` Shaokun Zhang
  2 siblings, 0 replies; 11+ messages in thread
From: Nitesh Narayan Lal @ 2020-06-24 20:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-api, frederic, mtosatti, juri.lelli, abelits,
	bhelgaas, linux-pci, rostedt, mingo, peterz, tglx, davem, sfr,
	stephen, rppt, yuqi jin, Shaokun Zhang


[-- Attachment #1.1: Type: text/plain, Size: 2230 bytes --]


On 6/24/20 3:26 PM, Andrew Morton wrote:
> On Tue, 23 Jun 2020 15:23:29 -0400 Nitesh Narayan Lal <nitesh@redhat.com> wrote:
>
>> From: Alex Belits <abelits@marvell.com>
>>
>> The current implementation of cpumask_local_spread() does not respect the
>> isolated CPUs, i.e., even if a CPU has been isolated for Real-Time task,
>> it will return it to the caller for pinning of its IRQ threads. Having
>> these unwanted IRQ threads on an isolated CPU adds up to a latency
>> overhead.
>>
>> Restrict the CPUs that are returned for spreading IRQs only to the
>> available housekeeping CPUs.
>>
>> ...
>>
>> --- a/lib/cpumask.c
>> +++ b/lib/cpumask.c
>> @@ -6,6 +6,7 @@
>>  #include <linux/export.h>
>>  #include <linux/memblock.h>
>>  #include <linux/numa.h>
>> +#include <linux/sched/isolation.h>
>>  
>>  /**
>>   * cpumask_next - get the next cpu in a cpumask
>> @@ -205,22 +206,27 @@ void __init free_bootmem_cpumask_var(cpumask_var_t mask)
>>   */
>>  unsigned int cpumask_local_spread(unsigned int i, int node)
>>  {
>> -	int cpu;
>> +	int cpu, hk_flags;
>> +	const struct cpumask *mask;
>>  
>> +	hk_flags = HK_FLAG_DOMAIN | HK_FLAG_WQ;
>> +	mask = housekeeping_cpumask(hk_flags);
>>  	/* Wrap: we always want a cpu. */
>> -	i %= num_online_cpus();
>> +	i %= cpumask_weight(mask);
>>  
>>  	if (node == NUMA_NO_NODE) {
>> -		for_each_cpu(cpu, cpu_online_mask)
>> +		for_each_cpu(cpu, mask) {
>>  			if (i-- == 0)
>>  				return cpu;
>> +		}
>>  	} else {
>>  		/* NUMA first. */
>> -		for_each_cpu_and(cpu, cpumask_of_node(node), cpu_online_mask)
>> +		for_each_cpu_and(cpu, cpumask_of_node(node), mask) {
>>  			if (i-- == 0)
>>  				return cpu;
>> +		}
>>  
>> -		for_each_cpu(cpu, cpu_online_mask) {
>> +		for_each_cpu(cpu, mask) {
>>  			/* Skip NUMA nodes, done above. */
>>  			if (cpumask_test_cpu(cpu, cpumask_of_node(node)))
>>  				continue;
> Are you aware of these changes to cpu_local_spread()?
> https://lore.kernel.org/lkml/1582768688-2314-1-git-send-email-zhangshaokun@hisilicon.com/
>
> I don't see a lot of overlap but it would be nice for you folks to
> check each other's homework ;)

Sure, I will take a look.
Thanks

>
-- 
Nitesh


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Patch v3 1/3] lib: Restrict cpumask_local_spread to houskeeping CPUs
  2020-06-24 19:26   ` Andrew Morton
  2020-06-24 20:38     ` Nitesh Narayan Lal
@ 2020-06-24 23:31     ` Nitesh Narayan Lal
  2020-06-29  9:01     ` Shaokun Zhang
  2 siblings, 0 replies; 11+ messages in thread
From: Nitesh Narayan Lal @ 2020-06-24 23:31 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-api, frederic, mtosatti, juri.lelli, abelits,
	bhelgaas, linux-pci, rostedt, mingo, peterz, tglx, davem, sfr,
	stephen, rppt, yuqi jin, Shaokun Zhang


[-- Attachment #1.1: Type: text/plain, Size: 2511 bytes --]


On 6/24/20 3:26 PM, Andrew Morton wrote:
> On Tue, 23 Jun 2020 15:23:29 -0400 Nitesh Narayan Lal <nitesh@redhat.com> wrote:
>
>> From: Alex Belits <abelits@marvell.com>
>>
>> The current implementation of cpumask_local_spread() does not respect the
>> isolated CPUs, i.e., even if a CPU has been isolated for Real-Time task,
>> it will return it to the caller for pinning of its IRQ threads. Having
>> these unwanted IRQ threads on an isolated CPU adds up to a latency
>> overhead.
>>
>> Restrict the CPUs that are returned for spreading IRQs only to the
>> available housekeeping CPUs.
>>
>> ...
>>
>> --- a/lib/cpumask.c
>> +++ b/lib/cpumask.c
>> @@ -6,6 +6,7 @@
>>  #include <linux/export.h>
>>  #include <linux/memblock.h>
>>  #include <linux/numa.h>
>> +#include <linux/sched/isolation.h>
>>  
>>  /**
>>   * cpumask_next - get the next cpu in a cpumask
>> @@ -205,22 +206,27 @@ void __init free_bootmem_cpumask_var(cpumask_var_t mask)
>>   */
>>  unsigned int cpumask_local_spread(unsigned int i, int node)
>>  {
>> -	int cpu;
>> +	int cpu, hk_flags;
>> +	const struct cpumask *mask;
>>  
>> +	hk_flags = HK_FLAG_DOMAIN | HK_FLAG_WQ;
>> +	mask = housekeeping_cpumask(hk_flags);
>>  	/* Wrap: we always want a cpu. */
>> -	i %= num_online_cpus();
>> +	i %= cpumask_weight(mask);
>>  
>>  	if (node == NUMA_NO_NODE) {
>> -		for_each_cpu(cpu, cpu_online_mask)
>> +		for_each_cpu(cpu, mask) {
>>  			if (i-- == 0)
>>  				return cpu;
>> +		}
>>  	} else {
>>  		/* NUMA first. */
>> -		for_each_cpu_and(cpu, cpumask_of_node(node), cpu_online_mask)
>> +		for_each_cpu_and(cpu, cpumask_of_node(node), mask) {
>>  			if (i-- == 0)
>>  				return cpu;
>> +		}
>>  
>> -		for_each_cpu(cpu, cpu_online_mask) {
>> +		for_each_cpu(cpu, mask) {
>>  			/* Skip NUMA nodes, done above. */
>>  			if (cpumask_test_cpu(cpu, cpumask_of_node(node)))
>>  				continue;
> Are you aware of these changes to cpu_local_spread()?
> https://lore.kernel.org/lkml/1582768688-2314-1-git-send-email-zhangshaokun@hisilicon.com/
>
> I don't see a lot of overlap but it would be nice for you folks to
> check each other's homework ;)

I took a look at the patch and as you said there is not much overlap.
The idea of keeping isolated CPUs untouched for RT environments will be valid
for the optimization that Shaokun is suggesting as well.
I am not sure about the current state of the patch-set but I will certainly keep
an eye on it.

>
>
-- 
Thanks
Nitesh


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Patch v3 1/3] lib: Restrict cpumask_local_spread to houskeeping CPUs
  2020-06-24 19:26   ` Andrew Morton
  2020-06-24 20:38     ` Nitesh Narayan Lal
  2020-06-24 23:31     ` Nitesh Narayan Lal
@ 2020-06-29  9:01     ` Shaokun Zhang
  2 siblings, 0 replies; 11+ messages in thread
From: Shaokun Zhang @ 2020-06-29  9:01 UTC (permalink / raw)
  To: Andrew Morton, Nitesh Narayan Lal
  Cc: linux-kernel, linux-api, frederic, mtosatti, juri.lelli, abelits,
	bhelgaas, linux-pci, rostedt, mingo, peterz, tglx, davem, sfr,
	stephen, rppt, yuqi jin

Hi Andrew,

在 2020/6/25 3:26, Andrew Morton 写道:
> On Tue, 23 Jun 2020 15:23:29 -0400 Nitesh Narayan Lal <nitesh@redhat.com> wrote:
> 
>> From: Alex Belits <abelits@marvell.com>
>>
>> The current implementation of cpumask_local_spread() does not respect the
>> isolated CPUs, i.e., even if a CPU has been isolated for Real-Time task,
>> it will return it to the caller for pinning of its IRQ threads. Having
>> these unwanted IRQ threads on an isolated CPU adds up to a latency
>> overhead.
>>
>> Restrict the CPUs that are returned for spreading IRQs only to the
>> available housekeeping CPUs.
>>
>> ...
>>
>> --- a/lib/cpumask.c
>> +++ b/lib/cpumask.c
>> @@ -6,6 +6,7 @@
>>  #include <linux/export.h>
>>  #include <linux/memblock.h>
>>  #include <linux/numa.h>
>> +#include <linux/sched/isolation.h>
>>  
>>  /**
>>   * cpumask_next - get the next cpu in a cpumask
>> @@ -205,22 +206,27 @@ void __init free_bootmem_cpumask_var(cpumask_var_t mask)
>>   */
>>  unsigned int cpumask_local_spread(unsigned int i, int node)
>>  {
>> -	int cpu;
>> +	int cpu, hk_flags;
>> +	const struct cpumask *mask;
>>  
>> +	hk_flags = HK_FLAG_DOMAIN | HK_FLAG_WQ;
>> +	mask = housekeeping_cpumask(hk_flags);
>>  	/* Wrap: we always want a cpu. */
>> -	i %= num_online_cpus();
>> +	i %= cpumask_weight(mask);
>>  
>>  	if (node == NUMA_NO_NODE) {
>> -		for_each_cpu(cpu, cpu_online_mask)
>> +		for_each_cpu(cpu, mask) {
>>  			if (i-- == 0)
>>  				return cpu;
>> +		}
>>  	} else {
>>  		/* NUMA first. */
>> -		for_each_cpu_and(cpu, cpumask_of_node(node), cpu_online_mask)
>> +		for_each_cpu_and(cpu, cpumask_of_node(node), mask) {
>>  			if (i-- == 0)
>>  				return cpu;
>> +		}
>>  
>> -		for_each_cpu(cpu, cpu_online_mask) {
>> +		for_each_cpu(cpu, mask) {
>>  			/* Skip NUMA nodes, done above. */
>>  			if (cpumask_test_cpu(cpu, cpumask_of_node(node)))
>>  				continue;
> 
> Are you aware of these changes to cpu_local_spread()?
> https://lore.kernel.org/lkml/1582768688-2314-1-git-send-email-zhangshaokun@hisilicon.com/
> 
> I don't see a lot of overlap but it would be nice for you folks to

Yeah, it's a different issue from Nitesh. About our's patch, it has been
linux-next long time, will it be merged in Linus's tree?

Thanks,
Shaokun

> check each other's homework ;)
> 
> 
> 
> .
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-06-30  0:48 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-23 19:23 [PATCH v3 0/3] Preventing job distribution to isolated CPUs Nitesh Narayan Lal
2020-06-23 19:23 ` [Patch v3 1/3] lib: Restrict cpumask_local_spread to houskeeping CPUs Nitesh Narayan Lal
2020-06-24 12:13   ` Frederic Weisbecker
2020-06-24 20:37     ` Nitesh Narayan Lal
2020-06-24 19:26   ` Andrew Morton
2020-06-24 20:38     ` Nitesh Narayan Lal
2020-06-24 23:31     ` Nitesh Narayan Lal
2020-06-29  9:01     ` Shaokun Zhang
2020-06-23 19:23 ` [Patch v3 2/3] PCI: Restrict probe functions to housekeeping CPUs Nitesh Narayan Lal
2020-06-23 19:23 ` [Patch v3 3/3] net: Restrict receive packets queuing " Nitesh Narayan Lal
2020-06-24 10:08 ` [PATCH v3 0/3] Preventing job distribution to isolated CPUs Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).