All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] blk-mq: avoid housekeeping CPUs scheduling a worker on a non-housekeeping CPU
@ 2022-02-10  9:35 Xiongfeng Wang
  2022-02-15  2:29 ` Xiongfeng Wang
  2022-02-15  9:38 ` Xiongfeng Wang
  0 siblings, 2 replies; 7+ messages in thread
From: Xiongfeng Wang @ 2022-02-10  9:35 UTC (permalink / raw)
  To: axboe, hch; +Cc: linux-block, linux-kernel, yuyufen, guohanjun, wangxiongfeng2

When NOHZ_FULL is enabled, such as in HPC situation, CPUs are divided
into housekeeping CPUs and non-housekeeping CPUs. Non-housekeeping CPUs
are NOHZ_FULL CPUs and are often monopolized by the userspace process,
such HPC application process. Any sort of interruption is not expected.

blk_mq_hctx_next_cpu() selects each cpu in 'hctx->cpumask' alternately
to schedule the work thread blk_mq_run_work_fn(). When 'hctx->cpumask'
contains housekeeping CPU and non-housekeeping CPU at the same time, a
housekeeping CPU, which want to request a IO, may schedule a worker on a
non-housekeeping CPU. This may affect the performance of the userspace
application running on non-housekeeping CPUs.

So let's just schedule the worker thread on the current CPU when the
current CPU is housekeeping CPU.

Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
---
 block/blk-mq.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 1adfe4824ef5..ff9a4bf16858 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -24,6 +24,7 @@
 #include <linux/sched/sysctl.h>
 #include <linux/sched/topology.h>
 #include <linux/sched/signal.h>
+#include <linux/sched/isolation.h>
 #include <linux/delay.h>
 #include <linux/crash_dump.h>
 #include <linux/prefetch.h>
@@ -2036,6 +2037,8 @@ static int blk_mq_hctx_next_cpu(struct blk_mq_hw_ctx *hctx)
 static void __blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async,
 					unsigned long msecs)
 {
+	int work_cpu;
+
 	if (unlikely(blk_mq_hctx_stopped(hctx)))
 		return;
 
@@ -2050,7 +2053,17 @@ static void __blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async,
 		put_cpu();
 	}
 
-	kblockd_mod_delayed_work_on(blk_mq_hctx_next_cpu(hctx), &hctx->run_work,
+	/*
+	 * Avoid housekeeping CPUs scheduling a worker on a non-housekeeping
+	 * CPU
+	 */
+	if (tick_nohz_full_enabled() && housekeeping_cpu(smp_processor_id(),
+							 HK_FLAG_WQ))
+		work_cpu = smp_processor_id();
+	else
+		work_cpu = blk_mq_hctx_next_cpu(hctx);
+
+	kblockd_mod_delayed_work_on(work_cpu, &hctx->run_work,
 				    msecs_to_jiffies(msecs));
 }
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH] blk-mq: avoid housekeeping CPUs scheduling a worker on a non-housekeeping CPU
  2022-02-10  9:35 [RFC PATCH] blk-mq: avoid housekeeping CPUs scheduling a worker on a non-housekeeping CPU Xiongfeng Wang
@ 2022-02-15  2:29 ` Xiongfeng Wang
  2022-02-15  4:37   ` Ming Lei
  2022-02-15  9:38 ` Xiongfeng Wang
  1 sibling, 1 reply; 7+ messages in thread
From: Xiongfeng Wang @ 2022-02-15  2:29 UTC (permalink / raw)
  To: axboe, hch, ming.lei; +Cc: linux-block, linux-kernel, yuyufen, guohanjun

Hi Ming,

Sorry to disturb you. It's just that I think you may be interested at this
patch. I found the following commit written by you.
  commit 11ea68f553e244851d15793a7fa33a97c46d8271
  genirq, sched/isolation: Isolate from handling managed interrupts
It removed the managed_irq interruption from non-housekeeping CPUs as long as
the non-housekeeping CPUs do not request IO. But the the work thread
blk_mq_run_work_fn() may still run on the non-housekeeping CPUs.
Appreciate it a lot if you can give it a look.

Thanks,
Xiongfeng

On 2022/2/10 17:35, Xiongfeng Wang wrote:
> When NOHZ_FULL is enabled, such as in HPC situation, CPUs are divided
> into housekeeping CPUs and non-housekeeping CPUs. Non-housekeeping CPUs
> are NOHZ_FULL CPUs and are often monopolized by the userspace process,
> such HPC application process. Any sort of interruption is not expected.
> 
> blk_mq_hctx_next_cpu() selects each cpu in 'hctx->cpumask' alternately
> to schedule the work thread blk_mq_run_work_fn(). When 'hctx->cpumask'
> contains housekeeping CPU and non-housekeeping CPU at the same time, a
> housekeeping CPU, which want to request a IO, may schedule a worker on a
> non-housekeeping CPU. This may affect the performance of the userspace
> application running on non-housekeeping CPUs.
> 
> So let's just schedule the worker thread on the current CPU when the
> current CPU is housekeeping CPU.
> 
> Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
> ---
>  block/blk-mq.c | 15 ++++++++++++++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 1adfe4824ef5..ff9a4bf16858 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -24,6 +24,7 @@
>  #include <linux/sched/sysctl.h>
>  #include <linux/sched/topology.h>
>  #include <linux/sched/signal.h>
> +#include <linux/sched/isolation.h>
>  #include <linux/delay.h>
>  #include <linux/crash_dump.h>
>  #include <linux/prefetch.h>
> @@ -2036,6 +2037,8 @@ static int blk_mq_hctx_next_cpu(struct blk_mq_hw_ctx *hctx)
>  static void __blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async,
>  					unsigned long msecs)
>  {
> +	int work_cpu;
> +
>  	if (unlikely(blk_mq_hctx_stopped(hctx)))
>  		return;
>  
> @@ -2050,7 +2053,17 @@ static void __blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async,
>  		put_cpu();
>  	}
>  
> -	kblockd_mod_delayed_work_on(blk_mq_hctx_next_cpu(hctx), &hctx->run_work,
> +	/*
> +	 * Avoid housekeeping CPUs scheduling a worker on a non-housekeeping
> +	 * CPU
> +	 */
> +	if (tick_nohz_full_enabled() && housekeeping_cpu(smp_processor_id(),
> +							 HK_FLAG_WQ))
> +		work_cpu = smp_processor_id();
> +	else
> +		work_cpu = blk_mq_hctx_next_cpu(hctx);
> +
> +	kblockd_mod_delayed_work_on(work_cpu, &hctx->run_work,
>  				    msecs_to_jiffies(msecs));
>  }
>  
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH] blk-mq: avoid housekeeping CPUs scheduling a worker on a non-housekeeping CPU
  2022-02-15  2:29 ` Xiongfeng Wang
@ 2022-02-15  4:37   ` Ming Lei
  2022-02-15  9:32     ` Xiongfeng Wang
  0 siblings, 1 reply; 7+ messages in thread
From: Ming Lei @ 2022-02-15  4:37 UTC (permalink / raw)
  To: Xiongfeng Wang; +Cc: axboe, hch, linux-block, linux-kernel, yuyufen, guohanjun

Hello Xiongfeng,

On Tue, Feb 15, 2022 at 10:29:51AM +0800, Xiongfeng Wang wrote:
> Hi Ming,
> 
> Sorry to disturb you. It's just that I think you may be interested at this
> patch. I found the following commit written by you.
>   commit 11ea68f553e244851d15793a7fa33a97c46d8271
>   genirq, sched/isolation: Isolate from handling managed interrupts
> It removed the managed_irq interruption from non-housekeeping CPUs as long as
> the non-housekeeping CPUs do not request IO. But the the work thread
> blk_mq_run_work_fn() may still run on the non-housekeeping CPUs.
> Appreciate it a lot if you can give it a look.

Yeah, commit 11ea68f553e24 touches irq subsystem to try not assign
isolated cpus for managed irq's effective affinity.

Here blk-mq just selects one cpu and calls mod_delayed_work_on()
to execute the run queue handler on specified cpu. There are lots of
such bound wq usage in tree, so I guess it might belong to one wq or
scheduler generic problem instead of blk-mq specific issue. Not sure
if it is good to address it in block layer.

thanks,
Ming

> 
> Thanks,
> Xiongfeng
> 
> On 2022/2/10 17:35, Xiongfeng Wang wrote:
> > When NOHZ_FULL is enabled, such as in HPC situation, CPUs are divided
> > into housekeeping CPUs and non-housekeeping CPUs. Non-housekeeping CPUs
> > are NOHZ_FULL CPUs and are often monopolized by the userspace process,
> > such HPC application process. Any sort of interruption is not expected.
> > 
> > blk_mq_hctx_next_cpu() selects each cpu in 'hctx->cpumask' alternately
> > to schedule the work thread blk_mq_run_work_fn(). When 'hctx->cpumask'
> > contains housekeeping CPU and non-housekeeping CPU at the same time, a
> > housekeeping CPU, which want to request a IO, may schedule a worker on a
> > non-housekeeping CPU. This may affect the performance of the userspace
> > application running on non-housekeeping CPUs.
> > 
> > So let's just schedule the worker thread on the current CPU when the
> > current CPU is housekeeping CPU.
> > 
> > Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
> > ---
> >  block/blk-mq.c | 15 ++++++++++++++-
> >  1 file changed, 14 insertions(+), 1 deletion(-)
> > 
> > diff --git a/block/blk-mq.c b/block/blk-mq.c
> > index 1adfe4824ef5..ff9a4bf16858 100644
> > --- a/block/blk-mq.c
> > +++ b/block/blk-mq.c
> > @@ -24,6 +24,7 @@
> >  #include <linux/sched/sysctl.h>
> >  #include <linux/sched/topology.h>
> >  #include <linux/sched/signal.h>
> > +#include <linux/sched/isolation.h>
> >  #include <linux/delay.h>
> >  #include <linux/crash_dump.h>
> >  #include <linux/prefetch.h>
> > @@ -2036,6 +2037,8 @@ static int blk_mq_hctx_next_cpu(struct blk_mq_hw_ctx *hctx)
> >  static void __blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async,
> >  					unsigned long msecs)
> >  {
> > +	int work_cpu;
> > +
> >  	if (unlikely(blk_mq_hctx_stopped(hctx)))
> >  		return;
> >  
> > @@ -2050,7 +2053,17 @@ static void __blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async,
> >  		put_cpu();
> >  	}
> >  
> > -	kblockd_mod_delayed_work_on(blk_mq_hctx_next_cpu(hctx), &hctx->run_work,
> > +	/*
> > +	 * Avoid housekeeping CPUs scheduling a worker on a non-housekeeping
> > +	 * CPU
> > +	 */
> > +	if (tick_nohz_full_enabled() && housekeeping_cpu(smp_processor_id(),
> > +							 HK_FLAG_WQ))
> > +		work_cpu = smp_processor_id();
> > +	else
> > +		work_cpu = blk_mq_hctx_next_cpu(hctx);
> > +
> > +	kblockd_mod_delayed_work_on(work_cpu, &hctx->run_work,
> >  				    msecs_to_jiffies(msecs));
> >  }
> >  
> > 
> 

-- 
Ming


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH] blk-mq: avoid housekeeping CPUs scheduling a worker on a non-housekeeping CPU
  2022-02-15  4:37   ` Ming Lei
@ 2022-02-15  9:32     ` Xiongfeng Wang
  0 siblings, 0 replies; 7+ messages in thread
From: Xiongfeng Wang @ 2022-02-15  9:32 UTC (permalink / raw)
  To: Ming Lei; +Cc: axboe, hch, linux-block, linux-kernel, yuyufen, guohanjun

Hi Ming,

Thanks for your reply !

On 2022/2/15 12:37, Ming Lei wrote:
> Hello Xiongfeng,
> 
> On Tue, Feb 15, 2022 at 10:29:51AM +0800, Xiongfeng Wang wrote:
>> Hi Ming,
>>
>> Sorry to disturb you. It's just that I think you may be interested at this
>> patch. I found the following commit written by you.
>>   commit 11ea68f553e244851d15793a7fa33a97c46d8271
>>   genirq, sched/isolation: Isolate from handling managed interrupts
>> It removed the managed_irq interruption from non-housekeeping CPUs as long as
>> the non-housekeeping CPUs do not request IO. But the the work thread
>> blk_mq_run_work_fn() may still run on the non-housekeeping CPUs.
>> Appreciate it a lot if you can give it a look.
> 
> Yeah, commit 11ea68f553e24 touches irq subsystem to try not assign
> isolated cpus for managed irq's effective affinity.
> 
> Here blk-mq just selects one cpu and calls mod_delayed_work_on()
> to execute the run queue handler on specified cpu. There are lots of
> such bound wq usage in tree, so I guess it might belong to one wq or
> scheduler generic problem instead of blk-mq specific issue. Not sure
> if it is good to address it in block layer.

Yes, I also find some other worker thread running on the non-housekeeping CPUs.
Some of them need to read the per-cpu data, such as drain_local_pages_wq(). But
workqueue subsystem doesn't know if the work threads read any per-cpu data and
can be migrated to another CPU.

For the workqueue marked as WQ_UNBOUND, the following commit can move the worker
threads to the housekeeping CPUs.
    commit 1bda3f8087fce9063da0b8aef87f17a3fe541aca
    sched/isolation: Isolate workqueues when "nohz_full=" is set
But for the workqueue without flag WQ_UNBOUND, workqueue subsystem doesn't know
if the worker threads can be migrated to another CPU.

So I think maybe the subsystem who create the workqueue can decide whether the
worker threads can be migrated.

Thanks,
Xiongfeng

> 
> thanks,
> Ming
> 
>>
>> Thanks,
>> Xiongfeng
>>
>> On 2022/2/10 17:35, Xiongfeng Wang wrote:
>>> When NOHZ_FULL is enabled, such as in HPC situation, CPUs are divided
>>> into housekeeping CPUs and non-housekeeping CPUs. Non-housekeeping CPUs
>>> are NOHZ_FULL CPUs and are often monopolized by the userspace process,
>>> such HPC application process. Any sort of interruption is not expected.
>>>
>>> blk_mq_hctx_next_cpu() selects each cpu in 'hctx->cpumask' alternately
>>> to schedule the work thread blk_mq_run_work_fn(). When 'hctx->cpumask'
>>> contains housekeeping CPU and non-housekeeping CPU at the same time, a
>>> housekeeping CPU, which want to request a IO, may schedule a worker on a
>>> non-housekeeping CPU. This may affect the performance of the userspace
>>> application running on non-housekeeping CPUs.
>>>
>>> So let's just schedule the worker thread on the current CPU when the
>>> current CPU is housekeeping CPU.
>>>
>>> Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
>>> ---
>>>  block/blk-mq.c | 15 ++++++++++++++-
>>>  1 file changed, 14 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>>> index 1adfe4824ef5..ff9a4bf16858 100644
>>> --- a/block/blk-mq.c
>>> +++ b/block/blk-mq.c
>>> @@ -24,6 +24,7 @@
>>>  #include <linux/sched/sysctl.h>
>>>  #include <linux/sched/topology.h>
>>>  #include <linux/sched/signal.h>
>>> +#include <linux/sched/isolation.h>
>>>  #include <linux/delay.h>
>>>  #include <linux/crash_dump.h>
>>>  #include <linux/prefetch.h>
>>> @@ -2036,6 +2037,8 @@ static int blk_mq_hctx_next_cpu(struct blk_mq_hw_ctx *hctx)
>>>  static void __blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async,
>>>  					unsigned long msecs)
>>>  {
>>> +	int work_cpu;
>>> +
>>>  	if (unlikely(blk_mq_hctx_stopped(hctx)))
>>>  		return;
>>>  
>>> @@ -2050,7 +2053,17 @@ static void __blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async,
>>>  		put_cpu();
>>>  	}
>>>  
>>> -	kblockd_mod_delayed_work_on(blk_mq_hctx_next_cpu(hctx), &hctx->run_work,
>>> +	/*
>>> +	 * Avoid housekeeping CPUs scheduling a worker on a non-housekeeping
>>> +	 * CPU
>>> +	 */
>>> +	if (tick_nohz_full_enabled() && housekeeping_cpu(smp_processor_id(),
>>> +							 HK_FLAG_WQ))
>>> +		work_cpu = smp_processor_id();
>>> +	else
>>> +		work_cpu = blk_mq_hctx_next_cpu(hctx);
>>> +
>>> +	kblockd_mod_delayed_work_on(work_cpu, &hctx->run_work,
>>>  				    msecs_to_jiffies(msecs));
>>>  }
>>>  
>>>
>>
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH] blk-mq: avoid housekeeping CPUs scheduling a worker on a non-housekeeping CPU
  2022-02-10  9:35 [RFC PATCH] blk-mq: avoid housekeeping CPUs scheduling a worker on a non-housekeeping CPU Xiongfeng Wang
  2022-02-15  2:29 ` Xiongfeng Wang
@ 2022-02-15  9:38 ` Xiongfeng Wang
  1 sibling, 0 replies; 7+ messages in thread
From: Xiongfeng Wang @ 2022-02-15  9:38 UTC (permalink / raw)
  To: axboe, hch, Frederic Weisbecker
  Cc: linux-block, linux-kernel, yuyufen, guohanjun

Hi Frederic,

Sorry to disturb you. It's just that I think you may be interested in this
patch. I notice you are reviewing some other CPU isolation patches. Appreciate
it a lot if you can give it a look. Or just ignore it if you are not interested.

Thanks,
Xiongfeng

On 2022/2/10 17:35, Xiongfeng Wang wrote:
> When NOHZ_FULL is enabled, such as in HPC situation, CPUs are divided
> into housekeeping CPUs and non-housekeeping CPUs. Non-housekeeping CPUs
> are NOHZ_FULL CPUs and are often monopolized by the userspace process,
> such HPC application process. Any sort of interruption is not expected.
> 
> blk_mq_hctx_next_cpu() selects each cpu in 'hctx->cpumask' alternately
> to schedule the work thread blk_mq_run_work_fn(). When 'hctx->cpumask'
> contains housekeeping CPU and non-housekeeping CPU at the same time, a
> housekeeping CPU, which want to request a IO, may schedule a worker on a
> non-housekeeping CPU. This may affect the performance of the userspace
> application running on non-housekeeping CPUs.
> 
> So let's just schedule the worker thread on the current CPU when the
> current CPU is housekeeping CPU.
> 
> Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
> ---
>  block/blk-mq.c | 15 ++++++++++++++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 1adfe4824ef5..ff9a4bf16858 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -24,6 +24,7 @@
>  #include <linux/sched/sysctl.h>
>  #include <linux/sched/topology.h>
>  #include <linux/sched/signal.h>
> +#include <linux/sched/isolation.h>
>  #include <linux/delay.h>
>  #include <linux/crash_dump.h>
>  #include <linux/prefetch.h>
> @@ -2036,6 +2037,8 @@ static int blk_mq_hctx_next_cpu(struct blk_mq_hw_ctx *hctx)
>  static void __blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async,
>  					unsigned long msecs)
>  {
> +	int work_cpu;
> +
>  	if (unlikely(blk_mq_hctx_stopped(hctx)))
>  		return;
>  
> @@ -2050,7 +2053,17 @@ static void __blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async,
>  		put_cpu();
>  	}
>  
> -	kblockd_mod_delayed_work_on(blk_mq_hctx_next_cpu(hctx), &hctx->run_work,
> +	/*
> +	 * Avoid housekeeping CPUs scheduling a worker on a non-housekeeping
> +	 * CPU
> +	 */
> +	if (tick_nohz_full_enabled() && housekeeping_cpu(smp_processor_id(),
> +							 HK_FLAG_WQ))
> +		work_cpu = smp_processor_id();
> +	else
> +		work_cpu = blk_mq_hctx_next_cpu(hctx);
> +
> +	kblockd_mod_delayed_work_on(work_cpu, &hctx->run_work,
>  				    msecs_to_jiffies(msecs));
>  }
>  
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH] blk-mq: avoid housekeeping CPUs scheduling a worker on a non-housekeeping CPU
  2023-02-23  7:48 Xiongfeng Wang
@ 2023-02-23 15:06 ` Christoph Hellwig
  0 siblings, 0 replies; 7+ messages in thread
From: Christoph Hellwig @ 2023-02-23 15:06 UTC (permalink / raw)
  To: Xiongfeng Wang
  Cc: axboe, hch, linux-block, linux-kernel, wangxiongfeng2, liwei391,
	wangkefeng.wang

On Thu, Feb 23, 2023 at 03:48:26PM +0800, Xiongfeng Wang wrote:
> From: Xiongfeng Wang <wangxiongfeng2@huawei.com>
> 
> When NOHZ_FULL is enabled, such as in HPC situation, CPUs are divided
> into housekeeping CPUs and non-housekeeping CPUs. Non-housekeeping CPUs
> are NOHZ_FULL CPUs and are often monopolized by the userspace process,
> such HPC application process. Any sort of interruption is not expected.
> 
> blk_mq_hctx_next_cpu() selects each cpu in 'hctx->cpumask' alternately
> to schedule the work thread blk_mq_run_work_fn(). When 'hctx->cpumask'
> contains housekeeping CPU and non-housekeeping CPU at the same time, a
> housekeeping CPU, which want to request a IO, may schedule a worker on a
> non-housekeeping CPU. This may affect the performance of the userspace
> application running on non-housekeeping CPUs.
> 
> So let's just schedule the worker thread on the current CPU when the
> current CPU is housekeeping CPU.

This looks like an odd non-systemic bandaid.  Shouldn't we have a more
generic way nothing ever gets onto these non-housekeeping CPUs by making
sure they never show up in the cpumask, and never get completion IPIs?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [RFC PATCH] blk-mq: avoid housekeeping CPUs scheduling a worker on a non-housekeeping CPU
@ 2023-02-23  7:48 Xiongfeng Wang
  2023-02-23 15:06 ` Christoph Hellwig
  0 siblings, 1 reply; 7+ messages in thread
From: Xiongfeng Wang @ 2023-02-23  7:48 UTC (permalink / raw)
  To: axboe, hch
  Cc: linux-block, linux-kernel, wangxiongfeng2, wangxiongfeng,
	liwei391, wangkefeng.wang

From: Xiongfeng Wang <wangxiongfeng2@huawei.com>

When NOHZ_FULL is enabled, such as in HPC situation, CPUs are divided
into housekeeping CPUs and non-housekeeping CPUs. Non-housekeeping CPUs
are NOHZ_FULL CPUs and are often monopolized by the userspace process,
such HPC application process. Any sort of interruption is not expected.

blk_mq_hctx_next_cpu() selects each cpu in 'hctx->cpumask' alternately
to schedule the work thread blk_mq_run_work_fn(). When 'hctx->cpumask'
contains housekeeping CPU and non-housekeeping CPU at the same time, a
housekeeping CPU, which want to request a IO, may schedule a worker on a
non-housekeeping CPU. This may affect the performance of the userspace
application running on non-housekeeping CPUs.

So let's just schedule the worker thread on the current CPU when the
current CPU is housekeeping CPU.

Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
---
 block/blk-mq.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index d3494a796ba8..1e84d393cce3 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -21,6 +21,7 @@
 #include <linux/llist.h>
 #include <linux/cpu.h>
 #include <linux/cache.h>
+#include <linux/sched/isolation.h>
 #include <linux/sched/sysctl.h>
 #include <linux/sched/topology.h>
 #include <linux/sched/signal.h>
@@ -2245,6 +2246,8 @@ static int blk_mq_hctx_next_cpu(struct blk_mq_hw_ctx *hctx)
 static void __blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async,
 					unsigned long msecs)
 {
+	int work_cpu;
+
 	if (unlikely(blk_mq_hctx_stopped(hctx)))
 		return;
 
@@ -2255,7 +2258,17 @@ static void __blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async,
 		}
 	}
 
-	kblockd_mod_delayed_work_on(blk_mq_hctx_next_cpu(hctx), &hctx->run_work,
+	/*
+	 * Avoid housekeeping CPUs scheduling a worker on a non-housekeeping
+	 * CPU
+	 */
+	if (tick_nohz_full_enabled() && housekeeping_cpu(smp_processor_id(),
+							 HK_TYPE_WQ))
+		work_cpu = smp_processor_id();
+	else
+		work_cpu = blk_mq_hctx_next_cpu(hctx);
+
+	kblockd_mod_delayed_work_on(work_cpu, &hctx->run_work,
 				    msecs_to_jiffies(msecs));
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-02-23 15:06 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-10  9:35 [RFC PATCH] blk-mq: avoid housekeeping CPUs scheduling a worker on a non-housekeeping CPU Xiongfeng Wang
2022-02-15  2:29 ` Xiongfeng Wang
2022-02-15  4:37   ` Ming Lei
2022-02-15  9:32     ` Xiongfeng Wang
2022-02-15  9:38 ` Xiongfeng Wang
2023-02-23  7:48 Xiongfeng Wang
2023-02-23 15:06 ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.