linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] sched/numa: do not balance tasks onto isolated cpus
@ 2018-07-23  5:39 Chen Lin
  2018-07-23  8:40 ` Peter Zijlstra
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Chen Lin @ 2018-07-23  5:39 UTC (permalink / raw)
  To: mingo, peterz
  Cc: linux-kernel, jiang.biao2, zhong.weidong, tan.hu, Chen Lin, Tan Hu

From: Chen Lin <cheng.lin130@zte.com.cn>

NUMA balancing has not taken *isolcpus(isolated cpus)* into 
consideration. It may migrate tasks onto isolated cpus and the 
migrated tasks will never escape from the isolated cpus, which will
break the isolation provided by *isolcpus* boot parameter and 
intrduce various problems.

This patch ensure NUMA balancing not to balance tasks onto iaolated
cpus. 

Signed-off-by: Cheng Lin <cheng.lin130@zte.com.cn>
Signed-off-by: Tan Hu <tan.hu@zte.com.cn>
Signed-off-by: Jiang Biao <jiang.biao2@zte.com.cn>
---
 kernel/sched/core.c | 9 ++++++---
 kernel/sched/fair.c | 3 ++-
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index fe365c9..f9ce90c 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1302,10 +1302,12 @@ int migrate_swap(struct task_struct *cur, struct task_struct *p)
 	if (!cpu_active(arg.src_cpu) || !cpu_active(arg.dst_cpu))
 		goto out;

-	if (!cpumask_test_cpu(arg.dst_cpu, &arg.src_task->cpus_allowed))
+	if ((!cpumask_test_cpu(arg.dst_cpu, &arg.src_task->cpus_allowed))
+            || cpumask_test_cpu(arg.dst_cpu, cpu_isolated_map))
 		goto out;

-	if (!cpumask_test_cpu(arg.src_cpu, &arg.dst_task->cpus_allowed))
+	if ((!cpumask_test_cpu(arg.src_cpu, &arg.dst_task->cpus_allowed))
+            || cpumask_test_cpu(arg.src_cpu, cpu_isolated_map))
 		goto out;

 	trace_sched_swap_numa(cur, arg.src_cpu, p, arg.dst_cpu);
@@ -5508,7 +5510,8 @@ int migrate_task_to(struct task_struct *p, int target_cpu)
 	if (curr_cpu == target_cpu)
 		return 0;

-	if (!cpumask_test_cpu(target_cpu, &p->cpus_allowed))
+	if ((!cpumask_test_cpu(target_cpu, &p->cpus_allowed))
+            || cpumask_test_cpu(target_cpu, cpu_isolated_map))
 		return -EINVAL;

 	/* TODO: This is not properly updating schedstats */
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 2f0a0be..a91f8fe 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1724,7 +1724,8 @@ static void task_numa_find_cpu(struct task_numa_env *env,

 	for_each_cpu(cpu, cpumask_of_node(env->dst_nid)) {
 		/* Skip this CPU if the source task cannot migrate */
-		if (!cpumask_test_cpu(cpu, &env->p->cpus_allowed))
+		if ((!cpumask_test_cpu(cpu, &env->p->cpus_allowed))
+                    || cpumask_test_cpu(cpu, cpu_isolated_map))
 			continue;

 		env->dst_cpu = cpu;
--
1.8.3.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched/numa: do not balance tasks onto isolated cpus
  2018-07-23  5:39 [PATCH] sched/numa: do not balance tasks onto isolated cpus Chen Lin
@ 2018-07-23  8:40 ` Peter Zijlstra
  2018-07-24  1:11   ` jiang.biao2
  2018-07-23 20:06 ` kbuild test robot
  2018-07-23 20:06 ` kbuild test robot
  2 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2018-07-23  8:40 UTC (permalink / raw)
  To: Chen Lin
  Cc: mingo, linux-kernel, jiang.biao2, zhong.weidong, tan.hu,
	Chen Lin, Tan Hu

On Mon, Jul 23, 2018 at 01:39:30PM +0800, Chen Lin wrote:
> From: Chen Lin <cheng.lin130@zte.com.cn>
> 
> NUMA balancing has not taken *isolcpus(isolated cpus)* into 
> consideration. It may migrate tasks onto isolated cpus and the 
> migrated tasks will never escape from the isolated cpus, which will
> break the isolation provided by *isolcpus* boot parameter and 
> intrduce various problems.
> 
> This patch ensure NUMA balancing not to balance tasks onto iaolated
> cpus. 

I'm not sure what kernel you're patching, but cpu_isolated_map doesn't
exist anymore. Also, if it steps on isolated CPUs, this is the wrong fix
anyway. Load-balancing should be constrained to the current root domain.

> Signed-off-by: Cheng Lin <cheng.lin130@zte.com.cn>
> Signed-off-by: Tan Hu <tan.hu@zte.com.cn>
> Signed-off-by: Jiang Biao <jiang.biao2@zte.com.cn>

This SoB chain is invalid.

> ---
>  kernel/sched/core.c | 9 ++++++---
>  kernel/sched/fair.c | 3 ++-
>  2 files changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index fe365c9..f9ce90c 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -1302,10 +1302,12 @@ int migrate_swap(struct task_struct *cur, struct task_struct *p)
>  	if (!cpu_active(arg.src_cpu) || !cpu_active(arg.dst_cpu))
>  		goto out;
> 
> -	if (!cpumask_test_cpu(arg.dst_cpu, &arg.src_task->cpus_allowed))
> +	if ((!cpumask_test_cpu(arg.dst_cpu, &arg.src_task->cpus_allowed))
> +            || cpumask_test_cpu(arg.dst_cpu, cpu_isolated_map))
>  		goto out;
> 
> -	if (!cpumask_test_cpu(arg.src_cpu, &arg.dst_task->cpus_allowed))
> +	if ((!cpumask_test_cpu(arg.src_cpu, &arg.dst_task->cpus_allowed))
> +            || cpumask_test_cpu(arg.src_cpu, cpu_isolated_map))
>  		goto out;
> 
>  	trace_sched_swap_numa(cur, arg.src_cpu, p, arg.dst_cpu);
> @@ -5508,7 +5510,8 @@ int migrate_task_to(struct task_struct *p, int target_cpu)
>  	if (curr_cpu == target_cpu)
>  		return 0;
> 
> -	if (!cpumask_test_cpu(target_cpu, &p->cpus_allowed))
> +	if ((!cpumask_test_cpu(target_cpu, &p->cpus_allowed))
> +            || cpumask_test_cpu(target_cpu, cpu_isolated_map))
>  		return -EINVAL;
> 
>  	/* TODO: This is not properly updating schedstats */
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 2f0a0be..a91f8fe 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -1724,7 +1724,8 @@ static void task_numa_find_cpu(struct task_numa_env *env,
> 
>  	for_each_cpu(cpu, cpumask_of_node(env->dst_nid)) {
>  		/* Skip this CPU if the source task cannot migrate */
> -		if (!cpumask_test_cpu(cpu, &env->p->cpus_allowed))
> +		if ((!cpumask_test_cpu(cpu, &env->p->cpus_allowed))
> +                    || cpumask_test_cpu(cpu, cpu_isolated_map))
>  			continue;
> 
>  		env->dst_cpu = cpu;
> --
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched/numa: do not balance tasks onto isolated cpus
  2018-07-23  5:39 [PATCH] sched/numa: do not balance tasks onto isolated cpus Chen Lin
  2018-07-23  8:40 ` Peter Zijlstra
@ 2018-07-23 20:06 ` kbuild test robot
  2018-07-23 20:06 ` kbuild test robot
  2 siblings, 0 replies; 9+ messages in thread
From: kbuild test robot @ 2018-07-23 20:06 UTC (permalink / raw)
  To: Chen Lin
  Cc: kbuild-all, mingo, peterz, linux-kernel, jiang.biao2,
	zhong.weidong, tan.hu, Chen Lin, Tan Hu

[-- Attachment #1: Type: text/plain, Size: 2576 bytes --]

Hi Chen,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tip/sched/core]
[also build test ERROR on v4.18-rc6 next-20180723]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Chen-Lin/sched-numa-do-not-balance-tasks-onto-isolated-cpus/20180724-031803
config: i386-randconfig-x008-201829 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

   kernel/sched/core.c: In function 'migrate_swap':
>> kernel/sched/core.c:1283:46: error: 'cpu_isolated_map' undeclared (first use in this function); did you mean 'cpu_core_map'?
                || cpumask_test_cpu(arg.dst_cpu, cpu_isolated_map))
                                                 ^~~~~~~~~~~~~~~~
                                                 cpu_core_map
   kernel/sched/core.c:1283:46: note: each undeclared identifier is reported only once for each function it appears in

vim +1283 kernel/sched/core.c

  1256	
  1257	/*
  1258	 * Cross migrate two tasks
  1259	 */
  1260	int migrate_swap(struct task_struct *cur, struct task_struct *p)
  1261	{
  1262		struct migration_swap_arg arg;
  1263		int ret = -EINVAL;
  1264	
  1265		arg = (struct migration_swap_arg){
  1266			.src_task = cur,
  1267			.src_cpu = task_cpu(cur),
  1268			.dst_task = p,
  1269			.dst_cpu = task_cpu(p),
  1270		};
  1271	
  1272		if (arg.src_cpu == arg.dst_cpu)
  1273			goto out;
  1274	
  1275		/*
  1276		 * These three tests are all lockless; this is OK since all of them
  1277		 * will be re-checked with proper locks held further down the line.
  1278		 */
  1279		if (!cpu_active(arg.src_cpu) || !cpu_active(arg.dst_cpu))
  1280			goto out;
  1281	
  1282		if ((!cpumask_test_cpu(arg.dst_cpu, &arg.src_task->cpus_allowed))
> 1283	            || cpumask_test_cpu(arg.dst_cpu, cpu_isolated_map))
  1284			goto out;
  1285	
  1286		if ((!cpumask_test_cpu(arg.src_cpu, &arg.dst_task->cpus_allowed))
  1287	            || cpumask_test_cpu(arg.src_cpu, cpu_isolated_map))
  1288			goto out;
  1289	
  1290		trace_sched_swap_numa(cur, arg.src_cpu, p, arg.dst_cpu);
  1291		ret = stop_two_cpus(arg.dst_cpu, arg.src_cpu, migrate_swap_stop, &arg);
  1292	
  1293	out:
  1294		return ret;
  1295	}
  1296	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 25146 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched/numa: do not balance tasks onto isolated cpus
  2018-07-23  5:39 [PATCH] sched/numa: do not balance tasks onto isolated cpus Chen Lin
  2018-07-23  8:40 ` Peter Zijlstra
  2018-07-23 20:06 ` kbuild test robot
@ 2018-07-23 20:06 ` kbuild test robot
  2 siblings, 0 replies; 9+ messages in thread
From: kbuild test robot @ 2018-07-23 20:06 UTC (permalink / raw)
  To: Chen Lin
  Cc: kbuild-all, mingo, peterz, linux-kernel, jiang.biao2,
	zhong.weidong, tan.hu, Chen Lin, Tan Hu

[-- Attachment #1: Type: text/plain, Size: 1848 bytes --]

Hi Chen,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tip/sched/core]
[also build test ERROR on v4.18-rc6 next-20180723]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Chen-Lin/sched-numa-do-not-balance-tasks-onto-isolated-cpus/20180724-031803
config: x86_64-randconfig-x008-201829 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   kernel//sched/fair.c: In function 'task_numa_find_cpu':
>> kernel//sched/fair.c:1726:46: error: 'cpu_isolated_map' undeclared (first use in this function); did you mean 'cpu_core_map'?
                        || cpumask_test_cpu(cpu, cpu_isolated_map))
                                                 ^~~~~~~~~~~~~~~~
                                                 cpu_core_map
   kernel//sched/fair.c:1726:46: note: each undeclared identifier is reported only once for each function it appears in

vim +1726 kernel//sched/fair.c

  1717	
  1718	static void task_numa_find_cpu(struct task_numa_env *env,
  1719					long taskimp, long groupimp)
  1720	{
  1721		int cpu;
  1722	
  1723		for_each_cpu(cpu, cpumask_of_node(env->dst_nid)) {
  1724			/* Skip this CPU if the source task cannot migrate */
  1725			if ((!cpumask_test_cpu(cpu, &env->p->cpus_allowed))
> 1726	                    || cpumask_test_cpu(cpu, cpu_isolated_map))
  1727				continue;
  1728	
  1729			env->dst_cpu = cpu;
  1730			task_numa_compare(env, taskimp, groupimp);
  1731		}
  1732	}
  1733	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 27462 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched/numa: do not balance tasks onto isolated cpus
  2018-07-23  8:40 ` Peter Zijlstra
@ 2018-07-24  1:11   ` jiang.biao2
  2018-07-24  9:06     ` Peter Zijlstra
  0 siblings, 1 reply; 9+ messages in thread
From: jiang.biao2 @ 2018-07-24  1:11 UTC (permalink / raw)
  To: peterz
  Cc: chen.lin130, mingo, linux-kernel, zhong.weidong, tan.hu,
	cheng.lin130, tan.hu


[-- Attachment #1.1: Type: text/plain, Size: 1345 bytes --]

>
> On Mon, Jul 23, 2018 at 01:39:30PM +0800, Chen Lin wrote:
>> From: Chen Lin <cheng.lin130@zte.com.cn>
>>
>> NUMA balancing has not taken *isolcpus(isolated cpus)* into
>> consideration. It may migrate tasks onto isolated cpus and the
>> migrated tasks will never escape from the isolated cpus, which will
>> break the isolation provided by *isolcpus* boot parameter and
>> intrduce various problems.
>>
>> This patch ensure NUMA balancing not to balance tasks onto iaolated
>> cpus.
>
> I'm not sure what kernel you're patching, but cpu_isolated_map doesn't
> exist anymore. Also, if it steps on isolated CPUs, this is the wrong fix
> anyway. Load-balancing should be constrained to the current root domain.
Indeed, we used 4.14 version and made wrong assumption, sorry for that.
We will retest on latest mainline version, rework the patch and send another
version.

>> Signed-off-by: Cheng Lin <cheng.lin130@zte.com.cn>
>> Signed-off-by: Tan Hu <tan.hu@zte.com.cn>
>> Signed-off-by: Jiang Biao <jiang.biao2@zte.com.cn>
> 
>This SoB chain is invalid.
Mm, we don't quite understand what the *Signed-off-by* precisely means,
Does it only mean DCO(developer certificate of origin)?
As we understood, multiple SoBs could be used to indicate co-authors.
If SoB only means DCO, how can the patches reflect co-authors?

Thanks a lot.
Jiang,
Regards

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched/numa: do not balance tasks onto isolated cpus
  2018-07-24  1:11   ` jiang.biao2
@ 2018-07-24  9:06     ` Peter Zijlstra
  2018-07-25  0:30       ` jiang.biao2
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2018-07-24  9:06 UTC (permalink / raw)
  To: jiang.biao2
  Cc: chen.lin130, mingo, linux-kernel, zhong.weidong, tan.hu,
	cheng.lin130, tan.hu

On Tue, Jul 24, 2018 at 09:11:37AM +0800, jiang.biao2@zte.com.cn wrote:

> >> Signed-off-by: Cheng Lin <cheng.lin130@zte.com.cn>
> >> Signed-off-by: Tan Hu <tan.hu@zte.com.cn>
> >> Signed-off-by: Jiang Biao <jiang.biao2@zte.com.cn>
> > 
> >This SoB chain is invalid.

> Mm, we don't quite understand what the *Signed-off-by* precisely means,
> Does it only mean DCO(developer certificate of origin)?
> As we understood, multiple SoBs could be used to indicate co-authors.
> If SoB only means DCO, how can the patches reflect co-authors?

It specifically does not allow for co-authorship. I think there's a
Co-Authored-by: tag invented by some people (check the git logs) but
especially for such a dinky little patch, I wouldn't bother. Maybe have
your co-workers review the patch or something.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched/numa: do not balance tasks onto isolated cpus
  2018-07-24  9:06     ` Peter Zijlstra
@ 2018-07-25  0:30       ` jiang.biao2
  2018-07-25  8:35         ` Peter Zijlstra
  0 siblings, 1 reply; 9+ messages in thread
From: jiang.biao2 @ 2018-07-25  0:30 UTC (permalink / raw)
  To: peterz
  Cc: chen.lin130, mingo, linux-kernel, zhong.weidong, tan.hu,
	cheng.lin130, tan.hu


[-- Attachment #1.1: Type: text/plain, Size: 1015 bytes --]

Hi,
> On Tue, Jul 24, 2018 at 09:11:37AM +0800, jiang.biao2@zte.com.cn wrote:
>
>> >> Signed-off-by: Cheng Lin <cheng.lin130@zte.com.cn>
>> >> Signed-off-by: Tan Hu <tan.hu@zte.com.cn>
>> >> Signed-off-by: Jiang Biao <jiang.biao2@zte.com.cn>
>> >
>> >This SoB chain is invalid.
>
>> Mm, we don't quite understand what the *Signed-off-by* precisely means,
>> Does it only mean DCO(developer certificate of origin)?
>> As we understood, multiple SoBs could be used to indicate co-authors.
>> If SoB only means DCO, how can the patches reflect co-authors?
>
> It specifically does not allow for co-authorship. I think there's a
> Co-Authored-by: tag invented by some people (check the git logs) but
> especially for such a dinky little patch, I wouldn't bother. Maybe have
> your co-workers review the patch or something.
I do find some(very few) Co-Authored-by in git logs, and we'll try to avoid
using that. Sometimes little patches may include much background work of
different people.:) 
Thanks lot.

Jiang,
Regard

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched/numa: do not balance tasks onto isolated cpus
  2018-07-25  0:30       ` jiang.biao2
@ 2018-07-25  8:35         ` Peter Zijlstra
  2018-07-25  8:52           ` jiang.biao2
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2018-07-25  8:35 UTC (permalink / raw)
  To: jiang.biao2
  Cc: chen.lin130, mingo, linux-kernel, zhong.weidong, tan.hu,
	cheng.lin130, tan.hu

On Wed, Jul 25, 2018 at 08:30:03AM +0800, jiang.biao2@zte.com.cn wrote:
> I do find some(very few) Co-Authored-by in git logs, and we'll try to avoid
> using that. Sometimes little patches may include much background work of
> different people.:) 

I sometimes use Debugged-by: to attribute the sometimes significant work
of actually finding what's wrong.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched/numa: do not balance tasks onto isolated cpus
  2018-07-25  8:35         ` Peter Zijlstra
@ 2018-07-25  8:52           ` jiang.biao2
  0 siblings, 0 replies; 9+ messages in thread
From: jiang.biao2 @ 2018-07-25  8:52 UTC (permalink / raw)
  To: peterz
  Cc: chen.lin130, mingo, linux-kernel, zhong.weidong, tan.hu,
	cheng.lin130, tan.hu


[-- Attachment #1.1: Type: text/plain, Size: 433 bytes --]

> On Wed, Jul 25, 2018 at 08:30:03AM +0800, jiang.biao2@zte.com.cn wrote:
>> I do find some(very few) Co-Authored-by in git logs, and we'll try to avoid
>> using that. Sometimes little patches may include much background work of
>> different people.:)
>
> I sometimes use Debugged-by: to attribute the sometimes significant work
> of actually finding what's wrong.
That seems a nice way. Thanks a lot for your advice.

Jiang,
Regards

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-07-25  8:53 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-23  5:39 [PATCH] sched/numa: do not balance tasks onto isolated cpus Chen Lin
2018-07-23  8:40 ` Peter Zijlstra
2018-07-24  1:11   ` jiang.biao2
2018-07-24  9:06     ` Peter Zijlstra
2018-07-25  0:30       ` jiang.biao2
2018-07-25  8:35         ` Peter Zijlstra
2018-07-25  8:52           ` jiang.biao2
2018-07-23 20:06 ` kbuild test robot
2018-07-23 20:06 ` kbuild test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).