All of lore.kernel.org
 help / color / mirror / Atom feed
From: Qian Cai <cai@lca.pw>
To: Michael Ellerman <mpe@ellerman.id.au>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	juri.lelli@redhat.com, dietmar.eggemann@arm.com,
	vincent.guittot@linaro.org, Steven Rostedt <rostedt@goodmis.org>,
	bsegall@google.com, mgorman@suse.de, paulmck@kernel.org,
	tglx@linutronix.de,
	"James.Bottomley@hansenpartnership.com" 
	<James.Bottomley@HansenPartnership.com>,
	deller@gmx.de, linuxppc-dev@lists.ozlabs.org,
	linux-parisc@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, Nicholas Piggin <npiggin@gmail.com>
Subject: Re: [PATCH v2] sched/core: fix illegal RCU from offline CPUs
Date: Fri, 17 Apr 2020 09:26:56 -0400	[thread overview]
Message-ID: <BBA124FA-7924-4782-AC9D-7B1B98BE817F@lca.pw> (raw)
In-Reply-To: <87369mt9kf.fsf@mpe.ellerman.id.au>



> On Apr 2, 2020, at 7:24 AM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> 
> Qian Cai <cai@lca.pw> writes:
>> From: Peter Zijlstra <peterz@infradead.org>
>> 
>> In the CPU-offline process, it calls mmdrop() after idle entry and the
>> subsequent call to cpuhp_report_idle_dead(). Once execution passes the
>> call to rcu_report_dead(), RCU is ignoring the CPU, which results in
>> lockdep complaining when mmdrop() uses RCU from either memcg or
>> debugobjects below.
>> 
>> Fix it by cleaning up the active_mm state from BP instead. Every arch
>> which has CONFIG_HOTPLUG_CPU should have already called idle_task_exit()
>> from AP. The only exception is parisc because it switches them to
>> &init_mm unconditionally (see smp_boot_one_cpu() and smp_cpu_init()),
>> but the patch will still work there because it calls mmgrab(&init_mm) in
>> smp_cpu_init() and then should call mmdrop(&init_mm) in finish_cpu().
> 
> Thanks for debugging this. How did you hit it in the first place?
> 
> A link to the original thread would have helped me:
> 
>  https://lore.kernel.org/lkml/20200113190331.12788-1-cai@lca.pw/
> 
>> WARNING: suspicious RCU usage
>> -----------------------------
>> kernel/workqueue.c:710 RCU or wq_pool_mutex should be held!
>> 
>> other info that might help us debug this:
>> 
>> RCU used illegally from offline CPU!
>> Call Trace:
>> dump_stack+0xf4/0x164 (unreliable)
>> lockdep_rcu_suspicious+0x140/0x164
>> get_work_pool+0x110/0x150
>> __queue_work+0x1bc/0xca0
>> queue_work_on+0x114/0x120
>> css_release+0x9c/0xc0
>> percpu_ref_put_many+0x204/0x230
>> free_pcp_prepare+0x264/0x570
>> free_unref_page+0x38/0xf0
>> __mmdrop+0x21c/0x2c0
>> idle_task_exit+0x170/0x1b0
>> pnv_smp_cpu_kill_self+0x38/0x2e0
>> cpu_die+0x48/0x64
>> arch_cpu_idle_dead+0x30/0x50
>> do_idle+0x2f4/0x470
>> cpu_startup_entry+0x38/0x40
>> start_secondary+0x7a8/0xa80
>> start_secondary_resume+0x10/0x14
> 
> Do we know when this started happening? ie. can we determine a Fixes
> tag?
> 
>> <Peter to sign off here>
>> Signed-off-by: Qian Cai <cai@lca.pw>
>> ---
>> arch/powerpc/platforms/powernv/smp.c |  1 -
>> include/linux/sched/mm.h             |  2 ++
>> kernel/cpu.c                         | 18 +++++++++++++++++-
>> kernel/sched/core.c                  |  5 +++--
>> 4 files changed, 22 insertions(+), 4 deletions(-)
>> 
>> diff --git a/arch/powerpc/platforms/powernv/smp.c b/arch/powerpc/platforms/powernv/smp.c
>> index 13e251699346..b2ba3e95bda7 100644
>> --- a/arch/powerpc/platforms/powernv/smp.c
>> +++ b/arch/powerpc/platforms/powernv/smp.c
>> @@ -167,7 +167,6 @@ static void pnv_smp_cpu_kill_self(void)
>> 	/* Standard hot unplug procedure */
>> 
>> 	idle_task_exit();
>> -	current->active_mm = NULL; /* for sanity */
> 
> If I'm reading it right, we'll now be running with active_mm == init_mm
> in the offline loop.
> 
> I guess that's fine, I can't think of any reason it would matter, and it
> seems like we were NULL'ing it out just for paranoia's sake not because
> of any actual problem.
> 
> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)

Peter, can you take a look at this patch when you have a chance?

> 
> 
> cheers
> 
>> diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h
>> index c49257a3b510..a132d875d351 100644
>> --- a/include/linux/sched/mm.h
>> +++ b/include/linux/sched/mm.h
>> @@ -49,6 +49,8 @@ static inline void mmdrop(struct mm_struct *mm)
>> 		__mmdrop(mm);
>> }
>> 
>> +void mmdrop(struct mm_struct *mm);
>> +
>> /*
>>  * This has to be called after a get_task_mm()/mmget_not_zero()
>>  * followed by taking the mmap_sem for writing before modifying the
>> diff --git a/kernel/cpu.c b/kernel/cpu.c
>> index 2371292f30b0..244d30544377 100644
>> --- a/kernel/cpu.c
>> +++ b/kernel/cpu.c
>> @@ -3,6 +3,7 @@
>>  *
>>  * This code is licenced under the GPL.
>>  */
>> +#include <linux/sched/mm.h>
>> #include <linux/proc_fs.h>
>> #include <linux/smp.h>
>> #include <linux/init.h>
>> @@ -564,6 +565,21 @@ static int bringup_cpu(unsigned int cpu)
>> 	return bringup_wait_for_ap(cpu);
>> }
>> 
>> +static int finish_cpu(unsigned int cpu)
>> +{
>> +	struct task_struct *idle = idle_thread_get(cpu);
>> +	struct mm_struct *mm = idle->active_mm;
>> +
>> +	/*
>> +	 * idle_task_exit() will have switched to &init_mm, now
>> +	 * clean up any remaining active_mm state.
>> +	 */
>> +	if (mm != &init_mm)
>> +		idle->active_mm = &init_mm;
>> +	mmdrop(mm);
>> +	return 0;
>> +}
>> +
>> /*
>>  * Hotplug state machine related functions
>>  */
>> @@ -1549,7 +1565,7 @@ static struct cpuhp_step cpuhp_hp_states[] = {
>> 	[CPUHP_BRINGUP_CPU] = {
>> 		.name			= "cpu:bringup",
>> 		.startup.single		= bringup_cpu,
>> -		.teardown.single	= NULL,
>> +		.teardown.single	= finish_cpu,
>> 		.cant_stop		= true,
>> 	},
>> 	/* Final state before CPU kills itself */
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index a2694ba82874..8787958339d5 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -6200,13 +6200,14 @@ void idle_task_exit(void)
>> 	struct mm_struct *mm = current->active_mm;
>> 
>> 	BUG_ON(cpu_online(smp_processor_id()));
>> +	BUG_ON(current != this_rq()->idle);
>> 
>> 	if (mm != &init_mm) {
>> 		switch_mm(mm, &init_mm, current);
>> -		current->active_mm = &init_mm;
>> 		finish_arch_post_lock_switch();
>> 	}
>> -	mmdrop(mm);
>> +
>> +	/* finish_cpu(), as ran on the BP, will clean up the active_mm state */
>> }
>> 
>> /*
>> -- 
>> 2.21.0 (Apple Git-122.2)


WARNING: multiple messages have this Message-ID (diff)
From: Qian Cai <cai@lca.pw>
To: Michael Ellerman <mpe@ellerman.id.au>
Cc: juri.lelli@redhat.com,
	"James.Bottomley@hansenpartnership.com"
	<James.Bottomley@HansenPartnership.com>,
	vincent.guittot@linaro.org, linux-parisc@vger.kernel.org,
	paulmck@kernel.org, Peter Zijlstra <peterz@infradead.org>,
	deller@gmx.de, Nicholas Piggin <npiggin@gmail.com>,
	linux-kernel@vger.kernel.org,
	Steven Rostedt <rostedt@goodmis.org>,
	bsegall@google.com, linux-mm@kvack.org,
	Ingo Molnar <mingo@redhat.com>,
	mgorman@suse.de, tglx@linutronix.de,
	linuxppc-dev@lists.ozlabs.org, dietmar.eggemann@arm.com
Subject: Re: [PATCH v2] sched/core: fix illegal RCU from offline CPUs
Date: Fri, 17 Apr 2020 09:26:56 -0400	[thread overview]
Message-ID: <BBA124FA-7924-4782-AC9D-7B1B98BE817F@lca.pw> (raw)
In-Reply-To: <87369mt9kf.fsf@mpe.ellerman.id.au>



> On Apr 2, 2020, at 7:24 AM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> 
> Qian Cai <cai@lca.pw> writes:
>> From: Peter Zijlstra <peterz@infradead.org>
>> 
>> In the CPU-offline process, it calls mmdrop() after idle entry and the
>> subsequent call to cpuhp_report_idle_dead(). Once execution passes the
>> call to rcu_report_dead(), RCU is ignoring the CPU, which results in
>> lockdep complaining when mmdrop() uses RCU from either memcg or
>> debugobjects below.
>> 
>> Fix it by cleaning up the active_mm state from BP instead. Every arch
>> which has CONFIG_HOTPLUG_CPU should have already called idle_task_exit()
>> from AP. The only exception is parisc because it switches them to
>> &init_mm unconditionally (see smp_boot_one_cpu() and smp_cpu_init()),
>> but the patch will still work there because it calls mmgrab(&init_mm) in
>> smp_cpu_init() and then should call mmdrop(&init_mm) in finish_cpu().
> 
> Thanks for debugging this. How did you hit it in the first place?
> 
> A link to the original thread would have helped me:
> 
>  https://lore.kernel.org/lkml/20200113190331.12788-1-cai@lca.pw/
> 
>> WARNING: suspicious RCU usage
>> -----------------------------
>> kernel/workqueue.c:710 RCU or wq_pool_mutex should be held!
>> 
>> other info that might help us debug this:
>> 
>> RCU used illegally from offline CPU!
>> Call Trace:
>> dump_stack+0xf4/0x164 (unreliable)
>> lockdep_rcu_suspicious+0x140/0x164
>> get_work_pool+0x110/0x150
>> __queue_work+0x1bc/0xca0
>> queue_work_on+0x114/0x120
>> css_release+0x9c/0xc0
>> percpu_ref_put_many+0x204/0x230
>> free_pcp_prepare+0x264/0x570
>> free_unref_page+0x38/0xf0
>> __mmdrop+0x21c/0x2c0
>> idle_task_exit+0x170/0x1b0
>> pnv_smp_cpu_kill_self+0x38/0x2e0
>> cpu_die+0x48/0x64
>> arch_cpu_idle_dead+0x30/0x50
>> do_idle+0x2f4/0x470
>> cpu_startup_entry+0x38/0x40
>> start_secondary+0x7a8/0xa80
>> start_secondary_resume+0x10/0x14
> 
> Do we know when this started happening? ie. can we determine a Fixes
> tag?
> 
>> <Peter to sign off here>
>> Signed-off-by: Qian Cai <cai@lca.pw>
>> ---
>> arch/powerpc/platforms/powernv/smp.c |  1 -
>> include/linux/sched/mm.h             |  2 ++
>> kernel/cpu.c                         | 18 +++++++++++++++++-
>> kernel/sched/core.c                  |  5 +++--
>> 4 files changed, 22 insertions(+), 4 deletions(-)
>> 
>> diff --git a/arch/powerpc/platforms/powernv/smp.c b/arch/powerpc/platforms/powernv/smp.c
>> index 13e251699346..b2ba3e95bda7 100644
>> --- a/arch/powerpc/platforms/powernv/smp.c
>> +++ b/arch/powerpc/platforms/powernv/smp.c
>> @@ -167,7 +167,6 @@ static void pnv_smp_cpu_kill_self(void)
>> 	/* Standard hot unplug procedure */
>> 
>> 	idle_task_exit();
>> -	current->active_mm = NULL; /* for sanity */
> 
> If I'm reading it right, we'll now be running with active_mm == init_mm
> in the offline loop.
> 
> I guess that's fine, I can't think of any reason it would matter, and it
> seems like we were NULL'ing it out just for paranoia's sake not because
> of any actual problem.
> 
> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)

Peter, can you take a look at this patch when you have a chance?

> 
> 
> cheers
> 
>> diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h
>> index c49257a3b510..a132d875d351 100644
>> --- a/include/linux/sched/mm.h
>> +++ b/include/linux/sched/mm.h
>> @@ -49,6 +49,8 @@ static inline void mmdrop(struct mm_struct *mm)
>> 		__mmdrop(mm);
>> }
>> 
>> +void mmdrop(struct mm_struct *mm);
>> +
>> /*
>>  * This has to be called after a get_task_mm()/mmget_not_zero()
>>  * followed by taking the mmap_sem for writing before modifying the
>> diff --git a/kernel/cpu.c b/kernel/cpu.c
>> index 2371292f30b0..244d30544377 100644
>> --- a/kernel/cpu.c
>> +++ b/kernel/cpu.c
>> @@ -3,6 +3,7 @@
>>  *
>>  * This code is licenced under the GPL.
>>  */
>> +#include <linux/sched/mm.h>
>> #include <linux/proc_fs.h>
>> #include <linux/smp.h>
>> #include <linux/init.h>
>> @@ -564,6 +565,21 @@ static int bringup_cpu(unsigned int cpu)
>> 	return bringup_wait_for_ap(cpu);
>> }
>> 
>> +static int finish_cpu(unsigned int cpu)
>> +{
>> +	struct task_struct *idle = idle_thread_get(cpu);
>> +	struct mm_struct *mm = idle->active_mm;
>> +
>> +	/*
>> +	 * idle_task_exit() will have switched to &init_mm, now
>> +	 * clean up any remaining active_mm state.
>> +	 */
>> +	if (mm != &init_mm)
>> +		idle->active_mm = &init_mm;
>> +	mmdrop(mm);
>> +	return 0;
>> +}
>> +
>> /*
>>  * Hotplug state machine related functions
>>  */
>> @@ -1549,7 +1565,7 @@ static struct cpuhp_step cpuhp_hp_states[] = {
>> 	[CPUHP_BRINGUP_CPU] = {
>> 		.name			= "cpu:bringup",
>> 		.startup.single		= bringup_cpu,
>> -		.teardown.single	= NULL,
>> +		.teardown.single	= finish_cpu,
>> 		.cant_stop		= true,
>> 	},
>> 	/* Final state before CPU kills itself */
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index a2694ba82874..8787958339d5 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -6200,13 +6200,14 @@ void idle_task_exit(void)
>> 	struct mm_struct *mm = current->active_mm;
>> 
>> 	BUG_ON(cpu_online(smp_processor_id()));
>> +	BUG_ON(current != this_rq()->idle);
>> 
>> 	if (mm != &init_mm) {
>> 		switch_mm(mm, &init_mm, current);
>> -		current->active_mm = &init_mm;
>> 		finish_arch_post_lock_switch();
>> 	}
>> -	mmdrop(mm);
>> +
>> +	/* finish_cpu(), as ran on the BP, will clean up the active_mm state */
>> }
>> 
>> /*
>> -- 
>> 2.21.0 (Apple Git-122.2)


  parent reply	other threads:[~2020-04-17 13:27 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-01 21:40 [PATCH v2] sched/core: fix illegal RCU from offline CPUs Qian Cai
2020-04-01 21:40 ` Qian Cai
2020-04-02 11:24 ` Michael Ellerman
2020-04-02 11:24   ` Michael Ellerman
2020-04-02 14:00   ` Qian Cai
2020-04-02 14:00     ` Qian Cai
2020-04-02 14:00     ` Qian Cai
2020-04-02 15:54     ` Paul E. McKenney
2020-04-02 15:54       ` Paul E. McKenney
2020-04-02 15:54       ` Paul E. McKenney
2020-04-02 16:19       ` Qian Cai
2020-04-02 16:19         ` Qian Cai
2020-04-02 16:19         ` Qian Cai
2020-04-02 16:57         ` Paul E. McKenney
2020-04-02 16:57           ` Paul E. McKenney
2020-04-02 16:57           ` Paul E. McKenney
2020-04-17 13:26   ` Qian Cai [this message]
2020-04-17 13:26     ` Qian Cai
2020-04-17 13:26     ` Qian Cai
2020-04-21 13:56     ` Peter Zijlstra
2020-04-21 13:56       ` Peter Zijlstra
2020-04-21 13:56       ` Peter Zijlstra
2020-05-01 18:22 ` [tip: sched/core] sched/core: Fix " tip-bot2 for Peter Zijlstra
  -- strict thread matches above, loose matches on Subject: below --
2020-01-13 19:03 [PATCH v2] sched/core: fix " Qian Cai
2020-01-20 10:16 ` Peter Zijlstra
2020-01-20 20:35   ` Qian Cai
2020-01-21 10:35     ` Peter Zijlstra
2020-01-24  4:21       ` Qian Cai
2020-01-24  5:02         ` Matthew Wilcox
2020-03-30  2:42       ` Qian Cai
2020-04-01 21:05         ` Qian Cai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BBA124FA-7924-4782-AC9D-7B1B98BE817F@lca.pw \
    --to=cai@lca.pw \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=bsegall@google.com \
    --cc=deller@gmx.de \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-parisc@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.