From: "Paul E. McKenney" <paulmck@kernel.org>
To: Valentin Schneider <vschneid@redhat.com>
Cc: linux-kernel@vger.kernel.org, mingo@redhat.com,
peterz@infradead.org, juri.lelli@redhat.com,
vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
bristot@redhat.com
Subject: Re: "Dying CPU not properly vacated" splat
Date: Mon, 25 Apr 2022 17:03:28 -0700 [thread overview]
Message-ID: <20220426000328.GY4285@paulmck-ThinkPad-P17-Gen-1> (raw)
In-Reply-To: <xhsmh1qxkakof.mognet@vschneid.remote.csb>
On Mon, Apr 25, 2022 at 10:59:44PM +0100, Valentin Schneider wrote:
> On 25/04/22 10:33, Paul E. McKenney wrote:
> > On Mon, Apr 25, 2022 at 05:15:13PM +0100, Valentin Schneider wrote:
> >>
> >> Hi Paul,
> >>
> >> On 21/04/22 12:38, Paul E. McKenney wrote:
> >> > Hello!
> >> >
> >> > The rcutorture TREE03 scenario got the following splat, which appears
> >> > to be a one-off, or if not, having an MTBF in the thousands of hours,
> >> > even assuming that it is specific to TREE03. (If it is not specific to
> >> > TREE03, we are talking tens of thousands of hours of rcutorture runtime.)
> >> >
> >> > So just in case this rings any bells or there are some diagnostics I
> >> > should add in case this ever happens again. ;-)
> >>
> >> There should be a dump of the enqueued tasks right after the snippet you've
> >> sent, any chance you could share that if it's there? That should tell us
> >> which tasks are potentially misbehaving.
> >
> > And now that I know to look for them, there they are! Thank you!!!
> >
> > CPU7 enqueued tasks (2 total):
> > pid: 52, name: migration/7
> > pid: 135, name: rcu_torture_rea
> > smpboot: CPU 7 is now offline
> >
> > So what did rcu_torture_reader() do wrong here? ;-)
> >
>
> So on teardown, CPUHP_AP_SCHED_WAIT_EMPTY->sched_cpu_wait_empty() waits for
> the rq to be empty. Tasks must *not* be enqueued onto that CPU after that
> step has been run - if there are per-CPU tasks bound to that CPU, they must
> be unbound in their respective hotplug callback.
>
> For instance for workqueue.c, we have workqueue_offline_cpu() as a hotplug
> callback which invokes unbind_workers(cpu), the interesting bit being:
>
> for_each_pool_worker(worker, pool) {
> kthread_set_per_cpu(worker->task, -1);
> WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, cpu_possible_mask) < 0);
> }
>
> The rcu_torture_reader() kthreads aren't bound to any particular CPU are
> they? I can't find any code that would indicate they are - and in that case
> it means we have a problem with is_cpu_allowed() or related.
I did not intend that the rcu_torture_reader() kthreads be bound, and
I am not seeing anything that binds them.
Thoughts? (Other than that validating any alleged fix will be quite
"interesting".)
Thanx, Paul
next prev parent reply other threads:[~2022-04-26 0:03 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-21 19:38 "Dying CPU not properly vacated" splat Paul E. McKenney
2022-04-25 16:15 ` Valentin Schneider
2022-04-25 17:33 ` Paul E. McKenney
2022-04-25 21:59 ` Valentin Schneider
2022-04-26 0:03 ` Paul E. McKenney [this message]
2022-04-26 14:48 ` Valentin Schneider
2022-04-26 16:24 ` Paul E. McKenney
2022-06-22 19:58 ` Paul E. McKenney
2022-07-05 7:45 ` Valentin Schneider
2022-07-05 17:23 ` Paul E. McKenney
2022-08-02 9:30 ` Valentin Schneider
2023-09-06 13:08 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220426000328.GY4285@paulmck-ThinkPad-P17-Gen-1 \
--to=paulmck@kernel.org \
--cc=bristot@redhat.com \
--cc=bsegall@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).