From: "Paul E. McKenney" <paulmck@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: mingo@redhat.com, juri.lelli@redhat.com,
vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
tglx@linutronix.de, linux-kernel@vger.kernel.org
Subject: Re: BUG: kernel NULL pointer dereference from check_preempt_wakeup()
Date: Sun, 7 Jun 2020 11:57:32 -0700 [thread overview]
Message-ID: <20200607185732.GA18906@paulmck-ThinkPad-P72> (raw)
In-Reply-To: <20200606172942.GA30594@paulmck-ThinkPad-P72>
On Sat, Jun 06, 2020 at 10:29:42AM -0700, Paul E. McKenney wrote:
> On Fri, Jun 05, 2020 at 05:51:26PM -0700, Paul E. McKenney wrote:
> > On Fri, Jun 05, 2020 at 11:41:59AM -0700, Paul E. McKenney wrote:
> > > On Fri, Jun 05, 2020 at 07:16:07AM -0700, Paul E. McKenney wrote:
> > >
> > > And in case it is helpful, here is the output of "git bisect view",
> > > which lists rather more commits than "git bisect run" claims, but there
> > > are only a few scheduler commits below. I don't see anything that
> > > I can prove can cause this problem, but there are some that are at
> > > least related to this code path.
> > >
> > > Is there anything that is actually relevant?
> >
> > And the run with the WARN and tracing did hit two failures, and the
> > corresponding console logs are in the attached tarball. Both of them
> > triggered a warning as well as the failure.
>
> And the current state of the bisection, for whatever it is worth.
The bisection finished, finally!
90b5363acd47 ("sched: Clean up scheduler_ipi()")
I don't see anything immediately obvious, but then again, I do not
claim to understand this code. If you have additional diagnostics,
please let me know.
Thanx, Paul
------------------------------------------------------------------------
commit 90b5363acd4739769c3f38c1aff16171bd133e8c
Author: Peter Zijlstra (Intel) <peterz@infradead.org>
Date: Fri Mar 27 11:44:56 2020 +0100
sched: Clean up scheduler_ipi()
The scheduler IPI has grown weird and wonderful over the years, time
for spring cleaning.
Move all the non-trivial stuff out of it and into a regular smp function
call IPI. This then reduces the schedule_ipi() to most of it's former NOP
glory and ensures to keep the interrupt vector lean and mean.
Aside of that avoiding the full irq_enter() in the x86 IPI implementation
is incorrect as scheduler_ipi() can be instrumented. To work around that
scheduler_ipi() had an irq_enter/exit() hack when heavy work was
pending. This is gone now.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Link: https://lkml.kernel.org/r/20200505134058.361859938@linutronix.de
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index b58efb1..cd2070d 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -219,6 +219,13 @@ void update_rq_clock(struct rq *rq)
update_rq_clock_task(rq, delta);
}
+static inline void
+rq_csd_init(struct rq *rq, call_single_data_t *csd, smp_call_func_t func)
+{
+ csd->flags = 0;
+ csd->func = func;
+ csd->info = rq;
+}
#ifdef CONFIG_SCHED_HRTICK
/*
@@ -314,16 +321,14 @@ void hrtick_start(struct rq *rq, u64 delay)
hrtimer_start(&rq->hrtick_timer, ns_to_ktime(delay),
HRTIMER_MODE_REL_PINNED_HARD);
}
+
#endif /* CONFIG_SMP */
static void hrtick_rq_init(struct rq *rq)
{
#ifdef CONFIG_SMP
- rq->hrtick_csd.flags = 0;
- rq->hrtick_csd.func = __hrtick_start;
- rq->hrtick_csd.info = rq;
+ rq_csd_init(rq, &rq->hrtick_csd, __hrtick_start);
#endif
-
hrtimer_init(&rq->hrtick_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD);
rq->hrtick_timer.function = hrtick;
}
@@ -650,6 +655,16 @@ static inline bool got_nohz_idle_kick(void)
return false;
}
+static void nohz_csd_func(void *info)
+{
+ struct rq *rq = info;
+
+ if (got_nohz_idle_kick()) {
+ rq->idle_balance = 1;
+ raise_softirq_irqoff(SCHED_SOFTIRQ);
+ }
+}
+
#else /* CONFIG_NO_HZ_COMMON */
static inline bool got_nohz_idle_kick(void)
@@ -2292,6 +2307,11 @@ void sched_ttwu_pending(void)
rq_unlock_irqrestore(rq, &rf);
}
+static void wake_csd_func(void *info)
+{
+ sched_ttwu_pending();
+}
+
void scheduler_ipi(void)
{
/*
@@ -2300,34 +2320,6 @@ void scheduler_ipi(void)
* this IPI.
*/
preempt_fold_need_resched();
-
- if (llist_empty(&this_rq()->wake_list) && !got_nohz_idle_kick())
- return;
-
- /*
- * Not all reschedule IPI handlers call irq_enter/irq_exit, since
- * traditionally all their work was done from the interrupt return
- * path. Now that we actually do some work, we need to make sure
- * we do call them.
- *
- * Some archs already do call them, luckily irq_enter/exit nest
- * properly.
- *
- * Arguably we should visit all archs and update all handlers,
- * however a fair share of IPIs are still resched only so this would
- * somewhat pessimize the simple resched case.
- */
- irq_enter();
- sched_ttwu_pending();
-
- /*
- * Check if someone kicked us for doing the nohz idle load balance.
- */
- if (unlikely(got_nohz_idle_kick())) {
- this_rq()->idle_balance = 1;
- raise_softirq_irqoff(SCHED_SOFTIRQ);
- }
- irq_exit();
}
static void ttwu_queue_remote(struct task_struct *p, int cpu, int wake_flags)
@@ -2336,9 +2328,9 @@ static void ttwu_queue_remote(struct task_struct *p, int cpu, int wake_flags)
p->sched_remote_wakeup = !!(wake_flags & WF_MIGRATED);
- if (llist_add(&p->wake_entry, &cpu_rq(cpu)->wake_list)) {
+ if (llist_add(&p->wake_entry, &rq->wake_list)) {
if (!set_nr_if_polling(rq->idle))
- smp_send_reschedule(cpu);
+ smp_call_function_single_async(cpu, &rq->wake_csd);
else
trace_sched_wake_idle_without_ipi(cpu);
}
@@ -6693,12 +6685,16 @@ void __init sched_init(void)
rq->avg_idle = 2*sysctl_sched_migration_cost;
rq->max_idle_balance_cost = sysctl_sched_migration_cost;
+ rq_csd_init(rq, &rq->wake_csd, wake_csd_func);
+
INIT_LIST_HEAD(&rq->cfs_tasks);
rq_attach_root(rq, &def_root_domain);
#ifdef CONFIG_NO_HZ_COMMON
rq->last_blocked_load_update_tick = jiffies;
atomic_set(&rq->nohz_flags, 0);
+
+ rq_csd_init(rq, &rq->nohz_csd, nohz_csd_func);
#endif
#endif /* CONFIG_SMP */
hrtick_rq_init(rq);
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 46b7bd4..6b7f147 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -10000,12 +10000,11 @@ static void kick_ilb(unsigned int flags)
return;
/*
- * Use smp_send_reschedule() instead of resched_cpu().
- * This way we generate a sched IPI on the target CPU which
+ * This way we generate an IPI on the target CPU which
* is idle. And the softirq performing nohz idle load balance
* will be run before returning from the IPI.
*/
- smp_send_reschedule(ilb_cpu);
+ smp_call_function_single_async(ilb_cpu, &cpu_rq(ilb_cpu)->nohz_csd);
}
/*
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 978c6fa..21416b3 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -889,9 +889,10 @@ struct rq {
#ifdef CONFIG_SMP
unsigned long last_blocked_load_update_tick;
unsigned int has_blocked_load;
+ call_single_data_t nohz_csd;
#endif /* CONFIG_SMP */
unsigned int nohz_tick_stopped;
- atomic_t nohz_flags;
+ atomic_t nohz_flags;
#endif /* CONFIG_NO_HZ_COMMON */
unsigned long nr_load_updates;
@@ -978,7 +979,7 @@ struct rq {
/* This is used to determine avg_idle's max value */
u64 max_idle_balance_cost;
-#endif
+#endif /* CONFIG_SMP */
#ifdef CONFIG_IRQ_TIME_ACCOUNTING
u64 prev_irq_time;
@@ -1020,6 +1021,7 @@ struct rq {
#endif
#ifdef CONFIG_SMP
+ call_single_data_t wake_csd;
struct llist_head wake_list;
#endif
next prev parent reply other threads:[~2020-06-07 18:57 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-04 22:54 BUG: kernel NULL pointer dereference from check_preempt_wakeup() Paul E. McKenney
2020-06-05 10:38 ` Peter Zijlstra
2020-06-05 10:56 ` Peter Zijlstra
2020-06-05 13:14 ` Peter Zijlstra
2020-06-05 14:16 ` Paul E. McKenney
2020-06-05 18:41 ` Paul E. McKenney
[not found] ` <20200606005126.GA21507@paulmck-ThinkPad-P72>
2020-06-06 17:29 ` Paul E. McKenney
2020-06-07 18:57 ` Paul E. McKenney [this message]
2020-06-09 15:40 ` Paul E. McKenney
2020-06-13 2:48 ` Paul E. McKenney
2020-06-13 7:26 ` Thomas Gleixner
2020-06-13 14:57 ` Paul E. McKenney
2020-06-13 23:40 ` Paul E. McKenney
2020-06-15 8:29 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200607185732.GA18906@paulmck-ThinkPad-P72 \
--to=paulmck@kernel.org \
--cc=bsegall@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=vincent.guittot@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.