* [PATCH v2 0/3] cpu/hotplug: rollback and "fail" interface fixes
@ 2021-02-16 10:35 vincent.donnefort
2021-02-16 10:35 ` [PATCH v2 1/3] cpu/hotplug: Allowing to reset fail injection vincent.donnefort
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: vincent.donnefort @ 2021-02-16 10:35 UTC (permalink / raw)
To: peterz, tglx; +Cc: linux-kernel, valentin.schneider, Vincent Donnefort
From: Vincent Donnefort <vincent.donnefort@arm.com>
This patch-set intends to unify steps call throughout hotplug and
hotunplug.
It also improves the "fail" interface, which can now be reset and will
reject states for which a failure can't be recovered.
v2:
- Reject all DEAD steps in the fail interface.
- Do not try to recover from a DEAD step failure.
- WARN on DEAD step failure.
- Additional comment in cpuhp_reset_state().
- Fix WARN_ON coding style issues.
Vincent Donnefort (3):
cpu/hotplug: Allowing to reset fail injection
cpu/hotplug: CPUHP_BRINGUP_CPU failure exception
cpu/hotplug: Add cpuhp_invoke_callback_range()
kernel/cpu.c | 194 ++++++++++++++++++++++++++++++++-------------------
1 file changed, 123 insertions(+), 71 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2 1/3] cpu/hotplug: Allowing to reset fail injection
2021-02-16 10:35 [PATCH v2 0/3] cpu/hotplug: rollback and "fail" interface fixes vincent.donnefort
@ 2021-02-16 10:35 ` vincent.donnefort
2021-03-02 9:01 ` [tip: sched/core] " tip-bot2 for Vincent Donnefort
` (2 more replies)
2021-02-16 10:35 ` [PATCH v2 2/3] cpu/hotplug: CPUHP_BRINGUP_CPU failure exception vincent.donnefort
2021-02-16 10:35 ` [PATCH v2 3/3] cpu/hotplug: Add cpuhp_invoke_callback_range() vincent.donnefort
2 siblings, 3 replies; 13+ messages in thread
From: vincent.donnefort @ 2021-02-16 10:35 UTC (permalink / raw)
To: peterz, tglx; +Cc: linux-kernel, valentin.schneider, Vincent Donnefort
From: Vincent Donnefort <vincent.donnefort@arm.com>
Currently, the only way of resetting the fail injection is to trigger a
hotplug, hotunplug or both. This is rather annoying for testing
and, as the default value for this file is -1, it seems pretty natural to
let a user write it.
Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 4e11e91010e1..093f96fb0824 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -2200,6 +2200,11 @@ static ssize_t write_cpuhp_fail(struct device *dev,
if (ret)
return ret;
+ if (fail == CPUHP_INVALID) {
+ st->fail = fail;
+ return count;
+ }
+
if (fail < CPUHP_OFFLINE || fail > CPUHP_ONLINE)
return -EINVAL;
--
2.25.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v2 2/3] cpu/hotplug: CPUHP_BRINGUP_CPU failure exception
2021-02-16 10:35 [PATCH v2 0/3] cpu/hotplug: rollback and "fail" interface fixes vincent.donnefort
2021-02-16 10:35 ` [PATCH v2 1/3] cpu/hotplug: Allowing to reset fail injection vincent.donnefort
@ 2021-02-16 10:35 ` vincent.donnefort
2021-03-02 9:01 ` [tip: sched/core] " tip-bot2 for Vincent Donnefort
` (2 more replies)
2021-02-16 10:35 ` [PATCH v2 3/3] cpu/hotplug: Add cpuhp_invoke_callback_range() vincent.donnefort
2 siblings, 3 replies; 13+ messages in thread
From: vincent.donnefort @ 2021-02-16 10:35 UTC (permalink / raw)
To: peterz, tglx; +Cc: linux-kernel, valentin.schneider, Vincent Donnefort
From: Vincent Donnefort <vincent.donnefort@arm.com>
The atomic states (between CPUHP_AP_IDLE_DEAD and CPUHP_AP_ONLINE) are
triggered by the CPUHP_BRINGUP_CPU step. If the latter fails, no atomic
state can be rolled back.
DEAD callbacks too can't fail and disallow recovery. As a consequence,
during hotunplug, the fail injection interface should prohibit all states
from CPUHP_BRINGUP_CPU to CPUHP_ONLINE.
Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 093f96fb0824..d44877095b8c 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -1038,9 +1038,13 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen,
* to do the further cleanups.
*/
ret = cpuhp_down_callbacks(cpu, st, target);
- if (ret && st->state == CPUHP_TEARDOWN_CPU && st->state < prev_state) {
- cpuhp_reset_state(st, prev_state);
- __cpuhp_kick_ap(st);
+ if (ret && st->state < prev_state) {
+ if (st->state == CPUHP_TEARDOWN_CPU) {
+ cpuhp_reset_state(st, prev_state);
+ __cpuhp_kick_ap(st);
+ } else {
+ WARN(1, "DEAD callback error for CPU%d", cpu);
+ }
}
out:
@@ -2214,6 +2218,15 @@ static ssize_t write_cpuhp_fail(struct device *dev,
if (cpuhp_is_atomic_state(fail))
return -EINVAL;
+ /*
+ * DEAD callbacks cannot fail...
+ * ... neither can CPUHP_BRINGUP_CPU during hotunplug. The latter
+ * triggering STARTING callbacks, a failure in this state would
+ * hinder rollback.
+ */
+ if (fail <= CPUHP_BRINGUP_CPU && st->state > CPUHP_BRINGUP_CPU)
+ return -EINVAL;
+
/*
* Cannot fail anything that doesn't have callbacks.
*/
--
2.25.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v2 3/3] cpu/hotplug: Add cpuhp_invoke_callback_range()
2021-02-16 10:35 [PATCH v2 0/3] cpu/hotplug: rollback and "fail" interface fixes vincent.donnefort
2021-02-16 10:35 ` [PATCH v2 1/3] cpu/hotplug: Allowing to reset fail injection vincent.donnefort
2021-02-16 10:35 ` [PATCH v2 2/3] cpu/hotplug: CPUHP_BRINGUP_CPU failure exception vincent.donnefort
@ 2021-02-16 10:35 ` vincent.donnefort
2021-03-02 9:01 ` [tip: sched/core] " tip-bot2 for Vincent Donnefort
` (2 more replies)
2 siblings, 3 replies; 13+ messages in thread
From: vincent.donnefort @ 2021-02-16 10:35 UTC (permalink / raw)
To: peterz, tglx; +Cc: linux-kernel, valentin.schneider, Vincent Donnefort
From: Vincent Donnefort <vincent.donnefort@arm.com>
Factorizing and unifying cpuhp callback range invocations, especially for
the hotunplug path, where two different ways of decrementing were used. The
first one, decrements before the callback is called:
cpuhp_thread_fun()
state = st->state;
st->state--;
cpuhp_invoke_callback(state);
The second one, after:
take_down_cpu()|cpuhp_down_callbacks()
cpuhp_invoke_callback(st->state);
st->state--;
This is problematic for rolling back the steps in case of error, as
depending on the decrement, the rollback will start from N or N-1. It also
makes tracing inconsistent, between steps run in the cpuhp thread and
the others.
Additionally, avoid useless cpuhp_thread_fun() loops by skipping empty
steps.
Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
diff --git a/kernel/cpu.c b/kernel/cpu.c
index d44877095b8c..382ef48e1271 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -135,6 +135,11 @@ static struct cpuhp_step *cpuhp_get_step(enum cpuhp_state state)
return cpuhp_hp_states + state;
}
+static bool cpuhp_step_empty(bool bringup, struct cpuhp_step *step)
+{
+ return bringup ? !step->startup.single : !step->teardown.single;
+}
+
/**
* cpuhp_invoke_callback _ Invoke the callbacks for a given state
* @cpu: The cpu for which the callback should be invoked
@@ -157,26 +162,24 @@ static int cpuhp_invoke_callback(unsigned int cpu, enum cpuhp_state state,
if (st->fail == state) {
st->fail = CPUHP_INVALID;
-
- if (!(bringup ? step->startup.single : step->teardown.single))
- return 0;
-
return -EAGAIN;
}
+ if (cpuhp_step_empty(bringup, step)) {
+ WARN_ON_ONCE(1);
+ return 0;
+ }
+
if (!step->multi_instance) {
WARN_ON_ONCE(lastp && *lastp);
cb = bringup ? step->startup.single : step->teardown.single;
- if (!cb)
- return 0;
+
trace_cpuhp_enter(cpu, st->target, state, cb);
ret = cb(cpu);
trace_cpuhp_exit(cpu, st->state, state, ret);
return ret;
}
cbm = bringup ? step->startup.multi : step->teardown.multi;
- if (!cbm)
- return 0;
/* Single invocation for instance add/remove */
if (node) {
@@ -468,6 +471,15 @@ cpuhp_set_state(struct cpuhp_cpu_state *st, enum cpuhp_state target)
static inline void
cpuhp_reset_state(struct cpuhp_cpu_state *st, enum cpuhp_state prev_state)
{
+ st->target = prev_state;
+
+ /*
+ * Already rolling back. No need invert the bringup value or to change
+ * the current state.
+ */
+ if (st->rollback)
+ return;
+
st->rollback = true;
/*
@@ -481,7 +493,6 @@ cpuhp_reset_state(struct cpuhp_cpu_state *st, enum cpuhp_state prev_state)
st->state++;
}
- st->target = prev_state;
st->bringup = !st->bringup;
}
@@ -584,10 +595,53 @@ static int finish_cpu(unsigned int cpu)
* Hotplug state machine related functions
*/
-static void undo_cpu_up(unsigned int cpu, struct cpuhp_cpu_state *st)
+/*
+ * Get the next state to run. Empty ones will be skipped. Returns true if a
+ * state must be run.
+ *
+ * st->state will be modified ahead of time, to match state_to_run, as if it
+ * has already ran.
+ */
+static bool cpuhp_next_state(bool bringup,
+ enum cpuhp_state *state_to_run,
+ struct cpuhp_cpu_state *st,
+ enum cpuhp_state target)
{
- for (st->state--; st->state > st->target; st->state--)
- cpuhp_invoke_callback(cpu, st->state, false, NULL, NULL);
+ do {
+ if (bringup) {
+ if (st->state >= target)
+ return false;
+
+ *state_to_run = ++st->state;
+ } else {
+ if (st->state <= target)
+ return false;
+
+ *state_to_run = st->state--;
+ }
+
+ if (!cpuhp_step_empty(bringup, cpuhp_get_step(*state_to_run)))
+ break;
+ } while (true);
+
+ return true;
+}
+
+static int cpuhp_invoke_callback_range(bool bringup,
+ unsigned int cpu,
+ struct cpuhp_cpu_state *st,
+ enum cpuhp_state target)
+{
+ enum cpuhp_state state;
+ int err = 0;
+
+ while (cpuhp_next_state(bringup, &state, st, target)) {
+ err = cpuhp_invoke_callback(cpu, state, bringup, NULL, NULL);
+ if (err)
+ break;
+ }
+
+ return err;
}
static inline bool can_rollback_cpu(struct cpuhp_cpu_state *st)
@@ -610,16 +664,12 @@ static int cpuhp_up_callbacks(unsigned int cpu, struct cpuhp_cpu_state *st,
enum cpuhp_state prev_state = st->state;
int ret = 0;
- while (st->state < target) {
- st->state++;
- ret = cpuhp_invoke_callback(cpu, st->state, true, NULL, NULL);
- if (ret) {
- if (can_rollback_cpu(st)) {
- st->target = prev_state;
- undo_cpu_up(cpu, st);
- }
- break;
- }
+ ret = cpuhp_invoke_callback_range(true, cpu, st, target);
+ if (ret) {
+ cpuhp_reset_state(st, prev_state);
+ if (can_rollback_cpu(st))
+ WARN_ON(cpuhp_invoke_callback_range(false, cpu, st,
+ prev_state));
}
return ret;
}
@@ -683,17 +733,9 @@ static void cpuhp_thread_fun(unsigned int cpu)
state = st->cb_state;
st->should_run = false;
} else {
- if (bringup) {
- st->state++;
- state = st->state;
- st->should_run = (st->state < st->target);
- WARN_ON_ONCE(st->state > st->target);
- } else {
- state = st->state;
- st->state--;
- st->should_run = (st->state > st->target);
- WARN_ON_ONCE(st->state < st->target);
- }
+ st->should_run = cpuhp_next_state(bringup, &state, st, st->target);
+ if (!st->should_run)
+ goto end;
}
WARN_ON_ONCE(!cpuhp_is_ap_state(state));
@@ -721,6 +763,7 @@ static void cpuhp_thread_fun(unsigned int cpu)
st->should_run = false;
}
+end:
cpuhp_lock_release(bringup);
lockdep_release_cpus_lock();
@@ -874,19 +917,18 @@ static int take_cpu_down(void *_param)
return err;
/*
- * We get here while we are in CPUHP_TEARDOWN_CPU state and we must not
- * do this step again.
+ * Must be called from CPUHP_TEARDOWN_CPU, which means, as we are going
+ * down, that the current state is CPUHP_TEARDOWN_CPU - 1.
*/
- WARN_ON(st->state != CPUHP_TEARDOWN_CPU);
- st->state--;
+ WARN_ON(st->state != (CPUHP_TEARDOWN_CPU - 1));
+
/* Invoke the former CPU_DYING callbacks */
- for (; st->state > target; st->state--) {
- ret = cpuhp_invoke_callback(cpu, st->state, false, NULL, NULL);
- /*
- * DYING must not fail!
- */
- WARN_ON_ONCE(ret);
- }
+ ret = cpuhp_invoke_callback_range(false, cpu, st, target);
+
+ /*
+ * DYING must not fail!
+ */
+ WARN_ON_ONCE(ret);
/* Give up timekeeping duties */
tick_handover_do_timer();
@@ -968,27 +1010,22 @@ void cpuhp_report_idle_dead(void)
cpuhp_complete_idle_dead, st, 0);
}
-static void undo_cpu_down(unsigned int cpu, struct cpuhp_cpu_state *st)
-{
- for (st->state++; st->state < st->target; st->state++)
- cpuhp_invoke_callback(cpu, st->state, true, NULL, NULL);
-}
-
static int cpuhp_down_callbacks(unsigned int cpu, struct cpuhp_cpu_state *st,
enum cpuhp_state target)
{
enum cpuhp_state prev_state = st->state;
int ret = 0;
- for (; st->state > target; st->state--) {
- ret = cpuhp_invoke_callback(cpu, st->state, false, NULL, NULL);
- if (ret) {
- st->target = prev_state;
- if (st->state < prev_state)
- undo_cpu_down(cpu, st);
- break;
- }
+ ret = cpuhp_invoke_callback_range(false, cpu, st, target);
+ if (ret) {
+
+ cpuhp_reset_state(st, prev_state);
+
+ if (st->state < prev_state)
+ WARN_ON(cpuhp_invoke_callback_range(true, cpu, st,
+ prev_state));
}
+
return ret;
}
@@ -1161,14 +1198,12 @@ void notify_cpu_starting(unsigned int cpu)
rcu_cpu_starting(cpu); /* Enables RCU usage on this CPU. */
cpumask_set_cpu(cpu, &cpus_booted_once_mask);
- while (st->state < target) {
- st->state++;
- ret = cpuhp_invoke_callback(cpu, st->state, true, NULL, NULL);
- /*
- * STARTING must not fail!
- */
- WARN_ON_ONCE(ret);
- }
+ ret = cpuhp_invoke_callback_range(true, cpu, st, target);
+
+ /*
+ * STARTING must not fail!
+ */
+ WARN_ON_ONCE(ret);
}
/*
@@ -1774,8 +1809,7 @@ static int cpuhp_issue_call(int cpu, enum cpuhp_state state, bool bringup,
* If there's nothing to do, we done.
* Relies on the union for multi_instance.
*/
- if ((bringup && !sp->startup.single) ||
- (!bringup && !sp->teardown.single))
+ if (cpuhp_step_empty(bringup, sp))
return 0;
/*
* The non AP bound callbacks can fail on bringup. On teardown
--
2.25.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [tip: sched/core] cpu/hotplug: Allowing to reset fail injection
2021-02-16 10:35 ` [PATCH v2 1/3] cpu/hotplug: Allowing to reset fail injection vincent.donnefort
@ 2021-03-02 9:01 ` tip-bot2 for Vincent Donnefort
2021-03-03 9:49 ` tip-bot2 for Vincent Donnefort
2021-03-06 11:42 ` tip-bot2 for Vincent Donnefort
2 siblings, 0 replies; 13+ messages in thread
From: tip-bot2 for Vincent Donnefort @ 2021-03-02 9:01 UTC (permalink / raw)
To: linux-tip-commits
Cc: Vincent Donnefort, Peter Zijlstra (Intel), x86, linux-kernel
The following commit has been merged into the sched/core branch of tip:
Commit-ID: e58422d9789d5c61bb039e40e4e10781d8d43ac3
Gitweb: https://git.kernel.org/tip/e58422d9789d5c61bb039e40e4e10781d8d43ac3
Author: Vincent Donnefort <vincent.donnefort@arm.com>
AuthorDate: Tue, 16 Feb 2021 10:35:04
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Mon, 01 Mar 2021 18:17:26 +01:00
cpu/hotplug: Allowing to reset fail injection
Currently, the only way of resetting the fail injection is to trigger a
hotplug, hotunplug or both. This is rather annoying for testing
and, as the default value for this file is -1, it seems pretty natural to
let a user write it.
Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210216103506.416286-2-vincent.donnefort@arm.com
---
kernel/cpu.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 1b6302e..9121edf 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -2207,6 +2207,11 @@ static ssize_t write_cpuhp_fail(struct device *dev,
if (ret)
return ret;
+ if (fail == CPUHP_INVALID) {
+ st->fail = fail;
+ return count;
+ }
+
if (fail < CPUHP_OFFLINE || fail > CPUHP_ONLINE)
return -EINVAL;
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [tip: sched/core] cpu/hotplug: CPUHP_BRINGUP_CPU failure exception
2021-02-16 10:35 ` [PATCH v2 2/3] cpu/hotplug: CPUHP_BRINGUP_CPU failure exception vincent.donnefort
@ 2021-03-02 9:01 ` tip-bot2 for Vincent Donnefort
2021-03-03 9:49 ` tip-bot2 for Vincent Donnefort
2021-03-06 11:42 ` tip-bot2 for Vincent Donnefort
2 siblings, 0 replies; 13+ messages in thread
From: tip-bot2 for Vincent Donnefort @ 2021-03-02 9:01 UTC (permalink / raw)
To: linux-tip-commits
Cc: Vincent Donnefort, Peter Zijlstra (Intel), x86, linux-kernel
The following commit has been merged into the sched/core branch of tip:
Commit-ID: 92e1512c5329c7675b66163d357d00d95107fa03
Gitweb: https://git.kernel.org/tip/92e1512c5329c7675b66163d357d00d95107fa03
Author: Vincent Donnefort <vincent.donnefort@arm.com>
AuthorDate: Tue, 16 Feb 2021 10:35:05
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Mon, 01 Mar 2021 18:17:26 +01:00
cpu/hotplug: CPUHP_BRINGUP_CPU failure exception
The atomic states (between CPUHP_AP_IDLE_DEAD and CPUHP_AP_ONLINE) are
triggered by the CPUHP_BRINGUP_CPU step. If the latter fails, no atomic
state can be rolled back.
DEAD callbacks too can't fail and disallow recovery. As a consequence,
during hotunplug, the fail injection interface should prohibit all states
from CPUHP_BRINGUP_CPU to CPUHP_ONLINE.
Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210216103506.416286-3-vincent.donnefort@arm.com
---
kernel/cpu.c | 19 ++++++++++++++++---
1 file changed, 16 insertions(+), 3 deletions(-)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 9121edf..680ed8f 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -1045,9 +1045,13 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen,
* to do the further cleanups.
*/
ret = cpuhp_down_callbacks(cpu, st, target);
- if (ret && st->state == CPUHP_TEARDOWN_CPU && st->state < prev_state) {
- cpuhp_reset_state(st, prev_state);
- __cpuhp_kick_ap(st);
+ if (ret && st->state < prev_state) {
+ if (st->state == CPUHP_TEARDOWN_CPU) {
+ cpuhp_reset_state(st, prev_state);
+ __cpuhp_kick_ap(st);
+ } else {
+ WARN(1, "DEAD callback error for CPU%d", cpu);
+ }
}
out:
@@ -2222,6 +2226,15 @@ static ssize_t write_cpuhp_fail(struct device *dev,
return -EINVAL;
/*
+ * DEAD callbacks cannot fail...
+ * ... neither can CPUHP_BRINGUP_CPU during hotunplug. The latter
+ * triggering STARTING callbacks, a failure in this state would
+ * hinder rollback.
+ */
+ if (fail <= CPUHP_BRINGUP_CPU && st->state > CPUHP_BRINGUP_CPU)
+ return -EINVAL;
+
+ /*
* Cannot fail anything that doesn't have callbacks.
*/
mutex_lock(&cpuhp_state_mutex);
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [tip: sched/core] cpu/hotplug: Add cpuhp_invoke_callback_range()
2021-02-16 10:35 ` [PATCH v2 3/3] cpu/hotplug: Add cpuhp_invoke_callback_range() vincent.donnefort
@ 2021-03-02 9:01 ` tip-bot2 for Vincent Donnefort
2021-03-03 9:49 ` tip-bot2 for Vincent Donnefort
2021-03-06 11:42 ` tip-bot2 for Vincent Donnefort
2 siblings, 0 replies; 13+ messages in thread
From: tip-bot2 for Vincent Donnefort @ 2021-03-02 9:01 UTC (permalink / raw)
To: linux-tip-commits
Cc: Vincent Donnefort, Peter Zijlstra (Intel), x86, linux-kernel
The following commit has been merged into the sched/core branch of tip:
Commit-ID: c91b0dcb6482096e7af4adbf39cfe3296af74a78
Gitweb: https://git.kernel.org/tip/c91b0dcb6482096e7af4adbf39cfe3296af74a78
Author: Vincent Donnefort <vincent.donnefort@arm.com>
AuthorDate: Tue, 16 Feb 2021 10:35:06
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Mon, 01 Mar 2021 18:17:27 +01:00
cpu/hotplug: Add cpuhp_invoke_callback_range()
Factorizing and unifying cpuhp callback range invocations, especially for
the hotunplug path, where two different ways of decrementing were used. The
first one, decrements before the callback is called:
cpuhp_thread_fun()
state = st->state;
st->state--;
cpuhp_invoke_callback(state);
The second one, after:
take_down_cpu()|cpuhp_down_callbacks()
cpuhp_invoke_callback(st->state);
st->state--;
This is problematic for rolling back the steps in case of error, as
depending on the decrement, the rollback will start from N or N-1. It also
makes tracing inconsistent, between steps run in the cpuhp thread and
the others.
Additionally, avoid useless cpuhp_thread_fun() loops by skipping empty
steps.
Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210216103506.416286-4-vincent.donnefort@arm.com
---
kernel/cpu.c | 170 ++++++++++++++++++++++++++++++--------------------
1 file changed, 102 insertions(+), 68 deletions(-)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 680ed8f..23505d6 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -135,6 +135,11 @@ static struct cpuhp_step *cpuhp_get_step(enum cpuhp_state state)
return cpuhp_hp_states + state;
}
+static bool cpuhp_step_empty(bool bringup, struct cpuhp_step *step)
+{
+ return bringup ? !step->startup.single : !step->teardown.single;
+}
+
/**
* cpuhp_invoke_callback _ Invoke the callbacks for a given state
* @cpu: The cpu for which the callback should be invoked
@@ -157,26 +162,24 @@ static int cpuhp_invoke_callback(unsigned int cpu, enum cpuhp_state state,
if (st->fail == state) {
st->fail = CPUHP_INVALID;
-
- if (!(bringup ? step->startup.single : step->teardown.single))
- return 0;
-
return -EAGAIN;
}
+ if (cpuhp_step_empty(bringup, step)) {
+ WARN_ON_ONCE(1);
+ return 0;
+ }
+
if (!step->multi_instance) {
WARN_ON_ONCE(lastp && *lastp);
cb = bringup ? step->startup.single : step->teardown.single;
- if (!cb)
- return 0;
+
trace_cpuhp_enter(cpu, st->target, state, cb);
ret = cb(cpu);
trace_cpuhp_exit(cpu, st->state, state, ret);
return ret;
}
cbm = bringup ? step->startup.multi : step->teardown.multi;
- if (!cbm)
- return 0;
/* Single invocation for instance add/remove */
if (node) {
@@ -475,6 +478,15 @@ cpuhp_set_state(struct cpuhp_cpu_state *st, enum cpuhp_state target)
static inline void
cpuhp_reset_state(struct cpuhp_cpu_state *st, enum cpuhp_state prev_state)
{
+ st->target = prev_state;
+
+ /*
+ * Already rolling back. No need invert the bringup value or to change
+ * the current state.
+ */
+ if (st->rollback)
+ return;
+
st->rollback = true;
/*
@@ -488,7 +500,6 @@ cpuhp_reset_state(struct cpuhp_cpu_state *st, enum cpuhp_state prev_state)
st->state++;
}
- st->target = prev_state;
st->bringup = !st->bringup;
}
@@ -591,10 +602,53 @@ static int finish_cpu(unsigned int cpu)
* Hotplug state machine related functions
*/
-static void undo_cpu_up(unsigned int cpu, struct cpuhp_cpu_state *st)
+/*
+ * Get the next state to run. Empty ones will be skipped. Returns true if a
+ * state must be run.
+ *
+ * st->state will be modified ahead of time, to match state_to_run, as if it
+ * has already ran.
+ */
+static bool cpuhp_next_state(bool bringup,
+ enum cpuhp_state *state_to_run,
+ struct cpuhp_cpu_state *st,
+ enum cpuhp_state target)
{
- for (st->state--; st->state > st->target; st->state--)
- cpuhp_invoke_callback(cpu, st->state, false, NULL, NULL);
+ do {
+ if (bringup) {
+ if (st->state >= target)
+ return false;
+
+ *state_to_run = ++st->state;
+ } else {
+ if (st->state <= target)
+ return false;
+
+ *state_to_run = st->state--;
+ }
+
+ if (!cpuhp_step_empty(bringup, cpuhp_get_step(*state_to_run)))
+ break;
+ } while (true);
+
+ return true;
+}
+
+static int cpuhp_invoke_callback_range(bool bringup,
+ unsigned int cpu,
+ struct cpuhp_cpu_state *st,
+ enum cpuhp_state target)
+{
+ enum cpuhp_state state;
+ int err = 0;
+
+ while (cpuhp_next_state(bringup, &state, st, target)) {
+ err = cpuhp_invoke_callback(cpu, state, bringup, NULL, NULL);
+ if (err)
+ break;
+ }
+
+ return err;
}
static inline bool can_rollback_cpu(struct cpuhp_cpu_state *st)
@@ -617,16 +671,12 @@ static int cpuhp_up_callbacks(unsigned int cpu, struct cpuhp_cpu_state *st,
enum cpuhp_state prev_state = st->state;
int ret = 0;
- while (st->state < target) {
- st->state++;
- ret = cpuhp_invoke_callback(cpu, st->state, true, NULL, NULL);
- if (ret) {
- if (can_rollback_cpu(st)) {
- st->target = prev_state;
- undo_cpu_up(cpu, st);
- }
- break;
- }
+ ret = cpuhp_invoke_callback_range(true, cpu, st, target);
+ if (ret) {
+ cpuhp_reset_state(st, prev_state);
+ if (can_rollback_cpu(st))
+ WARN_ON(cpuhp_invoke_callback_range(false, cpu, st,
+ prev_state));
}
return ret;
}
@@ -690,17 +740,9 @@ static void cpuhp_thread_fun(unsigned int cpu)
state = st->cb_state;
st->should_run = false;
} else {
- if (bringup) {
- st->state++;
- state = st->state;
- st->should_run = (st->state < st->target);
- WARN_ON_ONCE(st->state > st->target);
- } else {
- state = st->state;
- st->state--;
- st->should_run = (st->state > st->target);
- WARN_ON_ONCE(st->state < st->target);
- }
+ st->should_run = cpuhp_next_state(bringup, &state, st, st->target);
+ if (!st->should_run)
+ goto end;
}
WARN_ON_ONCE(!cpuhp_is_ap_state(state));
@@ -728,6 +770,7 @@ static void cpuhp_thread_fun(unsigned int cpu)
st->should_run = false;
}
+end:
cpuhp_lock_release(bringup);
lockdep_release_cpus_lock();
@@ -881,19 +924,18 @@ static int take_cpu_down(void *_param)
return err;
/*
- * We get here while we are in CPUHP_TEARDOWN_CPU state and we must not
- * do this step again.
+ * Must be called from CPUHP_TEARDOWN_CPU, which means, as we are going
+ * down, that the current state is CPUHP_TEARDOWN_CPU - 1.
*/
- WARN_ON(st->state != CPUHP_TEARDOWN_CPU);
- st->state--;
+ WARN_ON(st->state != (CPUHP_TEARDOWN_CPU - 1));
+
/* Invoke the former CPU_DYING callbacks */
- for (; st->state > target; st->state--) {
- ret = cpuhp_invoke_callback(cpu, st->state, false, NULL, NULL);
- /*
- * DYING must not fail!
- */
- WARN_ON_ONCE(ret);
- }
+ ret = cpuhp_invoke_callback_range(false, cpu, st, target);
+
+ /*
+ * DYING must not fail!
+ */
+ WARN_ON_ONCE(ret);
/* Give up timekeeping duties */
tick_handover_do_timer();
@@ -975,27 +1017,22 @@ void cpuhp_report_idle_dead(void)
cpuhp_complete_idle_dead, st, 0);
}
-static void undo_cpu_down(unsigned int cpu, struct cpuhp_cpu_state *st)
-{
- for (st->state++; st->state < st->target; st->state++)
- cpuhp_invoke_callback(cpu, st->state, true, NULL, NULL);
-}
-
static int cpuhp_down_callbacks(unsigned int cpu, struct cpuhp_cpu_state *st,
enum cpuhp_state target)
{
enum cpuhp_state prev_state = st->state;
int ret = 0;
- for (; st->state > target; st->state--) {
- ret = cpuhp_invoke_callback(cpu, st->state, false, NULL, NULL);
- if (ret) {
- st->target = prev_state;
- if (st->state < prev_state)
- undo_cpu_down(cpu, st);
- break;
- }
+ ret = cpuhp_invoke_callback_range(false, cpu, st, target);
+ if (ret) {
+
+ cpuhp_reset_state(st, prev_state);
+
+ if (st->state < prev_state)
+ WARN_ON(cpuhp_invoke_callback_range(true, cpu, st,
+ prev_state));
}
+
return ret;
}
@@ -1168,14 +1205,12 @@ void notify_cpu_starting(unsigned int cpu)
rcu_cpu_starting(cpu); /* Enables RCU usage on this CPU. */
cpumask_set_cpu(cpu, &cpus_booted_once_mask);
- while (st->state < target) {
- st->state++;
- ret = cpuhp_invoke_callback(cpu, st->state, true, NULL, NULL);
- /*
- * STARTING must not fail!
- */
- WARN_ON_ONCE(ret);
- }
+ ret = cpuhp_invoke_callback_range(true, cpu, st, target);
+
+ /*
+ * STARTING must not fail!
+ */
+ WARN_ON_ONCE(ret);
}
/*
@@ -1781,8 +1816,7 @@ static int cpuhp_issue_call(int cpu, enum cpuhp_state state, bool bringup,
* If there's nothing to do, we done.
* Relies on the union for multi_instance.
*/
- if ((bringup && !sp->startup.single) ||
- (!bringup && !sp->teardown.single))
+ if (cpuhp_step_empty(bringup, sp))
return 0;
/*
* The non AP bound callbacks can fail on bringup. On teardown
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [tip: sched/core] cpu/hotplug: CPUHP_BRINGUP_CPU failure exception
2021-02-16 10:35 ` [PATCH v2 2/3] cpu/hotplug: CPUHP_BRINGUP_CPU failure exception vincent.donnefort
2021-03-02 9:01 ` [tip: sched/core] " tip-bot2 for Vincent Donnefort
@ 2021-03-03 9:49 ` tip-bot2 for Vincent Donnefort
2021-03-06 11:42 ` tip-bot2 for Vincent Donnefort
2 siblings, 0 replies; 13+ messages in thread
From: tip-bot2 for Vincent Donnefort @ 2021-03-03 9:49 UTC (permalink / raw)
To: linux-tip-commits
Cc: Vincent Donnefort, Peter Zijlstra (Intel), x86, linux-kernel
The following commit has been merged into the sched/core branch of tip:
Commit-ID: 5e7f238920174248049ff840eff43c94f3a2e67e
Gitweb: https://git.kernel.org/tip/5e7f238920174248049ff840eff43c94f3a2e67e
Author: Vincent Donnefort <vincent.donnefort@arm.com>
AuthorDate: Tue, 16 Feb 2021 10:35:05
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Wed, 03 Mar 2021 10:33:00 +01:00
cpu/hotplug: CPUHP_BRINGUP_CPU failure exception
The atomic states (between CPUHP_AP_IDLE_DEAD and CPUHP_AP_ONLINE) are
triggered by the CPUHP_BRINGUP_CPU step. If the latter fails, no atomic
state can be rolled back.
DEAD callbacks too can't fail and disallow recovery. As a consequence,
during hotunplug, the fail injection interface should prohibit all states
from CPUHP_BRINGUP_CPU to CPUHP_ONLINE.
Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210216103506.416286-3-vincent.donnefort@arm.com
---
kernel/cpu.c | 19 ++++++++++++++++---
1 file changed, 16 insertions(+), 3 deletions(-)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 9121edf..680ed8f 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -1045,9 +1045,13 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen,
* to do the further cleanups.
*/
ret = cpuhp_down_callbacks(cpu, st, target);
- if (ret && st->state == CPUHP_TEARDOWN_CPU && st->state < prev_state) {
- cpuhp_reset_state(st, prev_state);
- __cpuhp_kick_ap(st);
+ if (ret && st->state < prev_state) {
+ if (st->state == CPUHP_TEARDOWN_CPU) {
+ cpuhp_reset_state(st, prev_state);
+ __cpuhp_kick_ap(st);
+ } else {
+ WARN(1, "DEAD callback error for CPU%d", cpu);
+ }
}
out:
@@ -2222,6 +2226,15 @@ static ssize_t write_cpuhp_fail(struct device *dev,
return -EINVAL;
/*
+ * DEAD callbacks cannot fail...
+ * ... neither can CPUHP_BRINGUP_CPU during hotunplug. The latter
+ * triggering STARTING callbacks, a failure in this state would
+ * hinder rollback.
+ */
+ if (fail <= CPUHP_BRINGUP_CPU && st->state > CPUHP_BRINGUP_CPU)
+ return -EINVAL;
+
+ /*
* Cannot fail anything that doesn't have callbacks.
*/
mutex_lock(&cpuhp_state_mutex);
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [tip: sched/core] cpu/hotplug: Add cpuhp_invoke_callback_range()
2021-02-16 10:35 ` [PATCH v2 3/3] cpu/hotplug: Add cpuhp_invoke_callback_range() vincent.donnefort
2021-03-02 9:01 ` [tip: sched/core] " tip-bot2 for Vincent Donnefort
@ 2021-03-03 9:49 ` tip-bot2 for Vincent Donnefort
2021-03-06 11:42 ` tip-bot2 for Vincent Donnefort
2 siblings, 0 replies; 13+ messages in thread
From: tip-bot2 for Vincent Donnefort @ 2021-03-03 9:49 UTC (permalink / raw)
To: linux-tip-commits
Cc: Vincent Donnefort, Peter Zijlstra (Intel), x86, linux-kernel
The following commit has been merged into the sched/core branch of tip:
Commit-ID: 8b89220650146d59e9a8af2e5f12fc582539609e
Gitweb: https://git.kernel.org/tip/8b89220650146d59e9a8af2e5f12fc582539609e
Author: Vincent Donnefort <vincent.donnefort@arm.com>
AuthorDate: Tue, 16 Feb 2021 10:35:06
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Wed, 03 Mar 2021 10:33:00 +01:00
cpu/hotplug: Add cpuhp_invoke_callback_range()
Factorizing and unifying cpuhp callback range invocations, especially for
the hotunplug path, where two different ways of decrementing were used. The
first one, decrements before the callback is called:
cpuhp_thread_fun()
state = st->state;
st->state--;
cpuhp_invoke_callback(state);
The second one, after:
take_down_cpu()|cpuhp_down_callbacks()
cpuhp_invoke_callback(st->state);
st->state--;
This is problematic for rolling back the steps in case of error, as
depending on the decrement, the rollback will start from N or N-1. It also
makes tracing inconsistent, between steps run in the cpuhp thread and
the others.
Additionally, avoid useless cpuhp_thread_fun() loops by skipping empty
steps.
Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210216103506.416286-4-vincent.donnefort@arm.com
---
kernel/cpu.c | 170 ++++++++++++++++++++++++++++++--------------------
1 file changed, 102 insertions(+), 68 deletions(-)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 680ed8f..23505d6 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -135,6 +135,11 @@ static struct cpuhp_step *cpuhp_get_step(enum cpuhp_state state)
return cpuhp_hp_states + state;
}
+static bool cpuhp_step_empty(bool bringup, struct cpuhp_step *step)
+{
+ return bringup ? !step->startup.single : !step->teardown.single;
+}
+
/**
* cpuhp_invoke_callback _ Invoke the callbacks for a given state
* @cpu: The cpu for which the callback should be invoked
@@ -157,26 +162,24 @@ static int cpuhp_invoke_callback(unsigned int cpu, enum cpuhp_state state,
if (st->fail == state) {
st->fail = CPUHP_INVALID;
-
- if (!(bringup ? step->startup.single : step->teardown.single))
- return 0;
-
return -EAGAIN;
}
+ if (cpuhp_step_empty(bringup, step)) {
+ WARN_ON_ONCE(1);
+ return 0;
+ }
+
if (!step->multi_instance) {
WARN_ON_ONCE(lastp && *lastp);
cb = bringup ? step->startup.single : step->teardown.single;
- if (!cb)
- return 0;
+
trace_cpuhp_enter(cpu, st->target, state, cb);
ret = cb(cpu);
trace_cpuhp_exit(cpu, st->state, state, ret);
return ret;
}
cbm = bringup ? step->startup.multi : step->teardown.multi;
- if (!cbm)
- return 0;
/* Single invocation for instance add/remove */
if (node) {
@@ -475,6 +478,15 @@ cpuhp_set_state(struct cpuhp_cpu_state *st, enum cpuhp_state target)
static inline void
cpuhp_reset_state(struct cpuhp_cpu_state *st, enum cpuhp_state prev_state)
{
+ st->target = prev_state;
+
+ /*
+ * Already rolling back. No need invert the bringup value or to change
+ * the current state.
+ */
+ if (st->rollback)
+ return;
+
st->rollback = true;
/*
@@ -488,7 +500,6 @@ cpuhp_reset_state(struct cpuhp_cpu_state *st, enum cpuhp_state prev_state)
st->state++;
}
- st->target = prev_state;
st->bringup = !st->bringup;
}
@@ -591,10 +602,53 @@ static int finish_cpu(unsigned int cpu)
* Hotplug state machine related functions
*/
-static void undo_cpu_up(unsigned int cpu, struct cpuhp_cpu_state *st)
+/*
+ * Get the next state to run. Empty ones will be skipped. Returns true if a
+ * state must be run.
+ *
+ * st->state will be modified ahead of time, to match state_to_run, as if it
+ * has already ran.
+ */
+static bool cpuhp_next_state(bool bringup,
+ enum cpuhp_state *state_to_run,
+ struct cpuhp_cpu_state *st,
+ enum cpuhp_state target)
{
- for (st->state--; st->state > st->target; st->state--)
- cpuhp_invoke_callback(cpu, st->state, false, NULL, NULL);
+ do {
+ if (bringup) {
+ if (st->state >= target)
+ return false;
+
+ *state_to_run = ++st->state;
+ } else {
+ if (st->state <= target)
+ return false;
+
+ *state_to_run = st->state--;
+ }
+
+ if (!cpuhp_step_empty(bringup, cpuhp_get_step(*state_to_run)))
+ break;
+ } while (true);
+
+ return true;
+}
+
+static int cpuhp_invoke_callback_range(bool bringup,
+ unsigned int cpu,
+ struct cpuhp_cpu_state *st,
+ enum cpuhp_state target)
+{
+ enum cpuhp_state state;
+ int err = 0;
+
+ while (cpuhp_next_state(bringup, &state, st, target)) {
+ err = cpuhp_invoke_callback(cpu, state, bringup, NULL, NULL);
+ if (err)
+ break;
+ }
+
+ return err;
}
static inline bool can_rollback_cpu(struct cpuhp_cpu_state *st)
@@ -617,16 +671,12 @@ static int cpuhp_up_callbacks(unsigned int cpu, struct cpuhp_cpu_state *st,
enum cpuhp_state prev_state = st->state;
int ret = 0;
- while (st->state < target) {
- st->state++;
- ret = cpuhp_invoke_callback(cpu, st->state, true, NULL, NULL);
- if (ret) {
- if (can_rollback_cpu(st)) {
- st->target = prev_state;
- undo_cpu_up(cpu, st);
- }
- break;
- }
+ ret = cpuhp_invoke_callback_range(true, cpu, st, target);
+ if (ret) {
+ cpuhp_reset_state(st, prev_state);
+ if (can_rollback_cpu(st))
+ WARN_ON(cpuhp_invoke_callback_range(false, cpu, st,
+ prev_state));
}
return ret;
}
@@ -690,17 +740,9 @@ static void cpuhp_thread_fun(unsigned int cpu)
state = st->cb_state;
st->should_run = false;
} else {
- if (bringup) {
- st->state++;
- state = st->state;
- st->should_run = (st->state < st->target);
- WARN_ON_ONCE(st->state > st->target);
- } else {
- state = st->state;
- st->state--;
- st->should_run = (st->state > st->target);
- WARN_ON_ONCE(st->state < st->target);
- }
+ st->should_run = cpuhp_next_state(bringup, &state, st, st->target);
+ if (!st->should_run)
+ goto end;
}
WARN_ON_ONCE(!cpuhp_is_ap_state(state));
@@ -728,6 +770,7 @@ static void cpuhp_thread_fun(unsigned int cpu)
st->should_run = false;
}
+end:
cpuhp_lock_release(bringup);
lockdep_release_cpus_lock();
@@ -881,19 +924,18 @@ static int take_cpu_down(void *_param)
return err;
/*
- * We get here while we are in CPUHP_TEARDOWN_CPU state and we must not
- * do this step again.
+ * Must be called from CPUHP_TEARDOWN_CPU, which means, as we are going
+ * down, that the current state is CPUHP_TEARDOWN_CPU - 1.
*/
- WARN_ON(st->state != CPUHP_TEARDOWN_CPU);
- st->state--;
+ WARN_ON(st->state != (CPUHP_TEARDOWN_CPU - 1));
+
/* Invoke the former CPU_DYING callbacks */
- for (; st->state > target; st->state--) {
- ret = cpuhp_invoke_callback(cpu, st->state, false, NULL, NULL);
- /*
- * DYING must not fail!
- */
- WARN_ON_ONCE(ret);
- }
+ ret = cpuhp_invoke_callback_range(false, cpu, st, target);
+
+ /*
+ * DYING must not fail!
+ */
+ WARN_ON_ONCE(ret);
/* Give up timekeeping duties */
tick_handover_do_timer();
@@ -975,27 +1017,22 @@ void cpuhp_report_idle_dead(void)
cpuhp_complete_idle_dead, st, 0);
}
-static void undo_cpu_down(unsigned int cpu, struct cpuhp_cpu_state *st)
-{
- for (st->state++; st->state < st->target; st->state++)
- cpuhp_invoke_callback(cpu, st->state, true, NULL, NULL);
-}
-
static int cpuhp_down_callbacks(unsigned int cpu, struct cpuhp_cpu_state *st,
enum cpuhp_state target)
{
enum cpuhp_state prev_state = st->state;
int ret = 0;
- for (; st->state > target; st->state--) {
- ret = cpuhp_invoke_callback(cpu, st->state, false, NULL, NULL);
- if (ret) {
- st->target = prev_state;
- if (st->state < prev_state)
- undo_cpu_down(cpu, st);
- break;
- }
+ ret = cpuhp_invoke_callback_range(false, cpu, st, target);
+ if (ret) {
+
+ cpuhp_reset_state(st, prev_state);
+
+ if (st->state < prev_state)
+ WARN_ON(cpuhp_invoke_callback_range(true, cpu, st,
+ prev_state));
}
+
return ret;
}
@@ -1168,14 +1205,12 @@ void notify_cpu_starting(unsigned int cpu)
rcu_cpu_starting(cpu); /* Enables RCU usage on this CPU. */
cpumask_set_cpu(cpu, &cpus_booted_once_mask);
- while (st->state < target) {
- st->state++;
- ret = cpuhp_invoke_callback(cpu, st->state, true, NULL, NULL);
- /*
- * STARTING must not fail!
- */
- WARN_ON_ONCE(ret);
- }
+ ret = cpuhp_invoke_callback_range(true, cpu, st, target);
+
+ /*
+ * STARTING must not fail!
+ */
+ WARN_ON_ONCE(ret);
}
/*
@@ -1781,8 +1816,7 @@ static int cpuhp_issue_call(int cpu, enum cpuhp_state state, bool bringup,
* If there's nothing to do, we done.
* Relies on the union for multi_instance.
*/
- if ((bringup && !sp->startup.single) ||
- (!bringup && !sp->teardown.single))
+ if (cpuhp_step_empty(bringup, sp))
return 0;
/*
* The non AP bound callbacks can fail on bringup. On teardown
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [tip: sched/core] cpu/hotplug: Allowing to reset fail injection
2021-02-16 10:35 ` [PATCH v2 1/3] cpu/hotplug: Allowing to reset fail injection vincent.donnefort
2021-03-02 9:01 ` [tip: sched/core] " tip-bot2 for Vincent Donnefort
@ 2021-03-03 9:49 ` tip-bot2 for Vincent Donnefort
2021-03-06 11:42 ` tip-bot2 for Vincent Donnefort
2 siblings, 0 replies; 13+ messages in thread
From: tip-bot2 for Vincent Donnefort @ 2021-03-03 9:49 UTC (permalink / raw)
To: linux-tip-commits
Cc: Vincent Donnefort, Peter Zijlstra (Intel), x86, linux-kernel
The following commit has been merged into the sched/core branch of tip:
Commit-ID: 6d06c515e9151dc858e391bd6bebce0b684eec4f
Gitweb: https://git.kernel.org/tip/6d06c515e9151dc858e391bd6bebce0b684eec4f
Author: Vincent Donnefort <vincent.donnefort@arm.com>
AuthorDate: Tue, 16 Feb 2021 10:35:04
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Wed, 03 Mar 2021 10:33:00 +01:00
cpu/hotplug: Allowing to reset fail injection
Currently, the only way of resetting the fail injection is to trigger a
hotplug, hotunplug or both. This is rather annoying for testing
and, as the default value for this file is -1, it seems pretty natural to
let a user write it.
Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210216103506.416286-2-vincent.donnefort@arm.com
---
kernel/cpu.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 1b6302e..9121edf 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -2207,6 +2207,11 @@ static ssize_t write_cpuhp_fail(struct device *dev,
if (ret)
return ret;
+ if (fail == CPUHP_INVALID) {
+ st->fail = fail;
+ return count;
+ }
+
if (fail < CPUHP_OFFLINE || fail > CPUHP_ONLINE)
return -EINVAL;
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [tip: sched/core] cpu/hotplug: Add cpuhp_invoke_callback_range()
2021-02-16 10:35 ` [PATCH v2 3/3] cpu/hotplug: Add cpuhp_invoke_callback_range() vincent.donnefort
2021-03-02 9:01 ` [tip: sched/core] " tip-bot2 for Vincent Donnefort
2021-03-03 9:49 ` tip-bot2 for Vincent Donnefort
@ 2021-03-06 11:42 ` tip-bot2 for Vincent Donnefort
2 siblings, 0 replies; 13+ messages in thread
From: tip-bot2 for Vincent Donnefort @ 2021-03-06 11:42 UTC (permalink / raw)
To: linux-tip-commits
Cc: Vincent Donnefort, Peter Zijlstra (Intel),
Ingo Molnar, x86, linux-kernel
The following commit has been merged into the sched/core branch of tip:
Commit-ID: 453e41085183980087f8a80dada523caf1131c3c
Gitweb: https://git.kernel.org/tip/453e41085183980087f8a80dada523caf1131c3c
Author: Vincent Donnefort <vincent.donnefort@arm.com>
AuthorDate: Tue, 16 Feb 2021 10:35:06
Committer: Ingo Molnar <mingo@kernel.org>
CommitterDate: Sat, 06 Mar 2021 12:40:22 +01:00
cpu/hotplug: Add cpuhp_invoke_callback_range()
Factorizing and unifying cpuhp callback range invocations, especially for
the hotunplug path, where two different ways of decrementing were used. The
first one, decrements before the callback is called:
cpuhp_thread_fun()
state = st->state;
st->state--;
cpuhp_invoke_callback(state);
The second one, after:
take_down_cpu()|cpuhp_down_callbacks()
cpuhp_invoke_callback(st->state);
st->state--;
This is problematic for rolling back the steps in case of error, as
depending on the decrement, the rollback will start from N or N-1. It also
makes tracing inconsistent, between steps run in the cpuhp thread and
the others.
Additionally, avoid useless cpuhp_thread_fun() loops by skipping empty
steps.
Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210216103506.416286-4-vincent.donnefort@arm.com
---
kernel/cpu.c | 170 ++++++++++++++++++++++++++++++--------------------
1 file changed, 102 insertions(+), 68 deletions(-)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 680ed8f..23505d6 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -135,6 +135,11 @@ static struct cpuhp_step *cpuhp_get_step(enum cpuhp_state state)
return cpuhp_hp_states + state;
}
+static bool cpuhp_step_empty(bool bringup, struct cpuhp_step *step)
+{
+ return bringup ? !step->startup.single : !step->teardown.single;
+}
+
/**
* cpuhp_invoke_callback _ Invoke the callbacks for a given state
* @cpu: The cpu for which the callback should be invoked
@@ -157,26 +162,24 @@ static int cpuhp_invoke_callback(unsigned int cpu, enum cpuhp_state state,
if (st->fail == state) {
st->fail = CPUHP_INVALID;
-
- if (!(bringup ? step->startup.single : step->teardown.single))
- return 0;
-
return -EAGAIN;
}
+ if (cpuhp_step_empty(bringup, step)) {
+ WARN_ON_ONCE(1);
+ return 0;
+ }
+
if (!step->multi_instance) {
WARN_ON_ONCE(lastp && *lastp);
cb = bringup ? step->startup.single : step->teardown.single;
- if (!cb)
- return 0;
+
trace_cpuhp_enter(cpu, st->target, state, cb);
ret = cb(cpu);
trace_cpuhp_exit(cpu, st->state, state, ret);
return ret;
}
cbm = bringup ? step->startup.multi : step->teardown.multi;
- if (!cbm)
- return 0;
/* Single invocation for instance add/remove */
if (node) {
@@ -475,6 +478,15 @@ cpuhp_set_state(struct cpuhp_cpu_state *st, enum cpuhp_state target)
static inline void
cpuhp_reset_state(struct cpuhp_cpu_state *st, enum cpuhp_state prev_state)
{
+ st->target = prev_state;
+
+ /*
+ * Already rolling back. No need invert the bringup value or to change
+ * the current state.
+ */
+ if (st->rollback)
+ return;
+
st->rollback = true;
/*
@@ -488,7 +500,6 @@ cpuhp_reset_state(struct cpuhp_cpu_state *st, enum cpuhp_state prev_state)
st->state++;
}
- st->target = prev_state;
st->bringup = !st->bringup;
}
@@ -591,10 +602,53 @@ static int finish_cpu(unsigned int cpu)
* Hotplug state machine related functions
*/
-static void undo_cpu_up(unsigned int cpu, struct cpuhp_cpu_state *st)
+/*
+ * Get the next state to run. Empty ones will be skipped. Returns true if a
+ * state must be run.
+ *
+ * st->state will be modified ahead of time, to match state_to_run, as if it
+ * has already ran.
+ */
+static bool cpuhp_next_state(bool bringup,
+ enum cpuhp_state *state_to_run,
+ struct cpuhp_cpu_state *st,
+ enum cpuhp_state target)
{
- for (st->state--; st->state > st->target; st->state--)
- cpuhp_invoke_callback(cpu, st->state, false, NULL, NULL);
+ do {
+ if (bringup) {
+ if (st->state >= target)
+ return false;
+
+ *state_to_run = ++st->state;
+ } else {
+ if (st->state <= target)
+ return false;
+
+ *state_to_run = st->state--;
+ }
+
+ if (!cpuhp_step_empty(bringup, cpuhp_get_step(*state_to_run)))
+ break;
+ } while (true);
+
+ return true;
+}
+
+static int cpuhp_invoke_callback_range(bool bringup,
+ unsigned int cpu,
+ struct cpuhp_cpu_state *st,
+ enum cpuhp_state target)
+{
+ enum cpuhp_state state;
+ int err = 0;
+
+ while (cpuhp_next_state(bringup, &state, st, target)) {
+ err = cpuhp_invoke_callback(cpu, state, bringup, NULL, NULL);
+ if (err)
+ break;
+ }
+
+ return err;
}
static inline bool can_rollback_cpu(struct cpuhp_cpu_state *st)
@@ -617,16 +671,12 @@ static int cpuhp_up_callbacks(unsigned int cpu, struct cpuhp_cpu_state *st,
enum cpuhp_state prev_state = st->state;
int ret = 0;
- while (st->state < target) {
- st->state++;
- ret = cpuhp_invoke_callback(cpu, st->state, true, NULL, NULL);
- if (ret) {
- if (can_rollback_cpu(st)) {
- st->target = prev_state;
- undo_cpu_up(cpu, st);
- }
- break;
- }
+ ret = cpuhp_invoke_callback_range(true, cpu, st, target);
+ if (ret) {
+ cpuhp_reset_state(st, prev_state);
+ if (can_rollback_cpu(st))
+ WARN_ON(cpuhp_invoke_callback_range(false, cpu, st,
+ prev_state));
}
return ret;
}
@@ -690,17 +740,9 @@ static void cpuhp_thread_fun(unsigned int cpu)
state = st->cb_state;
st->should_run = false;
} else {
- if (bringup) {
- st->state++;
- state = st->state;
- st->should_run = (st->state < st->target);
- WARN_ON_ONCE(st->state > st->target);
- } else {
- state = st->state;
- st->state--;
- st->should_run = (st->state > st->target);
- WARN_ON_ONCE(st->state < st->target);
- }
+ st->should_run = cpuhp_next_state(bringup, &state, st, st->target);
+ if (!st->should_run)
+ goto end;
}
WARN_ON_ONCE(!cpuhp_is_ap_state(state));
@@ -728,6 +770,7 @@ static void cpuhp_thread_fun(unsigned int cpu)
st->should_run = false;
}
+end:
cpuhp_lock_release(bringup);
lockdep_release_cpus_lock();
@@ -881,19 +924,18 @@ static int take_cpu_down(void *_param)
return err;
/*
- * We get here while we are in CPUHP_TEARDOWN_CPU state and we must not
- * do this step again.
+ * Must be called from CPUHP_TEARDOWN_CPU, which means, as we are going
+ * down, that the current state is CPUHP_TEARDOWN_CPU - 1.
*/
- WARN_ON(st->state != CPUHP_TEARDOWN_CPU);
- st->state--;
+ WARN_ON(st->state != (CPUHP_TEARDOWN_CPU - 1));
+
/* Invoke the former CPU_DYING callbacks */
- for (; st->state > target; st->state--) {
- ret = cpuhp_invoke_callback(cpu, st->state, false, NULL, NULL);
- /*
- * DYING must not fail!
- */
- WARN_ON_ONCE(ret);
- }
+ ret = cpuhp_invoke_callback_range(false, cpu, st, target);
+
+ /*
+ * DYING must not fail!
+ */
+ WARN_ON_ONCE(ret);
/* Give up timekeeping duties */
tick_handover_do_timer();
@@ -975,27 +1017,22 @@ void cpuhp_report_idle_dead(void)
cpuhp_complete_idle_dead, st, 0);
}
-static void undo_cpu_down(unsigned int cpu, struct cpuhp_cpu_state *st)
-{
- for (st->state++; st->state < st->target; st->state++)
- cpuhp_invoke_callback(cpu, st->state, true, NULL, NULL);
-}
-
static int cpuhp_down_callbacks(unsigned int cpu, struct cpuhp_cpu_state *st,
enum cpuhp_state target)
{
enum cpuhp_state prev_state = st->state;
int ret = 0;
- for (; st->state > target; st->state--) {
- ret = cpuhp_invoke_callback(cpu, st->state, false, NULL, NULL);
- if (ret) {
- st->target = prev_state;
- if (st->state < prev_state)
- undo_cpu_down(cpu, st);
- break;
- }
+ ret = cpuhp_invoke_callback_range(false, cpu, st, target);
+ if (ret) {
+
+ cpuhp_reset_state(st, prev_state);
+
+ if (st->state < prev_state)
+ WARN_ON(cpuhp_invoke_callback_range(true, cpu, st,
+ prev_state));
}
+
return ret;
}
@@ -1168,14 +1205,12 @@ void notify_cpu_starting(unsigned int cpu)
rcu_cpu_starting(cpu); /* Enables RCU usage on this CPU. */
cpumask_set_cpu(cpu, &cpus_booted_once_mask);
- while (st->state < target) {
- st->state++;
- ret = cpuhp_invoke_callback(cpu, st->state, true, NULL, NULL);
- /*
- * STARTING must not fail!
- */
- WARN_ON_ONCE(ret);
- }
+ ret = cpuhp_invoke_callback_range(true, cpu, st, target);
+
+ /*
+ * STARTING must not fail!
+ */
+ WARN_ON_ONCE(ret);
}
/*
@@ -1781,8 +1816,7 @@ static int cpuhp_issue_call(int cpu, enum cpuhp_state state, bool bringup,
* If there's nothing to do, we done.
* Relies on the union for multi_instance.
*/
- if ((bringup && !sp->startup.single) ||
- (!bringup && !sp->teardown.single))
+ if (cpuhp_step_empty(bringup, sp))
return 0;
/*
* The non AP bound callbacks can fail on bringup. On teardown
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [tip: sched/core] cpu/hotplug: CPUHP_BRINGUP_CPU failure exception
2021-02-16 10:35 ` [PATCH v2 2/3] cpu/hotplug: CPUHP_BRINGUP_CPU failure exception vincent.donnefort
2021-03-02 9:01 ` [tip: sched/core] " tip-bot2 for Vincent Donnefort
2021-03-03 9:49 ` tip-bot2 for Vincent Donnefort
@ 2021-03-06 11:42 ` tip-bot2 for Vincent Donnefort
2 siblings, 0 replies; 13+ messages in thread
From: tip-bot2 for Vincent Donnefort @ 2021-03-06 11:42 UTC (permalink / raw)
To: linux-tip-commits
Cc: Vincent Donnefort, Peter Zijlstra (Intel),
Ingo Molnar, x86, linux-kernel
The following commit has been merged into the sched/core branch of tip:
Commit-ID: 62f250694092dd5fef9900dc3126f07110bf9d48
Gitweb: https://git.kernel.org/tip/62f250694092dd5fef9900dc3126f07110bf9d48
Author: Vincent Donnefort <vincent.donnefort@arm.com>
AuthorDate: Tue, 16 Feb 2021 10:35:05
Committer: Ingo Molnar <mingo@kernel.org>
CommitterDate: Sat, 06 Mar 2021 12:40:22 +01:00
cpu/hotplug: CPUHP_BRINGUP_CPU failure exception
The atomic states (between CPUHP_AP_IDLE_DEAD and CPUHP_AP_ONLINE) are
triggered by the CPUHP_BRINGUP_CPU step. If the latter fails, no atomic
state can be rolled back.
DEAD callbacks too can't fail and disallow recovery. As a consequence,
during hotunplug, the fail injection interface should prohibit all states
from CPUHP_BRINGUP_CPU to CPUHP_ONLINE.
Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210216103506.416286-3-vincent.donnefort@arm.com
---
kernel/cpu.c | 19 ++++++++++++++++---
1 file changed, 16 insertions(+), 3 deletions(-)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 9121edf..680ed8f 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -1045,9 +1045,13 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen,
* to do the further cleanups.
*/
ret = cpuhp_down_callbacks(cpu, st, target);
- if (ret && st->state == CPUHP_TEARDOWN_CPU && st->state < prev_state) {
- cpuhp_reset_state(st, prev_state);
- __cpuhp_kick_ap(st);
+ if (ret && st->state < prev_state) {
+ if (st->state == CPUHP_TEARDOWN_CPU) {
+ cpuhp_reset_state(st, prev_state);
+ __cpuhp_kick_ap(st);
+ } else {
+ WARN(1, "DEAD callback error for CPU%d", cpu);
+ }
}
out:
@@ -2222,6 +2226,15 @@ static ssize_t write_cpuhp_fail(struct device *dev,
return -EINVAL;
/*
+ * DEAD callbacks cannot fail...
+ * ... neither can CPUHP_BRINGUP_CPU during hotunplug. The latter
+ * triggering STARTING callbacks, a failure in this state would
+ * hinder rollback.
+ */
+ if (fail <= CPUHP_BRINGUP_CPU && st->state > CPUHP_BRINGUP_CPU)
+ return -EINVAL;
+
+ /*
* Cannot fail anything that doesn't have callbacks.
*/
mutex_lock(&cpuhp_state_mutex);
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [tip: sched/core] cpu/hotplug: Allowing to reset fail injection
2021-02-16 10:35 ` [PATCH v2 1/3] cpu/hotplug: Allowing to reset fail injection vincent.donnefort
2021-03-02 9:01 ` [tip: sched/core] " tip-bot2 for Vincent Donnefort
2021-03-03 9:49 ` tip-bot2 for Vincent Donnefort
@ 2021-03-06 11:42 ` tip-bot2 for Vincent Donnefort
2 siblings, 0 replies; 13+ messages in thread
From: tip-bot2 for Vincent Donnefort @ 2021-03-06 11:42 UTC (permalink / raw)
To: linux-tip-commits
Cc: Vincent Donnefort, Peter Zijlstra (Intel),
Ingo Molnar, x86, linux-kernel
The following commit has been merged into the sched/core branch of tip:
Commit-ID: 3ae70c251f344976428d1f6ee61ea7b4e170fec3
Gitweb: https://git.kernel.org/tip/3ae70c251f344976428d1f6ee61ea7b4e170fec3
Author: Vincent Donnefort <vincent.donnefort@arm.com>
AuthorDate: Tue, 16 Feb 2021 10:35:04
Committer: Ingo Molnar <mingo@kernel.org>
CommitterDate: Sat, 06 Mar 2021 12:40:22 +01:00
cpu/hotplug: Allowing to reset fail injection
Currently, the only way of resetting the fail injection is to trigger a
hotplug, hotunplug or both. This is rather annoying for testing
and, as the default value for this file is -1, it seems pretty natural to
let a user write it.
Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210216103506.416286-2-vincent.donnefort@arm.com
---
kernel/cpu.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 1b6302e..9121edf 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -2207,6 +2207,11 @@ static ssize_t write_cpuhp_fail(struct device *dev,
if (ret)
return ret;
+ if (fail == CPUHP_INVALID) {
+ st->fail = fail;
+ return count;
+ }
+
if (fail < CPUHP_OFFLINE || fail > CPUHP_ONLINE)
return -EINVAL;
^ permalink raw reply related [flat|nested] 13+ messages in thread
end of thread, other threads:[~2021-03-06 11:43 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-16 10:35 [PATCH v2 0/3] cpu/hotplug: rollback and "fail" interface fixes vincent.donnefort
2021-02-16 10:35 ` [PATCH v2 1/3] cpu/hotplug: Allowing to reset fail injection vincent.donnefort
2021-03-02 9:01 ` [tip: sched/core] " tip-bot2 for Vincent Donnefort
2021-03-03 9:49 ` tip-bot2 for Vincent Donnefort
2021-03-06 11:42 ` tip-bot2 for Vincent Donnefort
2021-02-16 10:35 ` [PATCH v2 2/3] cpu/hotplug: CPUHP_BRINGUP_CPU failure exception vincent.donnefort
2021-03-02 9:01 ` [tip: sched/core] " tip-bot2 for Vincent Donnefort
2021-03-03 9:49 ` tip-bot2 for Vincent Donnefort
2021-03-06 11:42 ` tip-bot2 for Vincent Donnefort
2021-02-16 10:35 ` [PATCH v2 3/3] cpu/hotplug: Add cpuhp_invoke_callback_range() vincent.donnefort
2021-03-02 9:01 ` [tip: sched/core] " tip-bot2 for Vincent Donnefort
2021-03-03 9:49 ` tip-bot2 for Vincent Donnefort
2021-03-06 11:42 ` tip-bot2 for Vincent Donnefort
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).