linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5] cpu/hotplug: Do not bail-out in DYING/STARTING sections
@ 2022-07-25  9:59 Vincent Donnefort
  2022-07-25 15:07 ` Valentin Schneider
  2022-08-17  9:46 ` Thorsten Leemhuis
  0 siblings, 2 replies; 7+ messages in thread
From: Vincent Donnefort @ 2022-07-25  9:59 UTC (permalink / raw)
  To: peterz, tglx
  Cc: linux-kernel, vschneid, regressions, kernel-team,
	Vincent Donnefort, Derek Dolney

The DYING/STARTING callbacks are not expected to fail. However, as reported
by Derek, drivers such as tboot are still free to return errors within
those sections, which halts the hot(un)plug and leaves the CPU in an
unrecoverable state.

No rollback being possible there, let's only log the failures and proceed
with the following steps. This restores the hotplug behaviour prior to
commit 453e41085183 ("cpu/hotplug: Add cpuhp_invoke_callback_range()")

Link: https://bugzilla.kernel.org/show_bug.cgi?id=215867
Fixes: 453e41085183 ("cpu/hotplug: Add cpuhp_invoke_callback_range()")
Reported-by: Derek Dolney <z23@posteo.net>
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Tested-by: Derek Dolney <z23@posteo.net>

v4 -> v5:
   - Remove WARN, only log broken states with pr_warn.
v3 -> v4:
   - Sorry ... wrong commit description style ...
v2 -> v3:
   - Tested-by tag.
   - Refine commit description.
   - Bugzilla link.
v1 -> v2:
   - Commit message rewording.
   - More details in the warnings.
   - Some variable renaming

diff --git a/kernel/cpu.c b/kernel/cpu.c
index bbad5e375d3b..621e5af42d57 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -663,21 +663,51 @@ static bool cpuhp_next_state(bool bringup,
 	return true;
 }
 
-static int cpuhp_invoke_callback_range(bool bringup,
-				       unsigned int cpu,
-				       struct cpuhp_cpu_state *st,
-				       enum cpuhp_state target)
+static int __cpuhp_invoke_callback_range(bool bringup,
+					 unsigned int cpu,
+					 struct cpuhp_cpu_state *st,
+					 enum cpuhp_state target,
+					 bool nofail)
 {
 	enum cpuhp_state state;
-	int err = 0;
+	int ret = 0;
 
 	while (cpuhp_next_state(bringup, &state, st, target)) {
+		int err;
+
 		err = cpuhp_invoke_callback(cpu, state, bringup, NULL, NULL);
-		if (err)
+		if (!err)
+			continue;
+
+		if (nofail) {
+			pr_warn("CPU %u %s state %s (%d) failed (%d)\n",
+				cpu, bringup ? "UP" : "DOWN",
+				cpuhp_get_step(st->state)->name,
+				st->state, err);
+			ret = -1;
+		} else {
+			ret = err;
 			break;
+		}
 	}
 
-	return err;
+	return ret;
+}
+
+static inline int cpuhp_invoke_callback_range(bool bringup,
+					      unsigned int cpu,
+					      struct cpuhp_cpu_state *st,
+					      enum cpuhp_state target)
+{
+	return __cpuhp_invoke_callback_range(bringup, cpu, st, target, false);
+}
+
+static inline void cpuhp_invoke_callback_range_nofail(bool bringup,
+						      unsigned int cpu,
+						      struct cpuhp_cpu_state *st,
+						      enum cpuhp_state target)
+{
+	__cpuhp_invoke_callback_range(bringup, cpu, st, target, true);
 }
 
 static inline bool can_rollback_cpu(struct cpuhp_cpu_state *st)
@@ -999,7 +1029,6 @@ static int take_cpu_down(void *_param)
 	struct cpuhp_cpu_state *st = this_cpu_ptr(&cpuhp_state);
 	enum cpuhp_state target = max((int)st->target, CPUHP_AP_OFFLINE);
 	int err, cpu = smp_processor_id();
-	int ret;
 
 	/* Ensure this CPU doesn't handle any more interrupts. */
 	err = __cpu_disable();
@@ -1012,13 +1041,11 @@ static int take_cpu_down(void *_param)
 	 */
 	WARN_ON(st->state != (CPUHP_TEARDOWN_CPU - 1));
 
-	/* Invoke the former CPU_DYING callbacks */
-	ret = cpuhp_invoke_callback_range(false, cpu, st, target);
-
 	/*
+	 * Invoke the former CPU_DYING callbacks
 	 * DYING must not fail!
 	 */
-	WARN_ON_ONCE(ret);
+	cpuhp_invoke_callback_range_nofail(false, cpu, st, target);
 
 	/* Give up timekeeping duties */
 	tick_handover_do_timer();
@@ -1296,16 +1323,14 @@ void notify_cpu_starting(unsigned int cpu)
 {
 	struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu);
 	enum cpuhp_state target = min((int)st->target, CPUHP_AP_ONLINE);
-	int ret;
 
 	rcu_cpu_starting(cpu);	/* Enables RCU usage on this CPU. */
 	cpumask_set_cpu(cpu, &cpus_booted_once_mask);
-	ret = cpuhp_invoke_callback_range(true, cpu, st, target);
 
 	/*
 	 * STARTING must not fail!
 	 */
-	WARN_ON_ONCE(ret);
+	cpuhp_invoke_callback_range_nofail(true, cpu, st, target);
 }
 
 /*
-- 
2.37.1.359.gd136c6c3e2-goog


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v5] cpu/hotplug: Do not bail-out in DYING/STARTING sections
  2022-07-25  9:59 [PATCH v5] cpu/hotplug: Do not bail-out in DYING/STARTING sections Vincent Donnefort
@ 2022-07-25 15:07 ` Valentin Schneider
  2022-07-25 15:16   ` Vincent Donnefort
  2022-08-17  9:46 ` Thorsten Leemhuis
  1 sibling, 1 reply; 7+ messages in thread
From: Valentin Schneider @ 2022-07-25 15:07 UTC (permalink / raw)
  To: Vincent Donnefort, peterz, tglx
  Cc: linux-kernel, regressions, kernel-team, Vincent Donnefort, Derek Dolney

On 25/07/22 10:59, Vincent Donnefort wrote:
> The DYING/STARTING callbacks are not expected to fail. However, as reported
> by Derek, drivers such as tboot are still free to return errors within
> those sections, which halts the hot(un)plug and leaves the CPU in an
> unrecoverable state.
>
> No rollback being possible there, let's only log the failures and proceed
> with the following steps. This restores the hotplug behaviour prior to
> commit 453e41085183 ("cpu/hotplug: Add cpuhp_invoke_callback_range()")
>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=215867
> Fixes: 453e41085183 ("cpu/hotplug: Add cpuhp_invoke_callback_range()")
> Reported-by: Derek Dolney <z23@posteo.net>
> Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
> Tested-by: Derek Dolney <z23@posteo.net>
>

The changelog has some undesired stowaways below, but regardless:
Reviewed-by: Valentin Schneider <vschneid@redhat.com>

> v4 -> v5:
>    - Remove WARN, only log broken states with pr_warn.
> v3 -> v4:
>    - Sorry ... wrong commit description style ...
> v2 -> v3:
>    - Tested-by tag.
>    - Refine commit description.
>    - Bugzilla link.
> v1 -> v2:
>    - Commit message rewording.
>    - More details in the warnings.
>    - Some variable renaming
>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5] cpu/hotplug: Do not bail-out in DYING/STARTING sections
  2022-07-25 15:07 ` Valentin Schneider
@ 2022-07-25 15:16   ` Vincent Donnefort
  0 siblings, 0 replies; 7+ messages in thread
From: Vincent Donnefort @ 2022-07-25 15:16 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: peterz, tglx, linux-kernel, regressions, kernel-team, Derek Dolney

On Mon, Jul 25, 2022 at 04:07:47PM +0100, Valentin Schneider wrote:
> On 25/07/22 10:59, Vincent Donnefort wrote:
> > The DYING/STARTING callbacks are not expected to fail. However, as reported
> > by Derek, drivers such as tboot are still free to return errors within
> > those sections, which halts the hot(un)plug and leaves the CPU in an
> > unrecoverable state.
> >
> > No rollback being possible there, let's only log the failures and proceed
> > with the following steps. This restores the hotplug behaviour prior to
> > commit 453e41085183 ("cpu/hotplug: Add cpuhp_invoke_callback_range()")
> >
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=215867
> > Fixes: 453e41085183 ("cpu/hotplug: Add cpuhp_invoke_callback_range()")
> > Reported-by: Derek Dolney <z23@posteo.net>
> > Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
> > Tested-by: Derek Dolney <z23@posteo.net>
> >
> 
> The changelog has some undesired stowaways below, but regardless:
> Reviewed-by: Valentin Schneider <vschneid@redhat.com>

Arg indeed, my bad. But thanks for the tag!

> 
> > v4 -> v5:
> >    - Remove WARN, only log broken states with pr_warn.
> > v3 -> v4:
> >    - Sorry ... wrong commit description style ...
> > v2 -> v3:
> >    - Tested-by tag.
> >    - Refine commit description.
> >    - Bugzilla link.
> > v1 -> v2:
> >    - Commit message rewording.
> >    - More details in the warnings.
> >    - Some variable renaming
> >
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5] cpu/hotplug: Do not bail-out in DYING/STARTING sections
  2022-07-25  9:59 [PATCH v5] cpu/hotplug: Do not bail-out in DYING/STARTING sections Vincent Donnefort
  2022-07-25 15:07 ` Valentin Schneider
@ 2022-08-17  9:46 ` Thorsten Leemhuis
  2022-09-20  9:59   ` Thorsten Leemhuis
  1 sibling, 1 reply; 7+ messages in thread
From: Thorsten Leemhuis @ 2022-08-17  9:46 UTC (permalink / raw)
  To: Vincent Donnefort, peterz, tglx, Borislav Petkov
  Cc: linux-kernel, vschneid, regressions, kernel-team, Derek Dolney

[CCing boris]

Hi, this is your Linux kernel regression tracker.

On 25.07.22 11:59, Vincent Donnefort wrote:
> The DYING/STARTING callbacks are not expected to fail. However, as reported
> by Derek, drivers such as tboot are still free to return errors within
> those sections, which halts the hot(un)plug and leaves the CPU in an
> unrecoverable state.
> 
> No rollback being possible there, let's only log the failures and proceed
> with the following steps. This restores the hotplug behaviour prior to
> commit 453e41085183 ("cpu/hotplug: Add cpuhp_invoke_callback_range()")
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=215867
> Fixes: 453e41085183 ("cpu/hotplug: Add cpuhp_invoke_callback_range()")
> Reported-by: Derek Dolney <z23@posteo.net>
> Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
> Tested-by: Derek Dolney <z23@posteo.net>

What's the status here? Did that patch to fixing a regression fall
through the cracks? It looks like nothing happened for 3 weeks now,
that's why I wondered, but maybe I missed something.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.


> v4 -> v5:
>    - Remove WARN, only log broken states with pr_warn.
> v3 -> v4:
>    - Sorry ... wrong commit description style ...
> v2 -> v3:
>    - Tested-by tag.
>    - Refine commit description.
>    - Bugzilla link.
> v1 -> v2:
>    - Commit message rewording.
>    - More details in the warnings.
>    - Some variable renaming
> 
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index bbad5e375d3b..621e5af42d57 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -663,21 +663,51 @@ static bool cpuhp_next_state(bool bringup,
>  	return true;
>  }
>  
> -static int cpuhp_invoke_callback_range(bool bringup,
> -				       unsigned int cpu,
> -				       struct cpuhp_cpu_state *st,
> -				       enum cpuhp_state target)
> +static int __cpuhp_invoke_callback_range(bool bringup,
> +					 unsigned int cpu,
> +					 struct cpuhp_cpu_state *st,
> +					 enum cpuhp_state target,
> +					 bool nofail)
>  {
>  	enum cpuhp_state state;
> -	int err = 0;
> +	int ret = 0;
>  
>  	while (cpuhp_next_state(bringup, &state, st, target)) {
> +		int err;
> +
>  		err = cpuhp_invoke_callback(cpu, state, bringup, NULL, NULL);
> -		if (err)
> +		if (!err)
> +			continue;
> +
> +		if (nofail) {
> +			pr_warn("CPU %u %s state %s (%d) failed (%d)\n",
> +				cpu, bringup ? "UP" : "DOWN",
> +				cpuhp_get_step(st->state)->name,
> +				st->state, err);
> +			ret = -1;
> +		} else {
> +			ret = err;
>  			break;
> +		}
>  	}
>  
> -	return err;
> +	return ret;
> +}
> +
> +static inline int cpuhp_invoke_callback_range(bool bringup,
> +					      unsigned int cpu,
> +					      struct cpuhp_cpu_state *st,
> +					      enum cpuhp_state target)
> +{
> +	return __cpuhp_invoke_callback_range(bringup, cpu, st, target, false);
> +}
> +
> +static inline void cpuhp_invoke_callback_range_nofail(bool bringup,
> +						      unsigned int cpu,
> +						      struct cpuhp_cpu_state *st,
> +						      enum cpuhp_state target)
> +{
> +	__cpuhp_invoke_callback_range(bringup, cpu, st, target, true);
>  }
>  
>  static inline bool can_rollback_cpu(struct cpuhp_cpu_state *st)
> @@ -999,7 +1029,6 @@ static int take_cpu_down(void *_param)
>  	struct cpuhp_cpu_state *st = this_cpu_ptr(&cpuhp_state);
>  	enum cpuhp_state target = max((int)st->target, CPUHP_AP_OFFLINE);
>  	int err, cpu = smp_processor_id();
> -	int ret;
>  
>  	/* Ensure this CPU doesn't handle any more interrupts. */
>  	err = __cpu_disable();
> @@ -1012,13 +1041,11 @@ static int take_cpu_down(void *_param)
>  	 */
>  	WARN_ON(st->state != (CPUHP_TEARDOWN_CPU - 1));
>  
> -	/* Invoke the former CPU_DYING callbacks */
> -	ret = cpuhp_invoke_callback_range(false, cpu, st, target);
> -
>  	/*
> +	 * Invoke the former CPU_DYING callbacks
>  	 * DYING must not fail!
>  	 */
> -	WARN_ON_ONCE(ret);
> +	cpuhp_invoke_callback_range_nofail(false, cpu, st, target);
>  
>  	/* Give up timekeeping duties */
>  	tick_handover_do_timer();
> @@ -1296,16 +1323,14 @@ void notify_cpu_starting(unsigned int cpu)
>  {
>  	struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu);
>  	enum cpuhp_state target = min((int)st->target, CPUHP_AP_ONLINE);
> -	int ret;
>  
>  	rcu_cpu_starting(cpu);	/* Enables RCU usage on this CPU. */
>  	cpumask_set_cpu(cpu, &cpus_booted_once_mask);
> -	ret = cpuhp_invoke_callback_range(true, cpu, st, target);
>  
>  	/*
>  	 * STARTING must not fail!
>  	 */
> -	WARN_ON_ONCE(ret);
> +	cpuhp_invoke_callback_range_nofail(true, cpu, st, target);
>  }
>  
>  /*

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5] cpu/hotplug: Do not bail-out in DYING/STARTING sections
  2022-08-17  9:46 ` Thorsten Leemhuis
@ 2022-09-20  9:59   ` Thorsten Leemhuis
  2022-09-27 10:05     ` Vincent Donnefort
  0 siblings, 1 reply; 7+ messages in thread
From: Thorsten Leemhuis @ 2022-09-20  9:59 UTC (permalink / raw)
  To: tglx
  Cc: linux-kernel, vschneid, kernel-team, Derek Dolney,
	Vincent Donnefort, peterz

On 17.08.22 11:46, Thorsten Leemhuis wrote:
> 
> Hi, this is your Linux kernel regression tracker.
> 
> On 25.07.22 11:59, Vincent Donnefort wrote:
>> The DYING/STARTING callbacks are not expected to fail. However, as reported
>> by Derek, drivers such as tboot are still free to return errors within
>> those sections, which halts the hot(un)plug and leaves the CPU in an
>> unrecoverable state.
>>
>> No rollback being possible there, let's only log the failures and proceed
>> with the following steps. This restores the hotplug behaviour prior to
>> commit 453e41085183 ("cpu/hotplug: Add cpuhp_invoke_callback_range()")
>>
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=215867
>> Fixes: 453e41085183 ("cpu/hotplug: Add cpuhp_invoke_callback_range()")
>> Reported-by: Derek Dolney <z23@posteo.net>
>> Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
>> Tested-by: Derek Dolney <z23@posteo.net>
> 
> What's the status here? Did that patch to fixing a regression fall
> through the cracks? It looks like nothing happened for 3 weeks now,
> that's why I wondered, but maybe I missed something.

Hmm, Vincent seems to be MIA, at least I see no recent messages from him
on lore. Odd. But well, it's still a fix for a regression and it's up to
v5 already; Valentin already added his Reviewed-by, too. Would be a
shame to waste this.

Thomas, could you maybe take a look at the patch?  Maybe we're lucky and
the patch is already good to go...

Ciao, Thorsten

#regzbot poke

> P.S.: As the Linux kernel's regression tracker I deal with a lot of
> reports and sometimes miss something important when writing mails like
> this. If that's the case here, don't hesitate to tell me in a public
> reply, it's in everyone's interest to set the public record straight.
> 
> 
>> v4 -> v5:
>>    - Remove WARN, only log broken states with pr_warn.
>> v3 -> v4:
>>    - Sorry ... wrong commit description style ...
>> v2 -> v3:
>>    - Tested-by tag.
>>    - Refine commit description.
>>    - Bugzilla link.
>> v1 -> v2:
>>    - Commit message rewording.
>>    - More details in the warnings.
>>    - Some variable renaming
>>
>> diff --git a/kernel/cpu.c b/kernel/cpu.c
>> index bbad5e375d3b..621e5af42d57 100644
>> --- a/kernel/cpu.c
>> +++ b/kernel/cpu.c
>> @@ -663,21 +663,51 @@ static bool cpuhp_next_state(bool bringup,
>>  	return true;
>>  }
>>  
>> -static int cpuhp_invoke_callback_range(bool bringup,
>> -				       unsigned int cpu,
>> -				       struct cpuhp_cpu_state *st,
>> -				       enum cpuhp_state target)
>> +static int __cpuhp_invoke_callback_range(bool bringup,
>> +					 unsigned int cpu,
>> +					 struct cpuhp_cpu_state *st,
>> +					 enum cpuhp_state target,
>> +					 bool nofail)
>>  {
>>  	enum cpuhp_state state;
>> -	int err = 0;
>> +	int ret = 0;
>>  
>>  	while (cpuhp_next_state(bringup, &state, st, target)) {
>> +		int err;
>> +
>>  		err = cpuhp_invoke_callback(cpu, state, bringup, NULL, NULL);
>> -		if (err)
>> +		if (!err)
>> +			continue;
>> +
>> +		if (nofail) {
>> +			pr_warn("CPU %u %s state %s (%d) failed (%d)\n",
>> +				cpu, bringup ? "UP" : "DOWN",
>> +				cpuhp_get_step(st->state)->name,
>> +				st->state, err);
>> +			ret = -1;
>> +		} else {
>> +			ret = err;
>>  			break;
>> +		}
>>  	}
>>  
>> -	return err;
>> +	return ret;
>> +}
>> +
>> +static inline int cpuhp_invoke_callback_range(bool bringup,
>> +					      unsigned int cpu,
>> +					      struct cpuhp_cpu_state *st,
>> +					      enum cpuhp_state target)
>> +{
>> +	return __cpuhp_invoke_callback_range(bringup, cpu, st, target, false);
>> +}
>> +
>> +static inline void cpuhp_invoke_callback_range_nofail(bool bringup,
>> +						      unsigned int cpu,
>> +						      struct cpuhp_cpu_state *st,
>> +						      enum cpuhp_state target)
>> +{
>> +	__cpuhp_invoke_callback_range(bringup, cpu, st, target, true);
>>  }
>>  
>>  static inline bool can_rollback_cpu(struct cpuhp_cpu_state *st)
>> @@ -999,7 +1029,6 @@ static int take_cpu_down(void *_param)
>>  	struct cpuhp_cpu_state *st = this_cpu_ptr(&cpuhp_state);
>>  	enum cpuhp_state target = max((int)st->target, CPUHP_AP_OFFLINE);
>>  	int err, cpu = smp_processor_id();
>> -	int ret;
>>  
>>  	/* Ensure this CPU doesn't handle any more interrupts. */
>>  	err = __cpu_disable();
>> @@ -1012,13 +1041,11 @@ static int take_cpu_down(void *_param)
>>  	 */
>>  	WARN_ON(st->state != (CPUHP_TEARDOWN_CPU - 1));
>>  
>> -	/* Invoke the former CPU_DYING callbacks */
>> -	ret = cpuhp_invoke_callback_range(false, cpu, st, target);
>> -
>>  	/*
>> +	 * Invoke the former CPU_DYING callbacks
>>  	 * DYING must not fail!
>>  	 */
>> -	WARN_ON_ONCE(ret);
>> +	cpuhp_invoke_callback_range_nofail(false, cpu, st, target);
>>  
>>  	/* Give up timekeeping duties */
>>  	tick_handover_do_timer();
>> @@ -1296,16 +1323,14 @@ void notify_cpu_starting(unsigned int cpu)
>>  {
>>  	struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu);
>>  	enum cpuhp_state target = min((int)st->target, CPUHP_AP_ONLINE);
>> -	int ret;
>>  
>>  	rcu_cpu_starting(cpu);	/* Enables RCU usage on this CPU. */
>>  	cpumask_set_cpu(cpu, &cpus_booted_once_mask);
>> -	ret = cpuhp_invoke_callback_range(true, cpu, st, target);
>>  
>>  	/*
>>  	 * STARTING must not fail!
>>  	 */
>> -	WARN_ON_ONCE(ret);
>> +	cpuhp_invoke_callback_range_nofail(true, cpu, st, target);
>>  }
>>  
>>  /*

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5] cpu/hotplug: Do not bail-out in DYING/STARTING sections
  2022-09-20  9:59   ` Thorsten Leemhuis
@ 2022-09-27 10:05     ` Vincent Donnefort
  2022-09-27 10:20       ` Thorsten Leemhuis
  0 siblings, 1 reply; 7+ messages in thread
From: Vincent Donnefort @ 2022-09-27 10:05 UTC (permalink / raw)
  To: Thorsten Leemhuis
  Cc: tglx, linux-kernel, vschneid, kernel-team, Derek Dolney, peterz

On Tue, Sep 20, 2022 at 11:59:06AM +0200, Thorsten Leemhuis wrote:
> On 17.08.22 11:46, Thorsten Leemhuis wrote:
> > 
> > Hi, this is your Linux kernel regression tracker.
> > 
> > On 25.07.22 11:59, Vincent Donnefort wrote:
> >> The DYING/STARTING callbacks are not expected to fail. However, as reported
> >> by Derek, drivers such as tboot are still free to return errors within
> >> those sections, which halts the hot(un)plug and leaves the CPU in an
> >> unrecoverable state.
> >>
> >> No rollback being possible there, let's only log the failures and proceed
> >> with the following steps. This restores the hotplug behaviour prior to
> >> commit 453e41085183 ("cpu/hotplug: Add cpuhp_invoke_callback_range()")
> >>
> >> Link: https://bugzilla.kernel.org/show_bug.cgi?id=215867
> >> Fixes: 453e41085183 ("cpu/hotplug: Add cpuhp_invoke_callback_range()")
> >> Reported-by: Derek Dolney <z23@posteo.net>
> >> Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
> >> Tested-by: Derek Dolney <z23@posteo.net>
> > 
> > What's the status here? Did that patch to fixing a regression fall
> > through the cracks? It looks like nothing happened for 3 weeks now,
> > that's why I wondered, but maybe I missed something.
> 
> Hmm, Vincent seems to be MIA, at least I see no recent messages from him
> on lore. Odd. But well, it's still a fix for a regression and it's up to
> v5 already; Valentin already added his Reviewed-by, too. Would be a
> shame to waste this.
> 
> Thomas, could you maybe take a look at the patch?  Maybe we're lucky and
> the patch is already good to go...
> 
> Ciao, Thorsten
> 
> #regzbot poke

Hi Thorsten,

AFAIK, this patch is still valid. I don't think I do have any further action on
that though.

[...]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5] cpu/hotplug: Do not bail-out in DYING/STARTING sections
  2022-09-27 10:05     ` Vincent Donnefort
@ 2022-09-27 10:20       ` Thorsten Leemhuis
  0 siblings, 0 replies; 7+ messages in thread
From: Thorsten Leemhuis @ 2022-09-27 10:20 UTC (permalink / raw)
  To: Vincent Donnefort
  Cc: tglx, linux-kernel, vschneid, kernel-team, Derek Dolney, peterz

On 27.09.22 12:05, Vincent Donnefort wrote:
> On Tue, Sep 20, 2022 at 11:59:06AM +0200, Thorsten Leemhuis wrote:
>> On 17.08.22 11:46, Thorsten Leemhuis wrote:
>>>
>>> On 25.07.22 11:59, Vincent Donnefort wrote:
>>>> The DYING/STARTING callbacks are not expected to fail. However, as reported
>>>> by Derek, drivers such as tboot are still free to return errors within
>>>> those sections, which halts the hot(un)plug and leaves the CPU in an
>>>> unrecoverable state.
>>>>
>>>> No rollback being possible there, let's only log the failures and proceed
>>>> with the following steps. This restores the hotplug behaviour prior to
>>>> commit 453e41085183 ("cpu/hotplug: Add cpuhp_invoke_callback_range()")
>>>>
>>>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=215867
>>>> Fixes: 453e41085183 ("cpu/hotplug: Add cpuhp_invoke_callback_range()")
>>>> Reported-by: Derek Dolney <z23@posteo.net>
>>>> Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
>>>> Tested-by: Derek Dolney <z23@posteo.net>
>>>
>>> What's the status here? Did that patch to fixing a regression fall
>>> through the cracks? It looks like nothing happened for 3 weeks now,
>>> that's why I wondered, but maybe I missed something.
>>
>> Hmm, Vincent seems to be MIA, at least I see no recent messages from him
>> on lore. Odd. But well, it's still a fix for a regression and it's up to
>> v5 already; Valentin already added his Reviewed-by, too. Would be a
>> shame to waste this.
>>
>> Thomas, could you maybe take a look at the patch?  Maybe we're lucky and
>> the patch is already good to go...
> 
> AFAIK, this patch is still valid.

Great, thx for confirming!

> I don't think I do have any further action on
> that though.

Well, it seems in this case someone needs to knock on some doors to get
the maintainers to look at this fix to finally get the regression
resolved, as it seems they haven't looked closely at the patch for good
or bad reasons. I hope this mail exchange was enough to get things
rolling again, otherwise we sooner or later we have to get Linus
involved. :-/

Ciao, Thorsten

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-09-27 10:20 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-25  9:59 [PATCH v5] cpu/hotplug: Do not bail-out in DYING/STARTING sections Vincent Donnefort
2022-07-25 15:07 ` Valentin Schneider
2022-07-25 15:16   ` Vincent Donnefort
2022-08-17  9:46 ` Thorsten Leemhuis
2022-09-20  9:59   ` Thorsten Leemhuis
2022-09-27 10:05     ` Vincent Donnefort
2022-09-27 10:20       ` Thorsten Leemhuis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).