* [bug report] PM / reboot: Eliminate race between reboot and suspend
@ 2019-11-19 6:18 Dan Carpenter
0 siblings, 0 replies; 6+ messages in thread
From: Dan Carpenter @ 2019-11-19 6:18 UTC (permalink / raw)
To: kernelfans; +Cc: linux-pm
Hello Pingfan Liu,
The patch 55f2503c3b69: "PM / reboot: Eliminate race between reboot
and suspend" from Jul 31, 2018, leads to the following static checker
warning:
kernel/reboot.c:395 __do_sys_reboot()
error: double unlocked 'system_transition_mutex' (orig line 387)
kernel/reboot.c
310 SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsigned int, cmd,
311 void __user *, arg)
312 {
313 struct pid_namespace *pid_ns = task_active_pid_ns(current);
314 char buffer[256];
315 int ret = 0;
316
317 /* We only trust the superuser with rebooting the system. */
318 if (!ns_capable(pid_ns->user_ns, CAP_SYS_BOOT))
319 return -EPERM;
320
321 /* For safety, we require "magic" arguments. */
322 if (magic1 != LINUX_REBOOT_MAGIC1 ||
323 (magic2 != LINUX_REBOOT_MAGIC2 &&
324 magic2 != LINUX_REBOOT_MAGIC2A &&
325 magic2 != LINUX_REBOOT_MAGIC2B &&
326 magic2 != LINUX_REBOOT_MAGIC2C))
327 return -EINVAL;
328
329 /*
330 * If pid namespaces are enabled and the current task is in a child
331 * pid_namespace, the command is handled by reboot_pid_ns() which will
332 * call do_exit().
333 */
334 ret = reboot_pid_ns(pid_ns, cmd);
335 if (ret)
336 return ret;
337
338 /* Instead of trying to make the power_off code look like
339 * halt when pm_power_off is not set do it the easy way.
340 */
341 if ((cmd == LINUX_REBOOT_CMD_POWER_OFF) && !pm_power_off)
342 cmd = LINUX_REBOOT_CMD_HALT;
343
344 mutex_lock(&system_transition_mutex);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This used to be reboot_mutex.
345 switch (cmd) {
346 case LINUX_REBOOT_CMD_RESTART:
347 kernel_restart(NULL);
348 break;
349
350 case LINUX_REBOOT_CMD_CAD_ON:
351 C_A_D = 1;
352 break;
353
354 case LINUX_REBOOT_CMD_CAD_OFF:
355 C_A_D = 0;
356 break;
357
358 case LINUX_REBOOT_CMD_HALT:
359 kernel_halt();
360 do_exit(0);
361 panic("cannot halt");
362
363 case LINUX_REBOOT_CMD_POWER_OFF:
364 kernel_power_off();
365 do_exit(0);
366 break;
367
368 case LINUX_REBOOT_CMD_RESTART2:
369 ret = strncpy_from_user(&buffer[0], arg, sizeof(buffer) - 1);
370 if (ret < 0) {
371 ret = -EFAULT;
372 break;
373 }
374 buffer[sizeof(buffer) - 1] = '\0';
375
376 kernel_restart(buffer);
377 break;
378
379 #ifdef CONFIG_KEXEC_CORE
380 case LINUX_REBOOT_CMD_KEXEC:
381 ret = kernel_kexec();
382 break;
383 #endif
384
385 #ifdef CONFIG_HIBERNATION
386 case LINUX_REBOOT_CMD_SW_SUSPEND:
387 ret = hibernate();
^^^^^^^^^^^^^^^^^
The first thing that hibernate() does is call lock_system_sleep() which
use to take the pm_mutex, but now it takes system_transition_mutex so
it will just hang forever.
388 break;
389 #endif
390
391 default:
392 ret = -EINVAL;
393 break;
394 }
395 mutex_unlock(&system_transition_mutex);
396 return ret;
397 }
regards,
dan carpenter
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [bug report] PM / reboot: Eliminate race between reboot and suspend
2021-01-21 14:42 ` Rafael J. Wysocki
@ 2021-01-22 7:38 ` Baoquan He
0 siblings, 0 replies; 6+ messages in thread
From: Baoquan He @ 2021-01-22 7:38 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Pingfan Liu, Dave Young, Kexec Mailing List, Dan Carpenter, Linux PM
On 01/21/21 at 03:42pm, Rafael J. Wysocki wrote:
> On Thu, Jan 21, 2021 at 10:14 AM Pingfan Liu <kernelfans@gmail.com> wrote:
> >
> > On Wed, Jan 20, 2021 at 5:30 PM Baoquan He <bhe@redhat.com> wrote:
> > >
> > > Hi,
> > >
> > > On 11/18/20 at 09:59pm, Dan Carpenter wrote:
> > > > Hello Pingfan Liu,
> > > >
> > > > The patch 55f2503c3b69: "PM / reboot: Eliminate race between reboot
> > > > and suspend" from Jul 31, 2018, leads to the following static checker
> > > > warning:
> > > >
> > > > kernel/power/main.c:27 lock_system_sleep()
> > > > warn: called with lock held. '&system_transition_mutex'
> > >
> > > This is a good finding. I think we can simply remove the lock/unlock
> > > pair of system_transition_mutex in kernel_kexec() function. The dead
> > > lock should be easily triggered, but it hasn't caused any failure report
> > > because the feature 'kexec jump' is almost not used by anyone as far as
> > > I know. We may need to find out who is using it and where it's used
> > > through an inquiry. Before that, we can just remove the lock operation
> > > inside CONFIG_KEXEC_JUMP ifdeffery scope. Thanks.
> > >
> > >
> > > diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> > > index 80905e5aa8ae..a0b6780740c8 100644
> > > --- a/kernel/kexec_core.c
> > > +++ b/kernel/kexec_core.c
> > > @@ -1134,7 +1134,6 @@ int kernel_kexec(void)
> > >
> > > #ifdef CONFIG_KEXEC_JUMP
> > > if (kexec_image->preserve_context) {
> > > - lock_system_sleep();
> > > pm_prepare_console();
> > > error = freeze_processes();
> > > if (error) {
> > > @@ -1197,7 +1196,6 @@ int kernel_kexec(void)
> > > thaw_processes();
> > > Restore_console:
> > > pm_restore_console();
> > > - unlock_system_sleep();
> >
> > This should work since the only caller syscall_reboot has already
> > placed kernel_kexec() under the protection of system_transition_mutex.
> >
> > Thanks for the fix.
> >
> > Reviewed-by: Pingfan Liu <kernelfans@gmail.com>
>
> OK, so can anyone please submit that patch formally (Cc linux-pm, please)?
I will submit a patch with Pingfan's ack, thanks.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [bug report] PM / reboot: Eliminate race between reboot and suspend
2021-01-21 9:10 ` Pingfan Liu
@ 2021-01-21 14:42 ` Rafael J. Wysocki
2021-01-22 7:38 ` Baoquan He
0 siblings, 1 reply; 6+ messages in thread
From: Rafael J. Wysocki @ 2021-01-21 14:42 UTC (permalink / raw)
To: Pingfan Liu, Baoquan He
Cc: Dan Carpenter, Kexec Mailing List, Linux PM, Dave Young
On Thu, Jan 21, 2021 at 10:14 AM Pingfan Liu <kernelfans@gmail.com> wrote:
>
> On Wed, Jan 20, 2021 at 5:30 PM Baoquan He <bhe@redhat.com> wrote:
> >
> > Hi,
> >
> > On 11/18/20 at 09:59pm, Dan Carpenter wrote:
> > > Hello Pingfan Liu,
> > >
> > > The patch 55f2503c3b69: "PM / reboot: Eliminate race between reboot
> > > and suspend" from Jul 31, 2018, leads to the following static checker
> > > warning:
> > >
> > > kernel/power/main.c:27 lock_system_sleep()
> > > warn: called with lock held. '&system_transition_mutex'
> >
> > This is a good finding. I think we can simply remove the lock/unlock
> > pair of system_transition_mutex in kernel_kexec() function. The dead
> > lock should be easily triggered, but it hasn't caused any failure report
> > because the feature 'kexec jump' is almost not used by anyone as far as
> > I know. We may need to find out who is using it and where it's used
> > through an inquiry. Before that, we can just remove the lock operation
> > inside CONFIG_KEXEC_JUMP ifdeffery scope. Thanks.
> >
> >
> > diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> > index 80905e5aa8ae..a0b6780740c8 100644
> > --- a/kernel/kexec_core.c
> > +++ b/kernel/kexec_core.c
> > @@ -1134,7 +1134,6 @@ int kernel_kexec(void)
> >
> > #ifdef CONFIG_KEXEC_JUMP
> > if (kexec_image->preserve_context) {
> > - lock_system_sleep();
> > pm_prepare_console();
> > error = freeze_processes();
> > if (error) {
> > @@ -1197,7 +1196,6 @@ int kernel_kexec(void)
> > thaw_processes();
> > Restore_console:
> > pm_restore_console();
> > - unlock_system_sleep();
>
> This should work since the only caller syscall_reboot has already
> placed kernel_kexec() under the protection of system_transition_mutex.
>
> Thanks for the fix.
>
> Reviewed-by: Pingfan Liu <kernelfans@gmail.com>
OK, so can anyone please submit that patch formally (Cc linux-pm, please)?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [bug report] PM / reboot: Eliminate race between reboot and suspend
2021-01-20 9:30 ` Baoquan He
@ 2021-01-21 9:10 ` Pingfan Liu
2021-01-21 14:42 ` Rafael J. Wysocki
0 siblings, 1 reply; 6+ messages in thread
From: Pingfan Liu @ 2021-01-21 9:10 UTC (permalink / raw)
To: Baoquan He; +Cc: Dan Carpenter, Kexec Mailing List, linux-pm, Dave Young
On Wed, Jan 20, 2021 at 5:30 PM Baoquan He <bhe@redhat.com> wrote:
>
> Hi,
>
> On 11/18/20 at 09:59pm, Dan Carpenter wrote:
> > Hello Pingfan Liu,
> >
> > The patch 55f2503c3b69: "PM / reboot: Eliminate race between reboot
> > and suspend" from Jul 31, 2018, leads to the following static checker
> > warning:
> >
> > kernel/power/main.c:27 lock_system_sleep()
> > warn: called with lock held. '&system_transition_mutex'
>
> This is a good finding. I think we can simply remove the lock/unlock
> pair of system_transition_mutex in kernel_kexec() function. The dead
> lock should be easily triggered, but it hasn't caused any failure report
> because the feature 'kexec jump' is almost not used by anyone as far as
> I know. We may need to find out who is using it and where it's used
> through an inquiry. Before that, we can just remove the lock operation
> inside CONFIG_KEXEC_JUMP ifdeffery scope. Thanks.
>
>
> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> index 80905e5aa8ae..a0b6780740c8 100644
> --- a/kernel/kexec_core.c
> +++ b/kernel/kexec_core.c
> @@ -1134,7 +1134,6 @@ int kernel_kexec(void)
>
> #ifdef CONFIG_KEXEC_JUMP
> if (kexec_image->preserve_context) {
> - lock_system_sleep();
> pm_prepare_console();
> error = freeze_processes();
> if (error) {
> @@ -1197,7 +1196,6 @@ int kernel_kexec(void)
> thaw_processes();
> Restore_console:
> pm_restore_console();
> - unlock_system_sleep();
This should work since the only caller syscall_reboot has already
placed kernel_kexec() under the protection of system_transition_mutex.
Thanks for the fix.
Reviewed-by: Pingfan Liu <kernelfans@gmail.com>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [bug report] PM / reboot: Eliminate race between reboot and suspend
2020-11-18 18:59 Dan Carpenter
@ 2021-01-20 9:30 ` Baoquan He
2021-01-21 9:10 ` Pingfan Liu
0 siblings, 1 reply; 6+ messages in thread
From: Baoquan He @ 2021-01-20 9:30 UTC (permalink / raw)
To: Dan Carpenter; +Cc: kernelfans, kexec, linux-pm, dyoung
Hi,
On 11/18/20 at 09:59pm, Dan Carpenter wrote:
> Hello Pingfan Liu,
>
> The patch 55f2503c3b69: "PM / reboot: Eliminate race between reboot
> and suspend" from Jul 31, 2018, leads to the following static checker
> warning:
>
> kernel/power/main.c:27 lock_system_sleep()
> warn: called with lock held. '&system_transition_mutex'
This is a good finding. I think we can simply remove the lock/unlock
pair of system_transition_mutex in kernel_kexec() function. The dead
lock should be easily triggered, but it hasn't caused any failure report
because the feature 'kexec jump' is almost not used by anyone as far as
I know. We may need to find out who is using it and where it's used
through an inquiry. Before that, we can just remove the lock operation
inside CONFIG_KEXEC_JUMP ifdeffery scope. Thanks.
diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 80905e5aa8ae..a0b6780740c8 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -1134,7 +1134,6 @@ int kernel_kexec(void)
#ifdef CONFIG_KEXEC_JUMP
if (kexec_image->preserve_context) {
- lock_system_sleep();
pm_prepare_console();
error = freeze_processes();
if (error) {
@@ -1197,7 +1196,6 @@ int kernel_kexec(void)
thaw_processes();
Restore_console:
pm_restore_console();
- unlock_system_sleep();
}
#endif
>
> kernel/reboot.c
> 345
> 346 mutex_lock(&system_transition_mutex);
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> The patch changed the code to take this lock.
>
> 347 switch (cmd) {
> 348 case LINUX_REBOOT_CMD_RESTART:
> 349 kernel_restart(NULL);
> 350 break;
> 351
> 352 case LINUX_REBOOT_CMD_CAD_ON:
> 353 C_A_D = 1;
> 354 break;
> 355
> 356 case LINUX_REBOOT_CMD_CAD_OFF:
> 357 C_A_D = 0;
> 358 break;
> 359
> 360 case LINUX_REBOOT_CMD_HALT:
> 361 kernel_halt();
> 362 do_exit(0);
> 363 panic("cannot halt");
> 364
> 365 case LINUX_REBOOT_CMD_POWER_OFF:
> 366 kernel_power_off();
> 367 do_exit(0);
> 368 break;
> 369
> 370 case LINUX_REBOOT_CMD_RESTART2:
> 371 ret = strncpy_from_user(&buffer[0], arg, sizeof(buffer) - 1);
> 372 if (ret < 0) {
> 373 ret = -EFAULT;
> 374 break;
> 375 }
> 376 buffer[sizeof(buffer) - 1] = '\0';
> 377
> 378 kernel_restart(buffer);
> 379 break;
> 380
> 381 #ifdef CONFIG_KEXEC_CORE
> 382 case LINUX_REBOOT_CMD_KEXEC:
> 383 ret = kernel_kexec();
> ^^^^^^^^^^^^^^^^^^^^
> Called with lock held.
>
> 384 break;
> 385 #endif
>
> But kernel_kexec() also tries to take the &system_transition_mutex so
> it will dead lock.
>
> kernel/kexec_core.c
> 1125 int kernel_kexec(void)
> 1126 {
> 1127 int error = 0;
> 1128
> 1129 if (!mutex_trylock(&kexec_mutex))
> 1130 return -EBUSY;
> 1131 if (!kexec_image) {
> 1132 error = -EINVAL;
> 1133 goto Unlock;
> 1134 }
> 1135
> 1136 #ifdef CONFIG_KEXEC_JUMP
> 1137 if (kexec_image->preserve_context) {
> 1138 lock_system_sleep();
> ^^^^^^^^^^^^^^^^^^^
> Here.
>
> 1139 pm_prepare_console();
> 1140 error = freeze_processes();
> 1141 if (error) {
> 1142 error = -EBUSY;
> 1143 goto Restore_console;
> 1144 }
> 1145 suspend_console();
> 1146 error = dpm_suspend_start(PMSG_FREEZE);
> 1147 if (error)
> 1148 goto Resume_console;
> 1149 /* At this point, dpm_suspend_start() has been called,
> 1150 * but *not* dpm_suspend_end(). We *must* call
> 1151 * dpm_suspend_end() now. Otherwise, drivers for
> 1152 * some devices (e.g. interrupt controllers) become
> 1153 * desynchronized with the actual state of the
> 1154 * hardware at resume time, and evil weirdness ensues.
> 1155 */
> 1156 error = dpm_suspend_end(PMSG_FREEZE);
> 1157 if (error)
> 1158 goto Resume_devices;
> 1159 error = suspend_disable_secondary_cpus();
> 1160 if (error)
>
> regards,
> dan carpenter
>
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
>
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [bug report] PM / reboot: Eliminate race between reboot and suspend
@ 2020-11-18 18:59 Dan Carpenter
2021-01-20 9:30 ` Baoquan He
0 siblings, 1 reply; 6+ messages in thread
From: Dan Carpenter @ 2020-11-18 18:59 UTC (permalink / raw)
To: kernelfans; +Cc: linux-pm, kexec
Hello Pingfan Liu,
The patch 55f2503c3b69: "PM / reboot: Eliminate race between reboot
and suspend" from Jul 31, 2018, leads to the following static checker
warning:
kernel/power/main.c:27 lock_system_sleep()
warn: called with lock held. '&system_transition_mutex'
kernel/reboot.c
345
346 mutex_lock(&system_transition_mutex);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The patch changed the code to take this lock.
347 switch (cmd) {
348 case LINUX_REBOOT_CMD_RESTART:
349 kernel_restart(NULL);
350 break;
351
352 case LINUX_REBOOT_CMD_CAD_ON:
353 C_A_D = 1;
354 break;
355
356 case LINUX_REBOOT_CMD_CAD_OFF:
357 C_A_D = 0;
358 break;
359
360 case LINUX_REBOOT_CMD_HALT:
361 kernel_halt();
362 do_exit(0);
363 panic("cannot halt");
364
365 case LINUX_REBOOT_CMD_POWER_OFF:
366 kernel_power_off();
367 do_exit(0);
368 break;
369
370 case LINUX_REBOOT_CMD_RESTART2:
371 ret = strncpy_from_user(&buffer[0], arg, sizeof(buffer) - 1);
372 if (ret < 0) {
373 ret = -EFAULT;
374 break;
375 }
376 buffer[sizeof(buffer) - 1] = '\0';
377
378 kernel_restart(buffer);
379 break;
380
381 #ifdef CONFIG_KEXEC_CORE
382 case LINUX_REBOOT_CMD_KEXEC:
383 ret = kernel_kexec();
^^^^^^^^^^^^^^^^^^^^
Called with lock held.
384 break;
385 #endif
But kernel_kexec() also tries to take the &system_transition_mutex so
it will dead lock.
kernel/kexec_core.c
1125 int kernel_kexec(void)
1126 {
1127 int error = 0;
1128
1129 if (!mutex_trylock(&kexec_mutex))
1130 return -EBUSY;
1131 if (!kexec_image) {
1132 error = -EINVAL;
1133 goto Unlock;
1134 }
1135
1136 #ifdef CONFIG_KEXEC_JUMP
1137 if (kexec_image->preserve_context) {
1138 lock_system_sleep();
^^^^^^^^^^^^^^^^^^^
Here.
1139 pm_prepare_console();
1140 error = freeze_processes();
1141 if (error) {
1142 error = -EBUSY;
1143 goto Restore_console;
1144 }
1145 suspend_console();
1146 error = dpm_suspend_start(PMSG_FREEZE);
1147 if (error)
1148 goto Resume_console;
1149 /* At this point, dpm_suspend_start() has been called,
1150 * but *not* dpm_suspend_end(). We *must* call
1151 * dpm_suspend_end() now. Otherwise, drivers for
1152 * some devices (e.g. interrupt controllers) become
1153 * desynchronized with the actual state of the
1154 * hardware at resume time, and evil weirdness ensues.
1155 */
1156 error = dpm_suspend_end(PMSG_FREEZE);
1157 if (error)
1158 goto Resume_devices;
1159 error = suspend_disable_secondary_cpus();
1160 if (error)
regards,
dan carpenter
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-01-22 7:40 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-19 6:18 [bug report] PM / reboot: Eliminate race between reboot and suspend Dan Carpenter
2020-11-18 18:59 Dan Carpenter
2021-01-20 9:30 ` Baoquan He
2021-01-21 9:10 ` Pingfan Liu
2021-01-21 14:42 ` Rafael J. Wysocki
2021-01-22 7:38 ` Baoquan He
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).