From: Baoquan He <bhe@redhat.com> To: Dan Carpenter <dan.carpenter@oracle.com> Cc: kernelfans@gmail.com, kexec@lists.infradead.org, linux-pm@vger.kernel.org, dyoung@redhat.com Subject: Re: [bug report] PM / reboot: Eliminate race between reboot and suspend Date: Wed, 20 Jan 2021 17:30:15 +0800 [thread overview] Message-ID: <20210120093015.GE20161@MiWiFi-R3L-srv> (raw) In-Reply-To: <20201118185917.GA433776@mwanda> Hi, On 11/18/20 at 09:59pm, Dan Carpenter wrote: > Hello Pingfan Liu, > > The patch 55f2503c3b69: "PM / reboot: Eliminate race between reboot > and suspend" from Jul 31, 2018, leads to the following static checker > warning: > > kernel/power/main.c:27 lock_system_sleep() > warn: called with lock held. '&system_transition_mutex' This is a good finding. I think we can simply remove the lock/unlock pair of system_transition_mutex in kernel_kexec() function. The dead lock should be easily triggered, but it hasn't caused any failure report because the feature 'kexec jump' is almost not used by anyone as far as I know. We may need to find out who is using it and where it's used through an inquiry. Before that, we can just remove the lock operation inside CONFIG_KEXEC_JUMP ifdeffery scope. Thanks. diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c index 80905e5aa8ae..a0b6780740c8 100644 --- a/kernel/kexec_core.c +++ b/kernel/kexec_core.c @@ -1134,7 +1134,6 @@ int kernel_kexec(void) #ifdef CONFIG_KEXEC_JUMP if (kexec_image->preserve_context) { - lock_system_sleep(); pm_prepare_console(); error = freeze_processes(); if (error) { @@ -1197,7 +1196,6 @@ int kernel_kexec(void) thaw_processes(); Restore_console: pm_restore_console(); - unlock_system_sleep(); } #endif > > kernel/reboot.c > 345 > 346 mutex_lock(&system_transition_mutex); > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > The patch changed the code to take this lock. > > 347 switch (cmd) { > 348 case LINUX_REBOOT_CMD_RESTART: > 349 kernel_restart(NULL); > 350 break; > 351 > 352 case LINUX_REBOOT_CMD_CAD_ON: > 353 C_A_D = 1; > 354 break; > 355 > 356 case LINUX_REBOOT_CMD_CAD_OFF: > 357 C_A_D = 0; > 358 break; > 359 > 360 case LINUX_REBOOT_CMD_HALT: > 361 kernel_halt(); > 362 do_exit(0); > 363 panic("cannot halt"); > 364 > 365 case LINUX_REBOOT_CMD_POWER_OFF: > 366 kernel_power_off(); > 367 do_exit(0); > 368 break; > 369 > 370 case LINUX_REBOOT_CMD_RESTART2: > 371 ret = strncpy_from_user(&buffer[0], arg, sizeof(buffer) - 1); > 372 if (ret < 0) { > 373 ret = -EFAULT; > 374 break; > 375 } > 376 buffer[sizeof(buffer) - 1] = '\0'; > 377 > 378 kernel_restart(buffer); > 379 break; > 380 > 381 #ifdef CONFIG_KEXEC_CORE > 382 case LINUX_REBOOT_CMD_KEXEC: > 383 ret = kernel_kexec(); > ^^^^^^^^^^^^^^^^^^^^ > Called with lock held. > > 384 break; > 385 #endif > > But kernel_kexec() also tries to take the &system_transition_mutex so > it will dead lock. > > kernel/kexec_core.c > 1125 int kernel_kexec(void) > 1126 { > 1127 int error = 0; > 1128 > 1129 if (!mutex_trylock(&kexec_mutex)) > 1130 return -EBUSY; > 1131 if (!kexec_image) { > 1132 error = -EINVAL; > 1133 goto Unlock; > 1134 } > 1135 > 1136 #ifdef CONFIG_KEXEC_JUMP > 1137 if (kexec_image->preserve_context) { > 1138 lock_system_sleep(); > ^^^^^^^^^^^^^^^^^^^ > Here. > > 1139 pm_prepare_console(); > 1140 error = freeze_processes(); > 1141 if (error) { > 1142 error = -EBUSY; > 1143 goto Restore_console; > 1144 } > 1145 suspend_console(); > 1146 error = dpm_suspend_start(PMSG_FREEZE); > 1147 if (error) > 1148 goto Resume_console; > 1149 /* At this point, dpm_suspend_start() has been called, > 1150 * but *not* dpm_suspend_end(). We *must* call > 1151 * dpm_suspend_end() now. Otherwise, drivers for > 1152 * some devices (e.g. interrupt controllers) become > 1153 * desynchronized with the actual state of the > 1154 * hardware at resume time, and evil weirdness ensues. > 1155 */ > 1156 error = dpm_suspend_end(PMSG_FREEZE); > 1157 if (error) > 1158 goto Resume_devices; > 1159 error = suspend_disable_secondary_cpus(); > 1160 if (error) > > regards, > dan carpenter > > _______________________________________________ > kexec mailing list > kexec@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/kexec >
WARNING: multiple messages have this Message-ID (diff)
From: Baoquan He <bhe@redhat.com> To: Dan Carpenter <dan.carpenter@oracle.com> Cc: dyoung@redhat.com, kexec@lists.infradead.org, kernelfans@gmail.com, linux-pm@vger.kernel.org Subject: Re: [bug report] PM / reboot: Eliminate race between reboot and suspend Date: Wed, 20 Jan 2021 17:30:15 +0800 [thread overview] Message-ID: <20210120093015.GE20161@MiWiFi-R3L-srv> (raw) In-Reply-To: <20201118185917.GA433776@mwanda> Hi, On 11/18/20 at 09:59pm, Dan Carpenter wrote: > Hello Pingfan Liu, > > The patch 55f2503c3b69: "PM / reboot: Eliminate race between reboot > and suspend" from Jul 31, 2018, leads to the following static checker > warning: > > kernel/power/main.c:27 lock_system_sleep() > warn: called with lock held. '&system_transition_mutex' This is a good finding. I think we can simply remove the lock/unlock pair of system_transition_mutex in kernel_kexec() function. The dead lock should be easily triggered, but it hasn't caused any failure report because the feature 'kexec jump' is almost not used by anyone as far as I know. We may need to find out who is using it and where it's used through an inquiry. Before that, we can just remove the lock operation inside CONFIG_KEXEC_JUMP ifdeffery scope. Thanks. diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c index 80905e5aa8ae..a0b6780740c8 100644 --- a/kernel/kexec_core.c +++ b/kernel/kexec_core.c @@ -1134,7 +1134,6 @@ int kernel_kexec(void) #ifdef CONFIG_KEXEC_JUMP if (kexec_image->preserve_context) { - lock_system_sleep(); pm_prepare_console(); error = freeze_processes(); if (error) { @@ -1197,7 +1196,6 @@ int kernel_kexec(void) thaw_processes(); Restore_console: pm_restore_console(); - unlock_system_sleep(); } #endif > > kernel/reboot.c > 345 > 346 mutex_lock(&system_transition_mutex); > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > The patch changed the code to take this lock. > > 347 switch (cmd) { > 348 case LINUX_REBOOT_CMD_RESTART: > 349 kernel_restart(NULL); > 350 break; > 351 > 352 case LINUX_REBOOT_CMD_CAD_ON: > 353 C_A_D = 1; > 354 break; > 355 > 356 case LINUX_REBOOT_CMD_CAD_OFF: > 357 C_A_D = 0; > 358 break; > 359 > 360 case LINUX_REBOOT_CMD_HALT: > 361 kernel_halt(); > 362 do_exit(0); > 363 panic("cannot halt"); > 364 > 365 case LINUX_REBOOT_CMD_POWER_OFF: > 366 kernel_power_off(); > 367 do_exit(0); > 368 break; > 369 > 370 case LINUX_REBOOT_CMD_RESTART2: > 371 ret = strncpy_from_user(&buffer[0], arg, sizeof(buffer) - 1); > 372 if (ret < 0) { > 373 ret = -EFAULT; > 374 break; > 375 } > 376 buffer[sizeof(buffer) - 1] = '\0'; > 377 > 378 kernel_restart(buffer); > 379 break; > 380 > 381 #ifdef CONFIG_KEXEC_CORE > 382 case LINUX_REBOOT_CMD_KEXEC: > 383 ret = kernel_kexec(); > ^^^^^^^^^^^^^^^^^^^^ > Called with lock held. > > 384 break; > 385 #endif > > But kernel_kexec() also tries to take the &system_transition_mutex so > it will dead lock. > > kernel/kexec_core.c > 1125 int kernel_kexec(void) > 1126 { > 1127 int error = 0; > 1128 > 1129 if (!mutex_trylock(&kexec_mutex)) > 1130 return -EBUSY; > 1131 if (!kexec_image) { > 1132 error = -EINVAL; > 1133 goto Unlock; > 1134 } > 1135 > 1136 #ifdef CONFIG_KEXEC_JUMP > 1137 if (kexec_image->preserve_context) { > 1138 lock_system_sleep(); > ^^^^^^^^^^^^^^^^^^^ > Here. > > 1139 pm_prepare_console(); > 1140 error = freeze_processes(); > 1141 if (error) { > 1142 error = -EBUSY; > 1143 goto Restore_console; > 1144 } > 1145 suspend_console(); > 1146 error = dpm_suspend_start(PMSG_FREEZE); > 1147 if (error) > 1148 goto Resume_console; > 1149 /* At this point, dpm_suspend_start() has been called, > 1150 * but *not* dpm_suspend_end(). We *must* call > 1151 * dpm_suspend_end() now. Otherwise, drivers for > 1152 * some devices (e.g. interrupt controllers) become > 1153 * desynchronized with the actual state of the > 1154 * hardware at resume time, and evil weirdness ensues. > 1155 */ > 1156 error = dpm_suspend_end(PMSG_FREEZE); > 1157 if (error) > 1158 goto Resume_devices; > 1159 error = suspend_disable_secondary_cpus(); > 1160 if (error) > > regards, > dan carpenter > > _______________________________________________ > kexec mailing list > kexec@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/kexec > _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
next prev parent reply other threads:[~2021-01-20 11:06 UTC|newest] Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-11-18 18:59 [bug report] PM / reboot: Eliminate race between reboot and suspend Dan Carpenter 2020-11-18 18:59 ` Dan Carpenter 2021-01-20 9:30 ` Baoquan He [this message] 2021-01-20 9:30 ` Baoquan He 2021-01-21 9:10 ` Pingfan Liu 2021-01-21 9:10 ` Pingfan Liu 2021-01-21 14:42 ` Rafael J. Wysocki 2021-01-21 14:42 ` Rafael J. Wysocki 2021-01-22 7:38 ` Baoquan He 2021-01-22 7:38 ` Baoquan He -- strict thread matches above, loose matches on Subject: below -- 2019-11-19 6:18 Dan Carpenter
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20210120093015.GE20161@MiWiFi-R3L-srv \ --to=bhe@redhat.com \ --cc=dan.carpenter@oracle.com \ --cc=dyoung@redhat.com \ --cc=kernelfans@gmail.com \ --cc=kexec@lists.infradead.org \ --cc=linux-pm@vger.kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.