All of lore.kernel.org
 help / color / mirror / Atom feed
From: Baoquan He <bhe@redhat.com>
To: Dan Carpenter <dan.carpenter@oracle.com>
Cc: kernelfans@gmail.com, kexec@lists.infradead.org,
	linux-pm@vger.kernel.org, dyoung@redhat.com
Subject: Re: [bug report] PM / reboot: Eliminate race between reboot and suspend
Date: Wed, 20 Jan 2021 17:30:15 +0800	[thread overview]
Message-ID: <20210120093015.GE20161@MiWiFi-R3L-srv> (raw)
In-Reply-To: <20201118185917.GA433776@mwanda>

Hi,

On 11/18/20 at 09:59pm, Dan Carpenter wrote:
> Hello Pingfan Liu,
> 
> The patch 55f2503c3b69: "PM / reboot: Eliminate race between reboot
> and suspend" from Jul 31, 2018, leads to the following static checker
> warning:
> 
> 	kernel/power/main.c:27 lock_system_sleep()
> 	warn: called with lock held.  '&system_transition_mutex'

This is a good finding. I think we can simply remove the lock/unlock
pair of system_transition_mutex in kernel_kexec() function. The dead
lock should be easily triggered, but it hasn't caused any failure report
because the feature 'kexec jump' is almost not used by anyone as far as
I know. We may need to find out who is using it and where it's used
through an inquiry. Before that, we can just remove the lock operation
inside CONFIG_KEXEC_JUMP ifdeffery scope. Thanks. 


diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 80905e5aa8ae..a0b6780740c8 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -1134,7 +1134,6 @@ int kernel_kexec(void)
 
 #ifdef CONFIG_KEXEC_JUMP
 	if (kexec_image->preserve_context) {
-		lock_system_sleep();
 		pm_prepare_console();
 		error = freeze_processes();
 		if (error) {
@@ -1197,7 +1196,6 @@ int kernel_kexec(void)
 		thaw_processes();
  Restore_console:
 		pm_restore_console();
-		unlock_system_sleep();
 	}
 #endif
 

> 
> kernel/reboot.c
>    345  
>    346          mutex_lock(&system_transition_mutex);
>                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> The patch changed the code to take this lock.
> 
>    347          switch (cmd) {
>    348          case LINUX_REBOOT_CMD_RESTART:
>    349                  kernel_restart(NULL);
>    350                  break;
>    351  
>    352          case LINUX_REBOOT_CMD_CAD_ON:
>    353                  C_A_D = 1;
>    354                  break;
>    355  
>    356          case LINUX_REBOOT_CMD_CAD_OFF:
>    357                  C_A_D = 0;
>    358                  break;
>    359  
>    360          case LINUX_REBOOT_CMD_HALT:
>    361                  kernel_halt();
>    362                  do_exit(0);
>    363                  panic("cannot halt");
>    364  
>    365          case LINUX_REBOOT_CMD_POWER_OFF:
>    366                  kernel_power_off();
>    367                  do_exit(0);
>    368                  break;
>    369  
>    370          case LINUX_REBOOT_CMD_RESTART2:
>    371                  ret = strncpy_from_user(&buffer[0], arg, sizeof(buffer) - 1);
>    372                  if (ret < 0) {
>    373                          ret = -EFAULT;
>    374                          break;
>    375                  }
>    376                  buffer[sizeof(buffer) - 1] = '\0';
>    377  
>    378                  kernel_restart(buffer);
>    379                  break;
>    380  
>    381  #ifdef CONFIG_KEXEC_CORE
>    382          case LINUX_REBOOT_CMD_KEXEC:
>    383                  ret = kernel_kexec();
>                         ^^^^^^^^^^^^^^^^^^^^
> Called with lock held.
> 
>    384                  break;
>    385  #endif
> 
> But kernel_kexec() also tries to take the &system_transition_mutex so
> it will dead lock.
> 
> kernel/kexec_core.c
>   1125  int kernel_kexec(void)
>   1126  {
>   1127          int error = 0;
>   1128  
>   1129          if (!mutex_trylock(&kexec_mutex))
>   1130                  return -EBUSY;
>   1131          if (!kexec_image) {
>   1132                  error = -EINVAL;
>   1133                  goto Unlock;
>   1134          }
>   1135  
>   1136  #ifdef CONFIG_KEXEC_JUMP
>   1137          if (kexec_image->preserve_context) {
>   1138                  lock_system_sleep();
>                         ^^^^^^^^^^^^^^^^^^^
> Here.
> 
>   1139                  pm_prepare_console();
>   1140                  error = freeze_processes();
>   1141                  if (error) {
>   1142                          error = -EBUSY;
>   1143                          goto Restore_console;
>   1144                  }
>   1145                  suspend_console();
>   1146                  error = dpm_suspend_start(PMSG_FREEZE);
>   1147                  if (error)
>   1148                          goto Resume_console;
>   1149                  /* At this point, dpm_suspend_start() has been called,
>   1150                   * but *not* dpm_suspend_end(). We *must* call
>   1151                   * dpm_suspend_end() now.  Otherwise, drivers for
>   1152                   * some devices (e.g. interrupt controllers) become
>   1153                   * desynchronized with the actual state of the
>   1154                   * hardware at resume time, and evil weirdness ensues.
>   1155                   */
>   1156                  error = dpm_suspend_end(PMSG_FREEZE);
>   1157                  if (error)
>   1158                          goto Resume_devices;
>   1159                  error = suspend_disable_secondary_cpus();
>   1160                  if (error)
> 
> regards,
> dan carpenter
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
> 


WARNING: multiple messages have this Message-ID (diff)
From: Baoquan He <bhe@redhat.com>
To: Dan Carpenter <dan.carpenter@oracle.com>
Cc: dyoung@redhat.com, kexec@lists.infradead.org,
	kernelfans@gmail.com, linux-pm@vger.kernel.org
Subject: Re: [bug report] PM / reboot: Eliminate race between reboot and suspend
Date: Wed, 20 Jan 2021 17:30:15 +0800	[thread overview]
Message-ID: <20210120093015.GE20161@MiWiFi-R3L-srv> (raw)
In-Reply-To: <20201118185917.GA433776@mwanda>

Hi,

On 11/18/20 at 09:59pm, Dan Carpenter wrote:
> Hello Pingfan Liu,
> 
> The patch 55f2503c3b69: "PM / reboot: Eliminate race between reboot
> and suspend" from Jul 31, 2018, leads to the following static checker
> warning:
> 
> 	kernel/power/main.c:27 lock_system_sleep()
> 	warn: called with lock held.  '&system_transition_mutex'

This is a good finding. I think we can simply remove the lock/unlock
pair of system_transition_mutex in kernel_kexec() function. The dead
lock should be easily triggered, but it hasn't caused any failure report
because the feature 'kexec jump' is almost not used by anyone as far as
I know. We may need to find out who is using it and where it's used
through an inquiry. Before that, we can just remove the lock operation
inside CONFIG_KEXEC_JUMP ifdeffery scope. Thanks. 


diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 80905e5aa8ae..a0b6780740c8 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -1134,7 +1134,6 @@ int kernel_kexec(void)
 
 #ifdef CONFIG_KEXEC_JUMP
 	if (kexec_image->preserve_context) {
-		lock_system_sleep();
 		pm_prepare_console();
 		error = freeze_processes();
 		if (error) {
@@ -1197,7 +1196,6 @@ int kernel_kexec(void)
 		thaw_processes();
  Restore_console:
 		pm_restore_console();
-		unlock_system_sleep();
 	}
 #endif
 

> 
> kernel/reboot.c
>    345  
>    346          mutex_lock(&system_transition_mutex);
>                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> The patch changed the code to take this lock.
> 
>    347          switch (cmd) {
>    348          case LINUX_REBOOT_CMD_RESTART:
>    349                  kernel_restart(NULL);
>    350                  break;
>    351  
>    352          case LINUX_REBOOT_CMD_CAD_ON:
>    353                  C_A_D = 1;
>    354                  break;
>    355  
>    356          case LINUX_REBOOT_CMD_CAD_OFF:
>    357                  C_A_D = 0;
>    358                  break;
>    359  
>    360          case LINUX_REBOOT_CMD_HALT:
>    361                  kernel_halt();
>    362                  do_exit(0);
>    363                  panic("cannot halt");
>    364  
>    365          case LINUX_REBOOT_CMD_POWER_OFF:
>    366                  kernel_power_off();
>    367                  do_exit(0);
>    368                  break;
>    369  
>    370          case LINUX_REBOOT_CMD_RESTART2:
>    371                  ret = strncpy_from_user(&buffer[0], arg, sizeof(buffer) - 1);
>    372                  if (ret < 0) {
>    373                          ret = -EFAULT;
>    374                          break;
>    375                  }
>    376                  buffer[sizeof(buffer) - 1] = '\0';
>    377  
>    378                  kernel_restart(buffer);
>    379                  break;
>    380  
>    381  #ifdef CONFIG_KEXEC_CORE
>    382          case LINUX_REBOOT_CMD_KEXEC:
>    383                  ret = kernel_kexec();
>                         ^^^^^^^^^^^^^^^^^^^^
> Called with lock held.
> 
>    384                  break;
>    385  #endif
> 
> But kernel_kexec() also tries to take the &system_transition_mutex so
> it will dead lock.
> 
> kernel/kexec_core.c
>   1125  int kernel_kexec(void)
>   1126  {
>   1127          int error = 0;
>   1128  
>   1129          if (!mutex_trylock(&kexec_mutex))
>   1130                  return -EBUSY;
>   1131          if (!kexec_image) {
>   1132                  error = -EINVAL;
>   1133                  goto Unlock;
>   1134          }
>   1135  
>   1136  #ifdef CONFIG_KEXEC_JUMP
>   1137          if (kexec_image->preserve_context) {
>   1138                  lock_system_sleep();
>                         ^^^^^^^^^^^^^^^^^^^
> Here.
> 
>   1139                  pm_prepare_console();
>   1140                  error = freeze_processes();
>   1141                  if (error) {
>   1142                          error = -EBUSY;
>   1143                          goto Restore_console;
>   1144                  }
>   1145                  suspend_console();
>   1146                  error = dpm_suspend_start(PMSG_FREEZE);
>   1147                  if (error)
>   1148                          goto Resume_console;
>   1149                  /* At this point, dpm_suspend_start() has been called,
>   1150                   * but *not* dpm_suspend_end(). We *must* call
>   1151                   * dpm_suspend_end() now.  Otherwise, drivers for
>   1152                   * some devices (e.g. interrupt controllers) become
>   1153                   * desynchronized with the actual state of the
>   1154                   * hardware at resume time, and evil weirdness ensues.
>   1155                   */
>   1156                  error = dpm_suspend_end(PMSG_FREEZE);
>   1157                  if (error)
>   1158                          goto Resume_devices;
>   1159                  error = suspend_disable_secondary_cpus();
>   1160                  if (error)
> 
> regards,
> dan carpenter
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
> 


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

  reply	other threads:[~2021-01-20 11:06 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-18 18:59 [bug report] PM / reboot: Eliminate race between reboot and suspend Dan Carpenter
2020-11-18 18:59 ` Dan Carpenter
2021-01-20  9:30 ` Baoquan He [this message]
2021-01-20  9:30   ` Baoquan He
2021-01-21  9:10   ` Pingfan Liu
2021-01-21  9:10     ` Pingfan Liu
2021-01-21 14:42     ` Rafael J. Wysocki
2021-01-21 14:42       ` Rafael J. Wysocki
2021-01-22  7:38       ` Baoquan He
2021-01-22  7:38         ` Baoquan He
  -- strict thread matches above, loose matches on Subject: below --
2019-11-19  6:18 Dan Carpenter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210120093015.GE20161@MiWiFi-R3L-srv \
    --to=bhe@redhat.com \
    --cc=dan.carpenter@oracle.com \
    --cc=dyoung@redhat.com \
    --cc=kernelfans@gmail.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-pm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.