From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753447AbcANKJZ (ORCPT ); Thu, 14 Jan 2016 05:09:25 -0500 Received: from mail-wm0-f65.google.com ([74.125.82.65]:36244 "EHLO mail-wm0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753013AbcANKJS (ORCPT ); Thu, 14 Jan 2016 05:09:18 -0500 Date: Thu, 14 Jan 2016 11:09:13 +0100 From: Ingo Molnar To: Keerthy Cc: Keerthy , linux-kernel@vger.kernel.org, edubezval@gmail.com, grygorii.strashko@ti.com, nm@ti.com, linux-pm@vger.kernel.org, linux-omap@vger.kernel.org, joel@jms.id.au, akpm@linux-foundation.org, linux-arm-kernel@lists.infradead.org, peterz@infradead.org, dyoung@redhat.com, josh@joshtriplett.org, mpe@ellerman.id.au, Thomas Gleixner , Peter Zijlstra Subject: Re: [PATCH v2] reboot: Backup orderly_poweroff Message-ID: <20160114100913.GB15857@gmail.com> References: <1452688405-15087-1-git-send-email-j-keerthy@ti.com> <20160114090520.GA4351@gmail.com> <569767EC.2010704@ti.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <569767EC.2010704@ti.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Keerthy wrote: > Hi Ingo, > > On Thursday 14 January 2016 02:35 PM, Ingo Molnar wrote: > > > >* Keerthy wrote: > > > >>orderly_poweroff is triggered when a graceful shutdown > >>of system is desired. This may be used in many critical states of the > >>kernel such as when subsystems detects conditions such as critical > >>temperature conditions. However, in certain conditions in system > >>boot up sequences like those in the middle of driver probes being > >>initiated, userspace will be unable to power off the system in a clean > >>manner and leaves the system in a critical state. In cases like these, > >>the /sbin/poweroff will return success (having forked off to attempt > >>powering off the system. However, the system overall will fail to > >>completely poweroff (since other modules will be probed) and the system > >>is still functional with no userspace (since that would have shut itself > >>off). > >> > >>However, there is no clean way of detecting such failure of userspace > >>powering off the system. In such scenarios, it is necessary for a backup > >>workqueue to be able to force a shutdown of the system when orderly > >>shutdown is not successful after a configurable time period. > >> > >>Signed-off-by: Keerthy > >>Suggested-by: Eduardo Valentin > >>Reported-by: Nishanth Menon > >>--- > >>Links to previous discussion can be found here: > >> > >>http://www.spinics.net/lists/linux-omap/msg124925.html > >> > >>Boot tested on DRA7. > >> > >>changes in v2: > >> > >> * Changed #ifdef to #if CONFIG_SHUTDOWN_BACKUP_DELAY_MS > >> > >> arch/Kconfig | 7 +++++++ > >> kernel/reboot.c | 23 ++++++++++++++++++----- > >> 2 files changed, 25 insertions(+), 5 deletions(-) > >> > >>Index: linux/arch/Kconfig > >>=================================================================== > >>--- linux.orig/arch/Kconfig 2016-01-11 15:26:07.732173131 +0530 > >>+++ linux/arch/Kconfig 2016-01-11 15:26:07.728173205 +0530 > >>@@ -37,6 +37,18 @@ > >> def_bool y > >> depends on PERF_EVENTS && HAVE_PERF_EVENTS_NMI && !PPC64 > >> > >>+config SHUTDOWN_BACKUP_DELAY_MS > >>+ int "Backup shutdown delay in milli-seconds" > >>+ default 0 > >>+ help > >>+ The number of milliseconds to delay before backup workqueue > >>+ executes attempting to poweroff the system after the > >>+ orderly_poweroff function has failed to complete. > >>+ > >>+ If set to 0, the backup workqueue is not active. The value > >>+ should be conservatively configured based on userspace latencies > >>+ expected for a given system. > > > >I don't really understand this. In what circumstances can a reboot fail? > > > >I think that is what should be fixed: a reboot should never fail, instead of > >introducing some sort of fragile timeout based method. > > Here is the complete description of the scenario which was reported by Nishanth > who encountered the issue. The link has bootlogs and description of the exact > case which led to this patch. > > http://www.spinics.net/lists/linux-omap/msg124923.html it's a reply in the middle of a discussion ... What I managed to decode is that this: static int __orderly_poweroff(bool force) { int ret; ret = run_cmd(poweroff_cmd); if (ret && force) { pr_warn("Failed to start orderly shutdown: forcing the issue\n"); /* * I guess this should try to kick off some daemon to sync and * poweroff asap. Or not even bother syncing if we're doing an * emergency shutdown? */ emergency_sync(); kernel_power_off(); } return ret; } could fail to actually power the system off, if the run_cmd(poweroff_cmd) 'succeeds', but due to a user-space bug it does not actually call the real poweroff system call? Thanks, Ingo From mboxrd@z Thu Jan 1 00:00:00 1970 From: mingo@kernel.org (Ingo Molnar) Date: Thu, 14 Jan 2016 11:09:13 +0100 Subject: [PATCH v2] reboot: Backup orderly_poweroff In-Reply-To: <569767EC.2010704@ti.com> References: <1452688405-15087-1-git-send-email-j-keerthy@ti.com> <20160114090520.GA4351@gmail.com> <569767EC.2010704@ti.com> Message-ID: <20160114100913.GB15857@gmail.com> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org * Keerthy wrote: > Hi Ingo, > > On Thursday 14 January 2016 02:35 PM, Ingo Molnar wrote: > > > >* Keerthy wrote: > > > >>orderly_poweroff is triggered when a graceful shutdown > >>of system is desired. This may be used in many critical states of the > >>kernel such as when subsystems detects conditions such as critical > >>temperature conditions. However, in certain conditions in system > >>boot up sequences like those in the middle of driver probes being > >>initiated, userspace will be unable to power off the system in a clean > >>manner and leaves the system in a critical state. In cases like these, > >>the /sbin/poweroff will return success (having forked off to attempt > >>powering off the system. However, the system overall will fail to > >>completely poweroff (since other modules will be probed) and the system > >>is still functional with no userspace (since that would have shut itself > >>off). > >> > >>However, there is no clean way of detecting such failure of userspace > >>powering off the system. In such scenarios, it is necessary for a backup > >>workqueue to be able to force a shutdown of the system when orderly > >>shutdown is not successful after a configurable time period. > >> > >>Signed-off-by: Keerthy > >>Suggested-by: Eduardo Valentin > >>Reported-by: Nishanth Menon > >>--- > >>Links to previous discussion can be found here: > >> > >>http://www.spinics.net/lists/linux-omap/msg124925.html > >> > >>Boot tested on DRA7. > >> > >>changes in v2: > >> > >> * Changed #ifdef to #if CONFIG_SHUTDOWN_BACKUP_DELAY_MS > >> > >> arch/Kconfig | 7 +++++++ > >> kernel/reboot.c | 23 ++++++++++++++++++----- > >> 2 files changed, 25 insertions(+), 5 deletions(-) > >> > >>Index: linux/arch/Kconfig > >>=================================================================== > >>--- linux.orig/arch/Kconfig 2016-01-11 15:26:07.732173131 +0530 > >>+++ linux/arch/Kconfig 2016-01-11 15:26:07.728173205 +0530 > >>@@ -37,6 +37,18 @@ > >> def_bool y > >> depends on PERF_EVENTS && HAVE_PERF_EVENTS_NMI && !PPC64 > >> > >>+config SHUTDOWN_BACKUP_DELAY_MS > >>+ int "Backup shutdown delay in milli-seconds" > >>+ default 0 > >>+ help > >>+ The number of milliseconds to delay before backup workqueue > >>+ executes attempting to poweroff the system after the > >>+ orderly_poweroff function has failed to complete. > >>+ > >>+ If set to 0, the backup workqueue is not active. The value > >>+ should be conservatively configured based on userspace latencies > >>+ expected for a given system. > > > >I don't really understand this. In what circumstances can a reboot fail? > > > >I think that is what should be fixed: a reboot should never fail, instead of > >introducing some sort of fragile timeout based method. > > Here is the complete description of the scenario which was reported by Nishanth > who encountered the issue. The link has bootlogs and description of the exact > case which led to this patch. > > http://www.spinics.net/lists/linux-omap/msg124923.html it's a reply in the middle of a discussion ... What I managed to decode is that this: static int __orderly_poweroff(bool force) { int ret; ret = run_cmd(poweroff_cmd); if (ret && force) { pr_warn("Failed to start orderly shutdown: forcing the issue\n"); /* * I guess this should try to kick off some daemon to sync and * poweroff asap. Or not even bother syncing if we're doing an * emergency shutdown? */ emergency_sync(); kernel_power_off(); } return ret; } could fail to actually power the system off, if the run_cmd(poweroff_cmd) 'succeeds', but due to a user-space bug it does not actually call the real poweroff system call? Thanks, Ingo