All of lore.kernel.org
 help / color / mirror / Atom feed
* better oopsing when frozen
@ 2011-07-25  8:43 Oliver Neukum
  0 siblings, 0 replies; 20+ messages in thread
From: Oliver Neukum @ 2011-07-25  8:43 UTC (permalink / raw)
  To: linux-pm, linux-kernel, rjw

Hi Rafael,

I had a problem with the kernel stopping the machine forever because I got an
oops while tasks were frozen. It seems to me that we should thaw when this
happens. How about this approach?

	Regards
		Oliver

>From 6f3b5e7a5c7ccf3564bdd2e703eba7eee753ecdc Mon Sep 17 00:00:00 2001
From: Oliver Neukum <oliver@neukum.org>
Date: Fri, 22 Jul 2011 11:20:19 +0200
Subject: [PATCH] unfreeze tasks if an oops happens while tasks are frozen

If an oops kills the task suspending or snapshotting
is system, the system is dead because the action is
never completed and the tasks never thawed.

Signed-off-by: Oliver Neukum <oneukum@suse.de>
---
 include/linux/freezer.h |    1 +
 kernel/panic.c          |    2 ++
 kernel/power/process.c  |   11 +++++++++++
 3 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/include/linux/freezer.h b/include/linux/freezer.h
index 1effc8b..9907cf6 100644
--- a/include/linux/freezer.h
+++ b/include/linux/freezer.h
@@ -50,6 +50,7 @@ extern int thaw_process(struct task_struct *p);
 extern void refrigerator(void);
 extern int freeze_processes(void);
 extern void thaw_processes(void);
+extern void thaw_in_oops(void);
 
 static inline int try_to_freeze(void)
 {
diff --git a/kernel/panic.c b/kernel/panic.c
index 6923167..255e662 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -23,6 +23,7 @@
 #include <linux/init.h>
 #include <linux/nmi.h>
 #include <linux/dmi.h>
+#include <linux/freezer.h>
 
 #define PANIC_TIMER_STEP 100
 #define PANIC_BLINK_SPD 18
@@ -355,6 +356,7 @@ void oops_exit(void)
 	do_oops_enter_exit();
 	print_oops_end_marker();
 	kmsg_dump(KMSG_DUMP_OOPS);
+	thaw_in_oops();
 }
 
 #ifdef WANT_WARN_ON_SLOWPATH
diff --git a/kernel/power/process.c b/kernel/power/process.c
index 0cf3a27..20994cd 100644
--- a/kernel/power/process.c
+++ b/kernel/power/process.c
@@ -22,6 +22,9 @@
  */
 #define TIMEOUT	(20 * HZ)
 
+/* in case we oops while processes are frozen */
+static bool tasks_fozen = false;
+
 static inline int freezable(struct task_struct * p)
 {
 	if ((p == current) ||
@@ -131,6 +134,7 @@ static int try_to_freeze_tasks(bool sig_only)
 			elapsed_csecs % 100);
 	}
 
+	tasks_fozen = (todo == 0);
 	return todo ? -EBUSY : 0;
 }
 
@@ -189,7 +193,14 @@ void thaw_processes(void)
 	thaw_workqueues();
 	thaw_tasks(true);
 	thaw_tasks(false);
+	tasks_fozen = false;
 	schedule();
 	printk("done.\n");
 }
 
+void thaw_in_oops(void)
+{
+	if (tasks_fozen)
+		thaw_processes();
+}
+
-- 
1.7.1


-- 
- - - 
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 16746 (AG Nürnberg) 
Maxfeldstraße 5                         
90409 Nürnberg 
Germany 
- - - 

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: better oopsing when frozen
  2011-07-29 21:17         ` Oliver Neukum
@ 2011-07-29 21:30           ` Rafael J. Wysocki
  0 siblings, 0 replies; 20+ messages in thread
From: Rafael J. Wysocki @ 2011-07-29 21:30 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: Andrew Morton, linux-pm, linux-kernel

On Friday, July 29, 2011, Oliver Neukum wrote:
> Am Freitag, 29. Juli 2011, 23:09:44 schrieb Rafael J. Wysocki:
> > On Friday, July 29, 2011, Oliver Neukum wrote:
> > > Am Freitag, 29. Juli 2011, 22:27:54 schrieb Rafael J. Wysocki:
> > > > However, there's another problem I didn't think of before.  Namely,
> > > > we cannot thaw tasks before resuming devices in case we've already
> > > > suspended them, because that will defeat the very purpose of the
> > > > freezing in the first place.
> > > 
> > > That purpose is already defeated. The machine cannot be deader than dead.
> > 
> > OK, so what exactly is the purpose of the thawing, then?
> 
> We may be lucky enough to get the oops written out to disk. The oops may happen
> in a driver rarely used and be late in the sequence.

OK, so is there any guarantee that we won't corrupt things this way?

Rafael

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: better oopsing when frozen
  2011-07-29 21:09       ` [linux-pm] " Rafael J. Wysocki
  2011-07-29 21:17         ` Oliver Neukum
@ 2011-07-29 21:17         ` Oliver Neukum
  1 sibling, 0 replies; 20+ messages in thread
From: Oliver Neukum @ 2011-07-29 21:17 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Andrew Morton, linux-pm, linux-kernel

Am Freitag, 29. Juli 2011, 23:09:44 schrieb Rafael J. Wysocki:
> On Friday, July 29, 2011, Oliver Neukum wrote:
> > Am Freitag, 29. Juli 2011, 22:27:54 schrieb Rafael J. Wysocki:
> > > However, there's another problem I didn't think of before.  Namely,
> > > we cannot thaw tasks before resuming devices in case we've already
> > > suspended them, because that will defeat the very purpose of the
> > > freezing in the first place.
> > 
> > That purpose is already defeated. The machine cannot be deader than dead.
> 
> OK, so what exactly is the purpose of the thawing, then?

We may be lucky enough to get the oops written out to disk. The oops may happen
in a driver rarely used and be late in the sequence.

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: better oopsing when frozen
  2011-07-29 21:04     ` [linux-pm] " Oliver Neukum
@ 2011-07-29 21:09       ` Rafael J. Wysocki
  2011-07-29 21:09       ` [linux-pm] " Rafael J. Wysocki
  1 sibling, 0 replies; 20+ messages in thread
From: Rafael J. Wysocki @ 2011-07-29 21:09 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: Andrew Morton, linux-pm, linux-kernel

On Friday, July 29, 2011, Oliver Neukum wrote:
> Am Freitag, 29. Juli 2011, 22:27:54 schrieb Rafael J. Wysocki:
> > However, there's another problem I didn't think of before.  Namely,
> > we cannot thaw tasks before resuming devices in case we've already
> > suspended them, because that will defeat the very purpose of the
> > freezing in the first place.
> 
> That purpose is already defeated. The machine cannot be deader than dead.

OK, so what exactly is the purpose of the thawing, then?

Rafael

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: better oopsing when frozen
  2011-07-29 20:27   ` Rafael J. Wysocki
@ 2011-07-29 21:04     ` Oliver Neukum
  2011-07-29 21:04     ` [linux-pm] " Oliver Neukum
  1 sibling, 0 replies; 20+ messages in thread
From: Oliver Neukum @ 2011-07-29 21:04 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Andrew Morton, linux-pm, linux-kernel

Am Freitag, 29. Juli 2011, 22:27:54 schrieb Rafael J. Wysocki:
> However, there's another problem I didn't think of before.  Namely,
> we cannot thaw tasks before resuming devices in case we've already
> suspended them, because that will defeat the very purpose of the
> freezing in the first place.

That purpose is already defeated. The machine cannot be deader than dead.
We could try to go through the device tree. But the machine just oopsed
quite likely doing just that, so this doesn't seem wise to me.

IMHO if we oops while frozen, chances are the system is going to die.
It certainly dies now. However, we must certainly not thaw if a valid
image is already on disk, lest we corrupt a filesystem.

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: better oopsing when frozen
  2011-07-29 20:27   ` Oliver Neukum
@ 2011-07-29 20:29     ` Rafael J. Wysocki
  0 siblings, 0 replies; 20+ messages in thread
From: Rafael J. Wysocki @ 2011-07-29 20:29 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: linux-pm, linux-kernel

On Friday, July 29, 2011, Oliver Neukum wrote:
> Am Freitag, 29. Juli 2011, 21:54:30 schrieb Pavel Machek:
> > On Mon 2011-07-25 10:43:19, Oliver Neukum wrote:
> > > Hi Rafael,
> > > 
> > > I had a problem with the kernel stopping the machine forever because I got an
> > > oops while tasks were frozen. It seems to me that we should thaw when this
> > > happens. How about this approach?
> > 
> > Danger, Will Robinson. You must not write to the filesystems after
> > hibernation started. By thawing, you may do just that.
> > 
> > It should be safe to thaw as long as final suspend signature is not
> > on disk and will not be written there.
> 
> Yes. But this applies only if I use kernel-space hibernation, doesn't it?
> So I need a second flag for the signature being written. Any other problem?

Yes, please see my reply to Pavel.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: better oopsing when frozen
  2011-07-29 19:54 ` [linux-pm] " Pavel Machek
  2011-07-29 20:27   ` Oliver Neukum
  2011-07-29 20:27   ` Oliver Neukum
@ 2011-07-29 20:27   ` Rafael J. Wysocki
  2011-07-29 21:04     ` Oliver Neukum
  2011-07-29 21:04     ` [linux-pm] " Oliver Neukum
  2 siblings, 2 replies; 20+ messages in thread
From: Rafael J. Wysocki @ 2011-07-29 20:27 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Oliver Neukum, linux-pm, Andrew Morton, linux-kernel

On Friday, July 29, 2011, Pavel Machek wrote:
> On Mon 2011-07-25 10:43:19, Oliver Neukum wrote:
> > Hi Rafael,
> > 
> > I had a problem with the kernel stopping the machine forever because I got an
> > oops while tasks were frozen. It seems to me that we should thaw when this
> > happens. How about this approach?
> 
> Danger, Will Robinson. You must not write to the filesystems after
> hibernation started. By thawing, you may do just that.
> 
> It should be safe to thaw as long as final suspend signature is not
> on disk and will not be written there.

This only applies to hibernation and only when we're going to restore
from the image the presumably hasn't been saved yet, right?

However, there's another problem I didn't think of before.  Namely,
we cannot thaw tasks before resuming devices in case we've already
suspended them, because that will defeat the very purpose of the
freezing in the first place.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: better oopsing when frozen
  2011-07-29 19:54 ` [linux-pm] " Pavel Machek
  2011-07-29 20:27   ` Oliver Neukum
@ 2011-07-29 20:27   ` Oliver Neukum
  2011-07-29 20:27   ` Rafael J. Wysocki
  2 siblings, 0 replies; 20+ messages in thread
From: Oliver Neukum @ 2011-07-29 20:27 UTC (permalink / raw)
  To: Pavel Machek; +Cc: linux-pm, linux-kernel

Am Freitag, 29. Juli 2011, 21:54:30 schrieb Pavel Machek:
> On Mon 2011-07-25 10:43:19, Oliver Neukum wrote:
> > Hi Rafael,
> > 
> > I had a problem with the kernel stopping the machine forever because I got an
> > oops while tasks were frozen. It seems to me that we should thaw when this
> > happens. How about this approach?
> 
> Danger, Will Robinson. You must not write to the filesystems after
> hibernation started. By thawing, you may do just that.
> 
> It should be safe to thaw as long as final suspend signature is not
> on disk and will not be written there.

Yes. But this applies only if I use kernel-space hibernation, doesn't it?
So I need a second flag for the signature being written. Any other problem?

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: better oopsing when frozen
  2011-07-25  8:43 Oliver Neukum
                   ` (4 preceding siblings ...)
  2011-07-29 19:54 ` [linux-pm] " Pavel Machek
@ 2011-07-29 19:54 ` Pavel Machek
  5 siblings, 0 replies; 20+ messages in thread
From: Pavel Machek @ 2011-07-29 19:54 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: linux-pm, linux-kernel

On Mon 2011-07-25 10:43:19, Oliver Neukum wrote:
> Hi Rafael,
> 
> I had a problem with the kernel stopping the machine forever because I got an
> oops while tasks were frozen. It seems to me that we should thaw when this
> happens. How about this approach?

Danger, Will Robinson. You must not write to the filesystems after
hibernation started. By thawing, you may do just that.

It should be safe to thaw as long as final suspend signature is not
on disk and will not be written there.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: better oopsing when frozen
  2011-07-27  9:21     ` Rafael J. Wysocki
@ 2011-07-27 10:23       ` Oliver Neukum
  2011-07-27 10:23       ` Oliver Neukum
  1 sibling, 0 replies; 20+ messages in thread
From: Oliver Neukum @ 2011-07-27 10:23 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Andrew Morton, linux-pm, linux-kernel

Am Mittwoch, 27. Juli 2011, 11:21:21 schrieb Rafael J. Wysocki:
> > mm...  The patch as proposed is very simple, direct, explicit.  I
> > suspect that trying to embed this operation within some other one would
> > end up producing a less clear result.  Sometimes we do exceptional and
> > weird things, and leaving the code exceptional and weird-looking is
> > better than hiding it in some framework, if you follow what I mean.
> 
> My point is that instead of using the oom_killer_disabled variable in
> page_alloc.c, we could use a oom_killer_disabled() function returning
> the value of the new tasks_are_frozen variable.  Then,
> oom_killer_disable/enable() won't be necessary any more.

Well, if we do this, it'll be a separate change. Not that I disagree. 

	Regards
		Oliver
-- 
- - - 
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 16746 (AG Nürnberg) 
Maxfeldstraße 5                         
90409 Nürnberg 
Germany 
- - - 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: better oopsing when frozen
  2011-07-27  9:21     ` Rafael J. Wysocki
  2011-07-27 10:23       ` Oliver Neukum
@ 2011-07-27 10:23       ` Oliver Neukum
  1 sibling, 0 replies; 20+ messages in thread
From: Oliver Neukum @ 2011-07-27 10:23 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Andrew Morton, linux-pm, linux-kernel

Am Mittwoch, 27. Juli 2011, 11:21:21 schrieb Rafael J. Wysocki:
> > mm...  The patch as proposed is very simple, direct, explicit.  I
> > suspect that trying to embed this operation within some other one would
> > end up producing a less clear result.  Sometimes we do exceptional and
> > weird things, and leaving the code exceptional and weird-looking is
> > better than hiding it in some framework, if you follow what I mean.
> 
> My point is that instead of using the oom_killer_disabled variable in
> page_alloc.c, we could use a oom_killer_disabled() function returning
> the value of the new tasks_are_frozen variable.  Then,
> oom_killer_disable/enable() won't be necessary any more.

Well, if we do this, it'll be a separate change. Not that I disagree. 

	Regards
		Oliver
-- 
- - - 
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 16746 (AG Nürnberg) 
Maxfeldstraße 5                         
90409 Nürnberg 
Germany 
- - - 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: better oopsing when frozen
  2011-07-26 22:45   ` Andrew Morton
  2011-07-27  9:21     ` Rafael J. Wysocki
@ 2011-07-27  9:21     ` Rafael J. Wysocki
  2011-07-27 10:23       ` Oliver Neukum
  2011-07-27 10:23       ` Oliver Neukum
  1 sibling, 2 replies; 20+ messages in thread
From: Rafael J. Wysocki @ 2011-07-27  9:21 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Oliver Neukum, linux-pm, linux-kernel

On Wednesday, July 27, 2011, Andrew Morton wrote:
> On Wed, 27 Jul 2011 00:24:11 +0200
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > > Hi Rafael,
> > > 
> > > I had a problem with the kernel stopping the machine forever because I got an
> > > oops while tasks were frozen. It seems to me that we should thaw when this
> > > happens. How about this approach?
> > 
> > Well, we do something like this already for the OOM killer (see
> > oom_killer_disable() and friends), so I think it would be better to
> > simply extend/modify that mechanism instead of adding a new one
> > doing almost exactly the same thing.
> > 
> > I have no complaints about adding thaw_in_oops(), though, so long as
> > Andrew thinks it makes sense.
> 
> mm...  The patch as proposed is very simple, direct, explicit.  I
> suspect that trying to embed this operation within some other one would
> end up producing a less clear result.  Sometimes we do exceptional and
> weird things, and leaving the code exceptional and weird-looking is
> better than hiding it in some framework, if you follow what I mean.

My point is that instead of using the oom_killer_disabled variable in
page_alloc.c, we could use a oom_killer_disabled() function returning
the value of the new tasks_are_frozen variable.  Then,
oom_killer_disable/enable() won't be necessary any more.

> It does need some code comments to explain to people what it's doing
> and more importantly why it's doing it.  Also, something which doesn't
> break the build when CONFIG_FREEZER=n would be nice.

Right.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: better oopsing when frozen
  2011-07-26 22:45   ` Andrew Morton
@ 2011-07-27  9:21     ` Rafael J. Wysocki
  2011-07-27  9:21     ` Rafael J. Wysocki
  1 sibling, 0 replies; 20+ messages in thread
From: Rafael J. Wysocki @ 2011-07-27  9:21 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Oliver Neukum, linux-pm, linux-kernel

On Wednesday, July 27, 2011, Andrew Morton wrote:
> On Wed, 27 Jul 2011 00:24:11 +0200
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > > Hi Rafael,
> > > 
> > > I had a problem with the kernel stopping the machine forever because I got an
> > > oops while tasks were frozen. It seems to me that we should thaw when this
> > > happens. How about this approach?
> > 
> > Well, we do something like this already for the OOM killer (see
> > oom_killer_disable() and friends), so I think it would be better to
> > simply extend/modify that mechanism instead of adding a new one
> > doing almost exactly the same thing.
> > 
> > I have no complaints about adding thaw_in_oops(), though, so long as
> > Andrew thinks it makes sense.
> 
> mm...  The patch as proposed is very simple, direct, explicit.  I
> suspect that trying to embed this operation within some other one would
> end up producing a less clear result.  Sometimes we do exceptional and
> weird things, and leaving the code exceptional and weird-looking is
> better than hiding it in some framework, if you follow what I mean.

My point is that instead of using the oom_killer_disabled variable in
page_alloc.c, we could use a oom_killer_disabled() function returning
the value of the new tasks_are_frozen variable.  Then,
oom_killer_disable/enable() won't be necessary any more.

> It does need some code comments to explain to people what it's doing
> and more importantly why it's doing it.  Also, something which doesn't
> break the build when CONFIG_FREEZER=n would be nice.

Right.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: better oopsing when frozen
  2011-07-26 22:24 ` Rafael J. Wysocki
@ 2011-07-26 22:45   ` Andrew Morton
  2011-07-27  9:21     ` Rafael J. Wysocki
  2011-07-27  9:21     ` Rafael J. Wysocki
  2011-07-26 22:45   ` Andrew Morton
  1 sibling, 2 replies; 20+ messages in thread
From: Andrew Morton @ 2011-07-26 22:45 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Oliver Neukum, linux-pm, linux-kernel

On Wed, 27 Jul 2011 00:24:11 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> > Hi Rafael,
> > 
> > I had a problem with the kernel stopping the machine forever because I got an
> > oops while tasks were frozen. It seems to me that we should thaw when this
> > happens. How about this approach?
> 
> Well, we do something like this already for the OOM killer (see
> oom_killer_disable() and friends), so I think it would be better to
> simply extend/modify that mechanism instead of adding a new one
> doing almost exactly the same thing.
> 
> I have no complaints about adding thaw_in_oops(), though, so long as
> Andrew thinks it makes sense.

mm...  The patch as proposed is very simple, direct, explicit.  I
suspect that trying to embed this operation within some other one would
end up producing a less clear result.  Sometimes we do exceptional and
weird things, and leaving the code exceptional and weird-looking is
better than hiding it in some framework, if you follow what I mean.

It does need some code comments to explain to people what it's doing
and more importantly why it's doing it.  Also, something which doesn't
break the build when CONFIG_FREEZER=n would be nice.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: better oopsing when frozen
  2011-07-26 22:24 ` Rafael J. Wysocki
  2011-07-26 22:45   ` Andrew Morton
@ 2011-07-26 22:45   ` Andrew Morton
  1 sibling, 0 replies; 20+ messages in thread
From: Andrew Morton @ 2011-07-26 22:45 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Oliver Neukum, linux-pm, linux-kernel

On Wed, 27 Jul 2011 00:24:11 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> > Hi Rafael,
> > 
> > I had a problem with the kernel stopping the machine forever because I got an
> > oops while tasks were frozen. It seems to me that we should thaw when this
> > happens. How about this approach?
> 
> Well, we do something like this already for the OOM killer (see
> oom_killer_disable() and friends), so I think it would be better to
> simply extend/modify that mechanism instead of adding a new one
> doing almost exactly the same thing.
> 
> I have no complaints about adding thaw_in_oops(), though, so long as
> Andrew thinks it makes sense.

mm...  The patch as proposed is very simple, direct, explicit.  I
suspect that trying to embed this operation within some other one would
end up producing a less clear result.  Sometimes we do exceptional and
weird things, and leaving the code exceptional and weird-looking is
better than hiding it in some framework, if you follow what I mean.

It does need some code comments to explain to people what it's doing
and more importantly why it's doing it.  Also, something which doesn't
break the build when CONFIG_FREEZER=n would be nice.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: better oopsing when frozen
  2011-07-25  8:43 Oliver Neukum
                   ` (2 preceding siblings ...)
  2011-07-26 22:24 ` Rafael J. Wysocki
@ 2011-07-26 22:24 ` Rafael J. Wysocki
  2011-07-26 22:45   ` Andrew Morton
  2011-07-26 22:45   ` Andrew Morton
  2011-07-29 19:54 ` [linux-pm] " Pavel Machek
  2011-07-29 19:54 ` Pavel Machek
  5 siblings, 2 replies; 20+ messages in thread
From: Rafael J. Wysocki @ 2011-07-26 22:24 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: linux-pm, linux-kernel, Andrew Morton

Hi,

On Monday, July 25, 2011, Oliver Neukum wrote:
> Hi Rafael,
> 
> I had a problem with the kernel stopping the machine forever because I got an
> oops while tasks were frozen. It seems to me that we should thaw when this
> happens. How about this approach?

Well, we do something like this already for the OOM killer (see
oom_killer_disable() and friends), so I think it would be better to
simply extend/modify that mechanism instead of adding a new one
doing almost exactly the same thing.

I have no complaints about adding thaw_in_oops(), though, so long as
Andrew thinks it makes sense.

Thanks,
Rafael 


> From 6f3b5e7a5c7ccf3564bdd2e703eba7eee753ecdc Mon Sep 17 00:00:00 2001
> From: Oliver Neukum <oliver@neukum.org>
> Date: Fri, 22 Jul 2011 11:20:19 +0200
> Subject: [PATCH] unfreeze tasks if an oops happens while tasks are frozen
> 
> If an oops kills the task suspending or snapshotting
> is system, the system is dead because the action is
> never completed and the tasks never thawed.
> 
> Signed-off-by: Oliver Neukum <oneukum@suse.de>
> ---
>  include/linux/freezer.h |    1 +
>  kernel/panic.c          |    2 ++
>  kernel/power/process.c  |   11 +++++++++++
>  3 files changed, 14 insertions(+), 0 deletions(-)
> 
> diff --git a/include/linux/freezer.h b/include/linux/freezer.h
> index 1effc8b..9907cf6 100644
> --- a/include/linux/freezer.h
> +++ b/include/linux/freezer.h
> @@ -50,6 +50,7 @@ extern int thaw_process(struct task_struct *p);
>  extern void refrigerator(void);
>  extern int freeze_processes(void);
>  extern void thaw_processes(void);
> +extern void thaw_in_oops(void);
>  
>  static inline int try_to_freeze(void)
>  {
> diff --git a/kernel/panic.c b/kernel/panic.c
> index 6923167..255e662 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -23,6 +23,7 @@
>  #include <linux/init.h>
>  #include <linux/nmi.h>
>  #include <linux/dmi.h>
> +#include <linux/freezer.h>
>  
>  #define PANIC_TIMER_STEP 100
>  #define PANIC_BLINK_SPD 18
> @@ -355,6 +356,7 @@ void oops_exit(void)
>  	do_oops_enter_exit();
>  	print_oops_end_marker();
>  	kmsg_dump(KMSG_DUMP_OOPS);
> +	thaw_in_oops();
>  }
>  
>  #ifdef WANT_WARN_ON_SLOWPATH
> diff --git a/kernel/power/process.c b/kernel/power/process.c
> index 0cf3a27..20994cd 100644
> --- a/kernel/power/process.c
> +++ b/kernel/power/process.c
> @@ -22,6 +22,9 @@
>   */
>  #define TIMEOUT	(20 * HZ)
>  
> +/* in case we oops while processes are frozen */
> +static bool tasks_fozen = false;
> +
>  static inline int freezable(struct task_struct * p)
>  {
>  	if ((p == current) ||
> @@ -131,6 +134,7 @@ static int try_to_freeze_tasks(bool sig_only)
>  			elapsed_csecs % 100);
>  	}
>  
> +	tasks_fozen = (todo == 0);
>  	return todo ? -EBUSY : 0;
>  }
>  
> @@ -189,7 +193,14 @@ void thaw_processes(void)
>  	thaw_workqueues();
>  	thaw_tasks(true);
>  	thaw_tasks(false);
> +	tasks_fozen = false;
>  	schedule();
>  	printk("done.\n");
>  }
>  
> +void thaw_in_oops(void)
> +{
> +	if (tasks_fozen)
> +		thaw_processes();
> +}
> +
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: better oopsing when frozen
  2011-07-25  8:43 Oliver Neukum
  2011-07-25 10:10 ` Borislav Petkov
  2011-07-25 10:10 ` Borislav Petkov
@ 2011-07-26 22:24 ` Rafael J. Wysocki
  2011-07-26 22:24 ` Rafael J. Wysocki
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 20+ messages in thread
From: Rafael J. Wysocki @ 2011-07-26 22:24 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: linux-pm, linux-kernel, Andrew Morton

Hi,

On Monday, July 25, 2011, Oliver Neukum wrote:
> Hi Rafael,
> 
> I had a problem with the kernel stopping the machine forever because I got an
> oops while tasks were frozen. It seems to me that we should thaw when this
> happens. How about this approach?

Well, we do something like this already for the OOM killer (see
oom_killer_disable() and friends), so I think it would be better to
simply extend/modify that mechanism instead of adding a new one
doing almost exactly the same thing.

I have no complaints about adding thaw_in_oops(), though, so long as
Andrew thinks it makes sense.

Thanks,
Rafael 


> From 6f3b5e7a5c7ccf3564bdd2e703eba7eee753ecdc Mon Sep 17 00:00:00 2001
> From: Oliver Neukum <oliver@neukum.org>
> Date: Fri, 22 Jul 2011 11:20:19 +0200
> Subject: [PATCH] unfreeze tasks if an oops happens while tasks are frozen
> 
> If an oops kills the task suspending or snapshotting
> is system, the system is dead because the action is
> never completed and the tasks never thawed.
> 
> Signed-off-by: Oliver Neukum <oneukum@suse.de>
> ---
>  include/linux/freezer.h |    1 +
>  kernel/panic.c          |    2 ++
>  kernel/power/process.c  |   11 +++++++++++
>  3 files changed, 14 insertions(+), 0 deletions(-)
> 
> diff --git a/include/linux/freezer.h b/include/linux/freezer.h
> index 1effc8b..9907cf6 100644
> --- a/include/linux/freezer.h
> +++ b/include/linux/freezer.h
> @@ -50,6 +50,7 @@ extern int thaw_process(struct task_struct *p);
>  extern void refrigerator(void);
>  extern int freeze_processes(void);
>  extern void thaw_processes(void);
> +extern void thaw_in_oops(void);
>  
>  static inline int try_to_freeze(void)
>  {
> diff --git a/kernel/panic.c b/kernel/panic.c
> index 6923167..255e662 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -23,6 +23,7 @@
>  #include <linux/init.h>
>  #include <linux/nmi.h>
>  #include <linux/dmi.h>
> +#include <linux/freezer.h>
>  
>  #define PANIC_TIMER_STEP 100
>  #define PANIC_BLINK_SPD 18
> @@ -355,6 +356,7 @@ void oops_exit(void)
>  	do_oops_enter_exit();
>  	print_oops_end_marker();
>  	kmsg_dump(KMSG_DUMP_OOPS);
> +	thaw_in_oops();
>  }
>  
>  #ifdef WANT_WARN_ON_SLOWPATH
> diff --git a/kernel/power/process.c b/kernel/power/process.c
> index 0cf3a27..20994cd 100644
> --- a/kernel/power/process.c
> +++ b/kernel/power/process.c
> @@ -22,6 +22,9 @@
>   */
>  #define TIMEOUT	(20 * HZ)
>  
> +/* in case we oops while processes are frozen */
> +static bool tasks_fozen = false;
> +
>  static inline int freezable(struct task_struct * p)
>  {
>  	if ((p == current) ||
> @@ -131,6 +134,7 @@ static int try_to_freeze_tasks(bool sig_only)
>  			elapsed_csecs % 100);
>  	}
>  
> +	tasks_fozen = (todo == 0);
>  	return todo ? -EBUSY : 0;
>  }
>  
> @@ -189,7 +193,14 @@ void thaw_processes(void)
>  	thaw_workqueues();
>  	thaw_tasks(true);
>  	thaw_tasks(false);
> +	tasks_fozen = false;
>  	schedule();
>  	printk("done.\n");
>  }
>  
> +void thaw_in_oops(void)
> +{
> +	if (tasks_fozen)
> +		thaw_processes();
> +}
> +
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: better oopsing when frozen
  2011-07-25  8:43 Oliver Neukum
@ 2011-07-25 10:10 ` Borislav Petkov
  2011-07-25 10:10 ` Borislav Petkov
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 20+ messages in thread
From: Borislav Petkov @ 2011-07-25 10:10 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: linux-pm, linux-kernel, rjw

On Mon, Jul 25, 2011 at 10:43:19AM +0200, Oliver Neukum wrote:
> Hi Rafael,
> 
> I had a problem with the kernel stopping the machine forever because I got an
> oops while tasks were frozen. It seems to me that we should thaw when this
> happens. How about this approach?
> 
> 	Regards
> 		Oliver
> 
> From 6f3b5e7a5c7ccf3564bdd2e703eba7eee753ecdc Mon Sep 17 00:00:00 2001
> From: Oliver Neukum <oliver@neukum.org>
> Date: Fri, 22 Jul 2011 11:20:19 +0200
> Subject: [PATCH] unfreeze tasks if an oops happens while tasks are frozen
> 
> If an oops kills the task suspending or snapshotting
> is system, the system is dead because the action is
> never completed and the tasks never thawed.
> 
> Signed-off-by: Oliver Neukum <oneukum@suse.de>
> ---
>  include/linux/freezer.h |    1 +
>  kernel/panic.c          |    2 ++
>  kernel/power/process.c  |   11 +++++++++++
>  3 files changed, 14 insertions(+), 0 deletions(-)
> 
> diff --git a/include/linux/freezer.h b/include/linux/freezer.h
> index 1effc8b..9907cf6 100644
> --- a/include/linux/freezer.h
> +++ b/include/linux/freezer.h
> @@ -50,6 +50,7 @@ extern int thaw_process(struct task_struct *p);
>  extern void refrigerator(void);
>  extern int freeze_processes(void);
>  extern void thaw_processes(void);
> +extern void thaw_in_oops(void);
>  
>  static inline int try_to_freeze(void)
>  {
> diff --git a/kernel/panic.c b/kernel/panic.c
> index 6923167..255e662 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -23,6 +23,7 @@
>  #include <linux/init.h>
>  #include <linux/nmi.h>
>  #include <linux/dmi.h>
> +#include <linux/freezer.h>
>  
>  #define PANIC_TIMER_STEP 100
>  #define PANIC_BLINK_SPD 18
> @@ -355,6 +356,7 @@ void oops_exit(void)
>  	do_oops_enter_exit();
>  	print_oops_end_marker();
>  	kmsg_dump(KMSG_DUMP_OOPS);
> +	thaw_in_oops();
>  }
>  
>  #ifdef WANT_WARN_ON_SLOWPATH
> diff --git a/kernel/power/process.c b/kernel/power/process.c
> index 0cf3a27..20994cd 100644
> --- a/kernel/power/process.c
> +++ b/kernel/power/process.c
> @@ -22,6 +22,9 @@
>   */
>  #define TIMEOUT	(20 * HZ)
>  
> +/* in case we oops while processes are frozen */
> +static bool tasks_fozen = false;

I think you mean 'tasks_frozen' here :).

-- 
Regards/Gruss,
    Boris.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: better oopsing when frozen
  2011-07-25  8:43 Oliver Neukum
  2011-07-25 10:10 ` Borislav Petkov
@ 2011-07-25 10:10 ` Borislav Petkov
  2011-07-26 22:24 ` Rafael J. Wysocki
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 20+ messages in thread
From: Borislav Petkov @ 2011-07-25 10:10 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: linux-pm, linux-kernel

On Mon, Jul 25, 2011 at 10:43:19AM +0200, Oliver Neukum wrote:
> Hi Rafael,
> 
> I had a problem with the kernel stopping the machine forever because I got an
> oops while tasks were frozen. It seems to me that we should thaw when this
> happens. How about this approach?
> 
> 	Regards
> 		Oliver
> 
> From 6f3b5e7a5c7ccf3564bdd2e703eba7eee753ecdc Mon Sep 17 00:00:00 2001
> From: Oliver Neukum <oliver@neukum.org>
> Date: Fri, 22 Jul 2011 11:20:19 +0200
> Subject: [PATCH] unfreeze tasks if an oops happens while tasks are frozen
> 
> If an oops kills the task suspending or snapshotting
> is system, the system is dead because the action is
> never completed and the tasks never thawed.
> 
> Signed-off-by: Oliver Neukum <oneukum@suse.de>
> ---
>  include/linux/freezer.h |    1 +
>  kernel/panic.c          |    2 ++
>  kernel/power/process.c  |   11 +++++++++++
>  3 files changed, 14 insertions(+), 0 deletions(-)
> 
> diff --git a/include/linux/freezer.h b/include/linux/freezer.h
> index 1effc8b..9907cf6 100644
> --- a/include/linux/freezer.h
> +++ b/include/linux/freezer.h
> @@ -50,6 +50,7 @@ extern int thaw_process(struct task_struct *p);
>  extern void refrigerator(void);
>  extern int freeze_processes(void);
>  extern void thaw_processes(void);
> +extern void thaw_in_oops(void);
>  
>  static inline int try_to_freeze(void)
>  {
> diff --git a/kernel/panic.c b/kernel/panic.c
> index 6923167..255e662 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -23,6 +23,7 @@
>  #include <linux/init.h>
>  #include <linux/nmi.h>
>  #include <linux/dmi.h>
> +#include <linux/freezer.h>
>  
>  #define PANIC_TIMER_STEP 100
>  #define PANIC_BLINK_SPD 18
> @@ -355,6 +356,7 @@ void oops_exit(void)
>  	do_oops_enter_exit();
>  	print_oops_end_marker();
>  	kmsg_dump(KMSG_DUMP_OOPS);
> +	thaw_in_oops();
>  }
>  
>  #ifdef WANT_WARN_ON_SLOWPATH
> diff --git a/kernel/power/process.c b/kernel/power/process.c
> index 0cf3a27..20994cd 100644
> --- a/kernel/power/process.c
> +++ b/kernel/power/process.c
> @@ -22,6 +22,9 @@
>   */
>  #define TIMEOUT	(20 * HZ)
>  
> +/* in case we oops while processes are frozen */
> +static bool tasks_fozen = false;

I think you mean 'tasks_frozen' here :).

-- 
Regards/Gruss,
    Boris.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* better oopsing when frozen
@ 2011-07-25  8:43 Oliver Neukum
  2011-07-25 10:10 ` Borislav Petkov
                   ` (5 more replies)
  0 siblings, 6 replies; 20+ messages in thread
From: Oliver Neukum @ 2011-07-25  8:43 UTC (permalink / raw)
  To: linux-pm, linux-kernel, rjw

Hi Rafael,

I had a problem with the kernel stopping the machine forever because I got an
oops while tasks were frozen. It seems to me that we should thaw when this
happens. How about this approach?

	Regards
		Oliver

>From 6f3b5e7a5c7ccf3564bdd2e703eba7eee753ecdc Mon Sep 17 00:00:00 2001
From: Oliver Neukum <oliver@neukum.org>
Date: Fri, 22 Jul 2011 11:20:19 +0200
Subject: [PATCH] unfreeze tasks if an oops happens while tasks are frozen

If an oops kills the task suspending or snapshotting
is system, the system is dead because the action is
never completed and the tasks never thawed.

Signed-off-by: Oliver Neukum <oneukum@suse.de>
---
 include/linux/freezer.h |    1 +
 kernel/panic.c          |    2 ++
 kernel/power/process.c  |   11 +++++++++++
 3 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/include/linux/freezer.h b/include/linux/freezer.h
index 1effc8b..9907cf6 100644
--- a/include/linux/freezer.h
+++ b/include/linux/freezer.h
@@ -50,6 +50,7 @@ extern int thaw_process(struct task_struct *p);
 extern void refrigerator(void);
 extern int freeze_processes(void);
 extern void thaw_processes(void);
+extern void thaw_in_oops(void);
 
 static inline int try_to_freeze(void)
 {
diff --git a/kernel/panic.c b/kernel/panic.c
index 6923167..255e662 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -23,6 +23,7 @@
 #include <linux/init.h>
 #include <linux/nmi.h>
 #include <linux/dmi.h>
+#include <linux/freezer.h>
 
 #define PANIC_TIMER_STEP 100
 #define PANIC_BLINK_SPD 18
@@ -355,6 +356,7 @@ void oops_exit(void)
 	do_oops_enter_exit();
 	print_oops_end_marker();
 	kmsg_dump(KMSG_DUMP_OOPS);
+	thaw_in_oops();
 }
 
 #ifdef WANT_WARN_ON_SLOWPATH
diff --git a/kernel/power/process.c b/kernel/power/process.c
index 0cf3a27..20994cd 100644
--- a/kernel/power/process.c
+++ b/kernel/power/process.c
@@ -22,6 +22,9 @@
  */
 #define TIMEOUT	(20 * HZ)
 
+/* in case we oops while processes are frozen */
+static bool tasks_fozen = false;
+
 static inline int freezable(struct task_struct * p)
 {
 	if ((p == current) ||
@@ -131,6 +134,7 @@ static int try_to_freeze_tasks(bool sig_only)
 			elapsed_csecs % 100);
 	}
 
+	tasks_fozen = (todo == 0);
 	return todo ? -EBUSY : 0;
 }
 
@@ -189,7 +193,14 @@ void thaw_processes(void)
 	thaw_workqueues();
 	thaw_tasks(true);
 	thaw_tasks(false);
+	tasks_fozen = false;
 	schedule();
 	printk("done.\n");
 }
 
+void thaw_in_oops(void)
+{
+	if (tasks_fozen)
+		thaw_processes();
+}
+
-- 
1.7.1


-- 
- - - 
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 16746 (AG Nürnberg) 
Maxfeldstraße 5                         
90409 Nürnberg 
Germany 
- - - 

^ permalink raw reply related	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2011-07-29 21:30 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-07-25  8:43 better oopsing when frozen Oliver Neukum
  -- strict thread matches above, loose matches on Subject: below --
2011-07-25  8:43 Oliver Neukum
2011-07-25 10:10 ` Borislav Petkov
2011-07-25 10:10 ` Borislav Petkov
2011-07-26 22:24 ` Rafael J. Wysocki
2011-07-26 22:24 ` Rafael J. Wysocki
2011-07-26 22:45   ` Andrew Morton
2011-07-27  9:21     ` Rafael J. Wysocki
2011-07-27  9:21     ` Rafael J. Wysocki
2011-07-27 10:23       ` Oliver Neukum
2011-07-27 10:23       ` Oliver Neukum
2011-07-26 22:45   ` Andrew Morton
2011-07-29 19:54 ` [linux-pm] " Pavel Machek
2011-07-29 20:27   ` Oliver Neukum
2011-07-29 20:29     ` Rafael J. Wysocki
2011-07-29 20:27   ` Oliver Neukum
2011-07-29 20:27   ` Rafael J. Wysocki
2011-07-29 21:04     ` Oliver Neukum
2011-07-29 21:04     ` [linux-pm] " Oliver Neukum
2011-07-29 21:09       ` Rafael J. Wysocki
2011-07-29 21:09       ` [linux-pm] " Rafael J. Wysocki
2011-07-29 21:17         ` Oliver Neukum
2011-07-29 21:30           ` Rafael J. Wysocki
2011-07-29 21:17         ` Oliver Neukum
2011-07-29 19:54 ` Pavel Machek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.