All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: intermittent suspend problem again
@ 2009-10-28 11:43 Ferenc Wagner
  2009-10-28 18:56 ` Rafael J. Wysocki
  0 siblings, 1 reply; 88+ messages in thread
From: Ferenc Wagner @ 2009-10-28 11:43 UTC (permalink / raw)
  To: linux-pm

Hi,

Something similar to http://bugzilla.kernel.org/show_bug.cgi?id=13894
raised its ugly head again, please see my last comments on that bug.
2.6.32-rc5 feels particularly bad, with frequent failures to switch
off the machine after "S|" or freezes after "Snapshotting system".
The former does not cause much trouble in itself, as the machine can
be switched off and resumed all right, but the latter is nasty.
Suspend to RAM works all the time.  The issue is not reproducible,
unfortunately, and the kernel change happened almost together with a
BIOS upgrade.  Yesterday I switched back to 2.6.31 to see whether it
still works stably with the new BIOS.  I'll report back my findings in
a couple of days.
-- 
Regards,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-10-28 11:43 intermittent suspend problem again Ferenc Wagner
@ 2009-10-28 18:56 ` Rafael J. Wysocki
  2009-10-29  0:11   ` Ferenc Wagner
  0 siblings, 1 reply; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-10-28 18:56 UTC (permalink / raw)
  To: Ferenc Wagner; +Cc: linux-pm

On Wednesday 28 October 2009, Ferenc Wagner wrote:
> Hi,
> 
> Something similar to http://bugzilla.kernel.org/show_bug.cgi?id=13894
> raised its ugly head again, please see my last comments on that bug.

This very well may be a separete bug, so please file a new bugzilla report
on this and mark it as a regression.

> 2.6.32-rc5 feels particularly bad, with frequent failures to switch
> off the machine after "S|" or freezes after "Snapshotting system".
> The former does not cause much trouble in itself, as the machine can
> be switched off and resumed all right, but the latter is nasty.
> Suspend to RAM works all the time.  The issue is not reproducible,
> unfortunately, and the kernel change happened almost together with a
> BIOS upgrade.  Yesterday I switched back to 2.6.31 to see whether it
> still works stably with the new BIOS.  I'll report back my findings in
> a couple of days.

OK, thanks.

Still, I'm really afraid we won't be able to debug it any further without a
reproducible test case.

Best,
Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-10-28 18:56 ` Rafael J. Wysocki
@ 2009-10-29  0:11   ` Ferenc Wagner
  2009-10-29 18:36     ` Rafael J. Wysocki
  2009-10-29 18:36     ` [linux-pm] " Rafael J. Wysocki
  0 siblings, 2 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-10-29  0:11 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Wednesday 28 October 2009, Ferenc Wagner wrote:
> 
>> Something similar to http://bugzilla.kernel.org/show_bug.cgi?id=13894
>> raised its ugly head again, please see my last comments on that bug.
>
> This very well may be a separete bug, so please file a new bugzilla report
> on this and mark it as a regression.

Done.

>> 2.6.32-rc5 feels particularly bad, with frequent failures to switch
>> off the machine after "S|" or freezes after "Snapshotting system".
>> The former does not cause much trouble in itself, as the machine can
>> be switched off and resumed all right, but the latter is nasty.
>> Suspend to RAM works all the time.  The issue is not reproducible,
>> unfortunately, and the kernel change happened almost together with a
>> BIOS upgrade.  Yesterday I switched back to 2.6.31 to see whether it
>> still works stably with the new BIOS.  I'll report back my findings in
>> a couple of days.
>
> OK, thanks.
>
> Still, I'm really afraid we won't be able to debug it any further without a
> reproducible test case.

I've got another, fully reproducible but nevertheless neglected ACPI
problem, already mentioned in #13894:
https://bugs.freedesktop.org/show_bug.cgi?id=22126.  Well, it's
probably far-fetched, but maybe the two are somehow related...  Can't
you perhaps suggest a way forward there?  Or some tricks to create a
reproducible test case here?  Btw. my gut feeling is that hibernation
is getting slower with each kernel release.  I didn't measure it, and
didn't even care about comparable initial states... But could anything
explain this, or is it sheer impatience?
-- 
Regards,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-10-29  0:11   ` Ferenc Wagner
  2009-10-29 18:36     ` Rafael J. Wysocki
@ 2009-10-29 18:36     ` Rafael J. Wysocki
  2009-10-29 22:31       ` Ferenc Wagner
                         ` (3 more replies)
  1 sibling, 4 replies; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-10-29 18:36 UTC (permalink / raw)
  To: Ferenc Wagner
  Cc: linux-pm, Jesse Barnes, Andrew Morton, yakui.zhao, LKML,
	ACPI Devel Maling List, Len Brown

On Thursday 29 October 2009, Ferenc Wagner wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > On Wednesday 28 October 2009, Ferenc Wagner wrote:
> > 
> >> Something similar to http://bugzilla.kernel.org/show_bug.cgi?id=13894
> >> raised its ugly head again, please see my last comments on that bug.
> >
> > This very well may be a separete bug, so please file a new bugzilla report
> > on this and mark it as a regression.
> 
> Done.

Which number is this?

> >> 2.6.32-rc5 feels particularly bad, with frequent failures to switch
> >> off the machine after "S|" or freezes after "Snapshotting system".
> >> The former does not cause much trouble in itself, as the machine can
> >> be switched off and resumed all right, but the latter is nasty.
> >> Suspend to RAM works all the time.  The issue is not reproducible,
> >> unfortunately, and the kernel change happened almost together with a
> >> BIOS upgrade.  Yesterday I switched back to 2.6.31 to see whether it
> >> still works stably with the new BIOS.  I'll report back my findings in
> >> a couple of days.
> >
> > OK, thanks.
> >
> > Still, I'm really afraid we won't be able to debug it any further without a
> > reproducible test case.
> 
> I've got another, fully reproducible but nevertheless neglected ACPI
> problem, already mentioned in #13894:
> https://bugs.freedesktop.org/show_bug.cgi?id=22126.

A side note: I'm totally unhappy with _kernel_ bugs being handled at
bugs.freedesktop.org without a notice anywhere else.  Even though they are
related to the graphics, the kernel developers in general at least deserve the
information that the bugs have been reported.

In this particulare case, the bug is clearly related to ACPI and linux-acpi
should have received a notification about it.

> Well, it's probably far-fetched, but maybe the two are somehow related...

Very well may be.

> Can't you perhaps suggest a way forward there?  Or some tricks to create a
> reproducible test case here?

Well, you can test if the problem is reproducible in the "shutdown" mode of
hibernation.

> Btw. my gut feeling is that hibernation
> is getting slower with each kernel release.  I didn't measure it, and
> didn't even care about comparable initial states... But could anything
> explain this, or is it sheer impatience?

Which part of it is getting slower?  Saving the image, suspending devices or
the entire hibernation overall?

Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-10-29  0:11   ` Ferenc Wagner
@ 2009-10-29 18:36     ` Rafael J. Wysocki
  2009-10-29 18:36     ` [linux-pm] " Rafael J. Wysocki
  1 sibling, 0 replies; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-10-29 18:36 UTC (permalink / raw)
  To: Ferenc Wagner
  Cc: LKML, Jesse Barnes, yakui.zhao, ACPI Devel Maling List, linux-pm,
	Andrew Morton

On Thursday 29 October 2009, Ferenc Wagner wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > On Wednesday 28 October 2009, Ferenc Wagner wrote:
> > 
> >> Something similar to http://bugzilla.kernel.org/show_bug.cgi?id=13894
> >> raised its ugly head again, please see my last comments on that bug.
> >
> > This very well may be a separete bug, so please file a new bugzilla report
> > on this and mark it as a regression.
> 
> Done.

Which number is this?

> >> 2.6.32-rc5 feels particularly bad, with frequent failures to switch
> >> off the machine after "S|" or freezes after "Snapshotting system".
> >> The former does not cause much trouble in itself, as the machine can
> >> be switched off and resumed all right, but the latter is nasty.
> >> Suspend to RAM works all the time.  The issue is not reproducible,
> >> unfortunately, and the kernel change happened almost together with a
> >> BIOS upgrade.  Yesterday I switched back to 2.6.31 to see whether it
> >> still works stably with the new BIOS.  I'll report back my findings in
> >> a couple of days.
> >
> > OK, thanks.
> >
> > Still, I'm really afraid we won't be able to debug it any further without a
> > reproducible test case.
> 
> I've got another, fully reproducible but nevertheless neglected ACPI
> problem, already mentioned in #13894:
> https://bugs.freedesktop.org/show_bug.cgi?id=22126.

A side note: I'm totally unhappy with _kernel_ bugs being handled at
bugs.freedesktop.org without a notice anywhere else.  Even though they are
related to the graphics, the kernel developers in general at least deserve the
information that the bugs have been reported.

In this particulare case, the bug is clearly related to ACPI and linux-acpi
should have received a notification about it.

> Well, it's probably far-fetched, but maybe the two are somehow related...

Very well may be.

> Can't you perhaps suggest a way forward there?  Or some tricks to create a
> reproducible test case here?

Well, you can test if the problem is reproducible in the "shutdown" mode of
hibernation.

> Btw. my gut feeling is that hibernation
> is getting slower with each kernel release.  I didn't measure it, and
> didn't even care about comparable initial states... But could anything
> explain this, or is it sheer impatience?

Which part of it is getting slower?  Saving the image, suspending devices or
the entire hibernation overall?

Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-10-29 18:36     ` [linux-pm] " Rafael J. Wysocki
  2009-10-29 22:31       ` Ferenc Wagner
@ 2009-10-29 22:31       ` Ferenc Wagner
  2009-10-30 18:18         ` Rafael J. Wysocki
  2009-10-30 18:18         ` [linux-pm] " Rafael J. Wysocki
  2009-11-11 11:29       ` Ferenc Wagner
  2009-11-11 11:29       ` [linux-pm] " Ferenc Wagner
  3 siblings, 2 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-10-29 22:31 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, Jesse Barnes, Andrew Morton, yakui.zhao, LKML,
	ACPI Devel Maling List, Len Brown

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Thursday 29 October 2009, Ferenc Wagner wrote:
>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>> 
>>> On Wednesday 28 October 2009, Ferenc Wagner wrote:
>>> 
>>>> Something similar to http://bugzilla.kernel.org/show_bug.cgi?id=13894
>>>> raised its ugly head again, please see my last comments on that bug.
>>>
>>> This very well may be a separete bug, so please file a new bugzilla report
>>> on this and mark it as a regression.
>> 
>> Done.
>
> Which number is this?

http://bugzilla.kernel.org/show_bug.cgi?id=14504
Submitted containing the following paragraph only:

>>>> 2.6.32-rc5 feels particularly bad, with frequent failures to switch
>>>> off the machine after "S|" or freezes after "Snapshotting system".
>>>> The former does not cause much trouble in itself, as the machine can
>>>> be switched off and resumed all right, but the latter is nasty.
>>>> Suspend to RAM works all the time.  The issue is not reproducible,
>>>> unfortunately, and the kernel change happened almost together with a
>>>> BIOS upgrade.  Yesterday I switched back to 2.6.31 to see whether it
>>>> still works stably with the new BIOS.  I'll report back my findings in
>>>> a couple of days.
>>>
>>> OK, thanks.
>>>
>>> Still, I'm really afraid we won't be able to debug it any further without a
>>> reproducible test case.
>> 
>> I've got another, fully reproducible but nevertheless neglected ACPI
>> problem, already mentioned in #13894:
>> https://bugs.freedesktop.org/show_bug.cgi?id=22126.
>
> A side note: I'm totally unhappy with _kernel_ bugs being handled at
> bugs.freedesktop.org without a notice anywhere else.  Even though they are
> related to the graphics, the kernel developers in general at least deserve the
> information that the bugs have been reported.
>
> In this particulare case, the bug is clearly related to ACPI and linux-acpi
> should have received a notification about it.

When the ACPI relation became clear to me, I notified linux-acpi, see
http://thread.gmane.org/gmane.linux.acpi.devel/42172/focus=42230

>> Well, it's probably far-fetched, but maybe the two are somehow related...
>
> Very well may be.
>
>> Can't you perhaps suggest a way forward there?  Or some tricks to create a
>> reproducible test case here?
>
> Well, you can test if the problem is reproducible in the "shutdown" mode of
> hibernation.

Ok, I'll go back to 2.6.32-rc5 for testing that.  Does that make any
difference in the "Snapshotting system" phase?  Freezes happen that
time, too, before writing out the image.

>> Btw. my gut feeling is that hibernation is getting slower with each
>> kernel release.  I didn't measure it, and didn't even care about
>> comparable initial states... But could anything explain this, or is
>> it sheer impatience?
>
> Which part of it is getting slower?  Saving the image, suspending
> devices or the entire hibernation overall?

"Snapshotting system" before saving the image and saving the image as
well.  If s2disk didn't report funny huge negative ratios all the
time, I'd probably have tried to correlate this with the number of
saved pages or similar...  But anyway, this is a minor nit, it's still
far from being unbearable.  If only it worked all the time!
-- 
Thanks,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-10-29 18:36     ` [linux-pm] " Rafael J. Wysocki
@ 2009-10-29 22:31       ` Ferenc Wagner
  2009-10-29 22:31       ` [linux-pm] " Ferenc Wagner
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-10-29 22:31 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: LKML, Jesse Barnes, yakui.zhao, ACPI Devel Maling List, linux-pm,
	Andrew Morton

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Thursday 29 October 2009, Ferenc Wagner wrote:
>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>> 
>>> On Wednesday 28 October 2009, Ferenc Wagner wrote:
>>> 
>>>> Something similar to http://bugzilla.kernel.org/show_bug.cgi?id=13894
>>>> raised its ugly head again, please see my last comments on that bug.
>>>
>>> This very well may be a separete bug, so please file a new bugzilla report
>>> on this and mark it as a regression.
>> 
>> Done.
>
> Which number is this?

http://bugzilla.kernel.org/show_bug.cgi?id=14504
Submitted containing the following paragraph only:

>>>> 2.6.32-rc5 feels particularly bad, with frequent failures to switch
>>>> off the machine after "S|" or freezes after "Snapshotting system".
>>>> The former does not cause much trouble in itself, as the machine can
>>>> be switched off and resumed all right, but the latter is nasty.
>>>> Suspend to RAM works all the time.  The issue is not reproducible,
>>>> unfortunately, and the kernel change happened almost together with a
>>>> BIOS upgrade.  Yesterday I switched back to 2.6.31 to see whether it
>>>> still works stably with the new BIOS.  I'll report back my findings in
>>>> a couple of days.
>>>
>>> OK, thanks.
>>>
>>> Still, I'm really afraid we won't be able to debug it any further without a
>>> reproducible test case.
>> 
>> I've got another, fully reproducible but nevertheless neglected ACPI
>> problem, already mentioned in #13894:
>> https://bugs.freedesktop.org/show_bug.cgi?id=22126.
>
> A side note: I'm totally unhappy with _kernel_ bugs being handled at
> bugs.freedesktop.org without a notice anywhere else.  Even though they are
> related to the graphics, the kernel developers in general at least deserve the
> information that the bugs have been reported.
>
> In this particulare case, the bug is clearly related to ACPI and linux-acpi
> should have received a notification about it.

When the ACPI relation became clear to me, I notified linux-acpi, see
http://thread.gmane.org/gmane.linux.acpi.devel/42172/focus=42230

>> Well, it's probably far-fetched, but maybe the two are somehow related...
>
> Very well may be.
>
>> Can't you perhaps suggest a way forward there?  Or some tricks to create a
>> reproducible test case here?
>
> Well, you can test if the problem is reproducible in the "shutdown" mode of
> hibernation.

Ok, I'll go back to 2.6.32-rc5 for testing that.  Does that make any
difference in the "Snapshotting system" phase?  Freezes happen that
time, too, before writing out the image.

>> Btw. my gut feeling is that hibernation is getting slower with each
>> kernel release.  I didn't measure it, and didn't even care about
>> comparable initial states... But could anything explain this, or is
>> it sheer impatience?
>
> Which part of it is getting slower?  Saving the image, suspending
> devices or the entire hibernation overall?

"Snapshotting system" before saving the image and saving the image as
well.  If s2disk didn't report funny huge negative ratios all the
time, I'd probably have tried to correlate this with the number of
saved pages or similar...  But anyway, this is a minor nit, it's still
far from being unbearable.  If only it worked all the time!
-- 
Thanks,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-10-29 22:31       ` [linux-pm] " Ferenc Wagner
  2009-10-30 18:18         ` Rafael J. Wysocki
@ 2009-10-30 18:18         ` Rafael J. Wysocki
  2009-10-30 19:03             ` [linux-pm] " Ferenc Wagner
  1 sibling, 1 reply; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-10-30 18:18 UTC (permalink / raw)
  To: Ferenc Wagner
  Cc: linux-pm, Jesse Barnes, Andrew Morton, yakui.zhao, LKML,
	ACPI Devel Maling List, Len Brown

On Thursday 29 October 2009, Ferenc Wagner wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > On Thursday 29 October 2009, Ferenc Wagner wrote:
> >> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >> 
> >>> On Wednesday 28 October 2009, Ferenc Wagner wrote:
> >>> 
> >>>> Something similar to http://bugzilla.kernel.org/show_bug.cgi?id=13894
> >>>> raised its ugly head again, please see my last comments on that bug.
> >>>
> >>> This very well may be a separete bug, so please file a new bugzilla report
> >>> on this and mark it as a regression.
> >> 
> >> Done.
> >
> > Which number is this?
> 
> http://bugzilla.kernel.org/show_bug.cgi?id=14504

Thanks.

> Submitted containing the following paragraph only:

That should be sufficient.

> >>>> 2.6.32-rc5 feels particularly bad, with frequent failures to switch
> >>>> off the machine after "S|" or freezes after "Snapshotting system".
> >>>> The former does not cause much trouble in itself, as the machine can
> >>>> be switched off and resumed all right, but the latter is nasty.
> >>>> Suspend to RAM works all the time.  The issue is not reproducible,
> >>>> unfortunately, and the kernel change happened almost together with a
> >>>> BIOS upgrade.  Yesterday I switched back to 2.6.31 to see whether it
> >>>> still works stably with the new BIOS.  I'll report back my findings in
> >>>> a couple of days.
> >>>
> >>> OK, thanks.
> >>>
> >>> Still, I'm really afraid we won't be able to debug it any further without a
> >>> reproducible test case.
> >> 
> >> I've got another, fully reproducible but nevertheless neglected ACPI
> >> problem, already mentioned in #13894:
> >> https://bugs.freedesktop.org/show_bug.cgi?id=22126.
> >
> > A side note: I'm totally unhappy with _kernel_ bugs being handled at
> > bugs.freedesktop.org without a notice anywhere else.  Even though they are
> > related to the graphics, the kernel developers in general at least deserve the
> > information that the bugs have been reported.
> >
> > In this particulare case, the bug is clearly related to ACPI and linux-acpi
> > should have received a notification about it.
> 
> When the ACPI relation became clear to me, I notified linux-acpi, see
> http://thread.gmane.org/gmane.linux.acpi.devel/42172/focus=42230

OK, thanks.

> >> Well, it's probably far-fetched, but maybe the two are somehow related...
> >
> > Very well may be.
> >
> >> Can't you perhaps suggest a way forward there?  Or some tricks to create a
> >> reproducible test case here?
> >
> > Well, you can test if the problem is reproducible in the "shutdown" mode of
> > hibernation.
> 
> Ok, I'll go back to 2.6.32-rc5 for testing that.  Does that make any
> difference in the "Snapshotting system" phase?

Yes, it does.

> Freezes happen that time, too, before writing out the image.
> 
> >> Btw. my gut feeling is that hibernation is getting slower with each
> >> kernel release.  I didn't measure it, and didn't even care about
> >> comparable initial states... But could anything explain this, or is
> >> it sheer impatience?
> >
> > Which part of it is getting slower?  Saving the image, suspending
> > devices or the entire hibernation overall?
> 
> "Snapshotting system" before saving the image

That may be a result of changing the way in which image memory is reserved.
How much memory is there in your machine?

> and saving the image as well.  If s2disk didn't report funny huge negative
> ratios all the time,

Hmm.  This looks like a bug in s2disk.

> I'd probably have tried to correlate this with the number of
> saved pages or similar...  But anyway, this is a minor nit, it's still
> far from being unbearable.  If only it worked all the time!

It should.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-10-29 22:31       ` [linux-pm] " Ferenc Wagner
@ 2009-10-30 18:18         ` Rafael J. Wysocki
  2009-10-30 18:18         ` [linux-pm] " Rafael J. Wysocki
  1 sibling, 0 replies; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-10-30 18:18 UTC (permalink / raw)
  To: Ferenc Wagner
  Cc: LKML, Jesse Barnes, yakui.zhao, ACPI Devel Maling List, linux-pm,
	Andrew Morton

On Thursday 29 October 2009, Ferenc Wagner wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > On Thursday 29 October 2009, Ferenc Wagner wrote:
> >> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >> 
> >>> On Wednesday 28 October 2009, Ferenc Wagner wrote:
> >>> 
> >>>> Something similar to http://bugzilla.kernel.org/show_bug.cgi?id=13894
> >>>> raised its ugly head again, please see my last comments on that bug.
> >>>
> >>> This very well may be a separete bug, so please file a new bugzilla report
> >>> on this and mark it as a regression.
> >> 
> >> Done.
> >
> > Which number is this?
> 
> http://bugzilla.kernel.org/show_bug.cgi?id=14504

Thanks.

> Submitted containing the following paragraph only:

That should be sufficient.

> >>>> 2.6.32-rc5 feels particularly bad, with frequent failures to switch
> >>>> off the machine after "S|" or freezes after "Snapshotting system".
> >>>> The former does not cause much trouble in itself, as the machine can
> >>>> be switched off and resumed all right, but the latter is nasty.
> >>>> Suspend to RAM works all the time.  The issue is not reproducible,
> >>>> unfortunately, and the kernel change happened almost together with a
> >>>> BIOS upgrade.  Yesterday I switched back to 2.6.31 to see whether it
> >>>> still works stably with the new BIOS.  I'll report back my findings in
> >>>> a couple of days.
> >>>
> >>> OK, thanks.
> >>>
> >>> Still, I'm really afraid we won't be able to debug it any further without a
> >>> reproducible test case.
> >> 
> >> I've got another, fully reproducible but nevertheless neglected ACPI
> >> problem, already mentioned in #13894:
> >> https://bugs.freedesktop.org/show_bug.cgi?id=22126.
> >
> > A side note: I'm totally unhappy with _kernel_ bugs being handled at
> > bugs.freedesktop.org without a notice anywhere else.  Even though they are
> > related to the graphics, the kernel developers in general at least deserve the
> > information that the bugs have been reported.
> >
> > In this particulare case, the bug is clearly related to ACPI and linux-acpi
> > should have received a notification about it.
> 
> When the ACPI relation became clear to me, I notified linux-acpi, see
> http://thread.gmane.org/gmane.linux.acpi.devel/42172/focus=42230

OK, thanks.

> >> Well, it's probably far-fetched, but maybe the two are somehow related...
> >
> > Very well may be.
> >
> >> Can't you perhaps suggest a way forward there?  Or some tricks to create a
> >> reproducible test case here?
> >
> > Well, you can test if the problem is reproducible in the "shutdown" mode of
> > hibernation.
> 
> Ok, I'll go back to 2.6.32-rc5 for testing that.  Does that make any
> difference in the "Snapshotting system" phase?

Yes, it does.

> Freezes happen that time, too, before writing out the image.
> 
> >> Btw. my gut feeling is that hibernation is getting slower with each
> >> kernel release.  I didn't measure it, and didn't even care about
> >> comparable initial states... But could anything explain this, or is
> >> it sheer impatience?
> >
> > Which part of it is getting slower?  Saving the image, suspending
> > devices or the entire hibernation overall?
> 
> "Snapshotting system" before saving the image

That may be a result of changing the way in which image memory is reserved.
How much memory is there in your machine?

> and saving the image as well.  If s2disk didn't report funny huge negative
> ratios all the time,

Hmm.  This looks like a bug in s2disk.

> I'd probably have tried to correlate this with the number of
> saved pages or similar...  But anyway, this is a minor nit, it's still
> far from being unbearable.  If only it worked all the time!

It should.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-10-30 18:18         ` [linux-pm] " Rafael J. Wysocki
@ 2009-10-30 19:03             ` Ferenc Wagner
  0 siblings, 0 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-10-30 19:03 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: LKML, Jesse Barnes, yakui.zhao, ACPI Devel Maling List, linux-pm,
	Andrew Morton

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Thursday 29 October 2009, Ferenc Wagner wrote:
>
>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>> 
>>> Which part of it is getting slower?  Saving the image, suspending
>>> devices or the entire hibernation overall?
>> 
>> "Snapshotting system" before saving the image
>
> That may be a result of changing the way in which image memory is reserved.
> How much memory is there in your machine?

512 MB.

>> and saving the image as well.  If s2disk didn't report funny huge negative
>> ratios all the time,
>
> Hmm.  This looks like a bug in s2disk.

Definitely.  Do you also experience this?  Probably an easy one, but
I've never had the chance to check the CVS version (running 0.7 at the
moment).  I can probably give 0.8 a spin if you deem necessary.  I
always thought it wasn't more than a cosmetic flaw.
-- 
Thanks,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
@ 2009-10-30 19:03             ` Ferenc Wagner
  0 siblings, 0 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-10-30 19:03 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, Jesse Barnes, Andrew Morton, yakui.zhao, LKML,
	ACPI Devel Maling List, Len Brown

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Thursday 29 October 2009, Ferenc Wagner wrote:
>
>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>> 
>>> Which part of it is getting slower?  Saving the image, suspending
>>> devices or the entire hibernation overall?
>> 
>> "Snapshotting system" before saving the image
>
> That may be a result of changing the way in which image memory is reserved.
> How much memory is there in your machine?

512 MB.

>> and saving the image as well.  If s2disk didn't report funny huge negative
>> ratios all the time,
>
> Hmm.  This looks like a bug in s2disk.

Definitely.  Do you also experience this?  Probably an easy one, but
I've never had the chance to check the CVS version (running 0.7 at the
moment).  I can probably give 0.8 a spin if you deem necessary.  I
always thought it wasn't more than a cosmetic flaw.
-- 
Thanks,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-10-30 19:03             ` [linux-pm] " Ferenc Wagner
  (?)
  (?)
@ 2009-10-30 20:38             ` Rafael J. Wysocki
  2009-10-31 12:02               ` Alan Jenkins
  2009-10-31 12:02               ` [linux-pm] " Alan Jenkins
  -1 siblings, 2 replies; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-10-30 20:38 UTC (permalink / raw)
  To: Ferenc Wagner
  Cc: linux-pm, Jesse Barnes, Andrew Morton, yakui.zhao, LKML,
	ACPI Devel Maling List, Len Brown

On Friday 30 October 2009, Ferenc Wagner wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > On Thursday 29 October 2009, Ferenc Wagner wrote:
> >
> >> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >> 
> >>> Which part of it is getting slower?  Saving the image, suspending
> >>> devices or the entire hibernation overall?
> >> 
> >> "Snapshotting system" before saving the image
> >
> > That may be a result of changing the way in which image memory is reserved.
> > How much memory is there in your machine?
> 
> 512 MB.

So it's likely the slowdown results from the memory management rework.
Hopefully, it'll improve in the future.

> >> and saving the image as well.  If s2disk didn't report funny huge negative
> >> ratios all the time,
> >
> > Hmm.  This looks like a bug in s2disk.
> 
> Definitely.  Do you also experience this?

Not really, but I use newer versions.

> Probably an easy one, but I've never had the chance to check the CVS version
> (running 0.7 at the moment).  I can probably give 0.8 a spin if you deem
> necessary.  I always thought it wasn't more than a cosmetic flaw.

It probably is.  You can try the current version from my git tree at:
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-utils.git

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-10-30 19:03             ` [linux-pm] " Ferenc Wagner
  (?)
@ 2009-10-30 20:38             ` Rafael J. Wysocki
  -1 siblings, 0 replies; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-10-30 20:38 UTC (permalink / raw)
  To: Ferenc Wagner
  Cc: LKML, Jesse Barnes, yakui.zhao, ACPI Devel Maling List, linux-pm,
	Andrew Morton

On Friday 30 October 2009, Ferenc Wagner wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > On Thursday 29 October 2009, Ferenc Wagner wrote:
> >
> >> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >> 
> >>> Which part of it is getting slower?  Saving the image, suspending
> >>> devices or the entire hibernation overall?
> >> 
> >> "Snapshotting system" before saving the image
> >
> > That may be a result of changing the way in which image memory is reserved.
> > How much memory is there in your machine?
> 
> 512 MB.

So it's likely the slowdown results from the memory management rework.
Hopefully, it'll improve in the future.

> >> and saving the image as well.  If s2disk didn't report funny huge negative
> >> ratios all the time,
> >
> > Hmm.  This looks like a bug in s2disk.
> 
> Definitely.  Do you also experience this?

Not really, but I use newer versions.

> Probably an easy one, but I've never had the chance to check the CVS version
> (running 0.7 at the moment).  I can probably give 0.8 a spin if you deem
> necessary.  I always thought it wasn't more than a cosmetic flaw.

It probably is.  You can try the current version from my git tree at:
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-utils.git

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-10-30 20:38             ` [linux-pm] " Rafael J. Wysocki
  2009-10-31 12:02               ` Alan Jenkins
@ 2009-10-31 12:02               ` Alan Jenkins
  2009-10-31 14:06                 ` Ferenc Wagner
  2009-10-31 14:06                 ` [linux-pm] " Ferenc Wagner
  1 sibling, 2 replies; 88+ messages in thread
From: Alan Jenkins @ 2009-10-31 12:02 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Ferenc Wagner, linux-pm, Jesse Barnes, Andrew Morton, yakui.zhao,
	LKML, ACPI Devel Maling List, Len Brown

On 10/30/09, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Friday 30 October 2009, Ferenc Wagner wrote:
>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>>
>> > On Thursday 29 October 2009, Ferenc Wagner wrote:
>> >
>> >> "Rafael J. Wysocki" <rjw@sisk.pl> writes:

>> >> and saving the image as well.  If s2disk didn't report funny huge
>> >> negative
>> >> ratios all the time,
>> >
>> > Hmm.  This looks like a bug in s2disk.
>>
>> Definitely.  Do you also experience this?
>
> Not really, but I use newer versions.
>
>> Probably an easy one, but I've never had the chance to check the CVS
>> version
>> (running 0.7 at the moment).  I can probably give 0.8 a spin if you deem
>> necessary.  I always thought it wasn't more than a cosmetic flaw.
>
> It probably is.  You can try the current version from my git tree at:
> git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-utils.git
>
> Thanks,
> Rafael

I seem to recall reporting this and finding that the latest version
fixed the bug simply by removing the code which printed the ratio :-).

Regards
Alan

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-10-30 20:38             ` [linux-pm] " Rafael J. Wysocki
@ 2009-10-31 12:02               ` Alan Jenkins
  2009-10-31 12:02               ` [linux-pm] " Alan Jenkins
  1 sibling, 0 replies; 88+ messages in thread
From: Alan Jenkins @ 2009-10-31 12:02 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: LKML, Jesse Barnes, yakui.zhao, ACPI Devel Maling List, linux-pm,
	Ferenc Wagner, Andrew Morton

On 10/30/09, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Friday 30 October 2009, Ferenc Wagner wrote:
>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>>
>> > On Thursday 29 October 2009, Ferenc Wagner wrote:
>> >
>> >> "Rafael J. Wysocki" <rjw@sisk.pl> writes:

>> >> and saving the image as well.  If s2disk didn't report funny huge
>> >> negative
>> >> ratios all the time,
>> >
>> > Hmm.  This looks like a bug in s2disk.
>>
>> Definitely.  Do you also experience this?
>
> Not really, but I use newer versions.
>
>> Probably an easy one, but I've never had the chance to check the CVS
>> version
>> (running 0.7 at the moment).  I can probably give 0.8 a spin if you deem
>> necessary.  I always thought it wasn't more than a cosmetic flaw.
>
> It probably is.  You can try the current version from my git tree at:
> git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-utils.git
>
> Thanks,
> Rafael

I seem to recall reporting this and finding that the latest version
fixed the bug simply by removing the code which printed the ratio :-).

Regards
Alan

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-10-31 12:02               ` [linux-pm] " Alan Jenkins
  2009-10-31 14:06                 ` Ferenc Wagner
@ 2009-10-31 14:06                 ` Ferenc Wagner
  2009-10-31 19:11                   ` Rafael J. Wysocki
  2009-10-31 19:11                   ` [linux-pm] " Rafael J. Wysocki
  1 sibling, 2 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-10-31 14:06 UTC (permalink / raw)
  To: Alan Jenkins
  Cc: Rafael J. Wysocki, linux-pm, Jesse Barnes, Andrew Morton,
	yakui.zhao, LKML, ACPI Devel Maling List, Len Brown

Alan Jenkins <sourcejedi.lkml@googlemail.com> writes:

> On 10/30/09, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> On Friday 30 October 2009, Ferenc Wagner wrote:
>>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>>>
>>>> On Thursday 29 October 2009, Ferenc Wagner wrote:
>>>>
>>>>> and saving the image as well.  If s2disk didn't report funny huge
>>>>> negative ratios all the time,
>>>>
>>>> Hmm.  This looks like a bug in s2disk.
>>>
>>> Definitely.  Do you also experience this?
>>
>> Not really, but I use newer versions.
>>
>>> Probably an easy one, but I've never had the chance to check the CVS
>>> version (running 0.7 at the moment).  I can probably give 0.8 a spin
>>> if you deem necessary.  I always thought it wasn't more than a
>>> cosmetic flaw.
>>
>> It probably is.  You can try the current version from my git tree at:
>> git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-utils.git
>
> I seem to recall reporting this and finding that the latest version
> fixed the bug simply by removing the code which printed the ratio :-).

Heh, maybe, but the version compiled from Rafael's git tree prints the
ratio all right, and its value is even positive and less than 1.  So I
confirm that the ratio issue is fixed.  I'll be running with this
version from now on, and if it exhibits the same original issue, I'll
switch to shutdown mode and gather some experience running that.
-- 
Thanks,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-10-31 12:02               ` [linux-pm] " Alan Jenkins
@ 2009-10-31 14:06                 ` Ferenc Wagner
  2009-10-31 14:06                 ` [linux-pm] " Ferenc Wagner
  1 sibling, 0 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-10-31 14:06 UTC (permalink / raw)
  To: Alan Jenkins
  Cc: yakui.zhao, LKML, Jesse Barnes, ACPI Devel Maling List, linux-pm,
	Andrew Morton

Alan Jenkins <sourcejedi.lkml@googlemail.com> writes:

> On 10/30/09, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> On Friday 30 October 2009, Ferenc Wagner wrote:
>>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>>>
>>>> On Thursday 29 October 2009, Ferenc Wagner wrote:
>>>>
>>>>> and saving the image as well.  If s2disk didn't report funny huge
>>>>> negative ratios all the time,
>>>>
>>>> Hmm.  This looks like a bug in s2disk.
>>>
>>> Definitely.  Do you also experience this?
>>
>> Not really, but I use newer versions.
>>
>>> Probably an easy one, but I've never had the chance to check the CVS
>>> version (running 0.7 at the moment).  I can probably give 0.8 a spin
>>> if you deem necessary.  I always thought it wasn't more than a
>>> cosmetic flaw.
>>
>> It probably is.  You can try the current version from my git tree at:
>> git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-utils.git
>
> I seem to recall reporting this and finding that the latest version
> fixed the bug simply by removing the code which printed the ratio :-).

Heh, maybe, but the version compiled from Rafael's git tree prints the
ratio all right, and its value is even positive and less than 1.  So I
confirm that the ratio issue is fixed.  I'll be running with this
version from now on, and if it exhibits the same original issue, I'll
switch to shutdown mode and gather some experience running that.
-- 
Thanks,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-10-31 14:06                 ` [linux-pm] " Ferenc Wagner
  2009-10-31 19:11                   ` Rafael J. Wysocki
@ 2009-10-31 19:11                   ` Rafael J. Wysocki
  2009-11-01 21:53                     ` Ferenc Wagner
  2009-11-01 21:53                     ` Ferenc Wagner
  1 sibling, 2 replies; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-10-31 19:11 UTC (permalink / raw)
  To: Ferenc Wagner
  Cc: Alan Jenkins, linux-pm, Jesse Barnes, Andrew Morton, yakui.zhao,
	LKML, ACPI Devel Maling List, Len Brown

On Saturday 31 October 2009, Ferenc Wagner wrote:
> Alan Jenkins <sourcejedi.lkml@googlemail.com> writes:
> 
> > On 10/30/09, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> >> On Friday 30 October 2009, Ferenc Wagner wrote:
> >>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >>>
> >>>> On Thursday 29 October 2009, Ferenc Wagner wrote:
> >>>>
> >>>>> and saving the image as well.  If s2disk didn't report funny huge
> >>>>> negative ratios all the time,
> >>>>
> >>>> Hmm.  This looks like a bug in s2disk.
> >>>
> >>> Definitely.  Do you also experience this?
> >>
> >> Not really, but I use newer versions.
> >>
> >>> Probably an easy one, but I've never had the chance to check the CVS
> >>> version (running 0.7 at the moment).  I can probably give 0.8 a spin
> >>> if you deem necessary.  I always thought it wasn't more than a
> >>> cosmetic flaw.
> >>
> >> It probably is.  You can try the current version from my git tree at:
> >> git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-utils.git
> >
> > I seem to recall reporting this and finding that the latest version
> > fixed the bug simply by removing the code which printed the ratio :-).
> 
> Heh, maybe, but the version compiled from Rafael's git tree prints the
> ratio all right, and its value is even positive and less than 1.  So I
> confirm that the ratio issue is fixed.  I'll be running with this
> version from now on, and if it exhibits the same original issue, I'll
> switch to shutdown mode and gather some experience running that.

Well, the problem you reported is a kernel issue and switching to the newer
user space is not likely to help.

Best,
Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-10-31 14:06                 ` [linux-pm] " Ferenc Wagner
@ 2009-10-31 19:11                   ` Rafael J. Wysocki
  2009-10-31 19:11                   ` [linux-pm] " Rafael J. Wysocki
  1 sibling, 0 replies; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-10-31 19:11 UTC (permalink / raw)
  To: Ferenc Wagner
  Cc: LKML, Jesse Barnes, yakui.zhao, ACPI Devel Maling List, linux-pm,
	Alan Jenkins, Andrew Morton

On Saturday 31 October 2009, Ferenc Wagner wrote:
> Alan Jenkins <sourcejedi.lkml@googlemail.com> writes:
> 
> > On 10/30/09, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> >> On Friday 30 October 2009, Ferenc Wagner wrote:
> >>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >>>
> >>>> On Thursday 29 October 2009, Ferenc Wagner wrote:
> >>>>
> >>>>> and saving the image as well.  If s2disk didn't report funny huge
> >>>>> negative ratios all the time,
> >>>>
> >>>> Hmm.  This looks like a bug in s2disk.
> >>>
> >>> Definitely.  Do you also experience this?
> >>
> >> Not really, but I use newer versions.
> >>
> >>> Probably an easy one, but I've never had the chance to check the CVS
> >>> version (running 0.7 at the moment).  I can probably give 0.8 a spin
> >>> if you deem necessary.  I always thought it wasn't more than a
> >>> cosmetic flaw.
> >>
> >> It probably is.  You can try the current version from my git tree at:
> >> git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-utils.git
> >
> > I seem to recall reporting this and finding that the latest version
> > fixed the bug simply by removing the code which printed the ratio :-).
> 
> Heh, maybe, but the version compiled from Rafael's git tree prints the
> ratio all right, and its value is even positive and less than 1.  So I
> confirm that the ratio issue is fixed.  I'll be running with this
> version from now on, and if it exhibits the same original issue, I'll
> switch to shutdown mode and gather some experience running that.

Well, the problem you reported is a kernel issue and switching to the newer
user space is not likely to help.

Best,
Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-10-31 19:11                   ` [linux-pm] " Rafael J. Wysocki
@ 2009-11-01 21:53                     ` Ferenc Wagner
  2009-11-03 11:02                       ` Ferenc Wagner
  2009-11-03 11:02                       ` Ferenc Wagner
  2009-11-01 21:53                     ` Ferenc Wagner
  1 sibling, 2 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-11-01 21:53 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, Jesse Barnes, Andrew Morton, yakui.zhao, LKML,
	ACPI Devel Maling List, Len Brown

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Saturday 31 October 2009, Ferenc Wagner wrote:
>
>> Heh, maybe, but the version compiled from Rafael's git tree prints the
>> ratio all right, and its value is even positive and less than 1.  So I
>> confirm that the ratio issue is fixed.  I'll be running with this
>> version from now on, and if it exhibits the same original issue, I'll
>> switch to shutdown mode and gather some experience running that.
>
> Well, the problem you reported is a kernel issue and switching to the newer
> user space is not likely to help.

Well, I haven't had my hopes overly high, but wanted to have a concrete
baseline.  So: with the new s2disk I got a freeze again after S|.  After
a manual power off and a successful resume, I switched to shutdown mode
and hibernated again, and got the exact same freeze (apart from a
slightly different image size).  Power off, resume, switch to reboot
mode, hibernate, and this worked.  Switch back to shutdown, now that
worked as well...  Eh.  In my earlier bug report I think I noted that
after such a hibernation failure a straight shutdown didn't power off
the computer as it otherwise does, which feels consistent with the
above.

With the uswsusp 0.7, a typical freeze looked like this:

s2disk: Snapshotting system
s2disk: System snapshot ready. Preparing to write
s2disk: Image size: 240872 kilobytes
s2disk: Free swap: 1333596 kilobytes
s2disk: Saving 60217 image data pages (press backspace to abort) ... 100% done (60217 pages)
s2disk: Compression ratio -63208.85
S|

With the new version the ratio is 0.42 with similar numbers, which
sounds sane at least.  However, 60217 * 4 = 240868 = 240872 - 4, wasn't
the number of saved pages one off?  The new version seems to get this
right, though.

Now something else, which may or may not be related.  I supervise a
computing farm running a very old OS: Debian Sarge.  The kernel was
somewhat newer: 2.6.24 until recently, when new machines arrived to the
lab, which couldn't boot that 2.6.24 kernel.  So I upgraded to 2.6.31,
which works quite well, apart from one thing: halt doesn't power off the
machines anymore.  All the same under 2.6.32-rc5: they simply freeze
after reaching halt -d -f -i -h -p in the shutdown sequence.

I'm pretty much stumped here, but will try to get some SysRq dumps out
of these machines at least.
-- 
Thanks,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-10-31 19:11                   ` [linux-pm] " Rafael J. Wysocki
  2009-11-01 21:53                     ` Ferenc Wagner
@ 2009-11-01 21:53                     ` Ferenc Wagner
  1 sibling, 0 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-11-01 21:53 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: LKML, Jesse Barnes, yakui.zhao, ACPI Devel Maling List, linux-pm,
	Andrew Morton

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Saturday 31 October 2009, Ferenc Wagner wrote:
>
>> Heh, maybe, but the version compiled from Rafael's git tree prints the
>> ratio all right, and its value is even positive and less than 1.  So I
>> confirm that the ratio issue is fixed.  I'll be running with this
>> version from now on, and if it exhibits the same original issue, I'll
>> switch to shutdown mode and gather some experience running that.
>
> Well, the problem you reported is a kernel issue and switching to the newer
> user space is not likely to help.

Well, I haven't had my hopes overly high, but wanted to have a concrete
baseline.  So: with the new s2disk I got a freeze again after S|.  After
a manual power off and a successful resume, I switched to shutdown mode
and hibernated again, and got the exact same freeze (apart from a
slightly different image size).  Power off, resume, switch to reboot
mode, hibernate, and this worked.  Switch back to shutdown, now that
worked as well...  Eh.  In my earlier bug report I think I noted that
after such a hibernation failure a straight shutdown didn't power off
the computer as it otherwise does, which feels consistent with the
above.

With the uswsusp 0.7, a typical freeze looked like this:

s2disk: Snapshotting system
s2disk: System snapshot ready. Preparing to write
s2disk: Image size: 240872 kilobytes
s2disk: Free swap: 1333596 kilobytes
s2disk: Saving 60217 image data pages (press backspace to abort) ... 100% done (60217 pages)
s2disk: Compression ratio -63208.85
S|

With the new version the ratio is 0.42 with similar numbers, which
sounds sane at least.  However, 60217 * 4 = 240868 = 240872 - 4, wasn't
the number of saved pages one off?  The new version seems to get this
right, though.

Now something else, which may or may not be related.  I supervise a
computing farm running a very old OS: Debian Sarge.  The kernel was
somewhat newer: 2.6.24 until recently, when new machines arrived to the
lab, which couldn't boot that 2.6.24 kernel.  So I upgraded to 2.6.31,
which works quite well, apart from one thing: halt doesn't power off the
machines anymore.  All the same under 2.6.32-rc5: they simply freeze
after reaching halt -d -f -i -h -p in the shutdown sequence.

I'm pretty much stumped here, but will try to get some SysRq dumps out
of these machines at least.
-- 
Thanks,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-11-01 21:53                     ` Ferenc Wagner
@ 2009-11-03 11:02                       ` Ferenc Wagner
  2009-11-03 11:02                       ` Ferenc Wagner
  1 sibling, 0 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-11-03 11:02 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, Jesse Barnes, Andrew Morton, yakui.zhao, LKML,
	ACPI Devel Maling List, Len Brown

Ferenc Wagner <wferi@niif.hu> writes:

A)

> So: with the new s2disk I got a freeze again after S|.  After a manual
> power off and a successful resume, I switched to shutdown mode and
> hibernated again, and got the exact same freeze (apart from a slightly
> different image size).  Power off, resume, switch to reboot mode,
> hibernate, and this worked.  Switch back to shutdown, now that worked
> as well...  Eh.  In my earlier bug report I think I noted that after
> such a hibernation failure a straight shutdown didn't power off the
> computer as it otherwise does, which feels consistent with the above.

B)

> Now something else, which may or may not be related.  I supervise a
> computing farm running a very old OS: Debian Sarge.  The kernel was
> somewhat newer: 2.6.24 until recently, when new machines arrived to the
> lab, which couldn't boot that 2.6.24 kernel.  So I upgraded to 2.6.31,
> which works quite well, apart from one thing: halt doesn't power off the
> machines anymore.  All the same under 2.6.32-rc5: they simply freeze
> after reaching halt -d -f -i -h -p in the shutdown sequence.

Hi,

so now I've got three ACPI and/or PM related problems: A) and B) above
with hibernation and halt/poweroff, and the C) graphics and suspend
related described at http://bugs.freedesktop.org/show_bug.cgi?id=22126
(which turned out to be ACPI related recently).

B) and C) are 100% reproducible, although the machines exhibiting C) are
at a different geographic location, so I have to bug a remote admin for
power switching.

A) and C) are exhibited by my laptop.

I'd be grateful for debugging tips for any of the above.  While A) is
not reproducible, it happens often enough with the platform method (I
haven't got enough data with the shutdown method yet).  I'm willing to
recompile kernels, read up on documentation and code and have parallel
port LEDs handy (for the laptop).  But I've got no experience with ACPI
or PM, sadly.  However, 2.6.32 is currently nominated as more than a
couple of distros' choice for long term stable support, so I'm willing
to invest substantial effort into fixing these issues.

If anybody has debugging tips, please share (and note if further
questions aren't welcome).
-- 
Thanks,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-11-01 21:53                     ` Ferenc Wagner
  2009-11-03 11:02                       ` Ferenc Wagner
@ 2009-11-03 11:02                       ` Ferenc Wagner
  1 sibling, 0 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-11-03 11:02 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: LKML, Jesse Barnes, yakui.zhao, ACPI Devel Maling List, linux-pm,
	Andrew Morton

Ferenc Wagner <wferi@niif.hu> writes:

A)

> So: with the new s2disk I got a freeze again after S|.  After a manual
> power off and a successful resume, I switched to shutdown mode and
> hibernated again, and got the exact same freeze (apart from a slightly
> different image size).  Power off, resume, switch to reboot mode,
> hibernate, and this worked.  Switch back to shutdown, now that worked
> as well...  Eh.  In my earlier bug report I think I noted that after
> such a hibernation failure a straight shutdown didn't power off the
> computer as it otherwise does, which feels consistent with the above.

B)

> Now something else, which may or may not be related.  I supervise a
> computing farm running a very old OS: Debian Sarge.  The kernel was
> somewhat newer: 2.6.24 until recently, when new machines arrived to the
> lab, which couldn't boot that 2.6.24 kernel.  So I upgraded to 2.6.31,
> which works quite well, apart from one thing: halt doesn't power off the
> machines anymore.  All the same under 2.6.32-rc5: they simply freeze
> after reaching halt -d -f -i -h -p in the shutdown sequence.

Hi,

so now I've got three ACPI and/or PM related problems: A) and B) above
with hibernation and halt/poweroff, and the C) graphics and suspend
related described at http://bugs.freedesktop.org/show_bug.cgi?id=22126
(which turned out to be ACPI related recently).

B) and C) are 100% reproducible, although the machines exhibiting C) are
at a different geographic location, so I have to bug a remote admin for
power switching.

A) and C) are exhibited by my laptop.

I'd be grateful for debugging tips for any of the above.  While A) is
not reproducible, it happens often enough with the platform method (I
haven't got enough data with the shutdown method yet).  I'm willing to
recompile kernels, read up on documentation and code and have parallel
port LEDs handy (for the laptop).  But I've got no experience with ACPI
or PM, sadly.  However, 2.6.32 is currently nominated as more than a
couple of distros' choice for long term stable support, so I'm willing
to invest substantial effort into fixing these issues.

If anybody has debugging tips, please share (and note if further
questions aren't welcome).
-- 
Thanks,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-10-29 18:36     ` [linux-pm] " Rafael J. Wysocki
                         ` (2 preceding siblings ...)
  2009-11-11 11:29       ` Ferenc Wagner
@ 2009-11-11 11:29       ` Ferenc Wagner
  2009-11-11 11:38         ` Rafael J. Wysocki
  2009-11-11 11:38         ` [linux-pm] " Rafael J. Wysocki
  3 siblings, 2 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-11-11 11:29 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, Jesse Barnes, Andrew Morton, yakui.zhao, LKML,
	ACPI Devel Maling List, Len Brown

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Thursday 29 October 2009, Ferenc Wagner wrote:
>
>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>> 
>>> On Wednesday 28 October 2009, Ferenc Wagner wrote:
>>> 
>>>> 2.6.32-rc5 feels particularly bad, with frequent failures to switch
>>>> off the machine after "S|" or freezes after "Snapshotting system".
>>>> The former does not cause much trouble in itself, as the machine can
>>>> be switched off and resumed all right, but the latter is nasty.
>>>> Suspend to RAM works all the time.  The issue is not reproducible,
>>>> unfortunately, and the kernel change happened almost together with a
>>>> BIOS upgrade.  Yesterday I switched back to 2.6.31 to see whether it
>>>> still works stably with the new BIOS.  I'll report back my findings in
>>>> a couple of days.
>>>
>>> OK, thanks.
>>>
>>> Still, I'm really afraid we won't be able to debug it any further without a
>>> reproducible test case.
>>
>> Can't you perhaps suggest a way forward there?  Or some tricks to create a
>> reproducible test case here?
>
> Well, you can test if the problem is reproducible in the "shutdown" mode of
> hibernation.

Well, both failure modes happen with "shutdown" mode as well (the S|
freeze with yesterday's git, too), but still not reproducibly.  When
s2disk is stuck in "Snapshotting system", the system is not completely
dead, it echoes line feeds and Ctrl-C at least (as added to #14504).

I wonder what you did if the issue was reproducible...  Is that totally
unapplicable if the problem happens with 10% probability only?  Slow,
sure, but until I manage to set up an automated testing bench...
-- 
Thanks,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-10-29 18:36     ` [linux-pm] " Rafael J. Wysocki
  2009-10-29 22:31       ` Ferenc Wagner
  2009-10-29 22:31       ` [linux-pm] " Ferenc Wagner
@ 2009-11-11 11:29       ` Ferenc Wagner
  2009-11-11 11:29       ` [linux-pm] " Ferenc Wagner
  3 siblings, 0 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-11-11 11:29 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: LKML, Jesse Barnes, yakui.zhao, ACPI Devel Maling List, linux-pm,
	Andrew Morton

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Thursday 29 October 2009, Ferenc Wagner wrote:
>
>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>> 
>>> On Wednesday 28 October 2009, Ferenc Wagner wrote:
>>> 
>>>> 2.6.32-rc5 feels particularly bad, with frequent failures to switch
>>>> off the machine after "S|" or freezes after "Snapshotting system".
>>>> The former does not cause much trouble in itself, as the machine can
>>>> be switched off and resumed all right, but the latter is nasty.
>>>> Suspend to RAM works all the time.  The issue is not reproducible,
>>>> unfortunately, and the kernel change happened almost together with a
>>>> BIOS upgrade.  Yesterday I switched back to 2.6.31 to see whether it
>>>> still works stably with the new BIOS.  I'll report back my findings in
>>>> a couple of days.
>>>
>>> OK, thanks.
>>>
>>> Still, I'm really afraid we won't be able to debug it any further without a
>>> reproducible test case.
>>
>> Can't you perhaps suggest a way forward there?  Or some tricks to create a
>> reproducible test case here?
>
> Well, you can test if the problem is reproducible in the "shutdown" mode of
> hibernation.

Well, both failure modes happen with "shutdown" mode as well (the S|
freeze with yesterday's git, too), but still not reproducibly.  When
s2disk is stuck in "Snapshotting system", the system is not completely
dead, it echoes line feeds and Ctrl-C at least (as added to #14504).

I wonder what you did if the issue was reproducible...  Is that totally
unapplicable if the problem happens with 10% probability only?  Slow,
sure, but until I manage to set up an automated testing bench...
-- 
Thanks,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-11-11 11:29       ` [linux-pm] " Ferenc Wagner
  2009-11-11 11:38         ` Rafael J. Wysocki
@ 2009-11-11 11:38         ` Rafael J. Wysocki
  2009-11-11 13:29           ` Ferenc Wagner
  2009-11-11 13:29           ` [linux-pm] " Ferenc Wagner
  1 sibling, 2 replies; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-11-11 11:38 UTC (permalink / raw)
  To: Ferenc Wagner
  Cc: linux-pm, Jesse Barnes, Andrew Morton, yakui.zhao, LKML,
	ACPI Devel Maling List, Len Brown

On Wednesday 11 November 2009, Ferenc Wagner wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > On Thursday 29 October 2009, Ferenc Wagner wrote:
> >
> >> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >> 
> >>> On Wednesday 28 October 2009, Ferenc Wagner wrote:
> >>> 
> >>>> 2.6.32-rc5 feels particularly bad, with frequent failures to switch
> >>>> off the machine after "S|" or freezes after "Snapshotting system".
> >>>> The former does not cause much trouble in itself, as the machine can
> >>>> be switched off and resumed all right, but the latter is nasty.
> >>>> Suspend to RAM works all the time.  The issue is not reproducible,
> >>>> unfortunately, and the kernel change happened almost together with a
> >>>> BIOS upgrade.  Yesterday I switched back to 2.6.31 to see whether it
> >>>> still works stably with the new BIOS.  I'll report back my findings in
> >>>> a couple of days.
> >>>
> >>> OK, thanks.
> >>>
> >>> Still, I'm really afraid we won't be able to debug it any further without a
> >>> reproducible test case.
> >>
> >> Can't you perhaps suggest a way forward there?  Or some tricks to create a
> >> reproducible test case here?
> >
> > Well, you can test if the problem is reproducible in the "shutdown" mode of
> > hibernation.
> 
> Well, both failure modes happen with "shutdown" mode as well (the S|
> freeze with yesterday's git, too), but still not reproducibly.  When
> s2disk is stuck in "Snapshotting system", the system is not completely
> dead, it echoes line feeds and Ctrl-C at least (as added to #14504).
> 
> I wonder what you did if the issue was reproducible...  Is that totally
> unapplicable if the problem happens with 10% probability only?  Slow,
> sure, but until I manage to set up an automated testing bench...

I would try to identify the commit that made the problem appear using git
bisection.  However, this is really difficult with problems that are not
reliably reproducible.

Failing that, I would add some instrumentation to the code to identify the
exact place where it hangs.

BTW, did you carry out the /sys/power/pm_test "core" test on the box?

Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-11-11 11:29       ` [linux-pm] " Ferenc Wagner
@ 2009-11-11 11:38         ` Rafael J. Wysocki
  2009-11-11 11:38         ` [linux-pm] " Rafael J. Wysocki
  1 sibling, 0 replies; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-11-11 11:38 UTC (permalink / raw)
  To: Ferenc Wagner
  Cc: LKML, Jesse Barnes, yakui.zhao, ACPI Devel Maling List, linux-pm,
	Andrew Morton

On Wednesday 11 November 2009, Ferenc Wagner wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > On Thursday 29 October 2009, Ferenc Wagner wrote:
> >
> >> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >> 
> >>> On Wednesday 28 October 2009, Ferenc Wagner wrote:
> >>> 
> >>>> 2.6.32-rc5 feels particularly bad, with frequent failures to switch
> >>>> off the machine after "S|" or freezes after "Snapshotting system".
> >>>> The former does not cause much trouble in itself, as the machine can
> >>>> be switched off and resumed all right, but the latter is nasty.
> >>>> Suspend to RAM works all the time.  The issue is not reproducible,
> >>>> unfortunately, and the kernel change happened almost together with a
> >>>> BIOS upgrade.  Yesterday I switched back to 2.6.31 to see whether it
> >>>> still works stably with the new BIOS.  I'll report back my findings in
> >>>> a couple of days.
> >>>
> >>> OK, thanks.
> >>>
> >>> Still, I'm really afraid we won't be able to debug it any further without a
> >>> reproducible test case.
> >>
> >> Can't you perhaps suggest a way forward there?  Or some tricks to create a
> >> reproducible test case here?
> >
> > Well, you can test if the problem is reproducible in the "shutdown" mode of
> > hibernation.
> 
> Well, both failure modes happen with "shutdown" mode as well (the S|
> freeze with yesterday's git, too), but still not reproducibly.  When
> s2disk is stuck in "Snapshotting system", the system is not completely
> dead, it echoes line feeds and Ctrl-C at least (as added to #14504).
> 
> I wonder what you did if the issue was reproducible...  Is that totally
> unapplicable if the problem happens with 10% probability only?  Slow,
> sure, but until I manage to set up an automated testing bench...

I would try to identify the commit that made the problem appear using git
bisection.  However, this is really difficult with problems that are not
reliably reproducible.

Failing that, I would add some instrumentation to the code to identify the
exact place where it hangs.

BTW, did you carry out the /sys/power/pm_test "core" test on the box?

Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-11-11 11:38         ` [linux-pm] " Rafael J. Wysocki
  2009-11-11 13:29           ` Ferenc Wagner
@ 2009-11-11 13:29           ` Ferenc Wagner
  2009-11-11 14:47             ` Rafael J. Wysocki
  2009-11-11 14:47             ` [linux-pm] " Rafael J. Wysocki
  1 sibling, 2 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-11-11 13:29 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, Jesse Barnes, Andrew Morton, yakui.zhao, LKML,
	ACPI Devel Maling List, Len Brown

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Wednesday 11 November 2009, Ferenc Wagner wrote:
>
>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>> 
>>> On Thursday 29 October 2009, Ferenc Wagner wrote:
>>>
>>>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>>>> 
>>>>> On Wednesday 28 October 2009, Ferenc Wagner wrote:
>>>>> 
>>>>>> 2.6.32-rc5 feels particularly bad, with frequent failures to switch
>>>>>> off the machine after "S|" or freezes after "Snapshotting system".
>>>>>> The former does not cause much trouble in itself, as the machine can
>>>>>> be switched off and resumed all right, but the latter is nasty.
>>>>>> Suspend to RAM works all the time.  The issue is not reproducible,
>>>>>> unfortunately, and the kernel change happened almost together with a
>>>>>> BIOS upgrade.  Yesterday I switched back to 2.6.31 to see whether it
>>>>>> still works stably with the new BIOS.  I'll report back my findings in
>>>>>> a couple of days.
>>>>>
>>>>> OK, thanks.
>>>>>
>>>>> Still, I'm really afraid we won't be able to debug it any further without a
>>>>> reproducible test case.
>>>>
>>>> Can't you perhaps suggest a way forward there?  Or some tricks to create a
>>>> reproducible test case here?
>>>
>>> Well, you can test if the problem is reproducible in the "shutdown" mode of
>>> hibernation.
>> 
>> Well, both failure modes happen with "shutdown" mode as well (the S|
>> freeze with yesterday's git, too), but still not reproducibly.  When
>> s2disk is stuck in "Snapshotting system", the system is not completely
>> dead, it echoes line feeds and Ctrl-C at least (as added to #14504).
>> 
>> I wonder what you did if the issue was reproducible...  Is that totally
>> unapplicable if the problem happens with 10% probability only?  Slow,
>> sure, but until I manage to set up an automated testing bench...
>
> I would try to identify the commit that made the problem appear using git
> bisection.  However, this is really difficult with problems that are not
> reliably reproducible.

Indeed.  I'm thinking about setting up a script, which does nothing but
hibernates the laptop in a loop, and get my router provide a constant
stream of WOL packets to restart it.  If it always freezes in bounded
time that will make bisecting possible, if slow.

> Failing that, I would add some instrumentation to the code to identify the
> exact place where it hangs.

I managed to achieve this with my STR problem, see
http://bugs.freedesktop.org/show_bug.cgi?id=22126#c17, but maybe that
status = acpi_evaluate_object(NULL, METHOD_NAME__PTS, &arg_list, NULL);
wasn't deep enough, as it got no followup.  How deep should one go to be
useful?

I can probably do so again, if slower; but this case may also be easier
if I can depend on working console output.  Which are the interesting
parts for instrumentation?  Can those parts produce console output to
VGA or netconsole?  Wouldn't switching on ACPI debugging before invoking
s2disk be useful?  Which parts of it (to avoid it spitting out MBs of
useless characters)?

> BTW, did you carry out the /sys/power/pm_test "core" test on the box?

I'm not clear on how to do that with user space suspend.  Simply set it
to "cores" before invoking s2disk?  I already did the test for STR (see
http://bugs.freedesktop.org/show_bug.cgi?id=22126#c3), but will redo
with the current kernel tonight.
-- 
Thanks,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-11-11 11:38         ` [linux-pm] " Rafael J. Wysocki
@ 2009-11-11 13:29           ` Ferenc Wagner
  2009-11-11 13:29           ` [linux-pm] " Ferenc Wagner
  1 sibling, 0 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-11-11 13:29 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: LKML, Jesse Barnes, yakui.zhao, ACPI Devel Maling List, linux-pm,
	Andrew Morton

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Wednesday 11 November 2009, Ferenc Wagner wrote:
>
>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>> 
>>> On Thursday 29 October 2009, Ferenc Wagner wrote:
>>>
>>>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>>>> 
>>>>> On Wednesday 28 October 2009, Ferenc Wagner wrote:
>>>>> 
>>>>>> 2.6.32-rc5 feels particularly bad, with frequent failures to switch
>>>>>> off the machine after "S|" or freezes after "Snapshotting system".
>>>>>> The former does not cause much trouble in itself, as the machine can
>>>>>> be switched off and resumed all right, but the latter is nasty.
>>>>>> Suspend to RAM works all the time.  The issue is not reproducible,
>>>>>> unfortunately, and the kernel change happened almost together with a
>>>>>> BIOS upgrade.  Yesterday I switched back to 2.6.31 to see whether it
>>>>>> still works stably with the new BIOS.  I'll report back my findings in
>>>>>> a couple of days.
>>>>>
>>>>> OK, thanks.
>>>>>
>>>>> Still, I'm really afraid we won't be able to debug it any further without a
>>>>> reproducible test case.
>>>>
>>>> Can't you perhaps suggest a way forward there?  Or some tricks to create a
>>>> reproducible test case here?
>>>
>>> Well, you can test if the problem is reproducible in the "shutdown" mode of
>>> hibernation.
>> 
>> Well, both failure modes happen with "shutdown" mode as well (the S|
>> freeze with yesterday's git, too), but still not reproducibly.  When
>> s2disk is stuck in "Snapshotting system", the system is not completely
>> dead, it echoes line feeds and Ctrl-C at least (as added to #14504).
>> 
>> I wonder what you did if the issue was reproducible...  Is that totally
>> unapplicable if the problem happens with 10% probability only?  Slow,
>> sure, but until I manage to set up an automated testing bench...
>
> I would try to identify the commit that made the problem appear using git
> bisection.  However, this is really difficult with problems that are not
> reliably reproducible.

Indeed.  I'm thinking about setting up a script, which does nothing but
hibernates the laptop in a loop, and get my router provide a constant
stream of WOL packets to restart it.  If it always freezes in bounded
time that will make bisecting possible, if slow.

> Failing that, I would add some instrumentation to the code to identify the
> exact place where it hangs.

I managed to achieve this with my STR problem, see
http://bugs.freedesktop.org/show_bug.cgi?id=22126#c17, but maybe that
status = acpi_evaluate_object(NULL, METHOD_NAME__PTS, &arg_list, NULL);
wasn't deep enough, as it got no followup.  How deep should one go to be
useful?

I can probably do so again, if slower; but this case may also be easier
if I can depend on working console output.  Which are the interesting
parts for instrumentation?  Can those parts produce console output to
VGA or netconsole?  Wouldn't switching on ACPI debugging before invoking
s2disk be useful?  Which parts of it (to avoid it spitting out MBs of
useless characters)?

> BTW, did you carry out the /sys/power/pm_test "core" test on the box?

I'm not clear on how to do that with user space suspend.  Simply set it
to "cores" before invoking s2disk?  I already did the test for STR (see
http://bugs.freedesktop.org/show_bug.cgi?id=22126#c3), but will redo
with the current kernel tonight.
-- 
Thanks,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-11-11 13:29           ` [linux-pm] " Ferenc Wagner
  2009-11-11 14:47             ` Rafael J. Wysocki
@ 2009-11-11 14:47             ` Rafael J. Wysocki
  2009-11-13 16:35               ` Ferenc Wagner
  2009-11-13 16:35               ` [linux-pm] " Ferenc Wagner
  1 sibling, 2 replies; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-11-11 14:47 UTC (permalink / raw)
  To: Ferenc Wagner
  Cc: linux-pm, Jesse Barnes, Andrew Morton, yakui.zhao, LKML,
	ACPI Devel Maling List, Len Brown

On Wednesday 11 November 2009, Ferenc Wagner wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > On Wednesday 11 November 2009, Ferenc Wagner wrote:
> >
> >> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >> 
> >>> On Thursday 29 October 2009, Ferenc Wagner wrote:
> >>>
> >>>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >>>> 
> >>>>> On Wednesday 28 October 2009, Ferenc Wagner wrote:
> >>>>> 
> >>>>>> 2.6.32-rc5 feels particularly bad, with frequent failures to switch
> >>>>>> off the machine after "S|" or freezes after "Snapshotting system".
> >>>>>> The former does not cause much trouble in itself, as the machine can
> >>>>>> be switched off and resumed all right, but the latter is nasty.
> >>>>>> Suspend to RAM works all the time.  The issue is not reproducible,
> >>>>>> unfortunately, and the kernel change happened almost together with a
> >>>>>> BIOS upgrade.  Yesterday I switched back to 2.6.31 to see whether it
> >>>>>> still works stably with the new BIOS.  I'll report back my findings in
> >>>>>> a couple of days.
> >>>>>
> >>>>> OK, thanks.
> >>>>>
> >>>>> Still, I'm really afraid we won't be able to debug it any further without a
> >>>>> reproducible test case.
> >>>>
> >>>> Can't you perhaps suggest a way forward there?  Or some tricks to create a
> >>>> reproducible test case here?
> >>>
> >>> Well, you can test if the problem is reproducible in the "shutdown" mode of
> >>> hibernation.
> >> 
> >> Well, both failure modes happen with "shutdown" mode as well (the S|
> >> freeze with yesterday's git, too), but still not reproducibly.  When
> >> s2disk is stuck in "Snapshotting system", the system is not completely
> >> dead, it echoes line feeds and Ctrl-C at least (as added to #14504).
> >> 
> >> I wonder what you did if the issue was reproducible...  Is that totally
> >> unapplicable if the problem happens with 10% probability only?  Slow,
> >> sure, but until I manage to set up an automated testing bench...
> >
> > I would try to identify the commit that made the problem appear using git
> > bisection.  However, this is really difficult with problems that are not
> > reliably reproducible.
> 
> Indeed.  I'm thinking about setting up a script, which does nothing but
> hibernates the laptop in a loop, and get my router provide a constant
> stream of WOL packets to restart it.  If it always freezes in bounded
> time that will make bisecting possible, if slow.

Alternatively, you can use the RTC alarm to wake up the machine.

> > Failing that, I would add some instrumentation to the code to identify the
> > exact place where it hangs.
> 
> I managed to achieve this with my STR problem, see
> http://bugs.freedesktop.org/show_bug.cgi?id=22126#c17, but maybe that
> status = acpi_evaluate_object(NULL, METHOD_NAME__PTS, &arg_list, NULL);
> wasn't deep enough, as it got no followup.  How deep should one go to be
> useful?

No, this is deep enough and indicates a BIOS issue.

> I can probably do so again, if slower; but this case may also be easier
> if I can depend on working console output.  Which are the interesting
> parts for instrumentation?  Can those parts produce console output to
> VGA or netconsole?  Wouldn't switching on ACPI debugging before invoking
> s2disk be useful?  Which parts of it (to avoid it spitting out MBs of
> useless characters)?

I usually don't do that and if the issue is reproducible in the "shutdown"
mode, ACPI is most probably not involved.

> > BTW, did you carry out the /sys/power/pm_test "core" test on the box?
> 
> I'm not clear on how to do that with user space suspend.  Simply set it
> to "cores" before invoking s2disk?

Yes, echo "core" to /sys/power/pm_test before executing s2disk.

> I already did the test for STR (see
> http://bugs.freedesktop.org/show_bug.cgi?id=22126#c3), but will redo
> with the current kernel tonight.

OK, thanks.

Best,
Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-11-11 13:29           ` [linux-pm] " Ferenc Wagner
@ 2009-11-11 14:47             ` Rafael J. Wysocki
  2009-11-11 14:47             ` [linux-pm] " Rafael J. Wysocki
  1 sibling, 0 replies; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-11-11 14:47 UTC (permalink / raw)
  To: Ferenc Wagner
  Cc: LKML, Jesse Barnes, yakui.zhao, ACPI Devel Maling List, linux-pm,
	Andrew Morton

On Wednesday 11 November 2009, Ferenc Wagner wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > On Wednesday 11 November 2009, Ferenc Wagner wrote:
> >
> >> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >> 
> >>> On Thursday 29 October 2009, Ferenc Wagner wrote:
> >>>
> >>>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >>>> 
> >>>>> On Wednesday 28 October 2009, Ferenc Wagner wrote:
> >>>>> 
> >>>>>> 2.6.32-rc5 feels particularly bad, with frequent failures to switch
> >>>>>> off the machine after "S|" or freezes after "Snapshotting system".
> >>>>>> The former does not cause much trouble in itself, as the machine can
> >>>>>> be switched off and resumed all right, but the latter is nasty.
> >>>>>> Suspend to RAM works all the time.  The issue is not reproducible,
> >>>>>> unfortunately, and the kernel change happened almost together with a
> >>>>>> BIOS upgrade.  Yesterday I switched back to 2.6.31 to see whether it
> >>>>>> still works stably with the new BIOS.  I'll report back my findings in
> >>>>>> a couple of days.
> >>>>>
> >>>>> OK, thanks.
> >>>>>
> >>>>> Still, I'm really afraid we won't be able to debug it any further without a
> >>>>> reproducible test case.
> >>>>
> >>>> Can't you perhaps suggest a way forward there?  Or some tricks to create a
> >>>> reproducible test case here?
> >>>
> >>> Well, you can test if the problem is reproducible in the "shutdown" mode of
> >>> hibernation.
> >> 
> >> Well, both failure modes happen with "shutdown" mode as well (the S|
> >> freeze with yesterday's git, too), but still not reproducibly.  When
> >> s2disk is stuck in "Snapshotting system", the system is not completely
> >> dead, it echoes line feeds and Ctrl-C at least (as added to #14504).
> >> 
> >> I wonder what you did if the issue was reproducible...  Is that totally
> >> unapplicable if the problem happens with 10% probability only?  Slow,
> >> sure, but until I manage to set up an automated testing bench...
> >
> > I would try to identify the commit that made the problem appear using git
> > bisection.  However, this is really difficult with problems that are not
> > reliably reproducible.
> 
> Indeed.  I'm thinking about setting up a script, which does nothing but
> hibernates the laptop in a loop, and get my router provide a constant
> stream of WOL packets to restart it.  If it always freezes in bounded
> time that will make bisecting possible, if slow.

Alternatively, you can use the RTC alarm to wake up the machine.

> > Failing that, I would add some instrumentation to the code to identify the
> > exact place where it hangs.
> 
> I managed to achieve this with my STR problem, see
> http://bugs.freedesktop.org/show_bug.cgi?id=22126#c17, but maybe that
> status = acpi_evaluate_object(NULL, METHOD_NAME__PTS, &arg_list, NULL);
> wasn't deep enough, as it got no followup.  How deep should one go to be
> useful?

No, this is deep enough and indicates a BIOS issue.

> I can probably do so again, if slower; but this case may also be easier
> if I can depend on working console output.  Which are the interesting
> parts for instrumentation?  Can those parts produce console output to
> VGA or netconsole?  Wouldn't switching on ACPI debugging before invoking
> s2disk be useful?  Which parts of it (to avoid it spitting out MBs of
> useless characters)?

I usually don't do that and if the issue is reproducible in the "shutdown"
mode, ACPI is most probably not involved.

> > BTW, did you carry out the /sys/power/pm_test "core" test on the box?
> 
> I'm not clear on how to do that with user space suspend.  Simply set it
> to "cores" before invoking s2disk?

Yes, echo "core" to /sys/power/pm_test before executing s2disk.

> I already did the test for STR (see
> http://bugs.freedesktop.org/show_bug.cgi?id=22126#c3), but will redo
> with the current kernel tonight.

OK, thanks.

Best,
Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-11-11 14:47             ` [linux-pm] " Rafael J. Wysocki
  2009-11-13 16:35               ` Ferenc Wagner
@ 2009-11-13 16:35               ` Ferenc Wagner
  2009-11-13 19:59                 ` Rafael J. Wysocki
  2009-11-13 19:59                 ` [linux-pm] " Rafael J. Wysocki
  1 sibling, 2 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-11-13 16:35 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, Jesse Barnes, Andrew Morton, yakui.zhao, LKML,
	ACPI Devel Maling List, Len Brown

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> Yes, echo "core" to /sys/power/pm_test before executing s2disk.

It snapshots the system and returns, producing the same console output
as s2ram (is this the expected behaviour?)  I ran this several times in
a loop, and experienced no problems at all.  Maybe it depends on the
amount of memory used...  I saw a freeze saying "99% done" (ie. not
100%), btw.  Are other pm_test values meaningful with s2disk?  Is this
handled explicitly in s2disk, or does simply the kernel act as if it was
resumed instead of providing the system image after SNAPSHOT_CREATE_IMAGE?

> On Wednesday 11 November 2009, Ferenc Wagner wrote:
>
>> I already did the test for STR (see
>> http://bugs.freedesktop.org/show_bug.cgi?id=22126#c3), but will redo
>> with the current kernel tonight.
>
> OK, thanks.

No change on this front, FWIW.  But rc7 is out now, I'll test again.
-- 
Thanks,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-11-11 14:47             ` [linux-pm] " Rafael J. Wysocki
@ 2009-11-13 16:35               ` Ferenc Wagner
  2009-11-13 16:35               ` [linux-pm] " Ferenc Wagner
  1 sibling, 0 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-11-13 16:35 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: LKML, Jesse Barnes, yakui.zhao, ACPI Devel Maling List, linux-pm,
	Andrew Morton

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> Yes, echo "core" to /sys/power/pm_test before executing s2disk.

It snapshots the system and returns, producing the same console output
as s2ram (is this the expected behaviour?)  I ran this several times in
a loop, and experienced no problems at all.  Maybe it depends on the
amount of memory used...  I saw a freeze saying "99% done" (ie. not
100%), btw.  Are other pm_test values meaningful with s2disk?  Is this
handled explicitly in s2disk, or does simply the kernel act as if it was
resumed instead of providing the system image after SNAPSHOT_CREATE_IMAGE?

> On Wednesday 11 November 2009, Ferenc Wagner wrote:
>
>> I already did the test for STR (see
>> http://bugs.freedesktop.org/show_bug.cgi?id=22126#c3), but will redo
>> with the current kernel tonight.
>
> OK, thanks.

No change on this front, FWIW.  But rc7 is out now, I'll test again.
-- 
Thanks,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-11-13 16:35               ` [linux-pm] " Ferenc Wagner
  2009-11-13 19:59                 ` Rafael J. Wysocki
@ 2009-11-13 19:59                 ` Rafael J. Wysocki
  2009-11-14  1:50                   ` Ferenc Wagner
  2009-11-14  1:50                   ` Ferenc Wagner
  1 sibling, 2 replies; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-11-13 19:59 UTC (permalink / raw)
  To: Ferenc Wagner
  Cc: linux-pm, Jesse Barnes, Andrew Morton, yakui.zhao, LKML,
	ACPI Devel Maling List, Len Brown

On Friday 13 November 2009, Ferenc Wagner wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > Yes, echo "core" to /sys/power/pm_test before executing s2disk.
> 
> It snapshots the system and returns, producing the same console output
> as s2ram (is this the expected behaviour?)  I ran this several times in
> a loop, and experienced no problems at all.  Maybe it depends on the
> amount of memory used...  I saw a freeze saying "99% done" (ie. not
> 100%), btw.

The number is not always accurate because of rounding errors.  I think we can
safely assume that it always happens after the entire image has been written.

> Are other pm_test values meaningful with s2disk?  Is this
> handled explicitly in s2disk, or does simply the kernel act as if it was
> resumed instead of providing the system image after SNAPSHOT_CREATE_IMAGE?

The latter.

> > On Wednesday 11 November 2009, Ferenc Wagner wrote:
> >
> >> I already did the test for STR (see
> >> http://bugs.freedesktop.org/show_bug.cgi?id=22126#c3), but will redo
> >> with the current kernel tonight.
> >
> > OK, thanks.
> 
> No change on this front, FWIW.  But rc7 is out now, I'll test again.

Not sure if that's going to work, but yes please test it.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-11-13 16:35               ` [linux-pm] " Ferenc Wagner
@ 2009-11-13 19:59                 ` Rafael J. Wysocki
  2009-11-13 19:59                 ` [linux-pm] " Rafael J. Wysocki
  1 sibling, 0 replies; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-11-13 19:59 UTC (permalink / raw)
  To: Ferenc Wagner
  Cc: LKML, Jesse Barnes, yakui.zhao, ACPI Devel Maling List, linux-pm,
	Andrew Morton

On Friday 13 November 2009, Ferenc Wagner wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > Yes, echo "core" to /sys/power/pm_test before executing s2disk.
> 
> It snapshots the system and returns, producing the same console output
> as s2ram (is this the expected behaviour?)  I ran this several times in
> a loop, and experienced no problems at all.  Maybe it depends on the
> amount of memory used...  I saw a freeze saying "99% done" (ie. not
> 100%), btw.

The number is not always accurate because of rounding errors.  I think we can
safely assume that it always happens after the entire image has been written.

> Are other pm_test values meaningful with s2disk?  Is this
> handled explicitly in s2disk, or does simply the kernel act as if it was
> resumed instead of providing the system image after SNAPSHOT_CREATE_IMAGE?

The latter.

> > On Wednesday 11 November 2009, Ferenc Wagner wrote:
> >
> >> I already did the test for STR (see
> >> http://bugs.freedesktop.org/show_bug.cgi?id=22126#c3), but will redo
> >> with the current kernel tonight.
> >
> > OK, thanks.
> 
> No change on this front, FWIW.  But rc7 is out now, I'll test again.

Not sure if that's going to work, but yes please test it.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-11-13 19:59                 ` [linux-pm] " Rafael J. Wysocki
@ 2009-11-14  1:50                   ` Ferenc Wagner
  2009-11-14 18:52                     ` Rafael J. Wysocki
  2009-11-14 18:52                     ` Rafael J. Wysocki
  2009-11-14  1:50                   ` Ferenc Wagner
  1 sibling, 2 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-11-14  1:50 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, Jesse Barnes, Andrew Morton, yakui.zhao, LKML,
	ACPI Devel Maling List, Len Brown

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Friday 13 November 2009, Ferenc Wagner wrote:
>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>> 
>>> Yes, echo "core" to /sys/power/pm_test before executing s2disk.
>> 
>> It snapshots the system and returns, producing the same console output
>> as s2ram (is this the expected behaviour?)  I ran this several times in
>> a loop, and experienced no problems at all.  Maybe it depends on the
>> amount of memory used...  I saw a freeze saying "99% done" (ie. not
>> 100%), btw.
>
> The number is not always accurate because of rounding errors.  I think we can
> safely assume that it always happens after the entire image has been written.

Probably, "done" isn't output otherwise.

>> Are other pm_test values meaningful with s2disk?  Is this
>> handled explicitly in s2disk, or does simply the kernel act as if it was
>> resumed instead of providing the system image after SNAPSHOT_CREATE_IMAGE?
>
> The latter.

Ok, I found the code.  Are other pm_test values meaningful, or possibly
harmful?  I think I tried freezer, which resulted in a seemingly perfect
suspend, but the machine didn't try to resume afterwards, but booted
normally instead...

>>> On Wednesday 11 November 2009, Ferenc Wagner wrote:
>>>
>>>> I already did the test for STR (see
>>>> http://bugs.freedesktop.org/show_bug.cgi?id=22126#c3), but will redo
>>>> with the current kernel tonight.
>>>
>>> OK, thanks.
>> 
>> No change on this front, FWIW.  But rc7 is out now, I'll test again.
>
> Not sure if that's going to work, but yes please test it.

The KMS related STR freeze (evaluating the _PTS method) is still there.
I'm continuing testing s2disk with the platform method under rc7 (with
some instrumentation added).

Btw, s2ram -f works fine otherwise (no KMS), and my machine is not in
the whitelist.  I'm not sure whether the KMS problem disqualifies it
(shall I report it to suspend-devel?), but it can be identified by:
    sys_vendor   = "IBM"
    sys_product  = "1834S5G"
    sys_version  = "ThinkPad R50e"
    bios_version = "1WET90WW (2.10 )"
-- 
Thanks,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-11-13 19:59                 ` [linux-pm] " Rafael J. Wysocki
  2009-11-14  1:50                   ` Ferenc Wagner
@ 2009-11-14  1:50                   ` Ferenc Wagner
  1 sibling, 0 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-11-14  1:50 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: LKML, Jesse Barnes, yakui.zhao, ACPI Devel Maling List, linux-pm,
	Andrew Morton

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Friday 13 November 2009, Ferenc Wagner wrote:
>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>> 
>>> Yes, echo "core" to /sys/power/pm_test before executing s2disk.
>> 
>> It snapshots the system and returns, producing the same console output
>> as s2ram (is this the expected behaviour?)  I ran this several times in
>> a loop, and experienced no problems at all.  Maybe it depends on the
>> amount of memory used...  I saw a freeze saying "99% done" (ie. not
>> 100%), btw.
>
> The number is not always accurate because of rounding errors.  I think we can
> safely assume that it always happens after the entire image has been written.

Probably, "done" isn't output otherwise.

>> Are other pm_test values meaningful with s2disk?  Is this
>> handled explicitly in s2disk, or does simply the kernel act as if it was
>> resumed instead of providing the system image after SNAPSHOT_CREATE_IMAGE?
>
> The latter.

Ok, I found the code.  Are other pm_test values meaningful, or possibly
harmful?  I think I tried freezer, which resulted in a seemingly perfect
suspend, but the machine didn't try to resume afterwards, but booted
normally instead...

>>> On Wednesday 11 November 2009, Ferenc Wagner wrote:
>>>
>>>> I already did the test for STR (see
>>>> http://bugs.freedesktop.org/show_bug.cgi?id=22126#c3), but will redo
>>>> with the current kernel tonight.
>>>
>>> OK, thanks.
>> 
>> No change on this front, FWIW.  But rc7 is out now, I'll test again.
>
> Not sure if that's going to work, but yes please test it.

The KMS related STR freeze (evaluating the _PTS method) is still there.
I'm continuing testing s2disk with the platform method under rc7 (with
some instrumentation added).

Btw, s2ram -f works fine otherwise (no KMS), and my machine is not in
the whitelist.  I'm not sure whether the KMS problem disqualifies it
(shall I report it to suspend-devel?), but it can be identified by:
    sys_vendor   = "IBM"
    sys_product  = "1834S5G"
    sys_version  = "ThinkPad R50e"
    bios_version = "1WET90WW (2.10 )"
-- 
Thanks,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-11-14  1:50                   ` Ferenc Wagner
@ 2009-11-14 18:52                     ` Rafael J. Wysocki
  2009-11-18  1:12                       ` Ferenc Wagner
                                         ` (3 more replies)
  2009-11-14 18:52                     ` Rafael J. Wysocki
  1 sibling, 4 replies; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-11-14 18:52 UTC (permalink / raw)
  To: Ferenc Wagner
  Cc: linux-pm, Jesse Barnes, Andrew Morton, yakui.zhao, LKML,
	ACPI Devel Maling List, Len Brown

On Saturday 14 November 2009, Ferenc Wagner wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > On Friday 13 November 2009, Ferenc Wagner wrote:
> >> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >> 
> >>> Yes, echo "core" to /sys/power/pm_test before executing s2disk.
> >> 
> >> It snapshots the system and returns, producing the same console output
> >> as s2ram (is this the expected behaviour?)  I ran this several times in
> >> a loop, and experienced no problems at all.  Maybe it depends on the
> >> amount of memory used...  I saw a freeze saying "99% done" (ie. not
> >> 100%), btw.
> >
> > The number is not always accurate because of rounding errors.  I think we can
> > safely assume that it always happens after the entire image has been written.
> 
> Probably, "done" isn't output otherwise.
> 
> >> Are other pm_test values meaningful with s2disk?  Is this
> >> handled explicitly in s2disk, or does simply the kernel act as if it was
> >> resumed instead of providing the system image after SNAPSHOT_CREATE_IMAGE?
> >
> > The latter.
> 
> Ok, I found the code.  Are other pm_test values meaningful, or possibly
> harmful?

They are supposed to work as for suspend.

> I think I tried freezer, which resulted in a seemingly perfect
> suspend, but the machine didn't try to resume afterwards, but booted
> normally instead...

So this sounds like there's a bug (will check).

> >>> On Wednesday 11 November 2009, Ferenc Wagner wrote:
> >>>
> >>>> I already did the test for STR (see
> >>>> http://bugs.freedesktop.org/show_bug.cgi?id=22126#c3), but will redo
> >>>> with the current kernel tonight.
> >>>
> >>> OK, thanks.
> >> 
> >> No change on this front, FWIW.  But rc7 is out now, I'll test again.
> >
> > Not sure if that's going to work, but yes please test it.
> 
> The KMS related STR freeze (evaluating the _PTS method) is still there.
> I'm continuing testing s2disk with the platform method under rc7 (with
> some instrumentation added).
> 
> Btw, s2ram -f works fine otherwise (no KMS), and my machine is not in
> the whitelist.  I'm not sure whether the KMS problem disqualifies it

No, it doesn't.

> (shall I report it to suspend-devel?),

Yes, please.

> but it can be identified by:
>     sys_vendor   = "IBM"
>     sys_product  = "1834S5G"
>     sys_version  = "ThinkPad R50e"
>     bios_version = "1WET90WW (2.10 )"

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-11-14  1:50                   ` Ferenc Wagner
  2009-11-14 18:52                     ` Rafael J. Wysocki
@ 2009-11-14 18:52                     ` Rafael J. Wysocki
  1 sibling, 0 replies; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-11-14 18:52 UTC (permalink / raw)
  To: Ferenc Wagner
  Cc: LKML, Jesse Barnes, yakui.zhao, ACPI Devel Maling List, linux-pm,
	Andrew Morton

On Saturday 14 November 2009, Ferenc Wagner wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > On Friday 13 November 2009, Ferenc Wagner wrote:
> >> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >> 
> >>> Yes, echo "core" to /sys/power/pm_test before executing s2disk.
> >> 
> >> It snapshots the system and returns, producing the same console output
> >> as s2ram (is this the expected behaviour?)  I ran this several times in
> >> a loop, and experienced no problems at all.  Maybe it depends on the
> >> amount of memory used...  I saw a freeze saying "99% done" (ie. not
> >> 100%), btw.
> >
> > The number is not always accurate because of rounding errors.  I think we can
> > safely assume that it always happens after the entire image has been written.
> 
> Probably, "done" isn't output otherwise.
> 
> >> Are other pm_test values meaningful with s2disk?  Is this
> >> handled explicitly in s2disk, or does simply the kernel act as if it was
> >> resumed instead of providing the system image after SNAPSHOT_CREATE_IMAGE?
> >
> > The latter.
> 
> Ok, I found the code.  Are other pm_test values meaningful, or possibly
> harmful?

They are supposed to work as for suspend.

> I think I tried freezer, which resulted in a seemingly perfect
> suspend, but the machine didn't try to resume afterwards, but booted
> normally instead...

So this sounds like there's a bug (will check).

> >>> On Wednesday 11 November 2009, Ferenc Wagner wrote:
> >>>
> >>>> I already did the test for STR (see
> >>>> http://bugs.freedesktop.org/show_bug.cgi?id=22126#c3), but will redo
> >>>> with the current kernel tonight.
> >>>
> >>> OK, thanks.
> >> 
> >> No change on this front, FWIW.  But rc7 is out now, I'll test again.
> >
> > Not sure if that's going to work, but yes please test it.
> 
> The KMS related STR freeze (evaluating the _PTS method) is still there.
> I'm continuing testing s2disk with the platform method under rc7 (with
> some instrumentation added).
> 
> Btw, s2ram -f works fine otherwise (no KMS), and my machine is not in
> the whitelist.  I'm not sure whether the KMS problem disqualifies it

No, it doesn't.

> (shall I report it to suspend-devel?),

Yes, please.

> but it can be identified by:
>     sys_vendor   = "IBM"
>     sys_product  = "1834S5G"
>     sys_version  = "ThinkPad R50e"
>     bios_version = "1WET90WW (2.10 )"

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-11-14 18:52                     ` Rafael J. Wysocki
  2009-11-18  1:12                       ` Ferenc Wagner
@ 2009-11-18  1:12                       ` Ferenc Wagner
  2009-11-18 14:05                       ` Ferenc Wagner
  2009-11-18 14:05                       ` [linux-pm] " Ferenc Wagner
  3 siblings, 0 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-11-18  1:12 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, Jesse Barnes, Andrew Morton, yakui.zhao, LKML,
	ACPI Devel Maling List, Len Brown

[-- Attachment #1: Type: text/plain, Size: 2403 bytes --]

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Saturday 14 November 2009, Ferenc Wagner wrote:
>
>> Are other pm_test values meaningful, or possibly harmful?
>
> They are supposed to work as for suspend.
>
>> I think I tried freezer, which resulted in a seemingly perfect
>> suspend, but the machine didn't try to resume afterwards, but booted
>> normally instead...
>
> So this sounds like there's a bug (will check).

I rechecked this: the freezer "test" goes on to suspend the machine;
generally it's even possible to resume from the image, but this still
should be a bug.  Can you reproduce it?

Meanwhile I managed to freeze the machine in the "Snapshotting system"
phase (that is, in the SNAPSHOT_CREATE_IMAGE ioctl) again.  SysRq
reacted, but didn't produce any output besides the name of the invoked
function or the help text.  It couldn't power off the machine, but it
was able to reboot it.

Since I've instrumented s2disk and the hibernation path, no freeze
happened during hibernating the machine.  However, it instead froze once
rebooting, when I wanted to replace the kernel.  It was the usual stuff:
everything smooth until the last step, then the final syscall with the
magic constants, then silence... It's starting to look like this bug has
nothing to do with hibernation after all, it's just the shutdown method
I use most often, so it surfaced with that.

I tried various things after starting with init=/bin/bash, but I wasn't
able to cancel the suspend then, so I introduced the "always cancel"
parameter (please find my current patch queue attached).  With that, I'm
able to freeze the machine in 2-5 tries: after a couple of perfect runs,
s2disk -P"always cancel=y" returns normally to the starting screen, but
I'm left with a totally unresponsive machine.  If I didn't botch my
patches, this may be a trace to follow: still not 100% reproducible, but
almost.

Btw. no matter I tried setting suspend loglevel to 1 or 2, usual
unqualified printks didn't make it to the console s2disk uses (not even
ones from before suspend_console() in hibernation_platform_enter, or
even simple ioctl traces for /dev/snapshot).  Would it be possible to
work around this by skipping prepare_console or similar?

And a last thing: when I set resume device to /root/strace (yes, that
binary), s2disk gave a rater strange report:
s2disk: Invalid resume device. Reason: Success.
-- 
Regards,
Feri.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Set-config-file-location-to-etc-uswsusp.conf-the-D.patch --]
[-- Type: text/x-diff, Size: 748 bytes --]

>From 9df438995d040c8d4e46f3d79d0aabcd5092ee6b Mon Sep 17 00:00:00 2001
From: Ferenc Wagner <wferi@niif.hu>
Date: Tue, 10 Nov 2009 01:15:59 +0100
Subject: [PATCH] Set config file location to /etc/uswsusp.conf (the Debian default)

---
 config_parser.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/config_parser.h b/config_parser.h
index b1adb7f..3a85793 100644
--- a/config_parser.h
+++ b/config_parser.h
@@ -27,4 +27,4 @@ int parse(char *my_name, char *file_name, struct config_par *parv);
 void usage(char *my_name, struct option options[], const char *short_options);
 void version(char *my_name, char *extra_version);
 
-#define CONFIG_FILE	"/etc/suspend.conf"
+#define CONFIG_FILE	"/etc/uswsusp.conf"
-- 
1.5.6.5


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0002-Add-some-debugging-output.patch --]
[-- Type: text/x-diff, Size: 1066 bytes --]

>From 26f25561e1608984faac83b7df1e9d4030be1c83 Mon Sep 17 00:00:00 2001
From: Ferenc Wagner <wferi@niif.hu>
Date: Tue, 10 Nov 2009 01:16:20 +0100
Subject: [PATCH] Add some debugging output

---
 config_parser.c |    2 ++
 suspend.c       |    2 +-
 2 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/config_parser.c b/config_parser.c
index d017ccd..91cc4fc 100644
--- a/config_parser.c
+++ b/config_parser.c
@@ -66,6 +66,8 @@ int parse_line(char *str, struct config_par *parv)
 					if (sscanf(str, fmt, parv[i].ptr) <= 0)
 						error = -EINVAL;
 				}
+				if (!error)
+					fprintf (stderr, "Parsed %s=%s\n", parv[i].name, str);
 				break;
 			}
 		}
diff --git a/suspend.c b/suspend.c
index 51cab6f..5248dba 100644
--- a/suspend.c
+++ b/suspend.c
@@ -2421,7 +2421,7 @@ int main(int argc, char *argv[])
 
 	ret = 0;
 	if (stat(resume_dev_name, &stat_buf)) {
-		suspend_error("Could not stat the resume device file.");
+		suspend_error("Could not stat the resume device file '%s'.", resume_dev_name);
 		ret = ENODEV;
 		goto Umount;
 	}
-- 
1.5.6.5


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #4: 0003-Add-debugging-around-syscalls-and-ioctls.patch --]
[-- Type: text/x-diff, Size: 5215 bytes --]

>From f3472f84e44ccf41010aecc3971801c8825ae045 Mon Sep 17 00:00:00 2001
From: Ferenc Wagner <wferi@niif.hu>
Date: Sat, 14 Nov 2009 01:25:05 +0100
Subject: [PATCH] Add debugging around syscalls and ioctls

---
 suspend.c |   22 +++++++++++++++++++---
 swsusp.h  |   29 ++++++++++++++++++++++++++---
 2 files changed, 45 insertions(+), 6 deletions(-)

diff --git a/suspend.c b/suspend.c
index 5248dba..6a7dbb4 100644
--- a/suspend.c
+++ b/suspend.c
@@ -295,17 +295,26 @@ static int atomic_snapshot(int dev, int *in_suspend)
 {
 	int error;
 
+	fprintf (stderr, "SNAPSHOT_CREATE_IMAGE...\n");
 	error = ioctl(dev, SNAPSHOT_CREATE_IMAGE, in_suspend);
+	fprintf (stderr, "SNAPSHOT_CREATE_IMAGE: %d\n", error);
 	if (error && errno == ENOTTY) {
 		report_unsupported_ioctl("SNAPSHOT_CREATE_IMAGE");
+		fprintf (stderr, "SNAPSHOT_ATOMIC_SNAPSHOT...\n");
 		error = ioctl(dev, SNAPSHOT_ATOMIC_SNAPSHOT, in_suspend);
+		fprintf (stderr, "SNAPSHOT_ATOMIC_SNAPSHOT: %d\n", error);
 	}
 	return error;
 }
 
 static inline int free_snapshot(int dev)
 {
-	return ioctl(dev, SNAPSHOT_FREE, 0);
+	int error;
+
+	fprintf (stderr, "SNAPSHOT_FREE...\n");
+	error = ioctl(dev, SNAPSHOT_FREE, 0);
+	fprintf (stderr, "SNAPSHOT_FREE: %d\n", error);
+	return error;
 }
 
 static int set_image_size(int dev, unsigned int size)
@@ -329,10 +338,14 @@ static int platform_enter(int dev)
 {
 	int error;
 
+	fprintf (stderr, "SNAPSHOT_POWER_OFF...\n");
 	error = ioctl(dev, SNAPSHOT_POWER_OFF, 0);
+	fprintf (stderr, "SNAPSHOT_POWER_OFF: %d\n", error);
 	if (error  && errno == ENOTTY) {
 		report_unsupported_ioctl("SNAPSHOT_POWER_OFF");
+		fprintf (stderr, "SNAPSHOT_PMOPS(PMOPS_ENTER)...\n");
 		error = ioctl(dev, SNAPSHOT_PMOPS, PMOPS_ENTER);
+		fprintf (stderr, "SNAPSHOT_PMOPS(PMOPS_ENTER): %d\n", error);
 	}
 	return error;
 }
@@ -1258,7 +1271,7 @@ static int flush_buffer(struct swap_writer *handle)
  */
 static int save_image(struct swap_writer *handle, unsigned int nr_pages)
 {
-	unsigned int m, writeout_rate;
+	unsigned int m, writeout_rate, expected_nr_pages = nr_pages;
 	ssize_t ret;
 	struct termios newtrm, savedtrm;
 	int abort_possible, key, error = 0;
@@ -1350,7 +1363,7 @@ static int save_image(struct swap_writer *handle, unsigned int nr_pages)
 		if (!error)
 			error = save_extents(handle, 1);
 		if (!error)
-			printf(" done (%u pages)\n", nr_pages);
+			printf(" done (%u pages of expected %u)\n", nr_pages, expected_nr_pages);
 	}
 
  Exit:
@@ -1670,12 +1683,15 @@ static void suspend_shutdown(int snapshot_fd)
 	splash.set_caption("Done.");
 
 	if (shutdown_method == SHUTDOWN_METHOD_REBOOT) {
+		fprintf (stderr, "Shutdown method was REBOOT, so rebooting...\n");
 		reboot();
 	} else if (shutdown_method == SHUTDOWN_METHOD_PLATFORM) {
+		fprintf (stderr, "Shutdown method was PLATFORM, so entering...\n");
 		if (platform_enter(snapshot_fd))
 			suspend_error("Could not enter the hibernation state, "
 					"calling power_off.");
 	}
+	fprintf (stderr, "Shutdown method was SHUTDOWN or something went wrong, so powering off...\n");
 	power_off();
 	/* Signature is on disk, it is very dangerous to continue now.
 	 * We'd do resume with stale caches on next boot. */
diff --git a/swsusp.h b/swsusp.h
index e6abc83..fe47702 100644
--- a/swsusp.h
+++ b/swsusp.h
@@ -90,41 +90,64 @@ static inline void report_unsupported_ioctl(char *name)
 
 static inline int freeze(int dev)
 {
-	return ioctl(dev, SNAPSHOT_FREEZE, 0);
+	int error;
+
+	fprintf (stderr, "SNAPSHOT_FREEZE...\n");
+	error = ioctl(dev, SNAPSHOT_FREEZE, 0);
+	fprintf (stderr, "SNAPSHOT_FREEZE: %d\n", error);
+	return error;
 }
 
 static inline int unfreeze(int dev)
 {
-	return ioctl(dev, SNAPSHOT_UNFREEZE, 0);
+	int error;
+
+	fprintf (stderr, "SNAPSHOT_UNFREEZE...\n");
+	error = ioctl(dev, SNAPSHOT_UNFREEZE, 0);
+	fprintf (stderr, "SNAPSHOT_UNFREEZE: %d\n", error);
+	return error;
 }
 
 static inline int platform_prepare(int dev)
 {
 	int error;
 
+	fprintf (stderr, "SNAPSHOT_PLATFORM_SUPPORT(1)...\n");
 	error = ioctl(dev, SNAPSHOT_PLATFORM_SUPPORT, 1);
+	fprintf (stderr, "SNAPSHOT_PLATFORM_SUPPORT(1): %d\n", error);
 	if (error && errno == ENOTTY) {
 		report_unsupported_ioctl("SNAPSHOT_PLATFORM_SUPPORT");
+		fprintf (stderr, "SNAPSHOT_PMOPS(PMOPS_PREPARE)...\n");
 		error = ioctl(dev, SNAPSHOT_PMOPS, PMOPS_PREPARE);
+		fprintf (stderr, "SNAPSHOT_PMOPS(PMOPS_PREPARE): %d\n", error);
 	}
 	return error;
 }
 
 static inline int platform_finish(int dev)
 {
-	return ioctl(dev, SNAPSHOT_PMOPS, PMOPS_FINISH);
+	int error;
+
+	fprintf (stderr, "SNAPSHOT_PMOPS(PMOPS_FINISH)...\n");
+	error = ioctl(dev, SNAPSHOT_PMOPS, PMOPS_FINISH);
+	fprintf (stderr, "SNAPSHOT_PMOPS(PMOPS_FINISH): %d\n", error);
+	return error;
 }
 
 static inline void reboot(void)
 {
+	fprintf (stderr, "reboot syscall...\n");
 	syscall(SYS_reboot, LINUX_REBOOT_MAGIC1, LINUX_REBOOT_MAGIC2,
 		LINUX_REBOOT_CMD_RESTART, 0);
+	fprintf (stderr, "reboot syscall failed!\n");
 }
 
 static inline void power_off(void)
 {
+	fprintf (stderr, "power off syscall...\n");
 	syscall(SYS_reboot, LINUX_REBOOT_MAGIC1, LINUX_REBOOT_MAGIC2,
 		LINUX_REBOOT_CMD_POWER_OFF, 0);
+	fprintf (stderr, "power off syscall failed!\n");
 }
 
 #ifndef SYS_sync_file_range
-- 
1.5.6.5


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #5: 0004-Add-the-always-cancel-parameter.patch --]
[-- Type: text/x-diff, Size: 1696 bytes --]

>From 9b04b5ab4d4e963aef630fa96f84e1c433c49476 Mon Sep 17 00:00:00 2001
From: Ferenc Wagner <wferi@niif.hu>
Date: Tue, 17 Nov 2009 23:37:02 +0100
Subject: [PATCH] Add the 'always cancel' parameter

---
 resume.c  |    5 +++++
 suspend.c |   12 ++++++++++++
 2 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/resume.c b/resume.c
index 50ea2c8..4b70fe4 100644
--- a/resume.c
+++ b/resume.c
@@ -148,6 +148,11 @@ static struct config_par parameters[] = {
 		.fmt = "%c",
 		.ptr = NULL,
 	},
+	{
+		.name = "always cancel",
+		.fmt = "%c",
+		.ptr = NULL,
+	},
 #ifdef CONFIG_THREADS
 	{
 		.name = "threads",
diff --git a/suspend.c b/suspend.c
index 6a7dbb4..2dc1311 100644
--- a/suspend.c
+++ b/suspend.c
@@ -96,6 +96,7 @@ static enum {
 } shutdown_method = SHUTDOWN_METHOD_PLATFORM;
 static int resume_pause;
 static char verify_image;
+static char always_cancel;
 #ifdef CONFIG_THREADS
 static char use_threads;
 #else
@@ -204,6 +205,11 @@ static struct config_par parameters[] = {
 		.fmt = "%c",
 		.ptr = &verify_image,
 	},
+	{
+		.name = "always cancel",
+		.fmt = "%c",
+		.ptr = &always_cancel,
+	},
 #ifdef CONFIG_THREADS
 	{
 		.name = "threads",
@@ -1373,6 +1379,9 @@ static int save_image(struct swap_writer *handle, unsigned int nr_pages)
 	if (abort_possible)
 		splash.restore_abort(&savedtrm);
 
+	if (always_cancel)
+		error = -EINTR;
+
 	return error;
 }
 
@@ -2339,6 +2348,9 @@ int main(int argc, char *argv[])
 	if (verify_image != 'y' && verify_image != 'Y')
 		verify_image = 0;
 
+	if (always_cancel != 'y' && always_cancel != 'Y')
+		always_cancel = 0;
+
 #ifdef CONFIG_THREADS
 	if (use_threads != 'y' && use_threads != 'Y')
 		use_threads = 0;
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-11-14 18:52                     ` Rafael J. Wysocki
@ 2009-11-18  1:12                       ` Ferenc Wagner
  2009-11-18  1:12                       ` [linux-pm] " Ferenc Wagner
                                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-11-18  1:12 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: LKML, Jesse Barnes, yakui.zhao, ACPI Devel Maling List, linux-pm,
	Andrew Morton

[-- Attachment #1: Type: text/plain, Size: 2403 bytes --]

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Saturday 14 November 2009, Ferenc Wagner wrote:
>
>> Are other pm_test values meaningful, or possibly harmful?
>
> They are supposed to work as for suspend.
>
>> I think I tried freezer, which resulted in a seemingly perfect
>> suspend, but the machine didn't try to resume afterwards, but booted
>> normally instead...
>
> So this sounds like there's a bug (will check).

I rechecked this: the freezer "test" goes on to suspend the machine;
generally it's even possible to resume from the image, but this still
should be a bug.  Can you reproduce it?

Meanwhile I managed to freeze the machine in the "Snapshotting system"
phase (that is, in the SNAPSHOT_CREATE_IMAGE ioctl) again.  SysRq
reacted, but didn't produce any output besides the name of the invoked
function or the help text.  It couldn't power off the machine, but it
was able to reboot it.

Since I've instrumented s2disk and the hibernation path, no freeze
happened during hibernating the machine.  However, it instead froze once
rebooting, when I wanted to replace the kernel.  It was the usual stuff:
everything smooth until the last step, then the final syscall with the
magic constants, then silence... It's starting to look like this bug has
nothing to do with hibernation after all, it's just the shutdown method
I use most often, so it surfaced with that.

I tried various things after starting with init=/bin/bash, but I wasn't
able to cancel the suspend then, so I introduced the "always cancel"
parameter (please find my current patch queue attached).  With that, I'm
able to freeze the machine in 2-5 tries: after a couple of perfect runs,
s2disk -P"always cancel=y" returns normally to the starting screen, but
I'm left with a totally unresponsive machine.  If I didn't botch my
patches, this may be a trace to follow: still not 100% reproducible, but
almost.

Btw. no matter I tried setting suspend loglevel to 1 or 2, usual
unqualified printks didn't make it to the console s2disk uses (not even
ones from before suspend_console() in hibernation_platform_enter, or
even simple ioctl traces for /dev/snapshot).  Would it be possible to
work around this by skipping prepare_console or similar?

And a last thing: when I set resume device to /root/strace (yes, that
binary), s2disk gave a rater strange report:
s2disk: Invalid resume device. Reason: Success.
-- 
Regards,
Feri.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Set-config-file-location-to-etc-uswsusp.conf-the-D.patch --]
[-- Type: text/x-diff, Size: 748 bytes --]

>From 9df438995d040c8d4e46f3d79d0aabcd5092ee6b Mon Sep 17 00:00:00 2001
From: Ferenc Wagner <wferi@niif.hu>
Date: Tue, 10 Nov 2009 01:15:59 +0100
Subject: [PATCH] Set config file location to /etc/uswsusp.conf (the Debian default)

---
 config_parser.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/config_parser.h b/config_parser.h
index b1adb7f..3a85793 100644
--- a/config_parser.h
+++ b/config_parser.h
@@ -27,4 +27,4 @@ int parse(char *my_name, char *file_name, struct config_par *parv);
 void usage(char *my_name, struct option options[], const char *short_options);
 void version(char *my_name, char *extra_version);
 
-#define CONFIG_FILE	"/etc/suspend.conf"
+#define CONFIG_FILE	"/etc/uswsusp.conf"
-- 
1.5.6.5


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0002-Add-some-debugging-output.patch --]
[-- Type: text/x-diff, Size: 1066 bytes --]

>From 26f25561e1608984faac83b7df1e9d4030be1c83 Mon Sep 17 00:00:00 2001
From: Ferenc Wagner <wferi@niif.hu>
Date: Tue, 10 Nov 2009 01:16:20 +0100
Subject: [PATCH] Add some debugging output

---
 config_parser.c |    2 ++
 suspend.c       |    2 +-
 2 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/config_parser.c b/config_parser.c
index d017ccd..91cc4fc 100644
--- a/config_parser.c
+++ b/config_parser.c
@@ -66,6 +66,8 @@ int parse_line(char *str, struct config_par *parv)
 					if (sscanf(str, fmt, parv[i].ptr) <= 0)
 						error = -EINVAL;
 				}
+				if (!error)
+					fprintf (stderr, "Parsed %s=%s\n", parv[i].name, str);
 				break;
 			}
 		}
diff --git a/suspend.c b/suspend.c
index 51cab6f..5248dba 100644
--- a/suspend.c
+++ b/suspend.c
@@ -2421,7 +2421,7 @@ int main(int argc, char *argv[])
 
 	ret = 0;
 	if (stat(resume_dev_name, &stat_buf)) {
-		suspend_error("Could not stat the resume device file.");
+		suspend_error("Could not stat the resume device file '%s'.", resume_dev_name);
 		ret = ENODEV;
 		goto Umount;
 	}
-- 
1.5.6.5


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #4: 0003-Add-debugging-around-syscalls-and-ioctls.patch --]
[-- Type: text/x-diff, Size: 5215 bytes --]

>From f3472f84e44ccf41010aecc3971801c8825ae045 Mon Sep 17 00:00:00 2001
From: Ferenc Wagner <wferi@niif.hu>
Date: Sat, 14 Nov 2009 01:25:05 +0100
Subject: [PATCH] Add debugging around syscalls and ioctls

---
 suspend.c |   22 +++++++++++++++++++---
 swsusp.h  |   29 ++++++++++++++++++++++++++---
 2 files changed, 45 insertions(+), 6 deletions(-)

diff --git a/suspend.c b/suspend.c
index 5248dba..6a7dbb4 100644
--- a/suspend.c
+++ b/suspend.c
@@ -295,17 +295,26 @@ static int atomic_snapshot(int dev, int *in_suspend)
 {
 	int error;
 
+	fprintf (stderr, "SNAPSHOT_CREATE_IMAGE...\n");
 	error = ioctl(dev, SNAPSHOT_CREATE_IMAGE, in_suspend);
+	fprintf (stderr, "SNAPSHOT_CREATE_IMAGE: %d\n", error);
 	if (error && errno == ENOTTY) {
 		report_unsupported_ioctl("SNAPSHOT_CREATE_IMAGE");
+		fprintf (stderr, "SNAPSHOT_ATOMIC_SNAPSHOT...\n");
 		error = ioctl(dev, SNAPSHOT_ATOMIC_SNAPSHOT, in_suspend);
+		fprintf (stderr, "SNAPSHOT_ATOMIC_SNAPSHOT: %d\n", error);
 	}
 	return error;
 }
 
 static inline int free_snapshot(int dev)
 {
-	return ioctl(dev, SNAPSHOT_FREE, 0);
+	int error;
+
+	fprintf (stderr, "SNAPSHOT_FREE...\n");
+	error = ioctl(dev, SNAPSHOT_FREE, 0);
+	fprintf (stderr, "SNAPSHOT_FREE: %d\n", error);
+	return error;
 }
 
 static int set_image_size(int dev, unsigned int size)
@@ -329,10 +338,14 @@ static int platform_enter(int dev)
 {
 	int error;
 
+	fprintf (stderr, "SNAPSHOT_POWER_OFF...\n");
 	error = ioctl(dev, SNAPSHOT_POWER_OFF, 0);
+	fprintf (stderr, "SNAPSHOT_POWER_OFF: %d\n", error);
 	if (error  && errno == ENOTTY) {
 		report_unsupported_ioctl("SNAPSHOT_POWER_OFF");
+		fprintf (stderr, "SNAPSHOT_PMOPS(PMOPS_ENTER)...\n");
 		error = ioctl(dev, SNAPSHOT_PMOPS, PMOPS_ENTER);
+		fprintf (stderr, "SNAPSHOT_PMOPS(PMOPS_ENTER): %d\n", error);
 	}
 	return error;
 }
@@ -1258,7 +1271,7 @@ static int flush_buffer(struct swap_writer *handle)
  */
 static int save_image(struct swap_writer *handle, unsigned int nr_pages)
 {
-	unsigned int m, writeout_rate;
+	unsigned int m, writeout_rate, expected_nr_pages = nr_pages;
 	ssize_t ret;
 	struct termios newtrm, savedtrm;
 	int abort_possible, key, error = 0;
@@ -1350,7 +1363,7 @@ static int save_image(struct swap_writer *handle, unsigned int nr_pages)
 		if (!error)
 			error = save_extents(handle, 1);
 		if (!error)
-			printf(" done (%u pages)\n", nr_pages);
+			printf(" done (%u pages of expected %u)\n", nr_pages, expected_nr_pages);
 	}
 
  Exit:
@@ -1670,12 +1683,15 @@ static void suspend_shutdown(int snapshot_fd)
 	splash.set_caption("Done.");
 
 	if (shutdown_method == SHUTDOWN_METHOD_REBOOT) {
+		fprintf (stderr, "Shutdown method was REBOOT, so rebooting...\n");
 		reboot();
 	} else if (shutdown_method == SHUTDOWN_METHOD_PLATFORM) {
+		fprintf (stderr, "Shutdown method was PLATFORM, so entering...\n");
 		if (platform_enter(snapshot_fd))
 			suspend_error("Could not enter the hibernation state, "
 					"calling power_off.");
 	}
+	fprintf (stderr, "Shutdown method was SHUTDOWN or something went wrong, so powering off...\n");
 	power_off();
 	/* Signature is on disk, it is very dangerous to continue now.
 	 * We'd do resume with stale caches on next boot. */
diff --git a/swsusp.h b/swsusp.h
index e6abc83..fe47702 100644
--- a/swsusp.h
+++ b/swsusp.h
@@ -90,41 +90,64 @@ static inline void report_unsupported_ioctl(char *name)
 
 static inline int freeze(int dev)
 {
-	return ioctl(dev, SNAPSHOT_FREEZE, 0);
+	int error;
+
+	fprintf (stderr, "SNAPSHOT_FREEZE...\n");
+	error = ioctl(dev, SNAPSHOT_FREEZE, 0);
+	fprintf (stderr, "SNAPSHOT_FREEZE: %d\n", error);
+	return error;
 }
 
 static inline int unfreeze(int dev)
 {
-	return ioctl(dev, SNAPSHOT_UNFREEZE, 0);
+	int error;
+
+	fprintf (stderr, "SNAPSHOT_UNFREEZE...\n");
+	error = ioctl(dev, SNAPSHOT_UNFREEZE, 0);
+	fprintf (stderr, "SNAPSHOT_UNFREEZE: %d\n", error);
+	return error;
 }
 
 static inline int platform_prepare(int dev)
 {
 	int error;
 
+	fprintf (stderr, "SNAPSHOT_PLATFORM_SUPPORT(1)...\n");
 	error = ioctl(dev, SNAPSHOT_PLATFORM_SUPPORT, 1);
+	fprintf (stderr, "SNAPSHOT_PLATFORM_SUPPORT(1): %d\n", error);
 	if (error && errno == ENOTTY) {
 		report_unsupported_ioctl("SNAPSHOT_PLATFORM_SUPPORT");
+		fprintf (stderr, "SNAPSHOT_PMOPS(PMOPS_PREPARE)...\n");
 		error = ioctl(dev, SNAPSHOT_PMOPS, PMOPS_PREPARE);
+		fprintf (stderr, "SNAPSHOT_PMOPS(PMOPS_PREPARE): %d\n", error);
 	}
 	return error;
 }
 
 static inline int platform_finish(int dev)
 {
-	return ioctl(dev, SNAPSHOT_PMOPS, PMOPS_FINISH);
+	int error;
+
+	fprintf (stderr, "SNAPSHOT_PMOPS(PMOPS_FINISH)...\n");
+	error = ioctl(dev, SNAPSHOT_PMOPS, PMOPS_FINISH);
+	fprintf (stderr, "SNAPSHOT_PMOPS(PMOPS_FINISH): %d\n", error);
+	return error;
 }
 
 static inline void reboot(void)
 {
+	fprintf (stderr, "reboot syscall...\n");
 	syscall(SYS_reboot, LINUX_REBOOT_MAGIC1, LINUX_REBOOT_MAGIC2,
 		LINUX_REBOOT_CMD_RESTART, 0);
+	fprintf (stderr, "reboot syscall failed!\n");
 }
 
 static inline void power_off(void)
 {
+	fprintf (stderr, "power off syscall...\n");
 	syscall(SYS_reboot, LINUX_REBOOT_MAGIC1, LINUX_REBOOT_MAGIC2,
 		LINUX_REBOOT_CMD_POWER_OFF, 0);
+	fprintf (stderr, "power off syscall failed!\n");
 }
 
 #ifndef SYS_sync_file_range
-- 
1.5.6.5


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #5: 0004-Add-the-always-cancel-parameter.patch --]
[-- Type: text/x-diff, Size: 1696 bytes --]

>From 9b04b5ab4d4e963aef630fa96f84e1c433c49476 Mon Sep 17 00:00:00 2001
From: Ferenc Wagner <wferi@niif.hu>
Date: Tue, 17 Nov 2009 23:37:02 +0100
Subject: [PATCH] Add the 'always cancel' parameter

---
 resume.c  |    5 +++++
 suspend.c |   12 ++++++++++++
 2 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/resume.c b/resume.c
index 50ea2c8..4b70fe4 100644
--- a/resume.c
+++ b/resume.c
@@ -148,6 +148,11 @@ static struct config_par parameters[] = {
 		.fmt = "%c",
 		.ptr = NULL,
 	},
+	{
+		.name = "always cancel",
+		.fmt = "%c",
+		.ptr = NULL,
+	},
 #ifdef CONFIG_THREADS
 	{
 		.name = "threads",
diff --git a/suspend.c b/suspend.c
index 6a7dbb4..2dc1311 100644
--- a/suspend.c
+++ b/suspend.c
@@ -96,6 +96,7 @@ static enum {
 } shutdown_method = SHUTDOWN_METHOD_PLATFORM;
 static int resume_pause;
 static char verify_image;
+static char always_cancel;
 #ifdef CONFIG_THREADS
 static char use_threads;
 #else
@@ -204,6 +205,11 @@ static struct config_par parameters[] = {
 		.fmt = "%c",
 		.ptr = &verify_image,
 	},
+	{
+		.name = "always cancel",
+		.fmt = "%c",
+		.ptr = &always_cancel,
+	},
 #ifdef CONFIG_THREADS
 	{
 		.name = "threads",
@@ -1373,6 +1379,9 @@ static int save_image(struct swap_writer *handle, unsigned int nr_pages)
 	if (abort_possible)
 		splash.restore_abort(&savedtrm);
 
+	if (always_cancel)
+		error = -EINTR;
+
 	return error;
 }
 
@@ -2339,6 +2348,9 @@ int main(int argc, char *argv[])
 	if (verify_image != 'y' && verify_image != 'Y')
 		verify_image = 0;
 
+	if (always_cancel != 'y' && always_cancel != 'Y')
+		always_cancel = 0;
+
 #ifdef CONFIG_THREADS
 	if (use_threads != 'y' && use_threads != 'Y')
 		use_threads = 0;
-- 
1.5.6.5


[-- Attachment #6: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply related	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-11-14 18:52                     ` Rafael J. Wysocki
                                         ` (2 preceding siblings ...)
  2009-11-18 14:05                       ` Ferenc Wagner
@ 2009-11-18 14:05                       ` Ferenc Wagner
  2009-11-18 22:13                         ` Rafael J. Wysocki
  2009-11-18 22:13                         ` Rafael J. Wysocki
  3 siblings, 2 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-11-18 14:05 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, Jesse Barnes, Andrew Morton, yakui.zhao, LKML,
	ACPI Devel Maling List, Len Brown

Ferenc Wagner <wferi@niif.hu> writes:

> Since I've instrumented s2disk and the hibernation path, no freeze
> happened during hibernating the machine.

Not until I removed the delays from hibernation_platform_enter(), which
were put there previously to get step-by-step feedback.  Removing them
again resulted in a freeze in short course, maybe just two hibernations
later.  The instrumentation shows it stuck in dpm_suspend_start(PMSG_HIBERNATE).
Does it mean that some device driver is at fault?  I'll check if it
always fails at the same point (although tracing into dpm_suspend_start
isn't pure fun because of the multitude of devices it loops over).  Is
there any way to get printk output from that phase?

Side question: If I run s2disk from the init=/bin/bash prompt, the
instrumentation in acpi_enter_sleep_state_prep in drivers/acpi/acpica/hwsleep.c
fires before the "Snapshotting system" phase, but it does not fire if I
hibernate from the full running desktop.  (That instrumentation was put
there to investigate the KMS-triggered STR freeze.)  What could explain
this?
-- 
Thanks,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-11-14 18:52                     ` Rafael J. Wysocki
  2009-11-18  1:12                       ` Ferenc Wagner
  2009-11-18  1:12                       ` [linux-pm] " Ferenc Wagner
@ 2009-11-18 14:05                       ` Ferenc Wagner
  2009-11-18 14:05                       ` [linux-pm] " Ferenc Wagner
  3 siblings, 0 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-11-18 14:05 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: LKML, Jesse Barnes, yakui.zhao, ACPI Devel Maling List, linux-pm,
	Andrew Morton

Ferenc Wagner <wferi@niif.hu> writes:

> Since I've instrumented s2disk and the hibernation path, no freeze
> happened during hibernating the machine.

Not until I removed the delays from hibernation_platform_enter(), which
were put there previously to get step-by-step feedback.  Removing them
again resulted in a freeze in short course, maybe just two hibernations
later.  The instrumentation shows it stuck in dpm_suspend_start(PMSG_HIBERNATE).
Does it mean that some device driver is at fault?  I'll check if it
always fails at the same point (although tracing into dpm_suspend_start
isn't pure fun because of the multitude of devices it loops over).  Is
there any way to get printk output from that phase?

Side question: If I run s2disk from the init=/bin/bash prompt, the
instrumentation in acpi_enter_sleep_state_prep in drivers/acpi/acpica/hwsleep.c
fires before the "Snapshotting system" phase, but it does not fire if I
hibernate from the full running desktop.  (That instrumentation was put
there to investigate the KMS-triggered STR freeze.)  What could explain
this?
-- 
Thanks,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-11-18 14:05                       ` [linux-pm] " Ferenc Wagner
@ 2009-11-18 22:13                         ` Rafael J. Wysocki
  2009-11-18 22:54                           ` Ferenc Wagner
                                             ` (6 more replies)
  2009-11-18 22:13                         ` Rafael J. Wysocki
  1 sibling, 7 replies; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-11-18 22:13 UTC (permalink / raw)
  To: Ferenc Wagner
  Cc: linux-pm, Jesse Barnes, Andrew Morton, yakui.zhao, LKML,
	ACPI Devel Maling List, Len Brown

On Wednesday 18 November 2009, Ferenc Wagner wrote:
> Ferenc Wagner <wferi@niif.hu> writes:
> 
> > Since I've instrumented s2disk and the hibernation path, no freeze
> > happened during hibernating the machine.
> 
> Not until I removed the delays from hibernation_platform_enter(), which
> were put there previously to get step-by-step feedback.  Removing them
> again resulted in a freeze in short course, maybe just two hibernations
> later.  The instrumentation shows it stuck in dpm_suspend_start(PMSG_HIBERNATE).
> Does it mean that some device driver is at fault?

A driver or one of the platform hooks.

> I'll check if it always fails at the same point (although tracing into
> dpm_suspend_start isn't pure fun because of the multitude of devices it
> loops over).  Is there any way to get printk output from that phase?

Compile with CONFIG_PM_VERBOSE (it does mean exactly that).

> Side question: If I run s2disk from the init=/bin/bash prompt, the
> instrumentation in acpi_enter_sleep_state_prep in drivers/acpi/acpica/hwsleep.c
> fires before the "Snapshotting system" phase, but it does not fire if I
> hibernate from the full running desktop.  (That instrumentation was put
> there to investigate the KMS-triggered STR freeze.)  What could explain
> this?

It looks like it uses the "shutdown" method when run with init=/bin/bash, but
I don't know why exactly.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-11-18 14:05                       ` [linux-pm] " Ferenc Wagner
  2009-11-18 22:13                         ` Rafael J. Wysocki
@ 2009-11-18 22:13                         ` Rafael J. Wysocki
  1 sibling, 0 replies; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-11-18 22:13 UTC (permalink / raw)
  To: Ferenc Wagner
  Cc: LKML, Jesse Barnes, yakui.zhao, ACPI Devel Maling List, linux-pm,
	Andrew Morton

On Wednesday 18 November 2009, Ferenc Wagner wrote:
> Ferenc Wagner <wferi@niif.hu> writes:
> 
> > Since I've instrumented s2disk and the hibernation path, no freeze
> > happened during hibernating the machine.
> 
> Not until I removed the delays from hibernation_platform_enter(), which
> were put there previously to get step-by-step feedback.  Removing them
> again resulted in a freeze in short course, maybe just two hibernations
> later.  The instrumentation shows it stuck in dpm_suspend_start(PMSG_HIBERNATE).
> Does it mean that some device driver is at fault?

A driver or one of the platform hooks.

> I'll check if it always fails at the same point (although tracing into
> dpm_suspend_start isn't pure fun because of the multitude of devices it
> loops over).  Is there any way to get printk output from that phase?

Compile with CONFIG_PM_VERBOSE (it does mean exactly that).

> Side question: If I run s2disk from the init=/bin/bash prompt, the
> instrumentation in acpi_enter_sleep_state_prep in drivers/acpi/acpica/hwsleep.c
> fires before the "Snapshotting system" phase, but it does not fire if I
> hibernate from the full running desktop.  (That instrumentation was put
> there to investigate the KMS-triggered STR freeze.)  What could explain
> this?

It looks like it uses the "shutdown" method when run with init=/bin/bash, but
I don't know why exactly.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-11-18 22:13                         ` Rafael J. Wysocki
@ 2009-11-18 22:54                           ` Ferenc Wagner
  2009-11-18 22:54                           ` Ferenc Wagner
                                             ` (5 subsequent siblings)
  6 siblings, 0 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-11-18 22:54 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, Jesse Barnes, Andrew Morton, yakui.zhao, LKML,
	ACPI Devel Maling List, Len Brown

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Wednesday 18 November 2009, Ferenc Wagner wrote:
>> Ferenc Wagner <wferi@niif.hu> writes:
>> 
>> > Since I've instrumented s2disk and the hibernation path, no freeze
>> > happened during hibernating the machine.
>> 
>> Not until I removed the delays from hibernation_platform_enter(), which
>> were put there previously to get step-by-step feedback.  Removing them
>> again resulted in a freeze in short course, maybe just two hibernations
>> later.  The instrumentation shows it stuck in dpm_suspend_start(PMSG_HIBERNATE).
>> Does it mean that some device driver is at fault?
>
> A driver or one of the platform hooks.
>
>> I'll check if it always fails at the same point (although tracing into
>> dpm_suspend_start isn't pure fun because of the multitude of devices it
>> loops over).  Is there any way to get printk output from that phase?
>
> Compile with CONFIG_PM_VERBOSE (it does mean exactly that).

I've been running with CONFIG_PM_VERBOSE=y for a good while, but that
didn't help getting for example the result of the following printks to
the VGA console (0x3bc is the parallel port):

@@ -445,34 +446,66 @@ int hibernation_platform_enter(void)
         * hibernation_ops->finish() before saving the image, so we should let
         * the firmware know that we're going to enter the sleep state after all
         */
+       printk ("hibernation_ops->begin()...\n");
+       outb(16, 0x3bc);
        error = hibernation_ops->begin();
+       outb(17, 0x3bc);
+       printk ("hibernation_ops->begin(): %d\n", error);
        if (error)
                goto Close;

However, my dmesg is full of lines like

agpgart-intel 0000:00:00.0: preparing freeze
pci 0000:00:00.1: preparing freeze
pci 0000:00:00.3: preparing freeze

etc., I'll check it they are the same all the time.  Anyway, the above
printk strings aren't present in dmesg after a successful resume even,
so I must be doing something wrong...  The parport pins do change, though.
Maybe explicit levels would work better?  I can't see any other
difference from the pm_dev_dbg macro producing the above lines.

>> Side question: If I run s2disk from the init=/bin/bash prompt, the
>> instrumentation in acpi_enter_sleep_state_prep in drivers/acpi/acpica/hwsleep.c
>> fires before the "Snapshotting system" phase, but it does not fire if I
>> hibernate from the full running desktop.  (That instrumentation was put
>> there to investigate the KMS-triggered STR freeze.)  What could explain
>> this?
>
> It looks like it uses the "shutdown" method when run with init=/bin/bash, but
> I don't know why exactly.

Thanks for the tip, I'll check this too.
-- 
Regards,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-11-18 22:13                         ` Rafael J. Wysocki
  2009-11-18 22:54                           ` Ferenc Wagner
@ 2009-11-18 22:54                           ` Ferenc Wagner
  2009-11-19 12:00                             ` [linux-pm] " Ferenc Wagner
                                             ` (4 subsequent siblings)
  6 siblings, 0 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-11-18 22:54 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: LKML, Jesse Barnes, yakui.zhao, ACPI Devel Maling List, linux-pm,
	Andrew Morton

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Wednesday 18 November 2009, Ferenc Wagner wrote:
>> Ferenc Wagner <wferi@niif.hu> writes:
>> 
>> > Since I've instrumented s2disk and the hibernation path, no freeze
>> > happened during hibernating the machine.
>> 
>> Not until I removed the delays from hibernation_platform_enter(), which
>> were put there previously to get step-by-step feedback.  Removing them
>> again resulted in a freeze in short course, maybe just two hibernations
>> later.  The instrumentation shows it stuck in dpm_suspend_start(PMSG_HIBERNATE).
>> Does it mean that some device driver is at fault?
>
> A driver or one of the platform hooks.
>
>> I'll check if it always fails at the same point (although tracing into
>> dpm_suspend_start isn't pure fun because of the multitude of devices it
>> loops over).  Is there any way to get printk output from that phase?
>
> Compile with CONFIG_PM_VERBOSE (it does mean exactly that).

I've been running with CONFIG_PM_VERBOSE=y for a good while, but that
didn't help getting for example the result of the following printks to
the VGA console (0x3bc is the parallel port):

@@ -445,34 +446,66 @@ int hibernation_platform_enter(void)
         * hibernation_ops->finish() before saving the image, so we should let
         * the firmware know that we're going to enter the sleep state after all
         */
+       printk ("hibernation_ops->begin()...\n");
+       outb(16, 0x3bc);
        error = hibernation_ops->begin();
+       outb(17, 0x3bc);
+       printk ("hibernation_ops->begin(): %d\n", error);
        if (error)
                goto Close;

However, my dmesg is full of lines like

agpgart-intel 0000:00:00.0: preparing freeze
pci 0000:00:00.1: preparing freeze
pci 0000:00:00.3: preparing freeze

etc., I'll check it they are the same all the time.  Anyway, the above
printk strings aren't present in dmesg after a successful resume even,
so I must be doing something wrong...  The parport pins do change, though.
Maybe explicit levels would work better?  I can't see any other
difference from the pm_dev_dbg macro producing the above lines.

>> Side question: If I run s2disk from the init=/bin/bash prompt, the
>> instrumentation in acpi_enter_sleep_state_prep in drivers/acpi/acpica/hwsleep.c
>> fires before the "Snapshotting system" phase, but it does not fire if I
>> hibernate from the full running desktop.  (That instrumentation was put
>> there to investigate the KMS-triggered STR freeze.)  What could explain
>> this?
>
> It looks like it uses the "shutdown" method when run with init=/bin/bash, but
> I don't know why exactly.

Thanks for the tip, I'll check this too.
-- 
Regards,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-11-18 22:13                         ` Rafael J. Wysocki
@ 2009-11-19 12:00                             ` Ferenc Wagner
  2009-11-18 22:54                           ` Ferenc Wagner
                                               ` (5 subsequent siblings)
  6 siblings, 0 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-11-19 12:00 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: LKML, Jesse Barnes, yakui.zhao, ACPI Devel Maling List, linux-pm,
	Andrew Morton

Ferenc Wagner <wferi@niif.hu> writes:

> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>
>> On Wednesday 18 November 2009, Ferenc Wagner wrote:
>>
>>> Ferenc Wagner <wferi@niif.hu> writes:
>>> 
>>> Side question: If I run s2disk from the init=/bin/bash prompt, the
>>> instrumentation in acpi_enter_sleep_state_prep in drivers/acpi/acpica/hwsleep.c
>>> fires before the "Snapshotting system" phase, but it does not fire if I
>>> hibernate from the full running desktop.  (That instrumentation was put
>>> there to investigate the KMS-triggered STR freeze.)  What could explain
>>> this?
>>
>> It looks like it uses the "shutdown" method when run with init=/bin/bash, but
>> I don't know why exactly.
>
> Thanks for the tip, I'll check this too.

While looking into this, I found a reproducible kernel panic:

1. boot with init=/bin/bash
2. mount /usr; swapon -a
3. plug in a USB pendrive
4. s2disk (machine goes to sleep)
5. power on, proceed with resuming, press Enter to cancel resume pause
6. ACPI: Hardware changed while hibernated, cannot resume!
   Kernel panic - not syncing: ACPI S4 hardware signature mismatch

Does it make sense?
-- 
Thanks,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
@ 2009-11-19 12:00                             ` Ferenc Wagner
  0 siblings, 0 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-11-19 12:00 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, Jesse Barnes, Andrew Morton, yakui.zhao, LKML,
	ACPI Devel Maling List, Len Brown

Ferenc Wagner <wferi@niif.hu> writes:

> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>
>> On Wednesday 18 November 2009, Ferenc Wagner wrote:
>>
>>> Ferenc Wagner <wferi@niif.hu> writes:
>>> 
>>> Side question: If I run s2disk from the init=/bin/bash prompt, the
>>> instrumentation in acpi_enter_sleep_state_prep in drivers/acpi/acpica/hwsleep.c
>>> fires before the "Snapshotting system" phase, but it does not fire if I
>>> hibernate from the full running desktop.  (That instrumentation was put
>>> there to investigate the KMS-triggered STR freeze.)  What could explain
>>> this?
>>
>> It looks like it uses the "shutdown" method when run with init=/bin/bash, but
>> I don't know why exactly.
>
> Thanks for the tip, I'll check this too.

While looking into this, I found a reproducible kernel panic:

1. boot with init=/bin/bash
2. mount /usr; swapon -a
3. plug in a USB pendrive
4. s2disk (machine goes to sleep)
5. power on, proceed with resuming, press Enter to cancel resume pause
6. ACPI: Hardware changed while hibernated, cannot resume!
   Kernel panic - not syncing: ACPI S4 hardware signature mismatch

Does it make sense?
-- 
Thanks,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-11-19 12:00                             ` [linux-pm] " Ferenc Wagner
  (?)
@ 2009-11-19 13:02                             ` Ferenc Wagner
  2009-11-19 19:42                               ` Rafael J. Wysocki
  2009-11-19 19:42                               ` Rafael J. Wysocki
  -1 siblings, 2 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-11-19 13:02 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: LKML, Jesse Barnes, yakui.zhao, ACPI Devel Maling List, linux-pm,
	Andrew Morton

Ferenc Wagner <wferi@niif.hu> writes:

> Ferenc Wagner <wferi@niif.hu> writes:
>
>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>>
>>> On Wednesday 18 November 2009, Ferenc Wagner wrote:
>>>
>>>> Ferenc Wagner <wferi@niif.hu> writes:
>>>> 
>>>> Side question: If I run s2disk from the init=/bin/bash prompt, the
>>>> instrumentation in acpi_enter_sleep_state_prep in drivers/acpi/acpica/hwsleep.c
>>>> fires before the "Snapshotting system" phase, but it does not fire if I
>>>> hibernate from the full running desktop.  (That instrumentation was put
>>>> there to investigate the KMS-triggered STR freeze.)  What could explain
>>>> this?
>>>
>>> It looks like it uses the "shutdown" method when run with init=/bin/bash, but
>>> I don't know why exactly.
>>
>> Thanks for the tip, I'll check this too.
>
> While looking into this, I found a reproducible kernel panic:
>
> 1. boot with init=/bin/bash
> 2. mount /usr; swapon -a
> 3. plug in a USB pendrive
> 4. s2disk (machine goes to sleep)
> 5. power on, proceed with resuming, press Enter to cancel resume pause
> 6. ACPI: Hardware changed while hibernated, cannot resume!
>    Kernel panic - not syncing: ACPI S4 hardware signature mismatch
>
> Does it make sense?

Yes it does: it's the BIOS USB support playing its childish games.
I can disable it most of the time, except when booting from USB...

I wonder if this problem sould be handled more gracefully, now that USB
persistence is enabled by default.  It works just fine for STR, but
potentially panics after hibernation.
-- 
Regards,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-11-19 12:00                             ` [linux-pm] " Ferenc Wagner
  (?)
  (?)
@ 2009-11-19 13:02                             ` Ferenc Wagner
  -1 siblings, 0 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-11-19 13:02 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: LKML, Jesse Barnes, yakui.zhao, ACPI Devel Maling List, linux-pm,
	Andrew Morton

Ferenc Wagner <wferi@niif.hu> writes:

> Ferenc Wagner <wferi@niif.hu> writes:
>
>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>>
>>> On Wednesday 18 November 2009, Ferenc Wagner wrote:
>>>
>>>> Ferenc Wagner <wferi@niif.hu> writes:
>>>> 
>>>> Side question: If I run s2disk from the init=/bin/bash prompt, the
>>>> instrumentation in acpi_enter_sleep_state_prep in drivers/acpi/acpica/hwsleep.c
>>>> fires before the "Snapshotting system" phase, but it does not fire if I
>>>> hibernate from the full running desktop.  (That instrumentation was put
>>>> there to investigate the KMS-triggered STR freeze.)  What could explain
>>>> this?
>>>
>>> It looks like it uses the "shutdown" method when run with init=/bin/bash, but
>>> I don't know why exactly.
>>
>> Thanks for the tip, I'll check this too.
>
> While looking into this, I found a reproducible kernel panic:
>
> 1. boot with init=/bin/bash
> 2. mount /usr; swapon -a
> 3. plug in a USB pendrive
> 4. s2disk (machine goes to sleep)
> 5. power on, proceed with resuming, press Enter to cancel resume pause
> 6. ACPI: Hardware changed while hibernated, cannot resume!
>    Kernel panic - not syncing: ACPI S4 hardware signature mismatch
>
> Does it make sense?

Yes it does: it's the BIOS USB support playing its childish games.
I can disable it most of the time, except when booting from USB...

I wonder if this problem sould be handled more gracefully, now that USB
persistence is enabled by default.  It works just fine for STR, but
potentially panics after hibernation.
-- 
Regards,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-11-19 13:02                             ` Ferenc Wagner
  2009-11-19 19:42                               ` Rafael J. Wysocki
@ 2009-11-19 19:42                               ` Rafael J. Wysocki
  1 sibling, 0 replies; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-11-19 19:42 UTC (permalink / raw)
  To: Ferenc Wagner
  Cc: LKML, Jesse Barnes, yakui.zhao, ACPI Devel Maling List, linux-pm,
	Andrew Morton

On Thursday 19 November 2009, Ferenc Wagner wrote:
> Ferenc Wagner <wferi@niif.hu> writes:
> 
> > Ferenc Wagner <wferi@niif.hu> writes:
> >
> >> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >>
> >>> On Wednesday 18 November 2009, Ferenc Wagner wrote:
> >>>
> >>>> Ferenc Wagner <wferi@niif.hu> writes:
> >>>> 
> >>>> Side question: If I run s2disk from the init=/bin/bash prompt, the
> >>>> instrumentation in acpi_enter_sleep_state_prep in drivers/acpi/acpica/hwsleep.c
> >>>> fires before the "Snapshotting system" phase, but it does not fire if I
> >>>> hibernate from the full running desktop.  (That instrumentation was put
> >>>> there to investigate the KMS-triggered STR freeze.)  What could explain
> >>>> this?
> >>>
> >>> It looks like it uses the "shutdown" method when run with init=/bin/bash, but
> >>> I don't know why exactly.
> >>
> >> Thanks for the tip, I'll check this too.
> >
> > While looking into this, I found a reproducible kernel panic:
> >
> > 1. boot with init=/bin/bash
> > 2. mount /usr; swapon -a
> > 3. plug in a USB pendrive
> > 4. s2disk (machine goes to sleep)
> > 5. power on, proceed with resuming, press Enter to cancel resume pause
> > 6. ACPI: Hardware changed while hibernated, cannot resume!
> >    Kernel panic - not syncing: ACPI S4 hardware signature mismatch
> >
> > Does it make sense?
> 
> Yes it does: it's the BIOS USB support playing its childish games.
> I can disable it most of the time, except when booting from USB...
> 
> I wonder if this problem sould be handled more gracefully, now that USB
> persistence is enabled by default.  It works just fine for STR, but
> potentially panics after hibernation.

Add acpi_sleep=s4_nohwsig to the kernel command line.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-11-19 13:02                             ` Ferenc Wagner
@ 2009-11-19 19:42                               ` Rafael J. Wysocki
  2009-11-19 19:42                               ` Rafael J. Wysocki
  1 sibling, 0 replies; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-11-19 19:42 UTC (permalink / raw)
  To: Ferenc Wagner
  Cc: LKML, Jesse Barnes, yakui.zhao, ACPI Devel Maling List, linux-pm,
	Andrew Morton

On Thursday 19 November 2009, Ferenc Wagner wrote:
> Ferenc Wagner <wferi@niif.hu> writes:
> 
> > Ferenc Wagner <wferi@niif.hu> writes:
> >
> >> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >>
> >>> On Wednesday 18 November 2009, Ferenc Wagner wrote:
> >>>
> >>>> Ferenc Wagner <wferi@niif.hu> writes:
> >>>> 
> >>>> Side question: If I run s2disk from the init=/bin/bash prompt, the
> >>>> instrumentation in acpi_enter_sleep_state_prep in drivers/acpi/acpica/hwsleep.c
> >>>> fires before the "Snapshotting system" phase, but it does not fire if I
> >>>> hibernate from the full running desktop.  (That instrumentation was put
> >>>> there to investigate the KMS-triggered STR freeze.)  What could explain
> >>>> this?
> >>>
> >>> It looks like it uses the "shutdown" method when run with init=/bin/bash, but
> >>> I don't know why exactly.
> >>
> >> Thanks for the tip, I'll check this too.
> >
> > While looking into this, I found a reproducible kernel panic:
> >
> > 1. boot with init=/bin/bash
> > 2. mount /usr; swapon -a
> > 3. plug in a USB pendrive
> > 4. s2disk (machine goes to sleep)
> > 5. power on, proceed with resuming, press Enter to cancel resume pause
> > 6. ACPI: Hardware changed while hibernated, cannot resume!
> >    Kernel panic - not syncing: ACPI S4 hardware signature mismatch
> >
> > Does it make sense?
> 
> Yes it does: it's the BIOS USB support playing its childish games.
> I can disable it most of the time, except when booting from USB...
> 
> I wonder if this problem sould be handled more gracefully, now that USB
> persistence is enabled by default.  It works just fine for STR, but
> potentially panics after hibernation.

Add acpi_sleep=s4_nohwsig to the kernel command line.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-11-18 22:13                         ` Rafael J. Wysocki
                                             ` (3 preceding siblings ...)
  2009-11-21 23:59                           ` Ferenc Wagner
@ 2009-11-21 23:59                           ` Ferenc Wagner
  2009-11-28 19:01                           ` Ferenc Wagner
  2009-11-28 19:01                           ` Ferenc Wagner
  6 siblings, 0 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-11-21 23:59 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, Jesse Barnes, Andrew Morton, yakui.zhao, LKML,
	ACPI Devel Maling List, Len Brown

Ferenc Wagner <wferi@niif.hu> writes:

> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>
>> On Wednesday 18 November 2009, Ferenc Wagner wrote:
>>
>>> Ferenc Wagner <wferi@niif.hu> writes:
>>> 
>>>> Since I've instrumented s2disk and the hibernation path, no freeze
>>>> happened during hibernating the machine.
>>> 
>>> Not until I removed the delays from hibernation_platform_enter(), which
>>> were put there previously to get step-by-step feedback.  Removing them
>>> again resulted in a freeze in short course, maybe just two hibernations
>>> later.  The instrumentation shows it stuck in dpm_suspend_start(PMSG_HIBERNATE).
>>> Does it mean that some device driver is at fault?
>>
>> A driver or one of the platform hooks.

This result (the freeze happens in dpm_suspend_start) may not stand
anymore: the parport_pc module was loaded and this falsified my results.
That is, after suspending that device, the parport feedback ceased
working, outbs becoming ineffective.

>>> I'll check if it always fails at the same point (although tracing into
>>> dpm_suspend_start isn't pure fun because of the multitude of devices it
>>> loops over).  Is there any way to get printk output from that phase?
>>
>> Compile with CONFIG_PM_VERBOSE (it does mean exactly that).
>
> I've been running with CONFIG_PM_VERBOSE=y for a good while, but that
> didn't help getting for example the result of the following printks to
> the VGA console (0x3bc is the parallel port):
>
> @@ -445,34 +446,66 @@ int hibernation_platform_enter(void)
>          * hibernation_ops->finish() before saving the image, so we should let
>          * the firmware know that we're going to enter the sleep state after all
>          */
> +       printk ("hibernation_ops->begin()...\n");
> +       outb(16, 0x3bc);
>         error = hibernation_ops->begin();
> +       outb(17, 0x3bc);
> +       printk ("hibernation_ops->begin(): %d\n", error);
>         if (error)
>                 goto Close;

The problem was the very low (1) default suspend loglevel.  After
raising it, the incriminated messages (and lots of other stuff)
appeared.

> However, my dmesg is full of lines like
>
> agpgart-intel 0000:00:00.0: preparing freeze
> pci 0000:00:00.1: preparing freeze
> pci 0000:00:00.3: preparing freeze
>
> etc., I'll check it they are the same all the time.  Anyway, the above
> printk strings aren't present in dmesg after a successful resume even,
> so I must be doing something wrong...

These printks of mine are clearly out of scope of dmesg, because they
don't happen in the context of the resumed system.  I think I'm starting
to understand...

>>> Side question: If I run s2disk from the init=/bin/bash prompt, the
>>> instrumentation in acpi_enter_sleep_state_prep in drivers/acpi/acpica/hwsleep.c
>>> fires before the "Snapshotting system" phase, but it does not fire if I
>>> hibernate from the full running desktop.  (That instrumentation was put
>>> there to investigate the KMS-triggered STR freeze.)  What could explain
>>> this?
>>
>> It looks like it uses the "shutdown" method when run with init=/bin/bash, but
>> I don't know why exactly.
>
> Thanks for the tip, I'll check this too.

Looks like this was the effect of portport_pc being loaded when I run
s2disk from the desktop, but not when I started it after init=/bin/bash.
-- 
Regards,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-11-18 22:13                         ` Rafael J. Wysocki
                                             ` (2 preceding siblings ...)
  2009-11-19 12:00                             ` [linux-pm] " Ferenc Wagner
@ 2009-11-21 23:59                           ` Ferenc Wagner
  2009-11-21 23:59                           ` [linux-pm] " Ferenc Wagner
                                             ` (2 subsequent siblings)
  6 siblings, 0 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-11-21 23:59 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: LKML, Jesse Barnes, yakui.zhao, ACPI Devel Maling List, linux-pm,
	Andrew Morton

Ferenc Wagner <wferi@niif.hu> writes:

> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>
>> On Wednesday 18 November 2009, Ferenc Wagner wrote:
>>
>>> Ferenc Wagner <wferi@niif.hu> writes:
>>> 
>>>> Since I've instrumented s2disk and the hibernation path, no freeze
>>>> happened during hibernating the machine.
>>> 
>>> Not until I removed the delays from hibernation_platform_enter(), which
>>> were put there previously to get step-by-step feedback.  Removing them
>>> again resulted in a freeze in short course, maybe just two hibernations
>>> later.  The instrumentation shows it stuck in dpm_suspend_start(PMSG_HIBERNATE).
>>> Does it mean that some device driver is at fault?
>>
>> A driver or one of the platform hooks.

This result (the freeze happens in dpm_suspend_start) may not stand
anymore: the parport_pc module was loaded and this falsified my results.
That is, after suspending that device, the parport feedback ceased
working, outbs becoming ineffective.

>>> I'll check if it always fails at the same point (although tracing into
>>> dpm_suspend_start isn't pure fun because of the multitude of devices it
>>> loops over).  Is there any way to get printk output from that phase?
>>
>> Compile with CONFIG_PM_VERBOSE (it does mean exactly that).
>
> I've been running with CONFIG_PM_VERBOSE=y for a good while, but that
> didn't help getting for example the result of the following printks to
> the VGA console (0x3bc is the parallel port):
>
> @@ -445,34 +446,66 @@ int hibernation_platform_enter(void)
>          * hibernation_ops->finish() before saving the image, so we should let
>          * the firmware know that we're going to enter the sleep state after all
>          */
> +       printk ("hibernation_ops->begin()...\n");
> +       outb(16, 0x3bc);
>         error = hibernation_ops->begin();
> +       outb(17, 0x3bc);
> +       printk ("hibernation_ops->begin(): %d\n", error);
>         if (error)
>                 goto Close;

The problem was the very low (1) default suspend loglevel.  After
raising it, the incriminated messages (and lots of other stuff)
appeared.

> However, my dmesg is full of lines like
>
> agpgart-intel 0000:00:00.0: preparing freeze
> pci 0000:00:00.1: preparing freeze
> pci 0000:00:00.3: preparing freeze
>
> etc., I'll check it they are the same all the time.  Anyway, the above
> printk strings aren't present in dmesg after a successful resume even,
> so I must be doing something wrong...

These printks of mine are clearly out of scope of dmesg, because they
don't happen in the context of the resumed system.  I think I'm starting
to understand...

>>> Side question: If I run s2disk from the init=/bin/bash prompt, the
>>> instrumentation in acpi_enter_sleep_state_prep in drivers/acpi/acpica/hwsleep.c
>>> fires before the "Snapshotting system" phase, but it does not fire if I
>>> hibernate from the full running desktop.  (That instrumentation was put
>>> there to investigate the KMS-triggered STR freeze.)  What could explain
>>> this?
>>
>> It looks like it uses the "shutdown" method when run with init=/bin/bash, but
>> I don't know why exactly.
>
> Thanks for the tip, I'll check this too.

Looks like this was the effect of portport_pc being loaded when I run
s2disk from the desktop, but not when I started it after init=/bin/bash.
-- 
Regards,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-11-18 22:13                         ` Rafael J. Wysocki
                                             ` (4 preceding siblings ...)
  2009-11-21 23:59                           ` [linux-pm] " Ferenc Wagner
@ 2009-11-28 19:01                           ` Ferenc Wagner
  2009-11-29  0:29                             ` Rafael J. Wysocki
  2009-11-29  0:29                             ` [linux-pm] " Rafael J. Wysocki
  2009-11-28 19:01                           ` Ferenc Wagner
  6 siblings, 2 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-11-28 19:01 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, Jesse Barnes, Andrew Morton, yakui.zhao, LKML,
	ACPI Devel Maling List, Len Brown

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Wednesday 18 November 2009, Ferenc Wagner wrote:
>
>> Ferenc Wagner <wferi@niif.hu> writes:
>> 
>>> Since I've instrumented s2disk and the hibernation path, no freeze
>>> happened during hibernating the machine.
>> 
>> Not until I removed the delays from hibernation_platform_enter(), which
>> were put there previously to get step-by-step feedback.  Removing them
>> again resulted in a freeze in short course, maybe just two hibernations
>> later.  The instrumentation shows it stuck in dpm_suspend_start(PMSG_HIBERNATE).
>> Does it mean that some device driver is at fault?
>
> A driver or one of the platform hooks.
>
>> I'll check if it always fails at the same point (although tracing into
>> dpm_suspend_start isn't pure fun because of the multitude of devices it
>> loops over).  Is there any way to get printk output from that phase?
>
> Compile with CONFIG_PM_VERBOSE (it does mean exactly that).

The last message now was:

e100: 0000:02:08.0: hibernate, may wakeup

Looks like hibernating the e100 driver is unstable.
-- 
Regards,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-11-18 22:13                         ` Rafael J. Wysocki
                                             ` (5 preceding siblings ...)
  2009-11-28 19:01                           ` Ferenc Wagner
@ 2009-11-28 19:01                           ` Ferenc Wagner
  6 siblings, 0 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-11-28 19:01 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: LKML, Jesse Barnes, yakui.zhao, ACPI Devel Maling List, linux-pm,
	Andrew Morton

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Wednesday 18 November 2009, Ferenc Wagner wrote:
>
>> Ferenc Wagner <wferi@niif.hu> writes:
>> 
>>> Since I've instrumented s2disk and the hibernation path, no freeze
>>> happened during hibernating the machine.
>> 
>> Not until I removed the delays from hibernation_platform_enter(), which
>> were put there previously to get step-by-step feedback.  Removing them
>> again resulted in a freeze in short course, maybe just two hibernations
>> later.  The instrumentation shows it stuck in dpm_suspend_start(PMSG_HIBERNATE).
>> Does it mean that some device driver is at fault?
>
> A driver or one of the platform hooks.
>
>> I'll check if it always fails at the same point (although tracing into
>> dpm_suspend_start isn't pure fun because of the multitude of devices it
>> loops over).  Is there any way to get printk output from that phase?
>
> Compile with CONFIG_PM_VERBOSE (it does mean exactly that).

The last message now was:

e100: 0000:02:08.0: hibernate, may wakeup

Looks like hibernating the e100 driver is unstable.
-- 
Regards,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-11-28 19:01                           ` Ferenc Wagner
  2009-11-29  0:29                             ` Rafael J. Wysocki
@ 2009-11-29  0:29                             ` Rafael J. Wysocki
  2009-11-29 10:12                               ` Ferenc Wagner
                                                 ` (2 more replies)
  1 sibling, 3 replies; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-11-29  0:29 UTC (permalink / raw)
  To: Ferenc Wagner
  Cc: linux-pm, Jesse Barnes, Andrew Morton, yakui.zhao, LKML,
	ACPI Devel Maling List, Len Brown

On Saturday 28 November 2009, Ferenc Wagner wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > On Wednesday 18 November 2009, Ferenc Wagner wrote:
> >
> >> Ferenc Wagner <wferi@niif.hu> writes:
> >> 
> >>> Since I've instrumented s2disk and the hibernation path, no freeze
> >>> happened during hibernating the machine.
> >> 
> >> Not until I removed the delays from hibernation_platform_enter(), which
> >> were put there previously to get step-by-step feedback.  Removing them
> >> again resulted in a freeze in short course, maybe just two hibernations
> >> later.  The instrumentation shows it stuck in dpm_suspend_start(PMSG_HIBERNATE).
> >> Does it mean that some device driver is at fault?
> >
> > A driver or one of the platform hooks.
> >
> >> I'll check if it always fails at the same point (although tracing into
> >> dpm_suspend_start isn't pure fun because of the multitude of devices it
> >> loops over).  Is there any way to get printk output from that phase?
> >
> > Compile with CONFIG_PM_VERBOSE (it does mean exactly that).
> 
> The last message now was:
> 
> e100: 0000:02:08.0: hibernate, may wakeup
> 
> Looks like hibernating the e100 driver is unstable.

Can you verify that by trying to hibernate without the e100 driver?

Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-11-28 19:01                           ` Ferenc Wagner
@ 2009-11-29  0:29                             ` Rafael J. Wysocki
  2009-11-29  0:29                             ` [linux-pm] " Rafael J. Wysocki
  1 sibling, 0 replies; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-11-29  0:29 UTC (permalink / raw)
  To: Ferenc Wagner
  Cc: LKML, Jesse Barnes, yakui.zhao, ACPI Devel Maling List, linux-pm,
	Andrew Morton

On Saturday 28 November 2009, Ferenc Wagner wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > On Wednesday 18 November 2009, Ferenc Wagner wrote:
> >
> >> Ferenc Wagner <wferi@niif.hu> writes:
> >> 
> >>> Since I've instrumented s2disk and the hibernation path, no freeze
> >>> happened during hibernating the machine.
> >> 
> >> Not until I removed the delays from hibernation_platform_enter(), which
> >> were put there previously to get step-by-step feedback.  Removing them
> >> again resulted in a freeze in short course, maybe just two hibernations
> >> later.  The instrumentation shows it stuck in dpm_suspend_start(PMSG_HIBERNATE).
> >> Does it mean that some device driver is at fault?
> >
> > A driver or one of the platform hooks.
> >
> >> I'll check if it always fails at the same point (although tracing into
> >> dpm_suspend_start isn't pure fun because of the multitude of devices it
> >> loops over).  Is there any way to get printk output from that phase?
> >
> > Compile with CONFIG_PM_VERBOSE (it does mean exactly that).
> 
> The last message now was:
> 
> e100: 0000:02:08.0: hibernate, may wakeup
> 
> Looks like hibernating the e100 driver is unstable.

Can you verify that by trying to hibernate without the e100 driver?

Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-11-29  0:29                             ` [linux-pm] " Rafael J. Wysocki
@ 2009-11-29 10:12                               ` Ferenc Wagner
  2009-11-29 15:07                                 ` Rafael J. Wysocki
  2009-11-29 15:07                                 ` [linux-pm] " Rafael J. Wysocki
  2009-11-29 10:12                               ` [linux-pm] " Ferenc Wagner
  2009-11-29 10:12                               ` Ferenc Wagner
  2 siblings, 2 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-11-29 10:12 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, Andrew Morton, LKML, linux-netdev

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Saturday 28 November 2009, Ferenc Wagner wrote:
>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>> 
>>> On Wednesday 18 November 2009, Ferenc Wagner wrote:
>>>
>>>> Ferenc Wagner <wferi@niif.hu> writes:
>>>> 
>>>>> Since I've instrumented s2disk and the hibernation path, no freeze
>>>>> happened during hibernating the machine.
>>>> 
>>>> Not until I removed the delays from hibernation_platform_enter(), which
>>>> were put there previously to get step-by-step feedback.  Removing them
>>>> again resulted in a freeze in short course, maybe just two hibernations
>>>> later.  The instrumentation shows it stuck in dpm_suspend_start(PMSG_HIBERNATE).
>>>> Does it mean that some device driver is at fault?
>>>
>>> A driver or one of the platform hooks.
>>>
>>>> I'll check if it always fails at the same point (although tracing into
>>>> dpm_suspend_start isn't pure fun because of the multitude of devices it
>>>> loops over).  Is there any way to get printk output from that phase?
>>>
>>> Compile with CONFIG_PM_VERBOSE (it does mean exactly that).
>> 
>> The last message now was:
>> 
>> e100: 0000:02:08.0: hibernate, may wakeup
>> 
>> Looks like hibernating the e100 driver is unstable.
>
> Can you verify that by trying to hibernate without the e100 driver?

Not really, as I still can't reliable reproduce the issue.  Since I'm
running with suspend loglevel = 8, it's happened only twice (in a row),
with seemingly exact same console output.  Some earlier freezes also
happened in dpm_suspend_start, at least.  However, I can certainly add
e100 to SUSPEND_MODULES under /etc/pm/config.d, and continue running
with that.  Or I can try stress-testing the module, but not sure, how.
Interestingly, git log v2.6.31.. -- e100.c is tiny, but 8fbd962e affects
the suspend/resume routines through e100_up.  This could explain the
timing-sensitive nature of the issue.  I took the liberty to change the
Cc list, maybe linux-netdev can lend us a hand.
-- 
Regards,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-11-29  0:29                             ` [linux-pm] " Rafael J. Wysocki
  2009-11-29 10:12                               ` Ferenc Wagner
@ 2009-11-29 10:12                               ` Ferenc Wagner
  2009-11-29 10:12                               ` Ferenc Wagner
  2 siblings, 0 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-11-29 10:12 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, Andrew Morton, LKML, linux-netdev

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Saturday 28 November 2009, Ferenc Wagner wrote:
>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>> 
>>> On Wednesday 18 November 2009, Ferenc Wagner wrote:
>>>
>>>> Ferenc Wagner <wferi@niif.hu> writes:
>>>> 
>>>>> Since I've instrumented s2disk and the hibernation path, no freeze
>>>>> happened during hibernating the machine.
>>>> 
>>>> Not until I removed the delays from hibernation_platform_enter(), which
>>>> were put there previously to get step-by-step feedback.  Removing them
>>>> again resulted in a freeze in short course, maybe just two hibernations
>>>> later.  The instrumentation shows it stuck in dpm_suspend_start(PMSG_HIBERNATE).
>>>> Does it mean that some device driver is at fault?
>>>
>>> A driver or one of the platform hooks.
>>>
>>>> I'll check if it always fails at the same point (although tracing into
>>>> dpm_suspend_start isn't pure fun because of the multitude of devices it
>>>> loops over).  Is there any way to get printk output from that phase?
>>>
>>> Compile with CONFIG_PM_VERBOSE (it does mean exactly that).
>> 
>> The last message now was:
>> 
>> e100: 0000:02:08.0: hibernate, may wakeup
>> 
>> Looks like hibernating the e100 driver is unstable.
>
> Can you verify that by trying to hibernate without the e100 driver?

Not really, as I still can't reliable reproduce the issue.  Since I'm
running with suspend loglevel = 8, it's happened only twice (in a row),
with seemingly exact same console output.  Some earlier freezes also
happened in dpm_suspend_start, at least.  However, I can certainly add
e100 to SUSPEND_MODULES under /etc/pm/config.d, and continue running
with that.  Or I can try stress-testing the module, but not sure, how.
Interestingly, git log v2.6.31.. -- e100.c is tiny, but 8fbd962e affects
the suspend/resume routines through e100_up.  This could explain the
timing-sensitive nature of the issue.  I took the liberty to change the
Cc list, maybe linux-netdev can lend us a hand.
-- 
Regards,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-11-29  0:29                             ` [linux-pm] " Rafael J. Wysocki
  2009-11-29 10:12                               ` Ferenc Wagner
  2009-11-29 10:12                               ` [linux-pm] " Ferenc Wagner
@ 2009-11-29 10:12                               ` Ferenc Wagner
  2 siblings, 0 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-11-29 10:12 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, linux-netdev, Andrew Morton, LKML

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Saturday 28 November 2009, Ferenc Wagner wrote:
>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>> 
>>> On Wednesday 18 November 2009, Ferenc Wagner wrote:
>>>
>>>> Ferenc Wagner <wferi@niif.hu> writes:
>>>> 
>>>>> Since I've instrumented s2disk and the hibernation path, no freeze
>>>>> happened during hibernating the machine.
>>>> 
>>>> Not until I removed the delays from hibernation_platform_enter(), which
>>>> were put there previously to get step-by-step feedback.  Removing them
>>>> again resulted in a freeze in short course, maybe just two hibernations
>>>> later.  The instrumentation shows it stuck in dpm_suspend_start(PMSG_HIBERNATE).
>>>> Does it mean that some device driver is at fault?
>>>
>>> A driver or one of the platform hooks.
>>>
>>>> I'll check if it always fails at the same point (although tracing into
>>>> dpm_suspend_start isn't pure fun because of the multitude of devices it
>>>> loops over).  Is there any way to get printk output from that phase?
>>>
>>> Compile with CONFIG_PM_VERBOSE (it does mean exactly that).
>> 
>> The last message now was:
>> 
>> e100: 0000:02:08.0: hibernate, may wakeup
>> 
>> Looks like hibernating the e100 driver is unstable.
>
> Can you verify that by trying to hibernate without the e100 driver?

Not really, as I still can't reliable reproduce the issue.  Since I'm
running with suspend loglevel = 8, it's happened only twice (in a row),
with seemingly exact same console output.  Some earlier freezes also
happened in dpm_suspend_start, at least.  However, I can certainly add
e100 to SUSPEND_MODULES under /etc/pm/config.d, and continue running
with that.  Or I can try stress-testing the module, but not sure, how.
Interestingly, git log v2.6.31.. -- e100.c is tiny, but 8fbd962e affects
the suspend/resume routines through e100_up.  This could explain the
timing-sensitive nature of the issue.  I took the liberty to change the
Cc list, maybe linux-netdev can lend us a hand.
-- 
Regards,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-11-29 10:12                               ` Ferenc Wagner
  2009-11-29 15:07                                 ` Rafael J. Wysocki
@ 2009-11-29 15:07                                 ` Rafael J. Wysocki
  2009-12-01 10:29                                   ` Ferenc Wagner
  2009-12-01 10:29                                   ` Ferenc Wagner
  1 sibling, 2 replies; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-11-29 15:07 UTC (permalink / raw)
  To: Ferenc Wagner; +Cc: linux-pm, Andrew Morton, LKML, linux-netdev

On Sunday 29 November 2009, Ferenc Wagner wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > On Saturday 28 November 2009, Ferenc Wagner wrote:
> >> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >> 
> >>> On Wednesday 18 November 2009, Ferenc Wagner wrote:
> >>>
> >>>> Ferenc Wagner <wferi@niif.hu> writes:
> >>>> 
> >>>>> Since I've instrumented s2disk and the hibernation path, no freeze
> >>>>> happened during hibernating the machine.
> >>>> 
> >>>> Not until I removed the delays from hibernation_platform_enter(), which
> >>>> were put there previously to get step-by-step feedback.  Removing them
> >>>> again resulted in a freeze in short course, maybe just two hibernations
> >>>> later.  The instrumentation shows it stuck in dpm_suspend_start(PMSG_HIBERNATE).
> >>>> Does it mean that some device driver is at fault?
> >>>
> >>> A driver or one of the platform hooks.
> >>>
> >>>> I'll check if it always fails at the same point (although tracing into
> >>>> dpm_suspend_start isn't pure fun because of the multitude of devices it
> >>>> loops over).  Is there any way to get printk output from that phase?
> >>>
> >>> Compile with CONFIG_PM_VERBOSE (it does mean exactly that).
> >> 
> >> The last message now was:
> >> 
> >> e100: 0000:02:08.0: hibernate, may wakeup
> >> 
> >> Looks like hibernating the e100 driver is unstable.
> >
> > Can you verify that by trying to hibernate without the e100 driver?
> 
> Not really, as I still can't reliable reproduce the issue.  Since I'm
> running with suspend loglevel = 8, it's happened only twice (in a row),
> with seemingly exact same console output.  Some earlier freezes also
> happened in dpm_suspend_start, at least.  However, I can certainly add
> e100 to SUSPEND_MODULES under /etc/pm/config.d, and continue running
> with that.

That's what I'd do.  In addition to that, you can run multiple
hibernation/resume cycles in a tight loop using the RTC wakealarm.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-11-29 10:12                               ` Ferenc Wagner
@ 2009-11-29 15:07                                 ` Rafael J. Wysocki
  2009-11-29 15:07                                 ` [linux-pm] " Rafael J. Wysocki
  1 sibling, 0 replies; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-11-29 15:07 UTC (permalink / raw)
  To: Ferenc Wagner; +Cc: linux-pm, linux-netdev, Andrew Morton, LKML

On Sunday 29 November 2009, Ferenc Wagner wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > On Saturday 28 November 2009, Ferenc Wagner wrote:
> >> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >> 
> >>> On Wednesday 18 November 2009, Ferenc Wagner wrote:
> >>>
> >>>> Ferenc Wagner <wferi@niif.hu> writes:
> >>>> 
> >>>>> Since I've instrumented s2disk and the hibernation path, no freeze
> >>>>> happened during hibernating the machine.
> >>>> 
> >>>> Not until I removed the delays from hibernation_platform_enter(), which
> >>>> were put there previously to get step-by-step feedback.  Removing them
> >>>> again resulted in a freeze in short course, maybe just two hibernations
> >>>> later.  The instrumentation shows it stuck in dpm_suspend_start(PMSG_HIBERNATE).
> >>>> Does it mean that some device driver is at fault?
> >>>
> >>> A driver or one of the platform hooks.
> >>>
> >>>> I'll check if it always fails at the same point (although tracing into
> >>>> dpm_suspend_start isn't pure fun because of the multitude of devices it
> >>>> loops over).  Is there any way to get printk output from that phase?
> >>>
> >>> Compile with CONFIG_PM_VERBOSE (it does mean exactly that).
> >> 
> >> The last message now was:
> >> 
> >> e100: 0000:02:08.0: hibernate, may wakeup
> >> 
> >> Looks like hibernating the e100 driver is unstable.
> >
> > Can you verify that by trying to hibernate without the e100 driver?
> 
> Not really, as I still can't reliable reproduce the issue.  Since I'm
> running with suspend loglevel = 8, it's happened only twice (in a row),
> with seemingly exact same console output.  Some earlier freezes also
> happened in dpm_suspend_start, at least.  However, I can certainly add
> e100 to SUSPEND_MODULES under /etc/pm/config.d, and continue running
> with that.

That's what I'd do.  In addition to that, you can run multiple
hibernation/resume cycles in a tight loop using the RTC wakealarm.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-11-29 15:07                                 ` [linux-pm] " Rafael J. Wysocki
@ 2009-12-01 10:29                                   ` Ferenc Wagner
  2009-12-01 12:28                                     ` Rafael J. Wysocki
  2009-12-01 12:28                                     ` [linux-pm] " Rafael J. Wysocki
  2009-12-01 10:29                                   ` Ferenc Wagner
  1 sibling, 2 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-12-01 10:29 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, Andrew Morton, LKML, linux-netdev

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Sunday 29 November 2009, Ferenc Wagner wrote:
>
>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>> 
>>> On Saturday 28 November 2009, Ferenc Wagner wrote:
>>>
>>>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>>>> 
>>>>> Compile with CONFIG_PM_VERBOSE (it does mean exactly that).
>>>> 
>>>> The last message now was:
>>>> 
>>>> e100: 0000:02:08.0: hibernate, may wakeup
>>>> 
>>>> Looks like hibernating the e100 driver is unstable.
>>>
>>> Can you verify that by trying to hibernate without the e100 driver?
>> 
>> Not really, as I still can't reliable reproduce the issue.  Since I'm
>> running with suspend loglevel = 8, it's happened only twice (in a row),
>> with seemingly exact same console output.  Some earlier freezes also
>> happened in dpm_suspend_start, at least.  However, I can certainly add
>> e100 to SUSPEND_MODULES under /etc/pm/config.d, and continue running
>> with that.
>
> That's what I'd do.

That worked out mosty OK (no freeze in quite some hibernation cycles),
but I'm continuing testing it.

On the other hand, I reverted 8fbd962e3, recompiled and replaced the
module, and got the freeze during hibernation.  And that was the bulk of
the changes since 2.6.31...  I'll revert the rest and test again, but
that seems purely cosmetic, so no high hopes.

> In addition to that, you can run multiple hibernation/resume cycles in
> a tight loop using the RTC wakealarm.

I'll do so, as soon as I find a way to automatically supply the dm-crypt
passphrase... or even better, learn to hibernate to ramdisk from the
initramfs. :)
-- 
Cheers,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-11-29 15:07                                 ` [linux-pm] " Rafael J. Wysocki
  2009-12-01 10:29                                   ` Ferenc Wagner
@ 2009-12-01 10:29                                   ` Ferenc Wagner
  1 sibling, 0 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-12-01 10:29 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, linux-netdev, Andrew Morton, LKML

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Sunday 29 November 2009, Ferenc Wagner wrote:
>
>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>> 
>>> On Saturday 28 November 2009, Ferenc Wagner wrote:
>>>
>>>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>>>> 
>>>>> Compile with CONFIG_PM_VERBOSE (it does mean exactly that).
>>>> 
>>>> The last message now was:
>>>> 
>>>> e100: 0000:02:08.0: hibernate, may wakeup
>>>> 
>>>> Looks like hibernating the e100 driver is unstable.
>>>
>>> Can you verify that by trying to hibernate without the e100 driver?
>> 
>> Not really, as I still can't reliable reproduce the issue.  Since I'm
>> running with suspend loglevel = 8, it's happened only twice (in a row),
>> with seemingly exact same console output.  Some earlier freezes also
>> happened in dpm_suspend_start, at least.  However, I can certainly add
>> e100 to SUSPEND_MODULES under /etc/pm/config.d, and continue running
>> with that.
>
> That's what I'd do.

That worked out mosty OK (no freeze in quite some hibernation cycles),
but I'm continuing testing it.

On the other hand, I reverted 8fbd962e3, recompiled and replaced the
module, and got the freeze during hibernation.  And that was the bulk of
the changes since 2.6.31...  I'll revert the rest and test again, but
that seems purely cosmetic, so no high hopes.

> In addition to that, you can run multiple hibernation/resume cycles in
> a tight loop using the RTC wakealarm.

I'll do so, as soon as I find a way to automatically supply the dm-crypt
passphrase... or even better, learn to hibernate to ramdisk from the
initramfs. :)
-- 
Cheers,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-12-01 10:29                                   ` Ferenc Wagner
  2009-12-01 12:28                                     ` Rafael J. Wysocki
@ 2009-12-01 12:28                                     ` Rafael J. Wysocki
  2009-12-01 17:46                                       ` Ferenc Wagner
  2009-12-01 17:46                                       ` Ferenc Wagner
  1 sibling, 2 replies; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-12-01 12:28 UTC (permalink / raw)
  To: Ferenc Wagner; +Cc: linux-pm, Andrew Morton, LKML, linux-netdev

On Tuesday 01 December 2009, Ferenc Wagner wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > On Sunday 29 November 2009, Ferenc Wagner wrote:
> >
> >> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >> 
> >>> On Saturday 28 November 2009, Ferenc Wagner wrote:
> >>>
> >>>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >>>> 
> >>>>> Compile with CONFIG_PM_VERBOSE (it does mean exactly that).
> >>>> 
> >>>> The last message now was:
> >>>> 
> >>>> e100: 0000:02:08.0: hibernate, may wakeup
> >>>> 
> >>>> Looks like hibernating the e100 driver is unstable.
> >>>
> >>> Can you verify that by trying to hibernate without the e100 driver?
> >> 
> >> Not really, as I still can't reliable reproduce the issue.  Since I'm
> >> running with suspend loglevel = 8, it's happened only twice (in a row),
> >> with seemingly exact same console output.  Some earlier freezes also
> >> happened in dpm_suspend_start, at least.  However, I can certainly add
> >> e100 to SUSPEND_MODULES under /etc/pm/config.d, and continue running
> >> with that.
> >
> > That's what I'd do.
> 
> That worked out mosty OK (no freeze in quite some hibernation cycles),
> but I'm continuing testing it.

Great, please let me know how it works out.

> On the other hand, I reverted 8fbd962e3, recompiled and replaced the
> module, and got the freeze during hibernation.  And that was the bulk of
> the changes since 2.6.31...  I'll revert the rest and test again, but
> that seems purely cosmetic, so no high hopes.
> 
> > In addition to that, you can run multiple hibernation/resume cycles in
> > a tight loop using the RTC wakealarm.
> 
> I'll do so, as soon as I find a way to automatically supply the dm-crypt
> passphrase... or even better, learn to hibernate to ramdisk from the
> initramfs. :)

Well, you don't need to use swap encryptuon for _testing_. :-)

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-12-01 10:29                                   ` Ferenc Wagner
@ 2009-12-01 12:28                                     ` Rafael J. Wysocki
  2009-12-01 12:28                                     ` [linux-pm] " Rafael J. Wysocki
  1 sibling, 0 replies; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-12-01 12:28 UTC (permalink / raw)
  To: Ferenc Wagner; +Cc: linux-pm, linux-netdev, Andrew Morton, LKML

On Tuesday 01 December 2009, Ferenc Wagner wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > On Sunday 29 November 2009, Ferenc Wagner wrote:
> >
> >> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >> 
> >>> On Saturday 28 November 2009, Ferenc Wagner wrote:
> >>>
> >>>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >>>> 
> >>>>> Compile with CONFIG_PM_VERBOSE (it does mean exactly that).
> >>>> 
> >>>> The last message now was:
> >>>> 
> >>>> e100: 0000:02:08.0: hibernate, may wakeup
> >>>> 
> >>>> Looks like hibernating the e100 driver is unstable.
> >>>
> >>> Can you verify that by trying to hibernate without the e100 driver?
> >> 
> >> Not really, as I still can't reliable reproduce the issue.  Since I'm
> >> running with suspend loglevel = 8, it's happened only twice (in a row),
> >> with seemingly exact same console output.  Some earlier freezes also
> >> happened in dpm_suspend_start, at least.  However, I can certainly add
> >> e100 to SUSPEND_MODULES under /etc/pm/config.d, and continue running
> >> with that.
> >
> > That's what I'd do.
> 
> That worked out mosty OK (no freeze in quite some hibernation cycles),
> but I'm continuing testing it.

Great, please let me know how it works out.

> On the other hand, I reverted 8fbd962e3, recompiled and replaced the
> module, and got the freeze during hibernation.  And that was the bulk of
> the changes since 2.6.31...  I'll revert the rest and test again, but
> that seems purely cosmetic, so no high hopes.
> 
> > In addition to that, you can run multiple hibernation/resume cycles in
> > a tight loop using the RTC wakealarm.
> 
> I'll do so, as soon as I find a way to automatically supply the dm-crypt
> passphrase... or even better, learn to hibernate to ramdisk from the
> initramfs. :)

Well, you don't need to use swap encryptuon for _testing_. :-)

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-12-01 12:28                                     ` [linux-pm] " Rafael J. Wysocki
@ 2009-12-01 17:46                                       ` Ferenc Wagner
  2009-12-01 21:32                                         ` Rafael J. Wysocki
  2009-12-01 21:32                                         ` Rafael J. Wysocki
  2009-12-01 17:46                                       ` Ferenc Wagner
  1 sibling, 2 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-12-01 17:46 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, Andrew Morton, LKML, netdev

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Tuesday 01 December 2009, Ferenc Wagner wrote:
>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>> 
>>> On Sunday 29 November 2009, Ferenc Wagner wrote:
>>>
>>>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>>>> 
>>>>> On Saturday 28 November 2009, Ferenc Wagner wrote:
>>>>>
>>>>>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>>>>>> 
>>>>>>> Compile with CONFIG_PM_VERBOSE (it does mean exactly that).
>>>>>> 
>>>>>> The last message now was:
>>>>>> 
>>>>>> e100: 0000:02:08.0: hibernate, may wakeup
>>>>>> 
>>>>>> Looks like hibernating the e100 driver is unstable.
>>>>>
>>>>> Can you verify that by trying to hibernate without the e100 driver?
>>>> 
>>>> Not really, as I still can't reliable reproduce the issue.  Since I'm
>>>> running with suspend loglevel = 8, it's happened only twice (in a row),
>>>> with seemingly exact same console output.  Some earlier freezes also
>>>> happened in dpm_suspend_start, at least.  However, I can certainly add
>>>> e100 to SUSPEND_MODULES under /etc/pm/config.d, and continue running
>>>> with that.
>>>
>>> That's what I'd do.
>> 
>> That worked out mosty OK (no freeze in quite some hibernation cycles),
>> but I'm continuing testing it.
>
> Great, please let me know how it works out.

Will do.  On the negative side, this tends to confuse NetworkManager.

>> On the other hand, I reverted 8fbd962e3, recompiled and replaced the
>> module, and got the freeze during hibernation.  And that was the bulk of
>> the changes since 2.6.31...  I'll revert the rest and test again, but
>> that seems purely cosmetic, so no high hopes.
>> 
>>> In addition to that, you can run multiple hibernation/resume cycles in
>>> a tight loop using the RTC wakealarm.
>> 
>> I'll do so, as soon as I find a way to automatically supply the dm-crypt
>> passphrase... or even better, learn to hibernate to ramdisk from the
>> initramfs. :)
>
> Well, you don't need to use swap encryption for _testing_. :-)

I use partition encryption, everything except for /boot is encrypted.
Apropos: does s2disk perform encryption with a temporary key even if I
don't supply and RSA key, to protect mlocked application data from being
present in the swap after restore?
-- 
Thanks,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-12-01 12:28                                     ` [linux-pm] " Rafael J. Wysocki
  2009-12-01 17:46                                       ` Ferenc Wagner
@ 2009-12-01 17:46                                       ` Ferenc Wagner
  1 sibling, 0 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-12-01 17:46 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: netdev, linux-pm, Andrew Morton, LKML

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Tuesday 01 December 2009, Ferenc Wagner wrote:
>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>> 
>>> On Sunday 29 November 2009, Ferenc Wagner wrote:
>>>
>>>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>>>> 
>>>>> On Saturday 28 November 2009, Ferenc Wagner wrote:
>>>>>
>>>>>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>>>>>> 
>>>>>>> Compile with CONFIG_PM_VERBOSE (it does mean exactly that).
>>>>>> 
>>>>>> The last message now was:
>>>>>> 
>>>>>> e100: 0000:02:08.0: hibernate, may wakeup
>>>>>> 
>>>>>> Looks like hibernating the e100 driver is unstable.
>>>>>
>>>>> Can you verify that by trying to hibernate without the e100 driver?
>>>> 
>>>> Not really, as I still can't reliable reproduce the issue.  Since I'm
>>>> running with suspend loglevel = 8, it's happened only twice (in a row),
>>>> with seemingly exact same console output.  Some earlier freezes also
>>>> happened in dpm_suspend_start, at least.  However, I can certainly add
>>>> e100 to SUSPEND_MODULES under /etc/pm/config.d, and continue running
>>>> with that.
>>>
>>> That's what I'd do.
>> 
>> That worked out mosty OK (no freeze in quite some hibernation cycles),
>> but I'm continuing testing it.
>
> Great, please let me know how it works out.

Will do.  On the negative side, this tends to confuse NetworkManager.

>> On the other hand, I reverted 8fbd962e3, recompiled and replaced the
>> module, and got the freeze during hibernation.  And that was the bulk of
>> the changes since 2.6.31...  I'll revert the rest and test again, but
>> that seems purely cosmetic, so no high hopes.
>> 
>>> In addition to that, you can run multiple hibernation/resume cycles in
>>> a tight loop using the RTC wakealarm.
>> 
>> I'll do so, as soon as I find a way to automatically supply the dm-crypt
>> passphrase... or even better, learn to hibernate to ramdisk from the
>> initramfs. :)
>
> Well, you don't need to use swap encryption for _testing_. :-)

I use partition encryption, everything except for /boot is encrypted.
Apropos: does s2disk perform encryption with a temporary key even if I
don't supply and RSA key, to protect mlocked application data from being
present in the swap after restore?
-- 
Thanks,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-12-01 17:46                                       ` Ferenc Wagner
@ 2009-12-01 21:32                                         ` Rafael J. Wysocki
  2009-12-02  1:58                                           ` Ferenc Wagner
                                                             ` (2 more replies)
  2009-12-01 21:32                                         ` Rafael J. Wysocki
  1 sibling, 3 replies; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-12-01 21:32 UTC (permalink / raw)
  To: Ferenc Wagner; +Cc: linux-pm, Andrew Morton, LKML, netdev

On Tuesday 01 December 2009, Ferenc Wagner wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > On Tuesday 01 December 2009, Ferenc Wagner wrote:
> >> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >> 
> >>> On Sunday 29 November 2009, Ferenc Wagner wrote:
> >>>
> >>>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >>>> 
> >>>>> On Saturday 28 November 2009, Ferenc Wagner wrote:
> >>>>>
> >>>>>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >>>>>> 
> >>>>>>> Compile with CONFIG_PM_VERBOSE (it does mean exactly that).
> >>>>>> 
> >>>>>> The last message now was:
> >>>>>> 
> >>>>>> e100: 0000:02:08.0: hibernate, may wakeup
> >>>>>> 
> >>>>>> Looks like hibernating the e100 driver is unstable.
> >>>>>
> >>>>> Can you verify that by trying to hibernate without the e100 driver?
> >>>> 
> >>>> Not really, as I still can't reliable reproduce the issue.  Since I'm
> >>>> running with suspend loglevel = 8, it's happened only twice (in a row),
> >>>> with seemingly exact same console output.  Some earlier freezes also
> >>>> happened in dpm_suspend_start, at least.  However, I can certainly add
> >>>> e100 to SUSPEND_MODULES under /etc/pm/config.d, and continue running
> >>>> with that.
> >>>
> >>> That's what I'd do.
> >> 
> >> That worked out mosty OK (no freeze in quite some hibernation cycles),
> >> but I'm continuing testing it.
> >
> > Great, please let me know how it works out.
> 
> Will do.  On the negative side, this tends to confuse NetworkManager.
> 
> >> On the other hand, I reverted 8fbd962e3, recompiled and replaced the
> >> module, and got the freeze during hibernation.  And that was the bulk of
> >> the changes since 2.6.31...  I'll revert the rest and test again, but
> >> that seems purely cosmetic, so no high hopes.
> >> 
> >>> In addition to that, you can run multiple hibernation/resume cycles in
> >>> a tight loop using the RTC wakealarm.
> >> 
> >> I'll do so, as soon as I find a way to automatically supply the dm-crypt
> >> passphrase... or even better, learn to hibernate to ramdisk from the
> >> initramfs. :)
> >
> > Well, you don't need to use swap encryption for _testing_. :-)
> 
> I use partition encryption, everything except for /boot is encrypted.

If /boot is big enough, you could use a swap file in /boot for the testing.

> Apropos: does s2disk perform encryption with a temporary key even if I
> don't supply and RSA key, to protect mlocked application data from being
> present in the swap after restore?

It can do that, but you need to provide a key during suspend and resume.

Otherwise it doesn't use a random key, because it would have to store it in
the clear in the image header.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-12-01 17:46                                       ` Ferenc Wagner
  2009-12-01 21:32                                         ` Rafael J. Wysocki
@ 2009-12-01 21:32                                         ` Rafael J. Wysocki
  1 sibling, 0 replies; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-12-01 21:32 UTC (permalink / raw)
  To: Ferenc Wagner; +Cc: netdev, linux-pm, Andrew Morton, LKML

On Tuesday 01 December 2009, Ferenc Wagner wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > On Tuesday 01 December 2009, Ferenc Wagner wrote:
> >> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >> 
> >>> On Sunday 29 November 2009, Ferenc Wagner wrote:
> >>>
> >>>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >>>> 
> >>>>> On Saturday 28 November 2009, Ferenc Wagner wrote:
> >>>>>
> >>>>>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >>>>>> 
> >>>>>>> Compile with CONFIG_PM_VERBOSE (it does mean exactly that).
> >>>>>> 
> >>>>>> The last message now was:
> >>>>>> 
> >>>>>> e100: 0000:02:08.0: hibernate, may wakeup
> >>>>>> 
> >>>>>> Looks like hibernating the e100 driver is unstable.
> >>>>>
> >>>>> Can you verify that by trying to hibernate without the e100 driver?
> >>>> 
> >>>> Not really, as I still can't reliable reproduce the issue.  Since I'm
> >>>> running with suspend loglevel = 8, it's happened only twice (in a row),
> >>>> with seemingly exact same console output.  Some earlier freezes also
> >>>> happened in dpm_suspend_start, at least.  However, I can certainly add
> >>>> e100 to SUSPEND_MODULES under /etc/pm/config.d, and continue running
> >>>> with that.
> >>>
> >>> That's what I'd do.
> >> 
> >> That worked out mosty OK (no freeze in quite some hibernation cycles),
> >> but I'm continuing testing it.
> >
> > Great, please let me know how it works out.
> 
> Will do.  On the negative side, this tends to confuse NetworkManager.
> 
> >> On the other hand, I reverted 8fbd962e3, recompiled and replaced the
> >> module, and got the freeze during hibernation.  And that was the bulk of
> >> the changes since 2.6.31...  I'll revert the rest and test again, but
> >> that seems purely cosmetic, so no high hopes.
> >> 
> >>> In addition to that, you can run multiple hibernation/resume cycles in
> >>> a tight loop using the RTC wakealarm.
> >> 
> >> I'll do so, as soon as I find a way to automatically supply the dm-crypt
> >> passphrase... or even better, learn to hibernate to ramdisk from the
> >> initramfs. :)
> >
> > Well, you don't need to use swap encryption for _testing_. :-)
> 
> I use partition encryption, everything except for /boot is encrypted.

If /boot is big enough, you could use a swap file in /boot for the testing.

> Apropos: does s2disk perform encryption with a temporary key even if I
> don't supply and RSA key, to protect mlocked application data from being
> present in the swap after restore?

It can do that, but you need to provide a key during suspend and resume.

Otherwise it doesn't use a random key, because it would have to store it in
the clear in the image header.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-12-01 21:32                                         ` Rafael J. Wysocki
@ 2009-12-02  1:58                                           ` Ferenc Wagner
  2009-12-02 10:55                                             ` Ferenc Wagner
                                                               ` (3 more replies)
  2009-12-02  1:58                                           ` Ferenc Wagner
  2009-12-12 19:27                                             ` s2disk encryption was " Pavel Machek
  2 siblings, 4 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-12-02  1:58 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, Andrew Morton, LKML

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Tuesday 01 December 2009, Ferenc Wagner wrote:
>
>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>> 
>>> On Tuesday 01 December 2009, Ferenc Wagner wrote:
>>>
>>>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>>>>
>>>>> In addition to that, you can run multiple hibernation/resume cycles in
>>>>> a tight loop using the RTC wakealarm.
>>>> 
>>>> I'll do so, as soon as I find a way to automatically supply the dm-crypt
>>>> passphrase... or even better, learn to hibernate to ramdisk from the
>>>> initramfs. :)
>>>
>>> Well, you don't need to use swap encryption for _testing_. :-)
>> 
>> I use partition encryption, everything except for /boot is encrypted.
>
> If /boot is big enough, you could use a swap file in /boot for the testing.

Ramdisk worked good.  Maybe too good, because I left the machine doing
s2disks while I was having dinner, and it achieved some 120 suspends
without a freeze.  Only the e100 and the mii modules were loaded.

After some script munging I got the machine automatically boot with an
alternate passphrase, so in vivo testing is possible now.  I mean,
tomorrow.

Btw. s2disk has a strange effect of simulating enters during suspend.
It looks like this in a terminal:

$ sudo s2disk



















$ 
$ 
$ 
$ 
$ 
$ 
$ 
$ 
$ 
$ 
$ 
$ 
$ 
$ 
$ 
$ 
$ 
$ 
$ 
$ <cursor is here>

Can you also see this?

>> Apropos: does s2disk perform encryption with a temporary key even if I
>> don't supply and RSA key, to protect mlocked application data from being
>> present in the swap after restore?
>
> It can do that, but you need to provide a key during suspend and resume.
>
> Otherwise it doesn't use a random key, because it would have to store it in
> the clear in the image header.

So you don't feel like the "What is this 'Encrypt suspend image' for?"
Q&A in Documentation/swsusp.txt describes a real threat, do you?  If
an "application" has direct access to swap, then it's game over anyway.
-- 
Thanks,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-12-01 21:32                                         ` Rafael J. Wysocki
  2009-12-02  1:58                                           ` Ferenc Wagner
@ 2009-12-02  1:58                                           ` Ferenc Wagner
  2009-12-12 19:27                                             ` s2disk encryption was " Pavel Machek
  2 siblings, 0 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-12-02  1:58 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, Andrew Morton, LKML

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Tuesday 01 December 2009, Ferenc Wagner wrote:
>
>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>> 
>>> On Tuesday 01 December 2009, Ferenc Wagner wrote:
>>>
>>>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>>>>
>>>>> In addition to that, you can run multiple hibernation/resume cycles in
>>>>> a tight loop using the RTC wakealarm.
>>>> 
>>>> I'll do so, as soon as I find a way to automatically supply the dm-crypt
>>>> passphrase... or even better, learn to hibernate to ramdisk from the
>>>> initramfs. :)
>>>
>>> Well, you don't need to use swap encryption for _testing_. :-)
>> 
>> I use partition encryption, everything except for /boot is encrypted.
>
> If /boot is big enough, you could use a swap file in /boot for the testing.

Ramdisk worked good.  Maybe too good, because I left the machine doing
s2disks while I was having dinner, and it achieved some 120 suspends
without a freeze.  Only the e100 and the mii modules were loaded.

After some script munging I got the machine automatically boot with an
alternate passphrase, so in vivo testing is possible now.  I mean,
tomorrow.

Btw. s2disk has a strange effect of simulating enters during suspend.
It looks like this in a terminal:

$ sudo s2disk



















$ 
$ 
$ 
$ 
$ 
$ 
$ 
$ 
$ 
$ 
$ 
$ 
$ 
$ 
$ 
$ 
$ 
$ 
$ 
$ <cursor is here>

Can you also see this?

>> Apropos: does s2disk perform encryption with a temporary key even if I
>> don't supply and RSA key, to protect mlocked application data from being
>> present in the swap after restore?
>
> It can do that, but you need to provide a key during suspend and resume.
>
> Otherwise it doesn't use a random key, because it would have to store it in
> the clear in the image header.

So you don't feel like the "What is this 'Encrypt suspend image' for?"
Q&A in Documentation/swsusp.txt describes a real threat, do you?  If
an "application" has direct access to swap, then it's game over anyway.
-- 
Thanks,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-12-02  1:58                                           ` Ferenc Wagner
@ 2009-12-02 10:55                                             ` Ferenc Wagner
  2009-12-02 21:33                                               ` Rafael J. Wysocki
                                                                 ` (4 more replies)
  2009-12-02 10:55                                             ` Ferenc Wagner
                                                               ` (2 subsequent siblings)
  3 siblings, 5 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-12-02 10:55 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, Andrew Morton, LKML

Ferenc Wagner <wferi@niif.hu> writes:

> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>
>> On Tuesday 01 December 2009, Ferenc Wagner wrote:
>>
>>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>>> 
>>>> On Tuesday 01 December 2009, Ferenc Wagner wrote:
>>>>
>>>>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>>>>>
>>>>>> In addition to that, you can run multiple hibernation/resume cycles in
>>>>>> a tight loop using the RTC wakealarm.
>>>>> 
>>>>> I'll do so, as soon as I find a way to automatically supply the dm-crypt
>>>>> passphrase... or even better, learn to hibernate to ramdisk from the
>>>>> initramfs. :)
>>>>
>>>> Well, you don't need to use swap encryption for _testing_. :-)
>>> 
>>> I use partition encryption, everything except for /boot is encrypted.
>>
>> If /boot is big enough, you could use a swap file in /boot for the testing.
>
> Ramdisk worked good.  Maybe too good, because I left the machine doing
> s2disks while I was having dinner, and it achieved some 120 suspends
> without a freeze.  Only the e100 and the mii modules were loaded.
>
> After some script munging I got the machine automatically boot with an
> alternate passphrase, so in vivo testing is possible now.  I mean,
> tomorrow.

After almost 100 hibernate/resume cycles, I have to say that this issue
can't be reproduced by suspending in a tight loop.  I tried that while
flood pinging my gateway and also with no network activity.  The rc8
e100 module was loaded all the time.

> Btw. s2disk has a strange effect of simulating enters during suspend.
> [...]
> Can you also see this?

It can't be seen from the "sleep 1; s2disk" command, so it's probably an
artifact from X, when s2disk starts before Enter is released.
-- 
Regards,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-12-02  1:58                                           ` Ferenc Wagner
  2009-12-02 10:55                                             ` Ferenc Wagner
@ 2009-12-02 10:55                                             ` Ferenc Wagner
  2009-12-02 12:27                                             ` [linux-pm] " Stefan Seyfried
  2009-12-02 12:27                                             ` Stefan Seyfried
  3 siblings, 0 replies; 88+ messages in thread
From: Ferenc Wagner @ 2009-12-02 10:55 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, Andrew Morton, LKML

Ferenc Wagner <wferi@niif.hu> writes:

> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>
>> On Tuesday 01 December 2009, Ferenc Wagner wrote:
>>
>>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>>> 
>>>> On Tuesday 01 December 2009, Ferenc Wagner wrote:
>>>>
>>>>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>>>>>
>>>>>> In addition to that, you can run multiple hibernation/resume cycles in
>>>>>> a tight loop using the RTC wakealarm.
>>>>> 
>>>>> I'll do so, as soon as I find a way to automatically supply the dm-crypt
>>>>> passphrase... or even better, learn to hibernate to ramdisk from the
>>>>> initramfs. :)
>>>>
>>>> Well, you don't need to use swap encryption for _testing_. :-)
>>> 
>>> I use partition encryption, everything except for /boot is encrypted.
>>
>> If /boot is big enough, you could use a swap file in /boot for the testing.
>
> Ramdisk worked good.  Maybe too good, because I left the machine doing
> s2disks while I was having dinner, and it achieved some 120 suspends
> without a freeze.  Only the e100 and the mii modules were loaded.
>
> After some script munging I got the machine automatically boot with an
> alternate passphrase, so in vivo testing is possible now.  I mean,
> tomorrow.

After almost 100 hibernate/resume cycles, I have to say that this issue
can't be reproduced by suspending in a tight loop.  I tried that while
flood pinging my gateway and also with no network activity.  The rc8
e100 module was loaded all the time.

> Btw. s2disk has a strange effect of simulating enters during suspend.
> [...]
> Can you also see this?

It can't be seen from the "sleep 1; s2disk" command, so it's probably an
artifact from X, when s2disk starts before Enter is released.
-- 
Regards,
Feri.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-12-02  1:58                                           ` Ferenc Wagner
  2009-12-02 10:55                                             ` Ferenc Wagner
  2009-12-02 10:55                                             ` Ferenc Wagner
@ 2009-12-02 12:27                                             ` Stefan Seyfried
  2009-12-02 12:27                                             ` Stefan Seyfried
  3 siblings, 0 replies; 88+ messages in thread
From: Stefan Seyfried @ 2009-12-02 12:27 UTC (permalink / raw)
  To: Ferenc Wagner; +Cc: Rafael J. Wysocki, linux-pm, Andrew Morton, LKML

On Wed, 02 Dec 2009 02:58:41 +0100
Ferenc Wagner <wferi@niif.hu> wrote:

> Btw. s2disk has a strange effect of simulating enters during suspend.
> It looks like this in a terminal:
> 
> $ sudo s2disk

...

> 
> 
> 
> 
> 
> 
> $ 
> $ 
> $ 

...

> $ 
> $ 
> $ 
> $ 
> $ <cursor is here>
> 
> Can you also see this?

That's an old "bug" in X (but fixed for me since quite some time) IIUC:
you type s2disk <enter>, s2disk switches to console 1, while enter is
still pressed.
=> X does not know that enter got "released" and will only notice after
resume finished. The X internal key autorepeat does the rest.

Does not happen for me since at least one year, I'd guess, but I am
always running the latest and greatest bleeding edge of everything ;)

HTH,

	seife
-- 
Stefan Seyfried

"Any ideas, John?"
"Well, surrounding them's out."

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-12-02  1:58                                           ` Ferenc Wagner
                                                               ` (2 preceding siblings ...)
  2009-12-02 12:27                                             ` [linux-pm] " Stefan Seyfried
@ 2009-12-02 12:27                                             ` Stefan Seyfried
  3 siblings, 0 replies; 88+ messages in thread
From: Stefan Seyfried @ 2009-12-02 12:27 UTC (permalink / raw)
  To: Ferenc Wagner; +Cc: linux-pm, Andrew Morton, LKML

On Wed, 02 Dec 2009 02:58:41 +0100
Ferenc Wagner <wferi@niif.hu> wrote:

> Btw. s2disk has a strange effect of simulating enters during suspend.
> It looks like this in a terminal:
> 
> $ sudo s2disk

...

> 
> 
> 
> 
> 
> 
> $ 
> $ 
> $ 

...

> $ 
> $ 
> $ 
> $ 
> $ <cursor is here>
> 
> Can you also see this?

That's an old "bug" in X (but fixed for me since quite some time) IIUC:
you type s2disk <enter>, s2disk switches to console 1, while enter is
still pressed.
=> X does not know that enter got "released" and will only notice after
resume finished. The X internal key autorepeat does the rest.

Does not happen for me since at least one year, I'd guess, but I am
always running the latest and greatest bleeding edge of everything ;)

HTH,

	seife
-- 
Stefan Seyfried

"Any ideas, John?"
"Well, surrounding them's out."

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-12-02 10:55                                             ` Ferenc Wagner
  2009-12-02 21:33                                               ` Rafael J. Wysocki
@ 2009-12-02 21:33                                               ` Rafael J. Wysocki
  2009-12-02 21:41                                                 ` Mikael Abrahamsson
  2009-12-02 21:41                                                 ` [linux-pm] " Mikael Abrahamsson
  2009-12-02 21:49                                               ` Rafael J. Wysocki
                                                                 ` (2 subsequent siblings)
  4 siblings, 2 replies; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-12-02 21:33 UTC (permalink / raw)
  To: Ferenc Wagner; +Cc: linux-pm, Andrew Morton, LKML

On Wednesday 02 December 2009, Ferenc Wagner wrote:
> Ferenc Wagner <wferi@niif.hu> writes:
> 
> > "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >
> >> On Tuesday 01 December 2009, Ferenc Wagner wrote:
> >>
> >>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >>> 
> >>>> On Tuesday 01 December 2009, Ferenc Wagner wrote:
> >>>>
> >>>>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >>>>>
> >>>>>> In addition to that, you can run multiple hibernation/resume cycles in
> >>>>>> a tight loop using the RTC wakealarm.
> >>>>> 
> >>>>> I'll do so, as soon as I find a way to automatically supply the dm-crypt
> >>>>> passphrase... or even better, learn to hibernate to ramdisk from the
> >>>>> initramfs. :)
> >>>>
> >>>> Well, you don't need to use swap encryption for _testing_. :-)
> >>> 
> >>> I use partition encryption, everything except for /boot is encrypted.
> >>
> >> If /boot is big enough, you could use a swap file in /boot for the testing.
> >
> > Ramdisk worked good.  Maybe too good, because I left the machine doing
> > s2disks while I was having dinner, and it achieved some 120 suspends
> > without a freeze.  Only the e100 and the mii modules were loaded.
> >
> > After some script munging I got the machine automatically boot with an
> > alternate passphrase, so in vivo testing is possible now.  I mean,
> > tomorrow.
> 
> After almost 100 hibernate/resume cycles, I have to say that this issue
> can't be reproduced by suspending in a tight loop.  I tried that while
> flood pinging my gateway and also with no network activity.  The rc8
> e100 module was loaded all the time.

Then I guess we do something that confuses your machine's BIOS or it's a
timing-related issue (ie. there has to be a substantial delay between the
hibernation and restore to trigger the problem).  Or both.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-12-02 10:55                                             ` Ferenc Wagner
@ 2009-12-02 21:33                                               ` Rafael J. Wysocki
  2009-12-02 21:33                                               ` [linux-pm] " Rafael J. Wysocki
                                                                 ` (3 subsequent siblings)
  4 siblings, 0 replies; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-12-02 21:33 UTC (permalink / raw)
  To: Ferenc Wagner; +Cc: linux-pm, Andrew Morton, LKML

On Wednesday 02 December 2009, Ferenc Wagner wrote:
> Ferenc Wagner <wferi@niif.hu> writes:
> 
> > "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >
> >> On Tuesday 01 December 2009, Ferenc Wagner wrote:
> >>
> >>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >>> 
> >>>> On Tuesday 01 December 2009, Ferenc Wagner wrote:
> >>>>
> >>>>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >>>>>
> >>>>>> In addition to that, you can run multiple hibernation/resume cycles in
> >>>>>> a tight loop using the RTC wakealarm.
> >>>>> 
> >>>>> I'll do so, as soon as I find a way to automatically supply the dm-crypt
> >>>>> passphrase... or even better, learn to hibernate to ramdisk from the
> >>>>> initramfs. :)
> >>>>
> >>>> Well, you don't need to use swap encryption for _testing_. :-)
> >>> 
> >>> I use partition encryption, everything except for /boot is encrypted.
> >>
> >> If /boot is big enough, you could use a swap file in /boot for the testing.
> >
> > Ramdisk worked good.  Maybe too good, because I left the machine doing
> > s2disks while I was having dinner, and it achieved some 120 suspends
> > without a freeze.  Only the e100 and the mii modules were loaded.
> >
> > After some script munging I got the machine automatically boot with an
> > alternate passphrase, so in vivo testing is possible now.  I mean,
> > tomorrow.
> 
> After almost 100 hibernate/resume cycles, I have to say that this issue
> can't be reproduced by suspending in a tight loop.  I tried that while
> flood pinging my gateway and also with no network activity.  The rc8
> e100 module was loaded all the time.

Then I guess we do something that confuses your machine's BIOS or it's a
timing-related issue (ie. there has to be a substantial delay between the
hibernation and restore to trigger the problem).  Or both.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-12-02 21:33                                               ` [linux-pm] " Rafael J. Wysocki
  2009-12-02 21:41                                                 ` Mikael Abrahamsson
@ 2009-12-02 21:41                                                 ` Mikael Abrahamsson
  1 sibling, 0 replies; 88+ messages in thread
From: Mikael Abrahamsson @ 2009-12-02 21:41 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Ferenc Wagner, linux-pm, Andrew Morton, LKML

On Wed, 2 Dec 2009, Rafael J. Wysocki wrote:

> Then I guess we do something that confuses your machine's BIOS or it's a 
> timing-related issue (ie. there has to be a substantial delay between 
> the hibernation and restore to trigger the problem).  Or both.

I don't know if it's related, but on 2.6.31.something (ubuntu 9.10 stock 
kernel) I have intermittent suspend/resume problems on my Thinkpad X200, 
and I get the problems much more frequent when running on batteries than 
when I'm running with the power cable plugged in. Just saying that this 
might be something to test as well.

Link to "my" bug on launchpad:

<https://bugs.launchpad.net/ubuntu/+source/linux/+bug/473876>

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-12-02 21:33                                               ` [linux-pm] " Rafael J. Wysocki
@ 2009-12-02 21:41                                                 ` Mikael Abrahamsson
  2009-12-02 21:41                                                 ` [linux-pm] " Mikael Abrahamsson
  1 sibling, 0 replies; 88+ messages in thread
From: Mikael Abrahamsson @ 2009-12-02 21:41 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, Ferenc Wagner, Andrew Morton, LKML

On Wed, 2 Dec 2009, Rafael J. Wysocki wrote:

> Then I guess we do something that confuses your machine's BIOS or it's a 
> timing-related issue (ie. there has to be a substantial delay between 
> the hibernation and restore to trigger the problem).  Or both.

I don't know if it's related, but on 2.6.31.something (ubuntu 9.10 stock 
kernel) I have intermittent suspend/resume problems on my Thinkpad X200, 
and I get the problems much more frequent when running on batteries than 
when I'm running with the power cable plugged in. Just saying that this 
might be something to test as well.

Link to "my" bug on launchpad:

<https://bugs.launchpad.net/ubuntu/+source/linux/+bug/473876>

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-12-02 10:55                                             ` Ferenc Wagner
  2009-12-02 21:33                                               ` Rafael J. Wysocki
  2009-12-02 21:33                                               ` [linux-pm] " Rafael J. Wysocki
@ 2009-12-02 21:49                                               ` Rafael J. Wysocki
  2009-12-02 21:49                                               ` Rafael J. Wysocki
  2009-12-12 19:31                                                 ` Pavel Machek
  4 siblings, 0 replies; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-12-02 21:49 UTC (permalink / raw)
  To: Ferenc Wagner; +Cc: linux-pm, Andrew Morton, LKML

On Wednesday 02 December 2009, Ferenc Wagner wrote:
> Ferenc Wagner <wferi@niif.hu> writes:
> 
> > "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >
> >> On Tuesday 01 December 2009, Ferenc Wagner wrote:
> >>
> >>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >>> 
> >>>> On Tuesday 01 December 2009, Ferenc Wagner wrote:
> >>>>
> >>>>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >>>>>
> >>>>>> In addition to that, you can run multiple hibernation/resume cycles in
> >>>>>> a tight loop using the RTC wakealarm.
> >>>>> 
> >>>>> I'll do so, as soon as I find a way to automatically supply the dm-crypt
> >>>>> passphrase... or even better, learn to hibernate to ramdisk from the
> >>>>> initramfs. :)
> >>>>
> >>>> Well, you don't need to use swap encryption for _testing_. :-)
> >>> 
> >>> I use partition encryption, everything except for /boot is encrypted.
> >>
> >> If /boot is big enough, you could use a swap file in /boot for the testing.
> >
> > Ramdisk worked good.  Maybe too good, because I left the machine doing
> > s2disks while I was having dinner, and it achieved some 120 suspends
> > without a freeze.  Only the e100 and the mii modules were loaded.
> >
> > After some script munging I got the machine automatically boot with an
> > alternate passphrase, so in vivo testing is possible now.  I mean,
> > tomorrow.
> 
> After almost 100 hibernate/resume cycles, I have to say that this issue
> can't be reproduced by suspending in a tight loop.  I tried that while
> flood pinging my gateway and also with no network activity.  The rc8
> e100 module was loaded all the time.

I wonder if this patch:

http://patchwork.kernel.org/patch/64276/

helps in your case.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
  2009-12-02 10:55                                             ` Ferenc Wagner
                                                                 ` (2 preceding siblings ...)
  2009-12-02 21:49                                               ` Rafael J. Wysocki
@ 2009-12-02 21:49                                               ` Rafael J. Wysocki
  2009-12-12 19:31                                                 ` Pavel Machek
  4 siblings, 0 replies; 88+ messages in thread
From: Rafael J. Wysocki @ 2009-12-02 21:49 UTC (permalink / raw)
  To: Ferenc Wagner; +Cc: linux-pm, Andrew Morton, LKML

On Wednesday 02 December 2009, Ferenc Wagner wrote:
> Ferenc Wagner <wferi@niif.hu> writes:
> 
> > "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >
> >> On Tuesday 01 December 2009, Ferenc Wagner wrote:
> >>
> >>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >>> 
> >>>> On Tuesday 01 December 2009, Ferenc Wagner wrote:
> >>>>
> >>>>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >>>>>
> >>>>>> In addition to that, you can run multiple hibernation/resume cycles in
> >>>>>> a tight loop using the RTC wakealarm.
> >>>>> 
> >>>>> I'll do so, as soon as I find a way to automatically supply the dm-crypt
> >>>>> passphrase... or even better, learn to hibernate to ramdisk from the
> >>>>> initramfs. :)
> >>>>
> >>>> Well, you don't need to use swap encryption for _testing_. :-)
> >>> 
> >>> I use partition encryption, everything except for /boot is encrypted.
> >>
> >> If /boot is big enough, you could use a swap file in /boot for the testing.
> >
> > Ramdisk worked good.  Maybe too good, because I left the machine doing
> > s2disks while I was having dinner, and it achieved some 120 suspends
> > without a freeze.  Only the e100 and the mii modules were loaded.
> >
> > After some script munging I got the machine automatically boot with an
> > alternate passphrase, so in vivo testing is possible now.  I mean,
> > tomorrow.
> 
> After almost 100 hibernate/resume cycles, I have to say that this issue
> can't be reproduced by suspending in a tight loop.  I tried that while
> flood pinging my gateway and also with no network activity.  The rc8
> e100 module was loaded all the time.

I wonder if this patch:

http://patchwork.kernel.org/patch/64276/

helps in your case.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 88+ messages in thread

* s2disk encryption was Re: [linux-pm] intermittent suspend problem again
  2009-12-01 21:32                                         ` Rafael J. Wysocki
@ 2009-12-12 19:27                                             ` Pavel Machek
  2009-12-02  1:58                                           ` Ferenc Wagner
  2009-12-12 19:27                                             ` s2disk encryption was " Pavel Machek
  2 siblings, 0 replies; 88+ messages in thread
From: Pavel Machek @ 2009-12-12 19:27 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Ferenc Wagner, linux-pm, Andrew Morton, LKML, netdev

Hi!

> > >> On the other hand, I reverted 8fbd962e3, recompiled and replaced the
> > >> module, and got the freeze during hibernation.  And that was the bulk of
> > >> the changes since 2.6.31...  I'll revert the rest and test again, but
> > >> that seems purely cosmetic, so no high hopes.
> > >> 
> > >>> In addition to that, you can run multiple hibernation/resume cycles in
> > >>> a tight loop using the RTC wakealarm.
> > >> 
> > >> I'll do so, as soon as I find a way to automatically supply the dm-crypt
> > >> passphrase... or even better, learn to hibernate to ramdisk from the
> > >> initramfs. :)
> > >
> > > Well, you don't need to use swap encryption for _testing_. :-)
> > 
> > I use partition encryption, everything except for /boot is encrypted.
> 
> If /boot is big enough, you could use a swap file in /boot for the testing.
> 
> > Apropos: does s2disk perform encryption with a temporary key even if I
> > don't supply and RSA key, to protect mlocked application data from being
> > present in the swap after restore?
> 
> It can do that, but you need to provide a key during suspend and resume.
> 
> Otherwise it doesn't use a random key, because it would have to store it in
> the clear in the image header.

I believe it can use random key, stored in clear in image
header. Reason is... image header is easier to overwrite than removing
whole image.

That was original motivation for encryption... not having to overwrite
swap data with zeros.
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 88+ messages in thread

* s2disk encryption was Re: intermittent suspend problem again
@ 2009-12-12 19:27                                             ` Pavel Machek
  0 siblings, 0 replies; 88+ messages in thread
From: Pavel Machek @ 2009-12-12 19:27 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: netdev, linux-pm, Ferenc Wagner, Andrew Morton, LKML

Hi!

> > >> On the other hand, I reverted 8fbd962e3, recompiled and replaced the
> > >> module, and got the freeze during hibernation.  And that was the bulk of
> > >> the changes since 2.6.31...  I'll revert the rest and test again, but
> > >> that seems purely cosmetic, so no high hopes.
> > >> 
> > >>> In addition to that, you can run multiple hibernation/resume cycles in
> > >>> a tight loop using the RTC wakealarm.
> > >> 
> > >> I'll do so, as soon as I find a way to automatically supply the dm-crypt
> > >> passphrase... or even better, learn to hibernate to ramdisk from the
> > >> initramfs. :)
> > >
> > > Well, you don't need to use swap encryption for _testing_. :-)
> > 
> > I use partition encryption, everything except for /boot is encrypted.
> 
> If /boot is big enough, you could use a swap file in /boot for the testing.
> 
> > Apropos: does s2disk perform encryption with a temporary key even if I
> > don't supply and RSA key, to protect mlocked application data from being
> > present in the swap after restore?
> 
> It can do that, but you need to provide a key during suspend and resume.
> 
> Otherwise it doesn't use a random key, because it would have to store it in
> the clear in the image header.

I believe it can use random key, stored in clear in image
header. Reason is... image header is easier to overwrite than removing
whole image.

That was original motivation for encryption... not having to overwrite
swap data with zeros.
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [linux-pm] intermittent suspend problem again
  2009-12-02 10:55                                             ` Ferenc Wagner
@ 2009-12-12 19:31                                                 ` Pavel Machek
  2009-12-02 21:33                                               ` [linux-pm] " Rafael J. Wysocki
                                                                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 88+ messages in thread
From: Pavel Machek @ 2009-12-12 19:31 UTC (permalink / raw)
  To: Ferenc Wagner; +Cc: Rafael J. Wysocki, linux-pm, Andrew Morton, LKML

Hi!

> > Btw. s2disk has a strange effect of simulating enters during suspend.
> > [...]
> > Can you also see this?
> 
> It can't be seen from the "sleep 1; s2disk" command, so it's probably an
> artifact from X, when s2disk starts before Enter is released.

Yes, I see that, too, and believe it is X artifact.

(X keyboard handling is really broken: it does its own
autorepeat. That does not really work when big latencies are around.)
								Pavel 
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: intermittent suspend problem again
@ 2009-12-12 19:31                                                 ` Pavel Machek
  0 siblings, 0 replies; 88+ messages in thread
From: Pavel Machek @ 2009-12-12 19:31 UTC (permalink / raw)
  To: Ferenc Wagner; +Cc: linux-pm, Andrew Morton, LKML

Hi!

> > Btw. s2disk has a strange effect of simulating enters during suspend.
> > [...]
> > Can you also see this?
> 
> It can't be seen from the "sleep 1; s2disk" command, so it's probably an
> artifact from X, when s2disk starts before Enter is released.

Yes, I see that, too, and believe it is X artifact.

(X keyboard handling is really broken: it does its own
autorepeat. That does not really work when big latencies are around.)
								Pavel 
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 88+ messages in thread

end of thread, other threads:[~2009-12-13  1:41 UTC | newest]

Thread overview: 88+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-28 11:43 intermittent suspend problem again Ferenc Wagner
2009-10-28 18:56 ` Rafael J. Wysocki
2009-10-29  0:11   ` Ferenc Wagner
2009-10-29 18:36     ` Rafael J. Wysocki
2009-10-29 18:36     ` [linux-pm] " Rafael J. Wysocki
2009-10-29 22:31       ` Ferenc Wagner
2009-10-29 22:31       ` [linux-pm] " Ferenc Wagner
2009-10-30 18:18         ` Rafael J. Wysocki
2009-10-30 18:18         ` [linux-pm] " Rafael J. Wysocki
2009-10-30 19:03           ` Ferenc Wagner
2009-10-30 19:03             ` [linux-pm] " Ferenc Wagner
2009-10-30 20:38             ` Rafael J. Wysocki
2009-10-30 20:38             ` [linux-pm] " Rafael J. Wysocki
2009-10-31 12:02               ` Alan Jenkins
2009-10-31 12:02               ` [linux-pm] " Alan Jenkins
2009-10-31 14:06                 ` Ferenc Wagner
2009-10-31 14:06                 ` [linux-pm] " Ferenc Wagner
2009-10-31 19:11                   ` Rafael J. Wysocki
2009-10-31 19:11                   ` [linux-pm] " Rafael J. Wysocki
2009-11-01 21:53                     ` Ferenc Wagner
2009-11-03 11:02                       ` Ferenc Wagner
2009-11-03 11:02                       ` Ferenc Wagner
2009-11-01 21:53                     ` Ferenc Wagner
2009-11-11 11:29       ` Ferenc Wagner
2009-11-11 11:29       ` [linux-pm] " Ferenc Wagner
2009-11-11 11:38         ` Rafael J. Wysocki
2009-11-11 11:38         ` [linux-pm] " Rafael J. Wysocki
2009-11-11 13:29           ` Ferenc Wagner
2009-11-11 13:29           ` [linux-pm] " Ferenc Wagner
2009-11-11 14:47             ` Rafael J. Wysocki
2009-11-11 14:47             ` [linux-pm] " Rafael J. Wysocki
2009-11-13 16:35               ` Ferenc Wagner
2009-11-13 16:35               ` [linux-pm] " Ferenc Wagner
2009-11-13 19:59                 ` Rafael J. Wysocki
2009-11-13 19:59                 ` [linux-pm] " Rafael J. Wysocki
2009-11-14  1:50                   ` Ferenc Wagner
2009-11-14 18:52                     ` Rafael J. Wysocki
2009-11-18  1:12                       ` Ferenc Wagner
2009-11-18  1:12                       ` [linux-pm] " Ferenc Wagner
2009-11-18 14:05                       ` Ferenc Wagner
2009-11-18 14:05                       ` [linux-pm] " Ferenc Wagner
2009-11-18 22:13                         ` Rafael J. Wysocki
2009-11-18 22:54                           ` Ferenc Wagner
2009-11-18 22:54                           ` Ferenc Wagner
2009-11-19 12:00                           ` Ferenc Wagner
2009-11-19 12:00                             ` [linux-pm] " Ferenc Wagner
2009-11-19 13:02                             ` Ferenc Wagner
2009-11-19 19:42                               ` Rafael J. Wysocki
2009-11-19 19:42                               ` Rafael J. Wysocki
2009-11-19 13:02                             ` Ferenc Wagner
2009-11-21 23:59                           ` Ferenc Wagner
2009-11-21 23:59                           ` [linux-pm] " Ferenc Wagner
2009-11-28 19:01                           ` Ferenc Wagner
2009-11-29  0:29                             ` Rafael J. Wysocki
2009-11-29  0:29                             ` [linux-pm] " Rafael J. Wysocki
2009-11-29 10:12                               ` Ferenc Wagner
2009-11-29 15:07                                 ` Rafael J. Wysocki
2009-11-29 15:07                                 ` [linux-pm] " Rafael J. Wysocki
2009-12-01 10:29                                   ` Ferenc Wagner
2009-12-01 12:28                                     ` Rafael J. Wysocki
2009-12-01 12:28                                     ` [linux-pm] " Rafael J. Wysocki
2009-12-01 17:46                                       ` Ferenc Wagner
2009-12-01 21:32                                         ` Rafael J. Wysocki
2009-12-02  1:58                                           ` Ferenc Wagner
2009-12-02 10:55                                             ` Ferenc Wagner
2009-12-02 21:33                                               ` Rafael J. Wysocki
2009-12-02 21:33                                               ` [linux-pm] " Rafael J. Wysocki
2009-12-02 21:41                                                 ` Mikael Abrahamsson
2009-12-02 21:41                                                 ` [linux-pm] " Mikael Abrahamsson
2009-12-02 21:49                                               ` Rafael J. Wysocki
2009-12-02 21:49                                               ` Rafael J. Wysocki
2009-12-12 19:31                                               ` [linux-pm] " Pavel Machek
2009-12-12 19:31                                                 ` Pavel Machek
2009-12-02 10:55                                             ` Ferenc Wagner
2009-12-02 12:27                                             ` [linux-pm] " Stefan Seyfried
2009-12-02 12:27                                             ` Stefan Seyfried
2009-12-02  1:58                                           ` Ferenc Wagner
2009-12-12 19:27                                           ` s2disk encryption was Re: [linux-pm] " Pavel Machek
2009-12-12 19:27                                             ` s2disk encryption was " Pavel Machek
2009-12-01 21:32                                         ` Rafael J. Wysocki
2009-12-01 17:46                                       ` Ferenc Wagner
2009-12-01 10:29                                   ` Ferenc Wagner
2009-11-29 10:12                               ` [linux-pm] " Ferenc Wagner
2009-11-29 10:12                               ` Ferenc Wagner
2009-11-28 19:01                           ` Ferenc Wagner
2009-11-18 22:13                         ` Rafael J. Wysocki
2009-11-14 18:52                     ` Rafael J. Wysocki
2009-11-14  1:50                   ` Ferenc Wagner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.