linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
@ 2007-04-26 17:19 Josh Triplett
  2007-04-26 17:42 ` Linus Torvalds
  0 siblings, 1 reply; 158+ messages in thread
From: Josh Triplett @ 2007-04-26 17:19 UTC (permalink / raw)
  To: Pavel Machek; +Cc: linux-kernel, torvalds

Pavel Machek wrote:
>> For suspend to ram, in contrast, since you *know* that nobody will be 
>> touching the hardware, and since the timings are very different anyway 
>> (you'd hope that you can resume in a second or two), you'd generally want 
>> to keep the DMA engine tables right where they are, and just literally 
>> suspend the PCI chip itself.
> 
> I'd actually prefer resume to be similar to module insert, too... Do
> you think that resume is _that_ time critical?

http://www.gettysfamily.org/wordpress/?p=32
http://www.gettysfamily.org/wordpress/?p=33

You cannot resume "too fast"; suspend to RAM should become so fast that you
use it without even thinking about it.

- Josh Triplett


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 17:19 suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy) Josh Triplett
@ 2007-04-26 17:42 ` Linus Torvalds
  2007-04-28 19:18   ` Bill Davidsen
  0 siblings, 1 reply; 158+ messages in thread
From: Linus Torvalds @ 2007-04-26 17:42 UTC (permalink / raw)
  To: Josh Triplett; +Cc: Pavel Machek, linux-kernel



On Thu, 26 Apr 2007, Josh Triplett wrote:
> 
> http://www.gettysfamily.org/wordpress/?p=32
> http://www.gettysfamily.org/wordpress/?p=33
> 
> You cannot resume "too fast"; suspend to RAM should become so fast that you
> use it without even thinking about it.

Yes. I think the OLPC is a prime example of why suspend-to-ram should 
basically try to not do any setup at all: leave all the DMA command 
structures etc in memory, leave even things like USB URB's pending, just 
"stop the actual controllers and suspend it".

If you want to just save power by shutting down the CPU, it's a great idea 
to leave the NIC powered up etc, so that you don't have to renegotiate the 
link speed etc, and that you can go into s2ram and come back in less than 
a second. 

Then, if you really want to *sleep*, you can have a s2ram script that 
actually turns off devices for real and requires more tear-down and setup, 
but having something that tries to get in and out of the processor sleep 
modes as fast as possible is definitely interesting.

Obviously, it's *more* interesting on the OLPC than on many other machines 
(since that one can keep its display refreshed while it's sleeping), so it 
may well be that the OLPC is one extreme example that isn't very realistic 
for some other uses, but I think it's one of the things we want to 
support.

(And for me personally, I'd love to have all my machines "sleep" by 
default, but wake up by eithernet and keyboard - I'd love for my screen 
saver to literally put the machine to sleep, but not have to worry about 
touching a keyboard - just ssh'ing into them should still wake up. It's 
*technically* doable, but it's just a pain to do right now)

		Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-28 19:18   ` Bill Davidsen
@ 2007-04-28 19:15     ` David Lang
  2007-04-29  1:04       ` Matthew Garrett
  0 siblings, 1 reply; 158+ messages in thread
From: David Lang @ 2007-04-28 19:15 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Linus Torvalds, Josh Triplett, Pavel Machek, linux-kernel

On Sat, 28 Apr 2007, Bill Davidsen wrote:

> Linus Torvalds wrote:
>
>> (And for me personally, I'd love to have all my machines "sleep" by 
>> default, but wake up by eithernet and keyboard - I'd love for my screen 
>> saver to literally put the machine to sleep, but not have to worry about 
>> touching a keyboard - just ssh'ing into them should still wake up. It's 
>> *technically* doable, but it's just a pain to do right now)
>> 
> And timer somehow so cron jobs could still run. Ideal for critical but rarely 
> used machines like fallover servers, the user documentation download site, or 
> similar.

with dynaticks now in the kernel it may even be possible to have the idle 
process decide that the next event is far enough away that it should 
suspend-to-ram until that point.

David Lang

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 17:42 ` Linus Torvalds
@ 2007-04-28 19:18   ` Bill Davidsen
  2007-04-28 19:15     ` David Lang
  0 siblings, 1 reply; 158+ messages in thread
From: Bill Davidsen @ 2007-04-28 19:18 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Josh Triplett, Pavel Machek, linux-kernel

Linus Torvalds wrote:

> (And for me personally, I'd love to have all my machines "sleep" by 
> default, but wake up by eithernet and keyboard - I'd love for my screen 
> saver to literally put the machine to sleep, but not have to worry about 
> touching a keyboard - just ssh'ing into them should still wake up. It's 
> *technically* doable, but it's just a pain to do right now)
> 
And timer somehow so cron jobs could still run. Ideal for critical but 
rarely used machines like fallover servers, the user documentation 
download site, or similar.

-- 
Bill Davidsen <davidsen@tmr.com>
   "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-28 19:15     ` David Lang
@ 2007-04-29  1:04       ` Matthew Garrett
  2007-04-29  1:34         ` Josh Triplett
  0 siblings, 1 reply; 158+ messages in thread
From: Matthew Garrett @ 2007-04-29  1:04 UTC (permalink / raw)
  To: David Lang
  Cc: Bill Davidsen, Linus Torvalds, Josh Triplett, Pavel Machek, linux-kernel

On Sat, Apr 28, 2007 at 12:15:50PM -0700, David Lang wrote:

> with dynaticks now in the kernel it may even be possible to have the idle 
> process decide that the next event is far enough away that it should 
> suspend-to-ram until that point.

This would be ideal (and it's broadly what the OLPC guys are aiming for, 
I think), but on most platforms you're looking at at least a second or 
so to resume. As far as I know, we're still looking at ~60 ticks a 
second at best for an average desktop, so that's not going to be a win.

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-29  1:04       ` Matthew Garrett
@ 2007-04-29  1:34         ` Josh Triplett
  0 siblings, 0 replies; 158+ messages in thread
From: Josh Triplett @ 2007-04-29  1:34 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: David Lang, Bill Davidsen, Linus Torvalds, Pavel Machek, linux-kernel

Matthew Garrett wrote:
> On Sat, Apr 28, 2007 at 12:15:50PM -0700, David Lang wrote:
>> with dynaticks now in the kernel it may even be possible to have the idle 
>> process decide that the next event is far enough away that it should 
>> suspend-to-ram until that point.
> 
> This would be ideal (and it's broadly what the OLPC guys are aiming for, 
> I think), but on most platforms you're looking at at least a second or 
> so to resume. As far as I know, we're still looking at ~60 ticks a 
> second at best for an average desktop, so that's not going to be a win.

As long as the system has a clear idea of how long it will take to resume, it
can schedule a wakeup for a reasonable amount of time before that.  Ideally, a
completely unused system might not need a tick for several seconds.  However,
I agree that it doesn't make sense to add such functionality to the kernel
until someone can show a system that can actually wait that long between
ticks.

- Josh Triplett

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-05-26 20:35                         ` Rafael J. Wysocki
@ 2007-05-26 22:23                           ` Martin Steigerwald
  0 siblings, 0 replies; 158+ messages in thread
From: Martin Steigerwald @ 2007-05-26 22:23 UTC (permalink / raw)
  To: suspend2-devel
  Cc: Rafael J. Wysocki, Pavel Machek, Linus Torvalds, Nick Piggin,
	Mike Galbraith, linux-kernel, Thomas Gleixner, Con Kolivas,
	Ingo Molnar, Andrew Morton, Arjan van de Ven, Pekka J Enberg

[-- Attachment #1: Type: text/plain, Size: 1277 bytes --]

Am Samstag 26 Mai 2007 schrieb Rafael J. Wysocki:

Hi Rafael!

> The outcome was, more-or-less, that we'll work on merging suspend2 or
> at least some parts of it.
>
> However, in the meantime there have been some discussions implying that
> we have some important problems with suspend/hibernation that suspend2
> doesn't solve and that IMHO are more urgent than the merging of
> suspend2 right not.
>
> So, as far as I'm concerned, the plan is to fix the more urgent
> problems first and to work on merging suspend2 as far as there's time
> to do this.
>
> The problem is there are only a few people working on it and there's a
> lot to do, so I can only ask you to be patient. ;-)

Thats fine with me - I understand that. I just thought that there has been 
no outcome at all.

I will try to be patient as long as I do not dig into kernel hacking 
myself deeply enough to be able to help with that - did not do more than 
to put together two conflicting patches to compile my own kernels till 
now and forward port a patch for a sundance network card.  I can help 
with testing once there is something testable tough.

Regards,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-05-26 17:37                       ` Martin Steigerwald
@ 2007-05-26 20:35                         ` Rafael J. Wysocki
  2007-05-26 22:23                           ` Martin Steigerwald
  0 siblings, 1 reply; 158+ messages in thread
From: Rafael J. Wysocki @ 2007-05-26 20:35 UTC (permalink / raw)
  To: Martin Steigerwald
  Cc: suspend2-devel, Pavel Machek, Linus Torvalds, Nick Piggin,
	Mike Galbraith, linux-kernel, Thomas Gleixner, Con Kolivas,
	Ingo Molnar, Andrew Morton, Arjan van de Ven, Pekka J Enberg

On Saturday, 26 May 2007 19:37, Martin Steigerwald wrote:
> Am Mittwoch 25 April 2007 schrieb Pavel Machek:
> > Hi!
> >
> > > This is why there's a lot to be said for
> > >
> > > 	echo mem > /sys/power/state
> > >
> > > and being able to follow the path through _one_ object (the kernel)
> > > over trying to figure out the interaction between many different
> > > parts with different versions.
> >
> > The 'promise' is 'if you can get echo disk > /sys/power/state working,
> > uswsusp will work. too'. IOW it should be ok to debug the in-kernel
> > parts, only.
> 
> Hello Nigel, Pavel, Rafael and everyone else who is involved,
> 
> I would like to ask what come out of the suspend2 merge discussion. Nigel 
> just told that suspend2 likely won't be merged anytime soon and thats its 
> business as usual:
> 
> ---------------------------------------------------------------------
> It's pretty much business as usual. Linus doesn't want another
> implementation merged, and he wants the three of us (Pavel, Rafael and
> myself) to agree on a way forward. He also believes that we're
> approaching things from the wrong direction at the moment. Funnily
> enough, this is the one area on which we do all agree.
> ---------------------------------------------------------------------
> http://lists.suspend2.net/lurker/message/20070510.021641.fe306add.en.html
> 
> 
> Has there been any further discussion and preferably agreement on the way 
> to go forward?

The outcome was, more-or-less, that we'll work on merging suspend2 or at least
some parts of it.

However, in the meantime there have been some discussions implying that we have
some important problems with suspend/hibernation that suspend2 doesn't solve
and that IMHO are more urgent than the merging of suspend2 right not.

So, as far as I'm concerned, the plan is to fix the more urgent problems first
and to work on merging suspend2 as far as there's time to do this.

The problem is there are only a few people working on it and there's a lot to
do, so I can only ask you to be patient. ;-)

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25  7:23                     ` Pavel Machek
                                         ` (3 preceding siblings ...)
  2007-04-25 19:43                       ` Kenneth Crudup
@ 2007-05-26 17:37                       ` Martin Steigerwald
  2007-05-26 20:35                         ` Rafael J. Wysocki
  4 siblings, 1 reply; 158+ messages in thread
From: Martin Steigerwald @ 2007-05-26 17:37 UTC (permalink / raw)
  To: suspend2-devel
  Cc: Pavel Machek, Linus Torvalds, Nick Piggin, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, Ingo Molnar,
	Andrew Morton, Arjan van de Ven

[-- Attachment #1: Type: text/plain, Size: 4902 bytes --]

Am Mittwoch 25 April 2007 schrieb Pavel Machek:
> Hi!
>
> > This is why there's a lot to be said for
> >
> > 	echo mem > /sys/power/state
> >
> > and being able to follow the path through _one_ object (the kernel)
> > over trying to figure out the interaction between many different
> > parts with different versions.
>
> The 'promise' is 'if you can get echo disk > /sys/power/state working,
> uswsusp will work. too'. IOW it should be ok to debug the in-kernel
> parts, only.

Hello Nigel, Pavel, Rafael and everyone else who is involved,

I would like to ask what come out of the suspend2 merge discussion. Nigel 
just told that suspend2 likely won't be merged anytime soon and thats its 
business as usual:

---------------------------------------------------------------------
It's pretty much business as usual. Linus doesn't want another
implementation merged, and he wants the three of us (Pavel, Rafael and
myself) to agree on a way forward. He also believes that we're
approaching things from the wrong direction at the moment. Funnily
enough, this is the one area on which we do all agree.
---------------------------------------------------------------------
http://lists.suspend2.net/lurker/message/20070510.021641.fe306add.en.html


Has there been any further discussion and preferably agreement on the way 
to go forward?

Although you Linus, as I read from different mails only use suspend to 
RAM, there are many users out there who use suspend to disk daily. 

I used in kernel software suspend initially and it worked quite nice with 
starting from 2.6.10 or 2.6.11 where suspend2 didn't work for me before 
2.6.14 with the hibernate script. But from then on suspend2 worked better 
than in kernel software suspend for me and colleagues on:

- ThinkPad T23
- ThinkPad T42
- Possibly some other ThinkPads
- as well various Dell workstations we have at work

It was faster and more reliable, yielding uptimes up to 40 days on my 
workstation recently (with 2.6.17.7 still). And even that uptime was only 
ended by booting a newly build kernel (2.6.21 with sws 2.2.9.13). For me 
in the role of a user actually this is a really satisfying solution!

I tried userspace software suspend from time to time but then just was fed 
up with it, cause I could not get it to work within any sensible amount 
of time - even with some bog standard Debian kernel, I think it was some 
2.6.18 one. Maybe I am dumb, but so be it, it should not be that 
complicated to get it to work. Recently I didn't even bother to try 
anymore. Well and I read in the suspend2 merge discussion that even in 
kernel suspend does not work reliably anymore.

As long I cannot be convinced that the vanilla kernel contains a suspend 
to disk solution that works as good as suspend2 I will patch suspend2 
into all of the desktop kernels I build.

Thats quite bad IMHO for exactly the same reason than having drivers 
maintained out of the kernel. For the same reason I think swap prefetch 
should go in as soon as possible. It will never have the adoption and 
care taking of an in-kernel-tree solution.

I am convinced that a working suspend to RAM just is not enough - well it 
wasn't working correctly last time I checked. But I even don't bother 
about suspend to RAM anymore. I can wait those additional seconds for 
suspend to disk and it allows me to drive my laptops without batteries 
most of the time and have workstations switched off completely so that 
they do not consume standby power.

So please, pretty please consider working together on a reliable, fast, 
stable, easy to use and configurable in-kernel-tree snapshot solution! 
Actually I as a user I couldn't care less about the implementation 
details, but as someone who is interested in kernel technologies I like 
it to be a clean and well designed solution, too. ;-)

Maybe when the Linux Foundation organizes a meeting for Nigel, Pavel, 
Rafael and other kernel developers interested in creating such a solution 
it will help. To me it seems such a concentrated meeting in a good 
atmosphere could be more effective than endless mailing list discussions 
not leading to a clear result. When its not easy for the involved people 
to work together maybe a casual bystander who understands enough of 
kernel details should moderate the meeting and help finding an agreement.

It would just be such a pity to miss the chance to have a nicely working 
snapshot solution in the Linux kernel, that may even be interesting for 
virtualization (you could store a backup of a machine state permanently 
or even more of them - if not already available through other 
technologies like well suspend2 with filewriter for example).

Regards,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-27 10:19                                                       ` Johannes Berg
@ 2007-04-27 12:09                                                         ` Rafael J. Wysocki
  2007-04-27 12:07                                                           ` Johannes Berg
  0 siblings, 1 reply; 158+ messages in thread
From: Rafael J. Wysocki @ 2007-04-27 12:09 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Pavel Machek, Linus Torvalds, Nigel Cunningham, Nick Piggin,
	suspend2-devel, Mike Galbraith, linux-kernel, Con Kolivas,
	Andrew Morton, Thomas Gleixner, Ingo Molnar, Arjan van de Ven,
	linux-pm

On Friday, 27 April 2007 12:19, Johannes Berg wrote:
> On Fri, 2007-04-27 at 12:18 +0200, Rafael J. Wysocki wrote:
> 
> > 1) We define platform_hibernation if CONFIG_ACPI is set.
> 
> Let's just define it always then in the common code so we don't have
> even more magic bits platforms need to define even if they don't care at
> all. And please don't put #ifdef CONFIG_ACPI into the common code ;)
> Maybe #ifdef CONFIG_ARCH_NEEDS_HIBERNATE_HOOKS or something.
> 
> > 2) In the ACPI code we do
> > 
> > if (can do S4)
> > 	platform_hibernation = 1;
> 
> Gotcha.
> 
> > 3) We have functions arch_platform_prepare()/finish()/enter() that are defined
> > to be noops for anything but ACPI systems and for ACPI systems they are
> > defined like this:
> > 
> > int arch_platform_enter(void)
> > {
> > 	if (!platform_hibernation)
> > 		return 0;
> > 
> > 	...
> > }
> > 
> > I think it should work.
> 
> You could reduce code churn in all other platforms by making these weak
> symbols like the irq hooks I did for pm_ops. It looks like it can work
> and possibly is even less intrusive than my hibernate_ops patch. Though
> then again my hibernate_ops patch removed a lot of stuff that is now no
> longer necessary, and also completely removed the PM_SUSPEND_DISK foo...
> we probably want that regardless of how we invoke ACPI.

Yes.  Still, I'd like to rework your patch to deal with ACPI without
introducing hibernate_ops .  I'm going to do this later today if you don't
mind. :-)

Rafael

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-27 12:09                                                         ` Rafael J. Wysocki
@ 2007-04-27 12:07                                                           ` Johannes Berg
  0 siblings, 0 replies; 158+ messages in thread
From: Johannes Berg @ 2007-04-27 12:07 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Pavel Machek, Linus Torvalds, Nigel Cunningham, Nick Piggin,
	suspend2-devel, Mike Galbraith, linux-kernel, Con Kolivas,
	Andrew Morton, Thomas Gleixner, Ingo Molnar, Arjan van de Ven,
	linux-pm

[-- Attachment #1: Type: text/plain, Size: 347 bytes --]

On Fri, 2007-04-27 at 14:09 +0200, Rafael J. Wysocki wrote:

> Yes.  Still, I'd like to rework your patch to deal with ACPI without
> introducing hibernate_ops .  I'm going to do this later today if you don't
> mind. :-)

Not at all :) That's why I actually sent it out instead of just saying
"well I give up it breaks user.c"

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-27 10:18                                                     ` Rafael J. Wysocki
@ 2007-04-27 10:19                                                       ` Johannes Berg
  2007-04-27 12:09                                                         ` Rafael J. Wysocki
  0 siblings, 1 reply; 158+ messages in thread
From: Johannes Berg @ 2007-04-27 10:19 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Pavel Machek, Linus Torvalds, Nigel Cunningham, Nick Piggin,
	suspend2-devel, Mike Galbraith, linux-kernel, Con Kolivas,
	Andrew Morton, Thomas Gleixner, Ingo Molnar, Arjan van de Ven,
	linux-pm

[-- Attachment #1: Type: text/plain, Size: 1248 bytes --]

On Fri, 2007-04-27 at 12:18 +0200, Rafael J. Wysocki wrote:

> 1) We define platform_hibernation if CONFIG_ACPI is set.

Let's just define it always then in the common code so we don't have
even more magic bits platforms need to define even if they don't care at
all. And please don't put #ifdef CONFIG_ACPI into the common code ;)
Maybe #ifdef CONFIG_ARCH_NEEDS_HIBERNATE_HOOKS or something.

> 2) In the ACPI code we do
> 
> if (can do S4)
> 	platform_hibernation = 1;

Gotcha.

> 3) We have functions arch_platform_prepare()/finish()/enter() that are defined
> to be noops for anything but ACPI systems and for ACPI systems they are
> defined like this:
> 
> int arch_platform_enter(void)
> {
> 	if (!platform_hibernation)
> 		return 0;
> 
> 	...
> }
> 
> I think it should work.

You could reduce code churn in all other platforms by making these weak
symbols like the irq hooks I did for pm_ops. It looks like it can work
and possibly is even less intrusive than my hibernate_ops patch. Though
then again my hibernate_ops patch removed a lot of stuff that is now no
longer necessary, and also completely removed the PM_SUSPEND_DISK foo...
we probably want that regardless of how we invoke ACPI.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-27  9:41                                                   ` Johannes Berg
@ 2007-04-27 10:18                                                     ` Rafael J. Wysocki
  2007-04-27 10:19                                                       ` Johannes Berg
  0 siblings, 1 reply; 158+ messages in thread
From: Rafael J. Wysocki @ 2007-04-27 10:18 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Pavel Machek, Linus Torvalds, Nigel Cunningham, Nick Piggin,
	suspend2-devel, Mike Galbraith, linux-kernel, Con Kolivas,
	Andrew Morton, Thomas Gleixner, Ingo Molnar, Arjan van de Ven,
	linux-pm

On Friday, 27 April 2007 11:41, Johannes Berg wrote:
> On Thu, 2007-04-26 at 21:02 +0200, Rafael J. Wysocki wrote:
> 
> > Yes.  That's because we want to be able to repeat creating the image
> > without closing the fd in some situations.
> 
> Oh yeah, I just checked and it's not in fact necessary. I'm just
> confused.
> 
> > Still, we could use a global var 'platform_hibernation' or something like this,
> > I think.  Then, we can do
> > 
> > #define platform_hibernation	0
> > 
> > on the architectures that don't need it and make ACPI use it instead of this
> > "dynamic linking".
> 
> No, because acpi doesn't know at build time whether it can actually do
> S4 or not.

That's not a problem, I think.

1) We define platform_hibernation if CONFIG_ACPI is set.

2) In the ACPI code we do

if (can do S4)
	platform_hibernation = 1;

3) We have functions arch_platform_prepare()/finish()/enter() that are defined
to be noops for anything but ACPI systems and for ACPI systems they are
defined like this:

int arch_platform_enter(void)
{
	if (!platform_hibernation)
		return 0;

	...
}

I think it should work.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 19:02                                                 ` Rafael J. Wysocki
@ 2007-04-27  9:41                                                   ` Johannes Berg
  2007-04-27 10:18                                                     ` Rafael J. Wysocki
  0 siblings, 1 reply; 158+ messages in thread
From: Johannes Berg @ 2007-04-27  9:41 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Pavel Machek, Linus Torvalds, Nigel Cunningham, Nick Piggin,
	suspend2-devel, Mike Galbraith, linux-kernel, Con Kolivas,
	Andrew Morton, Thomas Gleixner, Ingo Molnar, Arjan van de Ven,
	linux-pm

[-- Attachment #1: Type: text/plain, Size: 616 bytes --]

On Thu, 2007-04-26 at 21:02 +0200, Rafael J. Wysocki wrote:

> Yes.  That's because we want to be able to repeat creating the image
> without closing the fd in some situations.

Oh yeah, I just checked and it's not in fact necessary. I'm just
confused.

> Still, we could use a global var 'platform_hibernation' or something like this,
> I think.  Then, we can do
> 
> #define platform_hibernation	0
> 
> on the architectures that don't need it and make ACPI use it instead of this
> "dynamic linking".

No, because acpi doesn't know at build time whether it can actually do
S4 or not.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 21:22                                           ` Rafael J. Wysocki
@ 2007-04-26 22:08                                             ` Nigel Cunningham
  0 siblings, 0 replies; 158+ messages in thread
From: Nigel Cunningham @ 2007-04-26 22:08 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Pekka Enberg, Pavel Machek, Dumitru Ciobarcianu, Ingo Molnar,
	Linus Torvalds, Christian Hesse, Nick Piggin, Mike Galbraith,
	linux-kernel, Con Kolivas, suspend2-devel, Andrew Morton,
	Thomas Gleixner, Arjan van de Ven

[-- Attachment #1: Type: text/plain, Size: 4505 bytes --]

Hi.

On Thu, 2007-04-26 at 23:22 +0200, Rafael J. Wysocki wrote:
> On Thursday, 26 April 2007 22:55, Nigel Cunningham wrote:
> > Hi.
> > 
> > On Thu, 2007-04-26 at 22:37 +0200, Rafael J. Wysocki wrote:
> > > On Thursday, 26 April 2007 22:16, Nigel Cunningham wrote:
> > > > Hi.
> > > > 
> > > > On Thu, 2007-04-26 at 21:28 +0200, Rafael J. Wysocki wrote:
> > > > > On Thursday, 26 April 2007 18:10, Pekka Enberg wrote:
> > > > > > 
> > > > > > On 4/26/2007, "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > > > > > In principle, we could add suspend2 as an alternative (in analogy with the I/O
> > > > > > > schedulers, for example), but I think for this purpose it should be reviewed
> > > > > > > properly.
> > > > > > 
> > > > > > Yeah, this makes sense.
> > > > > > 
> > > > > > On 4/26/2007, "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > > > > > There also is a real problem with how it uses the LRU pages.  It _seems_ to
> > > > > > > work, but at least to me it seems to be potentially dangerous.
> > > > > > 
> > > > > > I am new to suspend2 so can you please explain what exactly is dangerous
> > > > > > about it?
> > > > > 
> > > > > After freezing tasks, it first saves the contents of the LRU pages, freezes
> > > > > devices and then uses the LRU pages for storing the suspend image (if more
> > > > > memory is needed, it's allocated, but that's irrelevant here).  Now, we have no
> > > > > warranty that the LRU pages are not updated after we've saved their contents
> > > > > (first potential problem here).
> > > > > 
> > > > > After the image has been created, we have to unfreeze devices and save the
> > > > > image.  Now, we have no warranty that no one will be writing to the LRU pages
> > > > > that we have used to store the image, for whatever reasons known to him, so the
> > > > > image can potentially get corrupted while it's being saved.
> > > > > 
> > > > > In principle, device drivers can do this and there are some kernel threads that
> > > > > also can do this (we don't freeze them, because they're needed for the image
> > > > > saving).
> > > > > 
> > > > > The design is conceptually really really complicated and it makes strong
> > > > > assumptions about the behavior of different subsystems.  While these
> > > > > assumptions _may_ be satisfied right now, we'd have to ensure the satisfaction
> > > > > of them in the future if suspend2 were merged.
> > > > 
> > > > That's a good description of the issue, although I think _may_ and
> > > > _seems_ are stating things a bit more pessimistically than is
> > > > necessary. 
> > > 
> > > I've used them to express my personal concerns.
> > > 
> > > > You see, we need to remember that the pages which are saved separately
> > > > are LRU pages. Because userspace is frozen, their contents are going to
> > > > be static. The only possibilities for modifying them come from timer
> > > > routines, improperly frozen filesystems and device drivers.
> > > 
> > > And kernel threads that we don't freeze deliberately.  Currently, these are
> > > all worker threads, dm-related kernel threads and some others.
> > > 
> > > > We have code to check that the LRU isn't changing, and I've only seen
> > > > one report of modifications to about 20 LRU pages. I haven't had the
> > > > time yet to chase down the cause, but hope to do so soon.
> > > 
> > > I didn't say that would be common.  If it had been, you'd have seen problems
> > > with it.  To me the problem is the lack of warranty that it won't happen.
> > > 
> > > > The general scheme has been working for four or five years - if there
> > > > was a fundamental issue, we would have found it by now.
> > > > 
> > > > The scheme isn't complicated.
> > > 
> > > Conceptually, it is complicated just because you're using the LRU.
> > 
> > Well, I'm willing to look at other ideas. I actually selected LRU
> > because it was simple. Prior to this, we did have a try at just
> > iterating over the pages of frozen processes, but it didn't yield enough
> > pages to be viable. I wouldn't be surprised if hunting down the cause of
> > these changing pages leads to doing the opposite - starting with LRU
> > pages and then removing all the ones owned by processes. (Am I right in
> > thinking the remainder would be anonymous pages? I must learn more mm
> > inards :>).
> 
> I think we can discuss that, and the other things too.  I'm open to
> cooperation. :-)

Great!

Nigel

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 18:21                           ` Olivier Galibert
@ 2007-04-26 21:30                             ` Pavel Machek
  0 siblings, 0 replies; 158+ messages in thread
From: Pavel Machek @ 2007-04-26 21:30 UTC (permalink / raw)
  To: Olivier Galibert, Johannes Berg, Linus Torvalds, Nick Piggin,
	Mike Galbraith, linux-kernel, Thomas Gleixner, Con Kolivas,
	suspend2-devel, Ingo Molnar, Andrew Morton, Arjan van de Ven,
	Rafael J. Wysocki

Hi!

> > #define SNAPSHOT_SET_IMAGE_SIZE		_IOW(SNAPSHOT_IOC_MAGIC, 6, unsigned long)
> 
> So I'm not supposed to be able to suspend the 16Gb-ram, 32bits servers
> I have here?

(You are right, this should have been u64)

Snapshot image is by design limited by ammount of lowmem. If you want
to change that, this unsigned long limit will be least of your
problems.

(And no, I'd not expect loaded 6GB box to suspend properly. It will
just realize it does not have enough lowmem and refuse to suspend).
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 20:55                                         ` Nigel Cunningham
@ 2007-04-26 21:22                                           ` Rafael J. Wysocki
  2007-04-26 22:08                                             ` Nigel Cunningham
  0 siblings, 1 reply; 158+ messages in thread
From: Rafael J. Wysocki @ 2007-04-26 21:22 UTC (permalink / raw)
  To: nigel
  Cc: Pekka Enberg, Pavel Machek, Dumitru Ciobarcianu, Ingo Molnar,
	Linus Torvalds, Christian Hesse, Nick Piggin, Mike Galbraith,
	linux-kernel, Con Kolivas, suspend2-devel, Andrew Morton,
	Thomas Gleixner, Arjan van de Ven

On Thursday, 26 April 2007 22:55, Nigel Cunningham wrote:
> Hi.
> 
> On Thu, 2007-04-26 at 22:37 +0200, Rafael J. Wysocki wrote:
> > On Thursday, 26 April 2007 22:16, Nigel Cunningham wrote:
> > > Hi.
> > > 
> > > On Thu, 2007-04-26 at 21:28 +0200, Rafael J. Wysocki wrote:
> > > > On Thursday, 26 April 2007 18:10, Pekka Enberg wrote:
> > > > > 
> > > > > On 4/26/2007, "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > > > > In principle, we could add suspend2 as an alternative (in analogy with the I/O
> > > > > > schedulers, for example), but I think for this purpose it should be reviewed
> > > > > > properly.
> > > > > 
> > > > > Yeah, this makes sense.
> > > > > 
> > > > > On 4/26/2007, "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > > > > There also is a real problem with how it uses the LRU pages.  It _seems_ to
> > > > > > work, but at least to me it seems to be potentially dangerous.
> > > > > 
> > > > > I am new to suspend2 so can you please explain what exactly is dangerous
> > > > > about it?
> > > > 
> > > > After freezing tasks, it first saves the contents of the LRU pages, freezes
> > > > devices and then uses the LRU pages for storing the suspend image (if more
> > > > memory is needed, it's allocated, but that's irrelevant here).  Now, we have no
> > > > warranty that the LRU pages are not updated after we've saved their contents
> > > > (first potential problem here).
> > > > 
> > > > After the image has been created, we have to unfreeze devices and save the
> > > > image.  Now, we have no warranty that no one will be writing to the LRU pages
> > > > that we have used to store the image, for whatever reasons known to him, so the
> > > > image can potentially get corrupted while it's being saved.
> > > > 
> > > > In principle, device drivers can do this and there are some kernel threads that
> > > > also can do this (we don't freeze them, because they're needed for the image
> > > > saving).
> > > > 
> > > > The design is conceptually really really complicated and it makes strong
> > > > assumptions about the behavior of different subsystems.  While these
> > > > assumptions _may_ be satisfied right now, we'd have to ensure the satisfaction
> > > > of them in the future if suspend2 were merged.
> > > 
> > > That's a good description of the issue, although I think _may_ and
> > > _seems_ are stating things a bit more pessimistically than is
> > > necessary. 
> > 
> > I've used them to express my personal concerns.
> > 
> > > You see, we need to remember that the pages which are saved separately
> > > are LRU pages. Because userspace is frozen, their contents are going to
> > > be static. The only possibilities for modifying them come from timer
> > > routines, improperly frozen filesystems and device drivers.
> > 
> > And kernel threads that we don't freeze deliberately.  Currently, these are
> > all worker threads, dm-related kernel threads and some others.
> > 
> > > We have code to check that the LRU isn't changing, and I've only seen
> > > one report of modifications to about 20 LRU pages. I haven't had the
> > > time yet to chase down the cause, but hope to do so soon.
> > 
> > I didn't say that would be common.  If it had been, you'd have seen problems
> > with it.  To me the problem is the lack of warranty that it won't happen.
> > 
> > > The general scheme has been working for four or five years - if there
> > > was a fundamental issue, we would have found it by now.
> > > 
> > > The scheme isn't complicated.
> > 
> > Conceptually, it is complicated just because you're using the LRU.
> 
> Well, I'm willing to look at other ideas. I actually selected LRU
> because it was simple. Prior to this, we did have a try at just
> iterating over the pages of frozen processes, but it didn't yield enough
> pages to be viable. I wouldn't be surprised if hunting down the cause of
> these changing pages leads to doing the opposite - starting with LRU
> pages and then removing all the ones owned by processes. (Am I right in
> thinking the remainder would be anonymous pages? I must learn more mm
> inards :>).

I think we can discuss that, and the other things too.  I'm open to
cooperation. :-)

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 21:06                                   ` Theodore Tso
@ 2007-04-26 21:12                                     ` Nigel Cunningham
  0 siblings, 0 replies; 158+ messages in thread
From: Nigel Cunningham @ 2007-04-26 21:12 UTC (permalink / raw)
  To: Theodore Tso
  Cc: Linus Torvalds, Pavel Machek, Johannes Berg, Nick Piggin,
	Mike Galbraith, linux-kernel, Thomas Gleixner, Con Kolivas,
	suspend2-devel, Ingo Molnar, Andrew Morton, Arjan van de Ven,
	Rafael J. Wysocki

[-- Attachment #1: Type: text/plain, Size: 1372 bytes --]

Hi.

On Thu, 2007-04-26 at 17:06 -0400, Theodore Tso wrote:
> On Thu, Apr 26, 2007 at 08:56:48AM -0700, Linus Torvalds wrote:
> > 
> > 
> > On Thu, 26 Apr 2007, Pavel Machek wrote:
> > > 
> > > Yes, probably will. The other option is to break existing 32-bit
> > > userspace, which is a bit more common AFAICT.
> > 
> > And *this* is why kernel/userspace things simply should not be done.
> > 
> > It's simply better to do things entirely in the kernel. Because you add 
> > bugs and complications otherwise, and doing it in the kernel allows you 
> > to just switch things around.
> > 
> > As it is, it appears that user-space suspend is just broken whichever way 
> > we turn.
> 
> Well, in that case maybe suspend2 should be very seriously considered,
> since it has the features of uswsusp --- basic features which every
> single Microsoft and MacOSX user are used to like, like progress bars
> --- and it's all done in the kernel.

Umm. I don't want to be picky, but that's not quite true. The progress
bar is done in userspace.

There's also the possibility of using a userspace app to manage storage
too (I did work on establishing/tearing down an NBD connection as
necessary but didn't quite get it finished and have never released it).
That said, this bit can be torn out by simply removing a file and the
Makefile line.

Nigel

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 15:56                                 ` Linus Torvalds
@ 2007-04-26 21:06                                   ` Theodore Tso
  2007-04-26 21:12                                     ` Nigel Cunningham
  0 siblings, 1 reply; 158+ messages in thread
From: Theodore Tso @ 2007-04-26 21:06 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Pavel Machek, Johannes Berg, Nick Piggin, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, suspend2-devel,
	Ingo Molnar, Andrew Morton, Arjan van de Ven, Rafael J. Wysocki

On Thu, Apr 26, 2007 at 08:56:48AM -0700, Linus Torvalds wrote:
> 
> 
> On Thu, 26 Apr 2007, Pavel Machek wrote:
> > 
> > Yes, probably will. The other option is to break existing 32-bit
> > userspace, which is a bit more common AFAICT.
> 
> And *this* is why kernel/userspace things simply should not be done.
> 
> It's simply better to do things entirely in the kernel. Because you add 
> bugs and complications otherwise, and doing it in the kernel allows you 
> to just switch things around.
> 
> As it is, it appears that user-space suspend is just broken whichever way 
> we turn.

Well, in that case maybe suspend2 should be very seriously considered,
since it has the features of uswsusp --- basic features which every
single Microsoft and MacOSX user are used to like, like progress bars
--- and it's all done in the kernel.

					- Ted


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 16:10                                             ` Linus Torvalds
@ 2007-04-26 21:00                                               ` Pavel Machek
  0 siblings, 0 replies; 158+ messages in thread
From: Pavel Machek @ 2007-04-26 21:00 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Mark Lord, Alan Cox, Kenneth Crudup, Nick Piggin, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, suspend2-devel,
	Ingo Molnar, Andrew Morton, Arjan van de Ven

Hi!

> > > See? Two *totally* different cases. They have *nothing* in common. Not the
> > > call sequence, not the logic, not *anything*.
> > 
> > Except that both methods cannot rely upon hot-pluggable devices
> > still being present on resume/restore.  It is exceptionally common
> > to unplug all USB/firewire cables, mouse, keyboard, docking cables etc..
> > after a machine is in S2R state.
> 
> Right, and that has nothing to do with suspend/resume. You'd better be 
> able to handle unexpected hotplugs _regardless_.

Actually, with suspend/resume it is quite easy to cheat, and just
"unplug" the hardware on suspend, then "plug it back" on resume. That
works very well for devices like keyboards and mice (where you can't
tell if you are talking to the same hw, anyway).
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 20:37                                       ` Rafael J. Wysocki
  2007-04-26 20:49                                         ` David Lang
@ 2007-04-26 20:55                                         ` Nigel Cunningham
  2007-04-26 21:22                                           ` Rafael J. Wysocki
  1 sibling, 1 reply; 158+ messages in thread
From: Nigel Cunningham @ 2007-04-26 20:55 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Pekka Enberg, Pavel Machek, Dumitru Ciobarcianu, Ingo Molnar,
	Linus Torvalds, Christian Hesse, Nick Piggin, Mike Galbraith,
	linux-kernel, Con Kolivas, suspend2-devel, Andrew Morton,
	Thomas Gleixner, Arjan van de Ven

[-- Attachment #1: Type: text/plain, Size: 3955 bytes --]

Hi.

On Thu, 2007-04-26 at 22:37 +0200, Rafael J. Wysocki wrote:
> On Thursday, 26 April 2007 22:16, Nigel Cunningham wrote:
> > Hi.
> > 
> > On Thu, 2007-04-26 at 21:28 +0200, Rafael J. Wysocki wrote:
> > > On Thursday, 26 April 2007 18:10, Pekka Enberg wrote:
> > > > 
> > > > On 4/26/2007, "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > > > In principle, we could add suspend2 as an alternative (in analogy with the I/O
> > > > > schedulers, for example), but I think for this purpose it should be reviewed
> > > > > properly.
> > > > 
> > > > Yeah, this makes sense.
> > > > 
> > > > On 4/26/2007, "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > > > There also is a real problem with how it uses the LRU pages.  It _seems_ to
> > > > > work, but at least to me it seems to be potentially dangerous.
> > > > 
> > > > I am new to suspend2 so can you please explain what exactly is dangerous
> > > > about it?
> > > 
> > > After freezing tasks, it first saves the contents of the LRU pages, freezes
> > > devices and then uses the LRU pages for storing the suspend image (if more
> > > memory is needed, it's allocated, but that's irrelevant here).  Now, we have no
> > > warranty that the LRU pages are not updated after we've saved their contents
> > > (first potential problem here).
> > > 
> > > After the image has been created, we have to unfreeze devices and save the
> > > image.  Now, we have no warranty that no one will be writing to the LRU pages
> > > that we have used to store the image, for whatever reasons known to him, so the
> > > image can potentially get corrupted while it's being saved.
> > > 
> > > In principle, device drivers can do this and there are some kernel threads that
> > > also can do this (we don't freeze them, because they're needed for the image
> > > saving).
> > > 
> > > The design is conceptually really really complicated and it makes strong
> > > assumptions about the behavior of different subsystems.  While these
> > > assumptions _may_ be satisfied right now, we'd have to ensure the satisfaction
> > > of them in the future if suspend2 were merged.
> > 
> > That's a good description of the issue, although I think _may_ and
> > _seems_ are stating things a bit more pessimistically than is
> > necessary. 
> 
> I've used them to express my personal concerns.
> 
> > You see, we need to remember that the pages which are saved separately
> > are LRU pages. Because userspace is frozen, their contents are going to
> > be static. The only possibilities for modifying them come from timer
> > routines, improperly frozen filesystems and device drivers.
> 
> And kernel threads that we don't freeze deliberately.  Currently, these are
> all worker threads, dm-related kernel threads and some others.
> 
> > We have code to check that the LRU isn't changing, and I've only seen
> > one report of modifications to about 20 LRU pages. I haven't had the
> > time yet to chase down the cause, but hope to do so soon.
> 
> I didn't say that would be common.  If it had been, you'd have seen problems
> with it.  To me the problem is the lack of warranty that it won't happen.
> 
> > The general scheme has been working for four or five years - if there
> > was a fundamental issue, we would have found it by now.
> > 
> > The scheme isn't complicated.
> 
> Conceptually, it is complicated just because you're using the LRU.

Well, I'm willing to look at other ideas. I actually selected LRU
because it was simple. Prior to this, we did have a try at just
iterating over the pages of frozen processes, but it didn't yield enough
pages to be viable. I wouldn't be surprised if hunting down the cause of
these changing pages leads to doing the opposite - starting with LRU
pages and then removing all the ones owned by processes. (Am I right in
thinking the remainder would be anonymous pages? I must learn more mm
inards :>).

Nigel

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 20:37                                       ` Rafael J. Wysocki
@ 2007-04-26 20:49                                         ` David Lang
  2007-04-26 20:55                                         ` Nigel Cunningham
  1 sibling, 0 replies; 158+ messages in thread
From: David Lang @ 2007-04-26 20:49 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: nigel, Pekka Enberg, Pavel Machek, Dumitru Ciobarcianu,
	Ingo Molnar, Linus Torvalds, Christian Hesse, Nick Piggin,
	Mike Galbraith, linux-kernel, Con Kolivas, suspend2-devel,
	Andrew Morton, Thomas Gleixner, Arjan van de Ven

On Thu, 26 Apr 2007, Rafael J. Wysocki wrote:

>> The general scheme has been working for four or five years - if there
>> was a fundamental issue, we would have found it by now.
>>
>> The scheme isn't complicated.
>
> Conceptually, it is complicated just because you're using the LRU.

I know that I've seen many projects that are working on or claim to have 
suceeded in being able to do live migration of processes from one system to 
another. has anyone looked at useing any of these mechanisms for snapshoting the 
user processes for the std situation? if you can do this a process at a time you 
may be able to avoid the massive blob of a write

instead of what linus was saying

buff = snapshot()
write(buff)

it would be
start_snapshot()   /* stops all userspace schedulers except for this process */
foreach(pid) {
   buff = snapshotpid(pid)
   write(buff)
}

David Lang

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 20:16                                     ` Nigel Cunningham
@ 2007-04-26 20:37                                       ` Rafael J. Wysocki
  2007-04-26 20:49                                         ` David Lang
  2007-04-26 20:55                                         ` Nigel Cunningham
  0 siblings, 2 replies; 158+ messages in thread
From: Rafael J. Wysocki @ 2007-04-26 20:37 UTC (permalink / raw)
  To: nigel
  Cc: Pekka Enberg, Pavel Machek, Dumitru Ciobarcianu, Ingo Molnar,
	Linus Torvalds, Christian Hesse, Nick Piggin, Mike Galbraith,
	linux-kernel, Con Kolivas, suspend2-devel, Andrew Morton,
	Thomas Gleixner, Arjan van de Ven

On Thursday, 26 April 2007 22:16, Nigel Cunningham wrote:
> Hi.
> 
> On Thu, 2007-04-26 at 21:28 +0200, Rafael J. Wysocki wrote:
> > On Thursday, 26 April 2007 18:10, Pekka Enberg wrote:
> > > 
> > > On 4/26/2007, "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > > In principle, we could add suspend2 as an alternative (in analogy with the I/O
> > > > schedulers, for example), but I think for this purpose it should be reviewed
> > > > properly.
> > > 
> > > Yeah, this makes sense.
> > > 
> > > On 4/26/2007, "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > > There also is a real problem with how it uses the LRU pages.  It _seems_ to
> > > > work, but at least to me it seems to be potentially dangerous.
> > > 
> > > I am new to suspend2 so can you please explain what exactly is dangerous
> > > about it?
> > 
> > After freezing tasks, it first saves the contents of the LRU pages, freezes
> > devices and then uses the LRU pages for storing the suspend image (if more
> > memory is needed, it's allocated, but that's irrelevant here).  Now, we have no
> > warranty that the LRU pages are not updated after we've saved their contents
> > (first potential problem here).
> > 
> > After the image has been created, we have to unfreeze devices and save the
> > image.  Now, we have no warranty that no one will be writing to the LRU pages
> > that we have used to store the image, for whatever reasons known to him, so the
> > image can potentially get corrupted while it's being saved.
> > 
> > In principle, device drivers can do this and there are some kernel threads that
> > also can do this (we don't freeze them, because they're needed for the image
> > saving).
> > 
> > The design is conceptually really really complicated and it makes strong
> > assumptions about the behavior of different subsystems.  While these
> > assumptions _may_ be satisfied right now, we'd have to ensure the satisfaction
> > of them in the future if suspend2 were merged.
> 
> That's a good description of the issue, although I think _may_ and
> _seems_ are stating things a bit more pessimistically than is
> necessary. 

I've used them to express my personal concerns.

> You see, we need to remember that the pages which are saved separately
> are LRU pages. Because userspace is frozen, their contents are going to
> be static. The only possibilities for modifying them come from timer
> routines, improperly frozen filesystems and device drivers.

And kernel threads that we don't freeze deliberately.  Currently, these are
all worker threads, dm-related kernel threads and some others.

> We have code to check that the LRU isn't changing, and I've only seen
> one report of modifications to about 20 LRU pages. I haven't had the
> time yet to chase down the cause, but hope to do so soon.

I didn't say that would be common.  If it had been, you'd have seen problems
with it.  To me the problem is the lack of warranty that it won't happen.

> The general scheme has been working for four or five years - if there
> was a fundamental issue, we would have found it by now.
> 
> The scheme isn't complicated.

Conceptually, it is complicated just because you're using the LRU.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 19:28                                   ` Rafael J. Wysocki
@ 2007-04-26 20:16                                     ` Nigel Cunningham
  2007-04-26 20:37                                       ` Rafael J. Wysocki
  0 siblings, 1 reply; 158+ messages in thread
From: Nigel Cunningham @ 2007-04-26 20:16 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Pekka Enberg, Pavel Machek, Dumitru Ciobarcianu, Ingo Molnar,
	Linus Torvalds, Christian Hesse, Nick Piggin, Mike Galbraith,
	linux-kernel, Con Kolivas, suspend2-devel, Andrew Morton,
	Thomas Gleixner, Arjan van de Ven

[-- Attachment #1: Type: text/plain, Size: 2986 bytes --]

Hi.

On Thu, 2007-04-26 at 21:28 +0200, Rafael J. Wysocki wrote:
> On Thursday, 26 April 2007 18:10, Pekka Enberg wrote:
> > 
> > On 4/26/2007, "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > In principle, we could add suspend2 as an alternative (in analogy with the I/O
> > > schedulers, for example), but I think for this purpose it should be reviewed
> > > properly.
> > 
> > Yeah, this makes sense.
> > 
> > On 4/26/2007, "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > There also is a real problem with how it uses the LRU pages.  It _seems_ to
> > > work, but at least to me it seems to be potentially dangerous.
> > 
> > I am new to suspend2 so can you please explain what exactly is dangerous
> > about it?
> 
> After freezing tasks, it first saves the contents of the LRU pages, freezes
> devices and then uses the LRU pages for storing the suspend image (if more
> memory is needed, it's allocated, but that's irrelevant here).  Now, we have no
> warranty that the LRU pages are not updated after we've saved their contents
> (first potential problem here).
> 
> After the image has been created, we have to unfreeze devices and save the
> image.  Now, we have no warranty that no one will be writing to the LRU pages
> that we have used to store the image, for whatever reasons known to him, so the
> image can potentially get corrupted while it's being saved.
> 
> In principle, device drivers can do this and there are some kernel threads that
> also can do this (we don't freeze them, because they're needed for the image
> saving).
> 
> The design is conceptually really really complicated and it makes strong
> assumptions about the behavior of different subsystems.  While these
> assumptions _may_ be satisfied right now, we'd have to ensure the satisfaction
> of them in the future if suspend2 were merged.

That's a good description of the issue, although I think _may_ and
_seems_ are stating things a bit more pessimistically than is
necessary. 

You see, we need to remember that the pages which are saved separately
are LRU pages. Because userspace is frozen, their contents are going to
be static. The only possibilities for modifying them come from timer
routines, improperly frozen filesystems and device drivers.

We have code to check that the LRU isn't changing, and I've only seen
one report of modifications to about 20 LRU pages. I haven't had the
time yet to chase down the cause, but hope to do so soon.

The general scheme has been working for four or five years - if there
was a fundamental issue, we would have found it by now.

The scheme isn't complicated. The algo for figuring out whether to save
the page in an atomic copy just says: Iterate through all LRU pages. For
each page, ask: Is this used by the thread suspending, or by userui? No?
Save separately. Yes? Save in the atomic copy.... oh, and save
everything else that needs to be saved in the atomic copy.

Regards,

Nigel

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26  0:34                                           ` Linus Torvalds
@ 2007-04-26 20:12                                             ` Rafael J. Wysocki
  0 siblings, 0 replies; 158+ messages in thread
From: Rafael J. Wysocki @ 2007-04-26 20:12 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Pavel Machek, Alan Cox, Kenneth Crudup, Nick Piggin,
	Mike Galbraith, linux-kernel, Thomas Gleixner, Con Kolivas,
	suspend2-devel, Ingo Molnar, Andrew Morton, Arjan van de Ven

On Thursday, 26 April 2007 02:34, Linus Torvalds wrote:
> 
> On Wed, 25 Apr 2007, Linus Torvalds wrote:
> > 
> > The *thaw* needs to happen with devices quiescent. 
> 
> Btw, I sure as hell hope you didn't use "suspend()" for that. You're 
> (again) much better off having a totally separate function that just 
> freezes stuff.
> 
> So in the "snapshot+shutdown" path, you should have:
> 
>  - prepare_to_snapshot() - allocate memory, and possibly return errors
> 
>    We can skip this, if we just make the rule be that any devices that 
>    want to support snapshotting must always have the memory required for 
>    snapshotting pre-allocated. Most devices really do allocate memory for 
>    their state anyway, and the only real reason for the "prepare" stage 
>    here is becasue the final snapshot has to happen with interrupts off, 
>    obviously. So *if* we don't need to allocate any memory, and if we 
>    don't expect to want to accept some early error case, this is likely 
>    useless.

I think we need this.  Apparently, some device drivers need as much as 30 meg
of RAM at later stages (I don't know why and what for).

>  - snapshot() - actually save device state that is consistent with the 
>    memory image at the time. Called with interrupts off, but the device 
>    has to be usable both before and afterwards!
> 
> And I would seriously suggest that "snapshot()" be documented to not rely 
> on any DMA memory, exactly because the device has to be accessible both 
> before and after (before - because we're running and allocating memory, 
> and after - because we'll be writing thigns out). But see later:

Please note that some drivers are compiled as modules and they may deal with
uninitialized hardware (or worse, with the hardware initialized by the BIOS in
a crazy way) after restart_snapshot().  It may be better for them to actually
quiesce the devices here too to avoid problems after restart_snapshot() .

> For the "resume snapshot" path, I would suggest having 
> 
>  - freeze(): quiesce the device. This literally just does the absolute 
>    minimum to make sure that the device doesn't do anything surprising (no 
>    interrupts, no DMA, no nothing). For many devices, it's a no-op, even 
>    if they can do DMA (eg most disk controllers will do DMA, but only as 
>    an actual result of a request, and upper layers will be quiescent 
>    anyway, so they do *not* need to disable DMA)
> 
>    NOTE! The "freeze()" gets called from the *old* kernel just _before_ a
>    snapshot unpacking!!

Yes, and usually the majority of modules is not loaded at that time.

>  - restart_snapshot() - actually restart the snapshot (and usually this 
>    would involve re-setting the device, not so much trying to restore all 
>    the saved state. IOW, it's easier to just re-initialize the DMA command 
>    queues than to try to make them "atomic" in the snapshot).
> 
>    NOTE! This gets called by the *new* kernel _after_ the snapshot resume!

I think devices _should_ be resetted in restart_snapshot(), unless it's
possible to check if they have already been initialized by the "old" kernel -
but this information would have to be available from the device itself.

> And if you *want* to, I can see that you might want to actually do a 
> "unfreeze()" thing too, and make the actual shapshotting be:

What unfreeze() would be needed for in that case?

> 	/* We may not even need this.. */
> 	for_each_device() {
> 		err = prepare_to_snapshot();
> 		if (err)
> 			return err;
> 	}

We need to free as much memory as we'll need for the image creation at this
point.

> 	/* This is the real work for snapshotting */
> 	cli();
> 	for_each_device()
> 		freeze(dev);

You've added freeze() here, but it's not on your list above?

> 	for_each_device()
> 		snapshot(dev);
> 	.. snapshot current memory image ..
> 	for_each_device_depth_first()
> 		unfreeze(dev);
> 	sti();
> 
> and maybe it's worth it, but I would almost suggest that you just make the 
> rule be that any DMA etc just *has* to be re-initialized by 
> "restart_snapshot()", in which case it's not even necessary to 
> freeze/unfreeze over the device, and "snapshot()" itself only needs to 
> make sure any non-DMA data is safe.
>
> But adding the freeze/unfreeze (which is a no-op for most hardware anyway) 
> might make things easier to think about, so I would certainly not *object* 
> to it, even if I suspect it's not necessary.

I think it's not necessary.

> Anyway, the restore_snapshot() sequence should be:
> 
> 	/* Old kernel.. Normal boot, load snapshot image */
> 	cli()
> 	for_each_device()
> 		freeze(dev);
> 	restore_snapshot_image();
> 	restore_regs_and_jump_to_image();
> 	/* noreturn */
> 
> 
> 	/* New kernel, gets called at the snapshot restore address
> 	 * with interrupts off and devices frozen, and memory image
> 	 * constsntent with what it was at "snapshot()" time
> 	 */
> 	for_each_dev_depth_first()
> 		restore_snapshot(dev);
> 	/* And if you want to, just to be "symmetric"
> 
> 		for_each_dev_depth_first()
> 			unfreeze(dev)
> 
> 	   although I think you could just make "restore_snapshot()" 
> 	   implicitly unfreeze it too..

Agreed.

> 	 */
> 	sti();
> 	/* We're up */
> 
> and notice how *different* this is from what happens for s2ram. There 
> really isn't anything in common here. Exactly because s2ram simply doesn't 
> _have_ any of the issues with atomic memory images.

Agreed again.

Moreover, in the s2ram case there are no problems with device drivers compiled
as modules.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 16:10                                 ` Pekka Enberg
@ 2007-04-26 19:28                                   ` Rafael J. Wysocki
  2007-04-26 20:16                                     ` Nigel Cunningham
  0 siblings, 1 reply; 158+ messages in thread
From: Rafael J. Wysocki @ 2007-04-26 19:28 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Pavel Machek, Dumitru Ciobarcianu, Ingo Molnar, Linus Torvalds,
	Nigel Cunningham, Christian Hesse, Nick Piggin, Mike Galbraith,
	linux-kernel, Con Kolivas, suspend2-devel, Andrew Morton,
	Thomas Gleixner, Arjan van de Ven

On Thursday, 26 April 2007 18:10, Pekka Enberg wrote:
> 
> On 4/26/2007, "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > In principle, we could add suspend2 as an alternative (in analogy with the I/O
> > schedulers, for example), but I think for this purpose it should be reviewed
> > properly.
> 
> Yeah, this makes sense.
> 
> On 4/26/2007, "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > There also is a real problem with how it uses the LRU pages.  It _seems_ to
> > work, but at least to me it seems to be potentially dangerous.
> 
> I am new to suspend2 so can you please explain what exactly is dangerous
> about it?

After freezing tasks, it first saves the contents of the LRU pages, freezes
devices and then uses the LRU pages for storing the suspend image (if more
memory is needed, it's allocated, but that's irrelevant here).  Now, we have no
warranty that the LRU pages are not updated after we've saved their contents
(first potential problem here).

After the image has been created, we have to unfreeze devices and save the
image.  Now, we have no warranty that no one will be writing to the LRU pages
that we have used to store the image, for whatever reasons known to him, so the
image can potentially get corrupted while it's being saved.

In principle, device drivers can do this and there are some kernel threads that
also can do this (we don't freeze them, because they're needed for the image
saving).

The design is conceptually really really complicated and it makes strong
assumptions about the behavior of different subsystems.  While these
assumptions _may_ be satisfied right now, we'd have to ensure the satisfaction
of them in the future if suspend2 were merged.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 21:41                     ` Matt Mackall
  2007-04-26 11:27                       ` Pavel Machek
@ 2007-04-26 19:04                       ` Bill Davidsen
  1 sibling, 0 replies; 158+ messages in thread
From: Bill Davidsen @ 2007-04-26 19:04 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Pavel Machek, Ingo Molnar, Linus Torvalds, Nigel Cunningham,
	Christian Hesse, Nick Piggin, Mike Galbraith, linux-kernel,
	Con Kolivas, suspend2-devel, Andrew Morton, Thomas Gleixner,
	Arjan van de Ven

Matt Mackall wrote:
> On Tue, Apr 24, 2007 at 11:29:56PM +0200, Pavel Machek wrote:
>> We do not want to fragment the testing base, and suspend2 does not
>> really have any interesting features over uswsusp.
> 
> The testing base is already fragmented!
> 
> What the current situation means is that you simply never hear from
> the people who get fed up with suspend but who manage to get suspend2
> working.
> 
I have to feel that having a *working resume* capability is "any 
interesting features" enough. What you say about "simply never hear 
from" is unfortunately true.

On 04/25/2007 05:30 PM EDT, Pavel Machek wrote:

 >It is not Rafael's fault. Actually it is quite hard to work with
 >Nigel, because he implements every feature someone asks for, and wants
 >to merge them all  :-( . I don't expect to ever agree with Nigel on
 >anything important, sorry.

The fact that Pavel thinks giving the users what they want is a problem 
certainly defines the difference between them, the populist "give them 
what they want" and the elitist "let's them make do with what I think 
they should have."

I do respect Pavel for all the stuff he has done and is doing, I wish I 
could have found a nicer way to say that.

-- 
Bill Davidsen <davidsen@tmr.com>
   "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 18:40                                               ` Johannes Berg
@ 2007-04-26 19:02                                                 ` Rafael J. Wysocki
  2007-04-27  9:41                                                   ` Johannes Berg
  0 siblings, 1 reply; 158+ messages in thread
From: Rafael J. Wysocki @ 2007-04-26 19:02 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Pavel Machek, Linus Torvalds, Nigel Cunningham, Nick Piggin,
	suspend2-devel, Mike Galbraith, linux-kernel, Con Kolivas,
	Andrew Morton, Thomas Gleixner, Ingo Molnar, Arjan van de Ven,
	linux-pm

On Thursday, 26 April 2007 20:40, Johannes Berg wrote:
> On Thu, 2007-04-26 at 20:40 +0200, Rafael J. Wysocki wrote:
> 
> > >  * it surfaces kernel implementation details about pm_ops and thus makes
> > >    the whole thing very fragile
> > 
> > Can you elaborate?
> 
> Well it tells userspace about pm_ops->enter/prepare/finish etc.
> Also, it seems that it needs a "release memory now" operation instead of
> just releasing it when the fd is closed?

Yes.  That's because we want to be able to repeat creating the image
without closing the fd in some situations.

> > >  * it has yet another interface (yuck) to determine whether to reboot,
> > >    shut down etc, doesn't use /sys/power/disk
> > 
> > Yes.  In fact it was meant as a replacement for /sys/power/disk at one point.
> 
> Heh.
> 
> > >  * I generally had no idea wtf it is doing in some places
> > 
> > I could have told you if you had asked. :-)
> 
> I was offline ;)
> 
> > Do we need hibernate_ops at all?  There's only one user anyway and I'm not
> > sure there will be more of them in the future.
> 
> I'm pretty sure there won't be, but there's no way to do it cleanly
> without pm_ops since even acpi doesn't do this all the time but only
> when some set of conditions is true. Hence, it needs to be able to
> determine the availability of the platform mode at run time rather than
> build time (build time => we could use weak symbols, arch hooks, ...)

Still, we could use a global var 'platform_hibernation' or something like this,
I think.  Then, we can do

#define platform_hibernation	0

on the architectures that don't need it and make ACPI use it instead of this
"dynamic linking".

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 16:31                                           ` Johannes Berg
@ 2007-04-26 18:40                                             ` Rafael J. Wysocki
  2007-04-26 18:40                                               ` Johannes Berg
  0 siblings, 1 reply; 158+ messages in thread
From: Rafael J. Wysocki @ 2007-04-26 18:40 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Pavel Machek, Linus Torvalds, Nigel Cunningham, Nick Piggin,
	suspend2-devel, Mike Galbraith, linux-kernel, Con Kolivas,
	Andrew Morton, Thomas Gleixner, Ingo Molnar, Arjan van de Ven,
	linux-pm

On Thursday, 26 April 2007 18:31, Johannes Berg wrote:
> On Thu, 2007-04-26 at 13:30 +0200, Pavel Machek wrote:
> 
> > > From looking at pm_ops which I was recently working with a lot, it seems
> > > that it was designed by somebody who was reading the ACPI documentation
> > > and was otherwise pretty clueless, even at that level std tries to look
> > > like suspend. IMHO that is one of the first things that should be ripped
> > > out, no pm_ops for STD, it's a pain to work with.
> > 
> > That code goes back to Patrick, AFAICT. (And yes, ACPI S3 and ACPI S4
> > low-level enter is pretty similar).
> > 
> > Patches would be welcome
> 
> That was easier than I thought. This applies on top of a patch that
> makes kernel/power/user.c optional since I had no idea how to fix it,
> problems I see:
>  * it surfaces kernel implementation details about pm_ops and thus makes
>    the whole thing very fragile

Can you elaborate?

>  * it has yet another interface (yuck) to determine whether to reboot,
>    shut down etc, doesn't use /sys/power/disk

Yes.  In fact it was meant as a replacement for /sys/power/disk at one point.

>  * I generally had no idea wtf it is doing in some places

I could have told you if you had asked. :-)

> Anyway, this patch is only compile tested, it
>  * introduces include/linux/hibernate.h with hibernate_ops and
>    a new hibernate() function to hibernate the system

Do we need hibernate_ops at all?  There's only one user anyway and I'm not
sure there will be more of them in the future.

>  * rips apart a lot of the suspend code and puts it back together using
>    the hibernate_ops
>  * switches ACPI to hibernate_ops (the only user of pm_ops.pm_disk_mode)
>  * might apply/compile against -mm, I have all my and some of Rafael's
>    suspend/hibernate work in my tree.
>  * breaks user suspend as I noted above
>  * is incomplete, somewhere pm_suspend_disk() is still defined iirc

I think I can fix it up, just give me some time.

The idea is good, I think we should do someting like this.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 18:40                                             ` Rafael J. Wysocki
@ 2007-04-26 18:40                                               ` Johannes Berg
  2007-04-26 19:02                                                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 158+ messages in thread
From: Johannes Berg @ 2007-04-26 18:40 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Pavel Machek, Linus Torvalds, Nigel Cunningham, Nick Piggin,
	suspend2-devel, Mike Galbraith, linux-kernel, Con Kolivas,
	Andrew Morton, Thomas Gleixner, Ingo Molnar, Arjan van de Ven,
	linux-pm

[-- Attachment #1: Type: text/plain, Size: 1240 bytes --]

On Thu, 2007-04-26 at 20:40 +0200, Rafael J. Wysocki wrote:

> >  * it surfaces kernel implementation details about pm_ops and thus makes
> >    the whole thing very fragile
> 
> Can you elaborate?

Well it tells userspace about pm_ops->enter/prepare/finish etc.
Also, it seems that it needs a "release memory now" operation instead of
just releasing it when the fd is closed?

> >  * it has yet another interface (yuck) to determine whether to reboot,
> >    shut down etc, doesn't use /sys/power/disk
> 
> Yes.  In fact it was meant as a replacement for /sys/power/disk at one point.

Heh.

> >  * I generally had no idea wtf it is doing in some places
> 
> I could have told you if you had asked. :-)

I was offline ;)

> Do we need hibernate_ops at all?  There's only one user anyway and I'm not
> sure there will be more of them in the future.

I'm pretty sure there won't be, but there's no way to do it cleanly
without pm_ops since even acpi doesn't do this all the time but only
when some set of conditions is true. Hence, it needs to be able to
determine the availability of the platform mode at run time rather than
build time (build time => we could use weak symbols, arch hooks, ...)

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 11:09                         ` Pavel Machek
  2007-04-26 15:53                           ` Linus Torvalds
@ 2007-04-26 18:21                           ` Olivier Galibert
  2007-04-26 21:30                             ` Pavel Machek
  1 sibling, 1 reply; 158+ messages in thread
From: Olivier Galibert @ 2007-04-26 18:21 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Johannes Berg, Linus Torvalds, Nick Piggin, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, suspend2-devel,
	Ingo Molnar, Andrew Morton, Arjan van de Ven, Rafael J. Wysocki

On Thu, Apr 26, 2007 at 01:09:53PM +0200, Pavel Machek wrote:
> #define SNAPSHOT_SET_IMAGE_SIZE		_IOW(SNAPSHOT_IOC_MAGIC, 6, unsigned long)

So I'm not supposed to be able to suspend the 16Gb-ram, 32bits servers
I have here?

  OG.

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25  9:07                           ` Xavier Bestel
  2007-04-25  9:19                             ` Nigel Cunningham
@ 2007-04-26 18:18                             ` Bill Davidsen
  1 sibling, 0 replies; 158+ messages in thread
From: Bill Davidsen @ 2007-04-26 18:18 UTC (permalink / raw)
  To: linux-kernel; +Cc: suspend2-devel

Xavier Bestel wrote:
> On Wed, 2007-04-25 at 18:50 +1000, Nigel Cunningham wrote:
>>> (And guess what, it uses APM and suspend is really faster and way more
>>> reliable than each kernel implementation I could try).
>> If you tried Suspend2 and had problems with reliability, please send me
>> logs. I'll do all I can to help. (I have to qualify it a bit, because
>> I'm not able to fix drivers, but if it's a Suspend2 issue, tell me and
>> I'll fix it).
> 
> Does suspend2 work with APM ? After much trying, I think now the ACPI
> implementation of my laptop (a vintage Compaq Armada 1700) is busted,
> only APM works.
> 
> AFAIR the problem with suspend2 was that it didn't poweroff some parts
> of the laptop (the led of the wifi pcmcia card was on, and the lcd light
> was on too), but that was last year. Kernel's suspend kind of worked but
> didn't resume (no reaction on button press). As I tried all this last
> year, I may have forgotten some things.
> Honestly, I like this laptop when it works flawlessly, so I don't see
> many reasons to try *susp* again. I'll do it when I'm bored, just not
> today.
> 
Actually on some old laptops I just use the apm command, with -s (or -S, 
I forget by now), and that works.

-- 
Bill Davidsen <davidsen@tmr.com>
   "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 16:14                                   ` Chris Friesen
  2007-04-26 16:27                                     ` Linus Torvalds
@ 2007-04-26 17:11                                     ` Johannes Berg
  1 sibling, 0 replies; 158+ messages in thread
From: Johannes Berg @ 2007-04-26 17:11 UTC (permalink / raw)
  To: Chris Friesen
  Cc: Pavel Machek, Linus Torvalds, Nick Piggin, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, suspend2-devel,
	Ingo Molnar, Andrew Morton, Arjan van de Ven, Rafael J. Wysocki

[-- Attachment #1: Type: text/plain, Size: 581 bytes --]

On Thu, 2007-04-26 at 10:14 -0600, Chris Friesen wrote:
> Johannes Berg wrote:
> 
> > Judging from experience with the wext 32/64 bit fiasco it seems to be
> > rather uncommon to use 32-bit userspace on 64-bit machines.
> 
> I disagree...it's quite common.  I think its the standard way of doing 
> things for ppc64, for instance.

I know. My only 64-bit machine is ppc64 :)

But still nobody noticed the wext 32/64 bit compat bug for like forever.
On the other hand, maybe that just means that most 64-bit machines are
desktop machines without wireless.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 11:30                                         ` Pavel Machek
  2007-04-26 11:41                                           ` Johannes Berg
@ 2007-04-26 16:31                                           ` Johannes Berg
  2007-04-26 18:40                                             ` Rafael J. Wysocki
  1 sibling, 1 reply; 158+ messages in thread
From: Johannes Berg @ 2007-04-26 16:31 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Linus Torvalds, Nigel Cunningham, Nick Piggin, suspend2-devel,
	Mike Galbraith, linux-kernel, Con Kolivas, Andrew Morton,
	Thomas Gleixner, Ingo Molnar, Arjan van de Ven, linux-pm

On Thu, 2007-04-26 at 13:30 +0200, Pavel Machek wrote:

> > From looking at pm_ops which I was recently working with a lot, it seems
> > that it was designed by somebody who was reading the ACPI documentation
> > and was otherwise pretty clueless, even at that level std tries to look
> > like suspend. IMHO that is one of the first things that should be ripped
> > out, no pm_ops for STD, it's a pain to work with.
> 
> That code goes back to Patrick, AFAICT. (And yes, ACPI S3 and ACPI S4
> low-level enter is pretty similar).
> 
> Patches would be welcome

That was easier than I thought. This applies on top of a patch that
makes kernel/power/user.c optional since I had no idea how to fix it,
problems I see:
 * it surfaces kernel implementation details about pm_ops and thus makes
   the whole thing very fragile
 * it has yet another interface (yuck) to determine whether to reboot,
   shut down etc, doesn't use /sys/power/disk
 * I generally had no idea wtf it is doing in some places

Anyway, this patch is only compile tested, it
 * introduces include/linux/hibernate.h with hibernate_ops and
   a new hibernate() function to hibernate the system
 * rips apart a lot of the suspend code and puts it back together using
   the hibernate_ops
 * switches ACPI to hibernate_ops (the only user of pm_ops.pm_disk_mode)
 * might apply/compile against -mm, I have all my and some of Rafael's
   suspend/hibernate work in my tree.
 * breaks user suspend as I noted above
 * is incomplete, somewhere pm_suspend_disk() is still defined iirc

johannes
---
 Documentation/power/userland-swsusp.txt |   26 +++----
 drivers/acpi/sleep/main.c               |   89 ++++++++++++++++++++----
 drivers/acpi/sleep/proc.c               |    3 
 drivers/i2c/chips/tps65010.c            |    2 
 include/linux/hibernate.h               |   36 +++++++++
 include/linux/pm.h                      |   31 --------
 kernel/power/disk.c                     |  117 +++++++++++++++++++-------------
 kernel/power/main.c                     |   47 +++++-------
 kernel/power/power.h                    |   13 ---
 kernel/power/user.c                     |   28 +------
 kernel/sys.c                            |    3 
 11 files changed, 231 insertions(+), 164 deletions(-)

--- wireless-dev.orig/include/linux/pm.h	2007-04-26 18:15:00.440691185 +0200
+++ wireless-dev/include/linux/pm.h	2007-04-26 18:15:09.410691185 +0200
@@ -107,26 +107,11 @@ typedef int __bitwise suspend_state_t;
 #define PM_SUSPEND_ON		((__force suspend_state_t) 0)
 #define PM_SUSPEND_STANDBY	((__force suspend_state_t) 1)
 #define PM_SUSPEND_MEM		((__force suspend_state_t) 3)
-#define PM_SUSPEND_DISK		((__force suspend_state_t) 4)
-#define PM_SUSPEND_MAX		((__force suspend_state_t) 5)
-
-typedef int __bitwise suspend_disk_method_t;
-
-/* invalid must be 0 so struct pm_ops initialisers can leave it out */
-#define PM_DISK_INVALID		((__force suspend_disk_method_t) 0)
-#define	PM_DISK_PLATFORM	((__force suspend_disk_method_t) 1)
-#define	PM_DISK_SHUTDOWN	((__force suspend_disk_method_t) 2)
-#define	PM_DISK_REBOOT		((__force suspend_disk_method_t) 3)
-#define	PM_DISK_TEST		((__force suspend_disk_method_t) 4)
-#define	PM_DISK_TESTPROC	((__force suspend_disk_method_t) 5)
-#define	PM_DISK_MAX		((__force suspend_disk_method_t) 6)
+#define PM_SUSPEND_MAX		((__force suspend_state_t) 4)
 
 /**
  * struct pm_ops - Callbacks for managing platform dependent suspend states.
  * @valid: Callback to determine whether the given state can be entered.
- * 	If %CONFIG_SOFTWARE_SUSPEND is set then %PM_SUSPEND_DISK is
- *	always valid and never passed to this call. If not assigned,
- *	no suspend states are valid.
  *	Valid states are advertised in /sys/power/state but can still
  *	be rejected by prepare or enter if the conditions aren't right.
  *	There is a %pm_valid_only_mem function available that can be assigned
@@ -140,24 +125,12 @@ typedef int __bitwise suspend_disk_metho
  *
  * @finish: Called when the system has left the given state and all devices
  *	are resumed. The return value is ignored.
- *
- * @pm_disk_mode: The generic code always allows one of the shutdown methods
- *	%PM_DISK_SHUTDOWN, %PM_DISK_REBOOT, %PM_DISK_TEST and
- *	%PM_DISK_TESTPROC. If this variable is set, the mode it is set
- *	to is allowed in addition to those modes and is also made default.
- *	When this mode is sent selected, the @prepare call will be called
- *	before suspending to disk (if present), the @enter call should be
- *	present and will be called after all state has been saved and the
- *	machine is ready to be powered off; the @finish callback is called
- *	after state has been restored. All these calls are called with
- *	%PM_SUSPEND_DISK as the state.
  */
 struct pm_ops {
 	int (*valid)(suspend_state_t state);
 	int (*prepare)(suspend_state_t state);
 	int (*enter)(suspend_state_t state);
 	int (*finish)(suspend_state_t state);
-	suspend_disk_method_t pm_disk_mode;
 };
 
 /**
@@ -276,8 +249,6 @@ extern void device_power_up(void);
 extern void device_resume(void);
 
 #ifdef CONFIG_PM
-extern suspend_disk_method_t pm_disk_mode;
-
 extern int device_suspend(pm_message_t state);
 extern int device_prepare_suspend(pm_message_t state);
 
--- wireless-dev.orig/kernel/power/main.c	2007-04-26 18:15:00.790691185 +0200
+++ wireless-dev/kernel/power/main.c	2007-04-26 18:15:09.410691185 +0200
@@ -21,6 +21,7 @@
 #include <linux/resume-trace.h>
 #include <linux/freezer.h>
 #include <linux/vmstat.h>
+#include <linux/hibernate.h>
 
 #include "power.h"
 
@@ -30,7 +31,6 @@
 DEFINE_MUTEX(pm_mutex);
 
 struct pm_ops *pm_ops;
-suspend_disk_method_t pm_disk_mode = PM_DISK_SHUTDOWN;
 
 /**
  *	pm_set_ops - Set the global power method table. 
@@ -41,10 +41,6 @@ void pm_set_ops(struct pm_ops * ops)
 {
 	mutex_lock(&pm_mutex);
 	pm_ops = ops;
-	if (ops && ops->pm_disk_mode != PM_DISK_INVALID) {
-		pm_disk_mode = ops->pm_disk_mode;
-	} else
-		pm_disk_mode = PM_DISK_SHUTDOWN;
 	mutex_unlock(&pm_mutex);
 }
 
@@ -184,24 +180,12 @@ static void suspend_finish(suspend_state
 static const char * const pm_states[PM_SUSPEND_MAX] = {
 	[PM_SUSPEND_STANDBY]	= "standby",
 	[PM_SUSPEND_MEM]	= "mem",
-	[PM_SUSPEND_DISK]	= "disk",
 };
 
 static inline int valid_state(suspend_state_t state)
 {
-	/* Suspend-to-disk does not really need low-level support.
-	 * It can work with shutdown/reboot if needed. If it isn't
-	 * configured, then it cannot be supported.
-	 */
-	if (state == PM_SUSPEND_DISK)
-#ifdef CONFIG_SOFTWARE_SUSPEND
-		return 1;
-#else
-		return 0;
-#endif
-
-	/* all other states need lowlevel support and need to be
-	 * valid to the lowlevel implementation, no valid callback
+	/* All states need lowlevel support and need to be valid
+	 * to the lowlevel implementation, no valid callback
 	 * implies that none are valid. */
 	if (!pm_ops || !pm_ops->valid || !pm_ops->valid(state))
 		return 0;
@@ -229,11 +213,6 @@ static int enter_state(suspend_state_t s
 	if (!mutex_trylock(&pm_mutex))
 		return -EBUSY;
 
-	if (state == PM_SUSPEND_DISK) {
-		error = pm_suspend_disk();
-		goto Unlock;
-	}
-
 	pr_debug("PM: Preparing system for %s sleep\n", pm_states[state]);
 	if ((error = suspend_prepare(state)))
 		goto Unlock;
@@ -251,7 +230,7 @@ static int enter_state(suspend_state_t s
 
 /**
  *	pm_suspend - Externally visible function for suspending system.
- *	@state:		Enumarted value of state to enter.
+ *	@state:		Enumerated value of state to enter.
  *
  *	Determine whether or not value is within range, get state 
  *	structure, and enter (above).
@@ -283,13 +262,19 @@ decl_subsys(power,NULL,NULL);
 static ssize_t state_show(struct subsystem * subsys, char * buf)
 {
 	int i;
-	char * s = buf;
+	char *s = buf;
 
 	for (i = 0; i < PM_SUSPEND_MAX; i++) {
 		if (pm_states[i] && valid_state(i))
-			s += sprintf(s,"%s ", pm_states[i]);
+			s += sprintf(s, "%s ", pm_states[i]);
 	}
-	s += sprintf(s,"\n");
+#ifdef CONFIG_SOFTWARE_SUSPEND
+	s += sprintf(s, "%s\n", "disk");
+#else
+	if (s != buf)
+		/* convert the last space to a newline */
+		*(s-1) = "\n";
+#endif
 	return (s - buf);
 }
 
@@ -304,6 +289,12 @@ static ssize_t state_store(struct subsys
 	p = memchr(buf, '\n', n);
 	len = p ? p - buf : n;
 
+	/* first check hibernate */
+	if (strncmp(buf, "disk", len)) {
+		error = hibernate();
+		return error ? error : n;
+	}
+
 	for (s = &pm_states[state]; state < PM_SUSPEND_MAX; s++, state++) {
 		if (*s && !strncmp(buf, *s, len))
 			break;
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ wireless-dev/include/linux/hibernate.h	2007-04-26 18:21:38.130691185 +0200
@@ -0,0 +1,36 @@
+#ifndef __LINUX_HIBERNATE
+#define __LINUX_HIBERNATE
+/*
+ * hibernate ('suspend to disk') functionality
+ */
+
+/**
+ * struct hibernate_ops - hibernate platform support
+ *
+ * The methods in this structure allow a platform to override what
+ * happens for shutting down the machine when going into hibernation.
+ *
+ * All three methods must be assigned.
+ *
+ * @prepare: prepare system for hibernation
+ * @enter: shut down system after state has been saved to disk
+ * @finish: finish/clean up after state has been reloaded
+ */
+struct hibernate_ops {
+	int (*prepare)(void);
+	int (*enter)(void);
+	void (*finish)(void);
+};
+
+/**
+ * hibernate_set_ops - set the global hibernate operations
+ * @ops: the hibernate operations to use from now on.
+ */
+void hibernate_set_ops(struct hibernate_ops *ops);
+
+/**
+ * hibernate - hibernate the system
+ */
+int hibernate(void);
+
+#endif /* __LINUX_HIBERNATE */
--- wireless-dev.orig/kernel/power/disk.c	2007-04-26 18:15:00.800691185 +0200
+++ wireless-dev/kernel/power/disk.c	2007-04-26 18:15:09.420691185 +0200
@@ -21,45 +21,72 @@
 #include <linux/console.h>
 #include <linux/cpu.h>
 #include <linux/freezer.h>
+#include <linux/hibernate.h>
 
 #include "power.h"
 
 
-static int noresume = 0;
+static int noresume;
 char resume_file[256] = CONFIG_PM_STD_PARTITION;
 dev_t swsusp_resume_device;
 sector_t swsusp_resume_block;
 
+static struct hibernate_ops *hibernate_ops;
+static int pm_disk_mode;
+
+enum {
+	PM_DISK_INVALID,
+	PM_DISK_PLATFORM,
+	PM_DISK_TEST,
+	PM_DISK_TESTPROC,
+	PM_DISK_SHUTDOWN,
+	PM_DISK_REBOOT,
+	/* keep last */
+	__PM_DISK_AFTER_LAST
+};
+#define PM_DISK_MAX (__PM_DISK_AFTER_LAST-1)
+#define PM_DISK_FIRST (PM_DISK_INVALID + 1)
+
+void hibernate_set_ops(struct hibernate_ops *ops)
+{
+	BUG_ON(!hibernate_ops->prepare);
+	BUG_ON(!hibernate_ops->enter);
+	BUG_ON(!hibernate_ops->finish);
+	mutex_lock(&pm_mutex);
+	hibernate_ops = ops;
+	mutex_unlock(&pm_mutex);
+}
+
+
 /**
- *	platform_prepare - prepare the machine for hibernation using the
- *	platform driver if so configured and return an error code if it fails
+ *	hibernate_platform_prepare - prepare the machine for hibernation using
+ *	the platform driver if so configured and return an error code if it
+ *	fails.
  */
 
-static inline int platform_prepare(void)
+int hibernate_platform_prepare(void)
 {
-	int error = 0;
-
 	switch (pm_disk_mode) {
 	case PM_DISK_TEST:
 	case PM_DISK_TESTPROC:
 	case PM_DISK_SHUTDOWN:
 	case PM_DISK_REBOOT:
 		break;
-	default:
-		if (pm_ops && pm_ops->prepare)
-			error = pm_ops->prepare(PM_SUSPEND_DISK);
+	case PM_DISK_PLATFORM:
+		if (hibernate_ops)
+			return hibernate_ops->prepare();
 	}
-	return error;
+	return 0;
 }
 
 /**
- *	power_down - Shut machine down for hibernate.
+ *	hibernate_power_down - Shut machine down for hibernate.
  *
  *	Use the platform driver, if configured so; otherwise try
  *	to power off or reboot.
  */
 
-static void power_down(void)
+static void hibernate_power_down(void)
 {
 	switch (pm_disk_mode) {
 	case PM_DISK_TEST:
@@ -70,11 +97,10 @@ static void power_down(void)
 	case PM_DISK_REBOOT:
 		kernel_restart(NULL);
 		break;
-	default:
-		if (pm_ops && pm_ops->enter) {
+	case PM_DISK_PLATFORM:
+		if (hibernate_ops) {
 			kernel_shutdown_prepare(SYSTEM_SUSPEND_DISK);
-			pm_ops->enter(PM_SUSPEND_DISK);
-			break;
+			hibernate_ops->enter();
 		}
 	}
 
@@ -85,7 +111,7 @@ static void power_down(void)
 	while(1);
 }
 
-static inline void platform_finish(void)
+void hibernate_platform_finish(void)
 {
 	switch (pm_disk_mode) {
 	case PM_DISK_TEST:
@@ -93,9 +119,9 @@ static inline void platform_finish(void)
 	case PM_DISK_SHUTDOWN:
 	case PM_DISK_REBOOT:
 		break;
-	default:
-		if (pm_ops && pm_ops->finish)
-			pm_ops->finish(PM_SUSPEND_DISK);
+	case PM_DISK_PLATFORM:
+		if (hibernate_ops)
+			hibernate_ops->finish();
 	}
 }
 
@@ -118,13 +144,13 @@ static int prepare_processes(void)
 }
 
 /**
- *	pm_suspend_disk - The granpappy of hibernation power management.
+ *	hibernate - The granpappy of hibernation power management.
  *
  *	If not, then call swsusp to do its thing, then figure out how
  *	to power down the system.
  */
 
-int pm_suspend_disk(void)
+int hibernate(void)
 {
 	int error;
 
@@ -147,7 +173,7 @@ int pm_suspend_disk(void)
 	if (error)
 		goto Finish;
 
-	error = platform_prepare();
+	error = hibernate_platform_prepare();
 	if (error)
 		goto Finish;
 
@@ -175,13 +201,13 @@ int pm_suspend_disk(void)
 
 	if (in_suspend) {
 		enable_nonboot_cpus();
-		platform_finish();
+		hibernate_platform_finish();
 		device_resume();
 		resume_console();
 		pr_debug("PM: writing image.\n");
 		error = swsusp_write();
 		if (!error)
-			power_down();
+			hibernate_power_down();
 		else {
 			swsusp_free();
 			goto Finish;
@@ -194,7 +220,7 @@ int pm_suspend_disk(void)
  Enable_cpus:
 	enable_nonboot_cpus();
  Resume_devices:
-	platform_finish();
+	hibernate_platform_finish();
 	device_resume();
 	resume_console();
  Finish:
@@ -211,7 +237,7 @@ int pm_suspend_disk(void)
  *	Called as a late_initcall (so all devices are discovered and
  *	initialized), we call swsusp to see if we have a saved image or not.
  *	If so, we quiesce devices, the restore the saved image. We will
- *	return above (in pm_suspend_disk() ) if everything goes well.
+ *	return above (in hibernate() ) if everything goes well.
  *	Otherwise, we fail gracefully and return to the normally
  *	scheduled program.
  *
@@ -311,12 +337,13 @@ static const char * const pm_disk_modes[
  *
  *	Suspend-to-disk can be handled in several ways. We have a few options
  *	for putting the system to sleep - using the platform driver (e.g. ACPI
- *	or other pm_ops), powering off the system or rebooting the system
- *	(for testing) as well as the two test modes.
+ *	or other hibernate_ops), powering off the system or rebooting the
+ *	system (for testing) as well as the two test modes.
  *
  *	The system can support 'platform', and that is known a priori (and
- *	encoded in pm_ops). However, the user may choose 'shutdown' or 'reboot'
- *	as alternatives, as well as the test modes 'test' and 'testproc'.
+ *	encoded by the presence of hibernate_ops). However, the user may choose
+ *	'shutdown' or 'reboot' as alternatives, as well as the test modes 'test'
+ *	and 'testproc'.
  *
  *	show() will display what the mode is currently set to.
  *	store() will accept one of
@@ -328,7 +355,7 @@ static const char * const pm_disk_modes[
  *	'testproc'
  *
  *	It will only change to 'platform' if the system
- *	supports it (as determined from pm_ops->pm_disk_mode).
+ *	supports it (as determined by having hibernate_ops).
  */
 
 static ssize_t disk_show(struct subsystem * subsys, char * buf)
@@ -336,7 +363,7 @@ static ssize_t disk_show(struct subsyste
 	int i;
 	char *start = buf;
 
-	for (i = PM_DISK_PLATFORM; i < PM_DISK_MAX; i++) {
+	for (i = PM_DISK_FIRST; i <= PM_DISK_MAX; i++) {
 		if (!pm_disk_modes[i])
 			continue;
 		switch (i) {
@@ -345,9 +372,8 @@ static ssize_t disk_show(struct subsyste
 		case PM_DISK_TEST:
 		case PM_DISK_TESTPROC:
 			break;
-		default:
-			if (pm_ops && pm_ops->enter &&
-			    (i == pm_ops->pm_disk_mode))
+		case PM_DISK_PLATFORM:
+			if (hibernate_ops)
 				break;
 			/* not a valid mode, continue with loop */
 			continue;
@@ -370,19 +396,19 @@ static ssize_t disk_store(struct subsyst
 	int i;
 	int len;
 	char *p;
-	suspend_disk_method_t mode = 0;
+	int mode = PM_DISK_INVALID;
 
 	p = memchr(buf, '\n', n);
 	len = p ? p - buf : n;
 
 	mutex_lock(&pm_mutex);
-	for (i = PM_DISK_PLATFORM; i < PM_DISK_MAX; i++) {
+	for (i = PM_DISK_FIRST; i < PM_DISK_MAX; i++) {
 		if (!strncmp(buf, pm_disk_modes[i], len)) {
 			mode = i;
 			break;
 		}
 	}
-	if (mode) {
+	if (mode != PM_DISK_INVALID) {
 		switch (mode) {
 		case PM_DISK_SHUTDOWN:
 		case PM_DISK_REBOOT:
@@ -390,19 +416,18 @@ static ssize_t disk_store(struct subsyst
 		case PM_DISK_TESTPROC:
 			pm_disk_mode = mode;
 			break;
-		default:
-			if (pm_ops && pm_ops->enter &&
-			    (mode == pm_ops->pm_disk_mode))
+		case PM_DISK_PLATFORM:
+			if (hibernate_ops)
 				pm_disk_mode = mode;
 			else
 				error = -EINVAL;
 		}
-	} else {
+	} else
 		error = -EINVAL;
-	}
 
-	pr_debug("PM: suspend-to-disk mode set to '%s'\n",
-		 pm_disk_modes[mode]);
+	if (!error)
+		pr_debug("PM: suspend-to-disk mode set to '%s'\n",
+			 pm_disk_modes[mode]);
 	mutex_unlock(&pm_mutex);
 	return error ? error : n;
 }
--- wireless-dev.orig/kernel/power/user.c	2007-04-26 18:15:01.130691185 +0200
+++ wireless-dev/kernel/power/user.c	2007-04-26 18:15:09.420691185 +0200
@@ -128,22 +128,6 @@ static ssize_t snapshot_write(struct fil
 	return res;
 }
 
-static inline int platform_prepare(void)
-{
-	int error = 0;
-
-	if (pm_ops && pm_ops->prepare)
-		error = pm_ops->prepare(PM_SUSPEND_DISK);
-
-	return error;
-}
-
-static inline void platform_finish(void)
-{
-	if (pm_ops && pm_ops->finish)
-		pm_ops->finish(PM_SUSPEND_DISK);
-}
-
 static inline int snapshot_suspend(int platform_suspend)
 {
 	int error;
@@ -155,7 +139,7 @@ static inline int snapshot_suspend(int p
 		goto Finish;
 
 	if (platform_suspend) {
-		error = platform_prepare();
+		error = hibernate_platform_prepare();
 		if (error)
 			goto Finish;
 	}
@@ -172,7 +156,7 @@ static inline int snapshot_suspend(int p
 	enable_nonboot_cpus();
  Resume_devices:
 	if (platform_suspend)
-		platform_finish();
+		hibernate_platform_finish();
 
 	device_resume();
 	resume_console();
@@ -188,7 +172,7 @@ static inline int snapshot_restore(int p
 	mutex_lock(&pm_mutex);
 	pm_prepare_console();
 	if (platform_suspend) {
-		error = platform_prepare();
+		error = hibernate_platform_prepare();
 		if (error)
 			goto Finish;
 	}
@@ -204,7 +188,7 @@ static inline int snapshot_restore(int p
 	enable_nonboot_cpus();
  Resume_devices:
 	if (platform_suspend)
-		platform_finish();
+		hibernate_platform_finish();
 
 	device_resume();
 	resume_console();
@@ -406,13 +390,15 @@ static int snapshot_ioctl(struct inode *
 		case PMOPS_ENTER:
 			if (data->platform_suspend) {
 				kernel_shutdown_prepare(SYSTEM_SUSPEND_DISK);
-				error = pm_ops->enter(PM_SUSPEND_DISK);
+				error = hibernate_ops->enter();
+				/* how can this possibly do the right thing? */
 				error = 0;
 			}
 			break;
 
 		case PMOPS_FINISH:
 			if (data->platform_suspend)
+				/* and why doesn't this invoke anything??? */
 				error = 0;
 
 			break;
--- wireless-dev.orig/Documentation/power/userland-swsusp.txt	2007-04-26 18:15:02.120691185 +0200
+++ wireless-dev/Documentation/power/userland-swsusp.txt	2007-04-26 18:15:09.440691185 +0200
@@ -93,21 +93,23 @@ SNAPSHOT_S2RAM - suspend to RAM; using t
 	to resume the system from RAM if there's enough battery power or restore
 	its state on the basis of the saved suspend image otherwise)
 
-SNAPSHOT_PMOPS - enable the usage of the pmops->prepare, pmops->enter and
-	pmops->finish methods (the in-kernel swsusp knows these as the "platform
-	method") which are needed on many machines to (among others) speed up
-	the resume by letting the BIOS skip some steps or to let the system
-	recognise the correct state of the hardware after the resume (in
-	particular on many machines this ensures that unplugged AC
-	adapters get correctly detected and that kacpid does not run wild after
-	the resume).  The last ioctl() argument can take one of the three
-	values, defined in kernel/power/power.h:
+SNAPSHOT_PMOPS - enable the usage of the hibernate_ops->prepare,
+	hibernate_ops->enter and hibernate_ops->finish methods (the in-kernel
+	swsusp knows these as the "platform method") which are needed on many
+	machines to (among others) speed up the resume by letting the BIOS skip
+	some steps or to let the system recognise the correct state of the
+	hardware after the resume (in particular on many machines this ensures
+	that unplugged AC adapters get correctly detected and that kacpid does
+	not run wild after the resume).  The last ioctl() argument can take one
+	of the three values, defined in kernel/power/power.h:
 	PMOPS_PREPARE - make the kernel carry out the
-		pm_ops->prepare(PM_SUSPEND_DISK) operation
+		hibernate_ops->prepare() operation
 	PMOPS_ENTER - make the kernel power off the system by calling
-		pm_ops->enter(PM_SUSPEND_DISK)
+		hibernate_ops->enter()
 	PMOPS_FINISH - make the kernel carry out the
-		pm_ops->finish(PM_SUSPEND_DISK) operation
+		hibernate_ops->finish() operation
+	Note that the actual constants are misnamed because they surface
+	internal kernel implementation details that have changed.
 
 The device's read() operation can be used to transfer the snapshot image from
 the kernel.  It has the following limitations:
--- wireless-dev.orig/drivers/i2c/chips/tps65010.c	2007-04-26 18:15:02.150691185 +0200
+++ wireless-dev/drivers/i2c/chips/tps65010.c	2007-04-26 18:15:09.440691185 +0200
@@ -354,7 +354,7 @@ static void tps65010_interrupt(struct tp
 			 * also needs to get error handling and probably
 			 * an #ifdef CONFIG_SOFTWARE_SUSPEND
 			 */
-			pm_suspend(PM_SUSPEND_DISK);
+			hibernate();
 #endif
 			poll = 1;
 		}
--- wireless-dev.orig/kernel/sys.c	2007-04-26 18:15:01.310691185 +0200
+++ wireless-dev/kernel/sys.c	2007-04-26 18:15:09.450691185 +0200
@@ -25,6 +25,7 @@
 #include <linux/security.h>
 #include <linux/dcookies.h>
 #include <linux/suspend.h>
+#include <linux/hibernate.h>
 #include <linux/tty.h>
 #include <linux/signal.h>
 #include <linux/cn_proc.h>
@@ -881,7 +882,7 @@ asmlinkage long sys_reboot(int magic1, i
 #ifdef CONFIG_SOFTWARE_SUSPEND
 	case LINUX_REBOOT_CMD_SW_SUSPEND:
 		{
-			int ret = pm_suspend(PM_SUSPEND_DISK);
+			int ret = hibernate();
 			unlock_kernel();
 			return ret;
 		}
--- wireless-dev.orig/drivers/acpi/sleep/main.c	2007-04-26 18:15:02.290691185 +0200
+++ wireless-dev/drivers/acpi/sleep/main.c	2007-04-26 18:15:09.630691185 +0200
@@ -15,6 +15,7 @@
 #include <linux/dmi.h>
 #include <linux/device.h>
 #include <linux/suspend.h>
+#include <linux/hibernate.h>
 #include <acpi/acpi_bus.h>
 #include <acpi/acpi_drivers.h>
 #include "sleep.h"
@@ -29,7 +30,6 @@ static u32 acpi_suspend_states[] = {
 	[PM_SUSPEND_ON] = ACPI_STATE_S0,
 	[PM_SUSPEND_STANDBY] = ACPI_STATE_S1,
 	[PM_SUSPEND_MEM] = ACPI_STATE_S3,
-	[PM_SUSPEND_DISK] = ACPI_STATE_S4,
 	[PM_SUSPEND_MAX] = ACPI_STATE_S5
 };
 
@@ -94,14 +94,6 @@ static int acpi_pm_enter(suspend_state_t
 		do_suspend_lowlevel();
 		break;
 
-	case PM_SUSPEND_DISK:
-		if (acpi_pm_ops.pm_disk_mode == PM_DISK_PLATFORM)
-			status = acpi_enter_sleep_state(acpi_state);
-		break;
-	case PM_SUSPEND_MAX:
-		acpi_power_off();
-		break;
-
 	default:
 		return -EINVAL;
 	}
@@ -157,12 +149,13 @@ int acpi_suspend(u32 acpi_state)
 	suspend_state_t states[] = {
 		[1] = PM_SUSPEND_STANDBY,
 		[3] = PM_SUSPEND_MEM,
-		[4] = PM_SUSPEND_DISK,
 		[5] = PM_SUSPEND_MAX
 	};
 
 	if (acpi_state < 6 && states[acpi_state])
 		return pm_suspend(states[acpi_state]);
+	if (acpi_state == 4)
+		return hibernate();
 	return -EINVAL;
 }
 
@@ -189,6 +182,71 @@ static struct pm_ops acpi_pm_ops = {
 	.finish = acpi_pm_finish,
 };
 
+#ifdef CONFIG_SOFTWARE_SUSPEND
+static int acpi_hib_prepare(void)
+{
+	return acpi_sleep_prepare(ACPI_STATE_S4);
+}
+
+static int acpi_hib_enter(void)
+{
+	acpi_status status = AE_OK;
+	unsigned long flags = 0;
+	u32 acpi_state = acpi_suspend_states[pm_state];
+
+	ACPI_FLUSH_CPU_CACHE();
+
+	/* Do arch specific saving of state. */
+	int error = acpi_save_state_mem();
+	if (error)
+		return error;
+
+	local_irq_save(flags);
+	acpi_enable_wakeup_device(acpi_state);
+	status = acpi_enter_sleep_state(acpi_state);
+
+	/* ACPI 3.0 specs (P62) says that it's the responsabilty
+	 * of the OSPM to clear the status bit [ implying that the
+	 * POWER_BUTTON event should not reach userspace ]
+	 */
+	if (ACPI_SUCCESS(status) && (acpi_state == ACPI_STATE_S3))
+		acpi_clear_event(ACPI_EVENT_POWER_BUTTON);
+
+	local_irq_restore(flags);
+	printk(KERN_DEBUG "Back to C!\n");
+
+	/* restore processor state
+	 * We should only be here if we're coming back from STR or STD.
+	 * And, in the case of the latter, the memory image should have already
+	 * been loaded from disk.
+	 */
+	acpi_restore_state_mem();
+
+	return ACPI_SUCCESS(status) ? 0 : -EFAULT;
+}
+
+static void acpi_hib_finish(void)
+{
+	acpi_leave_sleep_state(ACPI_STATE_S4);
+	acpi_disable_wakeup_device(ACPI_STATE_S4);
+
+	/* reset firmware waking vector */
+	acpi_set_firmware_waking_vector((acpi_physical_address) 0);
+
+	if (init_8259A_after_S1) {
+		printk("Broken toshiba laptop -> kicking interrupts\n");
+		init_8259A(0);
+	}
+	return 0;
+}
+
+static struct hibernate_ops acpi_hib_ops = {
+	.prepare = acpi_hib_prepare,
+	.enter = acpi_hib_enter,
+	.finish = acpi_hib_finish,
+};
+#endif /* CONFIG_SOFTWARE_SUSPEND */
+
 /*
  * Toshiba fails to preserve interrupts over S1, reinitialization
  * of 8259 is needed after S1 resume.
@@ -227,13 +285,16 @@ int __init acpi_sleep_init(void)
 			sleep_states[i] = 1;
 			printk(" S%d", i);
 		}
-		if (i == ACPI_STATE_S4) {
-			if (sleep_states[i])
-				acpi_pm_ops.pm_disk_mode = PM_DISK_PLATFORM;
-		}
 	}
 	printk(")\n");
 
+#ifdef CONFIG_SOFTWARE_SUSPEND
+	if (sleep_states[ACPI_STATE_S4])
+		hibernate_set_ops(&acpi_hib_ops);
+#else
+	sleep_states[ACPI_STATE_S4] = 0;
+#endif
+
 	pm_set_ops(&acpi_pm_ops);
 	return 0;
 }
--- wireless-dev.orig/kernel/power/power.h	2007-04-26 18:15:01.240691185 +0200
+++ wireless-dev/kernel/power/power.h	2007-04-26 18:15:09.630691185 +0200
@@ -13,16 +13,6 @@ struct swsusp_info {
 
 
 
-#ifdef CONFIG_SOFTWARE_SUSPEND
-extern int pm_suspend_disk(void);
-
-#else
-static inline int pm_suspend_disk(void)
-{
-	return -EPERM;
-}
-#endif
-
 extern struct mutex pm_mutex;
 
 #define power_attr(_name) \
@@ -179,3 +169,6 @@ extern int suspend_enter(suspend_state_t
 struct timeval;
 extern void swsusp_show_speed(struct timeval *, struct timeval *,
 				unsigned int, char *);
+
+extern int hibernate_platform_prepare(void);
+extern void hibernate_platform_finish(void);
--- wireless-dev.orig/drivers/acpi/sleep/proc.c	2007-04-26 18:15:02.720691185 +0200
+++ wireless-dev/drivers/acpi/sleep/proc.c	2007-04-26 18:15:09.630691185 +0200
@@ -1,6 +1,7 @@
 #include <linux/proc_fs.h>
 #include <linux/seq_file.h>
 #include <linux/suspend.h>
+#include <linux/hibernate.h>
 #include <linux/bcd.h>
 #include <asm/uaccess.h>
 
@@ -60,7 +61,7 @@ acpi_system_write_sleep(struct file *fil
 	state = simple_strtoul(str, NULL, 0);
 #ifdef CONFIG_SOFTWARE_SUSPEND
 	if (state == 4) {
-		error = pm_suspend(PM_SUSPEND_DISK);
+		error = hibernate();
 		goto Done;
 	}
 #endif



^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 16:14                                   ` Chris Friesen
@ 2007-04-26 16:27                                     ` Linus Torvalds
  2007-04-26 17:11                                     ` Johannes Berg
  1 sibling, 0 replies; 158+ messages in thread
From: Linus Torvalds @ 2007-04-26 16:27 UTC (permalink / raw)
  To: Chris Friesen
  Cc: Johannes Berg, Pavel Machek, Nick Piggin, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, suspend2-devel,
	Ingo Molnar, Andrew Morton, Arjan van de Ven, Rafael J. Wysocki



On Thu, 26 Apr 2007, Chris Friesen wrote:
> 
> I disagree...it's quite common.  I think its the standard way of doing things
> for ppc64, for instance.

It is, although most x86-64 installations seem to be 64-bit user space 
*if*you*install*from*scatch*.

Of course, at least some users (yeah, I've done it) started with a 32-bit 
CD they had lying around, and upgraded just the kernel. And I'm sure some 
distro out there just defaults to 32-bit binaries just because (in 
practice, you have to use a 32-bit firefox anyway if you want flash etc, 
so you need all the 32-bit libraries, so the argument might go that you 
might as well use 32-bit stuff for all the common stuff, and only 64-bit 
binaries when actually needed).

		Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 11:35                                 ` Johannes Berg
  2007-04-26 11:33                                   ` Pavel Machek
@ 2007-04-26 16:14                                   ` Chris Friesen
  2007-04-26 16:27                                     ` Linus Torvalds
  2007-04-26 17:11                                     ` Johannes Berg
  1 sibling, 2 replies; 158+ messages in thread
From: Chris Friesen @ 2007-04-26 16:14 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Pavel Machek, Linus Torvalds, Nick Piggin, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, suspend2-devel,
	Ingo Molnar, Andrew Morton, Arjan van de Ven, Rafael J. Wysocki

Johannes Berg wrote:

> Judging from experience with the wext 32/64 bit fiasco it seems to be
> rather uncommon to use 32-bit userspace on 64-bit machines.

I disagree...it's quite common.  I think its the standard way of doing 
things for ppc64, for instance.

Chris

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 14:48                               ` Rafael J. Wysocki
@ 2007-04-26 16:10                                 ` Pekka Enberg
  2007-04-26 19:28                                   ` Rafael J. Wysocki
  0 siblings, 1 reply; 158+ messages in thread
From: Pekka Enberg @ 2007-04-26 16:10 UTC (permalink / raw)
  To: rjw
  Cc: Pavel Machek, Dumitru Ciobarcianu, Ingo Molnar, Linus Torvalds,
	Nigel Cunningham, Christian Hesse, Nick Piggin, Mike Galbraith,
	linux-kernel, Con Kolivas, suspend2-devel, Andrew Morton,
	Thomas Gleixner, Arjan van de Ven


On 4/26/2007, "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> In principle, we could add suspend2 as an alternative (in analogy with the I/O
> schedulers, for example), but I think for this purpose it should be reviewed
> properly.

Yeah, this makes sense.

On 4/26/2007, "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> There also is a real problem with how it uses the LRU pages.  It _seems_ to
> work, but at least to me it seems to be potentially dangerous.

I am new to suspend2 so can you please explain what exactly is dangerous
about it?

                                     Pekka

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 14:04                                           ` Mark Lord
@ 2007-04-26 16:10                                             ` Linus Torvalds
  2007-04-26 21:00                                               ` Pavel Machek
  0 siblings, 1 reply; 158+ messages in thread
From: Linus Torvalds @ 2007-04-26 16:10 UTC (permalink / raw)
  To: Mark Lord
  Cc: Alan Cox, Pavel Machek, Kenneth Crudup, Nick Piggin,
	Mike Galbraith, linux-kernel, Thomas Gleixner, Con Kolivas,
	suspend2-devel, Ingo Molnar, Andrew Morton, Arjan van de Ven



On Thu, 26 Apr 2007, Mark Lord wrote:
> Linus Torvalds wrote:
> > 
> > See? Two *totally* different cases. They have *nothing* in common. Not the
> > call sequence, not the logic, not *anything*.
> 
> Except that both methods cannot rely upon hot-pluggable devices
> still being present on resume/restore.  It is exceptionally common
> to unplug all USB/firewire cables, mouse, keyboard, docking cables etc..
> after a machine is in S2R state.

Right, and that has nothing to do with suspend/resume. You'd better be 
able to handle unexpected hotplugs _regardless_.

For example, it's quite common that people just "remove" the 
pcmcia/cardbus card while the driver is active. And in fact, when that 
happens, it's also quite common that the hardware raises the irq for that 
(active) driver (in fact, it's more than common: since the "card removal" 
interrupt for the Cardbus controller is generally always the same as the 
"card interrupt" interrupt for the low-level card driver, you can pretty 
much *guarantee* that you get that interrupt).

So the end result is that the interrupt handler and all normal IO routines 
for a hotpluggable piece of hardware baically _have_ to be able to 
gracefully handle the "oops, the hw simply isn't there any more" case!

The resume code isn't any different at all. It should run perfectly 
normally, but for hotpluggable devices, it has to follow all the same 
rules: handle the "oops, the hw is gone" case gracefully.  No different, 
and it's totally unrelated to suspend/resume: it's a *generic* issue.

In fact, suspend/resume is better off than a lot of the other code is, 
simply because it's easier to test that case and know you hit that 
particular sequence! It's much harder to verify that the "send packet" 
case is safe, because how are you going to know to remove the card at the 
right point to trigger it?

			Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 13:14                                                   ` Alan Cox
@ 2007-04-26 16:02                                                     ` Linus Torvalds
  0 siblings, 0 replies; 158+ messages in thread
From: Linus Torvalds @ 2007-04-26 16:02 UTC (permalink / raw)
  To: Alan Cox
  Cc: H. Peter Anvin, Pavel Machek, Kenneth Crudup, Nick Piggin,
	Mike Galbraith, linux-kernel, Thomas Gleixner, Con Kolivas,
	suspend2-devel, Ingo Molnar, Andrew Morton, Arjan van de Ven



On Thu, 26 Apr 2007, Alan Cox wrote:
>
> > The PCI spec for controlling DMA is really pretty nasty. You can disable 
> > it in the PCI config word, of course, but that usually just messes up the 
> > device entirely.
> 
> And some devices ignore it. Some of the older Cyrix stuff I have appears
> not to care how the master bit is set.

I'm not surprised. If the choice is between locking up the PCI bus by 
hanging the device in endless retries, or just ignoring the bit, I suspect 
"just ignore it" is actually the better choice.

Of course, in a perfect world you'd happily honor it, raise a PCI error, 
and all is good, but in practice the internal state machine of most 
non-trivial hardware is simply so complicated that the "abort gracefully" 
simply isn't an option.

The hw people have enough problems in getting things to work when 
everything is peachy and well, and a lot of companies end up releasing 
stuff with known errata for even the _normal_ cases, just because they 
expect software to work around them ("Doctor, doctor, it hurts when I do 
the documented access!" "You didn't read errata #317, did you? Don't do 
that, then!")

			Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 11:26                               ` Pavel Machek
  2007-04-26 11:35                                 ` Johannes Berg
@ 2007-04-26 15:56                                 ` Linus Torvalds
  2007-04-26 21:06                                   ` Theodore Tso
  1 sibling, 1 reply; 158+ messages in thread
From: Linus Torvalds @ 2007-04-26 15:56 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Johannes Berg, Nick Piggin, Mike Galbraith, linux-kernel,
	Thomas Gleixner, Con Kolivas, suspend2-devel, Ingo Molnar,
	Andrew Morton, Arjan van de Ven, Rafael J. Wysocki



On Thu, 26 Apr 2007, Pavel Machek wrote:
> 
> Yes, probably will. The other option is to break existing 32-bit
> userspace, which is a bit more common AFAICT.

And *this* is why kernel/userspace things simply should not be done.

It's simply better to do things entirely in the kernel. Because you add 
bugs and complications otherwise, and doing it in the kernel allows you 
to just switch things around.

As it is, it appears that user-space suspend is just broken whichever way 
we turn.

		Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 11:09                         ` Pavel Machek
@ 2007-04-26 15:53                           ` Linus Torvalds
  2007-04-26 18:21                           ` Olivier Galibert
  1 sibling, 0 replies; 158+ messages in thread
From: Linus Torvalds @ 2007-04-26 15:53 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Johannes Berg, Nick Piggin, Mike Galbraith, linux-kernel,
	Thomas Gleixner, Con Kolivas, suspend2-devel, Ingo Molnar,
	Andrew Morton, Arjan van de Ven, Rafael J. Wysocki



On Thu, 26 Apr 2007, Pavel Machek wrote:
> 
> #define SNAPSHOT_ATOMIC_SNAPSHOT	_IOW(SNAPSHOT_IOC_MAGIC, 3, void *)
> #define SNAPSHOT_SET_IMAGE_SIZE		_IOW(SNAPSHOT_IOC_MAGIC, 6, unsigned long)
> #define SNAPSHOT_AVAIL_SWAP		_IOR(SNAPSHOT_IOC_MAGIC, 7, void *)
> #define SNAPSHOT_GET_SWAP_PAGE		_IOR(SNAPSHOT_IOC_MAGIC, 8, void *)
> #define SNAPSHOT_SET_SWAP_FILE		_IOW(SNAPSHOT_IOC_MAGIC, 10, unsigned int)
> #define SNAPSHOT_PMOPS			_IOW(SNAPSHOT_IOC_MAGIC, 12, unsigned int)
> 
> Are these a problem? Do we need to just use u32 as a argument to keep
> ioctl numbers same between 32 and 64bit versions?

No, you need to use the *proper* type as an argument, and assuming that 
type has the same representation in both 32-bit and 64-bit world, the 
numbers will automatically match.

Using "void *" is totally bogus. It's supposed to be the actual argument 
you pass in, not the pointer to it. If your argument doesn't have a 
"struct xyz" kind of format, then you could use "int" (or u32 or 
something: but realistically int is 32-bit for the forseeable future), but 
it's always wrong to pass in "void *" or "unsigned long", since either of 
those are just a sign of the interface being either (a) misunderstood or 
(b) broken.

		Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 11:12                             ` Pekka Enberg
@ 2007-04-26 14:48                               ` Rafael J. Wysocki
  2007-04-26 16:10                                 ` Pekka Enberg
  0 siblings, 1 reply; 158+ messages in thread
From: Rafael J. Wysocki @ 2007-04-26 14:48 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Pavel Machek, Dumitru Ciobarcianu, Ingo Molnar, Linus Torvalds,
	Nigel Cunningham, Christian Hesse, Nick Piggin, Mike Galbraith,
	linux-kernel, Con Kolivas, suspend2-devel, Andrew Morton,
	Thomas Gleixner, Arjan van de Ven

On Thursday, 26 April 2007 13:12, Pekka Enberg wrote:
> On 4/25/07, Pavel Machek <pavel@ucw.cz> wrote:
> > > Please stop using FUD.
> > > Graphical progress it's not in the kernel, even with suspend2.
> >
> > It was ascii-art, but still 'graphical', last time I checked.
> 
> Suspend2 talks to an userspace client via netlink. While I find the
> name of the message ("redraw UI") rather appaling, there's nothing
> wrong in principle that userspace starts the suspend process and the
> kernel keeps feeding back progress information ("I froze all processes
> now") so it can display a graphical progress bar.
> 
> The real question here is what to do with compression and encryption.
> However, if you settle for one compression algorithm (such as LZF in
> the case of suspend2) and use the _existing_ in-kernel crypto API for
> encryption, suddenly the benefits of userspace suspend are not clear.
> 
> As you and Rafael seem to be mostly interested in uswsusp, why don't
> we replace the old in-kernel implementation with suspend2?

It has a lot of common code with uswsusp.  Practically, the saving of the image
is the only part of it that could be removed, but this is simple and _really_
helps with debugging.

In principle, we could add suspend2 as an alternative (in analogy with the I/O
schedulers, for example), but I think for this purpose it should be reviewed
properly.

There also is a real problem with how it uses the LRU pages.  It _seems_ to
work, but at least to me it seems to be potentially dangerous.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26  1:10                                         ` Linus Torvalds
@ 2007-04-26 14:04                                           ` Mark Lord
  2007-04-26 16:10                                             ` Linus Torvalds
  0 siblings, 1 reply; 158+ messages in thread
From: Mark Lord @ 2007-04-26 14:04 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alan Cox, Pavel Machek, Kenneth Crudup, Nick Piggin,
	Mike Galbraith, linux-kernel, Thomas Gleixner, Con Kolivas,
	suspend2-devel, Ingo Molnar, Andrew Morton, Arjan van de Ven

Linus Torvalds wrote:
> 
> See? Two *totally* different cases. They have *nothing* in common. Not the 
> call sequence, not the logic, not *anything*.

Except that both methods cannot rely upon hot-pluggable devices
still being present on resume/restore.  It is exceptionally common
to unplug all USB/firewire cables, mouse, keyboard, docking cables etc..
after a machine is in S2R state.

Cheers

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 10:40                       ` Pavel Machek
  2007-04-26 11:11                         ` Johannes Berg
@ 2007-04-26 13:45                         ` Johannes Berg
  1 sibling, 0 replies; 158+ messages in thread
From: Johannes Berg @ 2007-04-26 13:45 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Linus Torvalds, Nick Piggin, Mike Galbraith, linux-kernel,
	Thomas Gleixner, Con Kolivas, suspend2-devel, Ingo Molnar,
	Andrew Morton, Arjan van de Ven, Rafael J. Wysocki

[-- Attachment #1: Type: text/plain, Size: 324 bytes --]

By the way.

> diff --git a/kernel/power/power.h b/kernel/power/power.h
> index eb461b8..dc13af5 100644
> --- a/kernel/power/power.h
> +++ b/kernel/power/power.h
        ^^^^^^^^^^^^^^^^^^^^

Don't these definitions need to be exported to userspace? That
definitely is not a header file for userspace.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26  2:32                                                 ` Linus Torvalds
@ 2007-04-26 13:14                                                   ` Alan Cox
  2007-04-26 16:02                                                     ` Linus Torvalds
  0 siblings, 1 reply; 158+ messages in thread
From: Alan Cox @ 2007-04-26 13:14 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: H. Peter Anvin, Pavel Machek, Kenneth Crudup, Nick Piggin,
	Mike Galbraith, linux-kernel, Thomas Gleixner, Con Kolivas,
	suspend2-devel, Ingo Molnar, Andrew Morton, Arjan van de Ven

> The PCI spec for controlling DMA is really pretty nasty. You can disable 
> it in the PCI config word, of course, but that usually just messes up the 
> device entirely.

And some devices ignore it. Some of the older Cyrix stuff I have appears
not to care how the master bit is set.


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 12:15                       ` Ingo Molnar
@ 2007-04-26 12:41                         ` Pavel Machek
  0 siblings, 0 replies; 158+ messages in thread
From: Pavel Machek @ 2007-04-26 12:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Christoph Hellwig, Johannes Berg, Linus Torvalds, Nick Piggin,
	Mike Galbraith, linux-kernel, Thomas Gleixner, Con Kolivas,
	suspend2-devel, Andrew Morton, Arjan van de Ven

Hi!

> > > The interface isn't even 64/32-bit compatible...
> > 
> > It's not .  And it's one of the worst interface I've seen lately. Did 
> > anyone actually review this crap before it went in?  I completely 
> > agree with Linus that these kind of boundaries that lead to horribly 
> > complex ioctl interface are totally wrong.
> 
> it's a bit hard to see the point of it anyway: the resume binary (much 
> of the focus of the ioctls) fundamentally lives as an 'initrd binary' - 
> and most of the stuff that wants to execute in an initrd is 
> fundamentally tied to the kernel anyway.

Typically... yes, it needs to be in initrd.

And yes, klibc would help here.

							Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 11:35                     ` Christoph Hellwig
@ 2007-04-26 12:15                       ` Ingo Molnar
  2007-04-26 12:41                         ` Pavel Machek
  0 siblings, 1 reply; 158+ messages in thread
From: Ingo Molnar @ 2007-04-26 12:15 UTC (permalink / raw)
  To: Christoph Hellwig, Johannes Berg, Pavel Machek, Linus Torvalds,
	Nick Piggin, Mike Galbraith, linux-kernel, Thomas Gleixner,
	Con Kolivas, suspend2-devel, Andrew Morton, Arjan van de Ven


* Christoph Hellwig <hch@infradead.org> wrote:

> > The interface isn't even 64/32-bit compatible...
> 
> It's not .  And it's one of the worst interface I've seen lately. Did 
> anyone actually review this crap before it went in?  I completely 
> agree with Linus that these kind of boundaries that lead to horribly 
> complex ioctl interface are totally wrong.

it's a bit hard to see the point of it anyway: the resume binary (much 
of the focus of the ioctls) fundamentally lives as an 'initrd binary' - 
and most of the stuff that wants to execute in an initrd is 
fundamentally tied to the kernel anyway.

Perhaps we should allow "in-kernel userspace" that would be allowed to 
grow ad-hoc interfaces and linking that would only be compatible with 
the kernel they are embedded into: e.g. the klibc stuff in linux/usr/* 
could link to the kernel (via whatever method) and just be in essence 
another type of kernel code - but happening to execute in user-space, 
having access to the normal user-space facilities and being able to link 
to (GPL) user-space libraries. Perhaps this would bridge the "i want to 
tinker in user-space because it's technically easier/cleaner there" and 
"fine but that needs formalized ABIs for your connection to 
kernel-space" gap.

> Now suspend2 wasn't exactly nice either when I last reviewed it, but 
> we should probably give it another attempt if we can sort out a proper 
> incremental merge.

yeah, it still has quite a bit of work left, but it looked fundamentally 
split-uppable.

	Ingo

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 11:30                                         ` Pavel Machek
@ 2007-04-26 11:41                                           ` Johannes Berg
  2007-04-26 16:31                                           ` Johannes Berg
  1 sibling, 0 replies; 158+ messages in thread
From: Johannes Berg @ 2007-04-26 11:41 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Linus Torvalds, Nigel Cunningham, Nick Piggin, suspend2-devel,
	Mike Galbraith, linux-kernel, Con Kolivas, Andrew Morton,
	Thomas Gleixner, Ingo Molnar, Arjan van de Ven

[-- Attachment #1: Type: text/plain, Size: 428 bytes --]

On Thu, 2007-04-26 at 13:30 +0200, Pavel Machek wrote:

> That code goes back to Patrick, AFAICT. (And yes, ACPI S3 and ACPI S4
> low-level enter is pretty similar).

But that doesn't excuse abusing the same interface, IMHO.

> Patches would be welcome

:) I'll see what I can do. Shouldn't be too hard to add an interface
just for ACPI here and get platform disk-mode into there from a
different angle.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 10:17                   ` Johannes Berg
  2007-04-26 10:30                     ` Pavel Machek
@ 2007-04-26 11:35                     ` Christoph Hellwig
  2007-04-26 12:15                       ` Ingo Molnar
  1 sibling, 1 reply; 158+ messages in thread
From: Christoph Hellwig @ 2007-04-26 11:35 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Pavel Machek, Linus Torvalds, Nick Piggin, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, suspend2-devel,
	Ingo Molnar, Andrew Morton, Arjan van de Ven

On Thu, Apr 26, 2007 at 12:17:12PM +0200, Johannes Berg wrote:
> On Tue, 2007-04-24 at 23:24 +0200, Pavel Machek wrote:
> 
> > I believe uswsusp user/kernel separation is clean enough. Kernel
> > provides "snapshot image" and "resume image". (Thanks go to Rafael for
> > very clean interface).
> 
> The interface isn't even 64/32-bit compatible...

It's not .  And it's one of the worst interface I've seen lately.
Did anyone actually review this crap before it went in?  I completely
agree with Linus that these kind of boundaries that lead to horribly
complex ioctl interface are totally wrong.

Now suspend2 wasn't exactly nice either when I last reviewed it,
but we should probably give it another attempt if we can sort out
a proper incremental merge.


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 11:26                               ` Pavel Machek
@ 2007-04-26 11:35                                 ` Johannes Berg
  2007-04-26 11:33                                   ` Pavel Machek
  2007-04-26 16:14                                   ` Chris Friesen
  2007-04-26 15:56                                 ` Linus Torvalds
  1 sibling, 2 replies; 158+ messages in thread
From: Johannes Berg @ 2007-04-26 11:35 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Linus Torvalds, Nick Piggin, Mike Galbraith, linux-kernel,
	Thomas Gleixner, Con Kolivas, suspend2-devel, Ingo Molnar,
	Andrew Morton, Arjan van de Ven, Rafael J. Wysocki

[-- Attachment #1: Type: text/plain, Size: 480 bytes --]

On Thu, 2007-04-26 at 13:26 +0200, Pavel Machek wrote:

> Yes, probably will. The other option is to break existing 32-bit
> userspace, which is a bit more common AFAICT.

Judging from experience with the wext 32/64 bit fiasco it seems to be
rather uncommon to use 32-bit userspace on 64-bit machines.

Rafael hinted that we could just add these numbers, keep the existing
ones and then phase them out over time, but I haven't really given it
much thought.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 11:35                                 ` Johannes Berg
@ 2007-04-26 11:33                                   ` Pavel Machek
  2007-04-26 16:14                                   ` Chris Friesen
  1 sibling, 0 replies; 158+ messages in thread
From: Pavel Machek @ 2007-04-26 11:33 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Linus Torvalds, Nick Piggin, Mike Galbraith, linux-kernel,
	Thomas Gleixner, Con Kolivas, suspend2-devel, Ingo Molnar,
	Andrew Morton, Arjan van de Ven, Rafael J. Wysocki

Hi!

> > Yes, probably will. The other option is to break existing 32-bit
> > userspace, which is a bit more common AFAICT.
> 
> Judging from experience with the wext 32/64 bit fiasco it seems to be
> rather uncommon to use 32-bit userspace on 64-bit machines.

Well, it would break 32-bit userspace on 32-bit kernel, which is the
most common version, AFAICT.

> Rafael hinted that we could just add these numbers, keep the existing
> ones and then phase them out over time, but I haven't really given it
> much thought.

We could probably do that... but it is slightly ugly.



-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 10:39                                       ` Johannes Berg
@ 2007-04-26 11:30                                         ` Pavel Machek
  2007-04-26 11:41                                           ` Johannes Berg
  2007-04-26 16:31                                           ` Johannes Berg
  0 siblings, 2 replies; 158+ messages in thread
From: Pavel Machek @ 2007-04-26 11:30 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Linus Torvalds, Nigel Cunningham, Nick Piggin, suspend2-devel,
	Mike Galbraith, linux-kernel, Con Kolivas, Andrew Morton,
	Thomas Gleixner, Ingo Molnar, Arjan van de Ven

Hi!

> > That's where I started: whole "suspend to disk" thing actually has _more_ 
> > to do with "shutdown" than with "suspend". 
> 
> From looking at pm_ops which I was recently working with a lot, it seems
> that it was designed by somebody who was reading the ACPI documentation
> and was otherwise pretty clueless, even at that level std tries to look
> like suspend. IMHO that is one of the first things that should be ripped
> out, no pm_ops for STD, it's a pain to work with.

That code goes back to Patrick, AFAICT. (And yes, ACPI S3 and ACPI S4
low-level enter is pretty similar).

Patches would be welcome, as would be "suspend-to-ram maintainer".
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 21:41                     ` Matt Mackall
@ 2007-04-26 11:27                       ` Pavel Machek
  2007-04-26 19:04                       ` Bill Davidsen
  1 sibling, 0 replies; 158+ messages in thread
From: Pavel Machek @ 2007-04-26 11:27 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Ingo Molnar, Linus Torvalds, Nigel Cunningham, Christian Hesse,
	Nick Piggin, Mike Galbraith, linux-kernel, Con Kolivas,
	suspend2-devel, Andrew Morton, Thomas Gleixner, Arjan van de Ven

Hi!

> > We do not want to fragment the testing base, and suspend2 does not
> > really have any interesting features over uswsusp.
> 
> The testing base is already fragmented!
> 
> What the current situation means is that you simply never hear from
> the people who get fed up with suspend but who manage to get suspend2
> working.

Well, and what can I do?

I can wait for suspend2 to slowly disappear on their own. Merging
suspend2 would just make testing base _more_ fragmented then it is today. 
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 11:16                           ` Pavel Machek
@ 2007-04-26 11:27                             ` Johannes Berg
  2007-04-26 11:26                               ` Pavel Machek
  0 siblings, 1 reply; 158+ messages in thread
From: Johannes Berg @ 2007-04-26 11:27 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Linus Torvalds, Nick Piggin, Mike Galbraith, linux-kernel,
	Thomas Gleixner, Con Kolivas, suspend2-devel, Ingo Molnar,
	Andrew Morton, Arjan van de Ven, Rafael J. Wysocki

[-- Attachment #1: Type: text/plain, Size: 369 bytes --]

On Thu, 2007-04-26 at 13:16 +0200, Pavel Machek wrote:

> This one should prevent ioctl numbers changing, too.

> -#define SNAPSHOT_ATOMIC_SNAPSHOT	_IOW(SNAPSHOT_IOC_MAGIC, 3, void *)
> +#define SNAPSHOT_ATOMIC_SNAPSHOT	_IOW(SNAPSHOT_IOC_MAGIC, 3, u32) /* void * */

Afaict that'll actually change ioctl numbers breaking existing 64-bit
userspace.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 11:27                             ` Johannes Berg
@ 2007-04-26 11:26                               ` Pavel Machek
  2007-04-26 11:35                                 ` Johannes Berg
  2007-04-26 15:56                                 ` Linus Torvalds
  0 siblings, 2 replies; 158+ messages in thread
From: Pavel Machek @ 2007-04-26 11:26 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Linus Torvalds, Nick Piggin, Mike Galbraith, linux-kernel,
	Thomas Gleixner, Con Kolivas, suspend2-devel, Ingo Molnar,
	Andrew Morton, Arjan van de Ven, Rafael J. Wysocki

Hi!

> > This one should prevent ioctl numbers changing, too.
> 
> > -#define SNAPSHOT_ATOMIC_SNAPSHOT	_IOW(SNAPSHOT_IOC_MAGIC, 3, void *)
> > +#define SNAPSHOT_ATOMIC_SNAPSHOT	_IOW(SNAPSHOT_IOC_MAGIC, 3, u32) /* void * */
> 
> Afaict that'll actually change ioctl numbers breaking existing 64-bit
> userspace.

Yes, probably will. The other option is to break existing 32-bit
userspace, which is a bit more common AFAICT.
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 11:11                         ` Johannes Berg
@ 2007-04-26 11:16                           ` Pavel Machek
  2007-04-26 11:27                             ` Johannes Berg
  0 siblings, 1 reply; 158+ messages in thread
From: Pavel Machek @ 2007-04-26 11:16 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Linus Torvalds, Nick Piggin, Mike Galbraith, linux-kernel,
	Thomas Gleixner, Con Kolivas, suspend2-devel, Ingo Molnar,
	Andrew Morton, Arjan van de Ven, Rafael J. Wysocki

Hi!

> > Does this seem to help?
> 
> No idea, I haven't actually tried it yet, last time I tried uswsusp on
> my 32/32 machine it didn't work due to endian problems that were
> supposed to be resolved but I haven't had a chance to pick all the bits
> together that you need.

This one should prevent ioctl numbers changing, too.

diff --git a/kernel/power/power.h b/kernel/power/power.h
index eb461b8..a18b85a 100644
--- a/kernel/power/power.h
+++ b/kernel/power/power.h
@@ -114,23 +114,23 @@ extern int snapshot_image_loaded(struct 
  * SNAPSHOT_SET_SWAP_AREA ioctl
  */
 struct resume_swap_area {
-	loff_t offset;
+	u_int64_t offset;
 	u_int32_t dev;
 } __attribute__((packed));
 
 #define SNAPSHOT_IOC_MAGIC	'3'
 #define SNAPSHOT_FREEZE			_IO(SNAPSHOT_IOC_MAGIC, 1)
 #define SNAPSHOT_UNFREEZE		_IO(SNAPSHOT_IOC_MAGIC, 2)
-#define SNAPSHOT_ATOMIC_SNAPSHOT	_IOW(SNAPSHOT_IOC_MAGIC, 3, void *)
+#define SNAPSHOT_ATOMIC_SNAPSHOT	_IOW(SNAPSHOT_IOC_MAGIC, 3, u32) /* void * */
 #define SNAPSHOT_ATOMIC_RESTORE		_IO(SNAPSHOT_IOC_MAGIC, 4)
 #define SNAPSHOT_FREE			_IO(SNAPSHOT_IOC_MAGIC, 5)
-#define SNAPSHOT_SET_IMAGE_SIZE		_IOW(SNAPSHOT_IOC_MAGIC, 6, unsigned long)
-#define SNAPSHOT_AVAIL_SWAP		_IOR(SNAPSHOT_IOC_MAGIC, 7, void *)
-#define SNAPSHOT_GET_SWAP_PAGE		_IOR(SNAPSHOT_IOC_MAGIC, 8, void *)
+#define SNAPSHOT_SET_IMAGE_SIZE		_IOW(SNAPSHOT_IOC_MAGIC, 6, u32) /* unsigned long */
+#define SNAPSHOT_AVAIL_SWAP		_IOR(SNAPSHOT_IOC_MAGIC, 7, u32) /* void * */
+#define SNAPSHOT_GET_SWAP_PAGE		_IOR(SNAPSHOT_IOC_MAGIC, 8, u32) /* void * */
 #define SNAPSHOT_FREE_SWAP_PAGES	_IO(SNAPSHOT_IOC_MAGIC, 9)
-#define SNAPSHOT_SET_SWAP_FILE		_IOW(SNAPSHOT_IOC_MAGIC, 10, unsigned int)
+#define SNAPSHOT_SET_SWAP_FILE		_IOW(SNAPSHOT_IOC_MAGIC, 10, u32) /* unsigned int */
 #define SNAPSHOT_S2RAM			_IO(SNAPSHOT_IOC_MAGIC, 11)
-#define SNAPSHOT_PMOPS			_IOW(SNAPSHOT_IOC_MAGIC, 12, unsigned int)
+#define SNAPSHOT_PMOPS			_IOW(SNAPSHOT_IOC_MAGIC, 12, u32) /* unsigned int */
 #define SNAPSHOT_SET_SWAP_AREA		_IOW(SNAPSHOT_IOC_MAGIC, 13, \
 							struct resume_swap_area)
 #define SNAPSHOT_IOC_MAXNR	13
diff --git a/kernel/power/user.c b/kernel/power/user.c
index 558e18e..d0730c1 100644
--- a/kernel/power/user.c
+++ b/kernel/power/user.c
@@ -215,8 +215,7 @@ static int snapshot_ioctl(struct inode *
 {
 	int error = 0;
 	struct snapshot_data *data;
-	loff_t avail;
-	sector_t offset;
+	u64 avail, offset;
 
 	if (_IOC_TYPE(cmd) != SNAPSHOT_IOC_MAGIC)
 		return -ENOTTY;
@@ -286,7 +285,7 @@ static int snapshot_ioctl(struct inode *
 	case SNAPSHOT_AVAIL_SWAP:
 		avail = count_swap_pages(data->swap, 1);
 		avail <<= PAGE_SHIFT;
-		error = put_user(avail, (loff_t __user *)arg);
+		error = put_user(avail, (u64 __user *)arg);
 		break;
 
 	case SNAPSHOT_GET_SWAP_PAGE:
@@ -304,7 +303,7 @@ static int snapshot_ioctl(struct inode *
 		offset = alloc_swapdev_block(data->swap, data->bitmap);
 		if (offset) {
 			offset <<= PAGE_SHIFT;
-			error = put_user(offset, (sector_t __user *)arg);
+			error = put_user(offset, (u64 __user *)arg);
 		} else {
 			error = -ENOSPC;
 		}




-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply related	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25  8:10                           ` Pavel Machek
  2007-04-25  8:22                             ` Dumitru Ciobarcianu
@ 2007-04-26 11:12                             ` Pekka Enberg
  2007-04-26 14:48                               ` Rafael J. Wysocki
  1 sibling, 1 reply; 158+ messages in thread
From: Pekka Enberg @ 2007-04-26 11:12 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Dumitru Ciobarcianu, Ingo Molnar, Linus Torvalds,
	Nigel Cunningham, Christian Hesse, Nick Piggin, Mike Galbraith,
	linux-kernel, Con Kolivas, suspend2-devel, Andrew Morton,
	Thomas Gleixner, Arjan van de Ven, rjw

On 4/25/07, Pavel Machek <pavel@ucw.cz> wrote:
> > Please stop using FUD.
> > Graphical progress it's not in the kernel, even with suspend2.
>
> It was ascii-art, but still 'graphical', last time I checked.

Suspend2 talks to an userspace client via netlink. While I find the
name of the message ("redraw UI") rather appaling, there's nothing
wrong in principle that userspace starts the suspend process and the
kernel keeps feeding back progress information ("I froze all processes
now") so it can display a graphical progress bar.

The real question here is what to do with compression and encryption.
However, if you settle for one compression algorithm (such as LZF in
the case of suspend2) and use the _existing_ in-kernel crypto API for
encryption, suddenly the benefits of userspace suspend are not clear.

As you and Rafael seem to be mostly interested in uswsusp, why don't
we replace the old in-kernel implementation with suspend2?

                                         Pekka

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 10:40                       ` Pavel Machek
@ 2007-04-26 11:11                         ` Johannes Berg
  2007-04-26 11:16                           ` Pavel Machek
  2007-04-26 13:45                         ` Johannes Berg
  1 sibling, 1 reply; 158+ messages in thread
From: Johannes Berg @ 2007-04-26 11:11 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Linus Torvalds, Nick Piggin, Mike Galbraith, linux-kernel,
	Thomas Gleixner, Con Kolivas, suspend2-devel, Ingo Molnar,
	Andrew Morton, Arjan van de Ven, Rafael J. Wysocki

[-- Attachment #1: Type: text/plain, Size: 335 bytes --]

On Thu, 2007-04-26 at 12:40 +0200, Pavel Machek wrote:

> Does this seem to help?

No idea, I haven't actually tried it yet, last time I tried uswsusp on
my 32/32 machine it didn't work due to endian problems that were
supposed to be resolved but I haven't had a chance to pick all the bits
together that you need.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 11:04                       ` Johannes Berg
@ 2007-04-26 11:09                         ` Pavel Machek
  2007-04-26 15:53                           ` Linus Torvalds
  2007-04-26 18:21                           ` Olivier Galibert
  0 siblings, 2 replies; 158+ messages in thread
From: Pavel Machek @ 2007-04-26 11:09 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Linus Torvalds, Nick Piggin, Mike Galbraith, linux-kernel,
	Thomas Gleixner, Con Kolivas, suspend2-devel, Ingo Molnar,
	Andrew Morton, Arjan van de Ven, Rafael J. Wysocki

Hi!

> > > > I believe uswsusp user/kernel separation is clean enough. Kernel
> > > > provides "snapshot image" and "resume image". (Thanks go to Rafael for
> > > > very clean interface).
> > > 
> > > The interface isn't even 64/32-bit compatible...
> > 
> > Which parts?
> 
> ioctl numbers last time I talked about it with Rafael. No effort was
> made to fix it.

#define SNAPSHOT_ATOMIC_SNAPSHOT	_IOW(SNAPSHOT_IOC_MAGIC, 3, void *)
#define SNAPSHOT_SET_IMAGE_SIZE		_IOW(SNAPSHOT_IOC_MAGIC, 6, unsigned long)
#define SNAPSHOT_AVAIL_SWAP		_IOR(SNAPSHOT_IOC_MAGIC, 7, void *)
#define SNAPSHOT_GET_SWAP_PAGE		_IOR(SNAPSHOT_IOC_MAGIC, 8, void *)
#define SNAPSHOT_SET_SWAP_FILE		_IOW(SNAPSHOT_IOC_MAGIC, 10, unsigned int)
#define SNAPSHOT_PMOPS			_IOW(SNAPSHOT_IOC_MAGIC, 12, unsigned int)

Are these a problem? Do we need to just use u32 as a argument to keep
ioctl numbers same between 32 and 64bit versions?
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 10:30                     ` Pavel Machek
  2007-04-26 10:40                       ` Pavel Machek
@ 2007-04-26 11:04                       ` Johannes Berg
  2007-04-26 11:09                         ` Pavel Machek
  1 sibling, 1 reply; 158+ messages in thread
From: Johannes Berg @ 2007-04-26 11:04 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Linus Torvalds, Nick Piggin, Mike Galbraith, linux-kernel,
	Thomas Gleixner, Con Kolivas, suspend2-devel, Ingo Molnar,
	Andrew Morton, Arjan van de Ven, Rafael J. Wysocki

[-- Attachment #1: Type: text/plain, Size: 535 bytes --]

On Thu, 2007-04-26 at 12:30 +0200, Pavel Machek wrote:
> On Thu 2007-04-26 12:17:12, Johannes Berg wrote:
> > On Tue, 2007-04-24 at 23:24 +0200, Pavel Machek wrote:
> > 
> > > I believe uswsusp user/kernel separation is clean enough. Kernel
> > > provides "snapshot image" and "resume image". (Thanks go to Rafael for
> > > very clean interface).
> > 
> > The interface isn't even 64/32-bit compatible...
> 
> Which parts?

ioctl numbers last time I talked about it with Rafael. No effort was
made to fix it.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 10:30                     ` Pavel Machek
@ 2007-04-26 10:40                       ` Pavel Machek
  2007-04-26 11:11                         ` Johannes Berg
  2007-04-26 13:45                         ` Johannes Berg
  2007-04-26 11:04                       ` Johannes Berg
  1 sibling, 2 replies; 158+ messages in thread
From: Pavel Machek @ 2007-04-26 10:40 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Linus Torvalds, Nick Piggin, Mike Galbraith, linux-kernel,
	Thomas Gleixner, Con Kolivas, suspend2-devel, Ingo Molnar,
	Andrew Morton, Arjan van de Ven, Rafael J. Wysocki

Hi!

> > The interface isn't even 64/32-bit compatible...
> 
> Which parts?
> 
> ioctl(AVAIL_SWAP,
> 	...hmm, is this the one you are complaining about? It returns
> 	loff_t through a pointer.  Maybe there's another interface
> 	that can return available swap, and we should use that,
>	 instead?

loff_t is 64bit on i386, so I do not see immediate problem here, but
maybe we should just explicitely pass u64?

> ioctl(GET_SWAP_PAGE,
> 	returns sector_t through a pointer. NOt sure if that's good
> 	idea, either.

Ok, that's very bad idea, because sector_t can be 32-bit or 64-bit,
depending on CONFIG_LBD. We need to use u64 here.

> ioctl(SET_SWAP_FILE,
> 	does old_decode_dev(arg). Is that ok?
> 
> ioctl(SET_SWAP_AREA,
> 	shares struct resume_swap_area between user and kernel. I
> 	guess that's bad..?

struct resume_swap_area {
        loff_t offset;
        u_int32_t dev;
} __attribute__((packed));

...I guess we should change loff_t -> u64 and problem is solved?

Old_decode_dev takes u16. That sucks for majors/minors > 256, but
fortunately those are not common.

Does this seem to help?
									Pavel

diff --git a/kernel/power/power.h b/kernel/power/power.h
index eb461b8..dc13af5 100644
--- a/kernel/power/power.h
+++ b/kernel/power/power.h
@@ -114,7 +114,7 @@ extern int snapshot_image_loaded(struct 
  * SNAPSHOT_SET_SWAP_AREA ioctl
  */
 struct resume_swap_area {
-	loff_t offset;
+	u_int64_t offset;
 	u_int32_t dev;
 } __attribute__((packed));
 
diff --git a/kernel/power/user.c b/kernel/power/user.c
index 558e18e..d0730c1 100644
--- a/kernel/power/user.c
+++ b/kernel/power/user.c
@@ -215,8 +215,7 @@ static int snapshot_ioctl(struct inode *
 {
 	int error = 0;
 	struct snapshot_data *data;
-	loff_t avail;
-	sector_t offset;
+	u64 avail, offset;
 
 	if (_IOC_TYPE(cmd) != SNAPSHOT_IOC_MAGIC)
 		return -ENOTTY;
@@ -286,7 +285,7 @@ static int snapshot_ioctl(struct inode *
 	case SNAPSHOT_AVAIL_SWAP:
 		avail = count_swap_pages(data->swap, 1);
 		avail <<= PAGE_SHIFT;
-		error = put_user(avail, (loff_t __user *)arg);
+		error = put_user(avail, (u64 __user *)arg);
 		break;
 
 	case SNAPSHOT_GET_SWAP_PAGE:
@@ -304,7 +303,7 @@ static int snapshot_ioctl(struct inode *
 		offset = alloc_swapdev_block(data->swap, data->bitmap);
 		if (offset) {
 			offset <<= PAGE_SHIFT;
-			error = put_user(offset, (sector_t __user *)arg);
+			error = put_user(offset, (u64 __user *)arg);
 		} else {
 			error = -ENOSPC;
 		}

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply related	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 22:55                                     ` Linus Torvalds
  2007-04-25 23:13                                       ` Pavel Machek
  2007-04-26  1:40                                       ` Nigel Cunningham
@ 2007-04-26 10:39                                       ` Johannes Berg
  2007-04-26 11:30                                         ` Pavel Machek
  2 siblings, 1 reply; 158+ messages in thread
From: Johannes Berg @ 2007-04-26 10:39 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Nigel Cunningham, Nick Piggin, suspend2-devel, Mike Galbraith,
	linux-kernel, Con Kolivas, Andrew Morton, Pavel Machek,
	Thomas Gleixner, Ingo Molnar, Arjan van de Ven

[-- Attachment #1: Type: text/plain, Size: 544 bytes --]

On Wed, 2007-04-25 at 15:55 -0700, Linus Torvalds wrote:

> That's where I started: whole "suspend to disk" thing actually has _more_ 
> to do with "shutdown" than with "suspend". 

From looking at pm_ops which I was recently working with a lot, it seems
that it was designed by somebody who was reading the ACPI documentation
and was otherwise pretty clueless, even at that level std tries to look
like suspend. IMHO that is one of the first things that should be ripped
out, no pm_ops for STD, it's a pain to work with.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 10:17                   ` Johannes Berg
@ 2007-04-26 10:30                     ` Pavel Machek
  2007-04-26 10:40                       ` Pavel Machek
  2007-04-26 11:04                       ` Johannes Berg
  2007-04-26 11:35                     ` Christoph Hellwig
  1 sibling, 2 replies; 158+ messages in thread
From: Pavel Machek @ 2007-04-26 10:30 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Linus Torvalds, Nick Piggin, Mike Galbraith, linux-kernel,
	Thomas Gleixner, Con Kolivas, suspend2-devel, Ingo Molnar,
	Andrew Morton, Arjan van de Ven, Rafael J. Wysocki

On Thu 2007-04-26 12:17:12, Johannes Berg wrote:
> On Tue, 2007-04-24 at 23:24 +0200, Pavel Machek wrote:
> 
> > I believe uswsusp user/kernel separation is clean enough. Kernel
> > provides "snapshot image" and "resume image". (Thanks go to Rafael for
> > very clean interface).
> 
> The interface isn't even 64/32-bit compatible...

Which parts?

read/write on /dev/snapshot looks ok.

ioctl(SNAPSHOT_FREEZE, UNFREEZE, ATOMIC_RESTORE, FREE, FREE_SWAP_PAGE,
	SNAPSHOT_S2RAM,
	is okay, because it does not pass any data.

ioctl(ATOMIC_SNAPSHOT,
	returns 0/1 through pointer. Should be ok. (Maybe we should do 

		if (!error)
			error = put_user(in_suspend, (u32 __user *)arg);

	...instead, to make it very explicit?

ioctl(SET_IMAGE_SIZE,
	is okay, because it just uses arg directly.

ioctl(PMOPS,
	is okay, because it just uses arg directly... and it is in
	range 0-3 or something.

ioctl(AVAIL_SWAP,
	...hmm, is this the one you are complaining about? It returns
	loff_t through a pointer.  Maybe there's another interface
	that can return available swap, and we should use that, instead?

ioctl(GET_SWAP_PAGE,
	returns sector_t through a pointer. NOt sure if that's good
	idea, either.

ioctl(SET_SWAP_FILE,
	does old_decode_dev(arg). Is that ok?

ioctl(SET_SWAP_AREA,
	shares struct resume_swap_area between user and kernel. I
	guess that's bad..?
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-24 21:24                 ` Pavel Machek
  2007-04-24 23:41                   ` Linus Torvalds
@ 2007-04-26 10:17                   ` Johannes Berg
  2007-04-26 10:30                     ` Pavel Machek
  2007-04-26 11:35                     ` Christoph Hellwig
  1 sibling, 2 replies; 158+ messages in thread
From: Johannes Berg @ 2007-04-26 10:17 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Linus Torvalds, Nick Piggin, Mike Galbraith, linux-kernel,
	Thomas Gleixner, Con Kolivas, suspend2-devel, Ingo Molnar,
	Andrew Morton, Arjan van de Ven

[-- Attachment #1: Type: text/plain, Size: 290 bytes --]

On Tue, 2007-04-24 at 23:24 +0200, Pavel Machek wrote:

> I believe uswsusp user/kernel separation is clean enough. Kernel
> provides "snapshot image" and "resume image". (Thanks go to Rafael for
> very clean interface).

The interface isn't even 64/32-bit compatible...

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26  7:27                                   ` David Lang
@ 2007-04-26  9:45                                     ` Nigel Cunningham
  0 siblings, 0 replies; 158+ messages in thread
From: Nigel Cunningham @ 2007-04-26  9:45 UTC (permalink / raw)
  To: David Lang
  Cc: Olivier Galibert, Linus Torvalds, Adrian Bunk, Pavel Machek,
	Ingo Molnar, Christian Hesse, Nick Piggin, Mike Galbraith,
	linux-kernel, Con Kolivas, suspend2-devel, Andrew Morton,
	Thomas Gleixner, Arjan van de Ven

[-- Attachment #1: Type: text/plain, Size: 2028 bytes --]

Hi.

On Thu, 2007-04-26 at 00:27 -0700, David Lang wrote:
> On Thu, 26 Apr 2007, Nigel Cunningham wrote:
> 
> > Hi.
> >
> > On Thu, 2007-04-26 at 01:33 +0200, Olivier Galibert wrote:
> >> On Wed, Apr 25, 2007 at 11:50:45AM -0700, Linus Torvalds wrote:
> >>> .. but if the alternative is a feature that just isn't worth it, and
> >>> likely to not only have its own bugs, but cause bugs elsewhere? (And yes,
> >>> I believe STD is both of those. There's a reason it's called "STD". Go
> >>> to google and type "STD" and press "I'm feeling lucky". Google is God).
> >>
> >> If it was correctly designed, it would be possible to change the
> >> hardware or even the kernel through a STD cycle.  And that would be
> >> damn interesting on servers.
> >
> > Those are different issues - hardware hot/cold plugging for the first.
> >
> > Changing the kernel through a cycle - that's not a design fault. The
> > problem there is that the kernel and it's associated data structures are
> > part of the state. Changing the kernel and keeping the image would
> > require exactly correspondence in data structures, memory map and so on.
> > That's why the same kernel is required.
> 
> that depends on exactly what you save in your snapshot.
> 
> one approach is to try and save absolutly everything in ram (this is the current 
> approach)
> 
> if you do this then you do need to use the same kernel for the reasons that you 
> list.
> 
> however, you could also decide to only save the information about processes on 
> the system (i.e. what you absolutly have to) and let the kernel re-initialize 
> itself (along with it's devices) then you could use a different kernel safely. 
> doing this should also save you a significant amount of storage when makeing 
> your snapshot

Well, there is cryopid for individual processes. I suppose you could
potentially try doing a mass cryopiding. That would make things a lot
more complicated though. I'm not saying it's not doable.

Regards,

Nigel

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26  1:56                                 ` Nigel Cunningham
@ 2007-04-26  7:27                                   ` David Lang
  2007-04-26  9:45                                     ` Nigel Cunningham
  0 siblings, 1 reply; 158+ messages in thread
From: David Lang @ 2007-04-26  7:27 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: Olivier Galibert, Linus Torvalds, Adrian Bunk, Pavel Machek,
	Ingo Molnar, Christian Hesse, Nick Piggin, Mike Galbraith,
	linux-kernel, Con Kolivas, suspend2-devel, Andrew Morton,
	Thomas Gleixner, Arjan van de Ven

On Thu, 26 Apr 2007, Nigel Cunningham wrote:

> Hi.
>
> On Thu, 2007-04-26 at 01:33 +0200, Olivier Galibert wrote:
>> On Wed, Apr 25, 2007 at 11:50:45AM -0700, Linus Torvalds wrote:
>>> .. but if the alternative is a feature that just isn't worth it, and
>>> likely to not only have its own bugs, but cause bugs elsewhere? (And yes,
>>> I believe STD is both of those. There's a reason it's called "STD". Go
>>> to google and type "STD" and press "I'm feeling lucky". Google is God).
>>
>> If it was correctly designed, it would be possible to change the
>> hardware or even the kernel through a STD cycle.  And that would be
>> damn interesting on servers.
>
> Those are different issues - hardware hot/cold plugging for the first.
>
> Changing the kernel through a cycle - that's not a design fault. The
> problem there is that the kernel and it's associated data structures are
> part of the state. Changing the kernel and keeping the image would
> require exactly correspondence in data structures, memory map and so on.
> That's why the same kernel is required.

that depends on exactly what you save in your snapshot.

one approach is to try and save absolutly everything in ram (this is the current 
approach)

if you do this then you do need to use the same kernel for the reasons that you 
list.

however, you could also decide to only save the information about processes on 
the system (i.e. what you absolutly have to) and let the kernel re-initialize 
itself (along with it's devices) then you could use a different kernel safely. 
doing this should also save you a significant amount of storage when makeing 
your snapshot

David Lang


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26  0:24                                       ` Alan Cox
  2007-04-26  1:10                                         ` Linus Torvalds
@ 2007-04-26  7:08                                         ` Andy Grover
  1 sibling, 0 replies; 158+ messages in thread
From: Andy Grover @ 2007-04-26  7:08 UTC (permalink / raw)
  To: linux-kernel; +Cc: suspend2-devel

Alan Cox wrote:

> Mind you some laptops think S2RAM is just a transition state on the way
> to disk, leave them in ACPI S2RAM and the firmware will magically turn it
> into a save to disk and back to ram if the battery runs low or you leave
> it idle too long.

The OS does this (or at least it's supposed to). STR with battery low,
it comes back on fully via a battery wake event, and then STD (aka
snapshot/poweroff). The ACPI state machine always goes through S0, FWIW.

OK, now back to the "should we have 2 function pointers or 4" debate...

-- Andy


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26  3:03                                             ` Linus Torvalds
@ 2007-04-26  3:34                                               ` Nigel Cunningham
  0 siblings, 0 replies; 158+ messages in thread
From: Nigel Cunningham @ 2007-04-26  3:34 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Pavel Machek, Kenneth Crudup, Nick Piggin, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, suspend2-devel,
	Ingo Molnar, Andrew Morton, Arjan van de Ven

[-- Attachment #1: Type: text/plain, Size: 5779 bytes --]

Hi.

On Wed, 2007-04-25 at 20:03 -0700, Linus Torvalds wrote:
> 
> On Thu, 26 Apr 2007, Nigel Cunningham wrote:
> > 
> > Sorry. I wasn't clear. I wasn't saying that suspend to ram has a
> > snapshot point. I was trying to say it has a point where you're seeking
> > to save information (PCI state / SCSI transaction number or whatever)
> > that you'll need to get the hardware into the same state at a later
> > stage. That (saving information) is the point of similarity.
> 
> Yes, they do both save information, but I'm not actually convinced they 
> would necessarily even save the *same* information.
> 
> Let's just take an example of USB, and to make things more interesting, 
> say that the disk you want to suspend to is itself over USB (not 
> necessarily something you _want_ to do, but I think we can all agree that 
> it's something that should potentially work, no?)

Agreed - it would be nice.

> Now, USB devices actually have per-connection state (at a minimum, the 
> "toggle" bit or whatever), and that's obviously something that will 
> inevitably *change* as a result of the device being used after 
> snapshotting (and even if not used, by the rediscovery by the first kernel 
> to boot), and we fundamentally cannot put the final toggle state in the 
> snapshot.
> 
> So in the snapshot-to-disk scenario, there are some pieces of data that 
> simply fundamentally *cannot* be snapshotted, because they are not 
> controller state, they are "connection" state.
> 
> So in that case, you basically know that you *have* to rebuild the 
> connection when you do the "snapshot_resume()" thing. So there's no point 
> in even keeping these kinds of connection states (the same is true of 
> keyboards, mice, anything else - it's how USB works).

Sort of agree - you might want to record some serial number that might
let you recognise it as the same thing at resume time when everything is
re-hotplugged (assuming it's even there then). Nevertheless, I don't
think that diminishes what you're saying.

> In contrast, in suspend-to-RAM, USB connections might just be things you 
> actually want to keep open and active, and you *can* do so, in ways you 
> simply cannot do with "snapshot to disk". In fact, if you are something 
> like an OLPC and actually go to s2ram very aggressively, you might well 
> want to keep the connection established, because it's conceivable that you 
> might otherwise lose keypresses etc issues)
> 
> See? There are real *technical* reasons to believe that the two "save 
> state" operations are really fundamentally different. There are reasons to 
> believe that a s2ram can actually happen while keeping some connections 
> open that cannot be kept open over a disk snapshot.
> 
> Do they *have* to be different? Of course not. For many devices the "save" 
> and "freeze" operations will likely all be no-ops, and there would be 
> absolutely no difference between suspending and snapshotting, because the 
> driver state already natively contains all the information needed to get 
> the device going again.
> 
> Equally, I don't doubt that in many drivers you'll have very similar "save 
> state" logic, but in fact I believe that in many cases that "save state" 
> logic will often just be a simple
> 
> 	pci_save_state(dev);
> 
> call, so it's literally the case that they will not be just shared between 
> the "suspend" and "snapshot" case, they'll be shared across all simple PCI 
> devices too!
> 
> But that doesn't mean that the functions to do so should be the same. You 
> might have
> 
> 	static int mypcidevice_suspend(struct pci_dev *dev)
> 	{
> 		pci_save_state(dev);
> 		pci_set_power_state(dev, PCI_D3);
> 		return 0;
> 	}
> 
> 	static int mupcidevice_snapshot(struct pci_dev *dev)
> 	{
> 		pci_save_state(dev);
> 		return 0;
> 	}
> 
> and who cares if they both have that same call to a shared "save state" 
> function? They're still totally different operations, and the fact that 
> *some* devices may save the same things doesn't make them any more 
> similar! See above why some devices might save totally *different* things 
> for a "snapshot" vs a "suspend" event.

No disagreement here.

> > I suppose that's another point of similarity - for snapshotting, the
> > same ordering is probably needed?
> 
> I agree that you're likely to walk the device list in the same order. The 
> whole "shut down leaf devices first", "start up root devices first" is 
> pretty fundamental.
> 
> But that's true of reboot and device discovery too. Should that ordering 
> mean that we should use the "discovery()" function and pass it a flag and 
> say "you shouldn't discover, you should snapshot or suspend now"? No. 
> Everybody agrees that device discovery is something different from device 
> suspend. The fact that it's done in a topological order and thus they bear 
> some kind of inverse relationship to each other doesn't make them "the 
> same".
> 
> > > And yes, the _individual_ "save-and-suspend" events obviously needs to be 
> > > "atomic", but it's purely about that particular individual device, so 
> > > there's never any cross-device issues about that.
> > 
> > No interdependencies? I'm not sure.
> 
> Well, we pretty much count on it, since we will *suspend* the devices at 
> the same time. So if they had interdependencies that aren't described by 
> the ordering we enforce, they are pretty much screwed anyway ;)
> 
> So yes, the device list needs to be topologically sorted (and you need to 
> walk it in the right direction), but apart from that we'd *better* not 
> have any interdependencies, or we simply cannot suspend at all.

Thanks for your reply.

Nigel

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26  2:13                                           ` Nigel Cunningham
@ 2007-04-26  3:03                                             ` Linus Torvalds
  2007-04-26  3:34                                               ` Nigel Cunningham
  0 siblings, 1 reply; 158+ messages in thread
From: Linus Torvalds @ 2007-04-26  3:03 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: Pavel Machek, Kenneth Crudup, Nick Piggin, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, suspend2-devel,
	Ingo Molnar, Andrew Morton, Arjan van de Ven



On Thu, 26 Apr 2007, Nigel Cunningham wrote:
> 
> Sorry. I wasn't clear. I wasn't saying that suspend to ram has a
> snapshot point. I was trying to say it has a point where you're seeking
> to save information (PCI state / SCSI transaction number or whatever)
> that you'll need to get the hardware into the same state at a later
> stage. That (saving information) is the point of similarity.

Yes, they do both save information, but I'm not actually convinced they 
would necessarily even save the *same* information.

Let's just take an example of USB, and to make things more interesting, 
say that the disk you want to suspend to is itself over USB (not 
necessarily something you _want_ to do, but I think we can all agree that 
it's something that should potentially work, no?)

Now, USB devices actually have per-connection state (at a minimum, the 
"toggle" bit or whatever), and that's obviously something that will 
inevitably *change* as a result of the device being used after 
snapshotting (and even if not used, by the rediscovery by the first kernel 
to boot), and we fundamentally cannot put the final toggle state in the 
snapshot.

So in the snapshot-to-disk scenario, there are some pieces of data that 
simply fundamentally *cannot* be snapshotted, because they are not 
controller state, they are "connection" state.

So in that case, you basically know that you *have* to rebuild the 
connection when you do the "snapshot_resume()" thing. So there's no point 
in even keeping these kinds of connection states (the same is true of 
keyboards, mice, anything else - it's how USB works).

In contrast, in suspend-to-RAM, USB connections might just be things you 
actually want to keep open and active, and you *can* do so, in ways you 
simply cannot do with "snapshot to disk". In fact, if you are something 
like an OLPC and actually go to s2ram very aggressively, you might well 
want to keep the connection established, because it's conceivable that you 
might otherwise lose keypresses etc issues)

See? There are real *technical* reasons to believe that the two "save 
state" operations are really fundamentally different. There are reasons to 
believe that a s2ram can actually happen while keeping some connections 
open that cannot be kept open over a disk snapshot.

Do they *have* to be different? Of course not. For many devices the "save" 
and "freeze" operations will likely all be no-ops, and there would be 
absolutely no difference between suspending and snapshotting, because the 
driver state already natively contains all the information needed to get 
the device going again.

Equally, I don't doubt that in many drivers you'll have very similar "save 
state" logic, but in fact I believe that in many cases that "save state" 
logic will often just be a simple

	pci_save_state(dev);

call, so it's literally the case that they will not be just shared between 
the "suspend" and "snapshot" case, they'll be shared across all simple PCI 
devices too!

But that doesn't mean that the functions to do so should be the same. You 
might have

	static int mypcidevice_suspend(struct pci_dev *dev)
	{
		pci_save_state(dev);
		pci_set_power_state(dev, PCI_D3);
		return 0;
	}

	static int mupcidevice_snapshot(struct pci_dev *dev)
	{
		pci_save_state(dev);
		return 0;
	}

and who cares if they both have that same call to a shared "save state" 
function? They're still totally different operations, and the fact that 
*some* devices may save the same things doesn't make them any more 
similar! See above why some devices might save totally *different* things 
for a "snapshot" vs a "suspend" event.

> I suppose that's another point of similarity - for snapshotting, the
> same ordering is probably needed?

I agree that you're likely to walk the device list in the same order. The 
whole "shut down leaf devices first", "start up root devices first" is 
pretty fundamental.

But that's true of reboot and device discovery too. Should that ordering 
mean that we should use the "discovery()" function and pass it a flag and 
say "you shouldn't discover, you should snapshot or suspend now"? No. 
Everybody agrees that device discovery is something different from device 
suspend. The fact that it's done in a topological order and thus they bear 
some kind of inverse relationship to each other doesn't make them "the 
same".

> > And yes, the _individual_ "save-and-suspend" events obviously needs to be 
> > "atomic", but it's purely about that particular individual device, so 
> > there's never any cross-device issues about that.
> 
> No interdependencies? I'm not sure.

Well, we pretty much count on it, since we will *suspend* the devices at 
the same time. So if they had interdependencies that aren't described by 
the ordering we enforce, they are pretty much screwed anyway ;)

So yes, the device list needs to be topologically sorted (and you need to 
walk it in the right direction), but apart from that we'd *better* not 
have any interdependencies, or we simply cannot suspend at all.

			Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26  2:04                                               ` H. Peter Anvin
@ 2007-04-26  2:32                                                 ` Linus Torvalds
  2007-04-26 13:14                                                   ` Alan Cox
  0 siblings, 1 reply; 158+ messages in thread
From: Linus Torvalds @ 2007-04-26  2:32 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Pavel Machek, Alan Cox, Kenneth Crudup, Nick Piggin,
	Mike Galbraith, linux-kernel, Thomas Gleixner, Con Kolivas,
	suspend2-devel, Ingo Molnar, Andrew Morton, Arjan van de Ven



On Wed, 25 Apr 2007, H. Peter Anvin wrote:
> 
> That was the 1990s.  On a brand new server system:
> 
> 00:08.0 System peripheral: Intel Corporation 5000 Series Chipset DMA
> Engine (rev b1)
> 
> For better or worse, slave DMA seems to be making a comeback of sorts.
> Not to mention all kinds of embedded crap^Whardware with optimized DMA
> engines which look nothing like PCI at all.

Well, the solution to that tends to be to just leave them be, and hold 
them on until the very end - and just ignore them (and just make-believe 
that it's actually the device itself that does the DMA transfer).

The PCI spec for controlling DMA is really pretty nasty. You can disable 
it in the PCI config word, of course, but that usually just messes up the 
device entirely.

So in practice, the way to shut up DMA (regardless of whether it's an 
internal DMA engine or an external one) is that you just tell the device 
not to listen any more (for example, for a network controller, the way to 
make sure it doesn't do DMA is just to make sure that you're not sending 
any frames, but also that it's not listening to any either)!

So whether it's internal to the device, or some "system DMA controller", 
the sequence for shutting down DMA always ends up being the same:

 - make sure the host itself doesn't generate any new traffic (eg shut 
   down the send-queue). This is generally a higher-level thing anyway, ie 
   not really a driver decision.
 - the driver needs to tell the hardware to stop listening (ie "stop 
   scanning the command mailboxes" or "stop walking USB command 
   structures" or "stop receiving data")
 - the driver then needs to wait for the controller to say "ok, I'm idle".

because regardless of whether it's the system DMA controller or some 
on-chip DMA controller, you generally can *not* just say "stop 
transferring DMA data", because that will generally just lock the chip up 
or cause other major unhappiness.

So I don't think an external DMA controller (like the i8237, ugh!) really 
_changes_ anything. Except for just the horrible pain of serializing 
access to them for programming etc horrible resource handling issues, of 
course (but that's not specific to suspend/resume).

			Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26  2:04                                         ` Linus Torvalds
  2007-04-26  2:13                                           ` Nigel Cunningham
@ 2007-04-26  2:31                                           ` Nigel Cunningham
  1 sibling, 0 replies; 158+ messages in thread
From: Nigel Cunningham @ 2007-04-26  2:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Pavel Machek, Kenneth Crudup, Nick Piggin, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, suspend2-devel,
	Ingo Molnar, Andrew Morton, Arjan van de Ven

[-- Attachment #1: Type: text/plain, Size: 539 bytes --]

Hi.

Hmm. Perhaps I should have added to that last reply that recognising
that they store similar information doesn't mean I think they need the
same high-level routine for both state transitions.

I'd really like to see each driver have some sort of state machine
controlling its power management, into which these calls were just
another input (an important one, but just another alongside information
about policy, whether we're on battery (UPS or laptop) or AC, whether
the device is actually being used and so on.

Nigel

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26  2:04                                         ` Linus Torvalds
@ 2007-04-26  2:13                                           ` Nigel Cunningham
  2007-04-26  3:03                                             ` Linus Torvalds
  2007-04-26  2:31                                           ` Nigel Cunningham
  1 sibling, 1 reply; 158+ messages in thread
From: Nigel Cunningham @ 2007-04-26  2:13 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Pavel Machek, Kenneth Crudup, Nick Piggin, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, suspend2-devel,
	Ingo Molnar, Andrew Morton, Arjan van de Ven

[-- Attachment #1: Type: text/plain, Size: 3837 bytes --]

Hi.

On Wed, 2007-04-25 at 19:04 -0700, Linus Torvalds wrote:
> 
> On Thu, 26 Apr 2007, Nigel Cunningham wrote:
> >
> > That's where I think you're overstretching the argument. Like suspend 
> >(to ram), we're concerned at the snapshot point with getting the hardware 
> >in the same state at a later stage.
> 
> Really, no.
> 
> "suspend to ram" doesn't _have_ a "snapshot point".

Sorry. I wasn't clear. I wasn't saying that suspend to ram has a
snapshot point. I was trying to say it has a point where you're seeking
to save information (PCI state / SCSI transaction number or whatever)
that you'll need to get the hardware into the same state at a later
stage. That (saving information) is the point of similarity.

> I've tried to explain this multiple times, I don't know why it's not 
> apparently sinking in. This is much more fundamental than the fact that 
> you don't want to stop disks for snapshotting, although it really boils 
> down to all the same issues: the operations are simply not at all the 
> same!

Miscommunication, I think. Does the above help?

> I agree 100% that "snapshot to disk" is a "snapshot event". You have to 
> create a single point in time when everything is stable. And I'd much 
> rather call it "snapshot to disk" than "suspend to disk" to make it clear 
> that it's something _totally_ different from "suspend".
> 
> Because the thing is, "suspend to ram" is *not* a snapshot event. At no 
> point do you actually need to "snapshot" the system at all. You can just 
> gradually shut more and more things down, and equally gradually bring them 
> back up. There simply is *never* any "snapshot" time from a device 
> standpoint, because you can just shut down devices in the right order AND 
> YOU ARE DONE.
> 
> Really. 

I suppose that's another point of similarity - for snapshotting, the
same ordering is probably needed?

> [ Obviously s2ram does have one "magic moment", namely the time when the 
>   CPU does the magic read from the northbridge that actually turns off 
>   power for the CPU. But that's really a total non-event from a device 
>   standpoint, so while it's undoubtedly a very interesting moment in the 
>   suspend sequence, it's not really relevant in any way for device 
>   drivers in general. Not at all like the "snapshot moment" that requires 
>   the whole system to be totally quiescent in a "snapshot to disk"! ]
> 
> And the reason s2ram doesn't have a that "snapshot" moment is exactly that 
> the RAM contents are just always there, so there's no need to have a 
> "synchronization event" when ram and devices match. The RAM will *always* 
> match whatever any particular device has done to it, and the proper way to 
> handle things is to just do a simple per-device "save-and-suspend" event.

Yeah.

> And yes, the _individual_ "save-and-suspend" events obviously needs to be 
> "atomic", but it's purely about that particular individual device, so 
> there's never any cross-device issues about that.

No interdependencies? I'm not sure.

> For example, if you're a USB hub controller, which is just about the most 
> complex issue you can have, you obviously want to "save the state" with 
> the controller in a STOPPED state, but that should just go without saying: 
> if the controller isn't stopped, you simply *cannot* save the state, since 
> the state is changing under you. 
> 
> The difference is, that the USB driver needs to just "stop, save, and 
> suspend" as one simple operation for s2ram. In contrast, when doing 
> snapshot to disk, it cannot do that, because while it does want to do the 
> "stop" part, it needs to do so _separately_ from the "save" part because 
> you need to stop everything else *too* before you "save" anythng at all.

Agree.

Nigel

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26  1:40                                       ` Nigel Cunningham
@ 2007-04-26  2:04                                         ` Linus Torvalds
  2007-04-26  2:13                                           ` Nigel Cunningham
  2007-04-26  2:31                                           ` Nigel Cunningham
  0 siblings, 2 replies; 158+ messages in thread
From: Linus Torvalds @ 2007-04-26  2:04 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: Pavel Machek, Kenneth Crudup, Nick Piggin, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, suspend2-devel,
	Ingo Molnar, Andrew Morton, Arjan van de Ven



On Thu, 26 Apr 2007, Nigel Cunningham wrote:
>
> That's where I think you're overstretching the argument. Like suspend 
>(to ram), we're concerned at the snapshot point with getting the hardware 
>in the same state at a later stage.

Really, no.

"suspend to ram" doesn't _have_ a "snapshot point".

I've tried to explain this multiple times, I don't know why it's not 
apparently sinking in. This is much more fundamental than the fact that 
you don't want to stop disks for snapshotting, although it really boils 
down to all the same issues: the operations are simply not at all the 
same!

I agree 100% that "snapshot to disk" is a "snapshot event". You have to 
create a single point in time when everything is stable. And I'd much 
rather call it "snapshot to disk" than "suspend to disk" to make it clear 
that it's something _totally_ different from "suspend".

Because the thing is, "suspend to ram" is *not* a snapshot event. At no 
point do you actually need to "snapshot" the system at all. You can just 
gradually shut more and more things down, and equally gradually bring them 
back up. There simply is *never* any "snapshot" time from a device 
standpoint, because you can just shut down devices in the right order AND 
YOU ARE DONE.

Really. 

[ Obviously s2ram does have one "magic moment", namely the time when the 
  CPU does the magic read from the northbridge that actually turns off 
  power for the CPU. But that's really a total non-event from a device 
  standpoint, so while it's undoubtedly a very interesting moment in the 
  suspend sequence, it's not really relevant in any way for device 
  drivers in general. Not at all like the "snapshot moment" that requires 
  the whole system to be totally quiescent in a "snapshot to disk"! ]

And the reason s2ram doesn't have a that "snapshot" moment is exactly that 
the RAM contents are just always there, so there's no need to have a 
"synchronization event" when ram and devices match. The RAM will *always* 
match whatever any particular device has done to it, and the proper way to 
handle things is to just do a simple per-device "save-and-suspend" event.

And yes, the _individual_ "save-and-suspend" events obviously needs to be 
"atomic", but it's purely about that particular individual device, so 
there's never any cross-device issues about that.

For example, if you're a USB hub controller, which is just about the most 
complex issue you can have, you obviously want to "save the state" with 
the controller in a STOPPED state, but that should just go without saying: 
if the controller isn't stopped, you simply *cannot* save the state, since 
the state is changing under you. 

The difference is, that the USB driver needs to just "stop, save, and 
suspend" as one simple operation for s2ram. In contrast, when doing 
snapshot to disk, it cannot do that, because while it does want to do the 
"stop" part, it needs to do so _separately_ from the "save" part because 
you need to stop everything else *too* before you "save" anythng at all.

		Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26  0:38                                             ` Linus Torvalds
@ 2007-04-26  2:04                                               ` H. Peter Anvin
  2007-04-26  2:32                                                 ` Linus Torvalds
  0 siblings, 1 reply; 158+ messages in thread
From: H. Peter Anvin @ 2007-04-26  2:04 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Pavel Machek, Alan Cox, Kenneth Crudup, Nick Piggin,
	Mike Galbraith, linux-kernel, Thomas Gleixner, Con Kolivas,
	suspend2-devel, Ingo Molnar, Andrew Morton, Arjan van de Ven

Linus Torvalds wrote:
> 
> On Thu, 26 Apr 2007, Pavel Machek wrote:
>> Ok, I guess I'll have nightmares of DMA controllers doing DMAs from
>> chips that are no longer there tonight.
> 
> Umm. Welcome to the 21st century: we don't do that "separate DMA 
> controller" thing any more. All devices do their own DMA.
> 

That was the 1990s.  On a brand new server system:

00:08.0 System peripheral: Intel Corporation 5000 Series Chipset DMA
Engine (rev b1)

For better or worse, slave DMA seems to be making a comeback of sorts.
Not to mention all kinds of embedded crap^Whardware with optimized DMA
engines which look nothing like PCI at all.

	-hpa

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 23:33                               ` Olivier Galibert
@ 2007-04-26  1:56                                 ` Nigel Cunningham
  2007-04-26  7:27                                   ` David Lang
  0 siblings, 1 reply; 158+ messages in thread
From: Nigel Cunningham @ 2007-04-26  1:56 UTC (permalink / raw)
  To: Olivier Galibert
  Cc: Linus Torvalds, Adrian Bunk, Pavel Machek, Ingo Molnar,
	Christian Hesse, Nick Piggin, Mike Galbraith, linux-kernel,
	Con Kolivas, suspend2-devel, Andrew Morton, Thomas Gleixner,
	Arjan van de Ven

[-- Attachment #1: Type: text/plain, Size: 1246 bytes --]

Hi.

On Thu, 2007-04-26 at 01:33 +0200, Olivier Galibert wrote:
> On Wed, Apr 25, 2007 at 11:50:45AM -0700, Linus Torvalds wrote:
> > .. but if the alternative is a feature that just isn't worth it, and 
> > likely to not only have its own bugs, but cause bugs elsewhere? (And yes, 
> > I believe STD is both of those. There's a reason it's called "STD". Go 
> > to google and type "STD" and press "I'm feeling lucky". Google is God).
> 
> If it was correctly designed, it would be possible to change the
> hardware or even the kernel through a STD cycle.  And that would be
> damn interesting on servers.

Those are different issues - hardware hot/cold plugging for the first.

Changing the kernel through a cycle - that's not a design fault. The
problem there is that the kernel and it's associated data structures are
part of the state. Changing the kernel and keeping the image would
require exactly correspondence in data structures, memory map and so on.
That's why the same kernel is required.

> In any case, if I could trust it, I'd use it when I need to move
> servers around and I don't want to lose what is running.  Riding power
> cuts that way would be nice.

That's what Rafael and I working on.

Nigel

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 23:45                                           ` Pavel Machek
@ 2007-04-26  1:48                                             ` Nigel Cunningham
  0 siblings, 0 replies; 158+ messages in thread
From: Nigel Cunningham @ 2007-04-26  1:48 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Linus Torvalds, Kenneth Crudup, Nick Piggin, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, suspend2-devel,
	Ingo Molnar, Andrew Morton, Arjan van de Ven

[-- Attachment #1: Type: text/plain, Size: 1623 bytes --]

Hi.

On Thu, 2007-04-26 at 01:45 +0200, Pavel Machek wrote:
> > What's your argument? Your argument seems to be that it's not stupid, 
> > because it can work. Can't you see that that simply isn't an
> > argument at 
> 
> I tried keeping module_init/thaw/resume similar code, so that driver
> authors can debug suspend-to-disk, cross their fingers, and have
> suspend-to-ram work, too.



> Now, perhaps enough people do std/str these days so this is not
> important any longer... lets hope so.

Noooo! It's important and getting more important. More and more, people
are going to be demanding better power saving (climate change and all
that stuff). The best power saving is to have the thing completely off,
so STD is more important. The second best power saving is STR, so that's
important too. But even more important is good power saving all the
time.

For that reason, I agree completely with Linus. The current model is far
too limited. It shouldn't be so suspend-to-ram/disk centric, and should
instead focus on run-time power management, with suspend to ram and disk
as particular instances of run-time power management. It should make
appropriate differentiation between snapshotting and suspending to ram.

I do disagree that the current suspend-to-disk algorithm is broken. We
do need a point at which we say "Ok, drivers, record your state." - the
current device_suspend and device_resume calls. But that doesn't mean
the need to be called device_suspend/resume or do what they do now.

I'd love to help make this happen, but I'm afraid I just don't have the
time.

Nigel

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 22:55                                     ` Linus Torvalds
  2007-04-25 23:13                                       ` Pavel Machek
@ 2007-04-26  1:40                                       ` Nigel Cunningham
  2007-04-26  2:04                                         ` Linus Torvalds
  2007-04-26 10:39                                       ` Johannes Berg
  2 siblings, 1 reply; 158+ messages in thread
From: Nigel Cunningham @ 2007-04-26  1:40 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Pavel Machek, Kenneth Crudup, Nick Piggin, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, suspend2-devel,
	Ingo Molnar, Andrew Morton, Arjan van de Ven

[-- Attachment #1: Type: text/plain, Size: 2410 bytes --]

Hi.

On Wed, 2007-04-25 at 15:55 -0700, Linus Torvalds wrote:
> 
> On Thu, 26 Apr 2007, Nigel Cunningham wrote:
> > > 
> > > And name *one* thing that have in common.
> > 
> > Set/reset the scsi transaction id thingy? Hibernation didn't work with
> > SCSI for a long time precisely because that support was missing.
> 
> And by "hibernation", you mean what? You mean "snapshot + shutdown", 
> right?
> 
> Think about it for five seconds, and then ask yourself: at which point in 
> the "snapshot + shutdown" sequence would you actually tell a disk to shut 
> down?
> 
> If you said "snapshot", then you'd be *wrong*. 

No, I didn't. I agree with you that they should be separate and
distinct. I'm just pointing out that you're overstretching your argument
a little. There are some similiarities in that in both cases we want to
get the driver into some quiet state and out of it again. The difference
comes from the fact that the quite states don't have to be the same and
shouldn't be the same. I won't insult your intelligence by describing
the differences in more detail.

> That's my _point_. The snapshot() function should not (and MUST NOT) tell 
> disks to shut down, because unlike suspend(), we're still going to _use_ 
> those disks afterwards (why? To write out the snapshot image!).
> 
> In other words, the act of creating a snapshot has *nothing* to do with 
> suspend.

Absolutely. It's about getting the data we need to restore it to the
same state post-hibernation-cycle (or, more correctly)
post-atomic-restore.

> Now, after you've created (and written out) the snapshot, what do you 
> actually end up doing?
> 
> That's right - you end up _shutting down_ the machine, and yes, as part 
> of the _shutdown_ sequence you may actually end up doing a lot of the 
> things that a suspend would do. But that's long *after* you've actually 
> done the "snapshot" part, and has absolutely nothing to do with it.
> 
> That's where I started: whole "suspend to disk" thing actually has _more_ 
> to do with "shutdown" than with "suspend". 

That's where I think you're overstretching the argument. Like suspend
(to ram), we're concerned at the snapshot point with getting the
hardware in the same state at a later stage. Unlike suspend, we don't
necessarily want it to enter a low power-usage state as part of that
state preservation.

Nigel

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 19:25                               ` Adrian Bunk
                                                   ` (3 preceding siblings ...)
  2007-04-25 22:13                                 ` Kenneth Crudup
@ 2007-04-26  1:25                                 ` Antonino A. Daplas
  4 siblings, 0 replies; 158+ messages in thread
From: Antonino A. Daplas @ 2007-04-26  1:25 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Linus Torvalds, Pavel Machek, Ingo Molnar, Nigel Cunningham,
	Christian Hesse, Nick Piggin, Mike Galbraith, linux-kernel,
	Con Kolivas, suspend2-devel, Andrew Morton, Thomas Gleixner,
	Arjan van de Ven

On Wed, 2007-04-25 at 21:25 +0200, Adrian Bunk wrote:
> On Wed, Apr 25, 2007 at 11:50:45AM -0700, Linus Torvalds wrote:
> > 
> > 
> > On Wed, 25 Apr 2007, Adrian Bunk wrote:
> > > 
> > > 3W for the complete system? In CPU state S1? [1]
> > 
> > In STR, 3W is quite realistic. The CPU is off, all (or most - up to you) 
> > the devices are off, but the motherboard and memory is powered.
> 
> As far as I understand it, the CPU isn't off in S1.
> 
> > > And even 3W would still be a waste of energy.

It is, especially if you're living in a place where power infrastructure
is unreliable (such as where I live). Currently, because of the summer
heat, power demand exceeds power supply so we experience practically
daily rotating 4-hour power interruption. 

That 3W saved multiplied by the total number of computers is a lot.
In this perspective, S2D (or shutdown) is preferred over S2RAM.

Tony



^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26  0:24                                       ` Alan Cox
@ 2007-04-26  1:10                                         ` Linus Torvalds
  2007-04-26 14:04                                           ` Mark Lord
  2007-04-26  7:08                                         ` Andy Grover
  1 sibling, 1 reply; 158+ messages in thread
From: Linus Torvalds @ 2007-04-26  1:10 UTC (permalink / raw)
  To: Alan Cox
  Cc: Pavel Machek, Kenneth Crudup, Nick Piggin, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, suspend2-devel,
	Ingo Molnar, Andrew Morton, Arjan van de Ven



On Thu, 26 Apr 2007, Alan Cox wrote:
> 
> You bet there is. We need to know if data arrived or not, because there
> is no guarantee that the data retrieved if we inadvertently re-execute a
> command will be the same. The hardware state itself isn't the problem,
> its the combination of hardware state and internal state which need to
> match in some cases.

... which is why "suspend()" suspends the hardware.

Is that so hard to understand?

Once the hardware is suspended, it's not doing anything.

But STR doesn't have any need for atomicity guarantees _between_devices_.

That's a really *fundamental* difference. 

The reason s2ram is *so* different from snapshot-to-disk is exactly the 
fact that s2ram can (and does) work on one device at a time. 

In contrast, snapshot-to-disk needs to snapshot all the devices 
*together*, since it has a separate disk image.

See? Two *totally* different cases. They have *nothing* in common. Not the 
call sequence, not the logic, not *anything*.

		Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 20:08                         ` Linus Torvalds
  2007-04-25 20:27                           ` Pavel Machek
@ 2007-04-26  0:41                           ` Thomas Orgis
  1 sibling, 0 replies; 158+ messages in thread
From: Thomas Orgis @ 2007-04-26  0:41 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kenneth Crudup, Nick Piggin, suspend2-devel, Mike Galbraith,
	linux-kernel, Con Kolivas, Andrew Morton, Pavel Machek,
	Thomas Gleixner, Ingo Molnar, Arjan van de Ven

[-- Attachment #1: Type: text/plain, Size: 4979 bytes --]

Sort of my 2-many-cents story on why I need "snapshot/restore"...

Am Wed, 25 Apr 2007 13:08:09 -0700 (PDT)
schrieb Linus Torvalds <torvalds@linux-foundation.org>: 

> 
> 
> On Wed, 25 Apr 2007, Kenneth Crudup wrote:
> > 
> > Any working suspend-to-disk method takes care of that for me.  (I'm
> > really not sure why Linus hates S2D so much, though. Back in the day
> > there was a lot more BIOS support, but that's been years now.)
> 
> The really sad part is that APM actually did this better.. 

This really triggers a nerve in me. My laptops (always used models from
some years ago, even) didn't necessarily get easier with respect to power
management (suspend) over time.

My first laptop (Siemens Scenic Mobile 710, 200Mhz Pentium, maxed to
192MB RAM) worked just fine with APM, be it s2ram or s2disk.
Everything handled by the BIOS.
Admittedly, S2disk was quite slow as it stored all ram and didn't write
to the disk as fast as possible, but it worked.
S2ram was also a viable option because I was even able to easily swap
batteries because the thing had two bays to put batteries in.

The next one was a Toshiba Portege 7020 CT (366MHz Pentium2 with dynamic
clock, 192MB), supporting both APM and ACPI.
Installing Linux was not that easy, I think I remember that APM in kernel
froze the box (early 2.6 kernel), while ACPI needed some headache to set
up (compiling a fixed DSDT into the kernel, for example)... I needed
experimental toshiba_acpi to get functions and the acpi_pm_timer to
get something like continuous system clock (special cpu throttling has
funny effects).
Well, I got it together after some time.
Used suspend2 for "snapshot/restore" and actually was able to use ACPI
S3 with the glitch of having to unload/load psmouse driver ... until I
realized that it only resumed in about 80% of cases (BIOS ....).
So suspend2 was a badly needed "hack" around the hardware/BIOS to get
some sane workflow.
I remember dealing with swsusp / pmdisk before... but I really ended
up with suspend2 as the thing that works (and I wouldn't have bothered
finding this patch if the in-kernel stuff worked for me).
Of course this was a long time ago and recently I have seen that
in-kernel swsusp works ok, just this unresponsiveness after "restore"
due to missing page cache...

Now I have an IBM ThinkPad X31 (600-1.4GHz Pentium M, 512MB).
ACPI. SpeedStep.
The machine generally works fine, hardware config via ACPI seems to
be fine.
But doing S3/STR? Well... this machine has the odd idea that turning the
system off but the screen backlight back on after a second is a good idea.
Of course just now S3 worked fine... you cannot even depend on the
malfunction -- could have something to do with changing bootup video
from LCD to VGA output for some other reason recently.
Hm. Perhaps it even may work (after tricking the BIOS!?). But I doubt
I'll suddenly develop trust in that.
I _had_ trust in APM STR and STD.
I am quite confident in suspend2 being able to correctly resume (restore)
after a successful suspend (snapshot/restore).

And then, STR doesn't help me on the road when I need to exchange the
battery (I'd need this special extra battery to put under the ThinkPad
for that).
Another thing is that the old Siemens has a nice auxilliary monochrome
LCD that shows the charge status of the batteries in 5 levels, so you
have some means to predict the time you have in STR. The Thinkpad has
greed LED for "battery level OK" and red for "battery level low".
Well, but the Linux kernel won't change that...

Perhaps at some time ACPI implementations in BIOS get to something
reliable (hm, should I get a PowerBook instead?) and can be a good partner
for Linux which struggles for many years now to get into the post-APM era.
Remember reading desktop PC test reports in the c't magazine in the last
years, S3 usually did _not_ work; with Windows, even.
Well, there must be a reason Microsoft chose to implement the "hibernate"
(it _is_ in software, right?).

The APM->ACPI transition made me use the software STD
(snapshot/restore...;-) and I think I will stay with it for the
forseeable future, and be it because I can do fancy things like image
encryption.
ACPI S3 / STR is a nice addition when it works, for the smaller pauses
(changing a train at the station, leaving office for half an hour...),
but I consider STD really to be the more important feature that enables
me to _never_ close my applications unless I want to do a kernel update.

I really must say that some sort of STD is a total must for a laptop for me.
On the other hand I once had a Psion 5MX, which basically was on STR all
the (non-working) time -- and enabled well over 20h of working time on two AAs.
When laptops enter that range of battery life, I guess I could arrange with
just doing STR and won't have to worry about changing batteries without AC
connection;-)


Alrighty then,

Thomas.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26  0:14                                           ` Pavel Machek
  2007-04-25 23:51                                             ` David Lang
@ 2007-04-26  0:38                                             ` Linus Torvalds
  2007-04-26  2:04                                               ` H. Peter Anvin
  1 sibling, 1 reply; 158+ messages in thread
From: Linus Torvalds @ 2007-04-26  0:38 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Alan Cox, Kenneth Crudup, Nick Piggin, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, suspend2-devel,
	Ingo Molnar, Andrew Morton, Arjan van de Ven



On Thu, 26 Apr 2007, Pavel Machek wrote:
> 
> Ok, I guess I'll have nightmares of DMA controllers doing DMAs from
> chips that are no longer there tonight.

Umm. Welcome to the 21st century: we don't do that "separate DMA 
controller" thing any more. All devices do their own DMA.

> Only the fact that we are currently using same device call during
> snapshot() and during restore(). We obviously could do _5_ device
> calls
> 
> (suspend/resume/freeze/quiesce_disable_dma/thaw)
> 
> ...but that looks like too many calls to me.

I'd much rather have five or even more functions that each do *one* 
obvious thing. 

Think like a device driver writer: would you prefer to just implement five 
functions that do something very specific that you know trivially how to 
do ("I know how to disable interrupts and DMA") or would you want to do 
some high-level opertion that you don't even know why the caller asks you 
to suspend? What does "suspend()" even mean when the caller is just going 
to wake up up immediately again? Is it performance-critical? Should I tear 
down all my DMA's? I dunno!

In other words, splitting things up actually makes things simpler. That's 
*doubly* true if you can then give each specific function some really 
clear goals.

		Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26  0:05                                         ` Linus Torvalds
  2007-04-26  0:14                                           ` Pavel Machek
@ 2007-04-26  0:34                                           ` Linus Torvalds
  2007-04-26 20:12                                             ` Rafael J. Wysocki
  1 sibling, 1 reply; 158+ messages in thread
From: Linus Torvalds @ 2007-04-26  0:34 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Alan Cox, Kenneth Crudup, Nick Piggin, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, suspend2-devel,
	Ingo Molnar, Andrew Morton, Arjan van de Ven



On Wed, 25 Apr 2007, Linus Torvalds wrote:
> 
> The *thaw* needs to happen with devices quiescent. 

Btw, I sure as hell hope you didn't use "suspend()" for that. You're 
(again) much better off having a totally separate function that just 
freezes stuff.

So in the "snapshot+shutdown" path, you should have:

 - prepare_to_snapshot() - allocate memory, and possibly return errors

   We can skip this, if we just make the rule be that any devices that 
   want to support snapshotting must always have the memory required for 
   snapshotting pre-allocated. Most devices really do allocate memory for 
   their state anyway, and the only real reason for the "prepare" stage 
   here is becasue the final snapshot has to happen with interrupts off, 
   obviously. So *if* we don't need to allocate any memory, and if we 
   don't expect to want to accept some early error case, this is likely 
   useless.

 - snapshot() - actually save device state that is consistent with the 
   memory image at the time. Called with interrupts off, but the device 
   has to be usable both before and afterwards!

And I would seriously suggest that "snapshot()" be documented to not rely 
on any DMA memory, exactly because the device has to be accessible both 
before and after (before - because we're running and allocating memory, 
and after - because we'll be writing thigns out). But see later:

For the "resume snapshot" path, I would suggest having 

 - freeze(): quiesce the device. This literally just does the absolute 
   minimum to make sure that the device doesn't do anything surprising (no 
   interrupts, no DMA, no nothing). For many devices, it's a no-op, even 
   if they can do DMA (eg most disk controllers will do DMA, but only as 
   an actual result of a request, and upper layers will be quiescent 
   anyway, so they do *not* need to disable DMA)

   NOTE! The "freeze()" gets called from the *old* kernel just _before_ a
   snapshot unpacking!!

 - restart_snapshot() - actually restart the snapshot (and usually this 
   would involve re-setting the device, not so much trying to restore all 
   the saved state. IOW, it's easier to just re-initialize the DMA command 
   queues than to try to make them "atomic" in the snapshot).

   NOTE! This gets called by the *new* kernel _after_ the snapshot resume!

And if you *want* to, I can see that you might want to actually do a 
"unfreeze()" thing too, and make the actual shapshotting be:

	/* We may not even need this.. */
	for_each_device() {
		err = prepare_to_snapshot();
		if (err)
			return err;
	}

	/* This is the real work for snapshotting */
	cli();
	for_each_device()
		freeze(dev);
	for_each_device()
		snapshot(dev);
	.. snapshot current memory image ..
	for_each_device_depth_first()
		unfreeze(dev);
	sti();

and maybe it's worth it, but I would almost suggest that you just make the 
rule be that any DMA etc just *has* to be re-initialized by 
"restart_snapshot()", in which case it's not even necessary to 
freeze/unfreeze over the device, and "snapshot()" itself only needs to 
make sure any non-DMA data is safe.

But adding the freeze/unfreeze (which is a no-op for most hardware anyway) 
might make things easier to think about, so I would certainly not *object* 
to it, even if I suspect it's not necessary.

Anyway, the restore_snapshot() sequence should be:

	/* Old kernel.. Normal boot, load snapshot image */
	cli()
	for_each_device()
		freeze(dev);
	restore_snapshot_image();
	restore_regs_and_jump_to_image();
	/* noreturn */


	/* New kernel, gets called at the snapshot restore address
	 * with interrupts off and devices frozen, and memory image
	 * constsntent with what it was at "snapshot()" time
	 */
	for_each_dev_depth_first()
		restore_snapshot(dev);
	/* And if you want to, just to be "symmetric"

		for_each_dev_depth_first()
			unfreeze(dev)

	   although I think you could just make "restore_snapshot()" 
	   implicitly unfreeze it too..
	 */
	sti();
	/* We're up */

and notice how *different* this is from what happens for s2ram. There 
really isn't anything in common here. Exactly because s2ram simply doesn't 
_have_ any of the issues with atomic memory images.

So s2ram is just

	for_each_dev()
		suspend(dev);
	cli();
	for_each_dev()
		late_suspend(dev);
	.. go to sleep ..
	for_each_dev_depth_first()
		early_resume(dev);
	sti();
	for_each_dev_depth_first()
		resume(dev);

and has none of the "freeze" issues at all.

Doesn't that seem a lot more straightforward? Yes, it's more functions, 
but each function is a lot more obvious. This follows the unix rule of "do 
one thing, and do that thing well", instead of trying to make one function 
do many very different things depending on what you actually want done..

			Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 23:20                                     ` Linus Torvalds
  2007-04-25 23:52                                       ` Pavel Machek
@ 2007-04-26  0:24                                       ` Alan Cox
  2007-04-26  1:10                                         ` Linus Torvalds
  2007-04-26  7:08                                         ` Andy Grover
  1 sibling, 2 replies; 158+ messages in thread
From: Alan Cox @ 2007-04-26  0:24 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Pavel Machek, Kenneth Crudup, Nick Piggin, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, suspend2-devel,
	Ingo Molnar, Andrew Morton, Arjan van de Ven

> STR does not need to "ensure that you have a consistent snapshot".

Linus I think someone's been spiking your guinness again...

> Why? Becuase there is no _room_ for inconsistency. There's nothing to be 
> "inconsistent with", since any changes to memory (by things like DMA or 
> other setup that happens while the suspend process is going on) is by 
> _definition_ consistent with the resume image (becasue there is no 
> separate image).

You bet there is. We need to know if data arrived or not, because there
is no guarantee that the data retrieved if we inadvertently re-execute a
command will be the same. The hardware state itself isn't the problem,
its the combination of hardware state and internal state which need to
match in some cases.

> off DMA and try to make the hardware be wevy wevy quiet while it's hunting 
> wabbits, it's a lot easier to just do nothing at all on "freeze", and just 
> make sure that "thaw" will re-initialze the DMA tables entirely! All 

Who cares about DMA mapping tables, those are easy to deal with, not even
that bad with an IOMMU to restore. More problematic is the users data
because if we have a device where re-executing a command is not
repeatable (eg O_DIRECT SCSI on a shared bus) then we risk being
inconsistent in our S2RAM.  If we re-run the command on resume having
allowed it to prattle on while doing S2anything then we'll get the wrong
answer.

Now there are lots of devices we don't care about as they don't have
state in the form that causes problems - network cards, TV capture etc,
but there are cases where it matters that every operation is either
finished or not started and there is no doubt about them getting done
during the S2RAM/S2DISK

S2DISK/S2RAM both need to synchronize the state of a device so it can
build a valid snapshot. That bit is a shared requirement just like you
said didn't exist. Doesn't even need to involve turning DMA off, just
getting a consistent state.

The rest can be quite different.

Mind you some laptops think S2RAM is just a transition state on the way
to disk, leave them in ACPI S2RAM and the firmware will magically turn it
into a save to disk and back to ram if the battery runs low or you leave
it idle too long.

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26  0:05                                         ` Linus Torvalds
@ 2007-04-26  0:14                                           ` Pavel Machek
  2007-04-25 23:51                                             ` David Lang
  2007-04-26  0:38                                             ` Linus Torvalds
  2007-04-26  0:34                                           ` Linus Torvalds
  1 sibling, 2 replies; 158+ messages in thread
From: Pavel Machek @ 2007-04-26  0:14 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alan Cox, Kenneth Crudup, Nick Piggin, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, suspend2-devel,
	Ingo Molnar, Andrew Morton, Arjan van de Ven

Hi!

> > > Why? Becuase there is no _room_ for inconsistency. There's nothing to be 
> > > "inconsistent with", since any changes to memory (by things like DMA or 
> > > other setup that happens while the suspend process is going on) is by 
> > > _definition_ consistent with the resume image (becasue there is no 
> > > separate image).
> > 
> > Do you propose to keep DMAs running while suspending-to-RAM?
> 
> What part of "suspend a chip" do you have trouble with?
> 
> DMA obviously does *not* happen with a suspended device. There's no need 
> to turn DMA even off - it just doesn't happen!

Ok, I guess I'll have nightmares of DMA controllers doing DMAs from
chips that are no longer there tonight.

> > > For example, the whole myth that "freeze" needs to shut off DMA is a total 
> > > and utter *myth*. It needs nothing of the sort at all. Rather than shut 
> > > off DMA and try to make the hardware be wevy wevy quiet while it's hunting 
> > > wabbits, it's a lot easier to just do nothing at all on "freeze",
> > 
> > No. Sorry, you are wrong here. 
> > 
> > Remember that during resume we run
> > 
> > freeze()
> > copy old data into memory
> > thaw()
> > 
> > Now, if the old kernel left DMAs running, it could be overwriting
> > the data we are copying in.
> 
> The *thaw* needs to happen with devices quiescent. 
> 
> But that sure doesn't have anythign to do with the "snapshot()" path. In 
> fact, you'll have rebooted the machine in between.

Only the fact that we are currently using same device call during
snapshot() and during restore(). We obviously could do _5_ device
calls

(suspend/resume/freeze/quiesce_disable_dma/thaw)

...but that looks like too many calls to me.

> So what does that have to do with "snapshotting"?

I'm not comfortable with memory I'm copying changing under my hands
because of some DMA. It just looks like asking for trouble. I _think_
we can get away with DMA running during snapshot, because driver may
not assume anything about the DMA result before it got completion
interrupt, but... 

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 23:52                                       ` Pavel Machek
@ 2007-04-26  0:05                                         ` Linus Torvalds
  2007-04-26  0:14                                           ` Pavel Machek
  2007-04-26  0:34                                           ` Linus Torvalds
  0 siblings, 2 replies; 158+ messages in thread
From: Linus Torvalds @ 2007-04-26  0:05 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Alan Cox, Kenneth Crudup, Nick Piggin, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, suspend2-devel,
	Ingo Molnar, Andrew Morton, Arjan van de Ven



On Thu, 26 Apr 2007, Pavel Machek wrote:
> > 
> > Why? Becuase there is no _room_ for inconsistency. There's nothing to be 
> > "inconsistent with", since any changes to memory (by things like DMA or 
> > other setup that happens while the suspend process is going on) is by 
> > _definition_ consistent with the resume image (becasue there is no 
> > separate image).
> 
> Do you propose to keep DMAs running while suspending-to-RAM?

What part of "suspend a chip" do you have trouble with?

DMA obviously does *not* happen with a suspended device. There's no need 
to turn DMA even off - it just doesn't happen!

> > For example, the whole myth that "freeze" needs to shut off DMA is a total 
> > and utter *myth*. It needs nothing of the sort at all. Rather than shut 
> > off DMA and try to make the hardware be wevy wevy quiet while it's hunting 
> > wabbits, it's a lot easier to just do nothing at all on "freeze",
> 
> No. Sorry, you are wrong here. 
> 
> Remember that during resume we run
> 
> freeze()
> copy old data into memory
> thaw()
> 
> Now, if the old kernel left DMAs running, it could be overwriting
> the data we are copying in.

The *thaw* needs to happen with devices quiescent. 

But that sure doesn't have anythign to do with the "snapshot()" path. In 
fact, you'll have rebooted the machine in between.

So what does that have to do with "snapshotting"?

			Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 23:28                                       ` Pavel Machek
@ 2007-04-25 23:57                                         ` Linus Torvalds
  0 siblings, 0 replies; 158+ messages in thread
From: Linus Torvalds @ 2007-04-25 23:57 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Kenneth Crudup, Nick Piggin, Mike Galbraith, linux-kernel,
	Thomas Gleixner, Con Kolivas, suspend2-devel, Ingo Molnar,
	Andrew Morton, Arjan van de Ven



On Thu, 26 Apr 2007, Pavel Machek wrote:
> 
> > For suspend to ram, in contrast, since you *know* that nobody will be 
> > touching the hardware, and since the timings are very different anyway 
> > (you'd hope that you can resume in a second or two), you'd generally want 
> > to keep the DMA engine tables right where they are, and just literally 
> > suspend the PCI chip itself.
> 
> I'd actually prefer resume to be similar to module insert, too... Do
> you think that resume is _that_ time critical?

I think it probably depends on the device, and it should depend on the 
driver writer how he wants to do it.

My _point_ is that there is absolutely zero reason to think that the two 
events are the same. We *know* that for snapshot+shutdown, we need to 
actually keep the DMA tables intact *over* the snapshot (because writing 
out the snapshot may _need_ them). But exactly because we keep them 
intact, a driver writer may sanely say "I didn't even bother shutting them 
down, so on thaw, I cannot trust them, so I'll just re-initialize them 
entirely".

In contrast, over suspend-to-ram, it's entirely reasonable to just leave 
them in memory, and just keep them. There's no *reason* not to.

And that's my whole point in this argument: the two paths are 
fundamentally totally different. You *claim* that "snapshot()" needs to 
stop DMA etc, but that's simply not true.

So I claim:
 - for a lot of devices, it's actually a *lot* easier to just have 
   snapshot not do anythign at all, and re-initialze on thaw
 - for those same devices, for s2ram, since the tables are *safe* and 
   don't get touched by anything else, it's probably easier to just let 
   them be.

See? The "it's easier to do X" is a _different_ X for the two cases. 

So the whole "suspend is a superset of freeze" is simply not true.

> [I'd like you to drop me a line saying you understand current design
> and that it works -- even if it is very inelegant]

I _do_ understand the current design. I just think that it's totally 
and seriously broken. I've ranted against it before. I think it's stupid 
to play like you're "suspending" something just to save some state, 
especially since most users probably don't even *want* to suspend the 
state, and would quite happily re-initialize the chip instead.

And I think it's horrible to have a dynamic flag to tell the difference 
between two or more state changes that the devices should statically be 
able to determine. _If_ some driver really does have the same routine, 
just use the same routine. There are no downsides to splitting them up.

> Now, we can separate suspend/freeze and resume/thaw (with some common
> helpers). It will speed the code up by avoiding unneccessary
> operations. It also needs attetion from driver writers (ouch).
> 
> Do we want to do that?

I'd personally certainly want to do that. But I want to split up the 
callers too. Right now we mix those a lot as well. I suspect that would 
automatically be fixed by just forcing them to separate out (since they 
now call different functions of the devices), but I'm not 100% sure. There 
might be other issues.

Just as an example: one of the most painful things there is in the suspend 
sequence is that we shut off the console (because the console device will 
be suspended in hw, and it's thus not safe to use it over a suspend/resume 
sequence). That should just go away entirely for "snapshot()", because 
there is *never* any excuse for actually turning off the console during a 
snapshot: even a network console should be entirely functional.  Things 
like that - things that sound like small issues, but that really aren't.

(Right now you can enable the "don't disable the console" config option, 
but since network drivers will actually shut down etc, it just means that 
you'll have oopses etc if you do, and you have netconsole enabled)

		Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 23:20                                     ` Linus Torvalds
@ 2007-04-25 23:52                                       ` Pavel Machek
  2007-04-26  0:05                                         ` Linus Torvalds
  2007-04-26  0:24                                       ` Alan Cox
  1 sibling, 1 reply; 158+ messages in thread
From: Pavel Machek @ 2007-04-25 23:52 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alan Cox, Kenneth Crudup, Nick Piggin, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, suspend2-devel,
	Ingo Molnar, Andrew Morton, Arjan van de Ven

Hi!

> > Both of them have to ensure you can make a consistent snapshot.
> 
> Bzzt. Wrong again. Very much so.
> 
> STR does not need to "ensure that you have a consistent snapshot".
> 
> Why? Becuase there is no _room_ for inconsistency. There's nothing to be 
> "inconsistent with", since any changes to memory (by things like DMA or 
> other setup that happens while the suspend process is going on) is by 
> _definition_ consistent with the resume image (becasue there is no 
> separate image).

Do you propose to keep DMAs running while suspending-to-RAM? That
sounds really unsafe; we are shutting down our PCI controllers at that
time; doing that while DMAs are running sounds bad.

> That's TOTALLY DIFFERENT from "suspend to disk". In suspend to disk, you 
> need a completely different kind of mindset, namely you need a single 
> consistent image, where the image is consistent not only with memory, but 
> with all the devices.
> 
> For example, the whole myth that "freeze" needs to shut off DMA is a total 
> and utter *myth*. It needs nothing of the sort at all. Rather than shut 
> off DMA and try to make the hardware be wevy wevy quiet while it's hunting 
> wabbits, it's a lot easier to just do nothing at all on "freeze",

No. Sorry, you are wrong here. 

Remember that during resume we run

freeze()
copy old data into memory
thaw()

. Now, if the old kernel left DMAs running, it could be overwriting
the data we are copying in. It is not about DMA tables. While
resuming, CPU needs to be alone, without interference from DMA engines
(or other CPUs), because copying back old image means writing to
memory that was not properly alocated.

(Now, we could add one more hook, turn_off_dmas_for_copyback(), but
that looks like way too many hooks to me. And I'm not comfortable with
DMA engines running while I'm trying to copy image. They may be
overwriting data I'm trying to copy...) 

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26  0:14                                           ` Pavel Machek
@ 2007-04-25 23:51                                             ` David Lang
  2007-04-26  0:38                                             ` Linus Torvalds
  1 sibling, 0 replies; 158+ messages in thread
From: David Lang @ 2007-04-25 23:51 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Linus Torvalds, Alan Cox, Kenneth Crudup, Nick Piggin,
	Mike Galbraith, linux-kernel, Thomas Gleixner, Con Kolivas,
	suspend2-devel, Ingo Molnar, Andrew Morton, Arjan van de Ven

On Thu, 26 Apr 2007, Pavel Machek wrote:

>>> Now, if the old kernel left DMAs running, it could be overwriting
>>> the data we are copying in.
>>
>> The *thaw* needs to happen with devices quiescent.
>>
>> But that sure doesn't have anythign to do with the "snapshot()" path. In
>> fact, you'll have rebooted the machine in between.
>
> Only the fact that we are currently using same device call during
> snapshot() and during restore(). We obviously could do _5_ device
> calls
>
> (suspend/resume/freeze/quiesce_disable_dma/thaw)
>
> ...but that looks like too many calls to me.
>
>> So what does that have to do with "snapshotting"?
>
> I'm not comfortable with memory I'm copying changing under my hands
> because of some DMA. It just looks like asking for trouble. I _think_
> we can get away with DMA running during snapshot, because driver may
> not assume anything about the DMA result before it got completion
> interrupt, but...

the key is that with STR you don't need to copy the memory (it's staying where 
it is)

for STD you need to copy the memory, and there you halt DMA becouse you need to 
make an atomic snapshot.

David Lang

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 23:29                                         ` Linus Torvalds
@ 2007-04-25 23:45                                           ` Pavel Machek
  2007-04-26  1:48                                             ` Nigel Cunningham
  0 siblings, 1 reply; 158+ messages in thread
From: Pavel Machek @ 2007-04-25 23:45 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Nigel Cunningham, Kenneth Crudup, Nick Piggin, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, suspend2-devel,
	Ingo Molnar, Andrew Morton, Arjan van de Ven

Hi!

> > Current design is:
> 
> Broken. Yes. I've tried to tell you.

Ok.

...

> It's worse than just confusing, it's *idiotic*.
> 
> It _can_ work in practice, but
>  - we have pretty damn solid evidence that it doesn't work all that often 
>    in practice
>  - the fact that something *can* be done the stupid way is in no way an 
>    argument that it *should* be done the stupid way.
> 
> I claim that the current STD is *stupid*. Yes, it can work. But that 
> doesn't make it less stupid.

Good. So you understand how it works.

> What's your argument? Your argument seems to be that it's not stupid, 
> because it can work. Can't you see that that simply isn't an
> argument at 

I tried keeping module_init/thaw/resume similar code, so that driver
authors can debug suspend-to-disk, cross their fingers, and have
suspend-to-ram work, too.

Now, perhaps enough people do std/str these days so this is not
important any longer... lets hope so.
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 18:50                             ` Linus Torvalds
  2007-04-25 19:02                               ` Hua Zhong
  2007-04-25 19:25                               ` Adrian Bunk
@ 2007-04-25 23:33                               ` Olivier Galibert
  2007-04-26  1:56                                 ` Nigel Cunningham
  2 siblings, 1 reply; 158+ messages in thread
From: Olivier Galibert @ 2007-04-25 23:33 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Adrian Bunk, Pavel Machek, Ingo Molnar, Nigel Cunningham,
	Christian Hesse, Nick Piggin, Mike Galbraith, linux-kernel,
	Con Kolivas, suspend2-devel, Andrew Morton, Thomas Gleixner,
	Arjan van de Ven

On Wed, Apr 25, 2007 at 11:50:45AM -0700, Linus Torvalds wrote:
> .. but if the alternative is a feature that just isn't worth it, and 
> likely to not only have its own bugs, but cause bugs elsewhere? (And yes, 
> I believe STD is both of those. There's a reason it's called "STD". Go 
> to google and type "STD" and press "I'm feeling lucky". Google is God).

If it was correctly designed, it would be possible to change the
hardware or even the kernel through a STD cycle.  And that would be
damn interesting on servers.

In any case, if I could trust it, I'd use it when I need to move
servers around and I don't want to lose what is running.  Riding power
cuts that way would be nice.

  OG.


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 23:13                                       ` Pavel Machek
@ 2007-04-25 23:29                                         ` Linus Torvalds
  2007-04-25 23:45                                           ` Pavel Machek
  0 siblings, 1 reply; 158+ messages in thread
From: Linus Torvalds @ 2007-04-25 23:29 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Nigel Cunningham, Kenneth Crudup, Nick Piggin, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, suspend2-devel,
	Ingo Molnar, Andrew Morton, Arjan van de Ven



On Thu, 26 Apr 2007, Pavel Machek wrote:
> 
> Current design is:

Broken. Yes. I've tried to tell you.

> Twice. Once during snapshot (then we spin it up when the snapshot is
> done), and once during shutdown.

And nobody can possibly say that is "sane". But it's a direct result of 
the incorrect thinking that "suspend()" and "snapshot()" have anything 
what-so-ever to do with each other.

> Yep, we optimize away spindown, because it takes too long, so SCSI
> disks are actually very bad example.

No. SCSI disks are a *good* example. It's an example of how you 
(incorrectly) call the same function for two totally different things, and 
then that function is smart enough that it *understands* that they are 
totally different.

But the *confusion* remains. It remains in your head, and you've poisoned 
people like Alan too, that usually are not confused. And THAT is the main 
problem (although there are also indirect problems like "fixing one may 
break the other", but I actually think that the fundamental problem is the 
confusion it creates, which in turn causes bugs to happen because people 
are confused and think that they should do the same thing for suspend and 
for snapshot).

> No, I'd like you to understand that we actually CAN tell the disks to
> spin down, because we'll call resume and spin them back again before
> writing the image. We used to do it. We still can do it, but it is
> slow.
> 
> Yes, it is quite confusing.

It's worse than just confusing, it's *idiotic*.

It _can_ work in practice, but
 - we have pretty damn solid evidence that it doesn't work all that often 
   in practice
 - the fact that something *can* be done the stupid way is in no way an 
   argument that it *should* be done the stupid way.

I claim that the current STD is *stupid*. Yes, it can work. But that 
doesn't make it less stupid.

What's your argument? Your argument seems to be that it's not stupid, 
because it can work. Can't you see that that simply isn't an argument at 
all? "stupid and wrong" doesn't mean "cannot work in theory". But it 
*does* mean that people get confused, and it *does* mean that there are 
likely more bugs, because confused people do not tend to write very good 
code.

I'm not claiming that the current code cannot work. It clearly *does* 
work for a lot of people. But I'm claiming that it's STUPID.

So don't argue that "it works". Windows works, kind of. That doesn't make 
it less stupid and badly designed!

		Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 23:10                                     ` Linus Torvalds
@ 2007-04-25 23:28                                       ` Pavel Machek
  2007-04-25 23:57                                         ` Linus Torvalds
  0 siblings, 1 reply; 158+ messages in thread
From: Pavel Machek @ 2007-04-25 23:28 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kenneth Crudup, Nick Piggin, Mike Galbraith, linux-kernel,
	Thomas Gleixner, Con Kolivas, suspend2-devel, Ingo Molnar,
	Andrew Morton, Arjan van de Ven

Hi!

> > > I don't understand how you can even *claim* something like that.
> > 
> > BTW most problems are in thaw/resume functions.
> 
> And do you realize that the thaw/resume functions are totally different 
> too?
> 
> Or rather, they *would* be, if you allowed them to.
> 
> For example, for "snapshot + thaw", the _sane_ thing is to actually make 
> the snapshot just throw away all the DMA tables etc, and let the thawing 
> just do a full initialization (as it did on boot). It basically needs to 
> do that anyway, and it simplifies the whole thing (ie you don't even 
> *want* to save things like the DMA command queues etc - the ones that will 
> quite often be stepped on by the final "write snapshot to disk" stuff 
> anyway).

I'd prefer thaw to be similar to module insert, yes.

> For suspend to ram, in contrast, since you *know* that nobody will be 
> touching the hardware, and since the timings are very different anyway 
> (you'd hope that you can resume in a second or two), you'd generally want 
> to keep the DMA engine tables right where they are, and just literally 
> suspend the PCI chip itself.

I'd actually prefer resume to be similar to module insert, too... Do
you think that resume is _that_ time critical?

> You think they have things in common just because your whole (incorrect) 
> mindset has _forced_ them to have things in common, becasue your setup 
> stupidly thinks that "resume" is the same as "thaw", the same way you 
> think "freeze" is the same as "suspend".
> 
> NEITHER is true. You've _made_ them true in your mind, but there's 
> absolutely zero reason that they *should* be true.

[I'd like you to drop me a line saying you understand current design
and that it works -- even if it is very inelegant]

Now, we can separate suspend/freeze and resume/thaw (with some common
helpers). It will speed the code up by avoiding unneccessary
operations. It also needs attetion from driver writers (ouch).

Do we want to do that?
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 22:57                                   ` Alan Cox
@ 2007-04-25 23:20                                     ` Linus Torvalds
  2007-04-25 23:52                                       ` Pavel Machek
  2007-04-26  0:24                                       ` Alan Cox
  0 siblings, 2 replies; 158+ messages in thread
From: Linus Torvalds @ 2007-04-25 23:20 UTC (permalink / raw)
  To: Alan Cox
  Cc: Pavel Machek, Kenneth Crudup, Nick Piggin, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, suspend2-devel,
	Ingo Molnar, Andrew Morton, Arjan van de Ven



On Wed, 25 Apr 2007, Alan Cox wrote:
> 
> Both of them have to ensure you can make a consistent snapshot.

Bzzt. Wrong again. Very much so.

STR does not need to "ensure that you have a consistent snapshot".

Why? Becuase there is no _room_ for inconsistency. There's nothing to be 
"inconsistent with", since any changes to memory (by things like DMA or 
other setup that happens while the suspend process is going on) is by 
_definition_ consistent with the resume image (becasue there is no 
separate image).

> Doing that means you've got to be able to define a single "point" at 
> which the snapshot is made and is internally self-consistent. That in 
> both cases tends to mean you've got to ensure nothing occurs which pees 
> on the image while you are making that snapshot (such as outstanding 
> O_DIRECT I/O to user pages).

Get off the drugs, Alan. There *is* no snapshot with suspend-to-ram.

Which is the whole point I'm trying to make! A _lot_ of people are 
confused about this.

With suspend-to-ram, you don't need to do a damn thing to the chip, except 
suspend it and resume it. There are _zero_ consistency issues. There is no 
need to freeze anything at any point. You can suspend each device totally 
independently of all other devices (taking into account things like bus 
topology, of course), and there is no "atomic" snapshot that needs to ever 
exist.

That's TOTALLY DIFFERENT from "suspend to disk". In suspend to disk, you 
need a completely different kind of mindset, namely you need a single 
consistent image, where the image is consistent not only with memory, but 
with all the devices.

For example, the whole myth that "freeze" needs to shut off DMA is a total 
and utter *myth*. It needs nothing of the sort at all. Rather than shut 
off DMA and try to make the hardware be wevy wevy quiet while it's hunting 
wabbits, it's a lot easier to just do nothing at all on "freeze", and just 
make sure that "thaw" will re-initialze the DMA tables entirely! All 
drivers have code to do that anyway, since that's what you need to do at 
boot.

Notice?  Totally different. Absolutely NOTHING in common. Not on a 
practical plane, and not even conceptually.  The current (broken!) 
implementation has forced a totally idiotic model on things, where instead 
of snapshotting doing the sane and simple thing, it ends up doing extra 
work that is totally unnecessary, but *becomes* necessary just because it 
*also* shares the "resume" path (which should _not_ be the same either!)

			Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 22:55                                     ` Linus Torvalds
@ 2007-04-25 23:13                                       ` Pavel Machek
  2007-04-25 23:29                                         ` Linus Torvalds
  2007-04-26  1:40                                       ` Nigel Cunningham
  2007-04-26 10:39                                       ` Johannes Berg
  2 siblings, 1 reply; 158+ messages in thread
From: Pavel Machek @ 2007-04-25 23:13 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Nigel Cunningham, Kenneth Crudup, Nick Piggin, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, suspend2-devel,
	Ingo Molnar, Andrew Morton, Arjan van de Ven

Hi!

> > > And name *one* thing that have in common.
> > 
> > Set/reset the scsi transaction id thingy? Hibernation didn't work with
> > SCSI for a long time precisely because that support was missing.
> 
> And by "hibernation", you mean what? You mean "snapshot + shutdown", 
> right?
> 
> Think about it for five seconds, and then ask yourself: at which point in 
> the "snapshot + shutdown" sequence would you actually tell a disk to shut 
> down?

Current design is:

Twice. Once during snapshot (then we spin it up when the snapshot is
done), and once during shutdown.

Yep, we optimize away spindown, because it takes too long, so SCSI
disks are actually very bad example.

> If you said "snapshot", then you'd be *wrong*. 
> 
> That's my _point_. The snapshot() function should not (and MUST NOT) tell 
> disks to shut down, because unlike suspend(), we're still going to _use_ 
> those disks afterwards (why? To write out the snapshot image!).

No, I'd like you to understand that we actually CAN tell the disks to
spin down, because we'll call resume and spin them back again before
writing the image. We used to do it. We still can do it, but it is
slow.

Yes, it is quite confusing.
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 22:49                                   ` Pavel Machek
@ 2007-04-25 23:10                                     ` Linus Torvalds
  2007-04-25 23:28                                       ` Pavel Machek
  0 siblings, 1 reply; 158+ messages in thread
From: Linus Torvalds @ 2007-04-25 23:10 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Kenneth Crudup, Nick Piggin, Mike Galbraith, linux-kernel,
	Thomas Gleixner, Con Kolivas, suspend2-devel, Ingo Molnar,
	Andrew Morton, Arjan van de Ven



On Thu, 26 Apr 2007, Pavel Machek wrote:
> >
> > I don't understand how you can even *claim* something like that.
> 
> BTW most problems are in thaw/resume functions.

And do you realize that the thaw/resume functions are totally different 
too?

Or rather, they *would* be, if you allowed them to.

For example, for "snapshot + thaw", the _sane_ thing is to actually make 
the snapshot just throw away all the DMA tables etc, and let the thawing 
just do a full initialization (as it did on boot). It basically needs to 
do that anyway, and it simplifies the whole thing (ie you don't even 
*want* to save things like the DMA command queues etc - the ones that will 
quite often be stepped on by the final "write snapshot to disk" stuff 
anyway).

For suspend to ram, in contrast, since you *know* that nobody will be 
touching the hardware, and since the timings are very different anyway 
(you'd hope that you can resume in a second or two), you'd generally want 
to keep the DMA engine tables right where they are, and just literally 
suspend the PCI chip itself.

See? Again, *nothing* in common.

You think they have things in common just because your whole (incorrect) 
mindset has _forced_ them to have things in common, becasue your setup 
stupidly thinks that "resume" is the same as "thaw", the same way you 
think "freeze" is the same as "suspend".

NEITHER is true. You've _made_ them true in your mind, but there's 
absolutely zero reason that they *should* be true.

			Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 22:43                                   ` Chuck Ebbert
@ 2007-04-25 23:00                                     ` Linus Torvalds
  0 siblings, 0 replies; 158+ messages in thread
From: Linus Torvalds @ 2007-04-25 23:00 UTC (permalink / raw)
  To: Chuck Ebbert
  Cc: Pavel Machek, Kenneth Crudup, Nick Piggin, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, suspend2-devel,
	Ingo Molnar, Andrew Morton, Arjan van de Ven



On Wed, 25 Apr 2007, Chuck Ebbert wrote:
> 
> Freeze is a subset of suspend, isn't it? (It might be an empty subset
> in some cases.)

NO IT IS NOT!

Yes, you are parroting Pavel, but he can say it a million times, and it's 
*still* not true.

That's like saying "read() is a subset of write(), isn't it?" On many 
devices, they share some of the setup, like writing the same sector 
registers with the same values.

Does that make them subsets of each other?

Or does it mean that they *may* use some of the same common helper 
functions for some devices?

			Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 22:42                                   ` Pavel Machek
@ 2007-04-25 22:58                                     ` Linus Torvalds
  0 siblings, 0 replies; 158+ messages in thread
From: Linus Torvalds @ 2007-04-25 22:58 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Kenneth Crudup, Nick Piggin, Mike Galbraith, linux-kernel,
	Thomas Gleixner, Con Kolivas, suspend2-devel, Ingo Molnar,
	Andrew Morton, Arjan van de Ven



On Thu, 26 Apr 2007, Pavel Machek wrote:
> 
> Suspend syncs caches/spins down. Freeze does not do anything.
> 
> That's okay, I keep claiming "freeze" is subset of "suspend". Can you
> name device where that is not true?

Sure. Like just about any PCI device that doesn't do things on its own.

A "freeze" does nothing at all, or perhaps shuts down the reader side 
(for something like a network controller).

A "suspend" does "write D3 to the suspend register". Absolutely zero in 
common.

> Remember we do
> 
> suspend(PMSG_FREEZE)
> atomic snapshot
> resume()
> write snapshot.

AND THAT IS STUPID. It mixes up "suspend()" and creating a snapshot in 
ways that are totally idiotic. There is nothing in common!

		Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 22:18                                 ` Linus Torvalds
                                                     ` (3 preceding siblings ...)
  2007-04-25 22:49                                   ` Pavel Machek
@ 2007-04-25 22:57                                   ` Alan Cox
  2007-04-25 23:20                                     ` Linus Torvalds
  4 siblings, 1 reply; 158+ messages in thread
From: Alan Cox @ 2007-04-25 22:57 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Pavel Machek, Kenneth Crudup, Nick Piggin, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, suspend2-devel,
	Ingo Molnar, Andrew Morton, Arjan van de Ven

> Tell me, what does "suspend" do, and what does "freeze" (snapshot) do?
> 
> And name *one* thing that have in common.

Both of them have to ensure you can make a consistent snapshot. Doing
that means you've got to be able to define a single "point" at which the
snapshot is made and is internally self-consistent. That in both cases
tends to mean you've got to ensure nothing occurs which pees on the image
while you are making that snapshot (such as outstanding O_DIRECT I/O to
user pages).

Alan

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 22:27                                   ` Nigel Cunningham
@ 2007-04-25 22:55                                     ` Linus Torvalds
  2007-04-25 23:13                                       ` Pavel Machek
                                                         ` (2 more replies)
  0 siblings, 3 replies; 158+ messages in thread
From: Linus Torvalds @ 2007-04-25 22:55 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: Pavel Machek, Kenneth Crudup, Nick Piggin, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, suspend2-devel,
	Ingo Molnar, Andrew Morton, Arjan van de Ven



On Thu, 26 Apr 2007, Nigel Cunningham wrote:
> > 
> > And name *one* thing that have in common.
> 
> Set/reset the scsi transaction id thingy? Hibernation didn't work with
> SCSI for a long time precisely because that support was missing.

And by "hibernation", you mean what? You mean "snapshot + shutdown", 
right?

Think about it for five seconds, and then ask yourself: at which point in 
the "snapshot + shutdown" sequence would you actually tell a disk to shut 
down?

If you said "snapshot", then you'd be *wrong*. 

That's my _point_. The snapshot() function should not (and MUST NOT) tell 
disks to shut down, because unlike suspend(), we're still going to _use_ 
those disks afterwards (why? To write out the snapshot image!).

In other words, the act of creating a snapshot has *nothing* to do with 
suspend.

Now, after you've created (and written out) the snapshot, what do you 
actually end up doing?

That's right - you end up _shutting down_ the machine, and yes, as part 
of the _shutdown_ sequence you may actually end up doing a lot of the 
things that a suspend would do. But that's long *after* you've actually 
done the "snapshot" part, and has absolutely nothing to do with it.

That's where I started: whole "suspend to disk" thing actually has _more_ 
to do with "shutdown" than with "suspend". 

			Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 22:18                                 ` Linus Torvalds
                                                     ` (2 preceding siblings ...)
  2007-04-25 22:43                                   ` Chuck Ebbert
@ 2007-04-25 22:49                                   ` Pavel Machek
  2007-04-25 23:10                                     ` Linus Torvalds
  2007-04-25 22:57                                   ` Alan Cox
  4 siblings, 1 reply; 158+ messages in thread
From: Pavel Machek @ 2007-04-25 22:49 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kenneth Crudup, Nick Piggin, Mike Galbraith, linux-kernel,
	Thomas Gleixner, Con Kolivas, suspend2-devel, Ingo Molnar,
	Andrew Morton, Arjan van de Ven

Hi!

> > Not the same... but they are still related. "freeze" (for atomic
> > snapshot) is actually subset of "suspend"... freeze needs DMAs off and
> > saved state, and you need DMAs off and saved state for "suspend".
> 
> THEY HAVE ABSOLUTELY NOTHING IN COMMON!
> 
> Nobody in their right mind thinks that "disable DMA" and "suspend" are 
> similar operations. 
> 
> > So it is actually correct to do "suspend" when you want "freeze"; it
> > is just slow. That's why they only differ in parameter these days.
> 
> It is *not* correct to "suspend" when you want "freeze".
> 
> I don't understand how you can even *claim* something like that.

BTW most problems are in thaw/resume functions. Both suspend and
freeze are pretty simple, and they both need to save device state. In
SCSI disk, it would be nice to save options set by sdparm. And both
thaw and resume need to be able to restore the device from both
"powered down" and "some state preserved".
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 22:18                                 ` Linus Torvalds
  2007-04-25 22:27                                   ` Nigel Cunningham
  2007-04-25 22:42                                   ` Pavel Machek
@ 2007-04-25 22:43                                   ` Chuck Ebbert
  2007-04-25 23:00                                     ` Linus Torvalds
  2007-04-25 22:49                                   ` Pavel Machek
  2007-04-25 22:57                                   ` Alan Cox
  4 siblings, 1 reply; 158+ messages in thread
From: Chuck Ebbert @ 2007-04-25 22:43 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Pavel Machek, Kenneth Crudup, Nick Piggin, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, suspend2-devel,
	Ingo Molnar, Andrew Morton, Arjan van de Ven

Linus Torvalds wrote:
> Tell me, what does "suspend" do, and what does "freeze" (snapshot) do?
> 
> And name *one* thing that have in common.
> 
> I'll tell you: Nada. Zero. Zilch. Nothing.
> 
> "Freeze" for a disk is a total no-op. There is no DMA, there is no 
> nothing. In contrast, "suspend" for a disk is a totally valid operation.
> 

Freeze is a subset of suspend, isn't it? (It might be an empty subset
in some cases.)


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 22:18                                 ` Linus Torvalds
  2007-04-25 22:27                                   ` Nigel Cunningham
@ 2007-04-25 22:42                                   ` Pavel Machek
  2007-04-25 22:58                                     ` Linus Torvalds
  2007-04-25 22:43                                   ` Chuck Ebbert
                                                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 158+ messages in thread
From: Pavel Machek @ 2007-04-25 22:42 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kenneth Crudup, Nick Piggin, Mike Galbraith, linux-kernel,
	Thomas Gleixner, Con Kolivas, suspend2-devel, Ingo Molnar,
	Andrew Morton, Arjan van de Ven

Hi!

> > Not the same... but they are still related. "freeze" (for atomic
> > snapshot) is actually subset of "suspend"... freeze needs DMAs off and
> > saved state, and you need DMAs off and saved state for "suspend".
> 
> THEY HAVE ABSOLUTELY NOTHING IN COMMON!
> 
> Nobody in their right mind thinks that "disable DMA" and "suspend" are 
> similar operations. 
> 
> > So it is actually correct to do "suspend" when you want "freeze"; it
> > is just slow. That's why they only differ in parameter these days.
> 
> It is *not* correct to "suspend" when you want "freeze".

Example?

> I don't understand how you can even *claim* something like that.
> 
> Here's a trivial example:
>  - SCSI disk
> 
> Tell me, what does "suspend" do, and what does "freeze" (snapshot) do?

Suspend syncs caches/spins down. Freeze does not do anything.

That's okay, I keep claiming "freeze" is subset of "suspend". Can you
name device where that is not true?

Remember we do

suspend(PMSG_FREEZE)
atomic snapshot
resume()
write snapshot.

So if we do spin the scsi disk down, nothing really bad happens, we'll
just spin it up. (So scsi disk is not example I want. Spining down
scsi disk on freeze is slow and stupid, but it is not incorrect).

Yes, If I'd knew what I know now, drivers would have
suspend/freeze/thaw/resume methods. We probably still can do that
change. Unfortunately, it needs driver authors to understand 4 hooks
(not 2) and do the right thing.
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 20:08                                   ` Pavel Machek
  2007-04-25 20:33                                     ` Rafael J. Wysocki
@ 2007-04-25 22:36                                     ` Manu Abraham
  1 sibling, 0 replies; 158+ messages in thread
From: Manu Abraham @ 2007-04-25 22:36 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Linus Torvalds, Adrian Bunk, Ingo Molnar, Nigel Cunningham,
	Christian Hesse, Nick Piggin, Mike Galbraith, linux-kernel,
	Con Kolivas, suspend2-devel, Andrew Morton, Thomas Gleixner,
	Arjan van de Ven

Pavel Machek wrote:

> STD needs to snapshot system, and then it needs devices to be
> suspended so that snapshot is consistent.


One question though, there are devices that can be suspended (broken
suspend) and restore in such a case wouldn't work at all. The only
possible way would be then to reinitialize the device instead of restore ?

Manu


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 22:18                                 ` Linus Torvalds
@ 2007-04-25 22:27                                   ` Nigel Cunningham
  2007-04-25 22:55                                     ` Linus Torvalds
  2007-04-25 22:42                                   ` Pavel Machek
                                                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 158+ messages in thread
From: Nigel Cunningham @ 2007-04-25 22:27 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Pavel Machek, Kenneth Crudup, Nick Piggin, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, suspend2-devel,
	Ingo Molnar, Andrew Morton, Arjan van de Ven

[-- Attachment #1: Type: text/plain, Size: 1259 bytes --]

Hi.

On Wed, 2007-04-25 at 15:18 -0700, Linus Torvalds wrote:
> 
> On Wed, 25 Apr 2007, Pavel Machek wrote:
> > 
> > Not the same... but they are still related. "freeze" (for atomic
> > snapshot) is actually subset of "suspend"... freeze needs DMAs off and
> > saved state, and you need DMAs off and saved state for "suspend".
> 
> THEY HAVE ABSOLUTELY NOTHING IN COMMON!
> 
> Nobody in their right mind thinks that "disable DMA" and "suspend" are 
> similar operations. 
> 
> > So it is actually correct to do "suspend" when you want "freeze"; it
> > is just slow. That's why they only differ in parameter these days.
> 
> It is *not* correct to "suspend" when you want "freeze".
> 
> I don't understand how you can even *claim* something like that.
> 
> Here's a trivial example:
>  - SCSI disk
> 
> Tell me, what does "suspend" do, and what does "freeze" (snapshot) do?
> 
> And name *one* thing that have in common.

Set/reset the scsi transaction id thingy? Hibernation didn't work with
SCSI for a long time precisely because that support was missing.

Don't get me wrong, I agree on the whole - Suspend2 worked fine on the
whole under 2.4 without a driver model. But they do have a bit in
common.

Regards,

Nigel

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 21:30                                       ` Pavel Machek
  2007-04-25 21:40                                         ` Rafael J. Wysocki
@ 2007-04-25 22:22                                         ` Nigel Cunningham
  1 sibling, 0 replies; 158+ messages in thread
From: Nigel Cunningham @ 2007-04-25 22:22 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Linus Torvalds, Rafael J. Wysocki, Adrian Bunk, Ingo Molnar,
	Christian Hesse, Nick Piggin, Mike Galbraith, linux-kernel,
	Con Kolivas, suspend2-devel, Andrew Morton, Thomas Gleixner,
	Arjan van de Ven

[-- Attachment #1: Type: text/plain, Size: 1671 bytes --]

Hello.

On Wed, 2007-04-25 at 23:30 +0200, Pavel Machek wrote:
> Hi!
> 
> > > Please ask anyone who's worked with me if he's had any problem with that.
> > > If anyone say I'm unable to work with anybody else, I'd say you're right.  Till
> > > then, I feel offended.
> > 
> > I'll apologise (and virtually kiss your hairy feet) if you could actually 
> > show me a single implementation that people can agree on.
> > 
> > But until then, I claim that the suspend-to-disk people cannot work with 
> > each other.
> 
> It is not Rafael's fault. Actually it is quite hard to work with
> Nigel, because he implements every feature someone asks for, and wants
> to merge them all :-(. I don't expect to ever agree with Nigel on
> anything important, sorry.

I'm sorry that you feel that way, Pavel.

I can agree that I implement features that people ask for, but I think
saying "every feature someone asks for" is going a bit far (I won't ask
you to prove that). My desire is to provide Linux with hibernation
support that does more than just the bare minimum. Different people have
different usage scenarios, and this has led to me implementing more and
different features.

As to wanting to merge them all, this is true. No one wants to put time
into something only to have it left out. But I don't see why you think
this is a bad thing. Many kernel guys claim the thing follows an
evolutionary model. Well, here's software that has been developed out of
tree - evolved if you like - and which many people would consider more
mature ('evolved'?) than [u]swsusp. If evolutionary theory is to be
followed, let the fittest survive!

Nigel

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 20:23                                   ` Adrian Bunk
@ 2007-04-25 22:19                                     ` Kenneth Crudup
  0 siblings, 0 replies; 158+ messages in thread
From: Kenneth Crudup @ 2007-04-25 22:19 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Linus Torvalds, Nick Piggin, suspend2-devel, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, Pavel Machek,
	Ingo Molnar, Andrew Morton, Arjan van de Ven


On Wed, 25 Apr 2007, Adrian Bunk wrote:

> For me it was a serious regression if STD was removed without any
> replacement.

Amen. I have even made material donations to the SS2 effort to give the
developer what he'd needed to fix an issue with a certain configuration
and will do so again if need be, as that expense would be minor compared
to the productivity (== billing) loss that arises from having to start
over from scratch from each of the 3+ times per day I have to shutdown
and physically relocate my machine from place-to-place or client-to-client.

	-Kenny

-- 
Kenneth R. Crudup  Sr. SW Engineer, Scott County Consulting, Los Angeles
O: 3630 S. Sepulveda Blvd. #138, L.A., CA 90034-6809      (888) 454-8181

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 21:44                               ` Pavel Machek
@ 2007-04-25 22:18                                 ` Linus Torvalds
  2007-04-25 22:27                                   ` Nigel Cunningham
                                                     ` (4 more replies)
  0 siblings, 5 replies; 158+ messages in thread
From: Linus Torvalds @ 2007-04-25 22:18 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Kenneth Crudup, Nick Piggin, Mike Galbraith, linux-kernel,
	Thomas Gleixner, Con Kolivas, suspend2-devel, Ingo Molnar,
	Andrew Morton, Arjan van de Ven



On Wed, 25 Apr 2007, Pavel Machek wrote:
> 
> Not the same... but they are still related. "freeze" (for atomic
> snapshot) is actually subset of "suspend"... freeze needs DMAs off and
> saved state, and you need DMAs off and saved state for "suspend".

THEY HAVE ABSOLUTELY NOTHING IN COMMON!

Nobody in their right mind thinks that "disable DMA" and "suspend" are 
similar operations. 

> So it is actually correct to do "suspend" when you want "freeze"; it
> is just slow. That's why they only differ in parameter these days.

It is *not* correct to "suspend" when you want "freeze".

I don't understand how you can even *claim* something like that.

Here's a trivial example:
 - SCSI disk

Tell me, what does "suspend" do, and what does "freeze" (snapshot) do?

And name *one* thing that have in common.

I'll tell you: Nada. Zero. Zilch. Nothing.

"Freeze" for a disk is a total no-op. There is no DMA, there is no 
nothing. In contrast, "suspend" for a disk is a totally valid operation.

Anybody who claims that these two operations are "related" is a moron.

I'm sorry Pavel, but that's exactly how it is.

		Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 19:25                               ` Adrian Bunk
                                                   ` (2 preceding siblings ...)
  2007-04-25 19:55                                 ` Pavel Machek
@ 2007-04-25 22:13                                 ` Kenneth Crudup
  2007-04-26  1:25                                 ` Antonino A. Daplas
  4 siblings, 0 replies; 158+ messages in thread
From: Kenneth Crudup @ 2007-04-25 22:13 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Linus Torvalds, Nick Piggin, suspend2-devel, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, Pavel Machek,
	Ingo Molnar, Andrew Morton, Arjan van de Ven


On Wed, 25 Apr 2007, Adrian Bunk wrote:

> Some people might boot Windows between suspending and resuming.

Oh yeah- that, too. Since iTunes doesn't work well with VMWare, I do this
all the time.

	-Kenny

-- 
Kenneth R. Crudup  Sr. SW Engineer, Scott County Consulting, Los Angeles
O: 3630 S. Sepulveda Blvd. #138, L.A., CA 90034-6809      (888) 454-8181

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 17:34                         ` Pavel Machek
  2007-04-25 18:39                           ` Adrian Bunk
  2007-04-25 18:52                           ` Alon Bar-Lev
@ 2007-04-25 22:11                           ` Kenneth Crudup
  2 siblings, 0 replies; 158+ messages in thread
From: Kenneth Crudup @ 2007-04-25 22:11 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Adrian Bunk, Nick Piggin, Mike Galbraith, linux-kernel,
	Thomas Gleixner, Con Kolivas, suspend2-devel, Ingo Molnar,
	Linus Torvalds, Andrew Morton, Arjan van de Ven


On Wed, 25 Apr 2007, Pavel Machek wrote:

> But ... you are really using suspend-to-disk as a workaround for "my
> desktop takes too much power when idle".

While rare is the day admittedly, that my machine isn't on, there are
days I take a break from loooong days and won't work for 2-5 days at
a time.

My main revenue machine is a laptop with a fast, but last-generation
mobile processor and 2GB of DDR2 SDRAM.

I think it's ridiculous to expect that I could resume off battery (and
this thing is a behemoth, with a 17" screen and backlight and a lot of
little juice-eating peripherals that'll go thru a 4.4A/Hr battery in a
little over 90 mins, even with conservative power settings) after that
kind of delay. I don't even like "suspend to RAM, then suspend to disk
on battery low" 'cause that means when I turn it on again I have a low
battery for an hour and a half.

The only acceptable power usage when (completely) idle, IMO, is *zero*.

	-Kenny

-- 
Kenneth R. Crudup  Sr. SW Engineer, Scott County Consulting, Los Angeles
O: 3630 S. Sepulveda Blvd. #138, L.A., CA 90034-6809      (888) 454-8181

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 21:40                                         ` Rafael J. Wysocki
@ 2007-04-25 21:46                                           ` Pavel Machek
  0 siblings, 0 replies; 158+ messages in thread
From: Pavel Machek @ 2007-04-25 21:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linus Torvalds, Adrian Bunk, Ingo Molnar, Nigel Cunningham,
	Christian Hesse, Nick Piggin, Mike Galbraith, linux-kernel,
	Con Kolivas, suspend2-devel, Andrew Morton, Thomas Gleixner,
	Arjan van de Ven

Hi!

> > > And no, "three different implementations" doesn't cut it. Even _two_ is 
> > > too much. We need to get *rid* of something, not add more.
> > 
> > swsusp can be dropped. It is nice -- self contained, extremely easy to
> > setup, Andrew likes it. uswsusp has all the features, and pretty
> > elegant design. With klibc (or some way to ship userland code with
> > kernel, and put it into initramfs or something) we can reasonably drop
> > swsusp.
> 
> Well, I think we still need it and will need it in the future, at least for
> debugging.  Moreover, I think there are many users of it.
> 
> Let's not drop things that are helping us. :-)

Yes, it is very nice for debugging. But if I _had_ to choose, I'd
rather remove swsusp than uswsusp.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 20:44                             ` Linus Torvalds
  2007-04-25 21:07                               ` Rafael J. Wysocki
@ 2007-04-25 21:44                               ` Pavel Machek
  2007-04-25 22:18                                 ` Linus Torvalds
  1 sibling, 1 reply; 158+ messages in thread
From: Pavel Machek @ 2007-04-25 21:44 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kenneth Crudup, Nick Piggin, Mike Galbraith, linux-kernel,
	Thomas Gleixner, Con Kolivas, suspend2-devel, Ingo Molnar,
	Andrew Morton, Arjan van de Ven

Hi!

> > Can I get you on IRC somewhere? No, I do not think I'm a moron, and
> > yes, I need to suspend^Wsnapshot the devices before, so I have that in
> > the snapshot. Of course, I'll need to resume^Wrestore the devices
> > before writing snapshot. That's okay, it does not take long.
> 
> You do NOT need to "suspend" the devices, and that's the whole point.
> 
> You may want to save the device info somewhere, BUT THAT IS SOMETHING 
> TOTALLY DIFFERENT!
> 
> This is *exactly* the confusion I'm talking about. The STD and STR 
> codepaths try to use the same function for two TOTALLY DIFFERENT things.
> 
> STR actually wants to "suspend".
> 
> STD actually wants to "atomic snapshot", and it must not allow allocations 
> or anything like that, because the whole snapshot image should be done 
> atomically as one event. But it should *not* suspend, because that device 
> may actually be needed afterwards. 
> 
> So not the same thing at all.

Not the same... but they are still related. "freeze" (for atomic
snapshot) is actually subset of "suspend"... freeze needs DMAs off and
saved state, and you need DMAs off and saved state for "suspend".

So it is actually correct to do "suspend" when you want "freeze"; it
is just slow. That's why they only differ in parameter these days.

> So here's what "suspend()" wants:
>  - suspend() - preparatory work, can error our, can delay, can park the 
>    disk, etc etc.
>  - suspend_late() - called late, with interrupts disabled, should actually
>    suspend if the early suspend didn't do it already
> 
> And here is what "snapshot()" wants:
>  - prepare_to_snapshot() (for memory allocation)

Lets call this "freeze"?

>  - snapshot() - called late, with interrupts disabled, save state.

> and there is absolutely _zero_ overlap between them. There just isn't 
> anything in common. Yes, both are two-phase (for the simple reason that 
> both want an "atomic" part), but there's really no real overlap.

As I tried to explain, you can use suspend() to stop DMAs and save
state, and that's enough to get sane snapshot. (You do cli() before
doing snapshot, that helps with irqs).

> Just trying to *make* them be the same operations is just going to 
> introduce flags that then cause them to be totally different *and* 
> confusing and generate bugs. It also means that people do one of them, and 
> "it works" for that case, and the other case is totally broken, but it's 
> not obvious, because doing one means that the system _thinks_ that you did 
> both!
> 
> In the very unlikely case that some driver actually *wants* to use the 
> same function for snapshots and suspending, that driver could just go 
> ahead and _use_ the same function pointer. But now, as things are set up, 
> we force a total confusion on drivers by calling them through the same 
> interface for two totally different things.

Ok ok, we can do

suspend(PMSG_SUSPEND) -> suspend()
suspend(PMSG_FREEZE) -> freeze()

. We'll need to do big search&replace over the kernel etc. But if you
think it helps with confusion...

I'd still like to keep people using same method for both. It means
suspend path gets more testing, even when some stuff it does is not
strictly neccessary for freeze.
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-24 21:29                   ` Pavel Machek
  2007-04-24 22:24                     ` Ray Lee
@ 2007-04-25 21:41                     ` Matt Mackall
  2007-04-26 11:27                       ` Pavel Machek
  2007-04-26 19:04                       ` Bill Davidsen
  1 sibling, 2 replies; 158+ messages in thread
From: Matt Mackall @ 2007-04-25 21:41 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Ingo Molnar, Linus Torvalds, Nigel Cunningham, Christian Hesse,
	Nick Piggin, Mike Galbraith, linux-kernel, Con Kolivas,
	suspend2-devel, Andrew Morton, Thomas Gleixner, Arjan van de Ven

On Tue, Apr 24, 2007 at 11:29:56PM +0200, Pavel Machek wrote:
> We do not want to fragment the testing base, and suspend2 does not
> really have any interesting features over uswsusp.

The testing base is already fragmented!

What the current situation means is that you simply never hear from
the people who get fed up with suspend but who manage to get suspend2
working.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 21:30                                       ` Pavel Machek
@ 2007-04-25 21:40                                         ` Rafael J. Wysocki
  2007-04-25 21:46                                           ` Pavel Machek
  2007-04-25 22:22                                         ` Nigel Cunningham
  1 sibling, 1 reply; 158+ messages in thread
From: Rafael J. Wysocki @ 2007-04-25 21:40 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Linus Torvalds, Adrian Bunk, Ingo Molnar, Nigel Cunningham,
	Christian Hesse, Nick Piggin, Mike Galbraith, linux-kernel,
	Con Kolivas, suspend2-devel, Andrew Morton, Thomas Gleixner,
	Arjan van de Ven

On Wednesday, 25 April 2007 23:30, Pavel Machek wrote:
> Hi!
> 
> > > Please ask anyone who's worked with me if he's had any problem with that.
> > > If anyone say I'm unable to work with anybody else, I'd say you're right.  Till
> > > then, I feel offended.
> > 
> > I'll apologise (and virtually kiss your hairy feet) if you could actually 
> > show me a single implementation that people can agree on.
> > 
> > But until then, I claim that the suspend-to-disk people cannot work with 
> > each other.
> 
> It is not Rafael's fault. Actually it is quite hard to work with
> Nigel, because he implements every feature someone asks for, and wants
> to merge them all :-(. I don't expect to ever agree with Nigel on
> anything important, sorry.
> 
> > And no, "three different implementations" doesn't cut it. Even _two_ is 
> > too much. We need to get *rid* of something, not add more.
> 
> swsusp can be dropped. It is nice -- self contained, extremely easy to
> setup, Andrew likes it. uswsusp has all the features, and pretty
> elegant design. With klibc (or some way to ship userland code with
> kernel, and put it into initramfs or something) we can reasonably drop
> swsusp.

Well, I think we still need it and will need it in the future, at least for
debugging.  Moreover, I think there are many users of it.

Let's not drop things that are helping us. :-)

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 20:24                                     ` Linus Torvalds
@ 2007-04-25 21:30                                       ` Pavel Machek
  2007-04-25 21:40                                         ` Rafael J. Wysocki
  2007-04-25 22:22                                         ` Nigel Cunningham
  0 siblings, 2 replies; 158+ messages in thread
From: Pavel Machek @ 2007-04-25 21:30 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Rafael J. Wysocki, Adrian Bunk, Ingo Molnar, Nigel Cunningham,
	Christian Hesse, Nick Piggin, Mike Galbraith, linux-kernel,
	Con Kolivas, suspend2-devel, Andrew Morton, Thomas Gleixner,
	Arjan van de Ven

Hi!

> > Please ask anyone who's worked with me if he's had any problem with that.
> > If anyone say I'm unable to work with anybody else, I'd say you're right.  Till
> > then, I feel offended.
> 
> I'll apologise (and virtually kiss your hairy feet) if you could actually 
> show me a single implementation that people can agree on.
> 
> But until then, I claim that the suspend-to-disk people cannot work with 
> each other.

It is not Rafael's fault. Actually it is quite hard to work with
Nigel, because he implements every feature someone asks for, and wants
to merge them all :-(. I don't expect to ever agree with Nigel on
anything important, sorry.

> And no, "three different implementations" doesn't cut it. Even _two_ is 
> too much. We need to get *rid* of something, not add more.

swsusp can be dropped. It is nice -- self contained, extremely easy to
setup, Andrew likes it. uswsusp has all the features, and pretty
elegant design. With klibc (or some way to ship userland code with
kernel, and put it into initramfs or something) we can reasonably drop
swsusp.
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 20:44                             ` Linus Torvalds
@ 2007-04-25 21:07                               ` Rafael J. Wysocki
  2007-04-25 21:44                               ` Pavel Machek
  1 sibling, 0 replies; 158+ messages in thread
From: Rafael J. Wysocki @ 2007-04-25 21:07 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Pavel Machek, Kenneth Crudup, Nick Piggin, Mike Galbraith,
	linux-kernel, Thomas Gleixner, Con Kolivas, suspend2-devel,
	Ingo Molnar, Andrew Morton, Arjan van de Ven

On Wednesday, 25 April 2007 22:44, Linus Torvalds wrote:
> 
> On Wed, 25 Apr 2007, Pavel Machek wrote:
> > 
> > Can I get you on IRC somewhere? No, I do not think I'm a moron, and
> > yes, I need to suspend^Wsnapshot the devices before, so I have that in
> > the snapshot. Of course, I'll need to resume^Wrestore the devices
> > before writing snapshot. That's okay, it does not take long.
> 
> You do NOT need to "suspend" the devices, and that's the whole point.
> 
> You may want to save the device info somewhere, BUT THAT IS SOMETHING 
> TOTALLY DIFFERENT!
> 
> This is *exactly* the confusion I'm talking about. The STD and STR 
> codepaths try to use the same function for two TOTALLY DIFFERENT things.
> 
> STR actually wants to "suspend".
> 
> STD actually wants to "atomic snapshot", and it must not allow allocations 
> or anything like that, because the whole snapshot image should be done 
> atomically as one event. But it should *not* suspend, because that device 
> may actually be needed afterwards. 
> 
> So not the same thing at all.
> 
> So here's what "suspend()" wants:
>  - suspend() - preparatory work, can error our, can delay, can park the 
>    disk, etc etc.
>  - suspend_late() - called late, with interrupts disabled, should actually
>    suspend if the early suspend didn't do it already
> 
> And here is what "snapshot()" wants:
>  - prepare_to_snapshot() (for memory allocation)
>  - snapshot() - called late, with interrupts disabled, save state.
> 
> and there is absolutely _zero_ overlap between them. There just isn't 
> anything in common. Yes, both are two-phase (for the simple reason that 
> both want an "atomic" part), but there's really no real overlap.
> 
> Just trying to *make* them be the same operations is just going to 
> introduce flags that then cause them to be totally different *and* 
> confusing and generate bugs. It also means that people do one of them, and 
> "it works" for that case, and the other case is totally broken, but it's 
> not obvious, because doing one means that the system _thinks_ that you did 
> both!
> 
> In the very unlikely case that some driver actually *wants* to use the 
> same function for snapshots and suspending, that driver could just go 
> ahead and _use_ the same function pointer. But now, as things are set up, 
> we force a total confusion on drivers by calling them through the same 
> interface for two totally different things.

I agree, except there are surprisingly many drivers like that.

You're right, we should be doing all of it in a different way, but this means
a lot of changes and we can't do them overnight.

As I wrote in the reply to Pavel, I think we can introduce .freeze(), .thaw()
(and .prethaw() for that matter) callbacks for hibernation and make drivers
use them, but that will be a long series of patches.  Still, I think it's
doable.

Greetings,
Rafael


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 20:27                           ` Pavel Machek
@ 2007-04-25 20:44                             ` Linus Torvalds
  2007-04-25 21:07                               ` Rafael J. Wysocki
  2007-04-25 21:44                               ` Pavel Machek
  0 siblings, 2 replies; 158+ messages in thread
From: Linus Torvalds @ 2007-04-25 20:44 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Kenneth Crudup, Nick Piggin, Mike Galbraith, linux-kernel,
	Thomas Gleixner, Con Kolivas, suspend2-devel, Ingo Molnar,
	Andrew Morton, Arjan van de Ven



On Wed, 25 Apr 2007, Pavel Machek wrote:
> 
> Can I get you on IRC somewhere? No, I do not think I'm a moron, and
> yes, I need to suspend^Wsnapshot the devices before, so I have that in
> the snapshot. Of course, I'll need to resume^Wrestore the devices
> before writing snapshot. That's okay, it does not take long.

You do NOT need to "suspend" the devices, and that's the whole point.

You may want to save the device info somewhere, BUT THAT IS SOMETHING 
TOTALLY DIFFERENT!

This is *exactly* the confusion I'm talking about. The STD and STR 
codepaths try to use the same function for two TOTALLY DIFFERENT things.

STR actually wants to "suspend".

STD actually wants to "atomic snapshot", and it must not allow allocations 
or anything like that, because the whole snapshot image should be done 
atomically as one event. But it should *not* suspend, because that device 
may actually be needed afterwards. 

So not the same thing at all.

So here's what "suspend()" wants:
 - suspend() - preparatory work, can error our, can delay, can park the 
   disk, etc etc.
 - suspend_late() - called late, with interrupts disabled, should actually
   suspend if the early suspend didn't do it already

And here is what "snapshot()" wants:
 - prepare_to_snapshot() (for memory allocation)
 - snapshot() - called late, with interrupts disabled, save state.

and there is absolutely _zero_ overlap between them. There just isn't 
anything in common. Yes, both are two-phase (for the simple reason that 
both want an "atomic" part), but there's really no real overlap.

Just trying to *make* them be the same operations is just going to 
introduce flags that then cause them to be totally different *and* 
confusing and generate bugs. It also means that people do one of them, and 
"it works" for that case, and the other case is totally broken, but it's 
not obvious, because doing one means that the system _thinks_ that you did 
both!

In the very unlikely case that some driver actually *wants* to use the 
same function for snapshots and suspending, that driver could just go 
ahead and _use_ the same function pointer. But now, as things are set up, 
we force a total confusion on drivers by calling them through the same 
interface for two totally different things.

			Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 20:08                                   ` Pavel Machek
@ 2007-04-25 20:33                                     ` Rafael J. Wysocki
  2007-04-25 20:31                                       ` Pavel Machek
  2007-04-25 22:36                                     ` Manu Abraham
  1 sibling, 1 reply; 158+ messages in thread
From: Rafael J. Wysocki @ 2007-04-25 20:33 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Linus Torvalds, Adrian Bunk, Ingo Molnar, Nigel Cunningham,
	Christian Hesse, Nick Piggin, Mike Galbraith, linux-kernel,
	Con Kolivas, suspend2-devel, Andrew Morton, Thomas Gleixner,
	Arjan van de Ven

On Wednesday, 25 April 2007 22:08, Pavel Machek wrote:
> Hi!
> 
> > > > .. but if the alternative is a feature that just isn't worth it, and 
> > > > likely to not only have its own bugs, but cause bugs elsewhere? (And yes, 
> > > > I believe STD is both of those. There's a reason it's called "STD". Go 
> > > > to google and type "STD" and press "I'm feeling lucky". Google is God).
> > > 
> > > Is there really no use case for STD?

[--snip--]
> > So my objections to STD have nothing to do with saving state and shutting 
> > down. They have everything to do with the fact that it is not - and will 
> > never be - a "suspend", and it shouldn't affect suspend.
> 
> STD needs to snapshot system, and then it needs devices to be
> suspended so that snapshot is consistent.

Not suspended.  _Frozen_.  In fact don't want any DMA transfers or interrupts
to take place when we're creating the image.  That's all and that's what we're
doing (or rather, trying to do).

So, the "suspend" and "resume" for the functions being called for that are
wrong, but then we call them with PMSG_FREEZE. ;-)  Still, we could add
.freeze() and .thaw() callbacks for hibernation just fine.  This wouldn't even
be that difficult ...

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 20:33                                     ` Rafael J. Wysocki
@ 2007-04-25 20:31                                       ` Pavel Machek
  0 siblings, 0 replies; 158+ messages in thread
From: Pavel Machek @ 2007-04-25 20:31 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linus Torvalds, Adrian Bunk, Ingo Molnar, Nigel Cunningham,
	Christian Hesse, Nick Piggin, Mike Galbraith, linux-kernel,
	Con Kolivas, suspend2-devel, Andrew Morton, Thomas Gleixner,
	Arjan van de Ven

Hi!

> > > > > .. but if the alternative is a feature that just isn't worth it, and 
> > > > > likely to not only have its own bugs, but cause bugs elsewhere? (And yes, 
> > > > > I believe STD is both of those. There's a reason it's called "STD". Go 
> > > > > to google and type "STD" and press "I'm feeling lucky". Google is God).
> > > > 
> > > > Is there really no use case for STD?
> 
> [--snip--]
> > > So my objections to STD have nothing to do with saving state and shutting 
> > > down. They have everything to do with the fact that it is not - and will 
> > > never be - a "suspend", and it shouldn't affect suspend.
> > 
> > STD needs to snapshot system, and then it needs devices to be
> > suspended so that snapshot is consistent.
> 
> Not suspended.  _Frozen_.  In fact don't want any DMA transfers or interrupts
> to take place when we're creating the image.  That's all and that's what we're
> doing (or rather, trying to do).

Yep, _frozen_. That's the right word.

> So, the "suspend" and "resume" for the functions being called for that are
> wrong, but then we call them with PMSG_FREEZE. ;-)  Still, we could add
> .freeze() and .thaw() callbacks for hibernation just fine.  This wouldn't even
> be that difficult ...

It would be ugly big patch I'm afraid.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 20:08                         ` Linus Torvalds
@ 2007-04-25 20:27                           ` Pavel Machek
  2007-04-25 20:44                             ` Linus Torvalds
  2007-04-26  0:41                           ` Thomas Orgis
  1 sibling, 1 reply; 158+ messages in thread
From: Pavel Machek @ 2007-04-25 20:27 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kenneth Crudup, Nick Piggin, Mike Galbraith, linux-kernel,
	Thomas Gleixner, Con Kolivas, suspend2-devel, Ingo Molnar,
	Andrew Morton, Arjan van de Ven

Hi!

_Can we get a suspend-to-RAM maintainer_?

Noone cares about s2ram these days. I do care a little, seife
maintains whitelist, you care for mac mini, Len/Andrew/Intel acpi team
helps sometimes... But I feel we should have someone listed in the
MAINTAINERS file. Patrick was close, but... 

> > Any working suspend-to-disk method takes care of that for me.  (I'm
> > really not sure why Linus hates S2D so much, though. Back in the day
> > there was a lot more BIOS support, but that's been years now.)
> 
> The really sad part is that APM actually did this better.. 

I agree that APM STR worked better than current ACPI STR. I think
swsusp already works better than APM STD, but...

> I think you could do STD right too, but:
> 
>  - if you think it's about suspending devices, you are immediately 
>    disqualified. If you call the device driver "suspend" or "resume" 
>    functions, you are doing something wrong.
> 
>  - "suspend" is: snapshot memory, and anything you do *after* the snapshot 
>    is totally irrelevant. You MUST NOT suspend devices before, since 
>    devices are what that snapshot should be written out to, and you MUST 
>    NOT suspend devices afterwards either, because that shows that you are 
>    a moron who didn't understand the "machine will be turned off" part.

Can I get you on IRC somewhere? No, I do not think I'm a moron, and
yes, I need to suspend^Wsnapshot the devices before, so I have that in
the snapshot. Of course, I'll need to resume^Wrestore the devices
before writing snapshot. That's okay, it does not take long.

>  - "resume" is basically: get image into memory, turn *off* every device, 

Exactly. I need to turn devices *off* before restoring image, and I
need them *off* before saving image, too -- DMAs are dangerous.

I currently do that using "suspend" and "resume" hooks, before they
turn DMAs / IRQs off as a sideeffect.

>    put image into its proper location, and call the "startup" function. If 
>    you call a device "resume()" function, you again show that you are a 
>    moron, because you're not resuming anything at all, you're resetting 
>    the device from scratch. You _reinitialize_ the device. You don't 
>    resume it, and somebody may hve (and indeed, *WILL HAVE* used the 
>    device in between). There should be absolutely zero shared code, and 
>    the *last* thing you should do is to call the device with the same 
>    function, and give it a flag to tell it to do one thing or the other.

Well, "startup" function or how you want to call it has to deal with
device not initialized (s2disk driver is module, s2ram with hw powered
off) and has to deal with device initialized (s2disk driver in kernel
case, s2ram device was powered). 

I fear that "resume"/"restore" functions need to be pretty robust,
anyway... 

> And THAT is why I hate the kernel STD. It is fundamentally confused. In 
> ways that APM was not, I'd like to point out.

Ok, yes, it is confused/confusing.

> I'd love to get it fixed. But the first fix is to not call it "suspend", 
> because language *does* matter, and using that term is what I'm convinced 
> has confused so many people.

> If it had been called "snapshot + restore", I suspect a lot of
> people 

snapshot/restore sounds okay to me.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 20:20                                   ` Rafael J. Wysocki
@ 2007-04-25 20:24                                     ` Linus Torvalds
  2007-04-25 21:30                                       ` Pavel Machek
  0 siblings, 1 reply; 158+ messages in thread
From: Linus Torvalds @ 2007-04-25 20:24 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Adrian Bunk, Pavel Machek, Ingo Molnar, Nigel Cunningham,
	Christian Hesse, Nick Piggin, Mike Galbraith, linux-kernel,
	Con Kolivas, suspend2-devel, Andrew Morton, Thomas Gleixner,
	Arjan van de Ven



On Wed, 25 Apr 2007, Rafael J. Wysocki wrote:
> 
> Please ask anyone who's worked with me if he's had any problem with that.
> If anyone say I'm unable to work with anybody else, I'd say you're right.  Till
> then, I feel offended.

I'll apologise (and virtually kiss your hairy feet) if you could actually 
show me a single implementation that people can agree on.

But until then, I claim that the suspend-to-disk people cannot work with 
each other.

And no, "three different implementations" doesn't cut it. Even _two_ is 
too much. We need to get *rid* of something, not add more.

		Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 19:38                                 ` Linus Torvalds
  2007-04-25 20:08                                   ` Pavel Machek
  2007-04-25 20:20                                   ` Rafael J. Wysocki
@ 2007-04-25 20:23                                   ` Adrian Bunk
  2007-04-25 22:19                                     ` Kenneth Crudup
  2 siblings, 1 reply; 158+ messages in thread
From: Adrian Bunk @ 2007-04-25 20:23 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Pavel Machek, Ingo Molnar, Nigel Cunningham, Christian Hesse,
	Nick Piggin, Mike Galbraith, linux-kernel, Con Kolivas,
	suspend2-devel, Andrew Morton, Thomas Gleixner, Arjan van de Ven

On Wed, Apr 25, 2007 at 12:38:47PM -0700, Linus Torvalds wrote:
> 
> 
> On Wed, 25 Apr 2007, Adrian Bunk wrote:
> > > 
> > > .. but if the alternative is a feature that just isn't worth it, and 
> > > likely to not only have its own bugs, but cause bugs elsewhere? (And yes, 
> > > I believe STD is both of those. There's a reason it's called "STD". Go 
> > > to google and type "STD" and press "I'm feeling lucky". Google is God).
> > 
> > Is there really no use case for STD?
>...
> I'd actually be happier *removing* STD support in the sense it is now: 
> it's way too closely integrated with STR, even though it has absolutely 
> nothing in common with it. When you STD, you'e actually much closer to a 
> *shutdown* than to STR, yet the STD code continually seems to want to be 
> in the "suspend" path, as shown even by its name.
> 
> So my objections to STD have nothing to do with saving state and shutting 
> down. They have everything to do with the fact that it is not - and will 
> never be - a "suspend", and it shouldn't affect suspend.
>...

There are two completely different points:
- I say that the feature STD has use cases where STR is not a replacement
- you say you dislike the current implementation of STD

For me it was a serious regression if STD was removed without any 
replacement.

If someone would replace the STD implementation with what you want it to 
be I wouldn't care and you were happy.

> 		Linus

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 19:38                                 ` Linus Torvalds
  2007-04-25 20:08                                   ` Pavel Machek
@ 2007-04-25 20:20                                   ` Rafael J. Wysocki
  2007-04-25 20:24                                     ` Linus Torvalds
  2007-04-25 20:23                                   ` Adrian Bunk
  2 siblings, 1 reply; 158+ messages in thread
From: Rafael J. Wysocki @ 2007-04-25 20:20 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Adrian Bunk, Pavel Machek, Ingo Molnar, Nigel Cunningham,
	Christian Hesse, Nick Piggin, Mike Galbraith, linux-kernel,
	Con Kolivas, suspend2-devel, Andrew Morton, Thomas Gleixner,
	Arjan van de Ven

On Wednesday, 25 April 2007 21:38, Linus Torvalds wrote:
> 
> On Wed, 25 Apr 2007, Adrian Bunk wrote:

Well, I told Pavel that I wouldn't take part in this thread, but since you're
making some rude and unfounded personal remarks, I feel I have to speak.

[--snip--]
> And that's a *fundamental* problem. If the STD people cannot even realize 
> that they have less to do with "suspend" than to "reboot", how do you ever 
> expect them to get anything to work, and not affect other things 
> negatively?

That's not true.

> Yeah, I'm down on it. I'm down on it because every person involved with 
> the whole STD thing seems to have basically zero taste, and a total 
> inability to work with anybody else.

Please ask anyone who's worked with me if he's had any problem with that.
If anyone say I'm unable to work with anybody else, I'd say you're right.  Till
then, I feel offended.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 19:38                                 ` Linus Torvalds
@ 2007-04-25 20:08                                   ` Pavel Machek
  2007-04-25 20:33                                     ` Rafael J. Wysocki
  2007-04-25 22:36                                     ` Manu Abraham
  2007-04-25 20:20                                   ` Rafael J. Wysocki
  2007-04-25 20:23                                   ` Adrian Bunk
  2 siblings, 2 replies; 158+ messages in thread
From: Pavel Machek @ 2007-04-25 20:08 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Adrian Bunk, Ingo Molnar, Nigel Cunningham, Christian Hesse,
	Nick Piggin, Mike Galbraith, linux-kernel, Con Kolivas,
	suspend2-devel, Andrew Morton, Thomas Gleixner, Arjan van de Ven

Hi!

> > > .. but if the alternative is a feature that just isn't worth it, and 
> > > likely to not only have its own bugs, but cause bugs elsewhere? (And yes, 
> > > I believe STD is both of those. There's a reason it's called "STD". Go 
> > > to google and type "STD" and press "I'm feeling lucky". Google is God).
> > 
> > Is there really no use case for STD?
> 
> People seem to have reading comprehension problems.
> 
> The STD code is buggy, and has introduced bugs in STR too, largely thanks 
> to bad design. Some of them have happily gotten fixed. Others did not, and 
> now we have three totally different versions (two of which share some 
> infrastructure), all of which are broken (ie the "suspend2" people will 
> swear up-and-down that swsusp doesn't work for them, but anybody who 
> thinks that "suspend2" will work for everybody is just being a total 
> idiot, and I have a bridge to sell to them).

Well, lets get some credit to STD... it worked before STR, and it
allowed debugging basic driver infrastructure. 

> So my objections to STD have nothing to do with saving state and shutting 
> down. They have everything to do with the fact that it is not - and will 
> never be - a "suspend", and it shouldn't affect suspend.

STD needs to snapshot system, and then it needs devices to be
suspended so that snapshot is consistent.

> And that's a *fundamental* problem. If the STD people cannot even realize 
> that they have less to do with "suspend" than to "reboot", how do you ever 
> expect them to get anything to work, and not affect other things 
> negatively?

STD worked first ;-). Yes, these days it has little to do with
"suspend", it was mostly separated to "snapshot" and "restore".

We still keep swsusp in kernel for compatibility (and because it makes
debugging very easy).
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 19:43                       ` Kenneth Crudup
@ 2007-04-25 20:08                         ` Linus Torvalds
  2007-04-25 20:27                           ` Pavel Machek
  2007-04-26  0:41                           ` Thomas Orgis
  0 siblings, 2 replies; 158+ messages in thread
From: Linus Torvalds @ 2007-04-25 20:08 UTC (permalink / raw)
  To: Kenneth Crudup
  Cc: Pavel Machek, Nick Piggin, Mike Galbraith, linux-kernel,
	Thomas Gleixner, Con Kolivas, suspend2-devel, Ingo Molnar,
	Andrew Morton, Arjan van de Ven



On Wed, 25 Apr 2007, Kenneth Crudup wrote:
> 
> Any working suspend-to-disk method takes care of that for me.  (I'm
> really not sure why Linus hates S2D so much, though. Back in the day
> there was a lot more BIOS support, but that's been years now.)

The really sad part is that APM actually did this better.. 

It's not often you can say that, and APM didn't do diddly-squat for 
run-time power management, but when it comes to suspend-to-disk, APM 
actually did ok.

I think you could do STD right too, but:

 - if you think it's about suspending devices, you are immediately 
   disqualified. If you call the device driver "suspend" or "resume" 
   functions, you are doing something wrong.

 - "suspend" is: snapshot memory, and anything you do *after* the snapshot 
   is totally irrelevant. You MUST NOT suspend devices before, since 
   devices are what that snapshot should be written out to, and you MUST 
   NOT suspend devices afterwards either, because that shows that you are 
   a moron who didn't understand the "machine will be turned off" part.

 - "resume" is basically: get image into memory, turn *off* every device, 
   put image into its proper location, and call the "startup" function. If 
   you call a device "resume()" function, you again show that you are a 
   moron, because you're not resuming anything at all, you're resetting 
   the device from scratch. You _reinitialize_ the device. You don't 
   resume it, and somebody may hve (and indeed, *WILL HAVE* used the 
   device in between). There should be absolutely zero shared code, and 
   the *last* thing you should do is to call the device with the same 
   function, and give it a flag to tell it to do one thing or the other.

The important thing to take away from this is that it has nothing to do 
with "suspend" or "resume" at any level what-so-ever. Not at a device 
level, not at a system level, and not even when it comes to hardware. But 
for completely idiotic and wrong reasons, it is currently intimately 
involved in suspend/resume, and calls the same device management functions 
as a suspend/resume thing does, and shares a lot of the code.

And THAT is why I hate the kernel STD. It is fundamentally confused. In 
ways that APM was not, I'd like to point out.

I'd love to get it fixed. But the first fix is to not call it "suspend", 
because language *does* matter, and using that term is what I'm convinced 
has confused so many people.

If it had been called "snapshot + restore", I suspect a lot of people 
wouldn't have been so confused about what it does and how it needs to do 
it, and wouldn't have tried to shoehorn it into the same corner of the 
kernel as "suspend-to-ram" (where you really *can* do things like 
"suspend" devices, and while they might certainly lose power in between, 
they also really might not, and they certainly won't be *doing* things in 
between).

			Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 19:25                               ` Adrian Bunk
  2007-04-25 19:38                                 ` Linus Torvalds
  2007-04-25 19:41                                 ` Andrew Morton
@ 2007-04-25 19:55                                 ` Pavel Machek
  2007-04-25 22:13                                 ` Kenneth Crudup
  2007-04-26  1:25                                 ` Antonino A. Daplas
  4 siblings, 0 replies; 158+ messages in thread
From: Pavel Machek @ 2007-04-25 19:55 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Linus Torvalds, Ingo Molnar, Nigel Cunningham, Christian Hesse,
	Nick Piggin, Mike Galbraith, linux-kernel, Con Kolivas,
	suspend2-devel, Andrew Morton, Thomas Gleixner, Arjan van de Ven

Hi!

> > > 3W for the complete system? In CPU state S1? [1]
> > 
> > In STR, 3W is quite realistic. The CPU is off, all (or most - up to you) 
> > the devices are off, but the motherboard and memory is powered.
> 
> As far as I understand it, the CPU isn't off in S1.
> 
> > > And even 3W would still be a waste of energy.
> > 
> > .. but if the alternative is a feature that just isn't worth it, and 
> > likely to not only have its own bugs, but cause bugs elsewhere? (And yes, 
> > I believe STD is both of those. There's a reason it's called "STD". Go 
> > to google and type "STD" and press "I'm feeling lucky". Google is God).
> 
> Is there really no use case for STD?

Of course there are use cases for STD... lots of them... that's why
I'm maintaining it.

It has some "interesting" use cases, like suspend on one machine,
transfer image to identical one, resume there, dual-boot to windows;
there are "normal" use cases, like machines not capable of S3.

I hope we are not dropping STD just now...
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25  7:23                     ` Pavel Machek
                                         ` (2 preceding siblings ...)
  2007-04-25 15:18                       ` suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang " Adrian Bunk
@ 2007-04-25 19:43                       ` Kenneth Crudup
  2007-04-25 20:08                         ` Linus Torvalds
  2007-05-26 17:37                       ` Martin Steigerwald
  4 siblings, 1 reply; 158+ messages in thread
From: Kenneth Crudup @ 2007-04-25 19:43 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Linus Torvalds, Nick Piggin, Mike Galbraith, linux-kernel,
	Thomas Gleixner, Con Kolivas, suspend2-devel, Ingo Molnar,
	Andrew Morton, Arjan van de Ven


On Wed, 25 Apr 2007, Pavel Machek wrote:

> I'm starting to think that we should fix the idle power consumption
> problem. Cell phones do it right. They pretend to be ready/idle all
> the time, yet they have _days_ of standby.

My laptop goes nearly everywhere I do; I DO NOT want it on when I'm
travelling around to clients or between home and office or on a plane,
and I lose a lot of productivity the times I have to restart from a
cold boot as when I'm working I tend to have up ~10 xterms and while
my browsers have "restart", that's not infallible.

Any working suspend-to-disk method takes care of that for me.  (I'm
really not sure why Linus hates S2D so much, though. Back in the day
there was a lot more BIOS support, but that's been years now.)

	-Kenny

-- 
Kenneth R. Crudup  Sr. SW Engineer, Scott County Consulting, Los Angeles
O: 3630 S. Sepulveda Blvd. #138, L.A., CA 90034-6809      (888) 454-8181

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 19:25                               ` Adrian Bunk
  2007-04-25 19:38                                 ` Linus Torvalds
@ 2007-04-25 19:41                                 ` Andrew Morton
  2007-04-25 19:55                                 ` Pavel Machek
                                                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 158+ messages in thread
From: Andrew Morton @ 2007-04-25 19:41 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Linus Torvalds, Pavel Machek, Ingo Molnar, Nigel Cunningham,
	Christian Hesse, Nick Piggin, Mike Galbraith, linux-kernel,
	Con Kolivas, suspend2-devel, Thomas Gleixner, Arjan van de Ven

On Wed, 25 Apr 2007 21:25:12 +0200 Adrian Bunk <bunk@stusta.de> wrote:

> > > And even 3W would still be a waste of energy.
> > 
> > .. but if the alternative is a feature that just isn't worth it, and 
> > likely to not only have its own bugs, but cause bugs elsewhere? (And yes, 
> > I believe STD is both of those. There's a reason it's called "STD". Go 
> > to google and type "STD" and press "I'm feeling lucky". Google is God).
> 
> Is there really no use case for STD?

I use it all the time.  The batteries only seem to last a day or so in STR.

Plus one is supposed to power off all electrical equipment during takeoff
and landing.

> No worries if power is completely lost.

To change batteries.

> Some people might boot Windows between suspending and resuming.

I use that often too.  (But I won't when I get around to upgrading the X
driver to get the VaioOfDeath's external video output working under
Linux)

I don't think I need a fancy splash screen tho.

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 19:25                               ` Adrian Bunk
@ 2007-04-25 19:38                                 ` Linus Torvalds
  2007-04-25 20:08                                   ` Pavel Machek
                                                     ` (2 more replies)
  2007-04-25 19:41                                 ` Andrew Morton
                                                   ` (3 subsequent siblings)
  4 siblings, 3 replies; 158+ messages in thread
From: Linus Torvalds @ 2007-04-25 19:38 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Pavel Machek, Ingo Molnar, Nigel Cunningham, Christian Hesse,
	Nick Piggin, Mike Galbraith, linux-kernel, Con Kolivas,
	suspend2-devel, Andrew Morton, Thomas Gleixner, Arjan van de Ven



On Wed, 25 Apr 2007, Adrian Bunk wrote:
> > 
> > .. but if the alternative is a feature that just isn't worth it, and 
> > likely to not only have its own bugs, but cause bugs elsewhere? (And yes, 
> > I believe STD is both of those. There's a reason it's called "STD". Go 
> > to google and type "STD" and press "I'm feeling lucky". Google is God).
> 
> Is there really no use case for STD?

People seem to have reading comprehension problems.

The STD code is buggy, and has introduced bugs in STR too, largely thanks 
to bad design. Some of them have happily gotten fixed. Others did not, and 
now we have three totally different versions (two of which share some 
infrastructure), all of which are broken (ie the "suspend2" people will 
swear up-and-down that swsusp doesn't work for them, but anybody who 
thinks that "suspend2" will work for everybody is just being a total 
idiot, and I have a bridge to sell to them).

I'd actually be happier *removing* STD support in the sense it is now: 
it's way too closely integrated with STR, even though it has absolutely 
nothing in common with it. When you STD, you'e actually much closer to a 
*shutdown* than to STR, yet the STD code continually seems to want to be 
in the "suspend" path, as shown even by its name.

So my objections to STD have nothing to do with saving state and shutting 
down. They have everything to do with the fact that it is not - and will 
never be - a "suspend", and it shouldn't affect suspend.

And that's a *fundamental* problem. If the STD people cannot even realize 
that they have less to do with "suspend" than to "reboot", how do you ever 
expect them to get anything to work, and not affect other things 
negatively?

Yeah, I'm down on it. I'm down on it because every person involved with 
the whole STD thing seems to have basically zero taste, and a total 
inability to work with anybody else. 

		Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25  7:29                       ` Pavel Machek
                                           ` (2 preceding siblings ...)
  2007-04-25 13:07                         ` Federico Heinz
@ 2007-04-25 19:38                         ` Kenneth Crudup
  3 siblings, 0 replies; 158+ messages in thread
From: Kenneth Crudup @ 2007-04-25 19:38 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Ingo Molnar, Nick Piggin, Mike Galbraith, linux-kernel,
	Con Kolivas, suspend2-devel, Andrew Morton, Linus Torvalds,
	Thomas Gleixner, Arjan van de Ven


On Wed, 25 Apr 2007, Pavel Machek wrote:

> You'll miss compression part, but that provides only small speedup.

Not here:

----
fgrep -h Compressed /var/log/rawlog*
Apr 22 13:41:34 vaio kernel:   Compressed 85655552 bytes into 46779248 (45 percent compression).
Apr 22 16:09:13 vaio kernel:   Compressed 1380552704 bytes into 435656971 (68 percent compression).
Apr 22 17:06:11 vaio kernel:   Compressed 1488437248 bytes into 437400026 (70 percent compression).
Apr 22 22:55:41 vaio kernel:   Compressed 1875677184 bytes into 623450953 (66 percent compression).
Apr 23 12:30:33 vaio kernel:   Compressed 1731796992 bytes into 528194347 (69 percent compression).
Apr 23 18:13:32 vaio kernel:   Compressed 1883869184 bytes into 691016832 (63 percent compression).
Apr 24 11:55:07 vaio kernel:   Compressed 1795903488 bytes into 703370960 (60 percent compression).
<snip>
----

	-Kenny

-- 
Kenneth R. Crudup  Sr. SW Engineer, Scott County Consulting, Los Angeles
O: 3630 S. Sepulveda Blvd. #138, L.A., CA 90034-6809      (888) 454-8181

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 18:50                             ` Linus Torvalds
  2007-04-25 19:02                               ` Hua Zhong
@ 2007-04-25 19:25                               ` Adrian Bunk
  2007-04-25 19:38                                 ` Linus Torvalds
                                                   ` (4 more replies)
  2007-04-25 23:33                               ` Olivier Galibert
  2 siblings, 5 replies; 158+ messages in thread
From: Adrian Bunk @ 2007-04-25 19:25 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Pavel Machek, Ingo Molnar, Nigel Cunningham, Christian Hesse,
	Nick Piggin, Mike Galbraith, linux-kernel, Con Kolivas,
	suspend2-devel, Andrew Morton, Thomas Gleixner, Arjan van de Ven

On Wed, Apr 25, 2007 at 11:50:45AM -0700, Linus Torvalds wrote:
> 
> 
> On Wed, 25 Apr 2007, Adrian Bunk wrote:
> > 
> > 3W for the complete system? In CPU state S1? [1]
> 
> In STR, 3W is quite realistic. The CPU is off, all (or most - up to you) 
> the devices are off, but the motherboard and memory is powered.

As far as I understand it, the CPU isn't off in S1.

> > And even 3W would still be a waste of energy.
> 
> .. but if the alternative is a feature that just isn't worth it, and 
> likely to not only have its own bugs, but cause bugs elsewhere? (And yes, 
> I believe STD is both of those. There's a reason it's called "STD". Go 
> to google and type "STD" and press "I'm feeling lucky". Google is God).

Is there really no use case for STD?

No worries if power is completely lost.
Some people might boot Windows between suspending and resuming.
...

> 		Linus

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 158+ messages in thread

* RE: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 18:50                             ` Linus Torvalds
@ 2007-04-25 19:02                               ` Hua Zhong
  2007-04-25 19:25                               ` Adrian Bunk
  2007-04-25 23:33                               ` Olivier Galibert
  2 siblings, 0 replies; 158+ messages in thread
From: Hua Zhong @ 2007-04-25 19:02 UTC (permalink / raw)
  To: 'Linus Torvalds', 'Adrian Bunk'
  Cc: 'Pavel Machek', 'Ingo Molnar',
	'Nigel Cunningham', 'Christian Hesse',
	'Nick Piggin', 'Mike Galbraith',
	linux-kernel, 'Con Kolivas',
	suspend2-devel, 'Andrew Morton',
	'Thomas Gleixner', 'Arjan van de Ven'

> In STR, 3W is quite realistic. The CPU is off, all (or most - up to you)
> the devices are off, but the motherboard and memory is powered.
> 
> > And even 3W would still be a waste of energy.
> 
> .. but if the alternative is a feature that just isn't worth it, and
> likely to not only have its own bugs, but cause bugs elsewhere? (And
> yes,
> I believe STD is both of those. There's a reason it's called "STD". Go
> to google and type "STD" and press "I'm feeling lucky". Google is God).

Linus, the fact that you personally don't use S2D does not mean it's not
useful for other people. I've been using solely laptop for six years and
until recently (when my commute is now only two miles) I'd always used
hibernate when I go to or leave form work. And even now if I take my laptop
to somewhere farther away (like on a vacation) I need hibernation.

This is one area where Windows has been doing great for many years, and it's
not like Linux has not had a mature implementation for many years either. So
don't you think your comments are a bit odd at this point?

Hua



^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 17:34                         ` Pavel Machek
  2007-04-25 18:39                           ` Adrian Bunk
@ 2007-04-25 18:52                           ` Alon Bar-Lev
  2007-04-25 22:11                           ` Kenneth Crudup
  2 siblings, 0 replies; 158+ messages in thread
From: Alon Bar-Lev @ 2007-04-25 18:52 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Adrian Bunk, Linus Torvalds, Ingo Molnar, Nigel Cunningham,
	Christian Hesse, Nick Piggin, Mike Galbraith, linux-kernel,
	Con Kolivas, suspend2-devel, Andrew Morton, Thomas Gleixner,
	Arjan van de Ven

On 4/25/07, Pavel Machek <pavel@ucw.cz> wrote:
> Ok ok ok, suspend-to-disk has some other uses, too.
>
> But ... you are really using suspend-to-disk as a workaround for "my
> desktop takes too much power when idle". Imagine pressing "lock
> screensaver" combination, and your machine going to low power mode
> (3W?), immediately. (Quiet, too; you can't generate much noise for
> 3W). In the morning, you'd just press any key, machine would power up,
> immediately... ok, you'd have to ifconfig eth0 down, so that spurious
> packets on the local net would wake your machine, with all its fans
> etc.
>                                                                 Pavel

You are assuming that:
1. You have battery backup, or external power never fail.
2. You don't disconnect the filesystem from the device.
3. The security level of turned on device equals to a turned off one.
4. You turn on the same device that turned off.
5. You do not wish to boot another OS on this machine.

None of the above are always true... but why assume?
Just make this work... If Nigel wish to maintain this please let him,
you can be in charge of the s2ram.

Best Regards,
Alon Bar-Lev.

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 18:39                           ` Adrian Bunk
@ 2007-04-25 18:50                             ` Linus Torvalds
  2007-04-25 19:02                               ` Hua Zhong
                                                 ` (2 more replies)
  0 siblings, 3 replies; 158+ messages in thread
From: Linus Torvalds @ 2007-04-25 18:50 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Pavel Machek, Ingo Molnar, Nigel Cunningham, Christian Hesse,
	Nick Piggin, Mike Galbraith, linux-kernel, Con Kolivas,
	suspend2-devel, Andrew Morton, Thomas Gleixner, Arjan van de Ven



On Wed, 25 Apr 2007, Adrian Bunk wrote:
> 
> 3W for the complete system? In CPU state S1? [1]

In STR, 3W is quite realistic. The CPU is off, all (or most - up to you) 
the devices are off, but the motherboard and memory is powered.

> And even 3W would still be a waste of energy.

.. but if the alternative is a feature that just isn't worth it, and 
likely to not only have its own bugs, but cause bugs elsewhere? (And yes, 
I believe STD is both of those. There's a reason it's called "STD". Go 
to google and type "STD" and press "I'm feeling lucky". Google is God).

		Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 17:34                         ` Pavel Machek
@ 2007-04-25 18:39                           ` Adrian Bunk
  2007-04-25 18:50                             ` Linus Torvalds
  2007-04-25 18:52                           ` Alon Bar-Lev
  2007-04-25 22:11                           ` Kenneth Crudup
  2 siblings, 1 reply; 158+ messages in thread
From: Adrian Bunk @ 2007-04-25 18:39 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Linus Torvalds, Ingo Molnar, Nigel Cunningham, Christian Hesse,
	Nick Piggin, Mike Galbraith, linux-kernel, Con Kolivas,
	suspend2-devel, Andrew Morton, Thomas Gleixner, Arjan van de Ven

On Wed, Apr 25, 2007 at 07:34:05PM +0200, Pavel Machek wrote:
> Hi!
> 
> > > Even I am running in-kernel swsusp, but my managers were pretty clear
> > > they want graphical progress bar hiding all the 'ugly' swsusp
> > > messages... and in the end the same uswsusp enables compression, too.
> > > 
> > > > I absolutely detest all suspend-to-disk crap. Quite frankly, I hate the 
> > > > whole thing. I think they've _all_ caused problems for the "true" suspend 
> > > > (suspend-to-ram), and the last thing I want to see is three or four 
> > > 
> > > Well, it is a bit more complex than that.
> > > 
> > > suspend-to-disk is a workaround for
> > > 
> > > 	'suspend-to-ram eats too much power' (plus some details like
> > > 	being able to replace battery).
> > >...
> > 
> > Why does everyone think suspend-to-disk was a laptop-only thing?
> > 
> > My personal usage of suspend-to-disk is for turning the computer off in 
> > the evening and getting the complete FVWM with all programs running, 
> > open browser tabs,... back the next morning.
> 
> Ok ok ok, suspend-to-disk has some other uses, too.
> 
> But ... you are really using suspend-to-disk as a workaround for "my
> desktop takes too much power when idle". Imagine pressing "lock
> screensaver" combination, and your machine going to low power mode
> (3W?), immediately. (Quiet, too; you can't generate much noise for
> 3W). In the morning, you'd just press any key, machine would power up,
> immediately... ok, you'd have to ifconfig eth0 down, so that spurious
> packets on the local net would wake your machine, with all its fans
> etc.

3W for the complete system? In CPU state S1? [1]
And even 3W would still be a waste of energy.

And what would be the advantage? The socket my computer is connected at 
is located below my bed so I can turn the power on while still lying in 
bed (the computer is not reachable from my bed). OK, I could create an 
external power button for the computer using longer cables connected to 
the motherboard, but I still haven't understood why this should be 
better for my use case than suspend-to-disk.

> 								Pavel

cu
Adrian

[1] unless I'm misunderstanding [2], page 9, that's the highest state
    my processor supports
[2] http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/24309.pdf

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25 15:18                       ` suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang " Adrian Bunk
@ 2007-04-25 17:34                         ` Pavel Machek
  2007-04-25 18:39                           ` Adrian Bunk
                                             ` (2 more replies)
  0 siblings, 3 replies; 158+ messages in thread
From: Pavel Machek @ 2007-04-25 17:34 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Linus Torvalds, Ingo Molnar, Nigel Cunningham, Christian Hesse,
	Nick Piggin, Mike Galbraith, linux-kernel, Con Kolivas,
	suspend2-devel, Andrew Morton, Thomas Gleixner, Arjan van de Ven

Hi!

> > Even I am running in-kernel swsusp, but my managers were pretty clear
> > they want graphical progress bar hiding all the 'ugly' swsusp
> > messages... and in the end the same uswsusp enables compression, too.
> > 
> > > I absolutely detest all suspend-to-disk crap. Quite frankly, I hate the 
> > > whole thing. I think they've _all_ caused problems for the "true" suspend 
> > > (suspend-to-ram), and the last thing I want to see is three or four 
> > 
> > Well, it is a bit more complex than that.
> > 
> > suspend-to-disk is a workaround for
> > 
> > 	'suspend-to-ram eats too much power' (plus some details like
> > 	being able to replace battery).
> >...
> 
> Why does everyone think suspend-to-disk was a laptop-only thing?
> 
> My personal usage of suspend-to-disk is for turning the computer off in 
> the evening and getting the complete FVWM with all programs running, 
> open browser tabs,... back the next morning.

Ok ok ok, suspend-to-disk has some other uses, too.

But ... you are really using suspend-to-disk as a workaround for "my
desktop takes too much power when idle". Imagine pressing "lock
screensaver" combination, and your machine going to low power mode
(3W?), immediately. (Quiet, too; you can't generate much noise for
3W). In the morning, you'd just press any key, machine would power up,
immediately... ok, you'd have to ifconfig eth0 down, so that spurious
packets on the local net would wake your machine, with all its fans
etc.
								Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25  7:23                     ` Pavel Machek
  2007-04-25  8:48                       ` Xavier Bestel
  2007-04-25  9:02                       ` suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2:hang " Romano Giannetti
@ 2007-04-25 15:18                       ` Adrian Bunk
  2007-04-25 17:34                         ` Pavel Machek
  2007-04-25 19:43                       ` Kenneth Crudup
  2007-05-26 17:37                       ` Martin Steigerwald
  4 siblings, 1 reply; 158+ messages in thread
From: Adrian Bunk @ 2007-04-25 15:18 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Linus Torvalds, Ingo Molnar, Nigel Cunningham, Christian Hesse,
	Nick Piggin, Mike Galbraith, linux-kernel, Con Kolivas,
	suspend2-devel, Andrew Morton, Thomas Gleixner, Arjan van de Ven

On Wed, Apr 25, 2007 at 07:23:50AM +0000, Pavel Machek wrote:
> Hi!
> 
> > This is why there's a lot to be said for
> > 
> > 	echo mem > /sys/power/state
> > 
> > and being able to follow the path through _one_ object (the kernel) over 
> > trying to figure out the interaction between many different parts with 
> > different versions.
> 
> The 'promise' is 'if you can get echo disk > /sys/power/state working,
> uswsusp will work. too'. IOW it should be ok to debug the in-kernel
> parts, only.
> 
> Even I am running in-kernel swsusp, but my managers were pretty clear
> they want graphical progress bar hiding all the 'ugly' swsusp
> messages... and in the end the same uswsusp enables compression, too.
> 
> > I absolutely detest all suspend-to-disk crap. Quite frankly, I hate the 
> > whole thing. I think they've _all_ caused problems for the "true" suspend 
> > (suspend-to-ram), and the last thing I want to see is three or four 
> 
> Well, it is a bit more complex than that.
> 
> suspend-to-disk is a workaround for
> 
> 	'suspend-to-ram eats too much power' (plus some details like
> 	being able to replace battery).
>...

Why does everyone think suspend-to-disk was a laptop-only thing?

My personal usage of suspend-to-disk is for turning the computer off in 
the evening and getting the complete FVWM with all programs running, 
open browser tabs,... back the next morning.

All I need for suspending is:
- echo "disk" > /sys/power/state

All I need for getting a running and usable system back is:
- turn on the power at the socket my computer is connected at
- swapoff -a; swapon -a       [1]
- wait a bit until the above commands finished

That's much more convenient than a cold boot.

And it's working with a plain 2.6.16 kernel and zero userspace support.

> 							Pavel

cu
Adrian

[1] required step: working with 1 or 2 GB swapped out is horrible

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25  7:29                       ` Pavel Machek
  2007-04-25  7:48                         ` Dumitru Ciobarcianu
  2007-04-25  8:48                         ` Nigel Cunningham
@ 2007-04-25 13:07                         ` Federico Heinz
  2007-04-25 19:38                         ` Kenneth Crudup
  3 siblings, 0 replies; 158+ messages in thread
From: Federico Heinz @ 2007-04-25 13:07 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Ingo Molnar, Nick Piggin, Mike Galbraith, linux-kernel,
	Con Kolivas, suspend2-devel, Andrew Morton, Linus Torvalds,
	Thomas Gleixner, Arjan van de Ven

Pavel Machek wrote:
> ..and it means that 'echo disk > ...' should work w/o suspend2 patch,
> too. (Just try it). You'll miss compression part, but that provides
> only small speedup.
>   

In my experience, the speedup is significant, both in hibernating and in 
waking up, and since the full image is written to disk, the system wakes 
up *usable*. It takes forever for a system that wakes up from uswsusp to 
be usable again, it keeps tripping over page faults for *minutes*.

    Fede


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25  9:07                           ` Xavier Bestel
@ 2007-04-25  9:19                             ` Nigel Cunningham
  2007-04-26 18:18                             ` Bill Davidsen
  1 sibling, 0 replies; 158+ messages in thread
From: Nigel Cunningham @ 2007-04-25  9:19 UTC (permalink / raw)
  To: Xavier Bestel
  Cc: Pavel Machek, Linus Torvalds, Ingo Molnar, Christian Hesse,
	Nick Piggin, Mike Galbraith, linux-kernel, Con Kolivas,
	suspend2-devel, Andrew Morton, Thomas Gleixner, Arjan van de Ven

[-- Attachment #1: Type: text/plain, Size: 1755 bytes --]

Hi.

On Wed, 2007-04-25 at 11:07 +0200, Xavier Bestel wrote:
> On Wed, 2007-04-25 at 18:50 +1000, Nigel Cunningham wrote:
> > > (And guess what, it uses APM and suspend is really faster and way more
> > > reliable than each kernel implementation I could try).
> > 
> > If you tried Suspend2 and had problems with reliability, please send me
> > logs. I'll do all I can to help. (I have to qualify it a bit, because
> > I'm not able to fix drivers, but if it's a Suspend2 issue, tell me and
> > I'll fix it).
> 
> Does suspend2 work with APM ? After much trying, I think now the ACPI
> implementation of my laptop (a vintage Compaq Armada 1700) is busted,
> only APM works.

It should do. If you set the powerdown method to 0, it will use
machine_power_off() instead of trying to use acpi, fall back to
machine_halt() if that fails and lastly (should not be needed) a
while(1) cpu_relax() loop.

> AFAIR the problem with suspend2 was that it didn't poweroff some parts
> of the laptop (the led of the wifi pcmcia card was on, and the lcd light
> was on too), but that was last year. Kernel's suspend kind of worked but
> didn't resume (no reaction on button press). As I tried all this last
> year, I may have forgotten some things.

The code to poweroff those parts will be dependent on the drivers
(assuming I'm making the right calls). If it's something where swsusp
works and suspend2 doesn't, it will be because I'm doing something
wrong. If they both don't do the right thing, then it's probably the
driver.

> Honestly, I like this laptop when it works flawlessly, so I don't see
> many reasons to try *susp* again. I'll do it when I'm bored, just not
> today.

Okay :) Just let me know if I can help.

Nigel

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25  8:50                         ` Nigel Cunningham
@ 2007-04-25  9:07                           ` Xavier Bestel
  2007-04-25  9:19                             ` Nigel Cunningham
  2007-04-26 18:18                             ` Bill Davidsen
  0 siblings, 2 replies; 158+ messages in thread
From: Xavier Bestel @ 2007-04-25  9:07 UTC (permalink / raw)
  To: nigel
  Cc: Pavel Machek, Linus Torvalds, Ingo Molnar, Christian Hesse,
	Nick Piggin, Mike Galbraith, linux-kernel, Con Kolivas,
	suspend2-devel, Andrew Morton, Thomas Gleixner, Arjan van de Ven

On Wed, 2007-04-25 at 18:50 +1000, Nigel Cunningham wrote:
> > (And guess what, it uses APM and suspend is really faster and way more
> > reliable than each kernel implementation I could try).
> 
> If you tried Suspend2 and had problems with reliability, please send me
> logs. I'll do all I can to help. (I have to qualify it a bit, because
> I'm not able to fix drivers, but if it's a Suspend2 issue, tell me and
> I'll fix it).

Does suspend2 work with APM ? After much trying, I think now the ACPI
implementation of my laptop (a vintage Compaq Armada 1700) is busted,
only APM works.

AFAIR the problem with suspend2 was that it didn't poweroff some parts
of the laptop (the led of the wifi pcmcia card was on, and the lcd light
was on too), but that was last year. Kernel's suspend kind of worked but
didn't resume (no reaction on button press). As I tried all this last
year, I may have forgotten some things.
Honestly, I like this laptop when it works flawlessly, so I don't see
many reasons to try *susp* again. I'll do it when I'm bored, just not
today.

Thanks,
	Xav



^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and  suspend2:hang in atomic copy)
  2007-04-25  7:23                     ` Pavel Machek
  2007-04-25  8:48                       ` Xavier Bestel
@ 2007-04-25  9:02                       ` Romano Giannetti
  2007-04-25 15:18                       ` suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang " Adrian Bunk
                                         ` (2 subsequent siblings)
  4 siblings, 0 replies; 158+ messages in thread
From: Romano Giannetti @ 2007-04-25  9:02 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Linus Torvalds, Ingo Molnar, Nigel Cunningham, Christian Hesse,
	Nick Piggin, Mike Galbraith, linux-kernel, Con Kolivas,
	suspend2-devel, Andrew Morton, Thomas Gleixner, Arjan van de Ven

On Wed, 2007-04-25 at 07:23 +0000, Pavel Machek wrote:
> suspend-to-disk is a workaround for
>
>         'suspend-to-ram eats too much power' (plus some details like
>         being able to replace battery).
>

...and let me add 'suspend-to-disk' is a workaround for when s2ram does
not work for a gazillion interacting reasons (ACPI, vga bios, drm/dri,
you name it).
I am quite happy with s2ram now on my AMD-based vaio, but it started to
work with 2.6.17 kernels (Ubuntu Edgy, really), and the three years
before that suspend-to-disk (sometime Pavel's, sometime Nigel's) 
saved the day (yes, it's quite faster to use suspend-to-disk that doing
shutdown, reboot, and re-open all the applications).

So, please do not dismiss suspend-to-disk as "crap". It has its place
under the sun.

Romano

--
Romano Giannetti --- romano.giannetti@gmail.com
Sorry for the following disclaimer, it's attached by our outgoing server
and I cannot shut it up.



--
La presente comunicación tiene carácter confidencial y es para el exclusivo uso del destinatario indicado en la misma. Si Ud. no es el destinatario indicado, le informamos que cualquier forma de distribución, reproducción o uso de esta comunicación y/o de la información contenida en la misma están estrictamente prohibidos por la ley. Si Ud. ha recibido esta comunicación por error, por favor, notifíquelo inmediatamente al remitente contestando a este mensaje y proceda a continuación a destruirlo. Gracias por su colaboración.

This communication contains confidential information. It is for the exclusive use of the intended addressee. If you are not the intended addressee, please note that any form of distribution, copying or use of this communication or the information in it is strictly prohibited by law. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy this message. Thank you for your cooperation.


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25  8:48                       ` Xavier Bestel
@ 2007-04-25  8:50                         ` Nigel Cunningham
  2007-04-25  9:07                           ` Xavier Bestel
  0 siblings, 1 reply; 158+ messages in thread
From: Nigel Cunningham @ 2007-04-25  8:50 UTC (permalink / raw)
  To: Xavier Bestel
  Cc: Pavel Machek, Linus Torvalds, Ingo Molnar, Christian Hesse,
	Nick Piggin, Mike Galbraith, linux-kernel, Con Kolivas,
	suspend2-devel, Andrew Morton, Thomas Gleixner, Arjan van de Ven

[-- Attachment #1: Type: text/plain, Size: 1366 bytes --]

Hi.

On Wed, 2007-04-25 at 10:48 +0200, Xavier Bestel wrote:
> On Wed, 2007-04-25 at 07:23 +0000, Pavel Machek wrote:
> > > I absolutely detest all suspend-to-disk crap. Quite frankly, I hate the 
> > > whole thing. I think they've _all_ caused problems for the "true" suspend 
> > > (suspend-to-ram), and the last thing I want to see is three or four 
> > 
> > Well, it is a bit more complex than that.
> > 
> > suspend-to-disk is a workaround for
> > 
> >         'suspend-to-ram eats too much power' (plus some details like
> >         being able to replace battery).
> > 
> > suspend-to-ram is a workaround for
> > 
> >         'idle machine takes way too much power' (plus some details
> >         like don't spin the disk so that machine is safe to carry).
> 
> I think it depends on who you ask. I personally think that suspend-to-
> $youchoose is a workaround for the slowness of system startup. I never
> turn off my laptop, I just suspend it.
> 
> (And guess what, it uses APM and suspend is really faster and way more
> reliable than each kernel implementation I could try).

If you tried Suspend2 and had problems with reliability, please send me
logs. I'll do all I can to help. (I have to qualify it a bit, because
I'm not able to fix drivers, but if it's a Suspend2 issue, tell me and
I'll fix it).

Regards,

Nigel

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25  7:29                       ` Pavel Machek
  2007-04-25  7:48                         ` Dumitru Ciobarcianu
@ 2007-04-25  8:48                         ` Nigel Cunningham
  2007-04-25 13:07                         ` Federico Heinz
  2007-04-25 19:38                         ` Kenneth Crudup
  3 siblings, 0 replies; 158+ messages in thread
From: Nigel Cunningham @ 2007-04-25  8:48 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Ingo Molnar, Linus Torvalds, Christian Hesse, Nick Piggin,
	Mike Galbraith, linux-kernel, Con Kolivas, suspend2-devel,
	Andrew Morton, Thomas Gleixner, Arjan van de Ven

[-- Attachment #1: Type: text/plain, Size: 1911 bytes --]

Hi.

On Wed, 2007-04-25 at 07:29 +0000, Pavel Machek wrote:
> Hi!
> 
> > > I absolutely detest all suspend-to-disk crap. Quite frankly, I hate 
> > > the whole thing. I think they've _all_ caused problems for the "true" 
> > > suspend (suspend-to-ram), and the last thing I want to see is three or 
> > > four different suspend-to-disk implementations.  So unlike Ingo, I 
> > > don't think "let's just integrate them all side-by-side and maintain 
> > > them and look who wins" is really a good idea.
> > >
> > > How many different magic ioctl's does the thing introduce? Is it 
> > > really just *two* entry-points (and how simple are they, 
> > > interface-wise), and nothing else?
> > 
> > userspace-driven-suspend is already in the kernel, today. So it's not 
> > really "two versions side by side doing the same thing", but more of:
> > 
> >            A B C + D E F G H
> > 
> > where "ABC" is used by the uswsusp code today, and "ABCDEFGH" is used by 
> > suspend2. So any "suspend2 merge" would largely be about adding "DEFGH". 
> 
> Actually, we have 'D H' in kernel, today. It is called swsusp...
> (Encryption, swapFile support and Graphical progress are missing from
> today's kernel.)

Along with a lot of other things (see my "Reasons to merge Suspend2"
email from earlier in the day).

> > My original mail was about the following thing: i tried the suspend2 
> > patch (which just makes "echo disk > /sys/power/state" work as expected, 
> > as long as you give the booting up kernel image an idea about where the 
> 
> ..and it means that 'echo disk > ...' should work w/o suspend2 patch,
> too. (Just try it). You'll miss compression part, but that provides
> only small speedup.

Please don't spread misinformation to support your case. LZF compression
(which is what all Suspend2 users use AFAIK) generally doubles the speed
of your cycle.

Nigel

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25  7:23                     ` Pavel Machek
@ 2007-04-25  8:48                       ` Xavier Bestel
  2007-04-25  8:50                         ` Nigel Cunningham
  2007-04-25  9:02                       ` suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2:hang " Romano Giannetti
                                         ` (3 subsequent siblings)
  4 siblings, 1 reply; 158+ messages in thread
From: Xavier Bestel @ 2007-04-25  8:48 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Linus Torvalds, Ingo Molnar, Nigel Cunningham, Christian Hesse,
	Nick Piggin, Mike Galbraith, linux-kernel, Con Kolivas,
	suspend2-devel, Andrew Morton, Thomas Gleixner, Arjan van de Ven

On Wed, 2007-04-25 at 07:23 +0000, Pavel Machek wrote:
> > I absolutely detest all suspend-to-disk crap. Quite frankly, I hate the 
> > whole thing. I think they've _all_ caused problems for the "true" suspend 
> > (suspend-to-ram), and the last thing I want to see is three or four 
> 
> Well, it is a bit more complex than that.
> 
> suspend-to-disk is a workaround for
> 
>         'suspend-to-ram eats too much power' (plus some details like
>         being able to replace battery).
> 
> suspend-to-ram is a workaround for
> 
>         'idle machine takes way too much power' (plus some details
>         like don't spin the disk so that machine is safe to carry).

I think it depends on who you ask. I personally think that suspend-to-
$youchoose is a workaround for the slowness of system startup. I never
turn off my laptop, I just suspend it.

(And guess what, it uses APM and suspend is really faster and way more
reliable than each kernel implementation I could try).

	Xav



^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25  8:10                           ` Pavel Machek
@ 2007-04-25  8:22                             ` Dumitru Ciobarcianu
  2007-04-26 11:12                             ` Pekka Enberg
  1 sibling, 0 replies; 158+ messages in thread
From: Dumitru Ciobarcianu @ 2007-04-25  8:22 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Ingo Molnar, Linus Torvalds, Nigel Cunningham, Christian Hesse,
	Nick Piggin, Mike Galbraith, linux-kernel, Con Kolivas,
	suspend2-devel, Andrew Morton, Thomas Gleixner, Arjan van de Ven

On Wed, 2007-04-25 at 08:10 +0000, Pavel Machek wrote:
> Hi!
> 
> > > > userspace-driven-suspend is already in the kernel, today. So it's not 
> > > > really "two versions side by side doing the same thing", but more of:
> > > > 
> > > >            A B C + D E F G H
> > > > 
> > > > where "ABC" is used by the uswsusp code today, and "ABCDEFGH" is used by 
> > > > suspend2. So any "suspend2 merge" would largely be about adding "DEFGH". 
> > > 
> > > Actually, we have 'D H' in kernel, today. It is called swsusp...
> > > (Encryption, swapFile support and Graphical progress are missing from
> > > today's kernel.)
> > 
> > Please stop using FUD. 
> > Graphical progress it's not in the kernel, even with suspend2.
> 
> It was ascii-art, but still 'graphical', last time I checked.

That would be suspend2ui_text , an userspace app.
It also works without it.

-- 
Cioby



^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25  7:48                         ` Dumitru Ciobarcianu
@ 2007-04-25  8:10                           ` Pavel Machek
  2007-04-25  8:22                             ` Dumitru Ciobarcianu
  2007-04-26 11:12                             ` Pekka Enberg
  0 siblings, 2 replies; 158+ messages in thread
From: Pavel Machek @ 2007-04-25  8:10 UTC (permalink / raw)
  To: Dumitru Ciobarcianu
  Cc: Ingo Molnar, Linus Torvalds, Nigel Cunningham, Christian Hesse,
	Nick Piggin, Mike Galbraith, linux-kernel, Con Kolivas,
	suspend2-devel, Andrew Morton, Thomas Gleixner, Arjan van de Ven

Hi!

> > > userspace-driven-suspend is already in the kernel, today. So it's not 
> > > really "two versions side by side doing the same thing", but more of:
> > > 
> > >            A B C + D E F G H
> > > 
> > > where "ABC" is used by the uswsusp code today, and "ABCDEFGH" is used by 
> > > suspend2. So any "suspend2 merge" would largely be about adding "DEFGH". 
> > 
> > Actually, we have 'D H' in kernel, today. It is called swsusp...
> > (Encryption, swapFile support and Graphical progress are missing from
> > today's kernel.)
> 
> Please stop using FUD. 
> Graphical progress it's not in the kernel, even with suspend2.

It was ascii-art, but still 'graphical', last time I checked.

							Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25  7:29                       ` Pavel Machek
@ 2007-04-25  7:48                         ` Dumitru Ciobarcianu
  2007-04-25  8:10                           ` Pavel Machek
  2007-04-25  8:48                         ` Nigel Cunningham
                                           ` (2 subsequent siblings)
  3 siblings, 1 reply; 158+ messages in thread
From: Dumitru Ciobarcianu @ 2007-04-25  7:48 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Ingo Molnar, Linus Torvalds, Nigel Cunningham, Christian Hesse,
	Nick Piggin, Mike Galbraith, linux-kernel, Con Kolivas,
	suspend2-devel, Andrew Morton, Thomas Gleixner, Arjan van de Ven

On Wed, 2007-04-25 at 07:29 +0000, Pavel Machek wrote:
> > userspace-driven-suspend is already in the kernel, today. So it's not 
> > really "two versions side by side doing the same thing", but more of:
> > 
> >            A B C + D E F G H
> > 
> > where "ABC" is used by the uswsusp code today, and "ABCDEFGH" is used by 
> > suspend2. So any "suspend2 merge" would largely be about adding "DEFGH". 
> 
> Actually, we have 'D H' in kernel, today. It is called swsusp...
> (Encryption, swapFile support and Graphical progress are missing from
> today's kernel.)

Please stop using FUD. 
Graphical progress it's not in the kernel, even with suspend2.

> 
> > My original mail was about the following thing: i tried the suspend2 
> > patch (which just makes "echo disk > /sys/power/state" work as expected, 
> > as long as you give the booting up kernel image an idea about where the 
> 
> ..and it means that 'echo disk > ...' should work w/o suspend2 patch,
> too. (Just try it). You'll miss compression part, but that provides
> only small speedup.

I beg to differ:
  Compressed 904687616 bytes into 418828687 (53 percent compression).

Almost 500mb less to write (did I mention it writes the full image?).
Now imagine the time it takes to write that with those pesky 4200rpm
laptop hdds.

-- 
Cioby

"Mr Linus, how do you debug the kernel, what tools do you use?"
"Ever heard of prinf ?"
(From an presentation at the "Politehnica" University of Bucharest, 1995)


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-25  6:41                     ` Ingo Molnar
@ 2007-04-25  7:29                       ` Pavel Machek
  2007-04-25  7:48                         ` Dumitru Ciobarcianu
                                           ` (3 more replies)
  0 siblings, 4 replies; 158+ messages in thread
From: Pavel Machek @ 2007-04-25  7:29 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Nigel Cunningham, Christian Hesse, Nick Piggin,
	Mike Galbraith, linux-kernel, Con Kolivas, suspend2-devel,
	Andrew Morton, Thomas Gleixner, Arjan van de Ven

Hi!

> > I absolutely detest all suspend-to-disk crap. Quite frankly, I hate 
> > the whole thing. I think they've _all_ caused problems for the "true" 
> > suspend (suspend-to-ram), and the last thing I want to see is three or 
> > four different suspend-to-disk implementations.  So unlike Ingo, I 
> > don't think "let's just integrate them all side-by-side and maintain 
> > them and look who wins" is really a good idea.
> >
> > How many different magic ioctl's does the thing introduce? Is it 
> > really just *two* entry-points (and how simple are they, 
> > interface-wise), and nothing else?
> 
> userspace-driven-suspend is already in the kernel, today. So it's not 
> really "two versions side by side doing the same thing", but more of:
> 
>            A B C + D E F G H
> 
> where "ABC" is used by the uswsusp code today, and "ABCDEFGH" is used by 
> suspend2. So any "suspend2 merge" would largely be about adding "DEFGH". 

Actually, we have 'D H' in kernel, today. It is called swsusp...
(Encryption, swapFile support and Graphical progress are missing from
today's kernel.)

> My original mail was about the following thing: i tried the suspend2 
> patch (which just makes "echo disk > /sys/power/state" work as expected, 
> as long as you give the booting up kernel image an idea about where the 

..and it means that 'echo disk > ...' should work w/o suspend2 patch,
too. (Just try it). You'll miss compression part, but that provides
only small speedup.
							Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-24 23:41                   ` Linus Torvalds
  2007-04-25  1:06                     ` Olivier Galibert
  2007-04-25  6:41                     ` Ingo Molnar
@ 2007-04-25  7:23                     ` Pavel Machek
  2007-04-25  8:48                       ` Xavier Bestel
                                         ` (4 more replies)
  2 siblings, 5 replies; 158+ messages in thread
From: Pavel Machek @ 2007-04-25  7:23 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ingo Molnar, Nigel Cunningham, Christian Hesse, Nick Piggin,
	Mike Galbraith, linux-kernel, Con Kolivas, suspend2-devel,
	Andrew Morton, Thomas Gleixner, Arjan van de Ven

Hi!

> This is why there's a lot to be said for
> 
> 	echo mem > /sys/power/state
> 
> and being able to follow the path through _one_ object (the kernel) over 
> trying to figure out the interaction between many different parts with 
> different versions.

The 'promise' is 'if you can get echo disk > /sys/power/state working,
uswsusp will work. too'. IOW it should be ok to debug the in-kernel
parts, only.

Even I am running in-kernel swsusp, but my managers were pretty clear
they want graphical progress bar hiding all the 'ugly' swsusp
messages... and in the end the same uswsusp enables compression, too.

> I absolutely detest all suspend-to-disk crap. Quite frankly, I hate the 
> whole thing. I think they've _all_ caused problems for the "true" suspend 
> (suspend-to-ram), and the last thing I want to see is three or four 

Well, it is a bit more complex than that.

suspend-to-disk is a workaround for

	'suspend-to-ram eats too much power' (plus some details like
	being able to replace battery).

suspend-to-ram is a workaround for

	'idle machine takes way too much power' (plus some details
	like don't spin the disk so that machine is safe to carry).

I'm starting to think that we should fix the idle power consumption
problem. Cell phones do it right. They pretend to be ready/idle all
the time, yet they have _days_ of standby.

OLPC can do something like that, too: it is capable of entering
suspend-to-ram with screen on and input devices ready to wake the
system up.

And with right network card (and right userland) ... I think normal
PCs could enter suspend-to-ram during screensaver, too. When you are
about to turn off the screen, machine should enable WOL on any packet,
arm RTC wakeup for the next packet, and s2ram happily.

(Obviously we are far away from that on PC.)
							Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-24 23:41                   ` Linus Torvalds
  2007-04-25  1:06                     ` Olivier Galibert
@ 2007-04-25  6:41                     ` Ingo Molnar
  2007-04-25  7:29                       ` Pavel Machek
  2007-04-25  7:23                     ` Pavel Machek
  2 siblings, 1 reply; 158+ messages in thread
From: Ingo Molnar @ 2007-04-25  6:41 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Pavel Machek, Nigel Cunningham, Christian Hesse, Nick Piggin,
	Mike Galbraith, linux-kernel, Con Kolivas, suspend2-devel,
	Andrew Morton, Thomas Gleixner, Arjan van de Ven


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> I absolutely detest all suspend-to-disk crap. Quite frankly, I hate 
> the whole thing. I think they've _all_ caused problems for the "true" 
> suspend (suspend-to-ram), and the last thing I want to see is three or 
> four different suspend-to-disk implementations.  So unlike Ingo, I 
> don't think "let's just integrate them all side-by-side and maintain 
> them and look who wins" is really a good idea.
>
> How many different magic ioctl's does the thing introduce? Is it 
> really just *two* entry-points (and how simple are they, 
> interface-wise), and nothing else?

userspace-driven-suspend is already in the kernel, today. So it's not 
really "two versions side by side doing the same thing", but more of:

           A B C + D E F G H

where "ABC" is used by the uswsusp code today, and "ABCDEFGH" is used by 
suspend2. So any "suspend2 merge" would largely be about adding "DEFGH". 
(uswsusp of course redoes 'DEFGH' in user-space its own way, and there 
is the inevitable "+" glue code as well, but it's at least not two 
parallel versions of the same thing in the kernel, which would be 
gross.)

My original mail was about the following thing: i tried the suspend2 
patch (which just makes "echo disk > /sys/power/state" work as expected, 
as long as you give the booting up kernel image an idea about where the 
swap partition we suspended to is, via a single boot option) and that it 
was pretty straightforward and worked well, and that i think its way of 
reusing the existing suspend infrastructure and doing the add-ons 
cleanly while keeping the existing user-hibernate code intact looked 
viable to me.

I.e. to me it looked like while there are apparent conflicts of 
personalities suspend2 did not really seem to be a hostile 
reimplementation of 'A B C', but that it tries to build upon 'A B C' and 
just has a different technical opinion about whether 'DEFGH' should be 
in the kernel or outside of it.

	Ingo

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-24 23:41                   ` Linus Torvalds
@ 2007-04-25  1:06                     ` Olivier Galibert
  2007-04-25  6:41                     ` Ingo Molnar
  2007-04-25  7:23                     ` Pavel Machek
  2 siblings, 0 replies; 158+ messages in thread
From: Olivier Galibert @ 2007-04-25  1:06 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Pavel Machek, Ingo Molnar, Nigel Cunningham, Christian Hesse,
	Nick Piggin, Mike Galbraith, linux-kernel, Con Kolivas,
	suspend2-devel, Andrew Morton, Thomas Gleixner, Arjan van de Ven

On Tue, Apr 24, 2007 at 04:41:58PM -0700, Linus Torvalds wrote:
> How many different magic ioctl's does the thing introduce? Is it really 
> just *two* entry-points (and how simple are they, interface-wise), and 
> nothing else?

Aren't you a little late to the party here?  The userland version is
the one that currently is in the kernel, after all the people who said
"doing it in userland is not necessarily a good idea" got happily
ignored.  Suspend2 which is the continuity of the fully-in-kernel one
is the one that has been constantly rejected by Pavel, lately by
saying "it should be done in userspace", and hence never merged.

Incidentally, it's 13 ioctls, and it's documented in
Documentation/power/userland-swsusp.txt in a hard drive near you.  I
especially like the "get the available swap space in bytes" one that
can only handle 32 bits.

  OG.

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-24 21:24                 ` Pavel Machek
@ 2007-04-24 23:41                   ` Linus Torvalds
  2007-04-25  1:06                     ` Olivier Galibert
                                       ` (2 more replies)
  2007-04-26 10:17                   ` Johannes Berg
  1 sibling, 3 replies; 158+ messages in thread
From: Linus Torvalds @ 2007-04-24 23:41 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Ingo Molnar, Nigel Cunningham, Christian Hesse, Nick Piggin,
	Mike Galbraith, linux-kernel, Con Kolivas, suspend2-devel,
	Andrew Morton, Thomas Gleixner, Arjan van de Ven



On Tue, 24 Apr 2007, Pavel Machek wrote:
> > 
> > If the code just moved somewhere else, it's not "less code".
> 
> It is not "just moved". It is in userspace, where we can use liblzf /
> gcrypt / ( and vbetool for s2ram/s2both) as libraries. We have about
> 7000 LoC of userland code (that is not libraries).

If it's in user land, we also have 

 - communication difficulties between two parts, and all the *crap* that 
   tends to entail (ie legacy interfaces forever, and upgrading one 
   without the other etc)

 - people who work on the kernel part are working "blind" (ie they are at 
   the mercy of whatever userland does, and it's not a "contained" 
   subsystem). This just ends up becoming worse when you then interact 
   with ten different versions of the user-land stuff, thanks to small 
   tweaks by five different vendors, and a hundred random people.

And don't tell me that doesn't happen. Maybe it doesn't happen _now_, 
because people who use it all get the patches from one place, but the 
moment we start talking about integration into the standard kernel, that 
means that the kernel needs to work regardless of whether somebody uses 
SuSE, RH, Fedora, Ubuntu or cooked his own distro entirely using some 
development version of the suspend user-space tools.

This is why I don't believe in the whole kernel-line-counting thing. I'm 
personally 100% convinced that it's better to have ten times as many lines 
in the kernel, if it means that you can just forget about version skew and 
bad user-space interfaces etc.

So if you want to enumerate "good" points, you'd damn well also face the 
_problems_.

This is why there's a lot to be said for

	echo mem > /sys/power/state

and being able to follow the path through _one_ object (the kernel) over 
trying to figure out the interaction between many different parts with 
different versions.

> I believe uswsusp user/kernel separation is clean enough. Kernel
> provides "snapshot image" and "resume image". (Thanks go to Rafael for
> very clean interface).

Now, *that* is the kind of argument that matters.

Quite frankly, if you want to convince me, it's not by "lines of kernel 
code", but by talking about easy-to-understand interfaces that actuually 
do one thing and do it well (and by "one thing", I mean "one _whole_ 
thing"). Because I care a lot less about lines of code than about 
"maintainable interfaces that people can think about and debug".

I absolutely detest all suspend-to-disk crap. Quite frankly, I hate the 
whole thing. I think they've _all_ caused problems for the "true" suspend 
(suspend-to-ram), and the last thing I want to see is three or four 
different suspend-to-disk implementations.  So unlike Ingo, I don't think 
"let's just integrate them all side-by-side and maintain them and look who 
wins" is really a good idea.

How many different magic ioctl's does the thing introduce? Is it really 
just *two* entry-points (and how simple are they, interface-wise), and 
nothing else?

		Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-24 21:29                   ` Pavel Machek
@ 2007-04-24 22:24                     ` Ray Lee
  2007-04-25 21:41                     ` Matt Mackall
  1 sibling, 0 replies; 158+ messages in thread
From: Ray Lee @ 2007-04-24 22:24 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Ingo Molnar, Linus Torvalds, Nigel Cunningham, Christian Hesse,
	Nick Piggin, Mike Galbraith, linux-kernel, Con Kolivas,
	suspend2-devel, Andrew Morton, Thomas Gleixner, Arjan van de Ven

On 4/24/07, Pavel Machek <pavel@ucw.cz> wrote:
> > so, while it's probably apples to oranges, uswsusp seems to be larger,
> > while there's at least one feature that it is missing.
>
> (We are talking "save 100% memory" here).
>
> As I said, that one feature is doable in uswsusp, too. It is 200
> lines. It also makes mm <-> swsusp interaction _way_ more complex, and

Sounds like the perfect reason to put that in kernel space.

Ray

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-24 20:54                 ` Ingo Molnar
@ 2007-04-24 21:29                   ` Pavel Machek
  2007-04-24 22:24                     ` Ray Lee
  2007-04-25 21:41                     ` Matt Mackall
  0 siblings, 2 replies; 158+ messages in thread
From: Pavel Machek @ 2007-04-24 21:29 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Nigel Cunningham, Christian Hesse, Nick Piggin,
	Mike Galbraith, linux-kernel, Con Kolivas, suspend2-devel,
	Andrew Morton, Thomas Gleixner, Arjan van de Ven

Hi!

> > I have not a frigging clue whether that is the case in suspend2 vs 
> > uswsusp, but I object to this idiotic argument of counting "kernel 
> > code". That's simply not a valid argument. It never was.
> 
> the raw linecount appears to be the following:
> 
>  suspend2-2.2.9.12-for-2.6.21-rc6.patch
> 
>    89 files changed, 16452 insertions(+), 69 deletions(-)
> 
>  $ suspend-0.5> countlines
>  32260

(I'm getting very different numbers here. Userland part:)

pavel@amd:~/sf/suspend$ wc -l *.c
   125 bootsplash.c
    12 breakit.c
   119 config.c
   136 delme.c
   207 dmidecode.c
    92 encrypt.c
   222 keygen.c
   447 md5.c
   286 radeontool.c
   870 resume.c
   434 s2ram.c
    78 splash.c
    73 splashy_funcs.c
  1481 suspend.c
   117 swap-offset.c
    11 vfork_test.c
   123 vt.c
   413 whitelist.c
   136 whitelist2.c
  5382 total
pavel@amd:~/sf/suspend$ wc -l *.h
    23 bootsplash.h
    26 config.h
    62 encrypt.h
   106 md5.h
  1764 radeon_reg.h
    20 s2ram.h
    26 splash.h
    25 splashy_funcs.h
   217 swsusp.h
    10 vt.h
  2279 total
pavel@amd:~/sf/suspend$

> so, while it's probably apples to oranges, uswsusp seems to be larger, 
> while there's at least one feature that it is missing.

(We are talking "save 100% memory" here).

As I said, that one feature is doable in uswsusp, too. It is 200
lines. It also makes mm <-> swsusp interaction _way_ more complex, and
noone was able to review it. It will corrupt memory if we got it
wrong.

(Suspend2 has the same problem. It includes that same feature, and
noone is able to review it. It has few more problems.).

> also, from the structure of the suspend2 patch it seemed to me that they 
> could peacefully coexist in the kernel without stepping on each other's 
> toes - why not do that? Users will then pick the winner.

We do not want to fragment the testing base, and suspend2 does not
really have any interesting features over uswsusp.
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-24 20:41               ` Linus Torvalds
  2007-04-24 20:51                 ` Hua Zhong
  2007-04-24 20:54                 ` Ingo Molnar
@ 2007-04-24 21:24                 ` Pavel Machek
  2007-04-24 23:41                   ` Linus Torvalds
  2007-04-26 10:17                   ` Johannes Berg
  2 siblings, 2 replies; 158+ messages in thread
From: Pavel Machek @ 2007-04-24 21:24 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ingo Molnar, Nigel Cunningham, Christian Hesse, Nick Piggin,
	Mike Galbraith, linux-kernel, Con Kolivas, suspend2-devel,
	Andrew Morton, Thomas Gleixner, Arjan van de Ven

Hi!

> > Well, current uswsusp code can do most of stuff suspend2 can do, with
> > 20% (or so) of kernel code. 
> 
> Btw, this is a totally inane argument.
> 
> If the code just moved somewhere else, it's not "less code".

It is not "just moved". It is in userspace, where we can use liblzf /
gcrypt / ( and vbetool for s2ram/s2both) as libraries. We have about
7000 LoC of userland code (that is not libraries).

> You compare complete subsystems against complete subsystems, OR YOU DON'T 
> COMPARE THEM AT ALL!

Ok, I do not know how big suspend2 user code is, but kernel uswsusp
(4 kLoC) + userland support (7 kLoC) is still smaller than suspend2
kernel code (+ ? kLoC suspend2 userland support).

> This whole notion that "kernel lines of code" is somehow different is a 
> stupid and idiotic _disease_ that is spread by microkernel people and 
> people who have been brainwashed by them.

Yep, sorry about that.

> Separation of code often makes things *harder* to understand and debug. A 
> few prime examples of this f*cking idiotic stupid disease of discounting 
> user level code because it somehow "doesn't matter" is:

I believe uswsusp user/kernel separation is clean enough. Kernel
provides "snapshot image" and "resume image". (Thanks go to Rafael for
very clean interface).

								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-24 20:41               ` Linus Torvalds
  2007-04-24 20:51                 ` Hua Zhong
@ 2007-04-24 20:54                 ` Ingo Molnar
  2007-04-24 21:29                   ` Pavel Machek
  2007-04-24 21:24                 ` Pavel Machek
  2 siblings, 1 reply; 158+ messages in thread
From: Ingo Molnar @ 2007-04-24 20:54 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Pavel Machek, Nigel Cunningham, Christian Hesse, Nick Piggin,
	Mike Galbraith, linux-kernel, Con Kolivas, suspend2-devel,
	Andrew Morton, Thomas Gleixner, Arjan van de Ven


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> I have not a frigging clue whether that is the case in suspend2 vs 
> uswsusp, but I object to this idiotic argument of counting "kernel 
> code". That's simply not a valid argument. It never was.

the raw linecount appears to be the following:

 suspend2-2.2.9.12-for-2.6.21-rc6.patch

   89 files changed, 16452 insertions(+), 69 deletions(-)

 $ suspend-0.5> countlines
 32260

so, while it's probably apples to oranges, uswsusp seems to be larger, 
while there's at least one feature that it is missing.

also, from the structure of the suspend2 patch it seemed to me that they 
could peacefully coexist in the kernel without stepping on each other's 
toes - why not do that? Users will then pick the winner.

	Ingo

^ permalink raw reply	[flat|nested] 158+ messages in thread

* RE: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-24 20:41               ` Linus Torvalds
@ 2007-04-24 20:51                 ` Hua Zhong
  2007-04-24 20:54                 ` Ingo Molnar
  2007-04-24 21:24                 ` Pavel Machek
  2 siblings, 0 replies; 158+ messages in thread
From: Hua Zhong @ 2007-04-24 20:51 UTC (permalink / raw)
  To: 'Linus Torvalds', 'Pavel Machek'
  Cc: 'Ingo Molnar', 'Nigel Cunningham',
	'Christian Hesse', 'Nick Piggin',
	'Mike Galbraith', linux-kernel, 'Con Kolivas',
	suspend2-devel, 'Andrew Morton',
	'Thomas Gleixner', 'Arjan van de Ven'

> This whole notion that "kernel lines of code" is somehow different is a
> stupid and idiotic _disease_ that is spread by microkernel people and
> people who have been brainwashed by them.

I think a lot of people are tired of this argument, but I am glad you speak
up (as you did last year wrt s2ram).

> The only thing that matters is the end result

Amen to that. The end result is not just code size, but quality and whether
it actually *works reliably*.

Cheers,

Hua


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-24 20:23             ` suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy) Pavel Machek
@ 2007-04-24 20:41               ` Linus Torvalds
  2007-04-24 20:51                 ` Hua Zhong
                                   ` (2 more replies)
  0 siblings, 3 replies; 158+ messages in thread
From: Linus Torvalds @ 2007-04-24 20:41 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Ingo Molnar, Nigel Cunningham, Christian Hesse, Nick Piggin,
	Mike Galbraith, linux-kernel, Con Kolivas, suspend2-devel,
	Andrew Morton, Thomas Gleixner, Arjan van de Ven



On Tue, 24 Apr 2007, Pavel Machek wrote:
> 
> Well, current uswsusp code can do most of stuff suspend2 can do, with
> 20% (or so) of kernel code. 

Btw, this is a totally inane argument.

If the code just moved somewhere else, it's not "less code".

You compare complete subsystems against complete subsystems, OR YOU DON'T 
COMPARE THEM AT ALL!

This whole notion that "kernel lines of code" is somehow different is a 
stupid and idiotic _disease_ that is spread by microkernel people and 
people who have been brainwashed by them.

Code is code, and sometimes it's better in the kernel, sometimes it's 
better in user space, and you cannot say "we only have 10 lines of kernel 
code", if that is then combined with a million lines of user space code 
that actually is the only reason for the 10 lines of code in the first 
place.

Separation of code often makes things *harder* to understand and debug. A 
few prime examples of this f*cking idiotic stupid disease of discounting 
user level code because it somehow "doesn't matter" is:

 - the old 16-bit pcmcia layer that did all the "policy" in user space, 
   and only the "device access" in kernel space, and as a result _neither_ 
   actually knew what the hell they were actually doing, and debugging was 
   a nightmare.

   We've become a *lot* better off with a device layer that actually knows 
   and understands what it is doing, and having the code in one place, 
   rather than having two broken pieces.

   And we became better exactly by doing *more* in the kernel, and havign 
   a *higher* level of abstraction. This is a BIG ISSUE. Abstraction is 
   good, but abstraction is good only if it is at a high enough level to 
   make sense and matter, and give the abstraction level a choice in how 
   to implement the lower layers.

 - the old module loader was also split into user/kernel space, and yes, 
   we made the kernel part "larger" by moving some parts into the kernel, 
   but in doing so, we actually made the *combined* code smaller, and a 
   hell of a lot more maintainable.

   It also automatically (again, because of a higher level of abstraction) 
   meant that the new module loader infrastructure was not only more 
   maintainable, but could actually *do* more. Suddenly you can do things
   like check for cryptographic signatures etc, because you know what 
   you're actually doing, as opposed to getting a ready-made "binary blob" 
   that you don't know anything about, that has been pre-linked etc.

So stop blathering about "less kernel code". That's the *least* of any 
worries. The only thing that matters is the end result, and trying to say 
that magically only one part counts is just dishonest and stupid.

In general, the kernel should be self-sufficient and *understand* what it 
is doing. If the kernel cannot understand the bigger picture, nobody can 
ever maintain the kernel, because the kernel is just a broken piece 
bobbing around in a mindless churning sea during a thunderstorm. You 
cannot have purpose, and you cannot improve yourself if you don't actyally 
understand your lot in life. That's as true of kernels as it is of people.

User-space should set high-level policy, but if the kernel doesn't know 
what it's all about, the kernel can never do anything smarter and can 
never *fix* itself. That was the case in both PCMCIA and in kernel module 
loaders.

I have not a frigging clue whether that is the case in suspend2 vs 
uswsusp, but I object to this idiotic argument of counting "kernel code". 
That's simply not a valid argument. It never was.

			Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-19  7:04           ` Ingo Molnar
@ 2007-04-24 20:23             ` Pavel Machek
  2007-04-24 20:41               ` Linus Torvalds
  0 siblings, 1 reply; 158+ messages in thread
From: Pavel Machek @ 2007-04-24 20:23 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Nigel Cunningham, Christian Hesse, Nick Piggin, Mike Galbraith,
	linux-kernel, Con Kolivas, suspend2-devel, Andrew Morton,
	Linus Torvalds, Thomas Gleixner, Arjan van de Ven

Hi!

> > From subsequent emails, I think you already got your answer, but just 
> > in case...
> > 
> > Yes, if you enabled "Replace swsusp by default" and you already had it 
> > set up for getting swsusp to resume. If not, and you're using an 
> > initrd/ramfs, you'll need to modify it to echo
> > > /sys/power/suspend2/do_resume after /sys and /proc are mounted but
> > prior to mounting / and so on.
> 
> yeah, went with the default suggested by your patch:
> 
>    CONFIG_SUSPEND2_REPLACE_SWSUSP=y
> 
> and it was pretty easy to set things up. I used "echo disk > 
> /sys/power/state" to trigger it.
> 
> In hindsight it was all pretty straightforward and suspend2 worked 
> beautifully on an UP and on an SMP system i tried. So in exchange for 
> suspend2 folks debugging a bug in CFS here's some suspend2 review 
> feedback ;) Any plans about moving suspend2 to the upstream kernel? It 
> should be pretty easy for it to co-exist with the current swsuspend 
> code.

Well, current uswsusp code can do most of stuff suspend2 can do, with
20% (or so) of kernel code. 

"Major feature" that is missing is ability to save 100% of memory if
it is all the pagecache. I think that is not that important; we have
200 line patch to do that, but noone was able to verify it is correct.

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 158+ messages in thread

end of thread, other threads:[~2007-05-26 22:23 UTC | newest]

Thread overview: 158+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-04-26 17:19 suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy) Josh Triplett
2007-04-26 17:42 ` Linus Torvalds
2007-04-28 19:18   ` Bill Davidsen
2007-04-28 19:15     ` David Lang
2007-04-29  1:04       ` Matthew Garrett
2007-04-29  1:34         ` Josh Triplett
  -- strict thread matches above, loose matches on Subject: below --
2007-04-13 20:21 [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS] Ingo Molnar
2007-04-18 20:45 ` CFS and suspend2: hang in atomic copy Christian Hesse
2007-04-18 21:16   ` Ingo Molnar
2007-04-18 21:57     ` Christian Hesse
2007-04-18 22:02       ` Ingo Molnar
2007-04-19  1:52         ` [Suspend2-devel] " Nigel Cunningham
2007-04-19  7:04           ` Ingo Molnar
2007-04-24 20:23             ` suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy) Pavel Machek
2007-04-24 20:41               ` Linus Torvalds
2007-04-24 20:51                 ` Hua Zhong
2007-04-24 20:54                 ` Ingo Molnar
2007-04-24 21:29                   ` Pavel Machek
2007-04-24 22:24                     ` Ray Lee
2007-04-25 21:41                     ` Matt Mackall
2007-04-26 11:27                       ` Pavel Machek
2007-04-26 19:04                       ` Bill Davidsen
2007-04-24 21:24                 ` Pavel Machek
2007-04-24 23:41                   ` Linus Torvalds
2007-04-25  1:06                     ` Olivier Galibert
2007-04-25  6:41                     ` Ingo Molnar
2007-04-25  7:29                       ` Pavel Machek
2007-04-25  7:48                         ` Dumitru Ciobarcianu
2007-04-25  8:10                           ` Pavel Machek
2007-04-25  8:22                             ` Dumitru Ciobarcianu
2007-04-26 11:12                             ` Pekka Enberg
2007-04-26 14:48                               ` Rafael J. Wysocki
2007-04-26 16:10                                 ` Pekka Enberg
2007-04-26 19:28                                   ` Rafael J. Wysocki
2007-04-26 20:16                                     ` Nigel Cunningham
2007-04-26 20:37                                       ` Rafael J. Wysocki
2007-04-26 20:49                                         ` David Lang
2007-04-26 20:55                                         ` Nigel Cunningham
2007-04-26 21:22                                           ` Rafael J. Wysocki
2007-04-26 22:08                                             ` Nigel Cunningham
2007-04-25  8:48                         ` Nigel Cunningham
2007-04-25 13:07                         ` Federico Heinz
2007-04-25 19:38                         ` Kenneth Crudup
2007-04-25  7:23                     ` Pavel Machek
2007-04-25  8:48                       ` Xavier Bestel
2007-04-25  8:50                         ` Nigel Cunningham
2007-04-25  9:07                           ` Xavier Bestel
2007-04-25  9:19                             ` Nigel Cunningham
2007-04-26 18:18                             ` Bill Davidsen
2007-04-25  9:02                       ` suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2:hang " Romano Giannetti
2007-04-25 15:18                       ` suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang " Adrian Bunk
2007-04-25 17:34                         ` Pavel Machek
2007-04-25 18:39                           ` Adrian Bunk
2007-04-25 18:50                             ` Linus Torvalds
2007-04-25 19:02                               ` Hua Zhong
2007-04-25 19:25                               ` Adrian Bunk
2007-04-25 19:38                                 ` Linus Torvalds
2007-04-25 20:08                                   ` Pavel Machek
2007-04-25 20:33                                     ` Rafael J. Wysocki
2007-04-25 20:31                                       ` Pavel Machek
2007-04-25 22:36                                     ` Manu Abraham
2007-04-25 20:20                                   ` Rafael J. Wysocki
2007-04-25 20:24                                     ` Linus Torvalds
2007-04-25 21:30                                       ` Pavel Machek
2007-04-25 21:40                                         ` Rafael J. Wysocki
2007-04-25 21:46                                           ` Pavel Machek
2007-04-25 22:22                                         ` Nigel Cunningham
2007-04-25 20:23                                   ` Adrian Bunk
2007-04-25 22:19                                     ` Kenneth Crudup
2007-04-25 19:41                                 ` Andrew Morton
2007-04-25 19:55                                 ` Pavel Machek
2007-04-25 22:13                                 ` Kenneth Crudup
2007-04-26  1:25                                 ` Antonino A. Daplas
2007-04-25 23:33                               ` Olivier Galibert
2007-04-26  1:56                                 ` Nigel Cunningham
2007-04-26  7:27                                   ` David Lang
2007-04-26  9:45                                     ` Nigel Cunningham
2007-04-25 18:52                           ` Alon Bar-Lev
2007-04-25 22:11                           ` Kenneth Crudup
2007-04-25 19:43                       ` Kenneth Crudup
2007-04-25 20:08                         ` Linus Torvalds
2007-04-25 20:27                           ` Pavel Machek
2007-04-25 20:44                             ` Linus Torvalds
2007-04-25 21:07                               ` Rafael J. Wysocki
2007-04-25 21:44                               ` Pavel Machek
2007-04-25 22:18                                 ` Linus Torvalds
2007-04-25 22:27                                   ` Nigel Cunningham
2007-04-25 22:55                                     ` Linus Torvalds
2007-04-25 23:13                                       ` Pavel Machek
2007-04-25 23:29                                         ` Linus Torvalds
2007-04-25 23:45                                           ` Pavel Machek
2007-04-26  1:48                                             ` Nigel Cunningham
2007-04-26  1:40                                       ` Nigel Cunningham
2007-04-26  2:04                                         ` Linus Torvalds
2007-04-26  2:13                                           ` Nigel Cunningham
2007-04-26  3:03                                             ` Linus Torvalds
2007-04-26  3:34                                               ` Nigel Cunningham
2007-04-26  2:31                                           ` Nigel Cunningham
2007-04-26 10:39                                       ` Johannes Berg
2007-04-26 11:30                                         ` Pavel Machek
2007-04-26 11:41                                           ` Johannes Berg
2007-04-26 16:31                                           ` Johannes Berg
2007-04-26 18:40                                             ` Rafael J. Wysocki
2007-04-26 18:40                                               ` Johannes Berg
2007-04-26 19:02                                                 ` Rafael J. Wysocki
2007-04-27  9:41                                                   ` Johannes Berg
2007-04-27 10:18                                                     ` Rafael J. Wysocki
2007-04-27 10:19                                                       ` Johannes Berg
2007-04-27 12:09                                                         ` Rafael J. Wysocki
2007-04-27 12:07                                                           ` Johannes Berg
2007-04-25 22:42                                   ` Pavel Machek
2007-04-25 22:58                                     ` Linus Torvalds
2007-04-25 22:43                                   ` Chuck Ebbert
2007-04-25 23:00                                     ` Linus Torvalds
2007-04-25 22:49                                   ` Pavel Machek
2007-04-25 23:10                                     ` Linus Torvalds
2007-04-25 23:28                                       ` Pavel Machek
2007-04-25 23:57                                         ` Linus Torvalds
2007-04-25 22:57                                   ` Alan Cox
2007-04-25 23:20                                     ` Linus Torvalds
2007-04-25 23:52                                       ` Pavel Machek
2007-04-26  0:05                                         ` Linus Torvalds
2007-04-26  0:14                                           ` Pavel Machek
2007-04-25 23:51                                             ` David Lang
2007-04-26  0:38                                             ` Linus Torvalds
2007-04-26  2:04                                               ` H. Peter Anvin
2007-04-26  2:32                                                 ` Linus Torvalds
2007-04-26 13:14                                                   ` Alan Cox
2007-04-26 16:02                                                     ` Linus Torvalds
2007-04-26  0:34                                           ` Linus Torvalds
2007-04-26 20:12                                             ` Rafael J. Wysocki
2007-04-26  0:24                                       ` Alan Cox
2007-04-26  1:10                                         ` Linus Torvalds
2007-04-26 14:04                                           ` Mark Lord
2007-04-26 16:10                                             ` Linus Torvalds
2007-04-26 21:00                                               ` Pavel Machek
2007-04-26  7:08                                         ` Andy Grover
2007-04-26  0:41                           ` Thomas Orgis
2007-05-26 17:37                       ` Martin Steigerwald
2007-05-26 20:35                         ` Rafael J. Wysocki
2007-05-26 22:23                           ` Martin Steigerwald
2007-04-26 10:17                   ` Johannes Berg
2007-04-26 10:30                     ` Pavel Machek
2007-04-26 10:40                       ` Pavel Machek
2007-04-26 11:11                         ` Johannes Berg
2007-04-26 11:16                           ` Pavel Machek
2007-04-26 11:27                             ` Johannes Berg
2007-04-26 11:26                               ` Pavel Machek
2007-04-26 11:35                                 ` Johannes Berg
2007-04-26 11:33                                   ` Pavel Machek
2007-04-26 16:14                                   ` Chris Friesen
2007-04-26 16:27                                     ` Linus Torvalds
2007-04-26 17:11                                     ` Johannes Berg
2007-04-26 15:56                                 ` Linus Torvalds
2007-04-26 21:06                                   ` Theodore Tso
2007-04-26 21:12                                     ` Nigel Cunningham
2007-04-26 13:45                         ` Johannes Berg
2007-04-26 11:04                       ` Johannes Berg
2007-04-26 11:09                         ` Pavel Machek
2007-04-26 15:53                           ` Linus Torvalds
2007-04-26 18:21                           ` Olivier Galibert
2007-04-26 21:30                             ` Pavel Machek
2007-04-26 11:35                     ` Christoph Hellwig
2007-04-26 12:15                       ` Ingo Molnar
2007-04-26 12:41                         ` Pavel Machek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).