linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RE: Fix SWSUSP & !SWAP
@ 2003-04-23 23:47 Grover, Andrew
  2003-04-24  0:03 ` Pavel Machek
  0 siblings, 1 reply; 68+ messages in thread
From: Grover, Andrew @ 2003-04-23 23:47 UTC (permalink / raw)
  To: Martin J. Bligh, Nigel Cunningham, Marc Giger
  Cc: Pavel Machek, Geert Uytterhoeven, Linux Kernel Development

> From: Martin J. Bligh [mailto:mbligh@aracnet.com] 
> Can't you just create a pre-reserved separate swsusp area on 
> disk the size 
> of RAM (maybe a partition rather than a file to make things 
> easier), and 
> then you know you're safe (basically what Marc was 
> suggesting, except pre-allocated)? Or does that make me the 
> prince of all evil? ;-)
> 
> However much swap space you allocate, it can always all be 
> used, so that seems futile ...

This is what Other OSes do, and I believe this is the correct path.
Using swap for swsusp is a clever hack but not a 100% solution.

-- Andy

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  0:03 ` Pavel Machek
@ 2003-04-23 23:57   ` Martin J. Bligh
  2003-04-24  0:25     ` Pavel Machek
  2003-04-24  0:02   ` Nigel Cunningham
  2003-04-24  3:49   ` David Ford
  2 siblings, 1 reply; 68+ messages in thread
From: Martin J. Bligh @ 2003-04-23 23:57 UTC (permalink / raw)
  To: Pavel Machek, Grover, Andrew
  Cc: Nigel Cunningham, Marc Giger, Geert Uytterhoeven,
	Linux Kernel Development

>> > From: Martin J. Bligh [mailto:mbligh@aracnet.com] 
>> > Can't you just create a pre-reserved separate swsusp area on 
>> > disk the size 
>> > of RAM (maybe a partition rather than a file to make things 
>> > easier), and 
>> > then you know you're safe (basically what Marc was 
>> > suggesting, except pre-allocated)? Or does that make me the 
>> > prince of all evil? ;-)
>> > 
>> > However much swap space you allocate, it can always all be 
>> > used, so that seems futile ...
>> 
>> This is what Other OSes do, and I believe this is the correct path.
>> Using swap for swsusp is a clever hack but not a 100% solution.
> 
> Well, for normal use its clearly inferior -- suspend partition is unused
> when it could be used for speeding system up by swapping out unused
> stuff.
> 
> OtherOS approach is better because it can guarantee suspend-to-disk
> for critical situations like overheat or battery-critical.
> 
> But we can get best of both worlds if we OOM-kill during critical
> suspend. [If suspend partition was not used for swapping, machine
> would *already* OOM-killed someone, so we are only improving stuff].

OK ... but at least having the *option* to have a separate reserved
area would be nice, no? For most people, RAM is just a tiny amount
of their disk space ... and damn, does it make the code simpler ;-)

M.


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  0:03 ` Pavel Machek
  2003-04-23 23:57   ` Martin J. Bligh
@ 2003-04-24  0:02   ` Nigel Cunningham
  2003-04-24  0:23     ` Pavel Machek
  2003-04-24  0:45     ` Martin J. Bligh
  2003-04-24  3:49   ` David Ford
  2 siblings, 2 replies; 68+ messages in thread
From: Nigel Cunningham @ 2003-04-24  0:02 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Grover, Andrew, Martin J. Bligh, Marc Giger, Geert Uytterhoeven,
	Linux Kernel Development

I don't believer I've ever seen things get OOM killed. Instead, page
cache is discarded until things do fit.

Regards,

Nigel

On Thu, 2003-04-24 at 12:03, Pavel Machek wrote:
> Hi!
> 
> > > From: Martin J. Bligh [mailto:mbligh@aracnet.com] 
> > > Can't you just create a pre-reserved separate swsusp area on 
> > > disk the size 
> > > of RAM (maybe a partition rather than a file to make things 
> > > easier), and 
> > > then you know you're safe (basically what Marc was 
> > > suggesting, except pre-allocated)? Or does that make me the 
> > > prince of all evil? ;-)
> > > 
> > > However much swap space you allocate, it can always all be 
> > > used, so that seems futile ...
> > 
> > This is what Other OSes do, and I believe this is the correct path.
> > Using swap for swsusp is a clever hack but not a 100% solution.
> 
> Well, for normal use its clearly inferior -- suspend partition is unused
> when it could be used for speeding system up by swapping out unused
> stuff.
> 
> OtherOS approach is better because it can guarantee suspend-to-disk
> for critical situations like overheat or battery-critical.
> 
> But we can get best of both worlds if we OOM-kill during critical
> suspend. [If suspend partition was not used for swapping, machine
> would *already* OOM-killed someone, so we are only improving stuff].
> 
> 						Pavel  
-- 
Nigel Cunningham
495 St Georges Road South, Hastings 4201, New Zealand

Be diligent to present yourself approved to God as a workman who does
not need to be ashamed, handling accurately the word of truth.
	-- 2 Timothy 2:14, NASB.


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-23 23:47 Fix SWSUSP & !SWAP Grover, Andrew
@ 2003-04-24  0:03 ` Pavel Machek
  2003-04-23 23:57   ` Martin J. Bligh
                     ` (2 more replies)
  0 siblings, 3 replies; 68+ messages in thread
From: Pavel Machek @ 2003-04-24  0:03 UTC (permalink / raw)
  To: Grover, Andrew
  Cc: Martin J. Bligh, Nigel Cunningham, Marc Giger, Pavel Machek,
	Geert Uytterhoeven, Linux Kernel Development

Hi!

> > From: Martin J. Bligh [mailto:mbligh@aracnet.com] 
> > Can't you just create a pre-reserved separate swsusp area on 
> > disk the size 
> > of RAM (maybe a partition rather than a file to make things 
> > easier), and 
> > then you know you're safe (basically what Marc was 
> > suggesting, except pre-allocated)? Or does that make me the 
> > prince of all evil? ;-)
> > 
> > However much swap space you allocate, it can always all be 
> > used, so that seems futile ...
> 
> This is what Other OSes do, and I believe this is the correct path.
> Using swap for swsusp is a clever hack but not a 100% solution.

Well, for normal use its clearly inferior -- suspend partition is unused
when it could be used for speeding system up by swapping out unused
stuff.

OtherOS approach is better because it can guarantee suspend-to-disk
for critical situations like overheat or battery-critical.

But we can get best of both worlds if we OOM-kill during critical
suspend. [If suspend partition was not used for swapping, machine
would *already* OOM-killed someone, so we are only improving stuff].

						Pavel  

-- 
Horseback riding is like software...
...vgf orggre jura vgf serr.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  0:02   ` Nigel Cunningham
@ 2003-04-24  0:23     ` Pavel Machek
  2003-04-24  0:45     ` Martin J. Bligh
  1 sibling, 0 replies; 68+ messages in thread
From: Pavel Machek @ 2003-04-24  0:23 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: Grover, Andrew, Martin J. Bligh, Marc Giger, Geert Uytterhoeven,
	Linux Kernel Development

Hi!

> I don't believer I've ever seen things get OOM killed. Instead, page
> cache is discarded until things do fit.

In extreme cases, things just may not fit. It does not happen too
often in real life.

								Pavel

-- 
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-23 23:57   ` Martin J. Bligh
@ 2003-04-24  0:25     ` Pavel Machek
  2003-04-24  0:37       ` CaT
  2003-04-24  0:49       ` Martin J. Bligh
  0 siblings, 2 replies; 68+ messages in thread
From: Pavel Machek @ 2003-04-24  0:25 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Grover, Andrew, Nigel Cunningham, Marc Giger, Geert Uytterhoeven,
	Linux Kernel Development

Hi!

> >> > From: Martin J. Bligh [mailto:mbligh@aracnet.com] 
> >> > Can't you just create a pre-reserved separate swsusp area on 
> >> > disk the size 
> >> > of RAM (maybe a partition rather than a file to make things 
> >> > easier), and 
> >> > then you know you're safe (basically what Marc was 
> >> > suggesting, except pre-allocated)? Or does that make me the 
> >> > prince of all evil? ;-)
> >> > 
> >> > However much swap space you allocate, it can always all be 
> >> > used, so that seems futile ...
> >> 
> >> This is what Other OSes do, and I believe this is the correct path.
> >> Using swap for swsusp is a clever hack but not a 100% solution.
> > 
> > Well, for normal use its clearly inferior -- suspend partition is unused
> > when it could be used for speeding system up by swapping out unused
> > stuff.
> > 
> > OtherOS approach is better because it can guarantee suspend-to-disk
> > for critical situations like overheat or battery-critical.
> > 
> > But we can get best of both worlds if we OOM-kill during critical
> > suspend. [If suspend partition was not used for swapping, machine
> > would *already* OOM-killed someone, so we are only improving stuff].
> 
> OK ... but at least having the *option* to have a separate reserved
> area would be nice, no? For most people, RAM is just a tiny amount
> of their disk space ... and damn, does it make the code simpler ;-)

If it is an *option*, it does not make code simpler.

And OOM-killing during suspend is just what you want. It makes suspend
deterministic but it might kill someone. [Well, your solution would
kill him sooner than that...]
								Pavel

-- 
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  0:25     ` Pavel Machek
@ 2003-04-24  0:37       ` CaT
  2003-04-24  0:49       ` Martin J. Bligh
  1 sibling, 0 replies; 68+ messages in thread
From: CaT @ 2003-04-24  0:37 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Martin J. Bligh, Grover, Andrew, Nigel Cunningham, Marc Giger,
	Geert Uytterhoeven, Linux Kernel Development

On Thu, Apr 24, 2003 at 02:25:52AM +0200, Pavel Machek wrote:
> If it is an *option*, it does not make code simpler.
> 
> And OOM-killing during suspend is just what you want. It makes suspend
> deterministic but it might kill someone. [Well, your solution would
> kill him sooner than that...]

I wouldn't say it's what you want. I really rather not have open office
(for eg) killed just because while I could have had the space to suspend
fully I wasn't able. And if it's my only reason for using the laptop at
the time then it just outright dieing might be just as useful.

Have the OOM killing if you want but being able to tack on an extra swap
partition when suspending would be most useful, either out of the kernel
or out of userspace, whichever would be the most reliable.

-- 
Martin's distress was in contrast to the bitter satisfaction of some
of his fellow marines as they surveyed the scene. "The Iraqis are sick
people and we are the chemotherapy," said Corporal Ryan Dupre. "I am
starting to hate this country. Wait till I get hold of a friggin' Iraqi.
No, I won't get hold of one. I'll just kill him."
	- http://www.informationclearinghouse.info/article2479.htm

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  0:02   ` Nigel Cunningham
  2003-04-24  0:23     ` Pavel Machek
@ 2003-04-24  0:45     ` Martin J. Bligh
  2003-04-24  3:17       ` Nigel Cunningham
  1 sibling, 1 reply; 68+ messages in thread
From: Martin J. Bligh @ 2003-04-24  0:45 UTC (permalink / raw)
  To: Nigel Cunningham, Pavel Machek
  Cc: Grover, Andrew, Marc Giger, Geert Uytterhoeven, Linux Kernel Development

> I don't believer I've ever seen things get OOM killed. Instead, page
> cache is discarded until things do fit.

What happens if user allocated pages are filling up all the space,
not page cache? Trust me, it happens ;-)

M.


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  0:25     ` Pavel Machek
  2003-04-24  0:37       ` CaT
@ 2003-04-24  0:49       ` Martin J. Bligh
  2003-04-24  9:16         ` Pavel Machek
  1 sibling, 1 reply; 68+ messages in thread
From: Martin J. Bligh @ 2003-04-24  0:49 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Grover, Andrew, Nigel Cunningham, Marc Giger, Geert Uytterhoeven,
	Linux Kernel Development

>> OK ... but at least having the *option* to have a separate reserved
>> area would be nice, no? For most people, RAM is just a tiny amount
>> of their disk space ... and damn, does it make the code simpler ;-)
> 
> If it is an *option*, it does not make code simpler.
> 
> And OOM-killing during suspend is just what you want. It makes suspend
> deterministic but it might kill someone. [Well, your solution would
> kill him sooner than that...]

OK, fair enough. I like the "activate the spare swap blob" plan.
Seems like the best of both worlds ... people can use files or
partitions, and all the code is already there.

M.


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  0:45     ` Martin J. Bligh
@ 2003-04-24  3:17       ` Nigel Cunningham
  2003-04-24  4:37         ` Martin J. Bligh
  0 siblings, 1 reply; 68+ messages in thread
From: Nigel Cunningham @ 2003-04-24  3:17 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Pavel Machek, Grover, Andrew, Marc Giger, Geert Uytterhoeven,
	Linux Kernel Development

On Thu, 2003-04-24 at 12:45, Martin J. Bligh wrote:
> > I don't believer I've ever seen things get OOM killed. Instead, page
> > cache is discarded until things do fit.
> 
> What happens if user allocated pages are filling up all the space,
> not page cache? Trust me, it happens ;-)

Yep, just because I haven't seen it, doesn't mean a thing. :>. In that
case, there are two issues: memory to work in to start with and how to
save the image without corrupting it.

Regarding #1, there must still be some memory available, mustn't there?
swsusp only approx (nrpages in use/256) pages to do its work. Surely
we'd always be able to get .4% of the number of pages? Even if we can't
get that many, we should be able to adjust the algorithm to be able to
suspend a machine with only 10 or so pages available to start with (no,
I'm not volunteering to do it! I want to merge with 2.5 and get on to
other projects!).

Regarding #2, my algorithm (ie not the version in 2.5 at the mo)
separates pages to be saved into 2 types. Pageset1 are pages we expect
to be needed during suspend. Pageset2 is those that will definitely not
be needed. My algorithm for saving the data goes: Save pageset2 pages to
disk then (as per the original/current method) make a copy of pageset1
pages (using the pageset2 locations + extra allocated memory if needsbe)
and save the copy. Loading the image is the reverse process. Pageset 2
currently only consists of all highmem pages + active and inactive list
pages. If we refined the algorithm, perhaps that would address your
issue. The other point here is that since we have to be able to make a
copy of pageset1 pages, and since I haven't inlined kmap/unmap in the
routine to copy pageset1 pages back on resume (Pavel will say whew to
that, I'm sure!), pageset1 has a miximum size of half normal memory. I
reckon refining the algoritm so that pageset1 can be [nearly] guaranteed
to always be smaller is the better area to focus on, and I'm perfectly
happy to try suggestions, particularly when they come in the form of a
code fragment that include a call to SetPagePageset2(struct page * page)
for the relevant targets :>

Regards,

Nigel
-- 
Nigel Cunningham
495 St Georges Road South, Hastings 4201, New Zealand

Be diligent to present yourself approved to God as a workman who does
not need to be ashamed, handling accurately the word of truth.
	-- 2 Timothy 2:14, NASB.


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  0:03 ` Pavel Machek
  2003-04-23 23:57   ` Martin J. Bligh
  2003-04-24  0:02   ` Nigel Cunningham
@ 2003-04-24  3:49   ` David Ford
  2003-04-24  6:54     ` Jörn Engel
  2003-04-24  7:01     ` Elladan
  2 siblings, 2 replies; 68+ messages in thread
From: David Ford @ 2003-04-24  3:49 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Grover, Andrew, Martin J. Bligh, Nigel Cunningham, Marc Giger,
	Geert Uytterhoeven, Linux Kernel Development

I honestly don't see OOMing as an acceptable practice.  If I wanted to 
kill a bunch of stuff just to suspend, I would have simply shut the 
system down.  That isn't my intent or desire.  I want to suspend the 
system just as it is without OOMing a bunch of programs.

David

Pavel Machek wrote:

>Hi!
>
>  
>
>>>From: Martin J. Bligh [mailto:mbligh@aracnet.com] 
>>>Can't you just create a pre-reserved separate swsusp area on 
>>>disk the size 
>>>of RAM (maybe a partition rather than a file to make things 
>>>easier), and 
>>>then you know you're safe (basically what Marc was 
>>>suggesting, except pre-allocated)? Or does that make me the 
>>>prince of all evil? ;-)
>>>
>>>However much swap space you allocate, it can always all be 
>>>used, so that seems futile ...
>>>      
>>>
>>This is what Other OSes do, and I believe this is the correct path.
>>Using swap for swsusp is a clever hack but not a 100% solution.
>>    
>>
>
>Well, for normal use its clearly inferior -- suspend partition is unused
>when it could be used for speeding system up by swapping out unused
>stuff.
>
>OtherOS approach is better because it can guarantee suspend-to-disk
>for critical situations like overheat or battery-critical.
>
>But we can get best of both worlds if we OOM-kill during critical
>suspend. [If suspend partition was not used for swapping, machine
>would *already* OOM-killed someone, so we are only improving stuff].
>
>						Pavel  
>
>  
>


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  3:17       ` Nigel Cunningham
@ 2003-04-24  4:37         ` Martin J. Bligh
  2003-04-24  7:49           ` Marc Giger
  2003-04-24  9:27           ` Pavel Machek
  0 siblings, 2 replies; 68+ messages in thread
From: Martin J. Bligh @ 2003-04-24  4:37 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: Pavel Machek, Grover, Andrew, Marc Giger, Geert Uytterhoeven,
	Linux Kernel Development

>> > I don't believer I've ever seen things get OOM killed. Instead, page
>> > cache is discarded until things do fit.
>> 
>> What happens if user allocated pages are filling up all the space,
>> not page cache? Trust me, it happens ;-)
> 
> Yep, just because I haven't seen it, doesn't mean a thing. :>. In that
> case, there are two issues: memory to work in to start with and how to
> save the image without corrupting it.
> 
> Regarding #1, there must still be some memory available, mustn't there?
> swsusp only approx (nrpages in use/256) pages to do its work. Surely
> we'd always be able to get .4% of the number of pages? Even if we can't
> get that many, we should be able to adjust the algorithm to be able to
> suspend a machine with only 10 or so pages available to start with (no,
> I'm not volunteering to do it! I want to merge with 2.5 and get on to
> other projects!).

Well, if you have the pre-reserved area, you just start swapping pages out
into it. If we need a few pages (like 10) to manage data structures, etc,
we can just allocate those at boot, and keep them ready.
 
> Regarding #2, my algorithm (ie not the version in 2.5 at the mo)
> separates pages to be saved into 2 types. Pageset1 are pages we expect
> to be needed during suspend. Pageset2 is those that will definitely not
> be needed. My algorithm for saving the data goes: Save pageset2 pages to
> disk then (as per the original/current method) make a copy of pageset1
> pages (using the pageset2 locations + extra allocated memory if needsbe)
> and save the copy. Loading the image is the reverse process. Pageset 2
> currently only consists of all highmem pages + active and inactive list
> pages. 

Not sure that quite works ... don't you need PTE's to know what to swap
out? Those can be in highmem. However, all user pages would fit in pageset
2, I think.

> If we refined the algorithm, perhaps that would address your
> issue. The other point here is that since we have to be able to make a
> copy of pageset1 pages, and since I haven't inlined kmap/unmap in the
> routine to copy pageset1 pages back on resume (Pavel will say whew to
> that, I'm sure!), pageset1 has a miximum size of half normal memory. I
> reckon refining the algoritm so that pageset1 can be [nearly] guaranteed
> to always be smaller is the better area to focus on, and I'm perfectly
> happy to try suggestions, particularly when they come in the form of a
> code fragment that include a call to SetPagePageset2(struct page * page)
> for the relevant targets :>

The more I think about this, the more it seems so much simpler to just
require a reserved swap area the size of your RAM to suspend into. Would
make the code so much simpler ... forget the option bit I suggested earlier
;-)

M.


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  3:49   ` David Ford
@ 2003-04-24  6:54     ` Jörn Engel
  2003-04-24  7:01     ` Elladan
  1 sibling, 0 replies; 68+ messages in thread
From: Jörn Engel @ 2003-04-24  6:54 UTC (permalink / raw)
  To: David Ford
  Cc: Pavel Machek, Grover, Andrew, Martin J. Bligh, Nigel Cunningham,
	Marc Giger, Geert Uytterhoeven, Linux Kernel Development

On Wed, 23 April 2003 23:49:49 -0400, David Ford wrote:
> 
> I honestly don't see OOMing as an acceptable practice.  If I wanted to 
> kill a bunch of stuff just to suspend, I would have simply shut the 
> system down.  That isn't my intent or desire.  I want to suspend the 
> system just as it is without OOMing a bunch of programs.

Seconded.

Joe user would see the OOM either way, sooner with a seperate suspend
partition, which would be missing as swap. But he would be seeing it
at the moment he caused it.  With suspend to swap, the OOM happens
during suspend, every time. So who is to blame?

Jörn

-- 
The cost of changing business rules is much more expensive for software
than for a secretaty.
-- unknown

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  3:49   ` David Ford
  2003-04-24  6:54     ` Jörn Engel
@ 2003-04-24  7:01     ` Elladan
  1 sibling, 0 replies; 68+ messages in thread
From: Elladan @ 2003-04-24  7:01 UTC (permalink / raw)
  To: David Ford
  Cc: Pavel Machek, Grover, Andrew, Martin J. Bligh, Nigel Cunningham,
	Marc Giger, Geert Uytterhoeven, Linux Kernel Development

You would have already OOM-killed them if you pre-reserved space, since
instead of using that space for swap like you should have, you reserved
it for swsusp.  This means that if swsusp had to OOM-kill in the swap
scenario, the system would already have run out of memory and had to
oom-kill in the reserved partition scenario.

It's just a matter of whether the system lets you start some tasks it 
can't suspend, in which case it has to shut them down during the suspend
stage, or it doesn't let you run those tasks, in which case you've
already run out of memory and had to terminate them.

It seems to me the real thing you want here is growable swap files in
the kernel.  This would let both cases succeed, provided there's enough
disk space sitting around in some filesystem which is marked for swap.

Of course, if you run out of disk, you'd still hit OOM in either case,
and disk full too.  And of course, if swap is allowed to grow without
bound, the system will thrash itself to death instead of OOM-ing
whenever a process goes wild.

-J

On Wed, Apr 23, 2003 at 11:49:49PM -0400, David Ford wrote:
> I honestly don't see OOMing as an acceptable practice.  If I wanted to 
> kill a bunch of stuff just to suspend, I would have simply shut the 
> system down.  That isn't my intent or desire.  I want to suspend the 
> system just as it is without OOMing a bunch of programs.
> 
> David
> 
> Pavel Machek wrote:
> 
> >Hi!
> >
> > 
> >
> >>>From: Martin J. Bligh [mailto:mbligh@aracnet.com] 
> >>>Can't you just create a pre-reserved separate swsusp area on 
> >>>disk the size 
> >>>of RAM (maybe a partition rather than a file to make things 
> >>>easier), and 
> >>>then you know you're safe (basically what Marc was 
> >>>suggesting, except pre-allocated)? Or does that make me the 
> >>>prince of all evil? ;-)
> >>>
> >>>However much swap space you allocate, it can always all be 
> >>>used, so that seems futile ...
> >>>     
> >>>
> >>This is what Other OSes do, and I believe this is the correct path.
> >>Using swap for swsusp is a clever hack but not a 100% solution.
> >>   
> >>
> >
> >Well, for normal use its clearly inferior -- suspend partition is unused
> >when it could be used for speeding system up by swapping out unused
> >stuff.
> >
> >OtherOS approach is better because it can guarantee suspend-to-disk
> >for critical situations like overheat or battery-critical.
> >
> >But we can get best of both worlds if we OOM-kill during critical
> >suspend. [If suspend partition was not used for swapping, machine
> >would *already* OOM-killed someone, so we are only improving stuff].
> >
> >						Pavel  
> >
> > 
> >
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  4:37         ` Martin J. Bligh
@ 2003-04-24  7:49           ` Marc Giger
  2003-04-24  9:27           ` Pavel Machek
  1 sibling, 0 replies; 68+ messages in thread
From: Marc Giger @ 2003-04-24  7:49 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Nigel Cunningham, Pavel Machek, Grover, Andrew,
	Geert Uytterhoeven, Linux Kernel Development

Good Morning All:-)

On Wed, 23 Apr 2003 21:37:41 -0700
"Martin J. Bligh" <mbligh@aracnet.com> wrote:

> >> > I don't believer I've ever seen things get OOM killed. Instead, page
> >> > cache is discarded until things do fit.
> >> 
> >> What happens if user allocated pages are filling up all the space,
> >> not page cache? Trust me, it happens ;-)
> > 

Yes, it happens...:-)

> > Yep, just because I haven't seen it, doesn't mean a thing. :>. In that

<snip>

</snip>

> The more I think about this, the more it seems so much simpler to just
> require a reserved swap area the size of your RAM to suspend into. Would
> make the code so much simpler ... forget the option bit I suggested earlier
> ;-)

I think most people doesn't interest wheter the code is simple or not...it must work!

What I like to see is that we can define in /etc/fstab an entry for a separate suspend partition or suspend file.

eg. for a partition:
/dev/sda4	none	swsusp	swsp	0 0

or for a file:
/tmp/swsusp	none	swsusp	swsp	0 0

Is it possible? It's also just an idea...

Marc

Also I wouldn't like it, if my programs get OOM-killed just because swap and memory are full! The reasons because I use standby and hybernation are:

1. I hate booting...(If I like booting I would use windows;-))
2. I want to continue working on the last work without opening the programs again
3. I can go home faster:-)

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  0:49       ` Martin J. Bligh
@ 2003-04-24  9:16         ` Pavel Machek
  0 siblings, 0 replies; 68+ messages in thread
From: Pavel Machek @ 2003-04-24  9:16 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Grover, Andrew, Nigel Cunningham, Marc Giger, Geert Uytterhoeven,
	Linux Kernel Development

Hi!

> >> OK ... but at least having the *option* to have a separate reserved
> >> area would be nice, no? For most people, RAM is just a tiny amount
> >> of their disk space ... and damn, does it make the code simpler ;-)
> > 
> > If it is an *option*, it does not make code simpler.
> > 
> > And OOM-killing during suspend is just what you want. It makes suspend
> > deterministic but it might kill someone. [Well, your solution would
> > kill him sooner than that...]
> 
> OK, fair enough. I like the "activate the spare swap blob" plan.
> Seems like the best of both worlds ... people can use files or
> partitions, and all the code is already there.

Except you can not use swapfiles ;-).
								Pavel
-- 
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  4:37         ` Martin J. Bligh
  2003-04-24  7:49           ` Marc Giger
@ 2003-04-24  9:27           ` Pavel Machek
  1 sibling, 0 replies; 68+ messages in thread
From: Pavel Machek @ 2003-04-24  9:27 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Nigel Cunningham, Grover, Andrew, Marc Giger, Geert Uytterhoeven,
	Linux Kernel Development

Hi!

> > Regarding #2, my algorithm (ie not the version in 2.5 at the mo)
> > separates pages to be saved into 2 types. Pageset1 are pages we expect
> > to be needed during suspend. Pageset2 is those that will definitely not
> > be needed. My algorithm for saving the data goes: Save pageset2 pages to
> > disk then (as per the original/current method) make a copy of pageset1
> > pages (using the pageset2 locations + extra allocated memory if needsbe)
> > and save the copy. Loading the image is the reverse process. Pageset 2
> > currently only consists of all highmem pages + active and inactive list
> > pages. 
> 
> Not sure that quite works ... don't you need PTE's to know what to swap
> out? Those can be in highmem. However, all user pages would fit in pageset
> 2, I think.
> 
> > If we refined the algorithm, perhaps that would address your
> > issue. The other point here is that since we have to be able to make a
> > copy of pageset1 pages, and since I haven't inlined kmap/unmap in the
> > routine to copy pageset1 pages back on resume (Pavel will say whew to
> > that, I'm sure!), pageset1 has a miximum size of half normal memory. I
> > reckon refining the algoritm so that pageset1 can be [nearly] guaranteed
> > to always be smaller is the better area to focus on, and I'm perfectly
> > happy to try suggestions, particularly when they come in the form of a
> > code fragment that include a call to SetPagePageset2(struct page * page)
> > for the relevant targets :>
> 
> The more I think about this, the more it seems so much simpler to just
> require a reserved swap area the size of your RAM to suspend into. Would
> make the code so much simpler ... forget the option bit I suggested earlier
> ;-)

Reserved area does not make anything any simpler... Remember, you want
to avoid having second set of poll-driven disk drivers just for
suspend-to-disk. Its quite hard to design with that in mind.
								Pavel
-- 
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-25 16:20                                                 ` Andreas Dilger
  2003-04-25 18:28                                                   ` Nigel Cunningham
@ 2003-04-27 18:59                                                   ` Pavel Machek
  1 sibling, 0 replies; 68+ messages in thread
From: Pavel Machek @ 2003-04-27 18:59 UTC (permalink / raw)
  To: Nigel Cunningham, Andrew Morton, cat, mbligh, gigerstyle, geert,
	Linux Kernel Mailing List

Hi!

> > > Sounds like a good idea to me.
> > 
> > When I do sys_sync(), will it trigger that?
> 
> No, having sys_sync() do journal purging would really hurt journal fs
> performance.  That's why I said you need to call sync_supers_lockfs(),
> which is unfortunately not in 2.4 kernels (available in the LVM CVS
> as a patch), but it does appear to be in 2.5 kernels.  For
> journaling

(Second thought).

Its okay. We do sys_sync. If machine fails to resume, normal journal
recovery happens; nothing bad. It would be nice to
sync_supers_lockfs(), but if we need to have list of block devices for
that, its just too much trouble. Data are safe, anyway.
								Pavel
-- 
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-25 18:28                                                   ` Nigel Cunningham
  2003-04-25 19:32                                                     ` Jamie Lokier
@ 2003-04-25 19:58                                                     ` Andreas Dilger
  1 sibling, 0 replies; 68+ messages in thread
From: Andreas Dilger @ 2003-04-25 19:58 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: Pavel Machek, Andrew Morton, cat, mbligh, gigerstyle, geert,
	Linux Kernel Mailing List

On Apr 26, 2003  06:28 +1200, Nigel Cunningham wrote:
> On Sat, 2003-04-26 at 04:20, Andreas Dilger wrote:
> > On Apr 25, 2003  14:59 +0200, Pavel Machek wrote:
> > > > I don't believe Pavel was saying the image would be corrupted. Rather,
> > > > the rest of the disk contents are corrupted by replaying the journal and
> > > > then resuming back to a memory state that has been made inconsistent
> > > > with the disk state because of the journal replay.
> > > 
> > > Right.
> > 
> > But that is happening regardless of whether a swapfile is in use or not.
> > It is a problem whether the filesystem is journaling or not.  Basically,
> > if you are entering into a normal boot sequence and mounting filesystems
> > and then resuming from your saved state you risk filesystem corruption.
> > 
> > The only way to avoid that would be for the kernel to detect the swsusp
> > magic data in the swap partition _before_ any filesystems are mounted
> > (probably via initramfs), and then resume from that image (which will
> > implicitly "mount" the filesystem because it was never unmounted in
> > that image).  Then, swsusp becomes a special case of the 2-kernel-monte
> > (or add your other favourite kernel-booting-kernel method here), where
> > most of your kernel state is swapped out and only a limited recovery
> > state is loaded into RAM before doing the kernel dance).
> 
> And that's precisely what we do. SWSUSP runs before any file system is
> mounted. It checks the designed partition's swap header, loads the image
> and then completely replaces the booting kernel with the saved one.
> That's how we avoid corruption at the moment.

OK, then journal replay is again out of the picture...  It's just hard to
follow because one person says "journal replay will corrupt things", and
the other says "we aren't mounting filesystems".  No mounting == no journal
replay == no corruption.  This has nothing to do with swapfiles.

> But if we have a swapfile, we need to do some initialisation of the
> filesystem code in order to get access to our swapfile. Even if we
> record in the swapfile - while suspending - information that gives us
> the locations of blocks, we still need to find the start of the swapfile,
> or store it somewhere. If you have a suggestion in that regard, I might
> be able to see a swapfile as a possibility.

Two options:
1) You already have to store _some_ information about where to get the
   resume image from.  It sounds like you store the swap partition name
   on the kernel command-line or similar.  It should be equally possible
   to store the swap file partition name PLUS a relative offset from the
   start of the partition to where the swapfile starts.  You can easily
   get this via fstat+bmap on the swapfile.

2) You could require that the critical resume information be on a swap
   partition, but allow non-kernel data to be saved to the swapfile.
   If you can read the initial resume data from the swap partition, I'd
   have to imagine that you can read devno+block data from a swapfile
   without needing any of the filesystem code involved.  After all, the
   swap code doesn't depend on the filesystem either - it just makes a
   list of block numbers at swapfile activation time and does direct
   read/write to the device after that.

The lilo suggestion for doing the bmap could also be done, although
I don't see the need if the swap code has already mapped the entire
file in memory already.  Also note - while GRUB _examines_ the
filesystem (via internal fs-aware code), I don't think it acutally
mounts it or does journal replay, so it should be safe also if you
really wanted to go that way (I don't think you need to however).

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-25  1:31                                       ` Hua Zhong
@ 2003-04-25 19:41                                         ` H. Peter Anvin
  0 siblings, 0 replies; 68+ messages in thread
From: H. Peter Anvin @ 2003-04-25 19:41 UTC (permalink / raw)
  To: linux-kernel

Followup to:  <CDEDIMAGFBEBKHDJPCLDMEBCDMAA.hzhong@cisco.com>
By author:    "Hua Zhong" <hzhong@cisco.com>
In newsgroup: linux.dev.kernel
>
> It would be nice, so GRUB has no problem any more for using journaling file
> system on boot partition.
> 

Not just GRUB, but just about any boot loader which doesn't use
bmap().  bmap() is apparently special-cased in ext3, but still...

	-hpa
-- 
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-25 18:28                                                   ` Nigel Cunningham
@ 2003-04-25 19:32                                                     ` Jamie Lokier
  2003-04-25 19:58                                                     ` Andreas Dilger
  1 sibling, 0 replies; 68+ messages in thread
From: Jamie Lokier @ 2003-04-25 19:32 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: Andreas Dilger, Pavel Machek, Andrew Morton, cat, mbligh,
	gigerstyle, geert, Linux Kernel Mailing List

Nigel Cunningham wrote:
> Even if we record in the swapfile - while
> suspending - information that gives us the locations of blocks, we still
> need to find the start of the swapfile, or store it somewhere.

If LILO can do it...

The starting block number does not change unless you delete and
recreate the swapfile.  It has the same requirement for location
stability as the kernel image and initrd for LILO.

So, a patched LILO could record the start of the swapfile when it is
run.  Note that you do this as usual when installing a kernel or when
installing a new swapfile.  There's no need to do it when suspending.

If you're booting with Grub or some other fs-aware bootloader, then
the bootloader can presumably find the start of the swapfile just like
it finds the kernel and initrd.

-- Jamie

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-25 16:20                                                 ` Andreas Dilger
@ 2003-04-25 18:28                                                   ` Nigel Cunningham
  2003-04-25 19:32                                                     ` Jamie Lokier
  2003-04-25 19:58                                                     ` Andreas Dilger
  2003-04-27 18:59                                                   ` Pavel Machek
  1 sibling, 2 replies; 68+ messages in thread
From: Nigel Cunningham @ 2003-04-25 18:28 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: Pavel Machek, Andrew Morton, cat, mbligh, gigerstyle, geert,
	Linux Kernel Mailing List

On Sat, 2003-04-26 at 04:20, Andreas Dilger wrote:
> On Apr 25, 2003  14:59 +0200, Pavel Machek wrote:
> > > I don't believe Pavel was saying the image would be corrupted. Rather,
> > > the rest of the disk contents are corrupted by replaying the journal and
> > > then resuming back to a memory state that has been made inconsistent
> > > with the disk state because of the journal replay.
> > 
> > Right.
> 
> But that is happening regardless of whether a swapfile is in use or not.
> It is a problem whether the filesystem is journaling or not.  Basically,
> if you are entering into a normal boot sequence and mounting filesystems
> and then resuming from your saved state you risk filesystem corruption.
> 
> The only way to avoid that would be for the kernel to detect the swsusp
> magic data in the swap partition _before_ any filesystems are mounted
> (probably via initramfs), and then resume from that image (which will
> implicitly "mount" the filesystem because it was never unmounted in
> that image).  Then, swsusp becomes a special case of the 2-kernel-monte
> (or add your other favourite kernel-booting-kernel method here), where
> most of your kernel state is swapped out and only a limited recovery
> state is loaded into RAM before doing the kernel dance).

And that's precisely what we do. SWSUSP runs before any file system is
mounted. It checks the designed partition's swap header, loads the image
and then completely replaces the booting kernel with the saved one.
That's how we avoid corruption at the moment. But if we have a swapfile,
we need to do some initialisation of the filesystem code in order to get
access to our swapfile. Even if we record in the swapfile - while
suspending - information that gives us the locations of blocks, we still
need to find the start of the swapfile, or store it somewhere. If you
have a suggestion in that regard, I might be able to see a swapfile as a
possibility.

Regards,

Nigel
-- 
Nigel Cunningham
495 St Georges Road South, Hastings 4201, New Zealand

Be diligent to present yourself approved to God as a workman who does
not need to be ashamed, handling accurately the word of truth.
	-- 2 Timothy 2:14, NASB.


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-25 12:59                                               ` Pavel Machek
@ 2003-04-25 16:20                                                 ` Andreas Dilger
  2003-04-25 18:28                                                   ` Nigel Cunningham
  2003-04-27 18:59                                                   ` Pavel Machek
  0 siblings, 2 replies; 68+ messages in thread
From: Andreas Dilger @ 2003-04-25 16:20 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Nigel Cunningham, Andrew Morton, cat, mbligh, gigerstyle, geert,
	Linux Kernel Mailing List

On Apr 25, 2003  14:59 +0200, Pavel Machek wrote:
> > I don't believe Pavel was saying the image would be corrupted. Rather,
> > the rest of the disk contents are corrupted by replaying the journal and
> > then resuming back to a memory state that has been made inconsistent
> > with the disk state because of the journal replay.
> 
> Right.

But that is happening regardless of whether a swapfile is in use or not.
It is a problem whether the filesystem is journaling or not.  Basically,
if you are entering into a normal boot sequence and mounting filesystems
and then resuming from your saved state you risk filesystem corruption.

The only way to avoid that would be for the kernel to detect the swsusp
magic data in the swap partition _before_ any filesystems are mounted
(probably via initramfs), and then resume from that image (which will
implicitly "mount" the filesystem because it was never unmounted in
that image).  Then, swsusp becomes a special case of the 2-kernel-monte
(or add your other favourite kernel-booting-kernel method here), where
most of your kernel state is swapped out and only a limited recovery
state is loaded into RAM before doing the kernel dance).

> > > If that is the case, then the only way to avoid this would be to call
> > > sync_supers_lockfs() on each filesystem before the suspend, which will
> > > force the journal to be empty when it returns.  That API is supported
> > > by all of the journaling filesystems, and is probably a good thing to
> > > do anyways, as it will potentially free a lot of dirty data from RAM,
> > > and also ensure that the on-disk data is consistent in case the resume
> > > isn't handled gracefully.
> > 
> > Sounds like a good idea to me.
> 
> When I do sys_sync(), will it trigger that?

No, having sys_sync() do journal purging would really hurt journal fs
performance.  That's why I said you need to call sync_supers_lockfs(),
which is unfortunately not in 2.4 kernels (available in the LVM CVS
as a patch), but it does appear to be in 2.5 kernels.  For journaling
filesystems this is the equivalent of temporarily unmounting the
filesystem and then remounting it when unlockfs() is called.

Note that you must explicitly pass a block device to lock, so that you
do this before shutting down the disk device.  I've never tested, but
locking the filesystem probably does not prevent writing to an existing
swapfile on that filesystem, since the swap code bypasses the filesystem
entirely.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-25  1:09                                             ` Nigel Cunningham
@ 2003-04-25 12:59                                               ` Pavel Machek
  2003-04-25 16:20                                                 ` Andreas Dilger
  0 siblings, 1 reply; 68+ messages in thread
From: Pavel Machek @ 2003-04-25 12:59 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: Andreas Dilger, Pavel Machek, Andrew Morton, cat, mbligh,
	gigerstyle, geert, Linux Kernel Mailing List

Hi!

> > On Apr 24, 2003  22:48 +0200, Pavel Machek wrote:
> > OK, then why all of the talk earlier saying that journal recovery will
> > corrupt a swapfile?  That was the reason journaling was brought into the
> > discussion in the first place:
> > 
> > 	"And now you have kernel which expects data still in journal (that was
> > 	 state before suspend), but reality on disk is quite different (journal
> > 	 was replayed). Data corruption." -- Pavel
> 
> I don't believe Pavel was saying the image would be corrupted. Rather,
> the rest of the disk contents are corrupted by replaying the journal and
> then resuming back to a memory state that has been made inconsistent
> with the disk state because of the journal replay.

Right.

> > If that is the case, then the only way to avoid this would be to call
> > sync_super_lockfs() on each filesystem before the suspend, which will
> > force the journal to be empty when it returns.  That API is supported
> > by all of the journaling filesystems, and is probably a good thing to
> > do anyways, as it will potentially free a lot of dirty data from RAM,
> > and also ensure that the on-disk data is consistent in case the resume
> > isn't handled gracefully.
> 
> Sounds like a good idea to me.

When I do sys_sync(), will it trigger that?
								Pavel
-- 
Horseback riding is like software...
...vgf orggre jura vgf serr.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-25  4:27                                       ` Andreas Dilger
@ 2003-04-25  4:33                                         ` H. Peter Anvin
  0 siblings, 0 replies; 68+ messages in thread
From: H. Peter Anvin @ 2003-04-25  4:33 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: linux-kernel

Andreas Dilger wrote:
> 
> This "supersync" already exists, and it is supported by all of the
> journaling filesystems for LVM snapshots.  This is the VFS method
> write_super_lockfs in the ext3/reiserfs/XFS/JFS super_operations.
> Not only does it sync the dirty data to disk, but it also forces
> the journal to be empty and marks the filesystem clean, so that it
> can be snapshotted and read-only mounted (basically equivalent to
> unmounting the filesystem).
> 
> Unfortunately, even though the filesystems themselves have supported
> this VFS method for a long time, the actual code that calls these
> methods (sync_super_lockfs() and unlockfs()) are still only available
> as a patch from LVM.  The LVM/reiserfs folks have talked about submitting
> it to Marcelo for a long time now, but apparently still haven't done so.
> 

I really think this should be made available.  Perhaps we should have a 
sync1() system call which takes a flag set.  Then we could have 
sync1(SYNC_FLUSH_JOURNALS);

	-hpa


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-25  1:22                                     ` H. Peter Anvin
  2003-04-25  1:19                                       ` Nigel Cunningham
  2003-04-25  1:31                                       ` Hua Zhong
@ 2003-04-25  4:27                                       ` Andreas Dilger
  2003-04-25  4:33                                         ` H. Peter Anvin
  2 siblings, 1 reply; 68+ messages in thread
From: Andreas Dilger @ 2003-04-25  4:27 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: linux-kernel

On Apr 24, 2003  18:22 -0700, H. Peter Anvin wrote:
> By author:    Geert Uytterhoeven <geert@linux-m68k.org>
> > This has nothing to do with using a swapfile.
> > 
> > But if you resume from swsusp, you don't really `mount' all file systems.
> > They are implicitly mounted because they were mounted before the suspend
> > operation.
> 
> Shouldn't we be syncing them all before the suspend anyway, to
> minimize corruption in case the user chooses to mount the filesystem
> *without* resuming (think a dual-boot configuration.)  This would be
> another application for the "supersync" operation that was discussed
> at OLS 2002 -- a need for an operation which not only flushes all
> blocks to disk but also forces the journal to be replayed and
> truncated.

This "supersync" already exists, and it is supported by all of the
journaling filesystems for LVM snapshots.  This is the VFS method
write_super_lockfs in the ext3/reiserfs/XFS/JFS super_operations.
Not only does it sync the dirty data to disk, but it also forces
the journal to be empty and marks the filesystem clean, so that it
can be snapshotted and read-only mounted (basically equivalent to
unmounting the filesystem).

Unfortunately, even though the filesystems themselves have supported
this VFS method for a long time, the actual code that calls these
methods (sync_super_lockfs() and unlockfs()) are still only available
as a patch from LVM.  The LVM/reiserfs folks have talked about submitting
it to Marcelo for a long time now, but apparently still haven't done so.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/


^ permalink raw reply	[flat|nested] 68+ messages in thread

* RE: Fix SWSUSP & !SWAP
  2003-04-25  1:22                                     ` H. Peter Anvin
  2003-04-25  1:19                                       ` Nigel Cunningham
@ 2003-04-25  1:31                                       ` Hua Zhong
  2003-04-25 19:41                                         ` H. Peter Anvin
  2003-04-25  4:27                                       ` Andreas Dilger
  2 siblings, 1 reply; 68+ messages in thread
From: Hua Zhong @ 2003-04-25  1:31 UTC (permalink / raw)
  To: H. Peter Anvin, linux-kernel

It would be nice, so GRUB has no problem any more for using journaling file
system on boot partition.

> Shouldn't we be syncing them all before the suspend anyway, to
> minimize corruption in case the user chooses to mount the filesystem
> *without* resuming (think a dual-boot configuration.)  This would be
> another application for the "supersync" operation that was discussed
> at OLS 2002 -- a need for an operation which not only flushes all
> blocks to disk but also forces the journal to be replayed and
> truncated.
>
> 	-hpa
>
> --
> <hpa@transmeta.com> at work, <hpa@zytor.com> in private!
> "Unix gives you enough rope to shoot yourself in the foot."
> Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24 11:36                                   ` Geert Uytterhoeven
@ 2003-04-25  1:22                                     ` H. Peter Anvin
  2003-04-25  1:19                                       ` Nigel Cunningham
                                                         ` (2 more replies)
  0 siblings, 3 replies; 68+ messages in thread
From: H. Peter Anvin @ 2003-04-25  1:22 UTC (permalink / raw)
  To: linux-kernel

Followup to:  <Pine.GSO.4.21.0304241335210.19942-100000@vervain.sonytel.be>
By author:    Geert Uytterhoeven <geert@linux-m68k.org>
In newsgroup: linux.dev.kernel
>
> On Thu, 24 Apr 2003, Nigel Cunningham wrote:
> > On Thu, 2003-04-24 at 21:46, Andrew Morton wrote:
> > > > > Sorry, I still don't get it.  Go through the steps for me:
> > > > > 
> > > > > 1) suspend writes pages to disk
> > > > > 
> > > > > 2) machine is shutdown
> > > > > 
> > > > > 3) restart, journal replay
> > 
> > Corruption comes here. The journal reply tidies things up that shouldn't
> > be tidied up. They shouldn't be tidied up because once we reload the
> > image, things should be in the same state as prior to suspend. If replay
> > frees a block (thinking it wasn't properly linked or something similar),
> > it introduces corruption.
> 
> This has nothing to do with using a swapfile.
> 
> But if you resume from swsusp, you don't really `mount' all file systems. They
> are implicitly mounted because they were mounted before the suspend operation.
> 

Shouldn't we be syncing them all before the suspend anyway, to
minimize corruption in case the user chooses to mount the filesystem
*without* resuming (think a dual-boot configuration.)  This would be
another application for the "supersync" operation that was discussed
at OLS 2002 -- a need for an operation which not only flushes all
blocks to disk but also forces the journal to be replayed and
truncated.

	-hpa

-- 
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-25  1:22                                     ` H. Peter Anvin
@ 2003-04-25  1:19                                       ` Nigel Cunningham
  2003-04-25  1:31                                       ` Hua Zhong
  2003-04-25  4:27                                       ` Andreas Dilger
  2 siblings, 0 replies; 68+ messages in thread
From: Nigel Cunningham @ 2003-04-25  1:19 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Linux Kernel Mailing List

They are synced, the journal is just not truncated.

Regards,

Nigel

On Fri, 2003-04-25 at 13:22, H. Peter Anvin wrote: 
> Shouldn't we be syncing them all before the suspend anyway, to
> minimize corruption in case the user chooses to mount the filesystem
> *without* resuming (think a dual-boot configuration.)  This would be
> another application for the "supersync" operation that was discussed
> at OLS 2002 -- a need for an operation which not only flushes all
> blocks to disk but also forces the journal to be replayed and
> truncated.
> 
> 	-hpa



-- 
Nigel Cunningham
495 St Georges Road South, Hastings 4201, New Zealand

Be diligent to present yourself approved to God as a workman who does
not need to be ashamed, handling accurately the word of truth.
	-- 2 Timothy 2:14, NASB.


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24 21:46                                           ` Andreas Dilger
@ 2003-04-25  1:09                                             ` Nigel Cunningham
  2003-04-25 12:59                                               ` Pavel Machek
  0 siblings, 1 reply; 68+ messages in thread
From: Nigel Cunningham @ 2003-04-25  1:09 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: Pavel Machek, Andrew Morton, cat, mbligh, gigerstyle, geert,
	Linux Kernel Mailing List

On Fri, 2003-04-25 at 09:46, Andreas Dilger wrote:
> On Apr 24, 2003  22:48 +0200, Pavel Machek wrote:
> OK, then why all of the talk earlier saying that journal recovery will
> corrupt a swapfile?  That was the reason journaling was brought into the
> discussion in the first place:
> 
> 	"And now you have kernel which expects data still in journal (that was
> 	 state before suspend), but reality on disk is quite different (journal
> 	 was replayed). Data corruption." -- Pavel

I don't believe Pavel was saying the image would be corrupted. Rather,
the rest of the disk contents are corrupted by replaying the journal and
then resuming back to a memory state that has been made inconsistent
with the disk state because of the journal replay.

> If the filesystem was not unmounted and remounted, then no replay will happen.  
> End of story.  If the suspend code is doing something like:
> 	
> 	1) save memory contents to disk
> 	2) suspend/power off
> 	3) reboot kernel, mount filesystem(s), etc

This is just reboot kernel. Filesystems aren't mounted before (4).

> 	4) check for presence of suspend image
> 	5) replace currently-running kernel with suspended kernel
> 
> Then you are in for a world of hurt regardless of whether the data is in a
> swap file or a swap partition.  The data in the swapfile isn't touched by
> journal replay at all (so that is safe regardless), but the rest of the
> filesystem is, which could cause strange disk corruption since the in-memory
> data doesn't match the on-disk data.

On the second part, "Precisely."

> If that is the case, then the only way to avoid this would be to call
> sync_super_lockfs() on each filesystem before the suspend, which will
> force the journal to be empty when it returns.  That API is supported
> by all of the journaling filesystems, and is probably a good thing to
> do anyways, as it will potentially free a lot of dirty data from RAM,
> and also ensure that the on-disk data is consistent in case the resume
> isn't handled gracefully.

Sounds like a good idea to me.

Regards,

Nigel



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24 20:48                                         ` Pavel Machek
@ 2003-04-24 21:46                                           ` Andreas Dilger
  2003-04-25  1:09                                             ` Nigel Cunningham
  0 siblings, 1 reply; 68+ messages in thread
From: Andreas Dilger @ 2003-04-24 21:46 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Andrew Morton, Nigel Cunningham, cat, mbligh, gigerstyle, geert,
	linux-kernel

On Apr 24, 2003  22:48 +0200, Pavel Machek wrote:
> Andreas Dilger wrote:
> > Wouldn't that be true in all cases where you have a journaled filesystem
> > and you suspend?  
> 
> No. In normal state you restore without mounting anything (thus no
> journal replay).
> 
> > Are you talking about someone restarting system (without
> > doing a resume), mounting the file system (causing journal replay), and
> > then shutting down and going back to the suspended image?
> 
> No, I'm not.
> 
> > I think the important thing to note is that if you don't unmount the
> > filesystem during suspend, then no journal recovery will take place
> > at resume time, because you are not really mounting the filesystem
> > at all.  And I can't see how you could be unmounting the filesystems
> > without killing all of the applications, at which point it would make
> > suspend pretty useless.
> 
> No, I'm not unmounting it.

OK, then why all of the talk earlier saying that journal recovery will
corrupt a swapfile?  That was the reason journaling was brought into the
discussion in the first place:

	"And now you have kernel which expects data still in journal (that was
	 state before suspend), but reality on disk is quite different (journal
	 was replayed). Data corruption." -- Pavel

If the filesystem was not unmounted and remounted, then no replay will happen.  
End of story.  If the suspend code is doing something like:
	
	1) save memory contents to disk
	2) suspend/power off
	3) reboot kernel, mount filesystem(s), etc
	4) check for presence of suspend image
	5) replace currently-running kernel with suspended kernel

Then you are in for a world of hurt regardless of whether the data is in a
swap file or a swap partition.  The data in the swapfile isn't touched by
journal replay at all (so that is safe regardless), but the rest of the
filesystem is, which could cause strange disk corruption since the in-memory
data doesn't match the on-disk data.

If that is the case, then the only way to avoid this would be to call
sync_super_lockfs() on each filesystem before the suspend, which will
force the journal to be empty when it returns.  That API is supported
by all of the journaling filesystems, and is probably a good thing to
do anyways, as it will potentially free a lot of dirty data from RAM,
and also ensure that the on-disk data is consistent in case the resume
isn't handled gracefully.

> > What is also important to note is that during normal filesystem operation,
> > the ext3 journaling code never reads back any data from the journal, with
> > the exception of a couple of fields in the journal superblock.  I would
> > hazard a guess that if you did a suspend, wiped the journal, and then
> > resumed the journaling code wouldn't know the difference.
> 
> That's possible, but I do not want suspend to know about filesystem
> specifics.

This was just given as a counterexample saying that the in-kernel ext3
journal code does not actually care what is in the journal on disk, since
it is never read unless there is a crash.  So, the suspend code does not
need to know anything about the filesystem specifics at all in this regard.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24 16:37                                       ` Andreas Dilger
@ 2003-04-24 20:48                                         ` Pavel Machek
  2003-04-24 21:46                                           ` Andreas Dilger
  0 siblings, 1 reply; 68+ messages in thread
From: Pavel Machek @ 2003-04-24 20:48 UTC (permalink / raw)
  To: Andrew Morton, Nigel Cunningham, cat, mbligh, gigerstyle, geert,
	linux-kernel

Hi!

> > > > > > > Sorry, I still don't get it.  Go through the steps for me:
> > > > > > > 
> > > > > > > 1) suspend writes pages to disk
> > > > > > > 
> > > > > > > 2) machine is shutdown
> > > > > > > 
> > > > > > > 3) restart, journal replay
> > > > 
> > > > Corruption comes here. The journal reply tidies things up that shouldn't
> > > > be tidied up. They shouldn't be tidied up because once we reload the
> > > > image, things should be in the same state as prior to suspend. If replay
> > > > frees a block (thinking it wasn't properly linked or something similar),
> > > > it introduces corruption.
> > > 
> > > No, this will not happen.  All swapfile blocks must be allocated by swapon
> > > time.  It is just a chunk of disk and replay will not touch it.
> > > 
> > > That's for ext3, and no other filesystems journal data anyway...
> > 
> > Its not about data.
> > 
> > Corruption is not in suspended image. Imagine you have running system
> > (X open, applications running, gcc compiling) and someone runs journal
> > replay. Bye bye data. And that's what happens there. When you restore,
> > restored kernel no longer knows you did replay.
> 
> Wouldn't that be true in all cases where you have a journaled filesystem
> and you suspend?  

No. In normal state you restore without mounting anything (thus no
journal replay).

> Are you talking about someone restarting system (without
> doing a resume), mounting the file system (causing journal replay), and
> then shutting down and going back to the suspended image?

No, I'm not.

> I think the important thing to note is that if you don't unmount the
> filesystem during suspend, then no journal recovery will take place
> at resume time, because you are not really mounting the filesystem
> at all.  And I can't see how you could be unmounting the filesystems
> without killing all of the applications, at which point it would make
> suspend pretty useless.

No, I'm not unmounting it.

> What is also important to note is that during normal filesystem operation,
> the ext3 journaling code never reads back any data from the journal, with
> the exception of a couple of fields in the journal superblock.  I would
> hazard a guess that if you did a suspend, wiped the journal, and then
> resumed the journaling code wouldn't know the difference.

That's possible, but I do not want suspend to know about filesystem
specifics.

								Pavel
-- 
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24 14:26                                     ` Pavel Machek
@ 2003-04-24 16:37                                       ` Andreas Dilger
  2003-04-24 20:48                                         ` Pavel Machek
  0 siblings, 1 reply; 68+ messages in thread
From: Andreas Dilger @ 2003-04-24 16:37 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Andrew Morton, Nigel Cunningham, cat, mbligh, gigerstyle, geert,
	linux-kernel

On Apr 24, 2003  16:26 +0200, Pavel Machek wrote:
> > > > > Andrew Morton wrote:
> > > > > > Sorry, I still don't get it.  Go through the steps for me:
> > > > > > 
> > > > > > 1) suspend writes pages to disk
> > > > > > 
> > > > > > 2) machine is shutdown
> > > > > > 
> > > > > > 3) restart, journal replay
> > > 
> > > Corruption comes here. The journal reply tidies things up that shouldn't
> > > be tidied up. They shouldn't be tidied up because once we reload the
> > > image, things should be in the same state as prior to suspend. If replay
> > > frees a block (thinking it wasn't properly linked or something similar),
> > > it introduces corruption.
> > 
> > No, this will not happen.  All swapfile blocks must be allocated by swapon
> > time.  It is just a chunk of disk and replay will not touch it.
> > 
> > That's for ext3, and no other filesystems journal data anyway...
> 
> Its not about data.
> 
> Corruption is not in suspended image. Imagine you have running system
> (X open, applications running, gcc compiling) and someone runs journal
> replay. Bye bye data. And that's what happens there. When you restore,
> restored kernel no longer knows you did replay.

Wouldn't that be true in all cases where you have a journaled filesystem
and you suspend?  Are you talking about someone restarting system (without
doing a resume), mounting the file system (causing journal replay), and
then shutting down and going back to the suspended image?

I think the important thing to note is that if you don't unmount the
filesystem during suspend, then no journal recovery will take place
at resume time, because you are not really mounting the filesystem
at all.  And I can't see how you could be unmounting the filesystems
without killing all of the applications, at which point it would make
suspend pretty useless.

What is also important to note is that during normal filesystem operation,
the ext3 journaling code never reads back any data from the journal, with
the exception of a couple of fields in the journal superblock.  I would
hazard a guess that if you did a suspend, wiped the journal, and then
resumed the journaling code wouldn't know the difference.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  0:25                   ` Pavel Machek
  2003-04-24  9:01                     ` Andrew Morton
  2003-04-24  9:05                     ` Jamie Lokier
@ 2003-04-24 15:22                     ` Carl-Daniel Hailfinger
  2 siblings, 0 replies; 68+ messages in thread
From: Carl-Daniel Hailfinger @ 2003-04-24 15:22 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Andrew Morton, mbligh, ncunningham, gigerstyle, geert, linux-kernel

Pavel Machek wrote:
> Hi!
>
>
>>>If you really want to "solve" it reliably, you can always
>>>
>>>swapon /dev/hdfoo666
>>>
>>
>>Seems that using a swapfile instead of a swapdev would fix that neatly.
>>
>>But iirc, suspend doesn't work with swapfiles.  Is that correct?  If so,
>>what has to be done to get it working?
>
>
> Swapfile does not work, because even readonly mount wants to replay
> logs, and that'd be disk corruption.
>
> It could be doable with modifications to the filesystems, but it would
> be hard (and I do not think it is worth it).
> 								Pavel
>
Any non-journaling filesystem should work out fine. And if you don't
trust ext2, you can still use xiafs once I've finished porting it to 2.5

Carl-Daniel
-- 
http://www.hailfinger.org/


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24 11:36                                   ` Andrew Morton
@ 2003-04-24 14:26                                     ` Pavel Machek
  2003-04-24 16:37                                       ` Andreas Dilger
  0 siblings, 1 reply; 68+ messages in thread
From: Pavel Machek @ 2003-04-24 14:26 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Nigel Cunningham, pavel, cat, mbligh, gigerstyle, geert, linux-kernel

Hi!

> > > > > Sorry, I still don't get it.  Go through the steps for me:
> > > > > 
> > > > > 1) suspend writes pages to disk
> > > > > 
> > > > > 2) machine is shutdown
> > > > > 
> > > > > 3) restart, journal replay
> > 
> > Corruption comes here. The journal reply tidies things up that shouldn't
> > be tidied up. They shouldn't be tidied up because once we reload the
> > image, things should be in the same state as prior to suspend. If replay
> > frees a block (thinking it wasn't properly linked or something similar),
> > it introduces corruption.
> 
> No, this will not happen.  All swapfile blocks must be allocated by swapon
> time.  It is just a chunk of disk and replay will not touch it.
> 
> That's for ext3, and no other filesystems journal data anyway...

Its not about data.

Corruption is not in suspended image. Imagine you have running system
(X open, applications running, gcc compiling) and someone runs journal
replay. Bye bye data. And that's what happens there. When you restore,
restored kernel no longer knows you did replay.
								Pavel
-- 
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24 11:13                                 ` Nigel Cunningham
  2003-04-24 11:36                                   ` Andrew Morton
@ 2003-04-24 11:36                                   ` Geert Uytterhoeven
  2003-04-25  1:22                                     ` H. Peter Anvin
  1 sibling, 1 reply; 68+ messages in thread
From: Geert Uytterhoeven @ 2003-04-24 11:36 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: Andrew Morton, Pavel Machek, cat, mbligh, gigerstyle,
	Linux Kernel Mailing List

On Thu, 24 Apr 2003, Nigel Cunningham wrote:
> On Thu, 2003-04-24 at 21:46, Andrew Morton wrote:
> > > > Sorry, I still don't get it.  Go through the steps for me:
> > > > 
> > > > 1) suspend writes pages to disk
> > > > 
> > > > 2) machine is shutdown
> > > > 
> > > > 3) restart, journal replay
> 
> Corruption comes here. The journal reply tidies things up that shouldn't
> be tidied up. They shouldn't be tidied up because once we reload the
> image, things should be in the same state as prior to suspend. If replay
> frees a block (thinking it wasn't properly linked or something similar),
> it introduces corruption.

This has nothing to do with using a swapfile.

But if you resume from swsusp, you don't really `mount' all file systems. They
are implicitly mounted because they were mounted before the suspend operation.

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24 11:13                                 ` Nigel Cunningham
@ 2003-04-24 11:36                                   ` Andrew Morton
  2003-04-24 14:26                                     ` Pavel Machek
  2003-04-24 11:36                                   ` Geert Uytterhoeven
  1 sibling, 1 reply; 68+ messages in thread
From: Andrew Morton @ 2003-04-24 11:36 UTC (permalink / raw)
  To: Nigel Cunningham; +Cc: pavel, cat, mbligh, gigerstyle, geert, linux-kernel

Nigel Cunningham <ncunningham@clear.net.nz> wrote:
>
> On Thu, 2003-04-24 at 21:46, Andrew Morton wrote:
> > > > Sorry, I still don't get it.  Go through the steps for me:
> > > > 
> > > > 1) suspend writes pages to disk
> > > > 
> > > > 2) machine is shutdown
> > > > 
> > > > 3) restart, journal replay
> 
> Corruption comes here. The journal reply tidies things up that shouldn't
> be tidied up. They shouldn't be tidied up because once we reload the
> image, things should be in the same state as prior to suspend. If replay
> frees a block (thinking it wasn't properly linked or something similar),
> it introduces corruption.

No, this will not happen.  All swapfile blocks must be allocated by swapon
time.  It is just a chunk of disk and replay will not touch it.

That's for ext3, and no other filesystems journal data anyway...

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  9:46                               ` Andrew Morton
@ 2003-04-24 11:13                                 ` Nigel Cunningham
  2003-04-24 11:36                                   ` Andrew Morton
  2003-04-24 11:36                                   ` Geert Uytterhoeven
  0 siblings, 2 replies; 68+ messages in thread
From: Nigel Cunningham @ 2003-04-24 11:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Pavel Machek, cat, mbligh, gigerstyle, geert, Linux Kernel Mailing List

On Thu, 2003-04-24 at 21:46, Andrew Morton wrote:
> > > Sorry, I still don't get it.  Go through the steps for me:
> > > 
> > > 1) suspend writes pages to disk
> > > 
> > > 2) machine is shutdown
> > > 
> > > 3) restart, journal replay

Corruption comes here. The journal reply tidies things up that shouldn't
be tidied up. They shouldn't be tidied up because once we reload the
image, things should be in the same state as prior to suspend. If replay
frees a block (thinking it wasn't properly linked or something similar),
it introduces corruption.

> swapfiles are not journalled - the swap a_ops write direct to the swapfile's
> blocks with submit_bio().  Journal replay wouldn't touch the swapfile.
> 
> I can see that locating the swapfile for the resume-time swapon could be a
> problem, but the corruption thing still escapes me.

Regards,

Nigel

-- 
Nigel Cunningham
495 St Georges Road South, Hastings 4201, New Zealand

Be diligent to present yourself approved to God as a workman who does
not need to be ashamed, handling accurately the word of truth.
	-- 2 Timothy 2:14, NASB.


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  9:35                             ` Pavel Machek
@ 2003-04-24  9:46                               ` Andrew Morton
  2003-04-24 11:13                                 ` Nigel Cunningham
  0 siblings, 1 reply; 68+ messages in thread
From: Andrew Morton @ 2003-04-24  9:46 UTC (permalink / raw)
  To: Pavel Machek; +Cc: ncunningham, cat, mbligh, gigerstyle, geert, linux-kernel

Pavel Machek <pavel@suse.cz> wrote:
>
> Hi!
> 
> > > No, ext3 will be "unclean" during resume (you can't really unmount it
> > > during suspend!) and r-o mounting of ext3 will replay journal and
> > > cause data corruption.
> > 
> > Sorry, I still don't get it.  Go through the steps for me:
> > 
> > 1) suspend writes pages to disk
> > 
> > 2) machine is shutdown
> > 
> > 3) restart, journal replay
> > 
> > 4) resume reads pages from disk.
> 
> And now you have kernel which expects data still in journal (that was
> state before suspend), but reality on disk is quite different (journal
> was replayed). Data corruption.

swapfiles are not journalled - the swap a_ops write direct to the swapfile's
blocks with submit_bio().  Journal replay wouldn't touch the swapfile.

I can see that locating the swapfile for the resume-time swapon could be a
problem, but the corruption thing still escapes me.


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  9:25                           ` Andrew Morton
@ 2003-04-24  9:35                             ` Pavel Machek
  2003-04-24  9:46                               ` Andrew Morton
  0 siblings, 1 reply; 68+ messages in thread
From: Pavel Machek @ 2003-04-24  9:35 UTC (permalink / raw)
  To: Andrew Morton; +Cc: ncunningham, cat, mbligh, gigerstyle, geert, linux-kernel

Hi!

> > No, ext3 will be "unclean" during resume (you can't really unmount it
> > during suspend!) and r-o mounting of ext3 will replay journal and
> > cause data corruption.
> 
> Sorry, I still don't get it.  Go through the steps for me:
> 
> 1) suspend writes pages to disk
> 
> 2) machine is shutdown
> 
> 3) restart, journal replay
> 
> 4) resume reads pages from disk.

And now you have kernel which expects data still in journal (that was
state before suspend), but reality on disk is quite different (journal
was replayed). Data corruption.

Simple enough steps?
								Pavel
-- 
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  9:05                     ` Jamie Lokier
@ 2003-04-24  9:34                       ` Pavel Machek
  0 siblings, 0 replies; 68+ messages in thread
From: Pavel Machek @ 2003-04-24  9:34 UTC (permalink / raw)
  To: Jamie Lokier
  Cc: Andrew Morton, mbligh, ncunningham, gigerstyle, geert, linux-kernel

Hi!

> > Swapfile does not work, because even readonly mount wants to replay
> > logs, and that'd be disk corruption.
> 
> I don't understand.  During suspend, you just need a list of blocks to
> write to from the swapfile.  You can get that list before starting the
> actual suspend, so that writing doesn't imply any filesystem activity.
> 
> When you're resuming, you just need a list of which disk blocks to
> resume from.  Can't that list be stored in a few blocks of the
> swapfile itself, with the only critical parameter being the first
> block number to resume from?

And how do you pass that first number? Please user could you write
this down to paper and enter it on commandline during suspend?

Okay, okay, this could be made to work. You could store pointer in
swapspace or in reserved block somewhere...
								Pavel
-- 
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  9:12                         ` Pavel Machek
@ 2003-04-24  9:25                           ` Andrew Morton
  2003-04-24  9:35                             ` Pavel Machek
  0 siblings, 1 reply; 68+ messages in thread
From: Andrew Morton @ 2003-04-24  9:25 UTC (permalink / raw)
  To: Pavel Machek; +Cc: ncunningham, cat, mbligh, gigerstyle, geert, linux-kernel

Pavel Machek <pavel@ucw.cz> wrote:
>
> No, ext3 will be "unclean" during resume (you can't really unmount it
> during suspend!) and r-o mounting of ext3 will replay journal and
> cause data corruption.

Sorry, I still don't get it.  Go through the steps for me:

1) suspend writes pages to disk

2) machine is shutdown

3) restart, journal replay

4) resume reads pages from disk.

Where did the corruption happen?

Please bear in mind that I don't really know how swsusp works, so tell it to
me in nice simple steps.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  9:01                     ` Andrew Morton
@ 2003-04-24  9:14                       ` Pavel Machek
  0 siblings, 0 replies; 68+ messages in thread
From: Pavel Machek @ 2003-04-24  9:14 UTC (permalink / raw)
  To: Andrew Morton; +Cc: mbligh, ncunningham, gigerstyle, geert, linux-kernel

Hi!

> > > > If you really want to "solve" it reliably, you can always
> > > > 
> > > > swapon /dev/hdfoo666
> > > > 
> > > 
> > > Seems that using a swapfile instead of a swapdev would fix that neatly.
> > > 
> > > But iirc, suspend doesn't work with swapfiles.  Is that correct?  If so,
> > > what has to be done to get it working?
> > 
> > Swapfile does not work, because even readonly mount wants to replay
> > logs, and that'd be disk corruption.
> > 
> 
> I don't get it.   Can you explain some more?

See other mail about 3 minutes ago from me... I hope its clear there.

								Pavel
-- 
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  0:37                       ` Andrew Morton
@ 2003-04-24  9:12                         ` Pavel Machek
  2003-04-24  9:25                           ` Andrew Morton
  0 siblings, 1 reply; 68+ messages in thread
From: Pavel Machek @ 2003-04-24  9:12 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Nigel Cunningham, cat, mbligh, gigerstyle, geert, linux-kernel

Hi!

> > > I'm curious. What does a swapfile solve that a swapdev does not? Either
> > > way you need to prealloc the case (either have a chunky file in a
> > > partition or a partition set aside) or you need to keep enough room
> > > avail to fit the file when it's needed.
> > 
> > Nothing but further bloat in swsusp :> With a swapfile, we need to know
> > the location of the file (and be able to find it again when it changes,
> > and know how to find the next block in the file system - it might be
> > fragmented).
> 
> That's because swsusp is using the mm/page_io.c functions for suspend, but
> is using the fs/buffer.c functions direct to the blockdev for resume.
> 
> If you can use the swapper_space a_ops for both suspend and resume (say:
> "cleanup") then it will just work.

No, ext3 will be "unclean" during resume (you can't really unmount it
during suspend!) and r-o mounting of ext3 will replay journal and
cause data corruption.
								Pavel

-- 
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  0:25                   ` Pavel Machek
  2003-04-24  9:01                     ` Andrew Morton
@ 2003-04-24  9:05                     ` Jamie Lokier
  2003-04-24  9:34                       ` Pavel Machek
  2003-04-24 15:22                     ` Carl-Daniel Hailfinger
  2 siblings, 1 reply; 68+ messages in thread
From: Jamie Lokier @ 2003-04-24  9:05 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Andrew Morton, mbligh, ncunningham, gigerstyle, geert, linux-kernel

Pavel Machek wrote:
> Swapfile does not work, because even readonly mount wants to replay
> logs, and that'd be disk corruption.

I don't understand.  During suspend, you just need a list of blocks to
write to from the swapfile.  You can get that list before starting the
actual suspend, so that writing doesn't imply any filesystem activity.

When you're resuming, you just need a list of which disk blocks to
resume from.  Can't that list be stored in a few blocks of the
swapfile itself, with the only critical parameter being the first
block number to resume from?

(A bit like LILO has a map file containing the list of blocks which
contain the kernel to boot.)

-- Jamie

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  0:25                   ` Pavel Machek
@ 2003-04-24  9:01                     ` Andrew Morton
  2003-04-24  9:14                       ` Pavel Machek
  2003-04-24  9:05                     ` Jamie Lokier
  2003-04-24 15:22                     ` Carl-Daniel Hailfinger
  2 siblings, 1 reply; 68+ messages in thread
From: Andrew Morton @ 2003-04-24  9:01 UTC (permalink / raw)
  To: Pavel Machek; +Cc: mbligh, ncunningham, gigerstyle, geert, linux-kernel

Pavel Machek <pavel@suse.cz> wrote:
>
> Hi!
> 
> > > If you really want to "solve" it reliably, you can always
> > > 
> > > swapon /dev/hdfoo666
> > > 
> > 
> > Seems that using a swapfile instead of a swapdev would fix that neatly.
> > 
> > But iirc, suspend doesn't work with swapfiles.  Is that correct?  If so,
> > what has to be done to get it working?
> 
> Swapfile does not work, because even readonly mount wants to replay
> logs, and that'd be disk corruption.
> 

I don't get it.   Can you explain some more?

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  0:38                         ` Andrew Morton
  2003-04-24  0:54                           ` CaT
@ 2003-04-24  8:48                           ` John Bradford
  1 sibling, 0 replies; 68+ messages in thread
From: John Bradford @ 2003-04-24  8:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Randy.Dunlap, ncunningham, cat, pavel, mbligh, gigerstyle, geert,
	linux-kernel

> > That may be simple for you, but for lots of users, adding a partition
> > (to a ususally full disk drive) isn't simple.  It means backups,
> > shrink a filesystem, shrink a partition, add a partition, and run
> > mkswap on it.   Yes, the latter 2 are simple, but the former ones
> > are not.
> 
> Yeah.  swsusp is pretty much the only reason why you would want to have a
> swap partition at all in a 2.5/2.6 kernel.

A lot of users still follow the 'swap space twice the size of physical
RAM' rule of thumb.

Now that physical RAM sizes have increased by an order of magnitude
since that advice was given, there are a lot of systems with far more
swap than they need, which provides the solution to our problem - if
there is a single swap partition which is twice the size of physical
RAM, just split it in to SWAP and SWSUSP slices, (not separate
partitions, slices in the same way BSD subdivides a partition).

2.4 kernels will see a big swap area, and use all of it as swap, 2.5
kernels will see a smaller swap area, (equal to physical RAM size, and
therefore sufficient in a lot of cases), and a SWSUSP area the same
size as physical RAM.

John.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-23 22:25           ` Nigel Cunningham
  2003-04-23 23:28             ` Martin J. Bligh
@ 2003-04-24  8:00             ` Marc Giger
  1 sibling, 0 replies; 68+ messages in thread
From: Marc Giger @ 2003-04-24  8:00 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: Pavel Machek, Geert Uytterhoeven, Linux Kernel Development

Good Morning,

On Thu, 24 Apr 2003 10:25:25 +1200
Nigel Cunningham <ncunningham@clear.net.nz> wrote:

> 
> 
> On Thu, 2003-04-24 at 08:36, Marc Giger wrote:
> > Ok! I see the advantages / disadvantages of each version. But what
> > happens if the memory AND swap space are full and nothing can't be
> > freed? When I watch the memory and swap consumption on my laptop, I
> > think it's the most time the case...
> 
> If you're getting yourself in that situation, you should be increasing
> your swap space (and memory if possible) anyway.

Yeah, you are right but it was always enough. It was rarely the case that something got killed...I can't install more ram, 256MB is the maximum:-( and i don't want to repartition my harddisk..(~250MB swap)

> 
> > Another question:
> > Is it a big problem to save the memory in a separate file on the file
> > system, and save somewhere the pointer to it (as example in swap. Also
> > we could set a flag in swap so that we now that the last shutdown was
> > a hybernation). One Problem will be, that we don't know the filesystem
> > type on resume...(We could save the module in swap...)
> > All that is just theoretical. It's only a idea.
> 
> I guess the simplest answer is would it be worth the pain? Since disk
> space is cheap, it just requires a little forethought when installing
> Linux, to ensure enough swap is allocated. I certainly understand that
> using a file rather than swap makes adjusting the amount of space
> available easier, but as you rightly acknowledge, it does complicate
> things a fair bit more.

Yes I see, but there are a lot of users (like me) who has an already installed Linux and don't want to repartition the disk and reinstall it. Repartitioning Programs success is not always guaranteed...

Marc

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  0:54                           ` CaT
@ 2003-04-24  1:06                             ` Andrew Morton
  0 siblings, 0 replies; 68+ messages in thread
From: Andrew Morton @ 2003-04-24  1:06 UTC (permalink / raw)
  To: CaT; +Cc: linux-kernel

CaT <cat@zip.com.au> wrote:
>
> On Wed, Apr 23, 2003 at 05:38:37PM -0700, Andrew Morton wrote:
> > "Randy.Dunlap" <rddunlap@osdl.org> wrote:
> > >
> > > That may be simple for you, but for lots of users, adding a partition
> > > (to a ususally full disk drive) isn't simple.  It means backups,
> > > shrink a filesystem, shrink a partition, add a partition, and run
> > > mkswap on it.   Yes, the latter 2 are simple, but the former ones
> > > are not.
> > 
> > Yeah.  swsusp is pretty much the only reason why you would want to have a
> > swap partition at all in a 2.5/2.6 kernel.
> 
> Is there really no difference any longer in terms of speed?

Nope.  Not unless the swapfile which you created is splattered all over the
disk, in whcih case you already have serious problems with that filesystem.

There will some performance differences with resume, because it is doing
page-at-a-time synchronous IO.  The device-level caching should make that
acceptable though.


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  0:38                         ` Andrew Morton
@ 2003-04-24  0:54                           ` CaT
  2003-04-24  1:06                             ` Andrew Morton
  2003-04-24  8:48                           ` John Bradford
  1 sibling, 1 reply; 68+ messages in thread
From: CaT @ 2003-04-24  0:54 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

On Wed, Apr 23, 2003 at 05:38:37PM -0700, Andrew Morton wrote:
> "Randy.Dunlap" <rddunlap@osdl.org> wrote:
> >
> > That may be simple for you, but for lots of users, adding a partition
> > (to a ususally full disk drive) isn't simple.  It means backups,
> > shrink a filesystem, shrink a partition, add a partition, and run
> > mkswap on it.   Yes, the latter 2 are simple, but the former ones
> > are not.
> 
> Yeah.  swsusp is pretty much the only reason why you would want to have a
> swap partition at all in a 2.5/2.6 kernel.

Is there really no difference any longer in terms of speed?

-- 
Martin's distress was in contrast to the bitter satisfaction of some
of his fellow marines as they surveyed the scene. "The Iraqis are sick
people and we are the chemotherapy," said Corporal Ryan Dupre. "I am
starting to hate this country. Wait till I get hold of a friggin' Iraqi.
No, I won't get hold of one. I'll just kill him."
	- http://www.informationclearinghouse.info/article2479.htm

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  0:26                       ` Randy.Dunlap
  2003-04-24  0:31                         ` CaT
@ 2003-04-24  0:38                         ` Andrew Morton
  2003-04-24  0:54                           ` CaT
  2003-04-24  8:48                           ` John Bradford
  1 sibling, 2 replies; 68+ messages in thread
From: Andrew Morton @ 2003-04-24  0:38 UTC (permalink / raw)
  To: Randy.Dunlap
  Cc: ncunningham, cat, pavel, mbligh, gigerstyle, geert, linux-kernel

"Randy.Dunlap" <rddunlap@osdl.org> wrote:
>
> That may be simple for you, but for lots of users, adding a partition
> (to a ususally full disk drive) isn't simple.  It means backups,
> shrink a filesystem, shrink a partition, add a partition, and run
> mkswap on it.   Yes, the latter 2 are simple, but the former ones
> are not.

Yeah.  swsusp is pretty much the only reason why you would want to have a
swap partition at all in a 2.5/2.6 kernel.


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  0:16                     ` Nigel Cunningham
  2003-04-24  0:26                       ` Randy.Dunlap
@ 2003-04-24  0:37                       ` Andrew Morton
  2003-04-24  9:12                         ` Pavel Machek
  1 sibling, 1 reply; 68+ messages in thread
From: Andrew Morton @ 2003-04-24  0:37 UTC (permalink / raw)
  To: Nigel Cunningham; +Cc: cat, pavel, mbligh, gigerstyle, geert, linux-kernel

Nigel Cunningham <ncunningham@clear.net.nz> wrote:
>
> On Thu, 2003-04-24 at 12:17, CaT wrote:
> > I'm curious. What does a swapfile solve that a swapdev does not? Either
> > way you need to prealloc the case (either have a chunky file in a
> > partition or a partition set aside) or you need to keep enough room
> > avail to fit the file when it's needed.
> 
> Nothing but further bloat in swsusp :> With a swapfile, we need to know
> the location of the file (and be able to find it again when it changes,
> and know how to find the next block in the file system - it might be
> fragmented).

That's because swsusp is using the mm/page_io.c functions for suspend, but
is using the fs/buffer.c functions direct to the blockdev for resume.

If you can use the swapper_space a_ops for both suspend and resume (say:
"cleanup") then it will just work.


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  0:26                       ` Randy.Dunlap
@ 2003-04-24  0:31                         ` CaT
  2003-04-24  0:38                         ` Andrew Morton
  1 sibling, 0 replies; 68+ messages in thread
From: CaT @ 2003-04-24  0:31 UTC (permalink / raw)
  To: Randy.Dunlap
  Cc: Nigel Cunningham, akpm, pavel, mbligh, gigerstyle, geert, linux-kernel

On Wed, Apr 23, 2003 at 05:26:28PM -0700, Randy.Dunlap wrote:
> | fragmented). The simplest solution is to keep using the current method
> | and create a separate swap partition if you really feel you need to,
> | only turning it on before swap and turning if off afterwards. As Pavel
> | said, code could be added to get swsusp to do it itself.
> 
> That may be simple for you, but for lots of users, adding a partition
> (to a ususally full disk drive) isn't simple.  It means backups,
> shrink a filesystem, shrink a partition, add a partition, and run
> mkswap on it.   Yes, the latter 2 are simple, but the former ones
> are not.
> 
> Oh, and then just start over and install everything from backups. :(

parted should help with this. Dunno if it can move the start of a
partition yet but you can move the end down and put in your suspend
partition in the space you just made.

-- 
Martin's distress was in contrast to the bitter satisfaction of some
of his fellow marines as they surveyed the scene. "The Iraqis are sick
people and we are the chemotherapy," said Corporal Ryan Dupre. "I am
starting to hate this country. Wait till I get hold of a friggin' Iraqi.
No, I won't get hold of one. I'll just kill him."
	- http://www.informationclearinghouse.info/article2479.htm

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  0:16                     ` Nigel Cunningham
@ 2003-04-24  0:26                       ` Randy.Dunlap
  2003-04-24  0:31                         ` CaT
  2003-04-24  0:38                         ` Andrew Morton
  2003-04-24  0:37                       ` Andrew Morton
  1 sibling, 2 replies; 68+ messages in thread
From: Randy.Dunlap @ 2003-04-24  0:26 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: cat, akpm, pavel, mbligh, gigerstyle, geert, linux-kernel

On Thu, 24 Apr 2003 12:16:48 +1200 Nigel Cunningham <ncunningham@clear.net.nz> wrote:

| On Thu, 2003-04-24 at 12:17, CaT wrote:
| > I'm curious. What does a swapfile solve that a swapdev does not? Either
| > way you need to prealloc the case (either have a chunky file in a
| > partition or a partition set aside) or you need to keep enough room
| > avail to fit the file when it's needed.
| 
| Nothing but further bloat in swsusp :> With a swapfile, we need to know
| the location of the file (and be able to find it again when it changes,
| and know how to find the next block in the file system - it might be
| fragmented). The simplest solution is to keep using the current method
| and create a separate swap partition if you really feel you need to,
| only turning it on before swap and turning if off afterwards. As Pavel
| said, code could be added to get swsusp to do it itself.

That may be simple for you, but for lots of users, adding a partition
(to a ususally full disk drive) isn't simple.  It means backups,
shrink a filesystem, shrink a partition, add a partition, and run
mkswap on it.   Yes, the latter 2 are simple, but the former ones
are not.

Oh, and then just start over and install everything from backups. :(

--
~Randy

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  0:07                 ` Andrew Morton
  2003-04-24  0:17                   ` CaT
@ 2003-04-24  0:25                   ` Pavel Machek
  2003-04-24  9:01                     ` Andrew Morton
                                       ` (2 more replies)
  1 sibling, 3 replies; 68+ messages in thread
From: Pavel Machek @ 2003-04-24  0:25 UTC (permalink / raw)
  To: Andrew Morton; +Cc: mbligh, ncunningham, gigerstyle, geert, linux-kernel

Hi!

> > If you really want to "solve" it reliably, you can always
> > 
> > swapon /dev/hdfoo666
> > 
> 
> Seems that using a swapfile instead of a swapdev would fix that neatly.
> 
> But iirc, suspend doesn't work with swapfiles.  Is that correct?  If so,
> what has to be done to get it working?

Swapfile does not work, because even readonly mount wants to replay
logs, and that'd be disk corruption.

It could be doable with modifications to the filesystems, but it would
be hard (and I do not think it is worth it).
								Pavel

-- 
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  0:07                 ` Andrew Morton
@ 2003-04-24  0:17                   ` CaT
  2003-04-24  0:16                     ` Nigel Cunningham
  2003-04-24  0:25                   ` Pavel Machek
  1 sibling, 1 reply; 68+ messages in thread
From: CaT @ 2003-04-24  0:17 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Pavel Machek, mbligh, ncunningham, gigerstyle, geert, linux-kernel

On Wed, Apr 23, 2003 at 05:07:59PM -0700, Andrew Morton wrote:
> Pavel Machek <pavel@ucw.cz> wrote:
> >
> > If you really want to "solve" it reliably, you can always
> > 
> > swapon /dev/hdfoo666
> 
> Seems that using a swapfile instead of a swapdev would fix that neatly.
> 
> But iirc, suspend doesn't work with swapfiles.  Is that correct?  If so,
> what has to be done to get it working?

I'm curious. What does a swapfile solve that a swapdev does not? Either
way you need to prealloc the case (either have a chunky file in a
partition or a partition set aside) or you need to keep enough room
avail to fit the file when it's needed.

-- 
Martin's distress was in contrast to the bitter satisfaction of some
of his fellow marines as they surveyed the scene. "The Iraqis are sick
people and we are the chemotherapy," said Corporal Ryan Dupre. "I am
starting to hate this country. Wait till I get hold of a friggin' Iraqi.
No, I won't get hold of one. I'll just kill him."
	- http://www.informationclearinghouse.info/article2479.htm

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-24  0:17                   ` CaT
@ 2003-04-24  0:16                     ` Nigel Cunningham
  2003-04-24  0:26                       ` Randy.Dunlap
  2003-04-24  0:37                       ` Andrew Morton
  0 siblings, 2 replies; 68+ messages in thread
From: Nigel Cunningham @ 2003-04-24  0:16 UTC (permalink / raw)
  To: CaT
  Cc: Andrew Morton, Pavel Machek, mbligh, gigerstyle, geert,
	Linux Kernel Mailing List

On Thu, 2003-04-24 at 12:17, CaT wrote:
> I'm curious. What does a swapfile solve that a swapdev does not? Either
> way you need to prealloc the case (either have a chunky file in a
> partition or a partition set aside) or you need to keep enough room
> avail to fit the file when it's needed.

Nothing but further bloat in swsusp :> With a swapfile, we need to know
the location of the file (and be able to find it again when it changes,
and know how to find the next block in the file system - it might be
fragmented). The simplest solution is to keep using the current method
and create a separate swap partition if you really feel you need to,
only turning it on before swap and turning if off afterwards. As Pavel
said, code could be added to get swsusp to do it itself.

Regards,

Nigel

-- 
Nigel Cunningham
495 St Georges Road South, Hastings 4201, New Zealand

Be diligent to present yourself approved to God as a workman who does
not need to be ashamed, handling accurately the word of truth.
	-- 2 Timothy 2:14, NASB.


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-23 23:58               ` Pavel Machek
  2003-04-23 23:55                 ` Martin J. Bligh
@ 2003-04-24  0:07                 ` Andrew Morton
  2003-04-24  0:17                   ` CaT
  2003-04-24  0:25                   ` Pavel Machek
  1 sibling, 2 replies; 68+ messages in thread
From: Andrew Morton @ 2003-04-24  0:07 UTC (permalink / raw)
  To: Pavel Machek; +Cc: mbligh, ncunningham, gigerstyle, pavel, geert, linux-kernel

Pavel Machek <pavel@ucw.cz> wrote:
>
> If you really want to "solve" it reliably, you can always
> 
> swapon /dev/hdfoo666
> 

Seems that using a swapfile instead of a swapdev would fix that neatly.

But iirc, suspend doesn't work with swapfiles.  Is that correct?  If so,
what has to be done to get it working?


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-23 23:28             ` Martin J. Bligh
@ 2003-04-23 23:58               ` Pavel Machek
  2003-04-23 23:55                 ` Martin J. Bligh
  2003-04-24  0:07                 ` Andrew Morton
  0 siblings, 2 replies; 68+ messages in thread
From: Pavel Machek @ 2003-04-23 23:58 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Nigel Cunningham, Marc Giger, Pavel Machek, Geert Uytterhoeven,
	Linux Kernel Development

Hi!

> >> Ok! I see the advantages / disadvantages of each version. But what
> >> happens if the memory AND swap space are full and nothing can't be
> >> freed? When I watch the memory and swap consumption on my laptop, I
> >> think it's the most time the case...
> > 
> > If you're getting yourself in that situation, you should be increasing
> > your swap space (and memory if possible) anyway.
> > 
> >> Another question:
> >> Is it a big problem to save the memory in a separate file on the file
> >> system, and save somewhere the pointer to it (as example in swap. Also
> >> we could set a flag in swap so that we now that the last shutdown was
> >> a hybernation). One Problem will be, that we don't know the filesystem
> >> type on resume...(We could save the module in swap...)
> >> All that is just theoretical. It's only a idea.
> > 
> > I guess the simplest answer is would it be worth the pain? Since disk
> > space is cheap, it just requires a little forethought when installing
> > Linux, to ensure enough swap is allocated. I certainly understand that
> > using a file rather than swap makes adjusting the amount of space
> > available easier, but as you rightly acknowledge, it does complicate
> > things a fair bit more.
> 
> Can't you just create a pre-reserved separate swsusp area on disk the size 
> of RAM (maybe a partition rather than a file to make things easier), and 
> then you know you're safe (basically what Marc was suggesting, except pre-allocated)? Or does that make me the prince of all evil? ;-)
> 
> However much swap space you allocate, it can always all be used, so that
> seems futile ...

Well, but if all the swaps gets used, you go OOM and randomly kill
processes. That means that machines have way more swap than they need.

If you really want to "solve" it reliably, you can always

swapon /dev/hdfoo666

where hdfoo666 is as big as ram, just before starting swsusp. We could
even make swapon part of swsusp where its locked to kill races.

But I believe its not needed. Problem just is not there in practise.


							Pavel 

-- 
Horseback riding is like software...
...vgf orggre jura vgf serr.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-23 23:58               ` Pavel Machek
@ 2003-04-23 23:55                 ` Martin J. Bligh
  2003-04-24  0:07                 ` Andrew Morton
  1 sibling, 0 replies; 68+ messages in thread
From: Martin J. Bligh @ 2003-04-23 23:55 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Nigel Cunningham, Marc Giger, Geert Uytterhoeven,
	Linux Kernel Development

> Well, but if all the swaps gets used, you go OOM and randomly kill
> processes. That means that machines have way more swap than they need.
> 
> If you really want to "solve" it reliably, you can always
> 
> swapon /dev/hdfoo666
> 
> where hdfoo666 is as big as ram, just before starting swsusp. We could
> even make swapon part of swsusp where its locked to kill races.
> 
> But I believe its not needed. Problem just is not there in practise.

OK, well suppose RAM is full, and swap is almost full (less than used
RAM left). System is still running fine, no OOM. But suspend can't work, 
AFAICS. 

Saving a separate area to save RAM into seems like the only deterministic 
method to me.

M.


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-23 22:25           ` Nigel Cunningham
@ 2003-04-23 23:28             ` Martin J. Bligh
  2003-04-23 23:58               ` Pavel Machek
  2003-04-24  8:00             ` Marc Giger
  1 sibling, 1 reply; 68+ messages in thread
From: Martin J. Bligh @ 2003-04-23 23:28 UTC (permalink / raw)
  To: Nigel Cunningham, Marc Giger
  Cc: Pavel Machek, Geert Uytterhoeven, Linux Kernel Development

> On Thu, 2003-04-24 at 08:36, Marc Giger wrote:
>> Ok! I see the advantages / disadvantages of each version. But what
>> happens if the memory AND swap space are full and nothing can't be
>> freed? When I watch the memory and swap consumption on my laptop, I
>> think it's the most time the case...
> 
> If you're getting yourself in that situation, you should be increasing
> your swap space (and memory if possible) anyway.
> 
>> Another question:
>> Is it a big problem to save the memory in a separate file on the file
>> system, and save somewhere the pointer to it (as example in swap. Also
>> we could set a flag in swap so that we now that the last shutdown was
>> a hybernation). One Problem will be, that we don't know the filesystem
>> type on resume...(We could save the module in swap...)
>> All that is just theoretical. It's only a idea.
> 
> I guess the simplest answer is would it be worth the pain? Since disk
> space is cheap, it just requires a little forethought when installing
> Linux, to ensure enough swap is allocated. I certainly understand that
> using a file rather than swap makes adjusting the amount of space
> available easier, but as you rightly acknowledge, it does complicate
> things a fair bit more.

Can't you just create a pre-reserved separate swsusp area on disk the size 
of RAM (maybe a partition rather than a file to make things easier), and 
then you know you're safe (basically what Marc was suggesting, except pre-allocated)? Or does that make me the prince of all evil? ;-)

However much swap space you allocate, it can always all be used, so that
seems futile ...

M.


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-23 20:36         ` Marc Giger
@ 2003-04-23 22:25           ` Nigel Cunningham
  2003-04-23 23:28             ` Martin J. Bligh
  2003-04-24  8:00             ` Marc Giger
  0 siblings, 2 replies; 68+ messages in thread
From: Nigel Cunningham @ 2003-04-23 22:25 UTC (permalink / raw)
  To: Marc Giger; +Cc: Pavel Machek, Geert Uytterhoeven, Linux Kernel Development



On Thu, 2003-04-24 at 08:36, Marc Giger wrote:
> Ok! I see the advantages / disadvantages of each version. But what
> happens if the memory AND swap space are full and nothing can't be
> freed? When I watch the memory and swap consumption on my laptop, I
> think it's the most time the case...

If you're getting yourself in that situation, you should be increasing
your swap space (and memory if possible) anyway.

> Another question:
> Is it a big problem to save the memory in a separate file on the file
> system, and save somewhere the pointer to it (as example in swap. Also
> we could set a flag in swap so that we now that the last shutdown was
> a hybernation). One Problem will be, that we don't know the filesystem
> type on resume...(We could save the module in swap...)
> All that is just theoretical. It's only a idea.

I guess the simplest answer is would it be worth the pain? Since disk
space is cheap, it just requires a little forethought when installing
Linux, to ensure enough swap is allocated. I certainly understand that
using a file rather than swap makes adjusting the amount of space
available easier, but as you rightly acknowledge, it does complicate
things a fair bit more.

Regards,

Nigel
-- 
Nigel Cunningham
495 St Georges Road South, Hastings 4201, New Zealand

Be diligent to present yourself approved to God as a workman who does
not need to be ashamed, handling accurately the word of truth.
	-- 2 Timothy 2:14, NASB.


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-23 19:41       ` Nigel Cunningham
@ 2003-04-23 20:36         ` Marc Giger
  2003-04-23 22:25           ` Nigel Cunningham
  0 siblings, 1 reply; 68+ messages in thread
From: Marc Giger @ 2003-04-23 20:36 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: Pavel Machek, Geert Uytterhoeven, Linux Kernel Development

Hi Nigel, Hi All

On Thu, 24 Apr 2003 07:41:11 +1200
Nigel Cunningham <ncunningham@clear.net.nz> wrote:

> Swsusp will use the portion of your swap partition that is unused when
> you start to suspend. The version currently in the 2.5 tree frees most
> of your memory before suspending, and so doesn't need that much swap at
> all. The version that I'm working on merging only frees memory if it is
> necessary to fit the image in the available swap or to have enough
> memory to be able to save the image. Thus, you need a lot more swap for
> my version. (eg. I have 640MB ram on my laptop and a ~700MB swap
> partition).
> 

Ok! I see the advantages / disadvantages of each version. But what happens if the memory AND swap space are full and nothing can't be freed? When I watch the memory and swap consumption on my laptop, I think it's the most time the case...

Another question:
Is it a big problem to save the memory in a separate file on the file system, and save somewhere the pointer to it (as example in swap. Also we could set a flag in swap so that we now that the last shutdown was a hybernation). One Problem will be, that we don't know the filesystem type on resume...(We could save the module in swap...)
All that is just theoretical. It's only a idea.

Thank you for your statements.

greets

Marc

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-23 15:56     ` gigerstyle
@ 2003-04-23 19:41       ` Nigel Cunningham
  2003-04-23 20:36         ` Marc Giger
  0 siblings, 1 reply; 68+ messages in thread
From: Nigel Cunningham @ 2003-04-23 19:41 UTC (permalink / raw)
  To: gigerstyle; +Cc: Pavel Machek, Geert Uytterhoeven, Linux Kernel Development

Swsusp will use the portion of your swap partition that is unused when
you start to suspend. The version currently in the 2.5 tree frees most
of your memory before suspending, and so doesn't need that much swap at
all. The version that I'm working on merging only frees memory if it is
necessary to fit the image in the available swap or to have enough
memory to be able to save the image. Thus, you need a lot more swap for
my version. (eg. I have 640MB ram on my laptop and a ~700MB swap
partition).

Hope this helps.

Regards,

Nigel

On Thu, 2003-04-24 at 03:56, gigerstyle@gmx.ch wrote:
> Hi All,
> 
> Just a quick question:
> 
> As I know, swsusp is for hybernation (S4), right? The memory content
> will be written to the swap partition. What happens if the swap space
> is already used from programs? Abort? Or do I have to reserve swap
> space which never has to be used from programs?
> 
> Thank you!
> 
> Marc
> 
> 
> -- 
> Nigel Cunningham
> 495 St Georges Road South, Hastings 4201, New Zealand
> 
> Be diligent to present yourself approved to God as a workman who does
> not need to be ashamed, handling accurately the word of truth.
> 	-- 2 Timothy 2:14, NASB.


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-23 14:47   ` Pavel Machek
@ 2003-04-23 15:56     ` gigerstyle
  2003-04-23 19:41       ` Nigel Cunningham
  0 siblings, 1 reply; 68+ messages in thread
From: gigerstyle @ 2003-04-23 15:56 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Geert Uytterhoeven, Linux Kernel Development

Hi All,

Just a quick question:

As I know, swsusp is for hybernation (S4), right? The memory content will be written to the swap partition. What happens if the swap space is already used from programs? Abort? Or do I have to reserve swap space which never has to be used from programs?

Thank you!

Marc


On Wed, 23 Apr 2003 16:47:05 +0200
Pavel Machek <pavel@ucw.cz> wrote:

> Hi!
> 
> > > Swsusp without swap makes no sense, but leads to compilation
> > > failure. This fixes it. Please apply,
> > 
> > Just wondering, what about MMU-less machines?
> 
> Ugh... Currently: no we can't do that. We are happy to suspend/resume
> on i386 ;-).
> 								Pavel

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-23 14:32 ` Geert Uytterhoeven
@ 2003-04-23 14:47   ` Pavel Machek
  2003-04-23 15:56     ` gigerstyle
  0 siblings, 1 reply; 68+ messages in thread
From: Pavel Machek @ 2003-04-23 14:47 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: Linux Kernel Development

Hi!

> > Swsusp without swap makes no sense, but leads to compilation
> > failure. This fixes it. Please apply,
> 
> Just wondering, what about MMU-less machines?

Ugh... Currently: no we can't do that. We are happy to suspend/resume
on i386 ;-).
								Pavel

-- 
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix SWSUSP & !SWAP
  2003-04-23 13:51 Pavel Machek
@ 2003-04-23 14:32 ` Geert Uytterhoeven
  2003-04-23 14:47   ` Pavel Machek
  0 siblings, 1 reply; 68+ messages in thread
From: Geert Uytterhoeven @ 2003-04-23 14:32 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Linux Kernel Development

On Wed, 23 Apr 2003, Pavel Machek wrote:
> Swsusp without swap makes no sense, but leads to compilation
> failure. This fixes it. Please apply,

Just wondering, what about MMU-less machines?

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Fix SWSUSP & !SWAP
@ 2003-04-23 13:51 Pavel Machek
  2003-04-23 14:32 ` Geert Uytterhoeven
  0 siblings, 1 reply; 68+ messages in thread
From: Pavel Machek @ 2003-04-23 13:51 UTC (permalink / raw)
  To: Rusty trivial patch monkey Russell, kernel list, torvalds

Hi!

Swsusp without swap makes no sense, but leads to compilation
failure. This fixes it. Please apply,
							Pavel

Index: linux/arch/i386/Kconfig
===================================================================
--- linux.orig/arch/i386/Kconfig	2003-04-22 00:04:32.000000000 +0200
+++ linux/arch/i386/Kconfig	2003-04-22 00:02:32.000000000 +0200
@@ -798,7 +798,7 @@
 
 config SOFTWARE_SUSPEND
 	bool "Software Suspend (EXPERIMENTAL)"
-	depends on EXPERIMENTAL && PM
+	depends on EXPERIMENTAL && PM && SWAP
 	---help---
 	  Enable the possibilty of suspendig machine. It doesn't need APM.
 	  You may suspend your machine by 'swsusp' or 'shutdown -z <time>' 
Index: linux/arch/x86_64/Kconfig
===================================================================
--- linux.orig/arch/x86_64/Kconfig	2003-04-22 00:04:32.000000000 +0200
+++ linux/arch/x86_64/Kconfig	2003-04-22 00:02:37.000000000 +0200
@@ -286,7 +286,7 @@
 
 config SOFTWARE_SUSPEND
 	bool "Software Suspend (EXPERIMENTAL)"
-	depends on EXPERIMENTAL && PM
+	depends on EXPERIMENTAL && PM && SWAP
 	---help---
 	  Enable the possibilty of suspending the machine. It doesn't need APM.
 	  You may suspend your machine by 'swsusp' or 'shutdown -z <time>' 

-- 
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

^ permalink raw reply	[flat|nested] 68+ messages in thread

end of thread, other threads:[~2003-04-27 18:51 UTC | newest]

Thread overview: 68+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-04-23 23:47 Fix SWSUSP & !SWAP Grover, Andrew
2003-04-24  0:03 ` Pavel Machek
2003-04-23 23:57   ` Martin J. Bligh
2003-04-24  0:25     ` Pavel Machek
2003-04-24  0:37       ` CaT
2003-04-24  0:49       ` Martin J. Bligh
2003-04-24  9:16         ` Pavel Machek
2003-04-24  0:02   ` Nigel Cunningham
2003-04-24  0:23     ` Pavel Machek
2003-04-24  0:45     ` Martin J. Bligh
2003-04-24  3:17       ` Nigel Cunningham
2003-04-24  4:37         ` Martin J. Bligh
2003-04-24  7:49           ` Marc Giger
2003-04-24  9:27           ` Pavel Machek
2003-04-24  3:49   ` David Ford
2003-04-24  6:54     ` Jörn Engel
2003-04-24  7:01     ` Elladan
  -- strict thread matches above, loose matches on Subject: below --
2003-04-23 13:51 Pavel Machek
2003-04-23 14:32 ` Geert Uytterhoeven
2003-04-23 14:47   ` Pavel Machek
2003-04-23 15:56     ` gigerstyle
2003-04-23 19:41       ` Nigel Cunningham
2003-04-23 20:36         ` Marc Giger
2003-04-23 22:25           ` Nigel Cunningham
2003-04-23 23:28             ` Martin J. Bligh
2003-04-23 23:58               ` Pavel Machek
2003-04-23 23:55                 ` Martin J. Bligh
2003-04-24  0:07                 ` Andrew Morton
2003-04-24  0:17                   ` CaT
2003-04-24  0:16                     ` Nigel Cunningham
2003-04-24  0:26                       ` Randy.Dunlap
2003-04-24  0:31                         ` CaT
2003-04-24  0:38                         ` Andrew Morton
2003-04-24  0:54                           ` CaT
2003-04-24  1:06                             ` Andrew Morton
2003-04-24  8:48                           ` John Bradford
2003-04-24  0:37                       ` Andrew Morton
2003-04-24  9:12                         ` Pavel Machek
2003-04-24  9:25                           ` Andrew Morton
2003-04-24  9:35                             ` Pavel Machek
2003-04-24  9:46                               ` Andrew Morton
2003-04-24 11:13                                 ` Nigel Cunningham
2003-04-24 11:36                                   ` Andrew Morton
2003-04-24 14:26                                     ` Pavel Machek
2003-04-24 16:37                                       ` Andreas Dilger
2003-04-24 20:48                                         ` Pavel Machek
2003-04-24 21:46                                           ` Andreas Dilger
2003-04-25  1:09                                             ` Nigel Cunningham
2003-04-25 12:59                                               ` Pavel Machek
2003-04-25 16:20                                                 ` Andreas Dilger
2003-04-25 18:28                                                   ` Nigel Cunningham
2003-04-25 19:32                                                     ` Jamie Lokier
2003-04-25 19:58                                                     ` Andreas Dilger
2003-04-27 18:59                                                   ` Pavel Machek
2003-04-24 11:36                                   ` Geert Uytterhoeven
2003-04-25  1:22                                     ` H. Peter Anvin
2003-04-25  1:19                                       ` Nigel Cunningham
2003-04-25  1:31                                       ` Hua Zhong
2003-04-25 19:41                                         ` H. Peter Anvin
2003-04-25  4:27                                       ` Andreas Dilger
2003-04-25  4:33                                         ` H. Peter Anvin
2003-04-24  0:25                   ` Pavel Machek
2003-04-24  9:01                     ` Andrew Morton
2003-04-24  9:14                       ` Pavel Machek
2003-04-24  9:05                     ` Jamie Lokier
2003-04-24  9:34                       ` Pavel Machek
2003-04-24 15:22                     ` Carl-Daniel Hailfinger
2003-04-24  8:00             ` Marc Giger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).