All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-07  8:06             ` Pavel Machek
  0 siblings, 0 replies; 580+ messages in thread
From: Pavel Machek @ 2009-04-07  8:06 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jens Axboe, Alan Jenkins, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Andrew Morton

Hi!

> And the thing is, swsusp_save() really does do odd things. For example, to 
> get rid of unnecessary memory, it does "drain_local_pages()", where the 
> "local" is "local cpu". Why does it do that? Likely nobody knows.

I do :-). atomic image copying needs to copy any and all used pages,
and needs to know beforehand how many to copy. local pages  are
strange in this area, so we just get rid of them to simplify stuff.

> For example, there is a magic "PAGES_FOR_IO" #define, which is somewhat 
> arbitrarily set to 4MB worth of pages. Where did that number come from? 
> Who knows? But that's the number the code uses for the _initial_
> check of 

I picked that up out of thin air. Intent there is to make sure small
(<100K, lets say) allocations will work during suspend.

> And the thing is, that "swsusp_shrink_memory()" is just full of 
> heuristics. There's no hard numbers there. It doesn't seem to wait for 
> writeout, it just does the equivalent of "shrink_list()" and 
> "shrink_slab()", but it seems to have been basically cribbed half-way 
> from the regular "try to free memory", without really doing it all.

akpm designed shrink_memory(). Long time ago it was just while (1)
kmalloc() loop. It should be waiting. Andrew?

> Just as an example: it does that "zone_is_all_unreclaimable()" logic that 
> expects kswapd to mark things reclaimable again, but it doesn't seem to 
> actually ever wait for kswapd or pdflush. It also seems to set 
> "swappiness" to zero etc. Maybe it's all intentional, but it does mean 
> that it uses some shared heuristics with the "real" VM, but uses them 
> differently.

									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-07  8:06             ` Pavel Machek
  0 siblings, 0 replies; 580+ messages in thread
From: Pavel Machek @ 2009-04-07  8:06 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jens Axboe, Alan Jenkins, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Andrew Morton

Hi!

> And the thing is, swsusp_save() really does do odd things. For example, to 
> get rid of unnecessary memory, it does "drain_local_pages()", where the 
> "local" is "local cpu". Why does it do that? Likely nobody knows.

I do :-). atomic image copying needs to copy any and all used pages,
and needs to know beforehand how many to copy. local pages  are
strange in this area, so we just get rid of them to simplify stuff.

> For example, there is a magic "PAGES_FOR_IO" #define, which is somewhat 
> arbitrarily set to 4MB worth of pages. Where did that number come from? 
> Who knows? But that's the number the code uses for the _initial_
> check of 

I picked that up out of thin air. Intent there is to make sure small
(<100K, lets say) allocations will work during suspend.

> And the thing is, that "swsusp_shrink_memory()" is just full of 
> heuristics. There's no hard numbers there. It doesn't seem to wait for 
> writeout, it just does the equivalent of "shrink_list()" and 
> "shrink_slab()", but it seems to have been basically cribbed half-way 
> from the regular "try to free memory", without really doing it all.

akpm designed shrink_memory(). Long time ago it was just while (1)
kmalloc() loop. It should be waiting. Andrew?

> Just as an example: it does that "zone_is_all_unreclaimable()" logic that 
> expects kswapd to mark things reclaimable again, but it doesn't seem to 
> actually ever wait for kswapd or pdflush. It also seems to set 
> "swappiness" to zero etc. Maybe it's all intentional, but it does mean 
> that it uses some shared heuristics with the "real" VM, but uses them 
> differently.

									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* 2.6.30-rc2-git2: Reported regressions from 2.6.29
@ 2009-04-16 21:42 ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:42 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Adrian Bunk, Andrew Morton, Linus Torvalds, Natalie Protasevich,
	Kernel Testers List, Network Development, Linux ACPI,
	Linux PM List, Linux SCSI List

This message contains a list of some regressions from 2.6.29, for which there
are no fixes in the mainline I know of.  If any of them have been fixed already,
please let me know.

If you know of any other unresolved regressions from 2.6.29, please let me know
either and I'll add them to the list.  Also, please let me know if any of the
entries below are invalid.

Each entry from the list will be sent additionally in an automatic reply to
this message with CCs to the people involved in reporting and handling the
issue.


Listed regressions statistics:

  Date          Total  Pending  Unresolved
  ----------------------------------------
  2009-04-17       37       35          28


Unresolved regressions
----------------------

Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13126
Subject		: BUG: MAX_LOCKDEP_ENTRIES too low! when mounting rootfs
Submitter	: Alexander Beregalov <a.beregalov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-04-15 12:43 (2 days old)
References	: http://marc.info/?l=linux-kernel&m=123979949820538&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13125
Subject		: active uvcvideo breaks over suspend
Submitter	: Alan Jenkins <alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA@public.gmane.org>
Date		: 2009-04-15 10:12 (2 days old)
References	: http://marc.info/?l=linux-kernel&m=123979009508840&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13124
Subject		: ioatdma: DMA-API: device driver frees DMA memory with wrong function
Submitter	: Alexander Beregalov <a.beregalov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-04-09 12:36 (8 days old)
References	: http://marc.info/?l=linux-kernel&m=123928064322503&w=4
Handled-By	: Dan Williams <dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13122
Subject		: reiserfs_delete_xattrs: Couldn't delete all xattrs (-13)
Submitter	: Alexander Beregalov <a.beregalov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-04-16 19:23 (1 days old)
References	: http://marc.info/?l=linux-kernel&m=123990989515105&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13121
Subject		: commit 1a7c618a3f7bef1a20ae740df512eeba21397fa5 breaks ACPI video
Submitter	: Maxim Levitsky <maximlevitsky-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-04-16 11:37 (1 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=1a7c618a3f7bef1a20ae740df512eeba21397fa5
References	: http://marc.info/?l=linux-kernel&m=123988189401913&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13119
Subject		: Trouble with make-install from a NFS mount
Submitter	: Gregory Haskins <ghaskins-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
Date		: 2009-04-14 21:32 (3 days old)
References	: http://marc.info/?l=linux-kernel&m=123974482327044&w=4
Handled-By	: H. Peter Anvin <hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13118
Subject		: iptables very slow after commit 784544739a25c30637397ace5489eeb6e15d7d49
Submitter	: Jeff Chua <jeff.chua.linux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-04-10 16:05 (7 days old)
References	: http://lkml.org/lkml/2009/4/10/111
Handled-By	: Eric Dumazet <dada1-fPLkHRcR87vqlBn2x/YWAg@public.gmane.org>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13116
Subject		: Can't boot with nosmp
Submitter	: Stephen Hemminger <shemminger-ZtmgI6mnKB3QT0dZR+AlfA@public.gmane.org>
Date		: 2009-04-15 4:18 (2 days old)
References	: http://marc.info/?l=linux-kernel&m=123976917817920&w=4
Handled-By	: Dan Williams <dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13114
Subject		: USB storage (usbstick) automount woes
Submitter	: Mike Galbraith <efault-Mmb7MZpHnFY@public.gmane.org>
Date		: 2009-04-09 9:26 (8 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e6e244b6cb1f70e7109381626293cd40a8334ed3
References	: http://marc.info/?l=linux-kernel&m=123926928907568&w=4
Handled-By	: Alan Stern <stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz@public.gmane.org>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13112
Subject		: Oops in drain_array
Submitter	: Bart <mmx-G/jkD+u3s4s@public.gmane.org>
Date		: 2009-04-14 10:21 (3 days old)
References	: http://marc.info/?l=linux-kernel&m=123970493224628&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13111
Subject		: Linux 2.6.30-rc1 tg3 endian issues with MAC addresses on BCM5701
Submitter	: Robin Holt <holt-sJ/iWh9BUns@public.gmane.org>
Date		: 2009-04-08 7:12 (9 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e4f341103e4a2b35f56a0f89802f1b1448e8d04b
References	: http://marc.info/?l=linux-kernel&m=123917477312823&w=4
Handled-By	: Matt Carlson <mcarlson-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13110
Subject		: 2.6.30-rc1 problems with firmware loading
Submitter	: Ben Castricum <mail0904-YLO5ZLKhJ/U/fZsR/wcYMA@public.gmane.org>
Date		: 2009-04-12 6:20 (5 days old)
References	: http://marc.info/?l=linux-kernel&m=123951774919978&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13109
Subject		: High latency on /sys/class/thermal
Submitter	: Tiago Simões Batista <tiagosbatista-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-04-11 14:56 (6 days old)
References	: http://marc.info/?l=linux-kernel&m=123946182301248&w=4
Handled-By	: Zhang Rui <rui.zhang-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13108
Subject		: 2.6.30-rc1: white screen during boot (regression) on spitz
Submitter	: Pavel Machek <pavel-+ZI9xUNit7I@public.gmane.org>
Date		: 2009-04-10 10:34 (7 days old)
References	: http://marc.info/?l=linux-kernel&m=123935954223418&w=4
Handled-By	: Dmitry Eremin-Solenikov <dbaryshkov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13107
Subject		: LTP 20080131 causes defunct processes w/2.6.30-rc1
Submitter	: Kumar Gala <galak-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r@public.gmane.org>
Date		: 2009-04-09 15:43 (8 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=b3bfa0cba867f23365b81658b47efd906830879b
References	: http://marc.info/?l=linux-kernel&m=123929187208953&w=4
Handled-By	: Sukadev Bhattiprolu <sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13106
Subject		: 2.6.30-rc1: intel 3945 no wireless
Submitter	: 2.6.30-rc1: intel 3945 no wireless
Date		: 2009-04-08 5:36 (9 days old)
References	: http://marc.info/?l=linux-kernel&m=123916905605534&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13101
Subject		: BUG: scheduling while atomic: swapper/0/0x10000100
Submitter	: Maciej Rutecki <maciej.rutecki-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-04-07 7:37 (10 days old)
References	: http://marc.info/?l=linux-kernel&m=123908995822195&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13099
Subject		: net, sky2: BUG: unable to handle kernel NULL pointer dereference, pci_vpd_truncate()
Submitter	: Ingo Molnar <mingo-X9Un+BFzKDI@public.gmane.org>
Date		: 2009-04-06 9:03 (11 days old)
References	: http://marc.info/?l=linux-kernel&m=123900867611321&w=4
Handled-By	: Stephen Hemminger <shemminger-ZtmgI6mnKB3QT0dZR+AlfA@public.gmane.org>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13098
Subject		: 2.6.29-git12 breaks vga=0x0f07 on MSI/Intel GPU
Submitter	: Andi Kleen <andi-Vw/NltI1exuRpAAqCnN02g@public.gmane.org>
Date		: 2009-04-06 01:14 (11 days old)
References	: http://lkml.org/lkml/2009/4/5/200
Handled-By	: H. Peter Anvin <hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13097
Subject		: Kernel will freeze network after using a tun/tap device
Submitter	: Dâniel Fraga <fragabr-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-04-15 22:19 (2 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13087
Subject		: boot hang due to commit ff69f2bba67bd45514923aaedbf40fe351787c59
Submitter	: Bruno <bonbons67-H4aWS73dXupiYsDpGMXq6A@public.gmane.org>
Date		: 2009-04-14 17:51 (3 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ff69f2bba67bd45514923aaedbf40fe351787c59


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13069
Subject		: regression in 2.6.29-git3 on SH/Dreamcast
Submitter	: Adrian McMenamin <adrian-TSF8l6Tg6afpT6hvJLqO3U8SxdOydiOw@public.gmane.org>
Date		: 2009-03-29 19:04 (19 days old)
References	: http://marc.info/?l=linux-kernel&m=123835353115372&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13068
Subject		: Lockdep warining in inotify_dev_queue_event
Submitter	: Sachin Sant <sachinp-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
Date		: 2009-04-05 12:37 (12 days old)
References	: http://marc.info/?l=linux-kernel&m=123893439229272&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13067
Subject		: iwl3945: wlan0: beacon loss from AP - sending probe request
Submitter	: Maciej Rutecki <maciej.rutecki-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-04-05 9:11 (12 days old)
References	: http://marc.info/?l=linux-kernel&m=123892272218266&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13066
Subject		: Intel HD Audio oops
Submitter	: Jeff Chua <jeff.chua.linux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-04-01 8:28 (16 days old)
References	: http://marc.info/?l=linux-kernel&m=123857454625829&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13058
Subject		: First hibernation attempt fails
Submitter	: Alan Jenkins <alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA@public.gmane.org>
Date		: 2009-04-10 10:58 (7 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=1faa16d22877f4839bd433547d770c676d1d964c
References	: http://marc.info/?l=linux-kernel&m=123928022321917&w=2


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13044
Subject		: 2.6.30-rc1 can't find the root fs
Submitter	: Heinz Diehl <htd-HjJ2MNWy62to6+H+lsi3Gti2O/JbrIOy@public.gmane.org>
Date		: 2009-04-08 13:35 (9 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13031
Subject		: Deadlock/hang in SATA probe
Submitter	: Petr Vandrovec <petr-vPk2MGR0e28uaRcfnNAh7A@public.gmane.org>
Date		: 2009-04-06 23:33 (11 days old)


Regressions with patches
------------------------

Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13123
Subject		: 20 ACPI interrupts per second on EEEPC 4G
Submitter	: Alan Jenkins <alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA@public.gmane.org>
Date		: 2009-04-12 15:54 (5 days old)
References	: http://marc.info/?l=linux-kernel&m=123955169317870&w=4
Handled-By	: Matthew Garrett <mjg59-1xO5oi07KQx4cg9Nei1l7Q@public.gmane.org>
Patch		: http://marc.info/?l=linux-kernel&m=123973665713690&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13120
Subject		: BUG: using rootfstype=ext4 causes oops
Submitter	: Andrew Price <andy-QvJ1taJFSUQwEI6hhNFqhFpr/1R2p/CL@public.gmane.org>
Date		: 2009-04-15 20:59 (2 days old)
References	: http://marc.info/?l=linux-kernel&m=123982932807371&w=4
Handled-By	: Bartlomiej Zolnierkiewicz <bzolnier-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Patch		: http://marc.info/?l=linux-kernel&m=123991090816794&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13115
Subject		: microcode driver newly spews warnings
Submitter	: Jeff Garzik <jeff-o2qLIJkoznsdnm+yROfE0A@public.gmane.org>
Date		: 2009-04-13 18:23 (4 days old)
References	: http://marc.info/?l=linux-kernel&m=123964711725007&w=4
Handled-By	: Dmitry Adamushko <dmitry.adamushko-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Patch		: http://marc.info/?l=linux-kernel&m=123980715900884&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13113
Subject		: tiobench read 50% regression with 2.6.30-rc1
Submitter	: Zhang, Yanmin <yanmin_zhang-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
Date		: 2009-04-09 8:29 (8 days old)
References	: http://marc.info/?l=linux-kernel&m=123926576802992&w=4
Handled-By	: Jens Axboe <jens.axboe-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Patch		: http://marc.info/?l=linux-kernel&m=123971130800697&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13096
Subject		: 2.6.30-rc2 hangs in get_measured_perf on tigerton
Submitter	: Zhang, Yanmin <yanmin_zhang-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
Date		: 2009-04-15 14:01 (2 days old)
References	: http://lkml.org/lkml/2009/4/15/34
Handled-By	: Pallipadi, Venkatesh <venkatesh.pallipadi-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Patch		: http://lkml.org/lkml/2009/4/15/355


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13095
Subject		: thinkpad-acpi: cannot control brightness with hotkeys
Submitter	: Niel Lambrechts <niel.lambrechts-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-04-11 23:07 (6 days old)
References	: http://lkml.org/lkml/2009/4/11/160
Handled-By	: Matthew Garrett <mjg59-1xO5oi07KQx4cg9Nei1l7Q@public.gmane.org>
Patch		: http://lkml.org/lkml/2009/4/15/339


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13048
Subject		: /sys/class/backlight/acpi_video0/* is gone on vaio laptop with Intel GM45.
Submitter	: Rodrigo L. Batista <rodrigo-1dof46nAmC8dnm+yROfE0A@public.gmane.org>
Date		: 2009-04-09 04:57 (8 days old)
Handled-By	: yakui_zhao <yakui.zhao-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Patch		: http://bugzilla.kernel.org/attachment.cgi?id=20967
		  http://bugzilla.kernel.org/attachment.cgi?id=20959


For details, please visit the bug entries and follow the links given in
references.

As you can see, there is a Bugzilla entry for each of the listed regressions.
There also is a Bugzilla entry used for tracking the regressions from 2.6.29,
unresolved as well as resolved, at:

http://bugzilla.kernel.org/show_bug.cgi?id=13070

Please let me know if there are any Bugzilla entries that should be added to
the list in there.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* 2.6.30-rc2-git2: Reported regressions from 2.6.29
@ 2009-04-16 21:42 ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:42 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Adrian Bunk, Andrew Morton, Linus Torvalds, Natalie Protasevich,
	Kernel Testers List, Network Development, Linux ACPI,
	Linux PM List, Linux SCSI List

This message contains a list of some regressions from 2.6.29, for which there
are no fixes in the mainline I know of.  If any of them have been fixed already,
please let me know.

If you know of any other unresolved regressions from 2.6.29, please let me know
either and I'll add them to the list.  Also, please let me know if any of the
entries below are invalid.

Each entry from the list will be sent additionally in an automatic reply to
this message with CCs to the people involved in reporting and handling the
issue.


Listed regressions statistics:

  Date          Total  Pending  Unresolved
  ----------------------------------------
  2009-04-17       37       35          28


Unresolved regressions
----------------------

Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13126
Subject		: BUG: MAX_LOCKDEP_ENTRIES too low! when mounting rootfs
Submitter	: Alexander Beregalov <a.beregalov@gmail.com>
Date		: 2009-04-15 12:43 (2 days old)
References	: http://marc.info/?l=linux-kernel&m=123979949820538&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13125
Subject		: active uvcvideo breaks over suspend
Submitter	: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Date		: 2009-04-15 10:12 (2 days old)
References	: http://marc.info/?l=linux-kernel&m=123979009508840&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13124
Subject		: ioatdma: DMA-API: device driver frees DMA memory with wrong function
Submitter	: Alexander Beregalov <a.beregalov@gmail.com>
Date		: 2009-04-09 12:36 (8 days old)
References	: http://marc.info/?l=linux-kernel&m=123928064322503&w=4
Handled-By	: Dan Williams <dan.j.williams@intel.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13122
Subject		: reiserfs_delete_xattrs: Couldn't delete all xattrs (-13)
Submitter	: Alexander Beregalov <a.beregalov@gmail.com>
Date		: 2009-04-16 19:23 (1 days old)
References	: http://marc.info/?l=linux-kernel&m=123990989515105&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13121
Subject		: commit 1a7c618a3f7bef1a20ae740df512eeba21397fa5 breaks ACPI video
Submitter	: Maxim Levitsky <maximlevitsky@gmail.com>
Date		: 2009-04-16 11:37 (1 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=1a7c618a3f7bef1a20ae740df512eeba21397fa5
References	: http://marc.info/?l=linux-kernel&m=123988189401913&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13119
Subject		: Trouble with make-install from a NFS mount
Submitter	: Gregory Haskins <ghaskins@novell.com>
Date		: 2009-04-14 21:32 (3 days old)
References	: http://marc.info/?l=linux-kernel&m=123974482327044&w=4
Handled-By	: H. Peter Anvin <hpa@zytor.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13118
Subject		: iptables very slow after commit 784544739a25c30637397ace5489eeb6e15d7d49
Submitter	: Jeff Chua <jeff.chua.linux@gmail.com>
Date		: 2009-04-10 16:05 (7 days old)
References	: http://lkml.org/lkml/2009/4/10/111
Handled-By	: Eric Dumazet <dada1@cosmosbay.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13116
Subject		: Can't boot with nosmp
Submitter	: Stephen Hemminger <shemminger@vyatta.com>
Date		: 2009-04-15 4:18 (2 days old)
References	: http://marc.info/?l=linux-kernel&m=123976917817920&w=4
Handled-By	: Dan Williams <dan.j.williams@intel.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13114
Subject		: USB storage (usbstick) automount woes
Submitter	: Mike Galbraith <efault@gmx.de>
Date		: 2009-04-09 9:26 (8 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e6e244b6cb1f70e7109381626293cd40a8334ed3
References	: http://marc.info/?l=linux-kernel&m=123926928907568&w=4
Handled-By	: Alan Stern <stern@rowland.harvard.edu>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13112
Subject		: Oops in drain_array
Submitter	: Bart <mmx@riz.pl>
Date		: 2009-04-14 10:21 (3 days old)
References	: http://marc.info/?l=linux-kernel&m=123970493224628&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13111
Subject		: Linux 2.6.30-rc1 tg3 endian issues with MAC addresses on BCM5701
Submitter	: Robin Holt <holt@sgi.com>
Date		: 2009-04-08 7:12 (9 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e4f341103e4a2b35f56a0f89802f1b1448e8d04b
References	: http://marc.info/?l=linux-kernel&m=123917477312823&w=4
Handled-By	: Matt Carlson <mcarlson@broadcom.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13110
Subject		: 2.6.30-rc1 problems with firmware loading
Submitter	: Ben Castricum <mail0904@bencastricum.nl>
Date		: 2009-04-12 6:20 (5 days old)
References	: http://marc.info/?l=linux-kernel&m=123951774919978&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13109
Subject		: High latency on /sys/class/thermal
Submitter	: Tiago Simões Batista <tiagosbatista@gmail.com>
Date		: 2009-04-11 14:56 (6 days old)
References	: http://marc.info/?l=linux-kernel&m=123946182301248&w=4
Handled-By	: Zhang Rui <rui.zhang@intel.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13108
Subject		: 2.6.30-rc1: white screen during boot (regression) on spitz
Submitter	: Pavel Machek <pavel@ucw.cz>
Date		: 2009-04-10 10:34 (7 days old)
References	: http://marc.info/?l=linux-kernel&m=123935954223418&w=4
Handled-By	: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13107
Subject		: LTP 20080131 causes defunct processes w/2.6.30-rc1
Submitter	: Kumar Gala <galak@kernel.crashing.org>
Date		: 2009-04-09 15:43 (8 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=b3bfa0cba867f23365b81658b47efd906830879b
References	: http://marc.info/?l=linux-kernel&m=123929187208953&w=4
Handled-By	: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13106
Subject		: 2.6.30-rc1: intel 3945 no wireless
Submitter	: 2.6.30-rc1: intel 3945 no wireless
Date		: 2009-04-08 5:36 (9 days old)
References	: http://marc.info/?l=linux-kernel&m=123916905605534&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13101
Subject		: BUG: scheduling while atomic: swapper/0/0x10000100
Submitter	: Maciej Rutecki <maciej.rutecki@gmail.com>
Date		: 2009-04-07 7:37 (10 days old)
References	: http://marc.info/?l=linux-kernel&m=123908995822195&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13099
Subject		: net, sky2: BUG: unable to handle kernel NULL pointer dereference, pci_vpd_truncate()
Submitter	: Ingo Molnar <mingo@elte.hu>
Date		: 2009-04-06 9:03 (11 days old)
References	: http://marc.info/?l=linux-kernel&m=123900867611321&w=4
Handled-By	: Stephen Hemminger <shemminger@vyatta.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13098
Subject		: 2.6.29-git12 breaks vga=0x0f07 on MSI/Intel GPU
Submitter	: Andi Kleen <andi@firstfloor.org>
Date		: 2009-04-06 01:14 (11 days old)
References	: http://lkml.org/lkml/2009/4/5/200
Handled-By	: H. Peter Anvin <hpa@zytor.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13097
Subject		: Kernel will freeze network after using a tun/tap device
Submitter	: Dâniel Fraga <fragabr@gmail.com>
Date		: 2009-04-15 22:19 (2 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13087
Subject		: boot hang due to commit ff69f2bba67bd45514923aaedbf40fe351787c59
Submitter	: Bruno <bonbons67@internet.lu>
Date		: 2009-04-14 17:51 (3 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ff69f2bba67bd45514923aaedbf40fe351787c59


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13069
Subject		: regression in 2.6.29-git3 on SH/Dreamcast
Submitter	: Adrian McMenamin <adrian@newgolddream.dyndns.info>
Date		: 2009-03-29 19:04 (19 days old)
References	: http://marc.info/?l=linux-kernel&m=123835353115372&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13068
Subject		: Lockdep warining in inotify_dev_queue_event
Submitter	: Sachin Sant <sachinp@in.ibm.com>
Date		: 2009-04-05 12:37 (12 days old)
References	: http://marc.info/?l=linux-kernel&m=123893439229272&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13067
Subject		: iwl3945: wlan0: beacon loss from AP - sending probe request
Submitter	: Maciej Rutecki <maciej.rutecki@gmail.com>
Date		: 2009-04-05 9:11 (12 days old)
References	: http://marc.info/?l=linux-kernel&m=123892272218266&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13066
Subject		: Intel HD Audio oops
Submitter	: Jeff Chua <jeff.chua.linux@gmail.com>
Date		: 2009-04-01 8:28 (16 days old)
References	: http://marc.info/?l=linux-kernel&m=123857454625829&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13058
Subject		: First hibernation attempt fails
Submitter	: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Date		: 2009-04-10 10:58 (7 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=1faa16d22877f4839bd433547d770c676d1d964c
References	: http://marc.info/?l=linux-kernel&m=123928022321917&w=2


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13044
Subject		: 2.6.30-rc1 can't find the root fs
Submitter	: Heinz Diehl <htd@fancy-poultry.org>
Date		: 2009-04-08 13:35 (9 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13031
Subject		: Deadlock/hang in SATA probe
Submitter	: Petr Vandrovec <petr@vandrovec.name>
Date		: 2009-04-06 23:33 (11 days old)


Regressions with patches
------------------------

Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13123
Subject		: 20 ACPI interrupts per second on EEEPC 4G
Submitter	: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Date		: 2009-04-12 15:54 (5 days old)
References	: http://marc.info/?l=linux-kernel&m=123955169317870&w=4
Handled-By	: Matthew Garrett <mjg59@srcf.ucam.org>
Patch		: http://marc.info/?l=linux-kernel&m=123973665713690&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13120
Subject		: BUG: using rootfstype=ext4 causes oops
Submitter	: Andrew Price <andy@andrewprice.me.uk>
Date		: 2009-04-15 20:59 (2 days old)
References	: http://marc.info/?l=linux-kernel&m=123982932807371&w=4
Handled-By	: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Patch		: http://marc.info/?l=linux-kernel&m=123991090816794&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13115
Subject		: microcode driver newly spews warnings
Submitter	: Jeff Garzik <jeff@garzik.org>
Date		: 2009-04-13 18:23 (4 days old)
References	: http://marc.info/?l=linux-kernel&m=123964711725007&w=4
Handled-By	: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Patch		: http://marc.info/?l=linux-kernel&m=123980715900884&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13113
Subject		: tiobench read 50% regression with 2.6.30-rc1
Submitter	: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
Date		: 2009-04-09 8:29 (8 days old)
References	: http://marc.info/?l=linux-kernel&m=123926576802992&w=4
Handled-By	: Jens Axboe <jens.axboe@oracle.com>
Patch		: http://marc.info/?l=linux-kernel&m=123971130800697&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13096
Subject		: 2.6.30-rc2 hangs in get_measured_perf on tigerton
Submitter	: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
Date		: 2009-04-15 14:01 (2 days old)
References	: http://lkml.org/lkml/2009/4/15/34
Handled-By	: Pallipadi, Venkatesh <venkatesh.pallipadi@intel.com>
Patch		: http://lkml.org/lkml/2009/4/15/355


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13095
Subject		: thinkpad-acpi: cannot control brightness with hotkeys
Submitter	: Niel Lambrechts <niel.lambrechts@gmail.com>
Date		: 2009-04-11 23:07 (6 days old)
References	: http://lkml.org/lkml/2009/4/11/160
Handled-By	: Matthew Garrett <mjg59@srcf.ucam.org>
Patch		: http://lkml.org/lkml/2009/4/15/339


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13048
Subject		: /sys/class/backlight/acpi_video0/* is gone on vaio laptop with Intel GM45.
Submitter	: Rodrigo L. Batista <rodrigo@gus-mg.org>
Date		: 2009-04-09 04:57 (8 days old)
Handled-By	: yakui_zhao <yakui.zhao@intel.com>
Patch		: http://bugzilla.kernel.org/attachment.cgi?id=20967
		  http://bugzilla.kernel.org/attachment.cgi?id=20959


For details, please visit the bug entries and follow the links given in
references.

As you can see, there is a Bugzilla entry for each of the listed regressions.
There also is a Bugzilla entry used for tracking the regressions from 2.6.29,
unresolved as well as resolved, at:

http://bugzilla.kernel.org/show_bug.cgi?id=13070

Please let me know if there are any Bugzilla entries that should be added to
the list in there.

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13031] Deadlock/hang in SATA probe
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:42   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:42 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Petr Vandrovec

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13031
Subject		: Deadlock/hang in SATA probe
Submitter	: Petr Vandrovec <petr@vandrovec.name>
Date		: 2009-04-06 23:33 (11 days old)



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13031] Deadlock/hang in SATA probe
@ 2009-04-16 21:42   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:42 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Petr Vandrovec

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13031
Subject		: Deadlock/hang in SATA probe
Submitter	: Petr Vandrovec <petr-vPk2MGR0e28uaRcfnNAh7A@public.gmane.org>
Date		: 2009-04-06 23:33 (11 days old)


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13048] /sys/class/backlight/acpi_video0/* is gone on vaio laptop with Intel GM45.
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Eric Anholt, Len Brown, Matthew Garrett,
	Matthew Garrett, Rodrigo L. Batista, yakui_zhao

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13048
Subject		: /sys/class/backlight/acpi_video0/* is gone on vaio laptop with Intel GM45.
Submitter	: Rodrigo L. Batista <rodrigo@gus-mg.org>
Date		: 2009-04-09 04:57 (8 days old)
Handled-By	: yakui_zhao <yakui.zhao@intel.com>
Patch		: http://bugzilla.kernel.org/attachment.cgi?id=20967
		  http://bugzilla.kernel.org/attachment.cgi?id=20959



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13044] 2.6.30-rc1 can't find the root fs
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Heinz Diehl, Ingo Molnar

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13044
Subject		: 2.6.30-rc1 can't find the root fs
Submitter	: Heinz Diehl <htd@fancy-poultry.org>
Date		: 2009-04-08 13:35 (9 days old)



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13058] First hibernation attempt fails
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Alan Jenkins, Jens Axboe, Linus Torvalds

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13058
Subject		: First hibernation attempt fails
Submitter	: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Date		: 2009-04-10 10:58 (7 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=1faa16d22877f4839bd433547d770c676d1d964c
References	: http://marc.info/?l=linux-kernel&m=123928022321917&w=2



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13048] /sys/class/backlight/acpi_video0/* is gone on vaio laptop with Intel GM45.
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Eric Anholt, Len Brown, Matthew Garrett,
	Matthew Garrett, Rodrigo L. Batista, yakui_zhao

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13048
Subject		: /sys/class/backlight/acpi_video0/* is gone on vaio laptop with Intel GM45.
Submitter	: Rodrigo L. Batista <rodrigo-1dof46nAmC8dnm+yROfE0A@public.gmane.org>
Date		: 2009-04-09 04:57 (8 days old)
Handled-By	: yakui_zhao <yakui.zhao-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Patch		: http://bugzilla.kernel.org/attachment.cgi?id=20967
		  http://bugzilla.kernel.org/attachment.cgi?id=20959


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13044] 2.6.30-rc1 can't find the root fs
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Heinz Diehl, Ingo Molnar

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13044
Subject		: 2.6.30-rc1 can't find the root fs
Submitter	: Heinz Diehl <htd-HjJ2MNWy62to6+H+lsi3Gti2O/JbrIOy@public.gmane.org>
Date		: 2009-04-08 13:35 (9 days old)


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13058] First hibernation attempt fails
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Alan Jenkins, Jens Axboe, Linus Torvalds

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13058
Subject		: First hibernation attempt fails
Submitter	: Alan Jenkins <alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA@public.gmane.org>
Date		: 2009-04-10 10:58 (7 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=1faa16d22877f4839bd433547d770c676d1d964c
References	: http://marc.info/?l=linux-kernel&m=123928022321917&w=2


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13066] Intel HD Audio oops
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Jeff Chua

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13066
Subject		: Intel HD Audio oops
Submitter	: Jeff Chua <jeff.chua.linux@gmail.com>
Date		: 2009-04-01 8:28 (16 days old)
References	: http://marc.info/?l=linux-kernel&m=123857454625829&w=4



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13067] iwl3945: wlan0: beacon loss from AP - sending probe request
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Maciej Rutecki

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13067
Subject		: iwl3945: wlan0: beacon loss from AP - sending probe request
Submitter	: Maciej Rutecki <maciej.rutecki@gmail.com>
Date		: 2009-04-05 9:11 (12 days old)
References	: http://marc.info/?l=linux-kernel&m=123892272218266&w=4



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13069] regression in 2.6.29-git3 on SH/Dreamcast
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Adrian McMenamin, Manuel Lauss

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13069
Subject		: regression in 2.6.29-git3 on SH/Dreamcast
Submitter	: Adrian McMenamin <adrian@newgolddream.dyndns.info>
Date		: 2009-03-29 19:04 (19 days old)
References	: http://marc.info/?l=linux-kernel&m=123835353115372&w=4



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13068] Lockdep warining in inotify_dev_queue_event
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Sachin Sant

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13068
Subject		: Lockdep warining in inotify_dev_queue_event
Submitter	: Sachin Sant <sachinp@in.ibm.com>
Date		: 2009-04-05 12:37 (12 days old)
References	: http://marc.info/?l=linux-kernel&m=123893439229272&w=4



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13067] iwl3945: wlan0: beacon loss from AP - sending probe request
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Maciej Rutecki

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13067
Subject		: iwl3945: wlan0: beacon loss from AP - sending probe request
Submitter	: Maciej Rutecki <maciej.rutecki-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-04-05 9:11 (12 days old)
References	: http://marc.info/?l=linux-kernel&m=123892272218266&w=4


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13066] Intel HD Audio oops
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Jeff Chua

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13066
Subject		: Intel HD Audio oops
Submitter	: Jeff Chua <jeff.chua.linux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-04-01 8:28 (16 days old)
References	: http://marc.info/?l=linux-kernel&m=123857454625829&w=4


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13069] regression in 2.6.29-git3 on SH/Dreamcast
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Adrian McMenamin, Manuel Lauss

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13069
Subject		: regression in 2.6.29-git3 on SH/Dreamcast
Submitter	: Adrian McMenamin <adrian-TSF8l6Tg6afpT6hvJLqO3U8SxdOydiOw@public.gmane.org>
Date		: 2009-03-29 19:04 (19 days old)
References	: http://marc.info/?l=linux-kernel&m=123835353115372&w=4


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13068] Lockdep warining in inotify_dev_queue_event
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Sachin Sant

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13068
Subject		: Lockdep warining in inotify_dev_queue_event
Submitter	: Sachin Sant <sachinp-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
Date		: 2009-04-05 12:37 (12 days old)
References	: http://marc.info/?l=linux-kernel&m=123893439229272&w=4


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13095] thinkpad-acpi: cannot control brightness with hotkeys
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Henrique de Moraes Holschuh,
	Matthew Garrett, Maxim Levitsky, Niel Lambrechts, Zhang Rui

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13095
Subject		: thinkpad-acpi: cannot control brightness with hotkeys
Submitter	: Niel Lambrechts <niel.lambrechts@gmail.com>
Date		: 2009-04-11 23:07 (6 days old)
References	: http://lkml.org/lkml/2009/4/11/160
Handled-By	: Matthew Garrett <mjg59@srcf.ucam.org>
Patch		: http://lkml.org/lkml/2009/4/15/339



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13087] boot hang due to commit ff69f2bba67bd45514923aaedbf40fe351787c59
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Bruno

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13087
Subject		: boot hang due to commit ff69f2bba67bd45514923aaedbf40fe351787c59
Submitter	: Bruno <bonbons67@internet.lu>
Date		: 2009-04-14 17:51 (3 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ff69f2bba67bd45514923aaedbf40fe351787c59



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13097] Kernel will freeze network after using a tun/tap device
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Dâniel Fraga

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13097
Subject		: Kernel will freeze network after using a tun/tap device
Submitter	: Dâniel Fraga <fragabr@gmail.com>
Date		: 2009-04-15 22:19 (2 days old)



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13096] 2.6.30-rc2 hangs in get_measured_perf on tigerton
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Pallipadi, Venkatesh, Zhang, Yanmin

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13096
Subject		: 2.6.30-rc2 hangs in get_measured_perf on tigerton
Submitter	: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
Date		: 2009-04-15 14:01 (2 days old)
References	: http://lkml.org/lkml/2009/4/15/34
Handled-By	: Pallipadi, Venkatesh <venkatesh.pallipadi@intel.com>
Patch		: http://lkml.org/lkml/2009/4/15/355



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13095] thinkpad-acpi: cannot control brightness with hotkeys
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Henrique de Moraes Holschuh,
	Matthew Garrett, Maxim Levitsky, Niel Lambrechts, Zhang Rui

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13095
Subject		: thinkpad-acpi: cannot control brightness with hotkeys
Submitter	: Niel Lambrechts <niel.lambrechts-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-04-11 23:07 (6 days old)
References	: http://lkml.org/lkml/2009/4/11/160
Handled-By	: Matthew Garrett <mjg59-1xO5oi07KQx4cg9Nei1l7Q@public.gmane.org>
Patch		: http://lkml.org/lkml/2009/4/15/339


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13087] boot hang due to commit ff69f2bba67bd45514923aaedbf40fe351787c59
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Bruno

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13087
Subject		: boot hang due to commit ff69f2bba67bd45514923aaedbf40fe351787c59
Submitter	: Bruno <bonbons67-H4aWS73dXupiYsDpGMXq6A@public.gmane.org>
Date		: 2009-04-14 17:51 (3 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ff69f2bba67bd45514923aaedbf40fe351787c59


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13097] Kernel will freeze network after using a tun/tap device
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Dâniel Fraga

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13097
Subject		: Kernel will freeze network after using a tun/tap device
Submitter	: Dâniel Fraga <fragabr-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-04-15 22:19 (2 days old)


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13096] 2.6.30-rc2 hangs in get_measured_perf on tigerton
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Pallipadi, Venkatesh, Zhang, Yanmin

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13096
Subject		: 2.6.30-rc2 hangs in get_measured_perf on tigerton
Submitter	: Zhang, Yanmin <yanmin_zhang-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
Date		: 2009-04-15 14:01 (2 days old)
References	: http://lkml.org/lkml/2009/4/15/34
Handled-By	: Pallipadi, Venkatesh <venkatesh.pallipadi-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Patch		: http://lkml.org/lkml/2009/4/15/355


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13106] 2.6.30-rc1: intel 3945 no wireless
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, 2.6.30-rc1: intel 3945 no wireless, Larry Finger

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13106
Subject		: 2.6.30-rc1: intel 3945 no wireless
Submitter	: 2.6.30-rc1: intel 3945 no wireless
Date		: 2009-04-08 5:36 (9 days old)
References	: http://marc.info/?l=linux-kernel&m=123916905605534&w=4



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13099] net, sky2: BUG: unable to handle kernel NULL pointer dereference, pci_vpd_truncate()
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Ingo Molnar, Stephen Hemminger

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13099
Subject		: net, sky2: BUG: unable to handle kernel NULL pointer dereference, pci_vpd_truncate()
Submitter	: Ingo Molnar <mingo@elte.hu>
Date		: 2009-04-06 9:03 (11 days old)
References	: http://marc.info/?l=linux-kernel&m=123900867611321&w=4
Handled-By	: Stephen Hemminger <shemminger@vyatta.com>



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13098] 2.6.29-git12 breaks vga=0x0f07 on MSI/Intel GPU
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Andi Kleen, H. Peter Anvin

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13098
Subject		: 2.6.29-git12 breaks vga=0x0f07 on MSI/Intel GPU
Submitter	: Andi Kleen <andi@firstfloor.org>
Date		: 2009-04-06 01:14 (11 days old)
References	: http://lkml.org/lkml/2009/4/5/200
Handled-By	: H. Peter Anvin <hpa@zytor.com>



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13101] BUG: scheduling while atomic: swapper/0/0x10000100
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Maciej Rutecki, Marcel Holtmann, Thomas Gleixner

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13101
Subject		: BUG: scheduling while atomic: swapper/0/0x10000100
Submitter	: Maciej Rutecki <maciej.rutecki@gmail.com>
Date		: 2009-04-07 7:37 (10 days old)
References	: http://marc.info/?l=linux-kernel&m=123908995822195&w=4



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13106] 2.6.30-rc1: intel 3945 no wireless
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, 2.6.30-rc1: intel 3945 no wireless, Larry Finger

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13106
Subject		: 2.6.30-rc1: intel 3945 no wireless
Submitter	: 2.6.30-rc1: intel 3945 no wireless
Date		: 2009-04-08 5:36 (9 days old)
References	: http://marc.info/?l=linux-kernel&m=123916905605534&w=4


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13099] net, sky2: BUG: unable to handle kernel NULL pointer dereference, pci_vpd_truncate()
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Ingo Molnar, Stephen Hemminger

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13099
Subject		: net, sky2: BUG: unable to handle kernel NULL pointer dereference, pci_vpd_truncate()
Submitter	: Ingo Molnar <mingo-X9Un+BFzKDI@public.gmane.org>
Date		: 2009-04-06 9:03 (11 days old)
References	: http://marc.info/?l=linux-kernel&m=123900867611321&w=4
Handled-By	: Stephen Hemminger <shemminger-ZtmgI6mnKB3QT0dZR+AlfA@public.gmane.org>


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13098] 2.6.29-git12 breaks vga=0x0f07 on MSI/Intel GPU
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Andi Kleen, H. Peter Anvin

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13098
Subject		: 2.6.29-git12 breaks vga=0x0f07 on MSI/Intel GPU
Submitter	: Andi Kleen <andi-Vw/NltI1exuRpAAqCnN02g@public.gmane.org>
Date		: 2009-04-06 01:14 (11 days old)
References	: http://lkml.org/lkml/2009/4/5/200
Handled-By	: H. Peter Anvin <hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13101] BUG: scheduling while atomic: swapper/0/0x10000100
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Maciej Rutecki, Marcel Holtmann, Thomas Gleixner

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13101
Subject		: BUG: scheduling while atomic: swapper/0/0x10000100
Submitter	: Maciej Rutecki <maciej.rutecki-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-04-07 7:37 (10 days old)
References	: http://marc.info/?l=linux-kernel&m=123908995822195&w=4


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13108] 2.6.30-rc1: white screen during boot (regression) on spitz
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Dmitry Eremin-Solenikov, Pavel Machek,
	Peter Zijlstra

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13108
Subject		: 2.6.30-rc1: white screen during boot (regression) on spitz
Submitter	: Pavel Machek <pavel@ucw.cz>
Date		: 2009-04-10 10:34 (7 days old)
References	: http://marc.info/?l=linux-kernel&m=123935954223418&w=4
Handled-By	: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13107] LTP 20080131 causes defunct processes w/2.6.30-rc1
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Andrew Morton, Kumar Gala, Linus Torvalds,
	Sukadev Bhattiprolu

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13107
Subject		: LTP 20080131 causes defunct processes w/2.6.30-rc1
Submitter	: Kumar Gala <galak@kernel.crashing.org>
Date		: 2009-04-09 15:43 (8 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=b3bfa0cba867f23365b81658b47efd906830879b
References	: http://marc.info/?l=linux-kernel&m=123929187208953&w=4
Handled-By	: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13109] High latency on /sys/class/thermal
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Tiago Simões Batista, Zhang Rui

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13109
Subject		: High latency on /sys/class/thermal
Submitter	: Tiago Simões Batista <tiagosbatista@gmail.com>
Date		: 2009-04-11 14:56 (6 days old)
References	: http://marc.info/?l=linux-kernel&m=123946182301248&w=4
Handled-By	: Zhang Rui <rui.zhang@intel.com>



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13109] High latency on /sys/class/thermal
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Tiago Simões Batista, Zhang Rui

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13109
Subject		: High latency on /sys/class/thermal
Submitter	: Tiago Simões Batista <tiagosbatista-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-04-11 14:56 (6 days old)
References	: http://marc.info/?l=linux-kernel&m=123946182301248&w=4
Handled-By	: Zhang Rui <rui.zhang-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13107] LTP 20080131 causes defunct processes w/2.6.30-rc1
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Andrew Morton, Kumar Gala, Linus Torvalds,
	Sukadev Bhattiprolu

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13107
Subject		: LTP 20080131 causes defunct processes w/2.6.30-rc1
Submitter	: Kumar Gala <galak-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r@public.gmane.org>
Date		: 2009-04-09 15:43 (8 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=b3bfa0cba867f23365b81658b47efd906830879b
References	: http://marc.info/?l=linux-kernel&m=123929187208953&w=4
Handled-By	: Sukadev Bhattiprolu <sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13108] 2.6.30-rc1: white screen during boot (regression) on spitz
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Dmitry Eremin-Solenikov, Pavel Machek,
	Peter Zijlstra

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13108
Subject		: 2.6.30-rc1: white screen during boot (regression) on spitz
Submitter	: Pavel Machek <pavel-+ZI9xUNit7I@public.gmane.org>
Date		: 2009-04-10 10:34 (7 days old)
References	: http://marc.info/?l=linux-kernel&m=123935954223418&w=4
Handled-By	: Dmitry Eremin-Solenikov <dbaryshkov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13110] 2.6.30-rc1 problems with firmware loading
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Ben Castricum

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13110
Subject		: 2.6.30-rc1 problems with firmware loading
Submitter	: Ben Castricum <mail0904@bencastricum.nl>
Date		: 2009-04-12 6:20 (5 days old)
References	: http://marc.info/?l=linux-kernel&m=123951774919978&w=4



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13111] Linux 2.6.30-rc1 tg3 endian issues with MAC addresses on BCM5701
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Benjamin Li, David S. Miller, Matt Carlson,
	Michael Chan, Robin Holt

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13111
Subject		: Linux 2.6.30-rc1 tg3 endian issues with MAC addresses on BCM5701
Submitter	: Robin Holt <holt@sgi.com>
Date		: 2009-04-08 7:12 (9 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e4f341103e4a2b35f56a0f89802f1b1448e8d04b
References	: http://marc.info/?l=linux-kernel&m=123917477312823&w=4
Handled-By	: Matt Carlson <mcarlson@broadcom.com>



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13113] tiobench read 50% regression with 2.6.30-rc1
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Jens Axboe, Zhang, Yanmin

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13113
Subject		: tiobench read 50% regression with 2.6.30-rc1
Submitter	: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
Date		: 2009-04-09 8:29 (8 days old)
References	: http://marc.info/?l=linux-kernel&m=123926576802992&w=4
Handled-By	: Jens Axboe <jens.axboe@oracle.com>
Patch		: http://marc.info/?l=linux-kernel&m=123971130800697&w=4



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13112] Oops in drain_array
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Bart

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13112
Subject		: Oops in drain_array
Submitter	: Bart <mmx@riz.pl>
Date		: 2009-04-14 10:21 (3 days old)
References	: http://marc.info/?l=linux-kernel&m=123970493224628&w=4



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13110] 2.6.30-rc1 problems with firmware loading
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Ben Castricum

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13110
Subject		: 2.6.30-rc1 problems with firmware loading
Submitter	: Ben Castricum <mail0904-YLO5ZLKhJ/U/fZsR/wcYMA@public.gmane.org>
Date		: 2009-04-12 6:20 (5 days old)
References	: http://marc.info/?l=linux-kernel&m=123951774919978&w=4


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13113] tiobench read 50% regression with 2.6.30-rc1
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Jens Axboe, Zhang, Yanmin

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13113
Subject		: tiobench read 50% regression with 2.6.30-rc1
Submitter	: Zhang, Yanmin <yanmin_zhang-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
Date		: 2009-04-09 8:29 (8 days old)
References	: http://marc.info/?l=linux-kernel&m=123926576802992&w=4
Handled-By	: Jens Axboe <jens.axboe-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Patch		: http://marc.info/?l=linux-kernel&m=123971130800697&w=4


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13112] Oops in drain_array
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Bart

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13112
Subject		: Oops in drain_array
Submitter	: Bart <mmx-G/jkD+u3s4s@public.gmane.org>
Date		: 2009-04-14 10:21 (3 days old)
References	: http://marc.info/?l=linux-kernel&m=123970493224628&w=4


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13111] Linux 2.6.30-rc1 tg3 endian issues with MAC addresses on BCM5701
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Benjamin Li, David S. Miller, Matt Carlson,
	Michael Chan, Robin Holt

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13111
Subject		: Linux 2.6.30-rc1 tg3 endian issues with MAC addresses on BCM5701
Submitter	: Robin Holt <holt-sJ/iWh9BUns@public.gmane.org>
Date		: 2009-04-08 7:12 (9 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e4f341103e4a2b35f56a0f89802f1b1448e8d04b
References	: http://marc.info/?l=linux-kernel&m=123917477312823&w=4
Handled-By	: Matt Carlson <mcarlson-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13114] USB storage (usbstick) automount woes
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Alan Stern, Greg Kroah-Hartman, Mike Galbraith

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13114
Subject		: USB storage (usbstick) automount woes
Submitter	: Mike Galbraith <efault@gmx.de>
Date		: 2009-04-09 9:26 (8 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e6e244b6cb1f70e7109381626293cd40a8334ed3
References	: http://marc.info/?l=linux-kernel&m=123926928907568&w=4
Handled-By	: Alan Stern <stern@rowland.harvard.edu>



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13115] microcode driver newly spews warnings
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Dmitry Adamushko, Jeff Garzik

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13115
Subject		: microcode driver newly spews warnings
Submitter	: Jeff Garzik <jeff@garzik.org>
Date		: 2009-04-13 18:23 (4 days old)
References	: http://marc.info/?l=linux-kernel&m=123964711725007&w=4
Handled-By	: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Patch		: http://marc.info/?l=linux-kernel&m=123980715900884&w=4



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13116] Can't boot with nosmp
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Dan Williams, Stephen Hemminger

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13116
Subject		: Can't boot with nosmp
Submitter	: Stephen Hemminger <shemminger@vyatta.com>
Date		: 2009-04-15 4:18 (2 days old)
References	: http://marc.info/?l=linux-kernel&m=123976917817920&w=4
Handled-By	: Dan Williams <dan.j.williams@intel.com>



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13119] Trouble with make-install from a NFS mount
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Gregory Haskins, H. Peter Anvin, Sam Ravnborg

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13119
Subject		: Trouble with make-install from a NFS mount
Submitter	: Gregory Haskins <ghaskins@novell.com>
Date		: 2009-04-14 21:32 (3 days old)
References	: http://marc.info/?l=linux-kernel&m=123974482327044&w=4
Handled-By	: H. Peter Anvin <hpa@zytor.com>



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13118] iptables very slow after commit 784544739a25c30637397ace5489eeb6e15d7d49
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Eric Dumazet, Jeff Chua, Patrick McHardy,
	Stephen Hemminger

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13118
Subject		: iptables very slow after commit 784544739a25c30637397ace5489eeb6e15d7d49
Submitter	: Jeff Chua <jeff.chua.linux@gmail.com>
Date		: 2009-04-10 16:05 (7 days old)
References	: http://lkml.org/lkml/2009/4/10/111
Handled-By	: Eric Dumazet <dada1@cosmosbay.com>



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13114] USB storage (usbstick) automount woes
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Alan Stern, Greg Kroah-Hartman, Mike Galbraith

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13114
Subject		: USB storage (usbstick) automount woes
Submitter	: Mike Galbraith <efault-Mmb7MZpHnFY@public.gmane.org>
Date		: 2009-04-09 9:26 (8 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e6e244b6cb1f70e7109381626293cd40a8334ed3
References	: http://marc.info/?l=linux-kernel&m=123926928907568&w=4
Handled-By	: Alan Stern <stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz@public.gmane.org>


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13115] microcode driver newly spews warnings
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Dmitry Adamushko, Jeff Garzik

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13115
Subject		: microcode driver newly spews warnings
Submitter	: Jeff Garzik <jeff-o2qLIJkoznsdnm+yROfE0A@public.gmane.org>
Date		: 2009-04-13 18:23 (4 days old)
References	: http://marc.info/?l=linux-kernel&m=123964711725007&w=4
Handled-By	: Dmitry Adamushko <dmitry.adamushko-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Patch		: http://marc.info/?l=linux-kernel&m=123980715900884&w=4


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13116] Can't boot with nosmp
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Dan Williams, Stephen Hemminger

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13116
Subject		: Can't boot with nosmp
Submitter	: Stephen Hemminger <shemminger-ZtmgI6mnKB3QT0dZR+AlfA@public.gmane.org>
Date		: 2009-04-15 4:18 (2 days old)
References	: http://marc.info/?l=linux-kernel&m=123976917817920&w=4
Handled-By	: Dan Williams <dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13119] Trouble with make-install from a NFS mount
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Gregory Haskins, H. Peter Anvin, Sam Ravnborg

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13119
Subject		: Trouble with make-install from a NFS mount
Submitter	: Gregory Haskins <ghaskins-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
Date		: 2009-04-14 21:32 (3 days old)
References	: http://marc.info/?l=linux-kernel&m=123974482327044&w=4
Handled-By	: H. Peter Anvin <hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13118] iptables very slow after commit 784544739a25c30637397ace5489eeb6e15d7d49
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Eric Dumazet, Jeff Chua, Patrick McHardy,
	Stephen Hemminger

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13118
Subject		: iptables very slow after commit 784544739a25c30637397ace5489eeb6e15d7d49
Submitter	: Jeff Chua <jeff.chua.linux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-04-10 16:05 (7 days old)
References	: http://lkml.org/lkml/2009/4/10/111
Handled-By	: Eric Dumazet <dada1-fPLkHRcR87vqlBn2x/YWAg@public.gmane.org>


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13120] BUG: using rootfstype=ext4 causes oops
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Andrew Price, Bartlomiej Zolnierkiewicz

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13120
Subject		: BUG: using rootfstype=ext4 causes oops
Submitter	: Andrew Price <andy@andrewprice.me.uk>
Date		: 2009-04-15 20:59 (2 days old)
References	: http://marc.info/?l=linux-kernel&m=123982932807371&w=4
Handled-By	: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Patch		: http://marc.info/?l=linux-kernel&m=123991090816794&w=4



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13121] commit 1a7c618a3f7bef1a20ae740df512eeba21397fa5 breaks ACPI video
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Len Brown, Matthew Garrett, Maxim Levitsky,
	Thomas Renninger, Zhang Rui

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13121
Subject		: commit 1a7c618a3f7bef1a20ae740df512eeba21397fa5 breaks ACPI video
Submitter	: Maxim Levitsky <maximlevitsky@gmail.com>
Date		: 2009-04-16 11:37 (1 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=1a7c618a3f7bef1a20ae740df512eeba21397fa5
References	: http://marc.info/?l=linux-kernel&m=123988189401913&w=4



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13123] 20 ACPI interrupts per second on EEEPC 4G
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Alan Jenkins, Matthew Garrett

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13123
Subject		: 20 ACPI interrupts per second on EEEPC 4G
Submitter	: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Date		: 2009-04-12 15:54 (5 days old)
References	: http://marc.info/?l=linux-kernel&m=123955169317870&w=4
Handled-By	: Matthew Garrett <mjg59@srcf.ucam.org>
Patch		: http://marc.info/?l=linux-kernel&m=123973665713690&w=4



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13122] reiserfs_delete_xattrs: Couldn't delete all xattrs (-13)
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Alexander Beregalov

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13122
Subject		: reiserfs_delete_xattrs: Couldn't delete all xattrs (-13)
Submitter	: Alexander Beregalov <a.beregalov@gmail.com>
Date		: 2009-04-16 19:23 (1 days old)
References	: http://marc.info/?l=linux-kernel&m=123990989515105&w=4



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13124] ioatdma: DMA-API: device driver frees DMA memory with wrong function
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Alexander Beregalov, Dan Williams

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13124
Subject		: ioatdma: DMA-API: device driver frees DMA memory with wrong function
Submitter	: Alexander Beregalov <a.beregalov@gmail.com>
Date		: 2009-04-09 12:36 (8 days old)
References	: http://marc.info/?l=linux-kernel&m=123928064322503&w=4
Handled-By	: Dan Williams <dan.j.williams@intel.com>



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13120] BUG: using rootfstype=ext4 causes oops
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Andrew Price, Bartlomiej Zolnierkiewicz

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13120
Subject		: BUG: using rootfstype=ext4 causes oops
Submitter	: Andrew Price <andy-QvJ1taJFSUQwEI6hhNFqhFpr/1R2p/CL@public.gmane.org>
Date		: 2009-04-15 20:59 (2 days old)
References	: http://marc.info/?l=linux-kernel&m=123982932807371&w=4
Handled-By	: Bartlomiej Zolnierkiewicz <bzolnier-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Patch		: http://marc.info/?l=linux-kernel&m=123991090816794&w=4


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13124] ioatdma: DMA-API: device driver frees DMA memory with wrong function
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Alexander Beregalov, Dan Williams

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13124
Subject		: ioatdma: DMA-API: device driver frees DMA memory with wrong function
Submitter	: Alexander Beregalov <a.beregalov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-04-09 12:36 (8 days old)
References	: http://marc.info/?l=linux-kernel&m=123928064322503&w=4
Handled-By	: Dan Williams <dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13122] reiserfs_delete_xattrs: Couldn't delete all xattrs (-13)
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Alexander Beregalov

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13122
Subject		: reiserfs_delete_xattrs: Couldn't delete all xattrs (-13)
Submitter	: Alexander Beregalov <a.beregalov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-04-16 19:23 (1 days old)
References	: http://marc.info/?l=linux-kernel&m=123990989515105&w=4


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13121] commit 1a7c618a3f7bef1a20ae740df512eeba21397fa5 breaks ACPI video
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Len Brown, Matthew Garrett, Maxim Levitsky,
	Thomas Renninger, Zhang Rui

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13121
Subject		: commit 1a7c618a3f7bef1a20ae740df512eeba21397fa5 breaks ACPI video
Submitter	: Maxim Levitsky <maximlevitsky-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-04-16 11:37 (1 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=1a7c618a3f7bef1a20ae740df512eeba21397fa5
References	: http://marc.info/?l=linux-kernel&m=123988189401913&w=4


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13123] 20 ACPI interrupts per second on EEEPC 4G
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Alan Jenkins, Matthew Garrett

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13123
Subject		: 20 ACPI interrupts per second on EEEPC 4G
Submitter	: Alan Jenkins <alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA@public.gmane.org>
Date		: 2009-04-12 15:54 (5 days old)
References	: http://marc.info/?l=linux-kernel&m=123955169317870&w=4
Handled-By	: Matthew Garrett <mjg59-1xO5oi07KQx4cg9Nei1l7Q@public.gmane.org>
Patch		: http://marc.info/?l=linux-kernel&m=123973665713690&w=4


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13125] active uvcvideo breaks over suspend
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Alan Jenkins

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13125
Subject		: active uvcvideo breaks over suspend
Submitter	: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Date		: 2009-04-15 10:12 (2 days old)
References	: http://marc.info/?l=linux-kernel&m=123979009508840&w=4



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13126] BUG: MAX_LOCKDEP_ENTRIES too low! when mounting rootfs
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Alexander Beregalov

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13126
Subject		: BUG: MAX_LOCKDEP_ENTRIES too low! when mounting rootfs
Submitter	: Alexander Beregalov <a.beregalov@gmail.com>
Date		: 2009-04-15 12:43 (2 days old)
References	: http://marc.info/?l=linux-kernel&m=123979949820538&w=4



^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13125] active uvcvideo breaks over suspend
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Alan Jenkins

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13125
Subject		: active uvcvideo breaks over suspend
Submitter	: Alan Jenkins <alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA@public.gmane.org>
Date		: 2009-04-15 10:12 (2 days old)
References	: http://marc.info/?l=linux-kernel&m=123979009508840&w=4


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [Bug #13126] BUG: MAX_LOCKDEP_ENTRIES too low! when mounting rootfs
@ 2009-04-16 21:45   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-16 21:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Alexander Beregalov

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.29.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13126
Subject		: BUG: MAX_LOCKDEP_ENTRIES too low! when mounting rootfs
Submitter	: Alexander Beregalov <a.beregalov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-04-15 12:43 (2 days old)
References	: http://marc.info/?l=linux-kernel&m=123979949820538&w=4


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-17  0:40   ` Linus Torvalds
  -1 siblings, 0 replies; 580+ messages in thread
From: Linus Torvalds @ 2009-04-17  0:40 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Adrian Bunk, Linux SCSI List, Network Development,
	Linux Kernel Mailing List, Natalie Protasevich, Linux ACPI,
	Andrew Morton, Kernel Testers List, Linux PM List



I think you put this in the wrong regression pile:

> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13112
> Subject		: Oops in drain_array
> Submitter	: Bart <mmx@riz.pl>
> Date		: 2009-04-14 10:21 (3 days old)
> References	: http://marc.info/?l=linux-kernel&m=123970493224628&w=4

Hmm. This one seems like it should be in the "since 2.6.28" camp, since if 
I read that one right, it happens with 2.6.29.1.

(I mean sure, it might be new since 2.6.29, but it sounds more likely that 
it's already in 2.6.29)

		Linus

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
@ 2009-04-17  0:40   ` Linus Torvalds
  0 siblings, 0 replies; 580+ messages in thread
From: Linus Torvalds @ 2009-04-17  0:40 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Adrian Bunk, Andrew Morton,
	Natalie Protasevich, Kernel Testers List, Network Development,
	Linux ACPI, Linux PM List, Linux SCSI List



I think you put this in the wrong regression pile:

> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13112
> Subject		: Oops in drain_array
> Submitter	: Bart <mmx@riz.pl>
> Date		: 2009-04-14 10:21 (3 days old)
> References	: http://marc.info/?l=linux-kernel&m=123970493224628&w=4

Hmm. This one seems like it should be in the "since 2.6.28" camp, since if 
I read that one right, it happens with 2.6.29.1.

(I mean sure, it might be new since 2.6.29, but it sounds more likely that 
it's already in 2.6.29)

		Linus

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-16 21:42 ` Rafael J. Wysocki
                   ` (36 preceding siblings ...)
  (?)
@ 2009-04-17  0:41 ` David Miller
  2009-04-17 21:27   ` Rafael J. Wysocki
  2009-04-17 21:27   ` Rafael J. Wysocki
  -1 siblings, 2 replies; 580+ messages in thread
From: David Miller @ 2009-04-17  0:41 UTC (permalink / raw)
  To: rjw
  Cc: linux-kernel, bunk, akpm, torvalds, protasnb, kernel-testers,
	netdev, linux-acpi, linux-pm, linux-scsi

From: "Rafael J. Wysocki" <rjw@sisk.pl>
Date: Thu, 16 Apr 2009 23:42:31 +0200 (CEST)

> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13099
> Subject		: net, sky2: BUG: unable to handle kernel NULL pointer dereference, pci_vpd_truncate()
> Submitter	: Ingo Molnar <mingo@elte.hu>
> Date		: 2009-04-06 9:03 (11 days old)
> References	: http://marc.info/?l=linux-kernel&m=123900867611321&w=4
> Handled-By	: Stephen Hemminger <shemminger@vyatta.com>

Fixed by:

commit d407e32efe060afa2b9a797a91376ebc65b4ce11
Author: Anton Vorontsov <avorontsov@ru.mvista.com>
Date:   Wed Apr 1 02:23:41 2009 +0400

    PCI: Fix oops in pci_vpd_truncate
    
    pci_vpd_truncate() should check for dev->vpd->attr, otherwise this might
    happen:
    
      sky2 driver version 1.22
      Unable to handle kernel paging request for data at address 0x0000000c
      Faulting instruction address: 0xc01836fc
      Oops: Kernel access of bad area, sig: 11 [#1]
      [...]
      NIP [c01836fc] pci_vpd_truncate+0x38/0x40
      LR [c029be18] sky2_probe+0x14c/0x518
      Call Trace:
      [ef82bde0] [c029bda4] sky2_probe+0xd8/0x518 (unreliable)
      [ef82be20] [c018a11c] local_pci_probe+0x24/0x34
      [ef82be30] [c018a14c] pci_call_probe+0x20/0x30
      [ef82be50] [c018a330] __pci_device_probe+0x64/0x78
      [ef82be60] [c018a44c] pci_device_probe+0x30/0x58
      [ef82be80] [c01aa270] really_probe+0x78/0x1a0
      [ef82bea0] [c01aa460] __driver_attach+0xa4/0xa8
      [ef82bec0] [c01a96ac] bus_for_each_dev+0x60/0x9c
      [ef82bef0] [c01aa0b4] driver_attach+0x24/0x34
      [ef82bf00] [c01a9e08] bus_add_driver+0x12c/0x1cc
      [ef82bf20] [c01aa87c] driver_register+0x6c/0x110
      [ef82bf30] [c018a770] __pci_register_driver+0x4c/0x9c
      [ef82bf50] [c03782c8] sky2_init_module+0x30/0x40
      [ef82bf60] [c0001dbc] do_one_initcall+0x34/0x1a0
      [ef82bfd0] [c0362240] do_initcalls+0x38/0x58
    
    This happens with CONFIG_SKY2=y, and "ip=on" kernel command line, so
    pci_vpd_truncate() is called before late_initcall(pci_sysfs_init),
    therefore ->attr isn't yet initialized.
    
    Acked-by: Stephen Hemminger <shemminger@vyatta.com>
    Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
    Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-16 21:42 ` Rafael J. Wysocki
                   ` (37 preceding siblings ...)
  (?)
@ 2009-04-17  0:41 ` David Miller
  -1 siblings, 0 replies; 580+ messages in thread
From: David Miller @ 2009-04-17  0:41 UTC (permalink / raw)
  To: rjw
  Cc: bunk, linux-scsi, netdev, linux-kernel, protasnb, linux-acpi,
	akpm, kernel-testers, torvalds, linux-pm

From: "Rafael J. Wysocki" <rjw@sisk.pl>
Date: Thu, 16 Apr 2009 23:42:31 +0200 (CEST)

> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13099
> Subject		: net, sky2: BUG: unable to handle kernel NULL pointer dereference, pci_vpd_truncate()
> Submitter	: Ingo Molnar <mingo@elte.hu>
> Date		: 2009-04-06 9:03 (11 days old)
> References	: http://marc.info/?l=linux-kernel&m=123900867611321&w=4
> Handled-By	: Stephen Hemminger <shemminger@vyatta.com>

Fixed by:

commit d407e32efe060afa2b9a797a91376ebc65b4ce11
Author: Anton Vorontsov <avorontsov@ru.mvista.com>
Date:   Wed Apr 1 02:23:41 2009 +0400

    PCI: Fix oops in pci_vpd_truncate
    
    pci_vpd_truncate() should check for dev->vpd->attr, otherwise this might
    happen:
    
      sky2 driver version 1.22
      Unable to handle kernel paging request for data at address 0x0000000c
      Faulting instruction address: 0xc01836fc
      Oops: Kernel access of bad area, sig: 11 [#1]
      [...]
      NIP [c01836fc] pci_vpd_truncate+0x38/0x40
      LR [c029be18] sky2_probe+0x14c/0x518
      Call Trace:
      [ef82bde0] [c029bda4] sky2_probe+0xd8/0x518 (unreliable)
      [ef82be20] [c018a11c] local_pci_probe+0x24/0x34
      [ef82be30] [c018a14c] pci_call_probe+0x20/0x30
      [ef82be50] [c018a330] __pci_device_probe+0x64/0x78
      [ef82be60] [c018a44c] pci_device_probe+0x30/0x58
      [ef82be80] [c01aa270] really_probe+0x78/0x1a0
      [ef82bea0] [c01aa460] __driver_attach+0xa4/0xa8
      [ef82bec0] [c01a96ac] bus_for_each_dev+0x60/0x9c
      [ef82bef0] [c01aa0b4] driver_attach+0x24/0x34
      [ef82bf00] [c01a9e08] bus_add_driver+0x12c/0x1cc
      [ef82bf20] [c01aa87c] driver_register+0x6c/0x110
      [ef82bf30] [c018a770] __pci_register_driver+0x4c/0x9c
      [ef82bf50] [c03782c8] sky2_init_module+0x30/0x40
      [ef82bf60] [c0001dbc] do_one_initcall+0x34/0x1a0
      [ef82bfd0] [c0362240] do_initcalls+0x38/0x58
    
    This happens with CONFIG_SKY2=y, and "ip=on" kernel command line, so
    pci_vpd_truncate() is called before late_initcall(pci_sysfs_init),
    therefore ->attr isn't yet initialized.
    
    Acked-by: Stephen Hemminger <shemminger@vyatta.com>
    Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
    Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13111] Linux 2.6.30-rc1 tg3 endian issues with MAC addresses on BCM5701
  2009-04-16 21:45   ` Rafael J. Wysocki
@ 2009-04-17  0:43     ` David Miller
  -1 siblings, 0 replies; 580+ messages in thread
From: David Miller @ 2009-04-17  0:43 UTC (permalink / raw)
  To: rjw; +Cc: linux-kernel, kernel-testers, benli, mcarlson, mchan, holt

From: "Rafael J. Wysocki" <rjw@sisk.pl>
Date: Thu, 16 Apr 2009 23:45:05 +0200 (CEST)

> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13111
> Subject		: Linux 2.6.30-rc1 tg3 endian issues with MAC addresses on BCM5701
> Submitter	: Robin Holt <holt@sgi.com>
> Date		: 2009-04-08 7:12 (9 days old)
> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e4f341103e4a2b35f56a0f89802f1b1448e8d04b
> References	: http://marc.info/?l=linux-kernel&m=123917477312823&w=4
> Handled-By	: Matt Carlson <mcarlson@broadcom.com>

We're half-way to a fix for this, see the commit below.

But we're not completely finished, so keep this entry open.

commit 0d489ffb76de0fe804cf06a9d4d11fa7342d74b9
Author: James Bottomley <James.Bottomley@HansenPartnership.com>
Date:   Mon Apr 13 14:31:51 2009 -0700

    tg3: fix big endian MAC address collection failure
    
    We noticed on parisc that our broadcoms all swapped MAC addresses going
    from 2.6.29 to 2.6.30-rc1:
    
    Apr 11 07:48:24 ion kernel: eth0: Tigon3 [partno(BCM95700A6) rev 0105] (PCI:66MHz:64-bit) MAC address 00:30:6e:4b:15:59
    Apr 13 07:34:34 ion kernel: eth0: Tigon3 [partno(BCM95700A6) rev 0105] (PCI:66MHz:64-bit) MAC address 00:00:59:15:4b:6e
    
    The problem patch is:
    
    commit 6d348f2c1e0bb1cf7a494b51fc921095ead3f6ae
    Author: Matt Carlson <mcarlson@broadcom.com>
    Date:   Wed Feb 25 14:25:52 2009 +0000
    
        tg3: Eliminate tg3_nvram_read_swab()
    
    With the root cause being the use of memcpy to set the mac address:
    
       memcpy(&dev->dev_addr[0], ((char *)&hi) + 2, 2);
       memcpy(&dev->dev_addr[2], (char *)&lo, sizeof(lo));
    
    This might work on little endian machines, but it can't on big endian
    ones.  You have to use the original setting mechanism to be correct on
    all architectures.
    
    The attached patch fixes parisc.
    
    Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13111] Linux 2.6.30-rc1 tg3 endian issues with MAC addresses on BCM5701
@ 2009-04-17  0:43     ` David Miller
  0 siblings, 0 replies; 580+ messages in thread
From: David Miller @ 2009-04-17  0:43 UTC (permalink / raw)
  To: rjw-KKrjLPT3xs0
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	benli-dY08KVG/lbpWk0Htik3J/w, mcarlson-dY08KVG/lbpWk0Htik3J/w,
	mchan-dY08KVG/lbpWk0Htik3J/w, holt-sJ/iWh9BUns

From: "Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org>
Date: Thu, 16 Apr 2009 23:45:05 +0200 (CEST)

> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13111
> Subject		: Linux 2.6.30-rc1 tg3 endian issues with MAC addresses on BCM5701
> Submitter	: Robin Holt <holt-sJ/iWh9BUns@public.gmane.org>
> Date		: 2009-04-08 7:12 (9 days old)
> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e4f341103e4a2b35f56a0f89802f1b1448e8d04b
> References	: http://marc.info/?l=linux-kernel&m=123917477312823&w=4
> Handled-By	: Matt Carlson <mcarlson-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>

We're half-way to a fix for this, see the commit below.

But we're not completely finished, so keep this entry open.

commit 0d489ffb76de0fe804cf06a9d4d11fa7342d74b9
Author: James Bottomley <James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
Date:   Mon Apr 13 14:31:51 2009 -0700

    tg3: fix big endian MAC address collection failure
    
    We noticed on parisc that our broadcoms all swapped MAC addresses going
    from 2.6.29 to 2.6.30-rc1:
    
    Apr 11 07:48:24 ion kernel: eth0: Tigon3 [partno(BCM95700A6) rev 0105] (PCI:66MHz:64-bit) MAC address 00:30:6e:4b:15:59
    Apr 13 07:34:34 ion kernel: eth0: Tigon3 [partno(BCM95700A6) rev 0105] (PCI:66MHz:64-bit) MAC address 00:00:59:15:4b:6e
    
    The problem patch is:
    
    commit 6d348f2c1e0bb1cf7a494b51fc921095ead3f6ae
    Author: Matt Carlson <mcarlson-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
    Date:   Wed Feb 25 14:25:52 2009 +0000
    
        tg3: Eliminate tg3_nvram_read_swab()
    
    With the root cause being the use of memcpy to set the mac address:
    
       memcpy(&dev->dev_addr[0], ((char *)&hi) + 2, 2);
       memcpy(&dev->dev_addr[2], (char *)&lo, sizeof(lo));
    
    This might work on little endian machines, but it can't on big endian
    ones.  You have to use the original setting mechanism to be correct on
    all architectures.
    
    The attached patch fixes parisc.
    
    Signed-off-by: James Bottomley <James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
    Signed-off-by: David S. Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13097] Kernel will freeze network after using a tun/tap device
  2009-04-16 21:45   ` Rafael J. Wysocki
@ 2009-04-17  0:44     ` David Miller
  -1 siblings, 0 replies; 580+ messages in thread
From: David Miller @ 2009-04-17  0:44 UTC (permalink / raw)
  To: rjw; +Cc: linux-kernel, kernel-testers, fragabr, herbert

From: "Rafael J. Wysocki" <rjw@sisk.pl>
Date: Thu, 16 Apr 2009 23:45:02 +0200 (CEST)

> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13097
> Subject		: Kernel will freeze network after using a tun/tap device
> Submitter	: Dâniel Fraga <fragabr@gmail.com>
> Date		: 2009-04-15 22:19 (2 days old)

Herbert what's the state of this?  Is this fixed already as of what I
pushed earlier today to Linus, or are the two patches you are working
on cure this?

Thanks.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13097] Kernel will freeze network after using a tun/tap device
@ 2009-04-17  0:44     ` David Miller
  0 siblings, 0 replies; 580+ messages in thread
From: David Miller @ 2009-04-17  0:44 UTC (permalink / raw)
  To: rjw-KKrjLPT3xs0
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	fragabr-Re5JQEeQqe8AvxtiuMwx3w,
	herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q

From: "Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org>
Date: Thu, 16 Apr 2009 23:45:02 +0200 (CEST)

> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13097
> Subject		: Kernel will freeze network after using a tun/tap device
> Submitter	: Dâniel Fraga <fragabr-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Date		: 2009-04-15 22:19 (2 days old)

Herbert what's the state of this?  Is this fixed already as of what I
pushed earlier today to Linus, or are the two patches you are working
on cure this?

Thanks.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13099] net, sky2: BUG: unable to handle kernel NULL pointer dereference, pci_vpd_truncate()
  2009-04-16 21:45   ` Rafael J. Wysocki
@ 2009-04-17  0:45     ` Ingo Molnar
  -1 siblings, 0 replies; 580+ messages in thread
From: Ingo Molnar @ 2009-04-17  0:45 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Stephen Hemminger


* Rafael J. Wysocki <rjw@sisk.pl> wrote:

> This message has been generated automatically as a part of a report
> of recent regressions.
> 
> The following bug entry is on the current list of known regressions
> from 2.6.29.  Please verify if it still should be listed and let me know
> (either way).
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13099
> Subject		: net, sky2: BUG: unable to handle kernel NULL pointer dereference, pci_vpd_truncate()
> Submitter	: Ingo Molnar <mingo@elte.hu>
> Date		: 2009-04-06 9:03 (11 days old)
> References	: http://marc.info/?l=linux-kernel&m=123900867611321&w=4
> Handled-By	: Stephen Hemminger <shemminger@vyatta.com>

I think this can be closed as the fix has been merged upstream 
already (via the PCI tree):

 commit d407e32efe060afa2b9a797a91376ebc65b4ce11
 Author: Anton Vorontsov <avorontsov@ru.mvista.com>
 Date:   Wed Apr 1 02:23:41 2009 +0400

    PCI: Fix oops in pci_vpd_truncate

	Ingo

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13099] net, sky2: BUG: unable to handle kernel NULL pointer dereference, pci_vpd_truncate()
@ 2009-04-17  0:45     ` Ingo Molnar
  0 siblings, 0 replies; 580+ messages in thread
From: Ingo Molnar @ 2009-04-17  0:45 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Stephen Hemminger


* Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org> wrote:

> This message has been generated automatically as a part of a report
> of recent regressions.
> 
> The following bug entry is on the current list of known regressions
> from 2.6.29.  Please verify if it still should be listed and let me know
> (either way).
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13099
> Subject		: net, sky2: BUG: unable to handle kernel NULL pointer dereference, pci_vpd_truncate()
> Submitter	: Ingo Molnar <mingo-X9Un+BFzKDI@public.gmane.org>
> Date		: 2009-04-06 9:03 (11 days old)
> References	: http://marc.info/?l=linux-kernel&m=123900867611321&w=4
> Handled-By	: Stephen Hemminger <shemminger-ZtmgI6mnKB3QT0dZR+AlfA@public.gmane.org>

I think this can be closed as the fix has been merged upstream 
already (via the PCI tree):

 commit d407e32efe060afa2b9a797a91376ebc65b4ce11
 Author: Anton Vorontsov <avorontsov-hkdhdckH98+B+jHODAdFcQ@public.gmane.org>
 Date:   Wed Apr 1 02:23:41 2009 +0400

    PCI: Fix oops in pci_vpd_truncate

	Ingo

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-17  0:46   ` Linus Torvalds
  -1 siblings, 0 replies; 580+ messages in thread
From: Linus Torvalds @ 2009-04-17  0:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Adrian Bunk, Andrew Morton,
	Natalie Protasevich, Kernel Testers List, Network Development,
	Linux ACPI, Linux PM List, Linux SCSI List



On Thu, 16 Apr 2009, Rafael J. Wysocki wrote:
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13098
> Subject		: 2.6.29-git12 breaks vga=0x0f07 on MSI/Intel GPU
> Submitter	: Andi Kleen <andi-Vw/NltI1exuRpAAqCnN02g@public.gmane.org>
> Date		: 2009-04-06 01:14 (11 days old)
> References	: http://lkml.org/lkml/2009/4/5/200
> Handled-By	: H. Peter Anvin <hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>

I think this got fixed already. The VGA moresettign was reverted back to 
the old order.

> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13044
> Subject		: 2.6.30-rc1 can't find the root fs
> Submitter	: Heinz Diehl <htd-HjJ2MNWy62to6+H+lsi3Gti2O/JbrIOy@public.gmane.org>
> Date		: 2009-04-08 13:35 (9 days old)

This was one of the async things that got fixed by just waiting for module 
async work to finish.

		Linus

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
@ 2009-04-17  0:46   ` Linus Torvalds
  0 siblings, 0 replies; 580+ messages in thread
From: Linus Torvalds @ 2009-04-17  0:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Adrian Bunk, Andrew Morton,
	Natalie Protasevich, Kernel Testers List, Network Development,
	Linux ACPI, Linux PM List, Linux SCSI List



On Thu, 16 Apr 2009, Rafael J. Wysocki wrote:
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13098
> Subject		: 2.6.29-git12 breaks vga=0x0f07 on MSI/Intel GPU
> Submitter	: Andi Kleen <andi@firstfloor.org>
> Date		: 2009-04-06 01:14 (11 days old)
> References	: http://lkml.org/lkml/2009/4/5/200
> Handled-By	: H. Peter Anvin <hpa@zytor.com>

I think this got fixed already. The VGA moresettign was reverted back to 
the old order.

> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13044
> Subject		: 2.6.30-rc1 can't find the root fs
> Submitter	: Heinz Diehl <htd@fancy-poultry.org>
> Date		: 2009-04-08 13:35 (9 days old)

This was one of the async things that got fixed by just waiting for module 
async work to finish.

		Linus

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-16 21:42 ` Rafael J. Wysocki
                   ` (39 preceding siblings ...)
  (?)
@ 2009-04-17  0:46 ` Linus Torvalds
  -1 siblings, 0 replies; 580+ messages in thread
From: Linus Torvalds @ 2009-04-17  0:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Adrian Bunk, Linux SCSI List, Network Development,
	Linux Kernel Mailing List, Natalie Protasevich, Linux ACPI,
	Andrew Morton, Kernel Testers List, Linux PM List



On Thu, 16 Apr 2009, Rafael J. Wysocki wrote:
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13098
> Subject		: 2.6.29-git12 breaks vga=0x0f07 on MSI/Intel GPU
> Submitter	: Andi Kleen <andi@firstfloor.org>
> Date		: 2009-04-06 01:14 (11 days old)
> References	: http://lkml.org/lkml/2009/4/5/200
> Handled-By	: H. Peter Anvin <hpa@zytor.com>

I think this got fixed already. The VGA moresettign was reverted back to 
the old order.

> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13044
> Subject		: 2.6.30-rc1 can't find the root fs
> Submitter	: Heinz Diehl <htd@fancy-poultry.org>
> Date		: 2009-04-08 13:35 (9 days old)

This was one of the async things that got fixed by just waiting for module 
async work to finish.

		Linus

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13106] 2.6.30-rc1: intel 3945 no wireless
  2009-04-16 21:45   ` Rafael J. Wysocki
@ 2009-04-17  0:53     ` Larry Finger
  -1 siblings, 0 replies; 580+ messages in thread
From: Larry Finger @ 2009-04-17  0:53 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List,
	2.6.30-rc1: intel 3945 no wireless

Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
> 
> The following bug entry is on the current list of known regressions
> from 2.6.29.  Please verify if it still should be listed and let me know
> (either way).
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13106
> Subject		: 2.6.30-rc1: intel 3945 no wireless
> Submitter	: 2.6.30-rc1: intel 3945 no wireless
> Date		: 2009-04-08 5:36 (9 days old)
> References	: http://marc.info/?l=linux-kernel&m=123916905605534&w=4

That regression was fixed by Herbert Xu's commit 97c18e2c. It should no longer
be listed.

Larry

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13106] 2.6.30-rc1: intel 3945 no wireless
@ 2009-04-17  0:53     ` Larry Finger
  0 siblings, 0 replies; 580+ messages in thread
From: Larry Finger @ 2009-04-17  0:53 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List,
	2.6.30-rc1: intel 3945 no wireless

Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
> 
> The following bug entry is on the current list of known regressions
> from 2.6.29.  Please verify if it still should be listed and let me know
> (either way).
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13106
> Subject		: 2.6.30-rc1: intel 3945 no wireless
> Submitter	: 2.6.30-rc1: intel 3945 no wireless
> Date		: 2009-04-08 5:36 (9 days old)
> References	: http://marc.info/?l=linux-kernel&m=123916905605534&w=4

That regression was fixed by Herbert Xu's commit 97c18e2c. It should no longer
be listed.

Larry

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13097] Kernel will freeze network after using a tun/tap device
@ 2009-04-17  0:54       ` Herbert Xu
  0 siblings, 0 replies; 580+ messages in thread
From: Herbert Xu @ 2009-04-17  0:54 UTC (permalink / raw)
  To: David Miller; +Cc: rjw, linux-kernel, kernel-testers, fragabr

On Thu, Apr 16, 2009 at 05:44:06PM -0700, David Miller wrote:
> From: "Rafael J. Wysocki" <rjw@sisk.pl>
> Date: Thu, 16 Apr 2009 23:45:02 +0200 (CEST)
> 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13097
> > Subject		: Kernel will freeze network after using a tun/tap device
> > Submitter	: Dâniel Fraga <fragabr@gmail.com>
> > Date		: 2009-04-15 22:19 (2 days old)
> 
> Herbert what's the state of this?  Is this fixed already as of what I
> pushed earlier today to Linus, or are the two patches you are working
> on cure this?

This bug sounds nothing like the one I've been trying to fix.

I'll take a look at it.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13097] Kernel will freeze network after using a tun/tap device
@ 2009-04-17  0:54       ` Herbert Xu
  0 siblings, 0 replies; 580+ messages in thread
From: Herbert Xu @ 2009-04-17  0:54 UTC (permalink / raw)
  To: David Miller
  Cc: rjw-KKrjLPT3xs0, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	fragabr-Re5JQEeQqe8AvxtiuMwx3w

On Thu, Apr 16, 2009 at 05:44:06PM -0700, David Miller wrote:
> From: "Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org>
> Date: Thu, 16 Apr 2009 23:45:02 +0200 (CEST)
> 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13097
> > Subject		: Kernel will freeze network after using a tun/tap device
> > Submitter	: Dâniel Fraga <fragabr-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> > Date		: 2009-04-15 22:19 (2 days old)
> 
> Herbert what's the state of this?  Is this fixed already as of what I
> pushed earlier today to Linus, or are the two patches you are working
> on cure this?

This bug sounds nothing like the one I've been trying to fix.

I'll take a look at it.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13111] Linux 2.6.30-rc1 tg3 endian issues with MAC addresses on BCM5701
@ 2009-04-17  0:58       ` Matt Carlson
  0 siblings, 0 replies; 580+ messages in thread
From: Matt Carlson @ 2009-04-17  0:58 UTC (permalink / raw)
  To: David Miller
  Cc: rjw, linux-kernel, kernel-testers, Benjamin Li, Matthew Carlson,
	Michael Chan, holt, James.Bottomley

On Thu, Apr 16, 2009 at 05:43:10PM -0700, David Miller wrote:
> From: "Rafael J. Wysocki" <rjw@sisk.pl>
> Date: Thu, 16 Apr 2009 23:45:05 +0200 (CEST)
> 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13111
> > Subject		: Linux 2.6.30-rc1 tg3 endian issues with MAC addresses on BCM5701
> > Submitter	: Robin Holt <holt@sgi.com>
> > Date		: 2009-04-08 7:12 (9 days old)
> > First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e4f341103e4a2b35f56a0f89802f1b1448e8d04b
> > References	: http://marc.info/?l=linux-kernel&m=123917477312823&w=4
> > Handled-By	: Matt Carlson <mcarlson@broadcom.com>
> 
> We're half-way to a fix for this, see the commit below.
> 
> But we're not completely finished, so keep this entry open.

Actually, I think we do have a fix for this.  James and Robin both
reported that the test patch I sent out worked for them.  I'm preparing
a patchset for submission now.

James, Robin, can you confirm that you performed your tests with David's
patch reverted?

> commit 0d489ffb76de0fe804cf06a9d4d11fa7342d74b9
> Author: James Bottomley <James.Bottomley@HansenPartnership.com>
> Date:   Mon Apr 13 14:31:51 2009 -0700
> 
>     tg3: fix big endian MAC address collection failure
>     
>     We noticed on parisc that our broadcoms all swapped MAC addresses going
>     from 2.6.29 to 2.6.30-rc1:
>     
>     Apr 11 07:48:24 ion kernel: eth0: Tigon3 [partno(BCM95700A6) rev 0105] (PCI:66MHz:64-bit) MAC address 00:30:6e:4b:15:59
>     Apr 13 07:34:34 ion kernel: eth0: Tigon3 [partno(BCM95700A6) rev 0105] (PCI:66MHz:64-bit) MAC address 00:00:59:15:4b:6e
>     
>     The problem patch is:
>     
>     commit 6d348f2c1e0bb1cf7a494b51fc921095ead3f6ae
>     Author: Matt Carlson <mcarlson@broadcom.com>
>     Date:   Wed Feb 25 14:25:52 2009 +0000
>     
>         tg3: Eliminate tg3_nvram_read_swab()
>     
>     With the root cause being the use of memcpy to set the mac address:
>     
>        memcpy(&dev->dev_addr[0], ((char *)&hi) + 2, 2);
>        memcpy(&dev->dev_addr[2], (char *)&lo, sizeof(lo));
>     
>     This might work on little endian machines, but it can't on big endian
>     ones.  You have to use the original setting mechanism to be correct on
>     all architectures.
>     
>     The attached patch fixes parisc.
>     
>     Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
>     Signed-off-by: David S. Miller <davem@davemloft.net>
> 


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13111] Linux 2.6.30-rc1 tg3 endian issues with MAC addresses on BCM5701
@ 2009-04-17  0:58       ` Matt Carlson
  0 siblings, 0 replies; 580+ messages in thread
From: Matt Carlson @ 2009-04-17  0:58 UTC (permalink / raw)
  To: David Miller
  Cc: rjw-KKrjLPT3xs0, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA, Benjamin Li,
	Matthew Carlson, Michael Chan, holt-sJ/iWh9BUns,
	James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk

On Thu, Apr 16, 2009 at 05:43:10PM -0700, David Miller wrote:
> From: "Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org>
> Date: Thu, 16 Apr 2009 23:45:05 +0200 (CEST)
> 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13111
> > Subject		: Linux 2.6.30-rc1 tg3 endian issues with MAC addresses on BCM5701
> > Submitter	: Robin Holt <holt-sJ/iWh9BUns@public.gmane.org>
> > Date		: 2009-04-08 7:12 (9 days old)
> > First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e4f341103e4a2b35f56a0f89802f1b1448e8d04b
> > References	: http://marc.info/?l=linux-kernel&m=123917477312823&w=4
> > Handled-By	: Matt Carlson <mcarlson-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
> 
> We're half-way to a fix for this, see the commit below.
> 
> But we're not completely finished, so keep this entry open.

Actually, I think we do have a fix for this.  James and Robin both
reported that the test patch I sent out worked for them.  I'm preparing
a patchset for submission now.

James, Robin, can you confirm that you performed your tests with David's
patch reverted?

> commit 0d489ffb76de0fe804cf06a9d4d11fa7342d74b9
> Author: James Bottomley <James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
> Date:   Mon Apr 13 14:31:51 2009 -0700
> 
>     tg3: fix big endian MAC address collection failure
>     
>     We noticed on parisc that our broadcoms all swapped MAC addresses going
>     from 2.6.29 to 2.6.30-rc1:
>     
>     Apr 11 07:48:24 ion kernel: eth0: Tigon3 [partno(BCM95700A6) rev 0105] (PCI:66MHz:64-bit) MAC address 00:30:6e:4b:15:59
>     Apr 13 07:34:34 ion kernel: eth0: Tigon3 [partno(BCM95700A6) rev 0105] (PCI:66MHz:64-bit) MAC address 00:00:59:15:4b:6e
>     
>     The problem patch is:
>     
>     commit 6d348f2c1e0bb1cf7a494b51fc921095ead3f6ae
>     Author: Matt Carlson <mcarlson-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
>     Date:   Wed Feb 25 14:25:52 2009 +0000
>     
>         tg3: Eliminate tg3_nvram_read_swab()
>     
>     With the root cause being the use of memcpy to set the mac address:
>     
>        memcpy(&dev->dev_addr[0], ((char *)&hi) + 2, 2);
>        memcpy(&dev->dev_addr[2], (char *)&lo, sizeof(lo));
>     
>     This might work on little endian machines, but it can't on big endian
>     ones.  You have to use the original setting mechanism to be correct on
>     all architectures.
>     
>     The attached patch fixes parisc.
>     
>     Signed-off-by: James Bottomley <James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
>     Signed-off-by: David S. Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
> 

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-17  0:40   ` Linus Torvalds
  (?)
  (?)
@ 2009-04-17  1:25   ` Ingo Molnar
  2009-04-17 21:25     ` Rafael J. Wysocki
       [not found]     ` <20090417012544.GB16126-X9Un+BFzKDI@public.gmane.org>
  -1 siblings, 2 replies; 580+ messages in thread
From: Ingo Molnar @ 2009-04-17  1:25 UTC (permalink / raw)
  To: Linus Torvalds, Arjan van de Ven
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List, Adrian Bunk,
	Andrew Morton, Natalie Protasevich, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> 
> 
> I think you put this in the wrong regression pile:
> 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13112
> > Subject		: Oops in drain_array
> > Submitter	: Bart <mmx@riz.pl>
> > Date		: 2009-04-14 10:21 (3 days old)
> > References	: http://marc.info/?l=linux-kernel&m=123970493224628&w=4
> 
> Hmm. This one seems like it should be in the "since 2.6.28" camp, since if 
> I read that one right, it happens with 2.6.29.1.
> 
> (I mean sure, it might be new since 2.6.29, but it sounds more likely that 
> it's already in 2.6.29)

I'd suspect it's possibly hardware related:

  http://www.kerneloops.org/search.php?search=free_block&btnG=Function+Search

Look at the very similar call signatures - spanning almost all 
kernels back to v2.6.16. There's one spike at .27 - perhaps the same 
box trying up hard and crashing several times - or a popular distro 
kernel?

Or it's a really ancient bug going back to v2.6.16.

	Ingo

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-17  0:40   ` Linus Torvalds
  (?)
@ 2009-04-17  1:25   ` Ingo Molnar
  -1 siblings, 0 replies; 580+ messages in thread
From: Ingo Molnar @ 2009-04-17  1:25 UTC (permalink / raw)
  To: Linus Torvalds, Arjan van de Ven
  Cc: Adrian Bunk, Linux SCSI List, Network Development,
	Linux Kernel Mailing List, Natalie Protasevich, Linux ACPI,
	Andrew Morton, Kernel Testers List, Linux PM List


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> 
> 
> I think you put this in the wrong regression pile:
> 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13112
> > Subject		: Oops in drain_array
> > Submitter	: Bart <mmx@riz.pl>
> > Date		: 2009-04-14 10:21 (3 days old)
> > References	: http://marc.info/?l=linux-kernel&m=123970493224628&w=4
> 
> Hmm. This one seems like it should be in the "since 2.6.28" camp, since if 
> I read that one right, it happens with 2.6.29.1.
> 
> (I mean sure, it might be new since 2.6.29, but it sounds more likely that 
> it's already in 2.6.29)

I'd suspect it's possibly hardware related:

  http://www.kerneloops.org/search.php?search=free_block&btnG=Function+Search

Look at the very similar call signatures - spanning almost all 
kernels back to v2.6.16. There's one spike at .27 - perhaps the same 
box trying up hard and crashing several times - or a popular distro 
kernel?

Or it's a really ancient bug going back to v2.6.16.

	Ingo

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-17  1:28   ` Jeff Chua
  -1 siblings, 0 replies; 580+ messages in thread
From: Jeff Chua @ 2009-04-17  1:28 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Adrian Bunk, Andrew Morton,
	Linus Torvalds, Natalie Protasevich, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List

On Fri, Apr 17, 2009 at 5:42 AM, Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org> wrote:

> Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13118
> Subject         : iptables very slow after commit 784544739a25c30637397ace5489eeb6e15d7d49
> Submitter       : Jeff Chua <jeff.chua.linux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Date            : 2009-04-10 16:05 (7 days old)
> References      : http://lkml.org/lkml/2009/4/10/111
> Handled-By      : Eric Dumazet <dada1-fPLkHRcR87vqlBn2x/YWAg@public.gmane.org>

Several iterations of patches in progress. See
[PATCH] netfilter: per-cpu spin-lock with recursion (v0.8)


> Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13066
> Subject         : Intel HD Audio oops
> Submitter       : Jeff Chua <jeff.chua.linux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Date            : 2009-04-01 8:28 (16 days old)
> References      : http://marc.info/?l=linux-kernel&m=123857454625829&w=4

Fixed as of April 09 2009 git pull.


Thanks,
Jeff

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
@ 2009-04-17  1:28   ` Jeff Chua
  0 siblings, 0 replies; 580+ messages in thread
From: Jeff Chua @ 2009-04-17  1:28 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Adrian Bunk, Andrew Morton,
	Linus Torvalds, Natalie Protasevich, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List

On Fri, Apr 17, 2009 at 5:42 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:

> Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13118
> Subject         : iptables very slow after commit 784544739a25c30637397ace5489eeb6e15d7d49
> Submitter       : Jeff Chua <jeff.chua.linux@gmail.com>
> Date            : 2009-04-10 16:05 (7 days old)
> References      : http://lkml.org/lkml/2009/4/10/111
> Handled-By      : Eric Dumazet <dada1@cosmosbay.com>

Several iterations of patches in progress. See
[PATCH] netfilter: per-cpu spin-lock with recursion (v0.8)


> Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13066
> Subject         : Intel HD Audio oops
> Submitter       : Jeff Chua <jeff.chua.linux@gmail.com>
> Date            : 2009-04-01 8:28 (16 days old)
> References      : http://marc.info/?l=linux-kernel&m=123857454625829&w=4

Fixed as of April 09 2009 git pull.


Thanks,
Jeff

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-16 21:42 ` Rafael J. Wysocki
                   ` (40 preceding siblings ...)
  (?)
@ 2009-04-17  1:28 ` Jeff Chua
  -1 siblings, 0 replies; 580+ messages in thread
From: Jeff Chua @ 2009-04-17  1:28 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Adrian Bunk, Linux SCSI List, Network Development,
	Linux Kernel Mailing List, Natalie Protasevich, Linux ACPI,
	Andrew Morton, Kernel Testers List, Linus Torvalds,
	Linux PM List

On Fri, Apr 17, 2009 at 5:42 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:

> Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13118
> Subject         : iptables very slow after commit 784544739a25c30637397ace5489eeb6e15d7d49
> Submitter       : Jeff Chua <jeff.chua.linux@gmail.com>
> Date            : 2009-04-10 16:05 (7 days old)
> References      : http://lkml.org/lkml/2009/4/10/111
> Handled-By      : Eric Dumazet <dada1@cosmosbay.com>

Several iterations of patches in progress. See
[PATCH] netfilter: per-cpu spin-lock with recursion (v0.8)


> Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13066
> Subject         : Intel HD Audio oops
> Submitter       : Jeff Chua <jeff.chua.linux@gmail.com>
> Date            : 2009-04-01 8:28 (16 days old)
> References      : http://marc.info/?l=linux-kernel&m=123857454625829&w=4

Fixed as of April 09 2009 git pull.


Thanks,
Jeff

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-16 21:42 ` Rafael J. Wysocki
                   ` (42 preceding siblings ...)
  (?)
@ 2009-04-17  1:30 ` Zhang Rui
  2009-04-17  2:34     ` yakui_zhao
                     ` (3 more replies)
  -1 siblings, 4 replies; 580+ messages in thread
From: Zhang Rui @ 2009-04-17  1:30 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Adrian Bunk, Andrew Morton,
	Linus Torvalds, Natalie Protasevich, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List

On Fri, 2009-04-17 at 05:42 +0800, Rafael J. Wysocki wrote:
> 
> 
> Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13095
> Subject         : thinkpad-acpi: cannot control brightness with hotkeys
> Submitter       : Niel Lambrechts <niel.lambrechts@gmail.com>
> Date            : 2009-04-11 23:07 (6 days old)
> References      : http://lkml.org/lkml/2009/4/11/160
> Handled-By      : Matthew Garrett <mjg59@srcf.ucam.org>
> Patch           : http://lkml.org/lkml/2009/4/15/339
> 
> 
> Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13048
> Subject         : /sys/class/backlight/acpi_video0/* is gone on vaio laptop with Intel GM45.
> Submitter       : Rodrigo L. Batista <rodrigo@gus-mg.org>
> Date            : 2009-04-09 04:57 (8 days old)
> Handled-By      : yakui_zhao <yakui.zhao@intel.com>
> Patch           : http://bugzilla.kernel.org/attachment.cgi?id=20967
>                   http://bugzilla.kernel.org/attachment.cgi?id=20959
> 
> 
bug 13095 is a duplicate of bug 13048.
patches from Matthew and Yakui are for the same issue.

Yakui, could you verify which patch should be taken please?

thanks,
rui

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-16 21:42 ` Rafael J. Wysocki
                   ` (43 preceding siblings ...)
  (?)
@ 2009-04-17  1:30 ` Zhang Rui
  -1 siblings, 0 replies; 580+ messages in thread
From: Zhang Rui @ 2009-04-17  1:30 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Adrian Bunk, Linux SCSI List, Network Development,
	Linux Kernel Mailing List, Natalie Protasevich, Linux ACPI,
	Andrew Morton, Kernel Testers List, Linus Torvalds,
	Linux PM List

On Fri, 2009-04-17 at 05:42 +0800, Rafael J. Wysocki wrote:
> 
> 
> Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13095
> Subject         : thinkpad-acpi: cannot control brightness with hotkeys
> Submitter       : Niel Lambrechts <niel.lambrechts@gmail.com>
> Date            : 2009-04-11 23:07 (6 days old)
> References      : http://lkml.org/lkml/2009/4/11/160
> Handled-By      : Matthew Garrett <mjg59@srcf.ucam.org>
> Patch           : http://lkml.org/lkml/2009/4/15/339
> 
> 
> Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13048
> Subject         : /sys/class/backlight/acpi_video0/* is gone on vaio laptop with Intel GM45.
> Submitter       : Rodrigo L. Batista <rodrigo@gus-mg.org>
> Date            : 2009-04-09 04:57 (8 days old)
> Handled-By      : yakui_zhao <yakui.zhao@intel.com>
> Patch           : http://bugzilla.kernel.org/attachment.cgi?id=20967
>                   http://bugzilla.kernel.org/attachment.cgi?id=20959
> 
> 
bug 13095 is a duplicate of bug 13048.
patches from Matthew and Yakui are for the same issue.

Yakui, could you verify which patch should be taken please?

thanks,
rui

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-16 21:42 ` Rafael J. Wysocki
@ 2009-04-17  1:37   ` Ming Lei
  -1 siblings, 0 replies; 580+ messages in thread
From: Ming Lei @ 2009-04-17  1:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Adrian Bunk, Andrew Morton,
	Linus Torvalds, Natalie Protasevich, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List,
	video4linux-list, laurent.pinchart, mchehab

2009/4/17 Rafael J. Wysocki <rjw@sisk.pl>:
>
> Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13125
> Subject         : active uvcvideo breaks over suspend
> Submitter       : Alan Jenkins <alan-jenkins@tuffmail.co.uk>
> Date            : 2009-04-15 10:12 (2 days old)
> References      : http://marc.info/?l=linux-kernel&m=123979009508840&w=4
>

It is a bug in resume path of uvcvideo driver, and I have sent a patch
to laurent.pinchart@skynet.be,
mchehab@infradead.org  and video4linux-list@redhat.com to fix it, but
still no echo from them.

The patch title is V4L/DVB:usbvideo:fix uvc resume failed.

Rafael J.
        If you would like to apply it ,I can resend to you.  Thanks!

-- 
Lei Ming
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
@ 2009-04-17  1:37   ` Ming Lei
  0 siblings, 0 replies; 580+ messages in thread
From: Ming Lei @ 2009-04-17  1:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Adrian Bunk, Andrew Morton,
	Linus Torvalds, Natalie Protasevich, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List,
	video4linux-list, laurent.pinchart, mchehab

2009/4/17 Rafael J. Wysocki <rjw@sisk.pl>:
>
> Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13125
> Subject         : active uvcvideo breaks over suspend
> Submitter       : Alan Jenkins <alan-jenkins@tuffmail.co.uk>
> Date            : 2009-04-15 10:12 (2 days old)
> References      : http://marc.info/?l=linux-kernel&m=123979009508840&w=4
>

It is a bug in resume path of uvcvideo driver, and I have sent a patch
to laurent.pinchart@skynet.be,
mchehab@infradead.org  and video4linux-list@redhat.com to fix it, but
still no echo from them.

The patch title is V4L/DVB:usbvideo:fix uvc resume failed.

Rafael J.
        If you would like to apply it ,I can resend to you.  Thanks!

-- 
Lei Ming

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-16 21:42 ` Rafael J. Wysocki
                   ` (44 preceding siblings ...)
  (?)
@ 2009-04-17  1:37 ` Ming Lei
  -1 siblings, 0 replies; 580+ messages in thread
From: Ming Lei @ 2009-04-17  1:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Adrian Bunk, Linux SCSI List, Network Development,
	Linux Kernel Mailing List, Natalie Protasevich, mchehab,
	Linux ACPI, video4linux-list, laurent.pinchart, Andrew Morton,
	Kernel Testers List, Linus Torvalds, Linux PM List

2009/4/17 Rafael J. Wysocki <rjw@sisk.pl>:
>
> Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13125
> Subject         : active uvcvideo breaks over suspend
> Submitter       : Alan Jenkins <alan-jenkins@tuffmail.co.uk>
> Date            : 2009-04-15 10:12 (2 days old)
> References      : http://marc.info/?l=linux-kernel&m=123979009508840&w=4
>

It is a bug in resume path of uvcvideo driver, and I have sent a patch
to laurent.pinchart@skynet.be,
mchehab@infradead.org  and video4linux-list@redhat.com to fix it, but
still no echo from them.

The patch title is V4L/DVB:usbvideo:fix uvc resume failed.

Rafael J.
        If you would like to apply it ,I can resend to you.  Thanks!

-- 
Lei Ming

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-17  1:30 ` Zhang Rui
@ 2009-04-17  2:34     ` yakui_zhao
  2009-04-17  2:34   ` yakui_zhao
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 580+ messages in thread
From: yakui_zhao @ 2009-04-17  2:34 UTC (permalink / raw)
  To: Zhang Rui, mjq59-1xO5oi07KQx4cg9Nei1l7Q
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List, Adrian Bunk,
	Andrew Morton, Linus Torvalds, Natalie Protasevich,
	Kernel Testers List, Network Development, Linux ACPI,
	Linux PM List, Linux SCSI List

On Fri, 2009-04-17 at 09:30 +0800, Zhang Rui wrote:
> On Fri, 2009-04-17 at 05:42 +0800, Rafael J. Wysocki wrote:
> > 
> > 
> > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13095
> > Subject         : thinkpad-acpi: cannot control brightness with hotkeys
> > Submitter       : Niel Lambrechts <niel.lambrechts-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> > Date            : 2009-04-11 23:07 (6 days old)
> > References      : http://lkml.org/lkml/2009/4/11/160
> > Handled-By      : Matthew Garrett <mjg59-1xO5oi07KQx4cg9Nei1l7Q@public.gmane.org>
> > Patch           : http://lkml.org/lkml/2009/4/15/339
> > 
> > 
> > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13048
> > Subject         : /sys/class/backlight/acpi_video0/* is gone on vaio laptop with Intel GM45.
> > Submitter       : Rodrigo L. Batista <rodrigo-1dof46nAmC8dnm+yROfE0A@public.gmane.org>
> > Date            : 2009-04-09 04:57 (8 days old)
> > Handled-By      : yakui_zhao <yakui.zhao-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > Patch           : http://bugzilla.kernel.org/attachment.cgi?id=20967
> >                   http://bugzilla.kernel.org/attachment.cgi?id=20959
> > 
> > 
> bug 13095 is a duplicate of bug 13048.
> patches from Matthew and Yakui are for the same issue.
> 
> Yakui, could you verify which patch should be taken please?
The patch from Matthew is better. 
It still can work even when the KMS is disabled by adding the boot
option of "i915.modeset=0".

Hi, Matthew
    Will you please push the patch?
    
thanks.

> 
> thanks,
> rui
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
@ 2009-04-17  2:34     ` yakui_zhao
  0 siblings, 0 replies; 580+ messages in thread
From: yakui_zhao @ 2009-04-17  2:34 UTC (permalink / raw)
  To: Zhang Rui, mjq59
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List, Adrian Bunk,
	Andrew Morton, Linus Torvalds, Natalie Protasevich,
	Kernel Testers List, Network Development, Linux ACPI,
	Linux PM List, Linux SCSI List

On Fri, 2009-04-17 at 09:30 +0800, Zhang Rui wrote:
> On Fri, 2009-04-17 at 05:42 +0800, Rafael J. Wysocki wrote:
> > 
> > 
> > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13095
> > Subject         : thinkpad-acpi: cannot control brightness with hotkeys
> > Submitter       : Niel Lambrechts <niel.lambrechts@gmail.com>
> > Date            : 2009-04-11 23:07 (6 days old)
> > References      : http://lkml.org/lkml/2009/4/11/160
> > Handled-By      : Matthew Garrett <mjg59@srcf.ucam.org>
> > Patch           : http://lkml.org/lkml/2009/4/15/339
> > 
> > 
> > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13048
> > Subject         : /sys/class/backlight/acpi_video0/* is gone on vaio laptop with Intel GM45.
> > Submitter       : Rodrigo L. Batista <rodrigo@gus-mg.org>
> > Date            : 2009-04-09 04:57 (8 days old)
> > Handled-By      : yakui_zhao <yakui.zhao@intel.com>
> > Patch           : http://bugzilla.kernel.org/attachment.cgi?id=20967
> >                   http://bugzilla.kernel.org/attachment.cgi?id=20959
> > 
> > 
> bug 13095 is a duplicate of bug 13048.
> patches from Matthew and Yakui are for the same issue.
> 
> Yakui, could you verify which patch should be taken please?
The patch from Matthew is better. 
It still can work even when the KMS is disabled by adding the boot
option of "i915.modeset=0".

Hi, Matthew
    Will you please push the patch?
    
thanks.

> 
> thanks,
> rui
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-17  1:30 ` Zhang Rui
  2009-04-17  2:34     ` yakui_zhao
@ 2009-04-17  2:34   ` yakui_zhao
  2009-04-17 21:35   ` Rafael J. Wysocki
  2009-04-17 21:35     ` Rafael J. Wysocki
  3 siblings, 0 replies; 580+ messages in thread
From: yakui_zhao @ 2009-04-17  2:34 UTC (permalink / raw)
  To: Zhang Rui, mjq59
  Cc: Adrian Bunk, Linux SCSI List, Network Development,
	Linux Kernel Mailing List, Natalie Protasevich, Linux ACPI,
	Andrew Morton, Kernel Testers List, Linus Torvalds,
	Linux PM List

On Fri, 2009-04-17 at 09:30 +0800, Zhang Rui wrote:
> On Fri, 2009-04-17 at 05:42 +0800, Rafael J. Wysocki wrote:
> > 
> > 
> > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13095
> > Subject         : thinkpad-acpi: cannot control brightness with hotkeys
> > Submitter       : Niel Lambrechts <niel.lambrechts@gmail.com>
> > Date            : 2009-04-11 23:07 (6 days old)
> > References      : http://lkml.org/lkml/2009/4/11/160
> > Handled-By      : Matthew Garrett <mjg59@srcf.ucam.org>
> > Patch           : http://lkml.org/lkml/2009/4/15/339
> > 
> > 
> > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13048
> > Subject         : /sys/class/backlight/acpi_video0/* is gone on vaio laptop with Intel GM45.
> > Submitter       : Rodrigo L. Batista <rodrigo@gus-mg.org>
> > Date            : 2009-04-09 04:57 (8 days old)
> > Handled-By      : yakui_zhao <yakui.zhao@intel.com>
> > Patch           : http://bugzilla.kernel.org/attachment.cgi?id=20967
> >                   http://bugzilla.kernel.org/attachment.cgi?id=20959
> > 
> > 
> bug 13095 is a duplicate of bug 13048.
> patches from Matthew and Yakui are for the same issue.
> 
> Yakui, could you verify which patch should be taken please?
The patch from Matthew is better. 
It still can work even when the KMS is disabled by adding the boot
option of "i915.modeset=0".

Hi, Matthew
    Will you please push the patch?
    
thanks.

> 
> thanks,
> rui
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13106] 2.6.30-rc1: intel 3945 no wireless
@ 2009-04-17  3:21       ` Justin Madru
  0 siblings, 0 replies; 580+ messages in thread
From: Justin Madru @ 2009-04-17  3:21 UTC (permalink / raw)
  To: Larry Finger
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List, Kernel Testers List

Larry Finger wrote:
> Rafael J. Wysocki wrote:
>   
>> This message has been generated automatically as a part of a report
>> of recent regressions.
>>
>> The following bug entry is on the current list of known regressions
>> from 2.6.29.  Please verify if it still should be listed and let me know
>> (either way).
>>
>>
>> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13106
>> Subject		: 2.6.30-rc1: intel 3945 no wireless
>> Submitter	: 2.6.30-rc1: intel 3945 no wireless
>> Date		: 2009-04-08 5:36 (9 days old)
>> References	: http://marc.info/?l=linux-kernel&m=123916905605534&w=4
>>     
>
> That regression was fixed by Herbert Xu's commit 97c18e2c. It should no longer
> be listed.
>
> Larry
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>
>   
I'm the original submitter. I confirm that it's been fixed -- close bug.

Justin Madru

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13106] 2.6.30-rc1: intel 3945 no wireless
@ 2009-04-17  3:21       ` Justin Madru
  0 siblings, 0 replies; 580+ messages in thread
From: Justin Madru @ 2009-04-17  3:21 UTC (permalink / raw)
  To: Larry Finger
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List, Kernel Testers List

Larry Finger wrote:
> Rafael J. Wysocki wrote:
>   
>> This message has been generated automatically as a part of a report
>> of recent regressions.
>>
>> The following bug entry is on the current list of known regressions
>> from 2.6.29.  Please verify if it still should be listed and let me know
>> (either way).
>>
>>
>> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13106
>> Subject		: 2.6.30-rc1: intel 3945 no wireless
>> Submitter	: 2.6.30-rc1: intel 3945 no wireless
>> Date		: 2009-04-08 5:36 (9 days old)
>> References	: http://marc.info/?l=linux-kernel&m=123916905605534&w=4
>>     
>
> That regression was fixed by Herbert Xu's commit 97c18e2c. It should no longer
> be listed.
>
> Larry
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>
>   
I'm the original submitter. I confirm that it's been fixed -- close bug.

Justin Madru

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13067] iwl3945: wlan0: beacon loss from AP - sending probe request
  2009-04-16 21:45   ` Rafael J. Wysocki
@ 2009-04-17  3:38     ` Justin Madru
  -1 siblings, 0 replies; 580+ messages in thread
From: Justin Madru @ 2009-04-17  3:38 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Maciej Rutecki

Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
>
> The following bug entry is on the current list of known regressions
> from 2.6.29.  Please verify if it still should be listed and let me know
> (either way).
>
>
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13067
> Subject		: iwl3945: wlan0: beacon loss from AP - sending probe request
> Submitter	: Maciej Rutecki <maciej.rutecki@gmail.com>
> Date		: 2009-04-05 9:11 (12 days old)
> References	: http://marc.info/?l=linux-kernel&m=123892272218266&w=4
>
>
>   
I'm getting this on .30rc2, so I confirm that it's still an issue (I'm 
not the original submitter).
It's really annoying because it's filling my logs like crazy. Below is 
15mins of logs, and it just repeats like this.

dhclient: DHCPREQUEST of 192.168.1.5 on wlan0 to 192.168.1.254 port 67
dhclient: DHCPACK of 192.168.1.5 from 192.168.1.254
dhclient: bound to 192.168.1.5 -- renewal in 823 seconds.
NetworkManager: <debug> [1239937333.003788] periodic_update(): Roamed 
from BSSID 00:1C:DF:67:1F:AC (madru) to (none) ((none))
kernel: wlan0: beacon loss from AP 00:1c:df:67:1f:ac - sending probe request
NetworkManager: <debug> [1239937339.001004] periodic_update(): Roamed 
from BSSID (none) ((none)) to 00:1C:DF:67:1F:AC (madru)
NetworkManager: <debug> [1239937453.002911] periodic_update(): Roamed 
from BSSID 00:1C:DF:67:1F:AC (madru) to (none) ((none))
kernel: wlan0: beacon loss from AP 00:1c:df:67:1f:ac - sending probe request
NetworkManager: <debug> [1239937459.000974] periodic_update(): Roamed 
from BSSID (none) ((none)) to 00:1C:DF:67:1F:AC (madru)
NetworkManager: <debug> [1239937573.002956] periodic_update(): Roamed 
from BSSID 00:1C:DF:67:1F:AC (madru) to (none) ((none))
kernel: wlan0: beacon loss from AP 00:1c:df:67:1f:ac - sending probe request
NetworkManager: <debug> [1239937579.000971] periodic_update(): Roamed 
from BSSID (none) ((none)) to 00:1C:DF:67:1F:AC (madru)
NetworkManager: <debug> [1239937693.003787] periodic_update(): Roamed 
from BSSID 00:1C:DF:67:1F:AC (madru) to (none) ((none))
kernel: wlan0: beacon loss from AP 00:1c:df:67:1f:ac - sending probe request
NetworkManager: <debug> [1239937699.000981] periodic_update(): Roamed 
from BSSID (none) ((none)) to 00:1C:DF:67:1F:AC (madru)
NetworkManager: <debug> [1239937813.002710] periodic_update(): Roamed 
from BSSID 00:1C:DF:67:1F:AC (madru) to (none) ((none))
kernel: wlan0: beacon loss from AP 00:1c:df:67:1f:ac - sending probe request
NetworkManager: <debug> [1239937819.000753] periodic_update(): Roamed 
from BSSID (none) ((none)) to 00:1C:DF:67:1F:AC (madru)
NetworkManager: <debug> [1239937933.003806] periodic_update(): Roamed 
from BSSID 00:1C:DF:67:1F:AC (madru) to (none) ((none))
kernel: wlan0: beacon loss from AP 00:1c:df:67:1f:ac - sending probe request
NetworkManager: <debug> [1239937939.000724] periodic_update(): Roamed 
from BSSID (none) ((none)) to 00:1C:DF:67:1F:AC (madru)
NetworkManager: <debug> [1239938053.002711] periodic_update(): Roamed 
from BSSID 00:1C:DF:67:1F:AC (madru) to (none) ((none))
kernel: wlan0: beacon loss from AP 00:1c:df:67:1f:ac - sending probe request
NetworkManager: <debug> [1239938059.001749] periodic_update(): Roamed 
from BSSID (none) ((none)) to 00:1C:DF:67:1F:AC (madru)
dhclient: DHCPREQUEST of 192.168.1.5 on wlan0 to 192.168.1.254 port 67
dhclient: DHCPACK of 192.168.1.5 from 192.168.1.254
dhclient: bound to 192.168.1.5 -- renewal in 803 seconds.

Justin Madru

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13067] iwl3945: wlan0: beacon loss from AP - sending probe request
@ 2009-04-17  3:38     ` Justin Madru
  0 siblings, 0 replies; 580+ messages in thread
From: Justin Madru @ 2009-04-17  3:38 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Maciej Rutecki

Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
>
> The following bug entry is on the current list of known regressions
> from 2.6.29.  Please verify if it still should be listed and let me know
> (either way).
>
>
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13067
> Subject		: iwl3945: wlan0: beacon loss from AP - sending probe request
> Submitter	: Maciej Rutecki <maciej.rutecki-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Date		: 2009-04-05 9:11 (12 days old)
> References	: http://marc.info/?l=linux-kernel&m=123892272218266&w=4
>
>
>   
I'm getting this on .30rc2, so I confirm that it's still an issue (I'm 
not the original submitter).
It's really annoying because it's filling my logs like crazy. Below is 
15mins of logs, and it just repeats like this.

dhclient: DHCPREQUEST of 192.168.1.5 on wlan0 to 192.168.1.254 port 67
dhclient: DHCPACK of 192.168.1.5 from 192.168.1.254
dhclient: bound to 192.168.1.5 -- renewal in 823 seconds.
NetworkManager: <debug> [1239937333.003788] periodic_update(): Roamed 
from BSSID 00:1C:DF:67:1F:AC (madru) to (none) ((none))
kernel: wlan0: beacon loss from AP 00:1c:df:67:1f:ac - sending probe request
NetworkManager: <debug> [1239937339.001004] periodic_update(): Roamed 
from BSSID (none) ((none)) to 00:1C:DF:67:1F:AC (madru)
NetworkManager: <debug> [1239937453.002911] periodic_update(): Roamed 
from BSSID 00:1C:DF:67:1F:AC (madru) to (none) ((none))
kernel: wlan0: beacon loss from AP 00:1c:df:67:1f:ac - sending probe request
NetworkManager: <debug> [1239937459.000974] periodic_update(): Roamed 
from BSSID (none) ((none)) to 00:1C:DF:67:1F:AC (madru)
NetworkManager: <debug> [1239937573.002956] periodic_update(): Roamed 
from BSSID 00:1C:DF:67:1F:AC (madru) to (none) ((none))
kernel: wlan0: beacon loss from AP 00:1c:df:67:1f:ac - sending probe request
NetworkManager: <debug> [1239937579.000971] periodic_update(): Roamed 
from BSSID (none) ((none)) to 00:1C:DF:67:1F:AC (madru)
NetworkManager: <debug> [1239937693.003787] periodic_update(): Roamed 
from BSSID 00:1C:DF:67:1F:AC (madru) to (none) ((none))
kernel: wlan0: beacon loss from AP 00:1c:df:67:1f:ac - sending probe request
NetworkManager: <debug> [1239937699.000981] periodic_update(): Roamed 
from BSSID (none) ((none)) to 00:1C:DF:67:1F:AC (madru)
NetworkManager: <debug> [1239937813.002710] periodic_update(): Roamed 
from BSSID 00:1C:DF:67:1F:AC (madru) to (none) ((none))
kernel: wlan0: beacon loss from AP 00:1c:df:67:1f:ac - sending probe request
NetworkManager: <debug> [1239937819.000753] periodic_update(): Roamed 
from BSSID (none) ((none)) to 00:1C:DF:67:1F:AC (madru)
NetworkManager: <debug> [1239937933.003806] periodic_update(): Roamed 
from BSSID 00:1C:DF:67:1F:AC (madru) to (none) ((none))
kernel: wlan0: beacon loss from AP 00:1c:df:67:1f:ac - sending probe request
NetworkManager: <debug> [1239937939.000724] periodic_update(): Roamed 
from BSSID (none) ((none)) to 00:1C:DF:67:1F:AC (madru)
NetworkManager: <debug> [1239938053.002711] periodic_update(): Roamed 
from BSSID 00:1C:DF:67:1F:AC (madru) to (none) ((none))
kernel: wlan0: beacon loss from AP 00:1c:df:67:1f:ac - sending probe request
NetworkManager: <debug> [1239938059.001749] periodic_update(): Roamed 
from BSSID (none) ((none)) to 00:1C:DF:67:1F:AC (madru)
dhclient: DHCPREQUEST of 192.168.1.5 on wlan0 to 192.168.1.254 port 67
dhclient: DHCPACK of 192.168.1.5 from 192.168.1.254
dhclient: bound to 192.168.1.5 -- renewal in 803 seconds.

Justin Madru

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13114] USB storage (usbstick) automount woes
  2009-04-16 21:45   ` Rafael J. Wysocki
@ 2009-04-17  4:01     ` Mike Galbraith
  -1 siblings, 0 replies; 580+ messages in thread
From: Mike Galbraith @ 2009-04-17  4:01 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Alan Stern,
	Greg Kroah-Hartman

On Thu, 2009-04-16 at 23:45 +0200, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
> 
> The following bug entry is on the current list of known regressions
> from 2.6.29.  Please verify if it still should be listed and let me know
> (either way).
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13114
> Subject		: USB storage (usbstick) automount woes
> Submitter	: Mike Galbraith <efault@gmx.de>
> Date		: 2009-04-09 9:26 (8 days old)
> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e6e244b6cb1f70e7109381626293cd40a8334ed3
> References	: http://marc.info/?l=linux-kernel&m=123926928907568&w=4
> Handled-By	: Alan Stern <stern@rowland.harvard.edu>

The fix is in the pipe.

	-Mike


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13114] USB storage (usbstick) automount woes
@ 2009-04-17  4:01     ` Mike Galbraith
  0 siblings, 0 replies; 580+ messages in thread
From: Mike Galbraith @ 2009-04-17  4:01 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Alan Stern,
	Greg Kroah-Hartman

On Thu, 2009-04-16 at 23:45 +0200, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
> 
> The following bug entry is on the current list of known regressions
> from 2.6.29.  Please verify if it still should be listed and let me know
> (either way).
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13114
> Subject		: USB storage (usbstick) automount woes
> Submitter	: Mike Galbraith <efault-Mmb7MZpHnFY@public.gmane.org>
> Date		: 2009-04-09 9:26 (8 days old)
> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e6e244b6cb1f70e7109381626293cd40a8334ed3
> References	: http://marc.info/?l=linux-kernel&m=123926928907568&w=4
> Handled-By	: Alan Stern <stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz@public.gmane.org>

The fix is in the pipe.

	-Mike

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13098] 2.6.29-git12 breaks vga=0x0f07 on MSI/Intel GPU
  2009-04-16 21:45   ` Rafael J. Wysocki
@ 2009-04-17  5:24     ` Andi Kleen
  -1 siblings, 0 replies; 580+ messages in thread
From: Andi Kleen @ 2009-04-17  5:24 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Andi Kleen,
	H. Peter Anvin

On Thu, Apr 16, 2009 at 11:45:03PM +0200, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
> 
> The following bug entry is on the current list of known regressions
> from 2.6.29.  Please verify if it still should be listed and let me know
> (either way).
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13098
> Subject		: 2.6.29-git12 breaks vga=0x0f07 on MSI/Intel GPU
> Submitter	: Andi Kleen <andi@firstfloor.org>
> Date		: 2009-04-06 01:14 (11 days old)
> References	: http://lkml.org/lkml/2009/4/5/200
> Handled-By	: H. Peter Anvin <hpa@zytor.com>

That's already fixed with 5f641356127712fbdce0eee120e5ce115860c17f I think.
At least it works again on my test box with the rcs.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13098] 2.6.29-git12 breaks vga=0x0f07 on MSI/Intel GPU
@ 2009-04-17  5:24     ` Andi Kleen
  0 siblings, 0 replies; 580+ messages in thread
From: Andi Kleen @ 2009-04-17  5:24 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Andi Kleen,
	H. Peter Anvin

On Thu, Apr 16, 2009 at 11:45:03PM +0200, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
> 
> The following bug entry is on the current list of known regressions
> from 2.6.29.  Please verify if it still should be listed and let me know
> (either way).
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13098
> Subject		: 2.6.29-git12 breaks vga=0x0f07 on MSI/Intel GPU
> Submitter	: Andi Kleen <andi-Vw/NltI1exuRpAAqCnN02g@public.gmane.org>
> Date		: 2009-04-06 01:14 (11 days old)
> References	: http://lkml.org/lkml/2009/4/5/200
> Handled-By	: H. Peter Anvin <hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>

That's already fixed with 5f641356127712fbdce0eee120e5ce115860c17f I think.
At least it works again on my test box with the rcs.

-Andi

-- 
ak-VuQAYsv1563Yd54FQh9/CA@public.gmane.org -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13113] tiobench read 50% regression with 2.6.30-rc1
  2009-04-16 21:45   ` Rafael J. Wysocki
@ 2009-04-17  6:29     ` Jens Axboe
  -1 siblings, 0 replies; 580+ messages in thread
From: Jens Axboe @ 2009-04-17  6:29 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Zhang, Yanmin

On Thu, Apr 16 2009, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
> 
> The following bug entry is on the current list of known regressions
> from 2.6.29.  Please verify if it still should be listed and let me know
> (either way).
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13113
> Subject		: tiobench read 50% regression with 2.6.30-rc1
> Submitter	: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
> Date		: 2009-04-09 8:29 (8 days old)
> References	: http://marc.info/?l=linux-kernel&m=123926576802992&w=4
> Handled-By	: Jens Axboe <jens.axboe@oracle.com>
> Patch		: http://marc.info/?l=linux-kernel&m=123971130800697&w=4

It's fixed by d6ceb25e8d8bccf826848c2621a50d02c0a7f4ae, which is already
merged.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13113] tiobench read 50% regression with 2.6.30-rc1
@ 2009-04-17  6:29     ` Jens Axboe
  0 siblings, 0 replies; 580+ messages in thread
From: Jens Axboe @ 2009-04-17  6:29 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Zhang, Yanmin

On Thu, Apr 16 2009, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
> 
> The following bug entry is on the current list of known regressions
> from 2.6.29.  Please verify if it still should be listed and let me know
> (either way).
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13113
> Subject		: tiobench read 50% regression with 2.6.30-rc1
> Submitter	: Zhang, Yanmin <yanmin_zhang-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
> Date		: 2009-04-09 8:29 (8 days old)
> References	: http://marc.info/?l=linux-kernel&m=123926576802992&w=4
> Handled-By	: Jens Axboe <jens.axboe-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
> Patch		: http://marc.info/?l=linux-kernel&m=123971130800697&w=4

It's fixed by d6ceb25e8d8bccf826848c2621a50d02c0a7f4ae, which is already
merged.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
  2009-04-16 21:45   ` Rafael J. Wysocki
@ 2009-04-17  6:30     ` Jens Axboe
  -1 siblings, 0 replies; 580+ messages in thread
From: Jens Axboe @ 2009-04-17  6:30 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Alan Jenkins,
	Linus Torvalds

On Thu, Apr 16 2009, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
> 
> The following bug entry is on the current list of known regressions
> from 2.6.29.  Please verify if it still should be listed and let me know
> (either way).
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13058
> Subject		: First hibernation attempt fails
> Submitter	: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
> Date		: 2009-04-10 10:58 (7 days old)
> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=1faa16d22877f4839bd433547d770c676d1d964c
> References	: http://marc.info/?l=linux-kernel&m=123928022321917&w=2

Alan, is this still a problem?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-17  6:30     ` Jens Axboe
  0 siblings, 0 replies; 580+ messages in thread
From: Jens Axboe @ 2009-04-17  6:30 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Alan Jenkins,
	Linus Torvalds

On Thu, Apr 16 2009, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
> 
> The following bug entry is on the current list of known regressions
> from 2.6.29.  Please verify if it still should be listed and let me know
> (either way).
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13058
> Subject		: First hibernation attempt fails
> Submitter	: Alan Jenkins <alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA@public.gmane.org>
> Date		: 2009-04-10 10:58 (7 days old)
> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=1faa16d22877f4839bd433547d770c676d1d964c
> References	: http://marc.info/?l=linux-kernel&m=123928022321917&w=2

Alan, is this still a problem?

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-17  8:28       ` Alan Jenkins
  0 siblings, 0 replies; 580+ messages in thread
From: Alan Jenkins @ 2009-04-17  8:28 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Linus Torvalds

Jens Axboe wrote:
> On Thu, Apr 16 2009, Rafael J. Wysocki wrote:
>   
>> This message has been generated automatically as a part of a report
>> of recent regressions.
>>
>> The following bug entry is on the current list of known regressions
>> from 2.6.29.  Please verify if it still should be listed and let me know
>> (either way).
>>
>>
>> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13058
>> Subject		: First hibernation attempt fails
>> Submitter	: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
>> Date		: 2009-04-10 10:58 (7 days old)
>> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=1faa16d22877f4839bd433547d770c676d1d964c
>> References	: http://marc.info/?l=linux-kernel&m=123928022321917&w=2
>>     
>
> Alan, is this still a problem?
>   

Yup.  Still present in v2.6.30-rc2-195-g9f76208.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-17  8:28       ` Alan Jenkins
  0 siblings, 0 replies; 580+ messages in thread
From: Alan Jenkins @ 2009-04-17  8:28 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Linus Torvalds

Jens Axboe wrote:
> On Thu, Apr 16 2009, Rafael J. Wysocki wrote:
>   
>> This message has been generated automatically as a part of a report
>> of recent regressions.
>>
>> The following bug entry is on the current list of known regressions
>> from 2.6.29.  Please verify if it still should be listed and let me know
>> (either way).
>>
>>
>> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13058
>> Subject		: First hibernation attempt fails
>> Submitter	: Alan Jenkins <alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA@public.gmane.org>
>> Date		: 2009-04-10 10:58 (7 days old)
>> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=1faa16d22877f4839bd433547d770c676d1d964c
>> References	: http://marc.info/?l=linux-kernel&m=123928022321917&w=2
>>     
>
> Alan, is this still a problem?
>   

Yup.  Still present in v2.6.30-rc2-195-g9f76208.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-17  9:13         ` Jens Axboe
  0 siblings, 0 replies; 580+ messages in thread
From: Jens Axboe @ 2009-04-17  9:13 UTC (permalink / raw)
  To: Alan Jenkins
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Linus Torvalds

On Fri, Apr 17 2009, Alan Jenkins wrote:
> Jens Axboe wrote:
> > On Thu, Apr 16 2009, Rafael J. Wysocki wrote:
> >   
> >> This message has been generated automatically as a part of a report
> >> of recent regressions.
> >>
> >> The following bug entry is on the current list of known regressions
> >> from 2.6.29.  Please verify if it still should be listed and let me know
> >> (either way).
> >>
> >>
> >> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13058
> >> Subject		: First hibernation attempt fails
> >> Submitter	: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
> >> Date		: 2009-04-10 10:58 (7 days old)
> >> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=1faa16d22877f4839bd433547d770c676d1d964c
> >> References	: http://marc.info/?l=linux-kernel&m=123928022321917&w=2
> >>     
> >
> > Alan, is this still a problem?
> >   
> 
> Yup.  Still present in v2.6.30-rc2-195-g9f76208.

Given the somewhat odd nature of the bug and the requirements to trigger
it, how confident are you in the bisection results?

I'll try and reproduce it here.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-17  9:13         ` Jens Axboe
  0 siblings, 0 replies; 580+ messages in thread
From: Jens Axboe @ 2009-04-17  9:13 UTC (permalink / raw)
  To: Alan Jenkins
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Linus Torvalds

On Fri, Apr 17 2009, Alan Jenkins wrote:
> Jens Axboe wrote:
> > On Thu, Apr 16 2009, Rafael J. Wysocki wrote:
> >   
> >> This message has been generated automatically as a part of a report
> >> of recent regressions.
> >>
> >> The following bug entry is on the current list of known regressions
> >> from 2.6.29.  Please verify if it still should be listed and let me know
> >> (either way).
> >>
> >>
> >> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13058
> >> Subject		: First hibernation attempt fails
> >> Submitter	: Alan Jenkins <alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA@public.gmane.org>
> >> Date		: 2009-04-10 10:58 (7 days old)
> >> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=1faa16d22877f4839bd433547d770c676d1d964c
> >> References	: http://marc.info/?l=linux-kernel&m=123928022321917&w=2
> >>     
> >
> > Alan, is this still a problem?
> >   
> 
> Yup.  Still present in v2.6.30-rc2-195-g9f76208.

Given the somewhat odd nature of the bug and the requirements to trigger
it, how confident are you in the bisection results?

I'll try and reproduce it here.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-17  9:34           ` Jens Axboe
  0 siblings, 0 replies; 580+ messages in thread
From: Jens Axboe @ 2009-04-17  9:34 UTC (permalink / raw)
  To: Alan Jenkins
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Linus Torvalds

On Fri, Apr 17 2009, Jens Axboe wrote:
> On Fri, Apr 17 2009, Alan Jenkins wrote:
> > Jens Axboe wrote:
> > > On Thu, Apr 16 2009, Rafael J. Wysocki wrote:
> > >   
> > >> This message has been generated automatically as a part of a report
> > >> of recent regressions.
> > >>
> > >> The following bug entry is on the current list of known regressions
> > >> from 2.6.29.  Please verify if it still should be listed and let me know
> > >> (either way).
> > >>
> > >>
> > >> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13058
> > >> Subject		: First hibernation attempt fails
> > >> Submitter	: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
> > >> Date		: 2009-04-10 10:58 (7 days old)
> > >> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=1faa16d22877f4839bd433547d770c676d1d964c
> > >> References	: http://marc.info/?l=linux-kernel&m=123928022321917&w=2
> > >>     
> > >
> > > Alan, is this still a problem?
> > >   
> > 
> > Yup.  Still present in v2.6.30-rc2-195-g9f76208.
> 
> Given the somewhat odd nature of the bug and the requirements to trigger
> it, how confident are you in the bisection results?
> 
> I'll try and reproduce it here.

I can't reproduce it here. It seems very odd that an ENOMEM would happen
as a consequence of the rq allocation change, it doesn't really change
the allocation at all (and it'll never return -ENOMEM).

Can you please recheck the git bisect results. It'd be nice if the
hibernation failure would actually log where the problem occured...

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-17  9:34           ` Jens Axboe
  0 siblings, 0 replies; 580+ messages in thread
From: Jens Axboe @ 2009-04-17  9:34 UTC (permalink / raw)
  To: Alan Jenkins
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Linus Torvalds

On Fri, Apr 17 2009, Jens Axboe wrote:
> On Fri, Apr 17 2009, Alan Jenkins wrote:
> > Jens Axboe wrote:
> > > On Thu, Apr 16 2009, Rafael J. Wysocki wrote:
> > >   
> > >> This message has been generated automatically as a part of a report
> > >> of recent regressions.
> > >>
> > >> The following bug entry is on the current list of known regressions
> > >> from 2.6.29.  Please verify if it still should be listed and let me know
> > >> (either way).
> > >>
> > >>
> > >> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13058
> > >> Subject		: First hibernation attempt fails
> > >> Submitter	: Alan Jenkins <alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA@public.gmane.org>
> > >> Date		: 2009-04-10 10:58 (7 days old)
> > >> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=1faa16d22877f4839bd433547d770c676d1d964c
> > >> References	: http://marc.info/?l=linux-kernel&m=123928022321917&w=2
> > >>     
> > >
> > > Alan, is this still a problem?
> > >   
> > 
> > Yup.  Still present in v2.6.30-rc2-195-g9f76208.
> 
> Given the somewhat odd nature of the bug and the requirements to trigger
> it, how confident are you in the bisection results?
> 
> I'll try and reproduce it here.

I can't reproduce it here. It seems very odd that an ENOMEM would happen
as a consequence of the rq allocation change, it doesn't really change
the allocation at all (and it'll never return -ENOMEM).

Can you please recheck the git bisect results. It'd be nice if the
hibernation failure would actually log where the problem occured...

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
  2009-04-17  9:34           ` Jens Axboe
  (?)
@ 2009-04-17  9:38           ` Alan Jenkins
  2009-04-17  9:45               ` Jens Axboe
  -1 siblings, 1 reply; 580+ messages in thread
From: Alan Jenkins @ 2009-04-17  9:38 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Linus Torvalds

Jens Axboe wrote:
> On Fri, Apr 17 2009, Jens Axboe wrote:
>   
>> On Fri, Apr 17 2009, Alan Jenkins wrote:
>>     
>>> Jens Axboe wrote:
>>>       
>>>> On Thu, Apr 16 2009, Rafael J. Wysocki wrote:
>>>>   
>>>>         
>>>>> This message has been generated automatically as a part of a report
>>>>> of recent regressions.
>>>>>
>>>>> The following bug entry is on the current list of known regressions
>>>>> from 2.6.29.  Please verify if it still should be listed and let me know
>>>>> (either way).
>>>>>
>>>>>
>>>>> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13058
>>>>> Subject		: First hibernation attempt fails
>>>>> Submitter	: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
>>>>> Date		: 2009-04-10 10:58 (7 days old)
>>>>> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=1faa16d22877f4839bd433547d770c676d1d964c
>>>>> References	: http://marc.info/?l=linux-kernel&m=123928022321917&w=2
>>>>>     
>>>>>           
>>>> Alan, is this still a problem?
>>>>   
>>>>         
>>> Yup.  Still present in v2.6.30-rc2-195-g9f76208.
>>>       
>> Given the somewhat odd nature of the bug and the requirements to trigger
>> it, how confident are you in the bisection results?
>>
>> I'll try and reproduce it here.
>>     
>
> I can't reproduce it here. It seems very odd that an ENOMEM would happen
> as a consequence of the rq allocation change, it doesn't really change
> the allocation at all (and it'll never return -ENOMEM).
>
> Can you please recheck the git bisect results. It'd be nice if the
> hibernation failure would actually log where the problem occured...
>   

Once I found the right conditions (wireless disabled and a specific KDE
session), it was 100% reproducible. 

Reverting your commit fixed the problem.  I can do another test of that
if you like.

My _bisection_ was not absolute, rock-solid certain because I only found
the right conditions half-way through.  There's always the possibility I
would get different results if I redid it properly, from the start.  But
I have some experience of this and took care to re-validate my upper &
lower bounds.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-17  9:45               ` Jens Axboe
  0 siblings, 0 replies; 580+ messages in thread
From: Jens Axboe @ 2009-04-17  9:45 UTC (permalink / raw)
  To: Alan Jenkins
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Linus Torvalds

On Fri, Apr 17 2009, Alan Jenkins wrote:
> Jens Axboe wrote:
> > On Fri, Apr 17 2009, Jens Axboe wrote:
> >   
> >> On Fri, Apr 17 2009, Alan Jenkins wrote:
> >>     
> >>> Jens Axboe wrote:
> >>>       
> >>>> On Thu, Apr 16 2009, Rafael J. Wysocki wrote:
> >>>>   
> >>>>         
> >>>>> This message has been generated automatically as a part of a report
> >>>>> of recent regressions.
> >>>>>
> >>>>> The following bug entry is on the current list of known regressions
> >>>>> from 2.6.29.  Please verify if it still should be listed and let me know
> >>>>> (either way).
> >>>>>
> >>>>>
> >>>>> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13058
> >>>>> Subject		: First hibernation attempt fails
> >>>>> Submitter	: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
> >>>>> Date		: 2009-04-10 10:58 (7 days old)
> >>>>> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=1faa16d22877f4839bd433547d770c676d1d964c
> >>>>> References	: http://marc.info/?l=linux-kernel&m=123928022321917&w=2
> >>>>>     
> >>>>>           
> >>>> Alan, is this still a problem?
> >>>>   
> >>>>         
> >>> Yup.  Still present in v2.6.30-rc2-195-g9f76208.
> >>>       
> >> Given the somewhat odd nature of the bug and the requirements to trigger
> >> it, how confident are you in the bisection results?
> >>
> >> I'll try and reproduce it here.
> >>     
> >
> > I can't reproduce it here. It seems very odd that an ENOMEM would happen
> > as a consequence of the rq allocation change, it doesn't really change
> > the allocation at all (and it'll never return -ENOMEM).
> >
> > Can you please recheck the git bisect results. It'd be nice if the
> > hibernation failure would actually log where the problem occured...
> >   
> 
> Once I found the right conditions (wireless disabled and a specific KDE
> session), it was 100% reproducible. 
> 
> Reverting your commit fixed the problem.  I can do another test of that
> if you like.
> 
> My _bisection_ was not absolute, rock-solid certain because I only found
> the right conditions half-way through.  There's always the possibility I
> would get different results if I redid it properly, from the start.  But
> I have some experience of this and took care to re-validate my upper &
> lower bounds.

Well, if you can and have the time, reproducing the bisect results with
the same conditions all the way through would definitely help.

Or perhaps Rafael can suggest adding some printk()'s to catch where that
ENOMEM is coming from. That would help, right now I basically have zero
clue on where this might be.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-17  9:45               ` Jens Axboe
  0 siblings, 0 replies; 580+ messages in thread
From: Jens Axboe @ 2009-04-17  9:45 UTC (permalink / raw)
  To: Alan Jenkins
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Linus Torvalds

On Fri, Apr 17 2009, Alan Jenkins wrote:
> Jens Axboe wrote:
> > On Fri, Apr 17 2009, Jens Axboe wrote:
> >   
> >> On Fri, Apr 17 2009, Alan Jenkins wrote:
> >>     
> >>> Jens Axboe wrote:
> >>>       
> >>>> On Thu, Apr 16 2009, Rafael J. Wysocki wrote:
> >>>>   
> >>>>         
> >>>>> This message has been generated automatically as a part of a report
> >>>>> of recent regressions.
> >>>>>
> >>>>> The following bug entry is on the current list of known regressions
> >>>>> from 2.6.29.  Please verify if it still should be listed and let me know
> >>>>> (either way).
> >>>>>
> >>>>>
> >>>>> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13058
> >>>>> Subject		: First hibernation attempt fails
> >>>>> Submitter	: Alan Jenkins <alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA@public.gmane.org>
> >>>>> Date		: 2009-04-10 10:58 (7 days old)
> >>>>> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=1faa16d22877f4839bd433547d770c676d1d964c
> >>>>> References	: http://marc.info/?l=linux-kernel&m=123928022321917&w=2
> >>>>>     
> >>>>>           
> >>>> Alan, is this still a problem?
> >>>>   
> >>>>         
> >>> Yup.  Still present in v2.6.30-rc2-195-g9f76208.
> >>>       
> >> Given the somewhat odd nature of the bug and the requirements to trigger
> >> it, how confident are you in the bisection results?
> >>
> >> I'll try and reproduce it here.
> >>     
> >
> > I can't reproduce it here. It seems very odd that an ENOMEM would happen
> > as a consequence of the rq allocation change, it doesn't really change
> > the allocation at all (and it'll never return -ENOMEM).
> >
> > Can you please recheck the git bisect results. It'd be nice if the
> > hibernation failure would actually log where the problem occured...
> >   
> 
> Once I found the right conditions (wireless disabled and a specific KDE
> session), it was 100% reproducible. 
> 
> Reverting your commit fixed the problem.  I can do another test of that
> if you like.
> 
> My _bisection_ was not absolute, rock-solid certain because I only found
> the right conditions half-way through.  There's always the possibility I
> would get different results if I redid it properly, from the start.  But
> I have some experience of this and took care to re-validate my upper &
> lower bounds.

Well, if you can and have the time, reproducing the bisect results with
the same conditions all the way through would definitely help.

Or perhaps Rafael can suggest adding some printk()'s to catch where that
ENOMEM is coming from. That would help, right now I basically have zero
clue on where this might be.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-17 10:46                 ` Alan Jenkins
  0 siblings, 0 replies; 580+ messages in thread
From: Alan Jenkins @ 2009-04-17 10:46 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Linus Torvalds

Jens Axboe wrote:
> On Fri, Apr 17 2009, Alan Jenkins wrote:
>   
>> Jens Axboe wrote:
>>     
>>> On Fri, Apr 17 2009, Jens Axboe wrote:
>>>   
>>>       
>>>> On Fri, Apr 17 2009, Alan Jenkins wrote:
>>>>     
>>>>         
>>>>> Jens Axboe wrote:
>>>>>       
>>>>>           
>>>>>> On Thu, Apr 16 2009, Rafael J. Wysocki wrote:
>>>>>>   
>>>>>>         
>>>>>>             
>>>>>>> This message has been generated automatically as a part of a report
>>>>>>> of recent regressions.
>>>>>>>
>>>>>>> The following bug entry is on the current list of known regressions
>>>>>>> from 2.6.29.  Please verify if it still should be listed and let me know
>>>>>>> (either way).
>>>>>>>
>>>>>>>
>>>>>>> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13058
>>>>>>> Subject		: First hibernation attempt fails
>>>>>>> Submitter	: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
>>>>>>> Date		: 2009-04-10 10:58 (7 days old)
>>>>>>> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=1faa16d22877f4839bd433547d770c676d1d964c
>>>>>>> References	: http://marc.info/?l=linux-kernel&m=123928022321917&w=2
>>>>>>>     
>>>>>>>           
>>>>>>>               
>>>>>> Alan, is this still a problem?
>>>>>>   
>>>>>>         
>>>>>>             
>>>>> Yup.  Still present in v2.6.30-rc2-195-g9f76208.
>>>>>       
>>>>>           
>>>> Given the somewhat odd nature of the bug and the requirements to trigger
>>>> it, how confident are you in the bisection results?
>>>>
>>>> I'll try and reproduce it here.
>>>>     
>>>>         
>>> I can't reproduce it here. It seems very odd that an ENOMEM would happen
>>> as a consequence of the rq allocation change, it doesn't really change
>>> the allocation at all (and it'll never return -ENOMEM).
>>>
>>> Can you please recheck the git bisect results. It'd be nice if the
>>> hibernation failure would actually log where the problem occured...
>>>   
>>>       
>> Once I found the right conditions (wireless disabled and a specific KDE
>> session), it was 100% reproducible. 
>>
>> Reverting your commit fixed the problem.  I can do another test of that
>> if you like.
>>
>> My _bisection_ was not absolute, rock-solid certain because I only found
>> the right conditions half-way through.  There's always the possibility I
>> would get different results if I redid it properly, from the start.  But
>> I have some experience of this and took care to re-validate my upper &
>> lower bounds.
>>     
>
> Well, if you can and have the time, reproducing the bisect results with
> the same conditions all the way through would definitely help.
>   

I can do that, yes.

As another datapoint:  I tried blindly applying the commit to 2.6.29. 
The resulting kernel was able to hibernate fine the first time.

I'm going to be annoying and try something slightly different.  In
theory, I should be able to find the "first bad commit" where
cherry-picking 1faa16d22 causes a problem.

> Or perhaps Rafael can suggest adding some printk()'s to catch where that
> ENOMEM is coming from. That would help, right now I basically have zero
> clue on where this might be.
>   



^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-17 10:46                 ` Alan Jenkins
  0 siblings, 0 replies; 580+ messages in thread
From: Alan Jenkins @ 2009-04-17 10:46 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Linus Torvalds

Jens Axboe wrote:
> On Fri, Apr 17 2009, Alan Jenkins wrote:
>   
>> Jens Axboe wrote:
>>     
>>> On Fri, Apr 17 2009, Jens Axboe wrote:
>>>   
>>>       
>>>> On Fri, Apr 17 2009, Alan Jenkins wrote:
>>>>     
>>>>         
>>>>> Jens Axboe wrote:
>>>>>       
>>>>>           
>>>>>> On Thu, Apr 16 2009, Rafael J. Wysocki wrote:
>>>>>>   
>>>>>>         
>>>>>>             
>>>>>>> This message has been generated automatically as a part of a report
>>>>>>> of recent regressions.
>>>>>>>
>>>>>>> The following bug entry is on the current list of known regressions
>>>>>>> from 2.6.29.  Please verify if it still should be listed and let me know
>>>>>>> (either way).
>>>>>>>
>>>>>>>
>>>>>>> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13058
>>>>>>> Subject		: First hibernation attempt fails
>>>>>>> Submitter	: Alan Jenkins <alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA@public.gmane.org>
>>>>>>> Date		: 2009-04-10 10:58 (7 days old)
>>>>>>> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=1faa16d22877f4839bd433547d770c676d1d964c
>>>>>>> References	: http://marc.info/?l=linux-kernel&m=123928022321917&w=2
>>>>>>>     
>>>>>>>           
>>>>>>>               
>>>>>> Alan, is this still a problem?
>>>>>>   
>>>>>>         
>>>>>>             
>>>>> Yup.  Still present in v2.6.30-rc2-195-g9f76208.
>>>>>       
>>>>>           
>>>> Given the somewhat odd nature of the bug and the requirements to trigger
>>>> it, how confident are you in the bisection results?
>>>>
>>>> I'll try and reproduce it here.
>>>>     
>>>>         
>>> I can't reproduce it here. It seems very odd that an ENOMEM would happen
>>> as a consequence of the rq allocation change, it doesn't really change
>>> the allocation at all (and it'll never return -ENOMEM).
>>>
>>> Can you please recheck the git bisect results. It'd be nice if the
>>> hibernation failure would actually log where the problem occured...
>>>   
>>>       
>> Once I found the right conditions (wireless disabled and a specific KDE
>> session), it was 100% reproducible. 
>>
>> Reverting your commit fixed the problem.  I can do another test of that
>> if you like.
>>
>> My _bisection_ was not absolute, rock-solid certain because I only found
>> the right conditions half-way through.  There's always the possibility I
>> would get different results if I redid it properly, from the start.  But
>> I have some experience of this and took care to re-validate my upper &
>> lower bounds.
>>     
>
> Well, if you can and have the time, reproducing the bisect results with
> the same conditions all the way through would definitely help.
>   

I can do that, yes.

As another datapoint:  I tried blindly applying the commit to 2.6.29. 
The resulting kernel was able to hibernate fine the first time.

I'm going to be annoying and try something slightly different.  In
theory, I should be able to find the "first bad commit" where
cherry-picking 1faa16d22 causes a problem.

> Or perhaps Rafael can suggest adding some printk()'s to catch where that
> ENOMEM is coming from. That would help, right now I basically have zero
> clue on where this might be.
>   


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13111] Linux 2.6.30-rc1 tg3 endian issues with MAC addresses on BCM5701
@ 2009-04-17 12:21         ` Robin Holt
  0 siblings, 0 replies; 580+ messages in thread
From: Robin Holt @ 2009-04-17 12:21 UTC (permalink / raw)
  To: Matt Carlson
  Cc: David Miller, rjw, linux-kernel, kernel-testers, Benjamin Li,
	Michael Chan, holt, James.Bottomley

> Actually, I think we do have a fix for this.  James and Robin both
> reported that the test patch I sent out worked for them.  I'm preparing
> a patchset for submission now.
> 
> James, Robin, can you confirm that you performed your tests with David's
> patch reverted?

My test was done with the 2.6.28-rc1 kernel plus your patch and no
others.

Robin

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13111] Linux 2.6.30-rc1 tg3 endian issues with MAC addresses on BCM5701
@ 2009-04-17 12:21         ` Robin Holt
  0 siblings, 0 replies; 580+ messages in thread
From: Robin Holt @ 2009-04-17 12:21 UTC (permalink / raw)
  To: Matt Carlson
  Cc: David Miller, rjw-KKrjLPT3xs0,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA, Benjamin Li, Michael Chan,
	holt-sJ/iWh9BUns,
	James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk

> Actually, I think we do have a fix for this.  James and Robin both
> reported that the test patch I sent out worked for them.  I'm preparing
> a patchset for submission now.
> 
> James, Robin, can you confirm that you performed your tests with David's
> patch reverted?

My test was done with the 2.6.28-rc1 kernel plus your patch and no
others.

Robin

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-17 15:55           ` Linus Torvalds
  0 siblings, 0 replies; 580+ messages in thread
From: Linus Torvalds @ 2009-04-17 15:55 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Alan Jenkins, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List



On Fri, 17 Apr 2009, Jens Axboe wrote:
> 
> Given the somewhat odd nature of the bug and the requirements to trigger
> it, how confident are you in the bisection results?

I suspect it's timing-dependent. 

The failure case is a ENOMEM returned from the "echo disk > /sys/power/state", 
and sadly there are a _lot_ of potential sources of ENOMEM's in the path. 
And a numbe of them come from GFP_ATOMIC allocations etc.

Now, that explains why it only happens while in X (more memory being 
used), and also why it succeeds the second time (the first try will have 
triggered VM activity and then free'd the pages it allocated up to that 
point).

IOW, I bet it would work on the first try if you were to just run 
something like

	ptr = malloc(BIGNUM);
	memset(ptr, 0, BIGNUM);
	exit(0);

first - just to make room for stuff.

And the thing is, swsusp_save() really does do odd things. For example, to 
get rid of unnecessary memory, it does "drain_local_pages()", where the 
"local" is "local cpu". Why does it do that? Likely nobody knows.

Now, that won't matter in Alan's case (he is UP), but the point is, the 
swsuspend code does these random things to try to free up memory, and I 
suspect it's mostly been a trial-and-error thing. And then subtle changes 
in memory usage when allocating or writing things out will change things.

For example, there is a magic "PAGES_FOR_IO" #define, which is somewhat 
arbitrarily set to 4MB worth of pages. Where did that number come from? 
Who knows? But that's the number the code uses for the _initial_ check of 
"do we have enough memory" (the one that must have passed, since it 
actually started doing things and didn't print out a warning message).

Anyway, from the dmesg, we can see:

	[   41.873619] PM: Shrinking memory...  Restarting tasks ... done.

and this is a clear indication that it's "swsusp_shrink_memory()" that 
failed. If it had succeeded, you'd have seen

	PM: Shrinking memory... done (xyz pages freed)

but it returned an error case, and then the suspend fails and starts 
restarting tasks.

And the thing is, that "swsusp_shrink_memory()" is just full of 
heuristics. There's no hard numbers there. It doesn't seem to wait for 
writeout, it just does the equivalent of "shrink_list()" and 
"shrink_slab()", but it seems to have been basically cribbed half-way 
from the regular "try to free memory", without really doing it all.

Just as an example: it does that "zone_is_all_unreclaimable()" logic that 
expects kswapd to mark things reclaimable again, but it doesn't seem to 
actually ever wait for kswapd or pdflush. It also seems to set 
"swappiness" to zero etc. Maybe it's all intentional, but it does mean 
that it uses some shared heuristics with the "real" VM, but uses them 
differently.

		Linus

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-17 15:55           ` Linus Torvalds
  0 siblings, 0 replies; 580+ messages in thread
From: Linus Torvalds @ 2009-04-17 15:55 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Alan Jenkins, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List



On Fri, 17 Apr 2009, Jens Axboe wrote:
> 
> Given the somewhat odd nature of the bug and the requirements to trigger
> it, how confident are you in the bisection results?

I suspect it's timing-dependent. 

The failure case is a ENOMEM returned from the "echo disk > /sys/power/state", 
and sadly there are a _lot_ of potential sources of ENOMEM's in the path. 
And a numbe of them come from GFP_ATOMIC allocations etc.

Now, that explains why it only happens while in X (more memory being 
used), and also why it succeeds the second time (the first try will have 
triggered VM activity and then free'd the pages it allocated up to that 
point).

IOW, I bet it would work on the first try if you were to just run 
something like

	ptr = malloc(BIGNUM);
	memset(ptr, 0, BIGNUM);
	exit(0);

first - just to make room for stuff.

And the thing is, swsusp_save() really does do odd things. For example, to 
get rid of unnecessary memory, it does "drain_local_pages()", where the 
"local" is "local cpu". Why does it do that? Likely nobody knows.

Now, that won't matter in Alan's case (he is UP), but the point is, the 
swsuspend code does these random things to try to free up memory, and I 
suspect it's mostly been a trial-and-error thing. And then subtle changes 
in memory usage when allocating or writing things out will change things.

For example, there is a magic "PAGES_FOR_IO" #define, which is somewhat 
arbitrarily set to 4MB worth of pages. Where did that number come from? 
Who knows? But that's the number the code uses for the _initial_ check of 
"do we have enough memory" (the one that must have passed, since it 
actually started doing things and didn't print out a warning message).

Anyway, from the dmesg, we can see:

	[   41.873619] PM: Shrinking memory...  Restarting tasks ... done.

and this is a clear indication that it's "swsusp_shrink_memory()" that 
failed. If it had succeeded, you'd have seen

	PM: Shrinking memory... done (xyz pages freed)

but it returned an error case, and then the suspend fails and starts 
restarting tasks.

And the thing is, that "swsusp_shrink_memory()" is just full of 
heuristics. There's no hard numbers there. It doesn't seem to wait for 
writeout, it just does the equivalent of "shrink_list()" and 
"shrink_slab()", but it seems to have been basically cribbed half-way 
from the regular "try to free memory", without really doing it all.

Just as an example: it does that "zone_is_all_unreclaimable()" logic that 
expects kswapd to mark things reclaimable again, but it doesn't seem to 
actually ever wait for kswapd or pdflush. It also seems to set 
"swappiness" to zero etc. Maybe it's all intentional, but it does mean 
that it uses some shared heuristics with the "real" VM, but uses them 
differently.

		Linus

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-17 16:00                   ` Linus Torvalds
  0 siblings, 0 replies; 580+ messages in thread
From: Linus Torvalds @ 2009-04-17 16:00 UTC (permalink / raw)
  To: Alan Jenkins
  Cc: Jens Axboe, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List



On Fri, 17 Apr 2009, Alan Jenkins wrote:
> 
> As another datapoint:  I tried blindly applying the commit to 2.6.29. 
> The resulting kernel was able to hibernate fine the first time.

Yeah, so it's not that commit per se that causes it. I bet it needs all 
the IO scheduler changes too - and even when it does that, the end result 
probably is really just a timing change.

> I'm going to be annoying and try something slightly different.  In
> theory, I should be able to find the "first bad commit" where
> cherry-picking 1faa16d22 causes a problem.

Just for fun, try this one first and see if it makes any difference.

Maybe the whole "swappiness=0" part was intentional. And maybe it wasn't. 
This is one trivial patch. Maybe it makes your machine blow up. Who knows?

There are other differences in the shrink_all_memory() path wrt the normal 
memory freeing paths, but they are way more subtle. So I'm suggesting 
tryign this not becasue I think it's "The Bug(tm)", but because it's an 
easy test to make, and maybe it makes a difference.

		Linus
---
 mm/vmscan.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 39fdfb1..d3595ed 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2113,6 +2113,8 @@ unsigned long shrink_all_memory(unsigned long nr_pages)
 	struct scan_control sc = {
 		.gfp_mask = GFP_KERNEL,
 		.may_unmap = 0,
+		.swap_cluster_max = SWAP_CLUSTER_MAX,
+		.swappiness = vm_swappiness,
 		.may_writepage = 1,
 		.isolate_pages = isolate_pages_global,
 	};

^ permalink raw reply related	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-17 16:00                   ` Linus Torvalds
  0 siblings, 0 replies; 580+ messages in thread
From: Linus Torvalds @ 2009-04-17 16:00 UTC (permalink / raw)
  To: Alan Jenkins
  Cc: Jens Axboe, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List



On Fri, 17 Apr 2009, Alan Jenkins wrote:
> 
> As another datapoint:  I tried blindly applying the commit to 2.6.29. 
> The resulting kernel was able to hibernate fine the first time.

Yeah, so it's not that commit per se that causes it. I bet it needs all 
the IO scheduler changes too - and even when it does that, the end result 
probably is really just a timing change.

> I'm going to be annoying and try something slightly different.  In
> theory, I should be able to find the "first bad commit" where
> cherry-picking 1faa16d22 causes a problem.

Just for fun, try this one first and see if it makes any difference.

Maybe the whole "swappiness=0" part was intentional. And maybe it wasn't. 
This is one trivial patch. Maybe it makes your machine blow up. Who knows?

There are other differences in the shrink_all_memory() path wrt the normal 
memory freeing paths, but they are way more subtle. So I'm suggesting 
tryign this not becasue I think it's "The Bug(tm)", but because it's an 
easy test to make, and maybe it makes a difference.

		Linus
---
 mm/vmscan.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 39fdfb1..d3595ed 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2113,6 +2113,8 @@ unsigned long shrink_all_memory(unsigned long nr_pages)
 	struct scan_control sc = {
 		.gfp_mask = GFP_KERNEL,
 		.may_unmap = 0,
+		.swap_cluster_max = SWAP_CLUSTER_MAX,
+		.swappiness = vm_swappiness,
 		.may_writepage = 1,
 		.isolate_pages = isolate_pages_global,
 	};

^ permalink raw reply related	[flat|nested] 580+ messages in thread

* Re: [Bug #13107] LTP 20080131 causes defunct processes w/2.6.30-rc1
  2009-04-16 21:45   ` Rafael J. Wysocki
@ 2009-04-17 16:55     ` Sukadev Bhattiprolu
  -1 siblings, 0 replies; 580+ messages in thread
From: Sukadev Bhattiprolu @ 2009-04-17 16:55 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Andrew Morton,
	Kumar Gala, Linus Torvalds

Rafael J. Wysocki [rjw@sisk.pl] wrote:
| This message has been generated automatically as a part of a report
| of recent regressions.
| 
| The following bug entry is on the current list of known regressions
| from 2.6.29.  Please verify if it still should be listed and let me know
| (either way).
| 
| 
| Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13107
| Subject		: LTP 20080131 causes defunct processes w/2.6.30-rc1
| Submitter	: Kumar Gala <galak@kernel.crashing.org>
| Date		: 2009-04-09 15:43 (8 days old)
| First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=b3bfa0cba867f23365b81658b47efd906830879b
| References	: http://marc.info/?l=linux-kernel&m=123929187208953&w=4
| Handled-By	: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
| 

The last response to this was: http://lkml.org/lkml/2009/4/10/193.
So it was not clear if the init was being ptraced or it got the SIGSTOP
from parent namespace.

Kumar, any update on this ?

Sukadev

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13107] LTP 20080131 causes defunct processes w/2.6.30-rc1
@ 2009-04-17 16:55     ` Sukadev Bhattiprolu
  0 siblings, 0 replies; 580+ messages in thread
From: Sukadev Bhattiprolu @ 2009-04-17 16:55 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Andrew Morton,
	Kumar Gala, Linus Torvalds

Rafael J. Wysocki [rjw-KKrjLPT3xs0@public.gmane.org] wrote:
| This message has been generated automatically as a part of a report
| of recent regressions.
| 
| The following bug entry is on the current list of known regressions
| from 2.6.29.  Please verify if it still should be listed and let me know
| (either way).
| 
| 
| Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13107
| Subject		: LTP 20080131 causes defunct processes w/2.6.30-rc1
| Submitter	: Kumar Gala <galak-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r@public.gmane.org>
| Date		: 2009-04-09 15:43 (8 days old)
| First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=b3bfa0cba867f23365b81658b47efd906830879b
| References	: http://marc.info/?l=linux-kernel&m=123929187208953&w=4
| Handled-By	: Sukadev Bhattiprolu <sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
| 

The last response to this was: http://lkml.org/lkml/2009/4/10/193.
So it was not clear if the init was being ptraced or it got the SIGSTOP
from parent namespace.

Kumar, any update on this ?

Sukadev

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13066] Intel HD Audio oops
  2009-04-16 21:45   ` Rafael J. Wysocki
  (?)
@ 2009-04-17 16:57   ` Takashi Iwai
  2009-04-17 21:07       ` Rafael J. Wysocki
  -1 siblings, 1 reply; 580+ messages in thread
From: Takashi Iwai @ 2009-04-17 16:57 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Jeff Chua

At Thu, 16 Apr 2009 23:45:01 +0200 (CEST),
Rafael J. Wysocki wrote:
> 
> This message has been generated automatically as a part of a report
> of recent regressions.
> 
> The following bug entry is on the current list of known regressions
> from 2.6.29.  Please verify if it still should be listed and let me know
> (either way).
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13066
> Subject		: Intel HD Audio oops
> Submitter	: Jeff Chua <jeff.chua.linux@gmail.com>
> Date		: 2009-04-01 8:28 (16 days old)
> References	: http://marc.info/?l=linux-kernel&m=123857454625829&w=4

The fix patch was merged to the upstream as commit
95c0909961bc5ff18c78b2ab0d093cddc0a8b0b5.


thanks,

Takashi

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-16 21:42 ` Rafael J. Wysocki
                   ` (46 preceding siblings ...)
  (?)
@ 2009-04-17 17:09 ` Thomas Meyer
  2009-04-17 21:38     ` Rafael J. Wysocki
  -1 siblings, 1 reply; 580+ messages in thread
From: Thomas Meyer @ 2009-04-17 17:09 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Adrian Bunk, Andrew Morton,
	Linus Torvalds, Natalie Protasevich, Kernel Testers List


Zitat von "Rafael J. Wysocki" <rjw@sisk.pl>:

> If you know of any other unresolved regressions from 2.6.29, please  
> let me know
> either and I'll add them to the list.  Also, please let me know if any of the
> entries below are invalid.

Two things on 2.6.30-rc2:

1) Kernel panic while shutting down the system:
http://m3y3r.de/wordpress/?p=67

2) Backlight daemon dies (hald-addon-macbookpro-backlight). Don't know why.

Config is:
# CONFIG_64BIT is not set
CONFIG_X86_32=y
# CONFIG_X86_64 is not set
CONFIG_X86=y
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/i386_defconfig"
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_HAVE_LATENCYTOP_SUPPORT=y
CONFIG_FAST_CMPXCHG_LOCAL=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
# CONFIG_RWSEM_GENERIC_SPINLOCK is not set
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_ARCH_HAS_CPU_IDLE_WAIT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
# CONFIG_GENERIC_TIME_VSYSCALL is not set
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_DEFAULT_IDLE=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_HAVE_DYNAMIC_PER_CPU_AREA=y
# CONFIG_HAVE_CPUMASK_OF_CPU_MAP is not set
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
# CONFIG_ZONE_DMA32 is not set
CONFIG_ARCH_POPULATES_NODE_MAP=y
# CONFIG_AUDIT_ARCH is not set
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_GENERIC_HARDIRQS=y
CONFIG_GENERIC_HARDIRQS_NO__DO_IRQ=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_USE_GENERIC_SMP_HELPERS=y
CONFIG_X86_32_SMP=y
CONFIG_X86_HT=y
CONFIG_X86_TRAMPOLINE=y
CONFIG_KTIME_SCALAR=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

#
# General setup
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
CONFIG_BSD_PROCESS_ACCT=y
# CONFIG_BSD_PROCESS_ACCT_V3 is not set
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
CONFIG_TASK_XACCT=y
CONFIG_TASK_IO_ACCOUNTING=y
CONFIG_AUDIT=y
CONFIG_AUDITSYSCALL=y
CONFIG_AUDIT_TREE=y

#
# RCU Subsystem
#
CONFIG_CLASSIC_RCU=y
# CONFIG_TREE_RCU is not set
# CONFIG_PREEMPT_RCU is not set
# CONFIG_TREE_RCU_TRACE is not set
# CONFIG_PREEMPT_RCU_TRACE is not set
# CONFIG_IKCONFIG is not set
CONFIG_LOG_BUF_SHIFT=17
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
# CONFIG_GROUP_SCHED is not set
CONFIG_CGROUPS=y
# CONFIG_CGROUP_DEBUG is not set
CONFIG_CGROUP_NS=y
# CONFIG_CGROUP_FREEZER is not set
# CONFIG_CGROUP_DEVICE is not set
CONFIG_CPUSETS=y
CONFIG_PROC_PID_CPUSET=y
CONFIG_CGROUP_CPUACCT=y
CONFIG_RESOURCE_COUNTERS=y
# CONFIG_CGROUP_MEM_RES_CTLR is not set
# CONFIG_SYSFS_DEPRECATED_V2 is not set
CONFIG_RELAY=y
CONFIG_NAMESPACES=y
CONFIG_UTS_NS=y
CONFIG_IPC_NS=y
CONFIG_USER_NS=y
CONFIG_PID_NS=y
# CONFIG_NET_NS is not set
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_RD_GZIP=y
CONFIG_RD_BZIP2=y
CONFIG_RD_LZMA=y
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
CONFIG_ANON_INODES=y
# CONFIG_EMBEDDED is not set
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_EXTRA_PASS=y
CONFIG_STRIP_ASM_SYMS=y
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_PCSPKR_PLATFORM=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_AIO=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_PCI_QUIRKS=y
CONFIG_SLUB_DEBUG=y
# CONFIG_COMPAT_BRK is not set
# CONFIG_SLAB is not set
CONFIG_SLUB=y
# CONFIG_SLOB is not set
CONFIG_PROFILING=y
CONFIG_TRACEPOINTS=y
CONFIG_MARKERS=y
CONFIG_OPROFILE=y
# CONFIG_OPROFILE_IBS is not set
CONFIG_HAVE_OPROFILE=y
CONFIG_KPROBES=y
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
CONFIG_KRETPROBES=y
CONFIG_HAVE_IOREMAP_PROT=y
CONFIG_HAVE_KPROBES=y
CONFIG_HAVE_KRETPROBES=y
CONFIG_HAVE_ARCH_TRACEHOOK=y
CONFIG_HAVE_DMA_API_DEBUG=y
# CONFIG_SLOW_WORK is not set
CONFIG_HAVE_GENERIC_DMA_COHERENT=y
CONFIG_SLABINFO=y
CONFIG_RT_MUTEXES=y
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
# CONFIG_MODULE_FORCE_LOAD is not set
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
# CONFIG_MODVERSIONS is not set
CONFIG_MODULE_SRCVERSION_ALL=y
CONFIG_STOP_MACHINE=y
CONFIG_BLOCK=y
CONFIG_LBD=y
CONFIG_BLK_DEV_BSG=y
# CONFIG_BLK_DEV_INTEGRITY is not set

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
# CONFIG_DEFAULT_AS is not set
# CONFIG_DEFAULT_DEADLINE is not set
CONFIG_DEFAULT_CFQ=y
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="cfq"
CONFIG_PREEMPT_NOTIFIERS=y
CONFIG_FREEZER=y

#
# Processor type and features
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_GENERIC_CLOCKEVENTS_BUILD=y
CONFIG_SMP=y
# CONFIG_SPARSE_IRQ is not set
# CONFIG_X86_MPPARSE is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_EXTENDED_PLATFORM is not set
CONFIG_SCHED_OMIT_FRAME_POINTER=y
# CONFIG_PARAVIRT_GUEST is not set
# CONFIG_MEMTEST is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
CONFIG_MPENTIUMM=y
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MGEODE_LX is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_MVIAC7 is not set
# CONFIG_MPSC is not set
# CONFIG_MCORE2 is not set
# CONFIG_GENERIC_CPU is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CPU=y
CONFIG_X86_L1_CACHE_BYTES=64
CONFIG_X86_INTERNODE_CACHE_BYTES=64
CONFIG_X86_CMPXCHG=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_XADD=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_TSC=y
CONFIG_X86_CMOV=y
CONFIG_X86_MINIMUM_CPU_FAMILY=4
CONFIG_X86_DEBUGCTLMSR=y
CONFIG_CPU_SUP_INTEL=y
CONFIG_CPU_SUP_CYRIX_32=y
CONFIG_CPU_SUP_AMD=y
CONFIG_CPU_SUP_CENTAUR=y
CONFIG_CPU_SUP_TRANSMETA_32=y
CONFIG_CPU_SUP_UMC_32=y
CONFIG_X86_DS=y
CONFIG_X86_PTRACE_BTS=y
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
CONFIG_DMI=y
# CONFIG_IOMMU_HELPER is not set
# CONFIG_IOMMU_API is not set
CONFIG_NR_CPUS=4
# CONFIG_SCHED_SMT is not set
CONFIG_SCHED_MC=y
# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
CONFIG_PREEMPT=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
# CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS is not set
CONFIG_X86_MCE=y
# CONFIG_X86_MCE_NONFATAL is not set
CONFIG_X86_MCE_P4THERMAL=y
CONFIG_VM86=y
# CONFIG_TOSHIBA is not set
# CONFIG_I8K is not set
# CONFIG_X86_REBOOTFIXUPS is not set
CONFIG_MICROCODE=y
CONFIG_MICROCODE_INTEL=y
# CONFIG_MICROCODE_AMD is not set
CONFIG_MICROCODE_OLD_INTERFACE=y
CONFIG_X86_MSR=y
CONFIG_X86_CPUID=y
# CONFIG_X86_CPU_DEBUG is not set
# CONFIG_NOHIGHMEM is not set
CONFIG_HIGHMEM4G=y
# CONFIG_HIGHMEM64G is not set
CONFIG_PAGE_OFFSET=0xC0000000
CONFIG_HIGHMEM=y
# CONFIG_ARCH_PHYS_ADDR_T_64BIT is not set
CONFIG_ARCH_FLATMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_FLATMEM_MANUAL=y
# CONFIG_DISCONTIGMEM_MANUAL is not set
# CONFIG_SPARSEMEM_MANUAL is not set
CONFIG_FLATMEM=y
CONFIG_FLAT_NODE_MEM_MAP=y
CONFIG_SPARSEMEM_STATIC=y
CONFIG_PAGEFLAGS_EXTENDED=y
CONFIG_SPLIT_PTLOCK_CPUS=4
# CONFIG_PHYS_ADDR_T_64BIT is not set
CONFIG_ZONE_DMA_FLAG=1
CONFIG_BOUNCE=y
CONFIG_VIRT_TO_BUS=y
CONFIG_UNEVICTABLE_LRU=y
CONFIG_HAVE_MLOCK=y
CONFIG_HAVE_MLOCKED_PAGE_BIT=y
CONFIG_MMU_NOTIFIER=y
CONFIG_HIGHPTE=y
# CONFIG_X86_CHECK_BIOS_CORRUPTION is not set
CONFIG_X86_RESERVE_LOW_64K=y
# CONFIG_MATH_EMULATION is not set
CONFIG_MTRR=y
CONFIG_MTRR_SANITIZER=y
CONFIG_MTRR_SANITIZER_ENABLE_DEFAULT=0
CONFIG_MTRR_SANITIZER_SPARE_REG_NR_DEFAULT=1
CONFIG_X86_PAT=y
CONFIG_EFI=y
CONFIG_SECCOMP=y
CONFIG_CC_STACKPROTECTOR_ALL=y
CONFIG_CC_STACKPROTECTOR=y
# CONFIG_HZ_100 is not set
# CONFIG_HZ_250 is not set
CONFIG_HZ_300=y
# CONFIG_HZ_1000 is not set
CONFIG_HZ=300
CONFIG_SCHED_HRTICK=y
CONFIG_KEXEC=y
CONFIG_CRASH_DUMP=y
# CONFIG_KEXEC_JUMP is not set
CONFIG_PHYSICAL_START=0x400000
CONFIG_RELOCATABLE=y
CONFIG_PHYSICAL_ALIGN=0x400000
CONFIG_HOTPLUG_CPU=y
# CONFIG_COMPAT_VDSO is not set
# CONFIG_CMDLINE_BOOL is not set
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y

#
# Power management and ACPI options
#
CONFIG_PM=y
CONFIG_PM_DEBUG=y
# CONFIG_PM_VERBOSE is not set
CONFIG_CAN_PM_TRACE=y
# CONFIG_PM_TRACE_RTC is not set
CONFIG_PM_SLEEP_SMP=y
CONFIG_PM_SLEEP=y
CONFIG_SUSPEND=y
CONFIG_PM_TEST_SUSPEND=y
CONFIG_SUSPEND_FREEZER=y
CONFIG_HIBERNATION=y
CONFIG_PM_STD_PARTITION=""
CONFIG_ACPI=y
CONFIG_ACPI_SLEEP=y
# CONFIG_ACPI_PROCFS is not set
# CONFIG_ACPI_PROCFS_POWER is not set
CONFIG_ACPI_SYSFS_POWER=y
CONFIG_ACPI_PROC_EVENT=y
CONFIG_ACPI_AC=y
CONFIG_ACPI_BATTERY=y
CONFIG_ACPI_BUTTON=y
CONFIG_ACPI_VIDEO=m
CONFIG_ACPI_FAN=y
CONFIG_ACPI_DOCK=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_HOTPLUG_CPU=y
CONFIG_ACPI_THERMAL=y
# CONFIG_ACPI_CUSTOM_DSDT is not set
CONFIG_ACPI_BLACKLIST_YEAR=1999
# CONFIG_ACPI_DEBUG is not set
CONFIG_ACPI_PCI_SLOT=y
CONFIG_X86_PM_TIMER=y
CONFIG_ACPI_CONTAINER=y
CONFIG_ACPI_SBS=y
# CONFIG_APM is not set

#
# CPU Frequency scaling
#
CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_TABLE=y
CONFIG_CPU_FREQ_DEBUG=y
CONFIG_CPU_FREQ_STAT=y
CONFIG_CPU_FREQ_STAT_DETAILS=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set
CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=y
CONFIG_CPU_FREQ_GOV_USERSPACE=y
CONFIG_CPU_FREQ_GOV_ONDEMAND=y
CONFIG_CPU_FREQ_GOV_CONSERVATIVE=y

#
# CPUFreq processor drivers
#
CONFIG_X86_ACPI_CPUFREQ=y
# CONFIG_X86_POWERNOW_K6 is not set
# CONFIG_X86_POWERNOW_K7 is not set
# CONFIG_X86_POWERNOW_K8 is not set
# CONFIG_X86_GX_SUSPMOD is not set
# CONFIG_X86_SPEEDSTEP_CENTRINO is not set
CONFIG_X86_SPEEDSTEP_ICH=y
CONFIG_X86_SPEEDSTEP_SMI=y
# CONFIG_X86_P4_CLOCKMOD is not set
# CONFIG_X86_CPUFREQ_NFORCE2 is not set
# CONFIG_X86_LONGRUN is not set
# CONFIG_X86_LONGHAUL is not set
# CONFIG_X86_E_POWERSAVER is not set

#
# shared options
#
CONFIG_X86_SPEEDSTEP_LIB=y
# CONFIG_X86_SPEEDSTEP_RELAXED_CAP_CHECK is not set
CONFIG_CPU_IDLE=y
CONFIG_CPU_IDLE_GOV_LADDER=y
CONFIG_CPU_IDLE_GOV_MENU=y

#
# Bus options (PCI etc.)
#
CONFIG_PCI=y
# CONFIG_PCI_GOBIOS is not set
# CONFIG_PCI_GOMMCONFIG is not set
# CONFIG_PCI_GODIRECT is not set
# CONFIG_PCI_GOOLPC is not set
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
CONFIG_PCI_DOMAINS=y
# CONFIG_DMAR is not set
CONFIG_PCIEPORTBUS=y
CONFIG_HOTPLUG_PCI_PCIE=y
CONFIG_PCIEAER=y
CONFIG_PCIEASPM=y
# CONFIG_PCIEASPM_DEBUG is not set
CONFIG_ARCH_SUPPORTS_MSI=y
CONFIG_PCI_MSI=y
CONFIG_PCI_LEGACY=y
# CONFIG_PCI_STUB is not set
CONFIG_HT_IRQ=y
# CONFIG_PCI_IOV is not set
CONFIG_ISA_DMA_API=y
# CONFIG_ISA is not set
# CONFIG_MCA is not set
# CONFIG_SCx200 is not set
# CONFIG_OLPC is not set
# CONFIG_PCCARD is not set
CONFIG_HOTPLUG_PCI=y
CONFIG_HOTPLUG_PCI_FAKE=y
# CONFIG_HOTPLUG_PCI_COMPAQ is not set
# CONFIG_HOTPLUG_PCI_IBM is not set
CONFIG_HOTPLUG_PCI_ACPI=y
CONFIG_HOTPLUG_PCI_ACPI_IBM=y
# CONFIG_HOTPLUG_PCI_CPCI is not set
# CONFIG_HOTPLUG_PCI_SHPC is not set

#
# Executable file formats / Emulations
#
CONFIG_BINFMT_ELF=y
# CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set
CONFIG_HAVE_AOUT=y
# CONFIG_BINFMT_AOUT is not set
CONFIG_BINFMT_MISC=y
CONFIG_HAVE_ATOMIC_IOMAP=y
CONFIG_NET=y

#
# Networking options
#
CONFIG_PACKET=y
CONFIG_PACKET_MMAP=y
CONFIG_UNIX=y
CONFIG_XFRM=y
CONFIG_XFRM_USER=y
CONFIG_XFRM_SUB_POLICY=y
CONFIG_XFRM_MIGRATE=y
CONFIG_XFRM_STATISTICS=y
CONFIG_XFRM_IPCOMP=y
CONFIG_NET_KEY=y
CONFIG_NET_KEY_MIGRATE=y
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_ADVANCED_ROUTER=y
CONFIG_ASK_IP_FIB_HASH=y
# CONFIG_IP_FIB_TRIE is not set
CONFIG_IP_FIB_HASH=y
CONFIG_IP_MULTIPLE_TABLES=y
CONFIG_IP_ROUTE_MULTIPATH=y
CONFIG_IP_ROUTE_VERBOSE=y
# CONFIG_IP_PNP is not set
CONFIG_NET_IPIP=y
CONFIG_NET_IPGRE=y
CONFIG_NET_IPGRE_BROADCAST=y
CONFIG_IP_MROUTE=y
CONFIG_IP_PIMSM_V1=y
CONFIG_IP_PIMSM_V2=y
# CONFIG_ARPD is not set
CONFIG_SYN_COOKIES=y
CONFIG_INET_AH=y
CONFIG_INET_ESP=y
CONFIG_INET_IPCOMP=y
CONFIG_INET_XFRM_TUNNEL=y
CONFIG_INET_TUNNEL=y
CONFIG_INET_XFRM_MODE_TRANSPORT=y
CONFIG_INET_XFRM_MODE_TUNNEL=y
CONFIG_INET_XFRM_MODE_BEET=y
CONFIG_INET_LRO=y
CONFIG_INET_DIAG=y
CONFIG_INET_TCP_DIAG=y
CONFIG_TCP_CONG_ADVANCED=y
CONFIG_TCP_CONG_BIC=m
CONFIG_TCP_CONG_CUBIC=y
CONFIG_TCP_CONG_WESTWOOD=m
CONFIG_TCP_CONG_HTCP=m
CONFIG_TCP_CONG_HSTCP=m
CONFIG_TCP_CONG_HYBLA=m
CONFIG_TCP_CONG_VEGAS=m
CONFIG_TCP_CONG_SCALABLE=m
CONFIG_TCP_CONG_LP=m
CONFIG_TCP_CONG_VENO=m
CONFIG_TCP_CONG_YEAH=m
CONFIG_TCP_CONG_ILLINOIS=m
# CONFIG_DEFAULT_BIC is not set
CONFIG_DEFAULT_CUBIC=y
# CONFIG_DEFAULT_HTCP is not set
# CONFIG_DEFAULT_VEGAS is not set
# CONFIG_DEFAULT_WESTWOOD is not set
# CONFIG_DEFAULT_RENO is not set
CONFIG_DEFAULT_TCP_CONG="cubic"
CONFIG_TCP_MD5SIG=y
CONFIG_IPV6=y
CONFIG_IPV6_PRIVACY=y
CONFIG_IPV6_ROUTER_PREF=y
CONFIG_IPV6_ROUTE_INFO=y
CONFIG_IPV6_OPTIMISTIC_DAD=y
CONFIG_INET6_AH=m
CONFIG_INET6_ESP=m
CONFIG_INET6_IPCOMP=m
CONFIG_IPV6_MIP6=m
CONFIG_INET6_XFRM_TUNNEL=m
CONFIG_INET6_TUNNEL=m
CONFIG_INET6_XFRM_MODE_TRANSPORT=m
CONFIG_INET6_XFRM_MODE_TUNNEL=m
CONFIG_INET6_XFRM_MODE_BEET=m
CONFIG_INET6_XFRM_MODE_ROUTEOPTIMIZATION=m
CONFIG_IPV6_SIT=m
CONFIG_IPV6_NDISC_NODETYPE=y
CONFIG_IPV6_TUNNEL=m
CONFIG_IPV6_MULTIPLE_TABLES=y
CONFIG_IPV6_SUBTREES=y
# CONFIG_IPV6_MROUTE is not set
CONFIG_NETLABEL=y
CONFIG_NETWORK_SECMARK=y
CONFIG_NETFILTER=y
# CONFIG_NETFILTER_DEBUG is not set
CONFIG_NETFILTER_ADVANCED=y
CONFIG_BRIDGE_NETFILTER=y

#
# Core Netfilter Configuration
#
CONFIG_NETFILTER_NETLINK=m
CONFIG_NETFILTER_NETLINK_QUEUE=m
CONFIG_NETFILTER_NETLINK_LOG=m
CONFIG_NF_CONNTRACK=m
CONFIG_NF_CT_ACCT=y
CONFIG_NF_CONNTRACK_MARK=y
CONFIG_NF_CONNTRACK_SECMARK=y
CONFIG_NF_CONNTRACK_EVENTS=y
CONFIG_NF_CT_PROTO_DCCP=m
CONFIG_NF_CT_PROTO_GRE=m
CONFIG_NF_CT_PROTO_SCTP=m
CONFIG_NF_CT_PROTO_UDPLITE=m
CONFIG_NF_CONNTRACK_AMANDA=m
CONFIG_NF_CONNTRACK_FTP=m
CONFIG_NF_CONNTRACK_H323=m
CONFIG_NF_CONNTRACK_IRC=m
CONFIG_NF_CONNTRACK_NETBIOS_NS=m
CONFIG_NF_CONNTRACK_PPTP=m
CONFIG_NF_CONNTRACK_SANE=m
CONFIG_NF_CONNTRACK_SIP=m
CONFIG_NF_CONNTRACK_TFTP=m
CONFIG_NF_CT_NETLINK=m
CONFIG_NETFILTER_TPROXY=m
CONFIG_NETFILTER_XTABLES=m
CONFIG_NETFILTER_XT_TARGET_CLASSIFY=m
CONFIG_NETFILTER_XT_TARGET_CONNMARK=m
CONFIG_NETFILTER_XT_TARGET_CONNSECMARK=m
CONFIG_NETFILTER_XT_TARGET_DSCP=m
CONFIG_NETFILTER_XT_TARGET_HL=m
CONFIG_NETFILTER_XT_TARGET_LED=m
CONFIG_NETFILTER_XT_TARGET_MARK=m
CONFIG_NETFILTER_XT_TARGET_NFLOG=m
CONFIG_NETFILTER_XT_TARGET_NFQUEUE=m
CONFIG_NETFILTER_XT_TARGET_NOTRACK=m
CONFIG_NETFILTER_XT_TARGET_RATEEST=m
CONFIG_NETFILTER_XT_TARGET_TPROXY=m
CONFIG_NETFILTER_XT_TARGET_TRACE=m
CONFIG_NETFILTER_XT_TARGET_SECMARK=m
CONFIG_NETFILTER_XT_TARGET_TCPMSS=m
CONFIG_NETFILTER_XT_TARGET_TCPOPTSTRIP=m
CONFIG_NETFILTER_XT_MATCH_CLUSTER=m
CONFIG_NETFILTER_XT_MATCH_COMMENT=m
CONFIG_NETFILTER_XT_MATCH_CONNBYTES=m
CONFIG_NETFILTER_XT_MATCH_CONNLIMIT=m
CONFIG_NETFILTER_XT_MATCH_CONNMARK=m
CONFIG_NETFILTER_XT_MATCH_CONNTRACK=m
CONFIG_NETFILTER_XT_MATCH_DCCP=m
CONFIG_NETFILTER_XT_MATCH_DSCP=m
CONFIG_NETFILTER_XT_MATCH_ESP=m
CONFIG_NETFILTER_XT_MATCH_HASHLIMIT=m
CONFIG_NETFILTER_XT_MATCH_HELPER=m
CONFIG_NETFILTER_XT_MATCH_HL=m
CONFIG_NETFILTER_XT_MATCH_IPRANGE=m
CONFIG_NETFILTER_XT_MATCH_LENGTH=m
CONFIG_NETFILTER_XT_MATCH_LIMIT=m
CONFIG_NETFILTER_XT_MATCH_MAC=m
CONFIG_NETFILTER_XT_MATCH_MARK=m
CONFIG_NETFILTER_XT_MATCH_MULTIPORT=m
CONFIG_NETFILTER_XT_MATCH_OWNER=m
CONFIG_NETFILTER_XT_MATCH_POLICY=m
CONFIG_NETFILTER_XT_MATCH_PHYSDEV=m
CONFIG_NETFILTER_XT_MATCH_PKTTYPE=m
CONFIG_NETFILTER_XT_MATCH_QUOTA=m
CONFIG_NETFILTER_XT_MATCH_RATEEST=m
CONFIG_NETFILTER_XT_MATCH_REALM=m
CONFIG_NETFILTER_XT_MATCH_RECENT=m
# CONFIG_NETFILTER_XT_MATCH_RECENT_PROC_COMPAT is not set
CONFIG_NETFILTER_XT_MATCH_SCTP=m
CONFIG_NETFILTER_XT_MATCH_SOCKET=m
CONFIG_NETFILTER_XT_MATCH_STATE=m
CONFIG_NETFILTER_XT_MATCH_STATISTIC=m
CONFIG_NETFILTER_XT_MATCH_STRING=m
CONFIG_NETFILTER_XT_MATCH_TCPMSS=m
CONFIG_NETFILTER_XT_MATCH_TIME=m
CONFIG_NETFILTER_XT_MATCH_U32=m
CONFIG_IP_VS=m
# CONFIG_IP_VS_IPV6 is not set
# CONFIG_IP_VS_DEBUG is not set
CONFIG_IP_VS_TAB_BITS=12

#
# IPVS transport protocol load balancing support
#
CONFIG_IP_VS_PROTO_TCP=y
CONFIG_IP_VS_PROTO_UDP=y
CONFIG_IP_VS_PROTO_AH_ESP=y
CONFIG_IP_VS_PROTO_ESP=y
CONFIG_IP_VS_PROTO_AH=y

#
# IPVS scheduler
#
CONFIG_IP_VS_RR=m
CONFIG_IP_VS_WRR=m
CONFIG_IP_VS_LC=m
CONFIG_IP_VS_WLC=m
CONFIG_IP_VS_LBLC=m
CONFIG_IP_VS_LBLCR=m
CONFIG_IP_VS_DH=m
CONFIG_IP_VS_SH=m
CONFIG_IP_VS_SED=m
CONFIG_IP_VS_NQ=m

#
# IPVS application helper
#
CONFIG_IP_VS_FTP=m

#
# IP: Netfilter Configuration
#
CONFIG_NF_DEFRAG_IPV4=m
CONFIG_NF_CONNTRACK_IPV4=m
# CONFIG_NF_CONNTRACK_PROC_COMPAT is not set
CONFIG_IP_NF_QUEUE=m
CONFIG_IP_NF_IPTABLES=m
CONFIG_IP_NF_MATCH_ADDRTYPE=m
CONFIG_IP_NF_MATCH_AH=m
CONFIG_IP_NF_MATCH_ECN=m
CONFIG_IP_NF_MATCH_TTL=m
CONFIG_IP_NF_FILTER=m
CONFIG_IP_NF_TARGET_REJECT=m
CONFIG_IP_NF_TARGET_LOG=m
CONFIG_IP_NF_TARGET_ULOG=m
CONFIG_NF_NAT=m
CONFIG_NF_NAT_NEEDED=y
CONFIG_IP_NF_TARGET_MASQUERADE=m
CONFIG_IP_NF_TARGET_NETMAP=m
CONFIG_IP_NF_TARGET_REDIRECT=m
CONFIG_NF_NAT_SNMP_BASIC=m
CONFIG_NF_NAT_PROTO_DCCP=m
CONFIG_NF_NAT_PROTO_GRE=m
CONFIG_NF_NAT_PROTO_UDPLITE=m
CONFIG_NF_NAT_PROTO_SCTP=m
CONFIG_NF_NAT_FTP=m
CONFIG_NF_NAT_IRC=m
CONFIG_NF_NAT_TFTP=m
CONFIG_NF_NAT_AMANDA=m
CONFIG_NF_NAT_PPTP=m
CONFIG_NF_NAT_H323=m
CONFIG_NF_NAT_SIP=m
CONFIG_IP_NF_MANGLE=m
CONFIG_IP_NF_TARGET_CLUSTERIP=m
CONFIG_IP_NF_TARGET_ECN=m
CONFIG_IP_NF_TARGET_TTL=m
CONFIG_IP_NF_RAW=m
CONFIG_IP_NF_SECURITY=m
CONFIG_IP_NF_ARPTABLES=m
CONFIG_IP_NF_ARPFILTER=m
CONFIG_IP_NF_ARP_MANGLE=m

#
# IPv6: Netfilter Configuration
#
CONFIG_NF_CONNTRACK_IPV6=m
CONFIG_IP6_NF_QUEUE=m
CONFIG_IP6_NF_IPTABLES=m
CONFIG_IP6_NF_MATCH_AH=m
CONFIG_IP6_NF_MATCH_EUI64=m
CONFIG_IP6_NF_MATCH_FRAG=m
CONFIG_IP6_NF_MATCH_OPTS=m
CONFIG_IP6_NF_MATCH_HL=m
CONFIG_IP6_NF_MATCH_IPV6HEADER=m
CONFIG_IP6_NF_MATCH_MH=m
CONFIG_IP6_NF_MATCH_RT=m
CONFIG_IP6_NF_TARGET_HL=m
CONFIG_IP6_NF_TARGET_LOG=m
CONFIG_IP6_NF_FILTER=m
CONFIG_IP6_NF_TARGET_REJECT=m
CONFIG_IP6_NF_MANGLE=m
CONFIG_IP6_NF_RAW=m
CONFIG_IP6_NF_SECURITY=m
CONFIG_BRIDGE_NF_EBTABLES=m
CONFIG_BRIDGE_EBT_BROUTE=m
CONFIG_BRIDGE_EBT_T_FILTER=m
CONFIG_BRIDGE_EBT_T_NAT=m
CONFIG_BRIDGE_EBT_802_3=m
CONFIG_BRIDGE_EBT_AMONG=m
CONFIG_BRIDGE_EBT_ARP=m
CONFIG_BRIDGE_EBT_IP=m
CONFIG_BRIDGE_EBT_IP6=m
CONFIG_BRIDGE_EBT_LIMIT=m
CONFIG_BRIDGE_EBT_MARK=m
CONFIG_BRIDGE_EBT_PKTTYPE=m
CONFIG_BRIDGE_EBT_STP=m
CONFIG_BRIDGE_EBT_VLAN=m
CONFIG_BRIDGE_EBT_ARPREPLY=m
CONFIG_BRIDGE_EBT_DNAT=m
CONFIG_BRIDGE_EBT_MARK_T=m
CONFIG_BRIDGE_EBT_REDIRECT=m
CONFIG_BRIDGE_EBT_SNAT=m
CONFIG_BRIDGE_EBT_LOG=m
CONFIG_BRIDGE_EBT_ULOG=m
# CONFIG_BRIDGE_EBT_NFLOG is not set
CONFIG_IP_DCCP=y
CONFIG_INET_DCCP_DIAG=y

#
# DCCP CCIDs Configuration (EXPERIMENTAL)
#
# CONFIG_IP_DCCP_CCID2_DEBUG is not set
CONFIG_IP_DCCP_CCID3=y
# CONFIG_IP_DCCP_CCID3_DEBUG is not set
CONFIG_IP_DCCP_CCID3_RTO=100
CONFIG_IP_DCCP_TFRC_LIB=y
CONFIG_IP_SCTP=y
# CONFIG_SCTP_DBG_MSG is not set
# CONFIG_SCTP_DBG_OBJCNT is not set
# CONFIG_SCTP_HMAC_NONE is not set
# CONFIG_SCTP_HMAC_SHA1 is not set
CONFIG_SCTP_HMAC_MD5=y
CONFIG_TIPC=y
# CONFIG_TIPC_ADVANCED is not set
# CONFIG_TIPC_DEBUG is not set
# CONFIG_ATM is not set
CONFIG_STP=y
CONFIG_BRIDGE=y
# CONFIG_NET_DSA is not set
CONFIG_VLAN_8021Q=y
# CONFIG_VLAN_8021Q_GVRP is not set
# CONFIG_DECNET is not set
CONFIG_LLC=y
# CONFIG_LLC2 is not set
# CONFIG_IPX is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_ECONET is not set
# CONFIG_WAN_ROUTER is not set
# CONFIG_PHONET is not set
CONFIG_NET_SCHED=y

#
# Queueing/Scheduling
#
CONFIG_NET_SCH_CBQ=y
CONFIG_NET_SCH_HTB=y
CONFIG_NET_SCH_HFSC=y
CONFIG_NET_SCH_PRIO=y
CONFIG_NET_SCH_MULTIQ=y
CONFIG_NET_SCH_RED=y
CONFIG_NET_SCH_SFQ=y
CONFIG_NET_SCH_TEQL=y
CONFIG_NET_SCH_TBF=y
CONFIG_NET_SCH_GRED=y
CONFIG_NET_SCH_DSMARK=y
CONFIG_NET_SCH_NETEM=y
CONFIG_NET_SCH_DRR=y
CONFIG_NET_SCH_INGRESS=y

#
# Classification
#
CONFIG_NET_CLS=y
CONFIG_NET_CLS_BASIC=y
CONFIG_NET_CLS_TCINDEX=y
CONFIG_NET_CLS_ROUTE4=y
CONFIG_NET_CLS_ROUTE=y
CONFIG_NET_CLS_FW=y
CONFIG_NET_CLS_U32=y
CONFIG_CLS_U32_PERF=y
CONFIG_CLS_U32_MARK=y
CONFIG_NET_CLS_RSVP=y
CONFIG_NET_CLS_RSVP6=y
CONFIG_NET_CLS_FLOW=y
CONFIG_NET_CLS_CGROUP=y
CONFIG_NET_EMATCH=y
CONFIG_NET_EMATCH_STACK=32
CONFIG_NET_EMATCH_CMP=y
CONFIG_NET_EMATCH_NBYTE=y
CONFIG_NET_EMATCH_U32=y
CONFIG_NET_EMATCH_META=y
CONFIG_NET_EMATCH_TEXT=y
CONFIG_NET_CLS_ACT=y
CONFIG_NET_ACT_POLICE=y
CONFIG_NET_ACT_GACT=y
CONFIG_GACT_PROB=y
CONFIG_NET_ACT_MIRRED=y
CONFIG_NET_ACT_IPT=m
CONFIG_NET_ACT_NAT=y
CONFIG_NET_ACT_PEDIT=y
CONFIG_NET_ACT_SIMP=y
CONFIG_NET_ACT_SKBEDIT=y
CONFIG_NET_CLS_IND=y
CONFIG_NET_SCH_FIFO=y
CONFIG_DCB=y

#
# Network testing
#
# CONFIG_NET_PKTGEN is not set
# CONFIG_NET_TCPPROBE is not set
# CONFIG_NET_DROP_MONITOR is not set
# CONFIG_HAMRADIO is not set
# CONFIG_CAN is not set
CONFIG_IRDA=y

#
# IrDA protocols
#
CONFIG_IRLAN=y
# CONFIG_IRNET is not set
CONFIG_IRCOMM=y
# CONFIG_IRDA_ULTRA is not set

#
# IrDA options
#
CONFIG_IRDA_CACHE_LAST_LSAP=y
CONFIG_IRDA_FAST_RR=y
# CONFIG_IRDA_DEBUG is not set

#
# Infrared-port device drivers
#

#
# SIR device drivers
#
# CONFIG_IRTTY_SIR is not set

#
# Dongle support
#
# CONFIG_KINGSUN_DONGLE is not set
# CONFIG_KSDAZZLE_DONGLE is not set
# CONFIG_KS959_DONGLE is not set

#
# FIR device drivers
#
# CONFIG_USB_IRDA is not set
# CONFIG_SIGMATEL_FIR is not set
# CONFIG_NSC_FIR is not set
# CONFIG_WINBOND_FIR is not set
# CONFIG_TOSHIBA_FIR is not set
# CONFIG_SMC_IRCC_FIR is not set
# CONFIG_ALI_FIR is not set
# CONFIG_VLSI_FIR is not set
# CONFIG_VIA_FIR is not set
# CONFIG_MCS_FIR is not set
CONFIG_BT=y
CONFIG_BT_L2CAP=y
CONFIG_BT_SCO=y
CONFIG_BT_RFCOMM=y
CONFIG_BT_RFCOMM_TTY=y
CONFIG_BT_BNEP=y
CONFIG_BT_BNEP_MC_FILTER=y
CONFIG_BT_BNEP_PROTO_FILTER=y
CONFIG_BT_HIDP=y

#
# Bluetooth device drivers
#
CONFIG_BT_HCIBTUSB=y
# CONFIG_BT_HCIUART is not set
# CONFIG_BT_HCIBCM203X is not set
# CONFIG_BT_HCIBPA10X is not set
# CONFIG_BT_HCIBFUSB is not set
# CONFIG_BT_HCIVHCI is not set
# CONFIG_AF_RXRPC is not set
CONFIG_FIB_RULES=y
CONFIG_WIRELESS=y
CONFIG_CFG80211=m
# CONFIG_CFG80211_REG_DEBUG is not set
# CONFIG_WIRELESS_OLD_REGULATORY is not set
CONFIG_WIRELESS_EXT=y
CONFIG_WIRELESS_EXT_SYSFS=y
CONFIG_LIB80211=y
# CONFIG_LIB80211_DEBUG is not set
CONFIG_MAC80211=m

#
# Rate control algorithm selection
#
CONFIG_MAC80211_RC_MINSTREL=y
# CONFIG_MAC80211_RC_DEFAULT_PID is not set
CONFIG_MAC80211_RC_DEFAULT_MINSTREL=y
CONFIG_MAC80211_RC_DEFAULT="minstrel"
CONFIG_MAC80211_MESH=y
CONFIG_MAC80211_LEDS=y
CONFIG_MAC80211_DEBUGFS=y
# CONFIG_MAC80211_DEBUG_MENU is not set
# CONFIG_WIMAX is not set
CONFIG_RFKILL=y
CONFIG_RFKILL_INPUT=m
CONFIG_RFKILL_LEDS=y
# CONFIG_NET_9P is not set

#
# Device Drivers
#

#
# Generic Driver Options
#
CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y
CONFIG_FW_LOADER=y
CONFIG_FIRMWARE_IN_KERNEL=y
CONFIG_EXTRA_FIRMWARE=""
# CONFIG_SYS_HYPERVISOR is not set
CONFIG_CONNECTOR=y
CONFIG_PROC_EVENTS=y
CONFIG_MTD=y
# CONFIG_MTD_DEBUG is not set
# CONFIG_MTD_CONCAT is not set
CONFIG_MTD_PARTITIONS=y
# CONFIG_MTD_TESTS is not set
# CONFIG_MTD_REDBOOT_PARTS is not set
# CONFIG_MTD_CMDLINE_PARTS is not set
# CONFIG_MTD_AR7_PARTS is not set

#
# User Modules And Translation Layers
#
# CONFIG_MTD_CHAR is not set
# CONFIG_MTD_BLKDEVS is not set
# CONFIG_MTD_BLOCK is not set
# CONFIG_MTD_BLOCK_RO is not set
# CONFIG_FTL is not set
# CONFIG_NFTL is not set
# CONFIG_INFTL is not set
# CONFIG_RFD_FTL is not set
# CONFIG_SSFDC is not set
# CONFIG_MTD_OOPS is not set

#
# RAM/ROM/Flash chip drivers
#
# CONFIG_MTD_CFI is not set
# CONFIG_MTD_JEDECPROBE is not set
CONFIG_MTD_MAP_BANK_WIDTH_1=y
CONFIG_MTD_MAP_BANK_WIDTH_2=y
CONFIG_MTD_MAP_BANK_WIDTH_4=y
# CONFIG_MTD_MAP_BANK_WIDTH_8 is not set
# CONFIG_MTD_MAP_BANK_WIDTH_16 is not set
# CONFIG_MTD_MAP_BANK_WIDTH_32 is not set
CONFIG_MTD_CFI_I1=y
CONFIG_MTD_CFI_I2=y
# CONFIG_MTD_CFI_I4 is not set
# CONFIG_MTD_CFI_I8 is not set
# CONFIG_MTD_RAM is not set
# CONFIG_MTD_ROM is not set
# CONFIG_MTD_ABSENT is not set

#
# Mapping drivers for chip access
#
CONFIG_MTD_COMPLEX_MAPPINGS=y
# CONFIG_MTD_TS5500 is not set
# CONFIG_MTD_PCI is not set
# CONFIG_MTD_INTEL_VR_NOR is not set
# CONFIG_MTD_PLATRAM is not set

#
# Self-contained MTD device drivers
#
# CONFIG_MTD_PMC551 is not set
CONFIG_MTD_SLRAM=m
CONFIG_MTD_PHRAM=m
# CONFIG_MTD_MTDRAM is not set
CONFIG_MTD_BLOCK2MTD=m

#
# Disk-On-Chip Device Drivers
#
# CONFIG_MTD_DOC2000 is not set
# CONFIG_MTD_DOC2001 is not set
# CONFIG_MTD_DOC2001PLUS is not set
# CONFIG_MTD_NAND is not set
# CONFIG_MTD_ONENAND is not set

#
# LPDDR flash memory drivers
#
# CONFIG_MTD_LPDDR is not set

#
# UBI - Unsorted block images
#
# CONFIG_MTD_UBI is not set
# CONFIG_PARPORT is not set
CONFIG_PNP=y
CONFIG_PNP_DEBUG_MESSAGES=y

#
# Protocols
#
CONFIG_PNPACPI=y
CONFIG_BLK_DEV=y
# CONFIG_BLK_DEV_FD is not set
# CONFIG_BLK_CPQ_DA is not set
# CONFIG_BLK_CPQ_CISS_DA is not set
# CONFIG_BLK_DEV_DAC960 is not set
# CONFIG_BLK_DEV_UMEM is not set
# CONFIG_BLK_DEV_COW_COMMON is not set
CONFIG_BLK_DEV_LOOP=y
CONFIG_BLK_DEV_CRYPTOLOOP=y
# CONFIG_BLK_DEV_NBD is not set
# CONFIG_BLK_DEV_SX8 is not set
# CONFIG_BLK_DEV_UB is not set
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_COUNT=16
CONFIG_BLK_DEV_RAM_SIZE=16384
# CONFIG_BLK_DEV_XIP is not set
CONFIG_CDROM_PKTCDVD=y
CONFIG_CDROM_PKTCDVD_BUFFERS=8
# CONFIG_CDROM_PKTCDVD_WCACHE is not set
# CONFIG_ATA_OVER_ETH is not set
CONFIG_VIRTIO_BLK=m
# CONFIG_BLK_DEV_HD is not set
# CONFIG_MISC_DEVICES is not set
CONFIG_HAVE_IDE=y
# CONFIG_IDE is not set

#
# SCSI device support
#
# CONFIG_RAID_ATTRS is not set
CONFIG_SCSI=y
CONFIG_SCSI_DMA=y
CONFIG_SCSI_TGT=y
# CONFIG_SCSI_NETLINK is not set
# CONFIG_SCSI_PROC_FS is not set

#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=y
# CONFIG_CHR_DEV_ST is not set
# CONFIG_CHR_DEV_OSST is not set
CONFIG_BLK_DEV_SR=y
CONFIG_BLK_DEV_SR_VENDOR=y
CONFIG_CHR_DEV_SG=y
CONFIG_CHR_DEV_SCH=y

#
# Some SCSI devices (e.g. CD jukebox) support multiple LUNs
#
CONFIG_SCSI_MULTI_LUN=y
# CONFIG_SCSI_CONSTANTS is not set
CONFIG_SCSI_LOGGING=y
CONFIG_SCSI_SCAN_ASYNC=y
CONFIG_SCSI_WAIT_SCAN=m

#
# SCSI Transports
#
# CONFIG_SCSI_SPI_ATTRS is not set
# CONFIG_SCSI_FC_ATTRS is not set
# CONFIG_SCSI_ISCSI_ATTRS is not set
# CONFIG_SCSI_SAS_ATTRS is not set
# CONFIG_SCSI_SAS_LIBSAS is not set
# CONFIG_SCSI_SRP_ATTRS is not set
CONFIG_SCSI_LOWLEVEL=y
# CONFIG_ISCSI_TCP is not set
# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
# CONFIG_SCSI_3W_9XXX is not set
# CONFIG_SCSI_ACARD is not set
# CONFIG_SCSI_AACRAID is not set
# CONFIG_SCSI_AIC7XXX is not set
# CONFIG_SCSI_AIC7XXX_OLD is not set
# CONFIG_SCSI_AIC79XX is not set
# CONFIG_SCSI_AIC94XX is not set
# CONFIG_SCSI_DPT_I2O is not set
# CONFIG_SCSI_ADVANSYS is not set
# CONFIG_SCSI_ARCMSR is not set
# CONFIG_MEGARAID_NEWGEN is not set
# CONFIG_MEGARAID_LEGACY is not set
# CONFIG_MEGARAID_SAS is not set
# CONFIG_SCSI_MPT2SAS is not set
# CONFIG_SCSI_HPTIOP is not set
# CONFIG_SCSI_BUSLOGIC is not set
# CONFIG_LIBFC is not set
# CONFIG_LIBFCOE is not set
# CONFIG_FCOE is not set
# CONFIG_SCSI_DMX3191D is not set
# CONFIG_SCSI_EATA is not set
# CONFIG_SCSI_FUTURE_DOMAIN is not set
# CONFIG_SCSI_GDTH is not set
# CONFIG_SCSI_IPS is not set
# CONFIG_SCSI_INITIO is not set
# CONFIG_SCSI_INIA100 is not set
# CONFIG_SCSI_MVSAS is not set
# CONFIG_SCSI_STEX is not set
# CONFIG_SCSI_SYM53C8XX_2 is not set
# CONFIG_SCSI_IPR is not set
# CONFIG_SCSI_QLOGIC_1280 is not set
# CONFIG_SCSI_QLA_FC is not set
# CONFIG_SCSI_QLA_ISCSI is not set
# CONFIG_SCSI_LPFC is not set
# CONFIG_SCSI_DC395x is not set
# CONFIG_SCSI_DC390T is not set
# CONFIG_SCSI_NSP32 is not set
# CONFIG_SCSI_DEBUG is not set
# CONFIG_SCSI_SRP is not set
# CONFIG_SCSI_DH is not set
# CONFIG_SCSI_OSD_INITIATOR is not set
CONFIG_ATA=y
# CONFIG_ATA_NONSTANDARD is not set
CONFIG_ATA_ACPI=y
CONFIG_SATA_PMP=y
CONFIG_SATA_AHCI=y
# CONFIG_SATA_SIL24 is not set
CONFIG_ATA_SFF=y
# CONFIG_SATA_SVW is not set
CONFIG_ATA_PIIX=y
# CONFIG_SATA_MV is not set
# CONFIG_SATA_NV is not set
# CONFIG_PDC_ADMA is not set
# CONFIG_SATA_QSTOR is not set
# CONFIG_SATA_PROMISE is not set
# CONFIG_SATA_SX4 is not set
# CONFIG_SATA_SIL is not set
# CONFIG_SATA_SIS is not set
# CONFIG_SATA_ULI is not set
# CONFIG_SATA_VIA is not set
# CONFIG_SATA_VITESSE is not set
# CONFIG_SATA_INIC162X is not set
# CONFIG_PATA_ACPI is not set
# CONFIG_PATA_ALI is not set
# CONFIG_PATA_AMD is not set
# CONFIG_PATA_ARTOP is not set
# CONFIG_PATA_ATIIXP is not set
# CONFIG_PATA_CMD640_PCI is not set
# CONFIG_PATA_CMD64X is not set
# CONFIG_PATA_CS5520 is not set
# CONFIG_PATA_CS5530 is not set
# CONFIG_PATA_CS5535 is not set
# CONFIG_PATA_CS5536 is not set
# CONFIG_PATA_CYPRESS is not set
# CONFIG_PATA_EFAR is not set
CONFIG_ATA_GENERIC=y
# CONFIG_PATA_HPT366 is not set
# CONFIG_PATA_HPT37X is not set
# CONFIG_PATA_HPT3X2N is not set
# CONFIG_PATA_HPT3X3 is not set
# CONFIG_PATA_IT821X is not set
# CONFIG_PATA_IT8213 is not set
# CONFIG_PATA_JMICRON is not set
# CONFIG_PATA_TRIFLEX is not set
# CONFIG_PATA_MARVELL is not set
# CONFIG_PATA_MPIIX is not set
# CONFIG_PATA_OLDPIIX is not set
# CONFIG_PATA_NETCELL is not set
# CONFIG_PATA_NINJA32 is not set
# CONFIG_PATA_NS87410 is not set
# CONFIG_PATA_NS87415 is not set
# CONFIG_PATA_OPTI is not set
# CONFIG_PATA_OPTIDMA is not set
# CONFIG_PATA_PDC_OLD is not set
# CONFIG_PATA_RADISYS is not set
# CONFIG_PATA_RZ1000 is not set
# CONFIG_PATA_SC1200 is not set
# CONFIG_PATA_SERVERWORKS is not set
# CONFIG_PATA_PDC2027X is not set
# CONFIG_PATA_SIL680 is not set
# CONFIG_PATA_SIS is not set
# CONFIG_PATA_VIA is not set
# CONFIG_PATA_WINBOND is not set
# CONFIG_PATA_SCH is not set
CONFIG_MD=y
CONFIG_BLK_DEV_MD=m
CONFIG_MD_LINEAR=m
CONFIG_MD_RAID0=m
CONFIG_MD_RAID1=m
CONFIG_MD_RAID10=m
CONFIG_MD_RAID456=m
CONFIG_MD_RAID6_PQ=m
CONFIG_MD_MULTIPATH=m
CONFIG_MD_FAULTY=m
CONFIG_BLK_DEV_DM=y
CONFIG_DM_DEBUG=y
CONFIG_DM_CRYPT=y
CONFIG_DM_SNAPSHOT=y
CONFIG_DM_MIRROR=y
CONFIG_DM_ZERO=y
CONFIG_DM_MULTIPATH=y
CONFIG_DM_DELAY=y
CONFIG_DM_UEVENT=y
# CONFIG_FUSION is not set

#
# IEEE 1394 (FireWire) support
#

#
# Enable only one of the two stacks, unless you know what you are doing
#
CONFIG_FIREWIRE=y
CONFIG_FIREWIRE_OHCI=y
CONFIG_FIREWIRE_OHCI_DEBUG=y
CONFIG_FIREWIRE_SBP2=y
# CONFIG_IEEE1394 is not set
CONFIG_I2O=y
# CONFIG_I2O_LCT_NOTIFY_ON_CHANGES is not set
CONFIG_I2O_EXT_ADAPTEC=y
CONFIG_I2O_CONFIG=y
CONFIG_I2O_CONFIG_OLD_IOCTL=y
CONFIG_I2O_BUS=y
CONFIG_I2O_BLOCK=y
CONFIG_I2O_SCSI=y
CONFIG_I2O_PROC=y
CONFIG_MACINTOSH_DRIVERS=y
CONFIG_MAC_EMUMOUSEBTN=y
CONFIG_NETDEVICES=y
CONFIG_COMPAT_NET_DEV_OPS=y
CONFIG_IFB=y
CONFIG_DUMMY=y
CONFIG_BONDING=m
CONFIG_MACVLAN=m
CONFIG_EQUALIZER=m
CONFIG_TUN=y
CONFIG_VETH=m
# CONFIG_NET_SB1000 is not set
# CONFIG_ARCNET is not set
# CONFIG_NET_ETHERNET is not set
CONFIG_MII=y
CONFIG_NETDEV_1000=y
# CONFIG_ACENIC is not set
# CONFIG_DL2K is not set
# CONFIG_E1000 is not set
# CONFIG_E1000E is not set
# CONFIG_IP1000 is not set
# CONFIG_IGB is not set
# CONFIG_IGBVF is not set
# CONFIG_NS83820 is not set
# CONFIG_HAMACHI is not set
# CONFIG_YELLOWFIN is not set
# CONFIG_R8169 is not set
# CONFIG_SIS190 is not set
# CONFIG_SKGE is not set
CONFIG_SKY2=y
# CONFIG_SKY2_DEBUG is not set
# CONFIG_VIA_VELOCITY is not set
# CONFIG_TIGON3 is not set
# CONFIG_BNX2 is not set
# CONFIG_QLA3XXX is not set
# CONFIG_ATL1 is not set
# CONFIG_ATL1E is not set
# CONFIG_ATL1C is not set
# CONFIG_JME is not set
# CONFIG_NETDEV_10000 is not set
# CONFIG_TR is not set

#
# Wireless LAN
#
# CONFIG_WLAN_PRE80211 is not set
CONFIG_WLAN_80211=y
# CONFIG_LIBERTAS is not set
# CONFIG_LIBERTAS_THINFIRM is not set
# CONFIG_AIRO is not set
# CONFIG_ATMEL is not set
# CONFIG_AT76C50X_USB is not set
# CONFIG_PRISM54 is not set
# CONFIG_USB_ZD1201 is not set
CONFIG_USB_NET_RNDIS_WLAN=y
# CONFIG_RTL8180 is not set
# CONFIG_RTL8187 is not set
# CONFIG_ADM8211 is not set
# CONFIG_MAC80211_HWSIM is not set
# CONFIG_MWL8K is not set
# CONFIG_P54_COMMON is not set
CONFIG_ATH5K=m
# CONFIG_ATH5K_DEBUG is not set
# CONFIG_ATH9K is not set
# CONFIG_AR9170_USB is not set
# CONFIG_IPW2100 is not set
# CONFIG_IPW2200 is not set
# CONFIG_IWLWIFI is not set
# CONFIG_HOSTAP is not set
# CONFIG_B43 is not set
# CONFIG_B43LEGACY is not set
# CONFIG_ZD1211RW is not set
# CONFIG_RT2X00 is not set
# CONFIG_HERMES is not set

#
# Enable WiMAX (Networking options) to see the WiMAX drivers
#

#
# USB Network Adapters
#
# CONFIG_USB_CATC is not set
# CONFIG_USB_KAWETH is not set
# CONFIG_USB_PEGASUS is not set
# CONFIG_USB_RTL8150 is not set
CONFIG_USB_USBNET=y
# CONFIG_USB_NET_AX8817X is not set
CONFIG_USB_NET_CDCETHER=y
# CONFIG_USB_NET_DM9601 is not set
# CONFIG_USB_NET_SMSC95XX is not set
# CONFIG_USB_NET_GL620A is not set
# CONFIG_USB_NET_NET1080 is not set
# CONFIG_USB_NET_PLUSB is not set
# CONFIG_USB_NET_MCS7830 is not set
CONFIG_USB_NET_RNDIS_HOST=y
# CONFIG_USB_NET_CDC_SUBSET is not set
# CONFIG_USB_NET_ZAURUS is not set
# CONFIG_USB_HSO is not set
# CONFIG_WAN is not set
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
CONFIG_PPP=y
CONFIG_PPP_MULTILINK=y
CONFIG_PPP_FILTER=y
CONFIG_PPP_ASYNC=y
CONFIG_PPP_SYNC_TTY=y
CONFIG_PPP_DEFLATE=y
# CONFIG_PPP_BSDCOMP is not set
CONFIG_PPP_MPPE=y
CONFIG_PPPOE=y
CONFIG_PPPOL2TP=y
# CONFIG_SLIP is not set
CONFIG_SLHC=y
# CONFIG_NET_FC is not set
CONFIG_NETCONSOLE=y
CONFIG_NETCONSOLE_DYNAMIC=y
CONFIG_NETPOLL=y
CONFIG_NETPOLL_TRAP=y
CONFIG_NET_POLL_CONTROLLER=y
CONFIG_VIRTIO_NET=m
# CONFIG_ISDN is not set
# CONFIG_PHONE is not set

#
# Input device support
#
CONFIG_INPUT=y
# CONFIG_INPUT_FF_MEMLESS is not set
CONFIG_INPUT_POLLDEV=y

#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
# CONFIG_INPUT_MOUSEDEV_PSAUX is not set
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
CONFIG_INPUT_JOYDEV=y
CONFIG_INPUT_EVDEV=y
# CONFIG_INPUT_EVBUG is not set

#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
# CONFIG_KEYBOARD_SUNKBD is not set
# CONFIG_KEYBOARD_LKKBD is not set
# CONFIG_KEYBOARD_XTKBD is not set
# CONFIG_KEYBOARD_NEWTON is not set
# CONFIG_KEYBOARD_STOWAWAY is not set
CONFIG_INPUT_MOUSE=y
# CONFIG_MOUSE_PS2 is not set
# CONFIG_MOUSE_SERIAL is not set
CONFIG_MOUSE_APPLETOUCH=y
CONFIG_MOUSE_BCM5974=y
# CONFIG_MOUSE_VSXXXAA is not set
CONFIG_INPUT_JOYSTICK=y
# CONFIG_JOYSTICK_ANALOG is not set
# CONFIG_JOYSTICK_A3D is not set
# CONFIG_JOYSTICK_ADI is not set
# CONFIG_JOYSTICK_COBRA is not set
# CONFIG_JOYSTICK_GF2K is not set
# CONFIG_JOYSTICK_GRIP is not set
# CONFIG_JOYSTICK_GRIP_MP is not set
# CONFIG_JOYSTICK_GUILLEMOT is not set
# CONFIG_JOYSTICK_INTERACT is not set
# CONFIG_JOYSTICK_SIDEWINDER is not set
# CONFIG_JOYSTICK_TMDC is not set
# CONFIG_JOYSTICK_IFORCE is not set
# CONFIG_JOYSTICK_WARRIOR is not set
# CONFIG_JOYSTICK_MAGELLAN is not set
# CONFIG_JOYSTICK_SPACEORB is not set
# CONFIG_JOYSTICK_SPACEBALL is not set
# CONFIG_JOYSTICK_STINGER is not set
# CONFIG_JOYSTICK_TWIDJOY is not set
# CONFIG_JOYSTICK_ZHENHUA is not set
# CONFIG_JOYSTICK_JOYDUMP is not set
# CONFIG_JOYSTICK_XPAD is not set
CONFIG_INPUT_TABLET=y
# CONFIG_TABLET_USB_ACECAD is not set
# CONFIG_TABLET_USB_AIPTEK is not set
# CONFIG_TABLET_USB_GTCO is not set
# CONFIG_TABLET_USB_KBTAB is not set
# CONFIG_TABLET_USB_WACOM is not set
CONFIG_INPUT_TOUCHSCREEN=y
# CONFIG_TOUCHSCREEN_AD7879_I2C is not set
# CONFIG_TOUCHSCREEN_AD7879 is not set
# CONFIG_TOUCHSCREEN_FUJITSU is not set
# CONFIG_TOUCHSCREEN_GUNZE is not set
# CONFIG_TOUCHSCREEN_ELO is not set
# CONFIG_TOUCHSCREEN_WACOM_W8001 is not set
# CONFIG_TOUCHSCREEN_MTOUCH is not set
# CONFIG_TOUCHSCREEN_INEXIO is not set
# CONFIG_TOUCHSCREEN_MK712 is not set
# CONFIG_TOUCHSCREEN_PENMOUNT is not set
# CONFIG_TOUCHSCREEN_TOUCHRIGHT is not set
# CONFIG_TOUCHSCREEN_TOUCHWIN is not set
# CONFIG_TOUCHSCREEN_USB_COMPOSITE is not set
# CONFIG_TOUCHSCREEN_TOUCHIT213 is not set
# CONFIG_TOUCHSCREEN_TSC2007 is not set
CONFIG_INPUT_MISC=y
CONFIG_INPUT_PCSPKR=y
# CONFIG_INPUT_APANEL is not set
# CONFIG_INPUT_WISTRON_BTNS is not set
# CONFIG_INPUT_ATLAS_BTNS is not set
# CONFIG_INPUT_ATI_REMOTE is not set
# CONFIG_INPUT_ATI_REMOTE2 is not set
# CONFIG_INPUT_KEYSPAN_REMOTE is not set
# CONFIG_INPUT_POWERMATE is not set
# CONFIG_INPUT_YEALINK is not set
# CONFIG_INPUT_CM109 is not set
# CONFIG_INPUT_UINPUT is not set

#
# Hardware I/O ports
#
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
# CONFIG_SERIO_SERPORT is not set
# CONFIG_SERIO_CT82C710 is not set
# CONFIG_SERIO_PCIPS2 is not set
CONFIG_SERIO_LIBPS2=y
# CONFIG_SERIO_RAW is not set
# CONFIG_GAMEPORT is not set

#
# Character devices
#
CONFIG_VT=y
CONFIG_CONSOLE_TRANSLATIONS=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
CONFIG_VT_HW_CONSOLE_BINDING=y
# CONFIG_DEVKMEM is not set
# CONFIG_SERIAL_NONSTANDARD is not set
# CONFIG_NOZOMI is not set

#
# Serial drivers
#
# CONFIG_SERIAL_8250 is not set
CONFIG_FIX_EARLYCON_MEM=y

#
# Non-8250 serial port support
#
# CONFIG_SERIAL_JSM is not set
CONFIG_UNIX98_PTYS=y
# CONFIG_DEVPTS_MULTIPLE_INSTANCES is not set
# CONFIG_LEGACY_PTYS is not set
CONFIG_HVC_DRIVER=y
CONFIG_VIRTIO_CONSOLE=m
CONFIG_IPMI_HANDLER=y
# CONFIG_IPMI_PANIC_EVENT is not set
CONFIG_IPMI_DEVICE_INTERFACE=y
CONFIG_IPMI_SI=y
CONFIG_IPMI_WATCHDOG=y
CONFIG_IPMI_POWEROFF=y
CONFIG_HW_RANDOM=y
# CONFIG_HW_RANDOM_TIMERIOMEM is not set
CONFIG_HW_RANDOM_INTEL=y
# CONFIG_HW_RANDOM_AMD is not set
# CONFIG_HW_RANDOM_GEODE is not set
# CONFIG_HW_RANDOM_VIA is not set
# CONFIG_HW_RANDOM_VIRTIO is not set
CONFIG_NVRAM=y
# CONFIG_R3964 is not set
# CONFIG_APPLICOM is not set
# CONFIG_SONYPI is not set
# CONFIG_MWAVE is not set
# CONFIG_PC8736x_GPIO is not set
# CONFIG_NSC_GPIO is not set
# CONFIG_CS5535_GPIO is not set
# CONFIG_RAW_DRIVER is not set
CONFIG_HPET=y
# CONFIG_HPET_MMAP is not set
CONFIG_HANGCHECK_TIMER=y
CONFIG_TCG_TPM=y
# CONFIG_TCG_TIS is not set
# CONFIG_TCG_NSC is not set
# CONFIG_TCG_ATMEL is not set
CONFIG_TCG_INFINEON=y
# CONFIG_TELCLOCK is not set
CONFIG_DEVPORT=y
CONFIG_I2C=y
CONFIG_I2C_BOARDINFO=y
CONFIG_I2C_CHARDEV=y
CONFIG_I2C_HELPER_AUTO=y
CONFIG_I2C_ALGOBIT=y

#
# I2C Hardware Bus support
#

#
# PC SMBus host controller drivers
#
# CONFIG_I2C_ALI1535 is not set
# CONFIG_I2C_ALI1563 is not set
# CONFIG_I2C_ALI15X3 is not set
# CONFIG_I2C_AMD756 is not set
# CONFIG_I2C_AMD8111 is not set
CONFIG_I2C_I801=y
CONFIG_I2C_ISCH=y
CONFIG_I2C_PIIX4=y
# CONFIG_I2C_NFORCE2 is not set
# CONFIG_I2C_SIS5595 is not set
# CONFIG_I2C_SIS630 is not set
# CONFIG_I2C_SIS96X is not set
# CONFIG_I2C_VIA is not set
# CONFIG_I2C_VIAPRO is not set

#
# I2C system bus drivers (mostly embedded / system-on-chip)
#
# CONFIG_I2C_OCORES is not set
# CONFIG_I2C_SIMTEC is not set

#
# External I2C/SMBus adapter drivers
#
# CONFIG_I2C_PARPORT_LIGHT is not set
# CONFIG_I2C_TAOS_EVM is not set
# CONFIG_I2C_TINY_USB is not set

#
# Graphics adapter I2C/DDC channel drivers
#
# CONFIG_I2C_VOODOO3 is not set

#
# Other I2C/SMBus bus drivers
#
# CONFIG_I2C_PCA_PLATFORM is not set
# CONFIG_I2C_STUB is not set
# CONFIG_SCx200_ACB is not set

#
# Miscellaneous I2C Chip support
#
# CONFIG_DS1682 is not set
# CONFIG_SENSORS_PCF8574 is not set
# CONFIG_PCF8575 is not set
# CONFIG_SENSORS_PCA9539 is not set
# CONFIG_SENSORS_MAX6875 is not set
# CONFIG_SENSORS_TSL2550 is not set
# CONFIG_I2C_DEBUG_CORE is not set
# CONFIG_I2C_DEBUG_ALGO is not set
# CONFIG_I2C_DEBUG_BUS is not set
# CONFIG_I2C_DEBUG_CHIP is not set
# CONFIG_SPI is not set
CONFIG_ARCH_WANT_OPTIONAL_GPIOLIB=y
# CONFIG_GPIOLIB is not set
# CONFIG_W1 is not set
CONFIG_POWER_SUPPLY=y
# CONFIG_POWER_SUPPLY_DEBUG is not set
# CONFIG_PDA_POWER is not set
# CONFIG_BATTERY_DS2760 is not set
# CONFIG_BATTERY_BQ27x00 is not set
CONFIG_HWMON=y
# CONFIG_HWMON_VID is not set
# CONFIG_SENSORS_ABITUGURU is not set
# CONFIG_SENSORS_ABITUGURU3 is not set
# CONFIG_SENSORS_AD7414 is not set
# CONFIG_SENSORS_AD7418 is not set
# CONFIG_SENSORS_ADM1021 is not set
# CONFIG_SENSORS_ADM1025 is not set
# CONFIG_SENSORS_ADM1026 is not set
# CONFIG_SENSORS_ADM1029 is not set
# CONFIG_SENSORS_ADM1031 is not set
# CONFIG_SENSORS_ADM9240 is not set
# CONFIG_SENSORS_ADT7462 is not set
# CONFIG_SENSORS_ADT7470 is not set
# CONFIG_SENSORS_ADT7473 is not set
# CONFIG_SENSORS_ADT7475 is not set
# CONFIG_SENSORS_K8TEMP is not set
# CONFIG_SENSORS_ASB100 is not set
# CONFIG_SENSORS_ATK0110 is not set
# CONFIG_SENSORS_ATXP1 is not set
# CONFIG_SENSORS_DS1621 is not set
# CONFIG_SENSORS_I5K_AMB is not set
# CONFIG_SENSORS_F71805F is not set
# CONFIG_SENSORS_F71882FG is not set
# CONFIG_SENSORS_F75375S is not set
# CONFIG_SENSORS_FSCHER is not set
# CONFIG_SENSORS_FSCPOS is not set
# CONFIG_SENSORS_FSCHMD is not set
# CONFIG_SENSORS_G760A is not set
# CONFIG_SENSORS_GL518SM is not set
# CONFIG_SENSORS_GL520SM is not set
CONFIG_SENSORS_CORETEMP=y
# CONFIG_SENSORS_IBMAEM is not set
# CONFIG_SENSORS_IBMPEX is not set
# CONFIG_SENSORS_IT87 is not set
# CONFIG_SENSORS_LM63 is not set
# CONFIG_SENSORS_LM75 is not set
# CONFIG_SENSORS_LM77 is not set
# CONFIG_SENSORS_LM78 is not set
# CONFIG_SENSORS_LM80 is not set
# CONFIG_SENSORS_LM83 is not set
# CONFIG_SENSORS_LM85 is not set
# CONFIG_SENSORS_LM87 is not set
# CONFIG_SENSORS_LM90 is not set
# CONFIG_SENSORS_LM92 is not set
# CONFIG_SENSORS_LM93 is not set
# CONFIG_SENSORS_LTC4215 is not set
# CONFIG_SENSORS_LTC4245 is not set
# CONFIG_SENSORS_LM95241 is not set
# CONFIG_SENSORS_MAX1619 is not set
# CONFIG_SENSORS_MAX6650 is not set
# CONFIG_SENSORS_PC87360 is not set
# CONFIG_SENSORS_PC87427 is not set
# CONFIG_SENSORS_PCF8591 is not set
# CONFIG_SENSORS_SIS5595 is not set
# CONFIG_SENSORS_DME1737 is not set
# CONFIG_SENSORS_SMSC47M1 is not set
# CONFIG_SENSORS_SMSC47M192 is not set
# CONFIG_SENSORS_SMSC47B397 is not set
# CONFIG_SENSORS_ADS7828 is not set
# CONFIG_SENSORS_THMC50 is not set
# CONFIG_SENSORS_VIA686A is not set
# CONFIG_SENSORS_VT1211 is not set
# CONFIG_SENSORS_VT8231 is not set
# CONFIG_SENSORS_W83781D is not set
# CONFIG_SENSORS_W83791D is not set
# CONFIG_SENSORS_W83792D is not set
# CONFIG_SENSORS_W83793 is not set
# CONFIG_SENSORS_W83L785TS is not set
# CONFIG_SENSORS_W83L786NG is not set
# CONFIG_SENSORS_W83627HF is not set
# CONFIG_SENSORS_W83627EHF is not set
# CONFIG_SENSORS_HDAPS is not set
# CONFIG_SENSORS_LIS3LV02D is not set
CONFIG_SENSORS_APPLESMC=y
# CONFIG_HWMON_DEBUG_CHIP is not set
CONFIG_THERMAL=y
CONFIG_THERMAL_HWMON=y
CONFIG_WATCHDOG=y
# CONFIG_WATCHDOG_NOWAYOUT is not set

#
# Watchdog Device Drivers
#
# CONFIG_SOFT_WATCHDOG is not set
# CONFIG_ACQUIRE_WDT is not set
# CONFIG_ADVANTECH_WDT is not set
# CONFIG_ALIM1535_WDT is not set
# CONFIG_ALIM7101_WDT is not set
# CONFIG_SC520_WDT is not set
# CONFIG_EUROTECH_WDT is not set
# CONFIG_IB700_WDT is not set
# CONFIG_IBMASR is not set
# CONFIG_WAFER_WDT is not set
# CONFIG_I6300ESB_WDT is not set
CONFIG_ITCO_WDT=y
# CONFIG_ITCO_VENDOR_SUPPORT is not set
# CONFIG_IT8712F_WDT is not set
# CONFIG_IT87_WDT is not set
# CONFIG_HP_WATCHDOG is not set
# CONFIG_SC1200_WDT is not set
# CONFIG_PC87413_WDT is not set
# CONFIG_60XX_WDT is not set
# CONFIG_SBC8360_WDT is not set
# CONFIG_SBC7240_WDT is not set
# CONFIG_CPU5_WDT is not set
# CONFIG_SMSC_SCH311X_WDT is not set
# CONFIG_SMSC37B787_WDT is not set
# CONFIG_W83627HF_WDT is not set
# CONFIG_W83697HF_WDT is not set
# CONFIG_W83697UG_WDT is not set
# CONFIG_W83877F_WDT is not set
# CONFIG_W83977F_WDT is not set
# CONFIG_MACHZ_WDT is not set
# CONFIG_SBC_EPX_C3_WATCHDOG is not set

#
# PCI-based Watchdog Cards
#
# CONFIG_PCIPCWATCHDOG is not set
# CONFIG_WDTPCI is not set

#
# USB-based Watchdog Cards
#
# CONFIG_USBPCWATCHDOG is not set
CONFIG_SSB_POSSIBLE=y

#
# Sonics Silicon Backplane
#
# CONFIG_SSB is not set

#
# Multifunction device drivers
#
# CONFIG_MFD_CORE is not set
# CONFIG_MFD_SM501 is not set
# CONFIG_HTC_PASIC3 is not set
# CONFIG_TWL4030_CORE is not set
# CONFIG_MFD_TMIO is not set
# CONFIG_PMIC_DA903X is not set
# CONFIG_MFD_WM8400 is not set
# CONFIG_MFD_WM8350_I2C is not set
# CONFIG_MFD_PCF50633 is not set
# CONFIG_REGULATOR is not set

#
# Multimedia devices
#

#
# Multimedia core support
#
CONFIG_VIDEO_DEV=y
CONFIG_VIDEO_V4L2_COMMON=y
# CONFIG_VIDEO_ALLOW_V4L1 is not set
CONFIG_VIDEO_V4L1_COMPAT=y
# CONFIG_DVB_CORE is not set
CONFIG_VIDEO_MEDIA=y

#
# Multimedia drivers
#
# CONFIG_MEDIA_ATTACH is not set
CONFIG_MEDIA_TUNER=y
# CONFIG_MEDIA_TUNER_CUSTOMISE is not set
CONFIG_MEDIA_TUNER_SIMPLE=y
CONFIG_MEDIA_TUNER_TDA8290=y
CONFIG_MEDIA_TUNER_TDA9887=y
CONFIG_MEDIA_TUNER_TEA5761=y
CONFIG_MEDIA_TUNER_TEA5767=y
CONFIG_MEDIA_TUNER_MT20XX=y
CONFIG_MEDIA_TUNER_XC2028=y
CONFIG_MEDIA_TUNER_XC5000=y
CONFIG_MEDIA_TUNER_MC44S803=y
CONFIG_VIDEO_V4L2=y
# CONFIG_VIDEO_CAPTURE_DRIVERS is not set
# CONFIG_RADIO_ADAPTERS is not set
# CONFIG_DAB is not set

#
# Graphics support
#
CONFIG_AGP=y
# CONFIG_AGP_ALI is not set
# CONFIG_AGP_ATI is not set
# CONFIG_AGP_AMD is not set
# CONFIG_AGP_AMD64 is not set
CONFIG_AGP_INTEL=y
# CONFIG_AGP_NVIDIA is not set
# CONFIG_AGP_SIS is not set
# CONFIG_AGP_SWORKS is not set
# CONFIG_AGP_VIA is not set
# CONFIG_AGP_EFFICEON is not set
CONFIG_DRM=y
# CONFIG_DRM_TDFX is not set
# CONFIG_DRM_R128 is not set
CONFIG_DRM_RADEON=y
# CONFIG_DRM_I810 is not set
# CONFIG_DRM_I830 is not set
# CONFIG_DRM_I915 is not set
# CONFIG_DRM_MGA is not set
# CONFIG_DRM_SIS is not set
# CONFIG_DRM_VIA is not set
# CONFIG_DRM_SAVAGE is not set
# CONFIG_VGASTATE is not set
CONFIG_VIDEO_OUTPUT_CONTROL=y
CONFIG_FB=y
# CONFIG_FIRMWARE_EDID is not set
# CONFIG_FB_DDC is not set
# CONFIG_FB_BOOT_VESA_SUPPORT is not set
# CONFIG_FB_CFB_FILLRECT is not set
# CONFIG_FB_CFB_COPYAREA is not set
# CONFIG_FB_CFB_IMAGEBLIT is not set
# CONFIG_FB_CFB_REV_PIXELS_IN_BYTE is not set
# CONFIG_FB_SYS_FILLRECT is not set
# CONFIG_FB_SYS_COPYAREA is not set
# CONFIG_FB_SYS_IMAGEBLIT is not set
# CONFIG_FB_FOREIGN_ENDIAN is not set
# CONFIG_FB_SYS_FOPS is not set
# CONFIG_FB_SVGALIB is not set
# CONFIG_FB_MACMODES is not set
# CONFIG_FB_BACKLIGHT is not set
CONFIG_FB_MODE_HELPERS=y
CONFIG_FB_TILEBLITTING=y

#
# Frame buffer hardware drivers
#
# CONFIG_FB_CIRRUS is not set
# CONFIG_FB_PM2 is not set
# CONFIG_FB_CYBER2000 is not set
# CONFIG_FB_ARC is not set
# CONFIG_FB_ASILIANT is not set
# CONFIG_FB_IMSTT is not set
# CONFIG_FB_VGA16 is not set
# CONFIG_FB_UVESA is not set
# CONFIG_FB_VESA is not set
# CONFIG_FB_EFI is not set
# CONFIG_FB_N411 is not set
# CONFIG_FB_HGA is not set
# CONFIG_FB_S1D13XXX is not set
# CONFIG_FB_NVIDIA is not set
# CONFIG_FB_RIVA is not set
# CONFIG_FB_I810 is not set
# CONFIG_FB_LE80578 is not set
# CONFIG_FB_INTEL is not set
# CONFIG_FB_MATROX is not set
# CONFIG_FB_RADEON is not set
# CONFIG_FB_ATY128 is not set
# CONFIG_FB_ATY is not set
# CONFIG_FB_S3 is not set
# CONFIG_FB_SAVAGE is not set
# CONFIG_FB_SIS is not set
# CONFIG_FB_VIA is not set
# CONFIG_FB_NEOMAGIC is not set
# CONFIG_FB_KYRO is not set
# CONFIG_FB_3DFX is not set
# CONFIG_FB_VOODOO1 is not set
# CONFIG_FB_VT8623 is not set
# CONFIG_FB_TRIDENT is not set
# CONFIG_FB_ARK is not set
# CONFIG_FB_PM3 is not set
# CONFIG_FB_CARMINE is not set
# CONFIG_FB_GEODE is not set
# CONFIG_FB_VIRTUAL is not set
# CONFIG_FB_METRONOME is not set
# CONFIG_FB_MB862XX is not set
# CONFIG_FB_BROADSHEET is not set
CONFIG_BACKLIGHT_LCD_SUPPORT=y
CONFIG_LCD_CLASS_DEVICE=y
# CONFIG_LCD_ILI9320 is not set
CONFIG_LCD_PLATFORM=y
CONFIG_BACKLIGHT_CLASS_DEVICE=y
CONFIG_BACKLIGHT_GENERIC=y
# CONFIG_BACKLIGHT_PROGEAR is not set
# CONFIG_BACKLIGHT_MBP_NVIDIA is not set
# CONFIG_BACKLIGHT_SAHARA is not set

#
# Display device support
#
CONFIG_DISPLAY_SUPPORT=y

#
# Display hardware drivers
#

#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
CONFIG_VGACON_SOFT_SCROLLBACK=y
CONFIG_VGACON_SOFT_SCROLLBACK_SIZE=64
CONFIG_DUMMY_CONSOLE=y
CONFIG_FRAMEBUFFER_CONSOLE=y
CONFIG_FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y
CONFIG_FRAMEBUFFER_CONSOLE_ROTATION=y
# CONFIG_FONTS is not set
CONFIG_FONT_8x8=y
CONFIG_FONT_8x16=y
CONFIG_LOGO=y
# CONFIG_LOGO_LINUX_MONO is not set
# CONFIG_LOGO_LINUX_VGA16 is not set
CONFIG_LOGO_LINUX_CLUT224=y
CONFIG_SOUND=y
# CONFIG_SOUND_OSS_CORE is not set
CONFIG_SND=y
CONFIG_SND_TIMER=y
CONFIG_SND_PCM=y
CONFIG_SND_HWDEP=y
CONFIG_SND_RAWMIDI=y
CONFIG_SND_JACK=y
CONFIG_SND_SEQUENCER=y
CONFIG_SND_SEQ_DUMMY=y
# CONFIG_SND_MIXER_OSS is not set
# CONFIG_SND_PCM_OSS is not set
# CONFIG_SND_SEQUENCER_OSS is not set
CONFIG_SND_HRTIMER=y
CONFIG_SND_SEQ_HRTIMER_DEFAULT=y
CONFIG_SND_DYNAMIC_MINORS=y
# CONFIG_SND_SUPPORT_OLD_API is not set
# CONFIG_SND_VERBOSE_PROCFS is not set
# CONFIG_SND_VERBOSE_PRINTK is not set
# CONFIG_SND_DEBUG is not set
CONFIG_SND_VMASTER=y
CONFIG_SND_DRIVERS=y
# CONFIG_SND_PCSP is not set
CONFIG_SND_DUMMY=y
CONFIG_SND_VIRMIDI=y
# CONFIG_SND_MTPAV is not set
# CONFIG_SND_SERIAL_U16550 is not set
# CONFIG_SND_MPU401 is not set
CONFIG_SND_PCI=y
# CONFIG_SND_AD1889 is not set
# CONFIG_SND_ALS300 is not set
# CONFIG_SND_ALS4000 is not set
# CONFIG_SND_ALI5451 is not set
# CONFIG_SND_ATIIXP is not set
# CONFIG_SND_ATIIXP_MODEM is not set
# CONFIG_SND_AU8810 is not set
# CONFIG_SND_AU8820 is not set
# CONFIG_SND_AU8830 is not set
# CONFIG_SND_AW2 is not set
# CONFIG_SND_AZT3328 is not set
# CONFIG_SND_BT87X is not set
# CONFIG_SND_CA0106 is not set
# CONFIG_SND_CMIPCI is not set
# CONFIG_SND_OXYGEN is not set
# CONFIG_SND_CS4281 is not set
# CONFIG_SND_CS46XX is not set
# CONFIG_SND_CS5530 is not set
# CONFIG_SND_CS5535AUDIO is not set
# CONFIG_SND_DARLA20 is not set
# CONFIG_SND_GINA20 is not set
# CONFIG_SND_LAYLA20 is not set
# CONFIG_SND_DARLA24 is not set
# CONFIG_SND_GINA24 is not set
# CONFIG_SND_LAYLA24 is not set
# CONFIG_SND_MONA is not set
# CONFIG_SND_MIA is not set
# CONFIG_SND_ECHO3G is not set
# CONFIG_SND_INDIGO is not set
# CONFIG_SND_INDIGOIO is not set
# CONFIG_SND_INDIGODJ is not set
# CONFIG_SND_INDIGOIOX is not set
# CONFIG_SND_INDIGODJX is not set
# CONFIG_SND_EMU10K1 is not set
# CONFIG_SND_EMU10K1X is not set
# CONFIG_SND_ENS1370 is not set
# CONFIG_SND_ENS1371 is not set
# CONFIG_SND_ES1938 is not set
# CONFIG_SND_ES1968 is not set
# CONFIG_SND_FM801 is not set
CONFIG_SND_HDA_INTEL=y
CONFIG_SND_HDA_HWDEP=y
# CONFIG_SND_HDA_RECONFIG is not set
CONFIG_SND_HDA_INPUT_BEEP=y
CONFIG_SND_HDA_CODEC_REALTEK=y
CONFIG_SND_HDA_CODEC_ANALOG=y
CONFIG_SND_HDA_CODEC_SIGMATEL=y
CONFIG_SND_HDA_CODEC_VIA=y
CONFIG_SND_HDA_CODEC_ATIHDMI=y
CONFIG_SND_HDA_CODEC_NVHDMI=y
CONFIG_SND_HDA_CODEC_INTELHDMI=y
CONFIG_SND_HDA_ELD=y
CONFIG_SND_HDA_CODEC_CONEXANT=y
CONFIG_SND_HDA_CODEC_CMEDIA=y
CONFIG_SND_HDA_CODEC_SI3054=y
CONFIG_SND_HDA_GENERIC=y
CONFIG_SND_HDA_POWER_SAVE=y
CONFIG_SND_HDA_POWER_SAVE_DEFAULT=0
# CONFIG_SND_HDSP is not set
# CONFIG_SND_HDSPM is not set
# CONFIG_SND_HIFIER is not set
# CONFIG_SND_ICE1712 is not set
# CONFIG_SND_ICE1724 is not set
# CONFIG_SND_INTEL8X0 is not set
# CONFIG_SND_INTEL8X0M is not set
# CONFIG_SND_KORG1212 is not set
# CONFIG_SND_MAESTRO3 is not set
# CONFIG_SND_MIXART is not set
# CONFIG_SND_NM256 is not set
# CONFIG_SND_PCXHR is not set
# CONFIG_SND_RIPTIDE is not set
# CONFIG_SND_RME32 is not set
# CONFIG_SND_RME96 is not set
# CONFIG_SND_RME9652 is not set
# CONFIG_SND_SIS7019 is not set
# CONFIG_SND_SONICVIBES is not set
# CONFIG_SND_TRIDENT is not set
# CONFIG_SND_VIA82XX is not set
# CONFIG_SND_VIA82XX_MODEM is not set
# CONFIG_SND_VIRTUOSO is not set
# CONFIG_SND_VX222 is not set
# CONFIG_SND_YMFPCI is not set
# CONFIG_SND_USB is not set
# CONFIG_SND_SOC is not set
# CONFIG_SOUND_PRIME is not set
CONFIG_HID_SUPPORT=y
CONFIG_HID=y
# CONFIG_HID_DEBUG is not set
CONFIG_HIDRAW=y

#
# USB Input Devices
#
CONFIG_USB_HID=y
# CONFIG_HID_PID is not set
# CONFIG_USB_HIDDEV is not set

#
# Special HID drivers
#
CONFIG_HID_A4TECH=y
CONFIG_HID_APPLE=y
CONFIG_HID_BELKIN=y
CONFIG_HID_CHERRY=y
CONFIG_HID_CHICONY=y
CONFIG_HID_CYPRESS=y
# CONFIG_DRAGONRISE_FF is not set
CONFIG_HID_EZKEY=y
CONFIG_HID_KYE=y
CONFIG_HID_GYRATION=y
CONFIG_HID_KENSINGTON=y
CONFIG_HID_LOGITECH=y
# CONFIG_LOGITECH_FF is not set
# CONFIG_LOGIRUMBLEPAD2_FF is not set
CONFIG_HID_MICROSOFT=y
CONFIG_HID_MONTEREY=y
CONFIG_HID_NTRIG=y
CONFIG_HID_PANTHERLORD=y
# CONFIG_PANTHERLORD_FF is not set
CONFIG_HID_PETALYNX=y
CONFIG_HID_SAMSUNG=y
CONFIG_HID_SONY=y
CONFIG_HID_SUNPLUS=y
# CONFIG_GREENASIA_FF is not set
CONFIG_HID_TOPSEED=y
# CONFIG_THRUSTMASTER_FF is not set
# CONFIG_ZEROPLUS_FF is not set
CONFIG_USB_SUPPORT=y
CONFIG_USB_ARCH_HAS_HCD=y
CONFIG_USB_ARCH_HAS_OHCI=y
CONFIG_USB_ARCH_HAS_EHCI=y
CONFIG_USB=y
# CONFIG_USB_DEBUG is not set
CONFIG_USB_ANNOUNCE_NEW_DEVICES=y

#
# Miscellaneous USB options
#
CONFIG_USB_DEVICEFS=y
# CONFIG_USB_DEVICE_CLASS is not set
# CONFIG_USB_DYNAMIC_MINORS is not set
CONFIG_USB_SUSPEND=y
# CONFIG_USB_OTG is not set
CONFIG_USB_MON=y
# CONFIG_USB_WUSB is not set
# CONFIG_USB_WUSB_CBAF is not set

#
# USB Host Controller Drivers
#
# CONFIG_USB_C67X00_HCD is not set
CONFIG_USB_EHCI_HCD=y
CONFIG_USB_EHCI_ROOT_HUB_TT=y
CONFIG_USB_EHCI_TT_NEWSCHED=y
# CONFIG_USB_OXU210HP_HCD is not set
# CONFIG_USB_ISP116X_HCD is not set
# CONFIG_USB_ISP1760_HCD is not set
# CONFIG_USB_OHCI_HCD is not set
CONFIG_USB_UHCI_HCD=y
# CONFIG_USB_SL811_HCD is not set
# CONFIG_USB_R8A66597_HCD is not set
# CONFIG_USB_WHCI_HCD is not set
# CONFIG_USB_HWA_HCD is not set

#
# USB Device Class drivers
#
CONFIG_USB_ACM=y
CONFIG_USB_PRINTER=y
# CONFIG_USB_WDM is not set
# CONFIG_USB_TMC is not set

#
# NOTE: USB_STORAGE depends on SCSI but BLK_DEV_SD may
#

#
# also be needed; see USB_STORAGE Help for more info
#
CONFIG_USB_STORAGE=y
# CONFIG_USB_STORAGE_DEBUG is not set
CONFIG_USB_STORAGE_DATAFAB=y
CONFIG_USB_STORAGE_FREECOM=y
CONFIG_USB_STORAGE_ISD200=y
CONFIG_USB_STORAGE_USBAT=y
CONFIG_USB_STORAGE_SDDR09=y
CONFIG_USB_STORAGE_SDDR55=y
CONFIG_USB_STORAGE_JUMPSHOT=y
CONFIG_USB_STORAGE_ALAUDA=y
CONFIG_USB_STORAGE_ONETOUCH=y
CONFIG_USB_STORAGE_KARMA=y
# CONFIG_USB_STORAGE_CYPRESS_ATACB is not set
# CONFIG_USB_LIBUSUAL is not set

#
# USB Imaging devices
#
# CONFIG_USB_MDC800 is not set
# CONFIG_USB_MICROTEK is not set

#
# USB port drivers
#
# CONFIG_USB_SERIAL is not set

#
# USB Miscellaneous drivers
#
# CONFIG_USB_EMI62 is not set
# CONFIG_USB_EMI26 is not set
# CONFIG_USB_ADUTUX is not set
# CONFIG_USB_SEVSEG is not set
# CONFIG_USB_RIO500 is not set
# CONFIG_USB_LEGOTOWER is not set
# CONFIG_USB_LCD is not set
# CONFIG_USB_BERRY_CHARGE is not set
# CONFIG_USB_LED is not set
# CONFIG_USB_CYPRESS_CY7C63 is not set
# CONFIG_USB_CYTHERM is not set
# CONFIG_USB_IDMOUSE is not set
# CONFIG_USB_FTDI_ELAN is not set
CONFIG_USB_APPLEDISPLAY=y
# CONFIG_USB_SISUSBVGA is not set
# CONFIG_USB_LD is not set
# CONFIG_USB_TRANCEVIBRATOR is not set
# CONFIG_USB_IOWARRIOR is not set
# CONFIG_USB_TEST is not set
# CONFIG_USB_ISIGHTFW is not set
# CONFIG_USB_VST is not set
# CONFIG_USB_GADGET is not set

#
# OTG and related infrastructure
#
# CONFIG_NOP_USB_XCEIV is not set
# CONFIG_UWB is not set
# CONFIG_MMC is not set
# CONFIG_MEMSTICK is not set
CONFIG_NEW_LEDS=y
CONFIG_LEDS_CLASS=y

#
# LED drivers
#
# CONFIG_LEDS_ALIX2 is not set
# CONFIG_LEDS_PCA9532 is not set
# CONFIG_LEDS_LP5521 is not set
# CONFIG_LEDS_CLEVO_MAIL is not set
# CONFIG_LEDS_PCA955X is not set
# CONFIG_LEDS_BD2802 is not set

#
# LED Triggers
#
CONFIG_LEDS_TRIGGERS=y
CONFIG_LEDS_TRIGGER_TIMER=y
CONFIG_LEDS_TRIGGER_HEARTBEAT=y
# CONFIG_LEDS_TRIGGER_BACKLIGHT is not set
# CONFIG_LEDS_TRIGGER_DEFAULT_ON is not set

#
# iptables trigger is under Netfilter config (LED target)
#
# CONFIG_ACCESSIBILITY is not set
# CONFIG_INFINIBAND is not set
# CONFIG_EDAC is not set
CONFIG_RTC_LIB=y
CONFIG_RTC_CLASS=y
CONFIG_RTC_HCTOSYS=y
CONFIG_RTC_HCTOSYS_DEVICE="rtc0"
# CONFIG_RTC_DEBUG is not set

#
# RTC interfaces
#
CONFIG_RTC_INTF_SYSFS=y
CONFIG_RTC_INTF_PROC=y
CONFIG_RTC_INTF_DEV=y
# CONFIG_RTC_INTF_DEV_UIE_EMUL is not set
# CONFIG_RTC_DRV_TEST is not set

#
# I2C RTC drivers
#
# CONFIG_RTC_DRV_DS1307 is not set
# CONFIG_RTC_DRV_DS1374 is not set
# CONFIG_RTC_DRV_DS1672 is not set
# CONFIG_RTC_DRV_MAX6900 is not set
# CONFIG_RTC_DRV_RS5C372 is not set
# CONFIG_RTC_DRV_ISL1208 is not set
# CONFIG_RTC_DRV_X1205 is not set
# CONFIG_RTC_DRV_PCF8563 is not set
# CONFIG_RTC_DRV_PCF8583 is not set
# CONFIG_RTC_DRV_M41T80 is not set
# CONFIG_RTC_DRV_S35390A is not set
# CONFIG_RTC_DRV_FM3130 is not set
# CONFIG_RTC_DRV_RX8581 is not set

#
# SPI RTC drivers
#

#
# Platform RTC drivers
#
CONFIG_RTC_DRV_CMOS=y
# CONFIG_RTC_DRV_DS1286 is not set
# CONFIG_RTC_DRV_DS1511 is not set
# CONFIG_RTC_DRV_DS1553 is not set
# CONFIG_RTC_DRV_DS1742 is not set
# CONFIG_RTC_DRV_STK17TA8 is not set
# CONFIG_RTC_DRV_M48T86 is not set
# CONFIG_RTC_DRV_M48T35 is not set
# CONFIG_RTC_DRV_M48T59 is not set
# CONFIG_RTC_DRV_BQ4802 is not set
# CONFIG_RTC_DRV_V3020 is not set

#
# on-CPU RTC drivers
#
# CONFIG_DMADEVICES is not set
# CONFIG_AUXDISPLAY is not set
# CONFIG_UIO is not set
# CONFIG_STAGING is not set
CONFIG_X86_PLATFORM_DEVICES=y
# CONFIG_ACER_WMI is not set
# CONFIG_ASUS_LAPTOP is not set
# CONFIG_FUJITSU_LAPTOP is not set
# CONFIG_TC1100_WMI is not set
# CONFIG_MSI_LAPTOP is not set
# CONFIG_PANASONIC_LAPTOP is not set
# CONFIG_COMPAL_LAPTOP is not set
# CONFIG_SONY_LAPTOP is not set
# CONFIG_THINKPAD_ACPI is not set
# CONFIG_INTEL_MENLOW is not set
# CONFIG_EEEPC_LAPTOP is not set
# CONFIG_ACPI_WMI is not set
# CONFIG_ACPI_ASUS is not set
# CONFIG_ACPI_TOSHIBA is not set

#
# Firmware Drivers
#
CONFIG_EDD=y
# CONFIG_EDD_OFF is not set
CONFIG_FIRMWARE_MEMMAP=y
CONFIG_EFI_VARS=y
# CONFIG_DELL_RBU is not set
# CONFIG_DCDBAS is not set
CONFIG_DMIID=y
# CONFIG_ISCSI_IBFT_FIND is not set

#
# File systems
#
CONFIG_EXT2_FS=y
CONFIG_EXT2_FS_XATTR=y
CONFIG_EXT2_FS_POSIX_ACL=y
CONFIG_EXT2_FS_SECURITY=y
CONFIG_EXT2_FS_XIP=y
CONFIG_EXT3_FS=y
# CONFIG_EXT3_DEFAULTS_TO_ORDERED is not set
CONFIG_EXT3_FS_XATTR=y
CONFIG_EXT3_FS_POSIX_ACL=y
CONFIG_EXT3_FS_SECURITY=y
CONFIG_EXT4_FS=y
# CONFIG_EXT4DEV_COMPAT is not set
CONFIG_EXT4_FS_XATTR=y
CONFIG_EXT4_FS_POSIX_ACL=y
CONFIG_EXT4_FS_SECURITY=y
CONFIG_FS_XIP=y
CONFIG_JBD=y
# CONFIG_JBD_DEBUG is not set
CONFIG_JBD2=y
# CONFIG_JBD2_DEBUG is not set
CONFIG_FS_MBCACHE=y
# CONFIG_REISERFS_FS is not set
CONFIG_JFS_FS=y
CONFIG_JFS_POSIX_ACL=y
CONFIG_JFS_SECURITY=y
# CONFIG_JFS_DEBUG is not set
# CONFIG_JFS_STATISTICS is not set
CONFIG_FS_POSIX_ACL=y
CONFIG_FILE_LOCKING=y
# CONFIG_XFS_FS is not set
# CONFIG_GFS2_FS is not set
# CONFIG_OCFS2_FS is not set
# CONFIG_BTRFS_FS is not set
CONFIG_DNOTIFY=y
CONFIG_INOTIFY=y
CONFIG_INOTIFY_USER=y
CONFIG_QUOTA=y
CONFIG_QUOTA_NETLINK_INTERFACE=y
# CONFIG_PRINT_QUOTA_WARNING is not set
CONFIG_QUOTA_TREE=y
# CONFIG_QFMT_V1 is not set
CONFIG_QFMT_V2=y
CONFIG_QUOTACTL=y
CONFIG_AUTOFS_FS=m
CONFIG_AUTOFS4_FS=m
CONFIG_FUSE_FS=y
CONFIG_GENERIC_ACL=y

#
# Caches
#
# CONFIG_FSCACHE is not set

#
# CD-ROM/DVD Filesystems
#
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
CONFIG_ZISOFS=y
CONFIG_UDF_FS=y
CONFIG_UDF_NLS=y

#
# DOS/FAT/NT Filesystems
#
CONFIG_FAT_FS=y
CONFIG_MSDOS_FS=y
CONFIG_VFAT_FS=y
CONFIG_FAT_DEFAULT_CODEPAGE=437
CONFIG_FAT_DEFAULT_IOCHARSET="ascii"
CONFIG_NTFS_FS=y
# CONFIG_NTFS_DEBUG is not set
# CONFIG_NTFS_RW is not set

#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_PROC_VMCORE=y
CONFIG_PROC_SYSCTL=y
CONFIG_PROC_PAGE_MONITOR=y
CONFIG_SYSFS=y
CONFIG_TMPFS=y
CONFIG_TMPFS_POSIX_ACL=y
CONFIG_HUGETLBFS=y
CONFIG_HUGETLB_PAGE=y
CONFIG_CONFIGFS_FS=y
CONFIG_MISC_FILESYSTEMS=y
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_ECRYPT_FS is not set
# CONFIG_HFS_FS is not set
CONFIG_HFSPLUS_FS=y
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
# CONFIG_JFFS2_FS is not set
# CONFIG_CRAMFS is not set
# CONFIG_SQUASHFS is not set
# CONFIG_VXFS_FS is not set
# CONFIG_MINIX_FS is not set
# CONFIG_OMFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
# CONFIG_ROMFS_FS is not set
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set
# CONFIG_NILFS2_FS is not set
CONFIG_NETWORK_FILESYSTEMS=y
# CONFIG_NFS_FS is not set
# CONFIG_NFSD is not set
# CONFIG_SMB_FS is not set
CONFIG_CIFS=y
# CONFIG_CIFS_STATS is not set
CONFIG_CIFS_WEAK_PW_HASH=y
CONFIG_CIFS_UPCALL=y
CONFIG_CIFS_XATTR=y
CONFIG_CIFS_POSIX=y
# CONFIG_CIFS_DEBUG2 is not set
CONFIG_CIFS_DFS_UPCALL=y
CONFIG_CIFS_EXPERIMENTAL=y
# CONFIG_NCP_FS is not set
# CONFIG_CODA_FS is not set
# CONFIG_AFS_FS is not set

#
# Partition Types
#
CONFIG_PARTITION_ADVANCED=y
# CONFIG_ACORN_PARTITION is not set
# CONFIG_OSF_PARTITION is not set
# CONFIG_AMIGA_PARTITION is not set
# CONFIG_ATARI_PARTITION is not set
# CONFIG_MAC_PARTITION is not set
CONFIG_MSDOS_PARTITION=y
# CONFIG_BSD_DISKLABEL is not set
# CONFIG_MINIX_SUBPARTITION is not set
# CONFIG_SOLARIS_X86_PARTITION is not set
# CONFIG_UNIXWARE_DISKLABEL is not set
# CONFIG_LDM_PARTITION is not set
# CONFIG_SGI_PARTITION is not set
# CONFIG_ULTRIX_PARTITION is not set
# CONFIG_SUN_PARTITION is not set
# CONFIG_KARMA_PARTITION is not set
CONFIG_EFI_PARTITION=y
# CONFIG_SYSV68_PARTITION is not set
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="utf8"
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_CODEPAGE_737=y
CONFIG_NLS_CODEPAGE_775=y
CONFIG_NLS_CODEPAGE_850=y
CONFIG_NLS_CODEPAGE_852=y
CONFIG_NLS_CODEPAGE_855=y
CONFIG_NLS_CODEPAGE_857=y
CONFIG_NLS_CODEPAGE_860=y
CONFIG_NLS_CODEPAGE_861=y
CONFIG_NLS_CODEPAGE_862=y
CONFIG_NLS_CODEPAGE_863=y
CONFIG_NLS_CODEPAGE_864=y
CONFIG_NLS_CODEPAGE_865=y
CONFIG_NLS_CODEPAGE_866=y
CONFIG_NLS_CODEPAGE_869=y
CONFIG_NLS_CODEPAGE_936=y
CONFIG_NLS_CODEPAGE_950=y
CONFIG_NLS_CODEPAGE_932=y
CONFIG_NLS_CODEPAGE_949=y
CONFIG_NLS_CODEPAGE_874=y
CONFIG_NLS_ISO8859_8=y
CONFIG_NLS_CODEPAGE_1250=y
CONFIG_NLS_CODEPAGE_1251=y
CONFIG_NLS_ASCII=y
CONFIG_NLS_ISO8859_1=y
CONFIG_NLS_ISO8859_2=y
CONFIG_NLS_ISO8859_3=y
CONFIG_NLS_ISO8859_4=y
CONFIG_NLS_ISO8859_5=y
CONFIG_NLS_ISO8859_6=y
CONFIG_NLS_ISO8859_7=y
CONFIG_NLS_ISO8859_9=y
CONFIG_NLS_ISO8859_13=y
CONFIG_NLS_ISO8859_14=y
CONFIG_NLS_ISO8859_15=y
CONFIG_NLS_KOI8_R=y
CONFIG_NLS_KOI8_U=y
CONFIG_NLS_UTF8=y
# CONFIG_DLM is not set

#
# Kernel hacking
#
CONFIG_TRACE_IRQFLAGS_SUPPORT=y
CONFIG_PRINTK_TIME=y
# CONFIG_ENABLE_WARN_DEPRECATED is not set
CONFIG_ENABLE_MUST_CHECK=y
CONFIG_FRAME_WARN=1024
# CONFIG_MAGIC_SYSRQ is not set
# CONFIG_UNUSED_SYMBOLS is not set
CONFIG_DEBUG_FS=y
# CONFIG_HEADERS_CHECK is not set
# CONFIG_DEBUG_KERNEL is not set
# CONFIG_SLUB_DEBUG_ON is not set
# CONFIG_SLUB_STATS is not set
CONFIG_STACKTRACE=y
CONFIG_DEBUG_BUGVERBOSE=y
CONFIG_DEBUG_MEMORY_INIT=y
CONFIG_ARCH_WANT_FRAME_POINTERS=y
# CONFIG_FRAME_POINTER is not set
# CONFIG_RCU_CPU_STALL_DETECTOR is not set
# CONFIG_LATENCYTOP is not set
# CONFIG_SYSCTL_SYSCALL_CHECK is not set
CONFIG_USER_STACKTRACE_SUPPORT=y
CONFIG_NOP_TRACER=y
CONFIG_HAVE_FUNCTION_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
CONFIG_HAVE_HW_BRANCH_TRACER=y
CONFIG_HAVE_FTRACE_SYSCALLS=y
CONFIG_RING_BUFFER=y
CONFIG_TRACING=y
CONFIG_TRACING_SUPPORT=y

#
# Tracers
#
# CONFIG_FUNCTION_TRACER is not set
# CONFIG_IRQSOFF_TRACER is not set
# CONFIG_PREEMPT_TRACER is not set
# CONFIG_SYSPROF_TRACER is not set
# CONFIG_SCHED_TRACER is not set
# CONFIG_CONTEXT_SWITCH_TRACER is not set
# CONFIG_EVENT_TRACER is not set
# CONFIG_FTRACE_SYSCALLS is not set
# CONFIG_BOOT_TRACER is not set
# CONFIG_TRACE_BRANCH_PROFILING is not set
# CONFIG_POWER_TRACER is not set
# CONFIG_STACK_TRACER is not set
# CONFIG_HW_BRANCH_TRACER is not set
# CONFIG_KMEMTRACE is not set
# CONFIG_WORKQUEUE_TRACER is not set
# CONFIG_BLK_DEV_IO_TRACE is not set
# CONFIG_FTRACE_STARTUP_TEST is not set
# CONFIG_MMIOTRACE is not set
# CONFIG_PROVIDE_OHCI1394_DMA_INIT is not set
# CONFIG_FIREWIRE_OHCI_REMOTE_DMA is not set
# CONFIG_DYNAMIC_DEBUG is not set
# CONFIG_DMA_API_DEBUG is not set
# CONFIG_SAMPLES is not set
CONFIG_HAVE_ARCH_KGDB=y
CONFIG_STRICT_DEVMEM=y
# CONFIG_X86_VERBOSE_BOOTUP is not set
CONFIG_EARLY_PRINTK=y
# CONFIG_EARLY_PRINTK_DBGP is not set
CONFIG_4KSTACKS=y
CONFIG_DOUBLEFAULT=y
CONFIG_HAVE_MMIOTRACE_SUPPORT=y
CONFIG_IO_DELAY_TYPE_0X80=0
CONFIG_IO_DELAY_TYPE_0XED=1
CONFIG_IO_DELAY_TYPE_UDELAY=2
CONFIG_IO_DELAY_TYPE_NONE=3
CONFIG_IO_DELAY_0X80=y
# CONFIG_IO_DELAY_0XED is not set
# CONFIG_IO_DELAY_UDELAY is not set
# CONFIG_IO_DELAY_NONE is not set
CONFIG_DEFAULT_IO_DELAY_TYPE=0
# CONFIG_OPTIMIZE_INLINING is not set

#
# Security options
#
CONFIG_KEYS=y
CONFIG_KEYS_DEBUG_PROC_KEYS=y
CONFIG_SECURITY=y
CONFIG_SECURITYFS=y
CONFIG_SECURITY_NETWORK=y
CONFIG_SECURITY_NETWORK_XFRM=y
# CONFIG_SECURITY_PATH is not set
CONFIG_SECURITY_FILE_CAPABILITIES=y
# CONFIG_SECURITY_ROOTPLUG is not set
CONFIG_SECURITY_DEFAULT_MMAP_MIN_ADDR=65536
CONFIG_SECURITY_SELINUX=y
CONFIG_SECURITY_SELINUX_BOOTPARAM=y
CONFIG_SECURITY_SELINUX_BOOTPARAM_VALUE=1
CONFIG_SECURITY_SELINUX_DISABLE=y
CONFIG_SECURITY_SELINUX_DEVELOP=y
CONFIG_SECURITY_SELINUX_AVC_STATS=y
CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE=1
# CONFIG_SECURITY_SELINUX_POLICYDB_VERSION_MAX is not set
# CONFIG_SECURITY_SMACK is not set
# CONFIG_SECURITY_TOMOYO is not set
# CONFIG_IMA is not set
CONFIG_XOR_BLOCKS=m
CONFIG_ASYNC_CORE=m
CONFIG_ASYNC_MEMCPY=m
CONFIG_ASYNC_XOR=m
CONFIG_CRYPTO=y

#
# Crypto core or helper
#
# CONFIG_CRYPTO_FIPS is not set
CONFIG_CRYPTO_ALGAPI=y
CONFIG_CRYPTO_ALGAPI2=y
CONFIG_CRYPTO_AEAD=y
CONFIG_CRYPTO_AEAD2=y
CONFIG_CRYPTO_BLKCIPHER=y
CONFIG_CRYPTO_BLKCIPHER2=y
CONFIG_CRYPTO_HASH=y
CONFIG_CRYPTO_HASH2=y
CONFIG_CRYPTO_RNG=y
CONFIG_CRYPTO_RNG2=y
CONFIG_CRYPTO_PCOMP=y
CONFIG_CRYPTO_MANAGER=y
CONFIG_CRYPTO_MANAGER2=y
CONFIG_CRYPTO_GF128MUL=y
CONFIG_CRYPTO_NULL=y
CONFIG_CRYPTO_WORKQUEUE=y
# CONFIG_CRYPTO_CRYPTD is not set
CONFIG_CRYPTO_AUTHENC=y
# CONFIG_CRYPTO_TEST is not set

#
# Authenticated Encryption with Associated Data
#
CONFIG_CRYPTO_CCM=y
CONFIG_CRYPTO_GCM=y
CONFIG_CRYPTO_SEQIV=y

#
# Block modes
#
CONFIG_CRYPTO_CBC=y
CONFIG_CRYPTO_CTR=y
CONFIG_CRYPTO_CTS=y
CONFIG_CRYPTO_ECB=y
CONFIG_CRYPTO_LRW=y
CONFIG_CRYPTO_PCBC=y
CONFIG_CRYPTO_XTS=y

#
# Hash modes
#
CONFIG_CRYPTO_HMAC=y
CONFIG_CRYPTO_XCBC=y

#
# Digest
#
CONFIG_CRYPTO_CRC32C=y
# CONFIG_CRYPTO_CRC32C_INTEL is not set
CONFIG_CRYPTO_MD4=y
CONFIG_CRYPTO_MD5=y
CONFIG_CRYPTO_MICHAEL_MIC=y
CONFIG_CRYPTO_RMD128=y
CONFIG_CRYPTO_RMD160=y
CONFIG_CRYPTO_RMD256=y
CONFIG_CRYPTO_RMD320=y
CONFIG_CRYPTO_SHA1=y
CONFIG_CRYPTO_SHA256=y
CONFIG_CRYPTO_SHA512=y
CONFIG_CRYPTO_TGR192=y
CONFIG_CRYPTO_WP512=y

#
# Ciphers
#
CONFIG_CRYPTO_AES=y
CONFIG_CRYPTO_AES_586=y
CONFIG_CRYPTO_ANUBIS=y
CONFIG_CRYPTO_ARC4=y
CONFIG_CRYPTO_BLOWFISH=y
CONFIG_CRYPTO_CAMELLIA=y
CONFIG_CRYPTO_CAST5=y
CONFIG_CRYPTO_CAST6=y
CONFIG_CRYPTO_DES=y
CONFIG_CRYPTO_FCRYPT=y
CONFIG_CRYPTO_KHAZAD=y
CONFIG_CRYPTO_SALSA20=y
CONFIG_CRYPTO_SALSA20_586=y
CONFIG_CRYPTO_SEED=y
CONFIG_CRYPTO_SERPENT=y
CONFIG_CRYPTO_TEA=y
CONFIG_CRYPTO_TWOFISH=y
CONFIG_CRYPTO_TWOFISH_COMMON=y
CONFIG_CRYPTO_TWOFISH_586=y

#
# Compression
#
CONFIG_CRYPTO_DEFLATE=y
CONFIG_CRYPTO_ZLIB=y
CONFIG_CRYPTO_LZO=y

#
# Random Number Generation
#
# CONFIG_CRYPTO_ANSI_CPRNG is not set
# CONFIG_CRYPTO_HW is not set
CONFIG_HAVE_KVM=y
CONFIG_HAVE_KVM_IRQCHIP=y
CONFIG_VIRTUALIZATION=y
CONFIG_KVM=y
CONFIG_KVM_INTEL=y
# CONFIG_KVM_AMD is not set
# CONFIG_KVM_TRACE is not set
# CONFIG_LGUEST is not set
CONFIG_VIRTIO=y
CONFIG_VIRTIO_RING=y
CONFIG_VIRTIO_PCI=y
CONFIG_VIRTIO_BALLOON=y
CONFIG_BINARY_PRINTF=y

#
# Library routines
#
CONFIG_BITREVERSE=y
CONFIG_GENERIC_FIND_FIRST_BIT=y
CONFIG_GENERIC_FIND_NEXT_BIT=y
CONFIG_GENERIC_FIND_LAST_BIT=y
CONFIG_CRC_CCITT=y
CONFIG_CRC16=y
CONFIG_CRC_T10DIF=y
CONFIG_CRC_ITU_T=y
CONFIG_CRC32=y
CONFIG_CRC7=y
CONFIG_LIBCRC32C=y
CONFIG_AUDIT_GENERIC=y
CONFIG_ZLIB_INFLATE=y
CONFIG_ZLIB_DEFLATE=y
CONFIG_LZO_COMPRESS=y
CONFIG_LZO_DECOMPRESS=y
CONFIG_DECOMPRESS_GZIP=y
CONFIG_DECOMPRESS_BZIP2=y
CONFIG_DECOMPRESS_LZMA=y
CONFIG_TEXTSEARCH=y
CONFIG_TEXTSEARCH_KMP=y
CONFIG_TEXTSEARCH_BM=y
CONFIG_TEXTSEARCH_FSM=y
CONFIG_HAS_IOMEM=y
CONFIG_HAS_IOPORT=y
CONFIG_HAS_DMA=y
CONFIG_NLATTR=y

greets
thomas



^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-17 17:46                     ` Alan Jenkins
  0 siblings, 0 replies; 580+ messages in thread
From: Alan Jenkins @ 2009-04-17 17:46 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jens Axboe, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List

Linus Torvalds wrote:
> On Fri, 17 Apr 2009, Alan Jenkins wrote:
>   
>> As another datapoint:  I tried blindly applying the commit to 2.6.29. 
>> The resulting kernel was able to hibernate fine the first time.
>>     
>
> Yeah, so it's not that commit per se that causes it. I bet it needs all 
> the IO scheduler changes too - and even when it does that, the end result 
> probably is really just a timing change.
>
>   
>> I'm going to be annoying and try something slightly different.  In
>> theory, I should be able to find the "first bad commit" where
>> cherry-picking 1faa16d22 causes a problem.
>>     
>
> Just for fun, try this one first and see if it makes any difference.
>
> Maybe the whole "swappiness=0" part was intentional. And maybe it wasn't. 
> This is one trivial patch. Maybe it makes your machine blow up. Who knows?
>
> There are other differences in the shrink_all_memory() path wrt the normal 
> memory freeing paths, but they are way more subtle. So I'm suggesting 
> tryign this not becasue I think it's "The Bug(tm)", but because it's an 
> easy test to make, and maybe it makes a difference.
>
> 		Linus
> ---
>  mm/vmscan.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 39fdfb1..d3595ed 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2113,6 +2113,8 @@ unsigned long shrink_all_memory(unsigned long nr_pages)
>  	struct scan_control sc = {
>  		.gfp_mask = GFP_KERNEL,
>  		.may_unmap = 0,
> +		.swap_cluster_max = SWAP_CLUSTER_MAX,
> +		.swappiness = vm_swappiness,
>  		.may_writepage = 1,
>  		.isolate_pages = isolate_pages_global,
>  	};
>   


No, that doesn't seem to affect it.

Thanks
Alan

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-17 17:46                     ` Alan Jenkins
  0 siblings, 0 replies; 580+ messages in thread
From: Alan Jenkins @ 2009-04-17 17:46 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jens Axboe, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List

Linus Torvalds wrote:
> On Fri, 17 Apr 2009, Alan Jenkins wrote:
>   
>> As another datapoint:  I tried blindly applying the commit to 2.6.29. 
>> The resulting kernel was able to hibernate fine the first time.
>>     
>
> Yeah, so it's not that commit per se that causes it. I bet it needs all 
> the IO scheduler changes too - and even when it does that, the end result 
> probably is really just a timing change.
>
>   
>> I'm going to be annoying and try something slightly different.  In
>> theory, I should be able to find the "first bad commit" where
>> cherry-picking 1faa16d22 causes a problem.
>>     
>
> Just for fun, try this one first and see if it makes any difference.
>
> Maybe the whole "swappiness=0" part was intentional. And maybe it wasn't. 
> This is one trivial patch. Maybe it makes your machine blow up. Who knows?
>
> There are other differences in the shrink_all_memory() path wrt the normal 
> memory freeing paths, but they are way more subtle. So I'm suggesting 
> tryign this not becasue I think it's "The Bug(tm)", but because it's an 
> easy test to make, and maybe it makes a difference.
>
> 		Linus
> ---
>  mm/vmscan.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 39fdfb1..d3595ed 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2113,6 +2113,8 @@ unsigned long shrink_all_memory(unsigned long nr_pages)
>  	struct scan_control sc = {
>  		.gfp_mask = GFP_KERNEL,
>  		.may_unmap = 0,
> +		.swap_cluster_max = SWAP_CLUSTER_MAX,
> +		.swappiness = vm_swappiness,
>  		.may_writepage = 1,
>  		.isolate_pages = isolate_pages_global,
>  	};
>   


No, that doesn't seem to affect it.

Thanks
Alan

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-17 20:34             ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-17 20:34 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jens Axboe, Alan Jenkins, Linux Kernel Mailing List, Kernel Testers List

On Friday 17 April 2009, Linus Torvalds wrote:
> 
> On Fri, 17 Apr 2009, Jens Axboe wrote:
> > 
> > Given the somewhat odd nature of the bug and the requirements to trigger
> > it, how confident are you in the bisection results?
> 
> I suspect it's timing-dependent. 
> 
> The failure case is a ENOMEM returned from the "echo disk > /sys/power/state", 
> and sadly there are a _lot_ of potential sources of ENOMEM's in the path. 
> And a numbe of them come from GFP_ATOMIC allocations etc.
> 
> Now, that explains why it only happens while in X (more memory being 
> used), and also why it succeeds the second time (the first try will have 
> triggered VM activity and then free'd the pages it allocated up to that 
> point).
> 
> IOW, I bet it would work on the first try if you were to just run 
> something like
> 
> 	ptr = malloc(BIGNUM);
> 	memset(ptr, 0, BIGNUM);
> 	exit(0);
> 
> first - just to make room for stuff.
> 
> And the thing is, swsusp_save() really does do odd things. For example, to 
> get rid of unnecessary memory, it does "drain_local_pages()", where the 
> "local" is "local cpu". Why does it do that? Likely nobody knows.
> 
> Now, that won't matter in Alan's case (he is UP), but the point is, the 
> swsuspend code does these random things to try to free up memory, and I 
> suspect it's mostly been a trial-and-error thing. And then subtle changes 
> in memory usage when allocating or writing things out will change things.
> 
> For example, there is a magic "PAGES_FOR_IO" #define, which is somewhat 
> arbitrarily set to 4MB worth of pages. Where did that number come from? 
> Who knows? But that's the number the code uses for the _initial_ check of 
> "do we have enough memory" (the one that must have passed, since it 
> actually started doing things and didn't print out a warning message).
> 
> Anyway, from the dmesg, we can see:
> 
> 	[   41.873619] PM: Shrinking memory...  Restarting tasks ... done.

Ah, thanks for pointing this out to me!
 
> and this is a clear indication that it's "swsusp_shrink_memory()" that 
> failed. If it had succeeded, you'd have seen
> 
> 	PM: Shrinking memory... done (xyz pages freed)
> 
> but it returned an error case, and then the suspend fails and starts 
> restarting tasks.

AFAICS, there's only one possible situation in which that can happen,
which is when shrink_all_memory() returns 0 and there was the assumption
that this could not happen unless there _really_ was no memory to free.
Apparently, that has recently changed and it is now possible that
shrink_all_memory() returns 0, even though there still is some memory to free.

At the moment I don't see what change caused that to happen, but shouldn't we
put .nr_reclaimed = 0 in the definition of sc in shrink_all_memory()?

Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-17 20:34             ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-17 20:34 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jens Axboe, Alan Jenkins, Linux Kernel Mailing List, Kernel Testers List

On Friday 17 April 2009, Linus Torvalds wrote:
> 
> On Fri, 17 Apr 2009, Jens Axboe wrote:
> > 
> > Given the somewhat odd nature of the bug and the requirements to trigger
> > it, how confident are you in the bisection results?
> 
> I suspect it's timing-dependent. 
> 
> The failure case is a ENOMEM returned from the "echo disk > /sys/power/state", 
> and sadly there are a _lot_ of potential sources of ENOMEM's in the path. 
> And a numbe of them come from GFP_ATOMIC allocations etc.
> 
> Now, that explains why it only happens while in X (more memory being 
> used), and also why it succeeds the second time (the first try will have 
> triggered VM activity and then free'd the pages it allocated up to that 
> point).
> 
> IOW, I bet it would work on the first try if you were to just run 
> something like
> 
> 	ptr = malloc(BIGNUM);
> 	memset(ptr, 0, BIGNUM);
> 	exit(0);
> 
> first - just to make room for stuff.
> 
> And the thing is, swsusp_save() really does do odd things. For example, to 
> get rid of unnecessary memory, it does "drain_local_pages()", where the 
> "local" is "local cpu". Why does it do that? Likely nobody knows.
> 
> Now, that won't matter in Alan's case (he is UP), but the point is, the 
> swsuspend code does these random things to try to free up memory, and I 
> suspect it's mostly been a trial-and-error thing. And then subtle changes 
> in memory usage when allocating or writing things out will change things.
> 
> For example, there is a magic "PAGES_FOR_IO" #define, which is somewhat 
> arbitrarily set to 4MB worth of pages. Where did that number come from? 
> Who knows? But that's the number the code uses for the _initial_ check of 
> "do we have enough memory" (the one that must have passed, since it 
> actually started doing things and didn't print out a warning message).
> 
> Anyway, from the dmesg, we can see:
> 
> 	[   41.873619] PM: Shrinking memory...  Restarting tasks ... done.

Ah, thanks for pointing this out to me!
 
> and this is a clear indication that it's "swsusp_shrink_memory()" that 
> failed. If it had succeeded, you'd have seen
> 
> 	PM: Shrinking memory... done (xyz pages freed)
> 
> but it returned an error case, and then the suspend fails and starts 
> restarting tasks.

AFAICS, there's only one possible situation in which that can happen,
which is when shrink_all_memory() returns 0 and there was the assumption
that this could not happen unless there _really_ was no memory to free.
Apparently, that has recently changed and it is now possible that
shrink_all_memory() returns 0, even though there still is some memory to free.

At the moment I don't see what change caused that to happen, but shouldn't we
put .nr_reclaimed = 0 in the definition of sc in shrink_all_memory()?

Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-17 20:58                       ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-17 20:58 UTC (permalink / raw)
  To: Alan Jenkins
  Cc: Linus Torvalds, Jens Axboe, Linux Kernel Mailing List,
	Kernel Testers List

On Friday 17 April 2009, Alan Jenkins wrote:
> Linus Torvalds wrote:
> > On Fri, 17 Apr 2009, Alan Jenkins wrote:
> >   
> >> As another datapoint:  I tried blindly applying the commit to 2.6.29. 
> >> The resulting kernel was able to hibernate fine the first time.
> >>     
> >
> > Yeah, so it's not that commit per se that causes it. I bet it needs all 
> > the IO scheduler changes too - and even when it does that, the end result 
> > probably is really just a timing change.
> >
> >   
> >> I'm going to be annoying and try something slightly different.  In
> >> theory, I should be able to find the "first bad commit" where
> >> cherry-picking 1faa16d22 causes a problem.
> >>     
> >
> > Just for fun, try this one first and see if it makes any difference.
> >
> > Maybe the whole "swappiness=0" part was intentional. And maybe it wasn't. 
> > This is one trivial patch. Maybe it makes your machine blow up. Who knows?
> >
> > There are other differences in the shrink_all_memory() path wrt the normal 
> > memory freeing paths, but they are way more subtle. So I'm suggesting 
> > tryign this not becasue I think it's "The Bug(tm)", but because it's an 
> > easy test to make, and maybe it makes a difference.
> >
> > 		Linus
> > ---
> >  mm/vmscan.c |    2 ++
> >  1 files changed, 2 insertions(+), 0 deletions(-)
> >
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 39fdfb1..d3595ed 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -2113,6 +2113,8 @@ unsigned long shrink_all_memory(unsigned long nr_pages)
> >  	struct scan_control sc = {
> >  		.gfp_mask = GFP_KERNEL,
> >  		.may_unmap = 0,
> > +		.swap_cluster_max = SWAP_CLUSTER_MAX,
> > +		.swappiness = vm_swappiness,
> >  		.may_writepage = 1,
> >  		.isolate_pages = isolate_pages_global,
> >  	};
> >   
> 
> 
> No, that doesn't seem to affect it.

Can you please try to reproduce the problem with the appended debug patch
applied and send the output of dmesg to me?

Rafael

---
 mm/vmscan.c |    8 ++++++++
 1 file changed, 8 insertions(+)

Index: linux-2.6/mm/vmscan.c
===================================================================
--- linux-2.6.orig/mm/vmscan.c
+++ linux-2.6/mm/vmscan.c
@@ -2115,6 +2115,7 @@ unsigned long shrink_all_memory(unsigned
 		.may_unmap = 0,
 		.may_writepage = 1,
 		.isolate_pages = isolate_pages_global,
+		.nr_reclaimed = 0,
 	};
 
 	current->reclaim_state = &reclaim_state;
@@ -2135,6 +2136,8 @@ unsigned long shrink_all_memory(unsigned
 		nr_slab -= reclaim_state.reclaimed_slab;
 	}
 
+	printk(KERN_INFO "before: sc.nr_reclaimed = %lu\n", sc.nr_reclaimed);
+
 	/*
 	 * We try to shrink LRUs in 5 passes:
 	 * 0 = Reclaim from inactive_list only
@@ -2168,6 +2171,10 @@ unsigned long shrink_all_memory(unsigned
 
 			if (sc.nr_scanned && prio < DEF_PRIORITY - 2)
 				congestion_wait(WRITE, HZ / 10);
+
+			printk(KERN_INFO "pass = %d, prio = %d, "
+				"sc.nr_reclaimed = %lu\n", pass, prio,
+				sc.nr_reclaimed);
 		}
 	}
 
@@ -2184,6 +2191,7 @@ unsigned long shrink_all_memory(unsigned
 				reclaim_state.reclaimed_slab > 0);
 	}
 
+	printk(KERN_INFO "after: sc.nr_reclaimed = %lu\n", sc.nr_reclaimed);
 
 out:
 	current->reclaim_state = NULL;

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-17 20:58                       ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-17 20:58 UTC (permalink / raw)
  To: Alan Jenkins
  Cc: Linus Torvalds, Jens Axboe, Linux Kernel Mailing List,
	Kernel Testers List

On Friday 17 April 2009, Alan Jenkins wrote:
> Linus Torvalds wrote:
> > On Fri, 17 Apr 2009, Alan Jenkins wrote:
> >   
> >> As another datapoint:  I tried blindly applying the commit to 2.6.29. 
> >> The resulting kernel was able to hibernate fine the first time.
> >>     
> >
> > Yeah, so it's not that commit per se that causes it. I bet it needs all 
> > the IO scheduler changes too - and even when it does that, the end result 
> > probably is really just a timing change.
> >
> >   
> >> I'm going to be annoying and try something slightly different.  In
> >> theory, I should be able to find the "first bad commit" where
> >> cherry-picking 1faa16d22 causes a problem.
> >>     
> >
> > Just for fun, try this one first and see if it makes any difference.
> >
> > Maybe the whole "swappiness=0" part was intentional. And maybe it wasn't. 
> > This is one trivial patch. Maybe it makes your machine blow up. Who knows?
> >
> > There are other differences in the shrink_all_memory() path wrt the normal 
> > memory freeing paths, but they are way more subtle. So I'm suggesting 
> > tryign this not becasue I think it's "The Bug(tm)", but because it's an 
> > easy test to make, and maybe it makes a difference.
> >
> > 		Linus
> > ---
> >  mm/vmscan.c |    2 ++
> >  1 files changed, 2 insertions(+), 0 deletions(-)
> >
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 39fdfb1..d3595ed 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -2113,6 +2113,8 @@ unsigned long shrink_all_memory(unsigned long nr_pages)
> >  	struct scan_control sc = {
> >  		.gfp_mask = GFP_KERNEL,
> >  		.may_unmap = 0,
> > +		.swap_cluster_max = SWAP_CLUSTER_MAX,
> > +		.swappiness = vm_swappiness,
> >  		.may_writepage = 1,
> >  		.isolate_pages = isolate_pages_global,
> >  	};
> >   
> 
> 
> No, that doesn't seem to affect it.

Can you please try to reproduce the problem with the appended debug patch
applied and send the output of dmesg to me?

Rafael

---
 mm/vmscan.c |    8 ++++++++
 1 file changed, 8 insertions(+)

Index: linux-2.6/mm/vmscan.c
===================================================================
--- linux-2.6.orig/mm/vmscan.c
+++ linux-2.6/mm/vmscan.c
@@ -2115,6 +2115,7 @@ unsigned long shrink_all_memory(unsigned
 		.may_unmap = 0,
 		.may_writepage = 1,
 		.isolate_pages = isolate_pages_global,
+		.nr_reclaimed = 0,
 	};
 
 	current->reclaim_state = &reclaim_state;
@@ -2135,6 +2136,8 @@ unsigned long shrink_all_memory(unsigned
 		nr_slab -= reclaim_state.reclaimed_slab;
 	}
 
+	printk(KERN_INFO "before: sc.nr_reclaimed = %lu\n", sc.nr_reclaimed);
+
 	/*
 	 * We try to shrink LRUs in 5 passes:
 	 * 0 = Reclaim from inactive_list only
@@ -2168,6 +2171,10 @@ unsigned long shrink_all_memory(unsigned
 
 			if (sc.nr_scanned && prio < DEF_PRIORITY - 2)
 				congestion_wait(WRITE, HZ / 10);
+
+			printk(KERN_INFO "pass = %d, prio = %d, "
+				"sc.nr_reclaimed = %lu\n", pass, prio,
+				sc.nr_reclaimed);
 		}
 	}
 
@@ -2184,6 +2191,7 @@ unsigned long shrink_all_memory(unsigned
 				reclaim_state.reclaimed_slab > 0);
 	}
 
+	printk(KERN_INFO "after: sc.nr_reclaimed = %lu\n", sc.nr_reclaimed);
 
 out:
 	current->reclaim_state = NULL;

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13066] Intel HD Audio oops
@ 2009-04-17 21:07       ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-17 21:07 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: Linux Kernel Mailing List, Kernel Testers List, Jeff Chua

On Friday 17 April 2009, Takashi Iwai wrote:
> At Thu, 16 Apr 2009 23:45:01 +0200 (CEST),
> Rafael J. Wysocki wrote:
> > 
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> > 
> > The following bug entry is on the current list of known regressions
> > from 2.6.29.  Please verify if it still should be listed and let me know
> > (either way).
> > 
> > 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13066
> > Subject		: Intel HD Audio oops
> > Submitter	: Jeff Chua <jeff.chua.linux@gmail.com>
> > Date		: 2009-04-01 8:28 (16 days old)
> > References	: http://marc.info/?l=linux-kernel&m=123857454625829&w=4
> 
> The fix patch was merged to the upstream as commit
> 95c0909961bc5ff18c78b2ab0d093cddc0a8b0b5.

Closed.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13066] Intel HD Audio oops
@ 2009-04-17 21:07       ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-17 21:07 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: Linux Kernel Mailing List, Kernel Testers List, Jeff Chua

On Friday 17 April 2009, Takashi Iwai wrote:
> At Thu, 16 Apr 2009 23:45:01 +0200 (CEST),
> Rafael J. Wysocki wrote:
> > 
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> > 
> > The following bug entry is on the current list of known regressions
> > from 2.6.29.  Please verify if it still should be listed and let me know
> > (either way).
> > 
> > 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13066
> > Subject		: Intel HD Audio oops
> > Submitter	: Jeff Chua <jeff.chua.linux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> > Date		: 2009-04-01 8:28 (16 days old)
> > References	: http://marc.info/?l=linux-kernel&m=123857454625829&w=4
> 
> The fix patch was merged to the upstream as commit
> 95c0909961bc5ff18c78b2ab0d093cddc0a8b0b5.

Closed.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13067] iwl3945: wlan0: beacon loss from AP - sending probe request
@ 2009-04-17 21:09       ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-17 21:09 UTC (permalink / raw)
  To: Justin Madru
  Cc: Linux Kernel Mailing List, Kernel Testers List, Maciej Rutecki

On Friday 17 April 2009, Justin Madru wrote:
> Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> >
> > The following bug entry is on the current list of known regressions
> > from 2.6.29.  Please verify if it still should be listed and let me know
> > (either way).
> >
> >
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13067
> > Subject		: iwl3945: wlan0: beacon loss from AP - sending probe request
> > Submitter	: Maciej Rutecki <maciej.rutecki@gmail.com>
> > Date		: 2009-04-05 9:11 (12 days old)
> > References	: http://marc.info/?l=linux-kernel&m=123892272218266&w=4
> >
> >
> >   
> I'm getting this on .30rc2, so I confirm that it's still an issue (I'm 
> not the original submitter).
> It's really annoying because it's filling my logs like crazy. Below is 
> 15mins of logs, and it just repeats like this.
> 
> dhclient: DHCPREQUEST of 192.168.1.5 on wlan0 to 192.168.1.254 port 67
> dhclient: DHCPACK of 192.168.1.5 from 192.168.1.254
> dhclient: bound to 192.168.1.5 -- renewal in 823 seconds.
> NetworkManager: <debug> [1239937333.003788] periodic_update(): Roamed 
> from BSSID 00:1C:DF:67:1F:AC (madru) to (none) ((none))
> kernel: wlan0: beacon loss from AP 00:1c:df:67:1f:ac - sending probe request
> NetworkManager: <debug> [1239937339.001004] periodic_update(): Roamed 
> from BSSID (none) ((none)) to 00:1C:DF:67:1F:AC (madru)
> NetworkManager: <debug> [1239937453.002911] periodic_update(): Roamed 
> from BSSID 00:1C:DF:67:1F:AC (madru) to (none) ((none))
> kernel: wlan0: beacon loss from AP 00:1c:df:67:1f:ac - sending probe request
> NetworkManager: <debug> [1239937459.000974] periodic_update(): Roamed 
> from BSSID (none) ((none)) to 00:1C:DF:67:1F:AC (madru)
> NetworkManager: <debug> [1239937573.002956] periodic_update(): Roamed 
> from BSSID 00:1C:DF:67:1F:AC (madru) to (none) ((none))
> kernel: wlan0: beacon loss from AP 00:1c:df:67:1f:ac - sending probe request
> NetworkManager: <debug> [1239937579.000971] periodic_update(): Roamed 
> from BSSID (none) ((none)) to 00:1C:DF:67:1F:AC (madru)
> NetworkManager: <debug> [1239937693.003787] periodic_update(): Roamed 
> from BSSID 00:1C:DF:67:1F:AC (madru) to (none) ((none))
> kernel: wlan0: beacon loss from AP 00:1c:df:67:1f:ac - sending probe request
> NetworkManager: <debug> [1239937699.000981] periodic_update(): Roamed 
> from BSSID (none) ((none)) to 00:1C:DF:67:1F:AC (madru)
> NetworkManager: <debug> [1239937813.002710] periodic_update(): Roamed 
> from BSSID 00:1C:DF:67:1F:AC (madru) to (none) ((none))
> kernel: wlan0: beacon loss from AP 00:1c:df:67:1f:ac - sending probe request
> NetworkManager: <debug> [1239937819.000753] periodic_update(): Roamed 
> from BSSID (none) ((none)) to 00:1C:DF:67:1F:AC (madru)
> NetworkManager: <debug> [1239937933.003806] periodic_update(): Roamed 
> from BSSID 00:1C:DF:67:1F:AC (madru) to (none) ((none))
> kernel: wlan0: beacon loss from AP 00:1c:df:67:1f:ac - sending probe request
> NetworkManager: <debug> [1239937939.000724] periodic_update(): Roamed 
> from BSSID (none) ((none)) to 00:1C:DF:67:1F:AC (madru)
> NetworkManager: <debug> [1239938053.002711] periodic_update(): Roamed 
> from BSSID 00:1C:DF:67:1F:AC (madru) to (none) ((none))
> kernel: wlan0: beacon loss from AP 00:1c:df:67:1f:ac - sending probe request
> NetworkManager: <debug> [1239938059.001749] periodic_update(): Roamed 
> from BSSID (none) ((none)) to 00:1C:DF:67:1F:AC (madru)
> dhclient: DHCPREQUEST of 192.168.1.5 on wlan0 to 192.168.1.254 port 67
> dhclient: DHCPACK of 192.168.1.5 from 192.168.1.254
> dhclient: bound to 192.168.1.5 -- renewal in 803 seconds.

Thanks for the update.

Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13067] iwl3945: wlan0: beacon loss from AP - sending probe request
@ 2009-04-17 21:09       ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-17 21:09 UTC (permalink / raw)
  To: Justin Madru
  Cc: Linux Kernel Mailing List, Kernel Testers List, Maciej Rutecki

On Friday 17 April 2009, Justin Madru wrote:
> Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> >
> > The following bug entry is on the current list of known regressions
> > from 2.6.29.  Please verify if it still should be listed and let me know
> > (either way).
> >
> >
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13067
> > Subject		: iwl3945: wlan0: beacon loss from AP - sending probe request
> > Submitter	: Maciej Rutecki <maciej.rutecki-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> > Date		: 2009-04-05 9:11 (12 days old)
> > References	: http://marc.info/?l=linux-kernel&m=123892272218266&w=4
> >
> >
> >   
> I'm getting this on .30rc2, so I confirm that it's still an issue (I'm 
> not the original submitter).
> It's really annoying because it's filling my logs like crazy. Below is 
> 15mins of logs, and it just repeats like this.
> 
> dhclient: DHCPREQUEST of 192.168.1.5 on wlan0 to 192.168.1.254 port 67
> dhclient: DHCPACK of 192.168.1.5 from 192.168.1.254
> dhclient: bound to 192.168.1.5 -- renewal in 823 seconds.
> NetworkManager: <debug> [1239937333.003788] periodic_update(): Roamed 
> from BSSID 00:1C:DF:67:1F:AC (madru) to (none) ((none))
> kernel: wlan0: beacon loss from AP 00:1c:df:67:1f:ac - sending probe request
> NetworkManager: <debug> [1239937339.001004] periodic_update(): Roamed 
> from BSSID (none) ((none)) to 00:1C:DF:67:1F:AC (madru)
> NetworkManager: <debug> [1239937453.002911] periodic_update(): Roamed 
> from BSSID 00:1C:DF:67:1F:AC (madru) to (none) ((none))
> kernel: wlan0: beacon loss from AP 00:1c:df:67:1f:ac - sending probe request
> NetworkManager: <debug> [1239937459.000974] periodic_update(): Roamed 
> from BSSID (none) ((none)) to 00:1C:DF:67:1F:AC (madru)
> NetworkManager: <debug> [1239937573.002956] periodic_update(): Roamed 
> from BSSID 00:1C:DF:67:1F:AC (madru) to (none) ((none))
> kernel: wlan0: beacon loss from AP 00:1c:df:67:1f:ac - sending probe request
> NetworkManager: <debug> [1239937579.000971] periodic_update(): Roamed 
> from BSSID (none) ((none)) to 00:1C:DF:67:1F:AC (madru)
> NetworkManager: <debug> [1239937693.003787] periodic_update(): Roamed 
> from BSSID 00:1C:DF:67:1F:AC (madru) to (none) ((none))
> kernel: wlan0: beacon loss from AP 00:1c:df:67:1f:ac - sending probe request
> NetworkManager: <debug> [1239937699.000981] periodic_update(): Roamed 
> from BSSID (none) ((none)) to 00:1C:DF:67:1F:AC (madru)
> NetworkManager: <debug> [1239937813.002710] periodic_update(): Roamed 
> from BSSID 00:1C:DF:67:1F:AC (madru) to (none) ((none))
> kernel: wlan0: beacon loss from AP 00:1c:df:67:1f:ac - sending probe request
> NetworkManager: <debug> [1239937819.000753] periodic_update(): Roamed 
> from BSSID (none) ((none)) to 00:1C:DF:67:1F:AC (madru)
> NetworkManager: <debug> [1239937933.003806] periodic_update(): Roamed 
> from BSSID 00:1C:DF:67:1F:AC (madru) to (none) ((none))
> kernel: wlan0: beacon loss from AP 00:1c:df:67:1f:ac - sending probe request
> NetworkManager: <debug> [1239937939.000724] periodic_update(): Roamed 
> from BSSID (none) ((none)) to 00:1C:DF:67:1F:AC (madru)
> NetworkManager: <debug> [1239938053.002711] periodic_update(): Roamed 
> from BSSID 00:1C:DF:67:1F:AC (madru) to (none) ((none))
> kernel: wlan0: beacon loss from AP 00:1c:df:67:1f:ac - sending probe request
> NetworkManager: <debug> [1239938059.001749] periodic_update(): Roamed 
> from BSSID (none) ((none)) to 00:1C:DF:67:1F:AC (madru)
> dhclient: DHCPREQUEST of 192.168.1.5 on wlan0 to 192.168.1.254 port 67
> dhclient: DHCPACK of 192.168.1.5 from 192.168.1.254
> dhclient: bound to 192.168.1.5 -- renewal in 803 seconds.

Thanks for the update.

Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-17 21:12                         ` Linus Torvalds
  0 siblings, 0 replies; 580+ messages in thread
From: Linus Torvalds @ 2009-04-17 21:12 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Alan Jenkins, Jens Axboe, Linux Kernel Mailing List, Kernel Testers List



On Fri, 17 Apr 2009, Rafael J. Wysocki wrote:
> 
> Can you please try to reproduce the problem with the appended debug patch
> applied and send the output of dmesg to me?

Maybe something like this instead (or in addition to).

It does "show_mem()" when memory shrinking fails. It will show a _lot_ of 
data.

Untested, but trivial.

		Linus
---
 kernel/power/swsusp.c |    9 ++++++++-
 1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/kernel/power/swsusp.c b/kernel/power/swsusp.c
index 78c3504..6e70efd 100644
--- a/kernel/power/swsusp.c
+++ b/kernel/power/swsusp.c
@@ -207,9 +207,16 @@ void swsusp_show_speed(struct timeval *start, struct timeval *stop,
 #define SHRINK_BITE	10000
 static inline unsigned long __shrink_memory(long tmp)
 {
+	unsigned long ret;
+
 	if (tmp > SHRINK_BITE)
 		tmp = SHRINK_BITE;
-	return shrink_all_memory(tmp);
+	ret = shrink_all_memory(tmp);
+	if (!ret) {
+		printk("shrink_all_memory(%ld) failed\n", tmp);
+		show_mem();
+	}
+	return ret;
 }
 
 int swsusp_shrink_memory(void)

^ permalink raw reply related	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-17 21:12                         ` Linus Torvalds
  0 siblings, 0 replies; 580+ messages in thread
From: Linus Torvalds @ 2009-04-17 21:12 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Alan Jenkins, Jens Axboe, Linux Kernel Mailing List, Kernel Testers List



On Fri, 17 Apr 2009, Rafael J. Wysocki wrote:
> 
> Can you please try to reproduce the problem with the appended debug patch
> applied and send the output of dmesg to me?

Maybe something like this instead (or in addition to).

It does "show_mem()" when memory shrinking fails. It will show a _lot_ of 
data.

Untested, but trivial.

		Linus
---
 kernel/power/swsusp.c |    9 ++++++++-
 1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/kernel/power/swsusp.c b/kernel/power/swsusp.c
index 78c3504..6e70efd 100644
--- a/kernel/power/swsusp.c
+++ b/kernel/power/swsusp.c
@@ -207,9 +207,16 @@ void swsusp_show_speed(struct timeval *start, struct timeval *stop,
 #define SHRINK_BITE	10000
 static inline unsigned long __shrink_memory(long tmp)
 {
+	unsigned long ret;
+
 	if (tmp > SHRINK_BITE)
 		tmp = SHRINK_BITE;
-	return shrink_all_memory(tmp);
+	ret = shrink_all_memory(tmp);
+	if (!ret) {
+		printk("shrink_all_memory(%ld) failed\n", tmp);
+		show_mem();
+	}
+	return ret;
 }
 
 int swsusp_shrink_memory(void)

^ permalink raw reply related	[flat|nested] 580+ messages in thread

* Re: [Bug #13099] net, sky2: BUG: unable to handle kernel NULL pointer dereference, pci_vpd_truncate()
  2009-04-17  0:45     ` Ingo Molnar
  (?)
@ 2009-04-17 21:14     ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-17 21:14 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linux Kernel Mailing List, Kernel Testers List, Stephen Hemminger

On Friday 17 April 2009, Ingo Molnar wrote:
> 
> * Rafael J. Wysocki <rjw@sisk.pl> wrote:
> 
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> > 
> > The following bug entry is on the current list of known regressions
> > from 2.6.29.  Please verify if it still should be listed and let me know
> > (either way).
> > 
> > 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13099
> > Subject		: net, sky2: BUG: unable to handle kernel NULL pointer dereference, pci_vpd_truncate()
> > Submitter	: Ingo Molnar <mingo@elte.hu>
> > Date		: 2009-04-06 9:03 (11 days old)
> > References	: http://marc.info/?l=linux-kernel&m=123900867611321&w=4
> > Handled-By	: Stephen Hemminger <shemminger@vyatta.com>
> 
> I think this can be closed as the fix has been merged upstream 
> already (via the PCI tree):
> 
>  commit d407e32efe060afa2b9a797a91376ebc65b4ce11
>  Author: Anton Vorontsov <avorontsov@ru.mvista.com>
>  Date:   Wed Apr 1 02:23:41 2009 +0400
> 
>     PCI: Fix oops in pci_vpd_truncate

Thanks, closed.

Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13106] 2.6.30-rc1: intel 3945 no wireless
@ 2009-04-17 21:16         ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-17 21:16 UTC (permalink / raw)
  To: Justin Madru; +Cc: Larry Finger, Linux Kernel Mailing List, Kernel Testers List

On Friday 17 April 2009, Justin Madru wrote:
> Larry Finger wrote:
> > Rafael J. Wysocki wrote:
> >   
> >> This message has been generated automatically as a part of a report
> >> of recent regressions.
> >>
> >> The following bug entry is on the current list of known regressions
> >> from 2.6.29.  Please verify if it still should be listed and let me know
> >> (either way).
> >>
> >>
> >> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13106
> >> Subject		: 2.6.30-rc1: intel 3945 no wireless
> >> Submitter	: 2.6.30-rc1: intel 3945 no wireless
> >> Date		: 2009-04-08 5:36 (9 days old)
> >> References	: http://marc.info/?l=linux-kernel&m=123916905605534&w=4
> >>     
> >
> > That regression was fixed by Herbert Xu's commit 97c18e2c. It should no longer
> > be listed.
> >
> > Larry
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> >
> >   
> I'm the original submitter. I confirm that it's been fixed -- close bug.

Thanks for the confirmation, the bug has been closed already.

Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13106] 2.6.30-rc1: intel 3945 no wireless
@ 2009-04-17 21:16         ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-17 21:16 UTC (permalink / raw)
  To: Justin Madru; +Cc: Larry Finger, Linux Kernel Mailing List, Kernel Testers List

On Friday 17 April 2009, Justin Madru wrote:
> Larry Finger wrote:
> > Rafael J. Wysocki wrote:
> >   
> >> This message has been generated automatically as a part of a report
> >> of recent regressions.
> >>
> >> The following bug entry is on the current list of known regressions
> >> from 2.6.29.  Please verify if it still should be listed and let me know
> >> (either way).
> >>
> >>
> >> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13106
> >> Subject		: 2.6.30-rc1: intel 3945 no wireless
> >> Submitter	: 2.6.30-rc1: intel 3945 no wireless
> >> Date		: 2009-04-08 5:36 (9 days old)
> >> References	: http://marc.info/?l=linux-kernel&m=123916905605534&w=4
> >>     
> >
> > That regression was fixed by Herbert Xu's commit 97c18e2c. It should no longer
> > be listed.
> >
> > Larry
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> >
> >   
> I'm the original submitter. I confirm that it's been fixed -- close bug.

Thanks for the confirmation, the bug has been closed already.

Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13113] tiobench read 50% regression with 2.6.30-rc1
@ 2009-04-17 21:22       ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-17 21:22 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Linux Kernel Mailing List, Kernel Testers List, Zhang, Yanmin

On Friday 17 April 2009, Jens Axboe wrote:
> On Thu, Apr 16 2009, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> > 
> > The following bug entry is on the current list of known regressions
> > from 2.6.29.  Please verify if it still should be listed and let me know
> > (either way).
> > 
> > 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13113
> > Subject		: tiobench read 50% regression with 2.6.30-rc1
> > Submitter	: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
> > Date		: 2009-04-09 8:29 (8 days old)
> > References	: http://marc.info/?l=linux-kernel&m=123926576802992&w=4
> > Handled-By	: Jens Axboe <jens.axboe@oracle.com>
> > Patch		: http://marc.info/?l=linux-kernel&m=123971130800697&w=4
> 
> It's fixed by d6ceb25e8d8bccf826848c2621a50d02c0a7f4ae, which is already
> merged.

Thanks, closed.

Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13113] tiobench read 50% regression with 2.6.30-rc1
@ 2009-04-17 21:22       ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-17 21:22 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Linux Kernel Mailing List, Kernel Testers List, Zhang, Yanmin

On Friday 17 April 2009, Jens Axboe wrote:
> On Thu, Apr 16 2009, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> > 
> > The following bug entry is on the current list of known regressions
> > from 2.6.29.  Please verify if it still should be listed and let me know
> > (either way).
> > 
> > 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13113
> > Subject		: tiobench read 50% regression with 2.6.30-rc1
> > Submitter	: Zhang, Yanmin <yanmin_zhang-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
> > Date		: 2009-04-09 8:29 (8 days old)
> > References	: http://marc.info/?l=linux-kernel&m=123926576802992&w=4
> > Handled-By	: Jens Axboe <jens.axboe-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
> > Patch		: http://marc.info/?l=linux-kernel&m=123971130800697&w=4
> 
> It's fixed by d6ceb25e8d8bccf826848c2621a50d02c0a7f4ae, which is already
> merged.

Thanks, closed.

Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-17  1:25   ` Ingo Molnar
@ 2009-04-17 21:25         ` Rafael J. Wysocki
       [not found]     ` <20090417012544.GB16126-X9Un+BFzKDI@public.gmane.org>
  1 sibling, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-17 21:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Arjan van de Ven, Linux Kernel Mailing List,
	Adrian Bunk, Andrew Morton, Natalie Protasevich,
	Kernel Testers List, Network Development, Linux ACPI,
	Linux PM List, Linux SCSI List

On Friday 17 April 2009, Ingo Molnar wrote:
> 
> * Linus Torvalds <torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org> wrote:
> 
> > 
> > 
> > I think you put this in the wrong regression pile:
> > 
> > > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13112
> > > Subject		: Oops in drain_array
> > > Submitter	: Bart <mmx-G/jkD+u3s4s@public.gmane.org>
> > > Date		: 2009-04-14 10:21 (3 days old)
> > > References	: http://marc.info/?l=linux-kernel&m=123970493224628&w=4
> > 
> > Hmm. This one seems like it should be in the "since 2.6.28" camp, since if 
> > I read that one right, it happens with 2.6.29.1.
> > 
> > (I mean sure, it might be new since 2.6.29, but it sounds more likely that 
> > it's already in 2.6.29)
> 
> I'd suspect it's possibly hardware related:
> 
>   http://www.kerneloops.org/search.php?search=free_block&btnG=Function+Search
> 
> Look at the very similar call signatures - spanning almost all 
> kernels back to v2.6.16. There's one spike at .27 - perhaps the same 
> box trying up hard and crashing several times - or a popular distro 
> kernel?
> 
> Or it's a really ancient bug going back to v2.6.16.

I have moved this one onto the list of regressions from 2.6.28.  When it is
confirmed that the bug is older, I'll drop it from there.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
@ 2009-04-17 21:25         ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-17 21:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Arjan van de Ven, Linux Kernel Mailing List,
	Adrian Bunk, Andrew Morton, Natalie Protasevich,
	Kernel Testers List, Network Development, Linux ACPI,
	Linux PM List, Linux SCSI List

On Friday 17 April 2009, Ingo Molnar wrote:
> 
> * Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> > 
> > 
> > I think you put this in the wrong regression pile:
> > 
> > > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13112
> > > Subject		: Oops in drain_array
> > > Submitter	: Bart <mmx@riz.pl>
> > > Date		: 2009-04-14 10:21 (3 days old)
> > > References	: http://marc.info/?l=linux-kernel&m=123970493224628&w=4
> > 
> > Hmm. This one seems like it should be in the "since 2.6.28" camp, since if 
> > I read that one right, it happens with 2.6.29.1.
> > 
> > (I mean sure, it might be new since 2.6.29, but it sounds more likely that 
> > it's already in 2.6.29)
> 
> I'd suspect it's possibly hardware related:
> 
>   http://www.kerneloops.org/search.php?search=free_block&btnG=Function+Search
> 
> Look at the very similar call signatures - spanning almost all 
> kernels back to v2.6.16. There's one spike at .27 - perhaps the same 
> box trying up hard and crashing several times - or a popular distro 
> kernel?
> 
> Or it's a really ancient bug going back to v2.6.16.

I have moved this one onto the list of regressions from 2.6.28.  When it is
confirmed that the bug is older, I'll drop it from there.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-17  1:25   ` Ingo Molnar
@ 2009-04-17 21:25     ` Rafael J. Wysocki
       [not found]     ` <20090417012544.GB16126-X9Un+BFzKDI@public.gmane.org>
  1 sibling, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-17 21:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Adrian Bunk, Linux SCSI List, Network Development,
	Linux Kernel Mailing List, Natalie Protasevich, Linux ACPI,
	Andrew Morton, Kernel Testers List, Linus Torvalds,
	Linux PM List, Arjan van de Ven

On Friday 17 April 2009, Ingo Molnar wrote:
> 
> * Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> > 
> > 
> > I think you put this in the wrong regression pile:
> > 
> > > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13112
> > > Subject		: Oops in drain_array
> > > Submitter	: Bart <mmx@riz.pl>
> > > Date		: 2009-04-14 10:21 (3 days old)
> > > References	: http://marc.info/?l=linux-kernel&m=123970493224628&w=4
> > 
> > Hmm. This one seems like it should be in the "since 2.6.28" camp, since if 
> > I read that one right, it happens with 2.6.29.1.
> > 
> > (I mean sure, it might be new since 2.6.29, but it sounds more likely that 
> > it's already in 2.6.29)
> 
> I'd suspect it's possibly hardware related:
> 
>   http://www.kerneloops.org/search.php?search=free_block&btnG=Function+Search
> 
> Look at the very similar call signatures - spanning almost all 
> kernels back to v2.6.16. There's one spike at .27 - perhaps the same 
> box trying up hard and crashing several times - or a popular distro 
> kernel?
> 
> Or it's a really ancient bug going back to v2.6.16.

I have moved this one onto the list of regressions from 2.6.28.  When it is
confirmed that the bug is older, I'll drop it from there.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-17  0:41 ` David Miller
@ 2009-04-17 21:27   ` Rafael J. Wysocki
  2009-04-17 21:27   ` Rafael J. Wysocki
  1 sibling, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-17 21:27 UTC (permalink / raw)
  To: David Miller
  Cc: linux-kernel, bunk, akpm, torvalds, protasnb, kernel-testers,
	netdev, linux-acpi, linux-pm, linux-scsi

On Friday 17 April 2009, David Miller wrote:
> From: "Rafael J. Wysocki" <rjw@sisk.pl>
> Date: Thu, 16 Apr 2009 23:42:31 +0200 (CEST)
> 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13099
> > Subject		: net, sky2: BUG: unable to handle kernel NULL pointer dereference, pci_vpd_truncate()
> > Submitter	: Ingo Molnar <mingo@elte.hu>
> > Date		: 2009-04-06 9:03 (11 days old)
> > References	: http://marc.info/?l=linux-kernel&m=123900867611321&w=4
> > Handled-By	: Stephen Hemminger <shemminger@vyatta.com>
> 
> Fixed by:
> 
> commit d407e32efe060afa2b9a797a91376ebc65b4ce11
> Author: Anton Vorontsov <avorontsov@ru.mvista.com>
> Date:   Wed Apr 1 02:23:41 2009 +0400
> 
>     PCI: Fix oops in pci_vpd_truncate
>     
>     pci_vpd_truncate() should check for dev->vpd->attr, otherwise this might
>     happen:
>     
>       sky2 driver version 1.22
>       Unable to handle kernel paging request for data at address 0x0000000c
>       Faulting instruction address: 0xc01836fc
>       Oops: Kernel access of bad area, sig: 11 [#1]
>       [...]
>       NIP [c01836fc] pci_vpd_truncate+0x38/0x40
>       LR [c029be18] sky2_probe+0x14c/0x518
>       Call Trace:
>       [ef82bde0] [c029bda4] sky2_probe+0xd8/0x518 (unreliable)
>       [ef82be20] [c018a11c] local_pci_probe+0x24/0x34
>       [ef82be30] [c018a14c] pci_call_probe+0x20/0x30
>       [ef82be50] [c018a330] __pci_device_probe+0x64/0x78
>       [ef82be60] [c018a44c] pci_device_probe+0x30/0x58
>       [ef82be80] [c01aa270] really_probe+0x78/0x1a0
>       [ef82bea0] [c01aa460] __driver_attach+0xa4/0xa8
>       [ef82bec0] [c01a96ac] bus_for_each_dev+0x60/0x9c
>       [ef82bef0] [c01aa0b4] driver_attach+0x24/0x34
>       [ef82bf00] [c01a9e08] bus_add_driver+0x12c/0x1cc
>       [ef82bf20] [c01aa87c] driver_register+0x6c/0x110
>       [ef82bf30] [c018a770] __pci_register_driver+0x4c/0x9c
>       [ef82bf50] [c03782c8] sky2_init_module+0x30/0x40
>       [ef82bf60] [c0001dbc] do_one_initcall+0x34/0x1a0
>       [ef82bfd0] [c0362240] do_initcalls+0x38/0x58
>     
>     This happens with CONFIG_SKY2=y, and "ip=on" kernel command line, so
>     pci_vpd_truncate() is called before late_initcall(pci_sysfs_init),
>     therefore ->attr isn't yet initialized.
>     
>     Acked-by: Stephen Hemminger <shemminger@vyatta.com>
>     Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
>     Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
>     Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Thanks, closed.

Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-17  0:41 ` David Miller
  2009-04-17 21:27   ` Rafael J. Wysocki
@ 2009-04-17 21:27   ` Rafael J. Wysocki
  1 sibling, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-17 21:27 UTC (permalink / raw)
  To: David Miller
  Cc: bunk, linux-scsi, netdev, linux-kernel, protasnb, linux-acpi,
	akpm, kernel-testers, torvalds, linux-pm

On Friday 17 April 2009, David Miller wrote:
> From: "Rafael J. Wysocki" <rjw@sisk.pl>
> Date: Thu, 16 Apr 2009 23:42:31 +0200 (CEST)
> 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13099
> > Subject		: net, sky2: BUG: unable to handle kernel NULL pointer dereference, pci_vpd_truncate()
> > Submitter	: Ingo Molnar <mingo@elte.hu>
> > Date		: 2009-04-06 9:03 (11 days old)
> > References	: http://marc.info/?l=linux-kernel&m=123900867611321&w=4
> > Handled-By	: Stephen Hemminger <shemminger@vyatta.com>
> 
> Fixed by:
> 
> commit d407e32efe060afa2b9a797a91376ebc65b4ce11
> Author: Anton Vorontsov <avorontsov@ru.mvista.com>
> Date:   Wed Apr 1 02:23:41 2009 +0400
> 
>     PCI: Fix oops in pci_vpd_truncate
>     
>     pci_vpd_truncate() should check for dev->vpd->attr, otherwise this might
>     happen:
>     
>       sky2 driver version 1.22
>       Unable to handle kernel paging request for data at address 0x0000000c
>       Faulting instruction address: 0xc01836fc
>       Oops: Kernel access of bad area, sig: 11 [#1]
>       [...]
>       NIP [c01836fc] pci_vpd_truncate+0x38/0x40
>       LR [c029be18] sky2_probe+0x14c/0x518
>       Call Trace:
>       [ef82bde0] [c029bda4] sky2_probe+0xd8/0x518 (unreliable)
>       [ef82be20] [c018a11c] local_pci_probe+0x24/0x34
>       [ef82be30] [c018a14c] pci_call_probe+0x20/0x30
>       [ef82be50] [c018a330] __pci_device_probe+0x64/0x78
>       [ef82be60] [c018a44c] pci_device_probe+0x30/0x58
>       [ef82be80] [c01aa270] really_probe+0x78/0x1a0
>       [ef82bea0] [c01aa460] __driver_attach+0xa4/0xa8
>       [ef82bec0] [c01a96ac] bus_for_each_dev+0x60/0x9c
>       [ef82bef0] [c01aa0b4] driver_attach+0x24/0x34
>       [ef82bf00] [c01a9e08] bus_add_driver+0x12c/0x1cc
>       [ef82bf20] [c01aa87c] driver_register+0x6c/0x110
>       [ef82bf30] [c018a770] __pci_register_driver+0x4c/0x9c
>       [ef82bf50] [c03782c8] sky2_init_module+0x30/0x40
>       [ef82bf60] [c0001dbc] do_one_initcall+0x34/0x1a0
>       [ef82bfd0] [c0362240] do_initcalls+0x38/0x58
>     
>     This happens with CONFIG_SKY2=y, and "ip=on" kernel command line, so
>     pci_vpd_truncate() is called before late_initcall(pci_sysfs_init),
>     therefore ->attr isn't yet initialized.
>     
>     Acked-by: Stephen Hemminger <shemminger@vyatta.com>
>     Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
>     Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
>     Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Thanks, closed.

Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-17  0:46   ` Linus Torvalds
  (?)
  (?)
@ 2009-04-17 21:31   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-17 21:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Linux Kernel Mailing List, Adrian Bunk, Andrew Morton,
	Natalie Protasevich, Kernel Testers List, Network Development,
	Linux ACPI, Linux PM List, Linux SCSI List

On Friday 17 April 2009, Linus Torvalds wrote:
> 
> On Thu, 16 Apr 2009, Rafael J. Wysocki wrote:
> > 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13098
> > Subject		: 2.6.29-git12 breaks vga=0x0f07 on MSI/Intel GPU
> > Submitter	: Andi Kleen <andi@firstfloor.org>
> > Date		: 2009-04-06 01:14 (11 days old)
> > References	: http://lkml.org/lkml/2009/4/5/200
> > Handled-By	: H. Peter Anvin <hpa@zytor.com>
> 
> I think this got fixed already. The VGA moresettign was reverted back to 
> the old order.

Closed.

> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13044
> > Subject		: 2.6.30-rc1 can't find the root fs
> > Submitter	: Heinz Diehl <htd@fancy-poultry.org>
> > Date		: 2009-04-08 13:35 (9 days old)
> 
> This was one of the async things that got fixed by just waiting for module 
> async work to finish.

Closed.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-17  0:46   ` Linus Torvalds
  (?)
@ 2009-04-17 21:31   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-17 21:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Adrian Bunk, Linux SCSI List, Network Development,
	Linux Kernel Mailing List, Natalie Protasevich, Linux ACPI,
	Andrew Morton, Kernel Testers List, Linux PM List

On Friday 17 April 2009, Linus Torvalds wrote:
> 
> On Thu, 16 Apr 2009, Rafael J. Wysocki wrote:
> > 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13098
> > Subject		: 2.6.29-git12 breaks vga=0x0f07 on MSI/Intel GPU
> > Submitter	: Andi Kleen <andi@firstfloor.org>
> > Date		: 2009-04-06 01:14 (11 days old)
> > References	: http://lkml.org/lkml/2009/4/5/200
> > Handled-By	: H. Peter Anvin <hpa@zytor.com>
> 
> I think this got fixed already. The VGA moresettign was reverted back to 
> the old order.

Closed.

> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13044
> > Subject		: 2.6.30-rc1 can't find the root fs
> > Submitter	: Heinz Diehl <htd@fancy-poultry.org>
> > Date		: 2009-04-08 13:35 (9 days old)
> 
> This was one of the async things that got fixed by just waiting for module 
> async work to finish.

Closed.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-17  1:30 ` Zhang Rui
@ 2009-04-17 21:35     ` Rafael J. Wysocki
  2009-04-17  2:34   ` yakui_zhao
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-17 21:35 UTC (permalink / raw)
  To: Zhang Rui
  Cc: Linux Kernel Mailing List, Adrian Bunk, Andrew Morton,
	Linus Torvalds, Natalie Protasevich, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List

On Friday 17 April 2009, Zhang Rui wrote:
> On Fri, 2009-04-17 at 05:42 +0800, Rafael J. Wysocki wrote:
> > 
> > 
> > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13095
> > Subject         : thinkpad-acpi: cannot control brightness with hotkeys
> > Submitter       : Niel Lambrechts <niel.lambrechts-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> > Date            : 2009-04-11 23:07 (6 days old)
> > References      : http://lkml.org/lkml/2009/4/11/160
> > Handled-By      : Matthew Garrett <mjg59-1xO5oi07KQx4cg9Nei1l7Q@public.gmane.org>
> > Patch           : http://lkml.org/lkml/2009/4/15/339
> > 
> > 
> > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13048
> > Subject         : /sys/class/backlight/acpi_video0/* is gone on vaio laptop with Intel GM45.
> > Submitter       : Rodrigo L. Batista <rodrigo-1dof46nAmC8dnm+yROfE0A@public.gmane.org>
> > Date            : 2009-04-09 04:57 (8 days old)
> > Handled-By      : yakui_zhao <yakui.zhao-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > Patch           : http://bugzilla.kernel.org/attachment.cgi?id=20967
> >                   http://bugzilla.kernel.org/attachment.cgi?id=20959
> > 
> > 
> bug 13095 is a duplicate of bug 13048.
> patches from Matthew and Yakui are for the same issue.

OK, closed 13095 as a duplicate.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
@ 2009-04-17 21:35     ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-17 21:35 UTC (permalink / raw)
  To: Zhang Rui
  Cc: Linux Kernel Mailing List, Adrian Bunk, Andrew Morton,
	Linus Torvalds, Natalie Protasevich, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List

On Friday 17 April 2009, Zhang Rui wrote:
> On Fri, 2009-04-17 at 05:42 +0800, Rafael J. Wysocki wrote:
> > 
> > 
> > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13095
> > Subject         : thinkpad-acpi: cannot control brightness with hotkeys
> > Submitter       : Niel Lambrechts <niel.lambrechts@gmail.com>
> > Date            : 2009-04-11 23:07 (6 days old)
> > References      : http://lkml.org/lkml/2009/4/11/160
> > Handled-By      : Matthew Garrett <mjg59@srcf.ucam.org>
> > Patch           : http://lkml.org/lkml/2009/4/15/339
> > 
> > 
> > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13048
> > Subject         : /sys/class/backlight/acpi_video0/* is gone on vaio laptop with Intel GM45.
> > Submitter       : Rodrigo L. Batista <rodrigo@gus-mg.org>
> > Date            : 2009-04-09 04:57 (8 days old)
> > Handled-By      : yakui_zhao <yakui.zhao@intel.com>
> > Patch           : http://bugzilla.kernel.org/attachment.cgi?id=20967
> >                   http://bugzilla.kernel.org/attachment.cgi?id=20959
> > 
> > 
> bug 13095 is a duplicate of bug 13048.
> patches from Matthew and Yakui are for the same issue.

OK, closed 13095 as a duplicate.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-17  1:30 ` Zhang Rui
  2009-04-17  2:34     ` yakui_zhao
  2009-04-17  2:34   ` yakui_zhao
@ 2009-04-17 21:35   ` Rafael J. Wysocki
  2009-04-17 21:35     ` Rafael J. Wysocki
  3 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-17 21:35 UTC (permalink / raw)
  To: Zhang Rui
  Cc: Adrian Bunk, Linux SCSI List, Network Development,
	Linux Kernel Mailing List, Natalie Protasevich, Linux ACPI,
	Andrew Morton, Kernel Testers List, Linus Torvalds,
	Linux PM List

On Friday 17 April 2009, Zhang Rui wrote:
> On Fri, 2009-04-17 at 05:42 +0800, Rafael J. Wysocki wrote:
> > 
> > 
> > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13095
> > Subject         : thinkpad-acpi: cannot control brightness with hotkeys
> > Submitter       : Niel Lambrechts <niel.lambrechts@gmail.com>
> > Date            : 2009-04-11 23:07 (6 days old)
> > References      : http://lkml.org/lkml/2009/4/11/160
> > Handled-By      : Matthew Garrett <mjg59@srcf.ucam.org>
> > Patch           : http://lkml.org/lkml/2009/4/15/339
> > 
> > 
> > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13048
> > Subject         : /sys/class/backlight/acpi_video0/* is gone on vaio laptop with Intel GM45.
> > Submitter       : Rodrigo L. Batista <rodrigo@gus-mg.org>
> > Date            : 2009-04-09 04:57 (8 days old)
> > Handled-By      : yakui_zhao <yakui.zhao@intel.com>
> > Patch           : http://bugzilla.kernel.org/attachment.cgi?id=20967
> >                   http://bugzilla.kernel.org/attachment.cgi?id=20959
> > 
> > 
> bug 13095 is a duplicate of bug 13048.
> patches from Matthew and Yakui are for the same issue.

OK, closed 13095 as a duplicate.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-17  1:37   ` Ming Lei
  (?)
  (?)
@ 2009-04-17 21:36   ` Rafael J. Wysocki
  2009-04-17 23:56     ` Laurent Pinchart
                       ` (3 more replies)
  -1 siblings, 4 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-17 21:36 UTC (permalink / raw)
  To: Ming Lei
  Cc: Linux Kernel Mailing List, Adrian Bunk, Andrew Morton,
	Linus Torvalds, Natalie Protasevich, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List,
	video4linux-list, laurent.pinchart, mchehab

On Friday 17 April 2009, Ming Lei wrote:
> 2009/4/17 Rafael J. Wysocki <rjw@sisk.pl>:
> >
> > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13125
> > Subject         : active uvcvideo breaks over suspend
> > Submitter       : Alan Jenkins <alan-jenkins@tuffmail.co.uk>
> > Date            : 2009-04-15 10:12 (2 days old)
> > References      : http://marc.info/?l=linux-kernel&m=123979009508840&w=4
> >
> 
> It is a bug in resume path of uvcvideo driver, and I have sent a patch
> to laurent.pinchart@skynet.be,
> mchehab@infradead.org  and video4linux-list@redhat.com to fix it, but
> still no echo from them.
> 
> The patch title is V4L/DVB:usbvideo:fix uvc resume failed.
> 
> Rafael J.
>         If you would like to apply it ,I can resend to you.  Thanks!

Please resend.

Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-17  1:37   ` Ming Lei
  (?)
@ 2009-04-17 21:36   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-17 21:36 UTC (permalink / raw)
  To: Ming Lei
  Cc: Adrian Bunk, Linux SCSI List, Network Development,
	Linux Kernel Mailing List, Natalie Protasevich, mchehab,
	Linux ACPI, video4linux-list, laurent.pinchart, Andrew Morton,
	Kernel Testers List, Linus Torvalds, Linux PM List

On Friday 17 April 2009, Ming Lei wrote:
> 2009/4/17 Rafael J. Wysocki <rjw@sisk.pl>:
> >
> > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13125
> > Subject         : active uvcvideo breaks over suspend
> > Submitter       : Alan Jenkins <alan-jenkins@tuffmail.co.uk>
> > Date            : 2009-04-15 10:12 (2 days old)
> > References      : http://marc.info/?l=linux-kernel&m=123979009508840&w=4
> >
> 
> It is a bug in resume path of uvcvideo driver, and I have sent a patch
> to laurent.pinchart@skynet.be,
> mchehab@infradead.org  and video4linux-list@redhat.com to fix it, but
> still no echo from them.
> 
> The patch title is V4L/DVB:usbvideo:fix uvc resume failed.
> 
> Rafael J.
>         If you would like to apply it ,I can resend to you.  Thanks!

Please resend.

Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
@ 2009-04-17 21:38     ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-17 21:38 UTC (permalink / raw)
  To: Thomas Meyer
  Cc: Linux Kernel Mailing List, Adrian Bunk, Andrew Morton,
	Linus Torvalds, Natalie Protasevich, Kernel Testers List

On Friday 17 April 2009, Thomas Meyer wrote:
> 
> Zitat von "Rafael J. Wysocki" <rjw@sisk.pl>:
> 
> > If you know of any other unresolved regressions from 2.6.29, please  
> > let me know
> > either and I'll add them to the list.  Also, please let me know if any of the
> > entries below are invalid.
> 
> Two things on 2.6.30-rc2:
> 
> 1) Kernel panic while shutting down the system:
> http://m3y3r.de/wordpress/?p=67

Can you pleaes open a Bugzilla entry for this one (please add my address to the
CC list of the bug entry)?

> 2) Backlight daemon dies (hald-addon-macbookpro-backlight). Don't know why.

Isn't that bug #13048?

Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
@ 2009-04-17 21:38     ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-17 21:38 UTC (permalink / raw)
  To: Thomas Meyer
  Cc: Linux Kernel Mailing List, Adrian Bunk, Andrew Morton,
	Linus Torvalds, Natalie Protasevich, Kernel Testers List

On Friday 17 April 2009, Thomas Meyer wrote:
> 
> Zitat von "Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org>:
> 
> > If you know of any other unresolved regressions from 2.6.29, please  
> > let me know
> > either and I'll add them to the list.  Also, please let me know if any of the
> > entries below are invalid.
> 
> Two things on 2.6.30-rc2:
> 
> 1) Kernel panic while shutting down the system:
> http://m3y3r.de/wordpress/?p=67

Can you pleaes open a Bugzilla entry for this one (please add my address to the
CC list of the bug entry)?

> 2) Backlight daemon dies (hald-addon-macbookpro-backlight). Don't know why.

Isn't that bug #13048?

Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-17 21:36   ` Rafael J. Wysocki
@ 2009-04-17 23:56     ` Laurent Pinchart
  2009-04-18 12:29       ` Rafael J. Wysocki
       [not found]       ` <200904180156.24366.laurent.pinchart-AgBVmzD5pcezQB+pC5nmwQ@public.gmane.org>
  2009-04-17 23:56     ` Laurent Pinchart
                       ` (2 subsequent siblings)
  3 siblings, 2 replies; 580+ messages in thread
From: Laurent Pinchart @ 2009-04-17 23:56 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Ming Lei, Linux Kernel Mailing List, Adrian Bunk, Andrew Morton,
	Linus Torvalds, Natalie Protasevich, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List,
	video4linux-list, mchehab

Hi,

On Friday 17 April 2009 23:36:11 Rafael J. Wysocki wrote:
> On Friday 17 April 2009, Ming Lei wrote:
> > 2009/4/17 Rafael J. Wysocki <rjw@sisk.pl>:
> > > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13125
> > > Subject         : active uvcvideo breaks over suspend
> > > Submitter       : Alan Jenkins <alan-jenkins@tuffmail.co.uk>
> > > Date            : 2009-04-15 10:12 (2 days old)
> > > References      :
> > > http://marc.info/?l=linux-kernel&m=123979009508840&w=4
> >
> > It is a bug in resume path of uvcvideo driver, and I have sent a patch
> > to laurent.pinchart@skynet.be,
> > mchehab@infradead.org  and video4linux-list@redhat.com to fix it, but
> > still no echo from them.
> >
> > The patch title is V4L/DVB:usbvideo:fix uvc resume failed.
> >
> > Rafael J.
> >         If you would like to apply it ,I can resend to you.  Thanks!
>
> Please resend.

I'm reviewing the patch and I'll push it through my tree during the weekend. 
Sorry for the delay, I'm currently traveling.

Best regards,

Laurent Pinchart


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-17 21:36   ` Rafael J. Wysocki
  2009-04-17 23:56     ` Laurent Pinchart
@ 2009-04-17 23:56     ` Laurent Pinchart
  2009-04-18  2:32     ` leiming
  2009-04-18  2:32       ` leiming
  3 siblings, 0 replies; 580+ messages in thread
From: Laurent Pinchart @ 2009-04-17 23:56 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Adrian Bunk, Linux SCSI List, Linux Kernel Mailing List,
	Natalie Protasevich, mchehab, Linux ACPI, video4linux-list,
	Network Development, Andrew Morton, Kernel Testers List,
	Linus Torvalds, Linux PM List

Hi,

On Friday 17 April 2009 23:36:11 Rafael J. Wysocki wrote:
> On Friday 17 April 2009, Ming Lei wrote:
> > 2009/4/17 Rafael J. Wysocki <rjw@sisk.pl>:
> > > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13125
> > > Subject         : active uvcvideo breaks over suspend
> > > Submitter       : Alan Jenkins <alan-jenkins@tuffmail.co.uk>
> > > Date            : 2009-04-15 10:12 (2 days old)
> > > References      :
> > > http://marc.info/?l=linux-kernel&m=123979009508840&w=4
> >
> > It is a bug in resume path of uvcvideo driver, and I have sent a patch
> > to laurent.pinchart@skynet.be,
> > mchehab@infradead.org  and video4linux-list@redhat.com to fix it, but
> > still no echo from them.
> >
> > The patch title is V4L/DVB:usbvideo:fix uvc resume failed.
> >
> > Rafael J.
> >         If you would like to apply it ,I can resend to you.  Thanks!
>
> Please resend.

I'm reviewing the patch and I'll push it through my tree during the weekend. 
Sorry for the delay, I'm currently traveling.

Best regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-17 21:36   ` Rafael J. Wysocki
  2009-04-17 23:56     ` Laurent Pinchart
@ 2009-04-18  2:32       ` leiming
  2009-04-18  2:32     ` leiming
  2009-04-18  2:32       ` leiming
  3 siblings, 0 replies; 580+ messages in thread
From: leiming @ 2009-04-18  2:32 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Adrian Bunk, Andrew Morton,
	Linus Torvalds, Natalie Protasevich, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List,
	video4linux-list, laurent.pinchart, mchehab

于 Fri, 17 Apr 2009 23:36:11 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> 写道:

> On Friday 17 April 2009, Ming Lei wrote:
> > 2009/4/17 Rafael J. Wysocki <rjw@sisk.pl>:
> > >
> > > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13125
> > > Subject         : active uvcvideo breaks over suspend
> > > Submitter       : Alan Jenkins <alan-jenkins@tuffmail.co.uk>
> > > Date            : 2009-04-15 10:12 (2 days old)
> > > References      :
> > > http://marc.info/?l=linux-kernel&m=123979009508840&w=4
> > >
> > 
> > It is a bug in resume path of uvcvideo driver, and I have sent a
> > patch to laurent.pinchart@skynet.be,
> > mchehab@infradead.org  and video4linux-list@redhat.com to fix it,
> > but still no echo from them.
> > 
> > The patch title is V4L/DVB:usbvideo:fix uvc resume failed.
> > 
> > Rafael J.
> >         If you would like to apply it ,I can resend to you.  Thanks!
> 
> Please resend.
> 
> Rafael

From 5715e310a939f3f7cd3e88eae8f25fedbb28def4 Mon Sep 17 00:00:00 2001
From: Ming Lei <tom.leiming@gmail.com>
Date: Wed, 15 Apr 2009 22:32:51 +0800
Subject: [PATCH] V4L/DVB:usbvideo:fix uvc resume failed

Now urb buffers is not freed before suspend, so uvc_alloc_urb_buffers
should return packet counts allocated originally during uvc resume
, instead of zero.

This patch is against v2.6.30-rc2.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/media/video/uvc/uvc_video.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/media/video/uvc/uvc_video.c b/drivers/media/video/uvc/uvc_video.c
index a95e173..c050b22 100644
--- a/drivers/media/video/uvc/uvc_video.c
+++ b/drivers/media/video/uvc/uvc_video.c
@@ -742,7 +742,7 @@ static int uvc_alloc_urb_buffers(struct uvc_video_device *video,
 
 	/* Buffers are already allocated, bail out. */
 	if (video->urb_size)
-		return 0;
+		return DIV_ROUND_UP(video->urb_size, psize);
 
 	/* Compute the number of packets. Bulk endpoints might transfer UVC
 	 * payloads accross multiple URBs.
-- 
1.6.0.GIT



-- 
Lei Ming

^ permalink raw reply related	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
@ 2009-04-18  2:32       ` leiming
  0 siblings, 0 replies; 580+ messages in thread
From: leiming @ 2009-04-18  2:32 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Adrian Bunk, Andrew Morton,
	Linus Torvalds, Natalie Protasevich, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List,
	video4linux-list, laurent.pinchart, mchehab

于 Fri, 17 Apr 2009 23:36:11 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> 写道:

> On Friday 17 April 2009, Ming Lei wrote:
> > 2009/4/17 Rafael J. Wysocki <rjw@sisk.pl>:
> > >
> > > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13125
> > > Subject         : active uvcvideo breaks over suspend
> > > Submitter       : Alan Jenkins <alan-jenkins@tuffmail.co.uk>
> > > Date            : 2009-04-15 10:12 (2 days old)
> > > References      :
> > > http://marc.info/?l=linux-kernel&m=123979009508840&w=4
> > >
> > 
> > It is a bug in resume path of uvcvideo driver, and I have sent a
> > patch to laurent.pinchart@skynet.be,
> > mchehab@infradead.org  and video4linux-list@redhat.com to fix it,
> > but still no echo from them.
> > 
> > The patch title is V4L/DVB:usbvideo:fix uvc resume failed.
> > 
> > Rafael J.
> >         If you would like to apply it ,I can resend to you.  Thanks!
> 
> Please resend.
> 
> Rafael

>From 5715e310a939f3f7cd3e88eae8f25fedbb28def4 Mon Sep 17 00:00:00 2001
From: Ming Lei <tom.leiming@gmail.com>
Date: Wed, 15 Apr 2009 22:32:51 +0800
Subject: [PATCH] V4L/DVB:usbvideo:fix uvc resume failed

Now urb buffers is not freed before suspend, so uvc_alloc_urb_buffers
should return packet counts allocated originally during uvc resume
, instead of zero.

This patch is against v2.6.30-rc2.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/media/video/uvc/uvc_video.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/media/video/uvc/uvc_video.c b/drivers/media/video/uvc/uvc_video.c
index a95e173..c050b22 100644
--- a/drivers/media/video/uvc/uvc_video.c
+++ b/drivers/media/video/uvc/uvc_video.c
@@ -742,7 +742,7 @@ static int uvc_alloc_urb_buffers(struct uvc_video_device *video,
 
 	/* Buffers are already allocated, bail out. */
 	if (video->urb_size)
-		return 0;
+		return DIV_ROUND_UP(video->urb_size, psize);
 
 	/* Compute the number of packets. Bulk endpoints might transfer UVC
 	 * payloads accross multiple URBs.
-- 
1.6.0.GIT



-- 
Lei Ming

^ permalink raw reply related	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-17 21:36   ` Rafael J. Wysocki
  2009-04-17 23:56     ` Laurent Pinchart
  2009-04-17 23:56     ` Laurent Pinchart
@ 2009-04-18  2:32     ` leiming
  2009-04-18  2:32       ` leiming
  3 siblings, 0 replies; 580+ messages in thread
From: leiming @ 2009-04-18  2:32 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Adrian Bunk, Linux SCSI List, Linus, Linux Kernel Mailing List,
	Natalie Protasevich, mchehab, Linux ACPI, video4linux-list,
	laurent.pinchart, Network Development, Andrew Morton,
	Kernel Testers List, Torvalds, Linux PM List

于 Fri, 17 Apr 2009 23:36:11 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> 写道:

> On Friday 17 April 2009, Ming Lei wrote:
> > 2009/4/17 Rafael J. Wysocki <rjw@sisk.pl>:
> > >
> > > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13125
> > > Subject         : active uvcvideo breaks over suspend
> > > Submitter       : Alan Jenkins <alan-jenkins@tuffmail.co.uk>
> > > Date            : 2009-04-15 10:12 (2 days old)
> > > References      :
> > > http://marc.info/?l=linux-kernel&m=123979009508840&w=4
> > >
> > 
> > It is a bug in resume path of uvcvideo driver, and I have sent a
> > patch to laurent.pinchart@skynet.be,
> > mchehab@infradead.org  and video4linux-list@redhat.com to fix it,
> > but still no echo from them.
> > 
> > The patch title is V4L/DVB:usbvideo:fix uvc resume failed.
> > 
> > Rafael J.
> >         If you would like to apply it ,I can resend to you.  Thanks!
> 
> Please resend.
> 
> Rafael

From 5715e310a939f3f7cd3e88eae8f25fedbb28def4 Mon Sep 17 00:00:00 2001
From: Ming Lei <tom.leiming@gmail.com>
Date: Wed, 15 Apr 2009 22:32:51 +0800
Subject: [PATCH] V4L/DVB:usbvideo:fix uvc resume failed

Now urb buffers is not freed before suspend, so uvc_alloc_urb_buffers
should return packet counts allocated originally during uvc resume
, instead of zero.

This patch is against v2.6.30-rc2.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/media/video/uvc/uvc_video.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/media/video/uvc/uvc_video.c b/drivers/media/video/uvc/uvc_video.c
index a95e173..c050b22 100644
--- a/drivers/media/video/uvc/uvc_video.c
+++ b/drivers/media/video/uvc/uvc_video.c
@@ -742,7 +742,7 @@ static int uvc_alloc_urb_buffers(struct uvc_video_device *video,
 
 	/* Buffers are already allocated, bail out. */
 	if (video->urb_size)
-		return 0;
+		return DIV_ROUND_UP(video->urb_size, psize);
 
 	/* Compute the number of packets. Bulk endpoints might transfer UVC
 	 * payloads accross multiple URBs.
-- 
1.6.0.GIT



-- 
Lei Ming
_______________________________________________
linux-pm mailing list
linux-pm@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/linux-pm

^ permalink raw reply related	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
@ 2009-04-18  2:32       ` leiming
  0 siblings, 0 replies; 580+ messages in thread
From: leiming @ 2009-04-18  2:32 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Adrian Bunk, Andrew Morton,
	Linus Torvalds, Natalie Protasevich, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List,
	video4linux-list, laurent.pinchart, mchehab

ÓÚ Fri, 17 Apr 2009 23:36:11 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> дµÀ:

> On Friday 17 April 2009, Ming Lei wrote:
> > 2009/4/17 Rafael J. Wysocki <rjw@sisk.pl>:
> > >
> > > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13125
> > > Subject         : active uvcvideo breaks over suspend
> > > Submitter       : Alan Jenkins <alan-jenkins@tuffmail.co.uk>
> > > Date            : 2009-04-15 10:12 (2 days old)
> > > References      :
> > > http://marc.info/?l=linux-kernel&m=123979009508840&w=4
> > >
> > 
> > It is a bug in resume path of uvcvideo driver, and I have sent a
> > patch to laurent.pinchart@skynet.be,
> > mchehab@infradead.org  and video4linux-list@redhat.com to fix it,
> > but still no echo from them.
> > 
> > The patch title is V4L/DVB:usbvideo:fix uvc resume failed.
> > 
> > Rafael J.
> >         If you would like to apply it ,I can resend to you.  Thanks!
> 
> Please resend.
> 
> Rafael

From 5715e310a939f3f7cd3e88eae8f25fedbb28def4 Mon Sep 17 00:00:00 2001
From: Ming Lei <tom.leiming@gmail.com>
Date: Wed, 15 Apr 2009 22:32:51 +0800
Subject: [PATCH] V4L/DVB:usbvideo:fix uvc resume failed

Now urb buffers is not freed before suspend, so uvc_alloc_urb_buffers
should return packet counts allocated originally during uvc resume
, instead of zero.

This patch is against v2.6.30-rc2.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/media/video/uvc/uvc_video.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/media/video/uvc/uvc_video.c b/drivers/media/video/uvc/uvc_video.c
index a95e173..c050b22 100644
--- a/drivers/media/video/uvc/uvc_video.c
+++ b/drivers/media/video/uvc/uvc_video.c
@@ -742,7 +742,7 @@ static int uvc_alloc_urb_buffers(struct uvc_video_device *video,
 
 	/* Buffers are already allocated, bail out. */
 	if (video->urb_size)
-		return 0;
+		return DIV_ROUND_UP(video->urb_size, psize);
 
 	/* Compute the number of packets. Bulk endpoints might transfer UVC
 	 * payloads accross multiple URBs.
-- 
1.6.0.GIT



-- 
Lei Ming

^ permalink raw reply related	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-18  2:32       ` leiming
@ 2009-04-18  2:55         ` Linus Torvalds
  -1 siblings, 0 replies; 580+ messages in thread
From: Linus Torvalds @ 2009-04-18  2:55 UTC (permalink / raw)
  To: leiming
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List, Adrian Bunk,
	Andrew Morton, Natalie Protasevich, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List,
	video4linux-list-H+wXaHxf7aLQT0dZR+AlfA,
	laurent.pinchart-AgBVmzD5pcezQB+pC5nmwQ,
	mchehab-wEGCiKHe2LqWVfeAwA7xHQ



On Sat, 18 Apr 2009, leiming wrote:
> 
> >From 5715e310a939f3f7cd3e88eae8f25fedbb28def4 Mon Sep 17 00:00:00 2001
> From: Ming Lei <tom.leiming-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Date: Wed, 15 Apr 2009 22:32:51 +0800
> Subject: [PATCH] V4L/DVB:usbvideo:fix uvc resume failed
> 
> Now urb buffers is not freed before suspend, so uvc_alloc_urb_buffers
> should return packet counts allocated originally during uvc resume
> , instead of zero.
> 
> This patch is against v2.6.30-rc2.
> 
> Signed-off-by: Ming Lei <tom.leiming-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> ---
>  drivers/media/video/uvc/uvc_video.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/media/video/uvc/uvc_video.c b/drivers/media/video/uvc/uvc_video.c
> index a95e173..c050b22 100644
> --- a/drivers/media/video/uvc/uvc_video.c
> +++ b/drivers/media/video/uvc/uvc_video.c
> @@ -742,7 +742,7 @@ static int uvc_alloc_urb_buffers(struct uvc_video_device *video,
>  
>  	/* Buffers are already allocated, bail out. */
>  	if (video->urb_size)
> -		return 0;
> +		return DIV_ROUND_UP(video->urb_size, psize);

I don't think this is right. It should round _down_.

It's supposed to return 'npackets', but if you pass it a different packet 
size than it was passed originally, it can now return a potentially bigger 
number than the already allocated buffer, no?

So I think it should round down (ie use a regular divide). No?

		Linuse

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
@ 2009-04-18  2:55         ` Linus Torvalds
  0 siblings, 0 replies; 580+ messages in thread
From: Linus Torvalds @ 2009-04-18  2:55 UTC (permalink / raw)
  To: leiming
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List, Adrian Bunk,
	Andrew Morton, Natalie Protasevich, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List,
	video4linux-list, laurent.pinchart, mchehab



On Sat, 18 Apr 2009, leiming wrote:
> 
> >From 5715e310a939f3f7cd3e88eae8f25fedbb28def4 Mon Sep 17 00:00:00 2001
> From: Ming Lei <tom.leiming@gmail.com>
> Date: Wed, 15 Apr 2009 22:32:51 +0800
> Subject: [PATCH] V4L/DVB:usbvideo:fix uvc resume failed
> 
> Now urb buffers is not freed before suspend, so uvc_alloc_urb_buffers
> should return packet counts allocated originally during uvc resume
> , instead of zero.
> 
> This patch is against v2.6.30-rc2.
> 
> Signed-off-by: Ming Lei <tom.leiming@gmail.com>
> ---
>  drivers/media/video/uvc/uvc_video.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/media/video/uvc/uvc_video.c b/drivers/media/video/uvc/uvc_video.c
> index a95e173..c050b22 100644
> --- a/drivers/media/video/uvc/uvc_video.c
> +++ b/drivers/media/video/uvc/uvc_video.c
> @@ -742,7 +742,7 @@ static int uvc_alloc_urb_buffers(struct uvc_video_device *video,
>  
>  	/* Buffers are already allocated, bail out. */
>  	if (video->urb_size)
> -		return 0;
> +		return DIV_ROUND_UP(video->urb_size, psize);

I don't think this is right. It should round _down_.

It's supposed to return 'npackets', but if you pass it a different packet 
size than it was passed originally, it can now return a potentially bigger 
number than the already allocated buffer, no?

So I think it should round down (ie use a regular divide). No?

		Linuse

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-18  2:32       ` leiming
  (?)
  (?)
@ 2009-04-18  2:55       ` Linus Torvalds
  -1 siblings, 0 replies; 580+ messages in thread
From: Linus Torvalds @ 2009-04-18  2:55 UTC (permalink / raw)
  To: leiming
  Cc: Adrian Bunk, Linux SCSI List, Network Development,
	Linux Kernel Mailing List, Natalie Protasevich, mchehab,
	Linux ACPI, video4linux-list, laurent.pinchart, Andrew Morton,
	Kernel Testers List, Linux PM List



On Sat, 18 Apr 2009, leiming wrote:
> 
> >From 5715e310a939f3f7cd3e88eae8f25fedbb28def4 Mon Sep 17 00:00:00 2001
> From: Ming Lei <tom.leiming@gmail.com>
> Date: Wed, 15 Apr 2009 22:32:51 +0800
> Subject: [PATCH] V4L/DVB:usbvideo:fix uvc resume failed
> 
> Now urb buffers is not freed before suspend, so uvc_alloc_urb_buffers
> should return packet counts allocated originally during uvc resume
> , instead of zero.
> 
> This patch is against v2.6.30-rc2.
> 
> Signed-off-by: Ming Lei <tom.leiming@gmail.com>
> ---
>  drivers/media/video/uvc/uvc_video.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/media/video/uvc/uvc_video.c b/drivers/media/video/uvc/uvc_video.c
> index a95e173..c050b22 100644
> --- a/drivers/media/video/uvc/uvc_video.c
> +++ b/drivers/media/video/uvc/uvc_video.c
> @@ -742,7 +742,7 @@ static int uvc_alloc_urb_buffers(struct uvc_video_device *video,
>  
>  	/* Buffers are already allocated, bail out. */
>  	if (video->urb_size)
> -		return 0;
> +		return DIV_ROUND_UP(video->urb_size, psize);

I don't think this is right. It should round _down_.

It's supposed to return 'npackets', but if you pass it a different packet 
size than it was passed originally, it can now return a potentially bigger 
number than the already allocated buffer, no?

So I think it should round down (ie use a regular divide). No?

		Linuse

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-18  2:55         ` Linus Torvalds
  (?)
@ 2009-04-18  3:50         ` leiming
  -1 siblings, 0 replies; 580+ messages in thread
From: leiming @ 2009-04-18  3:50 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List, Adrian Bunk,
	Andrew Morton, Natalie Protasevich, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List,
	video4linux-list, laurent.pinchart, mchehab

On Fri, 17 Apr 2009 19:55:29 -0700 (PDT)
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> @@ -742,7 +742,7 @@ static int uvc_alloc_urb_buffers(struct
> > uvc_video_device *video, 
> >  	/* Buffers are already allocated, bail out. */
> >  	if (video->urb_size)
> > -		return 0;
> > +		return DIV_ROUND_UP(video->urb_size, psize);
> 
> I don't think this is right. It should round _down_.
> 
> It's supposed to return 'npackets', but if you pass it a different
> packet size than it was passed originally, it can now return a

Now uvc only uses the previous allocated buffer in suspend/resume
path, so the packet size doen't change in this path.

> potentially bigger number than the already allocated buffer, no?

If this case does exist, the URBs need to be updated and the patch is 
not enough. 

> 
> So I think it should round down (ie use a regular divide). No?

Because the following fact:

	uvc_alloc_urb_buffers()
	{
		...
		video->urb_size = psize * npackets; 
		...
	}

so DIV_ROUND_UP still can work correctly.

Thanks!

-- 
Lei Ming

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-18  2:55         ` Linus Torvalds
  (?)
  (?)
@ 2009-04-18  3:50         ` leiming
  -1 siblings, 0 replies; 580+ messages in thread
From: leiming @ 2009-04-18  3:50 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Adrian Bunk, Linux SCSI List, Network Development,
	Linux Kernel Mailing List, Natalie Protasevich, Andrew,
	Linux ACPI, video4linux-list, laurent.pinchart, Morton,
	Kernel Testers List, Linux PM List, mchehab

On Fri, 17 Apr 2009 19:55:29 -0700 (PDT)
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> @@ -742,7 +742,7 @@ static int uvc_alloc_urb_buffers(struct
> > uvc_video_device *video, 
> >  	/* Buffers are already allocated, bail out. */
> >  	if (video->urb_size)
> > -		return 0;
> > +		return DIV_ROUND_UP(video->urb_size, psize);
> 
> I don't think this is right. It should round _down_.
> 
> It's supposed to return 'npackets', but if you pass it a different
> packet size than it was passed originally, it can now return a

Now uvc only uses the previous allocated buffer in suspend/resume
path, so the packet size doen't change in this path.

> potentially bigger number than the already allocated buffer, no?

If this case does exist, the URBs need to be updated and the patch is 
not enough. 

> 
> So I think it should round down (ie use a regular divide). No?

Because the following fact:

	uvc_alloc_urb_buffers()
	{
		...
		video->urb_size = psize * npackets; 
		...
	}

so DIV_ROUND_UP still can work correctly.

Thanks!

-- 
Lei Ming

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-18  2:55         ` Linus Torvalds
@ 2009-04-18  4:51           ` leiming
  -1 siblings, 0 replies; 580+ messages in thread
From: leiming @ 2009-04-18  4:51 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List, Adrian Bunk,
	Andrew Morton, Natalie Protasevich, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List,
	video4linux-list, laurent.pinchart, mchehab

On Fri, 17 Apr 2009 19:55:29 -0700 (PDT)
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> > @@ -742,7 +742,7 @@ static int uvc_alloc_urb_buffers(struct
> > uvc_video_device *video, 
> >  	/* Buffers are already allocated, bail out. */
> >  	if (video->urb_size)
> > -		return 0;
> > +		return DIV_ROUND_UP(video->urb_size, psize);
> 
> I don't think this is right. It should round _down_.
> 
> It's supposed to return 'npackets', but if you pass it a different
> packet size than it was passed originally, it can now return a
> potentially bigger number than the already allocated buffer, no?
> 
> So I think it should round down (ie use a regular divide). No?

Yes,you are correct, please ignore my last reply, and following is
the fixed patch.

Thanks.

>From a3b3d72cdd57a0699fb643b41b78eb7beb211ff5 Mon Sep 17 00:00:00 2001
From: Ming Lei <tom.leiming@gmail.com>
Date: Wed, 15 Apr 2009 22:32:51 +0800
Subject: [PATCH] V4L/DVB:usbvideo:fix uvc resume failed(v2)

Now urb buffers is not freed before suspend, so uvc_alloc_urb_buffers
should return packet counts allocated originally during uvc resume
, instead of zero.

This version uses round down to return packet counts on Linus's
suggestions, or else may lead to buffer destructed if packet size
is changed before calling uvc_alloc_urb_buffers() in this kind of
case.

This patch is against v2.6.30-rc2.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/media/video/uvc/uvc_video.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/media/video/uvc/uvc_video.c b/drivers/media/video/uvc/uvc_video.c
index a95e173..6ce974d 100644
--- a/drivers/media/video/uvc/uvc_video.c
+++ b/drivers/media/video/uvc/uvc_video.c
@@ -742,7 +742,7 @@ static int uvc_alloc_urb_buffers(struct uvc_video_device *video,
 
 	/* Buffers are already allocated, bail out. */
 	if (video->urb_size)
-		return 0;
+		return video->urb_size / psize;
 
 	/* Compute the number of packets. Bulk endpoints might transfer UVC
 	 * payloads accross multiple URBs.
-- 
1.6.0.GIT





-- 
Lei Ming

^ permalink raw reply related	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-18  2:55         ` Linus Torvalds
                           ` (2 preceding siblings ...)
  (?)
@ 2009-04-18  4:51         ` leiming
  -1 siblings, 0 replies; 580+ messages in thread
From: leiming @ 2009-04-18  4:51 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Adrian Bunk, Linux SCSI List, Network Development,
	Linux Kernel Mailing List, Natalie Protasevich, Andrew,
	Linux ACPI, video4linux-list, laurent.pinchart, Morton,
	Kernel Testers List, Linux PM List, mchehab

On Fri, 17 Apr 2009 19:55:29 -0700 (PDT)
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> > @@ -742,7 +742,7 @@ static int uvc_alloc_urb_buffers(struct
> > uvc_video_device *video, 
> >  	/* Buffers are already allocated, bail out. */
> >  	if (video->urb_size)
> > -		return 0;
> > +		return DIV_ROUND_UP(video->urb_size, psize);
> 
> I don't think this is right. It should round _down_.
> 
> It's supposed to return 'npackets', but if you pass it a different
> packet size than it was passed originally, it can now return a
> potentially bigger number than the already allocated buffer, no?
> 
> So I think it should round down (ie use a regular divide). No?

Yes,you are correct, please ignore my last reply, and following is
the fixed patch.

Thanks.

>From a3b3d72cdd57a0699fb643b41b78eb7beb211ff5 Mon Sep 17 00:00:00 2001
From: Ming Lei <tom.leiming@gmail.com>
Date: Wed, 15 Apr 2009 22:32:51 +0800
Subject: [PATCH] V4L/DVB:usbvideo:fix uvc resume failed(v2)

Now urb buffers is not freed before suspend, so uvc_alloc_urb_buffers
should return packet counts allocated originally during uvc resume
, instead of zero.

This version uses round down to return packet counts on Linus's
suggestions, or else may lead to buffer destructed if packet size
is changed before calling uvc_alloc_urb_buffers() in this kind of
case.

This patch is against v2.6.30-rc2.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/media/video/uvc/uvc_video.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/media/video/uvc/uvc_video.c b/drivers/media/video/uvc/uvc_video.c
index a95e173..6ce974d 100644
--- a/drivers/media/video/uvc/uvc_video.c
+++ b/drivers/media/video/uvc/uvc_video.c
@@ -742,7 +742,7 @@ static int uvc_alloc_urb_buffers(struct uvc_video_device *video,
 
 	/* Buffers are already allocated, bail out. */
 	if (video->urb_size)
-		return 0;
+		return video->urb_size / psize;
 
 	/* Compute the number of packets. Bulk endpoints might transfer UVC
 	 * payloads accross multiple URBs.
-- 
1.6.0.GIT





-- 
Lei Ming

^ permalink raw reply related	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
@ 2009-04-18  4:51           ` leiming
  0 siblings, 0 replies; 580+ messages in thread
From: leiming @ 2009-04-18  4:51 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List, Adrian Bunk,
	Andrew Morton, Natalie Protasevich, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List,
	video4linux-list, laurent.pinchart, mchehab

On Fri, 17 Apr 2009 19:55:29 -0700 (PDT)
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> > @@ -742,7 +742,7 @@ static int uvc_alloc_urb_buffers(struct
> > uvc_video_device *video, 
> >  	/* Buffers are already allocated, bail out. */
> >  	if (video->urb_size)
> > -		return 0;
> > +		return DIV_ROUND_UP(video->urb_size, psize);
> 
> I don't think this is right. It should round _down_.
> 
> It's supposed to return 'npackets', but if you pass it a different
> packet size than it was passed originally, it can now return a
> potentially bigger number than the already allocated buffer, no?
> 
> So I think it should round down (ie use a regular divide). No?

Yes,you are correct, please ignore my last reply, and following is
the fixed patch.

Thanks.

From a3b3d72cdd57a0699fb643b41b78eb7beb211ff5 Mon Sep 17 00:00:00 2001
From: Ming Lei <tom.leiming@gmail.com>
Date: Wed, 15 Apr 2009 22:32:51 +0800
Subject: [PATCH] V4L/DVB:usbvideo:fix uvc resume failed(v2)

Now urb buffers is not freed before suspend, so uvc_alloc_urb_buffers
should return packet counts allocated originally during uvc resume
, instead of zero.

This version uses round down to return packet counts on Linus's
suggestions, or else may lead to buffer destructed if packet size
is changed before calling uvc_alloc_urb_buffers() in this kind of
case.

This patch is against v2.6.30-rc2.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/media/video/uvc/uvc_video.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/media/video/uvc/uvc_video.c b/drivers/media/video/uvc/uvc_video.c
index a95e173..6ce974d 100644
--- a/drivers/media/video/uvc/uvc_video.c
+++ b/drivers/media/video/uvc/uvc_video.c
@@ -742,7 +742,7 @@ static int uvc_alloc_urb_buffers(struct uvc_video_device *video,
 
 	/* Buffers are already allocated, bail out. */
 	if (video->urb_size)
-		return 0;
+		return video->urb_size / psize;
 
 	/* Compute the number of packets. Bulk endpoints might transfer UVC
 	 * payloads accross multiple URBs.
-- 
1.6.0.GIT





-- 
Lei Ming

^ permalink raw reply related	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-18  8:16                           ` Alan Jenkins
  0 siblings, 0 replies; 580+ messages in thread
From: Alan Jenkins @ 2009-04-18  8:16 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Rafael J. Wysocki, Jens Axboe, Linux Kernel Mailing List,
	Kernel Testers List

[-- Attachment #1: Type: text/plain, Size: 920 bytes --]

Linus Torvalds wrote:
> On Fri, 17 Apr 2009, Rafael J. Wysocki wrote:
>   
>> Can you please try to reproduce the problem with the appended debug patch
>> applied and send the output of dmesg to me?
>>     
>
> Maybe something like this instead (or in addition to).
>
> It does "show_mem()" when memory shrinking fails. It will show a _lot_ of 
> data.
>
> Untested, but trivial.
>
> 		Linus
> ---
>   

Ok, I applied both your and Rafael's debug patches.  dmesg attached.

After the failed hibernation, I noticed my touchpad wasn't working.  But 
I think that's something else.  I had another go and couldn't reproduce 
that.  It's happened to me once before while testing 2.6.30-; I've also 
had the keyboard stop working at least once.  I'm hoping it's the same 
bug as "20 ACPI interrupts per second on EEEPC" bug.  It could be 
overloading my bug-ridden EC, which also acts as the keyboard controller.

Thanks
Alan

[-- Attachment #2: dmesg.txt --]
[-- Type: text/plain, Size: 53281 bytes --]

[    0.000000] Linux version 2.6.30-rc2eeepc (alan@alan-desktop) (gcc version 4.2.4 (Ubuntu 4.2.4-1ubuntu3)) #240 Sat Apr 18 08:47:04 BST 2009
[    0.000000] KERNEL supported cpus:
[    0.000000]   Intel GenuineIntel
[    0.000000]   AMD AuthenticAMD
[    0.000000]   NSC Geode by NSC
[    0.000000]   Cyrix CyrixInstead
[    0.000000]   Centaur CentaurHauls
[    0.000000]   Transmeta GenuineTMx86
[    0.000000]   Transmeta TransmetaCPU
[    0.000000]   UMC UMC UMC UMC
[    0.000000] BIOS-provided physical RAM map:
[    0.000000]  BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
[    0.000000]  BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
[    0.000000]  BIOS-e820: 00000000000e4000 - 0000000000100000 (reserved)
[    0.000000]  BIOS-e820: 0000000000100000 - 000000001f780000 (usable)
[    0.000000]  BIOS-e820: 000000001f780000 - 000000001f790000 (ACPI data)
[    0.000000]  BIOS-e820: 000000001f790000 - 000000001f7d0000 (ACPI NVS)
[    0.000000]  BIOS-e820: 000000001f7d0000 - 000000001f7de000 (reserved)
[    0.000000]  BIOS-e820: 000000001f7e0000 - 000000001f800000 (reserved)
[    0.000000]  BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
[    0.000000]  BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved)
[    0.000000] DMI present.
[    0.000000] AMI BIOS detected: BIOS may corrupt low RAM, working around it.
[    0.000000] e820 update range: 0000000000000000 - 0000000000010000 (usable) ==> (reserved)
[    0.000000] last_pfn = 0x1f780 max_arch_pfn = 0x1000000
[    0.000000] MTRR default type: uncachable
[    0.000000] MTRR fixed ranges enabled:
[    0.000000]   00000-9FFFF write-back
[    0.000000]   A0000-DFFFF uncachable
[    0.000000]   E0000-EFFFF write-through
[    0.000000]   F0000-FFFFF write-protect
[    0.000000] MTRR variable ranges enabled:
[    0.000000]   0 base 000000000 mask FE0000000 write-back
[    0.000000]   1 base 01F800000 mask FFF800000 uncachable
[    0.000000]   2 disabled
[    0.000000]   3 disabled
[    0.000000]   4 disabled
[    0.000000]   5 disabled
[    0.000000]   6 disabled
[    0.000000]   7 disabled
[    0.000000] init_memory_mapping: 0000000000000000-000000001f780000
[    0.000000] NX (Execute Disable) protection: active
[    0.000000]  0000000000 - 0000200000 page 4k
[    0.000000]  0000200000 - 001f600000 page 2M
[    0.000000]  001f600000 - 001f780000 page 4k
[    0.000000] kernel direct mapping tables up to 1f780000 @ 10000-16000
[    0.000000] RAMDISK: 17724000 - 179df533
[    0.000000] ACPI: RSDP 000fbe50 00014 (v00 ACPIAM)
[    0.000000] ACPI: RSDT 1f780000 00034 (v01 A M I  OEMRSDT  03000911 MSFT 00000097)
[    0.000000] ACPI: FACP 1f780200 00081 (v01 A M I  OEMFACP  03000911 MSFT 00000097)
[    0.000000] ACPI: DSDT 1f780400 06069 (v01  A0797 A0797000 00000000 INTL 20060113)
[    0.000000] ACPI: FACS 1f790000 00040
[    0.000000] ACPI: APIC 1f780390 00068 (v01 A M I  OEMAPIC  03000911 MSFT 00000097)
[    0.000000] ACPI: OEMB 1f790040 00046 (v01 A M I  AMI_OEM  03000911 MSFT 00000097)
[    0.000000] ACPI: MCFG 1f786470 0003C (v01 A M I  OEMMCFG  03000911 MSFT 00000097)
[    0.000000] ACPI: Local APIC address 0xfee00000
[    0.000000] 503MB LOWMEM available.
[    0.000000]   mapped low ram: 0 - 1f780000
[    0.000000]   low ram: 0 - 1f780000
[    0.000000]   node 0 low ram: 00000000 - 1f780000
[    0.000000]   node 0 bootmap 00012000 - 00015ef0
[    0.000000] (7 early reservations) ==> bootmem [0000000000 - 001f780000]
[    0.000000]   #0 [0000000000 - 0000001000]   BIOS data page ==> [0000000000 - 0000001000]
[    0.000000]   #1 [0000100000 - 00008ec0ac]    TEXT DATA BSS ==> [0000100000 - 00008ec0ac]
[    0.000000]   #2 [0017724000 - 00179df533]          RAMDISK ==> [0017724000 - 00179df533]
[    0.000000]   #3 [000009fc00 - 0000100000]    BIOS reserved ==> [000009fc00 - 0000100000]
[    0.000000]   #4 [00008ed000 - 00008f41f4]              BRK ==> [00008ed000 - 00008f41f4]
[    0.000000]   #5 [0000010000 - 0000012000]          PGTABLE ==> [0000010000 - 0000012000]
[    0.000000]   #6 [0000012000 - 0000016000]          BOOTMAP ==> [0000012000 - 0000016000]
[    0.000000] Zone PFN ranges:
[    0.000000]   DMA      0x00000010 -> 0x00001000
[    0.000000]   Normal   0x00001000 -> 0x0001f780
[    0.000000] Movable zone start PFN for each node
[    0.000000] early_node_map[2] active PFN ranges
[    0.000000]     0: 0x00000010 -> 0x0000009f
[    0.000000]     0: 0x00000100 -> 0x0001f780
[    0.000000] On node 0 totalpages: 128783
[    0.000000] free_area_init_node: node 0, pgdat c043836c, node_mem_map c1000200
[    0.000000]   DMA zone: 32 pages used for memmap
[    0.000000]   DMA zone: 0 pages reserved
[    0.000000]   DMA zone: 3951 pages, LIFO batch:0
[    0.000000]   Normal zone: 975 pages used for memmap
[    0.000000]   Normal zone: 123825 pages, LIFO batch:31
[    0.000000] Using APIC driver default
[    0.000000] ACPI: PM-Timer IO Port: 0x808
[    0.000000] ACPI: Local APIC address 0xfee00000
[    0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
[    0.000000] ACPI: IOAPIC (id[0x01] address[0xfec00000] gsi_base[0])
[    0.000000] IOAPIC[0]: apic_id 1, version 32, address 0xfec00000, GSI 0-23
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[    0.000000] ACPI: IRQ0 used by override.
[    0.000000] ACPI: IRQ2 used by override.
[    0.000000] ACPI: IRQ9 used by override.
[    0.000000] Enabling APIC mode:  Flat.  Using 1 I/O APICs
[    0.000000] Using ACPI (MADT) for SMP configuration information
[    0.000000] nr_irqs_gsi: 24
[    0.000000] PM: Registered nosave memory: 000000000009f000 - 00000000000a0000
[    0.000000] PM: Registered nosave memory: 00000000000a0000 - 00000000000e4000
[    0.000000] PM: Registered nosave memory: 00000000000e4000 - 0000000000100000
[    0.000000] Allocating PCI resources starting at 20000000 (gap: 1f800000:df600000)
[    0.000000] Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 127776
[    0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-2.6.30-rc2eeepc root=/dev/sda2 ro root=/dev/sda2 rootfstype=ext4 resume=/dev/sda2 resume_offset=9732
[    0.000000] Enabling fast FPU save and restore... done.
[    0.000000] Enabling unmasked SIMD FPU exception support... done.
[    0.000000] Initializing CPU#0
[    0.000000] NR_IRQS:288
[    0.000000] PID hash table entries: 2048 (order: 11, 8192 bytes)
[    0.000000] Fast TSC calibration using PIT
[    0.000000] Detected 630.041 MHz processor.
[    0.003333] Console: colour VGA+ 80x25
[    0.003333] console [tty0] enabled
[    0.003333] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
[    0.003333] ... MAX_LOCKDEP_SUBCLASSES:  8
[    0.003333] ... MAX_LOCK_DEPTH:          48
[    0.003333] ... MAX_LOCKDEP_KEYS:        8191
[    0.003333] ... CLASSHASH_SIZE:          4096
[    0.003333] ... MAX_LOCKDEP_ENTRIES:     8192
[    0.003333] ... MAX_LOCKDEP_CHAINS:      16384
[    0.003333] ... CHAINHASH_SIZE:          8192
[    0.003333]  memory used by lock dependency info: 2847 kB
[    0.003333]  per task-struct memory footprint: 1152 bytes
[    0.003333] Dentry cache hash table entries: 65536 (order: 6, 262144 bytes)
[    0.003333] Inode-cache hash table entries: 32768 (order: 5, 131072 bytes)
[    0.003333] Memory: 499536k/515584k available (2045k kernel code, 15424k reserved, 1290k data, 248k init, 0k highmem)
[    0.003333] virtual kernel memory layout:
[    0.003333]     fixmap  : 0xfffaa000 - 0xfffff000   ( 340 kB)
[    0.003333]     vmalloc : 0xdff80000 - 0xfffa8000   ( 512 MB)
[    0.003333]     lowmem  : 0xc0000000 - 0xdf780000   ( 503 MB)
[    0.003333]       .init : 0xc0446000 - 0xc0484000   ( 248 kB)
[    0.003333]       .data : 0xc02ff503 - 0xc04420a8   (1290 kB)
[    0.003333]       .text : 0xc0100000 - 0xc02ff503   (2045 kB)
[    0.003333] Checking if this processor honours the WP bit even in supervisor mode...Ok.
[    0.003333] SLUB: Genslabs=13, HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[    0.003383] Calibrating delay loop (skipped), value calculated using timer frequency.. 1260.58 BogoMIPS (lpj=2100136)
[    0.003680] Mount-cache hash table entries: 512
[    0.006328] CPU: L1 I cache: 32K, L1 D cache: 32K
[    0.006474] CPU: L2 cache: 512K
[    0.006681] [ds] using Pentium M configuration
[    0.006778] [ds] pebs not available
[    0.006874] Intel machine check architecture supported.
[    0.006978] Intel machine check reporting enabled on CPU#0.
[    0.007093] CPU: Intel(R) Celeron(R) M processor          900MHz stepping 08
[    0.007298] Checking 'hlt' instruction... OK.
[    0.020678] ACPI: Core revision 20090320
[    0.054619] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
[    0.089999] PM: Adding info for No Bus:platform
[    0.089999] net_namespace: 1136 bytes
[    0.091819] NET: Registered protocol family 16
[    0.092936] PM: Adding info for No Bus:vtcon0
[    0.093174] ACPI: bus type pci registered
[    0.093492] PM: Adding info for No Bus:id
[    0.093715] PCI: MCFG configuration 0: base e0000000 segment 0 buses 0 - 255
[    0.093827] PCI: Not using MMCONFIG.
[    0.094216] PCI: PCI BIOS revision 3.00 entry at 0xf0031, last bus=5
[    0.094321] PCI: Using configuration type 1 for base access
[    0.097919] PM: Adding info for No Bus:default
[    0.098189] bio: create slab <bio-0> at 0
[    0.120455] ACPI: EC: Look up EC in DSDT
[    0.157798] ACPI: Interpreter enabled
[    0.157918] ACPI: (supports S0 S1 S3 S4 S5)
[    0.158404] ACPI: Using IOAPIC for interrupt routing
[    0.158727] PCI: MCFG configuration 0: base e0000000 segment 0 buses 0 - 255
[    0.168936] PCI: MCFG area at e0000000 reserved in ACPI motherboard resources
[    0.169048] PCI: Using MMCONFIG for extended config space
[    0.169477] PM: Adding info for acpi:LNXSYSTM:00
[    0.169626] PM: Adding info for acpi:LNXPWRBN:00
[    0.169822] PM: Adding info for acpi:ACPI_CPU:00
[    0.170102] PM: Adding info for acpi:device:00
[    0.170883] PM: Adding info for acpi:PNP0A08:00
[    0.171147] PM: Adding info for acpi:PNP0C01:00
[    0.172036] PM: Adding info for acpi:device:01
[    0.172887] PM: Adding info for acpi:device:02
[    0.173746] PM: Adding info for acpi:device:03
[    0.174591] PM: Adding info for acpi:device:04
[    0.175430] PM: Adding info for acpi:device:05
[    0.175645] PM: Adding info for acpi:device:06
[    0.175889] PM: Adding info for acpi:PNP0000:00
[    0.176126] PM: Adding info for acpi:PNP0200:00
[    0.176365] PM: Adding info for acpi:PNP0100:00
[    0.176595] PM: Adding info for acpi:PNP0B00:00
[    0.177248] PM: Adding info for acpi:PNP0303:00
[    0.177887] PM: Adding info for acpi:SYN0A00:00
[    0.178138] PM: Adding info for acpi:PNP0800:00
[    0.178369] PM: Adding info for acpi:PNP0C04:00
[    0.178642] PM: Adding info for acpi:PNP0C09:00
[    0.179538] PM: Adding info for acpi:PNP0C02:00
[    0.180313] PM: Adding info for acpi:PNP0C02:01
[    0.180635] PM: Adding info for acpi:PNP0C02:02
[    0.181419] PM: Adding info for acpi:PNP0C0A:00
[    0.181728] PM: Adding info for acpi:ACPI0003:00
[    0.181973] PM: Adding info for acpi:device:07
[    0.182314] PM: Adding info for acpi:device:08
[    0.182516] PM: Adding info for acpi:device:09
[    0.182729] PM: Adding info for acpi:device:0a
[    0.182938] PM: Adding info for acpi:device:0b
[    0.183138] PM: Adding info for acpi:device:0c
[    0.183359] PM: Adding info for acpi:device:0d
[    0.184311] PM: Adding info for acpi:device:0e
[    0.185157] PM: Adding info for acpi:device:0f
[    0.186004] PM: Adding info for acpi:device:10
[    0.186868] PM: Adding info for acpi:device:11
[    0.187718] PM: Adding info for acpi:device:12
[    0.188563] PM: Adding info for acpi:device:13
[    0.188768] PM: Adding info for acpi:device:14
[    0.188975] PM: Adding info for acpi:device:15
[    0.189218] PM: Adding info for acpi:device:16
[    0.189482] PM: Adding info for acpi:device:17
[    0.189692] PM: Adding info for acpi:device:18
[    0.189901] PM: Adding info for acpi:device:19
[    0.190214] PM: Adding info for acpi:PNP0C02:03
[    0.190461] PM: Adding info for acpi:PNP0C01:01
[    0.190717] PM: Adding info for acpi:ASUS010:00
[    0.191022] PM: Adding info for acpi:PNP0C0D:00
[    0.191261] PM: Adding info for acpi:PNP0C0E:00
[    0.191524] PM: Adding info for acpi:PNP0C0C:00
[    0.192134] PM: Adding info for acpi:PNP0C0F:00
[    0.192742] PM: Adding info for acpi:PNP0C0F:01
[    0.193381] PM: Adding info for acpi:PNP0C0F:02
[    0.193995] PM: Adding info for acpi:PNP0C0F:03
[    0.194603] PM: Adding info for acpi:PNP0C0F:04
[    0.195217] PM: Adding info for acpi:PNP0C0F:05
[    0.195822] PM: Adding info for acpi:PNP0C0F:06
[    0.196441] PM: Adding info for acpi:PNP0C0F:07
[    0.196750] PM: Adding info for acpi:LNXTHERM:00
[    0.196918] PM: Adding info for acpi:LNXTHERM:01
[    0.198033] ACPI: EC: GPE = 0x18, I/O: command/status = 0x66, data = 0x62
[    0.198143] ACPI: EC: driver started in poll mode
[    0.199451] ACPI: No dock devices found.
[    0.199837] ACPI: PCI Root Bridge [PCI0] (0000:00)
[    0.200126] PM: Adding info for No Bus:pci0000:00
[    0.200185] PM: Adding info for No Bus:0000:00
[    0.200727] pci 0000:00:02.0: reg 10 32bit mmio: [0xf7f00000-0xf7f7ffff]
[    0.200750] pci 0000:00:02.0: reg 14 io port: [0xec00-0xec07]
[    0.200771] pci 0000:00:02.0: reg 18 32bit mmio: [0xd0000000-0xdfffffff]
[    0.200792] pci 0000:00:02.0: reg 1c 32bit mmio: [0xf7ec0000-0xf7efffff]
[    0.200940] pci 0000:00:02.1: reg 10 32bit mmio: [0xf7f80000-0xf7ffffff]
[    0.201254] pci 0000:00:1b.0: reg 10 64bit mmio: [0xf7eb8000-0xf7ebbfff]
[    0.201385] pci 0000:00:1b.0: PME# supported from D0 D3hot D3cold
[    0.201498] pci 0000:00:1b.0: PME# disabled
[    0.201765] pci 0000:00:1c.0: PME# supported from D0 D3hot D3cold
[    0.201875] pci 0000:00:1c.0: PME# disabled
[    0.202149] pci 0000:00:1c.1: PME# supported from D0 D3hot D3cold
[    0.202258] pci 0000:00:1c.1: PME# disabled
[    0.202530] pci 0000:00:1c.2: PME# supported from D0 D3hot D3cold
[    0.202640] pci 0000:00:1c.2: PME# disabled
[    0.202874] pci 0000:00:1d.0: reg 20 io port: [0xe400-0xe41f]
[    0.203025] pci 0000:00:1d.1: reg 20 io port: [0xe480-0xe49f]
[    0.203171] pci 0000:00:1d.2: reg 20 io port: [0xe800-0xe81f]
[    0.203357] pci 0000:00:1d.3: reg 20 io port: [0xe880-0xe89f]
[    0.203514] pci 0000:00:1d.7: reg 10 32bit mmio: [0xf7eb7c00-0xf7eb7fff]
[    0.203656] pci 0000:00:1d.7: PME# supported from D0 D3hot D3cold
[    0.203766] pci 0000:00:1d.7: PME# disabled
[    0.204179] pci 0000:00:1f.0: Force enabled HPET at 0xfed00000
[    0.204200] pci 0000:00:1f.0: quirk: region 0800-087f claimed by ICH6 ACPI/GPIO/TCO
[    0.204345] pci 0000:00:1f.0: quirk: region 0480-04bf claimed by ICH6 GPIO
[    0.204456] pci 0000:00:1f.0: LPC Generic IO decode 1 PIO at 0380-03ff
[    0.204654] pci 0000:00:1f.2: reg 10 io port: [0x00-0x07]
[    0.204676] pci 0000:00:1f.2: reg 14 io port: [0x00-0x03]
[    0.204697] pci 0000:00:1f.2: reg 18 io port: [0x00-0x07]
[    0.204718] pci 0000:00:1f.2: reg 1c io port: [0x00-0x03]
[    0.204739] pci 0000:00:1f.2: reg 20 io port: [0xffa0-0xffaf]
[    0.204818] pci 0000:00:1f.2: PME# supported from D3hot
[    0.204923] pci 0000:00:1f.2: PME# disabled
[    0.205133] pci 0000:00:1f.3: reg 20 io port: [0x400-0x41f]
[    0.205489] pci 0000:03:00.0: reg 10 64bit mmio: [0xfbfc0000-0xfbffffff]
[    0.205568] pci 0000:03:00.0: reg 30 32bit mmio: [0xfbfa0000-0xfbfbffff]
[    0.205660] pci 0000:03:00.0: PME# supported from D3hot D3cold
[    0.205769] pci 0000:03:00.0: PME# disabled
[    0.205986] pci 0000:03:00.0: disabling ASPM on pre-1.1 PCIe device.  You can enable it with 'pcie_aspm=force'
[    0.206602] pci 0000:00:1c.1: bridge 32bit mmio: [0xfbf00000-0xfbffffff]
[    0.206781] pci 0000:00:1c.2: bridge 32bit mmio: [0xf8000000-0xfbefffff]
[    0.206803] pci 0000:00:1c.2: bridge 64bit mmio pref: [0xf0000000-0xf6ffffff]
[    0.206953] pci 0000:00:1e.0: transparent bridge
[    0.207127] pci_bus 0000:00: on NUMA node 0
[    0.207163] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
[    0.207966] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P3._PRT]
[    0.208205] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P5._PRT]
[    0.208423] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P6._PRT]
[    0.210386] PM: Adding info for pci:0000:00:00.0
[    0.211916] PM: Adding info for pci:0000:00:02.0
[    0.213433] PM: Adding info for pci:0000:00:02.1
[    0.214949] PM: Adding info for pci:0000:00:1b.0
[    0.216467] PM: Adding info for pci:0000:00:1c.0
[    0.218010] PM: Adding info for pci:0000:00:1c.1
[    0.219556] PM: Adding info for pci:0000:00:1c.2
[    0.221113] PM: Adding info for pci:0000:00:1d.0
[    0.222645] PM: Adding info for pci:0000:00:1d.1
[    0.224187] PM: Adding info for pci:0000:00:1d.2
[    0.225707] PM: Adding info for pci:0000:00:1d.3
[    0.227255] PM: Adding info for pci:0000:00:1d.7
[    0.228776] PM: Adding info for pci:0000:00:1e.0
[    0.230327] PM: Adding info for pci:0000:00:1f.0
[    0.231851] PM: Adding info for pci:0000:00:1f.2
[    0.233389] PM: Adding info for pci:0000:00:1f.3
[    0.233495] PM: Adding info for No Bus:0000:04
[    0.233670] PM: Adding info for pci:0000:03:00.0
[    0.233766] PM: Adding info for No Bus:0000:03
[    0.233860] PM: Adding info for No Bus:0000:01
[    0.233954] PM: Adding info for No Bus:0000:05
[    0.234819] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 *5 6 7 10 11 12 14 15)
[    0.236014] ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 10 *11 12 14 15)
[    0.237213] ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 *10 11 12 14 15)
[    0.238398] ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 *7 10 11 12 14 15)
[    0.239594] ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
[    0.240922] ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
[    0.242230] ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
[    0.243558] ACPI: PCI Interrupt Link [LNKH] (IRQs *3 4 5 6 7 10 11 12 14 15)
[    0.245027] SCSI subsystem initialized
[    0.261558] libata version 3.00 loaded.
[    0.262173] PCI: Using ACPI for IRQ routing
[    0.263385] PM: Adding info for No Bus:lo
[    0.264418] hpet clockevent registered
[    0.264428] HPET: 3 timers in total, 0 timers will be used for per-cpu timer
[    0.264545] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0
[    0.264828] hpet0: 3 comparators, 64-bit 14.318180 MHz counter
[    0.271687] pnp: PnP ACPI init
[    0.271877] PM: Adding info for No Bus:pnp0
[    0.271893] ACPI: bus type pnp registered
[    0.272885] PM: Adding info for pnp:00:00
[    0.273212] PM: Adding info for pnp:00:01
[    0.273611] PM: Adding info for pnp:00:02
[    0.273858] PM: Adding info for pnp:00:03
[    0.274294] PM: Adding info for pnp:00:04
[    0.274900] PM: Adding info for pnp:00:05
[    0.275125] PM: Adding info for pnp:00:06
[    0.275354] PM: Adding info for pnp:00:07
[    0.276366] PM: Adding info for pnp:00:08
[    0.277396] PM: Adding info for pnp:00:09
[    0.277961] PM: Adding info for pnp:00:0a
[    0.278869] PM: Adding info for pnp:00:0b
[    0.280118] PM: Adding info for pnp:00:0c
[    0.281660] pnp: PnP ACPI: found 13 devices
[    0.281760] ACPI: ACPI bus type pnp unregistered
[    0.281908] system 00:01: iomem range 0xfed13000-0xfed19fff has been reserved
[    0.282081] system 00:08: ioport range 0x380-0x383 has been reserved
[    0.282191] system 00:08: ioport range 0x4d0-0x4d1 has been reserved
[    0.284081] system 00:08: ioport range 0x800-0x87f has been reserved
[    0.284193] system 00:08: ioport range 0x480-0x4bf has been reserved
[    0.284303] system 00:08: iomem range 0xfed1c000-0xfed1ffff has been reserved
[    0.284415] system 00:08: iomem range 0xfed20000-0xfed8ffff has been reserved
[    0.284532] system 00:08: iomem range 0xfff00000-0xffffffff could not be reserved
[    0.284697] system 00:09: iomem range 0xfec00000-0xfec00fff has been reserved
[    0.284809] system 00:09: iomem range 0xfee00000-0xfee00fff has been reserved
[    0.284941] system 00:0a: iomem range 0xe0000000-0xefffffff has been reserved
[    0.285082] system 00:0b: iomem range 0xe0000000-0xefffffff has been reserved
[    0.285216] system 00:0c: iomem range 0x0-0x9ffff could not be reserved
[    0.285327] system 00:0c: iomem range 0xc0000-0xcffff could not be reserved
[    0.285439] system 00:0c: iomem range 0xe0000-0xfffff could not be reserved
[    0.285552] system 00:0c: iomem range 0x100000-0x1f7fffff could not be reserved
[    0.285848] PM: Adding info for No Bus:mem
[    0.286031] PM: Adding info for No Bus:kmem
[    0.286113] PM: Adding info for No Bus:null
[    0.286192] PM: Adding info for No Bus:port
[    0.286272] PM: Adding info for No Bus:zero
[    0.286353] PM: Adding info for No Bus:full
[    0.286432] PM: Adding info for No Bus:random
[    0.286515] PM: Adding info for No Bus:urandom
[    0.286596] PM: Adding info for No Bus:kmsg
[    0.322170] pci 0000:00:1c.0: PCI bridge, secondary bus 0000:04
[    0.322276] pci 0000:00:1c.0:   IO window: disabled
[    0.322384] pci 0000:00:1c.0:   MEM window: disabled
[    0.322488] pci 0000:00:1c.0:   PREFETCH window: disabled
[    0.322601] pci 0000:00:1c.1: PCI bridge, secondary bus 0000:03
[    0.322703] pci 0000:00:1c.1:   IO window: disabled
[    0.322811] pci 0000:00:1c.1:   MEM window: 0xfbf00000-0xfbffffff
[    0.322920] pci 0000:00:1c.1:   PREFETCH window: disabled
[    0.323032] pci 0000:00:1c.2: PCI bridge, secondary bus 0000:01
[    0.323135] pci 0000:00:1c.2:   IO window: disabled
[    0.323242] pci 0000:00:1c.2:   MEM window: 0xf8000000-0xfbefffff
[    0.323365] pci 0000:00:1c.2:   PREFETCH window: 0x000000f0000000-0x000000f6ffffff
[    0.323516] pci 0000:00:1e.0: PCI bridge, secondary bus 0000:05
[    0.323620] pci 0000:00:1e.0:   IO window: disabled
[    0.323727] pci 0000:00:1e.0:   MEM window: disabled
[    0.323831] pci 0000:00:1e.0:   PREFETCH window: disabled
[    0.323975] pci 0000:00:1c.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[    0.324093] pci 0000:00:1c.0: setting latency timer to 64
[    0.324122] pci 0000:00:1c.1: PCI INT B -> GSI 17 (level, low) -> IRQ 17
[    0.324234] pci 0000:00:1c.1: setting latency timer to 64
[    0.324265] pci 0000:00:1c.2: PCI INT C -> GSI 18 (level, low) -> IRQ 18
[    0.324377] pci 0000:00:1c.2: setting latency timer to 64
[    0.324401] pci 0000:00:1e.0: setting latency timer to 64
[    0.324417] pci_bus 0000:00: resource 0 io:  [0x00-0xffff]
[    0.324428] pci_bus 0000:00: resource 1 mem: [0x000000-0xffffffffffffffff]
[    0.324440] pci_bus 0000:04: resource 0 mem: [0x0-0x0]
[    0.324449] pci_bus 0000:04: resource 1 mem: [0x0-0x0]
[    0.324459] pci_bus 0000:04: resource 2 mem: [0x0-0x0]
[    0.324468] pci_bus 0000:04: resource 3 mem: [0x0-0x0]
[    0.324478] pci_bus 0000:03: resource 0 mem: [0x0-0x0]
[    0.324488] pci_bus 0000:03: resource 1 mem: [0xfbf00000-0xfbffffff]
[    0.324498] pci_bus 0000:03: resource 2 mem: [0x0-0x0]
[    0.324508] pci_bus 0000:03: resource 3 mem: [0x0-0x0]
[    0.324518] pci_bus 0000:01: resource 0 mem: [0x0-0x0]
[    0.324528] pci_bus 0000:01: resource 1 mem: [0xf8000000-0xfbefffff]
[    0.324538] pci_bus 0000:01: resource 2 mem: [0xf0000000-0xf6ffffff]
[    0.324548] pci_bus 0000:01: resource 3 mem: [0x0-0x0]
[    0.324558] pci_bus 0000:05: resource 0 mem: [0x0-0x0]
[    0.324568] pci_bus 0000:05: resource 1 mem: [0x0-0x0]
[    0.324577] pci_bus 0000:05: resource 2 mem: [0x0-0x0]
[    0.324587] pci_bus 0000:05: resource 3 io:  [0x00-0xffff]
[    0.324597] pci_bus 0000:05: resource 4 mem: [0x000000-0xffffffffffffffff]
[    0.324808] NET: Registered protocol family 2
[    0.325624] IP route cache hash table entries: 4096 (order: 2, 16384 bytes)
[    0.328069] TCP established hash table entries: 16384 (order: 5, 131072 bytes)
[    0.328653] TCP bind hash table entries: 16384 (order: 7, 524288 bytes)
[    0.334218] TCP: Hash tables configured (established 16384 bind 16384)
[    0.334466] TCP reno registered
[    0.335018] NET: Registered protocol family 1
[    0.335898] checking if image is initramfs...
[    0.619267] rootfs image is initramfs; unpacking...
[    0.619490] Freeing initrd memory: 2797k freed
[    0.625139] PM: Adding info for platform:pcspkr
[    0.625980] PM: Adding info for No Bus:snapshot
[    0.626432] audit: initializing netlink socket (disabled)
[    0.626714] type=2000 audit(1240041427.623:1): initialized
[    0.669232] msgmni has been set to 981
[    0.681678] alg: No test for stdrng (krng)
[    0.681904] io scheduler noop registered
[    0.682001] io scheduler anticipatory registered
[    0.682098] io scheduler deadline registered
[    0.682290] io scheduler cfq registered (default)
[    0.682435] pci 0000:00:02.0: Boot video device
[    0.683544] pcieport-driver 0000:00:1c.0: irq 24 for MSI/MSI-X
[    0.683588] pcieport-driver 0000:00:1c.0: setting latency timer to 64
[    0.683665] PM: Adding info for pci_express:0000:00:1c.0:pcie01
[    0.683764] PM: Adding info for pci_express:0000:00:1c.0:pcie04
[    0.683853] PM: Adding info for pci_express:0000:00:1c.0:pcie08
[    0.684182] pcieport-driver 0000:00:1c.1: irq 25 for MSI/MSI-X
[    0.684223] pcieport-driver 0000:00:1c.1: setting latency timer to 64
[    0.684286] PM: Adding info for pci_express:0000:00:1c.1:pcie01
[    0.684376] PM: Adding info for pci_express:0000:00:1c.1:pcie04
[    0.684465] PM: Adding info for pci_express:0000:00:1c.1:pcie08
[    0.684788] pcieport-driver 0000:00:1c.2: irq 26 for MSI/MSI-X
[    0.684828] pcieport-driver 0000:00:1c.2: setting latency timer to 64
[    0.684901] PM: Adding info for pci_express:0000:00:1c.2:pcie01
[    0.685002] PM: Adding info for pci_express:0000:00:1c.2:pcie04
[    0.685092] PM: Adding info for pci_express:0000:00:1c.2:pcie08
[    0.685625] PM: Adding info for No Bus:tty
[    0.685804] PM: Adding info for No Bus:console
[    0.685893] PM: Adding info for No Bus:tty0
[    0.686028] PM: Adding info for No Bus:vcs
[    0.686199] PM: Adding info for No Bus:vcsa
[    0.686340] PM: Adding info for No Bus:tty1
[    0.686422] PM: Adding info for No Bus:tty2
[    0.686503] PM: Adding info for No Bus:tty3
[    0.686591] PM: Adding info for No Bus:tty4
[    0.686708] PM: Adding info for No Bus:tty5
[    0.686790] PM: Adding info for No Bus:tty6
[    0.686872] PM: Adding info for No Bus:tty7
[    0.686952] PM: Adding info for No Bus:tty8
[    0.687033] PM: Adding info for No Bus:tty9
[    0.687115] PM: Adding info for No Bus:tty10
[    0.687197] PM: Adding info for No Bus:tty11
[    0.687279] PM: Adding info for No Bus:tty12
[    0.687362] PM: Adding info for No Bus:tty13
[    0.687445] PM: Adding info for No Bus:tty14
[    0.687532] PM: Adding info for No Bus:tty15
[    0.687621] PM: Adding info for No Bus:tty16
[    0.687705] PM: Adding info for No Bus:tty17
[    0.687788] PM: Adding info for No Bus:tty18
[    0.687871] PM: Adding info for No Bus:tty19
[    0.687956] PM: Adding info for No Bus:tty20
[    0.688040] PM: Adding info for No Bus:tty21
[    0.688125] PM: Adding info for No Bus:tty22
[    0.688210] PM: Adding info for No Bus:tty23
[    0.688294] PM: Adding info for No Bus:tty24
[    0.688379] PM: Adding info for No Bus:tty25
[    0.688464] PM: Adding info for No Bus:tty26
[    0.688551] PM: Adding info for No Bus:tty27
[    0.688643] PM: Adding info for No Bus:tty28
[    0.688730] PM: Adding info for No Bus:tty29
[    0.688814] PM: Adding info for No Bus:tty30
[    0.688910] PM: Adding info for No Bus:tty31
[    0.688997] PM: Adding info for No Bus:tty32
[    0.689083] PM: Adding info for No Bus:tty33
[    0.689168] PM: Adding info for No Bus:tty34
[    0.689257] PM: Adding info for No Bus:tty35
[    0.689346] PM: Adding info for No Bus:tty36
[    0.689432] PM: Adding info for No Bus:tty37
[    0.689521] PM: Adding info for No Bus:tty38
[    0.689610] PM: Adding info for No Bus:tty39
[    0.689702] PM: Adding info for No Bus:tty40
[    0.689789] PM: Adding info for No Bus:tty41
[    0.689875] PM: Adding info for No Bus:tty42
[    0.689979] PM: Adding info for No Bus:tty43
[    0.690073] PM: Adding info for No Bus:tty44
[    0.690161] PM: Adding info for No Bus:tty45
[    0.690249] PM: Adding info for No Bus:tty46
[    0.690340] PM: Adding info for No Bus:tty47
[    0.690427] PM: Adding info for No Bus:tty48
[    0.690514] PM: Adding info for No Bus:tty49
[    0.690601] PM: Adding info for No Bus:tty50
[    0.690699] PM: Adding info for No Bus:tty51
[    0.690795] PM: Adding info for No Bus:tty52
[    0.690884] PM: Adding info for No Bus:tty53
[    0.690972] PM: Adding info for No Bus:tty54
[    0.691062] PM: Adding info for No Bus:tty55
[    0.691151] PM: Adding info for No Bus:tty56
[    0.691241] PM: Adding info for No Bus:tty57
[    0.691329] PM: Adding info for No Bus:tty58
[    0.691419] PM: Adding info for No Bus:tty59
[    0.691509] PM: Adding info for No Bus:tty60
[    0.691598] PM: Adding info for No Bus:tty61
[    0.691690] PM: Adding info for No Bus:tty62
[    0.691783] PM: Adding info for No Bus:tty63
[    0.692242] PM: Adding info for No Bus:ptmx
[    0.692376] PM: Adding info for No Bus:hpet
[    0.692860] PM: Adding info for No Bus:ram0
[    0.693080] PM: Adding info for No Bus:1:0
[    0.693328] PM: Adding info for No Bus:ram1
[    0.693504] PM: Adding info for No Bus:1:1
[    0.693673] PM: Adding info for No Bus:ram2
[    0.693776] PM: Adding info for No Bus:1:2
[    0.693931] PM: Adding info for No Bus:ram3
[    0.694031] PM: Adding info for No Bus:1:3
[    0.694179] PM: Adding info for No Bus:ram4
[    0.694279] PM: Adding info for No Bus:1:4
[    0.694428] PM: Adding info for No Bus:ram5
[    0.694528] PM: Adding info for No Bus:1:5
[    0.694683] PM: Adding info for No Bus:ram6
[    0.694786] PM: Adding info for No Bus:1:6
[    0.694934] PM: Adding info for No Bus:ram7
[    0.695034] PM: Adding info for No Bus:1:7
[    0.695254] PM: Adding info for No Bus:ram8
[    0.695356] PM: Adding info for No Bus:1:8
[    0.695516] PM: Adding info for No Bus:ram9
[    0.695624] PM: Adding info for No Bus:1:9
[    0.695776] PM: Adding info for No Bus:ram10
[    0.695879] PM: Adding info for No Bus:1:10
[    0.696029] PM: Adding info for No Bus:ram11
[    0.696132] PM: Adding info for No Bus:1:11
[    0.696281] PM: Adding info for No Bus:ram12
[    0.696382] PM: Adding info for No Bus:1:12
[    0.696539] PM: Adding info for No Bus:ram13
[    0.696663] PM: Adding info for No Bus:1:13
[    0.696819] PM: Adding info for No Bus:ram14
[    0.696922] PM: Adding info for No Bus:1:14
[    0.697131] PM: Adding info for No Bus:ram15
[    0.697235] PM: Adding info for No Bus:1:15
[    0.697306] brd: module loaded
[    0.697483] Driver 'sd' needs updating - please use bus_type methods
[    0.697721] ahci 0000:00:1f.2: version 3.0
[    0.697830] ahci 0000:00:1f.2: PCI INT B -> GSI 19 (level, low) -> IRQ 19
[    0.698002] ahci 0000:00:1f.2: PCI INT B disabled
[    0.698120] ahci: probe of 0000:00:1f.2 failed with error -22
[    0.698392] ata_piix 0000:00:1f.2: version 2.12
[    0.698417] ata_piix 0000:00:1f.2: PCI INT B -> GSI 19 (level, low) -> IRQ 19
[    0.698534] ata_piix 0000:00:1f.2: MAP [ P0 P2 IDE IDE ]
[    0.698996] ata_piix 0000:00:1f.2: setting latency timer to 64
[    0.730030] scsi0 : ata_piix
[    0.730680] PM: Adding info for No Bus:host0
[    0.730811] PM: Adding info for No Bus:host0
[    0.731401] scsi1 : ata_piix
[    0.731533] PM: Adding info for No Bus:host1
[    0.731652] PM: Adding info for No Bus:host1
[    0.731718] ata1: SATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0xffa0 irq 14
[    0.731827] ata2: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0xffa8 irq 15
[    0.732397] PNP: PS/2 Controller [PNP0303:PS2K,PNP0f13:PS2M] at 0x60,0x64 irq 1,12
[    0.732844] PM: Adding info for platform:i8042
[    0.744856] serio: i8042 KBD port at 0x60,0x64 irq 1
[    0.745179] PM: Adding info for No Bus:mice
[    0.745381] PM: Adding info for No Bus:psaux
[    0.745425] mice: PS/2 mouse device common for all mice
[    0.745683] rtc_cmos 00:03: RTC can wake from S4
[    0.746094] PM: Adding info for No Bus:rtc0
[    0.746297] rtc_cmos 00:03: rtc core: registered rtc_cmos as rtc0
[    0.746455] rtc0: alarms up to one month, 114 bytes nvram, hpet irqs
[    0.746645] iTCO_wdt: Intel TCO WatchDog Timer Driver v1.05
[    0.746861] PM: Adding info for platform:iTCO_wdt
[    0.747067] iTCO_wdt: Found a ICH6-M TCO device (Version=2, TCOBASE=0x0860)
[    0.747329] PM: Adding info for No Bus:watchdog
[    0.747373] iTCO_wdt: initialized. heartbeat=30 sec (nowayout=0)
[    0.747486] iTCO_vendor_support: vendor-support=0
[    0.747634] cpuidle: using governor ladder
[    0.747730] cpuidle: using governor menu
[    0.748124] Advanced Linux Sound Architecture Driver Version 1.0.19.
[    0.748229] ALSA device list:
[    0.748317]   No soundcards found.
[    0.750861] TCP cubic registered
[    0.750960] Using IPI Shortcut mode
[    0.751178] PM: Adding info for No Bus:cpu_dma_latency
[    0.751285] PM: Adding info for No Bus:network_latency
[    0.751430] PM: Adding info for No Bus:network_throughput
[    0.751877] PM: Adding info for serio:serio0
[    0.766668] Switched to NOHz mode on CPU #0
[    0.785428] PM: Adding info for No Bus:input0
[    0.785904] input: AT Translated Set 2 keyboard as /class/input/input0
[    1.077108] ata2.00: CFA: SILICONMOTION SM223AC, , max UDMA/66
[    1.077218] ata2.00: 7815024 sectors, multi 0: LBA 
[    1.090333] ata2.00: configured for UDMA/66
[    1.091809] scsi 1:0:0:0: Direct-Access     ATA      SILICONMOTION SM n/a  PQ: 0 ANSI: 5
[    1.092182] PM: Adding info for No Bus:target1:0:0
[    1.092455] PM: Adding info for scsi:1:0:0:0
[    1.093328] PM: Adding info for No Bus:1:0:0:0
[    1.093676] PM: Adding info for No Bus:1:0:0:0
[    1.094012] sd 1:0:0:0: [sda] 7815024 512-byte hardware sectors: (4.00 GB/3.72 GiB)
[    1.094233] sd 1:0:0:0: [sda] Write Protect is off
[    1.094335] sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00
[    1.094481] sd 1:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[    1.094751] PM: Adding info for No Bus:sda
[    1.095712]  sda: sda1 sda2
[    1.097080] PM: Adding info for No Bus:sda1
[    1.097597] PM: Adding info for No Bus:sda2
[    1.098014] PM: Adding info for No Bus:8:0
[    1.098101] sd 1:0:0:0: [sda] Attached SCSI disk
[    1.098628] PM: Resume from partition /dev/sda2
[    1.098635] PM: Checking hibernation image.
[    1.100964] PM: Resume from disk failed.
[    1.101675] rtc_cmos 00:03: setting system clock to 2009-04-18 07:57:09 UTC (1240041429)
[    1.101828] BIOS EDD facility v0.16 2004-Jun-25, 2 devices found
[    1.102623] Freeing unused kernel memory: 248k freed
[    1.354542] ACPI: EC: non-query interrupt received, switching to interrupt mode
[    1.366312] PM: Adding info for No Bus:thermal_zone0
[    1.374534] thermal LNXTHERM:01: registered as thermal_zone0
[    1.374695] ACPI: Thermal Zone [TZ00] (32 C)
[    2.907760] PM: Starting manual resume from disk
[    2.907929] PM: Resume from partition 8:2
[    2.907936] PM: Checking hibernation image.
[    2.909243] PM: Resume from disk failed.
[    2.933167] PM: Marking nosave pages: 000000000009f000 - 0000000000100000
[    2.933184] PM: Basic memory bitmaps created
[    2.959910] PM: Basic memory bitmaps freed
[    2.983990] EXT4-fs: delayed allocation enabled
[    2.984099] EXT4-fs: file extents enabled
[    2.984438] EXT4-fs: mballoc enabled
[    2.984810] EXT4-fs: mounted filesystem sda2 without journal
[    3.756901] udev: starting version 140
[    3.757127] udev: deprecated sysfs layout; update the kernel or disable CONFIG_SYSFS_DEPRECATED; some udev features will not work correctly
[    4.367545] PM: Adding info for No Bus:event0
[    4.931001] PM: Adding info for No Bus:input1
[    4.931103] input: Power Button (FF) as /class/input/input1
[    4.931455] PM: Adding info for No Bus:event1
[    4.931506] ACPI: Power Button (FF) [PWRF]
[    4.931906] PM: Adding info for No Bus:input2
[    4.931998] input: Lid Switch as /class/input/input2
[    4.932204] PM: Adding info for No Bus:event2
[    4.972490] ACPI: CPU0 (power states: C1[C1] C2[C2] C3[C3])
[    4.973043] PM: Adding info for No Bus:cooling_device0
[    4.973110] processor ACPI_CPU:00: registered as cooling_device0
[    4.973236] ACPI: Processor [CPU1] (supports 8 throttling states)
[    4.974976] ACPI: Lid Switch [LID]
[    4.975331] PM: Adding info for No Bus:input3
[    4.975435] input: Sleep Button (CM) as /class/input/input3
[    4.975672] PM: Adding info for No Bus:event3
[    4.975724] ACPI: Sleep Button (CM) [SLPB]
[    4.976022] PM: Adding info for No Bus:input4
[    4.976111] input: Power Button (CM) as /class/input/input4
[    4.976329] PM: Adding info for No Bus:event4
[    4.976378] ACPI: Power Button (CM) [PWRB]
[    4.978561] PM: Adding info for No Bus:AC0
[    4.979193] ACPI: AC Adapter [AC0] (on-line)
[    4.985710] ACPI: Battery Slot [BAT0] (battery absent)
[    5.004861] eeepc: Eee PC Hotkey Driver
[    5.069203] eeepc: Hotkey init flags 0x41
[    5.070976] eeepc: Get control methods supported: 0x101711
[    5.071300] PM: Adding info for No Bus:input5
[    5.071416] input: Asus EeePC extra buttons as /class/input/input5
[    5.071662] PM: Adding info for No Bus:event5
[    5.074524] PM: Adding info for No Bus:rfkill0
[    5.075359] PM: Adding info for No Bus:eeepc
[    5.132746] Linux agpgart interface v0.103
[    5.151614] PM: Adding info for No Bus:hwmon0
[    5.151977] PM: Adding info for platform:eeepc
[    5.153770] PM: Adding info for No Bus:acpi_video0
[    5.154276] PM: Adding info for No Bus:acpi_video1
[    5.154446] PM: Adding info for No Bus:acpi_video2
[    5.154876] PM: Adding info for No Bus:input6
[    5.154985] input: Video Bus as /class/input/input6
[    5.155248] PM: Adding info for No Bus:event6
[    5.155302] ACPI: Video Device [VGA] (multi-head: yes  rom: no  post: no)
[    5.223386] PM: Adding info for No Bus:timer
[    5.233463] Atheros(R) L2 Ethernet Driver - version 2.2.3
[    5.233573] Copyright (c) 2007 Atheros Corporation.
[    5.233791] atl2 0000:03:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
[    5.233921] atl2 0000:03:00.0: setting latency timer to 64
[    5.305094] agpgart-intel 0000:00:00.0: Intel 915GM Chipset
[    5.305772] agpgart-intel 0000:00:00.0: detected 7932K stolen memory
[    5.310942] PM: Adding info for No Bus:agpgart
[    5.311077] agpgart-intel 0000:00:00.0: AGP aperture is 256M @ 0xd0000000
[    5.319675] intel_rng: FWH not detected
[    5.411157] PM: Adding info for No Bus:eth0
[    5.523056] usbcore: registered new interface driver usbfs
[    5.523545] usbcore: registered new interface driver hub
[    5.523849] usbcore: registered new device driver usb
[    5.530701] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[    5.530935] ehci_hcd 0000:00:1d.7: PCI INT A -> GSI 23 (level, low) -> IRQ 23
[    5.531122] ehci_hcd 0000:00:1d.7: setting latency timer to 64
[    5.531136] ehci_hcd 0000:00:1d.7: EHCI Host Controller
[    5.532185] PM: Adding info for No Bus:usb_host1
[    5.534328] ehci_hcd 0000:00:1d.7: new USB bus registered, assigned bus number 1
[    5.538507] ehci_hcd 0000:00:1d.7: debug port 1
[    5.538619] ehci_hcd 0000:00:1d.7: cache line size of 32 is not supported
[    5.538676] ehci_hcd 0000:00:1d.7: irq 23, io mem 0xf7eb7c00
[    5.546610] PM: Adding info for No Bus:input7
[    5.546771] input: PC Speaker as /class/input/input7
[    5.547022] PM: Adding info for No Bus:event7
[    5.550078] ehci_hcd 0000:00:1d.7: USB 2.0 started, EHCI 1.00
[    5.551704] PM: Adding info for usb:usb1
[    5.552178] usb usb1: configuration #1 chosen from 1 choice
[    5.552926] PM: Adding info for usb:1-0:1.0
[    5.553232] hub 1-0:1.0: USB hub found
[    5.553602] hub 1-0:1.0: 8 ports detected
[    5.555191] PM: Adding info for No Bus:usbdev1.1_ep81
[    5.555871] PM: Adding info for No Bus:usbdev1.1_ep00
[    5.647899] uhci_hcd: USB Universal Host Controller Interface driver
[    5.648189] uhci_hcd 0000:00:1d.0: PCI INT A -> GSI 23 (level, low) -> IRQ 23
[    5.648326] uhci_hcd 0000:00:1d.0: setting latency timer to 64
[    5.648340] uhci_hcd 0000:00:1d.0: UHCI Host Controller
[    5.648543] PM: Adding info for No Bus:usb_host2
[    5.648859] uhci_hcd 0000:00:1d.0: new USB bus registered, assigned bus number 2
[    5.649061] uhci_hcd 0000:00:1d.0: irq 23, io base 0x0000e400
[    5.649595] PM: Adding info for usb:usb2
[    5.649718] usb usb2: configuration #1 chosen from 1 choice
[    5.651724] PM: Adding info for usb:2-0:1.0
[    5.651801] hub 2-0:1.0: USB hub found
[    5.652197] hub 2-0:1.0: 2 ports detected
[    5.652608] PM: Adding info for No Bus:usbdev2.1_ep81
[    5.652841] PM: Adding info for No Bus:usbdev2.1_ep00
[    5.652974] uhci_hcd 0000:00:1d.1: PCI INT B -> GSI 19 (level, low) -> IRQ 19
[    5.653101] uhci_hcd 0000:00:1d.1: setting latency timer to 64
[    5.653113] uhci_hcd 0000:00:1d.1: UHCI Host Controller
[    5.653303] PM: Adding info for No Bus:usb_host3
[    5.653810] uhci_hcd 0000:00:1d.1: new USB bus registered, assigned bus number 3
[    5.654041] uhci_hcd 0000:00:1d.1: irq 19, io base 0x0000e480
[    5.654534] PM: Adding info for usb:usb3
[    5.654653] usb usb3: configuration #1 chosen from 1 choice
[    5.654880] PM: Adding info for usb:3-0:1.0
[    5.654951] hub 3-0:1.0: USB hub found
[    5.655074] hub 3-0:1.0: 2 ports detected
[    5.655364] PM: Adding info for No Bus:usbdev3.1_ep81
[    5.655576] PM: Adding info for No Bus:usbdev3.1_ep00
[    5.655707] uhci_hcd 0000:00:1d.2: PCI INT C -> GSI 18 (level, low) -> IRQ 18
[    5.655833] uhci_hcd 0000:00:1d.2: setting latency timer to 64
[    5.655845] uhci_hcd 0000:00:1d.2: UHCI Host Controller
[    5.656038] PM: Adding info for No Bus:usb_host4
[    5.656106] uhci_hcd 0000:00:1d.2: new USB bus registered, assigned bus number 4
[    5.656322] uhci_hcd 0000:00:1d.2: irq 18, io base 0x0000e800
[    5.656829] PM: Adding info for usb:usb4
[    5.656953] usb usb4: configuration #1 chosen from 1 choice
[    5.657185] PM: Adding info for usb:4-0:1.0
[    5.657257] hub 4-0:1.0: USB hub found
[    5.657380] hub 4-0:1.0: 2 ports detected
[    5.657667] PM: Adding info for No Bus:usbdev4.1_ep81
[    5.657884] PM: Adding info for No Bus:usbdev4.1_ep00
[    5.658007] uhci_hcd 0000:00:1d.3: PCI INT D -> GSI 16 (level, low) -> IRQ 16
[    5.658134] uhci_hcd 0000:00:1d.3: setting latency timer to 64
[    5.658146] uhci_hcd 0000:00:1d.3: UHCI Host Controller
[    5.658338] PM: Adding info for No Bus:usb_host5
[    5.658405] uhci_hcd 0000:00:1d.3: new USB bus registered, assigned bus number 5
[    5.658616] uhci_hcd 0000:00:1d.3: irq 16, io base 0x0000e880
[    5.659096] PM: Adding info for usb:usb5
[    5.659222] usb usb5: configuration #1 chosen from 1 choice
[    5.659443] PM: Adding info for usb:5-0:1.0
[    5.659521] hub 5-0:1.0: USB hub found
[    5.659642] hub 5-0:1.0: 2 ports detected
[    5.659926] PM: Adding info for No Bus:usbdev5.1_ep81
[    5.660167] PM: Adding info for No Bus:usbdev5.1_ep00
[    5.695102] i801_smbus 0000:00:1f.3: PCI INT B -> GSI 19 (level, low) -> IRQ 19
[    5.696043] PM: Adding info for No Bus:i2c-0
[    5.850185] HDA Intel 0000:00:1b.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[    5.850463] HDA Intel 0000:00:1b.0: setting latency timer to 64
[    5.886850] usb 1-5: new high speed USB device using ehci_hcd and address 2
[    6.012032] PM: Adding info for usb:1-5
[    6.012250] usb 1-5: configuration #1 chosen from 1 choice
[    6.013122] PM: Adding info for usb:1-5:1.0
[    6.013417] PM: Adding info for No Bus:usbdev1.2_ep01
[    6.013598] PM: Adding info for No Bus:usbdev1.2_ep82
[    6.013816] PM: Adding info for No Bus:usbdev1.2_ep00
[    6.130124] usb 1-8: new high speed USB device using ehci_hcd and address 3
[    6.251446] PM: Adding info for No Bus:pcmC0D0p
[    6.251926] PM: Adding info for No Bus:pcmC0D0c
[    6.253233] PM: Adding info for No Bus:dsp
[    6.253995] PM: Adding info for No Bus:audio
[    6.254687] PM: Adding info for No Bus:hwC0D0
[    6.254874] PM: Adding info for No Bus:controlC0
[    6.255245] PM: Adding info for No Bus:mixer
[    6.259028] PM: Adding info for usb:1-8
[    6.259171] usb 1-8: configuration #1 chosen from 1 choice
[    6.259540] PM: Adding info for usb:1-8:1.0
[    6.259746] PM: Adding info for No Bus:usbdev1.3_ep81
[    6.259907] PM: Adding info for usb:1-8:1.1
[    6.260184] PM: Adding info for No Bus:usbdev1.3_ep82
[    6.260408] PM: Adding info for No Bus:usbdev1.3_ep00
[    6.386655] usual_tables: module license 'unspecified' taints kernel.
[    6.386816] Disabling lockdep due to kernel taint
[    6.439651] Linux video capture interface: v2.00
[    6.446770] Marking TSC unstable due to TSC halts in idle
[    6.456417] uvcvideo: Found UVC 1.00 device <unnamed> (eb1a:2761)
[    6.614887] PM: Adding info for No Bus:video0
[    6.615051] usbcore: registered new interface driver uvcvideo
[    6.615165] USB Video Class driver (v0.1.0)
[    6.763383] Clocksource tsc unstable (delta = -96153146 ns)
[    6.853199] PM: Adding info for No Bus:vcs2
[    6.853297] PM: Adding info for No Bus:vcsa2
[    6.861768] PM: Adding info for No Bus:vcs3
[    6.861864] PM: Adding info for No Bus:vcsa3
[    6.868203] PM: Adding info for No Bus:vcs4
[    6.868299] PM: Adding info for No Bus:vcsa4
[    6.870832] PM: Adding info for No Bus:vcs5
[    6.870920] PM: Adding info for No Bus:vcsa5
[    6.875213] PM: Adding info for No Bus:vcs6
[    6.875298] PM: Adding info for No Bus:vcsa6
[    7.854286] EXT4 FS on sda2, no journal
[    8.280115] Adding 524280k swap on /swapfile.  Priority:-1 extents:1683 across:3168868k 
[    9.510557] NET: Registered protocol family 10
[    9.512717] lo: Disabled Privacy Extensions
[   12.962439] PM: Adding info for No Bus:vcs7
[   12.962543] PM: Adding info for No Bus:vcsa7
[   15.819409] [drm] Initialized drm 1.1.0 20060810
[   16.163239] pci 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[   16.163261] pci 0000:00:02.0: setting latency timer to 64
[   16.169550] PM: Adding info for No Bus:card0
[   16.170451] [drm:i915_gem_detect_bit_6_swizzle] *ERROR* Couldn't read from MCHBAR.  Disabling tiling.
[   16.170513] [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0
[   17.740069] atl2 0000:03:00.0: irq 27 for MSI/MSI-X
[   17.740537] ADDRCONF(NETDEV_UP): eth0: link is not ready
[   17.945294] atl2: eth0 NIC Link is Up<100 Mbps Full Duplex>
[   17.945411] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[   18.075778] NET: Registered protocol family 17
[   27.973369] eth0: no IPv6 routers present
[   57.549665] PM: Adding info for No Bus:vcs63
[   57.549769] PM: Adding info for No Bus:vcsa63
[   57.551117] [drm:i915_get_vblank_counter] *ERROR* trying to get vblank count for disabled pipe 0
[   60.580288] PM: Marking nosave pages: 000000000009f000 - 0000000000100000
[   60.580304] PM: Basic memory bitmaps created
[   60.580627] PM: Adding info for No Bus:vcs8
[   60.580727] PM: Adding info for No Bus:vcsa8
[   61.125471] Syncing filesystems ... done.
[   61.132711] Freezing user space processes ... (elapsed 0.00 seconds) done.
[   61.134847] Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done.
[   61.135207] PM: Shrinking memory...  <6>before: sc.nr_reclaimed = 0
[   61.180993] pass = 0, prio = 12, sc.nr_reclaimed = 0
[   61.181004] pass = 0, prio = 11, sc.nr_reclaimed = 0
[   61.181014] pass = 0, prio = 10, sc.nr_reclaimed = 0
[   61.181024] pass = 0, prio = 9, sc.nr_reclaimed = 0
[   61.181033] pass = 0, prio = 8, sc.nr_reclaimed = 0
[   61.181043] pass = 0, prio = 7, sc.nr_reclaimed = 0
[   61.186509] pass = 0, prio = 6, sc.nr_reclaimed = 0
[   61.186525] pass = 0, prio = 5, sc.nr_reclaimed = 0
[   61.186534] pass = 0, prio = 4, sc.nr_reclaimed = 0
[   61.186544] pass = 0, prio = 3, sc.nr_reclaimed = 0
[   61.333383] pass = 0, prio = 2, sc.nr_reclaimed = 7746
[   61.436711] pass = 0, prio = 1, sc.nr_reclaimed = 1957
[   61.556712] pass = 0, prio = 0, sc.nr_reclaimed = 4528
[   61.556729] pass = 1, prio = 12, sc.nr_reclaimed = 0
[   61.556739] pass = 1, prio = 11, sc.nr_reclaimed = 0
[   61.556749] pass = 1, prio = 10, sc.nr_reclaimed = 0
[   61.556759] pass = 1, prio = 9, sc.nr_reclaimed = 0
[   61.556768] pass = 1, prio = 8, sc.nr_reclaimed = 0
[   61.556778] pass = 1, prio = 7, sc.nr_reclaimed = 0
[   61.556787] pass = 1, prio = 6, sc.nr_reclaimed = 0
[   61.556797] pass = 1, prio = 5, sc.nr_reclaimed = 0
[   61.556806] pass = 1, prio = 4, sc.nr_reclaimed = 0
[   61.556816] pass = 1, prio = 3, sc.nr_reclaimed = 0
[   61.556825] pass = 1, prio = 2, sc.nr_reclaimed = 0
[   61.556835] pass = 1, prio = 1, sc.nr_reclaimed = 0
[   61.595841] pass = 1, prio = 0, sc.nr_reclaimed = 0
[   61.595854] pass = 2, prio = 12, sc.nr_reclaimed = 0
[   61.595864] pass = 2, prio = 11, sc.nr_reclaimed = 0
[   61.595873] pass = 2, prio = 10, sc.nr_reclaimed = 0
[   61.595883] pass = 2, prio = 9, sc.nr_reclaimed = 0
[   61.710044] pass = 2, prio = 8, sc.nr_reclaimed = 2895
[   61.710062] pass = 2, prio = 7, sc.nr_reclaimed = 0
[   61.710072] pass = 2, prio = 6, sc.nr_reclaimed = 0
[   61.710081] pass = 2, prio = 5, sc.nr_reclaimed = 0
[   61.710091] pass = 2, prio = 4, sc.nr_reclaimed = 0
[   61.710100] pass = 2, prio = 3, sc.nr_reclaimed = 0
[   61.710110] pass = 2, prio = 2, sc.nr_reclaimed = 0
[   61.710119] pass = 2, prio = 1, sc.nr_reclaimed = 0
[   61.823378] pass = 2, prio = 0, sc.nr_reclaimed = 1802
[   61.823396] pass = 3, prio = 12, sc.nr_reclaimed = 0
[   61.823406] pass = 3, prio = 11, sc.nr_reclaimed = 0
[   61.823416] pass = 3, prio = 10, sc.nr_reclaimed = 0
[   61.823425] pass = 3, prio = 9, sc.nr_reclaimed = 0
[   61.823435] pass = 3, prio = 8, sc.nr_reclaimed = 0
[   61.823444] pass = 3, prio = 7, sc.nr_reclaimed = 0
[   61.823454] pass = 3, prio = 6, sc.nr_reclaimed = 0
[   61.823464] pass = 3, prio = 5, sc.nr_reclaimed = 0
[   61.823473] pass = 3, prio = 4, sc.nr_reclaimed = 0
[   61.823483] pass = 3, prio = 3, sc.nr_reclaimed = 0
[   61.823492] pass = 3, prio = 2, sc.nr_reclaimed = 0
[   61.823502] pass = 3, prio = 1, sc.nr_reclaimed = 0
[   62.586707] pass = 3, prio = 0, sc.nr_reclaimed = 2716
[   62.608037] pass = 4, prio = 12, sc.nr_reclaimed = 3070
[   62.634644] pass = 4, prio = 11, sc.nr_reclaimed = 5048
[   62.638782] pass = 4, prio = 10, sc.nr_reclaimed = 192
[   62.740052] pass = 4, prio = 9, sc.nr_reclaimed = 0
[   62.843385] pass = 4, prio = 8, sc.nr_reclaimed = 640
[   62.946726] pass = 4, prio = 7, sc.nr_reclaimed = 640
[   63.046711] pass = 4, prio = 6, sc.nr_reclaimed = 640
[   63.146717] pass = 4, prio = 5, sc.nr_reclaimed = 608
[   63.246712] pass = 4, prio = 4, sc.nr_reclaimed = 600
[   63.346704] pass = 4, prio = 3, sc.nr_reclaimed = 128
[   63.446698] pass = 4, prio = 2, sc.nr_reclaimed = 0
[   63.546705] pass = 4, prio = 1, sc.nr_reclaimed = 0
[   63.646698] pass = 4, prio = 0, sc.nr_reclaimed = 0
[   63.646708] after: sc.nr_reclaimed = 0
[   63.646715] shrink_all_memory(10000) failed
[   63.646722] Mem-Info:
[   63.646729] DMA per-cpu:
[   63.646738] CPU    0: hi:    0, btch:   1 usd:   0
[   63.646745] Normal per-cpu:
[   63.646753] CPU    0: hi:  186, btch:  31 usd: 182
[   63.646769] Active_anon:0 active_file:0 inactive_anon:0
[   63.646773]  inactive_file:22 unevictable:1098 dirty:0 writeback:0 unstable:0
[   63.646779]  free:91278 slab:1984 mapped:953 pagetables:529 bounce:0
[   63.646796] DMA free:8008kB min:88kB low:108kB high:132kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB present:15804kB pages_scanned:0 all_unreclaimable? no
[   63.646809] lowmem_reserve[]: 0 483 483
[   63.646833] Normal free:357104kB min:2764kB low:3452kB high:4144kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:88kB unevictable:4392kB present:495300kB pages_scanned:176 all_unreclaimable? no
[   63.646847] lowmem_reserve[]: 0 0 0
[   63.646862] DMA: 2*4kB 6*8kB 3*16kB 3*32kB 2*64kB 2*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 1*4096kB = 8008kB
[   63.646906] Normal: 298*4kB 269*8kB 214*16kB 154*32kB 119*64kB 75*128kB 66*256kB 44*512kB 26*1024kB 14*2048kB 57*4096kB = 357104kB
[   63.646951] 972 total pagecache pages
[   63.646958] 0 pages in swap cache
[   63.646966] Swap cache stats: add 6615, delete 6615, find 0/0
[   63.646974] Free swap  = 497820kB
[   63.646980] Total swap = 524280kB
[   63.652111] 128880 pages RAM
[   63.652119] 3192 pages reserved
[   63.652126] 37178 pages shared
[   63.652132] 4437 pages non-shared
[   63.652177] Restarting tasks ... done.
[   63.671646] PM: Basic memory bitmaps freed
[   64.521472] PM: Removing info for No Bus:vcs63
[   64.521642] PM: Removing info for No Bus:vcsa63
[   65.267700] atl2 0000:03:00.0: irq 27 for MSI/MSI-X
[   65.268233] ADDRCONF(NETDEV_UP): eth0: link is not ready
[   65.270982] atl2: eth0 NIC Link is Up<100 Mbps Full Duplex>
[   65.271111] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[   75.913359] eth0: no IPv6 routers present


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-18  8:16                           ` Alan Jenkins
  0 siblings, 0 replies; 580+ messages in thread
From: Alan Jenkins @ 2009-04-18  8:16 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Rafael J. Wysocki, Jens Axboe, Linux Kernel Mailing List,
	Kernel Testers List

[-- Attachment #1: Type: text/plain, Size: 920 bytes --]

Linus Torvalds wrote:
> On Fri, 17 Apr 2009, Rafael J. Wysocki wrote:
>   
>> Can you please try to reproduce the problem with the appended debug patch
>> applied and send the output of dmesg to me?
>>     
>
> Maybe something like this instead (or in addition to).
>
> It does "show_mem()" when memory shrinking fails. It will show a _lot_ of 
> data.
>
> Untested, but trivial.
>
> 		Linus
> ---
>   

Ok, I applied both your and Rafael's debug patches.  dmesg attached.

After the failed hibernation, I noticed my touchpad wasn't working.  But 
I think that's something else.  I had another go and couldn't reproduce 
that.  It's happened to me once before while testing 2.6.30-; I've also 
had the keyboard stop working at least once.  I'm hoping it's the same 
bug as "20 ACPI interrupts per second on EEEPC" bug.  It could be 
overloading my bug-ridden EC, which also acts as the keyboard controller.

Thanks
Alan

[-- Attachment #2: dmesg.txt --]
[-- Type: text/plain, Size: 53281 bytes --]

[    0.000000] Linux version 2.6.30-rc2eeepc (alan@alan-desktop) (gcc version 4.2.4 (Ubuntu 4.2.4-1ubuntu3)) #240 Sat Apr 18 08:47:04 BST 2009
[    0.000000] KERNEL supported cpus:
[    0.000000]   Intel GenuineIntel
[    0.000000]   AMD AuthenticAMD
[    0.000000]   NSC Geode by NSC
[    0.000000]   Cyrix CyrixInstead
[    0.000000]   Centaur CentaurHauls
[    0.000000]   Transmeta GenuineTMx86
[    0.000000]   Transmeta TransmetaCPU
[    0.000000]   UMC UMC UMC UMC
[    0.000000] BIOS-provided physical RAM map:
[    0.000000]  BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
[    0.000000]  BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
[    0.000000]  BIOS-e820: 00000000000e4000 - 0000000000100000 (reserved)
[    0.000000]  BIOS-e820: 0000000000100000 - 000000001f780000 (usable)
[    0.000000]  BIOS-e820: 000000001f780000 - 000000001f790000 (ACPI data)
[    0.000000]  BIOS-e820: 000000001f790000 - 000000001f7d0000 (ACPI NVS)
[    0.000000]  BIOS-e820: 000000001f7d0000 - 000000001f7de000 (reserved)
[    0.000000]  BIOS-e820: 000000001f7e0000 - 000000001f800000 (reserved)
[    0.000000]  BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
[    0.000000]  BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved)
[    0.000000] DMI present.
[    0.000000] AMI BIOS detected: BIOS may corrupt low RAM, working around it.
[    0.000000] e820 update range: 0000000000000000 - 0000000000010000 (usable) ==> (reserved)
[    0.000000] last_pfn = 0x1f780 max_arch_pfn = 0x1000000
[    0.000000] MTRR default type: uncachable
[    0.000000] MTRR fixed ranges enabled:
[    0.000000]   00000-9FFFF write-back
[    0.000000]   A0000-DFFFF uncachable
[    0.000000]   E0000-EFFFF write-through
[    0.000000]   F0000-FFFFF write-protect
[    0.000000] MTRR variable ranges enabled:
[    0.000000]   0 base 000000000 mask FE0000000 write-back
[    0.000000]   1 base 01F800000 mask FFF800000 uncachable
[    0.000000]   2 disabled
[    0.000000]   3 disabled
[    0.000000]   4 disabled
[    0.000000]   5 disabled
[    0.000000]   6 disabled
[    0.000000]   7 disabled
[    0.000000] init_memory_mapping: 0000000000000000-000000001f780000
[    0.000000] NX (Execute Disable) protection: active
[    0.000000]  0000000000 - 0000200000 page 4k
[    0.000000]  0000200000 - 001f600000 page 2M
[    0.000000]  001f600000 - 001f780000 page 4k
[    0.000000] kernel direct mapping tables up to 1f780000 @ 10000-16000
[    0.000000] RAMDISK: 17724000 - 179df533
[    0.000000] ACPI: RSDP 000fbe50 00014 (v00 ACPIAM)
[    0.000000] ACPI: RSDT 1f780000 00034 (v01 A M I  OEMRSDT  03000911 MSFT 00000097)
[    0.000000] ACPI: FACP 1f780200 00081 (v01 A M I  OEMFACP  03000911 MSFT 00000097)
[    0.000000] ACPI: DSDT 1f780400 06069 (v01  A0797 A0797000 00000000 INTL 20060113)
[    0.000000] ACPI: FACS 1f790000 00040
[    0.000000] ACPI: APIC 1f780390 00068 (v01 A M I  OEMAPIC  03000911 MSFT 00000097)
[    0.000000] ACPI: OEMB 1f790040 00046 (v01 A M I  AMI_OEM  03000911 MSFT 00000097)
[    0.000000] ACPI: MCFG 1f786470 0003C (v01 A M I  OEMMCFG  03000911 MSFT 00000097)
[    0.000000] ACPI: Local APIC address 0xfee00000
[    0.000000] 503MB LOWMEM available.
[    0.000000]   mapped low ram: 0 - 1f780000
[    0.000000]   low ram: 0 - 1f780000
[    0.000000]   node 0 low ram: 00000000 - 1f780000
[    0.000000]   node 0 bootmap 00012000 - 00015ef0
[    0.000000] (7 early reservations) ==> bootmem [0000000000 - 001f780000]
[    0.000000]   #0 [0000000000 - 0000001000]   BIOS data page ==> [0000000000 - 0000001000]
[    0.000000]   #1 [0000100000 - 00008ec0ac]    TEXT DATA BSS ==> [0000100000 - 00008ec0ac]
[    0.000000]   #2 [0017724000 - 00179df533]          RAMDISK ==> [0017724000 - 00179df533]
[    0.000000]   #3 [000009fc00 - 0000100000]    BIOS reserved ==> [000009fc00 - 0000100000]
[    0.000000]   #4 [00008ed000 - 00008f41f4]              BRK ==> [00008ed000 - 00008f41f4]
[    0.000000]   #5 [0000010000 - 0000012000]          PGTABLE ==> [0000010000 - 0000012000]
[    0.000000]   #6 [0000012000 - 0000016000]          BOOTMAP ==> [0000012000 - 0000016000]
[    0.000000] Zone PFN ranges:
[    0.000000]   DMA      0x00000010 -> 0x00001000
[    0.000000]   Normal   0x00001000 -> 0x0001f780
[    0.000000] Movable zone start PFN for each node
[    0.000000] early_node_map[2] active PFN ranges
[    0.000000]     0: 0x00000010 -> 0x0000009f
[    0.000000]     0: 0x00000100 -> 0x0001f780
[    0.000000] On node 0 totalpages: 128783
[    0.000000] free_area_init_node: node 0, pgdat c043836c, node_mem_map c1000200
[    0.000000]   DMA zone: 32 pages used for memmap
[    0.000000]   DMA zone: 0 pages reserved
[    0.000000]   DMA zone: 3951 pages, LIFO batch:0
[    0.000000]   Normal zone: 975 pages used for memmap
[    0.000000]   Normal zone: 123825 pages, LIFO batch:31
[    0.000000] Using APIC driver default
[    0.000000] ACPI: PM-Timer IO Port: 0x808
[    0.000000] ACPI: Local APIC address 0xfee00000
[    0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
[    0.000000] ACPI: IOAPIC (id[0x01] address[0xfec00000] gsi_base[0])
[    0.000000] IOAPIC[0]: apic_id 1, version 32, address 0xfec00000, GSI 0-23
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[    0.000000] ACPI: IRQ0 used by override.
[    0.000000] ACPI: IRQ2 used by override.
[    0.000000] ACPI: IRQ9 used by override.
[    0.000000] Enabling APIC mode:  Flat.  Using 1 I/O APICs
[    0.000000] Using ACPI (MADT) for SMP configuration information
[    0.000000] nr_irqs_gsi: 24
[    0.000000] PM: Registered nosave memory: 000000000009f000 - 00000000000a0000
[    0.000000] PM: Registered nosave memory: 00000000000a0000 - 00000000000e4000
[    0.000000] PM: Registered nosave memory: 00000000000e4000 - 0000000000100000
[    0.000000] Allocating PCI resources starting at 20000000 (gap: 1f800000:df600000)
[    0.000000] Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 127776
[    0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-2.6.30-rc2eeepc root=/dev/sda2 ro root=/dev/sda2 rootfstype=ext4 resume=/dev/sda2 resume_offset=9732
[    0.000000] Enabling fast FPU save and restore... done.
[    0.000000] Enabling unmasked SIMD FPU exception support... done.
[    0.000000] Initializing CPU#0
[    0.000000] NR_IRQS:288
[    0.000000] PID hash table entries: 2048 (order: 11, 8192 bytes)
[    0.000000] Fast TSC calibration using PIT
[    0.000000] Detected 630.041 MHz processor.
[    0.003333] Console: colour VGA+ 80x25
[    0.003333] console [tty0] enabled
[    0.003333] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
[    0.003333] ... MAX_LOCKDEP_SUBCLASSES:  8
[    0.003333] ... MAX_LOCK_DEPTH:          48
[    0.003333] ... MAX_LOCKDEP_KEYS:        8191
[    0.003333] ... CLASSHASH_SIZE:          4096
[    0.003333] ... MAX_LOCKDEP_ENTRIES:     8192
[    0.003333] ... MAX_LOCKDEP_CHAINS:      16384
[    0.003333] ... CHAINHASH_SIZE:          8192
[    0.003333]  memory used by lock dependency info: 2847 kB
[    0.003333]  per task-struct memory footprint: 1152 bytes
[    0.003333] Dentry cache hash table entries: 65536 (order: 6, 262144 bytes)
[    0.003333] Inode-cache hash table entries: 32768 (order: 5, 131072 bytes)
[    0.003333] Memory: 499536k/515584k available (2045k kernel code, 15424k reserved, 1290k data, 248k init, 0k highmem)
[    0.003333] virtual kernel memory layout:
[    0.003333]     fixmap  : 0xfffaa000 - 0xfffff000   ( 340 kB)
[    0.003333]     vmalloc : 0xdff80000 - 0xfffa8000   ( 512 MB)
[    0.003333]     lowmem  : 0xc0000000 - 0xdf780000   ( 503 MB)
[    0.003333]       .init : 0xc0446000 - 0xc0484000   ( 248 kB)
[    0.003333]       .data : 0xc02ff503 - 0xc04420a8   (1290 kB)
[    0.003333]       .text : 0xc0100000 - 0xc02ff503   (2045 kB)
[    0.003333] Checking if this processor honours the WP bit even in supervisor mode...Ok.
[    0.003333] SLUB: Genslabs=13, HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[    0.003383] Calibrating delay loop (skipped), value calculated using timer frequency.. 1260.58 BogoMIPS (lpj=2100136)
[    0.003680] Mount-cache hash table entries: 512
[    0.006328] CPU: L1 I cache: 32K, L1 D cache: 32K
[    0.006474] CPU: L2 cache: 512K
[    0.006681] [ds] using Pentium M configuration
[    0.006778] [ds] pebs not available
[    0.006874] Intel machine check architecture supported.
[    0.006978] Intel machine check reporting enabled on CPU#0.
[    0.007093] CPU: Intel(R) Celeron(R) M processor          900MHz stepping 08
[    0.007298] Checking 'hlt' instruction... OK.
[    0.020678] ACPI: Core revision 20090320
[    0.054619] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
[    0.089999] PM: Adding info for No Bus:platform
[    0.089999] net_namespace: 1136 bytes
[    0.091819] NET: Registered protocol family 16
[    0.092936] PM: Adding info for No Bus:vtcon0
[    0.093174] ACPI: bus type pci registered
[    0.093492] PM: Adding info for No Bus:id
[    0.093715] PCI: MCFG configuration 0: base e0000000 segment 0 buses 0 - 255
[    0.093827] PCI: Not using MMCONFIG.
[    0.094216] PCI: PCI BIOS revision 3.00 entry at 0xf0031, last bus=5
[    0.094321] PCI: Using configuration type 1 for base access
[    0.097919] PM: Adding info for No Bus:default
[    0.098189] bio: create slab <bio-0> at 0
[    0.120455] ACPI: EC: Look up EC in DSDT
[    0.157798] ACPI: Interpreter enabled
[    0.157918] ACPI: (supports S0 S1 S3 S4 S5)
[    0.158404] ACPI: Using IOAPIC for interrupt routing
[    0.158727] PCI: MCFG configuration 0: base e0000000 segment 0 buses 0 - 255
[    0.168936] PCI: MCFG area at e0000000 reserved in ACPI motherboard resources
[    0.169048] PCI: Using MMCONFIG for extended config space
[    0.169477] PM: Adding info for acpi:LNXSYSTM:00
[    0.169626] PM: Adding info for acpi:LNXPWRBN:00
[    0.169822] PM: Adding info for acpi:ACPI_CPU:00
[    0.170102] PM: Adding info for acpi:device:00
[    0.170883] PM: Adding info for acpi:PNP0A08:00
[    0.171147] PM: Adding info for acpi:PNP0C01:00
[    0.172036] PM: Adding info for acpi:device:01
[    0.172887] PM: Adding info for acpi:device:02
[    0.173746] PM: Adding info for acpi:device:03
[    0.174591] PM: Adding info for acpi:device:04
[    0.175430] PM: Adding info for acpi:device:05
[    0.175645] PM: Adding info for acpi:device:06
[    0.175889] PM: Adding info for acpi:PNP0000:00
[    0.176126] PM: Adding info for acpi:PNP0200:00
[    0.176365] PM: Adding info for acpi:PNP0100:00
[    0.176595] PM: Adding info for acpi:PNP0B00:00
[    0.177248] PM: Adding info for acpi:PNP0303:00
[    0.177887] PM: Adding info for acpi:SYN0A00:00
[    0.178138] PM: Adding info for acpi:PNP0800:00
[    0.178369] PM: Adding info for acpi:PNP0C04:00
[    0.178642] PM: Adding info for acpi:PNP0C09:00
[    0.179538] PM: Adding info for acpi:PNP0C02:00
[    0.180313] PM: Adding info for acpi:PNP0C02:01
[    0.180635] PM: Adding info for acpi:PNP0C02:02
[    0.181419] PM: Adding info for acpi:PNP0C0A:00
[    0.181728] PM: Adding info for acpi:ACPI0003:00
[    0.181973] PM: Adding info for acpi:device:07
[    0.182314] PM: Adding info for acpi:device:08
[    0.182516] PM: Adding info for acpi:device:09
[    0.182729] PM: Adding info for acpi:device:0a
[    0.182938] PM: Adding info for acpi:device:0b
[    0.183138] PM: Adding info for acpi:device:0c
[    0.183359] PM: Adding info for acpi:device:0d
[    0.184311] PM: Adding info for acpi:device:0e
[    0.185157] PM: Adding info for acpi:device:0f
[    0.186004] PM: Adding info for acpi:device:10
[    0.186868] PM: Adding info for acpi:device:11
[    0.187718] PM: Adding info for acpi:device:12
[    0.188563] PM: Adding info for acpi:device:13
[    0.188768] PM: Adding info for acpi:device:14
[    0.188975] PM: Adding info for acpi:device:15
[    0.189218] PM: Adding info for acpi:device:16
[    0.189482] PM: Adding info for acpi:device:17
[    0.189692] PM: Adding info for acpi:device:18
[    0.189901] PM: Adding info for acpi:device:19
[    0.190214] PM: Adding info for acpi:PNP0C02:03
[    0.190461] PM: Adding info for acpi:PNP0C01:01
[    0.190717] PM: Adding info for acpi:ASUS010:00
[    0.191022] PM: Adding info for acpi:PNP0C0D:00
[    0.191261] PM: Adding info for acpi:PNP0C0E:00
[    0.191524] PM: Adding info for acpi:PNP0C0C:00
[    0.192134] PM: Adding info for acpi:PNP0C0F:00
[    0.192742] PM: Adding info for acpi:PNP0C0F:01
[    0.193381] PM: Adding info for acpi:PNP0C0F:02
[    0.193995] PM: Adding info for acpi:PNP0C0F:03
[    0.194603] PM: Adding info for acpi:PNP0C0F:04
[    0.195217] PM: Adding info for acpi:PNP0C0F:05
[    0.195822] PM: Adding info for acpi:PNP0C0F:06
[    0.196441] PM: Adding info for acpi:PNP0C0F:07
[    0.196750] PM: Adding info for acpi:LNXTHERM:00
[    0.196918] PM: Adding info for acpi:LNXTHERM:01
[    0.198033] ACPI: EC: GPE = 0x18, I/O: command/status = 0x66, data = 0x62
[    0.198143] ACPI: EC: driver started in poll mode
[    0.199451] ACPI: No dock devices found.
[    0.199837] ACPI: PCI Root Bridge [PCI0] (0000:00)
[    0.200126] PM: Adding info for No Bus:pci0000:00
[    0.200185] PM: Adding info for No Bus:0000:00
[    0.200727] pci 0000:00:02.0: reg 10 32bit mmio: [0xf7f00000-0xf7f7ffff]
[    0.200750] pci 0000:00:02.0: reg 14 io port: [0xec00-0xec07]
[    0.200771] pci 0000:00:02.0: reg 18 32bit mmio: [0xd0000000-0xdfffffff]
[    0.200792] pci 0000:00:02.0: reg 1c 32bit mmio: [0xf7ec0000-0xf7efffff]
[    0.200940] pci 0000:00:02.1: reg 10 32bit mmio: [0xf7f80000-0xf7ffffff]
[    0.201254] pci 0000:00:1b.0: reg 10 64bit mmio: [0xf7eb8000-0xf7ebbfff]
[    0.201385] pci 0000:00:1b.0: PME# supported from D0 D3hot D3cold
[    0.201498] pci 0000:00:1b.0: PME# disabled
[    0.201765] pci 0000:00:1c.0: PME# supported from D0 D3hot D3cold
[    0.201875] pci 0000:00:1c.0: PME# disabled
[    0.202149] pci 0000:00:1c.1: PME# supported from D0 D3hot D3cold
[    0.202258] pci 0000:00:1c.1: PME# disabled
[    0.202530] pci 0000:00:1c.2: PME# supported from D0 D3hot D3cold
[    0.202640] pci 0000:00:1c.2: PME# disabled
[    0.202874] pci 0000:00:1d.0: reg 20 io port: [0xe400-0xe41f]
[    0.203025] pci 0000:00:1d.1: reg 20 io port: [0xe480-0xe49f]
[    0.203171] pci 0000:00:1d.2: reg 20 io port: [0xe800-0xe81f]
[    0.203357] pci 0000:00:1d.3: reg 20 io port: [0xe880-0xe89f]
[    0.203514] pci 0000:00:1d.7: reg 10 32bit mmio: [0xf7eb7c00-0xf7eb7fff]
[    0.203656] pci 0000:00:1d.7: PME# supported from D0 D3hot D3cold
[    0.203766] pci 0000:00:1d.7: PME# disabled
[    0.204179] pci 0000:00:1f.0: Force enabled HPET at 0xfed00000
[    0.204200] pci 0000:00:1f.0: quirk: region 0800-087f claimed by ICH6 ACPI/GPIO/TCO
[    0.204345] pci 0000:00:1f.0: quirk: region 0480-04bf claimed by ICH6 GPIO
[    0.204456] pci 0000:00:1f.0: LPC Generic IO decode 1 PIO at 0380-03ff
[    0.204654] pci 0000:00:1f.2: reg 10 io port: [0x00-0x07]
[    0.204676] pci 0000:00:1f.2: reg 14 io port: [0x00-0x03]
[    0.204697] pci 0000:00:1f.2: reg 18 io port: [0x00-0x07]
[    0.204718] pci 0000:00:1f.2: reg 1c io port: [0x00-0x03]
[    0.204739] pci 0000:00:1f.2: reg 20 io port: [0xffa0-0xffaf]
[    0.204818] pci 0000:00:1f.2: PME# supported from D3hot
[    0.204923] pci 0000:00:1f.2: PME# disabled
[    0.205133] pci 0000:00:1f.3: reg 20 io port: [0x400-0x41f]
[    0.205489] pci 0000:03:00.0: reg 10 64bit mmio: [0xfbfc0000-0xfbffffff]
[    0.205568] pci 0000:03:00.0: reg 30 32bit mmio: [0xfbfa0000-0xfbfbffff]
[    0.205660] pci 0000:03:00.0: PME# supported from D3hot D3cold
[    0.205769] pci 0000:03:00.0: PME# disabled
[    0.205986] pci 0000:03:00.0: disabling ASPM on pre-1.1 PCIe device.  You can enable it with 'pcie_aspm=force'
[    0.206602] pci 0000:00:1c.1: bridge 32bit mmio: [0xfbf00000-0xfbffffff]
[    0.206781] pci 0000:00:1c.2: bridge 32bit mmio: [0xf8000000-0xfbefffff]
[    0.206803] pci 0000:00:1c.2: bridge 64bit mmio pref: [0xf0000000-0xf6ffffff]
[    0.206953] pci 0000:00:1e.0: transparent bridge
[    0.207127] pci_bus 0000:00: on NUMA node 0
[    0.207163] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
[    0.207966] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P3._PRT]
[    0.208205] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P5._PRT]
[    0.208423] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P6._PRT]
[    0.210386] PM: Adding info for pci:0000:00:00.0
[    0.211916] PM: Adding info for pci:0000:00:02.0
[    0.213433] PM: Adding info for pci:0000:00:02.1
[    0.214949] PM: Adding info for pci:0000:00:1b.0
[    0.216467] PM: Adding info for pci:0000:00:1c.0
[    0.218010] PM: Adding info for pci:0000:00:1c.1
[    0.219556] PM: Adding info for pci:0000:00:1c.2
[    0.221113] PM: Adding info for pci:0000:00:1d.0
[    0.222645] PM: Adding info for pci:0000:00:1d.1
[    0.224187] PM: Adding info for pci:0000:00:1d.2
[    0.225707] PM: Adding info for pci:0000:00:1d.3
[    0.227255] PM: Adding info for pci:0000:00:1d.7
[    0.228776] PM: Adding info for pci:0000:00:1e.0
[    0.230327] PM: Adding info for pci:0000:00:1f.0
[    0.231851] PM: Adding info for pci:0000:00:1f.2
[    0.233389] PM: Adding info for pci:0000:00:1f.3
[    0.233495] PM: Adding info for No Bus:0000:04
[    0.233670] PM: Adding info for pci:0000:03:00.0
[    0.233766] PM: Adding info for No Bus:0000:03
[    0.233860] PM: Adding info for No Bus:0000:01
[    0.233954] PM: Adding info for No Bus:0000:05
[    0.234819] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 *5 6 7 10 11 12 14 15)
[    0.236014] ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 10 *11 12 14 15)
[    0.237213] ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 *10 11 12 14 15)
[    0.238398] ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 *7 10 11 12 14 15)
[    0.239594] ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
[    0.240922] ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
[    0.242230] ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
[    0.243558] ACPI: PCI Interrupt Link [LNKH] (IRQs *3 4 5 6 7 10 11 12 14 15)
[    0.245027] SCSI subsystem initialized
[    0.261558] libata version 3.00 loaded.
[    0.262173] PCI: Using ACPI for IRQ routing
[    0.263385] PM: Adding info for No Bus:lo
[    0.264418] hpet clockevent registered
[    0.264428] HPET: 3 timers in total, 0 timers will be used for per-cpu timer
[    0.264545] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0
[    0.264828] hpet0: 3 comparators, 64-bit 14.318180 MHz counter
[    0.271687] pnp: PnP ACPI init
[    0.271877] PM: Adding info for No Bus:pnp0
[    0.271893] ACPI: bus type pnp registered
[    0.272885] PM: Adding info for pnp:00:00
[    0.273212] PM: Adding info for pnp:00:01
[    0.273611] PM: Adding info for pnp:00:02
[    0.273858] PM: Adding info for pnp:00:03
[    0.274294] PM: Adding info for pnp:00:04
[    0.274900] PM: Adding info for pnp:00:05
[    0.275125] PM: Adding info for pnp:00:06
[    0.275354] PM: Adding info for pnp:00:07
[    0.276366] PM: Adding info for pnp:00:08
[    0.277396] PM: Adding info for pnp:00:09
[    0.277961] PM: Adding info for pnp:00:0a
[    0.278869] PM: Adding info for pnp:00:0b
[    0.280118] PM: Adding info for pnp:00:0c
[    0.281660] pnp: PnP ACPI: found 13 devices
[    0.281760] ACPI: ACPI bus type pnp unregistered
[    0.281908] system 00:01: iomem range 0xfed13000-0xfed19fff has been reserved
[    0.282081] system 00:08: ioport range 0x380-0x383 has been reserved
[    0.282191] system 00:08: ioport range 0x4d0-0x4d1 has been reserved
[    0.284081] system 00:08: ioport range 0x800-0x87f has been reserved
[    0.284193] system 00:08: ioport range 0x480-0x4bf has been reserved
[    0.284303] system 00:08: iomem range 0xfed1c000-0xfed1ffff has been reserved
[    0.284415] system 00:08: iomem range 0xfed20000-0xfed8ffff has been reserved
[    0.284532] system 00:08: iomem range 0xfff00000-0xffffffff could not be reserved
[    0.284697] system 00:09: iomem range 0xfec00000-0xfec00fff has been reserved
[    0.284809] system 00:09: iomem range 0xfee00000-0xfee00fff has been reserved
[    0.284941] system 00:0a: iomem range 0xe0000000-0xefffffff has been reserved
[    0.285082] system 00:0b: iomem range 0xe0000000-0xefffffff has been reserved
[    0.285216] system 00:0c: iomem range 0x0-0x9ffff could not be reserved
[    0.285327] system 00:0c: iomem range 0xc0000-0xcffff could not be reserved
[    0.285439] system 00:0c: iomem range 0xe0000-0xfffff could not be reserved
[    0.285552] system 00:0c: iomem range 0x100000-0x1f7fffff could not be reserved
[    0.285848] PM: Adding info for No Bus:mem
[    0.286031] PM: Adding info for No Bus:kmem
[    0.286113] PM: Adding info for No Bus:null
[    0.286192] PM: Adding info for No Bus:port
[    0.286272] PM: Adding info for No Bus:zero
[    0.286353] PM: Adding info for No Bus:full
[    0.286432] PM: Adding info for No Bus:random
[    0.286515] PM: Adding info for No Bus:urandom
[    0.286596] PM: Adding info for No Bus:kmsg
[    0.322170] pci 0000:00:1c.0: PCI bridge, secondary bus 0000:04
[    0.322276] pci 0000:00:1c.0:   IO window: disabled
[    0.322384] pci 0000:00:1c.0:   MEM window: disabled
[    0.322488] pci 0000:00:1c.0:   PREFETCH window: disabled
[    0.322601] pci 0000:00:1c.1: PCI bridge, secondary bus 0000:03
[    0.322703] pci 0000:00:1c.1:   IO window: disabled
[    0.322811] pci 0000:00:1c.1:   MEM window: 0xfbf00000-0xfbffffff
[    0.322920] pci 0000:00:1c.1:   PREFETCH window: disabled
[    0.323032] pci 0000:00:1c.2: PCI bridge, secondary bus 0000:01
[    0.323135] pci 0000:00:1c.2:   IO window: disabled
[    0.323242] pci 0000:00:1c.2:   MEM window: 0xf8000000-0xfbefffff
[    0.323365] pci 0000:00:1c.2:   PREFETCH window: 0x000000f0000000-0x000000f6ffffff
[    0.323516] pci 0000:00:1e.0: PCI bridge, secondary bus 0000:05
[    0.323620] pci 0000:00:1e.0:   IO window: disabled
[    0.323727] pci 0000:00:1e.0:   MEM window: disabled
[    0.323831] pci 0000:00:1e.0:   PREFETCH window: disabled
[    0.323975] pci 0000:00:1c.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[    0.324093] pci 0000:00:1c.0: setting latency timer to 64
[    0.324122] pci 0000:00:1c.1: PCI INT B -> GSI 17 (level, low) -> IRQ 17
[    0.324234] pci 0000:00:1c.1: setting latency timer to 64
[    0.324265] pci 0000:00:1c.2: PCI INT C -> GSI 18 (level, low) -> IRQ 18
[    0.324377] pci 0000:00:1c.2: setting latency timer to 64
[    0.324401] pci 0000:00:1e.0: setting latency timer to 64
[    0.324417] pci_bus 0000:00: resource 0 io:  [0x00-0xffff]
[    0.324428] pci_bus 0000:00: resource 1 mem: [0x000000-0xffffffffffffffff]
[    0.324440] pci_bus 0000:04: resource 0 mem: [0x0-0x0]
[    0.324449] pci_bus 0000:04: resource 1 mem: [0x0-0x0]
[    0.324459] pci_bus 0000:04: resource 2 mem: [0x0-0x0]
[    0.324468] pci_bus 0000:04: resource 3 mem: [0x0-0x0]
[    0.324478] pci_bus 0000:03: resource 0 mem: [0x0-0x0]
[    0.324488] pci_bus 0000:03: resource 1 mem: [0xfbf00000-0xfbffffff]
[    0.324498] pci_bus 0000:03: resource 2 mem: [0x0-0x0]
[    0.324508] pci_bus 0000:03: resource 3 mem: [0x0-0x0]
[    0.324518] pci_bus 0000:01: resource 0 mem: [0x0-0x0]
[    0.324528] pci_bus 0000:01: resource 1 mem: [0xf8000000-0xfbefffff]
[    0.324538] pci_bus 0000:01: resource 2 mem: [0xf0000000-0xf6ffffff]
[    0.324548] pci_bus 0000:01: resource 3 mem: [0x0-0x0]
[    0.324558] pci_bus 0000:05: resource 0 mem: [0x0-0x0]
[    0.324568] pci_bus 0000:05: resource 1 mem: [0x0-0x0]
[    0.324577] pci_bus 0000:05: resource 2 mem: [0x0-0x0]
[    0.324587] pci_bus 0000:05: resource 3 io:  [0x00-0xffff]
[    0.324597] pci_bus 0000:05: resource 4 mem: [0x000000-0xffffffffffffffff]
[    0.324808] NET: Registered protocol family 2
[    0.325624] IP route cache hash table entries: 4096 (order: 2, 16384 bytes)
[    0.328069] TCP established hash table entries: 16384 (order: 5, 131072 bytes)
[    0.328653] TCP bind hash table entries: 16384 (order: 7, 524288 bytes)
[    0.334218] TCP: Hash tables configured (established 16384 bind 16384)
[    0.334466] TCP reno registered
[    0.335018] NET: Registered protocol family 1
[    0.335898] checking if image is initramfs...
[    0.619267] rootfs image is initramfs; unpacking...
[    0.619490] Freeing initrd memory: 2797k freed
[    0.625139] PM: Adding info for platform:pcspkr
[    0.625980] PM: Adding info for No Bus:snapshot
[    0.626432] audit: initializing netlink socket (disabled)
[    0.626714] type=2000 audit(1240041427.623:1): initialized
[    0.669232] msgmni has been set to 981
[    0.681678] alg: No test for stdrng (krng)
[    0.681904] io scheduler noop registered
[    0.682001] io scheduler anticipatory registered
[    0.682098] io scheduler deadline registered
[    0.682290] io scheduler cfq registered (default)
[    0.682435] pci 0000:00:02.0: Boot video device
[    0.683544] pcieport-driver 0000:00:1c.0: irq 24 for MSI/MSI-X
[    0.683588] pcieport-driver 0000:00:1c.0: setting latency timer to 64
[    0.683665] PM: Adding info for pci_express:0000:00:1c.0:pcie01
[    0.683764] PM: Adding info for pci_express:0000:00:1c.0:pcie04
[    0.683853] PM: Adding info for pci_express:0000:00:1c.0:pcie08
[    0.684182] pcieport-driver 0000:00:1c.1: irq 25 for MSI/MSI-X
[    0.684223] pcieport-driver 0000:00:1c.1: setting latency timer to 64
[    0.684286] PM: Adding info for pci_express:0000:00:1c.1:pcie01
[    0.684376] PM: Adding info for pci_express:0000:00:1c.1:pcie04
[    0.684465] PM: Adding info for pci_express:0000:00:1c.1:pcie08
[    0.684788] pcieport-driver 0000:00:1c.2: irq 26 for MSI/MSI-X
[    0.684828] pcieport-driver 0000:00:1c.2: setting latency timer to 64
[    0.684901] PM: Adding info for pci_express:0000:00:1c.2:pcie01
[    0.685002] PM: Adding info for pci_express:0000:00:1c.2:pcie04
[    0.685092] PM: Adding info for pci_express:0000:00:1c.2:pcie08
[    0.685625] PM: Adding info for No Bus:tty
[    0.685804] PM: Adding info for No Bus:console
[    0.685893] PM: Adding info for No Bus:tty0
[    0.686028] PM: Adding info for No Bus:vcs
[    0.686199] PM: Adding info for No Bus:vcsa
[    0.686340] PM: Adding info for No Bus:tty1
[    0.686422] PM: Adding info for No Bus:tty2
[    0.686503] PM: Adding info for No Bus:tty3
[    0.686591] PM: Adding info for No Bus:tty4
[    0.686708] PM: Adding info for No Bus:tty5
[    0.686790] PM: Adding info for No Bus:tty6
[    0.686872] PM: Adding info for No Bus:tty7
[    0.686952] PM: Adding info for No Bus:tty8
[    0.687033] PM: Adding info for No Bus:tty9
[    0.687115] PM: Adding info for No Bus:tty10
[    0.687197] PM: Adding info for No Bus:tty11
[    0.687279] PM: Adding info for No Bus:tty12
[    0.687362] PM: Adding info for No Bus:tty13
[    0.687445] PM: Adding info for No Bus:tty14
[    0.687532] PM: Adding info for No Bus:tty15
[    0.687621] PM: Adding info for No Bus:tty16
[    0.687705] PM: Adding info for No Bus:tty17
[    0.687788] PM: Adding info for No Bus:tty18
[    0.687871] PM: Adding info for No Bus:tty19
[    0.687956] PM: Adding info for No Bus:tty20
[    0.688040] PM: Adding info for No Bus:tty21
[    0.688125] PM: Adding info for No Bus:tty22
[    0.688210] PM: Adding info for No Bus:tty23
[    0.688294] PM: Adding info for No Bus:tty24
[    0.688379] PM: Adding info for No Bus:tty25
[    0.688464] PM: Adding info for No Bus:tty26
[    0.688551] PM: Adding info for No Bus:tty27
[    0.688643] PM: Adding info for No Bus:tty28
[    0.688730] PM: Adding info for No Bus:tty29
[    0.688814] PM: Adding info for No Bus:tty30
[    0.688910] PM: Adding info for No Bus:tty31
[    0.688997] PM: Adding info for No Bus:tty32
[    0.689083] PM: Adding info for No Bus:tty33
[    0.689168] PM: Adding info for No Bus:tty34
[    0.689257] PM: Adding info for No Bus:tty35
[    0.689346] PM: Adding info for No Bus:tty36
[    0.689432] PM: Adding info for No Bus:tty37
[    0.689521] PM: Adding info for No Bus:tty38
[    0.689610] PM: Adding info for No Bus:tty39
[    0.689702] PM: Adding info for No Bus:tty40
[    0.689789] PM: Adding info for No Bus:tty41
[    0.689875] PM: Adding info for No Bus:tty42
[    0.689979] PM: Adding info for No Bus:tty43
[    0.690073] PM: Adding info for No Bus:tty44
[    0.690161] PM: Adding info for No Bus:tty45
[    0.690249] PM: Adding info for No Bus:tty46
[    0.690340] PM: Adding info for No Bus:tty47
[    0.690427] PM: Adding info for No Bus:tty48
[    0.690514] PM: Adding info for No Bus:tty49
[    0.690601] PM: Adding info for No Bus:tty50
[    0.690699] PM: Adding info for No Bus:tty51
[    0.690795] PM: Adding info for No Bus:tty52
[    0.690884] PM: Adding info for No Bus:tty53
[    0.690972] PM: Adding info for No Bus:tty54
[    0.691062] PM: Adding info for No Bus:tty55
[    0.691151] PM: Adding info for No Bus:tty56
[    0.691241] PM: Adding info for No Bus:tty57
[    0.691329] PM: Adding info for No Bus:tty58
[    0.691419] PM: Adding info for No Bus:tty59
[    0.691509] PM: Adding info for No Bus:tty60
[    0.691598] PM: Adding info for No Bus:tty61
[    0.691690] PM: Adding info for No Bus:tty62
[    0.691783] PM: Adding info for No Bus:tty63
[    0.692242] PM: Adding info for No Bus:ptmx
[    0.692376] PM: Adding info for No Bus:hpet
[    0.692860] PM: Adding info for No Bus:ram0
[    0.693080] PM: Adding info for No Bus:1:0
[    0.693328] PM: Adding info for No Bus:ram1
[    0.693504] PM: Adding info for No Bus:1:1
[    0.693673] PM: Adding info for No Bus:ram2
[    0.693776] PM: Adding info for No Bus:1:2
[    0.693931] PM: Adding info for No Bus:ram3
[    0.694031] PM: Adding info for No Bus:1:3
[    0.694179] PM: Adding info for No Bus:ram4
[    0.694279] PM: Adding info for No Bus:1:4
[    0.694428] PM: Adding info for No Bus:ram5
[    0.694528] PM: Adding info for No Bus:1:5
[    0.694683] PM: Adding info for No Bus:ram6
[    0.694786] PM: Adding info for No Bus:1:6
[    0.694934] PM: Adding info for No Bus:ram7
[    0.695034] PM: Adding info for No Bus:1:7
[    0.695254] PM: Adding info for No Bus:ram8
[    0.695356] PM: Adding info for No Bus:1:8
[    0.695516] PM: Adding info for No Bus:ram9
[    0.695624] PM: Adding info for No Bus:1:9
[    0.695776] PM: Adding info for No Bus:ram10
[    0.695879] PM: Adding info for No Bus:1:10
[    0.696029] PM: Adding info for No Bus:ram11
[    0.696132] PM: Adding info for No Bus:1:11
[    0.696281] PM: Adding info for No Bus:ram12
[    0.696382] PM: Adding info for No Bus:1:12
[    0.696539] PM: Adding info for No Bus:ram13
[    0.696663] PM: Adding info for No Bus:1:13
[    0.696819] PM: Adding info for No Bus:ram14
[    0.696922] PM: Adding info for No Bus:1:14
[    0.697131] PM: Adding info for No Bus:ram15
[    0.697235] PM: Adding info for No Bus:1:15
[    0.697306] brd: module loaded
[    0.697483] Driver 'sd' needs updating - please use bus_type methods
[    0.697721] ahci 0000:00:1f.2: version 3.0
[    0.697830] ahci 0000:00:1f.2: PCI INT B -> GSI 19 (level, low) -> IRQ 19
[    0.698002] ahci 0000:00:1f.2: PCI INT B disabled
[    0.698120] ahci: probe of 0000:00:1f.2 failed with error -22
[    0.698392] ata_piix 0000:00:1f.2: version 2.12
[    0.698417] ata_piix 0000:00:1f.2: PCI INT B -> GSI 19 (level, low) -> IRQ 19
[    0.698534] ata_piix 0000:00:1f.2: MAP [ P0 P2 IDE IDE ]
[    0.698996] ata_piix 0000:00:1f.2: setting latency timer to 64
[    0.730030] scsi0 : ata_piix
[    0.730680] PM: Adding info for No Bus:host0
[    0.730811] PM: Adding info for No Bus:host0
[    0.731401] scsi1 : ata_piix
[    0.731533] PM: Adding info for No Bus:host1
[    0.731652] PM: Adding info for No Bus:host1
[    0.731718] ata1: SATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0xffa0 irq 14
[    0.731827] ata2: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0xffa8 irq 15
[    0.732397] PNP: PS/2 Controller [PNP0303:PS2K,PNP0f13:PS2M] at 0x60,0x64 irq 1,12
[    0.732844] PM: Adding info for platform:i8042
[    0.744856] serio: i8042 KBD port at 0x60,0x64 irq 1
[    0.745179] PM: Adding info for No Bus:mice
[    0.745381] PM: Adding info for No Bus:psaux
[    0.745425] mice: PS/2 mouse device common for all mice
[    0.745683] rtc_cmos 00:03: RTC can wake from S4
[    0.746094] PM: Adding info for No Bus:rtc0
[    0.746297] rtc_cmos 00:03: rtc core: registered rtc_cmos as rtc0
[    0.746455] rtc0: alarms up to one month, 114 bytes nvram, hpet irqs
[    0.746645] iTCO_wdt: Intel TCO WatchDog Timer Driver v1.05
[    0.746861] PM: Adding info for platform:iTCO_wdt
[    0.747067] iTCO_wdt: Found a ICH6-M TCO device (Version=2, TCOBASE=0x0860)
[    0.747329] PM: Adding info for No Bus:watchdog
[    0.747373] iTCO_wdt: initialized. heartbeat=30 sec (nowayout=0)
[    0.747486] iTCO_vendor_support: vendor-support=0
[    0.747634] cpuidle: using governor ladder
[    0.747730] cpuidle: using governor menu
[    0.748124] Advanced Linux Sound Architecture Driver Version 1.0.19.
[    0.748229] ALSA device list:
[    0.748317]   No soundcards found.
[    0.750861] TCP cubic registered
[    0.750960] Using IPI Shortcut mode
[    0.751178] PM: Adding info for No Bus:cpu_dma_latency
[    0.751285] PM: Adding info for No Bus:network_latency
[    0.751430] PM: Adding info for No Bus:network_throughput
[    0.751877] PM: Adding info for serio:serio0
[    0.766668] Switched to NOHz mode on CPU #0
[    0.785428] PM: Adding info for No Bus:input0
[    0.785904] input: AT Translated Set 2 keyboard as /class/input/input0
[    1.077108] ata2.00: CFA: SILICONMOTION SM223AC, , max UDMA/66
[    1.077218] ata2.00: 7815024 sectors, multi 0: LBA 
[    1.090333] ata2.00: configured for UDMA/66
[    1.091809] scsi 1:0:0:0: Direct-Access     ATA      SILICONMOTION SM n/a  PQ: 0 ANSI: 5
[    1.092182] PM: Adding info for No Bus:target1:0:0
[    1.092455] PM: Adding info for scsi:1:0:0:0
[    1.093328] PM: Adding info for No Bus:1:0:0:0
[    1.093676] PM: Adding info for No Bus:1:0:0:0
[    1.094012] sd 1:0:0:0: [sda] 7815024 512-byte hardware sectors: (4.00 GB/3.72 GiB)
[    1.094233] sd 1:0:0:0: [sda] Write Protect is off
[    1.094335] sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00
[    1.094481] sd 1:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[    1.094751] PM: Adding info for No Bus:sda
[    1.095712]  sda: sda1 sda2
[    1.097080] PM: Adding info for No Bus:sda1
[    1.097597] PM: Adding info for No Bus:sda2
[    1.098014] PM: Adding info for No Bus:8:0
[    1.098101] sd 1:0:0:0: [sda] Attached SCSI disk
[    1.098628] PM: Resume from partition /dev/sda2
[    1.098635] PM: Checking hibernation image.
[    1.100964] PM: Resume from disk failed.
[    1.101675] rtc_cmos 00:03: setting system clock to 2009-04-18 07:57:09 UTC (1240041429)
[    1.101828] BIOS EDD facility v0.16 2004-Jun-25, 2 devices found
[    1.102623] Freeing unused kernel memory: 248k freed
[    1.354542] ACPI: EC: non-query interrupt received, switching to interrupt mode
[    1.366312] PM: Adding info for No Bus:thermal_zone0
[    1.374534] thermal LNXTHERM:01: registered as thermal_zone0
[    1.374695] ACPI: Thermal Zone [TZ00] (32 C)
[    2.907760] PM: Starting manual resume from disk
[    2.907929] PM: Resume from partition 8:2
[    2.907936] PM: Checking hibernation image.
[    2.909243] PM: Resume from disk failed.
[    2.933167] PM: Marking nosave pages: 000000000009f000 - 0000000000100000
[    2.933184] PM: Basic memory bitmaps created
[    2.959910] PM: Basic memory bitmaps freed
[    2.983990] EXT4-fs: delayed allocation enabled
[    2.984099] EXT4-fs: file extents enabled
[    2.984438] EXT4-fs: mballoc enabled
[    2.984810] EXT4-fs: mounted filesystem sda2 without journal
[    3.756901] udev: starting version 140
[    3.757127] udev: deprecated sysfs layout; update the kernel or disable CONFIG_SYSFS_DEPRECATED; some udev features will not work correctly
[    4.367545] PM: Adding info for No Bus:event0
[    4.931001] PM: Adding info for No Bus:input1
[    4.931103] input: Power Button (FF) as /class/input/input1
[    4.931455] PM: Adding info for No Bus:event1
[    4.931506] ACPI: Power Button (FF) [PWRF]
[    4.931906] PM: Adding info for No Bus:input2
[    4.931998] input: Lid Switch as /class/input/input2
[    4.932204] PM: Adding info for No Bus:event2
[    4.972490] ACPI: CPU0 (power states: C1[C1] C2[C2] C3[C3])
[    4.973043] PM: Adding info for No Bus:cooling_device0
[    4.973110] processor ACPI_CPU:00: registered as cooling_device0
[    4.973236] ACPI: Processor [CPU1] (supports 8 throttling states)
[    4.974976] ACPI: Lid Switch [LID]
[    4.975331] PM: Adding info for No Bus:input3
[    4.975435] input: Sleep Button (CM) as /class/input/input3
[    4.975672] PM: Adding info for No Bus:event3
[    4.975724] ACPI: Sleep Button (CM) [SLPB]
[    4.976022] PM: Adding info for No Bus:input4
[    4.976111] input: Power Button (CM) as /class/input/input4
[    4.976329] PM: Adding info for No Bus:event4
[    4.976378] ACPI: Power Button (CM) [PWRB]
[    4.978561] PM: Adding info for No Bus:AC0
[    4.979193] ACPI: AC Adapter [AC0] (on-line)
[    4.985710] ACPI: Battery Slot [BAT0] (battery absent)
[    5.004861] eeepc: Eee PC Hotkey Driver
[    5.069203] eeepc: Hotkey init flags 0x41
[    5.070976] eeepc: Get control methods supported: 0x101711
[    5.071300] PM: Adding info for No Bus:input5
[    5.071416] input: Asus EeePC extra buttons as /class/input/input5
[    5.071662] PM: Adding info for No Bus:event5
[    5.074524] PM: Adding info for No Bus:rfkill0
[    5.075359] PM: Adding info for No Bus:eeepc
[    5.132746] Linux agpgart interface v0.103
[    5.151614] PM: Adding info for No Bus:hwmon0
[    5.151977] PM: Adding info for platform:eeepc
[    5.153770] PM: Adding info for No Bus:acpi_video0
[    5.154276] PM: Adding info for No Bus:acpi_video1
[    5.154446] PM: Adding info for No Bus:acpi_video2
[    5.154876] PM: Adding info for No Bus:input6
[    5.154985] input: Video Bus as /class/input/input6
[    5.155248] PM: Adding info for No Bus:event6
[    5.155302] ACPI: Video Device [VGA] (multi-head: yes  rom: no  post: no)
[    5.223386] PM: Adding info for No Bus:timer
[    5.233463] Atheros(R) L2 Ethernet Driver - version 2.2.3
[    5.233573] Copyright (c) 2007 Atheros Corporation.
[    5.233791] atl2 0000:03:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
[    5.233921] atl2 0000:03:00.0: setting latency timer to 64
[    5.305094] agpgart-intel 0000:00:00.0: Intel 915GM Chipset
[    5.305772] agpgart-intel 0000:00:00.0: detected 7932K stolen memory
[    5.310942] PM: Adding info for No Bus:agpgart
[    5.311077] agpgart-intel 0000:00:00.0: AGP aperture is 256M @ 0xd0000000
[    5.319675] intel_rng: FWH not detected
[    5.411157] PM: Adding info for No Bus:eth0
[    5.523056] usbcore: registered new interface driver usbfs
[    5.523545] usbcore: registered new interface driver hub
[    5.523849] usbcore: registered new device driver usb
[    5.530701] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[    5.530935] ehci_hcd 0000:00:1d.7: PCI INT A -> GSI 23 (level, low) -> IRQ 23
[    5.531122] ehci_hcd 0000:00:1d.7: setting latency timer to 64
[    5.531136] ehci_hcd 0000:00:1d.7: EHCI Host Controller
[    5.532185] PM: Adding info for No Bus:usb_host1
[    5.534328] ehci_hcd 0000:00:1d.7: new USB bus registered, assigned bus number 1
[    5.538507] ehci_hcd 0000:00:1d.7: debug port 1
[    5.538619] ehci_hcd 0000:00:1d.7: cache line size of 32 is not supported
[    5.538676] ehci_hcd 0000:00:1d.7: irq 23, io mem 0xf7eb7c00
[    5.546610] PM: Adding info for No Bus:input7
[    5.546771] input: PC Speaker as /class/input/input7
[    5.547022] PM: Adding info for No Bus:event7
[    5.550078] ehci_hcd 0000:00:1d.7: USB 2.0 started, EHCI 1.00
[    5.551704] PM: Adding info for usb:usb1
[    5.552178] usb usb1: configuration #1 chosen from 1 choice
[    5.552926] PM: Adding info for usb:1-0:1.0
[    5.553232] hub 1-0:1.0: USB hub found
[    5.553602] hub 1-0:1.0: 8 ports detected
[    5.555191] PM: Adding info for No Bus:usbdev1.1_ep81
[    5.555871] PM: Adding info for No Bus:usbdev1.1_ep00
[    5.647899] uhci_hcd: USB Universal Host Controller Interface driver
[    5.648189] uhci_hcd 0000:00:1d.0: PCI INT A -> GSI 23 (level, low) -> IRQ 23
[    5.648326] uhci_hcd 0000:00:1d.0: setting latency timer to 64
[    5.648340] uhci_hcd 0000:00:1d.0: UHCI Host Controller
[    5.648543] PM: Adding info for No Bus:usb_host2
[    5.648859] uhci_hcd 0000:00:1d.0: new USB bus registered, assigned bus number 2
[    5.649061] uhci_hcd 0000:00:1d.0: irq 23, io base 0x0000e400
[    5.649595] PM: Adding info for usb:usb2
[    5.649718] usb usb2: configuration #1 chosen from 1 choice
[    5.651724] PM: Adding info for usb:2-0:1.0
[    5.651801] hub 2-0:1.0: USB hub found
[    5.652197] hub 2-0:1.0: 2 ports detected
[    5.652608] PM: Adding info for No Bus:usbdev2.1_ep81
[    5.652841] PM: Adding info for No Bus:usbdev2.1_ep00
[    5.652974] uhci_hcd 0000:00:1d.1: PCI INT B -> GSI 19 (level, low) -> IRQ 19
[    5.653101] uhci_hcd 0000:00:1d.1: setting latency timer to 64
[    5.653113] uhci_hcd 0000:00:1d.1: UHCI Host Controller
[    5.653303] PM: Adding info for No Bus:usb_host3
[    5.653810] uhci_hcd 0000:00:1d.1: new USB bus registered, assigned bus number 3
[    5.654041] uhci_hcd 0000:00:1d.1: irq 19, io base 0x0000e480
[    5.654534] PM: Adding info for usb:usb3
[    5.654653] usb usb3: configuration #1 chosen from 1 choice
[    5.654880] PM: Adding info for usb:3-0:1.0
[    5.654951] hub 3-0:1.0: USB hub found
[    5.655074] hub 3-0:1.0: 2 ports detected
[    5.655364] PM: Adding info for No Bus:usbdev3.1_ep81
[    5.655576] PM: Adding info for No Bus:usbdev3.1_ep00
[    5.655707] uhci_hcd 0000:00:1d.2: PCI INT C -> GSI 18 (level, low) -> IRQ 18
[    5.655833] uhci_hcd 0000:00:1d.2: setting latency timer to 64
[    5.655845] uhci_hcd 0000:00:1d.2: UHCI Host Controller
[    5.656038] PM: Adding info for No Bus:usb_host4
[    5.656106] uhci_hcd 0000:00:1d.2: new USB bus registered, assigned bus number 4
[    5.656322] uhci_hcd 0000:00:1d.2: irq 18, io base 0x0000e800
[    5.656829] PM: Adding info for usb:usb4
[    5.656953] usb usb4: configuration #1 chosen from 1 choice
[    5.657185] PM: Adding info for usb:4-0:1.0
[    5.657257] hub 4-0:1.0: USB hub found
[    5.657380] hub 4-0:1.0: 2 ports detected
[    5.657667] PM: Adding info for No Bus:usbdev4.1_ep81
[    5.657884] PM: Adding info for No Bus:usbdev4.1_ep00
[    5.658007] uhci_hcd 0000:00:1d.3: PCI INT D -> GSI 16 (level, low) -> IRQ 16
[    5.658134] uhci_hcd 0000:00:1d.3: setting latency timer to 64
[    5.658146] uhci_hcd 0000:00:1d.3: UHCI Host Controller
[    5.658338] PM: Adding info for No Bus:usb_host5
[    5.658405] uhci_hcd 0000:00:1d.3: new USB bus registered, assigned bus number 5
[    5.658616] uhci_hcd 0000:00:1d.3: irq 16, io base 0x0000e880
[    5.659096] PM: Adding info for usb:usb5
[    5.659222] usb usb5: configuration #1 chosen from 1 choice
[    5.659443] PM: Adding info for usb:5-0:1.0
[    5.659521] hub 5-0:1.0: USB hub found
[    5.659642] hub 5-0:1.0: 2 ports detected
[    5.659926] PM: Adding info for No Bus:usbdev5.1_ep81
[    5.660167] PM: Adding info for No Bus:usbdev5.1_ep00
[    5.695102] i801_smbus 0000:00:1f.3: PCI INT B -> GSI 19 (level, low) -> IRQ 19
[    5.696043] PM: Adding info for No Bus:i2c-0
[    5.850185] HDA Intel 0000:00:1b.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[    5.850463] HDA Intel 0000:00:1b.0: setting latency timer to 64
[    5.886850] usb 1-5: new high speed USB device using ehci_hcd and address 2
[    6.012032] PM: Adding info for usb:1-5
[    6.012250] usb 1-5: configuration #1 chosen from 1 choice
[    6.013122] PM: Adding info for usb:1-5:1.0
[    6.013417] PM: Adding info for No Bus:usbdev1.2_ep01
[    6.013598] PM: Adding info for No Bus:usbdev1.2_ep82
[    6.013816] PM: Adding info for No Bus:usbdev1.2_ep00
[    6.130124] usb 1-8: new high speed USB device using ehci_hcd and address 3
[    6.251446] PM: Adding info for No Bus:pcmC0D0p
[    6.251926] PM: Adding info for No Bus:pcmC0D0c
[    6.253233] PM: Adding info for No Bus:dsp
[    6.253995] PM: Adding info for No Bus:audio
[    6.254687] PM: Adding info for No Bus:hwC0D0
[    6.254874] PM: Adding info for No Bus:controlC0
[    6.255245] PM: Adding info for No Bus:mixer
[    6.259028] PM: Adding info for usb:1-8
[    6.259171] usb 1-8: configuration #1 chosen from 1 choice
[    6.259540] PM: Adding info for usb:1-8:1.0
[    6.259746] PM: Adding info for No Bus:usbdev1.3_ep81
[    6.259907] PM: Adding info for usb:1-8:1.1
[    6.260184] PM: Adding info for No Bus:usbdev1.3_ep82
[    6.260408] PM: Adding info for No Bus:usbdev1.3_ep00
[    6.386655] usual_tables: module license 'unspecified' taints kernel.
[    6.386816] Disabling lockdep due to kernel taint
[    6.439651] Linux video capture interface: v2.00
[    6.446770] Marking TSC unstable due to TSC halts in idle
[    6.456417] uvcvideo: Found UVC 1.00 device <unnamed> (eb1a:2761)
[    6.614887] PM: Adding info for No Bus:video0
[    6.615051] usbcore: registered new interface driver uvcvideo
[    6.615165] USB Video Class driver (v0.1.0)
[    6.763383] Clocksource tsc unstable (delta = -96153146 ns)
[    6.853199] PM: Adding info for No Bus:vcs2
[    6.853297] PM: Adding info for No Bus:vcsa2
[    6.861768] PM: Adding info for No Bus:vcs3
[    6.861864] PM: Adding info for No Bus:vcsa3
[    6.868203] PM: Adding info for No Bus:vcs4
[    6.868299] PM: Adding info for No Bus:vcsa4
[    6.870832] PM: Adding info for No Bus:vcs5
[    6.870920] PM: Adding info for No Bus:vcsa5
[    6.875213] PM: Adding info for No Bus:vcs6
[    6.875298] PM: Adding info for No Bus:vcsa6
[    7.854286] EXT4 FS on sda2, no journal
[    8.280115] Adding 524280k swap on /swapfile.  Priority:-1 extents:1683 across:3168868k 
[    9.510557] NET: Registered protocol family 10
[    9.512717] lo: Disabled Privacy Extensions
[   12.962439] PM: Adding info for No Bus:vcs7
[   12.962543] PM: Adding info for No Bus:vcsa7
[   15.819409] [drm] Initialized drm 1.1.0 20060810
[   16.163239] pci 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[   16.163261] pci 0000:00:02.0: setting latency timer to 64
[   16.169550] PM: Adding info for No Bus:card0
[   16.170451] [drm:i915_gem_detect_bit_6_swizzle] *ERROR* Couldn't read from MCHBAR.  Disabling tiling.
[   16.170513] [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0
[   17.740069] atl2 0000:03:00.0: irq 27 for MSI/MSI-X
[   17.740537] ADDRCONF(NETDEV_UP): eth0: link is not ready
[   17.945294] atl2: eth0 NIC Link is Up<100 Mbps Full Duplex>
[   17.945411] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[   18.075778] NET: Registered protocol family 17
[   27.973369] eth0: no IPv6 routers present
[   57.549665] PM: Adding info for No Bus:vcs63
[   57.549769] PM: Adding info for No Bus:vcsa63
[   57.551117] [drm:i915_get_vblank_counter] *ERROR* trying to get vblank count for disabled pipe 0
[   60.580288] PM: Marking nosave pages: 000000000009f000 - 0000000000100000
[   60.580304] PM: Basic memory bitmaps created
[   60.580627] PM: Adding info for No Bus:vcs8
[   60.580727] PM: Adding info for No Bus:vcsa8
[   61.125471] Syncing filesystems ... done.
[   61.132711] Freezing user space processes ... (elapsed 0.00 seconds) done.
[   61.134847] Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done.
[   61.135207] PM: Shrinking memory...  <6>before: sc.nr_reclaimed = 0
[   61.180993] pass = 0, prio = 12, sc.nr_reclaimed = 0
[   61.181004] pass = 0, prio = 11, sc.nr_reclaimed = 0
[   61.181014] pass = 0, prio = 10, sc.nr_reclaimed = 0
[   61.181024] pass = 0, prio = 9, sc.nr_reclaimed = 0
[   61.181033] pass = 0, prio = 8, sc.nr_reclaimed = 0
[   61.181043] pass = 0, prio = 7, sc.nr_reclaimed = 0
[   61.186509] pass = 0, prio = 6, sc.nr_reclaimed = 0
[   61.186525] pass = 0, prio = 5, sc.nr_reclaimed = 0
[   61.186534] pass = 0, prio = 4, sc.nr_reclaimed = 0
[   61.186544] pass = 0, prio = 3, sc.nr_reclaimed = 0
[   61.333383] pass = 0, prio = 2, sc.nr_reclaimed = 7746
[   61.436711] pass = 0, prio = 1, sc.nr_reclaimed = 1957
[   61.556712] pass = 0, prio = 0, sc.nr_reclaimed = 4528
[   61.556729] pass = 1, prio = 12, sc.nr_reclaimed = 0
[   61.556739] pass = 1, prio = 11, sc.nr_reclaimed = 0
[   61.556749] pass = 1, prio = 10, sc.nr_reclaimed = 0
[   61.556759] pass = 1, prio = 9, sc.nr_reclaimed = 0
[   61.556768] pass = 1, prio = 8, sc.nr_reclaimed = 0
[   61.556778] pass = 1, prio = 7, sc.nr_reclaimed = 0
[   61.556787] pass = 1, prio = 6, sc.nr_reclaimed = 0
[   61.556797] pass = 1, prio = 5, sc.nr_reclaimed = 0
[   61.556806] pass = 1, prio = 4, sc.nr_reclaimed = 0
[   61.556816] pass = 1, prio = 3, sc.nr_reclaimed = 0
[   61.556825] pass = 1, prio = 2, sc.nr_reclaimed = 0
[   61.556835] pass = 1, prio = 1, sc.nr_reclaimed = 0
[   61.595841] pass = 1, prio = 0, sc.nr_reclaimed = 0
[   61.595854] pass = 2, prio = 12, sc.nr_reclaimed = 0
[   61.595864] pass = 2, prio = 11, sc.nr_reclaimed = 0
[   61.595873] pass = 2, prio = 10, sc.nr_reclaimed = 0
[   61.595883] pass = 2, prio = 9, sc.nr_reclaimed = 0
[   61.710044] pass = 2, prio = 8, sc.nr_reclaimed = 2895
[   61.710062] pass = 2, prio = 7, sc.nr_reclaimed = 0
[   61.710072] pass = 2, prio = 6, sc.nr_reclaimed = 0
[   61.710081] pass = 2, prio = 5, sc.nr_reclaimed = 0
[   61.710091] pass = 2, prio = 4, sc.nr_reclaimed = 0
[   61.710100] pass = 2, prio = 3, sc.nr_reclaimed = 0
[   61.710110] pass = 2, prio = 2, sc.nr_reclaimed = 0
[   61.710119] pass = 2, prio = 1, sc.nr_reclaimed = 0
[   61.823378] pass = 2, prio = 0, sc.nr_reclaimed = 1802
[   61.823396] pass = 3, prio = 12, sc.nr_reclaimed = 0
[   61.823406] pass = 3, prio = 11, sc.nr_reclaimed = 0
[   61.823416] pass = 3, prio = 10, sc.nr_reclaimed = 0
[   61.823425] pass = 3, prio = 9, sc.nr_reclaimed = 0
[   61.823435] pass = 3, prio = 8, sc.nr_reclaimed = 0
[   61.823444] pass = 3, prio = 7, sc.nr_reclaimed = 0
[   61.823454] pass = 3, prio = 6, sc.nr_reclaimed = 0
[   61.823464] pass = 3, prio = 5, sc.nr_reclaimed = 0
[   61.823473] pass = 3, prio = 4, sc.nr_reclaimed = 0
[   61.823483] pass = 3, prio = 3, sc.nr_reclaimed = 0
[   61.823492] pass = 3, prio = 2, sc.nr_reclaimed = 0
[   61.823502] pass = 3, prio = 1, sc.nr_reclaimed = 0
[   62.586707] pass = 3, prio = 0, sc.nr_reclaimed = 2716
[   62.608037] pass = 4, prio = 12, sc.nr_reclaimed = 3070
[   62.634644] pass = 4, prio = 11, sc.nr_reclaimed = 5048
[   62.638782] pass = 4, prio = 10, sc.nr_reclaimed = 192
[   62.740052] pass = 4, prio = 9, sc.nr_reclaimed = 0
[   62.843385] pass = 4, prio = 8, sc.nr_reclaimed = 640
[   62.946726] pass = 4, prio = 7, sc.nr_reclaimed = 640
[   63.046711] pass = 4, prio = 6, sc.nr_reclaimed = 640
[   63.146717] pass = 4, prio = 5, sc.nr_reclaimed = 608
[   63.246712] pass = 4, prio = 4, sc.nr_reclaimed = 600
[   63.346704] pass = 4, prio = 3, sc.nr_reclaimed = 128
[   63.446698] pass = 4, prio = 2, sc.nr_reclaimed = 0
[   63.546705] pass = 4, prio = 1, sc.nr_reclaimed = 0
[   63.646698] pass = 4, prio = 0, sc.nr_reclaimed = 0
[   63.646708] after: sc.nr_reclaimed = 0
[   63.646715] shrink_all_memory(10000) failed
[   63.646722] Mem-Info:
[   63.646729] DMA per-cpu:
[   63.646738] CPU    0: hi:    0, btch:   1 usd:   0
[   63.646745] Normal per-cpu:
[   63.646753] CPU    0: hi:  186, btch:  31 usd: 182
[   63.646769] Active_anon:0 active_file:0 inactive_anon:0
[   63.646773]  inactive_file:22 unevictable:1098 dirty:0 writeback:0 unstable:0
[   63.646779]  free:91278 slab:1984 mapped:953 pagetables:529 bounce:0
[   63.646796] DMA free:8008kB min:88kB low:108kB high:132kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB present:15804kB pages_scanned:0 all_unreclaimable? no
[   63.646809] lowmem_reserve[]: 0 483 483
[   63.646833] Normal free:357104kB min:2764kB low:3452kB high:4144kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:88kB unevictable:4392kB present:495300kB pages_scanned:176 all_unreclaimable? no
[   63.646847] lowmem_reserve[]: 0 0 0
[   63.646862] DMA: 2*4kB 6*8kB 3*16kB 3*32kB 2*64kB 2*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 1*4096kB = 8008kB
[   63.646906] Normal: 298*4kB 269*8kB 214*16kB 154*32kB 119*64kB 75*128kB 66*256kB 44*512kB 26*1024kB 14*2048kB 57*4096kB = 357104kB
[   63.646951] 972 total pagecache pages
[   63.646958] 0 pages in swap cache
[   63.646966] Swap cache stats: add 6615, delete 6615, find 0/0
[   63.646974] Free swap  = 497820kB
[   63.646980] Total swap = 524280kB
[   63.652111] 128880 pages RAM
[   63.652119] 3192 pages reserved
[   63.652126] 37178 pages shared
[   63.652132] 4437 pages non-shared
[   63.652177] Restarting tasks ... done.
[   63.671646] PM: Basic memory bitmaps freed
[   64.521472] PM: Removing info for No Bus:vcs63
[   64.521642] PM: Removing info for No Bus:vcsa63
[   65.267700] atl2 0000:03:00.0: irq 27 for MSI/MSI-X
[   65.268233] ADDRCONF(NETDEV_UP): eth0: link is not ready
[   65.270982] atl2: eth0 NIC Link is Up<100 Mbps Full Duplex>
[   65.271111] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[   75.913359] eth0: no IPv6 routers present


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-17 23:56     ` Laurent Pinchart
@ 2009-04-18 12:29           ` Rafael J. Wysocki
       [not found]       ` <200904180156.24366.laurent.pinchart-AgBVmzD5pcezQB+pC5nmwQ@public.gmane.org>
  1 sibling, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-18 12:29 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Ming Lei, Linux Kernel Mailing List, Adrian Bunk, Andrew Morton,
	Linus Torvalds, Natalie Protasevich, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List,
	video4linux-list-H+wXaHxf7aLQT0dZR+AlfA,
	mchehab-wEGCiKHe2LqWVfeAwA7xHQ

On Saturday 18 April 2009, Laurent Pinchart wrote:
> Hi,
> 
> On Friday 17 April 2009 23:36:11 Rafael J. Wysocki wrote:
> > On Friday 17 April 2009, Ming Lei wrote:
> > > 2009/4/17 Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>:
> > > > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13125
> > > > Subject         : active uvcvideo breaks over suspend
> > > > Submitter       : Alan Jenkins <alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA@public.gmane.org>
> > > > Date            : 2009-04-15 10:12 (2 days old)
> > > > References      :
> > > > http://marc.info/?l=linux-kernel&m=123979009508840&w=4
> > >
> > > It is a bug in resume path of uvcvideo driver, and I have sent a patch
> > > to laurent.pinchart-AgBVmzD5pcezQB+pC5nmwQ@public.gmane.org,
> > > mchehab-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org  and video4linux-list-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org to fix it, but
> > > still no echo from them.
> > >
> > > The patch title is V4L/DVB:usbvideo:fix uvc resume failed.
> > >
> > > Rafael J.
> > >         If you would like to apply it ,I can resend to you.  Thanks!
> >
> > Please resend.
> 
> I'm reviewing the patch and I'll push it through my tree during the weekend. 

Great, thanks a lot!

> Sorry for the delay, I'm currently traveling.

No problem at all. :-)

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
@ 2009-04-18 12:29           ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-18 12:29 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Ming Lei, Linux Kernel Mailing List, Adrian Bunk, Andrew Morton,
	Linus Torvalds, Natalie Protasevich, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List,
	video4linux-list, mchehab

On Saturday 18 April 2009, Laurent Pinchart wrote:
> Hi,
> 
> On Friday 17 April 2009 23:36:11 Rafael J. Wysocki wrote:
> > On Friday 17 April 2009, Ming Lei wrote:
> > > 2009/4/17 Rafael J. Wysocki <rjw@sisk.pl>:
> > > > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13125
> > > > Subject         : active uvcvideo breaks over suspend
> > > > Submitter       : Alan Jenkins <alan-jenkins@tuffmail.co.uk>
> > > > Date            : 2009-04-15 10:12 (2 days old)
> > > > References      :
> > > > http://marc.info/?l=linux-kernel&m=123979009508840&w=4
> > >
> > > It is a bug in resume path of uvcvideo driver, and I have sent a patch
> > > to laurent.pinchart@skynet.be,
> > > mchehab@infradead.org  and video4linux-list@redhat.com to fix it, but
> > > still no echo from them.
> > >
> > > The patch title is V4L/DVB:usbvideo:fix uvc resume failed.
> > >
> > > Rafael J.
> > >         If you would like to apply it ,I can resend to you.  Thanks!
> >
> > Please resend.
> 
> I'm reviewing the patch and I'll push it through my tree during the weekend. 

Great, thanks a lot!

> Sorry for the delay, I'm currently traveling.

No problem at all. :-)

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-17 23:56     ` Laurent Pinchart
@ 2009-04-18 12:29       ` Rafael J. Wysocki
       [not found]       ` <200904180156.24366.laurent.pinchart-AgBVmzD5pcezQB+pC5nmwQ@public.gmane.org>
  1 sibling, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-18 12:29 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Adrian Bunk, Linux SCSI List, Linux Kernel Mailing List,
	Natalie Protasevich, mchehab, Linux ACPI, video4linux-list,
	Network Development, Andrew Morton, Kernel Testers List,
	Linus Torvalds, Linux PM List

On Saturday 18 April 2009, Laurent Pinchart wrote:
> Hi,
> 
> On Friday 17 April 2009 23:36:11 Rafael J. Wysocki wrote:
> > On Friday 17 April 2009, Ming Lei wrote:
> > > 2009/4/17 Rafael J. Wysocki <rjw@sisk.pl>:
> > > > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13125
> > > > Subject         : active uvcvideo breaks over suspend
> > > > Submitter       : Alan Jenkins <alan-jenkins@tuffmail.co.uk>
> > > > Date            : 2009-04-15 10:12 (2 days old)
> > > > References      :
> > > > http://marc.info/?l=linux-kernel&m=123979009508840&w=4
> > >
> > > It is a bug in resume path of uvcvideo driver, and I have sent a patch
> > > to laurent.pinchart@skynet.be,
> > > mchehab@infradead.org  and video4linux-list@redhat.com to fix it, but
> > > still no echo from them.
> > >
> > > The patch title is V4L/DVB:usbvideo:fix uvc resume failed.
> > >
> > > Rafael J.
> > >         If you would like to apply it ,I can resend to you.  Thanks!
> >
> > Please resend.
> 
> I'm reviewing the patch and I'll push it through my tree during the weekend. 

Great, thanks a lot!

> Sorry for the delay, I'm currently traveling.

No problem at all. :-)

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-18  4:51           ` leiming
  (?)
@ 2009-04-18 12:33           ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-18 12:33 UTC (permalink / raw)
  To: leiming
  Cc: Linus Torvalds, Linux Kernel Mailing List, Adrian Bunk,
	Andrew Morton, Natalie Protasevich, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List,
	video4linux-list, laurent.pinchart, mchehab

On Saturday 18 April 2009, leiming wrote:
> On Fri, 17 Apr 2009 19:55:29 -0700 (PDT)
> Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> > > @@ -742,7 +742,7 @@ static int uvc_alloc_urb_buffers(struct
> > > uvc_video_device *video, 
> > >  	/* Buffers are already allocated, bail out. */
> > >  	if (video->urb_size)
> > > -		return 0;
> > > +		return DIV_ROUND_UP(video->urb_size, psize);
> > 
> > I don't think this is right. It should round _down_.
> > 
> > It's supposed to return 'npackets', but if you pass it a different
> > packet size than it was passed originally, it can now return a
> > potentially bigger number than the already allocated buffer, no?
> > 
> > So I think it should round down (ie use a regular divide). No?
> 
> Yes,you are correct, please ignore my last reply, and following is
> the fixed patch.
> 
> Thanks.

Thanks for the patch, I've updated the bug entry to point to it.

Best,
Rafael


> From a3b3d72cdd57a0699fb643b41b78eb7beb211ff5 Mon Sep 17 00:00:00 2001
> From: Ming Lei <tom.leiming@gmail.com>
> Date: Wed, 15 Apr 2009 22:32:51 +0800
> Subject: [PATCH] V4L/DVB:usbvideo:fix uvc resume failed(v2)
> 
> Now urb buffers is not freed before suspend, so uvc_alloc_urb_buffers
> should return packet counts allocated originally during uvc resume
> , instead of zero.
> 
> This version uses round down to return packet counts on Linus's
> suggestions, or else may lead to buffer destructed if packet size
> is changed before calling uvc_alloc_urb_buffers() in this kind of
> case.
> 
> This patch is against v2.6.30-rc2.
> 
> Signed-off-by: Ming Lei <tom.leiming@gmail.com>
> ---
>  drivers/media/video/uvc/uvc_video.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/media/video/uvc/uvc_video.c b/drivers/media/video/uvc/uvc_video.c
> index a95e173..6ce974d 100644
> --- a/drivers/media/video/uvc/uvc_video.c
> +++ b/drivers/media/video/uvc/uvc_video.c
> @@ -742,7 +742,7 @@ static int uvc_alloc_urb_buffers(struct uvc_video_device *video,
>  
>  	/* Buffers are already allocated, bail out. */
>  	if (video->urb_size)
> -		return 0;
> +		return video->urb_size / psize;
>  
>  	/* Compute the number of packets. Bulk endpoints might transfer UVC
>  	 * payloads accross multiple URBs.
> -- 
> 1.6.0.GIT

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-18  4:51           ` leiming
  (?)
  (?)
@ 2009-04-18 12:33           ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-18 12:33 UTC (permalink / raw)
  To: leiming
  Cc: Adrian Bunk, Linux SCSI List, Network Development,
	Linux Kernel Mailing List, Natalie Protasevich, mchehab,
	Linux ACPI, video4linux-list, laurent.pinchart, Andrew Morton,
	Kernel Testers List, Linus Torvalds, Linux PM List

On Saturday 18 April 2009, leiming wrote:
> On Fri, 17 Apr 2009 19:55:29 -0700 (PDT)
> Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> > > @@ -742,7 +742,7 @@ static int uvc_alloc_urb_buffers(struct
> > > uvc_video_device *video, 
> > >  	/* Buffers are already allocated, bail out. */
> > >  	if (video->urb_size)
> > > -		return 0;
> > > +		return DIV_ROUND_UP(video->urb_size, psize);
> > 
> > I don't think this is right. It should round _down_.
> > 
> > It's supposed to return 'npackets', but if you pass it a different
> > packet size than it was passed originally, it can now return a
> > potentially bigger number than the already allocated buffer, no?
> > 
> > So I think it should round down (ie use a regular divide). No?
> 
> Yes,you are correct, please ignore my last reply, and following is
> the fixed patch.
> 
> Thanks.

Thanks for the patch, I've updated the bug entry to point to it.

Best,
Rafael


> From a3b3d72cdd57a0699fb643b41b78eb7beb211ff5 Mon Sep 17 00:00:00 2001
> From: Ming Lei <tom.leiming@gmail.com>
> Date: Wed, 15 Apr 2009 22:32:51 +0800
> Subject: [PATCH] V4L/DVB:usbvideo:fix uvc resume failed(v2)
> 
> Now urb buffers is not freed before suspend, so uvc_alloc_urb_buffers
> should return packet counts allocated originally during uvc resume
> , instead of zero.
> 
> This version uses round down to return packet counts on Linus's
> suggestions, or else may lead to buffer destructed if packet size
> is changed before calling uvc_alloc_urb_buffers() in this kind of
> case.
> 
> This patch is against v2.6.30-rc2.
> 
> Signed-off-by: Ming Lei <tom.leiming@gmail.com>
> ---
>  drivers/media/video/uvc/uvc_video.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/media/video/uvc/uvc_video.c b/drivers/media/video/uvc/uvc_video.c
> index a95e173..6ce974d 100644
> --- a/drivers/media/video/uvc/uvc_video.c
> +++ b/drivers/media/video/uvc/uvc_video.c
> @@ -742,7 +742,7 @@ static int uvc_alloc_urb_buffers(struct uvc_video_device *video,
>  
>  	/* Buffers are already allocated, bail out. */
>  	if (video->urb_size)
> -		return 0;
> +		return video->urb_size / psize;
>  
>  	/* Compute the number of packets. Bulk endpoints might transfer UVC
>  	 * payloads accross multiple URBs.
> -- 
> 1.6.0.GIT

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-18 12:38                             ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-18 12:38 UTC (permalink / raw)
  To: Alan Jenkins
  Cc: Linus Torvalds, Jens Axboe, Linux Kernel Mailing List,
	Kernel Testers List

On Saturday 18 April 2009, Alan Jenkins wrote:
> Linus Torvalds wrote:
> > On Fri, 17 Apr 2009, Rafael J. Wysocki wrote:
> >   
> >> Can you please try to reproduce the problem with the appended debug patch
> >> applied and send the output of dmesg to me?
> >>     
> >
> > Maybe something like this instead (or in addition to).
> >
> > It does "show_mem()" when memory shrinking fails. It will show a _lot_ of 
> > data.
> >
> > Untested, but trivial.
> >
> > 		Linus
> > ---
> >   
> 
> Ok, I applied both your and Rafael's debug patches.  dmesg attached.
> 
> After the failed hibernation, I noticed my touchpad wasn't working.  But 
> I think that's something else.  I had another go and couldn't reproduce 
> that.  It's happened to me once before while testing 2.6.30-; I've also 
> had the keyboard stop working at least once.  I'm hoping it's the same 
> bug as "20 ACPI interrupts per second on EEEPC" bug.  It could be 
> overloading my bug-ridden EC, which also acts as the keyboard controller.

Thanks for testing!

Clearly, sc.nr_reclaimed is reset in each iteration of the loop in
shrink_all_memory():

[   61.135207] PM: Shrinking memory...  <6>before: sc.nr_reclaimed = 0
[   61.180993] pass = 0, prio = 12, sc.nr_reclaimed = 0
[   61.181004] pass = 0, prio = 11, sc.nr_reclaimed = 0
[   61.181014] pass = 0, prio = 10, sc.nr_reclaimed = 0
[   61.181024] pass = 0, prio = 9, sc.nr_reclaimed = 0
[   61.181033] pass = 0, prio = 8, sc.nr_reclaimed = 0
[   61.181043] pass = 0, prio = 7, sc.nr_reclaimed = 0
[   61.186509] pass = 0, prio = 6, sc.nr_reclaimed = 0
[   61.186525] pass = 0, prio = 5, sc.nr_reclaimed = 0
[   61.186534] pass = 0, prio = 4, sc.nr_reclaimed = 0
[   61.186544] pass = 0, prio = 3, sc.nr_reclaimed = 0
[   61.333383] pass = 0, prio = 2, sc.nr_reclaimed = 7746
[   61.436711] pass = 0, prio = 1, sc.nr_reclaimed = 1957
[   61.556712] pass = 0, prio = 0, sc.nr_reclaimed = 4528
[   61.556729] pass = 1, prio = 12, sc.nr_reclaimed = 0
[   61.556739] pass = 1, prio = 11, sc.nr_reclaimed = 0
[   61.556749] pass = 1, prio = 10, sc.nr_reclaimed = 0
[   61.556759] pass = 1, prio = 9, sc.nr_reclaimed = 0
[   61.556768] pass = 1, prio = 8, sc.nr_reclaimed = 0
[   61.556778] pass = 1, prio = 7, sc.nr_reclaimed = 0
[   61.556787] pass = 1, prio = 6, sc.nr_reclaimed = 0
[   61.556797] pass = 1, prio = 5, sc.nr_reclaimed = 0
[   61.556806] pass = 1, prio = 4, sc.nr_reclaimed = 0
[   61.556816] pass = 1, prio = 3, sc.nr_reclaimed = 0
[   61.556825] pass = 1, prio = 2, sc.nr_reclaimed = 0
[   61.556835] pass = 1, prio = 1, sc.nr_reclaimed = 0
[   61.595841] pass = 1, prio = 0, sc.nr_reclaimed = 0
[   61.595854] pass = 2, prio = 12, sc.nr_reclaimed = 0
[   61.595864] pass = 2, prio = 11, sc.nr_reclaimed = 0
[   61.595873] pass = 2, prio = 10, sc.nr_reclaimed = 0
[   61.595883] pass = 2, prio = 9, sc.nr_reclaimed = 0
[   61.710044] pass = 2, prio = 8, sc.nr_reclaimed = 2895
[   61.710062] pass = 2, prio = 7, sc.nr_reclaimed = 0
[   61.710072] pass = 2, prio = 6, sc.nr_reclaimed = 0
[   61.710081] pass = 2, prio = 5, sc.nr_reclaimed = 0
[   61.710091] pass = 2, prio = 4, sc.nr_reclaimed = 0
[   61.710100] pass = 2, prio = 3, sc.nr_reclaimed = 0
[   61.710110] pass = 2, prio = 2, sc.nr_reclaimed = 0
[   61.710119] pass = 2, prio = 1, sc.nr_reclaimed = 0
[   61.823378] pass = 2, prio = 0, sc.nr_reclaimed = 1802
[   61.823396] pass = 3, prio = 12, sc.nr_reclaimed = 0
[   61.823406] pass = 3, prio = 11, sc.nr_reclaimed = 0
[   61.823416] pass = 3, prio = 10, sc.nr_reclaimed = 0
[   61.823425] pass = 3, prio = 9, sc.nr_reclaimed = 0
[   61.823435] pass = 3, prio = 8, sc.nr_reclaimed = 0
[   61.823444] pass = 3, prio = 7, sc.nr_reclaimed = 0
[   61.823454] pass = 3, prio = 6, sc.nr_reclaimed = 0
[   61.823464] pass = 3, prio = 5, sc.nr_reclaimed = 0
[   61.823473] pass = 3, prio = 4, sc.nr_reclaimed = 0
[   61.823483] pass = 3, prio = 3, sc.nr_reclaimed = 0
[   61.823492] pass = 3, prio = 2, sc.nr_reclaimed = 0
[   61.823502] pass = 3, prio = 1, sc.nr_reclaimed = 0
[   62.586707] pass = 3, prio = 0, sc.nr_reclaimed = 2716
[   62.608037] pass = 4, prio = 12, sc.nr_reclaimed = 3070
[   62.634644] pass = 4, prio = 11, sc.nr_reclaimed = 5048
[   62.638782] pass = 4, prio = 10, sc.nr_reclaimed = 192
[   62.740052] pass = 4, prio = 9, sc.nr_reclaimed = 0
[   62.843385] pass = 4, prio = 8, sc.nr_reclaimed = 640
[   62.946726] pass = 4, prio = 7, sc.nr_reclaimed = 640
[   63.046711] pass = 4, prio = 6, sc.nr_reclaimed = 640
[   63.146717] pass = 4, prio = 5, sc.nr_reclaimed = 608
[   63.246712] pass = 4, prio = 4, sc.nr_reclaimed = 600
[   63.346704] pass = 4, prio = 3, sc.nr_reclaimed = 128
[   63.446698] pass = 4, prio = 2, sc.nr_reclaimed = 0
[   63.546705] pass = 4, prio = 1, sc.nr_reclaimed = 0
[   63.646698] pass = 4, prio = 0, sc.nr_reclaimed = 0
[   63.646708] after: sc.nr_reclaimed = 0
[   63.646715] shrink_all_memory(10000) failed

which obviously is done by shrink_all_zones().  Sigh.

The appended patch should help, please verify.

Thanks,
Rafael


---
 mm/vmscan.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Index: linux-2.6/mm/vmscan.c
===================================================================
--- linux-2.6.orig/mm/vmscan.c
+++ linux-2.6/mm/vmscan.c
@@ -2088,13 +2088,13 @@ static void shrink_all_zones(unsigned lo
 				nr_reclaimed += shrink_list(l, nr_to_scan, zone,
 								sc, prio);
 				if (nr_reclaimed >= nr_pages) {
-					sc->nr_reclaimed = nr_reclaimed;
+					sc->nr_reclaimed += nr_reclaimed;
 					return;
 				}
 			}
 		}
 	}
-	sc->nr_reclaimed = nr_reclaimed;
+	sc->nr_reclaimed += nr_reclaimed;
 }
 
 /*
@@ -2115,6 +2115,7 @@ unsigned long shrink_all_memory(unsigned
 		.may_unmap = 0,
 		.may_writepage = 1,
 		.isolate_pages = isolate_pages_global,
+		.nr_reclaimed = 0,
 	};
 
 	current->reclaim_state = &reclaim_state;

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-18 12:38                             ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-18 12:38 UTC (permalink / raw)
  To: Alan Jenkins
  Cc: Linus Torvalds, Jens Axboe, Linux Kernel Mailing List,
	Kernel Testers List

On Saturday 18 April 2009, Alan Jenkins wrote:
> Linus Torvalds wrote:
> > On Fri, 17 Apr 2009, Rafael J. Wysocki wrote:
> >   
> >> Can you please try to reproduce the problem with the appended debug patch
> >> applied and send the output of dmesg to me?
> >>     
> >
> > Maybe something like this instead (or in addition to).
> >
> > It does "show_mem()" when memory shrinking fails. It will show a _lot_ of 
> > data.
> >
> > Untested, but trivial.
> >
> > 		Linus
> > ---
> >   
> 
> Ok, I applied both your and Rafael's debug patches.  dmesg attached.
> 
> After the failed hibernation, I noticed my touchpad wasn't working.  But 
> I think that's something else.  I had another go and couldn't reproduce 
> that.  It's happened to me once before while testing 2.6.30-; I've also 
> had the keyboard stop working at least once.  I'm hoping it's the same 
> bug as "20 ACPI interrupts per second on EEEPC" bug.  It could be 
> overloading my bug-ridden EC, which also acts as the keyboard controller.

Thanks for testing!

Clearly, sc.nr_reclaimed is reset in each iteration of the loop in
shrink_all_memory():

[   61.135207] PM: Shrinking memory...  <6>before: sc.nr_reclaimed = 0
[   61.180993] pass = 0, prio = 12, sc.nr_reclaimed = 0
[   61.181004] pass = 0, prio = 11, sc.nr_reclaimed = 0
[   61.181014] pass = 0, prio = 10, sc.nr_reclaimed = 0
[   61.181024] pass = 0, prio = 9, sc.nr_reclaimed = 0
[   61.181033] pass = 0, prio = 8, sc.nr_reclaimed = 0
[   61.181043] pass = 0, prio = 7, sc.nr_reclaimed = 0
[   61.186509] pass = 0, prio = 6, sc.nr_reclaimed = 0
[   61.186525] pass = 0, prio = 5, sc.nr_reclaimed = 0
[   61.186534] pass = 0, prio = 4, sc.nr_reclaimed = 0
[   61.186544] pass = 0, prio = 3, sc.nr_reclaimed = 0
[   61.333383] pass = 0, prio = 2, sc.nr_reclaimed = 7746
[   61.436711] pass = 0, prio = 1, sc.nr_reclaimed = 1957
[   61.556712] pass = 0, prio = 0, sc.nr_reclaimed = 4528
[   61.556729] pass = 1, prio = 12, sc.nr_reclaimed = 0
[   61.556739] pass = 1, prio = 11, sc.nr_reclaimed = 0
[   61.556749] pass = 1, prio = 10, sc.nr_reclaimed = 0
[   61.556759] pass = 1, prio = 9, sc.nr_reclaimed = 0
[   61.556768] pass = 1, prio = 8, sc.nr_reclaimed = 0
[   61.556778] pass = 1, prio = 7, sc.nr_reclaimed = 0
[   61.556787] pass = 1, prio = 6, sc.nr_reclaimed = 0
[   61.556797] pass = 1, prio = 5, sc.nr_reclaimed = 0
[   61.556806] pass = 1, prio = 4, sc.nr_reclaimed = 0
[   61.556816] pass = 1, prio = 3, sc.nr_reclaimed = 0
[   61.556825] pass = 1, prio = 2, sc.nr_reclaimed = 0
[   61.556835] pass = 1, prio = 1, sc.nr_reclaimed = 0
[   61.595841] pass = 1, prio = 0, sc.nr_reclaimed = 0
[   61.595854] pass = 2, prio = 12, sc.nr_reclaimed = 0
[   61.595864] pass = 2, prio = 11, sc.nr_reclaimed = 0
[   61.595873] pass = 2, prio = 10, sc.nr_reclaimed = 0
[   61.595883] pass = 2, prio = 9, sc.nr_reclaimed = 0
[   61.710044] pass = 2, prio = 8, sc.nr_reclaimed = 2895
[   61.710062] pass = 2, prio = 7, sc.nr_reclaimed = 0
[   61.710072] pass = 2, prio = 6, sc.nr_reclaimed = 0
[   61.710081] pass = 2, prio = 5, sc.nr_reclaimed = 0
[   61.710091] pass = 2, prio = 4, sc.nr_reclaimed = 0
[   61.710100] pass = 2, prio = 3, sc.nr_reclaimed = 0
[   61.710110] pass = 2, prio = 2, sc.nr_reclaimed = 0
[   61.710119] pass = 2, prio = 1, sc.nr_reclaimed = 0
[   61.823378] pass = 2, prio = 0, sc.nr_reclaimed = 1802
[   61.823396] pass = 3, prio = 12, sc.nr_reclaimed = 0
[   61.823406] pass = 3, prio = 11, sc.nr_reclaimed = 0
[   61.823416] pass = 3, prio = 10, sc.nr_reclaimed = 0
[   61.823425] pass = 3, prio = 9, sc.nr_reclaimed = 0
[   61.823435] pass = 3, prio = 8, sc.nr_reclaimed = 0
[   61.823444] pass = 3, prio = 7, sc.nr_reclaimed = 0
[   61.823454] pass = 3, prio = 6, sc.nr_reclaimed = 0
[   61.823464] pass = 3, prio = 5, sc.nr_reclaimed = 0
[   61.823473] pass = 3, prio = 4, sc.nr_reclaimed = 0
[   61.823483] pass = 3, prio = 3, sc.nr_reclaimed = 0
[   61.823492] pass = 3, prio = 2, sc.nr_reclaimed = 0
[   61.823502] pass = 3, prio = 1, sc.nr_reclaimed = 0
[   62.586707] pass = 3, prio = 0, sc.nr_reclaimed = 2716
[   62.608037] pass = 4, prio = 12, sc.nr_reclaimed = 3070
[   62.634644] pass = 4, prio = 11, sc.nr_reclaimed = 5048
[   62.638782] pass = 4, prio = 10, sc.nr_reclaimed = 192
[   62.740052] pass = 4, prio = 9, sc.nr_reclaimed = 0
[   62.843385] pass = 4, prio = 8, sc.nr_reclaimed = 640
[   62.946726] pass = 4, prio = 7, sc.nr_reclaimed = 640
[   63.046711] pass = 4, prio = 6, sc.nr_reclaimed = 640
[   63.146717] pass = 4, prio = 5, sc.nr_reclaimed = 608
[   63.246712] pass = 4, prio = 4, sc.nr_reclaimed = 600
[   63.346704] pass = 4, prio = 3, sc.nr_reclaimed = 128
[   63.446698] pass = 4, prio = 2, sc.nr_reclaimed = 0
[   63.546705] pass = 4, prio = 1, sc.nr_reclaimed = 0
[   63.646698] pass = 4, prio = 0, sc.nr_reclaimed = 0
[   63.646708] after: sc.nr_reclaimed = 0
[   63.646715] shrink_all_memory(10000) failed

which obviously is done by shrink_all_zones().  Sigh.

The appended patch should help, please verify.

Thanks,
Rafael


---
 mm/vmscan.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Index: linux-2.6/mm/vmscan.c
===================================================================
--- linux-2.6.orig/mm/vmscan.c
+++ linux-2.6/mm/vmscan.c
@@ -2088,13 +2088,13 @@ static void shrink_all_zones(unsigned lo
 				nr_reclaimed += shrink_list(l, nr_to_scan, zone,
 								sc, prio);
 				if (nr_reclaimed >= nr_pages) {
-					sc->nr_reclaimed = nr_reclaimed;
+					sc->nr_reclaimed += nr_reclaimed;
 					return;
 				}
 			}
 		}
 	}
-	sc->nr_reclaimed = nr_reclaimed;
+	sc->nr_reclaimed += nr_reclaimed;
 }
 
 /*
@@ -2115,6 +2115,7 @@ unsigned long shrink_all_memory(unsigned
 		.may_unmap = 0,
 		.may_writepage = 1,
 		.isolate_pages = isolate_pages_global,
+		.nr_reclaimed = 0,
 	};
 
 	current->reclaim_state = &reclaim_state;

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-18 12:57                               ` Alan Jenkins
  0 siblings, 0 replies; 580+ messages in thread
From: Alan Jenkins @ 2009-04-18 12:57 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linus Torvalds, Jens Axboe, Linux Kernel Mailing List,
	Kernel Testers List

Rafael J. Wysocki wrote:
> On Saturday 18 April 2009, Alan Jenkins wrote:
>   
>> Linus Torvalds wrote:
>>     
>>> On Fri, 17 Apr 2009, Rafael J. Wysocki wrote:
>>>   
>>>       
>>>> Can you please try to reproduce the problem with the appended debug patch
>>>> applied and send the output of dmesg to me?
>>>>     
>>>>         
>>> Maybe something like this instead (or in addition to).
>>>
>>> It does "show_mem()" when memory shrinking fails. It will show a _lot_ of 
>>> data.
>>>
>>> Untested, but trivial.
>>>
>>> 		Linus
>>> ---
>>>   
>>>       
>> Ok, I applied both your and Rafael's debug patches.  dmesg attached.
>>
>> After the failed hibernation, I noticed my touchpad wasn't working.  But 
>> I think that's something else.  I had another go and couldn't reproduce 
>> that.  It's happened to me once before while testing 2.6.30-; I've also 
>> had the keyboard stop working at least once.  I'm hoping it's the same 
>> bug as "20 ACPI interrupts per second on EEEPC" bug.  It could be 
>> overloading my bug-ridden EC, which also acts as the keyboard controller.
>>     
>
> Thanks for testing!
>
> Clearly, sc.nr_reclaimed is reset in each iteration of the loop in
> shrink_all_memory():
>
> [   61.135207] PM: Shrinking memory...  <6>before: sc.nr_reclaimed = 0
> [   61.180993] pass = 0, prio = 12, sc.nr_reclaimed = 0
> [   61.181004] pass = 0, prio = 11, sc.nr_reclaimed = 0
> [   61.181014] pass = 0, prio = 10, sc.nr_reclaimed = 0
> [   61.181024] pass = 0, prio = 9, sc.nr_reclaimed = 0
> [   61.181033] pass = 0, prio = 8, sc.nr_reclaimed = 0
> [   61.181043] pass = 0, prio = 7, sc.nr_reclaimed = 0
> [   61.186509] pass = 0, prio = 6, sc.nr_reclaimed = 0
> [   61.186525] pass = 0, prio = 5, sc.nr_reclaimed = 0
> [   61.186534] pass = 0, prio = 4, sc.nr_reclaimed = 0
> [   61.186544] pass = 0, prio = 3, sc.nr_reclaimed = 0
> [   61.333383] pass = 0, prio = 2, sc.nr_reclaimed = 7746
> [   61.436711] pass = 0, prio = 1, sc.nr_reclaimed = 1957
> [   61.556712] pass = 0, prio = 0, sc.nr_reclaimed = 4528
> [   61.556729] pass = 1, prio = 12, sc.nr_reclaimed = 0
> [   61.556739] pass = 1, prio = 11, sc.nr_reclaimed = 0
> [   61.556749] pass = 1, prio = 10, sc.nr_reclaimed = 0
> [   61.556759] pass = 1, prio = 9, sc.nr_reclaimed = 0
> [   61.556768] pass = 1, prio = 8, sc.nr_reclaimed = 0
> [   61.556778] pass = 1, prio = 7, sc.nr_reclaimed = 0
> [   61.556787] pass = 1, prio = 6, sc.nr_reclaimed = 0
> [   61.556797] pass = 1, prio = 5, sc.nr_reclaimed = 0
> [   61.556806] pass = 1, prio = 4, sc.nr_reclaimed = 0
> [   61.556816] pass = 1, prio = 3, sc.nr_reclaimed = 0
> [   61.556825] pass = 1, prio = 2, sc.nr_reclaimed = 0
> [   61.556835] pass = 1, prio = 1, sc.nr_reclaimed = 0
> [   61.595841] pass = 1, prio = 0, sc.nr_reclaimed = 0
> [   61.595854] pass = 2, prio = 12, sc.nr_reclaimed = 0
> [   61.595864] pass = 2, prio = 11, sc.nr_reclaimed = 0
> [   61.595873] pass = 2, prio = 10, sc.nr_reclaimed = 0
> [   61.595883] pass = 2, prio = 9, sc.nr_reclaimed = 0
> [   61.710044] pass = 2, prio = 8, sc.nr_reclaimed = 2895
> [   61.710062] pass = 2, prio = 7, sc.nr_reclaimed = 0
> [   61.710072] pass = 2, prio = 6, sc.nr_reclaimed = 0
> [   61.710081] pass = 2, prio = 5, sc.nr_reclaimed = 0
> [   61.710091] pass = 2, prio = 4, sc.nr_reclaimed = 0
> [   61.710100] pass = 2, prio = 3, sc.nr_reclaimed = 0
> [   61.710110] pass = 2, prio = 2, sc.nr_reclaimed = 0
> [   61.710119] pass = 2, prio = 1, sc.nr_reclaimed = 0
> [   61.823378] pass = 2, prio = 0, sc.nr_reclaimed = 1802
> [   61.823396] pass = 3, prio = 12, sc.nr_reclaimed = 0
> [   61.823406] pass = 3, prio = 11, sc.nr_reclaimed = 0
> [   61.823416] pass = 3, prio = 10, sc.nr_reclaimed = 0
> [   61.823425] pass = 3, prio = 9, sc.nr_reclaimed = 0
> [   61.823435] pass = 3, prio = 8, sc.nr_reclaimed = 0
> [   61.823444] pass = 3, prio = 7, sc.nr_reclaimed = 0
> [   61.823454] pass = 3, prio = 6, sc.nr_reclaimed = 0
> [   61.823464] pass = 3, prio = 5, sc.nr_reclaimed = 0
> [   61.823473] pass = 3, prio = 4, sc.nr_reclaimed = 0
> [   61.823483] pass = 3, prio = 3, sc.nr_reclaimed = 0
> [   61.823492] pass = 3, prio = 2, sc.nr_reclaimed = 0
> [   61.823502] pass = 3, prio = 1, sc.nr_reclaimed = 0
> [   62.586707] pass = 3, prio = 0, sc.nr_reclaimed = 2716
> [   62.608037] pass = 4, prio = 12, sc.nr_reclaimed = 3070
> [   62.634644] pass = 4, prio = 11, sc.nr_reclaimed = 5048
> [   62.638782] pass = 4, prio = 10, sc.nr_reclaimed = 192
> [   62.740052] pass = 4, prio = 9, sc.nr_reclaimed = 0
> [   62.843385] pass = 4, prio = 8, sc.nr_reclaimed = 640
> [   62.946726] pass = 4, prio = 7, sc.nr_reclaimed = 640
> [   63.046711] pass = 4, prio = 6, sc.nr_reclaimed = 640
> [   63.146717] pass = 4, prio = 5, sc.nr_reclaimed = 608
> [   63.246712] pass = 4, prio = 4, sc.nr_reclaimed = 600
> [   63.346704] pass = 4, prio = 3, sc.nr_reclaimed = 128
> [   63.446698] pass = 4, prio = 2, sc.nr_reclaimed = 0
> [   63.546705] pass = 4, prio = 1, sc.nr_reclaimed = 0
> [   63.646698] pass = 4, prio = 0, sc.nr_reclaimed = 0
> [   63.646708] after: sc.nr_reclaimed = 0
> [   63.646715] shrink_all_memory(10000) failed
>
> which obviously is done by shrink_all_zones().  Sigh.
>
> The appended patch should help, please verify.
>   

Yes, that fixes it.

Thanks
Alan

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-18 12:57                               ` Alan Jenkins
  0 siblings, 0 replies; 580+ messages in thread
From: Alan Jenkins @ 2009-04-18 12:57 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linus Torvalds, Jens Axboe, Linux Kernel Mailing List,
	Kernel Testers List

Rafael J. Wysocki wrote:
> On Saturday 18 April 2009, Alan Jenkins wrote:
>   
>> Linus Torvalds wrote:
>>     
>>> On Fri, 17 Apr 2009, Rafael J. Wysocki wrote:
>>>   
>>>       
>>>> Can you please try to reproduce the problem with the appended debug patch
>>>> applied and send the output of dmesg to me?
>>>>     
>>>>         
>>> Maybe something like this instead (or in addition to).
>>>
>>> It does "show_mem()" when memory shrinking fails. It will show a _lot_ of 
>>> data.
>>>
>>> Untested, but trivial.
>>>
>>> 		Linus
>>> ---
>>>   
>>>       
>> Ok, I applied both your and Rafael's debug patches.  dmesg attached.
>>
>> After the failed hibernation, I noticed my touchpad wasn't working.  But 
>> I think that's something else.  I had another go and couldn't reproduce 
>> that.  It's happened to me once before while testing 2.6.30-; I've also 
>> had the keyboard stop working at least once.  I'm hoping it's the same 
>> bug as "20 ACPI interrupts per second on EEEPC" bug.  It could be 
>> overloading my bug-ridden EC, which also acts as the keyboard controller.
>>     
>
> Thanks for testing!
>
> Clearly, sc.nr_reclaimed is reset in each iteration of the loop in
> shrink_all_memory():
>
> [   61.135207] PM: Shrinking memory...  <6>before: sc.nr_reclaimed = 0
> [   61.180993] pass = 0, prio = 12, sc.nr_reclaimed = 0
> [   61.181004] pass = 0, prio = 11, sc.nr_reclaimed = 0
> [   61.181014] pass = 0, prio = 10, sc.nr_reclaimed = 0
> [   61.181024] pass = 0, prio = 9, sc.nr_reclaimed = 0
> [   61.181033] pass = 0, prio = 8, sc.nr_reclaimed = 0
> [   61.181043] pass = 0, prio = 7, sc.nr_reclaimed = 0
> [   61.186509] pass = 0, prio = 6, sc.nr_reclaimed = 0
> [   61.186525] pass = 0, prio = 5, sc.nr_reclaimed = 0
> [   61.186534] pass = 0, prio = 4, sc.nr_reclaimed = 0
> [   61.186544] pass = 0, prio = 3, sc.nr_reclaimed = 0
> [   61.333383] pass = 0, prio = 2, sc.nr_reclaimed = 7746
> [   61.436711] pass = 0, prio = 1, sc.nr_reclaimed = 1957
> [   61.556712] pass = 0, prio = 0, sc.nr_reclaimed = 4528
> [   61.556729] pass = 1, prio = 12, sc.nr_reclaimed = 0
> [   61.556739] pass = 1, prio = 11, sc.nr_reclaimed = 0
> [   61.556749] pass = 1, prio = 10, sc.nr_reclaimed = 0
> [   61.556759] pass = 1, prio = 9, sc.nr_reclaimed = 0
> [   61.556768] pass = 1, prio = 8, sc.nr_reclaimed = 0
> [   61.556778] pass = 1, prio = 7, sc.nr_reclaimed = 0
> [   61.556787] pass = 1, prio = 6, sc.nr_reclaimed = 0
> [   61.556797] pass = 1, prio = 5, sc.nr_reclaimed = 0
> [   61.556806] pass = 1, prio = 4, sc.nr_reclaimed = 0
> [   61.556816] pass = 1, prio = 3, sc.nr_reclaimed = 0
> [   61.556825] pass = 1, prio = 2, sc.nr_reclaimed = 0
> [   61.556835] pass = 1, prio = 1, sc.nr_reclaimed = 0
> [   61.595841] pass = 1, prio = 0, sc.nr_reclaimed = 0
> [   61.595854] pass = 2, prio = 12, sc.nr_reclaimed = 0
> [   61.595864] pass = 2, prio = 11, sc.nr_reclaimed = 0
> [   61.595873] pass = 2, prio = 10, sc.nr_reclaimed = 0
> [   61.595883] pass = 2, prio = 9, sc.nr_reclaimed = 0
> [   61.710044] pass = 2, prio = 8, sc.nr_reclaimed = 2895
> [   61.710062] pass = 2, prio = 7, sc.nr_reclaimed = 0
> [   61.710072] pass = 2, prio = 6, sc.nr_reclaimed = 0
> [   61.710081] pass = 2, prio = 5, sc.nr_reclaimed = 0
> [   61.710091] pass = 2, prio = 4, sc.nr_reclaimed = 0
> [   61.710100] pass = 2, prio = 3, sc.nr_reclaimed = 0
> [   61.710110] pass = 2, prio = 2, sc.nr_reclaimed = 0
> [   61.710119] pass = 2, prio = 1, sc.nr_reclaimed = 0
> [   61.823378] pass = 2, prio = 0, sc.nr_reclaimed = 1802
> [   61.823396] pass = 3, prio = 12, sc.nr_reclaimed = 0
> [   61.823406] pass = 3, prio = 11, sc.nr_reclaimed = 0
> [   61.823416] pass = 3, prio = 10, sc.nr_reclaimed = 0
> [   61.823425] pass = 3, prio = 9, sc.nr_reclaimed = 0
> [   61.823435] pass = 3, prio = 8, sc.nr_reclaimed = 0
> [   61.823444] pass = 3, prio = 7, sc.nr_reclaimed = 0
> [   61.823454] pass = 3, prio = 6, sc.nr_reclaimed = 0
> [   61.823464] pass = 3, prio = 5, sc.nr_reclaimed = 0
> [   61.823473] pass = 3, prio = 4, sc.nr_reclaimed = 0
> [   61.823483] pass = 3, prio = 3, sc.nr_reclaimed = 0
> [   61.823492] pass = 3, prio = 2, sc.nr_reclaimed = 0
> [   61.823502] pass = 3, prio = 1, sc.nr_reclaimed = 0
> [   62.586707] pass = 3, prio = 0, sc.nr_reclaimed = 2716
> [   62.608037] pass = 4, prio = 12, sc.nr_reclaimed = 3070
> [   62.634644] pass = 4, prio = 11, sc.nr_reclaimed = 5048
> [   62.638782] pass = 4, prio = 10, sc.nr_reclaimed = 192
> [   62.740052] pass = 4, prio = 9, sc.nr_reclaimed = 0
> [   62.843385] pass = 4, prio = 8, sc.nr_reclaimed = 640
> [   62.946726] pass = 4, prio = 7, sc.nr_reclaimed = 640
> [   63.046711] pass = 4, prio = 6, sc.nr_reclaimed = 640
> [   63.146717] pass = 4, prio = 5, sc.nr_reclaimed = 608
> [   63.246712] pass = 4, prio = 4, sc.nr_reclaimed = 600
> [   63.346704] pass = 4, prio = 3, sc.nr_reclaimed = 128
> [   63.446698] pass = 4, prio = 2, sc.nr_reclaimed = 0
> [   63.546705] pass = 4, prio = 1, sc.nr_reclaimed = 0
> [   63.646698] pass = 4, prio = 0, sc.nr_reclaimed = 0
> [   63.646708] after: sc.nr_reclaimed = 0
> [   63.646715] shrink_all_memory(10000) failed
>
> which obviously is done by shrink_all_zones().  Sigh.
>
> The appended patch should help, please verify.
>   

Yes, that fixes it.

Thanks
Alan

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH] PM/Hibernate: Fix memory shrinking (Re: [Bug #13058] First hibernation attempt fails)
@ 2009-04-18 15:23                                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-18 15:23 UTC (permalink / raw)
  To: Alan Jenkins
  Cc: Linus Torvalds, Jens Axboe, Linux Kernel Mailing List,
	Kernel Testers List

On Saturday 18 April 2009, Alan Jenkins wrote:
> Rafael J. Wysocki wrote:
> > On Saturday 18 April 2009, Alan Jenkins wrote:
> >   
> >> Linus Torvalds wrote:
> >>     
> >>> On Fri, 17 Apr 2009, Rafael J. Wysocki wrote:
> >>>   
> >>>       
> >>>> Can you please try to reproduce the problem with the appended debug patch
> >>>> applied and send the output of dmesg to me?
> >>>>     
> >>>>         
> >>> Maybe something like this instead (or in addition to).
> >>>
> >>> It does "show_mem()" when memory shrinking fails. It will show a _lot_ of 
> >>> data.
> >>>
> >>> Untested, but trivial.
> >>>
> >>> 		Linus
> >>> ---
> >>>   
> >>>       
> >> Ok, I applied both your and Rafael's debug patches.  dmesg attached.
> >>
> >> After the failed hibernation, I noticed my touchpad wasn't working.  But 
> >> I think that's something else.  I had another go and couldn't reproduce 
> >> that.  It's happened to me once before while testing 2.6.30-; I've also 
> >> had the keyboard stop working at least once.  I'm hoping it's the same 
> >> bug as "20 ACPI interrupts per second on EEEPC" bug.  It could be 
> >> overloading my bug-ridden EC, which also acts as the keyboard controller.
> >>     
> >
> > Thanks for testing!
> >
> > Clearly, sc.nr_reclaimed is reset in each iteration of the loop in
> > shrink_all_memory():
> >
> > [   61.135207] PM: Shrinking memory...  <6>before: sc.nr_reclaimed = 0
> > [   61.180993] pass = 0, prio = 12, sc.nr_reclaimed = 0
> > [   61.181004] pass = 0, prio = 11, sc.nr_reclaimed = 0
> > [   61.181014] pass = 0, prio = 10, sc.nr_reclaimed = 0
> > [   61.181024] pass = 0, prio = 9, sc.nr_reclaimed = 0
> > [   61.181033] pass = 0, prio = 8, sc.nr_reclaimed = 0
> > [   61.181043] pass = 0, prio = 7, sc.nr_reclaimed = 0
> > [   61.186509] pass = 0, prio = 6, sc.nr_reclaimed = 0
> > [   61.186525] pass = 0, prio = 5, sc.nr_reclaimed = 0
> > [   61.186534] pass = 0, prio = 4, sc.nr_reclaimed = 0
> > [   61.186544] pass = 0, prio = 3, sc.nr_reclaimed = 0
> > [   61.333383] pass = 0, prio = 2, sc.nr_reclaimed = 7746
> > [   61.436711] pass = 0, prio = 1, sc.nr_reclaimed = 1957
> > [   61.556712] pass = 0, prio = 0, sc.nr_reclaimed = 4528
> > [   61.556729] pass = 1, prio = 12, sc.nr_reclaimed = 0
> > [   61.556739] pass = 1, prio = 11, sc.nr_reclaimed = 0
> > [   61.556749] pass = 1, prio = 10, sc.nr_reclaimed = 0
> > [   61.556759] pass = 1, prio = 9, sc.nr_reclaimed = 0
> > [   61.556768] pass = 1, prio = 8, sc.nr_reclaimed = 0
> > [   61.556778] pass = 1, prio = 7, sc.nr_reclaimed = 0
> > [   61.556787] pass = 1, prio = 6, sc.nr_reclaimed = 0
> > [   61.556797] pass = 1, prio = 5, sc.nr_reclaimed = 0
> > [   61.556806] pass = 1, prio = 4, sc.nr_reclaimed = 0
> > [   61.556816] pass = 1, prio = 3, sc.nr_reclaimed = 0
> > [   61.556825] pass = 1, prio = 2, sc.nr_reclaimed = 0
> > [   61.556835] pass = 1, prio = 1, sc.nr_reclaimed = 0
> > [   61.595841] pass = 1, prio = 0, sc.nr_reclaimed = 0
> > [   61.595854] pass = 2, prio = 12, sc.nr_reclaimed = 0
> > [   61.595864] pass = 2, prio = 11, sc.nr_reclaimed = 0
> > [   61.595873] pass = 2, prio = 10, sc.nr_reclaimed = 0
> > [   61.595883] pass = 2, prio = 9, sc.nr_reclaimed = 0
> > [   61.710044] pass = 2, prio = 8, sc.nr_reclaimed = 2895
> > [   61.710062] pass = 2, prio = 7, sc.nr_reclaimed = 0
> > [   61.710072] pass = 2, prio = 6, sc.nr_reclaimed = 0
> > [   61.710081] pass = 2, prio = 5, sc.nr_reclaimed = 0
> > [   61.710091] pass = 2, prio = 4, sc.nr_reclaimed = 0
> > [   61.710100] pass = 2, prio = 3, sc.nr_reclaimed = 0
> > [   61.710110] pass = 2, prio = 2, sc.nr_reclaimed = 0
> > [   61.710119] pass = 2, prio = 1, sc.nr_reclaimed = 0
> > [   61.823378] pass = 2, prio = 0, sc.nr_reclaimed = 1802
> > [   61.823396] pass = 3, prio = 12, sc.nr_reclaimed = 0
> > [   61.823406] pass = 3, prio = 11, sc.nr_reclaimed = 0
> > [   61.823416] pass = 3, prio = 10, sc.nr_reclaimed = 0
> > [   61.823425] pass = 3, prio = 9, sc.nr_reclaimed = 0
> > [   61.823435] pass = 3, prio = 8, sc.nr_reclaimed = 0
> > [   61.823444] pass = 3, prio = 7, sc.nr_reclaimed = 0
> > [   61.823454] pass = 3, prio = 6, sc.nr_reclaimed = 0
> > [   61.823464] pass = 3, prio = 5, sc.nr_reclaimed = 0
> > [   61.823473] pass = 3, prio = 4, sc.nr_reclaimed = 0
> > [   61.823483] pass = 3, prio = 3, sc.nr_reclaimed = 0
> > [   61.823492] pass = 3, prio = 2, sc.nr_reclaimed = 0
> > [   61.823502] pass = 3, prio = 1, sc.nr_reclaimed = 0
> > [   62.586707] pass = 3, prio = 0, sc.nr_reclaimed = 2716
> > [   62.608037] pass = 4, prio = 12, sc.nr_reclaimed = 3070
> > [   62.634644] pass = 4, prio = 11, sc.nr_reclaimed = 5048
> > [   62.638782] pass = 4, prio = 10, sc.nr_reclaimed = 192
> > [   62.740052] pass = 4, prio = 9, sc.nr_reclaimed = 0
> > [   62.843385] pass = 4, prio = 8, sc.nr_reclaimed = 640
> > [   62.946726] pass = 4, prio = 7, sc.nr_reclaimed = 640
> > [   63.046711] pass = 4, prio = 6, sc.nr_reclaimed = 640
> > [   63.146717] pass = 4, prio = 5, sc.nr_reclaimed = 608
> > [   63.246712] pass = 4, prio = 4, sc.nr_reclaimed = 600
> > [   63.346704] pass = 4, prio = 3, sc.nr_reclaimed = 128
> > [   63.446698] pass = 4, prio = 2, sc.nr_reclaimed = 0
> > [   63.546705] pass = 4, prio = 1, sc.nr_reclaimed = 0
> > [   63.646698] pass = 4, prio = 0, sc.nr_reclaimed = 0
> > [   63.646708] after: sc.nr_reclaimed = 0
> > [   63.646715] shrink_all_memory(10000) failed
> >
> > which obviously is done by shrink_all_zones().  Sigh.
> >
> > The appended patch should help, please verify.
> >   
> 
> Yes, that fixes it.

---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: PM/Hibernate: Fix memory shrinking

Commit d979677c4c02f0a72db5a03ecd8184bd9d6695c8
(mm: shrink_all_memory(): use sc.nr_reclaimed) broke the memory
shrinking used by hibernation, becuse it did not update
shrink_all_zones() in accordance with the other changes it made.
Fix this by making shrink_all_zones() update sc->nr_reclaimed instead
of overwriting its value.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 mm/vmscan.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Index: linux-2.6/mm/vmscan.c
===================================================================
--- linux-2.6.orig/mm/vmscan.c
+++ linux-2.6/mm/vmscan.c
@@ -2088,13 +2088,13 @@ static void shrink_all_zones(unsigned lo
 				nr_reclaimed += shrink_list(l, nr_to_scan, zone,
 								sc, prio);
 				if (nr_reclaimed >= nr_pages) {
-					sc->nr_reclaimed = nr_reclaimed;
+					sc->nr_reclaimed += nr_reclaimed;
 					return;
 				}
 			}
 		}
 	}
-	sc->nr_reclaimed = nr_reclaimed;
+	sc->nr_reclaimed += nr_reclaimed;
 }
 
 /*
@@ -2115,6 +2115,7 @@ unsigned long shrink_all_memory(unsigned
 		.may_unmap = 0,
 		.may_writepage = 1,
 		.isolate_pages = isolate_pages_global,
+		.nr_reclaimed = 0,
 	};
 
 	current->reclaim_state = &reclaim_state;

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH] PM/Hibernate: Fix memory shrinking (Re: [Bug #13058] First hibernation attempt fails)
@ 2009-04-18 15:23                                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-18 15:23 UTC (permalink / raw)
  To: Alan Jenkins
  Cc: Linus Torvalds, Jens Axboe, Linux Kernel Mailing List,
	Kernel Testers List

On Saturday 18 April 2009, Alan Jenkins wrote:
> Rafael J. Wysocki wrote:
> > On Saturday 18 April 2009, Alan Jenkins wrote:
> >   
> >> Linus Torvalds wrote:
> >>     
> >>> On Fri, 17 Apr 2009, Rafael J. Wysocki wrote:
> >>>   
> >>>       
> >>>> Can you please try to reproduce the problem with the appended debug patch
> >>>> applied and send the output of dmesg to me?
> >>>>     
> >>>>         
> >>> Maybe something like this instead (or in addition to).
> >>>
> >>> It does "show_mem()" when memory shrinking fails. It will show a _lot_ of 
> >>> data.
> >>>
> >>> Untested, but trivial.
> >>>
> >>> 		Linus
> >>> ---
> >>>   
> >>>       
> >> Ok, I applied both your and Rafael's debug patches.  dmesg attached.
> >>
> >> After the failed hibernation, I noticed my touchpad wasn't working.  But 
> >> I think that's something else.  I had another go and couldn't reproduce 
> >> that.  It's happened to me once before while testing 2.6.30-; I've also 
> >> had the keyboard stop working at least once.  I'm hoping it's the same 
> >> bug as "20 ACPI interrupts per second on EEEPC" bug.  It could be 
> >> overloading my bug-ridden EC, which also acts as the keyboard controller.
> >>     
> >
> > Thanks for testing!
> >
> > Clearly, sc.nr_reclaimed is reset in each iteration of the loop in
> > shrink_all_memory():
> >
> > [   61.135207] PM: Shrinking memory...  <6>before: sc.nr_reclaimed = 0
> > [   61.180993] pass = 0, prio = 12, sc.nr_reclaimed = 0
> > [   61.181004] pass = 0, prio = 11, sc.nr_reclaimed = 0
> > [   61.181014] pass = 0, prio = 10, sc.nr_reclaimed = 0
> > [   61.181024] pass = 0, prio = 9, sc.nr_reclaimed = 0
> > [   61.181033] pass = 0, prio = 8, sc.nr_reclaimed = 0
> > [   61.181043] pass = 0, prio = 7, sc.nr_reclaimed = 0
> > [   61.186509] pass = 0, prio = 6, sc.nr_reclaimed = 0
> > [   61.186525] pass = 0, prio = 5, sc.nr_reclaimed = 0
> > [   61.186534] pass = 0, prio = 4, sc.nr_reclaimed = 0
> > [   61.186544] pass = 0, prio = 3, sc.nr_reclaimed = 0
> > [   61.333383] pass = 0, prio = 2, sc.nr_reclaimed = 7746
> > [   61.436711] pass = 0, prio = 1, sc.nr_reclaimed = 1957
> > [   61.556712] pass = 0, prio = 0, sc.nr_reclaimed = 4528
> > [   61.556729] pass = 1, prio = 12, sc.nr_reclaimed = 0
> > [   61.556739] pass = 1, prio = 11, sc.nr_reclaimed = 0
> > [   61.556749] pass = 1, prio = 10, sc.nr_reclaimed = 0
> > [   61.556759] pass = 1, prio = 9, sc.nr_reclaimed = 0
> > [   61.556768] pass = 1, prio = 8, sc.nr_reclaimed = 0
> > [   61.556778] pass = 1, prio = 7, sc.nr_reclaimed = 0
> > [   61.556787] pass = 1, prio = 6, sc.nr_reclaimed = 0
> > [   61.556797] pass = 1, prio = 5, sc.nr_reclaimed = 0
> > [   61.556806] pass = 1, prio = 4, sc.nr_reclaimed = 0
> > [   61.556816] pass = 1, prio = 3, sc.nr_reclaimed = 0
> > [   61.556825] pass = 1, prio = 2, sc.nr_reclaimed = 0
> > [   61.556835] pass = 1, prio = 1, sc.nr_reclaimed = 0
> > [   61.595841] pass = 1, prio = 0, sc.nr_reclaimed = 0
> > [   61.595854] pass = 2, prio = 12, sc.nr_reclaimed = 0
> > [   61.595864] pass = 2, prio = 11, sc.nr_reclaimed = 0
> > [   61.595873] pass = 2, prio = 10, sc.nr_reclaimed = 0
> > [   61.595883] pass = 2, prio = 9, sc.nr_reclaimed = 0
> > [   61.710044] pass = 2, prio = 8, sc.nr_reclaimed = 2895
> > [   61.710062] pass = 2, prio = 7, sc.nr_reclaimed = 0
> > [   61.710072] pass = 2, prio = 6, sc.nr_reclaimed = 0
> > [   61.710081] pass = 2, prio = 5, sc.nr_reclaimed = 0
> > [   61.710091] pass = 2, prio = 4, sc.nr_reclaimed = 0
> > [   61.710100] pass = 2, prio = 3, sc.nr_reclaimed = 0
> > [   61.710110] pass = 2, prio = 2, sc.nr_reclaimed = 0
> > [   61.710119] pass = 2, prio = 1, sc.nr_reclaimed = 0
> > [   61.823378] pass = 2, prio = 0, sc.nr_reclaimed = 1802
> > [   61.823396] pass = 3, prio = 12, sc.nr_reclaimed = 0
> > [   61.823406] pass = 3, prio = 11, sc.nr_reclaimed = 0
> > [   61.823416] pass = 3, prio = 10, sc.nr_reclaimed = 0
> > [   61.823425] pass = 3, prio = 9, sc.nr_reclaimed = 0
> > [   61.823435] pass = 3, prio = 8, sc.nr_reclaimed = 0
> > [   61.823444] pass = 3, prio = 7, sc.nr_reclaimed = 0
> > [   61.823454] pass = 3, prio = 6, sc.nr_reclaimed = 0
> > [   61.823464] pass = 3, prio = 5, sc.nr_reclaimed = 0
> > [   61.823473] pass = 3, prio = 4, sc.nr_reclaimed = 0
> > [   61.823483] pass = 3, prio = 3, sc.nr_reclaimed = 0
> > [   61.823492] pass = 3, prio = 2, sc.nr_reclaimed = 0
> > [   61.823502] pass = 3, prio = 1, sc.nr_reclaimed = 0
> > [   62.586707] pass = 3, prio = 0, sc.nr_reclaimed = 2716
> > [   62.608037] pass = 4, prio = 12, sc.nr_reclaimed = 3070
> > [   62.634644] pass = 4, prio = 11, sc.nr_reclaimed = 5048
> > [   62.638782] pass = 4, prio = 10, sc.nr_reclaimed = 192
> > [   62.740052] pass = 4, prio = 9, sc.nr_reclaimed = 0
> > [   62.843385] pass = 4, prio = 8, sc.nr_reclaimed = 640
> > [   62.946726] pass = 4, prio = 7, sc.nr_reclaimed = 640
> > [   63.046711] pass = 4, prio = 6, sc.nr_reclaimed = 640
> > [   63.146717] pass = 4, prio = 5, sc.nr_reclaimed = 608
> > [   63.246712] pass = 4, prio = 4, sc.nr_reclaimed = 600
> > [   63.346704] pass = 4, prio = 3, sc.nr_reclaimed = 128
> > [   63.446698] pass = 4, prio = 2, sc.nr_reclaimed = 0
> > [   63.546705] pass = 4, prio = 1, sc.nr_reclaimed = 0
> > [   63.646698] pass = 4, prio = 0, sc.nr_reclaimed = 0
> > [   63.646708] after: sc.nr_reclaimed = 0
> > [   63.646715] shrink_all_memory(10000) failed
> >
> > which obviously is done by shrink_all_zones().  Sigh.
> >
> > The appended patch should help, please verify.
> >   
> 
> Yes, that fixes it.

---
From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
Subject: PM/Hibernate: Fix memory shrinking

Commit d979677c4c02f0a72db5a03ecd8184bd9d6695c8
(mm: shrink_all_memory(): use sc.nr_reclaimed) broke the memory
shrinking used by hibernation, becuse it did not update
shrink_all_zones() in accordance with the other changes it made.
Fix this by making shrink_all_zones() update sc->nr_reclaimed instead
of overwriting its value.

Signed-off-by: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
---
 mm/vmscan.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Index: linux-2.6/mm/vmscan.c
===================================================================
--- linux-2.6.orig/mm/vmscan.c
+++ linux-2.6/mm/vmscan.c
@@ -2088,13 +2088,13 @@ static void shrink_all_zones(unsigned lo
 				nr_reclaimed += shrink_list(l, nr_to_scan, zone,
 								sc, prio);
 				if (nr_reclaimed >= nr_pages) {
-					sc->nr_reclaimed = nr_reclaimed;
+					sc->nr_reclaimed += nr_reclaimed;
 					return;
 				}
 			}
 		}
 	}
-	sc->nr_reclaimed = nr_reclaimed;
+	sc->nr_reclaimed += nr_reclaimed;
 }
 
 /*
@@ -2115,6 +2115,7 @@ unsigned long shrink_all_memory(unsigned
 		.may_unmap = 0,
 		.may_writepage = 1,
 		.isolate_pages = isolate_pages_global,
+		.nr_reclaimed = 0,
 	};
 
 	current->reclaim_state = &reclaim_state;

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13111] Linux 2.6.30-rc1 tg3 endian issues with MAC addresses on BCM5701
@ 2009-04-19  4:30           ` David Miller
  0 siblings, 0 replies; 580+ messages in thread
From: David Miller @ 2009-04-19  4:30 UTC (permalink / raw)
  To: holt
  Cc: mcarlson, rjw, linux-kernel, kernel-testers, benli, mchan,
	James.Bottomley

From: Robin Holt <holt@sgi.com>
Date: Fri, 17 Apr 2009 07:21:21 -0500

>> Actually, I think we do have a fix for this.  James and Robin both
>> reported that the test patch I sent out worked for them.  I'm preparing
>> a patchset for submission now.
>> 
>> James, Robin, can you confirm that you performed your tests with David's
>> patch reverted?
> 
> My test was done with the 2.6.28-rc1 kernel plus your patch and no
> others.

Can I get a final patch against Linus's current tree?

Thanks.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13111] Linux 2.6.30-rc1 tg3 endian issues with MAC addresses on BCM5701
@ 2009-04-19  4:30           ` David Miller
  0 siblings, 0 replies; 580+ messages in thread
From: David Miller @ 2009-04-19  4:30 UTC (permalink / raw)
  To: holt-sJ/iWh9BUns
  Cc: mcarlson-dY08KVG/lbpWk0Htik3J/w, rjw-KKrjLPT3xs0,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	benli-dY08KVG/lbpWk0Htik3J/w, mchan-dY08KVG/lbpWk0Htik3J/w,
	James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk

From: Robin Holt <holt-sJ/iWh9BUns@public.gmane.org>
Date: Fri, 17 Apr 2009 07:21:21 -0500

>> Actually, I think we do have a fix for this.  James and Robin both
>> reported that the test patch I sent out worked for them.  I'm preparing
>> a patchset for submission now.
>> 
>> James, Robin, can you confirm that you performed your tests with David's
>> patch reverted?
> 
> My test was done with the 2.6.28-rc1 kernel plus your patch and no
> others.

Can I get a final patch against Linus's current tree?

Thanks.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13068] Lockdep warining in inotify_dev_queue_event
  2009-04-16 21:45   ` Rafael J. Wysocki
@ 2009-04-19  9:36     ` Sachin Sant
  -1 siblings, 0 replies; 580+ messages in thread
From: Sachin Sant @ 2009-04-19  9:36 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux Kernel Mailing List, Kernel Testers List

Rafael J. Wysocki wrote:
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13068
> Subject		: Lockdep warining in inotify_dev_queue_event
> Submitter	: Sachin Sant <sachinp@in.ibm.com>
> Date		: 2009-04-05 12:37 (12 days old)
> References	: http://marc.info/?l=linux-kernel&m=123893439229272&w=4
>   
I can recreate this with latest kernel. Easiest way is to use the LTP mm 
tests (runltp -f mm )

=================================
[ INFO: inconsistent lock state ]
2.6.30-rc2-git4 #4
---------------------------------
inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
kswapd0/334 [HC0[0]:SC0[0]:HE1:SE1] takes:
 (&inode->inotify_mutex){+.+.?.}, at: [<c000000000166cc0>] 
.inotify_inode_is_dead+0x38/0xc8
{RECLAIM_FS-ON-W} state was registered at:
  [<c000000000098d74>] .lockdep_trace_alloc+0xc4/0xf4
  [<c000000000122128>] .__kmalloc+0x100/0x274
  [<c0000000001682fc>] .kernel_event+0xb8/0x154
  [<c0000000001684b8>] .inotify_dev_queue_event+0x120/0x1cc
  [<c000000000166b2c>] .inotify_inode_queue_event+0xf0/0x160
  [<c000000000139c08>] .vfs_create+0x170/0x1dc
  [<c00000000013d5c0>] .do_filp_open+0x25c/0x964
  [<c00000000012b414>] .do_sys_open+0x80/0x140
  [<c00000000012b2f0>] .SyS_creat+0x18/0x2c
  [<c000000000008554>] syscall_exit+0x0/0x40
irq event stamp: 75815
hardirqs last  enabled at (75815): [<c0000000000cb824>] 
.__call_rcu+0x128/0x15c
hardirqs last disabled at (75814): [<c0000000000cb748>] 
.__call_rcu+0x4c/0x15c
softirqs last  enabled at (73084): [<c00000000002be8c>] 
.call_do_softirq+0x14/0x24
softirqs last disabled at (73071): [<c00000000002be8c>] 
.call_do_softirq+0x14/0x24

other info that might help us debug this:
2 locks held by kswapd0/334:
 #0:  (shrinker_rwsem){++++..}, at: [<c0000000000f57c8>] 
.shrink_slab+0x5c/0x228
 #1:  (&type->s_umount_key#15){++++..}, at: [<c000000000142f00>] 
.shrink_dcache_memory+0xfc/0x244

stack backtrace:
Call Trace:
[c00000004473b440] [c000000000011a54] .show_stack+0x6c/0x16c (unreliable)
[c00000004473b4f0] [c0000000000984d8] .print_usage_bug+0x1c0/0x1f4
[c00000004473b5b0] [c00000000009888c] .mark_lock+0x380/0x6e4
[c00000004473b660] [c00000000009a99c] .__lock_acquire+0x7a8/0x17b4
[c00000004473b760] [c00000000009bab0] .lock_acquire+0x108/0x154
[c00000004473b820] [c0000000005a8bac] .mutex_lock_nested+0x88/0x460
[c00000004473b920] [c000000000166cc0] .inotify_inode_is_dead+0x38/0xc8
[c00000004473b9d0] [c0000000001427f4] .dentry_iput+0xa0/0x128
[c00000004473ba60] [c0000000001429f0] .d_kill+0x5c/0xa0
[c00000004473baf0] [c000000000142d38] .__shrink_dcache_sb+0x304/0x3d0
[c00000004473bbd0] [c000000000142f48] .shrink_dcache_memory+0x144/0x244
[c00000004473bcb0] [c0000000000f58c8] .shrink_slab+0x15c/0x228
[c00000004473bd70] [c0000000000f61a4] .kswapd+0x4c0/0x678
[c00000004473bf00] [c000000000088ba4] .kthread+0x80/0xcc
[c00000004473bf90] [c00000000002c194] .kernel_thread+0x54/0x70

Thanks
-Sachin

-- 

---------------------------------
Sachin Sant
IBM Linux Technology Center
India Systems and Technology Labs
Bangalore, India
---------------------------------


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13068] Lockdep warining in inotify_dev_queue_event
@ 2009-04-19  9:36     ` Sachin Sant
  0 siblings, 0 replies; 580+ messages in thread
From: Sachin Sant @ 2009-04-19  9:36 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux Kernel Mailing List, Kernel Testers List

Rafael J. Wysocki wrote:
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13068
> Subject		: Lockdep warining in inotify_dev_queue_event
> Submitter	: Sachin Sant <sachinp-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
> Date		: 2009-04-05 12:37 (12 days old)
> References	: http://marc.info/?l=linux-kernel&m=123893439229272&w=4
>   
I can recreate this with latest kernel. Easiest way is to use the LTP mm 
tests (runltp -f mm )

=================================
[ INFO: inconsistent lock state ]
2.6.30-rc2-git4 #4
---------------------------------
inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
kswapd0/334 [HC0[0]:SC0[0]:HE1:SE1] takes:
 (&inode->inotify_mutex){+.+.?.}, at: [<c000000000166cc0>] 
.inotify_inode_is_dead+0x38/0xc8
{RECLAIM_FS-ON-W} state was registered at:
  [<c000000000098d74>] .lockdep_trace_alloc+0xc4/0xf4
  [<c000000000122128>] .__kmalloc+0x100/0x274
  [<c0000000001682fc>] .kernel_event+0xb8/0x154
  [<c0000000001684b8>] .inotify_dev_queue_event+0x120/0x1cc
  [<c000000000166b2c>] .inotify_inode_queue_event+0xf0/0x160
  [<c000000000139c08>] .vfs_create+0x170/0x1dc
  [<c00000000013d5c0>] .do_filp_open+0x25c/0x964
  [<c00000000012b414>] .do_sys_open+0x80/0x140
  [<c00000000012b2f0>] .SyS_creat+0x18/0x2c
  [<c000000000008554>] syscall_exit+0x0/0x40
irq event stamp: 75815
hardirqs last  enabled at (75815): [<c0000000000cb824>] 
.__call_rcu+0x128/0x15c
hardirqs last disabled at (75814): [<c0000000000cb748>] 
.__call_rcu+0x4c/0x15c
softirqs last  enabled at (73084): [<c00000000002be8c>] 
.call_do_softirq+0x14/0x24
softirqs last disabled at (73071): [<c00000000002be8c>] 
.call_do_softirq+0x14/0x24

other info that might help us debug this:
2 locks held by kswapd0/334:
 #0:  (shrinker_rwsem){++++..}, at: [<c0000000000f57c8>] 
.shrink_slab+0x5c/0x228
 #1:  (&type->s_umount_key#15){++++..}, at: [<c000000000142f00>] 
.shrink_dcache_memory+0xfc/0x244

stack backtrace:
Call Trace:
[c00000004473b440] [c000000000011a54] .show_stack+0x6c/0x16c (unreliable)
[c00000004473b4f0] [c0000000000984d8] .print_usage_bug+0x1c0/0x1f4
[c00000004473b5b0] [c00000000009888c] .mark_lock+0x380/0x6e4
[c00000004473b660] [c00000000009a99c] .__lock_acquire+0x7a8/0x17b4
[c00000004473b760] [c00000000009bab0] .lock_acquire+0x108/0x154
[c00000004473b820] [c0000000005a8bac] .mutex_lock_nested+0x88/0x460
[c00000004473b920] [c000000000166cc0] .inotify_inode_is_dead+0x38/0xc8
[c00000004473b9d0] [c0000000001427f4] .dentry_iput+0xa0/0x128
[c00000004473ba60] [c0000000001429f0] .d_kill+0x5c/0xa0
[c00000004473baf0] [c000000000142d38] .__shrink_dcache_sb+0x304/0x3d0
[c00000004473bbd0] [c000000000142f48] .shrink_dcache_memory+0x144/0x244
[c00000004473bcb0] [c0000000000f58c8] .shrink_slab+0x15c/0x228
[c00000004473bd70] [c0000000000f61a4] .kswapd+0x4c0/0x678
[c00000004473bf00] [c000000000088ba4] .kthread+0x80/0xcc
[c00000004473bf90] [c00000000002c194] .kernel_thread+0x54/0x70

Thanks
-Sachin

-- 

---------------------------------
Sachin Sant
IBM Linux Technology Center
India Systems and Technology Labs
Bangalore, India
---------------------------------

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13068] Lockdep warining in inotify_dev_queue_event
@ 2009-04-19 10:56       ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-19 10:56 UTC (permalink / raw)
  To: Sachin Sant; +Cc: Linux Kernel Mailing List, Kernel Testers List

On Sunday 19 April 2009, Sachin Sant wrote:
> Rafael J. Wysocki wrote:
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13068
> > Subject		: Lockdep warining in inotify_dev_queue_event
> > Submitter	: Sachin Sant <sachinp@in.ibm.com>
> > Date		: 2009-04-05 12:37 (12 days old)
> > References	: http://marc.info/?l=linux-kernel&m=123893439229272&w=4
> >   
> I can recreate this with latest kernel. Easiest way is to use the LTP mm 
> tests (runltp -f mm )
> 
> =================================
> [ INFO: inconsistent lock state ]
> 2.6.30-rc2-git4 #4
> ---------------------------------
> inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
> kswapd0/334 [HC0[0]:SC0[0]:HE1:SE1] takes:
>  (&inode->inotify_mutex){+.+.?.}, at: [<c000000000166cc0>] 
> .inotify_inode_is_dead+0x38/0xc8
> {RECLAIM_FS-ON-W} state was registered at:
>   [<c000000000098d74>] .lockdep_trace_alloc+0xc4/0xf4
>   [<c000000000122128>] .__kmalloc+0x100/0x274
>   [<c0000000001682fc>] .kernel_event+0xb8/0x154
>   [<c0000000001684b8>] .inotify_dev_queue_event+0x120/0x1cc
>   [<c000000000166b2c>] .inotify_inode_queue_event+0xf0/0x160
>   [<c000000000139c08>] .vfs_create+0x170/0x1dc
>   [<c00000000013d5c0>] .do_filp_open+0x25c/0x964
>   [<c00000000012b414>] .do_sys_open+0x80/0x140
>   [<c00000000012b2f0>] .SyS_creat+0x18/0x2c
>   [<c000000000008554>] syscall_exit+0x0/0x40
> irq event stamp: 75815
> hardirqs last  enabled at (75815): [<c0000000000cb824>] 
> .__call_rcu+0x128/0x15c
> hardirqs last disabled at (75814): [<c0000000000cb748>] 
> .__call_rcu+0x4c/0x15c
> softirqs last  enabled at (73084): [<c00000000002be8c>] 
> .call_do_softirq+0x14/0x24
> softirqs last disabled at (73071): [<c00000000002be8c>] 
> .call_do_softirq+0x14/0x24
> 
> other info that might help us debug this:
> 2 locks held by kswapd0/334:
>  #0:  (shrinker_rwsem){++++..}, at: [<c0000000000f57c8>] 
> .shrink_slab+0x5c/0x228
>  #1:  (&type->s_umount_key#15){++++..}, at: [<c000000000142f00>] 
> .shrink_dcache_memory+0xfc/0x244
> 
> stack backtrace:
> Call Trace:
> [c00000004473b440] [c000000000011a54] .show_stack+0x6c/0x16c (unreliable)
> [c00000004473b4f0] [c0000000000984d8] .print_usage_bug+0x1c0/0x1f4
> [c00000004473b5b0] [c00000000009888c] .mark_lock+0x380/0x6e4
> [c00000004473b660] [c00000000009a99c] .__lock_acquire+0x7a8/0x17b4
> [c00000004473b760] [c00000000009bab0] .lock_acquire+0x108/0x154
> [c00000004473b820] [c0000000005a8bac] .mutex_lock_nested+0x88/0x460
> [c00000004473b920] [c000000000166cc0] .inotify_inode_is_dead+0x38/0xc8
> [c00000004473b9d0] [c0000000001427f4] .dentry_iput+0xa0/0x128
> [c00000004473ba60] [c0000000001429f0] .d_kill+0x5c/0xa0
> [c00000004473baf0] [c000000000142d38] .__shrink_dcache_sb+0x304/0x3d0
> [c00000004473bbd0] [c000000000142f48] .shrink_dcache_memory+0x144/0x244
> [c00000004473bcb0] [c0000000000f58c8] .shrink_slab+0x15c/0x228
> [c00000004473bd70] [c0000000000f61a4] .kswapd+0x4c0/0x678
> [c00000004473bf00] [c000000000088ba4] .kthread+0x80/0xcc
> [c00000004473bf90] [c00000000002c194] .kernel_thread+0x54/0x70

Thanks for the update.

Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13068] Lockdep warining in inotify_dev_queue_event
@ 2009-04-19 10:56       ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-19 10:56 UTC (permalink / raw)
  To: Sachin Sant; +Cc: Linux Kernel Mailing List, Kernel Testers List

On Sunday 19 April 2009, Sachin Sant wrote:
> Rafael J. Wysocki wrote:
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13068
> > Subject		: Lockdep warining in inotify_dev_queue_event
> > Submitter	: Sachin Sant <sachinp-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
> > Date		: 2009-04-05 12:37 (12 days old)
> > References	: http://marc.info/?l=linux-kernel&m=123893439229272&w=4
> >   
> I can recreate this with latest kernel. Easiest way is to use the LTP mm 
> tests (runltp -f mm )
> 
> =================================
> [ INFO: inconsistent lock state ]
> 2.6.30-rc2-git4 #4
> ---------------------------------
> inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
> kswapd0/334 [HC0[0]:SC0[0]:HE1:SE1] takes:
>  (&inode->inotify_mutex){+.+.?.}, at: [<c000000000166cc0>] 
> .inotify_inode_is_dead+0x38/0xc8
> {RECLAIM_FS-ON-W} state was registered at:
>   [<c000000000098d74>] .lockdep_trace_alloc+0xc4/0xf4
>   [<c000000000122128>] .__kmalloc+0x100/0x274
>   [<c0000000001682fc>] .kernel_event+0xb8/0x154
>   [<c0000000001684b8>] .inotify_dev_queue_event+0x120/0x1cc
>   [<c000000000166b2c>] .inotify_inode_queue_event+0xf0/0x160
>   [<c000000000139c08>] .vfs_create+0x170/0x1dc
>   [<c00000000013d5c0>] .do_filp_open+0x25c/0x964
>   [<c00000000012b414>] .do_sys_open+0x80/0x140
>   [<c00000000012b2f0>] .SyS_creat+0x18/0x2c
>   [<c000000000008554>] syscall_exit+0x0/0x40
> irq event stamp: 75815
> hardirqs last  enabled at (75815): [<c0000000000cb824>] 
> .__call_rcu+0x128/0x15c
> hardirqs last disabled at (75814): [<c0000000000cb748>] 
> .__call_rcu+0x4c/0x15c
> softirqs last  enabled at (73084): [<c00000000002be8c>] 
> .call_do_softirq+0x14/0x24
> softirqs last disabled at (73071): [<c00000000002be8c>] 
> .call_do_softirq+0x14/0x24
> 
> other info that might help us debug this:
> 2 locks held by kswapd0/334:
>  #0:  (shrinker_rwsem){++++..}, at: [<c0000000000f57c8>] 
> .shrink_slab+0x5c/0x228
>  #1:  (&type->s_umount_key#15){++++..}, at: [<c000000000142f00>] 
> .shrink_dcache_memory+0xfc/0x244
> 
> stack backtrace:
> Call Trace:
> [c00000004473b440] [c000000000011a54] .show_stack+0x6c/0x16c (unreliable)
> [c00000004473b4f0] [c0000000000984d8] .print_usage_bug+0x1c0/0x1f4
> [c00000004473b5b0] [c00000000009888c] .mark_lock+0x380/0x6e4
> [c00000004473b660] [c00000000009a99c] .__lock_acquire+0x7a8/0x17b4
> [c00000004473b760] [c00000000009bab0] .lock_acquire+0x108/0x154
> [c00000004473b820] [c0000000005a8bac] .mutex_lock_nested+0x88/0x460
> [c00000004473b920] [c000000000166cc0] .inotify_inode_is_dead+0x38/0xc8
> [c00000004473b9d0] [c0000000001427f4] .dentry_iput+0xa0/0x128
> [c00000004473ba60] [c0000000001429f0] .d_kill+0x5c/0xa0
> [c00000004473baf0] [c000000000142d38] .__shrink_dcache_sb+0x304/0x3d0
> [c00000004473bbd0] [c000000000142f48] .shrink_dcache_memory+0x144/0x244
> [c00000004473bcb0] [c0000000000f58c8] .shrink_slab+0x15c/0x228
> [c00000004473bd70] [c0000000000f61a4] .kswapd+0x4c0/0x678
> [c00000004473bf00] [c000000000088ba4] .kthread+0x80/0xcc
> [c00000004473bf90] [c00000000002c194] .kernel_thread+0x54/0x70

Thanks for the update.

Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13120] BUG: using rootfstype=ext4 causes oops
  2009-04-16 21:45   ` Rafael J. Wysocki
@ 2009-04-19 18:31     ` Andrew Price
  -1 siblings, 0 replies; 580+ messages in thread
From: Andrew Price @ 2009-04-19 18:31 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List,
	Bartlomiej Zolnierkiewicz

On Thu, Apr 16, 2009 at 11:45:07PM +0200, Rafael J. Wysocki wrote:
> The following bug entry is on the current list of known regressions
> from 2.6.29.  Please verify if it still should be listed and let me know
> (either way).
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13120
> Subject		: BUG: using rootfstype=ext4 causes oops
> Submitter	: Andrew Price <andy@andrewprice.me.uk>
> Date		: 2009-04-15 20:59 (2 days old)
> References	: http://marc.info/?l=linux-kernel&m=123982932807371&w=4
> Handled-By	: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
> Patch		: http://marc.info/?l=linux-kernel&m=123991090816794&w=4

Fixed in commit f505d49ffd25ed062e76ffd17568d3937fcd338c "ide: fix
barriers support" (which was merged in
df89f1ba971b3df2b7e1bc46ca7ce867539186fa ).

--
Andrew Price

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13120] BUG: using rootfstype=ext4 causes oops
@ 2009-04-19 18:31     ` Andrew Price
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Price @ 2009-04-19 18:31 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List,
	Bartlomiej Zolnierkiewicz

On Thu, Apr 16, 2009 at 11:45:07PM +0200, Rafael J. Wysocki wrote:
> The following bug entry is on the current list of known regressions
> from 2.6.29.  Please verify if it still should be listed and let me know
> (either way).
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13120
> Subject		: BUG: using rootfstype=ext4 causes oops
> Submitter	: Andrew Price <andy-QvJ1taJFSUQwEI6hhNFqhFpr/1R2p/CL@public.gmane.org>
> Date		: 2009-04-15 20:59 (2 days old)
> References	: http://marc.info/?l=linux-kernel&m=123982932807371&w=4
> Handled-By	: Bartlomiej Zolnierkiewicz <bzolnier-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Patch		: http://marc.info/?l=linux-kernel&m=123991090816794&w=4

Fixed in commit f505d49ffd25ed062e76ffd17568d3937fcd338c "ide: fix
barriers support" (which was merged in
df89f1ba971b3df2b7e1bc46ca7ce867539186fa ).

--
Andrew Price

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13111] Linux 2.6.30-rc1 tg3 endian issues with MAC addresses on BCM5701
@ 2009-04-20  4:31             ` Michael Chan
  0 siblings, 0 replies; 580+ messages in thread
From: Michael Chan @ 2009-04-20  4:31 UTC (permalink / raw)
  To: 'David Miller', holt
  Cc: Matthew Carlson, rjw, linux-kernel, kernel-testers, Benjamin Li,
	James.Bottomley

David Miller wrote:

>
> From: Robin Holt <holt@sgi.com>
> Date: Fri, 17 Apr 2009 07:21:21 -0500
>
> >> Actually, I think we do have a fix for this.  James and Robin both
> >> reported that the test patch I sent out worked for them.
> I'm preparing
> >> a patchset for submission now.
> >>
> >> James, Robin, can you confirm that you performed your
> tests with David's
> >> patch reverted?
> >
> > My test was done with the 2.6.28-rc1 kernel plus your patch and no
> > others.
>
> Can I get a final patch against Linus's current tree?
>

Matt should be able to provide a patch tomorrow.  We're still discussing
this with Grant Grundler on the parisc list.


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13111] Linux 2.6.30-rc1 tg3 endian issues with MAC addresses on BCM5701
@ 2009-04-20  4:31             ` Michael Chan
  0 siblings, 0 replies; 580+ messages in thread
From: Michael Chan @ 2009-04-20  4:31 UTC (permalink / raw)
  To: 'David Miller', holt-sJ/iWh9BUns
  Cc: Matthew Carlson, rjw-KKrjLPT3xs0,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA, Benjamin Li,
	James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk

David Miller wrote:

>
> From: Robin Holt <holt-sJ/iWh9BUns@public.gmane.org>
> Date: Fri, 17 Apr 2009 07:21:21 -0500
>
> >> Actually, I think we do have a fix for this.  James and Robin both
> >> reported that the test patch I sent out worked for them.
> I'm preparing
> >> a patchset for submission now.
> >>
> >> James, Robin, can you confirm that you performed your
> tests with David's
> >> patch reverted?
> >
> > My test was done with the 2.6.28-rc1 kernel plus your patch and no
> > others.
>
> Can I get a final patch against Linus's current tree?
>

Matt should be able to provide a patch tomorrow.  We're still discussing
this with Grant Grundler on the parisc list.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-20 19:20               ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-04-20 19:20 UTC (permalink / raw)
  To: Pavel Machek
  Cc: torvalds, jens.axboe, alan-jenkins, rjw, linux-kernel, kernel-testers

On Tue, 7 Apr 2009 10:06:32 +0200
Pavel Machek <pavel@ucw.cz> wrote:

> > And the thing is, that "swsusp_shrink_memory()" is just full of 
> > heuristics. There's no hard numbers there. It doesn't seem to wait for 
> > writeout, it just does the equivalent of "shrink_list()" and 
> > "shrink_slab()", but it seems to have been basically cribbed half-way 
> > from the regular "try to free memory", without really doing it all.
> 
> akpm designed shrink_memory(). Long time ago it was just while (1)
> kmalloc() loop. It should be waiting. Andrew?

I always wanted the thing to just allocate all the memory which it
needed and then to either return it all to the caller or free it all
again for the caller to reallocate (preferably the former).

But for some reason which I don't recall (Pavel provided it, iirc) that
doesn't work.  So the current (and subsequently tweaked) scheme was put
in there instead.  It turned out to be surprisingly difficult and ugly
to graft it in top of the existing page reclaim code, and various
changes were subsequently made to make it sort-of-work.

Remind me: why can't we just allocate N pages at suspend-time?

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-20 19:20               ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-04-20 19:20 UTC (permalink / raw)
  To: Pavel Machek
  Cc: torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA, rjw-KKrjLPT3xs0,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Tue, 7 Apr 2009 10:06:32 +0200
Pavel Machek <pavel-+ZI9xUNit7I@public.gmane.org> wrote:

> > And the thing is, that "swsusp_shrink_memory()" is just full of 
> > heuristics. There's no hard numbers there. It doesn't seem to wait for 
> > writeout, it just does the equivalent of "shrink_list()" and 
> > "shrink_slab()", but it seems to have been basically cribbed half-way 
> > from the regular "try to free memory", without really doing it all.
> 
> akpm designed shrink_memory(). Long time ago it was just while (1)
> kmalloc() loop. It should be waiting. Andrew?

I always wanted the thing to just allocate all the memory which it
needed and then to either return it all to the caller or free it all
again for the caller to reallocate (preferably the former).

But for some reason which I don't recall (Pavel provided it, iirc) that
doesn't work.  So the current (and subsequently tweaked) scheme was put
in there instead.  It turned out to be surprisingly difficult and ugly
to graft it in top of the existing page reclaim code, and various
changes were subsequently made to make it sort-of-work.

Remind me: why can't we just allocate N pages at suspend-time?

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-20 19:49                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-20 19:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Pavel Machek, torvalds, jens.axboe, alan-jenkins, linux-kernel,
	kernel-testers

On Monday 20 April 2009, Andrew Morton wrote:
> On Tue, 7 Apr 2009 10:06:32 +0200
> Pavel Machek <pavel@ucw.cz> wrote:
> 
> > > And the thing is, that "swsusp_shrink_memory()" is just full of 
> > > heuristics. There's no hard numbers there. It doesn't seem to wait for 
> > > writeout, it just does the equivalent of "shrink_list()" and 
> > > "shrink_slab()", but it seems to have been basically cribbed half-way 
> > > from the regular "try to free memory", without really doing it all.
> > 
> > akpm designed shrink_memory(). Long time ago it was just while (1)
> > kmalloc() loop. It should be waiting. Andrew?
> 
> I always wanted the thing to just allocate all the memory which it
> needed and then to either return it all to the caller or free it all
> again for the caller to reallocate (preferably the former).
> 
> But for some reason which I don't recall (Pavel provided it, iirc) that
> doesn't work.  So the current (and subsequently tweaked) scheme was put
> in there instead.  It turned out to be surprisingly difficult and ugly
> to graft it in top of the existing page reclaim code, and various
> changes were subsequently made to make it sort-of-work.
> 
> Remind me: why can't we just allocate N pages at suspend-time?

Well, IMO it may be worth trying anyway.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-20 19:49                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-20 19:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Pavel Machek, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Monday 20 April 2009, Andrew Morton wrote:
> On Tue, 7 Apr 2009 10:06:32 +0200
> Pavel Machek <pavel-+ZI9xUNit7I@public.gmane.org> wrote:
> 
> > > And the thing is, that "swsusp_shrink_memory()" is just full of 
> > > heuristics. There's no hard numbers there. It doesn't seem to wait for 
> > > writeout, it just does the equivalent of "shrink_list()" and 
> > > "shrink_slab()", but it seems to have been basically cribbed half-way 
> > > from the regular "try to free memory", without really doing it all.
> > 
> > akpm designed shrink_memory(). Long time ago it was just while (1)
> > kmalloc() loop. It should be waiting. Andrew?
> 
> I always wanted the thing to just allocate all the memory which it
> needed and then to either return it all to the caller or free it all
> again for the caller to reallocate (preferably the former).
> 
> But for some reason which I don't recall (Pavel provided it, iirc) that
> doesn't work.  So the current (and subsequently tweaked) scheme was put
> in there instead.  It turned out to be surprisingly difficult and ugly
> to graft it in top of the existing page reclaim code, and various
> changes were subsequently made to make it sort-of-work.
> 
> Remind me: why can't we just allocate N pages at suspend-time?

Well, IMO it may be worth trying anyway.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-20 19:53                 ` Pavel Machek
  0 siblings, 0 replies; 580+ messages in thread
From: Pavel Machek @ 2009-04-20 19:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: torvalds, jens.axboe, alan-jenkins, rjw, linux-kernel, kernel-testers

Hi!

> > > And the thing is, that "swsusp_shrink_memory()" is just full of 
> > > heuristics. There's no hard numbers there. It doesn't seem to wait for 
> > > writeout, it just does the equivalent of "shrink_list()" and 
> > > "shrink_slab()", but it seems to have been basically cribbed half-way 
> > > from the regular "try to free memory", without really doing it all.
> > 
> > akpm designed shrink_memory(). Long time ago it was just while (1)
> > kmalloc() loop. It should be waiting. Andrew?
> 
> I always wanted the thing to just allocate all the memory which it
> needed and then to either return it all to the caller or free it all
> again for the caller to reallocate (preferably the former).

We need half of memory free for swsusp to work. If we "just allocate"
it, we will trigger OOM killer; we'd prefer to fail suspend than to
OOM kill.

> But for some reason which I don't recall (Pavel provided it, iirc)
> that

Alas, I do not remember that clearly.

> doesn't work.  So the current (and subsequently tweaked) scheme was put
> in there instead.  It turned out to be surprisingly difficult and ugly
> to graft it in top of the existing page reclaim code, and various
> changes were subsequently made to make it sort-of-work.
> 
> Remind me: why can't we just allocate N pages at suspend-time?

We need half of memory free. The reason we can't "just allocate" is
probably OOM killer; but my memories are quite weak :-(.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-20 19:53                 ` Pavel Machek
  0 siblings, 0 replies; 580+ messages in thread
From: Pavel Machek @ 2009-04-20 19:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA, rjw-KKrjLPT3xs0,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

Hi!

> > > And the thing is, that "swsusp_shrink_memory()" is just full of 
> > > heuristics. There's no hard numbers there. It doesn't seem to wait for 
> > > writeout, it just does the equivalent of "shrink_list()" and 
> > > "shrink_slab()", but it seems to have been basically cribbed half-way 
> > > from the regular "try to free memory", without really doing it all.
> > 
> > akpm designed shrink_memory(). Long time ago it was just while (1)
> > kmalloc() loop. It should be waiting. Andrew?
> 
> I always wanted the thing to just allocate all the memory which it
> needed and then to either return it all to the caller or free it all
> again for the caller to reallocate (preferably the former).

We need half of memory free for swsusp to work. If we "just allocate"
it, we will trigger OOM killer; we'd prefer to fail suspend than to
OOM kill.

> But for some reason which I don't recall (Pavel provided it, iirc)
> that

Alas, I do not remember that clearly.

> doesn't work.  So the current (and subsequently tweaked) scheme was put
> in there instead.  It turned out to be surprisingly difficult and ugly
> to graft it in top of the existing page reclaim code, and various
> changes were subsequently made to make it sort-of-work.
> 
> Remind me: why can't we just allocate N pages at suspend-time?

We need half of memory free. The reason we can't "just allocate" is
probably OOM killer; but my memories are quite weak :-(.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-20 20:04                   ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-04-20 20:04 UTC (permalink / raw)
  To: Pavel Machek
  Cc: torvalds, jens.axboe, alan-jenkins, rjw, linux-kernel, kernel-testers

On Mon, 20 Apr 2009 21:53:06 +0200
Pavel Machek <pavel@ucw.cz> wrote:

> > Remind me: why can't we just allocate N pages at suspend-time?
> 
> We need half of memory free. The reason we can't "just allocate" is
> probably OOM killer; but my memories are quite weak :-(.

hm.  You'd think that with out splendid range of __GFP_foo falgs, there
would be some combo which would suit this requirement but I can't
immediately spot one.

We can always add another I guess.  Something like...


diff -puN mm/page_alloc.c~a mm/page_alloc.c
--- a/mm/page_alloc.c~a
+++ a/mm/page_alloc.c
@@ -1620,7 +1620,8 @@ nofail_alloc:
 		}
 
 		/* The OOM killer will not help higher order allocs so fail */
-		if (order > PAGE_ALLOC_COSTLY_ORDER) {
+		if (order > PAGE_ALLOC_COSTLY_ORDER ||
+				(gfp_mask & __GFP_NO_OOM_KILL)) {
 			clear_zonelist_oom(zonelist, gfp_mask);
 			goto nopage;
 		}
diff -puN include/linux/gfp.h~a include/linux/gfp.h
--- a/include/linux/gfp.h~a
+++ a/include/linux/gfp.h
@@ -51,8 +51,9 @@ struct vm_area_struct;
 #define __GFP_THISNODE	((__force gfp_t)0x40000u)/* No fallback, no policies */
 #define __GFP_RECLAIMABLE ((__force gfp_t)0x80000u) /* Page is reclaimable */
 #define __GFP_MOVABLE	((__force gfp_t)0x100000u)  /* Page is movable */
+#define __GFP_NO_OOM_KILL ((__force gfp_t)0x200000u)  /* Don't invoke out_of_memory() */
 
-#define __GFP_BITS_SHIFT 21	/* Room for 21 __GFP_FOO bits */
+#define __GFP_BITS_SHIFT 22	/* Number of__GFP_FOO bits */
 #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
 
 /* This equals 0, but use constants in case they ever change */
_


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-20 20:04                   ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-04-20 20:04 UTC (permalink / raw)
  To: Pavel Machek
  Cc: torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA, rjw-KKrjLPT3xs0,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Mon, 20 Apr 2009 21:53:06 +0200
Pavel Machek <pavel-+ZI9xUNit7I@public.gmane.org> wrote:

> > Remind me: why can't we just allocate N pages at suspend-time?
> 
> We need half of memory free. The reason we can't "just allocate" is
> probably OOM killer; but my memories are quite weak :-(.

hm.  You'd think that with out splendid range of __GFP_foo falgs, there
would be some combo which would suit this requirement but I can't
immediately spot one.

We can always add another I guess.  Something like...


diff -puN mm/page_alloc.c~a mm/page_alloc.c
--- a/mm/page_alloc.c~a
+++ a/mm/page_alloc.c
@@ -1620,7 +1620,8 @@ nofail_alloc:
 		}
 
 		/* The OOM killer will not help higher order allocs so fail */
-		if (order > PAGE_ALLOC_COSTLY_ORDER) {
+		if (order > PAGE_ALLOC_COSTLY_ORDER ||
+				(gfp_mask & __GFP_NO_OOM_KILL)) {
 			clear_zonelist_oom(zonelist, gfp_mask);
 			goto nopage;
 		}
diff -puN include/linux/gfp.h~a include/linux/gfp.h
--- a/include/linux/gfp.h~a
+++ a/include/linux/gfp.h
@@ -51,8 +51,9 @@ struct vm_area_struct;
 #define __GFP_THISNODE	((__force gfp_t)0x40000u)/* No fallback, no policies */
 #define __GFP_RECLAIMABLE ((__force gfp_t)0x80000u) /* Page is reclaimable */
 #define __GFP_MOVABLE	((__force gfp_t)0x100000u)  /* Page is movable */
+#define __GFP_NO_OOM_KILL ((__force gfp_t)0x200000u)  /* Don't invoke out_of_memory() */
 
-#define __GFP_BITS_SHIFT 21	/* Room for 21 __GFP_FOO bits */
+#define __GFP_BITS_SHIFT 22	/* Number of__GFP_FOO bits */
 #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
 
 /* This equals 0, but use constants in case they ever change */
_

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-18  4:51           ` leiming
                             ` (3 preceding siblings ...)
  (?)
@ 2009-04-20 20:08           ` Laurent Pinchart
  2009-04-21  1:47             ` Ming Lei
       [not found]             ` <200904202208.23899.laurent.pinchart-AgBVmzD5pcezQB+pC5nmwQ@public.gmane.org>
  -1 siblings, 2 replies; 580+ messages in thread
From: Laurent Pinchart @ 2009-04-20 20:08 UTC (permalink / raw)
  To: leiming
  Cc: Linus Torvalds, Rafael J. Wysocki, Linux Kernel Mailing List,
	Adrian Bunk, Andrew Morton, Natalie Protasevich,
	Kernel Testers List, Network Development, Linux ACPI,
	Linux PM List, Linux SCSI List, video4linux-list, mchehab

On Saturday 18 April 2009 06:51:11 leiming wrote:
> On Fri, 17 Apr 2009 19:55:29 -0700 (PDT)
>
> Linus Torvalds <torvalds@linux-foundation.org> wrote:
> > > @@ -742,7 +742,7 @@ static int uvc_alloc_urb_buffers(struct
> > > uvc_video_device *video,
> > >  	/* Buffers are already allocated, bail out. */
> > >  	if (video->urb_size)
> > > -		return 0;
> > > +		return DIV_ROUND_UP(video->urb_size, psize);
> >
> > I don't think this is right. It should round _down_.
> >
> > It's supposed to return 'npackets', but if you pass it a different
> > packet size than it was passed originally, it can now return a
> > potentially bigger number than the already allocated buffer, no?
> >
> > So I think it should round down (ie use a regular divide). No?
>
> Yes,you are correct, please ignore my last reply, and following is
> the fixed patch.

psize and video->urb_size shouldn't have changed before and after resume, 
otherwise we'll get into trouble anyway. A regular divide and a round-up 
divide should then return the same result. I'll take the regular divide, as it 
will be more efficient.

> Thanks.
>
> From a3b3d72cdd57a0699fb643b41b78eb7beb211ff5 Mon Sep 17 00:00:00 2001
> From: Ming Lei <tom.leiming@gmail.com>
> Date: Wed, 15 Apr 2009 22:32:51 +0800
> Subject: [PATCH] V4L/DVB:usbvideo:fix uvc resume failed(v2)
>
> Now urb buffers is not freed before suspend, so uvc_alloc_urb_buffers
> should return packet counts allocated originally during uvc resume
> , instead of zero.
>
> This version uses round down to return packet counts on Linus's
> suggestions, or else may lead to buffer destructed if packet size
> is changed before calling uvc_alloc_urb_buffers() in this kind of
> case.

The comment is misleading. If the packet size changes we need to reallocate 
the buffers anyway. Have you checked if the packet size (which depends on the 
endpoint being selected) can be changed between suspend and resume, either by 
the uvcvideo driver (I don't think it can) or the USB core ?

Best regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-18  4:51           ` leiming
                             ` (2 preceding siblings ...)
  (?)
@ 2009-04-20 20:08           ` Laurent Pinchart
  -1 siblings, 0 replies; 580+ messages in thread
From: Laurent Pinchart @ 2009-04-20 20:08 UTC (permalink / raw)
  To: leiming
  Cc: Adrian Bunk, Linux SCSI List, Network Development,
	Linux Kernel Mailing List, Natalie Protasevich, mchehab,
	Linux ACPI, video4linux-list, Andrew Morton, Kernel Testers List,
	Linus Torvalds, Linux PM List

On Saturday 18 April 2009 06:51:11 leiming wrote:
> On Fri, 17 Apr 2009 19:55:29 -0700 (PDT)
>
> Linus Torvalds <torvalds@linux-foundation.org> wrote:
> > > @@ -742,7 +742,7 @@ static int uvc_alloc_urb_buffers(struct
> > > uvc_video_device *video,
> > >  	/* Buffers are already allocated, bail out. */
> > >  	if (video->urb_size)
> > > -		return 0;
> > > +		return DIV_ROUND_UP(video->urb_size, psize);
> >
> > I don't think this is right. It should round _down_.
> >
> > It's supposed to return 'npackets', but if you pass it a different
> > packet size than it was passed originally, it can now return a
> > potentially bigger number than the already allocated buffer, no?
> >
> > So I think it should round down (ie use a regular divide). No?
>
> Yes,you are correct, please ignore my last reply, and following is
> the fixed patch.

psize and video->urb_size shouldn't have changed before and after resume, 
otherwise we'll get into trouble anyway. A regular divide and a round-up 
divide should then return the same result. I'll take the regular divide, as it 
will be more efficient.

> Thanks.
>
> From a3b3d72cdd57a0699fb643b41b78eb7beb211ff5 Mon Sep 17 00:00:00 2001
> From: Ming Lei <tom.leiming@gmail.com>
> Date: Wed, 15 Apr 2009 22:32:51 +0800
> Subject: [PATCH] V4L/DVB:usbvideo:fix uvc resume failed(v2)
>
> Now urb buffers is not freed before suspend, so uvc_alloc_urb_buffers
> should return packet counts allocated originally during uvc resume
> , instead of zero.
>
> This version uses round down to return packet counts on Linus's
> suggestions, or else may lead to buffer destructed if packet size
> is changed before calling uvc_alloc_urb_buffers() in this kind of
> case.

The comment is misleading. If the packet size changes we need to reallocate 
the buffers anyway. Have you checked if the packet size (which depends on the 
endpoint being selected) can be changed between suspend and resume, either by 
the uvcvideo driver (I don't think it can) or the USB core ?

Best regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-20 23:37                     ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-04-20 23:37 UTC (permalink / raw)
  To: pavel, torvalds, jens.axboe, alan-jenkins, rjw, linux-kernel,
	kernel-testers

On Mon, 20 Apr 2009 13:04:12 -0700
Andrew Morton <akpm@linux-foundation.org> wrote:

> On Mon, 20 Apr 2009 21:53:06 +0200
> Pavel Machek <pavel@ucw.cz> wrote:
> 
> > > Remind me: why can't we just allocate N pages at suspend-time?
> > 
> > We need half of memory free. The reason we can't "just allocate" is
> > probably OOM killer; but my memories are quite weak :-(.
> 
> hm.  You'd think that with out splendid range of __GFP_foo falgs, there
> would be some combo which would suit this requirement but I can't
> immediately spot one.
> 
> We can always add another I guess.  Something like...
> 
> 
> diff -puN mm/page_alloc.c~a mm/page_alloc.c
> --- a/mm/page_alloc.c~a
> +++ a/mm/page_alloc.c
> @@ -1620,7 +1620,8 @@ nofail_alloc:
>  		}
>  
>  		/* The OOM killer will not help higher order allocs so fail */
> -		if (order > PAGE_ALLOC_COSTLY_ORDER) {
> +		if (order > PAGE_ALLOC_COSTLY_ORDER ||
> +				(gfp_mask & __GFP_NO_OOM_KILL)) {
>  			clear_zonelist_oom(zonelist, gfp_mask);
>  			goto nopage;
>  		}
> diff -puN include/linux/gfp.h~a include/linux/gfp.h
> --- a/include/linux/gfp.h~a
> +++ a/include/linux/gfp.h
> @@ -51,8 +51,9 @@ struct vm_area_struct;
>  #define __GFP_THISNODE	((__force gfp_t)0x40000u)/* No fallback, no policies */
>  #define __GFP_RECLAIMABLE ((__force gfp_t)0x80000u) /* Page is reclaimable */
>  #define __GFP_MOVABLE	((__force gfp_t)0x100000u)  /* Page is movable */
> +#define __GFP_NO_OOM_KILL ((__force gfp_t)0x200000u)  /* Don't invoke out_of_memory() */
>  
> -#define __GFP_BITS_SHIFT 21	/* Room for 21 __GFP_FOO bits */
> +#define __GFP_BITS_SHIFT 22	/* Number of__GFP_FOO bits */
>  #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
>  
>  /* This equals 0, but use constants in case they ever change */
> _
> 

Of course, this will protect the calling task from getting oom-killed. 
But it doesn't protect other tasks from getting oom-killed due to the
activity of _this_ task.

But I think that problem already exists, and that this proposal doesn't
worsen anything, yes?

Or is it the case that all other tasks are safely stuck in the freezer
at this time, so they won't be allocating any memory anyway?


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-20 23:37                     ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-04-20 23:37 UTC (permalink / raw)
  To: pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA, rjw-KKrjLPT3xs0,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Mon, 20 Apr 2009 13:04:12 -0700
Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org> wrote:

> On Mon, 20 Apr 2009 21:53:06 +0200
> Pavel Machek <pavel-+ZI9xUNit7I@public.gmane.org> wrote:
> 
> > > Remind me: why can't we just allocate N pages at suspend-time?
> > 
> > We need half of memory free. The reason we can't "just allocate" is
> > probably OOM killer; but my memories are quite weak :-(.
> 
> hm.  You'd think that with out splendid range of __GFP_foo falgs, there
> would be some combo which would suit this requirement but I can't
> immediately spot one.
> 
> We can always add another I guess.  Something like...
> 
> 
> diff -puN mm/page_alloc.c~a mm/page_alloc.c
> --- a/mm/page_alloc.c~a
> +++ a/mm/page_alloc.c
> @@ -1620,7 +1620,8 @@ nofail_alloc:
>  		}
>  
>  		/* The OOM killer will not help higher order allocs so fail */
> -		if (order > PAGE_ALLOC_COSTLY_ORDER) {
> +		if (order > PAGE_ALLOC_COSTLY_ORDER ||
> +				(gfp_mask & __GFP_NO_OOM_KILL)) {
>  			clear_zonelist_oom(zonelist, gfp_mask);
>  			goto nopage;
>  		}
> diff -puN include/linux/gfp.h~a include/linux/gfp.h
> --- a/include/linux/gfp.h~a
> +++ a/include/linux/gfp.h
> @@ -51,8 +51,9 @@ struct vm_area_struct;
>  #define __GFP_THISNODE	((__force gfp_t)0x40000u)/* No fallback, no policies */
>  #define __GFP_RECLAIMABLE ((__force gfp_t)0x80000u) /* Page is reclaimable */
>  #define __GFP_MOVABLE	((__force gfp_t)0x100000u)  /* Page is movable */
> +#define __GFP_NO_OOM_KILL ((__force gfp_t)0x200000u)  /* Don't invoke out_of_memory() */
>  
> -#define __GFP_BITS_SHIFT 21	/* Room for 21 __GFP_FOO bits */
> +#define __GFP_BITS_SHIFT 22	/* Number of__GFP_FOO bits */
>  #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
>  
>  /* This equals 0, but use constants in case they ever change */
> _
> 

Of course, this will protect the calling task from getting oom-killed. 
But it doesn't protect other tasks from getting oom-killed due to the
activity of _this_ task.

But I think that problem already exists, and that this proposal doesn't
worsen anything, yes?

Or is it the case that all other tasks are safely stuck in the freezer
at this time, so they won't be allocating any memory anyway?

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-20 20:08           ` Laurent Pinchart
@ 2009-04-21  1:47                 ` Ming Lei
       [not found]             ` <200904202208.23899.laurent.pinchart-AgBVmzD5pcezQB+pC5nmwQ@public.gmane.org>
  1 sibling, 0 replies; 580+ messages in thread
From: Ming Lei @ 2009-04-21  1:47 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Linus Torvalds, Rafael J. Wysocki, Linux Kernel Mailing List,
	Adrian Bunk, Andrew Morton, Natalie Protasevich,
	Kernel Testers List, Network Development, Linux ACPI,
	Linux PM List, Linux SCSI List,
	video4linux-list-H+wXaHxf7aLQT0dZR+AlfA,
	mchehab-wEGCiKHe2LqWVfeAwA7xHQ

2009/4/21 Laurent Pinchart <laurent.pinchart-AgBVmzD5pcezQB+pC5nmwQ@public.gmane.org>:
> On Saturday 18 April 2009 06:51:11 leiming wrote:
>> On Fri, 17 Apr 2009 19:55:29 -0700 (PDT)
>>
>> Linus Torvalds <torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org> wrote:
>> > > @@ -742,7 +742,7 @@ static int uvc_alloc_urb_buffers(struct
>> > > uvc_video_device *video,
>> > >   /* Buffers are already allocated, bail out. */
>> > >   if (video->urb_size)
>> > > -         return 0;
>> > > +         return DIV_ROUND_UP(video->urb_size, psize);
>> >
>> > I don't think this is right. It should round _down_.
>> >
>> > It's supposed to return 'npackets', but if you pass it a different
>> > packet size than it was passed originally, it can now return a
>> > potentially bigger number than the already allocated buffer, no?
>> >
>> > So I think it should round down (ie use a regular divide). No?
>>
>> Yes,you are correct, please ignore my last reply, and following is
>> the fixed patch.
>
> psize and video->urb_size shouldn't have changed before and after resume,
> otherwise we'll get into trouble anyway. A regular divide and a round-up
> divide should then return the same result. I'll take the regular divide, as it
> will be more efficient.

Yes.

>
>> Thanks.
>>
>> From a3b3d72cdd57a0699fb643b41b78eb7beb211ff5 Mon Sep 17 00:00:00 2001
>> From: Ming Lei <tom.leiming-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>> Date: Wed, 15 Apr 2009 22:32:51 +0800
>> Subject: [PATCH] V4L/DVB:usbvideo:fix uvc resume failed(v2)
>>
>> Now urb buffers is not freed before suspend, so uvc_alloc_urb_buffers
>> should return packet counts allocated originally during uvc resume
>> , instead of zero.
>>
>> This version uses round down to return packet counts on Linus's
>> suggestions, or else may lead to buffer destructed if packet size
>> is changed before calling uvc_alloc_urb_buffers() in this kind of
>> case.
>
> The comment is misleading. If the packet size changes we need to reallocate
> the buffers anyway. Have you checked if the packet size (which depends on the
> endpoint being selected) can be changed between suspend and resume, either by
> the uvcvideo driver (I don't think it can) or the USB core ?

The packet size does not change between suspend and resume.  I mean
uvc_alloc_urb_buffers()
still can be used in other cases if buffers was not freed and is
reuesed in future. It seems there is no
such cases in uvcvideo now, but uvc_alloc_urb_buffers() really __can__
 work in such case, isn't it?

IMHO It is only used to allocate or reserve UVC_URBS usb buffers,
which size is video->urb_size, and
npackets can be shortened or enlarged if psize is changed, after all.

Thanks!

-- 
Lei Ming

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
@ 2009-04-21  1:47                 ` Ming Lei
  0 siblings, 0 replies; 580+ messages in thread
From: Ming Lei @ 2009-04-21  1:47 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Linus Torvalds, Rafael J. Wysocki, Linux Kernel Mailing List,
	Adrian Bunk, Andrew Morton, Natalie Protasevich,
	Kernel Testers List, Network Development, Linux ACPI,
	Linux PM List, Linux SCSI List, video4linux-list, mchehab

2009/4/21 Laurent Pinchart <laurent.pinchart@skynet.be>:
> On Saturday 18 April 2009 06:51:11 leiming wrote:
>> On Fri, 17 Apr 2009 19:55:29 -0700 (PDT)
>>
>> Linus Torvalds <torvalds@linux-foundation.org> wrote:
>> > > @@ -742,7 +742,7 @@ static int uvc_alloc_urb_buffers(struct
>> > > uvc_video_device *video,
>> > >   /* Buffers are already allocated, bail out. */
>> > >   if (video->urb_size)
>> > > -         return 0;
>> > > +         return DIV_ROUND_UP(video->urb_size, psize);
>> >
>> > I don't think this is right. It should round _down_.
>> >
>> > It's supposed to return 'npackets', but if you pass it a different
>> > packet size than it was passed originally, it can now return a
>> > potentially bigger number than the already allocated buffer, no?
>> >
>> > So I think it should round down (ie use a regular divide). No?
>>
>> Yes,you are correct, please ignore my last reply, and following is
>> the fixed patch.
>
> psize and video->urb_size shouldn't have changed before and after resume,
> otherwise we'll get into trouble anyway. A regular divide and a round-up
> divide should then return the same result. I'll take the regular divide, as it
> will be more efficient.

Yes.

>
>> Thanks.
>>
>> From a3b3d72cdd57a0699fb643b41b78eb7beb211ff5 Mon Sep 17 00:00:00 2001
>> From: Ming Lei <tom.leiming@gmail.com>
>> Date: Wed, 15 Apr 2009 22:32:51 +0800
>> Subject: [PATCH] V4L/DVB:usbvideo:fix uvc resume failed(v2)
>>
>> Now urb buffers is not freed before suspend, so uvc_alloc_urb_buffers
>> should return packet counts allocated originally during uvc resume
>> , instead of zero.
>>
>> This version uses round down to return packet counts on Linus's
>> suggestions, or else may lead to buffer destructed if packet size
>> is changed before calling uvc_alloc_urb_buffers() in this kind of
>> case.
>
> The comment is misleading. If the packet size changes we need to reallocate
> the buffers anyway. Have you checked if the packet size (which depends on the
> endpoint being selected) can be changed between suspend and resume, either by
> the uvcvideo driver (I don't think it can) or the USB core ?

The packet size does not change between suspend and resume.  I mean
uvc_alloc_urb_buffers()
still can be used in other cases if buffers was not freed and is
reuesed in future. It seems there is no
such cases in uvcvideo now, but uvc_alloc_urb_buffers() really __can__
 work in such case, isn't it?

IMHO It is only used to allocate or reserve UVC_URBS usb buffers,
which size is video->urb_size, and
npackets can be shortened or enlarged if psize is changed, after all.

Thanks!

-- 
Lei Ming

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-20 20:08           ` Laurent Pinchart
@ 2009-04-21  1:47             ` Ming Lei
       [not found]             ` <200904202208.23899.laurent.pinchart-AgBVmzD5pcezQB+pC5nmwQ@public.gmane.org>
  1 sibling, 0 replies; 580+ messages in thread
From: Ming Lei @ 2009-04-21  1:47 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Adrian Bunk, Linux SCSI List, Network Development,
	Linux Kernel Mailing List, Natalie Protasevich, mchehab,
	Linux ACPI, video4linux-list, Andrew Morton, Kernel Testers List,
	Linus Torvalds, Linux PM List

2009/4/21 Laurent Pinchart <laurent.pinchart@skynet.be>:
> On Saturday 18 April 2009 06:51:11 leiming wrote:
>> On Fri, 17 Apr 2009 19:55:29 -0700 (PDT)
>>
>> Linus Torvalds <torvalds@linux-foundation.org> wrote:
>> > > @@ -742,7 +742,7 @@ static int uvc_alloc_urb_buffers(struct
>> > > uvc_video_device *video,
>> > >   /* Buffers are already allocated, bail out. */
>> > >   if (video->urb_size)
>> > > -         return 0;
>> > > +         return DIV_ROUND_UP(video->urb_size, psize);
>> >
>> > I don't think this is right. It should round _down_.
>> >
>> > It's supposed to return 'npackets', but if you pass it a different
>> > packet size than it was passed originally, it can now return a
>> > potentially bigger number than the already allocated buffer, no?
>> >
>> > So I think it should round down (ie use a regular divide). No?
>>
>> Yes,you are correct, please ignore my last reply, and following is
>> the fixed patch.
>
> psize and video->urb_size shouldn't have changed before and after resume,
> otherwise we'll get into trouble anyway. A regular divide and a round-up
> divide should then return the same result. I'll take the regular divide, as it
> will be more efficient.

Yes.

>
>> Thanks.
>>
>> From a3b3d72cdd57a0699fb643b41b78eb7beb211ff5 Mon Sep 17 00:00:00 2001
>> From: Ming Lei <tom.leiming@gmail.com>
>> Date: Wed, 15 Apr 2009 22:32:51 +0800
>> Subject: [PATCH] V4L/DVB:usbvideo:fix uvc resume failed(v2)
>>
>> Now urb buffers is not freed before suspend, so uvc_alloc_urb_buffers
>> should return packet counts allocated originally during uvc resume
>> , instead of zero.
>>
>> This version uses round down to return packet counts on Linus's
>> suggestions, or else may lead to buffer destructed if packet size
>> is changed before calling uvc_alloc_urb_buffers() in this kind of
>> case.
>
> The comment is misleading. If the packet size changes we need to reallocate
> the buffers anyway. Have you checked if the packet size (which depends on the
> endpoint being selected) can be changed between suspend and resume, either by
> the uvcvideo driver (I don't think it can) or the USB core ?

The packet size does not change between suspend and resume.  I mean
uvc_alloc_urb_buffers()
still can be used in other cases if buffers was not freed and is
reuesed in future. It seems there is no
such cases in uvcvideo now, but uvc_alloc_urb_buffers() really __can__
 work in such case, isn't it?

IMHO It is only used to allocate or reserve UVC_URBS usb buffers,
which size is video->urb_size, and
npackets can be shortened or enlarged if psize is changed, after all.

Thanks!

-- 
Lei Ming

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
  2009-04-20 23:37                     ` Andrew Morton
  (?)
@ 2009-04-21 18:53                     ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-21 18:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: pavel, torvalds, jens.axboe, alan-jenkins, linux-kernel, kernel-testers

On Tuesday 21 April 2009, Andrew Morton wrote:
> On Mon, 20 Apr 2009 13:04:12 -0700
> Andrew Morton <akpm@linux-foundation.org> wrote:
> 
> > On Mon, 20 Apr 2009 21:53:06 +0200
> > Pavel Machek <pavel@ucw.cz> wrote:
> > 
> > > > Remind me: why can't we just allocate N pages at suspend-time?
> > > 
> > > We need half of memory free. The reason we can't "just allocate" is
> > > probably OOM killer; but my memories are quite weak :-(.
> > 
> > hm.  You'd think that with out splendid range of __GFP_foo falgs, there
> > would be some combo which would suit this requirement but I can't
> > immediately spot one.
> > 
> > We can always add another I guess.  Something like...
> > 
> > 
> > diff -puN mm/page_alloc.c~a mm/page_alloc.c
> > --- a/mm/page_alloc.c~a
> > +++ a/mm/page_alloc.c
> > @@ -1620,7 +1620,8 @@ nofail_alloc:
> >  		}
> >  
> >  		/* The OOM killer will not help higher order allocs so fail */
> > -		if (order > PAGE_ALLOC_COSTLY_ORDER) {
> > +		if (order > PAGE_ALLOC_COSTLY_ORDER ||
> > +				(gfp_mask & __GFP_NO_OOM_KILL)) {
> >  			clear_zonelist_oom(zonelist, gfp_mask);
> >  			goto nopage;
> >  		}
> > diff -puN include/linux/gfp.h~a include/linux/gfp.h
> > --- a/include/linux/gfp.h~a
> > +++ a/include/linux/gfp.h
> > @@ -51,8 +51,9 @@ struct vm_area_struct;
> >  #define __GFP_THISNODE	((__force gfp_t)0x40000u)/* No fallback, no policies */
> >  #define __GFP_RECLAIMABLE ((__force gfp_t)0x80000u) /* Page is reclaimable */
> >  #define __GFP_MOVABLE	((__force gfp_t)0x100000u)  /* Page is movable */
> > +#define __GFP_NO_OOM_KILL ((__force gfp_t)0x200000u)  /* Don't invoke out_of_memory() */
> >  
> > -#define __GFP_BITS_SHIFT 21	/* Room for 21 __GFP_FOO bits */
> > +#define __GFP_BITS_SHIFT 22	/* Number of__GFP_FOO bits */
> >  #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
> >  
> >  /* This equals 0, but use constants in case they ever change */
> > _
> > 
> 
> Of course, this will protect the calling task from getting oom-killed. 
> But it doesn't protect other tasks from getting oom-killed due to the
> activity of _this_ task.
> 
> But I think that problem already exists, and that this proposal doesn't
> worsen anything, yes?

I think it doesn't.

> Or is it the case that all other tasks are safely stuck in the freezer
> at this time, so they won't be allocating any memory anyway?

Except for the tasks (kernel threads) that are not frozen and which can
allocate memory as well.

However, the OOM killer is not really useful during suspend/hibernation, so
perhaps we can just disable it temporarily before that?

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-21  1:47                 ` Ming Lei
  (?)
@ 2009-04-21 23:21                 ` Laurent Pinchart
  2009-05-09  3:28                   ` Ming Lei
  2009-05-09  3:28                   ` Ming Lei
  -1 siblings, 2 replies; 580+ messages in thread
From: Laurent Pinchart @ 2009-04-21 23:21 UTC (permalink / raw)
  To: Ming Lei
  Cc: Linus Torvalds, Rafael J. Wysocki, Linux Kernel Mailing List,
	Adrian Bunk, Andrew Morton, Natalie Protasevich,
	Kernel Testers List, Network Development, Linux ACPI,
	Linux PM List, Linux SCSI List, video4linux-list, mchehab

Hi,

On Tuesday 21 April 2009 03:47:34 Ming Lei wrote:
> 2009/4/21 Laurent Pinchart <laurent.pinchart@skynet.be>:
> > On Saturday 18 April 2009 06:51:11 leiming wrote:
> >> On Fri, 17 Apr 2009 19:55:29 -0700 (PDT)
> >>
> >> Linus Torvalds <torvalds@linux-foundation.org> wrote:
> >> > > @@ -742,7 +742,7 @@ static int uvc_alloc_urb_buffers(struct
> >> > > uvc_video_device *video,
> >> > >   /* Buffers are already allocated, bail out. */
> >> > >   if (video->urb_size)
> >> > > -         return 0;
> >> > > +         return DIV_ROUND_UP(video->urb_size, psize);
> >> >
> >> > I don't think this is right. It should round _down_.
> >> >
> >> > It's supposed to return 'npackets', but if you pass it a different
> >> > packet size than it was passed originally, it can now return a
> >> > potentially bigger number than the already allocated buffer, no?
> >> >
> >> > So I think it should round down (ie use a regular divide). No?
> >>
> >> Yes,you are correct, please ignore my last reply, and following is
> >> the fixed patch.
> >
> > psize and video->urb_size shouldn't have changed before and after resume,
> > otherwise we'll get into trouble anyway. A regular divide and a round-up
> > divide should then return the same result. I'll take the regular divide,
> > as it will be more efficient.
>
> Yes.
>
> >> Thanks.
> >>
> >> From a3b3d72cdd57a0699fb643b41b78eb7beb211ff5 Mon Sep 17 00:00:00 2001
> >> From: Ming Lei <tom.leiming@gmail.com>
> >> Date: Wed, 15 Apr 2009 22:32:51 +0800
> >> Subject: [PATCH] V4L/DVB:usbvideo:fix uvc resume failed(v2)
> >>
> >> Now urb buffers is not freed before suspend, so uvc_alloc_urb_buffers
> >> should return packet counts allocated originally during uvc resume
> >> , instead of zero.
> >>
> >> This version uses round down to return packet counts on Linus's
> >> suggestions, or else may lead to buffer destructed if packet size
> >> is changed before calling uvc_alloc_urb_buffers() in this kind of
> >> case.
> >
> > The comment is misleading. If the packet size changes we need to
> > reallocate the buffers anyway. Have you checked if the packet size (which
> > depends on the endpoint being selected) can be changed between suspend
> > and resume, either by the uvcvideo driver (I don't think it can) or the
> > USB core ?
>
> The packet size does not change between suspend and resume.  I mean
> uvc_alloc_urb_buffers() still can be used in other cases if buffers was not
> freed and is reuesed in future. It seems there is no such cases in uvcvideo
> now, but uvc_alloc_urb_buffers() really __can__ work in such case, isn't it?
>
> IMHO It is only used to allocate or reserve UVC_URBS usb buffers, which size
> is video->urb_size, and npackets can be shortened or enlarged if psize is
> changed, after all.

You're right. Patch applied, thanks.

Best regards,

Laurent Pinchart


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-21  1:47                 ` Ming Lei
  (?)
  (?)
@ 2009-04-21 23:21                 ` Laurent Pinchart
  -1 siblings, 0 replies; 580+ messages in thread
From: Laurent Pinchart @ 2009-04-21 23:21 UTC (permalink / raw)
  To: Ming Lei
  Cc: Adrian Bunk, Linux SCSI List, Network Development,
	Linux Kernel Mailing List, Natalie Protasevich, mchehab,
	Linux ACPI, video4linux-list, Andrew Morton, Kernel Testers List,
	Linus Torvalds, Linux PM List

Hi,

On Tuesday 21 April 2009 03:47:34 Ming Lei wrote:
> 2009/4/21 Laurent Pinchart <laurent.pinchart@skynet.be>:
> > On Saturday 18 April 2009 06:51:11 leiming wrote:
> >> On Fri, 17 Apr 2009 19:55:29 -0700 (PDT)
> >>
> >> Linus Torvalds <torvalds@linux-foundation.org> wrote:
> >> > > @@ -742,7 +742,7 @@ static int uvc_alloc_urb_buffers(struct
> >> > > uvc_video_device *video,
> >> > >   /* Buffers are already allocated, bail out. */
> >> > >   if (video->urb_size)
> >> > > -         return 0;
> >> > > +         return DIV_ROUND_UP(video->urb_size, psize);
> >> >
> >> > I don't think this is right. It should round _down_.
> >> >
> >> > It's supposed to return 'npackets', but if you pass it a different
> >> > packet size than it was passed originally, it can now return a
> >> > potentially bigger number than the already allocated buffer, no?
> >> >
> >> > So I think it should round down (ie use a regular divide). No?
> >>
> >> Yes,you are correct, please ignore my last reply, and following is
> >> the fixed patch.
> >
> > psize and video->urb_size shouldn't have changed before and after resume,
> > otherwise we'll get into trouble anyway. A regular divide and a round-up
> > divide should then return the same result. I'll take the regular divide,
> > as it will be more efficient.
>
> Yes.
>
> >> Thanks.
> >>
> >> From a3b3d72cdd57a0699fb643b41b78eb7beb211ff5 Mon Sep 17 00:00:00 2001
> >> From: Ming Lei <tom.leiming@gmail.com>
> >> Date: Wed, 15 Apr 2009 22:32:51 +0800
> >> Subject: [PATCH] V4L/DVB:usbvideo:fix uvc resume failed(v2)
> >>
> >> Now urb buffers is not freed before suspend, so uvc_alloc_urb_buffers
> >> should return packet counts allocated originally during uvc resume
> >> , instead of zero.
> >>
> >> This version uses round down to return packet counts on Linus's
> >> suggestions, or else may lead to buffer destructed if packet size
> >> is changed before calling uvc_alloc_urb_buffers() in this kind of
> >> case.
> >
> > The comment is misleading. If the packet size changes we need to
> > reallocate the buffers anyway. Have you checked if the packet size (which
> > depends on the endpoint being selected) can be changed between suspend
> > and resume, either by the uvcvideo driver (I don't think it can) or the
> > USB core ?
>
> The packet size does not change between suspend and resume.  I mean
> uvc_alloc_urb_buffers() still can be used in other cases if buffers was not
> freed and is reuesed in future. It seems there is no such cases in uvcvideo
> now, but uvc_alloc_urb_buffers() really __can__ work in such case, isn't it?
>
> IMHO It is only used to allocate or reserve UVC_URBS usb buffers, which size
> is video->urb_size, and npackets can be shortened or enlarged if psize is
> changed, after all.

You're right. Patch applied, thanks.

Best regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13068] Lockdep warning in inotify_dev_queue_event
  2009-04-16 21:45   ` Rafael J. Wysocki
@ 2009-04-22  9:50     ` Sachin Sant
  -1 siblings, 0 replies; 580+ messages in thread
From: Sachin Sant @ 2009-04-22  9:50 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Ingo Molnar, peterz

Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
>
> The following bug entry is on the current list of known regressions
> from 2.6.29.  Please verify if it still should be listed and let me know
> (either way).
>
>
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13068
> Subject		: Lockdep warining in inotify_dev_queue_event
> Submitter	: Sachin Sant <sachinp@in.ibm.com>
> Date		: 2009-04-05 12:37 (12 days old)
> References	: http://marc.info/?l=linux-kernel&m=123893439229272&w=4
I can recreate this with 2.6.30-rc3 as well.

=================================
[ INFO: inconsistent lock state ]
2.6.30-rc3 #1
---------------------------------
inconsistent {IN-RECLAIM_FS-W} -> {RECLAIM_FS-ON-W} usage.
inotify02/31550 [HC0[0]:SC0[0]:HE1:SE1] takes:
 (&inode->inotify_mutex){+.+.?.}, at: [<c0000000001668ec>] 
.inotify_inode_queue_event+0x6c/0x160
{IN-RECLAIM_FS-W} state was registered at:
  [<c00000000009bb1c>] .lock_acquire+0x108/0x154
  [<c0000000005a8be4>] .mutex_lock_nested+0x88/0x460
  [<c000000000166b04>] .inotify_inode_is_dead+0x38/0xc8
  [<c000000000142708>] .dentry_iput+0xa0/0x128
  [<c000000000142904>] .d_kill+0x5c/0xa0
  [<c000000000142c4c>] .__shrink_dcache_sb+0x304/0x3d0
  [<c000000000142e5c>] .shrink_dcache_memory+0x144/0x244
  [<c0000000000f580c>] .shrink_slab+0x15c/0x228
  [<c0000000000f60f8>] .kswapd+0x4c4/0x67c
  [<c000000000088ba4>] .kthread+0x80/0xcc
  [<c00000000002c194>] .kernel_thread+0x54/0x70
irq event stamp: 2267
hardirqs last  enabled at (2267): [<c000000000120490>] 
.kmem_cache_alloc+0xec/0x1b4
hardirqs last disabled at (2266): [<c000000000120418>] 
.kmem_cache_alloc+0x74/0x1b4
softirqs last  enabled at (1654): [<c00000000002be8c>] 
.call_do_softirq+0x14/0x24
softirqs last disabled at (1639): [<c00000000002be8c>] 
.call_do_softirq+0x14/0x24

other info that might help us debug this:
4 locks held by inotify02/31550:
 #0:  (&type->i_mutex_dir_key#4){+.+.+.}, at: [<c00000000013d41c>] 
.do_filp_open+0x1a4/0x964
 #1:  (&inode->inotify_mutex){+.+.?.}, at: [<c0000000001668ec>] 
.inotify_inode_queue_event+0x6c/0x160
 #2:  (&ih->mutex){+.+...}, at: [<c000000000166920>] 
.inotify_inode_queue_event+0xa0/0x160
 #3:  (&dev->ev_mutex){+.+...}, at: [<c00000000016822c>] 
.inotify_dev_queue_event+0x50/0x1cc

stack backtrace:
Call Trace:
[c00000001816b4b0] [c000000000011a54] .show_stack+0x6c/0x16c (unreliable)
[c00000001816b560] [c000000000098544] .print_usage_bug+0x1c0/0x1f4
[c00000001816b620] [c0000000000988f8] .mark_lock+0x380/0x6e4
[c00000001816b6d0] [c000000000098cd0] .mark_held_locks+0x74/0xc0
[c00000001816b770] [c000000000098de0] .lockdep_trace_alloc+0xc4/0xf4
[c00000001816b7f0] [c000000000122090] .__kmalloc+0x100/0x274
[c00000001816b8a0] [c000000000168140] .kernel_event+0xb8/0x154
[c00000001816b940] [c0000000001682fc] .inotify_dev_queue_event+0x120/0x1cc
[c00000001816b9f0] [c000000000166970] .inotify_inode_queue_event+0xf0/0x160
[c00000001816bac0] [c000000000139acc] .vfs_create+0x170/0x1dc
[c00000001816bb60] [c00000000013d4d4] .do_filp_open+0x25c/0x964
[c00000001816bd10] [c00000000012b37c] .do_sys_open+0x80/0x140
[c00000001816bdc0] [c00000000012b258] .SyS_creat+0x18/0x2c
[c00000001816be30] [c000000000008554] syscall_exit+0x0/0x40

Let me know if i could provide any other information.

Thanks
-Sachin

-- 

---------------------------------
Sachin Sant
IBM Linux Technology Center
India Systems and Technology Labs
Bangalore, India
---------------------------------


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13068] Lockdep warning in inotify_dev_queue_event
@ 2009-04-22  9:50     ` Sachin Sant
  0 siblings, 0 replies; 580+ messages in thread
From: Sachin Sant @ 2009-04-22  9:50 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Ingo Molnar,
	peterz-wEGCiKHe2LqWVfeAwA7xHQ

Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
>
> The following bug entry is on the current list of known regressions
> from 2.6.29.  Please verify if it still should be listed and let me know
> (either way).
>
>
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13068
> Subject		: Lockdep warining in inotify_dev_queue_event
> Submitter	: Sachin Sant <sachinp-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
> Date		: 2009-04-05 12:37 (12 days old)
> References	: http://marc.info/?l=linux-kernel&m=123893439229272&w=4
I can recreate this with 2.6.30-rc3 as well.

=================================
[ INFO: inconsistent lock state ]
2.6.30-rc3 #1
---------------------------------
inconsistent {IN-RECLAIM_FS-W} -> {RECLAIM_FS-ON-W} usage.
inotify02/31550 [HC0[0]:SC0[0]:HE1:SE1] takes:
 (&inode->inotify_mutex){+.+.?.}, at: [<c0000000001668ec>] 
.inotify_inode_queue_event+0x6c/0x160
{IN-RECLAIM_FS-W} state was registered at:
  [<c00000000009bb1c>] .lock_acquire+0x108/0x154
  [<c0000000005a8be4>] .mutex_lock_nested+0x88/0x460
  [<c000000000166b04>] .inotify_inode_is_dead+0x38/0xc8
  [<c000000000142708>] .dentry_iput+0xa0/0x128
  [<c000000000142904>] .d_kill+0x5c/0xa0
  [<c000000000142c4c>] .__shrink_dcache_sb+0x304/0x3d0
  [<c000000000142e5c>] .shrink_dcache_memory+0x144/0x244
  [<c0000000000f580c>] .shrink_slab+0x15c/0x228
  [<c0000000000f60f8>] .kswapd+0x4c4/0x67c
  [<c000000000088ba4>] .kthread+0x80/0xcc
  [<c00000000002c194>] .kernel_thread+0x54/0x70
irq event stamp: 2267
hardirqs last  enabled at (2267): [<c000000000120490>] 
.kmem_cache_alloc+0xec/0x1b4
hardirqs last disabled at (2266): [<c000000000120418>] 
.kmem_cache_alloc+0x74/0x1b4
softirqs last  enabled at (1654): [<c00000000002be8c>] 
.call_do_softirq+0x14/0x24
softirqs last disabled at (1639): [<c00000000002be8c>] 
.call_do_softirq+0x14/0x24

other info that might help us debug this:
4 locks held by inotify02/31550:
 #0:  (&type->i_mutex_dir_key#4){+.+.+.}, at: [<c00000000013d41c>] 
.do_filp_open+0x1a4/0x964
 #1:  (&inode->inotify_mutex){+.+.?.}, at: [<c0000000001668ec>] 
.inotify_inode_queue_event+0x6c/0x160
 #2:  (&ih->mutex){+.+...}, at: [<c000000000166920>] 
.inotify_inode_queue_event+0xa0/0x160
 #3:  (&dev->ev_mutex){+.+...}, at: [<c00000000016822c>] 
.inotify_dev_queue_event+0x50/0x1cc

stack backtrace:
Call Trace:
[c00000001816b4b0] [c000000000011a54] .show_stack+0x6c/0x16c (unreliable)
[c00000001816b560] [c000000000098544] .print_usage_bug+0x1c0/0x1f4
[c00000001816b620] [c0000000000988f8] .mark_lock+0x380/0x6e4
[c00000001816b6d0] [c000000000098cd0] .mark_held_locks+0x74/0xc0
[c00000001816b770] [c000000000098de0] .lockdep_trace_alloc+0xc4/0xf4
[c00000001816b7f0] [c000000000122090] .__kmalloc+0x100/0x274
[c00000001816b8a0] [c000000000168140] .kernel_event+0xb8/0x154
[c00000001816b940] [c0000000001682fc] .inotify_dev_queue_event+0x120/0x1cc
[c00000001816b9f0] [c000000000166970] .inotify_inode_queue_event+0xf0/0x160
[c00000001816bac0] [c000000000139acc] .vfs_create+0x170/0x1dc
[c00000001816bb60] [c00000000013d4d4] .do_filp_open+0x25c/0x964
[c00000001816bd10] [c00000000012b37c] .do_sys_open+0x80/0x140
[c00000001816bdc0] [c00000000012b258] .SyS_creat+0x18/0x2c
[c00000001816be30] [c000000000008554] syscall_exit+0x0/0x40

Let me know if i could provide any other information.

Thanks
-Sachin

-- 

---------------------------------
Sachin Sant
IBM Linux Technology Center
India Systems and Technology Labs
Bangalore, India
---------------------------------

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
  2009-04-20 23:37                     ` Andrew Morton
  (?)
  (?)
@ 2009-04-22 13:07                     ` Pavel Machek
  2009-04-22 20:11                       ` Rafael J. Wysocki
  -1 siblings, 1 reply; 580+ messages in thread
From: Pavel Machek @ 2009-04-22 13:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: torvalds, jens.axboe, alan-jenkins, rjw, linux-kernel, kernel-testers

Hi!

> Of course, this will protect the calling task from getting oom-killed. 
> But it doesn't protect other tasks from getting oom-killed due to the
> activity of _this_ task.
> 
> But I think that problem already exists, and that this proposal doesn't
> worsen anything, yes?
> 
> Or is it the case that all other tasks are safely stuck in the freezer
> at this time, so they won't be allocating any memory anyway?

That is the idea, yes. ... but we now have more threads that are not
freezable... so they may allocate the memory.

Is it non-feasible to free memory without really going and allocating
everything?

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
  2009-04-22 13:07                     ` Pavel Machek
@ 2009-04-22 20:11                       ` Rafael J. Wysocki
  2009-04-22 20:19                           ` Andrew Morton
  0 siblings, 1 reply; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-22 20:11 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Andrew Morton, torvalds, jens.axboe, alan-jenkins, linux-kernel,
	kernel-testers

On Wednesday 22 April 2009, Pavel Machek wrote:
> Hi!
> 
> > Of course, this will protect the calling task from getting oom-killed. 
> > But it doesn't protect other tasks from getting oom-killed due to the
> > activity of _this_ task.
> > 
> > But I think that problem already exists, and that this proposal doesn't
> > worsen anything, yes?
> > 
> > Or is it the case that all other tasks are safely stuck in the freezer
> > at this time, so they won't be allocating any memory anyway?
> 
> That is the idea, yes. ... but we now have more threads that are not
> freezable... so they may allocate the memory.
> 
> Is it non-feasible to free memory without really going and allocating
> everything?

The question is whether there is a point.  In principle we can just go and
allocate as much as we need upfront.  It shouldn't change anything, because
we resume and suspend devices after creating the image anyway.

I think we could try to disable the OOM killer before suspend and just
allocate the memory for the image right before devices are suspended for the
first time.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-22 20:19                           ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-04-22 20:19 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: pavel, torvalds, jens.axboe, alan-jenkins, linux-kernel, kernel-testers

On Wed, 22 Apr 2009 22:11:17 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> On Wednesday 22 April 2009, Pavel Machek wrote:
> > Hi!
> > 
> > > Of course, this will protect the calling task from getting oom-killed. 
> > > But it doesn't protect other tasks from getting oom-killed due to the
> > > activity of _this_ task.
> > > 
> > > But I think that problem already exists, and that this proposal doesn't
> > > worsen anything, yes?
> > > 
> > > Or is it the case that all other tasks are safely stuck in the freezer
> > > at this time, so they won't be allocating any memory anyway?
> > 
> > That is the idea, yes. ... but we now have more threads that are not
> > freezable... so they may allocate the memory.
> > 
> > Is it non-feasible to free memory without really going and allocating
> > everything?
> 
> The question is whether there is a point.  In principle we can just go and
> allocate as much as we need upfront.  It shouldn't change anything, because
> we resume and suspend devices after creating the image anyway.
> 
> I think we could try to disable the OOM killer before suspend and just
> allocate the memory for the image right before devices are suspended for the
> first time.
> 

It would be nice to do.

shrink_all_memory() is simply trying to do something which page reclaim
doesn't expect to do (free memory when there's already lots of memory
free).  Consequently it doesn't do it very well, and there's a good
risk that changes to core reclaim will accidentally break
shrink_all_memory().  


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13058] First hibernation attempt fails
@ 2009-04-22 20:19                           ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-04-22 20:19 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Wed, 22 Apr 2009 22:11:17 +0200
"Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:

> On Wednesday 22 April 2009, Pavel Machek wrote:
> > Hi!
> > 
> > > Of course, this will protect the calling task from getting oom-killed. 
> > > But it doesn't protect other tasks from getting oom-killed due to the
> > > activity of _this_ task.
> > > 
> > > But I think that problem already exists, and that this proposal doesn't
> > > worsen anything, yes?
> > > 
> > > Or is it the case that all other tasks are safely stuck in the freezer
> > > at this time, so they won't be allocating any memory anyway?
> > 
> > That is the idea, yes. ... but we now have more threads that are not
> > freezable... so they may allocate the memory.
> > 
> > Is it non-feasible to free memory without really going and allocating
> > everything?
> 
> The question is whether there is a point.  In principle we can just go and
> allocate as much as we need upfront.  It shouldn't change anything, because
> we resume and suspend devices after creating the image anyway.
> 
> I think we could try to disable the OOM killer before suspend and just
> allocate the memory for the image right before devices are suspended for the
> first time.
> 

It would be nice to do.

shrink_all_memory() is simply trying to do something which page reclaim
doesn't expect to do (free memory when there's already lots of memory
free).  Consequently it doesn't do it very well, and there's a good
risk that changes to core reclaim will accidentally break
shrink_all_memory().  

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-16 21:42 ` Rafael J. Wysocki
                   ` (48 preceding siblings ...)
  (?)
@ 2009-04-24 13:44 ` Kalle Valo
       [not found]   ` <87ljpqqi89.fsf-xNZwKgViW5gAvxtiuMwx3w@public.gmane.org>
  2009-04-25 21:57   ` Rafael J. Wysocki
  -1 siblings, 2 replies; 580+ messages in thread
From: Kalle Valo @ 2009-04-24 13:44 UTC (permalink / raw)
  To: ext Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Adrian Bunk, Andrew Morton,
	Linus Torvalds, Natalie Protasevich, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> This message contains a list of some regressions from 2.6.29, for
> which there are no fixes in the mainline I know of. If any of them
> have been fixed already, please let me know.
>
> If you know of any other unresolved regressions from 2.6.29, please
> let me know either and I'll add them to the list. Also, please let
> me know if any of the entries below are invalid.
>
> Each entry from the list will be sent additionally in an automatic reply to
> this message with CCs to the people involved in reporting and handling the
> issue.
>

[...]

> Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13067
> Subject         : iwl3945: wlan0: beacon loss from AP - sending probe request
> Submitter       : Maciej Rutecki <maciej.rutecki@gmail.com>
> Date            : 2009-04-05 9:11 (12 days old)
> References      : http://marc.info/?l=linux-kernel&m=123892272218266&w=4

The regression here is that I added a printk() to inform about beacon
loss. The issue has been there a long time, the printk() just exposed
it.

Michael wrote a patch which silences the printk:

http://git.kernel.org/?p=linux/kernel/git/linville/wireless-testing.git;a=commit;h=16eaea5faa37d552b14e246ca56a436e55ca67b3

I fixed the beacon loss detection here:

http://git.kernel.org/?p=linux/kernel/git/linville/wireless-testing.git;a=commit;h=3b6dc5a431e4fef35717cba53544a95209f49b68

John, I think Michael's patch should be sent to 2.6.30. Any chances
for that?

Rafael, is it possible to send your regression mails to
linux-wireless@vger.kernel.org as well? Very few wireless developers
have time to follow netdev or lkml.

-- 
Kalle Valo

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-16 21:42 ` Rafael J. Wysocki
                   ` (47 preceding siblings ...)
  (?)
@ 2009-04-24 13:44 ` Kalle Valo
  -1 siblings, 0 replies; 580+ messages in thread
From: Kalle Valo @ 2009-04-24 13:44 UTC (permalink / raw)
  To: ext Rafael J. Wysocki
  Cc: Adrian Bunk, Linux SCSI List, Network Development,
	Linux Kernel Mailing List, Natalie Protasevich, Linux ACPI,
	Andrew Morton, Kernel Testers List, Linus Torvalds,
	Linux PM List

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> This message contains a list of some regressions from 2.6.29, for
> which there are no fixes in the mainline I know of. If any of them
> have been fixed already, please let me know.
>
> If you know of any other unresolved regressions from 2.6.29, please
> let me know either and I'll add them to the list. Also, please let
> me know if any of the entries below are invalid.
>
> Each entry from the list will be sent additionally in an automatic reply to
> this message with CCs to the people involved in reporting and handling the
> issue.
>

[...]

> Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13067
> Subject         : iwl3945: wlan0: beacon loss from AP - sending probe request
> Submitter       : Maciej Rutecki <maciej.rutecki@gmail.com>
> Date            : 2009-04-05 9:11 (12 days old)
> References      : http://marc.info/?l=linux-kernel&m=123892272218266&w=4

The regression here is that I added a printk() to inform about beacon
loss. The issue has been there a long time, the printk() just exposed
it.

Michael wrote a patch which silences the printk:

http://git.kernel.org/?p=linux/kernel/git/linville/wireless-testing.git;a=commit;h=16eaea5faa37d552b14e246ca56a436e55ca67b3

I fixed the beacon loss detection here:

http://git.kernel.org/?p=linux/kernel/git/linville/wireless-testing.git;a=commit;h=3b6dc5a431e4fef35717cba53544a95209f49b68

John, I think Michael's patch should be sent to 2.6.30. Any chances
for that?

Rafael, is it possible to send your regression mails to
linux-wireless@vger.kernel.org as well? Very few wireless developers
have time to follow netdev or lkml.

-- 
Kalle Valo

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13069] regression in 2.6.29-git3 on SH/Dreamcast
  2009-04-16 21:45   ` Rafael J. Wysocki
@ 2009-04-24 17:37     ` Adrian McMenamin
  -1 siblings, 0 replies; 580+ messages in thread
From: Adrian McMenamin @ 2009-04-24 17:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Manuel Lauss

On Thu, 2009-04-16 at 23:45 +0200, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
> 
> The following bug entry is on the current list of known regressions
> from 2.6.29.  Please verify if it still should be listed and let me know
> (either way).
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13069
> Subject		: regression in 2.6.29-git3 on SH/Dreamcast
> Submitter	: Adrian McMenamin <adrian@newgolddream.dyndns.info>
> Date		: 2009-03-29 19:04 (19 days old)
> References	: http://marc.info/?l=linux-kernel&m=123835353115372&w=4
> 

At this point it *looks* as though it was simply a question of
insufficient memory to boot, but I think it needs further testing.
Nobody else seems to have picked it up though.


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13069] regression in 2.6.29-git3 on SH/Dreamcast
@ 2009-04-24 17:37     ` Adrian McMenamin
  0 siblings, 0 replies; 580+ messages in thread
From: Adrian McMenamin @ 2009-04-24 17:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Manuel Lauss

On Thu, 2009-04-16 at 23:45 +0200, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
> 
> The following bug entry is on the current list of known regressions
> from 2.6.29.  Please verify if it still should be listed and let me know
> (either way).
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13069
> Subject		: regression in 2.6.29-git3 on SH/Dreamcast
> Submitter	: Adrian McMenamin <adrian-TSF8l6Tg6afpT6hvJLqO3U8SxdOydiOw@public.gmane.org>
> Date		: 2009-03-29 19:04 (19 days old)
> References	: http://marc.info/?l=linux-kernel&m=123835353115372&w=4
> 

At this point it *looks* as though it was simply a question of
insufficient memory to boot, but I think it needs further testing.
Nobody else seems to have picked it up though.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13108] 2.6.30-rc1: white screen during boot (regression) on spitz
  2009-04-16 21:45   ` Rafael J. Wysocki
  (?)
@ 2009-04-25 11:54   ` Pavel Machek
  2009-04-26 12:18     ` Rafael J. Wysocki
  -1 siblings, 1 reply; 580+ messages in thread
From: Pavel Machek @ 2009-04-25 11:54 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List,
	Dmitry Eremin-Solenikov, Peter Zijlstra

On Thu 2009-04-16 23:45:04, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
> 
> The following bug entry is on the current list of known regressions
> from 2.6.29.  Please verify if it still should be listed and let me know
> (either way).
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13108
> Subject		: 2.6.30-rc1: white screen during boot (regression) on spitz
> Submitter	: Pavel Machek <pavel@ucw.cz>
> Date		: 2009-04-10 10:34 (7 days old)
> References	: http://marc.info/?l=linux-kernel&m=123935954223418&w=4
> Handled-By	: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>

It was fixed in 2.6.30-rc3 as a test boot confirmed... (I have a link
to patch fixing it, but I guess we can simply close this.) Thanks!

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-24 13:44 ` Kalle Valo
@ 2009-04-25 21:57       ` Rafael J. Wysocki
  2009-04-25 21:57   ` Rafael J. Wysocki
  1 sibling, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-25 21:57 UTC (permalink / raw)
  To: Kalle Valo
  Cc: Linux Kernel Mailing List, Adrian Bunk, Andrew Morton,
	Linus Torvalds, Natalie Protasevich, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List

On Friday 24 April 2009, Kalle Valo wrote:
> "Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> writes:
> 
> > This message contains a list of some regressions from 2.6.29, for
> > which there are no fixes in the mainline I know of. If any of them
> > have been fixed already, please let me know.
> >
> > If you know of any other unresolved regressions from 2.6.29, please
> > let me know either and I'll add them to the list. Also, please let
> > me know if any of the entries below are invalid.
> >
> > Each entry from the list will be sent additionally in an automatic reply to
> > this message with CCs to the people involved in reporting and handling the
> > issue.
> >
> 
> [...]
> 
> > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13067
> > Subject         : iwl3945: wlan0: beacon loss from AP - sending probe request
> > Submitter       : Maciej Rutecki <maciej.rutecki-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> > Date            : 2009-04-05 9:11 (12 days old)
> > References      : http://marc.info/?l=linux-kernel&m=123892272218266&w=4
> 
> The regression here is that I added a printk() to inform about beacon
> loss. The issue has been there a long time, the printk() just exposed
> it.
> 
> Michael wrote a patch which silences the printk:
> 
> http://git.kernel.org/?p=linux/kernel/git/linville/wireless-testing.git;a=commit;h=16eaea5faa37d552b14e246ca56a436e55ca67b3
> 
> I fixed the beacon loss detection here:
> 
> http://git.kernel.org/?p=linux/kernel/git/linville/wireless-testing.git;a=commit;h=3b6dc5a431e4fef35717cba53544a95209f49b68
> 
> John, I think Michael's patch should be sent to 2.6.30. Any chances
> for that?
> 
> Rafael, is it possible to send your regression mails to
> linux-wireless-u79uwXL29TY76Z2rM5mHXA@public.gmane.org as well? Very few wireless developers
> have time to follow netdev or lkml.

Sure, the next reports will go there too.

Best,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
@ 2009-04-25 21:57       ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-25 21:57 UTC (permalink / raw)
  To: Kalle Valo
  Cc: Linux Kernel Mailing List, Adrian Bunk, Andrew Morton,
	Linus Torvalds, Natalie Protasevich, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List

On Friday 24 April 2009, Kalle Valo wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > This message contains a list of some regressions from 2.6.29, for
> > which there are no fixes in the mainline I know of. If any of them
> > have been fixed already, please let me know.
> >
> > If you know of any other unresolved regressions from 2.6.29, please
> > let me know either and I'll add them to the list. Also, please let
> > me know if any of the entries below are invalid.
> >
> > Each entry from the list will be sent additionally in an automatic reply to
> > this message with CCs to the people involved in reporting and handling the
> > issue.
> >
> 
> [...]
> 
> > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13067
> > Subject         : iwl3945: wlan0: beacon loss from AP - sending probe request
> > Submitter       : Maciej Rutecki <maciej.rutecki@gmail.com>
> > Date            : 2009-04-05 9:11 (12 days old)
> > References      : http://marc.info/?l=linux-kernel&m=123892272218266&w=4
> 
> The regression here is that I added a printk() to inform about beacon
> loss. The issue has been there a long time, the printk() just exposed
> it.
> 
> Michael wrote a patch which silences the printk:
> 
> http://git.kernel.org/?p=linux/kernel/git/linville/wireless-testing.git;a=commit;h=16eaea5faa37d552b14e246ca56a436e55ca67b3
> 
> I fixed the beacon loss detection here:
> 
> http://git.kernel.org/?p=linux/kernel/git/linville/wireless-testing.git;a=commit;h=3b6dc5a431e4fef35717cba53544a95209f49b68
> 
> John, I think Michael's patch should be sent to 2.6.30. Any chances
> for that?
> 
> Rafael, is it possible to send your regression mails to
> linux-wireless@vger.kernel.org as well? Very few wireless developers
> have time to follow netdev or lkml.

Sure, the next reports will go there too.

Best,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-24 13:44 ` Kalle Valo
       [not found]   ` <87ljpqqi89.fsf-xNZwKgViW5gAvxtiuMwx3w@public.gmane.org>
@ 2009-04-25 21:57   ` Rafael J. Wysocki
  1 sibling, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-25 21:57 UTC (permalink / raw)
  To: Kalle Valo
  Cc: Adrian Bunk, Linux SCSI List, Network Development,
	Linux Kernel Mailing List, Natalie Protasevich, Linux ACPI,
	Andrew Morton, Kernel Testers List, Linus Torvalds,
	Linux PM List

On Friday 24 April 2009, Kalle Valo wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > This message contains a list of some regressions from 2.6.29, for
> > which there are no fixes in the mainline I know of. If any of them
> > have been fixed already, please let me know.
> >
> > If you know of any other unresolved regressions from 2.6.29, please
> > let me know either and I'll add them to the list. Also, please let
> > me know if any of the entries below are invalid.
> >
> > Each entry from the list will be sent additionally in an automatic reply to
> > this message with CCs to the people involved in reporting and handling the
> > issue.
> >
> 
> [...]
> 
> > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13067
> > Subject         : iwl3945: wlan0: beacon loss from AP - sending probe request
> > Submitter       : Maciej Rutecki <maciej.rutecki@gmail.com>
> > Date            : 2009-04-05 9:11 (12 days old)
> > References      : http://marc.info/?l=linux-kernel&m=123892272218266&w=4
> 
> The regression here is that I added a printk() to inform about beacon
> loss. The issue has been there a long time, the printk() just exposed
> it.
> 
> Michael wrote a patch which silences the printk:
> 
> http://git.kernel.org/?p=linux/kernel/git/linville/wireless-testing.git;a=commit;h=16eaea5faa37d552b14e246ca56a436e55ca67b3
> 
> I fixed the beacon loss detection here:
> 
> http://git.kernel.org/?p=linux/kernel/git/linville/wireless-testing.git;a=commit;h=3b6dc5a431e4fef35717cba53544a95209f49b68
> 
> John, I think Michael's patch should be sent to 2.6.30. Any chances
> for that?
> 
> Rafael, is it possible to send your regression mails to
> linux-wireless@vger.kernel.org as well? Very few wireless developers
> have time to follow netdev or lkml.

Sure, the next reports will go there too.

Best,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-25 21:57       ` Rafael J. Wysocki
  (?)
@ 2009-04-26  7:06       ` Kalle Valo
  -1 siblings, 0 replies; 580+ messages in thread
From: Kalle Valo @ 2009-04-26  7:06 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-wireless

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

>> Rafael, is it possible to send your regression mails to
>> linux-wireless@vger.kernel.org as well? Very few wireless developers
>> have time to follow netdev or lkml.
>
> Sure, the next reports will go there too.

Thanks a lot. We already got a new report.

-- 
Kalle Valo

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13108] 2.6.30-rc1: white screen during boot (regression) on spitz
  2009-04-25 11:54   ` Pavel Machek
@ 2009-04-26 12:18     ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-04-26 12:18 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Linux Kernel Mailing List, Kernel Testers List,
	Dmitry Eremin-Solenikov, Peter Zijlstra

On Saturday 25 April 2009, Pavel Machek wrote:
> On Thu 2009-04-16 23:45:04, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> > 
> > The following bug entry is on the current list of known regressions
> > from 2.6.29.  Please verify if it still should be listed and let me know
> > (either way).
> > 
> > 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13108
> > Subject		: 2.6.30-rc1: white screen during boot (regression) on spitz
> > Submitter	: Pavel Machek <pavel@ucw.cz>
> > Date		: 2009-04-10 10:34 (7 days old)
> > References	: http://marc.info/?l=linux-kernel&m=123935954223418&w=4
> > Handled-By	: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
> 
> It was fixed in 2.6.30-rc3 as a test boot confirmed... (I have a link
> to patch fixing it, but I guess we can simply close this.) Thanks!

Thanks, closed.

Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
       [not found]   ` <200904171648.38172.rjw@sisk.pl>
@ 2009-04-26 13:35     ` Ed Tomlinson
  0 siblings, 0 replies; 580+ messages in thread
From: Ed Tomlinson @ 2009-04-26 13:35 UTC (permalink / raw)
  To: Rafael J. Wysocki, Avi Kivity, netdev, LKML

On Friday 17 April 2009 10:48:38 you wrote:
> On Friday 17 April 2009, Ed Tomlinson wrote:
> > On Thursday 16 April 2009 17:42:31 you wrote:
> > > If you know of any other unresolved regressions from 2.6.29, please let me know
> > > either and I'll add them to the list.  Also, please let me know if any of the
> > > entries below are invalid.
> > 
> > Rafael,
> > 
> > Do you want a bug raised?  The stall reported in thread "2.6.30-rc1 A few issues and a stall"
> > is still a problem with rc2.  The other issues mentioned in the thread are fixed.
> 
> If there's only one issue remaining unfixed, please file a bug in the Bugzilla
> and put my address in the CC list of the bug entry.

There is not need for me to raise a bug.  The problem is fixed with rc3 + Dave Miller's latest
network tree for Linus.

Thanks
Ed

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 0/3] PM: Drop shrink_all_memory (was: Re: [Bug #13058] First hibernation attempt fails)
@ 2009-05-01 22:26                             ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-01 22:26 UTC (permalink / raw)
  To: Andrew Morton
  Cc: pavel, torvalds, jens.axboe, alan-jenkins, linux-kernel,
	kernel-testers, pm list

On Wednesday 22 April 2009, Andrew Morton wrote:
> On Wed, 22 Apr 2009 22:11:17 +0200
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > On Wednesday 22 April 2009, Pavel Machek wrote:
> > > Hi!
> > > 
> > > > Of course, this will protect the calling task from getting oom-killed. 
> > > > But it doesn't protect other tasks from getting oom-killed due to the
> > > > activity of _this_ task.
> > > > 
> > > > But I think that problem already exists, and that this proposal doesn't
> > > > worsen anything, yes?
> > > > 
> > > > Or is it the case that all other tasks are safely stuck in the freezer
> > > > at this time, so they won't be allocating any memory anyway?
> > > 
> > > That is the idea, yes. ... but we now have more threads that are not
> > > freezable... so they may allocate the memory.
> > > 
> > > Is it non-feasible to free memory without really going and allocating
> > > everything?
> > 
> > The question is whether there is a point.  In principle we can just go and
> > allocate as much as we need upfront.  It shouldn't change anything, because
> > we resume and suspend devices after creating the image anyway.
> > 
> > I think we could try to disable the OOM killer before suspend and just
> > allocate the memory for the image right before devices are suspended for the
> > first time.
> > 
> 
> It would be nice to do.
> 
> shrink_all_memory() is simply trying to do something which page reclaim
> doesn't expect to do (free memory when there's already lots of memory
> free).  Consequently it doesn't do it very well, and there's a good
> risk that changes to core reclaim will accidentally break
> shrink_all_memory().  

OK, a patchset follows:

[1/3] - disable the OOM killer during system-wide power transitions (should be
        done anyway IMO)
[2/3] - move swsusp_shrink_memory() to kernel/power/snapshot.c so that the
        next patch is easier to read
[3/3] - drop shrink_all_memory()

Please have a look and tell me what you think.

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 0/3] PM: Drop shrink_all_memory (was: Re: [Bug #13058] First hibernation attempt fails)
  2009-04-22 20:19                           ` Andrew Morton
  (?)
@ 2009-05-01 22:26                           ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-01 22:26 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, alan-jenkins, jens.axboe, pm list, kernel-testers,
	torvalds

On Wednesday 22 April 2009, Andrew Morton wrote:
> On Wed, 22 Apr 2009 22:11:17 +0200
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > On Wednesday 22 April 2009, Pavel Machek wrote:
> > > Hi!
> > > 
> > > > Of course, this will protect the calling task from getting oom-killed. 
> > > > But it doesn't protect other tasks from getting oom-killed due to the
> > > > activity of _this_ task.
> > > > 
> > > > But I think that problem already exists, and that this proposal doesn't
> > > > worsen anything, yes?
> > > > 
> > > > Or is it the case that all other tasks are safely stuck in the freezer
> > > > at this time, so they won't be allocating any memory anyway?
> > > 
> > > That is the idea, yes. ... but we now have more threads that are not
> > > freezable... so they may allocate the memory.
> > > 
> > > Is it non-feasible to free memory without really going and allocating
> > > everything?
> > 
> > The question is whether there is a point.  In principle we can just go and
> > allocate as much as we need upfront.  It shouldn't change anything, because
> > we resume and suspend devices after creating the image anyway.
> > 
> > I think we could try to disable the OOM killer before suspend and just
> > allocate the memory for the image right before devices are suspended for the
> > first time.
> > 
> 
> It would be nice to do.
> 
> shrink_all_memory() is simply trying to do something which page reclaim
> doesn't expect to do (free memory when there's already lots of memory
> free).  Consequently it doesn't do it very well, and there's a good
> risk that changes to core reclaim will accidentally break
> shrink_all_memory().  

OK, a patchset follows:

[1/3] - disable the OOM killer during system-wide power transitions (should be
        done anyway IMO)
[2/3] - move swsusp_shrink_memory() to kernel/power/snapshot.c so that the
        next patch is easier to read
[3/3] - drop shrink_all_memory()

Please have a look and tell me what you think.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 0/3] PM: Drop shrink_all_memory (was: Re: [Bug #13058] First hibernation attempt fails)
@ 2009-05-01 22:26                             ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-01 22:26 UTC (permalink / raw)
  To: Andrew Morton
  Cc: pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA, pm list

On Wednesday 22 April 2009, Andrew Morton wrote:
> On Wed, 22 Apr 2009 22:11:17 +0200
> "Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:
> 
> > On Wednesday 22 April 2009, Pavel Machek wrote:
> > > Hi!
> > > 
> > > > Of course, this will protect the calling task from getting oom-killed. 
> > > > But it doesn't protect other tasks from getting oom-killed due to the
> > > > activity of _this_ task.
> > > > 
> > > > But I think that problem already exists, and that this proposal doesn't
> > > > worsen anything, yes?
> > > > 
> > > > Or is it the case that all other tasks are safely stuck in the freezer
> > > > at this time, so they won't be allocating any memory anyway?
> > > 
> > > That is the idea, yes. ... but we now have more threads that are not
> > > freezable... so they may allocate the memory.
> > > 
> > > Is it non-feasible to free memory without really going and allocating
> > > everything?
> > 
> > The question is whether there is a point.  In principle we can just go and
> > allocate as much as we need upfront.  It shouldn't change anything, because
> > we resume and suspend devices after creating the image anyway.
> > 
> > I think we could try to disable the OOM killer before suspend and just
> > allocate the memory for the image right before devices are suspended for the
> > first time.
> > 
> 
> It would be nice to do.
> 
> shrink_all_memory() is simply trying to do something which page reclaim
> doesn't expect to do (free memory when there's already lots of memory
> free).  Consequently it doesn't do it very well, and there's a good
> risk that changes to core reclaim will accidentally break
> shrink_all_memory().  

OK, a patchset follows:

[1/3] - disable the OOM killer during system-wide power transitions (should be
        done anyway IMO)
[2/3] - move swsusp_shrink_memory() to kernel/power/snapshot.c so that the
        next patch is easier to read
[3/3] - drop shrink_all_memory()

Please have a look and tell me what you think.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 1/3] PM: Disable OOM killer during system-wide power transitions
@ 2009-05-01 22:27                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-01 22:27 UTC (permalink / raw)
  To: Andrew Morton
  Cc: pavel, torvalds, jens.axboe, alan-jenkins, linux-kernel,
	kernel-testers, pm list

From: Rafael J. Wysocki <rjw@sisk.pl>

The OOM killer is not particularly useful during system-wide power
transitions, so do not use it if such a transition is in progress.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 include/linux/suspend.h |    4 ++++
 kernel/power/disk.c     |   13 ++++++++++++-
 kernel/power/main.c     |   13 +++++++++++++
 kernel/power/power.h    |    1 +
 mm/page_alloc.c         |    8 ++++++--
 5 files changed, 36 insertions(+), 3 deletions(-)

Index: linux-2.6/include/linux/suspend.h
===================================================================
--- linux-2.6.orig/include/linux/suspend.h
+++ linux-2.6/include/linux/suspend.h
@@ -282,6 +282,8 @@ extern int unregister_pm_notifier(struct
 		{ .notifier_call = fn, .priority = pri };	\
 	register_pm_notifier(&fn##_nb);			\
 }
+
+extern bool pm_transition_in_progress(void);
 #else /* !CONFIG_PM_SLEEP */
 
 static inline int register_pm_notifier(struct notifier_block *nb)
@@ -295,6 +297,8 @@ static inline int unregister_pm_notifier
 }
 
 #define pm_notifier(fn, pri)	do { (void)(fn); } while (0)
+
+static inline bool pm_transition_in_progress(void) { return false; }
 #endif /* !CONFIG_PM_SLEEP */
 
 #ifndef CONFIG_HIBERNATION
Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c
+++ linux-2.6/mm/page_alloc.c
@@ -1619,8 +1619,12 @@ nofail_alloc:
 			goto got_pg;
 		}
 
-		/* The OOM killer will not help higher order allocs so fail */
-		if (order > PAGE_ALLOC_COSTLY_ORDER) {
+		/*
+		 * The OOM killer will not help higher order allocs and it is
+		 * not useful during system-wide power transitions, so fail.
+		 */
+		if (order > PAGE_ALLOC_COSTLY_ORDER
+		    || pm_transition_in_progress()) {
 			clear_zonelist_oom(zonelist, gfp_mask);
 			goto nopage;
 		}
Index: linux-2.6/kernel/power/disk.c
===================================================================
--- linux-2.6.orig/kernel/power/disk.c
+++ linux-2.6/kernel/power/disk.c
@@ -299,9 +299,11 @@ int hibernation_snapshot(int platform_mo
 {
 	int error;
 
+	transition_in_progress = true;
+
 	error = platform_begin(platform_mode);
 	if (error)
-		return error;
+		goto Out;
 
 	/* Free memory before shutting down devices. */
 	error = swsusp_shrink_memory();
@@ -325,6 +327,9 @@ int hibernation_snapshot(int platform_mo
 	resume_console();
  Close:
 	platform_end(platform_mode);
+
+ Out:
+	transition_in_progress = false;
 	return error;
 
  Recover_platform:
@@ -607,6 +612,7 @@ int hibernate(void)
 		pr_debug("PM: Image restored successfully.\n");
 		swsusp_free();
 	}
+
  Thaw:
 	thaw_processes();
  Finish:
@@ -724,6 +730,8 @@ static int software_resume(void)
 		goto Done;
 	}
 
+	transition_in_progress = true;
+
 	pr_debug("PM: Reading hibernation image.\n");
 
 	error = swsusp_read(&flags);
@@ -732,6 +740,9 @@ static int software_resume(void)
 
 	printk(KERN_ERR "PM: Restore failed, recovering.\n");
 	swsusp_free();
+
+	transition_in_progress = false;
+
 	thaw_processes();
  Done:
 	free_basic_memory_bitmaps();
Index: linux-2.6/kernel/power/main.c
===================================================================
--- linux-2.6.orig/kernel/power/main.c
+++ linux-2.6/kernel/power/main.c
@@ -32,6 +32,13 @@ EXPORT_SYMBOL(pm_flags);
 
 #ifdef CONFIG_PM_SLEEP
 
+bool transition_in_progress;
+
+bool pm_transition_in_progress(void)
+{
+	return transition_in_progress;
+}
+
 /* Routines for PM-transition notifications */
 
 static BLOCKING_NOTIFIER_HEAD(pm_chain_head);
@@ -246,6 +253,8 @@ static int suspend_prepare(void)
 		goto Thaw;
 	}
 
+	transition_in_progress = true;
+
 	free_pages = global_page_state(NR_FREE_PAGES);
 	if (free_pages < FREE_PAGE_NUMBER) {
 		pr_debug("PM: free some memory\n");
@@ -258,6 +267,8 @@ static int suspend_prepare(void)
 	if (!error)
 		return 0;
 
+	transition_in_progress = false;
+
  Thaw:
 	suspend_thaw_processes();
 	usermodehelper_enable();
@@ -403,6 +414,8 @@ int suspend_devices_and_enter(suspend_st
  */
 static void suspend_finish(void)
 {
+	transition_in_progress = false;
+
 	suspend_thaw_processes();
 	usermodehelper_enable();
 	pm_notifier_call_chain(PM_POST_SUSPEND);
Index: linux-2.6/kernel/power/power.h
===================================================================
--- linux-2.6.orig/kernel/power/power.h
+++ linux-2.6/kernel/power/power.h
@@ -173,6 +173,7 @@ static inline int suspend_devices_and_en
 #ifdef CONFIG_PM_SLEEP
 /* kernel/power/main.c */
 extern int pm_notifier_call_chain(unsigned long val);
+extern bool transition_in_progress;
 #endif
 
 #ifdef CONFIG_HIGHMEM

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 1/3] PM: Disable OOM killer during system-wide power transitions
  2009-05-01 22:26                             ` Rafael J. Wysocki
  (?)
@ 2009-05-01 22:27                             ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-01 22:27 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, alan-jenkins, jens.axboe, pm list, kernel-testers,
	torvalds

From: Rafael J. Wysocki <rjw@sisk.pl>

The OOM killer is not particularly useful during system-wide power
transitions, so do not use it if such a transition is in progress.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 include/linux/suspend.h |    4 ++++
 kernel/power/disk.c     |   13 ++++++++++++-
 kernel/power/main.c     |   13 +++++++++++++
 kernel/power/power.h    |    1 +
 mm/page_alloc.c         |    8 ++++++--
 5 files changed, 36 insertions(+), 3 deletions(-)

Index: linux-2.6/include/linux/suspend.h
===================================================================
--- linux-2.6.orig/include/linux/suspend.h
+++ linux-2.6/include/linux/suspend.h
@@ -282,6 +282,8 @@ extern int unregister_pm_notifier(struct
 		{ .notifier_call = fn, .priority = pri };	\
 	register_pm_notifier(&fn##_nb);			\
 }
+
+extern bool pm_transition_in_progress(void);
 #else /* !CONFIG_PM_SLEEP */
 
 static inline int register_pm_notifier(struct notifier_block *nb)
@@ -295,6 +297,8 @@ static inline int unregister_pm_notifier
 }
 
 #define pm_notifier(fn, pri)	do { (void)(fn); } while (0)
+
+static inline bool pm_transition_in_progress(void) { return false; }
 #endif /* !CONFIG_PM_SLEEP */
 
 #ifndef CONFIG_HIBERNATION
Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c
+++ linux-2.6/mm/page_alloc.c
@@ -1619,8 +1619,12 @@ nofail_alloc:
 			goto got_pg;
 		}
 
-		/* The OOM killer will not help higher order allocs so fail */
-		if (order > PAGE_ALLOC_COSTLY_ORDER) {
+		/*
+		 * The OOM killer will not help higher order allocs and it is
+		 * not useful during system-wide power transitions, so fail.
+		 */
+		if (order > PAGE_ALLOC_COSTLY_ORDER
+		    || pm_transition_in_progress()) {
 			clear_zonelist_oom(zonelist, gfp_mask);
 			goto nopage;
 		}
Index: linux-2.6/kernel/power/disk.c
===================================================================
--- linux-2.6.orig/kernel/power/disk.c
+++ linux-2.6/kernel/power/disk.c
@@ -299,9 +299,11 @@ int hibernation_snapshot(int platform_mo
 {
 	int error;
 
+	transition_in_progress = true;
+
 	error = platform_begin(platform_mode);
 	if (error)
-		return error;
+		goto Out;
 
 	/* Free memory before shutting down devices. */
 	error = swsusp_shrink_memory();
@@ -325,6 +327,9 @@ int hibernation_snapshot(int platform_mo
 	resume_console();
  Close:
 	platform_end(platform_mode);
+
+ Out:
+	transition_in_progress = false;
 	return error;
 
  Recover_platform:
@@ -607,6 +612,7 @@ int hibernate(void)
 		pr_debug("PM: Image restored successfully.\n");
 		swsusp_free();
 	}
+
  Thaw:
 	thaw_processes();
  Finish:
@@ -724,6 +730,8 @@ static int software_resume(void)
 		goto Done;
 	}
 
+	transition_in_progress = true;
+
 	pr_debug("PM: Reading hibernation image.\n");
 
 	error = swsusp_read(&flags);
@@ -732,6 +740,9 @@ static int software_resume(void)
 
 	printk(KERN_ERR "PM: Restore failed, recovering.\n");
 	swsusp_free();
+
+	transition_in_progress = false;
+
 	thaw_processes();
  Done:
 	free_basic_memory_bitmaps();
Index: linux-2.6/kernel/power/main.c
===================================================================
--- linux-2.6.orig/kernel/power/main.c
+++ linux-2.6/kernel/power/main.c
@@ -32,6 +32,13 @@ EXPORT_SYMBOL(pm_flags);
 
 #ifdef CONFIG_PM_SLEEP
 
+bool transition_in_progress;
+
+bool pm_transition_in_progress(void)
+{
+	return transition_in_progress;
+}
+
 /* Routines for PM-transition notifications */
 
 static BLOCKING_NOTIFIER_HEAD(pm_chain_head);
@@ -246,6 +253,8 @@ static int suspend_prepare(void)
 		goto Thaw;
 	}
 
+	transition_in_progress = true;
+
 	free_pages = global_page_state(NR_FREE_PAGES);
 	if (free_pages < FREE_PAGE_NUMBER) {
 		pr_debug("PM: free some memory\n");
@@ -258,6 +267,8 @@ static int suspend_prepare(void)
 	if (!error)
 		return 0;
 
+	transition_in_progress = false;
+
  Thaw:
 	suspend_thaw_processes();
 	usermodehelper_enable();
@@ -403,6 +414,8 @@ int suspend_devices_and_enter(suspend_st
  */
 static void suspend_finish(void)
 {
+	transition_in_progress = false;
+
 	suspend_thaw_processes();
 	usermodehelper_enable();
 	pm_notifier_call_chain(PM_POST_SUSPEND);
Index: linux-2.6/kernel/power/power.h
===================================================================
--- linux-2.6.orig/kernel/power/power.h
+++ linux-2.6/kernel/power/power.h
@@ -173,6 +173,7 @@ static inline int suspend_devices_and_en
 #ifdef CONFIG_PM_SLEEP
 /* kernel/power/main.c */
 extern int pm_notifier_call_chain(unsigned long val);
+extern bool transition_in_progress;
 #endif
 
 #ifdef CONFIG_HIGHMEM

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 1/3] PM: Disable OOM killer during system-wide power transitions
@ 2009-05-01 22:27                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-01 22:27 UTC (permalink / raw)
  To: Andrew Morton
  Cc: pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA, pm list

From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>

The OOM killer is not particularly useful during system-wide power
transitions, so do not use it if such a transition is in progress.

Signed-off-by: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
---
 include/linux/suspend.h |    4 ++++
 kernel/power/disk.c     |   13 ++++++++++++-
 kernel/power/main.c     |   13 +++++++++++++
 kernel/power/power.h    |    1 +
 mm/page_alloc.c         |    8 ++++++--
 5 files changed, 36 insertions(+), 3 deletions(-)

Index: linux-2.6/include/linux/suspend.h
===================================================================
--- linux-2.6.orig/include/linux/suspend.h
+++ linux-2.6/include/linux/suspend.h
@@ -282,6 +282,8 @@ extern int unregister_pm_notifier(struct
 		{ .notifier_call = fn, .priority = pri };	\
 	register_pm_notifier(&fn##_nb);			\
 }
+
+extern bool pm_transition_in_progress(void);
 #else /* !CONFIG_PM_SLEEP */
 
 static inline int register_pm_notifier(struct notifier_block *nb)
@@ -295,6 +297,8 @@ static inline int unregister_pm_notifier
 }
 
 #define pm_notifier(fn, pri)	do { (void)(fn); } while (0)
+
+static inline bool pm_transition_in_progress(void) { return false; }
 #endif /* !CONFIG_PM_SLEEP */
 
 #ifndef CONFIG_HIBERNATION
Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c
+++ linux-2.6/mm/page_alloc.c
@@ -1619,8 +1619,12 @@ nofail_alloc:
 			goto got_pg;
 		}
 
-		/* The OOM killer will not help higher order allocs so fail */
-		if (order > PAGE_ALLOC_COSTLY_ORDER) {
+		/*
+		 * The OOM killer will not help higher order allocs and it is
+		 * not useful during system-wide power transitions, so fail.
+		 */
+		if (order > PAGE_ALLOC_COSTLY_ORDER
+		    || pm_transition_in_progress()) {
 			clear_zonelist_oom(zonelist, gfp_mask);
 			goto nopage;
 		}
Index: linux-2.6/kernel/power/disk.c
===================================================================
--- linux-2.6.orig/kernel/power/disk.c
+++ linux-2.6/kernel/power/disk.c
@@ -299,9 +299,11 @@ int hibernation_snapshot(int platform_mo
 {
 	int error;
 
+	transition_in_progress = true;
+
 	error = platform_begin(platform_mode);
 	if (error)
-		return error;
+		goto Out;
 
 	/* Free memory before shutting down devices. */
 	error = swsusp_shrink_memory();
@@ -325,6 +327,9 @@ int hibernation_snapshot(int platform_mo
 	resume_console();
  Close:
 	platform_end(platform_mode);
+
+ Out:
+	transition_in_progress = false;
 	return error;
 
  Recover_platform:
@@ -607,6 +612,7 @@ int hibernate(void)
 		pr_debug("PM: Image restored successfully.\n");
 		swsusp_free();
 	}
+
  Thaw:
 	thaw_processes();
  Finish:
@@ -724,6 +730,8 @@ static int software_resume(void)
 		goto Done;
 	}
 
+	transition_in_progress = true;
+
 	pr_debug("PM: Reading hibernation image.\n");
 
 	error = swsusp_read(&flags);
@@ -732,6 +740,9 @@ static int software_resume(void)
 
 	printk(KERN_ERR "PM: Restore failed, recovering.\n");
 	swsusp_free();
+
+	transition_in_progress = false;
+
 	thaw_processes();
  Done:
 	free_basic_memory_bitmaps();
Index: linux-2.6/kernel/power/main.c
===================================================================
--- linux-2.6.orig/kernel/power/main.c
+++ linux-2.6/kernel/power/main.c
@@ -32,6 +32,13 @@ EXPORT_SYMBOL(pm_flags);
 
 #ifdef CONFIG_PM_SLEEP
 
+bool transition_in_progress;
+
+bool pm_transition_in_progress(void)
+{
+	return transition_in_progress;
+}
+
 /* Routines for PM-transition notifications */
 
 static BLOCKING_NOTIFIER_HEAD(pm_chain_head);
@@ -246,6 +253,8 @@ static int suspend_prepare(void)
 		goto Thaw;
 	}
 
+	transition_in_progress = true;
+
 	free_pages = global_page_state(NR_FREE_PAGES);
 	if (free_pages < FREE_PAGE_NUMBER) {
 		pr_debug("PM: free some memory\n");
@@ -258,6 +267,8 @@ static int suspend_prepare(void)
 	if (!error)
 		return 0;
 
+	transition_in_progress = false;
+
  Thaw:
 	suspend_thaw_processes();
 	usermodehelper_enable();
@@ -403,6 +414,8 @@ int suspend_devices_and_enter(suspend_st
  */
 static void suspend_finish(void)
 {
+	transition_in_progress = false;
+
 	suspend_thaw_processes();
 	usermodehelper_enable();
 	pm_notifier_call_chain(PM_POST_SUSPEND);
Index: linux-2.6/kernel/power/power.h
===================================================================
--- linux-2.6.orig/kernel/power/power.h
+++ linux-2.6/kernel/power/power.h
@@ -173,6 +173,7 @@ static inline int suspend_devices_and_en
 #ifdef CONFIG_PM_SLEEP
 /* kernel/power/main.c */
 extern int pm_notifier_call_chain(unsigned long val);
+extern bool transition_in_progress;
 #endif
 
 #ifdef CONFIG_HIGHMEM

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 2/3] PM/Hibernate: Move memory shrinking to snapshot.c
@ 2009-05-01 22:28                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-01 22:28 UTC (permalink / raw)
  To: Andrew Morton
  Cc: pavel, torvalds, jens.axboe, alan-jenkins, linux-kernel,
	kernel-testers, pm list

From: Rafael J. Wysocki <rjw@sisk.pl>

The next patch is going to modify the memory shrinking code so that
it will make memory allocations to free memory instead of using an
artificial memory shrinking mechanism for that.  For this purpose it
is convenient to move swsusp_shrink_memory() from
kernel/power/swsusp.c to kernel/power/snapshot.c, because the new
memory-shrinking code is going to use things that are local to
kernel/power/snapshot.c .

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/snapshot.c |   76 ++++++++++++++++++++++++++++++++++++++++++++++++
 kernel/power/swsusp.c   |   76 ------------------------------------------------
 2 files changed, 76 insertions(+), 76 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -39,6 +39,14 @@ static int swsusp_page_is_free(struct pa
 static void swsusp_set_page_forbidden(struct page *);
 static void swsusp_unset_page_forbidden(struct page *);
 
+/*
+ * Preferred image size in bytes (tunable via /sys/power/image_size).
+ * When it is set to N, swsusp will do its best to ensure the image
+ * size will not exceed N bytes, but if that is impossible, it will
+ * try to create the smallest image possible.
+ */
+unsigned long image_size = 500 * 1024 * 1024;
+
 /* List of PBEs needed for restoring the pages that were allocated before
  * the suspend and included in the suspend image, but have also been
  * allocated by the "resume" kernel, so their contents cannot be written
@@ -569,6 +577,74 @@ static unsigned long memory_bm_next_pfn(
 }
 
 /**
+ *	swsusp_shrink_memory -  Try to free as much memory as needed
+ *
+ *	... but do not OOM-kill anyone
+ *
+ *	Notice: all userland should be stopped before it is called, or
+ *	livelock is possible.
+ */
+
+#define SHRINK_BITE	10000
+static inline unsigned long __shrink_memory(long tmp)
+{
+	if (tmp > SHRINK_BITE)
+		tmp = SHRINK_BITE;
+	return shrink_all_memory(tmp);
+}
+
+int swsusp_shrink_memory(void)
+{
+	long tmp;
+	struct zone *zone;
+	unsigned long pages = 0;
+	unsigned int i = 0;
+	char *p = "-\\|/";
+	struct timeval start, stop;
+
+	printk(KERN_INFO "PM: Shrinking memory...  ");
+	do_gettimeofday(&start);
+	do {
+		long size, highmem_size;
+
+		highmem_size = count_highmem_pages();
+		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
+		tmp = size;
+		size += highmem_size;
+		for_each_populated_zone(zone) {
+			tmp += snapshot_additional_pages(zone);
+			if (is_highmem(zone)) {
+				highmem_size -=
+					zone_page_state(zone, NR_FREE_PAGES);
+			} else {
+				tmp -= zone_page_state(zone, NR_FREE_PAGES);
+				tmp += zone->lowmem_reserve[ZONE_NORMAL];
+			}
+		}
+
+		if (highmem_size < 0)
+			highmem_size = 0;
+
+		tmp += highmem_size;
+		if (tmp > 0) {
+			tmp = __shrink_memory(tmp);
+			if (!tmp)
+				return -ENOMEM;
+			pages += tmp;
+		} else if (size > image_size / PAGE_SIZE) {
+			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
+			pages += tmp;
+		}
+		printk("\b%c", p[i++%4]);
+	} while (tmp > 0);
+	do_gettimeofday(&stop);
+	printk("\bdone (%lu pages freed)\n", pages);
+	swsusp_show_speed(&start, &stop, pages, "Freed");
+
+	return 0;
+}
+
+/**
  *	This structure represents a range of page frames the contents of which
  *	should not be saved during the suspend.
  */
Index: linux-2.6/kernel/power/swsusp.c
===================================================================
--- linux-2.6.orig/kernel/power/swsusp.c
+++ linux-2.6/kernel/power/swsusp.c
@@ -55,14 +55,6 @@
 
 #include "power.h"
 
-/*
- * Preferred image size in bytes (tunable via /sys/power/image_size).
- * When it is set to N, swsusp will do its best to ensure the image
- * size will not exceed N bytes, but if that is impossible, it will
- * try to create the smallest image possible.
- */
-unsigned long image_size = 500 * 1024 * 1024;
-
 int in_suspend __nosavedata = 0;
 
 /**
@@ -195,74 +187,6 @@ void swsusp_show_speed(struct timeval *s
 			kps / 1000, (kps % 1000) / 10);
 }
 
-/**
- *	swsusp_shrink_memory -  Try to free as much memory as needed
- *
- *	... but do not OOM-kill anyone
- *
- *	Notice: all userland should be stopped before it is called, or
- *	livelock is possible.
- */
-
-#define SHRINK_BITE	10000
-static inline unsigned long __shrink_memory(long tmp)
-{
-	if (tmp > SHRINK_BITE)
-		tmp = SHRINK_BITE;
-	return shrink_all_memory(tmp);
-}
-
-int swsusp_shrink_memory(void)
-{
-	long tmp;
-	struct zone *zone;
-	unsigned long pages = 0;
-	unsigned int i = 0;
-	char *p = "-\\|/";
-	struct timeval start, stop;
-
-	printk(KERN_INFO "PM: Shrinking memory...  ");
-	do_gettimeofday(&start);
-	do {
-		long size, highmem_size;
-
-		highmem_size = count_highmem_pages();
-		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
-		tmp = size;
-		size += highmem_size;
-		for_each_populated_zone(zone) {
-			tmp += snapshot_additional_pages(zone);
-			if (is_highmem(zone)) {
-				highmem_size -=
-					zone_page_state(zone, NR_FREE_PAGES);
-			} else {
-				tmp -= zone_page_state(zone, NR_FREE_PAGES);
-				tmp += zone->lowmem_reserve[ZONE_NORMAL];
-			}
-		}
-
-		if (highmem_size < 0)
-			highmem_size = 0;
-
-		tmp += highmem_size;
-		if (tmp > 0) {
-			tmp = __shrink_memory(tmp);
-			if (!tmp)
-				return -ENOMEM;
-			pages += tmp;
-		} else if (size > image_size / PAGE_SIZE) {
-			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
-			pages += tmp;
-		}
-		printk("\b%c", p[i++%4]);
-	} while (tmp > 0);
-	do_gettimeofday(&stop);
-	printk("\bdone (%lu pages freed)\n", pages);
-	swsusp_show_speed(&start, &stop, pages, "Freed");
-
-	return 0;
-}
-
 /*
  * Platforms, like ACPI, may want us to save some memory used by them during
  * hibernation and to restore the contents of this memory during the subsequent

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 2/3] PM/Hibernate: Move memory shrinking to snapshot.c
  2009-05-01 22:26                             ` Rafael J. Wysocki
                                               ` (2 preceding siblings ...)
  (?)
@ 2009-05-01 22:28                             ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-01 22:28 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, alan-jenkins, jens.axboe, pm list, kernel-testers,
	torvalds

From: Rafael J. Wysocki <rjw@sisk.pl>

The next patch is going to modify the memory shrinking code so that
it will make memory allocations to free memory instead of using an
artificial memory shrinking mechanism for that.  For this purpose it
is convenient to move swsusp_shrink_memory() from
kernel/power/swsusp.c to kernel/power/snapshot.c, because the new
memory-shrinking code is going to use things that are local to
kernel/power/snapshot.c .

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/snapshot.c |   76 ++++++++++++++++++++++++++++++++++++++++++++++++
 kernel/power/swsusp.c   |   76 ------------------------------------------------
 2 files changed, 76 insertions(+), 76 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -39,6 +39,14 @@ static int swsusp_page_is_free(struct pa
 static void swsusp_set_page_forbidden(struct page *);
 static void swsusp_unset_page_forbidden(struct page *);
 
+/*
+ * Preferred image size in bytes (tunable via /sys/power/image_size).
+ * When it is set to N, swsusp will do its best to ensure the image
+ * size will not exceed N bytes, but if that is impossible, it will
+ * try to create the smallest image possible.
+ */
+unsigned long image_size = 500 * 1024 * 1024;
+
 /* List of PBEs needed for restoring the pages that were allocated before
  * the suspend and included in the suspend image, but have also been
  * allocated by the "resume" kernel, so their contents cannot be written
@@ -569,6 +577,74 @@ static unsigned long memory_bm_next_pfn(
 }
 
 /**
+ *	swsusp_shrink_memory -  Try to free as much memory as needed
+ *
+ *	... but do not OOM-kill anyone
+ *
+ *	Notice: all userland should be stopped before it is called, or
+ *	livelock is possible.
+ */
+
+#define SHRINK_BITE	10000
+static inline unsigned long __shrink_memory(long tmp)
+{
+	if (tmp > SHRINK_BITE)
+		tmp = SHRINK_BITE;
+	return shrink_all_memory(tmp);
+}
+
+int swsusp_shrink_memory(void)
+{
+	long tmp;
+	struct zone *zone;
+	unsigned long pages = 0;
+	unsigned int i = 0;
+	char *p = "-\\|/";
+	struct timeval start, stop;
+
+	printk(KERN_INFO "PM: Shrinking memory...  ");
+	do_gettimeofday(&start);
+	do {
+		long size, highmem_size;
+
+		highmem_size = count_highmem_pages();
+		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
+		tmp = size;
+		size += highmem_size;
+		for_each_populated_zone(zone) {
+			tmp += snapshot_additional_pages(zone);
+			if (is_highmem(zone)) {
+				highmem_size -=
+					zone_page_state(zone, NR_FREE_PAGES);
+			} else {
+				tmp -= zone_page_state(zone, NR_FREE_PAGES);
+				tmp += zone->lowmem_reserve[ZONE_NORMAL];
+			}
+		}
+
+		if (highmem_size < 0)
+			highmem_size = 0;
+
+		tmp += highmem_size;
+		if (tmp > 0) {
+			tmp = __shrink_memory(tmp);
+			if (!tmp)
+				return -ENOMEM;
+			pages += tmp;
+		} else if (size > image_size / PAGE_SIZE) {
+			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
+			pages += tmp;
+		}
+		printk("\b%c", p[i++%4]);
+	} while (tmp > 0);
+	do_gettimeofday(&stop);
+	printk("\bdone (%lu pages freed)\n", pages);
+	swsusp_show_speed(&start, &stop, pages, "Freed");
+
+	return 0;
+}
+
+/**
  *	This structure represents a range of page frames the contents of which
  *	should not be saved during the suspend.
  */
Index: linux-2.6/kernel/power/swsusp.c
===================================================================
--- linux-2.6.orig/kernel/power/swsusp.c
+++ linux-2.6/kernel/power/swsusp.c
@@ -55,14 +55,6 @@
 
 #include "power.h"
 
-/*
- * Preferred image size in bytes (tunable via /sys/power/image_size).
- * When it is set to N, swsusp will do its best to ensure the image
- * size will not exceed N bytes, but if that is impossible, it will
- * try to create the smallest image possible.
- */
-unsigned long image_size = 500 * 1024 * 1024;
-
 int in_suspend __nosavedata = 0;
 
 /**
@@ -195,74 +187,6 @@ void swsusp_show_speed(struct timeval *s
 			kps / 1000, (kps % 1000) / 10);
 }
 
-/**
- *	swsusp_shrink_memory -  Try to free as much memory as needed
- *
- *	... but do not OOM-kill anyone
- *
- *	Notice: all userland should be stopped before it is called, or
- *	livelock is possible.
- */
-
-#define SHRINK_BITE	10000
-static inline unsigned long __shrink_memory(long tmp)
-{
-	if (tmp > SHRINK_BITE)
-		tmp = SHRINK_BITE;
-	return shrink_all_memory(tmp);
-}
-
-int swsusp_shrink_memory(void)
-{
-	long tmp;
-	struct zone *zone;
-	unsigned long pages = 0;
-	unsigned int i = 0;
-	char *p = "-\\|/";
-	struct timeval start, stop;
-
-	printk(KERN_INFO "PM: Shrinking memory...  ");
-	do_gettimeofday(&start);
-	do {
-		long size, highmem_size;
-
-		highmem_size = count_highmem_pages();
-		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
-		tmp = size;
-		size += highmem_size;
-		for_each_populated_zone(zone) {
-			tmp += snapshot_additional_pages(zone);
-			if (is_highmem(zone)) {
-				highmem_size -=
-					zone_page_state(zone, NR_FREE_PAGES);
-			} else {
-				tmp -= zone_page_state(zone, NR_FREE_PAGES);
-				tmp += zone->lowmem_reserve[ZONE_NORMAL];
-			}
-		}
-
-		if (highmem_size < 0)
-			highmem_size = 0;
-
-		tmp += highmem_size;
-		if (tmp > 0) {
-			tmp = __shrink_memory(tmp);
-			if (!tmp)
-				return -ENOMEM;
-			pages += tmp;
-		} else if (size > image_size / PAGE_SIZE) {
-			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
-			pages += tmp;
-		}
-		printk("\b%c", p[i++%4]);
-	} while (tmp > 0);
-	do_gettimeofday(&stop);
-	printk("\bdone (%lu pages freed)\n", pages);
-	swsusp_show_speed(&start, &stop, pages, "Freed");
-
-	return 0;
-}
-
 /*
  * Platforms, like ACPI, may want us to save some memory used by them during
  * hibernation and to restore the contents of this memory during the subsequent

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 2/3] PM/Hibernate: Move memory shrinking to snapshot.c
@ 2009-05-01 22:28                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-01 22:28 UTC (permalink / raw)
  To: Andrew Morton
  Cc: pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA, pm list

From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>

The next patch is going to modify the memory shrinking code so that
it will make memory allocations to free memory instead of using an
artificial memory shrinking mechanism for that.  For this purpose it
is convenient to move swsusp_shrink_memory() from
kernel/power/swsusp.c to kernel/power/snapshot.c, because the new
memory-shrinking code is going to use things that are local to
kernel/power/snapshot.c .

Signed-off-by: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
---
 kernel/power/snapshot.c |   76 ++++++++++++++++++++++++++++++++++++++++++++++++
 kernel/power/swsusp.c   |   76 ------------------------------------------------
 2 files changed, 76 insertions(+), 76 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -39,6 +39,14 @@ static int swsusp_page_is_free(struct pa
 static void swsusp_set_page_forbidden(struct page *);
 static void swsusp_unset_page_forbidden(struct page *);
 
+/*
+ * Preferred image size in bytes (tunable via /sys/power/image_size).
+ * When it is set to N, swsusp will do its best to ensure the image
+ * size will not exceed N bytes, but if that is impossible, it will
+ * try to create the smallest image possible.
+ */
+unsigned long image_size = 500 * 1024 * 1024;
+
 /* List of PBEs needed for restoring the pages that were allocated before
  * the suspend and included in the suspend image, but have also been
  * allocated by the "resume" kernel, so their contents cannot be written
@@ -569,6 +577,74 @@ static unsigned long memory_bm_next_pfn(
 }
 
 /**
+ *	swsusp_shrink_memory -  Try to free as much memory as needed
+ *
+ *	... but do not OOM-kill anyone
+ *
+ *	Notice: all userland should be stopped before it is called, or
+ *	livelock is possible.
+ */
+
+#define SHRINK_BITE	10000
+static inline unsigned long __shrink_memory(long tmp)
+{
+	if (tmp > SHRINK_BITE)
+		tmp = SHRINK_BITE;
+	return shrink_all_memory(tmp);
+}
+
+int swsusp_shrink_memory(void)
+{
+	long tmp;
+	struct zone *zone;
+	unsigned long pages = 0;
+	unsigned int i = 0;
+	char *p = "-\\|/";
+	struct timeval start, stop;
+
+	printk(KERN_INFO "PM: Shrinking memory...  ");
+	do_gettimeofday(&start);
+	do {
+		long size, highmem_size;
+
+		highmem_size = count_highmem_pages();
+		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
+		tmp = size;
+		size += highmem_size;
+		for_each_populated_zone(zone) {
+			tmp += snapshot_additional_pages(zone);
+			if (is_highmem(zone)) {
+				highmem_size -=
+					zone_page_state(zone, NR_FREE_PAGES);
+			} else {
+				tmp -= zone_page_state(zone, NR_FREE_PAGES);
+				tmp += zone->lowmem_reserve[ZONE_NORMAL];
+			}
+		}
+
+		if (highmem_size < 0)
+			highmem_size = 0;
+
+		tmp += highmem_size;
+		if (tmp > 0) {
+			tmp = __shrink_memory(tmp);
+			if (!tmp)
+				return -ENOMEM;
+			pages += tmp;
+		} else if (size > image_size / PAGE_SIZE) {
+			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
+			pages += tmp;
+		}
+		printk("\b%c", p[i++%4]);
+	} while (tmp > 0);
+	do_gettimeofday(&stop);
+	printk("\bdone (%lu pages freed)\n", pages);
+	swsusp_show_speed(&start, &stop, pages, "Freed");
+
+	return 0;
+}
+
+/**
  *	This structure represents a range of page frames the contents of which
  *	should not be saved during the suspend.
  */
Index: linux-2.6/kernel/power/swsusp.c
===================================================================
--- linux-2.6.orig/kernel/power/swsusp.c
+++ linux-2.6/kernel/power/swsusp.c
@@ -55,14 +55,6 @@
 
 #include "power.h"
 
-/*
- * Preferred image size in bytes (tunable via /sys/power/image_size).
- * When it is set to N, swsusp will do its best to ensure the image
- * size will not exceed N bytes, but if that is impossible, it will
- * try to create the smallest image possible.
- */
-unsigned long image_size = 500 * 1024 * 1024;
-
 int in_suspend __nosavedata = 0;
 
 /**
@@ -195,74 +187,6 @@ void swsusp_show_speed(struct timeval *s
 			kps / 1000, (kps % 1000) / 10);
 }
 
-/**
- *	swsusp_shrink_memory -  Try to free as much memory as needed
- *
- *	... but do not OOM-kill anyone
- *
- *	Notice: all userland should be stopped before it is called, or
- *	livelock is possible.
- */
-
-#define SHRINK_BITE	10000
-static inline unsigned long __shrink_memory(long tmp)
-{
-	if (tmp > SHRINK_BITE)
-		tmp = SHRINK_BITE;
-	return shrink_all_memory(tmp);
-}
-
-int swsusp_shrink_memory(void)
-{
-	long tmp;
-	struct zone *zone;
-	unsigned long pages = 0;
-	unsigned int i = 0;
-	char *p = "-\\|/";
-	struct timeval start, stop;
-
-	printk(KERN_INFO "PM: Shrinking memory...  ");
-	do_gettimeofday(&start);
-	do {
-		long size, highmem_size;
-
-		highmem_size = count_highmem_pages();
-		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
-		tmp = size;
-		size += highmem_size;
-		for_each_populated_zone(zone) {
-			tmp += snapshot_additional_pages(zone);
-			if (is_highmem(zone)) {
-				highmem_size -=
-					zone_page_state(zone, NR_FREE_PAGES);
-			} else {
-				tmp -= zone_page_state(zone, NR_FREE_PAGES);
-				tmp += zone->lowmem_reserve[ZONE_NORMAL];
-			}
-		}
-
-		if (highmem_size < 0)
-			highmem_size = 0;
-
-		tmp += highmem_size;
-		if (tmp > 0) {
-			tmp = __shrink_memory(tmp);
-			if (!tmp)
-				return -ENOMEM;
-			pages += tmp;
-		} else if (size > image_size / PAGE_SIZE) {
-			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
-			pages += tmp;
-		}
-		printk("\b%c", p[i++%4]);
-	} while (tmp > 0);
-	do_gettimeofday(&stop);
-	printk("\bdone (%lu pages freed)\n", pages);
-	swsusp_show_speed(&start, &stop, pages, "Freed");
-
-	return 0;
-}
-
 /*
  * Platforms, like ACPI, may want us to save some memory used by them during
  * hibernation and to restore the contents of this memory during the subsequent

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory
@ 2009-05-01 22:29                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-01 22:29 UTC (permalink / raw)
  To: Andrew Morton
  Cc: pavel, torvalds, jens.axboe, alan-jenkins, linux-kernel,
	kernel-testers, pm list

From: Rafael J. Wysocki <rjw@sisk.pl>

Modify the hibernation memory shrinking code so that it will make
memory allocations to free memory instead of using an artificial
memory shrinking mechanism for that.  Remove the shrinking of
memory from the suspend-to-RAM code, where it is not really
necessary.  Finally, remove the no longer used memory shrinking
functions from mm/vmscan.c .

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/main.c     |   25 +-------
 kernel/power/snapshot.c |  118 ++++++++++++++++++++++++++++++---------
 mm/vmscan.c             |  142 ------------------------------------------------
 3 files changed, 93 insertions(+), 192 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -576,39 +576,89 @@ static unsigned long memory_bm_next_pfn(
 	return bb->start_pfn + bit;
 }
 
+/* Helper functions used for the shrinking of memory. */
+
 /**
- *	swsusp_shrink_memory -  Try to free as much memory as needed
- *
- *	... but do not OOM-kill anyone
+ * free_marked_pages - release pages allocated during memory shrinking
+ * @bm: Memory bitmap where the allocated pages were marked.
  *
- *	Notice: all userland should be stopped before it is called, or
- *	livelock is possible.
+ * Free all pages marked in given memory bitmap.
  */
+static void free_marked_pages(struct memory_bitmap *bm)
+{
+	memory_bm_position_reset(bm);
+	for(;;) {
+		unsigned long pfn;
+		struct page *page;
 
-#define SHRINK_BITE	10000
-static inline unsigned long __shrink_memory(long tmp)
+		pfn = memory_bm_next_pfn(bm);
+		if (pfn == BM_END_OF_MAP)
+			break;
+		page = pfn_to_page(pfn);
+		__free_page(page);
+	}
+}
+
+/**
+ * alloc_and_mark_page - allocate given number of pages and mark their PFNs
+ * @bm: Memory bitmap to use for marking allocated pages.
+ * @nr_pages: Number of pages to allocate.
+ *
+ * Allocate given number of pages and mark their PFNs in given memory bitmap,
+ * so that they can be released by free_marked_pages().
+ * Return value: The number of normal (ie. non-highmem) pages allocated or
+ * -ENOMEM on failure.
+ */
+static long alloc_and_mark_pages(struct memory_bitmap *bm, long nr_pages)
 {
-	if (tmp > SHRINK_BITE)
-		tmp = SHRINK_BITE;
-	return shrink_all_memory(tmp);
+	long nr_normal = 0;
+
+	while (nr_pages-- > 0) {
+		struct page *page;
+
+		page = alloc_page(GFP_KERNEL | __GFP_HIGHMEM);
+		if (!page)
+			return -ENOMEM;
+		memory_bm_set_bit(bm, page_to_pfn(page));
+		if (!PageHighMem(page))
+			nr_normal++;
+	}
+
+	return nr_normal;
 }
 
+#define SHRINK_BITE	10000
+
+/**
+ * swsusp_shrink_memory -  Try to make the kernel free as much memory as needed
+ */
 int swsusp_shrink_memory(void)
 {
 	long tmp;
 	struct zone *zone;
-	unsigned long pages = 0;
+	unsigned long pages = 0, alloc_normal = 0, alloc_highmem = 0;
 	unsigned int i = 0;
 	char *p = "-\\|/";
 	struct timeval start, stop;
+	struct memory_bitmap *bm;
+	int error;
+
+	bm = kzalloc(sizeof(*bm), GFP_KERNEL);
+	if (!bm)
+		return -ENOMEM;
+	error = memory_bm_create(bm, GFP_KERNEL, PG_ANY);
+	if (error)
+		return error;
 
 	printk(KERN_INFO "PM: Shrinking memory...  ");
 	do_gettimeofday(&start);
-	do {
-		long size, highmem_size;
 
-		highmem_size = count_highmem_pages();
-		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
+	for (;;) {
+		long size, highmem_size, ret;
+
+		highmem_size = count_highmem_pages() - 2 * alloc_highmem;
+		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES
+			- 2 * alloc_normal;
 		tmp = size;
 		size += highmem_size;
 		for_each_populated_zone(zone) {
@@ -621,27 +671,39 @@ int swsusp_shrink_memory(void)
 				tmp += zone->lowmem_reserve[ZONE_NORMAL];
 			}
 		}
-
 		if (highmem_size < 0)
 			highmem_size = 0;
-
 		tmp += highmem_size;
-		if (tmp > 0) {
-			tmp = __shrink_memory(tmp);
-			if (!tmp)
-				return -ENOMEM;
-			pages += tmp;
-		} else if (size > image_size / PAGE_SIZE) {
-			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
-			pages += tmp;
+
+		if (tmp <= 0 && size > image_size / PAGE_SIZE)
+			tmp = size - (image_size / PAGE_SIZE);
+
+		if (tmp > SHRINK_BITE)
+			tmp = SHRINK_BITE;
+		else if (tmp <= 0)
+			break;
+
+		ret = alloc_and_mark_pages(bm, tmp);
+		if (ret < 0) {
+			error = -ENOMEM;
+			goto out;
 		}
+		alloc_normal += ret;
+		alloc_highmem += tmp - ret;
+		pages += tmp;
+
 		printk("\b%c", p[i++%4]);
-	} while (tmp > 0);
+	}
+
 	do_gettimeofday(&stop);
-	printk("\bdone (%lu pages freed)\n", pages);
+	printk("\bdone (preallocated %lu free pages)\n", pages);
 	swsusp_show_speed(&start, &stop, pages, "Freed");
 
-	return 0;
+ out:
+	free_marked_pages(bm);
+	memory_bm_free(bm, PG_UNSAFE_CLEAR);
+
+	return error;
 }
 
 /**
Index: linux-2.6/mm/vmscan.c
===================================================================
--- linux-2.6.orig/mm/vmscan.c
+++ linux-2.6/mm/vmscan.c
@@ -2054,148 +2054,6 @@ unsigned long global_lru_pages(void)
 		+ global_page_state(NR_INACTIVE_FILE);
 }
 
-#ifdef CONFIG_PM
-/*
- * Helper function for shrink_all_memory().  Tries to reclaim 'nr_pages' pages
- * from LRU lists system-wide, for given pass and priority.
- *
- * For pass > 3 we also try to shrink the LRU lists that contain a few pages
- */
-static void shrink_all_zones(unsigned long nr_pages, int prio,
-				      int pass, struct scan_control *sc)
-{
-	struct zone *zone;
-	unsigned long nr_reclaimed = 0;
-
-	for_each_populated_zone(zone) {
-		enum lru_list l;
-
-		if (zone_is_all_unreclaimable(zone) && prio != DEF_PRIORITY)
-			continue;
-
-		for_each_evictable_lru(l) {
-			enum zone_stat_item ls = NR_LRU_BASE + l;
-			unsigned long lru_pages = zone_page_state(zone, ls);
-
-			/* For pass = 0, we don't shrink the active list */
-			if (pass == 0 && (l == LRU_ACTIVE_ANON ||
-						l == LRU_ACTIVE_FILE))
-				continue;
-
-			zone->lru[l].nr_scan += (lru_pages >> prio) + 1;
-			if (zone->lru[l].nr_scan >= nr_pages || pass > 3) {
-				unsigned long nr_to_scan;
-
-				zone->lru[l].nr_scan = 0;
-				nr_to_scan = min(nr_pages, lru_pages);
-				nr_reclaimed += shrink_list(l, nr_to_scan, zone,
-								sc, prio);
-				if (nr_reclaimed >= nr_pages) {
-					sc->nr_reclaimed += nr_reclaimed;
-					return;
-				}
-			}
-		}
-	}
-	sc->nr_reclaimed += nr_reclaimed;
-}
-
-/*
- * Try to free `nr_pages' of memory, system-wide, and return the number of
- * freed pages.
- *
- * Rather than trying to age LRUs the aim is to preserve the overall
- * LRU order by reclaiming preferentially
- * inactive > active > active referenced > active mapped
- */
-unsigned long shrink_all_memory(unsigned long nr_pages)
-{
-	unsigned long lru_pages, nr_slab;
-	int pass;
-	struct reclaim_state reclaim_state;
-	struct scan_control sc = {
-		.gfp_mask = GFP_KERNEL,
-		.may_unmap = 0,
-		.may_writepage = 1,
-		.isolate_pages = isolate_pages_global,
-		.nr_reclaimed = 0,
-	};
-
-	current->reclaim_state = &reclaim_state;
-
-	lru_pages = global_lru_pages();
-	nr_slab = global_page_state(NR_SLAB_RECLAIMABLE);
-	/* If slab caches are huge, it's better to hit them first */
-	while (nr_slab >= lru_pages) {
-		reclaim_state.reclaimed_slab = 0;
-		shrink_slab(nr_pages, sc.gfp_mask, lru_pages);
-		if (!reclaim_state.reclaimed_slab)
-			break;
-
-		sc.nr_reclaimed += reclaim_state.reclaimed_slab;
-		if (sc.nr_reclaimed >= nr_pages)
-			goto out;
-
-		nr_slab -= reclaim_state.reclaimed_slab;
-	}
-
-	/*
-	 * We try to shrink LRUs in 5 passes:
-	 * 0 = Reclaim from inactive_list only
-	 * 1 = Reclaim from active list but don't reclaim mapped
-	 * 2 = 2nd pass of type 1
-	 * 3 = Reclaim mapped (normal reclaim)
-	 * 4 = 2nd pass of type 3
-	 */
-	for (pass = 0; pass < 5; pass++) {
-		int prio;
-
-		/* Force reclaiming mapped pages in the passes #3 and #4 */
-		if (pass > 2)
-			sc.may_unmap = 1;
-
-		for (prio = DEF_PRIORITY; prio >= 0; prio--) {
-			unsigned long nr_to_scan = nr_pages - sc.nr_reclaimed;
-
-			sc.nr_scanned = 0;
-			sc.swap_cluster_max = nr_to_scan;
-			shrink_all_zones(nr_to_scan, prio, pass, &sc);
-			if (sc.nr_reclaimed >= nr_pages)
-				goto out;
-
-			reclaim_state.reclaimed_slab = 0;
-			shrink_slab(sc.nr_scanned, sc.gfp_mask,
-					global_lru_pages());
-			sc.nr_reclaimed += reclaim_state.reclaimed_slab;
-			if (sc.nr_reclaimed >= nr_pages)
-				goto out;
-
-			if (sc.nr_scanned && prio < DEF_PRIORITY - 2)
-				congestion_wait(WRITE, HZ / 10);
-		}
-	}
-
-	/*
-	 * If sc.nr_reclaimed = 0, we could not shrink LRUs, but there may be
-	 * something in slab caches
-	 */
-	if (!sc.nr_reclaimed) {
-		do {
-			reclaim_state.reclaimed_slab = 0;
-			shrink_slab(nr_pages, sc.gfp_mask, global_lru_pages());
-			sc.nr_reclaimed += reclaim_state.reclaimed_slab;
-		} while (sc.nr_reclaimed < nr_pages &&
-				reclaim_state.reclaimed_slab > 0);
-	}
-
-
-out:
-	current->reclaim_state = NULL;
-
-	return sc.nr_reclaimed;
-}
-#endif
-
 /* It's optimal to keep kswapds on the same CPUs as their memory, but
    not required for correctness.  So if the last cpu in a node goes
    away, we get changed to run anywhere: as the first one comes back,
Index: linux-2.6/kernel/power/main.c
===================================================================
--- linux-2.6.orig/kernel/power/main.c
+++ linux-2.6/kernel/power/main.c
@@ -195,9 +195,6 @@ static void suspend_test_finish(const ch
 
 #endif
 
-/* This is just an arbitrary number */
-#define FREE_PAGE_NUMBER (100)
-
 static struct platform_suspend_ops *suspend_ops;
 
 /**
@@ -233,7 +230,6 @@ int suspend_valid_only_mem(suspend_state
 static int suspend_prepare(void)
 {
 	int error;
-	unsigned int free_pages;
 
 	if (!suspend_ops || !suspend_ops->enter)
 		return -EPERM;
@@ -250,26 +246,11 @@ static int suspend_prepare(void)
 
 	if (suspend_freeze_processes()) {
 		error = -EAGAIN;
-		goto Thaw;
-	}
-
-	transition_in_progress = true;
-
-	free_pages = global_page_state(NR_FREE_PAGES);
-	if (free_pages < FREE_PAGE_NUMBER) {
-		pr_debug("PM: free some memory\n");
-		shrink_all_memory(FREE_PAGE_NUMBER - free_pages);
-		if (nr_free_pages() < FREE_PAGE_NUMBER) {
-			error = -ENOMEM;
-			printk(KERN_ERR "PM: No enough memory\n");
-		}
-	}
-	if (!error)
+	} else {
+		transition_in_progress = true;
 		return 0;
+	}
 
-	transition_in_progress = false;
-
- Thaw:
 	suspend_thaw_processes();
 	usermodehelper_enable();
  Finish:

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory
  2009-05-01 22:26                             ` Rafael J. Wysocki
                                               ` (5 preceding siblings ...)
  (?)
@ 2009-05-01 22:29                             ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-01 22:29 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, alan-jenkins, jens.axboe, pm list, kernel-testers,
	torvalds

From: Rafael J. Wysocki <rjw@sisk.pl>

Modify the hibernation memory shrinking code so that it will make
memory allocations to free memory instead of using an artificial
memory shrinking mechanism for that.  Remove the shrinking of
memory from the suspend-to-RAM code, where it is not really
necessary.  Finally, remove the no longer used memory shrinking
functions from mm/vmscan.c .

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/main.c     |   25 +-------
 kernel/power/snapshot.c |  118 ++++++++++++++++++++++++++++++---------
 mm/vmscan.c             |  142 ------------------------------------------------
 3 files changed, 93 insertions(+), 192 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -576,39 +576,89 @@ static unsigned long memory_bm_next_pfn(
 	return bb->start_pfn + bit;
 }
 
+/* Helper functions used for the shrinking of memory. */
+
 /**
- *	swsusp_shrink_memory -  Try to free as much memory as needed
- *
- *	... but do not OOM-kill anyone
+ * free_marked_pages - release pages allocated during memory shrinking
+ * @bm: Memory bitmap where the allocated pages were marked.
  *
- *	Notice: all userland should be stopped before it is called, or
- *	livelock is possible.
+ * Free all pages marked in given memory bitmap.
  */
+static void free_marked_pages(struct memory_bitmap *bm)
+{
+	memory_bm_position_reset(bm);
+	for(;;) {
+		unsigned long pfn;
+		struct page *page;
 
-#define SHRINK_BITE	10000
-static inline unsigned long __shrink_memory(long tmp)
+		pfn = memory_bm_next_pfn(bm);
+		if (pfn == BM_END_OF_MAP)
+			break;
+		page = pfn_to_page(pfn);
+		__free_page(page);
+	}
+}
+
+/**
+ * alloc_and_mark_page - allocate given number of pages and mark their PFNs
+ * @bm: Memory bitmap to use for marking allocated pages.
+ * @nr_pages: Number of pages to allocate.
+ *
+ * Allocate given number of pages and mark their PFNs in given memory bitmap,
+ * so that they can be released by free_marked_pages().
+ * Return value: The number of normal (ie. non-highmem) pages allocated or
+ * -ENOMEM on failure.
+ */
+static long alloc_and_mark_pages(struct memory_bitmap *bm, long nr_pages)
 {
-	if (tmp > SHRINK_BITE)
-		tmp = SHRINK_BITE;
-	return shrink_all_memory(tmp);
+	long nr_normal = 0;
+
+	while (nr_pages-- > 0) {
+		struct page *page;
+
+		page = alloc_page(GFP_KERNEL | __GFP_HIGHMEM);
+		if (!page)
+			return -ENOMEM;
+		memory_bm_set_bit(bm, page_to_pfn(page));
+		if (!PageHighMem(page))
+			nr_normal++;
+	}
+
+	return nr_normal;
 }
 
+#define SHRINK_BITE	10000
+
+/**
+ * swsusp_shrink_memory -  Try to make the kernel free as much memory as needed
+ */
 int swsusp_shrink_memory(void)
 {
 	long tmp;
 	struct zone *zone;
-	unsigned long pages = 0;
+	unsigned long pages = 0, alloc_normal = 0, alloc_highmem = 0;
 	unsigned int i = 0;
 	char *p = "-\\|/";
 	struct timeval start, stop;
+	struct memory_bitmap *bm;
+	int error;
+
+	bm = kzalloc(sizeof(*bm), GFP_KERNEL);
+	if (!bm)
+		return -ENOMEM;
+	error = memory_bm_create(bm, GFP_KERNEL, PG_ANY);
+	if (error)
+		return error;
 
 	printk(KERN_INFO "PM: Shrinking memory...  ");
 	do_gettimeofday(&start);
-	do {
-		long size, highmem_size;
 
-		highmem_size = count_highmem_pages();
-		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
+	for (;;) {
+		long size, highmem_size, ret;
+
+		highmem_size = count_highmem_pages() - 2 * alloc_highmem;
+		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES
+			- 2 * alloc_normal;
 		tmp = size;
 		size += highmem_size;
 		for_each_populated_zone(zone) {
@@ -621,27 +671,39 @@ int swsusp_shrink_memory(void)
 				tmp += zone->lowmem_reserve[ZONE_NORMAL];
 			}
 		}
-
 		if (highmem_size < 0)
 			highmem_size = 0;
-
 		tmp += highmem_size;
-		if (tmp > 0) {
-			tmp = __shrink_memory(tmp);
-			if (!tmp)
-				return -ENOMEM;
-			pages += tmp;
-		} else if (size > image_size / PAGE_SIZE) {
-			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
-			pages += tmp;
+
+		if (tmp <= 0 && size > image_size / PAGE_SIZE)
+			tmp = size - (image_size / PAGE_SIZE);
+
+		if (tmp > SHRINK_BITE)
+			tmp = SHRINK_BITE;
+		else if (tmp <= 0)
+			break;
+
+		ret = alloc_and_mark_pages(bm, tmp);
+		if (ret < 0) {
+			error = -ENOMEM;
+			goto out;
 		}
+		alloc_normal += ret;
+		alloc_highmem += tmp - ret;
+		pages += tmp;
+
 		printk("\b%c", p[i++%4]);
-	} while (tmp > 0);
+	}
+
 	do_gettimeofday(&stop);
-	printk("\bdone (%lu pages freed)\n", pages);
+	printk("\bdone (preallocated %lu free pages)\n", pages);
 	swsusp_show_speed(&start, &stop, pages, "Freed");
 
-	return 0;
+ out:
+	free_marked_pages(bm);
+	memory_bm_free(bm, PG_UNSAFE_CLEAR);
+
+	return error;
 }
 
 /**
Index: linux-2.6/mm/vmscan.c
===================================================================
--- linux-2.6.orig/mm/vmscan.c
+++ linux-2.6/mm/vmscan.c
@@ -2054,148 +2054,6 @@ unsigned long global_lru_pages(void)
 		+ global_page_state(NR_INACTIVE_FILE);
 }
 
-#ifdef CONFIG_PM
-/*
- * Helper function for shrink_all_memory().  Tries to reclaim 'nr_pages' pages
- * from LRU lists system-wide, for given pass and priority.
- *
- * For pass > 3 we also try to shrink the LRU lists that contain a few pages
- */
-static void shrink_all_zones(unsigned long nr_pages, int prio,
-				      int pass, struct scan_control *sc)
-{
-	struct zone *zone;
-	unsigned long nr_reclaimed = 0;
-
-	for_each_populated_zone(zone) {
-		enum lru_list l;
-
-		if (zone_is_all_unreclaimable(zone) && prio != DEF_PRIORITY)
-			continue;
-
-		for_each_evictable_lru(l) {
-			enum zone_stat_item ls = NR_LRU_BASE + l;
-			unsigned long lru_pages = zone_page_state(zone, ls);
-
-			/* For pass = 0, we don't shrink the active list */
-			if (pass == 0 && (l == LRU_ACTIVE_ANON ||
-						l == LRU_ACTIVE_FILE))
-				continue;
-
-			zone->lru[l].nr_scan += (lru_pages >> prio) + 1;
-			if (zone->lru[l].nr_scan >= nr_pages || pass > 3) {
-				unsigned long nr_to_scan;
-
-				zone->lru[l].nr_scan = 0;
-				nr_to_scan = min(nr_pages, lru_pages);
-				nr_reclaimed += shrink_list(l, nr_to_scan, zone,
-								sc, prio);
-				if (nr_reclaimed >= nr_pages) {
-					sc->nr_reclaimed += nr_reclaimed;
-					return;
-				}
-			}
-		}
-	}
-	sc->nr_reclaimed += nr_reclaimed;
-}
-
-/*
- * Try to free `nr_pages' of memory, system-wide, and return the number of
- * freed pages.
- *
- * Rather than trying to age LRUs the aim is to preserve the overall
- * LRU order by reclaiming preferentially
- * inactive > active > active referenced > active mapped
- */
-unsigned long shrink_all_memory(unsigned long nr_pages)
-{
-	unsigned long lru_pages, nr_slab;
-	int pass;
-	struct reclaim_state reclaim_state;
-	struct scan_control sc = {
-		.gfp_mask = GFP_KERNEL,
-		.may_unmap = 0,
-		.may_writepage = 1,
-		.isolate_pages = isolate_pages_global,
-		.nr_reclaimed = 0,
-	};
-
-	current->reclaim_state = &reclaim_state;
-
-	lru_pages = global_lru_pages();
-	nr_slab = global_page_state(NR_SLAB_RECLAIMABLE);
-	/* If slab caches are huge, it's better to hit them first */
-	while (nr_slab >= lru_pages) {
-		reclaim_state.reclaimed_slab = 0;
-		shrink_slab(nr_pages, sc.gfp_mask, lru_pages);
-		if (!reclaim_state.reclaimed_slab)
-			break;
-
-		sc.nr_reclaimed += reclaim_state.reclaimed_slab;
-		if (sc.nr_reclaimed >= nr_pages)
-			goto out;
-
-		nr_slab -= reclaim_state.reclaimed_slab;
-	}
-
-	/*
-	 * We try to shrink LRUs in 5 passes:
-	 * 0 = Reclaim from inactive_list only
-	 * 1 = Reclaim from active list but don't reclaim mapped
-	 * 2 = 2nd pass of type 1
-	 * 3 = Reclaim mapped (normal reclaim)
-	 * 4 = 2nd pass of type 3
-	 */
-	for (pass = 0; pass < 5; pass++) {
-		int prio;
-
-		/* Force reclaiming mapped pages in the passes #3 and #4 */
-		if (pass > 2)
-			sc.may_unmap = 1;
-
-		for (prio = DEF_PRIORITY; prio >= 0; prio--) {
-			unsigned long nr_to_scan = nr_pages - sc.nr_reclaimed;
-
-			sc.nr_scanned = 0;
-			sc.swap_cluster_max = nr_to_scan;
-			shrink_all_zones(nr_to_scan, prio, pass, &sc);
-			if (sc.nr_reclaimed >= nr_pages)
-				goto out;
-
-			reclaim_state.reclaimed_slab = 0;
-			shrink_slab(sc.nr_scanned, sc.gfp_mask,
-					global_lru_pages());
-			sc.nr_reclaimed += reclaim_state.reclaimed_slab;
-			if (sc.nr_reclaimed >= nr_pages)
-				goto out;
-
-			if (sc.nr_scanned && prio < DEF_PRIORITY - 2)
-				congestion_wait(WRITE, HZ / 10);
-		}
-	}
-
-	/*
-	 * If sc.nr_reclaimed = 0, we could not shrink LRUs, but there may be
-	 * something in slab caches
-	 */
-	if (!sc.nr_reclaimed) {
-		do {
-			reclaim_state.reclaimed_slab = 0;
-			shrink_slab(nr_pages, sc.gfp_mask, global_lru_pages());
-			sc.nr_reclaimed += reclaim_state.reclaimed_slab;
-		} while (sc.nr_reclaimed < nr_pages &&
-				reclaim_state.reclaimed_slab > 0);
-	}
-
-
-out:
-	current->reclaim_state = NULL;
-
-	return sc.nr_reclaimed;
-}
-#endif
-
 /* It's optimal to keep kswapds on the same CPUs as their memory, but
    not required for correctness.  So if the last cpu in a node goes
    away, we get changed to run anywhere: as the first one comes back,
Index: linux-2.6/kernel/power/main.c
===================================================================
--- linux-2.6.orig/kernel/power/main.c
+++ linux-2.6/kernel/power/main.c
@@ -195,9 +195,6 @@ static void suspend_test_finish(const ch
 
 #endif
 
-/* This is just an arbitrary number */
-#define FREE_PAGE_NUMBER (100)
-
 static struct platform_suspend_ops *suspend_ops;
 
 /**
@@ -233,7 +230,6 @@ int suspend_valid_only_mem(suspend_state
 static int suspend_prepare(void)
 {
 	int error;
-	unsigned int free_pages;
 
 	if (!suspend_ops || !suspend_ops->enter)
 		return -EPERM;
@@ -250,26 +246,11 @@ static int suspend_prepare(void)
 
 	if (suspend_freeze_processes()) {
 		error = -EAGAIN;
-		goto Thaw;
-	}
-
-	transition_in_progress = true;
-
-	free_pages = global_page_state(NR_FREE_PAGES);
-	if (free_pages < FREE_PAGE_NUMBER) {
-		pr_debug("PM: free some memory\n");
-		shrink_all_memory(FREE_PAGE_NUMBER - free_pages);
-		if (nr_free_pages() < FREE_PAGE_NUMBER) {
-			error = -ENOMEM;
-			printk(KERN_ERR "PM: No enough memory\n");
-		}
-	}
-	if (!error)
+	} else {
+		transition_in_progress = true;
 		return 0;
+	}
 
-	transition_in_progress = false;
-
- Thaw:
 	suspend_thaw_processes();
 	usermodehelper_enable();
  Finish:

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory
@ 2009-05-01 22:29                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-01 22:29 UTC (permalink / raw)
  To: Andrew Morton
  Cc: pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA, pm list

From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>

Modify the hibernation memory shrinking code so that it will make
memory allocations to free memory instead of using an artificial
memory shrinking mechanism for that.  Remove the shrinking of
memory from the suspend-to-RAM code, where it is not really
necessary.  Finally, remove the no longer used memory shrinking
functions from mm/vmscan.c .

Signed-off-by: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
---
 kernel/power/main.c     |   25 +-------
 kernel/power/snapshot.c |  118 ++++++++++++++++++++++++++++++---------
 mm/vmscan.c             |  142 ------------------------------------------------
 3 files changed, 93 insertions(+), 192 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -576,39 +576,89 @@ static unsigned long memory_bm_next_pfn(
 	return bb->start_pfn + bit;
 }
 
+/* Helper functions used for the shrinking of memory. */
+
 /**
- *	swsusp_shrink_memory -  Try to free as much memory as needed
- *
- *	... but do not OOM-kill anyone
+ * free_marked_pages - release pages allocated during memory shrinking
+ * @bm: Memory bitmap where the allocated pages were marked.
  *
- *	Notice: all userland should be stopped before it is called, or
- *	livelock is possible.
+ * Free all pages marked in given memory bitmap.
  */
+static void free_marked_pages(struct memory_bitmap *bm)
+{
+	memory_bm_position_reset(bm);
+	for(;;) {
+		unsigned long pfn;
+		struct page *page;
 
-#define SHRINK_BITE	10000
-static inline unsigned long __shrink_memory(long tmp)
+		pfn = memory_bm_next_pfn(bm);
+		if (pfn == BM_END_OF_MAP)
+			break;
+		page = pfn_to_page(pfn);
+		__free_page(page);
+	}
+}
+
+/**
+ * alloc_and_mark_page - allocate given number of pages and mark their PFNs
+ * @bm: Memory bitmap to use for marking allocated pages.
+ * @nr_pages: Number of pages to allocate.
+ *
+ * Allocate given number of pages and mark their PFNs in given memory bitmap,
+ * so that they can be released by free_marked_pages().
+ * Return value: The number of normal (ie. non-highmem) pages allocated or
+ * -ENOMEM on failure.
+ */
+static long alloc_and_mark_pages(struct memory_bitmap *bm, long nr_pages)
 {
-	if (tmp > SHRINK_BITE)
-		tmp = SHRINK_BITE;
-	return shrink_all_memory(tmp);
+	long nr_normal = 0;
+
+	while (nr_pages-- > 0) {
+		struct page *page;
+
+		page = alloc_page(GFP_KERNEL | __GFP_HIGHMEM);
+		if (!page)
+			return -ENOMEM;
+		memory_bm_set_bit(bm, page_to_pfn(page));
+		if (!PageHighMem(page))
+			nr_normal++;
+	}
+
+	return nr_normal;
 }
 
+#define SHRINK_BITE	10000
+
+/**
+ * swsusp_shrink_memory -  Try to make the kernel free as much memory as needed
+ */
 int swsusp_shrink_memory(void)
 {
 	long tmp;
 	struct zone *zone;
-	unsigned long pages = 0;
+	unsigned long pages = 0, alloc_normal = 0, alloc_highmem = 0;
 	unsigned int i = 0;
 	char *p = "-\\|/";
 	struct timeval start, stop;
+	struct memory_bitmap *bm;
+	int error;
+
+	bm = kzalloc(sizeof(*bm), GFP_KERNEL);
+	if (!bm)
+		return -ENOMEM;
+	error = memory_bm_create(bm, GFP_KERNEL, PG_ANY);
+	if (error)
+		return error;
 
 	printk(KERN_INFO "PM: Shrinking memory...  ");
 	do_gettimeofday(&start);
-	do {
-		long size, highmem_size;
 
-		highmem_size = count_highmem_pages();
-		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
+	for (;;) {
+		long size, highmem_size, ret;
+
+		highmem_size = count_highmem_pages() - 2 * alloc_highmem;
+		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES
+			- 2 * alloc_normal;
 		tmp = size;
 		size += highmem_size;
 		for_each_populated_zone(zone) {
@@ -621,27 +671,39 @@ int swsusp_shrink_memory(void)
 				tmp += zone->lowmem_reserve[ZONE_NORMAL];
 			}
 		}
-
 		if (highmem_size < 0)
 			highmem_size = 0;
-
 		tmp += highmem_size;
-		if (tmp > 0) {
-			tmp = __shrink_memory(tmp);
-			if (!tmp)
-				return -ENOMEM;
-			pages += tmp;
-		} else if (size > image_size / PAGE_SIZE) {
-			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
-			pages += tmp;
+
+		if (tmp <= 0 && size > image_size / PAGE_SIZE)
+			tmp = size - (image_size / PAGE_SIZE);
+
+		if (tmp > SHRINK_BITE)
+			tmp = SHRINK_BITE;
+		else if (tmp <= 0)
+			break;
+
+		ret = alloc_and_mark_pages(bm, tmp);
+		if (ret < 0) {
+			error = -ENOMEM;
+			goto out;
 		}
+		alloc_normal += ret;
+		alloc_highmem += tmp - ret;
+		pages += tmp;
+
 		printk("\b%c", p[i++%4]);
-	} while (tmp > 0);
+	}
+
 	do_gettimeofday(&stop);
-	printk("\bdone (%lu pages freed)\n", pages);
+	printk("\bdone (preallocated %lu free pages)\n", pages);
 	swsusp_show_speed(&start, &stop, pages, "Freed");
 
-	return 0;
+ out:
+	free_marked_pages(bm);
+	memory_bm_free(bm, PG_UNSAFE_CLEAR);
+
+	return error;
 }
 
 /**
Index: linux-2.6/mm/vmscan.c
===================================================================
--- linux-2.6.orig/mm/vmscan.c
+++ linux-2.6/mm/vmscan.c
@@ -2054,148 +2054,6 @@ unsigned long global_lru_pages(void)
 		+ global_page_state(NR_INACTIVE_FILE);
 }
 
-#ifdef CONFIG_PM
-/*
- * Helper function for shrink_all_memory().  Tries to reclaim 'nr_pages' pages
- * from LRU lists system-wide, for given pass and priority.
- *
- * For pass > 3 we also try to shrink the LRU lists that contain a few pages
- */
-static void shrink_all_zones(unsigned long nr_pages, int prio,
-				      int pass, struct scan_control *sc)
-{
-	struct zone *zone;
-	unsigned long nr_reclaimed = 0;
-
-	for_each_populated_zone(zone) {
-		enum lru_list l;
-
-		if (zone_is_all_unreclaimable(zone) && prio != DEF_PRIORITY)
-			continue;
-
-		for_each_evictable_lru(l) {
-			enum zone_stat_item ls = NR_LRU_BASE + l;
-			unsigned long lru_pages = zone_page_state(zone, ls);
-
-			/* For pass = 0, we don't shrink the active list */
-			if (pass == 0 && (l == LRU_ACTIVE_ANON ||
-						l == LRU_ACTIVE_FILE))
-				continue;
-
-			zone->lru[l].nr_scan += (lru_pages >> prio) + 1;
-			if (zone->lru[l].nr_scan >= nr_pages || pass > 3) {
-				unsigned long nr_to_scan;
-
-				zone->lru[l].nr_scan = 0;
-				nr_to_scan = min(nr_pages, lru_pages);
-				nr_reclaimed += shrink_list(l, nr_to_scan, zone,
-								sc, prio);
-				if (nr_reclaimed >= nr_pages) {
-					sc->nr_reclaimed += nr_reclaimed;
-					return;
-				}
-			}
-		}
-	}
-	sc->nr_reclaimed += nr_reclaimed;
-}
-
-/*
- * Try to free `nr_pages' of memory, system-wide, and return the number of
- * freed pages.
- *
- * Rather than trying to age LRUs the aim is to preserve the overall
- * LRU order by reclaiming preferentially
- * inactive > active > active referenced > active mapped
- */
-unsigned long shrink_all_memory(unsigned long nr_pages)
-{
-	unsigned long lru_pages, nr_slab;
-	int pass;
-	struct reclaim_state reclaim_state;
-	struct scan_control sc = {
-		.gfp_mask = GFP_KERNEL,
-		.may_unmap = 0,
-		.may_writepage = 1,
-		.isolate_pages = isolate_pages_global,
-		.nr_reclaimed = 0,
-	};
-
-	current->reclaim_state = &reclaim_state;
-
-	lru_pages = global_lru_pages();
-	nr_slab = global_page_state(NR_SLAB_RECLAIMABLE);
-	/* If slab caches are huge, it's better to hit them first */
-	while (nr_slab >= lru_pages) {
-		reclaim_state.reclaimed_slab = 0;
-		shrink_slab(nr_pages, sc.gfp_mask, lru_pages);
-		if (!reclaim_state.reclaimed_slab)
-			break;
-
-		sc.nr_reclaimed += reclaim_state.reclaimed_slab;
-		if (sc.nr_reclaimed >= nr_pages)
-			goto out;
-
-		nr_slab -= reclaim_state.reclaimed_slab;
-	}
-
-	/*
-	 * We try to shrink LRUs in 5 passes:
-	 * 0 = Reclaim from inactive_list only
-	 * 1 = Reclaim from active list but don't reclaim mapped
-	 * 2 = 2nd pass of type 1
-	 * 3 = Reclaim mapped (normal reclaim)
-	 * 4 = 2nd pass of type 3
-	 */
-	for (pass = 0; pass < 5; pass++) {
-		int prio;
-
-		/* Force reclaiming mapped pages in the passes #3 and #4 */
-		if (pass > 2)
-			sc.may_unmap = 1;
-
-		for (prio = DEF_PRIORITY; prio >= 0; prio--) {
-			unsigned long nr_to_scan = nr_pages - sc.nr_reclaimed;
-
-			sc.nr_scanned = 0;
-			sc.swap_cluster_max = nr_to_scan;
-			shrink_all_zones(nr_to_scan, prio, pass, &sc);
-			if (sc.nr_reclaimed >= nr_pages)
-				goto out;
-
-			reclaim_state.reclaimed_slab = 0;
-			shrink_slab(sc.nr_scanned, sc.gfp_mask,
-					global_lru_pages());
-			sc.nr_reclaimed += reclaim_state.reclaimed_slab;
-			if (sc.nr_reclaimed >= nr_pages)
-				goto out;
-
-			if (sc.nr_scanned && prio < DEF_PRIORITY - 2)
-				congestion_wait(WRITE, HZ / 10);
-		}
-	}
-
-	/*
-	 * If sc.nr_reclaimed = 0, we could not shrink LRUs, but there may be
-	 * something in slab caches
-	 */
-	if (!sc.nr_reclaimed) {
-		do {
-			reclaim_state.reclaimed_slab = 0;
-			shrink_slab(nr_pages, sc.gfp_mask, global_lru_pages());
-			sc.nr_reclaimed += reclaim_state.reclaimed_slab;
-		} while (sc.nr_reclaimed < nr_pages &&
-				reclaim_state.reclaimed_slab > 0);
-	}
-
-
-out:
-	current->reclaim_state = NULL;
-
-	return sc.nr_reclaimed;
-}
-#endif
-
 /* It's optimal to keep kswapds on the same CPUs as their memory, but
    not required for correctness.  So if the last cpu in a node goes
    away, we get changed to run anywhere: as the first one comes back,
Index: linux-2.6/kernel/power/main.c
===================================================================
--- linux-2.6.orig/kernel/power/main.c
+++ linux-2.6/kernel/power/main.c
@@ -195,9 +195,6 @@ static void suspend_test_finish(const ch
 
 #endif
 
-/* This is just an arbitrary number */
-#define FREE_PAGE_NUMBER (100)
-
 static struct platform_suspend_ops *suspend_ops;
 
 /**
@@ -233,7 +230,6 @@ int suspend_valid_only_mem(suspend_state
 static int suspend_prepare(void)
 {
 	int error;
-	unsigned int free_pages;
 
 	if (!suspend_ops || !suspend_ops->enter)
 		return -EPERM;
@@ -250,26 +246,11 @@ static int suspend_prepare(void)
 
 	if (suspend_freeze_processes()) {
 		error = -EAGAIN;
-		goto Thaw;
-	}
-
-	transition_in_progress = true;
-
-	free_pages = global_page_state(NR_FREE_PAGES);
-	if (free_pages < FREE_PAGE_NUMBER) {
-		pr_debug("PM: free some memory\n");
-		shrink_all_memory(FREE_PAGE_NUMBER - free_pages);
-		if (nr_free_pages() < FREE_PAGE_NUMBER) {
-			error = -ENOMEM;
-			printk(KERN_ERR "PM: No enough memory\n");
-		}
-	}
-	if (!error)
+	} else {
+		transition_in_progress = true;
 		return 0;
+	}
 
-	transition_in_progress = false;
-
- Thaw:
 	suspend_thaw_processes();
 	usermodehelper_enable();
  Finish:

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/3] PM: Disable OOM killer during system-wide power transitions
@ 2009-05-01 23:09                                 ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-01 23:09 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: pavel, torvalds, jens.axboe, alan-jenkins, linux-kernel,
	kernel-testers, linux-pm

On Sat, 2 May 2009 00:27:30 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> From: Rafael J. Wysocki <rjw@sisk.pl>
> 
> The OOM killer is not particularly useful during system-wide power
> transitions, so do not use it if such a transition is in progress.
> 

so...  I think what you've done here is to arrange for the page
allocator to return NULL if we're hibernating rather than oom-killing,
yes?

Does the same apply to suspending?  If so, why?

I think this is an OK change, as long as the only thing which is
allocating memory is hibernation itself.  If random processes are still
doing random memory allocations at this time then their failed memory
allocation could be just as fatal as an oom-killing.  Moreso if they're
s/bin/init or whatever.

So is it the case that pm_transition_in_progress() is only true during
the highly-constrained hibernation process?  After everything is frozen?

If so, there are alternatives - the calling process could set
PF_DONT_KILL_ANYONE_FOR_ME, or could pass __GFP_DONT_KILL_ANYONE_FOR_ME. 
Those might be worse alternatives, dunno - I'm just asking probing
questions ;)


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/3] PM: Disable OOM killer during system-wide power transitions
  2009-05-01 22:27                               ` Rafael J. Wysocki
  (?)
@ 2009-05-01 23:09                               ` Andrew Morton
  -1 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-01 23:09 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-kernel, alan-jenkins, jens.axboe, linux-pm, kernel-testers,
	torvalds

On Sat, 2 May 2009 00:27:30 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> From: Rafael J. Wysocki <rjw@sisk.pl>
> 
> The OOM killer is not particularly useful during system-wide power
> transitions, so do not use it if such a transition is in progress.
> 

so...  I think what you've done here is to arrange for the page
allocator to return NULL if we're hibernating rather than oom-killing,
yes?

Does the same apply to suspending?  If so, why?

I think this is an OK change, as long as the only thing which is
allocating memory is hibernation itself.  If random processes are still
doing random memory allocations at this time then their failed memory
allocation could be just as fatal as an oom-killing.  Moreso if they're
s/bin/init or whatever.

So is it the case that pm_transition_in_progress() is only true during
the highly-constrained hibernation process?  After everything is frozen?

If so, there are alternatives - the calling process could set
PF_DONT_KILL_ANYONE_FOR_ME, or could pass __GFP_DONT_KILL_ANYONE_FOR_ME. 
Those might be worse alternatives, dunno - I'm just asking probing
questions ;)

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/3] PM: Disable OOM killer during system-wide power transitions
@ 2009-05-01 23:09                                 ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-01 23:09 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

On Sat, 2 May 2009 00:27:30 +0200
"Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:

> From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
> 
> The OOM killer is not particularly useful during system-wide power
> transitions, so do not use it if such a transition is in progress.
> 

so...  I think what you've done here is to arrange for the page
allocator to return NULL if we're hibernating rather than oom-killing,
yes?

Does the same apply to suspending?  If so, why?

I think this is an OK change, as long as the only thing which is
allocating memory is hibernation itself.  If random processes are still
doing random memory allocations at this time then their failed memory
allocation could be just as fatal as an oom-killing.  Moreso if they're
s/bin/init or whatever.

So is it the case that pm_transition_in_progress() is only true during
the highly-constrained hibernation process?  After everything is frozen?

If so, there are alternatives - the calling process could set
PF_DONT_KILL_ANYONE_FOR_ME, or could pass __GFP_DONT_KILL_ANYONE_FOR_ME. 
Those might be worse alternatives, dunno - I'm just asking probing
questions ;)

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory
@ 2009-05-01 23:14                                 ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-01 23:14 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: pavel, torvalds, jens.axboe, alan-jenkins, linux-kernel,
	kernel-testers, linux-pm

On Sat, 2 May 2009 00:29:38 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> From: Rafael J. Wysocki <rjw@sisk.pl>
> 
> Modify the hibernation memory shrinking code so that it will make
> memory allocations to free memory instead of using an artificial
> memory shrinking mechanism for that.  Remove the shrinking of
> memory from the suspend-to-RAM code, where it is not really
> necessary.  Finally, remove the no longer used memory shrinking
> functions from mm/vmscan.c .
> 
> ...
>
> +static long alloc_and_mark_pages(struct memory_bitmap *bm, long nr_pages)
>  {
> -	if (tmp > SHRINK_BITE)
> -		tmp = SHRINK_BITE;
> -	return shrink_all_memory(tmp);
> +	long nr_normal = 0;
> +
> +	while (nr_pages-- > 0) {
> +		struct page *page;
> +
> +		page = alloc_page(GFP_KERNEL | __GFP_HIGHMEM);
> +		if (!page)
> +			return -ENOMEM;
> +		memory_bm_set_bit(bm, page_to_pfn(page));
> +		if (!PageHighMem(page))
> +			nr_normal++;
> +	}
> +
> +	return nr_normal;
>  }

Do we need the bitmap?  I expect we can just string all these pages
onto a local list via page.lru.  Would need to check that - the
pageframe fields are quite overloaded.


> ...
>
> +#define SHRINK_BITE	10000
> +		long size, highmem_size, ret;
> +
> +		highmem_size = count_highmem_pages() - 2 * alloc_highmem;
> +		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES
> +			- 2 * alloc_normal;

It'd be nice if this head-spinning arithmetic were spelled out in a
comment somewhere.  There are rather a lot of magic-number heuristics
in here.

>  		tmp = size;
>  		size += highmem_size;
>  		for_each_populated_zone(zone) {
> @@ -621,27 +671,39 @@ int swsusp_shrink_memory(void)

All looks pretty sane to me.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory
  2009-05-01 22:29                               ` Rafael J. Wysocki
  (?)
  (?)
@ 2009-05-01 23:14                               ` Andrew Morton
  -1 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-01 23:14 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-kernel, alan-jenkins, jens.axboe, linux-pm, kernel-testers,
	torvalds

On Sat, 2 May 2009 00:29:38 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> From: Rafael J. Wysocki <rjw@sisk.pl>
> 
> Modify the hibernation memory shrinking code so that it will make
> memory allocations to free memory instead of using an artificial
> memory shrinking mechanism for that.  Remove the shrinking of
> memory from the suspend-to-RAM code, where it is not really
> necessary.  Finally, remove the no longer used memory shrinking
> functions from mm/vmscan.c .
> 
> ...
>
> +static long alloc_and_mark_pages(struct memory_bitmap *bm, long nr_pages)
>  {
> -	if (tmp > SHRINK_BITE)
> -		tmp = SHRINK_BITE;
> -	return shrink_all_memory(tmp);
> +	long nr_normal = 0;
> +
> +	while (nr_pages-- > 0) {
> +		struct page *page;
> +
> +		page = alloc_page(GFP_KERNEL | __GFP_HIGHMEM);
> +		if (!page)
> +			return -ENOMEM;
> +		memory_bm_set_bit(bm, page_to_pfn(page));
> +		if (!PageHighMem(page))
> +			nr_normal++;
> +	}
> +
> +	return nr_normal;
>  }

Do we need the bitmap?  I expect we can just string all these pages
onto a local list via page.lru.  Would need to check that - the
pageframe fields are quite overloaded.


> ...
>
> +#define SHRINK_BITE	10000
> +		long size, highmem_size, ret;
> +
> +		highmem_size = count_highmem_pages() - 2 * alloc_highmem;
> +		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES
> +			- 2 * alloc_normal;

It'd be nice if this head-spinning arithmetic were spelled out in a
comment somewhere.  There are rather a lot of magic-number heuristics
in here.

>  		tmp = size;
>  		size += highmem_size;
>  		for_each_populated_zone(zone) {
> @@ -621,27 +671,39 @@ int swsusp_shrink_memory(void)

All looks pretty sane to me.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory
@ 2009-05-01 23:14                                 ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-01 23:14 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

On Sat, 2 May 2009 00:29:38 +0200
"Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:

> From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
> 
> Modify the hibernation memory shrinking code so that it will make
> memory allocations to free memory instead of using an artificial
> memory shrinking mechanism for that.  Remove the shrinking of
> memory from the suspend-to-RAM code, where it is not really
> necessary.  Finally, remove the no longer used memory shrinking
> functions from mm/vmscan.c .
> 
> ...
>
> +static long alloc_and_mark_pages(struct memory_bitmap *bm, long nr_pages)
>  {
> -	if (tmp > SHRINK_BITE)
> -		tmp = SHRINK_BITE;
> -	return shrink_all_memory(tmp);
> +	long nr_normal = 0;
> +
> +	while (nr_pages-- > 0) {
> +		struct page *page;
> +
> +		page = alloc_page(GFP_KERNEL | __GFP_HIGHMEM);
> +		if (!page)
> +			return -ENOMEM;
> +		memory_bm_set_bit(bm, page_to_pfn(page));
> +		if (!PageHighMem(page))
> +			nr_normal++;
> +	}
> +
> +	return nr_normal;
>  }

Do we need the bitmap?  I expect we can just string all these pages
onto a local list via page.lru.  Would need to check that - the
pageframe fields are quite overloaded.


> ...
>
> +#define SHRINK_BITE	10000
> +		long size, highmem_size, ret;
> +
> +		highmem_size = count_highmem_pages() - 2 * alloc_highmem;
> +		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES
> +			- 2 * alloc_normal;

It'd be nice if this head-spinning arithmetic were spelled out in a
comment somewhere.  There are rather a lot of magic-number heuristics
in here.

>  		tmp = size;
>  		size += highmem_size;
>  		for_each_populated_zone(zone) {
> @@ -621,27 +671,39 @@ int swsusp_shrink_memory(void)

All looks pretty sane to me.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/3] PM: Disable OOM killer during system-wide power transitions
@ 2009-05-02 11:34                                   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-02 11:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: pavel, torvalds, jens.axboe, alan-jenkins, linux-kernel,
	kernel-testers, linux-pm

On Saturday 02 May 2009, Andrew Morton wrote:
> On Sat, 2 May 2009 00:27:30 +0200
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > 
> > The OOM killer is not particularly useful during system-wide power
> > transitions, so do not use it if such a transition is in progress.
> > 
> 
> so...  I think what you've done here is to arrange for the page
> allocator to return NULL if we're hibernating rather than oom-killing,
> yes?

Yes.

> Does the same apply to suspending?  If so, why?

Because I think it doesn't work anyway.  User space processes are frozen and
effectively in TASK_UNINTERRUPTIBLE, so they won't be killed.

> I think this is an OK change, as long as the only thing which is
> allocating memory is hibernation itself.  If random processes are still
> doing random memory allocations at this time then their failed memory
> allocation could be just as fatal as an oom-killing.  Moreso if they're
> s/bin/init or whatever.

At this point all of the user space tasks are frozen.

> So is it the case that pm_transition_in_progress() is only true during
> the highly-constrained hibernation process?  After everything is frozen?

Yes.

> If so, there are alternatives - the calling process could set
> PF_DONT_KILL_ANYONE_FOR_ME, or could pass __GFP_DONT_KILL_ANYONE_FOR_ME. 
> Those might be worse alternatives, dunno - I'm just asking probing
> questions ;)

Well, the __GFP_DONT_KILL_ANYONE_FOR_ME would be easier to implement.

I don't really think that the freezing of processes plays very well with the
OOM killer, so disabling it altogether after the processes have been frozen
seems reasonable to me.  Still, if you prefer to use
__GFP_DONT_KILL_ANYONE_FOR_ME, it's fine by me.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/3] PM: Disable OOM killer during system-wide power transitions
  2009-05-01 23:09                                 ` Andrew Morton
  (?)
@ 2009-05-02 11:34                                 ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-02 11:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, alan-jenkins, jens.axboe, linux-pm, kernel-testers,
	torvalds

On Saturday 02 May 2009, Andrew Morton wrote:
> On Sat, 2 May 2009 00:27:30 +0200
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > 
> > The OOM killer is not particularly useful during system-wide power
> > transitions, so do not use it if such a transition is in progress.
> > 
> 
> so...  I think what you've done here is to arrange for the page
> allocator to return NULL if we're hibernating rather than oom-killing,
> yes?

Yes.

> Does the same apply to suspending?  If so, why?

Because I think it doesn't work anyway.  User space processes are frozen and
effectively in TASK_UNINTERRUPTIBLE, so they won't be killed.

> I think this is an OK change, as long as the only thing which is
> allocating memory is hibernation itself.  If random processes are still
> doing random memory allocations at this time then their failed memory
> allocation could be just as fatal as an oom-killing.  Moreso if they're
> s/bin/init or whatever.

At this point all of the user space tasks are frozen.

> So is it the case that pm_transition_in_progress() is only true during
> the highly-constrained hibernation process?  After everything is frozen?

Yes.

> If so, there are alternatives - the calling process could set
> PF_DONT_KILL_ANYONE_FOR_ME, or could pass __GFP_DONT_KILL_ANYONE_FOR_ME. 
> Those might be worse alternatives, dunno - I'm just asking probing
> questions ;)

Well, the __GFP_DONT_KILL_ANYONE_FOR_ME would be easier to implement.

I don't really think that the freezing of processes plays very well with the
OOM killer, so disabling it altogether after the processes have been frozen
seems reasonable to me.  Still, if you prefer to use
__GFP_DONT_KILL_ANYONE_FOR_ME, it's fine by me.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/3] PM: Disable OOM killer during system-wide power transitions
@ 2009-05-02 11:34                                   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-02 11:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

On Saturday 02 May 2009, Andrew Morton wrote:
> On Sat, 2 May 2009 00:27:30 +0200
> "Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:
> 
> > From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
> > 
> > The OOM killer is not particularly useful during system-wide power
> > transitions, so do not use it if such a transition is in progress.
> > 
> 
> so...  I think what you've done here is to arrange for the page
> allocator to return NULL if we're hibernating rather than oom-killing,
> yes?

Yes.

> Does the same apply to suspending?  If so, why?

Because I think it doesn't work anyway.  User space processes are frozen and
effectively in TASK_UNINTERRUPTIBLE, so they won't be killed.

> I think this is an OK change, as long as the only thing which is
> allocating memory is hibernation itself.  If random processes are still
> doing random memory allocations at this time then their failed memory
> allocation could be just as fatal as an oom-killing.  Moreso if they're
> s/bin/init or whatever.

At this point all of the user space tasks are frozen.

> So is it the case that pm_transition_in_progress() is only true during
> the highly-constrained hibernation process?  After everything is frozen?

Yes.

> If so, there are alternatives - the calling process could set
> PF_DONT_KILL_ANYONE_FOR_ME, or could pass __GFP_DONT_KILL_ANYONE_FOR_ME. 
> Those might be worse alternatives, dunno - I'm just asking probing
> questions ;)

Well, the __GFP_DONT_KILL_ANYONE_FOR_ME would be easier to implement.

I don't really think that the freezing of processes plays very well with the
OOM killer, so disabling it altogether after the processes have been frozen
seems reasonable to me.  Still, if you prefer to use
__GFP_DONT_KILL_ANYONE_FOR_ME, it's fine by me.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory
@ 2009-05-02 11:46                                   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-02 11:46 UTC (permalink / raw)
  To: Andrew Morton
  Cc: pavel, torvalds, jens.axboe, alan-jenkins, linux-kernel,
	kernel-testers, linux-pm

On Saturday 02 May 2009, Andrew Morton wrote:
> On Sat, 2 May 2009 00:29:38 +0200
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > 
> > Modify the hibernation memory shrinking code so that it will make
> > memory allocations to free memory instead of using an artificial
> > memory shrinking mechanism for that.  Remove the shrinking of
> > memory from the suspend-to-RAM code, where it is not really
> > necessary.  Finally, remove the no longer used memory shrinking
> > functions from mm/vmscan.c .
> > 
> > ...
> >
> > +static long alloc_and_mark_pages(struct memory_bitmap *bm, long nr_pages)
> >  {
> > -	if (tmp > SHRINK_BITE)
> > -		tmp = SHRINK_BITE;
> > -	return shrink_all_memory(tmp);
> > +	long nr_normal = 0;
> > +
> > +	while (nr_pages-- > 0) {
> > +		struct page *page;
> > +
> > +		page = alloc_page(GFP_KERNEL | __GFP_HIGHMEM);
> > +		if (!page)
> > +			return -ENOMEM;
> > +		memory_bm_set_bit(bm, page_to_pfn(page));
> > +		if (!PageHighMem(page))
> > +			nr_normal++;
> > +	}
> > +
> > +	return nr_normal;
> >  }
> 
> Do we need the bitmap?  I expect we can just string all these pages
> onto a local list via page.lru.  Would need to check that - the
> pageframe fields are quite overloaded.

This is the reason why we use the bitmaps for hibernation. :-)

> > ...
> >
> > +#define SHRINK_BITE	10000
> > +		long size, highmem_size, ret;
> > +
> > +		highmem_size = count_highmem_pages() - 2 * alloc_highmem;
> > +		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES
> > +			- 2 * alloc_normal;
> 
> It'd be nice if this head-spinning arithmetic were spelled out in a
> comment somewhere.  There are rather a lot of magic-number heuristics
> in here.

Well, yeah.  I'll try to write up something. :-)

> >  		tmp = size;
> >  		size += highmem_size;
> >  		for_each_populated_zone(zone) {
> > @@ -621,27 +671,39 @@ int swsusp_shrink_memory(void)
> 
> All looks pretty sane to me.

Great, thanks for the comments!

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory
  2009-05-01 23:14                                 ` Andrew Morton
  (?)
  (?)
@ 2009-05-02 11:46                                 ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-02 11:46 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, alan-jenkins, jens.axboe, linux-pm, kernel-testers,
	torvalds

On Saturday 02 May 2009, Andrew Morton wrote:
> On Sat, 2 May 2009 00:29:38 +0200
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > 
> > Modify the hibernation memory shrinking code so that it will make
> > memory allocations to free memory instead of using an artificial
> > memory shrinking mechanism for that.  Remove the shrinking of
> > memory from the suspend-to-RAM code, where it is not really
> > necessary.  Finally, remove the no longer used memory shrinking
> > functions from mm/vmscan.c .
> > 
> > ...
> >
> > +static long alloc_and_mark_pages(struct memory_bitmap *bm, long nr_pages)
> >  {
> > -	if (tmp > SHRINK_BITE)
> > -		tmp = SHRINK_BITE;
> > -	return shrink_all_memory(tmp);
> > +	long nr_normal = 0;
> > +
> > +	while (nr_pages-- > 0) {
> > +		struct page *page;
> > +
> > +		page = alloc_page(GFP_KERNEL | __GFP_HIGHMEM);
> > +		if (!page)
> > +			return -ENOMEM;
> > +		memory_bm_set_bit(bm, page_to_pfn(page));
> > +		if (!PageHighMem(page))
> > +			nr_normal++;
> > +	}
> > +
> > +	return nr_normal;
> >  }
> 
> Do we need the bitmap?  I expect we can just string all these pages
> onto a local list via page.lru.  Would need to check that - the
> pageframe fields are quite overloaded.

This is the reason why we use the bitmaps for hibernation. :-)

> > ...
> >
> > +#define SHRINK_BITE	10000
> > +		long size, highmem_size, ret;
> > +
> > +		highmem_size = count_highmem_pages() - 2 * alloc_highmem;
> > +		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES
> > +			- 2 * alloc_normal;
> 
> It'd be nice if this head-spinning arithmetic were spelled out in a
> comment somewhere.  There are rather a lot of magic-number heuristics
> in here.

Well, yeah.  I'll try to write up something. :-)

> >  		tmp = size;
> >  		size += highmem_size;
> >  		for_each_populated_zone(zone) {
> > @@ -621,27 +671,39 @@ int swsusp_shrink_memory(void)
> 
> All looks pretty sane to me.

Great, thanks for the comments!

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory
@ 2009-05-02 11:46                                   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-02 11:46 UTC (permalink / raw)
  To: Andrew Morton
  Cc: pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

On Saturday 02 May 2009, Andrew Morton wrote:
> On Sat, 2 May 2009 00:29:38 +0200
> "Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:
> 
> > From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
> > 
> > Modify the hibernation memory shrinking code so that it will make
> > memory allocations to free memory instead of using an artificial
> > memory shrinking mechanism for that.  Remove the shrinking of
> > memory from the suspend-to-RAM code, where it is not really
> > necessary.  Finally, remove the no longer used memory shrinking
> > functions from mm/vmscan.c .
> > 
> > ...
> >
> > +static long alloc_and_mark_pages(struct memory_bitmap *bm, long nr_pages)
> >  {
> > -	if (tmp > SHRINK_BITE)
> > -		tmp = SHRINK_BITE;
> > -	return shrink_all_memory(tmp);
> > +	long nr_normal = 0;
> > +
> > +	while (nr_pages-- > 0) {
> > +		struct page *page;
> > +
> > +		page = alloc_page(GFP_KERNEL | __GFP_HIGHMEM);
> > +		if (!page)
> > +			return -ENOMEM;
> > +		memory_bm_set_bit(bm, page_to_pfn(page));
> > +		if (!PageHighMem(page))
> > +			nr_normal++;
> > +	}
> > +
> > +	return nr_normal;
> >  }
> 
> Do we need the bitmap?  I expect we can just string all these pages
> onto a local list via page.lru.  Would need to check that - the
> pageframe fields are quite overloaded.

This is the reason why we use the bitmaps for hibernation. :-)

> > ...
> >
> > +#define SHRINK_BITE	10000
> > +		long size, highmem_size, ret;
> > +
> > +		highmem_size = count_highmem_pages() - 2 * alloc_highmem;
> > +		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES
> > +			- 2 * alloc_normal;
> 
> It'd be nice if this head-spinning arithmetic were spelled out in a
> comment somewhere.  There are rather a lot of magic-number heuristics
> in here.

Well, yeah.  I'll try to write up something. :-)

> >  		tmp = size;
> >  		size += highmem_size;
> >  		for_each_populated_zone(zone) {
> > @@ -621,27 +671,39 @@ int swsusp_shrink_memory(void)
> 
> All looks pretty sane to me.

Great, thanks for the comments!

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory
@ 2009-05-02 17:49                                     ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-02 17:49 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: pavel, torvalds, jens.axboe, alan-jenkins, linux-kernel,
	kernel-testers, linux-pm

On Sat, 2 May 2009 13:46:34 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> > Do we need the bitmap?  I expect we can just string all these pages
> > onto a local list via page.lru.  Would need to check that - the
> > pageframe fields are quite overloaded.
> 
> This is the reason why we use the bitmaps for hibernation. :-)

grep the tree for page->lru and you'll see that quite a few page
consumers are using it.  So you'd be pretty safe doing it this way.

Whether it's _worth_ doing it this way is debatable, given that
hibernation uses bitmaps elsewhere.  But it would shrink the patch a
bit I expect?

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory
  2009-05-02 11:46                                   ` Rafael J. Wysocki
  (?)
@ 2009-05-02 17:49                                   ` Andrew Morton
  -1 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-02 17:49 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-kernel, alan-jenkins, jens.axboe, linux-pm, kernel-testers,
	torvalds

On Sat, 2 May 2009 13:46:34 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> > Do we need the bitmap?  I expect we can just string all these pages
> > onto a local list via page.lru.  Would need to check that - the
> > pageframe fields are quite overloaded.
> 
> This is the reason why we use the bitmaps for hibernation. :-)

grep the tree for page->lru and you'll see that quite a few page
consumers are using it.  So you'd be pretty safe doing it this way.

Whether it's _worth_ doing it this way is debatable, given that
hibernation uses bitmaps elsewhere.  But it would shrink the patch a
bit I expect?

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory
@ 2009-05-02 17:49                                     ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-02 17:49 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

On Sat, 2 May 2009 13:46:34 +0200 "Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:

> > Do we need the bitmap?  I expect we can just string all these pages
> > onto a local list via page.lru.  Would need to check that - the
> > pageframe fields are quite overloaded.
> 
> This is the reason why we use the bitmaps for hibernation. :-)

grep the tree for page->lru and you'll see that quite a few page
consumers are using it.  So you'd be pretty safe doing it this way.

Whether it's _worth_ doing it this way is debatable, given that
hibernation uses bitmaps elsewhere.  But it would shrink the patch a
bit I expect?

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 0/4] PM: Drop shrink_all_memory (rev. 2) (was: Re: [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory)
@ 2009-05-03  0:20                                       ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-03  0:20 UTC (permalink / raw)
  To: Andrew Morton
  Cc: pavel, torvalds, jens.axboe, alan-jenkins, linux-kernel,
	kernel-testers, linux-pm

On Saturday 02 May 2009, Andrew Morton wrote:
> On Sat, 2 May 2009 13:46:34 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > > Do we need the bitmap?  I expect we can just string all these pages
> > > onto a local list via page.lru.  Would need to check that - the
> > > pageframe fields are quite overloaded.
> > 
> > This is the reason why we use the bitmaps for hibernation. :-)
> 
> grep the tree for page->lru and you'll see that quite a few page
> consumers are using it.  So you'd be pretty safe doing it this way.
> 
> Whether it's _worth_ doing it this way is debatable, given that
> hibernation uses bitmaps elsewhere.  But it would shrink the patch a
> bit I expect?

It probably would, but it turns out we need not create the new bitmap, we
can use the existing ones for marking the allocated pages.  That also has
the benefit that we can use swsusp_free() to release them.

Modified patch series follows:

[1/4] - your patch introducing __GFP_NO_OOM_KILL (I decided it would be better
        do it this way in this particular case.  The fact that the OOM killer
        is not going to work after tasks have been frozen is a different issue.)

[2/4] - move swsusp_shrink_memory to snapshot.c, no major changes

[3/4] - use memory allocations to for making the room for the image (added
        comments, used the existing bitmaps, cleaned up a bit)

[4/4] - new thing: do not release memory allocated by [3/4] and use it for
        creating the image directly.

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 0/4] PM: Drop shrink_all_memory (rev. 2) (was: Re: [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory)
  2009-05-02 17:49                                     ` Andrew Morton
  (?)
  (?)
@ 2009-05-03  0:20                                     ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-03  0:20 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, alan-jenkins, jens.axboe, linux-pm, kernel-testers,
	torvalds

On Saturday 02 May 2009, Andrew Morton wrote:
> On Sat, 2 May 2009 13:46:34 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > > Do we need the bitmap?  I expect we can just string all these pages
> > > onto a local list via page.lru.  Would need to check that - the
> > > pageframe fields are quite overloaded.
> > 
> > This is the reason why we use the bitmaps for hibernation. :-)
> 
> grep the tree for page->lru and you'll see that quite a few page
> consumers are using it.  So you'd be pretty safe doing it this way.
> 
> Whether it's _worth_ doing it this way is debatable, given that
> hibernation uses bitmaps elsewhere.  But it would shrink the patch a
> bit I expect?

It probably would, but it turns out we need not create the new bitmap, we
can use the existing ones for marking the allocated pages.  That also has
the benefit that we can use swsusp_free() to release them.

Modified patch series follows:

[1/4] - your patch introducing __GFP_NO_OOM_KILL (I decided it would be better
        do it this way in this particular case.  The fact that the OOM killer
        is not going to work after tasks have been frozen is a different issue.)

[2/4] - move swsusp_shrink_memory to snapshot.c, no major changes

[3/4] - use memory allocations to for making the room for the image (added
        comments, used the existing bitmaps, cleaned up a bit)

[4/4] - new thing: do not release memory allocated by [3/4] and use it for
        creating the image directly.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 0/4] PM: Drop shrink_all_memory (rev. 2) (was: Re: [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory)
@ 2009-05-03  0:20                                       ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-03  0:20 UTC (permalink / raw)
  To: Andrew Morton
  Cc: pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

On Saturday 02 May 2009, Andrew Morton wrote:
> On Sat, 2 May 2009 13:46:34 +0200 "Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:
> 
> > > Do we need the bitmap?  I expect we can just string all these pages
> > > onto a local list via page.lru.  Would need to check that - the
> > > pageframe fields are quite overloaded.
> > 
> > This is the reason why we use the bitmaps for hibernation. :-)
> 
> grep the tree for page->lru and you'll see that quite a few page
> consumers are using it.  So you'd be pretty safe doing it this way.
> 
> Whether it's _worth_ doing it this way is debatable, given that
> hibernation uses bitmaps elsewhere.  But it would shrink the patch a
> bit I expect?

It probably would, but it turns out we need not create the new bitmap, we
can use the existing ones for marking the allocated pages.  That also has
the benefit that we can use swsusp_free() to release them.

Modified patch series follows:

[1/4] - your patch introducing __GFP_NO_OOM_KILL (I decided it would be better
        do it this way in this particular case.  The fact that the OOM killer
        is not going to work after tasks have been frozen is a different issue.)

[2/4] - move swsusp_shrink_memory to snapshot.c, no major changes

[3/4] - use memory allocations to for making the room for the image (added
        comments, used the existing bitmaps, cleaned up a bit)

[4/4] - new thing: do not release memory allocated by [3/4] and use it for
        creating the image directly.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 1/4] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-03  0:22                                         ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-03  0:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: pavel, torvalds, jens.axboe, alan-jenkins, linux-kernel,
	kernel-testers, linux-pm

From: Andrew Morton <akpm@linux-foundation.org>

> > Remind me: why can't we just allocate N pages at suspend-time?
> 
> We need half of memory free. The reason we can't "just allocate" is
> probably OOM killer; but my memories are quite weak :-(.

hm.  You'd think that with our splendid range of __GFP_foo falgs, there
would be some combo which would suit this requirement but I can't
immediately spot one.

We can always add another I guess.  Something like...

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 include/linux/gfp.h |    3 ++-
 mm/page_alloc.c     |    3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c
+++ linux-2.6/mm/page_alloc.c
@@ -1620,7 +1620,8 @@ nofail_alloc:
 		}
 
 		/* The OOM killer will not help higher order allocs so fail */
-		if (order > PAGE_ALLOC_COSTLY_ORDER) {
+		if (order > PAGE_ALLOC_COSTLY_ORDER ||
+				(gfp_mask & __GFP_NO_OOM_KILL)) {
 			clear_zonelist_oom(zonelist, gfp_mask);
 			goto nopage;
 		}
Index: linux-2.6/include/linux/gfp.h
===================================================================
--- linux-2.6.orig/include/linux/gfp.h
+++ linux-2.6/include/linux/gfp.h
@@ -51,8 +51,9 @@ struct vm_area_struct;
 #define __GFP_THISNODE	((__force gfp_t)0x40000u)/* No fallback, no policies */
 #define __GFP_RECLAIMABLE ((__force gfp_t)0x80000u) /* Page is reclaimable */
 #define __GFP_MOVABLE	((__force gfp_t)0x100000u)  /* Page is movable */
+#define __GFP_NO_OOM_KILL ((__force gfp_t)0x200000u)  /* Don't invoke out_of_memory() */
 
-#define __GFP_BITS_SHIFT 21	/* Room for 21 __GFP_FOO bits */
+#define __GFP_BITS_SHIFT 22	/* Number of__GFP_FOO bits */
 #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
 
 /* This equals 0, but use constants in case they ever change */


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 1/4] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-03  0:20                                       ` Rafael J. Wysocki
  (?)
  (?)
@ 2009-05-03  0:22                                       ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-03  0:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, alan-jenkins, jens.axboe, linux-pm, kernel-testers,
	torvalds

From: Andrew Morton <akpm@linux-foundation.org>

> > Remind me: why can't we just allocate N pages at suspend-time?
> 
> We need half of memory free. The reason we can't "just allocate" is
> probably OOM killer; but my memories are quite weak :-(.

hm.  You'd think that with our splendid range of __GFP_foo falgs, there
would be some combo which would suit this requirement but I can't
immediately spot one.

We can always add another I guess.  Something like...

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 include/linux/gfp.h |    3 ++-
 mm/page_alloc.c     |    3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c
+++ linux-2.6/mm/page_alloc.c
@@ -1620,7 +1620,8 @@ nofail_alloc:
 		}
 
 		/* The OOM killer will not help higher order allocs so fail */
-		if (order > PAGE_ALLOC_COSTLY_ORDER) {
+		if (order > PAGE_ALLOC_COSTLY_ORDER ||
+				(gfp_mask & __GFP_NO_OOM_KILL)) {
 			clear_zonelist_oom(zonelist, gfp_mask);
 			goto nopage;
 		}
Index: linux-2.6/include/linux/gfp.h
===================================================================
--- linux-2.6.orig/include/linux/gfp.h
+++ linux-2.6/include/linux/gfp.h
@@ -51,8 +51,9 @@ struct vm_area_struct;
 #define __GFP_THISNODE	((__force gfp_t)0x40000u)/* No fallback, no policies */
 #define __GFP_RECLAIMABLE ((__force gfp_t)0x80000u) /* Page is reclaimable */
 #define __GFP_MOVABLE	((__force gfp_t)0x100000u)  /* Page is movable */
+#define __GFP_NO_OOM_KILL ((__force gfp_t)0x200000u)  /* Don't invoke out_of_memory() */
 
-#define __GFP_BITS_SHIFT 21	/* Room for 21 __GFP_FOO bits */
+#define __GFP_BITS_SHIFT 22	/* Number of__GFP_FOO bits */
 #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
 
 /* This equals 0, but use constants in case they ever change */

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 1/4] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-03  0:22                                         ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-03  0:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

From: Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>

> > Remind me: why can't we just allocate N pages at suspend-time?
> 
> We need half of memory free. The reason we can't "just allocate" is
> probably OOM killer; but my memories are quite weak :-(.

hm.  You'd think that with our splendid range of __GFP_foo falgs, there
would be some combo which would suit this requirement but I can't
immediately spot one.

We can always add another I guess.  Something like...

Signed-off-by: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
---
 include/linux/gfp.h |    3 ++-
 mm/page_alloc.c     |    3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c
+++ linux-2.6/mm/page_alloc.c
@@ -1620,7 +1620,8 @@ nofail_alloc:
 		}
 
 		/* The OOM killer will not help higher order allocs so fail */
-		if (order > PAGE_ALLOC_COSTLY_ORDER) {
+		if (order > PAGE_ALLOC_COSTLY_ORDER ||
+				(gfp_mask & __GFP_NO_OOM_KILL)) {
 			clear_zonelist_oom(zonelist, gfp_mask);
 			goto nopage;
 		}
Index: linux-2.6/include/linux/gfp.h
===================================================================
--- linux-2.6.orig/include/linux/gfp.h
+++ linux-2.6/include/linux/gfp.h
@@ -51,8 +51,9 @@ struct vm_area_struct;
 #define __GFP_THISNODE	((__force gfp_t)0x40000u)/* No fallback, no policies */
 #define __GFP_RECLAIMABLE ((__force gfp_t)0x80000u) /* Page is reclaimable */
 #define __GFP_MOVABLE	((__force gfp_t)0x100000u)  /* Page is movable */
+#define __GFP_NO_OOM_KILL ((__force gfp_t)0x200000u)  /* Don't invoke out_of_memory() */
 
-#define __GFP_BITS_SHIFT 21	/* Room for 21 __GFP_FOO bits */
+#define __GFP_BITS_SHIFT 22	/* Number of__GFP_FOO bits */
 #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
 
 /* This equals 0, but use constants in case they ever change */

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 2/4] PM/Hibernate: Move memory shrinking to snapshot.c (rev. 2)
@ 2009-05-03  0:23                                         ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-03  0:23 UTC (permalink / raw)
  To: Andrew Morton
  Cc: pavel, torvalds, jens.axboe, alan-jenkins, linux-kernel,
	kernel-testers, linux-pm

From: Rafael J. Wysocki <rjw@sisk.pl>

The next patch is going to modify the memory shrinking code so that
it will make memory allocations to free memory instead of using an
artificial memory shrinking mechanism for that.  For this purpose it
is convenient to move swsusp_shrink_memory() from
kernel/power/swsusp.c to kernel/power/snapshot.c, because the new
memory-shrinking code is going to use things that are local to
kernel/power/snapshot.c .

[rev. 2: Make some functions static and remove their headers from
 kernel/power/power.h]

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/power.h    |    4 --
 kernel/power/snapshot.c |   80 ++++++++++++++++++++++++++++++++++++++++++++++--
 kernel/power/swsusp.c   |   76 ---------------------------------------------
 3 files changed, 79 insertions(+), 81 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -39,6 +39,14 @@ static int swsusp_page_is_free(struct pa
 static void swsusp_set_page_forbidden(struct page *);
 static void swsusp_unset_page_forbidden(struct page *);
 
+/*
+ * Preferred image size in bytes (tunable via /sys/power/image_size).
+ * When it is set to N, swsusp will do its best to ensure the image
+ * size will not exceed N bytes, but if that is impossible, it will
+ * try to create the smallest image possible.
+ */
+unsigned long image_size = 500 * 1024 * 1024;
+
 /* List of PBEs needed for restoring the pages that were allocated before
  * the suspend and included in the suspend image, but have also been
  * allocated by the "resume" kernel, so their contents cannot be written
@@ -840,7 +848,7 @@ static struct page *saveable_highmem_pag
  *	pages.
  */
 
-unsigned int count_highmem_pages(void)
+static unsigned int count_highmem_pages(void)
 {
 	struct zone *zone;
 	unsigned int n = 0;
@@ -902,7 +910,7 @@ static struct page *saveable_page(struct
  *	pages.
  */
 
-unsigned int count_data_pages(void)
+static unsigned int count_data_pages(void)
 {
 	struct zone *zone;
 	unsigned long pfn, max_zone_pfn;
@@ -1058,6 +1066,74 @@ void swsusp_free(void)
 	buffer = NULL;
 }
 
+/**
+ *	swsusp_shrink_memory -  Try to free as much memory as needed
+ *
+ *	... but do not OOM-kill anyone
+ *
+ *	Notice: all userland should be stopped before it is called, or
+ *	livelock is possible.
+ */
+
+#define SHRINK_BITE	10000
+static inline unsigned long __shrink_memory(long tmp)
+{
+	if (tmp > SHRINK_BITE)
+		tmp = SHRINK_BITE;
+	return shrink_all_memory(tmp);
+}
+
+int swsusp_shrink_memory(void)
+{
+	long tmp;
+	struct zone *zone;
+	unsigned long pages = 0;
+	unsigned int i = 0;
+	char *p = "-\\|/";
+	struct timeval start, stop;
+
+	printk(KERN_INFO "PM: Shrinking memory...  ");
+	do_gettimeofday(&start);
+	do {
+		long size, highmem_size;
+
+		highmem_size = count_highmem_pages();
+		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
+		tmp = size;
+		size += highmem_size;
+		for_each_populated_zone(zone) {
+			tmp += snapshot_additional_pages(zone);
+			if (is_highmem(zone)) {
+				highmem_size -=
+					zone_page_state(zone, NR_FREE_PAGES);
+			} else {
+				tmp -= zone_page_state(zone, NR_FREE_PAGES);
+				tmp += zone->lowmem_reserve[ZONE_NORMAL];
+			}
+		}
+
+		if (highmem_size < 0)
+			highmem_size = 0;
+
+		tmp += highmem_size;
+		if (tmp > 0) {
+			tmp = __shrink_memory(tmp);
+			if (!tmp)
+				return -ENOMEM;
+			pages += tmp;
+		} else if (size > image_size / PAGE_SIZE) {
+			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
+			pages += tmp;
+		}
+		printk("\b%c", p[i++%4]);
+	} while (tmp > 0);
+	do_gettimeofday(&stop);
+	printk("\bdone (%lu pages freed)\n", pages);
+	swsusp_show_speed(&start, &stop, pages, "Freed");
+
+	return 0;
+}
+
 #ifdef CONFIG_HIGHMEM
 /**
   *	count_pages_for_highmem - compute the number of non-highmem pages
Index: linux-2.6/kernel/power/swsusp.c
===================================================================
--- linux-2.6.orig/kernel/power/swsusp.c
+++ linux-2.6/kernel/power/swsusp.c
@@ -55,14 +55,6 @@
 
 #include "power.h"
 
-/*
- * Preferred image size in bytes (tunable via /sys/power/image_size).
- * When it is set to N, swsusp will do its best to ensure the image
- * size will not exceed N bytes, but if that is impossible, it will
- * try to create the smallest image possible.
- */
-unsigned long image_size = 500 * 1024 * 1024;
-
 int in_suspend __nosavedata = 0;
 
 /**
@@ -195,74 +187,6 @@ void swsusp_show_speed(struct timeval *s
 			kps / 1000, (kps % 1000) / 10);
 }
 
-/**
- *	swsusp_shrink_memory -  Try to free as much memory as needed
- *
- *	... but do not OOM-kill anyone
- *
- *	Notice: all userland should be stopped before it is called, or
- *	livelock is possible.
- */
-
-#define SHRINK_BITE	10000
-static inline unsigned long __shrink_memory(long tmp)
-{
-	if (tmp > SHRINK_BITE)
-		tmp = SHRINK_BITE;
-	return shrink_all_memory(tmp);
-}
-
-int swsusp_shrink_memory(void)
-{
-	long tmp;
-	struct zone *zone;
-	unsigned long pages = 0;
-	unsigned int i = 0;
-	char *p = "-\\|/";
-	struct timeval start, stop;
-
-	printk(KERN_INFO "PM: Shrinking memory...  ");
-	do_gettimeofday(&start);
-	do {
-		long size, highmem_size;
-
-		highmem_size = count_highmem_pages();
-		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
-		tmp = size;
-		size += highmem_size;
-		for_each_populated_zone(zone) {
-			tmp += snapshot_additional_pages(zone);
-			if (is_highmem(zone)) {
-				highmem_size -=
-					zone_page_state(zone, NR_FREE_PAGES);
-			} else {
-				tmp -= zone_page_state(zone, NR_FREE_PAGES);
-				tmp += zone->lowmem_reserve[ZONE_NORMAL];
-			}
-		}
-
-		if (highmem_size < 0)
-			highmem_size = 0;
-
-		tmp += highmem_size;
-		if (tmp > 0) {
-			tmp = __shrink_memory(tmp);
-			if (!tmp)
-				return -ENOMEM;
-			pages += tmp;
-		} else if (size > image_size / PAGE_SIZE) {
-			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
-			pages += tmp;
-		}
-		printk("\b%c", p[i++%4]);
-	} while (tmp > 0);
-	do_gettimeofday(&stop);
-	printk("\bdone (%lu pages freed)\n", pages);
-	swsusp_show_speed(&start, &stop, pages, "Freed");
-
-	return 0;
-}
-
 /*
  * Platforms, like ACPI, may want us to save some memory used by them during
  * hibernation and to restore the contents of this memory during the subsequent
Index: linux-2.6/kernel/power/power.h
===================================================================
--- linux-2.6.orig/kernel/power/power.h
+++ linux-2.6/kernel/power/power.h
@@ -74,7 +74,7 @@ extern asmlinkage int swsusp_arch_resume
 
 extern int create_basic_memory_bitmaps(void);
 extern void free_basic_memory_bitmaps(void);
-extern unsigned int count_data_pages(void);
+extern int swsusp_shrink_memory(void);
 
 /**
  *	Auxiliary structure used for reading the snapshot image data and
@@ -149,7 +149,6 @@ extern int swsusp_swap_in_use(void);
 
 /* kernel/power/disk.c */
 extern int swsusp_check(void);
-extern int swsusp_shrink_memory(void);
 extern void swsusp_free(void);
 extern int swsusp_read(unsigned int *flags_p);
 extern int swsusp_write(unsigned int flags);
@@ -176,7 +175,6 @@ extern int pm_notifier_call_chain(unsign
 #endif
 
 #ifdef CONFIG_HIGHMEM
-unsigned int count_highmem_pages(void);
 int restore_highmem(void);
 #else
 static inline unsigned int count_highmem_pages(void) { return 0; }


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 2/4] PM/Hibernate: Move memory shrinking to snapshot.c (rev. 2)
  2009-05-03  0:20                                       ` Rafael J. Wysocki
                                                         ` (2 preceding siblings ...)
  (?)
@ 2009-05-03  0:23                                       ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-03  0:23 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, alan-jenkins, jens.axboe, linux-pm, kernel-testers,
	torvalds

From: Rafael J. Wysocki <rjw@sisk.pl>

The next patch is going to modify the memory shrinking code so that
it will make memory allocations to free memory instead of using an
artificial memory shrinking mechanism for that.  For this purpose it
is convenient to move swsusp_shrink_memory() from
kernel/power/swsusp.c to kernel/power/snapshot.c, because the new
memory-shrinking code is going to use things that are local to
kernel/power/snapshot.c .

[rev. 2: Make some functions static and remove their headers from
 kernel/power/power.h]

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/power.h    |    4 --
 kernel/power/snapshot.c |   80 ++++++++++++++++++++++++++++++++++++++++++++++--
 kernel/power/swsusp.c   |   76 ---------------------------------------------
 3 files changed, 79 insertions(+), 81 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -39,6 +39,14 @@ static int swsusp_page_is_free(struct pa
 static void swsusp_set_page_forbidden(struct page *);
 static void swsusp_unset_page_forbidden(struct page *);
 
+/*
+ * Preferred image size in bytes (tunable via /sys/power/image_size).
+ * When it is set to N, swsusp will do its best to ensure the image
+ * size will not exceed N bytes, but if that is impossible, it will
+ * try to create the smallest image possible.
+ */
+unsigned long image_size = 500 * 1024 * 1024;
+
 /* List of PBEs needed for restoring the pages that were allocated before
  * the suspend and included in the suspend image, but have also been
  * allocated by the "resume" kernel, so their contents cannot be written
@@ -840,7 +848,7 @@ static struct page *saveable_highmem_pag
  *	pages.
  */
 
-unsigned int count_highmem_pages(void)
+static unsigned int count_highmem_pages(void)
 {
 	struct zone *zone;
 	unsigned int n = 0;
@@ -902,7 +910,7 @@ static struct page *saveable_page(struct
  *	pages.
  */
 
-unsigned int count_data_pages(void)
+static unsigned int count_data_pages(void)
 {
 	struct zone *zone;
 	unsigned long pfn, max_zone_pfn;
@@ -1058,6 +1066,74 @@ void swsusp_free(void)
 	buffer = NULL;
 }
 
+/**
+ *	swsusp_shrink_memory -  Try to free as much memory as needed
+ *
+ *	... but do not OOM-kill anyone
+ *
+ *	Notice: all userland should be stopped before it is called, or
+ *	livelock is possible.
+ */
+
+#define SHRINK_BITE	10000
+static inline unsigned long __shrink_memory(long tmp)
+{
+	if (tmp > SHRINK_BITE)
+		tmp = SHRINK_BITE;
+	return shrink_all_memory(tmp);
+}
+
+int swsusp_shrink_memory(void)
+{
+	long tmp;
+	struct zone *zone;
+	unsigned long pages = 0;
+	unsigned int i = 0;
+	char *p = "-\\|/";
+	struct timeval start, stop;
+
+	printk(KERN_INFO "PM: Shrinking memory...  ");
+	do_gettimeofday(&start);
+	do {
+		long size, highmem_size;
+
+		highmem_size = count_highmem_pages();
+		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
+		tmp = size;
+		size += highmem_size;
+		for_each_populated_zone(zone) {
+			tmp += snapshot_additional_pages(zone);
+			if (is_highmem(zone)) {
+				highmem_size -=
+					zone_page_state(zone, NR_FREE_PAGES);
+			} else {
+				tmp -= zone_page_state(zone, NR_FREE_PAGES);
+				tmp += zone->lowmem_reserve[ZONE_NORMAL];
+			}
+		}
+
+		if (highmem_size < 0)
+			highmem_size = 0;
+
+		tmp += highmem_size;
+		if (tmp > 0) {
+			tmp = __shrink_memory(tmp);
+			if (!tmp)
+				return -ENOMEM;
+			pages += tmp;
+		} else if (size > image_size / PAGE_SIZE) {
+			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
+			pages += tmp;
+		}
+		printk("\b%c", p[i++%4]);
+	} while (tmp > 0);
+	do_gettimeofday(&stop);
+	printk("\bdone (%lu pages freed)\n", pages);
+	swsusp_show_speed(&start, &stop, pages, "Freed");
+
+	return 0;
+}
+
 #ifdef CONFIG_HIGHMEM
 /**
   *	count_pages_for_highmem - compute the number of non-highmem pages
Index: linux-2.6/kernel/power/swsusp.c
===================================================================
--- linux-2.6.orig/kernel/power/swsusp.c
+++ linux-2.6/kernel/power/swsusp.c
@@ -55,14 +55,6 @@
 
 #include "power.h"
 
-/*
- * Preferred image size in bytes (tunable via /sys/power/image_size).
- * When it is set to N, swsusp will do its best to ensure the image
- * size will not exceed N bytes, but if that is impossible, it will
- * try to create the smallest image possible.
- */
-unsigned long image_size = 500 * 1024 * 1024;
-
 int in_suspend __nosavedata = 0;
 
 /**
@@ -195,74 +187,6 @@ void swsusp_show_speed(struct timeval *s
 			kps / 1000, (kps % 1000) / 10);
 }
 
-/**
- *	swsusp_shrink_memory -  Try to free as much memory as needed
- *
- *	... but do not OOM-kill anyone
- *
- *	Notice: all userland should be stopped before it is called, or
- *	livelock is possible.
- */
-
-#define SHRINK_BITE	10000
-static inline unsigned long __shrink_memory(long tmp)
-{
-	if (tmp > SHRINK_BITE)
-		tmp = SHRINK_BITE;
-	return shrink_all_memory(tmp);
-}
-
-int swsusp_shrink_memory(void)
-{
-	long tmp;
-	struct zone *zone;
-	unsigned long pages = 0;
-	unsigned int i = 0;
-	char *p = "-\\|/";
-	struct timeval start, stop;
-
-	printk(KERN_INFO "PM: Shrinking memory...  ");
-	do_gettimeofday(&start);
-	do {
-		long size, highmem_size;
-
-		highmem_size = count_highmem_pages();
-		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
-		tmp = size;
-		size += highmem_size;
-		for_each_populated_zone(zone) {
-			tmp += snapshot_additional_pages(zone);
-			if (is_highmem(zone)) {
-				highmem_size -=
-					zone_page_state(zone, NR_FREE_PAGES);
-			} else {
-				tmp -= zone_page_state(zone, NR_FREE_PAGES);
-				tmp += zone->lowmem_reserve[ZONE_NORMAL];
-			}
-		}
-
-		if (highmem_size < 0)
-			highmem_size = 0;
-
-		tmp += highmem_size;
-		if (tmp > 0) {
-			tmp = __shrink_memory(tmp);
-			if (!tmp)
-				return -ENOMEM;
-			pages += tmp;
-		} else if (size > image_size / PAGE_SIZE) {
-			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
-			pages += tmp;
-		}
-		printk("\b%c", p[i++%4]);
-	} while (tmp > 0);
-	do_gettimeofday(&stop);
-	printk("\bdone (%lu pages freed)\n", pages);
-	swsusp_show_speed(&start, &stop, pages, "Freed");
-
-	return 0;
-}
-
 /*
  * Platforms, like ACPI, may want us to save some memory used by them during
  * hibernation and to restore the contents of this memory during the subsequent
Index: linux-2.6/kernel/power/power.h
===================================================================
--- linux-2.6.orig/kernel/power/power.h
+++ linux-2.6/kernel/power/power.h
@@ -74,7 +74,7 @@ extern asmlinkage int swsusp_arch_resume
 
 extern int create_basic_memory_bitmaps(void);
 extern void free_basic_memory_bitmaps(void);
-extern unsigned int count_data_pages(void);
+extern int swsusp_shrink_memory(void);
 
 /**
  *	Auxiliary structure used for reading the snapshot image data and
@@ -149,7 +149,6 @@ extern int swsusp_swap_in_use(void);
 
 /* kernel/power/disk.c */
 extern int swsusp_check(void);
-extern int swsusp_shrink_memory(void);
 extern void swsusp_free(void);
 extern int swsusp_read(unsigned int *flags_p);
 extern int swsusp_write(unsigned int flags);
@@ -176,7 +175,6 @@ extern int pm_notifier_call_chain(unsign
 #endif
 
 #ifdef CONFIG_HIGHMEM
-unsigned int count_highmem_pages(void);
 int restore_highmem(void);
 #else
 static inline unsigned int count_highmem_pages(void) { return 0; }

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 2/4] PM/Hibernate: Move memory shrinking to snapshot.c (rev. 2)
@ 2009-05-03  0:23                                         ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-03  0:23 UTC (permalink / raw)
  To: Andrew Morton
  Cc: pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>

The next patch is going to modify the memory shrinking code so that
it will make memory allocations to free memory instead of using an
artificial memory shrinking mechanism for that.  For this purpose it
is convenient to move swsusp_shrink_memory() from
kernel/power/swsusp.c to kernel/power/snapshot.c, because the new
memory-shrinking code is going to use things that are local to
kernel/power/snapshot.c .

[rev. 2: Make some functions static and remove their headers from
 kernel/power/power.h]

Signed-off-by: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
---
 kernel/power/power.h    |    4 --
 kernel/power/snapshot.c |   80 ++++++++++++++++++++++++++++++++++++++++++++++--
 kernel/power/swsusp.c   |   76 ---------------------------------------------
 3 files changed, 79 insertions(+), 81 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -39,6 +39,14 @@ static int swsusp_page_is_free(struct pa
 static void swsusp_set_page_forbidden(struct page *);
 static void swsusp_unset_page_forbidden(struct page *);
 
+/*
+ * Preferred image size in bytes (tunable via /sys/power/image_size).
+ * When it is set to N, swsusp will do its best to ensure the image
+ * size will not exceed N bytes, but if that is impossible, it will
+ * try to create the smallest image possible.
+ */
+unsigned long image_size = 500 * 1024 * 1024;
+
 /* List of PBEs needed for restoring the pages that were allocated before
  * the suspend and included in the suspend image, but have also been
  * allocated by the "resume" kernel, so their contents cannot be written
@@ -840,7 +848,7 @@ static struct page *saveable_highmem_pag
  *	pages.
  */
 
-unsigned int count_highmem_pages(void)
+static unsigned int count_highmem_pages(void)
 {
 	struct zone *zone;
 	unsigned int n = 0;
@@ -902,7 +910,7 @@ static struct page *saveable_page(struct
  *	pages.
  */
 
-unsigned int count_data_pages(void)
+static unsigned int count_data_pages(void)
 {
 	struct zone *zone;
 	unsigned long pfn, max_zone_pfn;
@@ -1058,6 +1066,74 @@ void swsusp_free(void)
 	buffer = NULL;
 }
 
+/**
+ *	swsusp_shrink_memory -  Try to free as much memory as needed
+ *
+ *	... but do not OOM-kill anyone
+ *
+ *	Notice: all userland should be stopped before it is called, or
+ *	livelock is possible.
+ */
+
+#define SHRINK_BITE	10000
+static inline unsigned long __shrink_memory(long tmp)
+{
+	if (tmp > SHRINK_BITE)
+		tmp = SHRINK_BITE;
+	return shrink_all_memory(tmp);
+}
+
+int swsusp_shrink_memory(void)
+{
+	long tmp;
+	struct zone *zone;
+	unsigned long pages = 0;
+	unsigned int i = 0;
+	char *p = "-\\|/";
+	struct timeval start, stop;
+
+	printk(KERN_INFO "PM: Shrinking memory...  ");
+	do_gettimeofday(&start);
+	do {
+		long size, highmem_size;
+
+		highmem_size = count_highmem_pages();
+		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
+		tmp = size;
+		size += highmem_size;
+		for_each_populated_zone(zone) {
+			tmp += snapshot_additional_pages(zone);
+			if (is_highmem(zone)) {
+				highmem_size -=
+					zone_page_state(zone, NR_FREE_PAGES);
+			} else {
+				tmp -= zone_page_state(zone, NR_FREE_PAGES);
+				tmp += zone->lowmem_reserve[ZONE_NORMAL];
+			}
+		}
+
+		if (highmem_size < 0)
+			highmem_size = 0;
+
+		tmp += highmem_size;
+		if (tmp > 0) {
+			tmp = __shrink_memory(tmp);
+			if (!tmp)
+				return -ENOMEM;
+			pages += tmp;
+		} else if (size > image_size / PAGE_SIZE) {
+			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
+			pages += tmp;
+		}
+		printk("\b%c", p[i++%4]);
+	} while (tmp > 0);
+	do_gettimeofday(&stop);
+	printk("\bdone (%lu pages freed)\n", pages);
+	swsusp_show_speed(&start, &stop, pages, "Freed");
+
+	return 0;
+}
+
 #ifdef CONFIG_HIGHMEM
 /**
   *	count_pages_for_highmem - compute the number of non-highmem pages
Index: linux-2.6/kernel/power/swsusp.c
===================================================================
--- linux-2.6.orig/kernel/power/swsusp.c
+++ linux-2.6/kernel/power/swsusp.c
@@ -55,14 +55,6 @@
 
 #include "power.h"
 
-/*
- * Preferred image size in bytes (tunable via /sys/power/image_size).
- * When it is set to N, swsusp will do its best to ensure the image
- * size will not exceed N bytes, but if that is impossible, it will
- * try to create the smallest image possible.
- */
-unsigned long image_size = 500 * 1024 * 1024;
-
 int in_suspend __nosavedata = 0;
 
 /**
@@ -195,74 +187,6 @@ void swsusp_show_speed(struct timeval *s
 			kps / 1000, (kps % 1000) / 10);
 }
 
-/**
- *	swsusp_shrink_memory -  Try to free as much memory as needed
- *
- *	... but do not OOM-kill anyone
- *
- *	Notice: all userland should be stopped before it is called, or
- *	livelock is possible.
- */
-
-#define SHRINK_BITE	10000
-static inline unsigned long __shrink_memory(long tmp)
-{
-	if (tmp > SHRINK_BITE)
-		tmp = SHRINK_BITE;
-	return shrink_all_memory(tmp);
-}
-
-int swsusp_shrink_memory(void)
-{
-	long tmp;
-	struct zone *zone;
-	unsigned long pages = 0;
-	unsigned int i = 0;
-	char *p = "-\\|/";
-	struct timeval start, stop;
-
-	printk(KERN_INFO "PM: Shrinking memory...  ");
-	do_gettimeofday(&start);
-	do {
-		long size, highmem_size;
-
-		highmem_size = count_highmem_pages();
-		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
-		tmp = size;
-		size += highmem_size;
-		for_each_populated_zone(zone) {
-			tmp += snapshot_additional_pages(zone);
-			if (is_highmem(zone)) {
-				highmem_size -=
-					zone_page_state(zone, NR_FREE_PAGES);
-			} else {
-				tmp -= zone_page_state(zone, NR_FREE_PAGES);
-				tmp += zone->lowmem_reserve[ZONE_NORMAL];
-			}
-		}
-
-		if (highmem_size < 0)
-			highmem_size = 0;
-
-		tmp += highmem_size;
-		if (tmp > 0) {
-			tmp = __shrink_memory(tmp);
-			if (!tmp)
-				return -ENOMEM;
-			pages += tmp;
-		} else if (size > image_size / PAGE_SIZE) {
-			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
-			pages += tmp;
-		}
-		printk("\b%c", p[i++%4]);
-	} while (tmp > 0);
-	do_gettimeofday(&stop);
-	printk("\bdone (%lu pages freed)\n", pages);
-	swsusp_show_speed(&start, &stop, pages, "Freed");
-
-	return 0;
-}
-
 /*
  * Platforms, like ACPI, may want us to save some memory used by them during
  * hibernation and to restore the contents of this memory during the subsequent
Index: linux-2.6/kernel/power/power.h
===================================================================
--- linux-2.6.orig/kernel/power/power.h
+++ linux-2.6/kernel/power/power.h
@@ -74,7 +74,7 @@ extern asmlinkage int swsusp_arch_resume
 
 extern int create_basic_memory_bitmaps(void);
 extern void free_basic_memory_bitmaps(void);
-extern unsigned int count_data_pages(void);
+extern int swsusp_shrink_memory(void);
 
 /**
  *	Auxiliary structure used for reading the snapshot image data and
@@ -149,7 +149,6 @@ extern int swsusp_swap_in_use(void);
 
 /* kernel/power/disk.c */
 extern int swsusp_check(void);
-extern int swsusp_shrink_memory(void);
 extern void swsusp_free(void);
 extern int swsusp_read(unsigned int *flags_p);
 extern int swsusp_write(unsigned int flags);
@@ -176,7 +175,6 @@ extern int pm_notifier_call_chain(unsign
 #endif
 
 #ifdef CONFIG_HIGHMEM
-unsigned int count_highmem_pages(void);
 int restore_highmem(void);
 #else
 static inline unsigned int count_highmem_pages(void) { return 0; }

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 3/4] PM/Hibernate: Use memory allocations to free memory (rev. 2)
@ 2009-05-03  0:24                                         ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-03  0:24 UTC (permalink / raw)
  To: Andrew Morton
  Cc: pavel, torvalds, jens.axboe, alan-jenkins, linux-kernel,
	kernel-testers, linux-pm

From: Rafael J. Wysocki <rjw@sisk.pl>

Modify the hibernation memory shrinking code so that it will make
memory allocations to free memory instead of using an artificial
memory shrinking mechanism for that.  Remove the shrinking of
memory from the suspend-to-RAM code, where it is not really
necessary.  Finally, remove the no longer used memory shrinking
functions from mm/vmscan.c .

[rev. 2: Use the existing memory bitmaps for marking preallocated
 image pages and use swsusp_free() from releasing them, introduce
 GFP_IMAGE, add comments describing the memory shrinking strategy.]

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/main.c     |   20 ------
 kernel/power/snapshot.c |  132 +++++++++++++++++++++++++++++++++-----------
 mm/vmscan.c             |  142 ------------------------------------------------
 3 files changed, 101 insertions(+), 193 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -1066,41 +1066,97 @@ void swsusp_free(void)
 	buffer = NULL;
 }
 
+/* Helper functions used for the shrinking of memory. */
+
+#ifdef CONFIG_HIGHMEM
+#define GFP_IMAGE	(GFP_KERNEL | __GFP_HIGHMEM | __GFP_NO_OOM_KILL)
+#else
+#define GFP_IMAGE	(GFP_KERNEL | __GFP_NO_OOM_KILL)
+#endif
+
+#define SHRINK_BITE	10000
+
 /**
- *	swsusp_shrink_memory -  Try to free as much memory as needed
+ * prealloc_pages - preallocate given number of pages and mark their PFNs
+ * @nr_pages: Number of pages to allocate.
  *
- *	... but do not OOM-kill anyone
- *
- *	Notice: all userland should be stopped before it is called, or
- *	livelock is possible.
+ * Allocate given number of pages and mark their PFNs in the hibernation memory
+ * bitmaps, so that they can be released by swsusp_free().
+ * Return value: The number of normal (ie. non-highmem) pages allocated or
+ * -ENOMEM on failure.
  */
-
-#define SHRINK_BITE	10000
-static inline unsigned long __shrink_memory(long tmp)
+static long prealloc_pages(long nr_pages)
 {
-	if (tmp > SHRINK_BITE)
-		tmp = SHRINK_BITE;
-	return shrink_all_memory(tmp);
+	long nr_normal = 0;
+
+	while (nr_pages-- > 0) {
+		struct page *page;
+
+		page = alloc_image_page(GFP_IMAGE);
+		if (!page)
+			return -ENOMEM;
+		if (!PageHighMem(page))
+			nr_normal++;
+	}
+
+	return nr_normal;
 }
 
+/**
+ * swsusp_shrink_memory -  Make the kernel release as much memory as needed
+ *
+ * To create a hibernation image it is necessary to make a copy of every page
+ * frame in use.  We also need a number of page frames to be free during
+ * hibernation for allocations made while saving the image and for device
+ * drivers, in case they need to allocate memory from their hibernation
+ * callbacks (these two numbers are given by PAGES_FOR_IO and SPARE_PAGES,
+ * respectively, both of which are rough estimates).  To make this happen, we
+ * preallocate memory in SHRINK_BITE chunks in a loop until the following
+ * condition is satisfied:
+ *
+ * [number of preallocated page frames] >=
+ *	(1/2) * ([total number of page frames in use] + PAGES_FOR_IO
+ *		+ SPARE_PAGES - [number of free page frames])
+ *
+ * because in that case, if all of the preallocated page frames are released,
+ * the total number of free page frames will be equal to or greater than the sum
+ * of the total number of page frames in use with PAGES_FOR_IO and SPARE_PAGES,
+ * which is what we need.
+ *
+ * If image_size is set below the number following from the above inequality,
+ * the preallocation of memory is continued until the total number of page
+ * frames in use is below the requested image size.
+ */
 int swsusp_shrink_memory(void)
 {
-	long tmp;
-	struct zone *zone;
-	unsigned long pages = 0;
+	unsigned long pages = 0, alloc_normal = 0, alloc_highmem = 0;
 	unsigned int i = 0;
 	char *p = "-\\|/";
 	struct timeval start, stop;
+	int error = 0;
 
 	printk(KERN_INFO "PM: Shrinking memory...  ");
 	do_gettimeofday(&start);
-	do {
-		long size, highmem_size;
 
-		highmem_size = count_highmem_pages();
-		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
-		tmp = size;
+	for (;;) {
+		struct zone *zone;
+		long size, highmem_size, tmp, ret;
+
+		/*
+		 * Pages preallocated by this loop are not counted as data pages
+		 * by count_data_pages() and count_highmem_pages(), so we only
+		 * need to subtract their numbers once here to verify the
+		 * satisfaction of the stop condition.
+		 */
+		size = count_data_pages() - alloc_normal;
+		tmp = size + PAGES_FOR_IO + SPARE_PAGES;
+		highmem_size = count_highmem_pages() - alloc_highmem;
 		size += highmem_size;
+		/*
+		 * Highmem is treated differently, because we prefer not to
+		 * store copies of normal page frames in it during image
+		 * creation.
+		 */
 		for_each_populated_zone(zone) {
 			tmp += snapshot_additional_pages(zone);
 			if (is_highmem(zone)) {
@@ -1111,27 +1167,39 @@ int swsusp_shrink_memory(void)
 				tmp += zone->lowmem_reserve[ZONE_NORMAL];
 			}
 		}
-
 		if (highmem_size < 0)
 			highmem_size = 0;
-
 		tmp += highmem_size;
-		if (tmp > 0) {
-			tmp = __shrink_memory(tmp);
-			if (!tmp)
-				return -ENOMEM;
-			pages += tmp;
-		} else if (size > image_size / PAGE_SIZE) {
-			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
-			pages += tmp;
+
+		if (tmp <= 0 && size > image_size / PAGE_SIZE)
+			tmp = size - (image_size / PAGE_SIZE);
+
+		if (tmp > SHRINK_BITE)
+			tmp = SHRINK_BITE;
+		else if (tmp <= 0)
+			break;
+
+		ret = prealloc_pages(tmp);
+		if (ret < 0) {
+			error = -ENOMEM;
+			goto out;
 		}
+		alloc_normal += ret;
+		alloc_highmem += tmp - ret;
+		pages += tmp;
+
 		printk("\b%c", p[i++%4]);
-	} while (tmp > 0);
+	}
+
 	do_gettimeofday(&stop);
-	printk("\bdone (%lu pages freed)\n", pages);
+	printk("\bdone (preallocated %lu free pages)\n", pages);
 	swsusp_show_speed(&start, &stop, pages, "Freed");
 
-	return 0;
+ out:
+	/* Release the preallocated page frames. */
+	swsusp_free();
+
+	return error;
 }
 
 #ifdef CONFIG_HIGHMEM
Index: linux-2.6/mm/vmscan.c
===================================================================
--- linux-2.6.orig/mm/vmscan.c
+++ linux-2.6/mm/vmscan.c
@@ -2054,148 +2054,6 @@ unsigned long global_lru_pages(void)
 		+ global_page_state(NR_INACTIVE_FILE);
 }
 
-#ifdef CONFIG_PM
-/*
- * Helper function for shrink_all_memory().  Tries to reclaim 'nr_pages' pages
- * from LRU lists system-wide, for given pass and priority.
- *
- * For pass > 3 we also try to shrink the LRU lists that contain a few pages
- */
-static void shrink_all_zones(unsigned long nr_pages, int prio,
-				      int pass, struct scan_control *sc)
-{
-	struct zone *zone;
-	unsigned long nr_reclaimed = 0;
-
-	for_each_populated_zone(zone) {
-		enum lru_list l;
-
-		if (zone_is_all_unreclaimable(zone) && prio != DEF_PRIORITY)
-			continue;
-
-		for_each_evictable_lru(l) {
-			enum zone_stat_item ls = NR_LRU_BASE + l;
-			unsigned long lru_pages = zone_page_state(zone, ls);
-
-			/* For pass = 0, we don't shrink the active list */
-			if (pass == 0 && (l == LRU_ACTIVE_ANON ||
-						l == LRU_ACTIVE_FILE))
-				continue;
-
-			zone->lru[l].nr_scan += (lru_pages >> prio) + 1;
-			if (zone->lru[l].nr_scan >= nr_pages || pass > 3) {
-				unsigned long nr_to_scan;
-
-				zone->lru[l].nr_scan = 0;
-				nr_to_scan = min(nr_pages, lru_pages);
-				nr_reclaimed += shrink_list(l, nr_to_scan, zone,
-								sc, prio);
-				if (nr_reclaimed >= nr_pages) {
-					sc->nr_reclaimed += nr_reclaimed;
-					return;
-				}
-			}
-		}
-	}
-	sc->nr_reclaimed += nr_reclaimed;
-}
-
-/*
- * Try to free `nr_pages' of memory, system-wide, and return the number of
- * freed pages.
- *
- * Rather than trying to age LRUs the aim is to preserve the overall
- * LRU order by reclaiming preferentially
- * inactive > active > active referenced > active mapped
- */
-unsigned long shrink_all_memory(unsigned long nr_pages)
-{
-	unsigned long lru_pages, nr_slab;
-	int pass;
-	struct reclaim_state reclaim_state;
-	struct scan_control sc = {
-		.gfp_mask = GFP_KERNEL,
-		.may_unmap = 0,
-		.may_writepage = 1,
-		.isolate_pages = isolate_pages_global,
-		.nr_reclaimed = 0,
-	};
-
-	current->reclaim_state = &reclaim_state;
-
-	lru_pages = global_lru_pages();
-	nr_slab = global_page_state(NR_SLAB_RECLAIMABLE);
-	/* If slab caches are huge, it's better to hit them first */
-	while (nr_slab >= lru_pages) {
-		reclaim_state.reclaimed_slab = 0;
-		shrink_slab(nr_pages, sc.gfp_mask, lru_pages);
-		if (!reclaim_state.reclaimed_slab)
-			break;
-
-		sc.nr_reclaimed += reclaim_state.reclaimed_slab;
-		if (sc.nr_reclaimed >= nr_pages)
-			goto out;
-
-		nr_slab -= reclaim_state.reclaimed_slab;
-	}
-
-	/*
-	 * We try to shrink LRUs in 5 passes:
-	 * 0 = Reclaim from inactive_list only
-	 * 1 = Reclaim from active list but don't reclaim mapped
-	 * 2 = 2nd pass of type 1
-	 * 3 = Reclaim mapped (normal reclaim)
-	 * 4 = 2nd pass of type 3
-	 */
-	for (pass = 0; pass < 5; pass++) {
-		int prio;
-
-		/* Force reclaiming mapped pages in the passes #3 and #4 */
-		if (pass > 2)
-			sc.may_unmap = 1;
-
-		for (prio = DEF_PRIORITY; prio >= 0; prio--) {
-			unsigned long nr_to_scan = nr_pages - sc.nr_reclaimed;
-
-			sc.nr_scanned = 0;
-			sc.swap_cluster_max = nr_to_scan;
-			shrink_all_zones(nr_to_scan, prio, pass, &sc);
-			if (sc.nr_reclaimed >= nr_pages)
-				goto out;
-
-			reclaim_state.reclaimed_slab = 0;
-			shrink_slab(sc.nr_scanned, sc.gfp_mask,
-					global_lru_pages());
-			sc.nr_reclaimed += reclaim_state.reclaimed_slab;
-			if (sc.nr_reclaimed >= nr_pages)
-				goto out;
-
-			if (sc.nr_scanned && prio < DEF_PRIORITY - 2)
-				congestion_wait(WRITE, HZ / 10);
-		}
-	}
-
-	/*
-	 * If sc.nr_reclaimed = 0, we could not shrink LRUs, but there may be
-	 * something in slab caches
-	 */
-	if (!sc.nr_reclaimed) {
-		do {
-			reclaim_state.reclaimed_slab = 0;
-			shrink_slab(nr_pages, sc.gfp_mask, global_lru_pages());
-			sc.nr_reclaimed += reclaim_state.reclaimed_slab;
-		} while (sc.nr_reclaimed < nr_pages &&
-				reclaim_state.reclaimed_slab > 0);
-	}
-
-
-out:
-	current->reclaim_state = NULL;
-
-	return sc.nr_reclaimed;
-}
-#endif
-
 /* It's optimal to keep kswapds on the same CPUs as their memory, but
    not required for correctness.  So if the last cpu in a node goes
    away, we get changed to run anywhere: as the first one comes back,
Index: linux-2.6/kernel/power/main.c
===================================================================
--- linux-2.6.orig/kernel/power/main.c
+++ linux-2.6/kernel/power/main.c
@@ -188,9 +188,6 @@ static void suspend_test_finish(const ch
 
 #endif
 
-/* This is just an arbitrary number */
-#define FREE_PAGE_NUMBER (100)
-
 static struct platform_suspend_ops *suspend_ops;
 
 /**
@@ -226,7 +223,6 @@ int suspend_valid_only_mem(suspend_state
 static int suspend_prepare(void)
 {
 	int error;
-	unsigned int free_pages;
 
 	if (!suspend_ops || !suspend_ops->enter)
 		return -EPERM;
@@ -241,24 +237,10 @@ static int suspend_prepare(void)
 	if (error)
 		goto Finish;
 
-	if (suspend_freeze_processes()) {
-		error = -EAGAIN;
-		goto Thaw;
-	}
-
-	free_pages = global_page_state(NR_FREE_PAGES);
-	if (free_pages < FREE_PAGE_NUMBER) {
-		pr_debug("PM: free some memory\n");
-		shrink_all_memory(FREE_PAGE_NUMBER - free_pages);
-		if (nr_free_pages() < FREE_PAGE_NUMBER) {
-			error = -ENOMEM;
-			printk(KERN_ERR "PM: No enough memory\n");
-		}
-	}
+	error = suspend_freeze_processes();
 	if (!error)
 		return 0;
 
- Thaw:
 	suspend_thaw_processes();
 	usermodehelper_enable();
  Finish:


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 3/4] PM/Hibernate: Use memory allocations to free memory (rev. 2)
  2009-05-03  0:20                                       ` Rafael J. Wysocki
                                                         ` (4 preceding siblings ...)
  (?)
@ 2009-05-03  0:24                                       ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-03  0:24 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, alan-jenkins, jens.axboe, linux-pm, kernel-testers,
	torvalds

From: Rafael J. Wysocki <rjw@sisk.pl>

Modify the hibernation memory shrinking code so that it will make
memory allocations to free memory instead of using an artificial
memory shrinking mechanism for that.  Remove the shrinking of
memory from the suspend-to-RAM code, where it is not really
necessary.  Finally, remove the no longer used memory shrinking
functions from mm/vmscan.c .

[rev. 2: Use the existing memory bitmaps for marking preallocated
 image pages and use swsusp_free() from releasing them, introduce
 GFP_IMAGE, add comments describing the memory shrinking strategy.]

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/main.c     |   20 ------
 kernel/power/snapshot.c |  132 +++++++++++++++++++++++++++++++++-----------
 mm/vmscan.c             |  142 ------------------------------------------------
 3 files changed, 101 insertions(+), 193 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -1066,41 +1066,97 @@ void swsusp_free(void)
 	buffer = NULL;
 }
 
+/* Helper functions used for the shrinking of memory. */
+
+#ifdef CONFIG_HIGHMEM
+#define GFP_IMAGE	(GFP_KERNEL | __GFP_HIGHMEM | __GFP_NO_OOM_KILL)
+#else
+#define GFP_IMAGE	(GFP_KERNEL | __GFP_NO_OOM_KILL)
+#endif
+
+#define SHRINK_BITE	10000
+
 /**
- *	swsusp_shrink_memory -  Try to free as much memory as needed
+ * prealloc_pages - preallocate given number of pages and mark their PFNs
+ * @nr_pages: Number of pages to allocate.
  *
- *	... but do not OOM-kill anyone
- *
- *	Notice: all userland should be stopped before it is called, or
- *	livelock is possible.
+ * Allocate given number of pages and mark their PFNs in the hibernation memory
+ * bitmaps, so that they can be released by swsusp_free().
+ * Return value: The number of normal (ie. non-highmem) pages allocated or
+ * -ENOMEM on failure.
  */
-
-#define SHRINK_BITE	10000
-static inline unsigned long __shrink_memory(long tmp)
+static long prealloc_pages(long nr_pages)
 {
-	if (tmp > SHRINK_BITE)
-		tmp = SHRINK_BITE;
-	return shrink_all_memory(tmp);
+	long nr_normal = 0;
+
+	while (nr_pages-- > 0) {
+		struct page *page;
+
+		page = alloc_image_page(GFP_IMAGE);
+		if (!page)
+			return -ENOMEM;
+		if (!PageHighMem(page))
+			nr_normal++;
+	}
+
+	return nr_normal;
 }
 
+/**
+ * swsusp_shrink_memory -  Make the kernel release as much memory as needed
+ *
+ * To create a hibernation image it is necessary to make a copy of every page
+ * frame in use.  We also need a number of page frames to be free during
+ * hibernation for allocations made while saving the image and for device
+ * drivers, in case they need to allocate memory from their hibernation
+ * callbacks (these two numbers are given by PAGES_FOR_IO and SPARE_PAGES,
+ * respectively, both of which are rough estimates).  To make this happen, we
+ * preallocate memory in SHRINK_BITE chunks in a loop until the following
+ * condition is satisfied:
+ *
+ * [number of preallocated page frames] >=
+ *	(1/2) * ([total number of page frames in use] + PAGES_FOR_IO
+ *		+ SPARE_PAGES - [number of free page frames])
+ *
+ * because in that case, if all of the preallocated page frames are released,
+ * the total number of free page frames will be equal to or greater than the sum
+ * of the total number of page frames in use with PAGES_FOR_IO and SPARE_PAGES,
+ * which is what we need.
+ *
+ * If image_size is set below the number following from the above inequality,
+ * the preallocation of memory is continued until the total number of page
+ * frames in use is below the requested image size.
+ */
 int swsusp_shrink_memory(void)
 {
-	long tmp;
-	struct zone *zone;
-	unsigned long pages = 0;
+	unsigned long pages = 0, alloc_normal = 0, alloc_highmem = 0;
 	unsigned int i = 0;
 	char *p = "-\\|/";
 	struct timeval start, stop;
+	int error = 0;
 
 	printk(KERN_INFO "PM: Shrinking memory...  ");
 	do_gettimeofday(&start);
-	do {
-		long size, highmem_size;
 
-		highmem_size = count_highmem_pages();
-		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
-		tmp = size;
+	for (;;) {
+		struct zone *zone;
+		long size, highmem_size, tmp, ret;
+
+		/*
+		 * Pages preallocated by this loop are not counted as data pages
+		 * by count_data_pages() and count_highmem_pages(), so we only
+		 * need to subtract their numbers once here to verify the
+		 * satisfaction of the stop condition.
+		 */
+		size = count_data_pages() - alloc_normal;
+		tmp = size + PAGES_FOR_IO + SPARE_PAGES;
+		highmem_size = count_highmem_pages() - alloc_highmem;
 		size += highmem_size;
+		/*
+		 * Highmem is treated differently, because we prefer not to
+		 * store copies of normal page frames in it during image
+		 * creation.
+		 */
 		for_each_populated_zone(zone) {
 			tmp += snapshot_additional_pages(zone);
 			if (is_highmem(zone)) {
@@ -1111,27 +1167,39 @@ int swsusp_shrink_memory(void)
 				tmp += zone->lowmem_reserve[ZONE_NORMAL];
 			}
 		}
-
 		if (highmem_size < 0)
 			highmem_size = 0;
-
 		tmp += highmem_size;
-		if (tmp > 0) {
-			tmp = __shrink_memory(tmp);
-			if (!tmp)
-				return -ENOMEM;
-			pages += tmp;
-		} else if (size > image_size / PAGE_SIZE) {
-			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
-			pages += tmp;
+
+		if (tmp <= 0 && size > image_size / PAGE_SIZE)
+			tmp = size - (image_size / PAGE_SIZE);
+
+		if (tmp > SHRINK_BITE)
+			tmp = SHRINK_BITE;
+		else if (tmp <= 0)
+			break;
+
+		ret = prealloc_pages(tmp);
+		if (ret < 0) {
+			error = -ENOMEM;
+			goto out;
 		}
+		alloc_normal += ret;
+		alloc_highmem += tmp - ret;
+		pages += tmp;
+
 		printk("\b%c", p[i++%4]);
-	} while (tmp > 0);
+	}
+
 	do_gettimeofday(&stop);
-	printk("\bdone (%lu pages freed)\n", pages);
+	printk("\bdone (preallocated %lu free pages)\n", pages);
 	swsusp_show_speed(&start, &stop, pages, "Freed");
 
-	return 0;
+ out:
+	/* Release the preallocated page frames. */
+	swsusp_free();
+
+	return error;
 }
 
 #ifdef CONFIG_HIGHMEM
Index: linux-2.6/mm/vmscan.c
===================================================================
--- linux-2.6.orig/mm/vmscan.c
+++ linux-2.6/mm/vmscan.c
@@ -2054,148 +2054,6 @@ unsigned long global_lru_pages(void)
 		+ global_page_state(NR_INACTIVE_FILE);
 }
 
-#ifdef CONFIG_PM
-/*
- * Helper function for shrink_all_memory().  Tries to reclaim 'nr_pages' pages
- * from LRU lists system-wide, for given pass and priority.
- *
- * For pass > 3 we also try to shrink the LRU lists that contain a few pages
- */
-static void shrink_all_zones(unsigned long nr_pages, int prio,
-				      int pass, struct scan_control *sc)
-{
-	struct zone *zone;
-	unsigned long nr_reclaimed = 0;
-
-	for_each_populated_zone(zone) {
-		enum lru_list l;
-
-		if (zone_is_all_unreclaimable(zone) && prio != DEF_PRIORITY)
-			continue;
-
-		for_each_evictable_lru(l) {
-			enum zone_stat_item ls = NR_LRU_BASE + l;
-			unsigned long lru_pages = zone_page_state(zone, ls);
-
-			/* For pass = 0, we don't shrink the active list */
-			if (pass == 0 && (l == LRU_ACTIVE_ANON ||
-						l == LRU_ACTIVE_FILE))
-				continue;
-
-			zone->lru[l].nr_scan += (lru_pages >> prio) + 1;
-			if (zone->lru[l].nr_scan >= nr_pages || pass > 3) {
-				unsigned long nr_to_scan;
-
-				zone->lru[l].nr_scan = 0;
-				nr_to_scan = min(nr_pages, lru_pages);
-				nr_reclaimed += shrink_list(l, nr_to_scan, zone,
-								sc, prio);
-				if (nr_reclaimed >= nr_pages) {
-					sc->nr_reclaimed += nr_reclaimed;
-					return;
-				}
-			}
-		}
-	}
-	sc->nr_reclaimed += nr_reclaimed;
-}
-
-/*
- * Try to free `nr_pages' of memory, system-wide, and return the number of
- * freed pages.
- *
- * Rather than trying to age LRUs the aim is to preserve the overall
- * LRU order by reclaiming preferentially
- * inactive > active > active referenced > active mapped
- */
-unsigned long shrink_all_memory(unsigned long nr_pages)
-{
-	unsigned long lru_pages, nr_slab;
-	int pass;
-	struct reclaim_state reclaim_state;
-	struct scan_control sc = {
-		.gfp_mask = GFP_KERNEL,
-		.may_unmap = 0,
-		.may_writepage = 1,
-		.isolate_pages = isolate_pages_global,
-		.nr_reclaimed = 0,
-	};
-
-	current->reclaim_state = &reclaim_state;
-
-	lru_pages = global_lru_pages();
-	nr_slab = global_page_state(NR_SLAB_RECLAIMABLE);
-	/* If slab caches are huge, it's better to hit them first */
-	while (nr_slab >= lru_pages) {
-		reclaim_state.reclaimed_slab = 0;
-		shrink_slab(nr_pages, sc.gfp_mask, lru_pages);
-		if (!reclaim_state.reclaimed_slab)
-			break;
-
-		sc.nr_reclaimed += reclaim_state.reclaimed_slab;
-		if (sc.nr_reclaimed >= nr_pages)
-			goto out;
-
-		nr_slab -= reclaim_state.reclaimed_slab;
-	}
-
-	/*
-	 * We try to shrink LRUs in 5 passes:
-	 * 0 = Reclaim from inactive_list only
-	 * 1 = Reclaim from active list but don't reclaim mapped
-	 * 2 = 2nd pass of type 1
-	 * 3 = Reclaim mapped (normal reclaim)
-	 * 4 = 2nd pass of type 3
-	 */
-	for (pass = 0; pass < 5; pass++) {
-		int prio;
-
-		/* Force reclaiming mapped pages in the passes #3 and #4 */
-		if (pass > 2)
-			sc.may_unmap = 1;
-
-		for (prio = DEF_PRIORITY; prio >= 0; prio--) {
-			unsigned long nr_to_scan = nr_pages - sc.nr_reclaimed;
-
-			sc.nr_scanned = 0;
-			sc.swap_cluster_max = nr_to_scan;
-			shrink_all_zones(nr_to_scan, prio, pass, &sc);
-			if (sc.nr_reclaimed >= nr_pages)
-				goto out;
-
-			reclaim_state.reclaimed_slab = 0;
-			shrink_slab(sc.nr_scanned, sc.gfp_mask,
-					global_lru_pages());
-			sc.nr_reclaimed += reclaim_state.reclaimed_slab;
-			if (sc.nr_reclaimed >= nr_pages)
-				goto out;
-
-			if (sc.nr_scanned && prio < DEF_PRIORITY - 2)
-				congestion_wait(WRITE, HZ / 10);
-		}
-	}
-
-	/*
-	 * If sc.nr_reclaimed = 0, we could not shrink LRUs, but there may be
-	 * something in slab caches
-	 */
-	if (!sc.nr_reclaimed) {
-		do {
-			reclaim_state.reclaimed_slab = 0;
-			shrink_slab(nr_pages, sc.gfp_mask, global_lru_pages());
-			sc.nr_reclaimed += reclaim_state.reclaimed_slab;
-		} while (sc.nr_reclaimed < nr_pages &&
-				reclaim_state.reclaimed_slab > 0);
-	}
-
-
-out:
-	current->reclaim_state = NULL;
-
-	return sc.nr_reclaimed;
-}
-#endif
-
 /* It's optimal to keep kswapds on the same CPUs as their memory, but
    not required for correctness.  So if the last cpu in a node goes
    away, we get changed to run anywhere: as the first one comes back,
Index: linux-2.6/kernel/power/main.c
===================================================================
--- linux-2.6.orig/kernel/power/main.c
+++ linux-2.6/kernel/power/main.c
@@ -188,9 +188,6 @@ static void suspend_test_finish(const ch
 
 #endif
 
-/* This is just an arbitrary number */
-#define FREE_PAGE_NUMBER (100)
-
 static struct platform_suspend_ops *suspend_ops;
 
 /**
@@ -226,7 +223,6 @@ int suspend_valid_only_mem(suspend_state
 static int suspend_prepare(void)
 {
 	int error;
-	unsigned int free_pages;
 
 	if (!suspend_ops || !suspend_ops->enter)
 		return -EPERM;
@@ -241,24 +237,10 @@ static int suspend_prepare(void)
 	if (error)
 		goto Finish;
 
-	if (suspend_freeze_processes()) {
-		error = -EAGAIN;
-		goto Thaw;
-	}
-
-	free_pages = global_page_state(NR_FREE_PAGES);
-	if (free_pages < FREE_PAGE_NUMBER) {
-		pr_debug("PM: free some memory\n");
-		shrink_all_memory(FREE_PAGE_NUMBER - free_pages);
-		if (nr_free_pages() < FREE_PAGE_NUMBER) {
-			error = -ENOMEM;
-			printk(KERN_ERR "PM: No enough memory\n");
-		}
-	}
+	error = suspend_freeze_processes();
 	if (!error)
 		return 0;
 
- Thaw:
 	suspend_thaw_processes();
 	usermodehelper_enable();
  Finish:

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 3/4] PM/Hibernate: Use memory allocations to free memory (rev. 2)
@ 2009-05-03  0:24                                         ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-03  0:24 UTC (permalink / raw)
  To: Andrew Morton
  Cc: pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>

Modify the hibernation memory shrinking code so that it will make
memory allocations to free memory instead of using an artificial
memory shrinking mechanism for that.  Remove the shrinking of
memory from the suspend-to-RAM code, where it is not really
necessary.  Finally, remove the no longer used memory shrinking
functions from mm/vmscan.c .

[rev. 2: Use the existing memory bitmaps for marking preallocated
 image pages and use swsusp_free() from releasing them, introduce
 GFP_IMAGE, add comments describing the memory shrinking strategy.]

Signed-off-by: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
---
 kernel/power/main.c     |   20 ------
 kernel/power/snapshot.c |  132 +++++++++++++++++++++++++++++++++-----------
 mm/vmscan.c             |  142 ------------------------------------------------
 3 files changed, 101 insertions(+), 193 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -1066,41 +1066,97 @@ void swsusp_free(void)
 	buffer = NULL;
 }
 
+/* Helper functions used for the shrinking of memory. */
+
+#ifdef CONFIG_HIGHMEM
+#define GFP_IMAGE	(GFP_KERNEL | __GFP_HIGHMEM | __GFP_NO_OOM_KILL)
+#else
+#define GFP_IMAGE	(GFP_KERNEL | __GFP_NO_OOM_KILL)
+#endif
+
+#define SHRINK_BITE	10000
+
 /**
- *	swsusp_shrink_memory -  Try to free as much memory as needed
+ * prealloc_pages - preallocate given number of pages and mark their PFNs
+ * @nr_pages: Number of pages to allocate.
  *
- *	... but do not OOM-kill anyone
- *
- *	Notice: all userland should be stopped before it is called, or
- *	livelock is possible.
+ * Allocate given number of pages and mark their PFNs in the hibernation memory
+ * bitmaps, so that they can be released by swsusp_free().
+ * Return value: The number of normal (ie. non-highmem) pages allocated or
+ * -ENOMEM on failure.
  */
-
-#define SHRINK_BITE	10000
-static inline unsigned long __shrink_memory(long tmp)
+static long prealloc_pages(long nr_pages)
 {
-	if (tmp > SHRINK_BITE)
-		tmp = SHRINK_BITE;
-	return shrink_all_memory(tmp);
+	long nr_normal = 0;
+
+	while (nr_pages-- > 0) {
+		struct page *page;
+
+		page = alloc_image_page(GFP_IMAGE);
+		if (!page)
+			return -ENOMEM;
+		if (!PageHighMem(page))
+			nr_normal++;
+	}
+
+	return nr_normal;
 }
 
+/**
+ * swsusp_shrink_memory -  Make the kernel release as much memory as needed
+ *
+ * To create a hibernation image it is necessary to make a copy of every page
+ * frame in use.  We also need a number of page frames to be free during
+ * hibernation for allocations made while saving the image and for device
+ * drivers, in case they need to allocate memory from their hibernation
+ * callbacks (these two numbers are given by PAGES_FOR_IO and SPARE_PAGES,
+ * respectively, both of which are rough estimates).  To make this happen, we
+ * preallocate memory in SHRINK_BITE chunks in a loop until the following
+ * condition is satisfied:
+ *
+ * [number of preallocated page frames] >=
+ *	(1/2) * ([total number of page frames in use] + PAGES_FOR_IO
+ *		+ SPARE_PAGES - [number of free page frames])
+ *
+ * because in that case, if all of the preallocated page frames are released,
+ * the total number of free page frames will be equal to or greater than the sum
+ * of the total number of page frames in use with PAGES_FOR_IO and SPARE_PAGES,
+ * which is what we need.
+ *
+ * If image_size is set below the number following from the above inequality,
+ * the preallocation of memory is continued until the total number of page
+ * frames in use is below the requested image size.
+ */
 int swsusp_shrink_memory(void)
 {
-	long tmp;
-	struct zone *zone;
-	unsigned long pages = 0;
+	unsigned long pages = 0, alloc_normal = 0, alloc_highmem = 0;
 	unsigned int i = 0;
 	char *p = "-\\|/";
 	struct timeval start, stop;
+	int error = 0;
 
 	printk(KERN_INFO "PM: Shrinking memory...  ");
 	do_gettimeofday(&start);
-	do {
-		long size, highmem_size;
 
-		highmem_size = count_highmem_pages();
-		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
-		tmp = size;
+	for (;;) {
+		struct zone *zone;
+		long size, highmem_size, tmp, ret;
+
+		/*
+		 * Pages preallocated by this loop are not counted as data pages
+		 * by count_data_pages() and count_highmem_pages(), so we only
+		 * need to subtract their numbers once here to verify the
+		 * satisfaction of the stop condition.
+		 */
+		size = count_data_pages() - alloc_normal;
+		tmp = size + PAGES_FOR_IO + SPARE_PAGES;
+		highmem_size = count_highmem_pages() - alloc_highmem;
 		size += highmem_size;
+		/*
+		 * Highmem is treated differently, because we prefer not to
+		 * store copies of normal page frames in it during image
+		 * creation.
+		 */
 		for_each_populated_zone(zone) {
 			tmp += snapshot_additional_pages(zone);
 			if (is_highmem(zone)) {
@@ -1111,27 +1167,39 @@ int swsusp_shrink_memory(void)
 				tmp += zone->lowmem_reserve[ZONE_NORMAL];
 			}
 		}
-
 		if (highmem_size < 0)
 			highmem_size = 0;
-
 		tmp += highmem_size;
-		if (tmp > 0) {
-			tmp = __shrink_memory(tmp);
-			if (!tmp)
-				return -ENOMEM;
-			pages += tmp;
-		} else if (size > image_size / PAGE_SIZE) {
-			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
-			pages += tmp;
+
+		if (tmp <= 0 && size > image_size / PAGE_SIZE)
+			tmp = size - (image_size / PAGE_SIZE);
+
+		if (tmp > SHRINK_BITE)
+			tmp = SHRINK_BITE;
+		else if (tmp <= 0)
+			break;
+
+		ret = prealloc_pages(tmp);
+		if (ret < 0) {
+			error = -ENOMEM;
+			goto out;
 		}
+		alloc_normal += ret;
+		alloc_highmem += tmp - ret;
+		pages += tmp;
+
 		printk("\b%c", p[i++%4]);
-	} while (tmp > 0);
+	}
+
 	do_gettimeofday(&stop);
-	printk("\bdone (%lu pages freed)\n", pages);
+	printk("\bdone (preallocated %lu free pages)\n", pages);
 	swsusp_show_speed(&start, &stop, pages, "Freed");
 
-	return 0;
+ out:
+	/* Release the preallocated page frames. */
+	swsusp_free();
+
+	return error;
 }
 
 #ifdef CONFIG_HIGHMEM
Index: linux-2.6/mm/vmscan.c
===================================================================
--- linux-2.6.orig/mm/vmscan.c
+++ linux-2.6/mm/vmscan.c
@@ -2054,148 +2054,6 @@ unsigned long global_lru_pages(void)
 		+ global_page_state(NR_INACTIVE_FILE);
 }
 
-#ifdef CONFIG_PM
-/*
- * Helper function for shrink_all_memory().  Tries to reclaim 'nr_pages' pages
- * from LRU lists system-wide, for given pass and priority.
- *
- * For pass > 3 we also try to shrink the LRU lists that contain a few pages
- */
-static void shrink_all_zones(unsigned long nr_pages, int prio,
-				      int pass, struct scan_control *sc)
-{
-	struct zone *zone;
-	unsigned long nr_reclaimed = 0;
-
-	for_each_populated_zone(zone) {
-		enum lru_list l;
-
-		if (zone_is_all_unreclaimable(zone) && prio != DEF_PRIORITY)
-			continue;
-
-		for_each_evictable_lru(l) {
-			enum zone_stat_item ls = NR_LRU_BASE + l;
-			unsigned long lru_pages = zone_page_state(zone, ls);
-
-			/* For pass = 0, we don't shrink the active list */
-			if (pass == 0 && (l == LRU_ACTIVE_ANON ||
-						l == LRU_ACTIVE_FILE))
-				continue;
-
-			zone->lru[l].nr_scan += (lru_pages >> prio) + 1;
-			if (zone->lru[l].nr_scan >= nr_pages || pass > 3) {
-				unsigned long nr_to_scan;
-
-				zone->lru[l].nr_scan = 0;
-				nr_to_scan = min(nr_pages, lru_pages);
-				nr_reclaimed += shrink_list(l, nr_to_scan, zone,
-								sc, prio);
-				if (nr_reclaimed >= nr_pages) {
-					sc->nr_reclaimed += nr_reclaimed;
-					return;
-				}
-			}
-		}
-	}
-	sc->nr_reclaimed += nr_reclaimed;
-}
-
-/*
- * Try to free `nr_pages' of memory, system-wide, and return the number of
- * freed pages.
- *
- * Rather than trying to age LRUs the aim is to preserve the overall
- * LRU order by reclaiming preferentially
- * inactive > active > active referenced > active mapped
- */
-unsigned long shrink_all_memory(unsigned long nr_pages)
-{
-	unsigned long lru_pages, nr_slab;
-	int pass;
-	struct reclaim_state reclaim_state;
-	struct scan_control sc = {
-		.gfp_mask = GFP_KERNEL,
-		.may_unmap = 0,
-		.may_writepage = 1,
-		.isolate_pages = isolate_pages_global,
-		.nr_reclaimed = 0,
-	};
-
-	current->reclaim_state = &reclaim_state;
-
-	lru_pages = global_lru_pages();
-	nr_slab = global_page_state(NR_SLAB_RECLAIMABLE);
-	/* If slab caches are huge, it's better to hit them first */
-	while (nr_slab >= lru_pages) {
-		reclaim_state.reclaimed_slab = 0;
-		shrink_slab(nr_pages, sc.gfp_mask, lru_pages);
-		if (!reclaim_state.reclaimed_slab)
-			break;
-
-		sc.nr_reclaimed += reclaim_state.reclaimed_slab;
-		if (sc.nr_reclaimed >= nr_pages)
-			goto out;
-
-		nr_slab -= reclaim_state.reclaimed_slab;
-	}
-
-	/*
-	 * We try to shrink LRUs in 5 passes:
-	 * 0 = Reclaim from inactive_list only
-	 * 1 = Reclaim from active list but don't reclaim mapped
-	 * 2 = 2nd pass of type 1
-	 * 3 = Reclaim mapped (normal reclaim)
-	 * 4 = 2nd pass of type 3
-	 */
-	for (pass = 0; pass < 5; pass++) {
-		int prio;
-
-		/* Force reclaiming mapped pages in the passes #3 and #4 */
-		if (pass > 2)
-			sc.may_unmap = 1;
-
-		for (prio = DEF_PRIORITY; prio >= 0; prio--) {
-			unsigned long nr_to_scan = nr_pages - sc.nr_reclaimed;
-
-			sc.nr_scanned = 0;
-			sc.swap_cluster_max = nr_to_scan;
-			shrink_all_zones(nr_to_scan, prio, pass, &sc);
-			if (sc.nr_reclaimed >= nr_pages)
-				goto out;
-
-			reclaim_state.reclaimed_slab = 0;
-			shrink_slab(sc.nr_scanned, sc.gfp_mask,
-					global_lru_pages());
-			sc.nr_reclaimed += reclaim_state.reclaimed_slab;
-			if (sc.nr_reclaimed >= nr_pages)
-				goto out;
-
-			if (sc.nr_scanned && prio < DEF_PRIORITY - 2)
-				congestion_wait(WRITE, HZ / 10);
-		}
-	}
-
-	/*
-	 * If sc.nr_reclaimed = 0, we could not shrink LRUs, but there may be
-	 * something in slab caches
-	 */
-	if (!sc.nr_reclaimed) {
-		do {
-			reclaim_state.reclaimed_slab = 0;
-			shrink_slab(nr_pages, sc.gfp_mask, global_lru_pages());
-			sc.nr_reclaimed += reclaim_state.reclaimed_slab;
-		} while (sc.nr_reclaimed < nr_pages &&
-				reclaim_state.reclaimed_slab > 0);
-	}
-
-
-out:
-	current->reclaim_state = NULL;
-
-	return sc.nr_reclaimed;
-}
-#endif
-
 /* It's optimal to keep kswapds on the same CPUs as their memory, but
    not required for correctness.  So if the last cpu in a node goes
    away, we get changed to run anywhere: as the first one comes back,
Index: linux-2.6/kernel/power/main.c
===================================================================
--- linux-2.6.orig/kernel/power/main.c
+++ linux-2.6/kernel/power/main.c
@@ -188,9 +188,6 @@ static void suspend_test_finish(const ch
 
 #endif
 
-/* This is just an arbitrary number */
-#define FREE_PAGE_NUMBER (100)
-
 static struct platform_suspend_ops *suspend_ops;
 
 /**
@@ -226,7 +223,6 @@ int suspend_valid_only_mem(suspend_state
 static int suspend_prepare(void)
 {
 	int error;
-	unsigned int free_pages;
 
 	if (!suspend_ops || !suspend_ops->enter)
 		return -EPERM;
@@ -241,24 +237,10 @@ static int suspend_prepare(void)
 	if (error)
 		goto Finish;
 
-	if (suspend_freeze_processes()) {
-		error = -EAGAIN;
-		goto Thaw;
-	}
-
-	free_pages = global_page_state(NR_FREE_PAGES);
-	if (free_pages < FREE_PAGE_NUMBER) {
-		pr_debug("PM: free some memory\n");
-		shrink_all_memory(FREE_PAGE_NUMBER - free_pages);
-		if (nr_free_pages() < FREE_PAGE_NUMBER) {
-			error = -ENOMEM;
-			printk(KERN_ERR "PM: No enough memory\n");
-		}
-	}
+	error = suspend_freeze_processes();
 	if (!error)
 		return 0;
 
- Thaw:
 	suspend_thaw_processes();
 	usermodehelper_enable();
  Finish:

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 4/4] PM/Hibernate: Do not release preallocated memory unnecessarily
@ 2009-05-03  0:25                                         ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-03  0:25 UTC (permalink / raw)
  To: Andrew Morton
  Cc: pavel, torvalds, jens.axboe, alan-jenkins, linux-kernel,
	kernel-testers, linux-pm

From: Rafael J. Wysocki <rjw@sisk.pl>

Since the hibernation code is now going to use allocations of memory
to create enough room for the image, it can also use the page frames
allocated at this stage as image page frames.  The low-level
hibernation code needs to be rearranged for this purpose, but it
allows us to avoid freeing a great number of pages and allocating
these same pages once again later, so it generally is worth doing.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/disk.c     |   15 +++-
 kernel/power/power.h    |    2 
 kernel/power/snapshot.c |  151 +++++++++++++++++++++++-------------------------
 3 files changed, 87 insertions(+), 81 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -783,21 +783,6 @@ void free_basic_memory_bitmaps(void)
 	pr_debug("PM: Basic memory bitmaps freed\n");
 }
 
-/**
- *	snapshot_additional_pages - estimate the number of additional pages
- *	be needed for setting up the suspend image data structures for given
- *	zone (usually the returned value is greater than the exact number)
- */
-
-unsigned int snapshot_additional_pages(struct zone *zone)
-{
-	unsigned int res;
-
-	res = DIV_ROUND_UP(zone->spanned_pages, BM_BITS_PER_BLOCK);
-	res += DIV_ROUND_UP(res * sizeof(struct bm_block), PAGE_SIZE);
-	return 2 * res;
-}
-
 #ifdef CONFIG_HIGHMEM
 /**
  *	count_free_highmem_pages - compute the total number of free highmem
@@ -1033,6 +1018,25 @@ copy_data_pages(struct memory_bitmap *co
 static unsigned int nr_copy_pages;
 /* Number of pages needed for saving the original pfns of the image pages */
 static unsigned int nr_meta_pages;
+/*
+ * Numbers of normal and highmem page frames allocated for hibernation image
+ * before suspending devices.
+ */
+unsigned int alloc_normal, alloc_highmem;
+/*
+ * Memory bitmap used for marking saveable pages (during hibernation) or
+ * hibernation image pages (during restore)
+ */
+static struct memory_bitmap orig_bm;
+/*
+ * Memory bitmap used during hibernation for marking allocated page frames that
+ * will contain copies of saveable pages.  During restore it is initially used
+ * for marking hibernation image pages, but then the set bits from it are
+ * duplicated in @orig_bm and it is released.  On highmem systems it is next
+ * used for marking "safe" highmem pages, but it has to be reinitialized for
+ * this purpose.
+ */
+static struct memory_bitmap copy_bm;
 
 /**
  *	swsusp_free - free pages allocated for the suspend.
@@ -1064,6 +1068,8 @@ void swsusp_free(void)
 	nr_meta_pages = 0;
 	restore_pblist = NULL;
 	buffer = NULL;
+	alloc_normal = 0;
+	alloc_highmem = 0;
 }
 
 /* Helper functions used for the shrinking of memory. */
@@ -1085,7 +1091,7 @@ void swsusp_free(void)
  * Return value: The number of normal (ie. non-highmem) pages allocated or
  * -ENOMEM on failure.
  */
-static long prealloc_pages(long nr_pages)
+static long prealloc_pages(struct memory_bitmap *bm, long nr_pages)
 {
 	long nr_normal = 0;
 
@@ -1095,6 +1101,7 @@ static long prealloc_pages(long nr_pages
 		page = alloc_image_page(GFP_IMAGE);
 		if (!page)
 			return -ENOMEM;
+		memory_bm_set_bit(bm, page_to_pfn(page));
 		if (!PageHighMem(page))
 			nr_normal++;
 	}
@@ -1103,7 +1110,7 @@ static long prealloc_pages(long nr_pages
 }
 
 /**
- * swsusp_shrink_memory -  Make the kernel release as much memory as needed
+ * hibernate_preallocate_memory - Preallocate memory for hibernation image
  *
  * To create a hibernation image it is necessary to make a copy of every page
  * frame in use.  We also need a number of page frames to be free during
@@ -1127,17 +1134,29 @@ static long prealloc_pages(long nr_pages
  * the preallocation of memory is continued until the total number of page
  * frames in use is below the requested image size.
  */
-int swsusp_shrink_memory(void)
+int hibernate_preallocate_memory(void)
 {
-	unsigned long pages = 0, alloc_normal = 0, alloc_highmem = 0;
+	unsigned long pages = 0;
 	unsigned int i = 0;
 	char *p = "-\\|/";
 	struct timeval start, stop;
-	int error = 0;
+	int error;
 
-	printk(KERN_INFO "PM: Shrinking memory...  ");
+	printk(KERN_INFO "PM: Preallocating image memory ...  ");
 	do_gettimeofday(&start);
 
+	error = memory_bm_create(&orig_bm, GFP_IMAGE, PG_ANY);
+	if (error)
+		goto err_out;
+
+	error = memory_bm_create(&copy_bm, GFP_IMAGE, PG_ANY);
+	if (error)
+		goto err_out;
+
+	alloc_normal = 0;
+	alloc_highmem = 0;
+	error = -ENOMEM;
+
 	for (;;) {
 		struct zone *zone;
 		long size, highmem_size, tmp, ret;
@@ -1158,7 +1177,6 @@ int swsusp_shrink_memory(void)
 		 * creation.
 		 */
 		for_each_populated_zone(zone) {
-			tmp += snapshot_additional_pages(zone);
 			if (is_highmem(zone)) {
 				highmem_size -=
 					zone_page_state(zone, NR_FREE_PAGES);
@@ -1179,11 +1197,9 @@ int swsusp_shrink_memory(void)
 		else if (tmp <= 0)
 			break;
 
-		ret = prealloc_pages(tmp);
-		if (ret < 0) {
-			error = -ENOMEM;
-			goto out;
-		}
+		ret = prealloc_pages(&copy_bm, tmp);
+		if (ret < 0)
+			goto err_out;
 		alloc_normal += ret;
 		alloc_highmem += tmp - ret;
 		pages += tmp;
@@ -1192,13 +1208,13 @@ int swsusp_shrink_memory(void)
 	}
 
 	do_gettimeofday(&stop);
-	printk("\bdone (preallocated %lu free pages)\n", pages);
-	swsusp_show_speed(&start, &stop, pages, "Freed");
+	printk("\bdone (allocated %lu image pages)\n", pages);
+	swsusp_show_speed(&start, &stop, pages, "Allocated");
 
- out:
-	/* Release the preallocated page frames. */
-	swsusp_free();
+	return 0;
 
+ err_out:
+	swsusp_free();
 	return error;
 }
 
@@ -1210,7 +1226,7 @@ int swsusp_shrink_memory(void)
 
 static unsigned int count_pages_for_highmem(unsigned int nr_highmem)
 {
-	unsigned int free_highmem = count_free_highmem_pages();
+	unsigned int free_highmem = count_free_highmem_pages() + alloc_highmem;
 
 	if (free_highmem >= nr_highmem)
 		nr_highmem = 0;
@@ -1232,19 +1248,17 @@ count_pages_for_highmem(unsigned int nr_
 static int enough_free_mem(unsigned int nr_pages, unsigned int nr_highmem)
 {
 	struct zone *zone;
-	unsigned int free = 0, meta = 0;
+	unsigned int free = alloc_normal;
 
-	for_each_zone(zone) {
-		meta += snapshot_additional_pages(zone);
+	for_each_zone(zone)
 		if (!is_highmem(zone))
 			free += zone_page_state(zone, NR_FREE_PAGES);
-	}
 
 	nr_pages += count_pages_for_highmem(nr_highmem);
-	pr_debug("PM: Normal pages needed: %u + %u + %u, available pages: %u\n",
-		nr_pages, PAGES_FOR_IO, meta, free);
+	pr_debug("PM: Normal pages needed: %u + %u, available pages: %u\n",
+		nr_pages, PAGES_FOR_IO, free);
 
-	return free > nr_pages + PAGES_FOR_IO + meta;
+	return free > nr_pages + PAGES_FOR_IO;
 }
 
 #ifdef CONFIG_HIGHMEM
@@ -1266,7 +1280,7 @@ static inline int get_highmem_buffer(int
  */
 
 static inline unsigned int
-alloc_highmem_image_pages(struct memory_bitmap *bm, unsigned int nr_highmem)
+alloc_highmem_pages(struct memory_bitmap *bm, unsigned int nr_highmem)
 {
 	unsigned int to_alloc = count_free_highmem_pages();
 
@@ -1277,7 +1291,7 @@ alloc_highmem_image_pages(struct memory_
 	while (to_alloc-- > 0) {
 		struct page *page;
 
-		page = alloc_image_page(__GFP_HIGHMEM);
+		page = alloc_image_page(__GFP_HIGHMEM | __GFP_NO_OOM_KILL);
 		memory_bm_set_bit(bm, page_to_pfn(page));
 	}
 	return nr_highmem;
@@ -1286,7 +1300,7 @@ alloc_highmem_image_pages(struct memory_
 static inline int get_highmem_buffer(int safe_needed) { return 0; }
 
 static inline unsigned int
-alloc_highmem_image_pages(struct memory_bitmap *bm, unsigned int n) { return 0; }
+alloc_highmem_pages(struct memory_bitmap *bm, unsigned int n) { return 0; }
 #endif /* CONFIG_HIGHMEM */
 
 /**
@@ -1305,51 +1319,36 @@ static int
 swsusp_alloc(struct memory_bitmap *orig_bm, struct memory_bitmap *copy_bm,
 		unsigned int nr_pages, unsigned int nr_highmem)
 {
-	int error;
-
-	error = memory_bm_create(orig_bm, GFP_ATOMIC | __GFP_COLD, PG_ANY);
-	if (error)
-		goto Free;
-
-	error = memory_bm_create(copy_bm, GFP_ATOMIC | __GFP_COLD, PG_ANY);
-	if (error)
-		goto Free;
+	int error = 0;
 
 	if (nr_highmem > 0) {
 		error = get_highmem_buffer(PG_ANY);
 		if (error)
-			goto Free;
-
-		nr_pages += alloc_highmem_image_pages(copy_bm, nr_highmem);
+			goto err_out;
+		if (nr_highmem > alloc_highmem) {
+			nr_highmem -= alloc_highmem;
+			nr_pages += alloc_highmem_pages(copy_bm, nr_highmem);
+		}
 	}
-	while (nr_pages-- > 0) {
-		struct page *page = alloc_image_page(GFP_ATOMIC | __GFP_COLD);
-
-		if (!page)
-			goto Free;
+	if (nr_pages > alloc_normal) {
+		nr_pages -= alloc_normal;
+		while (nr_pages-- > 0) {
+			struct page *page;
 
-		memory_bm_set_bit(copy_bm, page_to_pfn(page));
+			page = alloc_image_page(GFP_ATOMIC | __GFP_COLD);
+			if (!page)
+				goto err_out;
+			memory_bm_set_bit(copy_bm, page_to_pfn(page));
+		}
 	}
+
 	return 0;
 
- Free:
+ err_out:
 	swsusp_free();
-	return -ENOMEM;
+	return error;
 }
 
-/* Memory bitmap used for marking saveable pages (during suspend) or the
- * suspend image pages (during resume)
- */
-static struct memory_bitmap orig_bm;
-/* Memory bitmap used on suspend for marking allocated pages that will contain
- * the copies of saveable pages.  During resume it is initially used for
- * marking the suspend image pages, but then its set bits are duplicated in
- * @orig_bm and it is released.  Next, on systems with high memory, it may be
- * used for marking "safe" highmem pages, but it has to be reinitialized for
- * this purpose.
- */
-static struct memory_bitmap copy_bm;
-
 asmlinkage int swsusp_save(void)
 {
 	unsigned int nr_pages, nr_highmem;
Index: linux-2.6/kernel/power/power.h
===================================================================
--- linux-2.6.orig/kernel/power/power.h
+++ linux-2.6/kernel/power/power.h
@@ -74,7 +74,7 @@ extern asmlinkage int swsusp_arch_resume
 
 extern int create_basic_memory_bitmaps(void);
 extern void free_basic_memory_bitmaps(void);
-extern int swsusp_shrink_memory(void);
+extern int hibernate_preallocate_memory(void);
 
 /**
  *	Auxiliary structure used for reading the snapshot image data and
Index: linux-2.6/kernel/power/disk.c
===================================================================
--- linux-2.6.orig/kernel/power/disk.c
+++ linux-2.6/kernel/power/disk.c
@@ -303,8 +303,8 @@ int hibernation_snapshot(int platform_mo
 	if (error)
 		return error;
 
-	/* Free memory before shutting down devices. */
-	error = swsusp_shrink_memory();
+	/* Preallocate image memory before shutting down devices. */
+	error = hibernate_preallocate_memory();
 	if (error)
 		goto Close;
 
@@ -320,6 +320,10 @@ int hibernation_snapshot(int platform_mo
 	/* Control returns here after successful restore */
 
  Resume_devices:
+	/* We may need to release the preallocated image pages here. */
+	if (error || !in_suspend)
+		swsusp_free();
+
 	device_resume(in_suspend ?
 		(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
 	resume_console();
@@ -593,7 +597,10 @@ int hibernate(void)
 		goto Thaw;
 
 	error = hibernation_snapshot(hibernation_mode == HIBERNATION_PLATFORM);
-	if (in_suspend && !error) {
+	if (error)
+		goto Thaw;
+
+	if (in_suspend) {
 		unsigned int flags = 0;
 
 		if (hibernation_mode == HIBERNATION_PLATFORM)
@@ -605,8 +612,8 @@ int hibernate(void)
 			power_down();
 	} else {
 		pr_debug("PM: Image restored successfully.\n");
-		swsusp_free();
 	}
+
  Thaw:
 	thaw_processes();
  Finish:

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 4/4] PM/Hibernate: Do not release preallocated memory unnecessarily
  2009-05-03  0:20                                       ` Rafael J. Wysocki
                                                         ` (6 preceding siblings ...)
  (?)
@ 2009-05-03  0:25                                       ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-03  0:25 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, alan-jenkins, jens.axboe, linux-pm, kernel-testers,
	torvalds

From: Rafael J. Wysocki <rjw@sisk.pl>

Since the hibernation code is now going to use allocations of memory
to create enough room for the image, it can also use the page frames
allocated at this stage as image page frames.  The low-level
hibernation code needs to be rearranged for this purpose, but it
allows us to avoid freeing a great number of pages and allocating
these same pages once again later, so it generally is worth doing.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/disk.c     |   15 +++-
 kernel/power/power.h    |    2 
 kernel/power/snapshot.c |  151 +++++++++++++++++++++++-------------------------
 3 files changed, 87 insertions(+), 81 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -783,21 +783,6 @@ void free_basic_memory_bitmaps(void)
 	pr_debug("PM: Basic memory bitmaps freed\n");
 }
 
-/**
- *	snapshot_additional_pages - estimate the number of additional pages
- *	be needed for setting up the suspend image data structures for given
- *	zone (usually the returned value is greater than the exact number)
- */
-
-unsigned int snapshot_additional_pages(struct zone *zone)
-{
-	unsigned int res;
-
-	res = DIV_ROUND_UP(zone->spanned_pages, BM_BITS_PER_BLOCK);
-	res += DIV_ROUND_UP(res * sizeof(struct bm_block), PAGE_SIZE);
-	return 2 * res;
-}
-
 #ifdef CONFIG_HIGHMEM
 /**
  *	count_free_highmem_pages - compute the total number of free highmem
@@ -1033,6 +1018,25 @@ copy_data_pages(struct memory_bitmap *co
 static unsigned int nr_copy_pages;
 /* Number of pages needed for saving the original pfns of the image pages */
 static unsigned int nr_meta_pages;
+/*
+ * Numbers of normal and highmem page frames allocated for hibernation image
+ * before suspending devices.
+ */
+unsigned int alloc_normal, alloc_highmem;
+/*
+ * Memory bitmap used for marking saveable pages (during hibernation) or
+ * hibernation image pages (during restore)
+ */
+static struct memory_bitmap orig_bm;
+/*
+ * Memory bitmap used during hibernation for marking allocated page frames that
+ * will contain copies of saveable pages.  During restore it is initially used
+ * for marking hibernation image pages, but then the set bits from it are
+ * duplicated in @orig_bm and it is released.  On highmem systems it is next
+ * used for marking "safe" highmem pages, but it has to be reinitialized for
+ * this purpose.
+ */
+static struct memory_bitmap copy_bm;
 
 /**
  *	swsusp_free - free pages allocated for the suspend.
@@ -1064,6 +1068,8 @@ void swsusp_free(void)
 	nr_meta_pages = 0;
 	restore_pblist = NULL;
 	buffer = NULL;
+	alloc_normal = 0;
+	alloc_highmem = 0;
 }
 
 /* Helper functions used for the shrinking of memory. */
@@ -1085,7 +1091,7 @@ void swsusp_free(void)
  * Return value: The number of normal (ie. non-highmem) pages allocated or
  * -ENOMEM on failure.
  */
-static long prealloc_pages(long nr_pages)
+static long prealloc_pages(struct memory_bitmap *bm, long nr_pages)
 {
 	long nr_normal = 0;
 
@@ -1095,6 +1101,7 @@ static long prealloc_pages(long nr_pages
 		page = alloc_image_page(GFP_IMAGE);
 		if (!page)
 			return -ENOMEM;
+		memory_bm_set_bit(bm, page_to_pfn(page));
 		if (!PageHighMem(page))
 			nr_normal++;
 	}
@@ -1103,7 +1110,7 @@ static long prealloc_pages(long nr_pages
 }
 
 /**
- * swsusp_shrink_memory -  Make the kernel release as much memory as needed
+ * hibernate_preallocate_memory - Preallocate memory for hibernation image
  *
  * To create a hibernation image it is necessary to make a copy of every page
  * frame in use.  We also need a number of page frames to be free during
@@ -1127,17 +1134,29 @@ static long prealloc_pages(long nr_pages
  * the preallocation of memory is continued until the total number of page
  * frames in use is below the requested image size.
  */
-int swsusp_shrink_memory(void)
+int hibernate_preallocate_memory(void)
 {
-	unsigned long pages = 0, alloc_normal = 0, alloc_highmem = 0;
+	unsigned long pages = 0;
 	unsigned int i = 0;
 	char *p = "-\\|/";
 	struct timeval start, stop;
-	int error = 0;
+	int error;
 
-	printk(KERN_INFO "PM: Shrinking memory...  ");
+	printk(KERN_INFO "PM: Preallocating image memory ...  ");
 	do_gettimeofday(&start);
 
+	error = memory_bm_create(&orig_bm, GFP_IMAGE, PG_ANY);
+	if (error)
+		goto err_out;
+
+	error = memory_bm_create(&copy_bm, GFP_IMAGE, PG_ANY);
+	if (error)
+		goto err_out;
+
+	alloc_normal = 0;
+	alloc_highmem = 0;
+	error = -ENOMEM;
+
 	for (;;) {
 		struct zone *zone;
 		long size, highmem_size, tmp, ret;
@@ -1158,7 +1177,6 @@ int swsusp_shrink_memory(void)
 		 * creation.
 		 */
 		for_each_populated_zone(zone) {
-			tmp += snapshot_additional_pages(zone);
 			if (is_highmem(zone)) {
 				highmem_size -=
 					zone_page_state(zone, NR_FREE_PAGES);
@@ -1179,11 +1197,9 @@ int swsusp_shrink_memory(void)
 		else if (tmp <= 0)
 			break;
 
-		ret = prealloc_pages(tmp);
-		if (ret < 0) {
-			error = -ENOMEM;
-			goto out;
-		}
+		ret = prealloc_pages(&copy_bm, tmp);
+		if (ret < 0)
+			goto err_out;
 		alloc_normal += ret;
 		alloc_highmem += tmp - ret;
 		pages += tmp;
@@ -1192,13 +1208,13 @@ int swsusp_shrink_memory(void)
 	}
 
 	do_gettimeofday(&stop);
-	printk("\bdone (preallocated %lu free pages)\n", pages);
-	swsusp_show_speed(&start, &stop, pages, "Freed");
+	printk("\bdone (allocated %lu image pages)\n", pages);
+	swsusp_show_speed(&start, &stop, pages, "Allocated");
 
- out:
-	/* Release the preallocated page frames. */
-	swsusp_free();
+	return 0;
 
+ err_out:
+	swsusp_free();
 	return error;
 }
 
@@ -1210,7 +1226,7 @@ int swsusp_shrink_memory(void)
 
 static unsigned int count_pages_for_highmem(unsigned int nr_highmem)
 {
-	unsigned int free_highmem = count_free_highmem_pages();
+	unsigned int free_highmem = count_free_highmem_pages() + alloc_highmem;
 
 	if (free_highmem >= nr_highmem)
 		nr_highmem = 0;
@@ -1232,19 +1248,17 @@ count_pages_for_highmem(unsigned int nr_
 static int enough_free_mem(unsigned int nr_pages, unsigned int nr_highmem)
 {
 	struct zone *zone;
-	unsigned int free = 0, meta = 0;
+	unsigned int free = alloc_normal;
 
-	for_each_zone(zone) {
-		meta += snapshot_additional_pages(zone);
+	for_each_zone(zone)
 		if (!is_highmem(zone))
 			free += zone_page_state(zone, NR_FREE_PAGES);
-	}
 
 	nr_pages += count_pages_for_highmem(nr_highmem);
-	pr_debug("PM: Normal pages needed: %u + %u + %u, available pages: %u\n",
-		nr_pages, PAGES_FOR_IO, meta, free);
+	pr_debug("PM: Normal pages needed: %u + %u, available pages: %u\n",
+		nr_pages, PAGES_FOR_IO, free);
 
-	return free > nr_pages + PAGES_FOR_IO + meta;
+	return free > nr_pages + PAGES_FOR_IO;
 }
 
 #ifdef CONFIG_HIGHMEM
@@ -1266,7 +1280,7 @@ static inline int get_highmem_buffer(int
  */
 
 static inline unsigned int
-alloc_highmem_image_pages(struct memory_bitmap *bm, unsigned int nr_highmem)
+alloc_highmem_pages(struct memory_bitmap *bm, unsigned int nr_highmem)
 {
 	unsigned int to_alloc = count_free_highmem_pages();
 
@@ -1277,7 +1291,7 @@ alloc_highmem_image_pages(struct memory_
 	while (to_alloc-- > 0) {
 		struct page *page;
 
-		page = alloc_image_page(__GFP_HIGHMEM);
+		page = alloc_image_page(__GFP_HIGHMEM | __GFP_NO_OOM_KILL);
 		memory_bm_set_bit(bm, page_to_pfn(page));
 	}
 	return nr_highmem;
@@ -1286,7 +1300,7 @@ alloc_highmem_image_pages(struct memory_
 static inline int get_highmem_buffer(int safe_needed) { return 0; }
 
 static inline unsigned int
-alloc_highmem_image_pages(struct memory_bitmap *bm, unsigned int n) { return 0; }
+alloc_highmem_pages(struct memory_bitmap *bm, unsigned int n) { return 0; }
 #endif /* CONFIG_HIGHMEM */
 
 /**
@@ -1305,51 +1319,36 @@ static int
 swsusp_alloc(struct memory_bitmap *orig_bm, struct memory_bitmap *copy_bm,
 		unsigned int nr_pages, unsigned int nr_highmem)
 {
-	int error;
-
-	error = memory_bm_create(orig_bm, GFP_ATOMIC | __GFP_COLD, PG_ANY);
-	if (error)
-		goto Free;
-
-	error = memory_bm_create(copy_bm, GFP_ATOMIC | __GFP_COLD, PG_ANY);
-	if (error)
-		goto Free;
+	int error = 0;
 
 	if (nr_highmem > 0) {
 		error = get_highmem_buffer(PG_ANY);
 		if (error)
-			goto Free;
-
-		nr_pages += alloc_highmem_image_pages(copy_bm, nr_highmem);
+			goto err_out;
+		if (nr_highmem > alloc_highmem) {
+			nr_highmem -= alloc_highmem;
+			nr_pages += alloc_highmem_pages(copy_bm, nr_highmem);
+		}
 	}
-	while (nr_pages-- > 0) {
-		struct page *page = alloc_image_page(GFP_ATOMIC | __GFP_COLD);
-
-		if (!page)
-			goto Free;
+	if (nr_pages > alloc_normal) {
+		nr_pages -= alloc_normal;
+		while (nr_pages-- > 0) {
+			struct page *page;
 
-		memory_bm_set_bit(copy_bm, page_to_pfn(page));
+			page = alloc_image_page(GFP_ATOMIC | __GFP_COLD);
+			if (!page)
+				goto err_out;
+			memory_bm_set_bit(copy_bm, page_to_pfn(page));
+		}
 	}
+
 	return 0;
 
- Free:
+ err_out:
 	swsusp_free();
-	return -ENOMEM;
+	return error;
 }
 
-/* Memory bitmap used for marking saveable pages (during suspend) or the
- * suspend image pages (during resume)
- */
-static struct memory_bitmap orig_bm;
-/* Memory bitmap used on suspend for marking allocated pages that will contain
- * the copies of saveable pages.  During resume it is initially used for
- * marking the suspend image pages, but then its set bits are duplicated in
- * @orig_bm and it is released.  Next, on systems with high memory, it may be
- * used for marking "safe" highmem pages, but it has to be reinitialized for
- * this purpose.
- */
-static struct memory_bitmap copy_bm;
-
 asmlinkage int swsusp_save(void)
 {
 	unsigned int nr_pages, nr_highmem;
Index: linux-2.6/kernel/power/power.h
===================================================================
--- linux-2.6.orig/kernel/power/power.h
+++ linux-2.6/kernel/power/power.h
@@ -74,7 +74,7 @@ extern asmlinkage int swsusp_arch_resume
 
 extern int create_basic_memory_bitmaps(void);
 extern void free_basic_memory_bitmaps(void);
-extern int swsusp_shrink_memory(void);
+extern int hibernate_preallocate_memory(void);
 
 /**
  *	Auxiliary structure used for reading the snapshot image data and
Index: linux-2.6/kernel/power/disk.c
===================================================================
--- linux-2.6.orig/kernel/power/disk.c
+++ linux-2.6/kernel/power/disk.c
@@ -303,8 +303,8 @@ int hibernation_snapshot(int platform_mo
 	if (error)
 		return error;
 
-	/* Free memory before shutting down devices. */
-	error = swsusp_shrink_memory();
+	/* Preallocate image memory before shutting down devices. */
+	error = hibernate_preallocate_memory();
 	if (error)
 		goto Close;
 
@@ -320,6 +320,10 @@ int hibernation_snapshot(int platform_mo
 	/* Control returns here after successful restore */
 
  Resume_devices:
+	/* We may need to release the preallocated image pages here. */
+	if (error || !in_suspend)
+		swsusp_free();
+
 	device_resume(in_suspend ?
 		(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
 	resume_console();
@@ -593,7 +597,10 @@ int hibernate(void)
 		goto Thaw;
 
 	error = hibernation_snapshot(hibernation_mode == HIBERNATION_PLATFORM);
-	if (in_suspend && !error) {
+	if (error)
+		goto Thaw;
+
+	if (in_suspend) {
 		unsigned int flags = 0;
 
 		if (hibernation_mode == HIBERNATION_PLATFORM)
@@ -605,8 +612,8 @@ int hibernate(void)
 			power_down();
 	} else {
 		pr_debug("PM: Image restored successfully.\n");
-		swsusp_free();
 	}
+
  Thaw:
 	thaw_processes();
  Finish:

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 4/4] PM/Hibernate: Do not release preallocated memory unnecessarily
@ 2009-05-03  0:25                                         ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-03  0:25 UTC (permalink / raw)
  To: Andrew Morton
  Cc: pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>

Since the hibernation code is now going to use allocations of memory
to create enough room for the image, it can also use the page frames
allocated at this stage as image page frames.  The low-level
hibernation code needs to be rearranged for this purpose, but it
allows us to avoid freeing a great number of pages and allocating
these same pages once again later, so it generally is worth doing.

Signed-off-by: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
---
 kernel/power/disk.c     |   15 +++-
 kernel/power/power.h    |    2 
 kernel/power/snapshot.c |  151 +++++++++++++++++++++++-------------------------
 3 files changed, 87 insertions(+), 81 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -783,21 +783,6 @@ void free_basic_memory_bitmaps(void)
 	pr_debug("PM: Basic memory bitmaps freed\n");
 }
 
-/**
- *	snapshot_additional_pages - estimate the number of additional pages
- *	be needed for setting up the suspend image data structures for given
- *	zone (usually the returned value is greater than the exact number)
- */
-
-unsigned int snapshot_additional_pages(struct zone *zone)
-{
-	unsigned int res;
-
-	res = DIV_ROUND_UP(zone->spanned_pages, BM_BITS_PER_BLOCK);
-	res += DIV_ROUND_UP(res * sizeof(struct bm_block), PAGE_SIZE);
-	return 2 * res;
-}
-
 #ifdef CONFIG_HIGHMEM
 /**
  *	count_free_highmem_pages - compute the total number of free highmem
@@ -1033,6 +1018,25 @@ copy_data_pages(struct memory_bitmap *co
 static unsigned int nr_copy_pages;
 /* Number of pages needed for saving the original pfns of the image pages */
 static unsigned int nr_meta_pages;
+/*
+ * Numbers of normal and highmem page frames allocated for hibernation image
+ * before suspending devices.
+ */
+unsigned int alloc_normal, alloc_highmem;
+/*
+ * Memory bitmap used for marking saveable pages (during hibernation) or
+ * hibernation image pages (during restore)
+ */
+static struct memory_bitmap orig_bm;
+/*
+ * Memory bitmap used during hibernation for marking allocated page frames that
+ * will contain copies of saveable pages.  During restore it is initially used
+ * for marking hibernation image pages, but then the set bits from it are
+ * duplicated in @orig_bm and it is released.  On highmem systems it is next
+ * used for marking "safe" highmem pages, but it has to be reinitialized for
+ * this purpose.
+ */
+static struct memory_bitmap copy_bm;
 
 /**
  *	swsusp_free - free pages allocated for the suspend.
@@ -1064,6 +1068,8 @@ void swsusp_free(void)
 	nr_meta_pages = 0;
 	restore_pblist = NULL;
 	buffer = NULL;
+	alloc_normal = 0;
+	alloc_highmem = 0;
 }
 
 /* Helper functions used for the shrinking of memory. */
@@ -1085,7 +1091,7 @@ void swsusp_free(void)
  * Return value: The number of normal (ie. non-highmem) pages allocated or
  * -ENOMEM on failure.
  */
-static long prealloc_pages(long nr_pages)
+static long prealloc_pages(struct memory_bitmap *bm, long nr_pages)
 {
 	long nr_normal = 0;
 
@@ -1095,6 +1101,7 @@ static long prealloc_pages(long nr_pages
 		page = alloc_image_page(GFP_IMAGE);
 		if (!page)
 			return -ENOMEM;
+		memory_bm_set_bit(bm, page_to_pfn(page));
 		if (!PageHighMem(page))
 			nr_normal++;
 	}
@@ -1103,7 +1110,7 @@ static long prealloc_pages(long nr_pages
 }
 
 /**
- * swsusp_shrink_memory -  Make the kernel release as much memory as needed
+ * hibernate_preallocate_memory - Preallocate memory for hibernation image
  *
  * To create a hibernation image it is necessary to make a copy of every page
  * frame in use.  We also need a number of page frames to be free during
@@ -1127,17 +1134,29 @@ static long prealloc_pages(long nr_pages
  * the preallocation of memory is continued until the total number of page
  * frames in use is below the requested image size.
  */
-int swsusp_shrink_memory(void)
+int hibernate_preallocate_memory(void)
 {
-	unsigned long pages = 0, alloc_normal = 0, alloc_highmem = 0;
+	unsigned long pages = 0;
 	unsigned int i = 0;
 	char *p = "-\\|/";
 	struct timeval start, stop;
-	int error = 0;
+	int error;
 
-	printk(KERN_INFO "PM: Shrinking memory...  ");
+	printk(KERN_INFO "PM: Preallocating image memory ...  ");
 	do_gettimeofday(&start);
 
+	error = memory_bm_create(&orig_bm, GFP_IMAGE, PG_ANY);
+	if (error)
+		goto err_out;
+
+	error = memory_bm_create(&copy_bm, GFP_IMAGE, PG_ANY);
+	if (error)
+		goto err_out;
+
+	alloc_normal = 0;
+	alloc_highmem = 0;
+	error = -ENOMEM;
+
 	for (;;) {
 		struct zone *zone;
 		long size, highmem_size, tmp, ret;
@@ -1158,7 +1177,6 @@ int swsusp_shrink_memory(void)
 		 * creation.
 		 */
 		for_each_populated_zone(zone) {
-			tmp += snapshot_additional_pages(zone);
 			if (is_highmem(zone)) {
 				highmem_size -=
 					zone_page_state(zone, NR_FREE_PAGES);
@@ -1179,11 +1197,9 @@ int swsusp_shrink_memory(void)
 		else if (tmp <= 0)
 			break;
 
-		ret = prealloc_pages(tmp);
-		if (ret < 0) {
-			error = -ENOMEM;
-			goto out;
-		}
+		ret = prealloc_pages(&copy_bm, tmp);
+		if (ret < 0)
+			goto err_out;
 		alloc_normal += ret;
 		alloc_highmem += tmp - ret;
 		pages += tmp;
@@ -1192,13 +1208,13 @@ int swsusp_shrink_memory(void)
 	}
 
 	do_gettimeofday(&stop);
-	printk("\bdone (preallocated %lu free pages)\n", pages);
-	swsusp_show_speed(&start, &stop, pages, "Freed");
+	printk("\bdone (allocated %lu image pages)\n", pages);
+	swsusp_show_speed(&start, &stop, pages, "Allocated");
 
- out:
-	/* Release the preallocated page frames. */
-	swsusp_free();
+	return 0;
 
+ err_out:
+	swsusp_free();
 	return error;
 }
 
@@ -1210,7 +1226,7 @@ int swsusp_shrink_memory(void)
 
 static unsigned int count_pages_for_highmem(unsigned int nr_highmem)
 {
-	unsigned int free_highmem = count_free_highmem_pages();
+	unsigned int free_highmem = count_free_highmem_pages() + alloc_highmem;
 
 	if (free_highmem >= nr_highmem)
 		nr_highmem = 0;
@@ -1232,19 +1248,17 @@ count_pages_for_highmem(unsigned int nr_
 static int enough_free_mem(unsigned int nr_pages, unsigned int nr_highmem)
 {
 	struct zone *zone;
-	unsigned int free = 0, meta = 0;
+	unsigned int free = alloc_normal;
 
-	for_each_zone(zone) {
-		meta += snapshot_additional_pages(zone);
+	for_each_zone(zone)
 		if (!is_highmem(zone))
 			free += zone_page_state(zone, NR_FREE_PAGES);
-	}
 
 	nr_pages += count_pages_for_highmem(nr_highmem);
-	pr_debug("PM: Normal pages needed: %u + %u + %u, available pages: %u\n",
-		nr_pages, PAGES_FOR_IO, meta, free);
+	pr_debug("PM: Normal pages needed: %u + %u, available pages: %u\n",
+		nr_pages, PAGES_FOR_IO, free);
 
-	return free > nr_pages + PAGES_FOR_IO + meta;
+	return free > nr_pages + PAGES_FOR_IO;
 }
 
 #ifdef CONFIG_HIGHMEM
@@ -1266,7 +1280,7 @@ static inline int get_highmem_buffer(int
  */
 
 static inline unsigned int
-alloc_highmem_image_pages(struct memory_bitmap *bm, unsigned int nr_highmem)
+alloc_highmem_pages(struct memory_bitmap *bm, unsigned int nr_highmem)
 {
 	unsigned int to_alloc = count_free_highmem_pages();
 
@@ -1277,7 +1291,7 @@ alloc_highmem_image_pages(struct memory_
 	while (to_alloc-- > 0) {
 		struct page *page;
 
-		page = alloc_image_page(__GFP_HIGHMEM);
+		page = alloc_image_page(__GFP_HIGHMEM | __GFP_NO_OOM_KILL);
 		memory_bm_set_bit(bm, page_to_pfn(page));
 	}
 	return nr_highmem;
@@ -1286,7 +1300,7 @@ alloc_highmem_image_pages(struct memory_
 static inline int get_highmem_buffer(int safe_needed) { return 0; }
 
 static inline unsigned int
-alloc_highmem_image_pages(struct memory_bitmap *bm, unsigned int n) { return 0; }
+alloc_highmem_pages(struct memory_bitmap *bm, unsigned int n) { return 0; }
 #endif /* CONFIG_HIGHMEM */
 
 /**
@@ -1305,51 +1319,36 @@ static int
 swsusp_alloc(struct memory_bitmap *orig_bm, struct memory_bitmap *copy_bm,
 		unsigned int nr_pages, unsigned int nr_highmem)
 {
-	int error;
-
-	error = memory_bm_create(orig_bm, GFP_ATOMIC | __GFP_COLD, PG_ANY);
-	if (error)
-		goto Free;
-
-	error = memory_bm_create(copy_bm, GFP_ATOMIC | __GFP_COLD, PG_ANY);
-	if (error)
-		goto Free;
+	int error = 0;
 
 	if (nr_highmem > 0) {
 		error = get_highmem_buffer(PG_ANY);
 		if (error)
-			goto Free;
-
-		nr_pages += alloc_highmem_image_pages(copy_bm, nr_highmem);
+			goto err_out;
+		if (nr_highmem > alloc_highmem) {
+			nr_highmem -= alloc_highmem;
+			nr_pages += alloc_highmem_pages(copy_bm, nr_highmem);
+		}
 	}
-	while (nr_pages-- > 0) {
-		struct page *page = alloc_image_page(GFP_ATOMIC | __GFP_COLD);
-
-		if (!page)
-			goto Free;
+	if (nr_pages > alloc_normal) {
+		nr_pages -= alloc_normal;
+		while (nr_pages-- > 0) {
+			struct page *page;
 
-		memory_bm_set_bit(copy_bm, page_to_pfn(page));
+			page = alloc_image_page(GFP_ATOMIC | __GFP_COLD);
+			if (!page)
+				goto err_out;
+			memory_bm_set_bit(copy_bm, page_to_pfn(page));
+		}
 	}
+
 	return 0;
 
- Free:
+ err_out:
 	swsusp_free();
-	return -ENOMEM;
+	return error;
 }
 
-/* Memory bitmap used for marking saveable pages (during suspend) or the
- * suspend image pages (during resume)
- */
-static struct memory_bitmap orig_bm;
-/* Memory bitmap used on suspend for marking allocated pages that will contain
- * the copies of saveable pages.  During resume it is initially used for
- * marking the suspend image pages, but then its set bits are duplicated in
- * @orig_bm and it is released.  Next, on systems with high memory, it may be
- * used for marking "safe" highmem pages, but it has to be reinitialized for
- * this purpose.
- */
-static struct memory_bitmap copy_bm;
-
 asmlinkage int swsusp_save(void)
 {
 	unsigned int nr_pages, nr_highmem;
Index: linux-2.6/kernel/power/power.h
===================================================================
--- linux-2.6.orig/kernel/power/power.h
+++ linux-2.6/kernel/power/power.h
@@ -74,7 +74,7 @@ extern asmlinkage int swsusp_arch_resume
 
 extern int create_basic_memory_bitmaps(void);
 extern void free_basic_memory_bitmaps(void);
-extern int swsusp_shrink_memory(void);
+extern int hibernate_preallocate_memory(void);
 
 /**
  *	Auxiliary structure used for reading the snapshot image data and
Index: linux-2.6/kernel/power/disk.c
===================================================================
--- linux-2.6.orig/kernel/power/disk.c
+++ linux-2.6/kernel/power/disk.c
@@ -303,8 +303,8 @@ int hibernation_snapshot(int platform_mo
 	if (error)
 		return error;
 
-	/* Free memory before shutting down devices. */
-	error = swsusp_shrink_memory();
+	/* Preallocate image memory before shutting down devices. */
+	error = hibernate_preallocate_memory();
 	if (error)
 		goto Close;
 
@@ -320,6 +320,10 @@ int hibernation_snapshot(int platform_mo
 	/* Control returns here after successful restore */
 
  Resume_devices:
+	/* We may need to release the preallocated image pages here. */
+	if (error || !in_suspend)
+		swsusp_free();
+
 	device_resume(in_suspend ?
 		(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
 	resume_console();
@@ -593,7 +597,10 @@ int hibernate(void)
 		goto Thaw;
 
 	error = hibernation_snapshot(hibernation_mode == HIBERNATION_PLATFORM);
-	if (in_suspend && !error) {
+	if (error)
+		goto Thaw;
+
+	if (in_suspend) {
 		unsigned int flags = 0;
 
 		if (hibernation_mode == HIBERNATION_PLATFORM)
@@ -605,8 +612,8 @@ int hibernate(void)
 			power_down();
 	} else {
 		pr_debug("PM: Image restored successfully.\n");
-		swsusp_free();
 	}
+
  Thaw:
 	thaw_processes();
  Finish:

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/4] PM/Hibernate: Use memory allocations to free memory (rev. 2)
@ 2009-05-03  3:06                                           ` Linus Torvalds
  0 siblings, 0 replies; 580+ messages in thread
From: Linus Torvalds @ 2009-05-03  3:06 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, pavel, jens.axboe, alan-jenkins, linux-kernel,
	kernel-testers, linux-pm



On Sun, 3 May 2009, Rafael J. Wysocki wrote:
>
> Remove the shrinking of memory from the suspend-to-RAM code, where it is 
> not really necessary.

Hmm. Shouldn't we do this _regardless_?

IOW, shouldn't this be a totally separate patch? It seems to be left-over 
from when we shared the same code-paths, and before the split of the STR 
and hibernate code?

IOW, shouldn't the very _first_ patch just be this part? That code doesn't 
make any sense anyway (that FREE_PAGE_NUMBER really _is_ totally 
arbitrary).

This part seems to be totally independent of all the other parts in your 
patch-series. No?

		Linus

---
 kernel/power/main.c |   19 +------------------
 1 files changed, 1 insertions(+), 18 deletions(-)

diff --git a/kernel/power/main.c b/kernel/power/main.c
index f99ed6a..e3197e9 100644
--- a/kernel/power/main.c
+++ b/kernel/power/main.c
@@ -188,9 +188,6 @@ static void suspend_test_finish(const char *label)
 
 #endif
 
-/* This is just an arbitrary number */
-#define FREE_PAGE_NUMBER (100)
-
 static struct platform_suspend_ops *suspend_ops;
 
 /**
@@ -241,24 +238,10 @@ static int suspend_prepare(void)
 	if (error)
 		goto Finish;
 
-	if (suspend_freeze_processes()) {
-		error = -EAGAIN;
-		goto Thaw;
-	}
-
-	free_pages = global_page_state(NR_FREE_PAGES);
-	if (free_pages < FREE_PAGE_NUMBER) {
-		pr_debug("PM: free some memory\n");
-		shrink_all_memory(FREE_PAGE_NUMBER - free_pages);
-		if (nr_free_pages() < FREE_PAGE_NUMBER) {
-			error = -ENOMEM;
-			printk(KERN_ERR "PM: No enough memory\n");
-		}
-	}
+	error = suspend_freeze_processes();
 	if (!error)
 		return 0;
 
- Thaw:
 	suspend_thaw_processes();
 	usermodehelper_enable();
  Finish:

^ permalink raw reply related	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/4] PM/Hibernate: Use memory allocations to free memory (rev. 2)
  2009-05-03  0:24                                         ` Rafael J. Wysocki
  (?)
@ 2009-05-03  3:06                                         ` Linus Torvalds
  -1 siblings, 0 replies; 580+ messages in thread
From: Linus Torvalds @ 2009-05-03  3:06 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-kernel, alan-jenkins, jens.axboe, Andrew Morton,
	kernel-testers, linux-pm



On Sun, 3 May 2009, Rafael J. Wysocki wrote:
>
> Remove the shrinking of memory from the suspend-to-RAM code, where it is 
> not really necessary.

Hmm. Shouldn't we do this _regardless_?

IOW, shouldn't this be a totally separate patch? It seems to be left-over 
from when we shared the same code-paths, and before the split of the STR 
and hibernate code?

IOW, shouldn't the very _first_ patch just be this part? That code doesn't 
make any sense anyway (that FREE_PAGE_NUMBER really _is_ totally 
arbitrary).

This part seems to be totally independent of all the other parts in your 
patch-series. No?

		Linus

---
 kernel/power/main.c |   19 +------------------
 1 files changed, 1 insertions(+), 18 deletions(-)

diff --git a/kernel/power/main.c b/kernel/power/main.c
index f99ed6a..e3197e9 100644
--- a/kernel/power/main.c
+++ b/kernel/power/main.c
@@ -188,9 +188,6 @@ static void suspend_test_finish(const char *label)
 
 #endif
 
-/* This is just an arbitrary number */
-#define FREE_PAGE_NUMBER (100)
-
 static struct platform_suspend_ops *suspend_ops;
 
 /**
@@ -241,24 +238,10 @@ static int suspend_prepare(void)
 	if (error)
 		goto Finish;
 
-	if (suspend_freeze_processes()) {
-		error = -EAGAIN;
-		goto Thaw;
-	}
-
-	free_pages = global_page_state(NR_FREE_PAGES);
-	if (free_pages < FREE_PAGE_NUMBER) {
-		pr_debug("PM: free some memory\n");
-		shrink_all_memory(FREE_PAGE_NUMBER - free_pages);
-		if (nr_free_pages() < FREE_PAGE_NUMBER) {
-			error = -ENOMEM;
-			printk(KERN_ERR "PM: No enough memory\n");
-		}
-	}
+	error = suspend_freeze_processes();
 	if (!error)
 		return 0;
 
- Thaw:
 	suspend_thaw_processes();
 	usermodehelper_enable();
  Finish:

^ permalink raw reply related	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/4] PM/Hibernate: Use memory allocations to free memory (rev. 2)
@ 2009-05-03  3:06                                           ` Linus Torvalds
  0 siblings, 0 replies; 580+ messages in thread
From: Linus Torvalds @ 2009-05-03  3:06 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, pavel-+ZI9xUNit7I,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA



On Sun, 3 May 2009, Rafael J. Wysocki wrote:
>
> Remove the shrinking of memory from the suspend-to-RAM code, where it is 
> not really necessary.

Hmm. Shouldn't we do this _regardless_?

IOW, shouldn't this be a totally separate patch? It seems to be left-over 
from when we shared the same code-paths, and before the split of the STR 
and hibernate code?

IOW, shouldn't the very _first_ patch just be this part? That code doesn't 
make any sense anyway (that FREE_PAGE_NUMBER really _is_ totally 
arbitrary).

This part seems to be totally independent of all the other parts in your 
patch-series. No?

		Linus

---
 kernel/power/main.c |   19 +------------------
 1 files changed, 1 insertions(+), 18 deletions(-)

diff --git a/kernel/power/main.c b/kernel/power/main.c
index f99ed6a..e3197e9 100644
--- a/kernel/power/main.c
+++ b/kernel/power/main.c
@@ -188,9 +188,6 @@ static void suspend_test_finish(const char *label)
 
 #endif
 
-/* This is just an arbitrary number */
-#define FREE_PAGE_NUMBER (100)
-
 static struct platform_suspend_ops *suspend_ops;
 
 /**
@@ -241,24 +238,10 @@ static int suspend_prepare(void)
 	if (error)
 		goto Finish;
 
-	if (suspend_freeze_processes()) {
-		error = -EAGAIN;
-		goto Thaw;
-	}
-
-	free_pages = global_page_state(NR_FREE_PAGES);
-	if (free_pages < FREE_PAGE_NUMBER) {
-		pr_debug("PM: free some memory\n");
-		shrink_all_memory(FREE_PAGE_NUMBER - free_pages);
-		if (nr_free_pages() < FREE_PAGE_NUMBER) {
-			error = -ENOMEM;
-			printk(KERN_ERR "PM: No enough memory\n");
-		}
-	}
+	error = suspend_freeze_processes();
 	if (!error)
 		return 0;
 
- Thaw:
 	suspend_thaw_processes();
 	usermodehelper_enable();
  Finish:

^ permalink raw reply related	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/4] PM/Hibernate: Use memory allocations to free memory (rev. 2)
@ 2009-05-03  9:36                                             ` Pavel Machek
  0 siblings, 0 replies; 580+ messages in thread
From: Pavel Machek @ 2009-05-03  9:36 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Rafael J. Wysocki, Andrew Morton, jens.axboe, alan-jenkins,
	linux-kernel, kernel-testers, linux-pm

Hi!

> > Remove the shrinking of memory from the suspend-to-RAM code, where it is 
> > not really necessary.
> 
> Hmm. Shouldn't we do this _regardless_?
> 
> IOW, shouldn't this be a totally separate patch? It seems to be left-over 
> from when we shared the same code-paths, and before the split of the STR 
> and hibernate code?
> 
> IOW, shouldn't the very _first_ patch just be this part? That code doesn't 
> make any sense anyway (that FREE_PAGE_NUMBER really _is_ totally 
> arbitrary).
> 
> This part seems to be totally independent of all the other parts in your 
> patch-series. No?

I'm not sure this one is a good idea: drivers will need to allocate
memory during suspend/resume, and when processes are frozen/disk
driver is suspended, normal memory management will no longer work.

So, freeing 4M of memory before starting suspend seems like a good
idea. That way those small alocations will not fail.
								Pavel


> @@ -188,9 +188,6 @@ static void suspend_test_finish(const char *label)
>  
>  #endif
>  
> -/* This is just an arbitrary number */
> -#define FREE_PAGE_NUMBER (100)
> -
>  static struct platform_suspend_ops *suspend_ops;
>  
>  /**
> @@ -241,24 +238,10 @@ static int suspend_prepare(void)
>  	if (error)
>  		goto Finish;
>  
> -	if (suspend_freeze_processes()) {
> -		error = -EAGAIN;
> -		goto Thaw;
> -	}
> -
> -	free_pages = global_page_state(NR_FREE_PAGES);
> -	if (free_pages < FREE_PAGE_NUMBER) {
> -		pr_debug("PM: free some memory\n");
> -		shrink_all_memory(FREE_PAGE_NUMBER - free_pages);
> -		if (nr_free_pages() < FREE_PAGE_NUMBER) {
> -			error = -ENOMEM;
> -			printk(KERN_ERR "PM: No enough memory\n");
> -		}
> -	}
> +	error = suspend_freeze_processes();
>  	if (!error)
>  		return 0;
>  
> - Thaw:
>  	suspend_thaw_processes();
>  	usermodehelper_enable();
>   Finish:

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/4] PM/Hibernate: Use memory allocations to free memory (rev. 2)
  2009-05-03  3:06                                           ` Linus Torvalds
  (?)
@ 2009-05-03  9:36                                           ` Pavel Machek
  -1 siblings, 0 replies; 580+ messages in thread
From: Pavel Machek @ 2009-05-03  9:36 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: linux-kernel, alan-jenkins, jens.axboe, Andrew Morton,
	kernel-testers, linux-pm

Hi!

> > Remove the shrinking of memory from the suspend-to-RAM code, where it is 
> > not really necessary.
> 
> Hmm. Shouldn't we do this _regardless_?
> 
> IOW, shouldn't this be a totally separate patch? It seems to be left-over 
> from when we shared the same code-paths, and before the split of the STR 
> and hibernate code?
> 
> IOW, shouldn't the very _first_ patch just be this part? That code doesn't 
> make any sense anyway (that FREE_PAGE_NUMBER really _is_ totally 
> arbitrary).
> 
> This part seems to be totally independent of all the other parts in your 
> patch-series. No?

I'm not sure this one is a good idea: drivers will need to allocate
memory during suspend/resume, and when processes are frozen/disk
driver is suspended, normal memory management will no longer work.

So, freeing 4M of memory before starting suspend seems like a good
idea. That way those small alocations will not fail.
								Pavel


> @@ -188,9 +188,6 @@ static void suspend_test_finish(const char *label)
>  
>  #endif
>  
> -/* This is just an arbitrary number */
> -#define FREE_PAGE_NUMBER (100)
> -
>  static struct platform_suspend_ops *suspend_ops;
>  
>  /**
> @@ -241,24 +238,10 @@ static int suspend_prepare(void)
>  	if (error)
>  		goto Finish;
>  
> -	if (suspend_freeze_processes()) {
> -		error = -EAGAIN;
> -		goto Thaw;
> -	}
> -
> -	free_pages = global_page_state(NR_FREE_PAGES);
> -	if (free_pages < FREE_PAGE_NUMBER) {
> -		pr_debug("PM: free some memory\n");
> -		shrink_all_memory(FREE_PAGE_NUMBER - free_pages);
> -		if (nr_free_pages() < FREE_PAGE_NUMBER) {
> -			error = -ENOMEM;
> -			printk(KERN_ERR "PM: No enough memory\n");
> -		}
> -	}
> +	error = suspend_freeze_processes();
>  	if (!error)
>  		return 0;
>  
> - Thaw:
>  	suspend_thaw_processes();
>  	usermodehelper_enable();
>   Finish:

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/4] PM/Hibernate: Use memory allocations to free memory (rev. 2)
@ 2009-05-03  9:36                                             ` Pavel Machek
  0 siblings, 0 replies; 580+ messages in thread
From: Pavel Machek @ 2009-05-03  9:36 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Rafael J. Wysocki, Andrew Morton,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

Hi!

> > Remove the shrinking of memory from the suspend-to-RAM code, where it is 
> > not really necessary.
> 
> Hmm. Shouldn't we do this _regardless_?
> 
> IOW, shouldn't this be a totally separate patch? It seems to be left-over 
> from when we shared the same code-paths, and before the split of the STR 
> and hibernate code?
> 
> IOW, shouldn't the very _first_ patch just be this part? That code doesn't 
> make any sense anyway (that FREE_PAGE_NUMBER really _is_ totally 
> arbitrary).
> 
> This part seems to be totally independent of all the other parts in your 
> patch-series. No?

I'm not sure this one is a good idea: drivers will need to allocate
memory during suspend/resume, and when processes are frozen/disk
driver is suspended, normal memory management will no longer work.

So, freeing 4M of memory before starting suspend seems like a good
idea. That way those small alocations will not fail.
								Pavel


> @@ -188,9 +188,6 @@ static void suspend_test_finish(const char *label)
>  
>  #endif
>  
> -/* This is just an arbitrary number */
> -#define FREE_PAGE_NUMBER (100)
> -
>  static struct platform_suspend_ops *suspend_ops;
>  
>  /**
> @@ -241,24 +238,10 @@ static int suspend_prepare(void)
>  	if (error)
>  		goto Finish;
>  
> -	if (suspend_freeze_processes()) {
> -		error = -EAGAIN;
> -		goto Thaw;
> -	}
> -
> -	free_pages = global_page_state(NR_FREE_PAGES);
> -	if (free_pages < FREE_PAGE_NUMBER) {
> -		pr_debug("PM: free some memory\n");
> -		shrink_all_memory(FREE_PAGE_NUMBER - free_pages);
> -		if (nr_free_pages() < FREE_PAGE_NUMBER) {
> -			error = -ENOMEM;
> -			printk(KERN_ERR "PM: No enough memory\n");
> -		}
> -	}
> +	error = suspend_freeze_processes();
>  	if (!error)
>  		return 0;
>  
> - Thaw:
>  	suspend_thaw_processes();
>  	usermodehelper_enable();
>   Finish:

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/3] PM: Disable OOM killer during system-wide power transitions
@ 2009-05-03  9:47                                     ` Pavel Machek
  0 siblings, 0 replies; 580+ messages in thread
From: Pavel Machek @ 2009-05-03  9:47 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, torvalds, jens.axboe, alan-jenkins, linux-kernel,
	kernel-testers, linux-pm


> > Does the same apply to suspending?  If so, why?
> 
> Because I think it doesn't work anyway.  User space processes are frozen and
> effectively in TASK_UNINTERRUPTIBLE, so they won't be killed.
> 
> > I think this is an OK change, as long as the only thing which is
> > allocating memory is hibernation itself.  If random processes are still
> > doing random memory allocations at this time then their failed memory
> > allocation could be just as fatal as an oom-killing.  Moreso if they're
> > s/bin/init or whatever.
> 
> At this point all of the user space tasks are frozen.

Well, many kernel threads still remain active. But it is true that OOM
killer is unlikely to help.

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/3] PM: Disable OOM killer during system-wide power transitions
  2009-05-02 11:34                                   ` Rafael J. Wysocki
  (?)
@ 2009-05-03  9:47                                   ` Pavel Machek
  -1 siblings, 0 replies; 580+ messages in thread
From: Pavel Machek @ 2009-05-03  9:47 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-kernel, alan-jenkins, jens.axboe, Andrew Morton,
	kernel-testers, torvalds, linux-pm


> > Does the same apply to suspending?  If so, why?
> 
> Because I think it doesn't work anyway.  User space processes are frozen and
> effectively in TASK_UNINTERRUPTIBLE, so they won't be killed.
> 
> > I think this is an OK change, as long as the only thing which is
> > allocating memory is hibernation itself.  If random processes are still
> > doing random memory allocations at this time then their failed memory
> > allocation could be just as fatal as an oom-killing.  Moreso if they're
> > s/bin/init or whatever.
> 
> At this point all of the user space tasks are frozen.

Well, many kernel threads still remain active. But it is true that OOM
killer is unlikely to help.

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/3] PM: Disable OOM killer during system-wide power transitions
@ 2009-05-03  9:47                                     ` Pavel Machek
  0 siblings, 0 replies; 580+ messages in thread
From: Pavel Machek @ 2009-05-03  9:47 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA


> > Does the same apply to suspending?  If so, why?
> 
> Because I think it doesn't work anyway.  User space processes are frozen and
> effectively in TASK_UNINTERRUPTIBLE, so they won't be killed.
> 
> > I think this is an OK change, as long as the only thing which is
> > allocating memory is hibernation itself.  If random processes are still
> > doing random memory allocations at this time then their failed memory
> > allocation could be just as fatal as an oom-killing.  Moreso if they're
> > s/bin/init or whatever.
> 
> At this point all of the user space tasks are frozen.

Well, many kernel threads still remain active. But it is true that OOM
killer is unlikely to help.

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/4] PM/Hibernate: Use memory allocations to free memory (rev. 2)
@ 2009-05-03 11:51                                           ` Wu Fengguang
  0 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-03 11:51 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, pavel, torvalds, jens.axboe, alan-jenkins,
	linux-kernel, kernel-testers, linux-pm

On Sun, May 03, 2009 at 02:24:20AM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rjw@sisk.pl>
> 
> Modify the hibernation memory shrinking code so that it will make
> memory allocations to free memory instead of using an artificial
> memory shrinking mechanism for that.  Remove the shrinking of
> memory from the suspend-to-RAM code, where it is not really
> necessary.  Finally, remove the no longer used memory shrinking
> functions from mm/vmscan.c .
> 
> [rev. 2: Use the existing memory bitmaps for marking preallocated
>  image pages and use swsusp_free() from releasing them, introduce
>  GFP_IMAGE, add comments describing the memory shrinking strategy.]
> 
> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> ---
>  kernel/power/main.c     |   20 ------
>  kernel/power/snapshot.c |  132 +++++++++++++++++++++++++++++++++-----------
>  mm/vmscan.c             |  142 ------------------------------------------------
>  3 files changed, 101 insertions(+), 193 deletions(-)
> 
> Index: linux-2.6/kernel/power/snapshot.c
> ===================================================================
> --- linux-2.6.orig/kernel/power/snapshot.c
> +++ linux-2.6/kernel/power/snapshot.c
> @@ -1066,41 +1066,97 @@ void swsusp_free(void)
>  	buffer = NULL;
>  }
>  
> +/* Helper functions used for the shrinking of memory. */
> +
> +#ifdef CONFIG_HIGHMEM
> +#define GFP_IMAGE	(GFP_KERNEL | __GFP_HIGHMEM | __GFP_NO_OOM_KILL)
> +#else
> +#define GFP_IMAGE	(GFP_KERNEL | __GFP_NO_OOM_KILL)
> +#endif

The CONFIG_HIGHMEM test is not necessary: __GFP_HIGHMEM is always defined.

> +#define SHRINK_BITE	10000

This is ~40MB. A full scan of (for example) 8G pages will be time
consuming, not to mention we have to do it 2*(8G-500M)/40M = 384 times!

Can we make it a LONG_MAX?

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/4] PM/Hibernate: Use memory allocations to free memory (rev. 2)
  2009-05-03  0:24                                         ` Rafael J. Wysocki
                                                           ` (2 preceding siblings ...)
  (?)
@ 2009-05-03 11:51                                         ` Wu Fengguang
  -1 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-03 11:51 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-kernel, alan-jenkins, jens.axboe, Andrew Morton,
	kernel-testers, torvalds, linux-pm

On Sun, May 03, 2009 at 02:24:20AM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rjw@sisk.pl>
> 
> Modify the hibernation memory shrinking code so that it will make
> memory allocations to free memory instead of using an artificial
> memory shrinking mechanism for that.  Remove the shrinking of
> memory from the suspend-to-RAM code, where it is not really
> necessary.  Finally, remove the no longer used memory shrinking
> functions from mm/vmscan.c .
> 
> [rev. 2: Use the existing memory bitmaps for marking preallocated
>  image pages and use swsusp_free() from releasing them, introduce
>  GFP_IMAGE, add comments describing the memory shrinking strategy.]
> 
> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> ---
>  kernel/power/main.c     |   20 ------
>  kernel/power/snapshot.c |  132 +++++++++++++++++++++++++++++++++-----------
>  mm/vmscan.c             |  142 ------------------------------------------------
>  3 files changed, 101 insertions(+), 193 deletions(-)
> 
> Index: linux-2.6/kernel/power/snapshot.c
> ===================================================================
> --- linux-2.6.orig/kernel/power/snapshot.c
> +++ linux-2.6/kernel/power/snapshot.c
> @@ -1066,41 +1066,97 @@ void swsusp_free(void)
>  	buffer = NULL;
>  }
>  
> +/* Helper functions used for the shrinking of memory. */
> +
> +#ifdef CONFIG_HIGHMEM
> +#define GFP_IMAGE	(GFP_KERNEL | __GFP_HIGHMEM | __GFP_NO_OOM_KILL)
> +#else
> +#define GFP_IMAGE	(GFP_KERNEL | __GFP_NO_OOM_KILL)
> +#endif

The CONFIG_HIGHMEM test is not necessary: __GFP_HIGHMEM is always defined.

> +#define SHRINK_BITE	10000

This is ~40MB. A full scan of (for example) 8G pages will be time
consuming, not to mention we have to do it 2*(8G-500M)/40M = 384 times!

Can we make it a LONG_MAX?

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/4] PM/Hibernate: Use memory allocations to free memory (rev. 2)
@ 2009-05-03 11:51                                           ` Wu Fengguang
  0 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-03 11:51 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, pavel-+ZI9xUNit7I,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

On Sun, May 03, 2009 at 02:24:20AM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
> 
> Modify the hibernation memory shrinking code so that it will make
> memory allocations to free memory instead of using an artificial
> memory shrinking mechanism for that.  Remove the shrinking of
> memory from the suspend-to-RAM code, where it is not really
> necessary.  Finally, remove the no longer used memory shrinking
> functions from mm/vmscan.c .
> 
> [rev. 2: Use the existing memory bitmaps for marking preallocated
>  image pages and use swsusp_free() from releasing them, introduce
>  GFP_IMAGE, add comments describing the memory shrinking strategy.]
> 
> Signed-off-by: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
> ---
>  kernel/power/main.c     |   20 ------
>  kernel/power/snapshot.c |  132 +++++++++++++++++++++++++++++++++-----------
>  mm/vmscan.c             |  142 ------------------------------------------------
>  3 files changed, 101 insertions(+), 193 deletions(-)
> 
> Index: linux-2.6/kernel/power/snapshot.c
> ===================================================================
> --- linux-2.6.orig/kernel/power/snapshot.c
> +++ linux-2.6/kernel/power/snapshot.c
> @@ -1066,41 +1066,97 @@ void swsusp_free(void)
>  	buffer = NULL;
>  }
>  
> +/* Helper functions used for the shrinking of memory. */
> +
> +#ifdef CONFIG_HIGHMEM
> +#define GFP_IMAGE	(GFP_KERNEL | __GFP_HIGHMEM | __GFP_NO_OOM_KILL)
> +#else
> +#define GFP_IMAGE	(GFP_KERNEL | __GFP_NO_OOM_KILL)
> +#endif

The CONFIG_HIGHMEM test is not necessary: __GFP_HIGHMEM is always defined.

> +#define SHRINK_BITE	10000

This is ~40MB. A full scan of (for example) 8G pages will be time
consuming, not to mention we have to do it 2*(8G-500M)/40M = 384 times!

Can we make it a LONG_MAX?

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/4] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-03 11:54                                           ` Wu Fengguang
  0 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-03 11:54 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, pavel, torvalds, jens.axboe, alan-jenkins,
	linux-kernel, kernel-testers, linux-pm

On Sun, May 03, 2009 at 02:22:06AM +0200, Rafael J. Wysocki wrote:
> From: Andrew Morton <akpm@linux-foundation.org>
> 
> > > Remind me: why can't we just allocate N pages at suspend-time?
> > 
> > We need half of memory free. The reason we can't "just allocate" is
> > probably OOM killer; but my memories are quite weak :-(.
> 
> hm.  You'd think that with our splendid range of __GFP_foo falgs, there
> would be some combo which would suit this requirement but I can't
> immediately spot one.
> 
> We can always add another I guess.  Something like...
> 
> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> ---
>  include/linux/gfp.h |    3 ++-
>  mm/page_alloc.c     |    3 ++-
>  2 files changed, 4 insertions(+), 2 deletions(-)
> 
> Index: linux-2.6/mm/page_alloc.c
> ===================================================================
> --- linux-2.6.orig/mm/page_alloc.c
> +++ linux-2.6/mm/page_alloc.c
> @@ -1620,7 +1620,8 @@ nofail_alloc:
>  		}
>  
>  		/* The OOM killer will not help higher order allocs so fail */
> -		if (order > PAGE_ALLOC_COSTLY_ORDER) {
> +		if (order > PAGE_ALLOC_COSTLY_ORDER ||
> +				(gfp_mask & __GFP_NO_OOM_KILL)) {
>  			clear_zonelist_oom(zonelist, gfp_mask);
>  			goto nopage;
>  		}
> Index: linux-2.6/include/linux/gfp.h
> ===================================================================
> --- linux-2.6.orig/include/linux/gfp.h
> +++ linux-2.6/include/linux/gfp.h
> @@ -51,8 +51,9 @@ struct vm_area_struct;
>  #define __GFP_THISNODE	((__force gfp_t)0x40000u)/* No fallback, no policies */
>  #define __GFP_RECLAIMABLE ((__force gfp_t)0x80000u) /* Page is reclaimable */
>  #define __GFP_MOVABLE	((__force gfp_t)0x100000u)  /* Page is movable */
> +#define __GFP_NO_OOM_KILL ((__force gfp_t)0x200000u)  /* Don't invoke out_of_memory() */
>  
> -#define __GFP_BITS_SHIFT 21	/* Room for 21 __GFP_FOO bits */
> +#define __GFP_BITS_SHIFT 22	/* Number of__GFP_FOO bits */
                                            ^ missed a white space :)

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/4] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-03  0:22                                         ` Rafael J. Wysocki
  (?)
@ 2009-05-03 11:54                                         ` Wu Fengguang
  -1 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-03 11:54 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-kernel, alan-jenkins, jens.axboe, Andrew Morton,
	kernel-testers, torvalds, linux-pm

On Sun, May 03, 2009 at 02:22:06AM +0200, Rafael J. Wysocki wrote:
> From: Andrew Morton <akpm@linux-foundation.org>
> 
> > > Remind me: why can't we just allocate N pages at suspend-time?
> > 
> > We need half of memory free. The reason we can't "just allocate" is
> > probably OOM killer; but my memories are quite weak :-(.
> 
> hm.  You'd think that with our splendid range of __GFP_foo falgs, there
> would be some combo which would suit this requirement but I can't
> immediately spot one.
> 
> We can always add another I guess.  Something like...
> 
> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> ---
>  include/linux/gfp.h |    3 ++-
>  mm/page_alloc.c     |    3 ++-
>  2 files changed, 4 insertions(+), 2 deletions(-)
> 
> Index: linux-2.6/mm/page_alloc.c
> ===================================================================
> --- linux-2.6.orig/mm/page_alloc.c
> +++ linux-2.6/mm/page_alloc.c
> @@ -1620,7 +1620,8 @@ nofail_alloc:
>  		}
>  
>  		/* The OOM killer will not help higher order allocs so fail */
> -		if (order > PAGE_ALLOC_COSTLY_ORDER) {
> +		if (order > PAGE_ALLOC_COSTLY_ORDER ||
> +				(gfp_mask & __GFP_NO_OOM_KILL)) {
>  			clear_zonelist_oom(zonelist, gfp_mask);
>  			goto nopage;
>  		}
> Index: linux-2.6/include/linux/gfp.h
> ===================================================================
> --- linux-2.6.orig/include/linux/gfp.h
> +++ linux-2.6/include/linux/gfp.h
> @@ -51,8 +51,9 @@ struct vm_area_struct;
>  #define __GFP_THISNODE	((__force gfp_t)0x40000u)/* No fallback, no policies */
>  #define __GFP_RECLAIMABLE ((__force gfp_t)0x80000u) /* Page is reclaimable */
>  #define __GFP_MOVABLE	((__force gfp_t)0x100000u)  /* Page is movable */
> +#define __GFP_NO_OOM_KILL ((__force gfp_t)0x200000u)  /* Don't invoke out_of_memory() */
>  
> -#define __GFP_BITS_SHIFT 21	/* Room for 21 __GFP_FOO bits */
> +#define __GFP_BITS_SHIFT 22	/* Number of__GFP_FOO bits */
                                            ^ missed a white space :)

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/4] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-03 11:54                                           ` Wu Fengguang
  0 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-03 11:54 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, pavel-+ZI9xUNit7I,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

On Sun, May 03, 2009 at 02:22:06AM +0200, Rafael J. Wysocki wrote:
> From: Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
> 
> > > Remind me: why can't we just allocate N pages at suspend-time?
> > 
> > We need half of memory free. The reason we can't "just allocate" is
> > probably OOM killer; but my memories are quite weak :-(.
> 
> hm.  You'd think that with our splendid range of __GFP_foo falgs, there
> would be some combo which would suit this requirement but I can't
> immediately spot one.
> 
> We can always add another I guess.  Something like...
> 
> Signed-off-by: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
> ---
>  include/linux/gfp.h |    3 ++-
>  mm/page_alloc.c     |    3 ++-
>  2 files changed, 4 insertions(+), 2 deletions(-)
> 
> Index: linux-2.6/mm/page_alloc.c
> ===================================================================
> --- linux-2.6.orig/mm/page_alloc.c
> +++ linux-2.6/mm/page_alloc.c
> @@ -1620,7 +1620,8 @@ nofail_alloc:
>  		}
>  
>  		/* The OOM killer will not help higher order allocs so fail */
> -		if (order > PAGE_ALLOC_COSTLY_ORDER) {
> +		if (order > PAGE_ALLOC_COSTLY_ORDER ||
> +				(gfp_mask & __GFP_NO_OOM_KILL)) {
>  			clear_zonelist_oom(zonelist, gfp_mask);
>  			goto nopage;
>  		}
> Index: linux-2.6/include/linux/gfp.h
> ===================================================================
> --- linux-2.6.orig/include/linux/gfp.h
> +++ linux-2.6/include/linux/gfp.h
> @@ -51,8 +51,9 @@ struct vm_area_struct;
>  #define __GFP_THISNODE	((__force gfp_t)0x40000u)/* No fallback, no policies */
>  #define __GFP_RECLAIMABLE ((__force gfp_t)0x80000u) /* Page is reclaimable */
>  #define __GFP_MOVABLE	((__force gfp_t)0x100000u)  /* Page is movable */
> +#define __GFP_NO_OOM_KILL ((__force gfp_t)0x200000u)  /* Don't invoke out_of_memory() */
>  
> -#define __GFP_BITS_SHIFT 21	/* Room for 21 __GFP_FOO bits */
> +#define __GFP_BITS_SHIFT 22	/* Number of__GFP_FOO bits */
                                            ^ missed a white space :)

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 0/4] PM: Drop shrink_all_memory (rev. 2) (was: Re: [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory)
@ 2009-05-03 13:08                                         ` Wu Fengguang
  0 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-03 13:08 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, pavel, torvalds, jens.axboe, alan-jenkins,
	linux-kernel, kernel-testers, linux-pm


Hi Rafael,

I happened to be doing some benchmarks on the older shrink_all_memory(),
Hopefully it can be a useful reference point for the new design.

The current swsusp_shrink_memory()/shrink_all_memory() are terribly
inefficient: it takes 7-9s to free up 1.4G memory:

[  131.899389] PM: Freed 1413380 kbytes in 7.03 seconds (201.04 MB/s)
[  732.757916] PM: Freed 1490116 kbytes in 9.37 seconds (159.03 MB/s)

Below are the logs I collected by injecting printks. There are
basically two major problems:
- swsusp_shrink_memory() scans the whole 2G memory again and again;
- shrink_all_memory() is slow. It won't reclaim pages at all with
  small priority values, because it's batching size is 10000 pages.

I wonder if it's possible to free up the memory within 1s at all.
(Maybe the slowness is due to too much enabled debugging options...)

Thanks,
Fengguang
---

vanilla 2.6.30-rc2-next-20090417:

[  124.516187] PM: Marking nosave pages: 0000000000001000 - 0000000000006000
[  124.523087] PM: Marking nosave pages: 000000000009f000 - 0000000000100000
[  124.530060] PM: Basic memory bitmaps created
[  124.534421] PM: Syncing filesystems ... done.
[  124.842282] Freezing user space processes ... (elapsed 0.00 seconds) done.
[  124.849800] Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done.
[  124.857571] PM: Shrinking memory...  tmp=471584, size=491906, highmem_size=0
[  124.939103] shrink_all_memory: pages=10000
[  125.019543] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  125.027636]-tmp=451770, size=481986, highmem_size=0
[  125.107571] shrink_all_memory: pages=10000
[  125.139928] shrink_all_zones: pass=0, prio=7, lru=Normal.2, pages=10000, reclaimed=8500
[  125.280940] shrink_all_zones: pass=0, prio=6, lru=DMA32.2, pages=1500, reclaimed=1500
[  125.547990] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=10000
[  125.556135]\tmp=411598, size=461898, highmem_size=0
[  125.637414] shrink_all_memory: pages=10000
[  125.716890] shrink_all_zones: pass=0, prio=7, lru=Normal.2, pages=10000, reclaimed=10000
[  125.725092]|tmp=391507, size=451854, highmem_size=0
[  125.806935] shrink_all_memory: pages=10000
[  125.886317] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  125.894531]/tmp=371481, size=441841, highmem_size=0
[  125.976823] shrink_all_memory: pages=10000
[  126.104367] shrink_all_zones: pass=0, prio=6, lru=Normal.2, pages=10000, reclaimed=10000
[  126.112572]-tmp=351715, size=431952, highmem_size=0
[  126.195178] shrink_all_memory: pages=10000
[  126.274586] shrink_all_zones: pass=0, prio=6, lru=DMA32.2, pages=10000, reclaimed=10000
[  126.282698]\tmp=331949, size=422063, highmem_size=0
[  126.365743] shrink_all_memory: pages=10000
[  126.445851] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  126.453968]|tmp=311858, size=412019, highmem_size=0
[  126.537417] shrink_all_memory: pages=10000
[  126.616980] shrink_all_zones: pass=0, prio=9, lru=Normal.2, pages=10000, reclaimed=10000
[  126.625180]/tmp=291751, size=401975, highmem_size=0
[  126.709066] shrink_all_memory: pages=10000
[  126.788665] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  126.796833]-tmp=271725, size=391962, highmem_size=0
[  126.880997] shrink_all_memory: pages=10000
[  127.008443] shrink_all_zones: pass=0, prio=6, lru=Normal.2, pages=10000, reclaimed=10000
[  127.016667]\tmp=251716, size=381949, highmem_size=0
[  127.101581] shrink_all_memory: pages=10000
[  127.181588] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  127.189728]|tmp=231673, size=371936, highmem_size=0
[  127.275105] shrink_all_memory: pages=10000
[  127.354799] shrink_all_zones: pass=0, prio=6, lru=Normal.2, pages=10000, reclaimed=10000
[  127.363003]/tmp=211599, size=361892, highmem_size=0
[  127.448750] shrink_all_memory: pages=10000
[  127.528252] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  127.536385]-tmp=191621, size=351910, highmem_size=0
[  127.622369] shrink_all_memory: pages=10000
[  127.750093] shrink_all_zones: pass=0, prio=6, lru=Normal.2, pages=10000, reclaimed=10000
[  127.758295]\tmp=171539, size=341866, highmem_size=0
[  127.844867] shrink_all_memory: pages=10000
[  127.925614] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  127.933758]|tmp=151465, size=331822, highmem_size=0
[  128.020878] shrink_all_memory: pages=10000
[  128.100580] shrink_all_zones: pass=0, prio=6, lru=Normal.2, pages=10000, reclaimed=10000
[  128.108803]/tmp=131391, size=321778, highmem_size=0
[  128.196312] shrink_all_memory: pages=10000
[  128.275643] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  128.283769]-tmp=111413, size=311796, highmem_size=0
[  128.371814] shrink_all_memory: pages=10000
[  128.501803] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=10000
[  128.510007]\tmp=91339, size=301752, highmem_size=0
[  128.597726] shrink_all_memory: pages=10000
[  128.677138] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  128.685277]|tmp=71296, size=291739, highmem_size=0
[  128.774061] shrink_all_memory: pages=10000
[  128.855940] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=10000
[  128.864145]/tmp=51259, size=281726, highmem_size=0
[  128.953486] shrink_all_memory: pages=10000
[  129.033417] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  129.041553]-tmp=31172, size=271682, highmem_size=0
[  129.131233] shrink_all_memory: pages=10000
[  129.210693] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=10000
[  129.218994]\tmp=11146, size=261669, highmem_size=0
[  129.309142] shrink_all_memory: pages=10000
[  129.388523] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  129.396648]|tmp=-8880, size=251656, highmem_size=0
[  129.487193] shrink_all_memory: pages=10000
[  129.614831] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=10000
[  129.623059]/tmp=-28954, size=241612, highmem_size=0
[  129.714055] shrink_all_memory: pages=10000
[  129.794104] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=10000
[  129.802246]-tmp=-48932, size=231630, highmem_size=0
[  129.893893] shrink_all_memory: pages=10000
[  129.973667] shrink_all_zones: pass=0, prio=6, lru=Normal.2, pages=10000, reclaimed=10000
[  129.981892]\tmp=-69020, size=221586, highmem_size=0
[  130.073916] shrink_all_memory: pages=10000
[  130.154620] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=10000
[  130.162853]|tmp=-89156, size=211511, highmem_size=0
[  130.255274] shrink_all_memory: pages=10000
[  130.334612] shrink_all_zones: pass=0, prio=8, lru=DMA32.2, pages=10000, reclaimed=10000
[  130.342750]/tmp=-109182, size=201498, highmem_size=0
[  130.435551] shrink_all_memory: pages=10000
[  130.515074] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=10000
[  130.523305]-tmp=-129273, size=191454, highmem_size=0
[  130.616714] shrink_all_memory: pages=10000
[  130.696350] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=10000
[  130.704490]\tmp=-149299, size=181441, highmem_size=0
[  130.798322] shrink_all_memory: pages=10000
[  130.877834] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=10000
[  130.886038]|tmp=-169325, size=171428, highmem_size=0
[  130.980312] shrink_all_memory: pages=10000
[  131.107844] shrink_all_zones: pass=0, prio=3, lru=DMA32.2, pages=10000, reclaimed=10000
[  131.115982]/tmp=-189351, size=161415, highmem_size=0
[  131.210530] shrink_all_memory: pages=10000
[  131.291223] shrink_all_zones: pass=0, prio=7, lru=Normal.2, pages=10000, reclaimed=10000
[  131.299433]-tmp=-209459, size=151371, highmem_size=0
[  131.394488] shrink_all_memory: pages=10000
[  131.474123] shrink_all_zones: pass=0, prio=3, lru=Normal.2, pages=10000, reclaimed=10000
[  131.482344]\tmp=-229420, size=141389, highmem_size=0
[  131.577910] shrink_all_memory: pages=10000
[  131.657376] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  131.665498]|tmp=-249511, size=131345, highmem_size=0
[  131.761676] shrink_all_memory: pages=3345
[  131.791048] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=3345, reclaimed=3345
[  131.799085]/tmp=-256256, size=127966, highmem_size=0
[  131.895290]done (353345 pages freed)
[  131.899389] PM: Freed 1413380 kbytes in 7.03 seconds (201.04 MB/s)


1/30 memory being mapped, vanilla 2.6.30-rc2-next-20090417:

AnonPages:         38684 kB
Mapped:            66940 kB

[  722.944082] PM: Marking nosave pages: 0000000000001000 - 0000000000006000
[  722.956215] PM: Marking nosave pages: 000000000009f000 - 0000000000100000
[  722.963053] PM: Basic memory bitmaps created
[  722.967365] PM: Syncing filesystems ... done.
[  723.361274] Freezing user space processes ... (elapsed 0.00 seconds) done.
[  723.369310] Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done.
[  723.377342] PM: Shrinking memory...  tmp=508165, size=510179, highmem_size=0
[  723.563602] shrink_all_memory: pages=10000
[  723.648921] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=9766
[  723.733064] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=19581
[  723.741225]-tmp=468972, size=490587, highmem_size=0
[  723.821406] shrink_all_memory: pages=10000
[  723.902912] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=9804
[  723.987433] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=19617
[  723.995565]\tmp=429714, size=470964, highmem_size=0
[  724.077458] shrink_all_memory: pages=10000
[  724.160394] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=9808
[  724.261056] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=19610
[  724.269482]|tmp=390489, size=451341, highmem_size=0
[  724.353672] shrink_all_memory: pages=10000
[  724.556153] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=9806
[  724.669591] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=19636
[  724.677770]/tmp=351365, size=431780, highmem_size=0
[  724.762188] shrink_all_memory: pages=10000
[  724.923372] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=9805
[  725.037897] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=19620
[  725.046501]-tmp=312189, size=412188, highmem_size=0
[  725.133452] shrink_all_memory: pages=10000
[  725.371199] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=9781
[  725.519983] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=19585
[  725.528233]\tmp=273061, size=392627, highmem_size=0
[  725.616020] shrink_all_memory: pages=10000
[  725.801211] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=9804
[  725.954523] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=19617
[  725.962685]|tmp=233885, size=373035, highmem_size=0
[  726.051775] shrink_all_memory: pages=10000
[  726.296589] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=9804
[  726.449150] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=19682
[  726.457342]/tmp=194507, size=353350, highmem_size=0
[  726.548053] shrink_all_memory: pages=10000
[  726.759180] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=9803
[  726.940475] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=19561
[  726.948638]-tmp=155396, size=333789, highmem_size=0
[  727.040362] shrink_all_memory: pages=10000
[  727.257478] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=9804
[  727.442356] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=19544
[  727.450548]\tmp=116319, size=314259, highmem_size=0
[  727.543609] shrink_all_memory: pages=10000
[  727.755346] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=9804
[  727.894707] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=19555
[  727.902910]|tmp=77256, size=294729, highmem_size=0
[  727.997018] shrink_all_memory: pages=10000
[  728.170973] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=9799
[  728.332426] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=19545
[  728.341152]/tmp=38210, size=275199, highmem_size=0
[  728.437625] shrink_all_memory: pages=10000
[  728.673862] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=9773
[  728.812572] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=19517
[  728.820738]-tmp=-852, size=255669, highmem_size=0
[  728.917360] shrink_all_memory: pages=10000
[  729.110178] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=9839
[  729.266243] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=19610
[  729.274407]\tmp=-40045, size=236077, highmem_size=0
[  729.372371] shrink_all_memory: pages=10000
[  729.553743] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=9741
[  729.673174] shrink_all_zones: pass=0, prio=3, lru=DMA.2, pages=259, reclaimed=256
[  729.681224] shrink_all_zones: pass=0, prio=3, lru=DMA32.0, pages=259, reclaimed=256
[  729.693997] shrink_all_zones: pass=0, prio=3, lru=DMA32.2, pages=259, reclaimed=513
[  730.006423] shrink_all_zones: pass=0, prio=2, lru=DMA32.2, pages=9487, reclaimed=9296
[  730.177563] shrink_all_zones: pass=0, prio=2, lru=Normal.2, pages=9487, reclaimed=18626
[  730.185640]|tmp=-98138, size=207022, highmem_size=0
[  730.285280] shrink_all_memory: pages=10000
[  730.484499] shrink_all_zones: pass=0, prio=3, lru=DMA32.2, pages=10000, reclaimed=9807
[  730.637792] shrink_all_zones: pass=0, prio=3, lru=Normal.2, pages=10000, reclaimed=19613
[  730.645975]/tmp=-137343, size=187430, highmem_size=0
[  730.746709] shrink_all_memory: pages=10000
[  730.754374] shrink_all_zones: pass=0, prio=5, lru=Normal.0, pages=10000, reclaimed=0
[  731.101101] shrink_all_zones: pass=0, prio=3, lru=DMA32.2, pages=10000, reclaimed=9777
[  731.257243] shrink_all_zones: pass=0, prio=3, lru=Normal.2, pages=10000, reclaimed=19582
[  731.265411]-tmp=-176567, size=167807, highmem_size=0
[  731.367111] shrink_all_memory: pages=10000
[  731.567779] shrink_all_zones: pass=0, prio=3, lru=DMA32.2, pages=10000, reclaimed=9811
[  731.803019] shrink_all_zones: pass=0, prio=3, lru=Normal.2, pages=10000, reclaimed=19615
[  731.811189]\tmp=-215837, size=148184, highmem_size=0
[  731.913738] shrink_all_memory: pages=10000
[  732.123893] shrink_all_zones: pass=0, prio=2, lru=DMA32.2, pages=10000, reclaimed=9808
[  732.312075] shrink_all_zones: pass=0, prio=2, lru=Normal.2, pages=10000, reclaimed=19580
[  732.320234]|tmp=-254948, size=128623, highmem_size=0
[  732.423776] shrink_all_memory: pages=623
[  732.432862] shrink_all_zones: pass=0, prio=12, lru=DMA.2, pages=623, reclaimed=617
[  732.441782] shrink_all_zones: pass=0, prio=12, lru=Normal.0, pages=623, reclaimed=617
[  732.453341] shrink_all_zones: pass=0, prio=11, lru=DMA.0, pages=6, reclaimed=0
[  732.460712] shrink_all_zones: pass=0, prio=11, lru=DMA32.0, pages=6, reclaimed=0
[  732.468390] shrink_all_zones: pass=0, prio=11, lru=DMA32.2, pages=6, reclaimed=6
[  732.488091] shrink_all_zones: pass=0, prio=6, lru=DMA32.2, pages=623, reclaimed=617
[  732.508256] shrink_all_zones: pass=0, prio=6, lru=Normal.2, pages=623, reclaimed=1233
[  732.516202]/tmp=-258774, size=126704, highmem_size=0
[  732.753869]done (372529 pages freed)
[  732.757916] PM: Freed 1490116 kbytes in 9.37 seconds (159.03 MB/s)


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 0/4] PM: Drop shrink_all_memory (rev. 2) (was: Re: [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory)
  2009-05-03  0:20                                       ` Rafael J. Wysocki
                                                         ` (8 preceding siblings ...)
  (?)
@ 2009-05-03 13:08                                       ` Wu Fengguang
  -1 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-03 13:08 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-kernel, alan-jenkins, jens.axboe, Andrew Morton,
	kernel-testers, torvalds, linux-pm


Hi Rafael,

I happened to be doing some benchmarks on the older shrink_all_memory(),
Hopefully it can be a useful reference point for the new design.

The current swsusp_shrink_memory()/shrink_all_memory() are terribly
inefficient: it takes 7-9s to free up 1.4G memory:

[  131.899389] PM: Freed 1413380 kbytes in 7.03 seconds (201.04 MB/s)
[  732.757916] PM: Freed 1490116 kbytes in 9.37 seconds (159.03 MB/s)

Below are the logs I collected by injecting printks. There are
basically two major problems:
- swsusp_shrink_memory() scans the whole 2G memory again and again;
- shrink_all_memory() is slow. It won't reclaim pages at all with
  small priority values, because it's batching size is 10000 pages.

I wonder if it's possible to free up the memory within 1s at all.
(Maybe the slowness is due to too much enabled debugging options...)

Thanks,
Fengguang
---

vanilla 2.6.30-rc2-next-20090417:

[  124.516187] PM: Marking nosave pages: 0000000000001000 - 0000000000006000
[  124.523087] PM: Marking nosave pages: 000000000009f000 - 0000000000100000
[  124.530060] PM: Basic memory bitmaps created
[  124.534421] PM: Syncing filesystems ... done.
[  124.842282] Freezing user space processes ... (elapsed 0.00 seconds) done.
[  124.849800] Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done.
[  124.857571] PM: Shrinking memory...  tmp=471584, size=491906, highmem_size=0
[  124.939103] shrink_all_memory: pages=10000
[  125.019543] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  125.027636]-tmp=451770, size=481986, highmem_size=0
[  125.107571] shrink_all_memory: pages=10000
[  125.139928] shrink_all_zones: pass=0, prio=7, lru=Normal.2, pages=10000, reclaimed=8500
[  125.280940] shrink_all_zones: pass=0, prio=6, lru=DMA32.2, pages=1500, reclaimed=1500
[  125.547990] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=10000
[  125.556135]\tmp=411598, size=461898, highmem_size=0
[  125.637414] shrink_all_memory: pages=10000
[  125.716890] shrink_all_zones: pass=0, prio=7, lru=Normal.2, pages=10000, reclaimed=10000
[  125.725092]|tmp=391507, size=451854, highmem_size=0
[  125.806935] shrink_all_memory: pages=10000
[  125.886317] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  125.894531]/tmp=371481, size=441841, highmem_size=0
[  125.976823] shrink_all_memory: pages=10000
[  126.104367] shrink_all_zones: pass=0, prio=6, lru=Normal.2, pages=10000, reclaimed=10000
[  126.112572]-tmp=351715, size=431952, highmem_size=0
[  126.195178] shrink_all_memory: pages=10000
[  126.274586] shrink_all_zones: pass=0, prio=6, lru=DMA32.2, pages=10000, reclaimed=10000
[  126.282698]\tmp=331949, size=422063, highmem_size=0
[  126.365743] shrink_all_memory: pages=10000
[  126.445851] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  126.453968]|tmp=311858, size=412019, highmem_size=0
[  126.537417] shrink_all_memory: pages=10000
[  126.616980] shrink_all_zones: pass=0, prio=9, lru=Normal.2, pages=10000, reclaimed=10000
[  126.625180]/tmp=291751, size=401975, highmem_size=0
[  126.709066] shrink_all_memory: pages=10000
[  126.788665] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  126.796833]-tmp=271725, size=391962, highmem_size=0
[  126.880997] shrink_all_memory: pages=10000
[  127.008443] shrink_all_zones: pass=0, prio=6, lru=Normal.2, pages=10000, reclaimed=10000
[  127.016667]\tmp=251716, size=381949, highmem_size=0
[  127.101581] shrink_all_memory: pages=10000
[  127.181588] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  127.189728]|tmp=231673, size=371936, highmem_size=0
[  127.275105] shrink_all_memory: pages=10000
[  127.354799] shrink_all_zones: pass=0, prio=6, lru=Normal.2, pages=10000, reclaimed=10000
[  127.363003]/tmp=211599, size=361892, highmem_size=0
[  127.448750] shrink_all_memory: pages=10000
[  127.528252] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  127.536385]-tmp=191621, size=351910, highmem_size=0
[  127.622369] shrink_all_memory: pages=10000
[  127.750093] shrink_all_zones: pass=0, prio=6, lru=Normal.2, pages=10000, reclaimed=10000
[  127.758295]\tmp=171539, size=341866, highmem_size=0
[  127.844867] shrink_all_memory: pages=10000
[  127.925614] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  127.933758]|tmp=151465, size=331822, highmem_size=0
[  128.020878] shrink_all_memory: pages=10000
[  128.100580] shrink_all_zones: pass=0, prio=6, lru=Normal.2, pages=10000, reclaimed=10000
[  128.108803]/tmp=131391, size=321778, highmem_size=0
[  128.196312] shrink_all_memory: pages=10000
[  128.275643] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  128.283769]-tmp=111413, size=311796, highmem_size=0
[  128.371814] shrink_all_memory: pages=10000
[  128.501803] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=10000
[  128.510007]\tmp=91339, size=301752, highmem_size=0
[  128.597726] shrink_all_memory: pages=10000
[  128.677138] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  128.685277]|tmp=71296, size=291739, highmem_size=0
[  128.774061] shrink_all_memory: pages=10000
[  128.855940] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=10000
[  128.864145]/tmp=51259, size=281726, highmem_size=0
[  128.953486] shrink_all_memory: pages=10000
[  129.033417] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  129.041553]-tmp=31172, size=271682, highmem_size=0
[  129.131233] shrink_all_memory: pages=10000
[  129.210693] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=10000
[  129.218994]\tmp=11146, size=261669, highmem_size=0
[  129.309142] shrink_all_memory: pages=10000
[  129.388523] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  129.396648]|tmp=-8880, size=251656, highmem_size=0
[  129.487193] shrink_all_memory: pages=10000
[  129.614831] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=10000
[  129.623059]/tmp=-28954, size=241612, highmem_size=0
[  129.714055] shrink_all_memory: pages=10000
[  129.794104] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=10000
[  129.802246]-tmp=-48932, size=231630, highmem_size=0
[  129.893893] shrink_all_memory: pages=10000
[  129.973667] shrink_all_zones: pass=0, prio=6, lru=Normal.2, pages=10000, reclaimed=10000
[  129.981892]\tmp=-69020, size=221586, highmem_size=0
[  130.073916] shrink_all_memory: pages=10000
[  130.154620] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=10000
[  130.162853]|tmp=-89156, size=211511, highmem_size=0
[  130.255274] shrink_all_memory: pages=10000
[  130.334612] shrink_all_zones: pass=0, prio=8, lru=DMA32.2, pages=10000, reclaimed=10000
[  130.342750]/tmp=-109182, size=201498, highmem_size=0
[  130.435551] shrink_all_memory: pages=10000
[  130.515074] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=10000
[  130.523305]-tmp=-129273, size=191454, highmem_size=0
[  130.616714] shrink_all_memory: pages=10000
[  130.696350] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=10000
[  130.704490]\tmp=-149299, size=181441, highmem_size=0
[  130.798322] shrink_all_memory: pages=10000
[  130.877834] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=10000
[  130.886038]|tmp=-169325, size=171428, highmem_size=0
[  130.980312] shrink_all_memory: pages=10000
[  131.107844] shrink_all_zones: pass=0, prio=3, lru=DMA32.2, pages=10000, reclaimed=10000
[  131.115982]/tmp=-189351, size=161415, highmem_size=0
[  131.210530] shrink_all_memory: pages=10000
[  131.291223] shrink_all_zones: pass=0, prio=7, lru=Normal.2, pages=10000, reclaimed=10000
[  131.299433]-tmp=-209459, size=151371, highmem_size=0
[  131.394488] shrink_all_memory: pages=10000
[  131.474123] shrink_all_zones: pass=0, prio=3, lru=Normal.2, pages=10000, reclaimed=10000
[  131.482344]\tmp=-229420, size=141389, highmem_size=0
[  131.577910] shrink_all_memory: pages=10000
[  131.657376] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  131.665498]|tmp=-249511, size=131345, highmem_size=0
[  131.761676] shrink_all_memory: pages=3345
[  131.791048] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=3345, reclaimed=3345
[  131.799085]/tmp=-256256, size=127966, highmem_size=0
[  131.895290]done (353345 pages freed)
[  131.899389] PM: Freed 1413380 kbytes in 7.03 seconds (201.04 MB/s)


1/30 memory being mapped, vanilla 2.6.30-rc2-next-20090417:

AnonPages:         38684 kB
Mapped:            66940 kB

[  722.944082] PM: Marking nosave pages: 0000000000001000 - 0000000000006000
[  722.956215] PM: Marking nosave pages: 000000000009f000 - 0000000000100000
[  722.963053] PM: Basic memory bitmaps created
[  722.967365] PM: Syncing filesystems ... done.
[  723.361274] Freezing user space processes ... (elapsed 0.00 seconds) done.
[  723.369310] Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done.
[  723.377342] PM: Shrinking memory...  tmp=508165, size=510179, highmem_size=0
[  723.563602] shrink_all_memory: pages=10000
[  723.648921] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=9766
[  723.733064] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=19581
[  723.741225]-tmp=468972, size=490587, highmem_size=0
[  723.821406] shrink_all_memory: pages=10000
[  723.902912] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=9804
[  723.987433] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=19617
[  723.995565]\tmp=429714, size=470964, highmem_size=0
[  724.077458] shrink_all_memory: pages=10000
[  724.160394] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=9808
[  724.261056] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=19610
[  724.269482]|tmp=390489, size=451341, highmem_size=0
[  724.353672] shrink_all_memory: pages=10000
[  724.556153] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=9806
[  724.669591] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=19636
[  724.677770]/tmp=351365, size=431780, highmem_size=0
[  724.762188] shrink_all_memory: pages=10000
[  724.923372] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=9805
[  725.037897] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=19620
[  725.046501]-tmp=312189, size=412188, highmem_size=0
[  725.133452] shrink_all_memory: pages=10000
[  725.371199] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=9781
[  725.519983] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=19585
[  725.528233]\tmp=273061, size=392627, highmem_size=0
[  725.616020] shrink_all_memory: pages=10000
[  725.801211] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=9804
[  725.954523] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=19617
[  725.962685]|tmp=233885, size=373035, highmem_size=0
[  726.051775] shrink_all_memory: pages=10000
[  726.296589] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=9804
[  726.449150] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=19682
[  726.457342]/tmp=194507, size=353350, highmem_size=0
[  726.548053] shrink_all_memory: pages=10000
[  726.759180] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=9803
[  726.940475] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=19561
[  726.948638]-tmp=155396, size=333789, highmem_size=0
[  727.040362] shrink_all_memory: pages=10000
[  727.257478] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=9804
[  727.442356] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=19544
[  727.450548]\tmp=116319, size=314259, highmem_size=0
[  727.543609] shrink_all_memory: pages=10000
[  727.755346] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=9804
[  727.894707] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=19555
[  727.902910]|tmp=77256, size=294729, highmem_size=0
[  727.997018] shrink_all_memory: pages=10000
[  728.170973] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=9799
[  728.332426] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=19545
[  728.341152]/tmp=38210, size=275199, highmem_size=0
[  728.437625] shrink_all_memory: pages=10000
[  728.673862] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=9773
[  728.812572] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=19517
[  728.820738]-tmp=-852, size=255669, highmem_size=0
[  728.917360] shrink_all_memory: pages=10000
[  729.110178] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=9839
[  729.266243] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=19610
[  729.274407]\tmp=-40045, size=236077, highmem_size=0
[  729.372371] shrink_all_memory: pages=10000
[  729.553743] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=9741
[  729.673174] shrink_all_zones: pass=0, prio=3, lru=DMA.2, pages=259, reclaimed=256
[  729.681224] shrink_all_zones: pass=0, prio=3, lru=DMA32.0, pages=259, reclaimed=256
[  729.693997] shrink_all_zones: pass=0, prio=3, lru=DMA32.2, pages=259, reclaimed=513
[  730.006423] shrink_all_zones: pass=0, prio=2, lru=DMA32.2, pages=9487, reclaimed=9296
[  730.177563] shrink_all_zones: pass=0, prio=2, lru=Normal.2, pages=9487, reclaimed=18626
[  730.185640]|tmp=-98138, size=207022, highmem_size=0
[  730.285280] shrink_all_memory: pages=10000
[  730.484499] shrink_all_zones: pass=0, prio=3, lru=DMA32.2, pages=10000, reclaimed=9807
[  730.637792] shrink_all_zones: pass=0, prio=3, lru=Normal.2, pages=10000, reclaimed=19613
[  730.645975]/tmp=-137343, size=187430, highmem_size=0
[  730.746709] shrink_all_memory: pages=10000
[  730.754374] shrink_all_zones: pass=0, prio=5, lru=Normal.0, pages=10000, reclaimed=0
[  731.101101] shrink_all_zones: pass=0, prio=3, lru=DMA32.2, pages=10000, reclaimed=9777
[  731.257243] shrink_all_zones: pass=0, prio=3, lru=Normal.2, pages=10000, reclaimed=19582
[  731.265411]-tmp=-176567, size=167807, highmem_size=0
[  731.367111] shrink_all_memory: pages=10000
[  731.567779] shrink_all_zones: pass=0, prio=3, lru=DMA32.2, pages=10000, reclaimed=9811
[  731.803019] shrink_all_zones: pass=0, prio=3, lru=Normal.2, pages=10000, reclaimed=19615
[  731.811189]\tmp=-215837, size=148184, highmem_size=0
[  731.913738] shrink_all_memory: pages=10000
[  732.123893] shrink_all_zones: pass=0, prio=2, lru=DMA32.2, pages=10000, reclaimed=9808
[  732.312075] shrink_all_zones: pass=0, prio=2, lru=Normal.2, pages=10000, reclaimed=19580
[  732.320234]|tmp=-254948, size=128623, highmem_size=0
[  732.423776] shrink_all_memory: pages=623
[  732.432862] shrink_all_zones: pass=0, prio=12, lru=DMA.2, pages=623, reclaimed=617
[  732.441782] shrink_all_zones: pass=0, prio=12, lru=Normal.0, pages=623, reclaimed=617
[  732.453341] shrink_all_zones: pass=0, prio=11, lru=DMA.0, pages=6, reclaimed=0
[  732.460712] shrink_all_zones: pass=0, prio=11, lru=DMA32.0, pages=6, reclaimed=0
[  732.468390] shrink_all_zones: pass=0, prio=11, lru=DMA32.2, pages=6, reclaimed=6
[  732.488091] shrink_all_zones: pass=0, prio=6, lru=DMA32.2, pages=623, reclaimed=617
[  732.508256] shrink_all_zones: pass=0, prio=6, lru=Normal.2, pages=623, reclaimed=1233
[  732.516202]/tmp=-258774, size=126704, highmem_size=0
[  732.753869]done (372529 pages freed)
[  732.757916] PM: Freed 1490116 kbytes in 9.37 seconds (159.03 MB/s)

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 0/4] PM: Drop shrink_all_memory (rev. 2) (was: Re: [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory)
@ 2009-05-03 13:08                                         ` Wu Fengguang
  0 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-03 13:08 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, pavel-+ZI9xUNit7I,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA


Hi Rafael,

I happened to be doing some benchmarks on the older shrink_all_memory(),
Hopefully it can be a useful reference point for the new design.

The current swsusp_shrink_memory()/shrink_all_memory() are terribly
inefficient: it takes 7-9s to free up 1.4G memory:

[  131.899389] PM: Freed 1413380 kbytes in 7.03 seconds (201.04 MB/s)
[  732.757916] PM: Freed 1490116 kbytes in 9.37 seconds (159.03 MB/s)

Below are the logs I collected by injecting printks. There are
basically two major problems:
- swsusp_shrink_memory() scans the whole 2G memory again and again;
- shrink_all_memory() is slow. It won't reclaim pages at all with
  small priority values, because it's batching size is 10000 pages.

I wonder if it's possible to free up the memory within 1s at all.
(Maybe the slowness is due to too much enabled debugging options...)

Thanks,
Fengguang
---

vanilla 2.6.30-rc2-next-20090417:

[  124.516187] PM: Marking nosave pages: 0000000000001000 - 0000000000006000
[  124.523087] PM: Marking nosave pages: 000000000009f000 - 0000000000100000
[  124.530060] PM: Basic memory bitmaps created
[  124.534421] PM: Syncing filesystems ... done.
[  124.842282] Freezing user space processes ... (elapsed 0.00 seconds) done.
[  124.849800] Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done.
[  124.857571] PM: Shrinking memory...  tmp=471584, size=491906, highmem_size=0
[  124.939103] shrink_all_memory: pages=10000
[  125.019543] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  125.027636]-tmp=451770, size=481986, highmem_size=0
[  125.107571] shrink_all_memory: pages=10000
[  125.139928] shrink_all_zones: pass=0, prio=7, lru=Normal.2, pages=10000, reclaimed=8500
[  125.280940] shrink_all_zones: pass=0, prio=6, lru=DMA32.2, pages=1500, reclaimed=1500
[  125.547990] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=10000
[  125.556135]\tmp=411598, size=461898, highmem_size=0
[  125.637414] shrink_all_memory: pages=10000
[  125.716890] shrink_all_zones: pass=0, prio=7, lru=Normal.2, pages=10000, reclaimed=10000
[  125.725092]|tmp=391507, size=451854, highmem_size=0
[  125.806935] shrink_all_memory: pages=10000
[  125.886317] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  125.894531]/tmp=371481, size=441841, highmem_size=0
[  125.976823] shrink_all_memory: pages=10000
[  126.104367] shrink_all_zones: pass=0, prio=6, lru=Normal.2, pages=10000, reclaimed=10000
[  126.112572]-tmp=351715, size=431952, highmem_size=0
[  126.195178] shrink_all_memory: pages=10000
[  126.274586] shrink_all_zones: pass=0, prio=6, lru=DMA32.2, pages=10000, reclaimed=10000
[  126.282698]\tmp=331949, size=422063, highmem_size=0
[  126.365743] shrink_all_memory: pages=10000
[  126.445851] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  126.453968]|tmp=311858, size=412019, highmem_size=0
[  126.537417] shrink_all_memory: pages=10000
[  126.616980] shrink_all_zones: pass=0, prio=9, lru=Normal.2, pages=10000, reclaimed=10000
[  126.625180]/tmp=291751, size=401975, highmem_size=0
[  126.709066] shrink_all_memory: pages=10000
[  126.788665] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  126.796833]-tmp=271725, size=391962, highmem_size=0
[  126.880997] shrink_all_memory: pages=10000
[  127.008443] shrink_all_zones: pass=0, prio=6, lru=Normal.2, pages=10000, reclaimed=10000
[  127.016667]\tmp=251716, size=381949, highmem_size=0
[  127.101581] shrink_all_memory: pages=10000
[  127.181588] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  127.189728]|tmp=231673, size=371936, highmem_size=0
[  127.275105] shrink_all_memory: pages=10000
[  127.354799] shrink_all_zones: pass=0, prio=6, lru=Normal.2, pages=10000, reclaimed=10000
[  127.363003]/tmp=211599, size=361892, highmem_size=0
[  127.448750] shrink_all_memory: pages=10000
[  127.528252] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  127.536385]-tmp=191621, size=351910, highmem_size=0
[  127.622369] shrink_all_memory: pages=10000
[  127.750093] shrink_all_zones: pass=0, prio=6, lru=Normal.2, pages=10000, reclaimed=10000
[  127.758295]\tmp=171539, size=341866, highmem_size=0
[  127.844867] shrink_all_memory: pages=10000
[  127.925614] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  127.933758]|tmp=151465, size=331822, highmem_size=0
[  128.020878] shrink_all_memory: pages=10000
[  128.100580] shrink_all_zones: pass=0, prio=6, lru=Normal.2, pages=10000, reclaimed=10000
[  128.108803]/tmp=131391, size=321778, highmem_size=0
[  128.196312] shrink_all_memory: pages=10000
[  128.275643] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  128.283769]-tmp=111413, size=311796, highmem_size=0
[  128.371814] shrink_all_memory: pages=10000
[  128.501803] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=10000
[  128.510007]\tmp=91339, size=301752, highmem_size=0
[  128.597726] shrink_all_memory: pages=10000
[  128.677138] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  128.685277]|tmp=71296, size=291739, highmem_size=0
[  128.774061] shrink_all_memory: pages=10000
[  128.855940] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=10000
[  128.864145]/tmp=51259, size=281726, highmem_size=0
[  128.953486] shrink_all_memory: pages=10000
[  129.033417] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  129.041553]-tmp=31172, size=271682, highmem_size=0
[  129.131233] shrink_all_memory: pages=10000
[  129.210693] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=10000
[  129.218994]\tmp=11146, size=261669, highmem_size=0
[  129.309142] shrink_all_memory: pages=10000
[  129.388523] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  129.396648]|tmp=-8880, size=251656, highmem_size=0
[  129.487193] shrink_all_memory: pages=10000
[  129.614831] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=10000
[  129.623059]/tmp=-28954, size=241612, highmem_size=0
[  129.714055] shrink_all_memory: pages=10000
[  129.794104] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=10000
[  129.802246]-tmp=-48932, size=231630, highmem_size=0
[  129.893893] shrink_all_memory: pages=10000
[  129.973667] shrink_all_zones: pass=0, prio=6, lru=Normal.2, pages=10000, reclaimed=10000
[  129.981892]\tmp=-69020, size=221586, highmem_size=0
[  130.073916] shrink_all_memory: pages=10000
[  130.154620] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=10000
[  130.162853]|tmp=-89156, size=211511, highmem_size=0
[  130.255274] shrink_all_memory: pages=10000
[  130.334612] shrink_all_zones: pass=0, prio=8, lru=DMA32.2, pages=10000, reclaimed=10000
[  130.342750]/tmp=-109182, size=201498, highmem_size=0
[  130.435551] shrink_all_memory: pages=10000
[  130.515074] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=10000
[  130.523305]-tmp=-129273, size=191454, highmem_size=0
[  130.616714] shrink_all_memory: pages=10000
[  130.696350] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=10000
[  130.704490]\tmp=-149299, size=181441, highmem_size=0
[  130.798322] shrink_all_memory: pages=10000
[  130.877834] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=10000
[  130.886038]|tmp=-169325, size=171428, highmem_size=0
[  130.980312] shrink_all_memory: pages=10000
[  131.107844] shrink_all_zones: pass=0, prio=3, lru=DMA32.2, pages=10000, reclaimed=10000
[  131.115982]/tmp=-189351, size=161415, highmem_size=0
[  131.210530] shrink_all_memory: pages=10000
[  131.291223] shrink_all_zones: pass=0, prio=7, lru=Normal.2, pages=10000, reclaimed=10000
[  131.299433]-tmp=-209459, size=151371, highmem_size=0
[  131.394488] shrink_all_memory: pages=10000
[  131.474123] shrink_all_zones: pass=0, prio=3, lru=Normal.2, pages=10000, reclaimed=10000
[  131.482344]\tmp=-229420, size=141389, highmem_size=0
[  131.577910] shrink_all_memory: pages=10000
[  131.657376] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=10000
[  131.665498]|tmp=-249511, size=131345, highmem_size=0
[  131.761676] shrink_all_memory: pages=3345
[  131.791048] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=3345, reclaimed=3345
[  131.799085]/tmp=-256256, size=127966, highmem_size=0
[  131.895290]done (353345 pages freed)
[  131.899389] PM: Freed 1413380 kbytes in 7.03 seconds (201.04 MB/s)


1/30 memory being mapped, vanilla 2.6.30-rc2-next-20090417:

AnonPages:         38684 kB
Mapped:            66940 kB

[  722.944082] PM: Marking nosave pages: 0000000000001000 - 0000000000006000
[  722.956215] PM: Marking nosave pages: 000000000009f000 - 0000000000100000
[  722.963053] PM: Basic memory bitmaps created
[  722.967365] PM: Syncing filesystems ... done.
[  723.361274] Freezing user space processes ... (elapsed 0.00 seconds) done.
[  723.369310] Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done.
[  723.377342] PM: Shrinking memory...  tmp=508165, size=510179, highmem_size=0
[  723.563602] shrink_all_memory: pages=10000
[  723.648921] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=9766
[  723.733064] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=19581
[  723.741225]-tmp=468972, size=490587, highmem_size=0
[  723.821406] shrink_all_memory: pages=10000
[  723.902912] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=9804
[  723.987433] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=19617
[  723.995565]\tmp=429714, size=470964, highmem_size=0
[  724.077458] shrink_all_memory: pages=10000
[  724.160394] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=9808
[  724.261056] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=19610
[  724.269482]|tmp=390489, size=451341, highmem_size=0
[  724.353672] shrink_all_memory: pages=10000
[  724.556153] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=9806
[  724.669591] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=19636
[  724.677770]/tmp=351365, size=431780, highmem_size=0
[  724.762188] shrink_all_memory: pages=10000
[  724.923372] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=9805
[  725.037897] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=19620
[  725.046501]-tmp=312189, size=412188, highmem_size=0
[  725.133452] shrink_all_memory: pages=10000
[  725.371199] shrink_all_zones: pass=0, prio=5, lru=DMA32.2, pages=10000, reclaimed=9781
[  725.519983] shrink_all_zones: pass=0, prio=5, lru=Normal.2, pages=10000, reclaimed=19585
[  725.528233]\tmp=273061, size=392627, highmem_size=0
[  725.616020] shrink_all_memory: pages=10000
[  725.801211] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=9804
[  725.954523] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=19617
[  725.962685]|tmp=233885, size=373035, highmem_size=0
[  726.051775] shrink_all_memory: pages=10000
[  726.296589] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=9804
[  726.449150] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=19682
[  726.457342]/tmp=194507, size=353350, highmem_size=0
[  726.548053] shrink_all_memory: pages=10000
[  726.759180] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=9803
[  726.940475] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=19561
[  726.948638]-tmp=155396, size=333789, highmem_size=0
[  727.040362] shrink_all_memory: pages=10000
[  727.257478] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=9804
[  727.442356] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=19544
[  727.450548]\tmp=116319, size=314259, highmem_size=0
[  727.543609] shrink_all_memory: pages=10000
[  727.755346] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=9804
[  727.894707] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=19555
[  727.902910]|tmp=77256, size=294729, highmem_size=0
[  727.997018] shrink_all_memory: pages=10000
[  728.170973] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=9799
[  728.332426] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=19545
[  728.341152]/tmp=38210, size=275199, highmem_size=0
[  728.437625] shrink_all_memory: pages=10000
[  728.673862] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=9773
[  728.812572] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=19517
[  728.820738]-tmp=-852, size=255669, highmem_size=0
[  728.917360] shrink_all_memory: pages=10000
[  729.110178] shrink_all_zones: pass=0, prio=4, lru=DMA32.2, pages=10000, reclaimed=9839
[  729.266243] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=19610
[  729.274407]\tmp=-40045, size=236077, highmem_size=0
[  729.372371] shrink_all_memory: pages=10000
[  729.553743] shrink_all_zones: pass=0, prio=4, lru=Normal.2, pages=10000, reclaimed=9741
[  729.673174] shrink_all_zones: pass=0, prio=3, lru=DMA.2, pages=259, reclaimed=256
[  729.681224] shrink_all_zones: pass=0, prio=3, lru=DMA32.0, pages=259, reclaimed=256
[  729.693997] shrink_all_zones: pass=0, prio=3, lru=DMA32.2, pages=259, reclaimed=513
[  730.006423] shrink_all_zones: pass=0, prio=2, lru=DMA32.2, pages=9487, reclaimed=9296
[  730.177563] shrink_all_zones: pass=0, prio=2, lru=Normal.2, pages=9487, reclaimed=18626
[  730.185640]|tmp=-98138, size=207022, highmem_size=0
[  730.285280] shrink_all_memory: pages=10000
[  730.484499] shrink_all_zones: pass=0, prio=3, lru=DMA32.2, pages=10000, reclaimed=9807
[  730.637792] shrink_all_zones: pass=0, prio=3, lru=Normal.2, pages=10000, reclaimed=19613
[  730.645975]/tmp=-137343, size=187430, highmem_size=0
[  730.746709] shrink_all_memory: pages=10000
[  730.754374] shrink_all_zones: pass=0, prio=5, lru=Normal.0, pages=10000, reclaimed=0
[  731.101101] shrink_all_zones: pass=0, prio=3, lru=DMA32.2, pages=10000, reclaimed=9777
[  731.257243] shrink_all_zones: pass=0, prio=3, lru=Normal.2, pages=10000, reclaimed=19582
[  731.265411]-tmp=-176567, size=167807, highmem_size=0
[  731.367111] shrink_all_memory: pages=10000
[  731.567779] shrink_all_zones: pass=0, prio=3, lru=DMA32.2, pages=10000, reclaimed=9811
[  731.803019] shrink_all_zones: pass=0, prio=3, lru=Normal.2, pages=10000, reclaimed=19615
[  731.811189]\tmp=-215837, size=148184, highmem_size=0
[  731.913738] shrink_all_memory: pages=10000
[  732.123893] shrink_all_zones: pass=0, prio=2, lru=DMA32.2, pages=10000, reclaimed=9808
[  732.312075] shrink_all_zones: pass=0, prio=2, lru=Normal.2, pages=10000, reclaimed=19580
[  732.320234]|tmp=-254948, size=128623, highmem_size=0
[  732.423776] shrink_all_memory: pages=623
[  732.432862] shrink_all_zones: pass=0, prio=12, lru=DMA.2, pages=623, reclaimed=617
[  732.441782] shrink_all_zones: pass=0, prio=12, lru=Normal.0, pages=623, reclaimed=617
[  732.453341] shrink_all_zones: pass=0, prio=11, lru=DMA.0, pages=6, reclaimed=0
[  732.460712] shrink_all_zones: pass=0, prio=11, lru=DMA32.0, pages=6, reclaimed=0
[  732.468390] shrink_all_zones: pass=0, prio=11, lru=DMA32.2, pages=6, reclaimed=6
[  732.488091] shrink_all_zones: pass=0, prio=6, lru=DMA32.2, pages=623, reclaimed=617
[  732.508256] shrink_all_zones: pass=0, prio=6, lru=Normal.2, pages=623, reclaimed=1233
[  732.516202]/tmp=-258774, size=126704, highmem_size=0
[  732.753869]done (372529 pages freed)
[  732.757916] PM: Freed 1490116 kbytes in 9.37 seconds (159.03 MB/s)

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/4] PM/Hibernate: Use memory allocations to free memory (rev. 2)
@ 2009-05-03 16:15                                             ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-03 16:15 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, pavel, jens.axboe, alan-jenkins, linux-kernel,
	kernel-testers, linux-pm

On Sunday 03 May 2009, Linus Torvalds wrote:
> 
> On Sun, 3 May 2009, Rafael J. Wysocki wrote:
> >
> > Remove the shrinking of memory from the suspend-to-RAM code, where it is 
> > not really necessary.
> 
> Hmm. Shouldn't we do this _regardless_?
> 
> IOW, shouldn't this be a totally separate patch? It seems to be left-over 
> from when we shared the same code-paths, and before the split of the STR 
> and hibernate code?
> 
> IOW, shouldn't the very _first_ patch just be this part? That code doesn't 
> make any sense anyway (that FREE_PAGE_NUMBER really _is_ totally 
> arbitrary).
> 
> This part seems to be totally independent of all the other parts in your 
> patch-series. No?

I'm removing this along with shrink_all_memory() which it depends on, but I can
put that into a separate patch if you prefer.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/4] PM/Hibernate: Use memory allocations to free memory (rev. 2)
  2009-05-03  3:06                                           ` Linus Torvalds
                                                             ` (2 preceding siblings ...)
  (?)
@ 2009-05-03 16:15                                           ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-03 16:15 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: linux-kernel, alan-jenkins, jens.axboe, Andrew Morton,
	kernel-testers, linux-pm

On Sunday 03 May 2009, Linus Torvalds wrote:
> 
> On Sun, 3 May 2009, Rafael J. Wysocki wrote:
> >
> > Remove the shrinking of memory from the suspend-to-RAM code, where it is 
> > not really necessary.
> 
> Hmm. Shouldn't we do this _regardless_?
> 
> IOW, shouldn't this be a totally separate patch? It seems to be left-over 
> from when we shared the same code-paths, and before the split of the STR 
> and hibernate code?
> 
> IOW, shouldn't the very _first_ patch just be this part? That code doesn't 
> make any sense anyway (that FREE_PAGE_NUMBER really _is_ totally 
> arbitrary).
> 
> This part seems to be totally independent of all the other parts in your 
> patch-series. No?

I'm removing this along with shrink_all_memory() which it depends on, but I can
put that into a separate patch if you prefer.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/4] PM/Hibernate: Use memory allocations to free memory (rev. 2)
@ 2009-05-03 16:15                                             ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-03 16:15 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, pavel-+ZI9xUNit7I,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

On Sunday 03 May 2009, Linus Torvalds wrote:
> 
> On Sun, 3 May 2009, Rafael J. Wysocki wrote:
> >
> > Remove the shrinking of memory from the suspend-to-RAM code, where it is 
> > not really necessary.
> 
> Hmm. Shouldn't we do this _regardless_?
> 
> IOW, shouldn't this be a totally separate patch? It seems to be left-over 
> from when we shared the same code-paths, and before the split of the STR 
> and hibernate code?
> 
> IOW, shouldn't the very _first_ patch just be this part? That code doesn't 
> make any sense anyway (that FREE_PAGE_NUMBER really _is_ totally 
> arbitrary).
> 
> This part seems to be totally independent of all the other parts in your 
> patch-series. No?

I'm removing this along with shrink_all_memory() which it depends on, but I can
put that into a separate patch if you prefer.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/4] PM/Hibernate: Use memory allocations to free memory (rev. 2)
  2009-05-03 11:51                                           ` Wu Fengguang
@ 2009-05-03 16:22                                             ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-03 16:22 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Andrew Morton, pavel, torvalds, jens.axboe, alan-jenkins,
	linux-kernel, kernel-testers, linux-pm

On Sunday 03 May 2009, Wu Fengguang wrote:
> On Sun, May 03, 2009 at 02:24:20AM +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > 
> > Modify the hibernation memory shrinking code so that it will make
> > memory allocations to free memory instead of using an artificial
> > memory shrinking mechanism for that.  Remove the shrinking of
> > memory from the suspend-to-RAM code, where it is not really
> > necessary.  Finally, remove the no longer used memory shrinking
> > functions from mm/vmscan.c .
> > 
> > [rev. 2: Use the existing memory bitmaps for marking preallocated
> >  image pages and use swsusp_free() from releasing them, introduce
> >  GFP_IMAGE, add comments describing the memory shrinking strategy.]
> > 
> > Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> > ---
> >  kernel/power/main.c     |   20 ------
> >  kernel/power/snapshot.c |  132 +++++++++++++++++++++++++++++++++-----------
> >  mm/vmscan.c             |  142 ------------------------------------------------
> >  3 files changed, 101 insertions(+), 193 deletions(-)
> > 
> > Index: linux-2.6/kernel/power/snapshot.c
> > ===================================================================
> > --- linux-2.6.orig/kernel/power/snapshot.c
> > +++ linux-2.6/kernel/power/snapshot.c
> > @@ -1066,41 +1066,97 @@ void swsusp_free(void)
> >  	buffer = NULL;
> >  }
> >  
> > +/* Helper functions used for the shrinking of memory. */
> > +
> > +#ifdef CONFIG_HIGHMEM
> > +#define GFP_IMAGE	(GFP_KERNEL | __GFP_HIGHMEM | __GFP_NO_OOM_KILL)
> > +#else
> > +#define GFP_IMAGE	(GFP_KERNEL | __GFP_NO_OOM_KILL)
> > +#endif
> 
> The CONFIG_HIGHMEM test is not necessary: __GFP_HIGHMEM is always defined.
> 
> > +#define SHRINK_BITE	10000
> 
> This is ~40MB. A full scan of (for example) 8G pages will be time
> consuming, not to mention we have to do it 2*(8G-500M)/40M = 384 times!
> 
> Can we make it a LONG_MAX? 

No, I don't think so.  The problem is the number of pages we'll need to copy
is generally shrinking  as we allocate memory, so we can't do that in one shot.

We can make it a greater number, but I don't really think it would be a good
idea to make it greater than 100 MB.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/4] PM/Hibernate: Use memory allocations to free memory (rev. 2)
  2009-05-03 11:51                                           ` Wu Fengguang
  (?)
@ 2009-05-03 16:22                                           ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-03 16:22 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-kernel, alan-jenkins, jens.axboe, Andrew Morton,
	kernel-testers, torvalds, linux-pm

On Sunday 03 May 2009, Wu Fengguang wrote:
> On Sun, May 03, 2009 at 02:24:20AM +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > 
> > Modify the hibernation memory shrinking code so that it will make
> > memory allocations to free memory instead of using an artificial
> > memory shrinking mechanism for that.  Remove the shrinking of
> > memory from the suspend-to-RAM code, where it is not really
> > necessary.  Finally, remove the no longer used memory shrinking
> > functions from mm/vmscan.c .
> > 
> > [rev. 2: Use the existing memory bitmaps for marking preallocated
> >  image pages and use swsusp_free() from releasing them, introduce
> >  GFP_IMAGE, add comments describing the memory shrinking strategy.]
> > 
> > Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> > ---
> >  kernel/power/main.c     |   20 ------
> >  kernel/power/snapshot.c |  132 +++++++++++++++++++++++++++++++++-----------
> >  mm/vmscan.c             |  142 ------------------------------------------------
> >  3 files changed, 101 insertions(+), 193 deletions(-)
> > 
> > Index: linux-2.6/kernel/power/snapshot.c
> > ===================================================================
> > --- linux-2.6.orig/kernel/power/snapshot.c
> > +++ linux-2.6/kernel/power/snapshot.c
> > @@ -1066,41 +1066,97 @@ void swsusp_free(void)
> >  	buffer = NULL;
> >  }
> >  
> > +/* Helper functions used for the shrinking of memory. */
> > +
> > +#ifdef CONFIG_HIGHMEM
> > +#define GFP_IMAGE	(GFP_KERNEL | __GFP_HIGHMEM | __GFP_NO_OOM_KILL)
> > +#else
> > +#define GFP_IMAGE	(GFP_KERNEL | __GFP_NO_OOM_KILL)
> > +#endif
> 
> The CONFIG_HIGHMEM test is not necessary: __GFP_HIGHMEM is always defined.
> 
> > +#define SHRINK_BITE	10000
> 
> This is ~40MB. A full scan of (for example) 8G pages will be time
> consuming, not to mention we have to do it 2*(8G-500M)/40M = 384 times!
> 
> Can we make it a LONG_MAX? 

No, I don't think so.  The problem is the number of pages we'll need to copy
is generally shrinking  as we allocate memory, so we can't do that in one shot.

We can make it a greater number, but I don't really think it would be a good
idea to make it greater than 100 MB.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/4] PM/Hibernate: Use memory allocations to free memory (rev. 2)
@ 2009-05-03 16:22                                             ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-03 16:22 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Andrew Morton, pavel-+ZI9xUNit7I,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

On Sunday 03 May 2009, Wu Fengguang wrote:
> On Sun, May 03, 2009 at 02:24:20AM +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
> > 
> > Modify the hibernation memory shrinking code so that it will make
> > memory allocations to free memory instead of using an artificial
> > memory shrinking mechanism for that.  Remove the shrinking of
> > memory from the suspend-to-RAM code, where it is not really
> > necessary.  Finally, remove the no longer used memory shrinking
> > functions from mm/vmscan.c .
> > 
> > [rev. 2: Use the existing memory bitmaps for marking preallocated
> >  image pages and use swsusp_free() from releasing them, introduce
> >  GFP_IMAGE, add comments describing the memory shrinking strategy.]
> > 
> > Signed-off-by: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
> > ---
> >  kernel/power/main.c     |   20 ------
> >  kernel/power/snapshot.c |  132 +++++++++++++++++++++++++++++++++-----------
> >  mm/vmscan.c             |  142 ------------------------------------------------
> >  3 files changed, 101 insertions(+), 193 deletions(-)
> > 
> > Index: linux-2.6/kernel/power/snapshot.c
> > ===================================================================
> > --- linux-2.6.orig/kernel/power/snapshot.c
> > +++ linux-2.6/kernel/power/snapshot.c
> > @@ -1066,41 +1066,97 @@ void swsusp_free(void)
> >  	buffer = NULL;
> >  }
> >  
> > +/* Helper functions used for the shrinking of memory. */
> > +
> > +#ifdef CONFIG_HIGHMEM
> > +#define GFP_IMAGE	(GFP_KERNEL | __GFP_HIGHMEM | __GFP_NO_OOM_KILL)
> > +#else
> > +#define GFP_IMAGE	(GFP_KERNEL | __GFP_NO_OOM_KILL)
> > +#endif
> 
> The CONFIG_HIGHMEM test is not necessary: __GFP_HIGHMEM is always defined.
> 
> > +#define SHRINK_BITE	10000
> 
> This is ~40MB. A full scan of (for example) 8G pages will be time
> consuming, not to mention we have to do it 2*(8G-500M)/40M = 384 times!
> 
> Can we make it a LONG_MAX? 

No, I don't think so.  The problem is the number of pages we'll need to copy
is generally shrinking  as we allocate memory, so we can't do that in one shot.

We can make it a greater number, but I don't really think it would be a good
idea to make it greater than 100 MB.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 0/4] PM: Drop shrink_all_memory (rev. 2) (was: Re: [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory)
  2009-05-03 13:08                                         ` Wu Fengguang
@ 2009-05-03 16:30                                           ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-03 16:30 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Andrew Morton, pavel, torvalds, jens.axboe, alan-jenkins,
	linux-kernel, kernel-testers, linux-pm

On Sunday 03 May 2009, Wu Fengguang wrote:
> 
> Hi Rafael,

Hi,

> I happened to be doing some benchmarks on the older shrink_all_memory(),
> Hopefully it can be a useful reference point for the new design.
> 
> The current swsusp_shrink_memory()/shrink_all_memory() are terribly
> inefficient: it takes 7-9s to free up 1.4G memory:

One reason may be that it takes too many steps to do it,

> [  131.899389] PM: Freed 1413380 kbytes in 7.03 seconds (201.04 MB/s)
> [  732.757916] PM: Freed 1490116 kbytes in 9.37 seconds (159.03 MB/s)

because the new way doesn't seem to do any better.

> Below are the logs I collected by injecting printks. There are
> basically two major problems:
> - swsusp_shrink_memory() scans the whole 2G memory again and again;
> - shrink_all_memory() is slow. It won't reclaim pages at all with
>   small priority values, because it's batching size is 10000 pages.

I know that swsusp_shrink_memory() has problems, that's why I'd like to get rid
of it.

> I wonder if it's possible to free up the memory within 1s at all.

I'm not sure.

Apparently, the counting of saveable pages takes substantial time (0.5 s each
iteration on my 64-bit test box), so we can improve that by limiting the number
of iterations.

Well, perhaps we can do it all in one shot after all, I'll think how to do that.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 0/4] PM: Drop shrink_all_memory (rev. 2) (was: Re: [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory)
  2009-05-03 13:08                                         ` Wu Fengguang
  (?)
@ 2009-05-03 16:30                                         ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-03 16:30 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-kernel, alan-jenkins, jens.axboe, Andrew Morton,
	kernel-testers, torvalds, linux-pm

On Sunday 03 May 2009, Wu Fengguang wrote:
> 
> Hi Rafael,

Hi,

> I happened to be doing some benchmarks on the older shrink_all_memory(),
> Hopefully it can be a useful reference point for the new design.
> 
> The current swsusp_shrink_memory()/shrink_all_memory() are terribly
> inefficient: it takes 7-9s to free up 1.4G memory:

One reason may be that it takes too many steps to do it,

> [  131.899389] PM: Freed 1413380 kbytes in 7.03 seconds (201.04 MB/s)
> [  732.757916] PM: Freed 1490116 kbytes in 9.37 seconds (159.03 MB/s)

because the new way doesn't seem to do any better.

> Below are the logs I collected by injecting printks. There are
> basically two major problems:
> - swsusp_shrink_memory() scans the whole 2G memory again and again;
> - shrink_all_memory() is slow. It won't reclaim pages at all with
>   small priority values, because it's batching size is 10000 pages.

I know that swsusp_shrink_memory() has problems, that's why I'd like to get rid
of it.

> I wonder if it's possible to free up the memory within 1s at all.

I'm not sure.

Apparently, the counting of saveable pages takes substantial time (0.5 s each
iteration on my 64-bit test box), so we can improve that by limiting the number
of iterations.

Well, perhaps we can do it all in one shot after all, I'll think how to do that.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 0/4] PM: Drop shrink_all_memory (rev. 2) (was: Re: [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory)
@ 2009-05-03 16:30                                           ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-03 16:30 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Andrew Morton, pavel-+ZI9xUNit7I,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

On Sunday 03 May 2009, Wu Fengguang wrote:
> 
> Hi Rafael,

Hi,

> I happened to be doing some benchmarks on the older shrink_all_memory(),
> Hopefully it can be a useful reference point for the new design.
> 
> The current swsusp_shrink_memory()/shrink_all_memory() are terribly
> inefficient: it takes 7-9s to free up 1.4G memory:

One reason may be that it takes too many steps to do it,

> [  131.899389] PM: Freed 1413380 kbytes in 7.03 seconds (201.04 MB/s)
> [  732.757916] PM: Freed 1490116 kbytes in 9.37 seconds (159.03 MB/s)

because the new way doesn't seem to do any better.

> Below are the logs I collected by injecting printks. There are
> basically two major problems:
> - swsusp_shrink_memory() scans the whole 2G memory again and again;
> - shrink_all_memory() is slow. It won't reclaim pages at all with
>   small priority values, because it's batching size is 10000 pages.

I know that swsusp_shrink_memory() has problems, that's why I'd like to get rid
of it.

> I wonder if it's possible to free up the memory within 1s at all.

I'm not sure.

Apparently, the counting of saveable pages takes substantial time (0.5 s each
iteration on my 64-bit test box), so we can improve that by limiting the number
of iterations.

Well, perhaps we can do it all in one shot after all, I'll think how to do that.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/4] PM/Hibernate: Use memory allocations to free memory (rev. 2)
@ 2009-05-03 16:35                                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-03 16:35 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Linus Torvalds, Andrew Morton, jens.axboe, alan-jenkins,
	linux-kernel, kernel-testers, linux-pm

On Sunday 03 May 2009, Pavel Machek wrote:
> Hi!

Hi,

> > > Remove the shrinking of memory from the suspend-to-RAM code, where it is 
> > > not really necessary.
> > 
> > Hmm. Shouldn't we do this _regardless_?
> > 
> > IOW, shouldn't this be a totally separate patch? It seems to be left-over 
> > from when we shared the same code-paths, and before the split of the STR 
> > and hibernate code?
> > 
> > IOW, shouldn't the very _first_ patch just be this part? That code doesn't 
> > make any sense anyway (that FREE_PAGE_NUMBER really _is_ totally 
> > arbitrary).
> > 
> > This part seems to be totally independent of all the other parts in your 
> > patch-series. No?
> 
> I'm not sure this one is a good idea: drivers will need to allocate
> memory during suspend/resume, and when processes are frozen/disk
> driver is suspended, normal memory management will no longer work.
> 
> So, freeing 4M of memory before starting suspend seems like a good
> idea. That way those small alocations will not fail.

I don't think we've ever had problems with the drivers having too little
memory to suspend.

I'm opting for removing this code and seeing if that leads to any regressions.
If it does, we can still get some free memory by allocating and releasing it.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/4] PM/Hibernate: Use memory allocations to free memory (rev. 2)
  2009-05-03  9:36                                             ` Pavel Machek
  (?)
@ 2009-05-03 16:35                                             ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-03 16:35 UTC (permalink / raw)
  To: Pavel Machek
  Cc: linux-kernel, alan-jenkins, jens.axboe, Andrew Morton,
	kernel-testers, Linus Torvalds, linux-pm

On Sunday 03 May 2009, Pavel Machek wrote:
> Hi!

Hi,

> > > Remove the shrinking of memory from the suspend-to-RAM code, where it is 
> > > not really necessary.
> > 
> > Hmm. Shouldn't we do this _regardless_?
> > 
> > IOW, shouldn't this be a totally separate patch? It seems to be left-over 
> > from when we shared the same code-paths, and before the split of the STR 
> > and hibernate code?
> > 
> > IOW, shouldn't the very _first_ patch just be this part? That code doesn't 
> > make any sense anyway (that FREE_PAGE_NUMBER really _is_ totally 
> > arbitrary).
> > 
> > This part seems to be totally independent of all the other parts in your 
> > patch-series. No?
> 
> I'm not sure this one is a good idea: drivers will need to allocate
> memory during suspend/resume, and when processes are frozen/disk
> driver is suspended, normal memory management will no longer work.
> 
> So, freeing 4M of memory before starting suspend seems like a good
> idea. That way those small alocations will not fail.

I don't think we've ever had problems with the drivers having too little
memory to suspend.

I'm opting for removing this code and seeing if that leads to any regressions.
If it does, we can still get some free memory by allocating and releasing it.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/4] PM/Hibernate: Use memory allocations to free memory (rev. 2)
@ 2009-05-03 16:35                                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-03 16:35 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Linus Torvalds, Andrew Morton, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

On Sunday 03 May 2009, Pavel Machek wrote:
> Hi!

Hi,

> > > Remove the shrinking of memory from the suspend-to-RAM code, where it is 
> > > not really necessary.
> > 
> > Hmm. Shouldn't we do this _regardless_?
> > 
> > IOW, shouldn't this be a totally separate patch? It seems to be left-over 
> > from when we shared the same code-paths, and before the split of the STR 
> > and hibernate code?
> > 
> > IOW, shouldn't the very _first_ patch just be this part? That code doesn't 
> > make any sense anyway (that FREE_PAGE_NUMBER really _is_ totally 
> > arbitrary).
> > 
> > This part seems to be totally independent of all the other parts in your 
> > patch-series. No?
> 
> I'm not sure this one is a good idea: drivers will need to allocate
> memory during suspend/resume, and when processes are frozen/disk
> driver is suspended, normal memory management will no longer work.
> 
> So, freeing 4M of memory before starting suspend seems like a good
> idea. That way those small alocations will not fail.

I don't think we've ever had problems with the drivers having too little
memory to suspend.

I'm opting for removing this code and seeing if that leads to any regressions.
If it does, we can still get some free memory by allocating and releasing it.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 0/5] PM: Drop shrink_all_memory (rev. 3)
@ 2009-05-04  0:08                                             ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-04  0:08 UTC (permalink / raw)
  To: Wu Fengguang, linux-pm
  Cc: Andrew Morton, pavel, torvalds, jens.axboe, alan-jenkins,
	linux-kernel, kernel-testers

On Sunday 03 May 2009, Rafael J. Wysocki wrote:
> On Sunday 03 May 2009, Wu Fengguang wrote:
> > 
> > Hi Rafael,
> 
> Hi,
> 
> > I happened to be doing some benchmarks on the older shrink_all_memory(),
> > Hopefully it can be a useful reference point for the new design.
> > 
> > The current swsusp_shrink_memory()/shrink_all_memory() are terribly
> > inefficient: it takes 7-9s to free up 1.4G memory:
> 
> One reason may be that it takes too many steps to do it,
> 
> > [  131.899389] PM: Freed 1413380 kbytes in 7.03 seconds (201.04 MB/s)
> > [  732.757916] PM: Freed 1490116 kbytes in 9.37 seconds (159.03 MB/s)
> 
> because the new way doesn't seem to do any better.
> 
> > Below are the logs I collected by injecting printks. There are
> > basically two major problems:
> > - swsusp_shrink_memory() scans the whole 2G memory again and again;
> > - shrink_all_memory() is slow. It won't reclaim pages at all with
> >   small priority values, because it's batching size is 10000 pages.
> 
> I know that swsusp_shrink_memory() has problems, that's why I'd like to get rid
> of it.
> 
> > I wonder if it's possible to free up the memory within 1s at all.
> 
> I'm not sure.
> 
> Apparently, the counting of saveable pages takes substantial time (0.5 s each
> iteration on my 64-bit test box), so we can improve that by limiting the number
> of iterations.
> 
> Well, perhaps we can do it all in one shot after all, I'll think how to do that.

I've changed swsusp_shrink_memory() to preallocate all of the pages in one
iteration.  Although it doesn't seem to improve the speed of memory shrinking,
the function is simpler in this form.

Anyway, updated patch series follows:

[1/5] - the Andrew's patch introducing __GFP_NO_OOM_KILL (I decided it would be
        better do it this way in this particular case.  The fact that the OOM
        killer is not going to work after tasks have been frozen is a different
        issue.)

[2/5] - move swsusp_shrink_memory to snapshot.c, no major changes

[3/5] - remove the shrinking of memory from suspend code (in a separate patch
         as requested by Linus)

[4/5] - use memory allocations to for making the room for the image

[5/5] - do not release all memory allocated by [4/5] and use it for
        creating the image directly (some allocated memory is released).

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 0/5] PM: Drop shrink_all_memory (rev. 3)
  2009-05-03 16:30                                           ` Rafael J. Wysocki
  (?)
@ 2009-05-04  0:08                                           ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-04  0:08 UTC (permalink / raw)
  To: Wu Fengguang, linux-pm
  Cc: linux-kernel, alan-jenkins, jens.axboe, Andrew Morton,
	kernel-testers, torvalds

On Sunday 03 May 2009, Rafael J. Wysocki wrote:
> On Sunday 03 May 2009, Wu Fengguang wrote:
> > 
> > Hi Rafael,
> 
> Hi,
> 
> > I happened to be doing some benchmarks on the older shrink_all_memory(),
> > Hopefully it can be a useful reference point for the new design.
> > 
> > The current swsusp_shrink_memory()/shrink_all_memory() are terribly
> > inefficient: it takes 7-9s to free up 1.4G memory:
> 
> One reason may be that it takes too many steps to do it,
> 
> > [  131.899389] PM: Freed 1413380 kbytes in 7.03 seconds (201.04 MB/s)
> > [  732.757916] PM: Freed 1490116 kbytes in 9.37 seconds (159.03 MB/s)
> 
> because the new way doesn't seem to do any better.
> 
> > Below are the logs I collected by injecting printks. There are
> > basically two major problems:
> > - swsusp_shrink_memory() scans the whole 2G memory again and again;
> > - shrink_all_memory() is slow. It won't reclaim pages at all with
> >   small priority values, because it's batching size is 10000 pages.
> 
> I know that swsusp_shrink_memory() has problems, that's why I'd like to get rid
> of it.
> 
> > I wonder if it's possible to free up the memory within 1s at all.
> 
> I'm not sure.
> 
> Apparently, the counting of saveable pages takes substantial time (0.5 s each
> iteration on my 64-bit test box), so we can improve that by limiting the number
> of iterations.
> 
> Well, perhaps we can do it all in one shot after all, I'll think how to do that.

I've changed swsusp_shrink_memory() to preallocate all of the pages in one
iteration.  Although it doesn't seem to improve the speed of memory shrinking,
the function is simpler in this form.

Anyway, updated patch series follows:

[1/5] - the Andrew's patch introducing __GFP_NO_OOM_KILL (I decided it would be
        better do it this way in this particular case.  The fact that the OOM
        killer is not going to work after tasks have been frozen is a different
        issue.)

[2/5] - move swsusp_shrink_memory to snapshot.c, no major changes

[3/5] - remove the shrinking of memory from suspend code (in a separate patch
         as requested by Linus)

[4/5] - use memory allocations to for making the room for the image

[5/5] - do not release all memory allocated by [4/5] and use it for
        creating the image directly (some allocated memory is released).

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 0/5] PM: Drop shrink_all_memory (rev. 3)
@ 2009-05-04  0:08                                             ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-04  0:08 UTC (permalink / raw)
  To: Wu Fengguang, linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Andrew Morton, pavel-+ZI9xUNit7I,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Sunday 03 May 2009, Rafael J. Wysocki wrote:
> On Sunday 03 May 2009, Wu Fengguang wrote:
> > 
> > Hi Rafael,
> 
> Hi,
> 
> > I happened to be doing some benchmarks on the older shrink_all_memory(),
> > Hopefully it can be a useful reference point for the new design.
> > 
> > The current swsusp_shrink_memory()/shrink_all_memory() are terribly
> > inefficient: it takes 7-9s to free up 1.4G memory:
> 
> One reason may be that it takes too many steps to do it,
> 
> > [  131.899389] PM: Freed 1413380 kbytes in 7.03 seconds (201.04 MB/s)
> > [  732.757916] PM: Freed 1490116 kbytes in 9.37 seconds (159.03 MB/s)
> 
> because the new way doesn't seem to do any better.
> 
> > Below are the logs I collected by injecting printks. There are
> > basically two major problems:
> > - swsusp_shrink_memory() scans the whole 2G memory again and again;
> > - shrink_all_memory() is slow. It won't reclaim pages at all with
> >   small priority values, because it's batching size is 10000 pages.
> 
> I know that swsusp_shrink_memory() has problems, that's why I'd like to get rid
> of it.
> 
> > I wonder if it's possible to free up the memory within 1s at all.
> 
> I'm not sure.
> 
> Apparently, the counting of saveable pages takes substantial time (0.5 s each
> iteration on my 64-bit test box), so we can improve that by limiting the number
> of iterations.
> 
> Well, perhaps we can do it all in one shot after all, I'll think how to do that.

I've changed swsusp_shrink_memory() to preallocate all of the pages in one
iteration.  Although it doesn't seem to improve the speed of memory shrinking,
the function is simpler in this form.

Anyway, updated patch series follows:

[1/5] - the Andrew's patch introducing __GFP_NO_OOM_KILL (I decided it would be
        better do it this way in this particular case.  The fact that the OOM
        killer is not going to work after tasks have been frozen is a different
        issue.)

[2/5] - move swsusp_shrink_memory to snapshot.c, no major changes

[3/5] - remove the shrinking of memory from suspend code (in a separate patch
         as requested by Linus)

[4/5] - use memory allocations to for making the room for the image

[5/5] - do not release all memory allocated by [4/5] and use it for
        creating the image directly (some allocated memory is released).

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-04  0:10                                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-04  0:10 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-pm, Andrew Morton, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

From: Andrew Morton <akpm@linux-foundation.org>

> > Remind me: why can't we just allocate N pages at suspend-time?
> 
> We need half of memory free. The reason we can't "just allocate" is
> probably OOM killer; but my memories are quite weak :-(.

hm.  You'd think that with our splendid range of __GFP_foo falgs, there
would be some combo which would suit this requirement but I can't
immediately spot one.

We can always add another I guess.  Something like...

[rjw: fixed white space]

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 include/linux/gfp.h |    3 ++-
 mm/page_alloc.c     |    3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c
+++ linux-2.6/mm/page_alloc.c
@@ -1620,7 +1620,8 @@ nofail_alloc:
 		}
 
 		/* The OOM killer will not help higher order allocs so fail */
-		if (order > PAGE_ALLOC_COSTLY_ORDER) {
+		if (order > PAGE_ALLOC_COSTLY_ORDER ||
+				(gfp_mask & __GFP_NO_OOM_KILL)) {
 			clear_zonelist_oom(zonelist, gfp_mask);
 			goto nopage;
 		}
Index: linux-2.6/include/linux/gfp.h
===================================================================
--- linux-2.6.orig/include/linux/gfp.h
+++ linux-2.6/include/linux/gfp.h
@@ -51,8 +51,9 @@ struct vm_area_struct;
 #define __GFP_THISNODE	((__force gfp_t)0x40000u)/* No fallback, no policies */
 #define __GFP_RECLAIMABLE ((__force gfp_t)0x80000u) /* Page is reclaimable */
 #define __GFP_MOVABLE	((__force gfp_t)0x100000u)  /* Page is movable */
+#define __GFP_NO_OOM_KILL ((__force gfp_t)0x200000u)  /* Don't invoke out_of_memory() */
 
-#define __GFP_BITS_SHIFT 21	/* Room for 21 __GFP_FOO bits */
+#define __GFP_BITS_SHIFT 22	/* Number of __GFP_FOO bits */
 #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
 
 /* This equals 0, but use constants in case they ever change */

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-04  0:08                                             ` Rafael J. Wysocki
  (?)
  (?)
@ 2009-05-04  0:10                                             ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-04  0:10 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-kernel, alan-jenkins, jens.axboe, linux-pm, kernel-testers,
	torvalds, Andrew Morton

From: Andrew Morton <akpm@linux-foundation.org>

> > Remind me: why can't we just allocate N pages at suspend-time?
> 
> We need half of memory free. The reason we can't "just allocate" is
> probably OOM killer; but my memories are quite weak :-(.

hm.  You'd think that with our splendid range of __GFP_foo falgs, there
would be some combo which would suit this requirement but I can't
immediately spot one.

We can always add another I guess.  Something like...

[rjw: fixed white space]

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 include/linux/gfp.h |    3 ++-
 mm/page_alloc.c     |    3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c
+++ linux-2.6/mm/page_alloc.c
@@ -1620,7 +1620,8 @@ nofail_alloc:
 		}
 
 		/* The OOM killer will not help higher order allocs so fail */
-		if (order > PAGE_ALLOC_COSTLY_ORDER) {
+		if (order > PAGE_ALLOC_COSTLY_ORDER ||
+				(gfp_mask & __GFP_NO_OOM_KILL)) {
 			clear_zonelist_oom(zonelist, gfp_mask);
 			goto nopage;
 		}
Index: linux-2.6/include/linux/gfp.h
===================================================================
--- linux-2.6.orig/include/linux/gfp.h
+++ linux-2.6/include/linux/gfp.h
@@ -51,8 +51,9 @@ struct vm_area_struct;
 #define __GFP_THISNODE	((__force gfp_t)0x40000u)/* No fallback, no policies */
 #define __GFP_RECLAIMABLE ((__force gfp_t)0x80000u) /* Page is reclaimable */
 #define __GFP_MOVABLE	((__force gfp_t)0x100000u)  /* Page is movable */
+#define __GFP_NO_OOM_KILL ((__force gfp_t)0x200000u)  /* Don't invoke out_of_memory() */
 
-#define __GFP_BITS_SHIFT 21	/* Room for 21 __GFP_FOO bits */
+#define __GFP_BITS_SHIFT 22	/* Number of __GFP_FOO bits */
 #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
 
 /* This equals 0, but use constants in case they ever change */

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-04  0:10                                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-04  0:10 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Andrew Morton, pavel-+ZI9xUNit7I,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

From: Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>

> > Remind me: why can't we just allocate N pages at suspend-time?
> 
> We need half of memory free. The reason we can't "just allocate" is
> probably OOM killer; but my memories are quite weak :-(.

hm.  You'd think that with our splendid range of __GFP_foo falgs, there
would be some combo which would suit this requirement but I can't
immediately spot one.

We can always add another I guess.  Something like...

[rjw: fixed white space]

Signed-off-by: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
---
 include/linux/gfp.h |    3 ++-
 mm/page_alloc.c     |    3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c
+++ linux-2.6/mm/page_alloc.c
@@ -1620,7 +1620,8 @@ nofail_alloc:
 		}
 
 		/* The OOM killer will not help higher order allocs so fail */
-		if (order > PAGE_ALLOC_COSTLY_ORDER) {
+		if (order > PAGE_ALLOC_COSTLY_ORDER ||
+				(gfp_mask & __GFP_NO_OOM_KILL)) {
 			clear_zonelist_oom(zonelist, gfp_mask);
 			goto nopage;
 		}
Index: linux-2.6/include/linux/gfp.h
===================================================================
--- linux-2.6.orig/include/linux/gfp.h
+++ linux-2.6/include/linux/gfp.h
@@ -51,8 +51,9 @@ struct vm_area_struct;
 #define __GFP_THISNODE	((__force gfp_t)0x40000u)/* No fallback, no policies */
 #define __GFP_RECLAIMABLE ((__force gfp_t)0x80000u) /* Page is reclaimable */
 #define __GFP_MOVABLE	((__force gfp_t)0x100000u)  /* Page is movable */
+#define __GFP_NO_OOM_KILL ((__force gfp_t)0x200000u)  /* Don't invoke out_of_memory() */
 
-#define __GFP_BITS_SHIFT 21	/* Room for 21 __GFP_FOO bits */
+#define __GFP_BITS_SHIFT 22	/* Number of __GFP_FOO bits */
 #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
 
 /* This equals 0, but use constants in case they ever change */

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 2/5] PM/Hibernate: Move memory shrinking to snapshot.c (rev. 2)
@ 2009-05-04  0:11                                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-04  0:11 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-pm, Andrew Morton, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

From: Rafael J. Wysocki <rjw@sisk.pl>

The next patch is going to modify the memory shrinking code so that
it will make memory allocations to free memory instead of using an
artificial memory shrinking mechanism for that.  For this purpose it
is convenient to move swsusp_shrink_memory() from
kernel/power/swsusp.c to kernel/power/snapshot.c, because the new
memory-shrinking code is going to use things that are local to
kernel/power/snapshot.c .

[rev. 2: Make some functions static and remove their headers from
 kernel/power/power.h]

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/power.h    |    4 --
 kernel/power/snapshot.c |   80 ++++++++++++++++++++++++++++++++++++++++++++++--
 kernel/power/swsusp.c   |   76 ---------------------------------------------
 3 files changed, 79 insertions(+), 81 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -39,6 +39,14 @@ static int swsusp_page_is_free(struct pa
 static void swsusp_set_page_forbidden(struct page *);
 static void swsusp_unset_page_forbidden(struct page *);
 
+/*
+ * Preferred image size in bytes (tunable via /sys/power/image_size).
+ * When it is set to N, swsusp will do its best to ensure the image
+ * size will not exceed N bytes, but if that is impossible, it will
+ * try to create the smallest image possible.
+ */
+unsigned long image_size = 500 * 1024 * 1024;
+
 /* List of PBEs needed for restoring the pages that were allocated before
  * the suspend and included in the suspend image, but have also been
  * allocated by the "resume" kernel, so their contents cannot be written
@@ -840,7 +848,7 @@ static struct page *saveable_highmem_pag
  *	pages.
  */
 
-unsigned int count_highmem_pages(void)
+static unsigned int count_highmem_pages(void)
 {
 	struct zone *zone;
 	unsigned int n = 0;
@@ -902,7 +910,7 @@ static struct page *saveable_page(struct
  *	pages.
  */
 
-unsigned int count_data_pages(void)
+static unsigned int count_data_pages(void)
 {
 	struct zone *zone;
 	unsigned long pfn, max_zone_pfn;
@@ -1058,6 +1066,74 @@ void swsusp_free(void)
 	buffer = NULL;
 }
 
+/**
+ *	swsusp_shrink_memory -  Try to free as much memory as needed
+ *
+ *	... but do not OOM-kill anyone
+ *
+ *	Notice: all userland should be stopped before it is called, or
+ *	livelock is possible.
+ */
+
+#define SHRINK_BITE	10000
+static inline unsigned long __shrink_memory(long tmp)
+{
+	if (tmp > SHRINK_BITE)
+		tmp = SHRINK_BITE;
+	return shrink_all_memory(tmp);
+}
+
+int swsusp_shrink_memory(void)
+{
+	long tmp;
+	struct zone *zone;
+	unsigned long pages = 0;
+	unsigned int i = 0;
+	char *p = "-\\|/";
+	struct timeval start, stop;
+
+	printk(KERN_INFO "PM: Shrinking memory...  ");
+	do_gettimeofday(&start);
+	do {
+		long size, highmem_size;
+
+		highmem_size = count_highmem_pages();
+		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
+		tmp = size;
+		size += highmem_size;
+		for_each_populated_zone(zone) {
+			tmp += snapshot_additional_pages(zone);
+			if (is_highmem(zone)) {
+				highmem_size -=
+					zone_page_state(zone, NR_FREE_PAGES);
+			} else {
+				tmp -= zone_page_state(zone, NR_FREE_PAGES);
+				tmp += zone->lowmem_reserve[ZONE_NORMAL];
+			}
+		}
+
+		if (highmem_size < 0)
+			highmem_size = 0;
+
+		tmp += highmem_size;
+		if (tmp > 0) {
+			tmp = __shrink_memory(tmp);
+			if (!tmp)
+				return -ENOMEM;
+			pages += tmp;
+		} else if (size > image_size / PAGE_SIZE) {
+			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
+			pages += tmp;
+		}
+		printk("\b%c", p[i++%4]);
+	} while (tmp > 0);
+	do_gettimeofday(&stop);
+	printk("\bdone (%lu pages freed)\n", pages);
+	swsusp_show_speed(&start, &stop, pages, "Freed");
+
+	return 0;
+}
+
 #ifdef CONFIG_HIGHMEM
 /**
   *	count_pages_for_highmem - compute the number of non-highmem pages
Index: linux-2.6/kernel/power/swsusp.c
===================================================================
--- linux-2.6.orig/kernel/power/swsusp.c
+++ linux-2.6/kernel/power/swsusp.c
@@ -55,14 +55,6 @@
 
 #include "power.h"
 
-/*
- * Preferred image size in bytes (tunable via /sys/power/image_size).
- * When it is set to N, swsusp will do its best to ensure the image
- * size will not exceed N bytes, but if that is impossible, it will
- * try to create the smallest image possible.
- */
-unsigned long image_size = 500 * 1024 * 1024;
-
 int in_suspend __nosavedata = 0;
 
 /**
@@ -195,74 +187,6 @@ void swsusp_show_speed(struct timeval *s
 			kps / 1000, (kps % 1000) / 10);
 }
 
-/**
- *	swsusp_shrink_memory -  Try to free as much memory as needed
- *
- *	... but do not OOM-kill anyone
- *
- *	Notice: all userland should be stopped before it is called, or
- *	livelock is possible.
- */
-
-#define SHRINK_BITE	10000
-static inline unsigned long __shrink_memory(long tmp)
-{
-	if (tmp > SHRINK_BITE)
-		tmp = SHRINK_BITE;
-	return shrink_all_memory(tmp);
-}
-
-int swsusp_shrink_memory(void)
-{
-	long tmp;
-	struct zone *zone;
-	unsigned long pages = 0;
-	unsigned int i = 0;
-	char *p = "-\\|/";
-	struct timeval start, stop;
-
-	printk(KERN_INFO "PM: Shrinking memory...  ");
-	do_gettimeofday(&start);
-	do {
-		long size, highmem_size;
-
-		highmem_size = count_highmem_pages();
-		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
-		tmp = size;
-		size += highmem_size;
-		for_each_populated_zone(zone) {
-			tmp += snapshot_additional_pages(zone);
-			if (is_highmem(zone)) {
-				highmem_size -=
-					zone_page_state(zone, NR_FREE_PAGES);
-			} else {
-				tmp -= zone_page_state(zone, NR_FREE_PAGES);
-				tmp += zone->lowmem_reserve[ZONE_NORMAL];
-			}
-		}
-
-		if (highmem_size < 0)
-			highmem_size = 0;
-
-		tmp += highmem_size;
-		if (tmp > 0) {
-			tmp = __shrink_memory(tmp);
-			if (!tmp)
-				return -ENOMEM;
-			pages += tmp;
-		} else if (size > image_size / PAGE_SIZE) {
-			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
-			pages += tmp;
-		}
-		printk("\b%c", p[i++%4]);
-	} while (tmp > 0);
-	do_gettimeofday(&stop);
-	printk("\bdone (%lu pages freed)\n", pages);
-	swsusp_show_speed(&start, &stop, pages, "Freed");
-
-	return 0;
-}
-
 /*
  * Platforms, like ACPI, may want us to save some memory used by them during
  * hibernation and to restore the contents of this memory during the subsequent
Index: linux-2.6/kernel/power/power.h
===================================================================
--- linux-2.6.orig/kernel/power/power.h
+++ linux-2.6/kernel/power/power.h
@@ -74,7 +74,7 @@ extern asmlinkage int swsusp_arch_resume
 
 extern int create_basic_memory_bitmaps(void);
 extern void free_basic_memory_bitmaps(void);
-extern unsigned int count_data_pages(void);
+extern int swsusp_shrink_memory(void);
 
 /**
  *	Auxiliary structure used for reading the snapshot image data and
@@ -149,7 +149,6 @@ extern int swsusp_swap_in_use(void);
 
 /* kernel/power/disk.c */
 extern int swsusp_check(void);
-extern int swsusp_shrink_memory(void);
 extern void swsusp_free(void);
 extern int swsusp_read(unsigned int *flags_p);
 extern int swsusp_write(unsigned int flags);
@@ -176,7 +175,6 @@ extern int pm_notifier_call_chain(unsign
 #endif
 
 #ifdef CONFIG_HIGHMEM
-unsigned int count_highmem_pages(void);
 int restore_highmem(void);
 #else
 static inline unsigned int count_highmem_pages(void) { return 0; }

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 2/5] PM/Hibernate: Move memory shrinking to snapshot.c (rev. 2)
  2009-05-04  0:08                                             ` Rafael J. Wysocki
                                                               ` (2 preceding siblings ...)
  (?)
@ 2009-05-04  0:11                                             ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-04  0:11 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-kernel, alan-jenkins, jens.axboe, linux-pm, kernel-testers,
	torvalds, Andrew Morton

From: Rafael J. Wysocki <rjw@sisk.pl>

The next patch is going to modify the memory shrinking code so that
it will make memory allocations to free memory instead of using an
artificial memory shrinking mechanism for that.  For this purpose it
is convenient to move swsusp_shrink_memory() from
kernel/power/swsusp.c to kernel/power/snapshot.c, because the new
memory-shrinking code is going to use things that are local to
kernel/power/snapshot.c .

[rev. 2: Make some functions static and remove their headers from
 kernel/power/power.h]

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/power.h    |    4 --
 kernel/power/snapshot.c |   80 ++++++++++++++++++++++++++++++++++++++++++++++--
 kernel/power/swsusp.c   |   76 ---------------------------------------------
 3 files changed, 79 insertions(+), 81 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -39,6 +39,14 @@ static int swsusp_page_is_free(struct pa
 static void swsusp_set_page_forbidden(struct page *);
 static void swsusp_unset_page_forbidden(struct page *);
 
+/*
+ * Preferred image size in bytes (tunable via /sys/power/image_size).
+ * When it is set to N, swsusp will do its best to ensure the image
+ * size will not exceed N bytes, but if that is impossible, it will
+ * try to create the smallest image possible.
+ */
+unsigned long image_size = 500 * 1024 * 1024;
+
 /* List of PBEs needed for restoring the pages that were allocated before
  * the suspend and included in the suspend image, but have also been
  * allocated by the "resume" kernel, so their contents cannot be written
@@ -840,7 +848,7 @@ static struct page *saveable_highmem_pag
  *	pages.
  */
 
-unsigned int count_highmem_pages(void)
+static unsigned int count_highmem_pages(void)
 {
 	struct zone *zone;
 	unsigned int n = 0;
@@ -902,7 +910,7 @@ static struct page *saveable_page(struct
  *	pages.
  */
 
-unsigned int count_data_pages(void)
+static unsigned int count_data_pages(void)
 {
 	struct zone *zone;
 	unsigned long pfn, max_zone_pfn;
@@ -1058,6 +1066,74 @@ void swsusp_free(void)
 	buffer = NULL;
 }
 
+/**
+ *	swsusp_shrink_memory -  Try to free as much memory as needed
+ *
+ *	... but do not OOM-kill anyone
+ *
+ *	Notice: all userland should be stopped before it is called, or
+ *	livelock is possible.
+ */
+
+#define SHRINK_BITE	10000
+static inline unsigned long __shrink_memory(long tmp)
+{
+	if (tmp > SHRINK_BITE)
+		tmp = SHRINK_BITE;
+	return shrink_all_memory(tmp);
+}
+
+int swsusp_shrink_memory(void)
+{
+	long tmp;
+	struct zone *zone;
+	unsigned long pages = 0;
+	unsigned int i = 0;
+	char *p = "-\\|/";
+	struct timeval start, stop;
+
+	printk(KERN_INFO "PM: Shrinking memory...  ");
+	do_gettimeofday(&start);
+	do {
+		long size, highmem_size;
+
+		highmem_size = count_highmem_pages();
+		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
+		tmp = size;
+		size += highmem_size;
+		for_each_populated_zone(zone) {
+			tmp += snapshot_additional_pages(zone);
+			if (is_highmem(zone)) {
+				highmem_size -=
+					zone_page_state(zone, NR_FREE_PAGES);
+			} else {
+				tmp -= zone_page_state(zone, NR_FREE_PAGES);
+				tmp += zone->lowmem_reserve[ZONE_NORMAL];
+			}
+		}
+
+		if (highmem_size < 0)
+			highmem_size = 0;
+
+		tmp += highmem_size;
+		if (tmp > 0) {
+			tmp = __shrink_memory(tmp);
+			if (!tmp)
+				return -ENOMEM;
+			pages += tmp;
+		} else if (size > image_size / PAGE_SIZE) {
+			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
+			pages += tmp;
+		}
+		printk("\b%c", p[i++%4]);
+	} while (tmp > 0);
+	do_gettimeofday(&stop);
+	printk("\bdone (%lu pages freed)\n", pages);
+	swsusp_show_speed(&start, &stop, pages, "Freed");
+
+	return 0;
+}
+
 #ifdef CONFIG_HIGHMEM
 /**
   *	count_pages_for_highmem - compute the number of non-highmem pages
Index: linux-2.6/kernel/power/swsusp.c
===================================================================
--- linux-2.6.orig/kernel/power/swsusp.c
+++ linux-2.6/kernel/power/swsusp.c
@@ -55,14 +55,6 @@
 
 #include "power.h"
 
-/*
- * Preferred image size in bytes (tunable via /sys/power/image_size).
- * When it is set to N, swsusp will do its best to ensure the image
- * size will not exceed N bytes, but if that is impossible, it will
- * try to create the smallest image possible.
- */
-unsigned long image_size = 500 * 1024 * 1024;
-
 int in_suspend __nosavedata = 0;
 
 /**
@@ -195,74 +187,6 @@ void swsusp_show_speed(struct timeval *s
 			kps / 1000, (kps % 1000) / 10);
 }
 
-/**
- *	swsusp_shrink_memory -  Try to free as much memory as needed
- *
- *	... but do not OOM-kill anyone
- *
- *	Notice: all userland should be stopped before it is called, or
- *	livelock is possible.
- */
-
-#define SHRINK_BITE	10000
-static inline unsigned long __shrink_memory(long tmp)
-{
-	if (tmp > SHRINK_BITE)
-		tmp = SHRINK_BITE;
-	return shrink_all_memory(tmp);
-}
-
-int swsusp_shrink_memory(void)
-{
-	long tmp;
-	struct zone *zone;
-	unsigned long pages = 0;
-	unsigned int i = 0;
-	char *p = "-\\|/";
-	struct timeval start, stop;
-
-	printk(KERN_INFO "PM: Shrinking memory...  ");
-	do_gettimeofday(&start);
-	do {
-		long size, highmem_size;
-
-		highmem_size = count_highmem_pages();
-		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
-		tmp = size;
-		size += highmem_size;
-		for_each_populated_zone(zone) {
-			tmp += snapshot_additional_pages(zone);
-			if (is_highmem(zone)) {
-				highmem_size -=
-					zone_page_state(zone, NR_FREE_PAGES);
-			} else {
-				tmp -= zone_page_state(zone, NR_FREE_PAGES);
-				tmp += zone->lowmem_reserve[ZONE_NORMAL];
-			}
-		}
-
-		if (highmem_size < 0)
-			highmem_size = 0;
-
-		tmp += highmem_size;
-		if (tmp > 0) {
-			tmp = __shrink_memory(tmp);
-			if (!tmp)
-				return -ENOMEM;
-			pages += tmp;
-		} else if (size > image_size / PAGE_SIZE) {
-			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
-			pages += tmp;
-		}
-		printk("\b%c", p[i++%4]);
-	} while (tmp > 0);
-	do_gettimeofday(&stop);
-	printk("\bdone (%lu pages freed)\n", pages);
-	swsusp_show_speed(&start, &stop, pages, "Freed");
-
-	return 0;
-}
-
 /*
  * Platforms, like ACPI, may want us to save some memory used by them during
  * hibernation and to restore the contents of this memory during the subsequent
Index: linux-2.6/kernel/power/power.h
===================================================================
--- linux-2.6.orig/kernel/power/power.h
+++ linux-2.6/kernel/power/power.h
@@ -74,7 +74,7 @@ extern asmlinkage int swsusp_arch_resume
 
 extern int create_basic_memory_bitmaps(void);
 extern void free_basic_memory_bitmaps(void);
-extern unsigned int count_data_pages(void);
+extern int swsusp_shrink_memory(void);
 
 /**
  *	Auxiliary structure used for reading the snapshot image data and
@@ -149,7 +149,6 @@ extern int swsusp_swap_in_use(void);
 
 /* kernel/power/disk.c */
 extern int swsusp_check(void);
-extern int swsusp_shrink_memory(void);
 extern void swsusp_free(void);
 extern int swsusp_read(unsigned int *flags_p);
 extern int swsusp_write(unsigned int flags);
@@ -176,7 +175,6 @@ extern int pm_notifier_call_chain(unsign
 #endif
 
 #ifdef CONFIG_HIGHMEM
-unsigned int count_highmem_pages(void);
 int restore_highmem(void);
 #else
 static inline unsigned int count_highmem_pages(void) { return 0; }

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 2/5] PM/Hibernate: Move memory shrinking to snapshot.c (rev. 2)
@ 2009-05-04  0:11                                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-04  0:11 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Andrew Morton, pavel-+ZI9xUNit7I,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>

The next patch is going to modify the memory shrinking code so that
it will make memory allocations to free memory instead of using an
artificial memory shrinking mechanism for that.  For this purpose it
is convenient to move swsusp_shrink_memory() from
kernel/power/swsusp.c to kernel/power/snapshot.c, because the new
memory-shrinking code is going to use things that are local to
kernel/power/snapshot.c .

[rev. 2: Make some functions static and remove their headers from
 kernel/power/power.h]

Signed-off-by: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
---
 kernel/power/power.h    |    4 --
 kernel/power/snapshot.c |   80 ++++++++++++++++++++++++++++++++++++++++++++++--
 kernel/power/swsusp.c   |   76 ---------------------------------------------
 3 files changed, 79 insertions(+), 81 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -39,6 +39,14 @@ static int swsusp_page_is_free(struct pa
 static void swsusp_set_page_forbidden(struct page *);
 static void swsusp_unset_page_forbidden(struct page *);
 
+/*
+ * Preferred image size in bytes (tunable via /sys/power/image_size).
+ * When it is set to N, swsusp will do its best to ensure the image
+ * size will not exceed N bytes, but if that is impossible, it will
+ * try to create the smallest image possible.
+ */
+unsigned long image_size = 500 * 1024 * 1024;
+
 /* List of PBEs needed for restoring the pages that were allocated before
  * the suspend and included in the suspend image, but have also been
  * allocated by the "resume" kernel, so their contents cannot be written
@@ -840,7 +848,7 @@ static struct page *saveable_highmem_pag
  *	pages.
  */
 
-unsigned int count_highmem_pages(void)
+static unsigned int count_highmem_pages(void)
 {
 	struct zone *zone;
 	unsigned int n = 0;
@@ -902,7 +910,7 @@ static struct page *saveable_page(struct
  *	pages.
  */
 
-unsigned int count_data_pages(void)
+static unsigned int count_data_pages(void)
 {
 	struct zone *zone;
 	unsigned long pfn, max_zone_pfn;
@@ -1058,6 +1066,74 @@ void swsusp_free(void)
 	buffer = NULL;
 }
 
+/**
+ *	swsusp_shrink_memory -  Try to free as much memory as needed
+ *
+ *	... but do not OOM-kill anyone
+ *
+ *	Notice: all userland should be stopped before it is called, or
+ *	livelock is possible.
+ */
+
+#define SHRINK_BITE	10000
+static inline unsigned long __shrink_memory(long tmp)
+{
+	if (tmp > SHRINK_BITE)
+		tmp = SHRINK_BITE;
+	return shrink_all_memory(tmp);
+}
+
+int swsusp_shrink_memory(void)
+{
+	long tmp;
+	struct zone *zone;
+	unsigned long pages = 0;
+	unsigned int i = 0;
+	char *p = "-\\|/";
+	struct timeval start, stop;
+
+	printk(KERN_INFO "PM: Shrinking memory...  ");
+	do_gettimeofday(&start);
+	do {
+		long size, highmem_size;
+
+		highmem_size = count_highmem_pages();
+		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
+		tmp = size;
+		size += highmem_size;
+		for_each_populated_zone(zone) {
+			tmp += snapshot_additional_pages(zone);
+			if (is_highmem(zone)) {
+				highmem_size -=
+					zone_page_state(zone, NR_FREE_PAGES);
+			} else {
+				tmp -= zone_page_state(zone, NR_FREE_PAGES);
+				tmp += zone->lowmem_reserve[ZONE_NORMAL];
+			}
+		}
+
+		if (highmem_size < 0)
+			highmem_size = 0;
+
+		tmp += highmem_size;
+		if (tmp > 0) {
+			tmp = __shrink_memory(tmp);
+			if (!tmp)
+				return -ENOMEM;
+			pages += tmp;
+		} else if (size > image_size / PAGE_SIZE) {
+			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
+			pages += tmp;
+		}
+		printk("\b%c", p[i++%4]);
+	} while (tmp > 0);
+	do_gettimeofday(&stop);
+	printk("\bdone (%lu pages freed)\n", pages);
+	swsusp_show_speed(&start, &stop, pages, "Freed");
+
+	return 0;
+}
+
 #ifdef CONFIG_HIGHMEM
 /**
   *	count_pages_for_highmem - compute the number of non-highmem pages
Index: linux-2.6/kernel/power/swsusp.c
===================================================================
--- linux-2.6.orig/kernel/power/swsusp.c
+++ linux-2.6/kernel/power/swsusp.c
@@ -55,14 +55,6 @@
 
 #include "power.h"
 
-/*
- * Preferred image size in bytes (tunable via /sys/power/image_size).
- * When it is set to N, swsusp will do its best to ensure the image
- * size will not exceed N bytes, but if that is impossible, it will
- * try to create the smallest image possible.
- */
-unsigned long image_size = 500 * 1024 * 1024;
-
 int in_suspend __nosavedata = 0;
 
 /**
@@ -195,74 +187,6 @@ void swsusp_show_speed(struct timeval *s
 			kps / 1000, (kps % 1000) / 10);
 }
 
-/**
- *	swsusp_shrink_memory -  Try to free as much memory as needed
- *
- *	... but do not OOM-kill anyone
- *
- *	Notice: all userland should be stopped before it is called, or
- *	livelock is possible.
- */
-
-#define SHRINK_BITE	10000
-static inline unsigned long __shrink_memory(long tmp)
-{
-	if (tmp > SHRINK_BITE)
-		tmp = SHRINK_BITE;
-	return shrink_all_memory(tmp);
-}
-
-int swsusp_shrink_memory(void)
-{
-	long tmp;
-	struct zone *zone;
-	unsigned long pages = 0;
-	unsigned int i = 0;
-	char *p = "-\\|/";
-	struct timeval start, stop;
-
-	printk(KERN_INFO "PM: Shrinking memory...  ");
-	do_gettimeofday(&start);
-	do {
-		long size, highmem_size;
-
-		highmem_size = count_highmem_pages();
-		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
-		tmp = size;
-		size += highmem_size;
-		for_each_populated_zone(zone) {
-			tmp += snapshot_additional_pages(zone);
-			if (is_highmem(zone)) {
-				highmem_size -=
-					zone_page_state(zone, NR_FREE_PAGES);
-			} else {
-				tmp -= zone_page_state(zone, NR_FREE_PAGES);
-				tmp += zone->lowmem_reserve[ZONE_NORMAL];
-			}
-		}
-
-		if (highmem_size < 0)
-			highmem_size = 0;
-
-		tmp += highmem_size;
-		if (tmp > 0) {
-			tmp = __shrink_memory(tmp);
-			if (!tmp)
-				return -ENOMEM;
-			pages += tmp;
-		} else if (size > image_size / PAGE_SIZE) {
-			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
-			pages += tmp;
-		}
-		printk("\b%c", p[i++%4]);
-	} while (tmp > 0);
-	do_gettimeofday(&stop);
-	printk("\bdone (%lu pages freed)\n", pages);
-	swsusp_show_speed(&start, &stop, pages, "Freed");
-
-	return 0;
-}
-
 /*
  * Platforms, like ACPI, may want us to save some memory used by them during
  * hibernation and to restore the contents of this memory during the subsequent
Index: linux-2.6/kernel/power/power.h
===================================================================
--- linux-2.6.orig/kernel/power/power.h
+++ linux-2.6/kernel/power/power.h
@@ -74,7 +74,7 @@ extern asmlinkage int swsusp_arch_resume
 
 extern int create_basic_memory_bitmaps(void);
 extern void free_basic_memory_bitmaps(void);
-extern unsigned int count_data_pages(void);
+extern int swsusp_shrink_memory(void);
 
 /**
  *	Auxiliary structure used for reading the snapshot image data and
@@ -149,7 +149,6 @@ extern int swsusp_swap_in_use(void);
 
 /* kernel/power/disk.c */
 extern int swsusp_check(void);
-extern int swsusp_shrink_memory(void);
 extern void swsusp_free(void);
 extern int swsusp_read(unsigned int *flags_p);
 extern int swsusp_write(unsigned int flags);
@@ -176,7 +175,6 @@ extern int pm_notifier_call_chain(unsign
 #endif
 
 #ifdef CONFIG_HIGHMEM
-unsigned int count_highmem_pages(void);
 int restore_highmem(void);
 #else
 static inline unsigned int count_highmem_pages(void) { return 0; }

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 3/5] PM/Suspend: Do not shrink memory before suspend
@ 2009-05-04  0:12                                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-04  0:12 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-pm, Andrew Morton, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

From: Rafael J. Wysocki <rjw@sisk.pl>

Remove the shrinking of memory from the suspend-to-RAM code, where
it is not really necessary.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/main.c |   20 +-------------------
 1 file changed, 1 insertion(+), 19 deletions(-)

Index: linux-2.6/kernel/power/main.c
===================================================================
--- linux-2.6.orig/kernel/power/main.c
+++ linux-2.6/kernel/power/main.c
@@ -188,9 +188,6 @@ static void suspend_test_finish(const ch
 
 #endif
 
-/* This is just an arbitrary number */
-#define FREE_PAGE_NUMBER (100)
-
 static struct platform_suspend_ops *suspend_ops;
 
 /**
@@ -226,7 +223,6 @@ int suspend_valid_only_mem(suspend_state
 static int suspend_prepare(void)
 {
 	int error;
-	unsigned int free_pages;
 
 	if (!suspend_ops || !suspend_ops->enter)
 		return -EPERM;
@@ -241,24 +237,10 @@ static int suspend_prepare(void)
 	if (error)
 		goto Finish;
 
-	if (suspend_freeze_processes()) {
-		error = -EAGAIN;
-		goto Thaw;
-	}
-
-	free_pages = global_page_state(NR_FREE_PAGES);
-	if (free_pages < FREE_PAGE_NUMBER) {
-		pr_debug("PM: free some memory\n");
-		shrink_all_memory(FREE_PAGE_NUMBER - free_pages);
-		if (nr_free_pages() < FREE_PAGE_NUMBER) {
-			error = -ENOMEM;
-			printk(KERN_ERR "PM: No enough memory\n");
-		}
-	}
+	error = suspend_freeze_processes();
 	if (!error)
 		return 0;
 
- Thaw:
 	suspend_thaw_processes();
 	usermodehelper_enable();
  Finish:

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 3/5] PM/Suspend: Do not shrink memory before suspend
  2009-05-04  0:08                                             ` Rafael J. Wysocki
                                                               ` (5 preceding siblings ...)
  (?)
@ 2009-05-04  0:12                                             ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-04  0:12 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-kernel, alan-jenkins, jens.axboe, linux-pm, kernel-testers,
	torvalds, Andrew Morton

From: Rafael J. Wysocki <rjw@sisk.pl>

Remove the shrinking of memory from the suspend-to-RAM code, where
it is not really necessary.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/main.c |   20 +-------------------
 1 file changed, 1 insertion(+), 19 deletions(-)

Index: linux-2.6/kernel/power/main.c
===================================================================
--- linux-2.6.orig/kernel/power/main.c
+++ linux-2.6/kernel/power/main.c
@@ -188,9 +188,6 @@ static void suspend_test_finish(const ch
 
 #endif
 
-/* This is just an arbitrary number */
-#define FREE_PAGE_NUMBER (100)
-
 static struct platform_suspend_ops *suspend_ops;
 
 /**
@@ -226,7 +223,6 @@ int suspend_valid_only_mem(suspend_state
 static int suspend_prepare(void)
 {
 	int error;
-	unsigned int free_pages;
 
 	if (!suspend_ops || !suspend_ops->enter)
 		return -EPERM;
@@ -241,24 +237,10 @@ static int suspend_prepare(void)
 	if (error)
 		goto Finish;
 
-	if (suspend_freeze_processes()) {
-		error = -EAGAIN;
-		goto Thaw;
-	}
-
-	free_pages = global_page_state(NR_FREE_PAGES);
-	if (free_pages < FREE_PAGE_NUMBER) {
-		pr_debug("PM: free some memory\n");
-		shrink_all_memory(FREE_PAGE_NUMBER - free_pages);
-		if (nr_free_pages() < FREE_PAGE_NUMBER) {
-			error = -ENOMEM;
-			printk(KERN_ERR "PM: No enough memory\n");
-		}
-	}
+	error = suspend_freeze_processes();
 	if (!error)
 		return 0;
 
- Thaw:
 	suspend_thaw_processes();
 	usermodehelper_enable();
  Finish:

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 3/5] PM/Suspend: Do not shrink memory before suspend
@ 2009-05-04  0:12                                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-04  0:12 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Andrew Morton, pavel-+ZI9xUNit7I,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>

Remove the shrinking of memory from the suspend-to-RAM code, where
it is not really necessary.

Signed-off-by: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
---
 kernel/power/main.c |   20 +-------------------
 1 file changed, 1 insertion(+), 19 deletions(-)

Index: linux-2.6/kernel/power/main.c
===================================================================
--- linux-2.6.orig/kernel/power/main.c
+++ linux-2.6/kernel/power/main.c
@@ -188,9 +188,6 @@ static void suspend_test_finish(const ch
 
 #endif
 
-/* This is just an arbitrary number */
-#define FREE_PAGE_NUMBER (100)
-
 static struct platform_suspend_ops *suspend_ops;
 
 /**
@@ -226,7 +223,6 @@ int suspend_valid_only_mem(suspend_state
 static int suspend_prepare(void)
 {
 	int error;
-	unsigned int free_pages;
 
 	if (!suspend_ops || !suspend_ops->enter)
 		return -EPERM;
@@ -241,24 +237,10 @@ static int suspend_prepare(void)
 	if (error)
 		goto Finish;
 
-	if (suspend_freeze_processes()) {
-		error = -EAGAIN;
-		goto Thaw;
-	}
-
-	free_pages = global_page_state(NR_FREE_PAGES);
-	if (free_pages < FREE_PAGE_NUMBER) {
-		pr_debug("PM: free some memory\n");
-		shrink_all_memory(FREE_PAGE_NUMBER - free_pages);
-		if (nr_free_pages() < FREE_PAGE_NUMBER) {
-			error = -ENOMEM;
-			printk(KERN_ERR "PM: No enough memory\n");
-		}
-	}
+	error = suspend_freeze_processes();
 	if (!error)
 		return 0;
 
- Thaw:
 	suspend_thaw_processes();
 	usermodehelper_enable();
  Finish:

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 4/5] PM/Hibernate: Use memory allocations to free memory (rev. 3)
@ 2009-05-04  0:20                                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-04  0:20 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-pm, Andrew Morton, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

From: Rafael J. Wysocki <rjw@sisk.pl>

Modify the hibernation memory shrinking code so that it will make
memory allocations to free memory instead of using an artificial
memory shrinking mechanism for that.  Remove the no longer used memory
shrinking functions from mm/vmscan.c .

[rev. 2: Use the existing memory bitmaps for marking preallocated
 image pages and use swsusp_free() from releasing them, add comments
 describing the memory shrinking strategy.
 rev. 3: change the memory shrinking strategy to preallocate as much
 memory as needed to get the right image size in one shot.]

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/snapshot.c |  119 +++++++++++++++++++++++-----------------
 mm/vmscan.c             |  142 ------------------------------------------------
 2 files changed, 70 insertions(+), 191 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -1066,69 +1066,90 @@ void swsusp_free(void)
 	buffer = NULL;
 }
 
+/* Helper function used for the shrinking of memory. */
+
 /**
- *	swsusp_shrink_memory -  Try to free as much memory as needed
+ * swsusp_shrink_memory -  Make the kernel release as much memory as needed
+ *
+ * To create a hibernation image it is necessary to make a copy of every page
+ * frame in use.  We also need a number of page frames to be free during
+ * hibernation for allocations made while saving the image and for device
+ * drivers, in case they need to allocate memory from their hibernation
+ * callbacks (these two numbers are given by PAGES_FOR_IO and SPARE_PAGES,
+ * respectively, both of which are rough estimates).  To make this happen, we
+ * compute the total number of available page frames and allocate at least
  *
- *	... but do not OOM-kill anyone
+ * ([page frames total] + PAGES_FOR_IO + SPARE_PAGES + [metadata pages]) / 2
  *
- *	Notice: all userland should be stopped before it is called, or
- *	livelock is possible.
+ * of them, which corresponds to the maximum size of a hibernation image.
+ *
+ * If image_size is set below the number following from the above formula,
+ * the preallocation of memory is continued until the total number of page
+ * frames in use is below the requested image size or it is impossible to
+ * allocate more memory, whichever happens first.
  */
-
-#define SHRINK_BITE	10000
-static inline unsigned long __shrink_memory(long tmp)
-{
-	if (tmp > SHRINK_BITE)
-		tmp = SHRINK_BITE;
-	return shrink_all_memory(tmp);
-}
-
 int swsusp_shrink_memory(void)
 {
-	long tmp;
 	struct zone *zone;
-	unsigned long pages = 0;
-	unsigned int i = 0;
-	char *p = "-\\|/";
+	unsigned long saveable, size, max_size, count, pages = 0;
 	struct timeval start, stop;
+	int error = 0;
 
-	printk(KERN_INFO "PM: Shrinking memory...  ");
+	printk(KERN_INFO "PM: Shrinking memory ... ");
 	do_gettimeofday(&start);
-	do {
-		long size, highmem_size;
 
-		highmem_size = count_highmem_pages();
-		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
-		tmp = size;
-		size += highmem_size;
-		for_each_populated_zone(zone) {
-			tmp += snapshot_additional_pages(zone);
-			if (is_highmem(zone)) {
-				highmem_size -=
-					zone_page_state(zone, NR_FREE_PAGES);
-			} else {
-				tmp -= zone_page_state(zone, NR_FREE_PAGES);
-				tmp += zone->lowmem_reserve[ZONE_NORMAL];
-			}
-		}
+	/* Count the number of saveable data pages. */
+	saveable = count_data_pages() + count_highmem_pages();
+
+	/*
+	 * Compute the total number of page frames we can use (count) and the
+	 * number of pages needed for image metadata (size).
+	 */
+	count = saveable;
+	size = 0;
+	for_each_populated_zone(zone) {
+		size += snapshot_additional_pages(zone);
+		count += zone_page_state(zone, NR_FREE_PAGES);
+		if (!is_highmem(zone))
+			count -= zone->lowmem_reserve[ZONE_NORMAL];
+	}
+
+	/* Compute the maximum number of saveable pages to leave in memory. */
+	max_size = (count - (size + PAGES_FOR_IO + SPARE_PAGES)) / 2;
+	size = DIV_ROUND_UP(image_size, PAGE_SIZE);
+	if (size > max_size)
+		size = max_size;
+	/*
+	 * If the current number of saveable pages is lesser than the maximum,
+	 * we don't need to do anything more.
+	 */
+	if (size > saveable)
+		goto out;
 
-		if (highmem_size < 0)
-			highmem_size = 0;
+	/* Preallocate memory. */
+	for (count -= size; count > 0; count--) {
+		struct page *page;
 
-		tmp += highmem_size;
-		if (tmp > 0) {
-			tmp = __shrink_memory(tmp);
-			if (!tmp)
-				return -ENOMEM;
-			pages += tmp;
-		} else if (size > image_size / PAGE_SIZE) {
-			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
-			pages += tmp;
-		}
-		printk("\b%c", p[i++%4]);
-	} while (tmp > 0);
+		page = alloc_image_page(GFP_KERNEL | __GFP_NO_OOM_KILL);
+		if (!page)
+			break;
+		pages++;
+	}
+	/* If size < max_size, preallocating enough memory may be impossible. */
+	if (count > 0 && size == max_size)
+		error = -ENOMEM;
+
+	/* Release all of the preallocated page frames. */
+	swsusp_free();
+
+	if (error) {
+		printk(KERN_CONT "\n");
+		return error;
+	}
+
+ out:
 	do_gettimeofday(&stop);
-	printk("\bdone (%lu pages freed)\n", pages);
+	printk(KERN_CONT "done (preallocated %lu free pages)\n", pages);
 	swsusp_show_speed(&start, &stop, pages, "Freed");
 
 	return 0;
Index: linux-2.6/mm/vmscan.c
===================================================================
--- linux-2.6.orig/mm/vmscan.c
+++ linux-2.6/mm/vmscan.c
@@ -2054,148 +2054,6 @@ unsigned long global_lru_pages(void)
 		+ global_page_state(NR_INACTIVE_FILE);
 }
 
-#ifdef CONFIG_PM
-/*
- * Helper function for shrink_all_memory().  Tries to reclaim 'nr_pages' pages
- * from LRU lists system-wide, for given pass and priority.
- *
- * For pass > 3 we also try to shrink the LRU lists that contain a few pages
- */
-static void shrink_all_zones(unsigned long nr_pages, int prio,
-				      int pass, struct scan_control *sc)
-{
-	struct zone *zone;
-	unsigned long nr_reclaimed = 0;
-
-	for_each_populated_zone(zone) {
-		enum lru_list l;
-
-		if (zone_is_all_unreclaimable(zone) && prio != DEF_PRIORITY)
-			continue;
-
-		for_each_evictable_lru(l) {
-			enum zone_stat_item ls = NR_LRU_BASE + l;
-			unsigned long lru_pages = zone_page_state(zone, ls);
-
-			/* For pass = 0, we don't shrink the active list */
-			if (pass == 0 && (l == LRU_ACTIVE_ANON ||
-						l == LRU_ACTIVE_FILE))
-				continue;
-
-			zone->lru[l].nr_scan += (lru_pages >> prio) + 1;
-			if (zone->lru[l].nr_scan >= nr_pages || pass > 3) {
-				unsigned long nr_to_scan;
-
-				zone->lru[l].nr_scan = 0;
-				nr_to_scan = min(nr_pages, lru_pages);
-				nr_reclaimed += shrink_list(l, nr_to_scan, zone,
-								sc, prio);
-				if (nr_reclaimed >= nr_pages) {
-					sc->nr_reclaimed += nr_reclaimed;
-					return;
-				}
-			}
-		}
-	}
-	sc->nr_reclaimed += nr_reclaimed;
-}
-
-/*
- * Try to free `nr_pages' of memory, system-wide, and return the number of
- * freed pages.
- *
- * Rather than trying to age LRUs the aim is to preserve the overall
- * LRU order by reclaiming preferentially
- * inactive > active > active referenced > active mapped
- */
-unsigned long shrink_all_memory(unsigned long nr_pages)
-{
-	unsigned long lru_pages, nr_slab;
-	int pass;
-	struct reclaim_state reclaim_state;
-	struct scan_control sc = {
-		.gfp_mask = GFP_KERNEL,
-		.may_unmap = 0,
-		.may_writepage = 1,
-		.isolate_pages = isolate_pages_global,
-		.nr_reclaimed = 0,
-	};
-
-	current->reclaim_state = &reclaim_state;
-
-	lru_pages = global_lru_pages();
-	nr_slab = global_page_state(NR_SLAB_RECLAIMABLE);
-	/* If slab caches are huge, it's better to hit them first */
-	while (nr_slab >= lru_pages) {
-		reclaim_state.reclaimed_slab = 0;
-		shrink_slab(nr_pages, sc.gfp_mask, lru_pages);
-		if (!reclaim_state.reclaimed_slab)
-			break;
-
-		sc.nr_reclaimed += reclaim_state.reclaimed_slab;
-		if (sc.nr_reclaimed >= nr_pages)
-			goto out;
-
-		nr_slab -= reclaim_state.reclaimed_slab;
-	}
-
-	/*
-	 * We try to shrink LRUs in 5 passes:
-	 * 0 = Reclaim from inactive_list only
-	 * 1 = Reclaim from active list but don't reclaim mapped
-	 * 2 = 2nd pass of type 1
-	 * 3 = Reclaim mapped (normal reclaim)
-	 * 4 = 2nd pass of type 3
-	 */
-	for (pass = 0; pass < 5; pass++) {
-		int prio;
-
-		/* Force reclaiming mapped pages in the passes #3 and #4 */
-		if (pass > 2)
-			sc.may_unmap = 1;
-
-		for (prio = DEF_PRIORITY; prio >= 0; prio--) {
-			unsigned long nr_to_scan = nr_pages - sc.nr_reclaimed;
-
-			sc.nr_scanned = 0;
-			sc.swap_cluster_max = nr_to_scan;
-			shrink_all_zones(nr_to_scan, prio, pass, &sc);
-			if (sc.nr_reclaimed >= nr_pages)
-				goto out;
-
-			reclaim_state.reclaimed_slab = 0;
-			shrink_slab(sc.nr_scanned, sc.gfp_mask,
-					global_lru_pages());
-			sc.nr_reclaimed += reclaim_state.reclaimed_slab;
-			if (sc.nr_reclaimed >= nr_pages)
-				goto out;
-
-			if (sc.nr_scanned && prio < DEF_PRIORITY - 2)
-				congestion_wait(WRITE, HZ / 10);
-		}
-	}
-
-	/*
-	 * If sc.nr_reclaimed = 0, we could not shrink LRUs, but there may be
-	 * something in slab caches
-	 */
-	if (!sc.nr_reclaimed) {
-		do {
-			reclaim_state.reclaimed_slab = 0;
-			shrink_slab(nr_pages, sc.gfp_mask, global_lru_pages());
-			sc.nr_reclaimed += reclaim_state.reclaimed_slab;
-		} while (sc.nr_reclaimed < nr_pages &&
-				reclaim_state.reclaimed_slab > 0);
-	}
-
-
-out:
-	current->reclaim_state = NULL;
-
-	return sc.nr_reclaimed;
-}
-#endif
-
 /* It's optimal to keep kswapds on the same CPUs as their memory, but
    not required for correctness.  So if the last cpu in a node goes
    away, we get changed to run anywhere: as the first one comes back,

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 4/5] PM/Hibernate: Use memory allocations to free memory (rev. 3)
  2009-05-04  0:08                                             ` Rafael J. Wysocki
                                                               ` (6 preceding siblings ...)
  (?)
@ 2009-05-04  0:20                                             ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-04  0:20 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-kernel, alan-jenkins, jens.axboe, linux-pm, kernel-testers,
	torvalds, Andrew Morton

From: Rafael J. Wysocki <rjw@sisk.pl>

Modify the hibernation memory shrinking code so that it will make
memory allocations to free memory instead of using an artificial
memory shrinking mechanism for that.  Remove the no longer used memory
shrinking functions from mm/vmscan.c .

[rev. 2: Use the existing memory bitmaps for marking preallocated
 image pages and use swsusp_free() from releasing them, add comments
 describing the memory shrinking strategy.
 rev. 3: change the memory shrinking strategy to preallocate as much
 memory as needed to get the right image size in one shot.]

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/snapshot.c |  119 +++++++++++++++++++++++-----------------
 mm/vmscan.c             |  142 ------------------------------------------------
 2 files changed, 70 insertions(+), 191 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -1066,69 +1066,90 @@ void swsusp_free(void)
 	buffer = NULL;
 }
 
+/* Helper function used for the shrinking of memory. */
+
 /**
- *	swsusp_shrink_memory -  Try to free as much memory as needed
+ * swsusp_shrink_memory -  Make the kernel release as much memory as needed
+ *
+ * To create a hibernation image it is necessary to make a copy of every page
+ * frame in use.  We also need a number of page frames to be free during
+ * hibernation for allocations made while saving the image and for device
+ * drivers, in case they need to allocate memory from their hibernation
+ * callbacks (these two numbers are given by PAGES_FOR_IO and SPARE_PAGES,
+ * respectively, both of which are rough estimates).  To make this happen, we
+ * compute the total number of available page frames and allocate at least
  *
- *	... but do not OOM-kill anyone
+ * ([page frames total] + PAGES_FOR_IO + SPARE_PAGES + [metadata pages]) / 2
  *
- *	Notice: all userland should be stopped before it is called, or
- *	livelock is possible.
+ * of them, which corresponds to the maximum size of a hibernation image.
+ *
+ * If image_size is set below the number following from the above formula,
+ * the preallocation of memory is continued until the total number of page
+ * frames in use is below the requested image size or it is impossible to
+ * allocate more memory, whichever happens first.
  */
-
-#define SHRINK_BITE	10000
-static inline unsigned long __shrink_memory(long tmp)
-{
-	if (tmp > SHRINK_BITE)
-		tmp = SHRINK_BITE;
-	return shrink_all_memory(tmp);
-}
-
 int swsusp_shrink_memory(void)
 {
-	long tmp;
 	struct zone *zone;
-	unsigned long pages = 0;
-	unsigned int i = 0;
-	char *p = "-\\|/";
+	unsigned long saveable, size, max_size, count, pages = 0;
 	struct timeval start, stop;
+	int error = 0;
 
-	printk(KERN_INFO "PM: Shrinking memory...  ");
+	printk(KERN_INFO "PM: Shrinking memory ... ");
 	do_gettimeofday(&start);
-	do {
-		long size, highmem_size;
 
-		highmem_size = count_highmem_pages();
-		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
-		tmp = size;
-		size += highmem_size;
-		for_each_populated_zone(zone) {
-			tmp += snapshot_additional_pages(zone);
-			if (is_highmem(zone)) {
-				highmem_size -=
-					zone_page_state(zone, NR_FREE_PAGES);
-			} else {
-				tmp -= zone_page_state(zone, NR_FREE_PAGES);
-				tmp += zone->lowmem_reserve[ZONE_NORMAL];
-			}
-		}
+	/* Count the number of saveable data pages. */
+	saveable = count_data_pages() + count_highmem_pages();
+
+	/*
+	 * Compute the total number of page frames we can use (count) and the
+	 * number of pages needed for image metadata (size).
+	 */
+	count = saveable;
+	size = 0;
+	for_each_populated_zone(zone) {
+		size += snapshot_additional_pages(zone);
+		count += zone_page_state(zone, NR_FREE_PAGES);
+		if (!is_highmem(zone))
+			count -= zone->lowmem_reserve[ZONE_NORMAL];
+	}
+
+	/* Compute the maximum number of saveable pages to leave in memory. */
+	max_size = (count - (size + PAGES_FOR_IO + SPARE_PAGES)) / 2;
+	size = DIV_ROUND_UP(image_size, PAGE_SIZE);
+	if (size > max_size)
+		size = max_size;
+	/*
+	 * If the current number of saveable pages is lesser than the maximum,
+	 * we don't need to do anything more.
+	 */
+	if (size > saveable)
+		goto out;
 
-		if (highmem_size < 0)
-			highmem_size = 0;
+	/* Preallocate memory. */
+	for (count -= size; count > 0; count--) {
+		struct page *page;
 
-		tmp += highmem_size;
-		if (tmp > 0) {
-			tmp = __shrink_memory(tmp);
-			if (!tmp)
-				return -ENOMEM;
-			pages += tmp;
-		} else if (size > image_size / PAGE_SIZE) {
-			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
-			pages += tmp;
-		}
-		printk("\b%c", p[i++%4]);
-	} while (tmp > 0);
+		page = alloc_image_page(GFP_KERNEL | __GFP_NO_OOM_KILL);
+		if (!page)
+			break;
+		pages++;
+	}
+	/* If size < max_size, preallocating enough memory may be impossible. */
+	if (count > 0 && size == max_size)
+		error = -ENOMEM;
+
+	/* Release all of the preallocated page frames. */
+	swsusp_free();
+
+	if (error) {
+		printk(KERN_CONT "\n");
+		return error;
+	}
+
+ out:
 	do_gettimeofday(&stop);
-	printk("\bdone (%lu pages freed)\n", pages);
+	printk(KERN_CONT "done (preallocated %lu free pages)\n", pages);
 	swsusp_show_speed(&start, &stop, pages, "Freed");
 
 	return 0;
Index: linux-2.6/mm/vmscan.c
===================================================================
--- linux-2.6.orig/mm/vmscan.c
+++ linux-2.6/mm/vmscan.c
@@ -2054,148 +2054,6 @@ unsigned long global_lru_pages(void)
 		+ global_page_state(NR_INACTIVE_FILE);
 }
 
-#ifdef CONFIG_PM
-/*
- * Helper function for shrink_all_memory().  Tries to reclaim 'nr_pages' pages
- * from LRU lists system-wide, for given pass and priority.
- *
- * For pass > 3 we also try to shrink the LRU lists that contain a few pages
- */
-static void shrink_all_zones(unsigned long nr_pages, int prio,
-				      int pass, struct scan_control *sc)
-{
-	struct zone *zone;
-	unsigned long nr_reclaimed = 0;
-
-	for_each_populated_zone(zone) {
-		enum lru_list l;
-
-		if (zone_is_all_unreclaimable(zone) && prio != DEF_PRIORITY)
-			continue;
-
-		for_each_evictable_lru(l) {
-			enum zone_stat_item ls = NR_LRU_BASE + l;
-			unsigned long lru_pages = zone_page_state(zone, ls);
-
-			/* For pass = 0, we don't shrink the active list */
-			if (pass == 0 && (l == LRU_ACTIVE_ANON ||
-						l == LRU_ACTIVE_FILE))
-				continue;
-
-			zone->lru[l].nr_scan += (lru_pages >> prio) + 1;
-			if (zone->lru[l].nr_scan >= nr_pages || pass > 3) {
-				unsigned long nr_to_scan;
-
-				zone->lru[l].nr_scan = 0;
-				nr_to_scan = min(nr_pages, lru_pages);
-				nr_reclaimed += shrink_list(l, nr_to_scan, zone,
-								sc, prio);
-				if (nr_reclaimed >= nr_pages) {
-					sc->nr_reclaimed += nr_reclaimed;
-					return;
-				}
-			}
-		}
-	}
-	sc->nr_reclaimed += nr_reclaimed;
-}
-
-/*
- * Try to free `nr_pages' of memory, system-wide, and return the number of
- * freed pages.
- *
- * Rather than trying to age LRUs the aim is to preserve the overall
- * LRU order by reclaiming preferentially
- * inactive > active > active referenced > active mapped
- */
-unsigned long shrink_all_memory(unsigned long nr_pages)
-{
-	unsigned long lru_pages, nr_slab;
-	int pass;
-	struct reclaim_state reclaim_state;
-	struct scan_control sc = {
-		.gfp_mask = GFP_KERNEL,
-		.may_unmap = 0,
-		.may_writepage = 1,
-		.isolate_pages = isolate_pages_global,
-		.nr_reclaimed = 0,
-	};
-
-	current->reclaim_state = &reclaim_state;
-
-	lru_pages = global_lru_pages();
-	nr_slab = global_page_state(NR_SLAB_RECLAIMABLE);
-	/* If slab caches are huge, it's better to hit them first */
-	while (nr_slab >= lru_pages) {
-		reclaim_state.reclaimed_slab = 0;
-		shrink_slab(nr_pages, sc.gfp_mask, lru_pages);
-		if (!reclaim_state.reclaimed_slab)
-			break;
-
-		sc.nr_reclaimed += reclaim_state.reclaimed_slab;
-		if (sc.nr_reclaimed >= nr_pages)
-			goto out;
-
-		nr_slab -= reclaim_state.reclaimed_slab;
-	}
-
-	/*
-	 * We try to shrink LRUs in 5 passes:
-	 * 0 = Reclaim from inactive_list only
-	 * 1 = Reclaim from active list but don't reclaim mapped
-	 * 2 = 2nd pass of type 1
-	 * 3 = Reclaim mapped (normal reclaim)
-	 * 4 = 2nd pass of type 3
-	 */
-	for (pass = 0; pass < 5; pass++) {
-		int prio;
-
-		/* Force reclaiming mapped pages in the passes #3 and #4 */
-		if (pass > 2)
-			sc.may_unmap = 1;
-
-		for (prio = DEF_PRIORITY; prio >= 0; prio--) {
-			unsigned long nr_to_scan = nr_pages - sc.nr_reclaimed;
-
-			sc.nr_scanned = 0;
-			sc.swap_cluster_max = nr_to_scan;
-			shrink_all_zones(nr_to_scan, prio, pass, &sc);
-			if (sc.nr_reclaimed >= nr_pages)
-				goto out;
-
-			reclaim_state.reclaimed_slab = 0;
-			shrink_slab(sc.nr_scanned, sc.gfp_mask,
-					global_lru_pages());
-			sc.nr_reclaimed += reclaim_state.reclaimed_slab;
-			if (sc.nr_reclaimed >= nr_pages)
-				goto out;
-
-			if (sc.nr_scanned && prio < DEF_PRIORITY - 2)
-				congestion_wait(WRITE, HZ / 10);
-		}
-	}
-
-	/*
-	 * If sc.nr_reclaimed = 0, we could not shrink LRUs, but there may be
-	 * something in slab caches
-	 */
-	if (!sc.nr_reclaimed) {
-		do {
-			reclaim_state.reclaimed_slab = 0;
-			shrink_slab(nr_pages, sc.gfp_mask, global_lru_pages());
-			sc.nr_reclaimed += reclaim_state.reclaimed_slab;
-		} while (sc.nr_reclaimed < nr_pages &&
-				reclaim_state.reclaimed_slab > 0);
-	}
-
-
-out:
-	current->reclaim_state = NULL;
-
-	return sc.nr_reclaimed;
-}
-#endif
-
 /* It's optimal to keep kswapds on the same CPUs as their memory, but
    not required for correctness.  So if the last cpu in a node goes
    away, we get changed to run anywhere: as the first one comes back,

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 4/5] PM/Hibernate: Use memory allocations to free memory (rev. 3)
@ 2009-05-04  0:20                                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-04  0:20 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Andrew Morton, pavel-+ZI9xUNit7I,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>

Modify the hibernation memory shrinking code so that it will make
memory allocations to free memory instead of using an artificial
memory shrinking mechanism for that.  Remove the no longer used memory
shrinking functions from mm/vmscan.c .

[rev. 2: Use the existing memory bitmaps for marking preallocated
 image pages and use swsusp_free() from releasing them, add comments
 describing the memory shrinking strategy.
 rev. 3: change the memory shrinking strategy to preallocate as much
 memory as needed to get the right image size in one shot.]

Signed-off-by: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
---
 kernel/power/snapshot.c |  119 +++++++++++++++++++++++-----------------
 mm/vmscan.c             |  142 ------------------------------------------------
 2 files changed, 70 insertions(+), 191 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -1066,69 +1066,90 @@ void swsusp_free(void)
 	buffer = NULL;
 }
 
+/* Helper function used for the shrinking of memory. */
+
 /**
- *	swsusp_shrink_memory -  Try to free as much memory as needed
+ * swsusp_shrink_memory -  Make the kernel release as much memory as needed
+ *
+ * To create a hibernation image it is necessary to make a copy of every page
+ * frame in use.  We also need a number of page frames to be free during
+ * hibernation for allocations made while saving the image and for device
+ * drivers, in case they need to allocate memory from their hibernation
+ * callbacks (these two numbers are given by PAGES_FOR_IO and SPARE_PAGES,
+ * respectively, both of which are rough estimates).  To make this happen, we
+ * compute the total number of available page frames and allocate at least
  *
- *	... but do not OOM-kill anyone
+ * ([page frames total] + PAGES_FOR_IO + SPARE_PAGES + [metadata pages]) / 2
  *
- *	Notice: all userland should be stopped before it is called, or
- *	livelock is possible.
+ * of them, which corresponds to the maximum size of a hibernation image.
+ *
+ * If image_size is set below the number following from the above formula,
+ * the preallocation of memory is continued until the total number of page
+ * frames in use is below the requested image size or it is impossible to
+ * allocate more memory, whichever happens first.
  */
-
-#define SHRINK_BITE	10000
-static inline unsigned long __shrink_memory(long tmp)
-{
-	if (tmp > SHRINK_BITE)
-		tmp = SHRINK_BITE;
-	return shrink_all_memory(tmp);
-}
-
 int swsusp_shrink_memory(void)
 {
-	long tmp;
 	struct zone *zone;
-	unsigned long pages = 0;
-	unsigned int i = 0;
-	char *p = "-\\|/";
+	unsigned long saveable, size, max_size, count, pages = 0;
 	struct timeval start, stop;
+	int error = 0;
 
-	printk(KERN_INFO "PM: Shrinking memory...  ");
+	printk(KERN_INFO "PM: Shrinking memory ... ");
 	do_gettimeofday(&start);
-	do {
-		long size, highmem_size;
 
-		highmem_size = count_highmem_pages();
-		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
-		tmp = size;
-		size += highmem_size;
-		for_each_populated_zone(zone) {
-			tmp += snapshot_additional_pages(zone);
-			if (is_highmem(zone)) {
-				highmem_size -=
-					zone_page_state(zone, NR_FREE_PAGES);
-			} else {
-				tmp -= zone_page_state(zone, NR_FREE_PAGES);
-				tmp += zone->lowmem_reserve[ZONE_NORMAL];
-			}
-		}
+	/* Count the number of saveable data pages. */
+	saveable = count_data_pages() + count_highmem_pages();
+
+	/*
+	 * Compute the total number of page frames we can use (count) and the
+	 * number of pages needed for image metadata (size).
+	 */
+	count = saveable;
+	size = 0;
+	for_each_populated_zone(zone) {
+		size += snapshot_additional_pages(zone);
+		count += zone_page_state(zone, NR_FREE_PAGES);
+		if (!is_highmem(zone))
+			count -= zone->lowmem_reserve[ZONE_NORMAL];
+	}
+
+	/* Compute the maximum number of saveable pages to leave in memory. */
+	max_size = (count - (size + PAGES_FOR_IO + SPARE_PAGES)) / 2;
+	size = DIV_ROUND_UP(image_size, PAGE_SIZE);
+	if (size > max_size)
+		size = max_size;
+	/*
+	 * If the current number of saveable pages is lesser than the maximum,
+	 * we don't need to do anything more.
+	 */
+	if (size > saveable)
+		goto out;
 
-		if (highmem_size < 0)
-			highmem_size = 0;
+	/* Preallocate memory. */
+	for (count -= size; count > 0; count--) {
+		struct page *page;
 
-		tmp += highmem_size;
-		if (tmp > 0) {
-			tmp = __shrink_memory(tmp);
-			if (!tmp)
-				return -ENOMEM;
-			pages += tmp;
-		} else if (size > image_size / PAGE_SIZE) {
-			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
-			pages += tmp;
-		}
-		printk("\b%c", p[i++%4]);
-	} while (tmp > 0);
+		page = alloc_image_page(GFP_KERNEL | __GFP_NO_OOM_KILL);
+		if (!page)
+			break;
+		pages++;
+	}
+	/* If size < max_size, preallocating enough memory may be impossible. */
+	if (count > 0 && size == max_size)
+		error = -ENOMEM;
+
+	/* Release all of the preallocated page frames. */
+	swsusp_free();
+
+	if (error) {
+		printk(KERN_CONT "\n");
+		return error;
+	}
+
+ out:
 	do_gettimeofday(&stop);
-	printk("\bdone (%lu pages freed)\n", pages);
+	printk(KERN_CONT "done (preallocated %lu free pages)\n", pages);
 	swsusp_show_speed(&start, &stop, pages, "Freed");
 
 	return 0;
Index: linux-2.6/mm/vmscan.c
===================================================================
--- linux-2.6.orig/mm/vmscan.c
+++ linux-2.6/mm/vmscan.c
@@ -2054,148 +2054,6 @@ unsigned long global_lru_pages(void)
 		+ global_page_state(NR_INACTIVE_FILE);
 }
 
-#ifdef CONFIG_PM
-/*
- * Helper function for shrink_all_memory().  Tries to reclaim 'nr_pages' pages
- * from LRU lists system-wide, for given pass and priority.
- *
- * For pass > 3 we also try to shrink the LRU lists that contain a few pages
- */
-static void shrink_all_zones(unsigned long nr_pages, int prio,
-				      int pass, struct scan_control *sc)
-{
-	struct zone *zone;
-	unsigned long nr_reclaimed = 0;
-
-	for_each_populated_zone(zone) {
-		enum lru_list l;
-
-		if (zone_is_all_unreclaimable(zone) && prio != DEF_PRIORITY)
-			continue;
-
-		for_each_evictable_lru(l) {
-			enum zone_stat_item ls = NR_LRU_BASE + l;
-			unsigned long lru_pages = zone_page_state(zone, ls);
-
-			/* For pass = 0, we don't shrink the active list */
-			if (pass == 0 && (l == LRU_ACTIVE_ANON ||
-						l == LRU_ACTIVE_FILE))
-				continue;
-
-			zone->lru[l].nr_scan += (lru_pages >> prio) + 1;
-			if (zone->lru[l].nr_scan >= nr_pages || pass > 3) {
-				unsigned long nr_to_scan;
-
-				zone->lru[l].nr_scan = 0;
-				nr_to_scan = min(nr_pages, lru_pages);
-				nr_reclaimed += shrink_list(l, nr_to_scan, zone,
-								sc, prio);
-				if (nr_reclaimed >= nr_pages) {
-					sc->nr_reclaimed += nr_reclaimed;
-					return;
-				}
-			}
-		}
-	}
-	sc->nr_reclaimed += nr_reclaimed;
-}
-
-/*
- * Try to free `nr_pages' of memory, system-wide, and return the number of
- * freed pages.
- *
- * Rather than trying to age LRUs the aim is to preserve the overall
- * LRU order by reclaiming preferentially
- * inactive > active > active referenced > active mapped
- */
-unsigned long shrink_all_memory(unsigned long nr_pages)
-{
-	unsigned long lru_pages, nr_slab;
-	int pass;
-	struct reclaim_state reclaim_state;
-	struct scan_control sc = {
-		.gfp_mask = GFP_KERNEL,
-		.may_unmap = 0,
-		.may_writepage = 1,
-		.isolate_pages = isolate_pages_global,
-		.nr_reclaimed = 0,
-	};
-
-	current->reclaim_state = &reclaim_state;
-
-	lru_pages = global_lru_pages();
-	nr_slab = global_page_state(NR_SLAB_RECLAIMABLE);
-	/* If slab caches are huge, it's better to hit them first */
-	while (nr_slab >= lru_pages) {
-		reclaim_state.reclaimed_slab = 0;
-		shrink_slab(nr_pages, sc.gfp_mask, lru_pages);
-		if (!reclaim_state.reclaimed_slab)
-			break;
-
-		sc.nr_reclaimed += reclaim_state.reclaimed_slab;
-		if (sc.nr_reclaimed >= nr_pages)
-			goto out;
-
-		nr_slab -= reclaim_state.reclaimed_slab;
-	}
-
-	/*
-	 * We try to shrink LRUs in 5 passes:
-	 * 0 = Reclaim from inactive_list only
-	 * 1 = Reclaim from active list but don't reclaim mapped
-	 * 2 = 2nd pass of type 1
-	 * 3 = Reclaim mapped (normal reclaim)
-	 * 4 = 2nd pass of type 3
-	 */
-	for (pass = 0; pass < 5; pass++) {
-		int prio;
-
-		/* Force reclaiming mapped pages in the passes #3 and #4 */
-		if (pass > 2)
-			sc.may_unmap = 1;
-
-		for (prio = DEF_PRIORITY; prio >= 0; prio--) {
-			unsigned long nr_to_scan = nr_pages - sc.nr_reclaimed;
-
-			sc.nr_scanned = 0;
-			sc.swap_cluster_max = nr_to_scan;
-			shrink_all_zones(nr_to_scan, prio, pass, &sc);
-			if (sc.nr_reclaimed >= nr_pages)
-				goto out;
-
-			reclaim_state.reclaimed_slab = 0;
-			shrink_slab(sc.nr_scanned, sc.gfp_mask,
-					global_lru_pages());
-			sc.nr_reclaimed += reclaim_state.reclaimed_slab;
-			if (sc.nr_reclaimed >= nr_pages)
-				goto out;
-
-			if (sc.nr_scanned && prio < DEF_PRIORITY - 2)
-				congestion_wait(WRITE, HZ / 10);
-		}
-	}
-
-	/*
-	 * If sc.nr_reclaimed = 0, we could not shrink LRUs, but there may be
-	 * something in slab caches
-	 */
-	if (!sc.nr_reclaimed) {
-		do {
-			reclaim_state.reclaimed_slab = 0;
-			shrink_slab(nr_pages, sc.gfp_mask, global_lru_pages());
-			sc.nr_reclaimed += reclaim_state.reclaimed_slab;
-		} while (sc.nr_reclaimed < nr_pages &&
-				reclaim_state.reclaimed_slab > 0);
-	}
-
-
-out:
-	current->reclaim_state = NULL;
-
-	return sc.nr_reclaimed;
-}
-#endif
-
 /* It's optimal to keep kswapds on the same CPUs as their memory, but
    not required for correctness.  So if the last cpu in a node goes
    away, we get changed to run anywhere: as the first one comes back,

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
@ 2009-05-04  0:22                                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-04  0:22 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-pm, Andrew Morton, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

From: Rafael J. Wysocki <rjw@sisk.pl>

Since the hibernation code is now going to use allocations of memory
to create enough room for the image, it can also use the page frames
allocated at this stage as image page frames.  The low-level
hibernation code needs to be rearranged for this purpose, but it
allows us to avoid freeing a great number of pages and allocating
these same pages once again later, so it generally is worth doing.

[rev. 2: Change the strategy of preallocating memory to allocate as
 many pages as needed to get the right image size in one shot (the
 excessive allocated pages are released afterwards).]

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/disk.c     |   15 +++-
 kernel/power/power.h    |    2 
 kernel/power/snapshot.c |  157 ++++++++++++++++++++++++++++++------------------
 3 files changed, 112 insertions(+), 62 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -1033,6 +1033,25 @@ copy_data_pages(struct memory_bitmap *co
 static unsigned int nr_copy_pages;
 /* Number of pages needed for saving the original pfns of the image pages */
 static unsigned int nr_meta_pages;
+/*
+ * Numbers of normal and highmem page frames allocated for hibernation image
+ * before suspending devices.
+ */
+unsigned int alloc_normal, alloc_highmem;
+/*
+ * Memory bitmap used for marking saveable pages (during hibernation) or
+ * hibernation image pages (during restore)
+ */
+static struct memory_bitmap orig_bm;
+/*
+ * Memory bitmap used during hibernation for marking allocated page frames that
+ * will contain copies of saveable pages.  During restore it is initially used
+ * for marking hibernation image pages, but then the set bits from it are
+ * duplicated in @orig_bm and it is released.  On highmem systems it is next
+ * used for marking "safe" highmem pages, but it has to be reinitialized for
+ * this purpose.
+ */
+static struct memory_bitmap copy_bm;
 
 /**
  *	swsusp_free - free pages allocated for the suspend.
@@ -1064,12 +1083,16 @@ void swsusp_free(void)
 	nr_meta_pages = 0;
 	restore_pblist = NULL;
 	buffer = NULL;
+	alloc_normal = 0;
+	alloc_highmem = 0;
 }
 
 /* Helper function used for the shrinking of memory. */
 
+#define GFP_IMAGE	(GFP_KERNEL | __GFP_NO_OOM_KILL)
+
 /**
- * swsusp_shrink_memory -  Make the kernel release as much memory as needed
+ * hibernate_preallocate_memory - Preallocate memory for hibernation image
  *
  * To create a hibernation image it is necessary to make a copy of every page
  * frame in use.  We also need a number of page frames to be free during
@@ -1088,16 +1111,27 @@ void swsusp_free(void)
  * frames in use is below the requested image size or it is impossible to
  * allocate more memory, whichever happens first.
  */
-int swsusp_shrink_memory(void)
+int hibernate_preallocate_memory(void)
 {
 	struct zone *zone;
 	unsigned long saveable, size, max_size, count, pages = 0;
 	struct timeval start, stop;
-	int error = 0;
+	int error;
 
-	printk(KERN_INFO "PM: Shrinking memory ... ");
+	printk(KERN_INFO "PM: Preallocating image memory ... ");
 	do_gettimeofday(&start);
 
+	error = memory_bm_create(&orig_bm, GFP_IMAGE, PG_ANY);
+	if (error)
+		goto err_out;
+
+	error = memory_bm_create(&copy_bm, GFP_IMAGE, PG_ANY);
+	if (error)
+		goto err_out;
+
+	alloc_normal = 0;
+	alloc_highmem = 0;
+
 	/* Count the number of saveable data pages. */
 	saveable = count_data_pages() + count_highmem_pages();
 
@@ -1130,29 +1164,55 @@ int swsusp_shrink_memory(void)
 	for (count -= size; count > 0; count--) {
 		struct page *page;
 
-		page = alloc_image_page(GFP_KERNEL | __GFP_NO_OOM_KILL);
+		page = alloc_image_page(GFP_IMAGE);
 		if (!page)
 			break;
-		pages++;
+		memory_bm_set_bit(&copy_bm, page_to_pfn(page));
+		if (PageHighMem(page))
+			alloc_highmem++;
+		else
+			alloc_normal++;
 	}
 	/* If size < max_size, preallocating enough memory may be impossible. */
 	if (count > 0 && size == max_size)
 		error = -ENOMEM;
+	if (error)
+		goto err_out;
 
-	/* Release all of the preallocated page frames. */
-	swsusp_free();
+	/* Save the number of allocated pages for the statistics below. */
+	pages = alloc_normal + alloc_highmem;
 
-	if (error) {
-		printk(KERN_CONT "\n");
-		return error;
+	/*
+	 * We only need 'size' page frames for the image but we have allocated
+	 * more.  Release the excessive ones now.
+	 */
+	memory_bm_position_reset(&copy_bm);
+	while (alloc_normal + alloc_highmem > size) {
+		unsigned long pfn = memory_bm_next_pfn(&copy_bm);
+		struct page *page = pfn_to_page(pfn);
+
+		memory_bm_clear_bit(&copy_bm, pfn);
+		if (PageHighMem(page))
+			alloc_highmem--;
+		else
+			alloc_normal--;
+		swsusp_unset_page_forbidden(page);
+		swsusp_unset_page_free(page);
+		__free_page(page);
 	}
 
  out:
 	do_gettimeofday(&stop);
-	printk(KERN_CONT "done (preallocated %lu free pages)\n", pages);
-	swsusp_show_speed(&start, &stop, pages, "Freed");
+	printk(KERN_CONT "done (allocated %lu pages, %lu image pages kept)\n",
+		pages, size);
+	swsusp_show_speed(&start, &stop, pages, "Allocated");
 
 	return 0;
+
+ err_out:
+	printk(KERN_CONT "\n");
+	swsusp_free();
+	return error;
 }
 
 #ifdef CONFIG_HIGHMEM
@@ -1163,7 +1223,7 @@ int swsusp_shrink_memory(void)
 
 static unsigned int count_pages_for_highmem(unsigned int nr_highmem)
 {
-	unsigned int free_highmem = count_free_highmem_pages();
+	unsigned int free_highmem = count_free_highmem_pages() + alloc_highmem;
 
 	if (free_highmem >= nr_highmem)
 		nr_highmem = 0;
@@ -1185,19 +1245,17 @@ count_pages_for_highmem(unsigned int nr_
 static int enough_free_mem(unsigned int nr_pages, unsigned int nr_highmem)
 {
 	struct zone *zone;
-	unsigned int free = 0, meta = 0;
+	unsigned int free = alloc_normal;
 
-	for_each_zone(zone) {
-		meta += snapshot_additional_pages(zone);
+	for_each_zone(zone)
 		if (!is_highmem(zone))
 			free += zone_page_state(zone, NR_FREE_PAGES);
-	}
 
 	nr_pages += count_pages_for_highmem(nr_highmem);
-	pr_debug("PM: Normal pages needed: %u + %u + %u, available pages: %u\n",
-		nr_pages, PAGES_FOR_IO, meta, free);
+	pr_debug("PM: Normal pages needed: %u + %u, available pages: %u\n",
+		nr_pages, PAGES_FOR_IO, free);
 
-	return free > nr_pages + PAGES_FOR_IO + meta;
+	return free > nr_pages + PAGES_FOR_IO;
 }
 
 #ifdef CONFIG_HIGHMEM
@@ -1219,7 +1277,7 @@ static inline int get_highmem_buffer(int
  */
 
 static inline unsigned int
-alloc_highmem_image_pages(struct memory_bitmap *bm, unsigned int nr_highmem)
+alloc_highmem_pages(struct memory_bitmap *bm, unsigned int nr_highmem)
 {
 	unsigned int to_alloc = count_free_highmem_pages();
 
@@ -1230,7 +1288,7 @@ alloc_highmem_image_pages(struct memory_
 	while (to_alloc-- > 0) {
 		struct page *page;
 
-		page = alloc_image_page(__GFP_HIGHMEM);
+		page = alloc_image_page(__GFP_HIGHMEM | __GFP_NO_OOM_KILL);
 		memory_bm_set_bit(bm, page_to_pfn(page));
 	}
 	return nr_highmem;
@@ -1239,7 +1297,7 @@ alloc_highmem_image_pages(struct memory_
 static inline int get_highmem_buffer(int safe_needed) { return 0; }
 
 static inline unsigned int
-alloc_highmem_image_pages(struct memory_bitmap *bm, unsigned int n) { return 0; }
+alloc_highmem_pages(struct memory_bitmap *bm, unsigned int n) { return 0; }
 #endif /* CONFIG_HIGHMEM */
 
 /**
@@ -1258,51 +1316,36 @@ static int
 swsusp_alloc(struct memory_bitmap *orig_bm, struct memory_bitmap *copy_bm,
 		unsigned int nr_pages, unsigned int nr_highmem)
 {
-	int error;
-
-	error = memory_bm_create(orig_bm, GFP_ATOMIC | __GFP_COLD, PG_ANY);
-	if (error)
-		goto Free;
-
-	error = memory_bm_create(copy_bm, GFP_ATOMIC | __GFP_COLD, PG_ANY);
-	if (error)
-		goto Free;
+	int error = 0;
 
 	if (nr_highmem > 0) {
 		error = get_highmem_buffer(PG_ANY);
 		if (error)
-			goto Free;
-
-		nr_pages += alloc_highmem_image_pages(copy_bm, nr_highmem);
+			goto err_out;
+		if (nr_highmem > alloc_highmem) {
+			nr_highmem -= alloc_highmem;
+			nr_pages += alloc_highmem_pages(copy_bm, nr_highmem);
+		}
 	}
-	while (nr_pages-- > 0) {
-		struct page *page = alloc_image_page(GFP_ATOMIC | __GFP_COLD);
-
-		if (!page)
-			goto Free;
+	if (nr_pages > alloc_normal) {
+		nr_pages -= alloc_normal;
+		while (nr_pages-- > 0) {
+			struct page *page;
 
-		memory_bm_set_bit(copy_bm, page_to_pfn(page));
+			page = alloc_image_page(GFP_ATOMIC | __GFP_COLD);
+			if (!page)
+				goto err_out;
+			memory_bm_set_bit(copy_bm, page_to_pfn(page));
+		}
 	}
+
 	return 0;
 
- Free:
+ err_out:
 	swsusp_free();
-	return -ENOMEM;
+	return error;
 }
 
-/* Memory bitmap used for marking saveable pages (during suspend) or the
- * suspend image pages (during resume)
- */
-static struct memory_bitmap orig_bm;
-/* Memory bitmap used on suspend for marking allocated pages that will contain
- * the copies of saveable pages.  During resume it is initially used for
- * marking the suspend image pages, but then its set bits are duplicated in
- * @orig_bm and it is released.  Next, on systems with high memory, it may be
- * used for marking "safe" highmem pages, but it has to be reinitialized for
- * this purpose.
- */
-static struct memory_bitmap copy_bm;
-
 asmlinkage int swsusp_save(void)
 {
 	unsigned int nr_pages, nr_highmem;
Index: linux-2.6/kernel/power/power.h
===================================================================
--- linux-2.6.orig/kernel/power/power.h
+++ linux-2.6/kernel/power/power.h
@@ -74,7 +74,7 @@ extern asmlinkage int swsusp_arch_resume
 
 extern int create_basic_memory_bitmaps(void);
 extern void free_basic_memory_bitmaps(void);
-extern int swsusp_shrink_memory(void);
+extern int hibernate_preallocate_memory(void);
 
 /**
  *	Auxiliary structure used for reading the snapshot image data and
Index: linux-2.6/kernel/power/disk.c
===================================================================
--- linux-2.6.orig/kernel/power/disk.c
+++ linux-2.6/kernel/power/disk.c
@@ -303,8 +303,8 @@ int hibernation_snapshot(int platform_mo
 	if (error)
 		return error;
 
-	/* Free memory before shutting down devices. */
-	error = swsusp_shrink_memory();
+	/* Preallocate image memory before shutting down devices. */
+	error = hibernate_preallocate_memory();
 	if (error)
 		goto Close;
 
@@ -320,6 +320,10 @@ int hibernation_snapshot(int platform_mo
 	/* Control returns here after successful restore */
 
  Resume_devices:
+	/* We may need to release the preallocated image pages here. */
+	if (error || !in_suspend)
+		swsusp_free();
+
 	device_resume(in_suspend ?
 		(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
 	resume_console();
@@ -593,7 +597,10 @@ int hibernate(void)
 		goto Thaw;
 
 	error = hibernation_snapshot(hibernation_mode == HIBERNATION_PLATFORM);
-	if (in_suspend && !error) {
+	if (error)
+		goto Thaw;
+
+	if (in_suspend) {
 		unsigned int flags = 0;
 
 		if (hibernation_mode == HIBERNATION_PLATFORM)
@@ -605,8 +612,8 @@ int hibernate(void)
 			power_down();
 	} else {
 		pr_debug("PM: Image restored successfully.\n");
-		swsusp_free();
 	}
+
  Thaw:
 	thaw_processes();
  Finish:

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
  2009-05-04  0:08                                             ` Rafael J. Wysocki
                                                               ` (9 preceding siblings ...)
  (?)
@ 2009-05-04  0:22                                             ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-04  0:22 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-kernel, alan-jenkins, jens.axboe, linux-pm, kernel-testers,
	torvalds, Andrew Morton

From: Rafael J. Wysocki <rjw@sisk.pl>

Since the hibernation code is now going to use allocations of memory
to create enough room for the image, it can also use the page frames
allocated at this stage as image page frames.  The low-level
hibernation code needs to be rearranged for this purpose, but it
allows us to avoid freeing a great number of pages and allocating
these same pages once again later, so it generally is worth doing.

[rev. 2: Change the strategy of preallocating memory to allocate as
 many pages as needed to get the right image size in one shot (the
 excessive allocated pages are released afterwards).]

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/disk.c     |   15 +++-
 kernel/power/power.h    |    2 
 kernel/power/snapshot.c |  157 ++++++++++++++++++++++++++++++------------------
 3 files changed, 112 insertions(+), 62 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -1033,6 +1033,25 @@ copy_data_pages(struct memory_bitmap *co
 static unsigned int nr_copy_pages;
 /* Number of pages needed for saving the original pfns of the image pages */
 static unsigned int nr_meta_pages;
+/*
+ * Numbers of normal and highmem page frames allocated for hibernation image
+ * before suspending devices.
+ */
+unsigned int alloc_normal, alloc_highmem;
+/*
+ * Memory bitmap used for marking saveable pages (during hibernation) or
+ * hibernation image pages (during restore)
+ */
+static struct memory_bitmap orig_bm;
+/*
+ * Memory bitmap used during hibernation for marking allocated page frames that
+ * will contain copies of saveable pages.  During restore it is initially used
+ * for marking hibernation image pages, but then the set bits from it are
+ * duplicated in @orig_bm and it is released.  On highmem systems it is next
+ * used for marking "safe" highmem pages, but it has to be reinitialized for
+ * this purpose.
+ */
+static struct memory_bitmap copy_bm;
 
 /**
  *	swsusp_free - free pages allocated for the suspend.
@@ -1064,12 +1083,16 @@ void swsusp_free(void)
 	nr_meta_pages = 0;
 	restore_pblist = NULL;
 	buffer = NULL;
+	alloc_normal = 0;
+	alloc_highmem = 0;
 }
 
 /* Helper function used for the shrinking of memory. */
 
+#define GFP_IMAGE	(GFP_KERNEL | __GFP_NO_OOM_KILL)
+
 /**
- * swsusp_shrink_memory -  Make the kernel release as much memory as needed
+ * hibernate_preallocate_memory - Preallocate memory for hibernation image
  *
  * To create a hibernation image it is necessary to make a copy of every page
  * frame in use.  We also need a number of page frames to be free during
@@ -1088,16 +1111,27 @@ void swsusp_free(void)
  * frames in use is below the requested image size or it is impossible to
  * allocate more memory, whichever happens first.
  */
-int swsusp_shrink_memory(void)
+int hibernate_preallocate_memory(void)
 {
 	struct zone *zone;
 	unsigned long saveable, size, max_size, count, pages = 0;
 	struct timeval start, stop;
-	int error = 0;
+	int error;
 
-	printk(KERN_INFO "PM: Shrinking memory ... ");
+	printk(KERN_INFO "PM: Preallocating image memory ... ");
 	do_gettimeofday(&start);
 
+	error = memory_bm_create(&orig_bm, GFP_IMAGE, PG_ANY);
+	if (error)
+		goto err_out;
+
+	error = memory_bm_create(&copy_bm, GFP_IMAGE, PG_ANY);
+	if (error)
+		goto err_out;
+
+	alloc_normal = 0;
+	alloc_highmem = 0;
+
 	/* Count the number of saveable data pages. */
 	saveable = count_data_pages() + count_highmem_pages();
 
@@ -1130,29 +1164,55 @@ int swsusp_shrink_memory(void)
 	for (count -= size; count > 0; count--) {
 		struct page *page;
 
-		page = alloc_image_page(GFP_KERNEL | __GFP_NO_OOM_KILL);
+		page = alloc_image_page(GFP_IMAGE);
 		if (!page)
 			break;
-		pages++;
+		memory_bm_set_bit(&copy_bm, page_to_pfn(page));
+		if (PageHighMem(page))
+			alloc_highmem++;
+		else
+			alloc_normal++;
 	}
 	/* If size < max_size, preallocating enough memory may be impossible. */
 	if (count > 0 && size == max_size)
 		error = -ENOMEM;
+	if (error)
+		goto err_out;
 
-	/* Release all of the preallocated page frames. */
-	swsusp_free();
+	/* Save the number of allocated pages for the statistics below. */
+	pages = alloc_normal + alloc_highmem;
 
-	if (error) {
-		printk(KERN_CONT "\n");
-		return error;
+	/*
+	 * We only need 'size' page frames for the image but we have allocated
+	 * more.  Release the excessive ones now.
+	 */
+	memory_bm_position_reset(&copy_bm);
+	while (alloc_normal + alloc_highmem > size) {
+		unsigned long pfn = memory_bm_next_pfn(&copy_bm);
+		struct page *page = pfn_to_page(pfn);
+
+		memory_bm_clear_bit(&copy_bm, pfn);
+		if (PageHighMem(page))
+			alloc_highmem--;
+		else
+			alloc_normal--;
+		swsusp_unset_page_forbidden(page);
+		swsusp_unset_page_free(page);
+		__free_page(page);
 	}
 
  out:
 	do_gettimeofday(&stop);
-	printk(KERN_CONT "done (preallocated %lu free pages)\n", pages);
-	swsusp_show_speed(&start, &stop, pages, "Freed");
+	printk(KERN_CONT "done (allocated %lu pages, %lu image pages kept)\n",
+		pages, size);
+	swsusp_show_speed(&start, &stop, pages, "Allocated");
 
 	return 0;
+
+ err_out:
+	printk(KERN_CONT "\n");
+	swsusp_free();
+	return error;
 }
 
 #ifdef CONFIG_HIGHMEM
@@ -1163,7 +1223,7 @@ int swsusp_shrink_memory(void)
 
 static unsigned int count_pages_for_highmem(unsigned int nr_highmem)
 {
-	unsigned int free_highmem = count_free_highmem_pages();
+	unsigned int free_highmem = count_free_highmem_pages() + alloc_highmem;
 
 	if (free_highmem >= nr_highmem)
 		nr_highmem = 0;
@@ -1185,19 +1245,17 @@ count_pages_for_highmem(unsigned int nr_
 static int enough_free_mem(unsigned int nr_pages, unsigned int nr_highmem)
 {
 	struct zone *zone;
-	unsigned int free = 0, meta = 0;
+	unsigned int free = alloc_normal;
 
-	for_each_zone(zone) {
-		meta += snapshot_additional_pages(zone);
+	for_each_zone(zone)
 		if (!is_highmem(zone))
 			free += zone_page_state(zone, NR_FREE_PAGES);
-	}
 
 	nr_pages += count_pages_for_highmem(nr_highmem);
-	pr_debug("PM: Normal pages needed: %u + %u + %u, available pages: %u\n",
-		nr_pages, PAGES_FOR_IO, meta, free);
+	pr_debug("PM: Normal pages needed: %u + %u, available pages: %u\n",
+		nr_pages, PAGES_FOR_IO, free);
 
-	return free > nr_pages + PAGES_FOR_IO + meta;
+	return free > nr_pages + PAGES_FOR_IO;
 }
 
 #ifdef CONFIG_HIGHMEM
@@ -1219,7 +1277,7 @@ static inline int get_highmem_buffer(int
  */
 
 static inline unsigned int
-alloc_highmem_image_pages(struct memory_bitmap *bm, unsigned int nr_highmem)
+alloc_highmem_pages(struct memory_bitmap *bm, unsigned int nr_highmem)
 {
 	unsigned int to_alloc = count_free_highmem_pages();
 
@@ -1230,7 +1288,7 @@ alloc_highmem_image_pages(struct memory_
 	while (to_alloc-- > 0) {
 		struct page *page;
 
-		page = alloc_image_page(__GFP_HIGHMEM);
+		page = alloc_image_page(__GFP_HIGHMEM | __GFP_NO_OOM_KILL);
 		memory_bm_set_bit(bm, page_to_pfn(page));
 	}
 	return nr_highmem;
@@ -1239,7 +1297,7 @@ alloc_highmem_image_pages(struct memory_
 static inline int get_highmem_buffer(int safe_needed) { return 0; }
 
 static inline unsigned int
-alloc_highmem_image_pages(struct memory_bitmap *bm, unsigned int n) { return 0; }
+alloc_highmem_pages(struct memory_bitmap *bm, unsigned int n) { return 0; }
 #endif /* CONFIG_HIGHMEM */
 
 /**
@@ -1258,51 +1316,36 @@ static int
 swsusp_alloc(struct memory_bitmap *orig_bm, struct memory_bitmap *copy_bm,
 		unsigned int nr_pages, unsigned int nr_highmem)
 {
-	int error;
-
-	error = memory_bm_create(orig_bm, GFP_ATOMIC | __GFP_COLD, PG_ANY);
-	if (error)
-		goto Free;
-
-	error = memory_bm_create(copy_bm, GFP_ATOMIC | __GFP_COLD, PG_ANY);
-	if (error)
-		goto Free;
+	int error = 0;
 
 	if (nr_highmem > 0) {
 		error = get_highmem_buffer(PG_ANY);
 		if (error)
-			goto Free;
-
-		nr_pages += alloc_highmem_image_pages(copy_bm, nr_highmem);
+			goto err_out;
+		if (nr_highmem > alloc_highmem) {
+			nr_highmem -= alloc_highmem;
+			nr_pages += alloc_highmem_pages(copy_bm, nr_highmem);
+		}
 	}
-	while (nr_pages-- > 0) {
-		struct page *page = alloc_image_page(GFP_ATOMIC | __GFP_COLD);
-
-		if (!page)
-			goto Free;
+	if (nr_pages > alloc_normal) {
+		nr_pages -= alloc_normal;
+		while (nr_pages-- > 0) {
+			struct page *page;
 
-		memory_bm_set_bit(copy_bm, page_to_pfn(page));
+			page = alloc_image_page(GFP_ATOMIC | __GFP_COLD);
+			if (!page)
+				goto err_out;
+			memory_bm_set_bit(copy_bm, page_to_pfn(page));
+		}
 	}
+
 	return 0;
 
- Free:
+ err_out:
 	swsusp_free();
-	return -ENOMEM;
+	return error;
 }
 
-/* Memory bitmap used for marking saveable pages (during suspend) or the
- * suspend image pages (during resume)
- */
-static struct memory_bitmap orig_bm;
-/* Memory bitmap used on suspend for marking allocated pages that will contain
- * the copies of saveable pages.  During resume it is initially used for
- * marking the suspend image pages, but then its set bits are duplicated in
- * @orig_bm and it is released.  Next, on systems with high memory, it may be
- * used for marking "safe" highmem pages, but it has to be reinitialized for
- * this purpose.
- */
-static struct memory_bitmap copy_bm;
-
 asmlinkage int swsusp_save(void)
 {
 	unsigned int nr_pages, nr_highmem;
Index: linux-2.6/kernel/power/power.h
===================================================================
--- linux-2.6.orig/kernel/power/power.h
+++ linux-2.6/kernel/power/power.h
@@ -74,7 +74,7 @@ extern asmlinkage int swsusp_arch_resume
 
 extern int create_basic_memory_bitmaps(void);
 extern void free_basic_memory_bitmaps(void);
-extern int swsusp_shrink_memory(void);
+extern int hibernate_preallocate_memory(void);
 
 /**
  *	Auxiliary structure used for reading the snapshot image data and
Index: linux-2.6/kernel/power/disk.c
===================================================================
--- linux-2.6.orig/kernel/power/disk.c
+++ linux-2.6/kernel/power/disk.c
@@ -303,8 +303,8 @@ int hibernation_snapshot(int platform_mo
 	if (error)
 		return error;
 
-	/* Free memory before shutting down devices. */
-	error = swsusp_shrink_memory();
+	/* Preallocate image memory before shutting down devices. */
+	error = hibernate_preallocate_memory();
 	if (error)
 		goto Close;
 
@@ -320,6 +320,10 @@ int hibernation_snapshot(int platform_mo
 	/* Control returns here after successful restore */
 
  Resume_devices:
+	/* We may need to release the preallocated image pages here. */
+	if (error || !in_suspend)
+		swsusp_free();
+
 	device_resume(in_suspend ?
 		(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
 	resume_console();
@@ -593,7 +597,10 @@ int hibernate(void)
 		goto Thaw;
 
 	error = hibernation_snapshot(hibernation_mode == HIBERNATION_PLATFORM);
-	if (in_suspend && !error) {
+	if (error)
+		goto Thaw;
+
+	if (in_suspend) {
 		unsigned int flags = 0;
 
 		if (hibernation_mode == HIBERNATION_PLATFORM)
@@ -605,8 +612,8 @@ int hibernate(void)
 			power_down();
 	} else {
 		pr_debug("PM: Image restored successfully.\n");
-		swsusp_free();
 	}
+
  Thaw:
 	thaw_processes();
  Finish:

^ permalink raw reply	[flat|nested] 580+ messages in thread

* [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
@ 2009-05-04  0:22                                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-04  0:22 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Andrew Morton, pavel-+ZI9xUNit7I,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>

Since the hibernation code is now going to use allocations of memory
to create enough room for the image, it can also use the page frames
allocated at this stage as image page frames.  The low-level
hibernation code needs to be rearranged for this purpose, but it
allows us to avoid freeing a great number of pages and allocating
these same pages once again later, so it generally is worth doing.

[rev. 2: Change the strategy of preallocating memory to allocate as
 many pages as needed to get the right image size in one shot (the
 excessive allocated pages are released afterwards).]

Signed-off-by: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
---
 kernel/power/disk.c     |   15 +++-
 kernel/power/power.h    |    2 
 kernel/power/snapshot.c |  157 ++++++++++++++++++++++++++++++------------------
 3 files changed, 112 insertions(+), 62 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -1033,6 +1033,25 @@ copy_data_pages(struct memory_bitmap *co
 static unsigned int nr_copy_pages;
 /* Number of pages needed for saving the original pfns of the image pages */
 static unsigned int nr_meta_pages;
+/*
+ * Numbers of normal and highmem page frames allocated for hibernation image
+ * before suspending devices.
+ */
+unsigned int alloc_normal, alloc_highmem;
+/*
+ * Memory bitmap used for marking saveable pages (during hibernation) or
+ * hibernation image pages (during restore)
+ */
+static struct memory_bitmap orig_bm;
+/*
+ * Memory bitmap used during hibernation for marking allocated page frames that
+ * will contain copies of saveable pages.  During restore it is initially used
+ * for marking hibernation image pages, but then the set bits from it are
+ * duplicated in @orig_bm and it is released.  On highmem systems it is next
+ * used for marking "safe" highmem pages, but it has to be reinitialized for
+ * this purpose.
+ */
+static struct memory_bitmap copy_bm;
 
 /**
  *	swsusp_free - free pages allocated for the suspend.
@@ -1064,12 +1083,16 @@ void swsusp_free(void)
 	nr_meta_pages = 0;
 	restore_pblist = NULL;
 	buffer = NULL;
+	alloc_normal = 0;
+	alloc_highmem = 0;
 }
 
 /* Helper function used for the shrinking of memory. */
 
+#define GFP_IMAGE	(GFP_KERNEL | __GFP_NO_OOM_KILL)
+
 /**
- * swsusp_shrink_memory -  Make the kernel release as much memory as needed
+ * hibernate_preallocate_memory - Preallocate memory for hibernation image
  *
  * To create a hibernation image it is necessary to make a copy of every page
  * frame in use.  We also need a number of page frames to be free during
@@ -1088,16 +1111,27 @@ void swsusp_free(void)
  * frames in use is below the requested image size or it is impossible to
  * allocate more memory, whichever happens first.
  */
-int swsusp_shrink_memory(void)
+int hibernate_preallocate_memory(void)
 {
 	struct zone *zone;
 	unsigned long saveable, size, max_size, count, pages = 0;
 	struct timeval start, stop;
-	int error = 0;
+	int error;
 
-	printk(KERN_INFO "PM: Shrinking memory ... ");
+	printk(KERN_INFO "PM: Preallocating image memory ... ");
 	do_gettimeofday(&start);
 
+	error = memory_bm_create(&orig_bm, GFP_IMAGE, PG_ANY);
+	if (error)
+		goto err_out;
+
+	error = memory_bm_create(&copy_bm, GFP_IMAGE, PG_ANY);
+	if (error)
+		goto err_out;
+
+	alloc_normal = 0;
+	alloc_highmem = 0;
+
 	/* Count the number of saveable data pages. */
 	saveable = count_data_pages() + count_highmem_pages();
 
@@ -1130,29 +1164,55 @@ int swsusp_shrink_memory(void)
 	for (count -= size; count > 0; count--) {
 		struct page *page;
 
-		page = alloc_image_page(GFP_KERNEL | __GFP_NO_OOM_KILL);
+		page = alloc_image_page(GFP_IMAGE);
 		if (!page)
 			break;
-		pages++;
+		memory_bm_set_bit(&copy_bm, page_to_pfn(page));
+		if (PageHighMem(page))
+			alloc_highmem++;
+		else
+			alloc_normal++;
 	}
 	/* If size < max_size, preallocating enough memory may be impossible. */
 	if (count > 0 && size == max_size)
 		error = -ENOMEM;
+	if (error)
+		goto err_out;
 
-	/* Release all of the preallocated page frames. */
-	swsusp_free();
+	/* Save the number of allocated pages for the statistics below. */
+	pages = alloc_normal + alloc_highmem;
 
-	if (error) {
-		printk(KERN_CONT "\n");
-		return error;
+	/*
+	 * We only need 'size' page frames for the image but we have allocated
+	 * more.  Release the excessive ones now.
+	 */
+	memory_bm_position_reset(&copy_bm);
+	while (alloc_normal + alloc_highmem > size) {
+		unsigned long pfn = memory_bm_next_pfn(&copy_bm);
+		struct page *page = pfn_to_page(pfn);
+
+		memory_bm_clear_bit(&copy_bm, pfn);
+		if (PageHighMem(page))
+			alloc_highmem--;
+		else
+			alloc_normal--;
+		swsusp_unset_page_forbidden(page);
+		swsusp_unset_page_free(page);
+		__free_page(page);
 	}
 
  out:
 	do_gettimeofday(&stop);
-	printk(KERN_CONT "done (preallocated %lu free pages)\n", pages);
-	swsusp_show_speed(&start, &stop, pages, "Freed");
+	printk(KERN_CONT "done (allocated %lu pages, %lu image pages kept)\n",
+		pages, size);
+	swsusp_show_speed(&start, &stop, pages, "Allocated");
 
 	return 0;
+
+ err_out:
+	printk(KERN_CONT "\n");
+	swsusp_free();
+	return error;
 }
 
 #ifdef CONFIG_HIGHMEM
@@ -1163,7 +1223,7 @@ int swsusp_shrink_memory(void)
 
 static unsigned int count_pages_for_highmem(unsigned int nr_highmem)
 {
-	unsigned int free_highmem = count_free_highmem_pages();
+	unsigned int free_highmem = count_free_highmem_pages() + alloc_highmem;
 
 	if (free_highmem >= nr_highmem)
 		nr_highmem = 0;
@@ -1185,19 +1245,17 @@ count_pages_for_highmem(unsigned int nr_
 static int enough_free_mem(unsigned int nr_pages, unsigned int nr_highmem)
 {
 	struct zone *zone;
-	unsigned int free = 0, meta = 0;
+	unsigned int free = alloc_normal;
 
-	for_each_zone(zone) {
-		meta += snapshot_additional_pages(zone);
+	for_each_zone(zone)
 		if (!is_highmem(zone))
 			free += zone_page_state(zone, NR_FREE_PAGES);
-	}
 
 	nr_pages += count_pages_for_highmem(nr_highmem);
-	pr_debug("PM: Normal pages needed: %u + %u + %u, available pages: %u\n",
-		nr_pages, PAGES_FOR_IO, meta, free);
+	pr_debug("PM: Normal pages needed: %u + %u, available pages: %u\n",
+		nr_pages, PAGES_FOR_IO, free);
 
-	return free > nr_pages + PAGES_FOR_IO + meta;
+	return free > nr_pages + PAGES_FOR_IO;
 }
 
 #ifdef CONFIG_HIGHMEM
@@ -1219,7 +1277,7 @@ static inline int get_highmem_buffer(int
  */
 
 static inline unsigned int
-alloc_highmem_image_pages(struct memory_bitmap *bm, unsigned int nr_highmem)
+alloc_highmem_pages(struct memory_bitmap *bm, unsigned int nr_highmem)
 {
 	unsigned int to_alloc = count_free_highmem_pages();
 
@@ -1230,7 +1288,7 @@ alloc_highmem_image_pages(struct memory_
 	while (to_alloc-- > 0) {
 		struct page *page;
 
-		page = alloc_image_page(__GFP_HIGHMEM);
+		page = alloc_image_page(__GFP_HIGHMEM | __GFP_NO_OOM_KILL);
 		memory_bm_set_bit(bm, page_to_pfn(page));
 	}
 	return nr_highmem;
@@ -1239,7 +1297,7 @@ alloc_highmem_image_pages(struct memory_
 static inline int get_highmem_buffer(int safe_needed) { return 0; }
 
 static inline unsigned int
-alloc_highmem_image_pages(struct memory_bitmap *bm, unsigned int n) { return 0; }
+alloc_highmem_pages(struct memory_bitmap *bm, unsigned int n) { return 0; }
 #endif /* CONFIG_HIGHMEM */
 
 /**
@@ -1258,51 +1316,36 @@ static int
 swsusp_alloc(struct memory_bitmap *orig_bm, struct memory_bitmap *copy_bm,
 		unsigned int nr_pages, unsigned int nr_highmem)
 {
-	int error;
-
-	error = memory_bm_create(orig_bm, GFP_ATOMIC | __GFP_COLD, PG_ANY);
-	if (error)
-		goto Free;
-
-	error = memory_bm_create(copy_bm, GFP_ATOMIC | __GFP_COLD, PG_ANY);
-	if (error)
-		goto Free;
+	int error = 0;
 
 	if (nr_highmem > 0) {
 		error = get_highmem_buffer(PG_ANY);
 		if (error)
-			goto Free;
-
-		nr_pages += alloc_highmem_image_pages(copy_bm, nr_highmem);
+			goto err_out;
+		if (nr_highmem > alloc_highmem) {
+			nr_highmem -= alloc_highmem;
+			nr_pages += alloc_highmem_pages(copy_bm, nr_highmem);
+		}
 	}
-	while (nr_pages-- > 0) {
-		struct page *page = alloc_image_page(GFP_ATOMIC | __GFP_COLD);
-
-		if (!page)
-			goto Free;
+	if (nr_pages > alloc_normal) {
+		nr_pages -= alloc_normal;
+		while (nr_pages-- > 0) {
+			struct page *page;
 
-		memory_bm_set_bit(copy_bm, page_to_pfn(page));
+			page = alloc_image_page(GFP_ATOMIC | __GFP_COLD);
+			if (!page)
+				goto err_out;
+			memory_bm_set_bit(copy_bm, page_to_pfn(page));
+		}
 	}
+
 	return 0;
 
- Free:
+ err_out:
 	swsusp_free();
-	return -ENOMEM;
+	return error;
 }
 
-/* Memory bitmap used for marking saveable pages (during suspend) or the
- * suspend image pages (during resume)
- */
-static struct memory_bitmap orig_bm;
-/* Memory bitmap used on suspend for marking allocated pages that will contain
- * the copies of saveable pages.  During resume it is initially used for
- * marking the suspend image pages, but then its set bits are duplicated in
- * @orig_bm and it is released.  Next, on systems with high memory, it may be
- * used for marking "safe" highmem pages, but it has to be reinitialized for
- * this purpose.
- */
-static struct memory_bitmap copy_bm;
-
 asmlinkage int swsusp_save(void)
 {
 	unsigned int nr_pages, nr_highmem;
Index: linux-2.6/kernel/power/power.h
===================================================================
--- linux-2.6.orig/kernel/power/power.h
+++ linux-2.6/kernel/power/power.h
@@ -74,7 +74,7 @@ extern asmlinkage int swsusp_arch_resume
 
 extern int create_basic_memory_bitmaps(void);
 extern void free_basic_memory_bitmaps(void);
-extern int swsusp_shrink_memory(void);
+extern int hibernate_preallocate_memory(void);
 
 /**
  *	Auxiliary structure used for reading the snapshot image data and
Index: linux-2.6/kernel/power/disk.c
===================================================================
--- linux-2.6.orig/kernel/power/disk.c
+++ linux-2.6/kernel/power/disk.c
@@ -303,8 +303,8 @@ int hibernation_snapshot(int platform_mo
 	if (error)
 		return error;
 
-	/* Free memory before shutting down devices. */
-	error = swsusp_shrink_memory();
+	/* Preallocate image memory before shutting down devices. */
+	error = hibernate_preallocate_memory();
 	if (error)
 		goto Close;
 
@@ -320,6 +320,10 @@ int hibernation_snapshot(int platform_mo
 	/* Control returns here after successful restore */
 
  Resume_devices:
+	/* We may need to release the preallocated image pages here. */
+	if (error || !in_suspend)
+		swsusp_free();
+
 	device_resume(in_suspend ?
 		(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
 	resume_console();
@@ -593,7 +597,10 @@ int hibernate(void)
 		goto Thaw;
 
 	error = hibernation_snapshot(hibernation_mode == HIBERNATION_PLATFORM);
-	if (in_suspend && !error) {
+	if (error)
+		goto Thaw;
+
+	if (in_suspend) {
 		unsigned int flags = 0;
 
 		if (hibernation_mode == HIBERNATION_PLATFORM)
@@ -605,8 +612,8 @@ int hibernate(void)
 			power_down();
 	} else {
 		pr_debug("PM: Image restored successfully.\n");
-		swsusp_free();
 	}
+
  Thaw:
 	thaw_processes();
  Finish:

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-04  0:10                                               ` Rafael J. Wysocki
  (?)
  (?)
@ 2009-05-04  0:38                                               ` David Rientjes
  2009-05-04 15:02                                                 ` Rafael J. Wysocki
  2009-05-04 15:02                                                   ` Rafael J. Wysocki
  -1 siblings, 2 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-04  0:38 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Wu Fengguang, linux-pm, Andrew Morton, pavel, torvalds,
	jens.axboe, alan-jenkins, linux-kernel, kernel-testers

On Mon, 4 May 2009, Rafael J. Wysocki wrote:

> Index: linux-2.6/mm/page_alloc.c
> ===================================================================
> --- linux-2.6.orig/mm/page_alloc.c
> +++ linux-2.6/mm/page_alloc.c
> @@ -1620,7 +1620,8 @@ nofail_alloc:
>  		}
>  
>  		/* The OOM killer will not help higher order allocs so fail */
> -		if (order > PAGE_ALLOC_COSTLY_ORDER) {
> +		if (order > PAGE_ALLOC_COSTLY_ORDER ||
> +				(gfp_mask & __GFP_NO_OOM_KILL)) {
>  			clear_zonelist_oom(zonelist, gfp_mask);
>  			goto nopage;
>  		}

This is inconsistent because __GFP_NO_OOM_KILL now implies __GFP_NORETRY 
(the "goto nopage" above), but only for allocations with __GFP_FS set and 
__GFP_NORETRY clear.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-04  0:10                                               ` Rafael J. Wysocki
  (?)
@ 2009-05-04  0:38                                               ` David Rientjes
  -1 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-04  0:38 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: kernel-testers, linux-kernel, alan-jenkins, jens.axboe, linux-pm,
	Wu Fengguang, torvalds, Andrew Morton

On Mon, 4 May 2009, Rafael J. Wysocki wrote:

> Index: linux-2.6/mm/page_alloc.c
> ===================================================================
> --- linux-2.6.orig/mm/page_alloc.c
> +++ linux-2.6/mm/page_alloc.c
> @@ -1620,7 +1620,8 @@ nofail_alloc:
>  		}
>  
>  		/* The OOM killer will not help higher order allocs so fail */
> -		if (order > PAGE_ALLOC_COSTLY_ORDER) {
> +		if (order > PAGE_ALLOC_COSTLY_ORDER ||
> +				(gfp_mask & __GFP_NO_OOM_KILL)) {
>  			clear_zonelist_oom(zonelist, gfp_mask);
>  			goto nopage;
>  		}

This is inconsistent because __GFP_NO_OOM_KILL now implies __GFP_NORETRY 
(the "goto nopage" above), but only for allocations with __GFP_FS set and 
__GFP_NORETRY clear.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/4] PM/Hibernate: Use memory allocations to free memory (rev. 2)
@ 2009-05-04  9:31                                               ` Pavel Machek
  0 siblings, 0 replies; 580+ messages in thread
From: Pavel Machek @ 2009-05-04  9:31 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Wu Fengguang, Andrew Morton, torvalds, jens.axboe, alan-jenkins,
	linux-kernel, kernel-testers, linux-pm

On Sun 2009-05-03 18:22:54, Rafael J. Wysocki wrote:
> On Sunday 03 May 2009, Wu Fengguang wrote:
> > On Sun, May 03, 2009 at 02:24:20AM +0200, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > 
> > > Modify the hibernation memory shrinking code so that it will make
> > > memory allocations to free memory instead of using an artificial
> > > memory shrinking mechanism for that.  Remove the shrinking of
> > > memory from the suspend-to-RAM code, where it is not really
> > > necessary.  Finally, remove the no longer used memory shrinking
> > > functions from mm/vmscan.c .
> > > 
> > > [rev. 2: Use the existing memory bitmaps for marking preallocated
> > >  image pages and use swsusp_free() from releasing them, introduce
> > >  GFP_IMAGE, add comments describing the memory shrinking strategy.]
> > > 
> > > Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> > > ---
> > >  kernel/power/main.c     |   20 ------
> > >  kernel/power/snapshot.c |  132 +++++++++++++++++++++++++++++++++-----------
> > >  mm/vmscan.c             |  142 ------------------------------------------------
> > >  3 files changed, 101 insertions(+), 193 deletions(-)
> > > 
> > > Index: linux-2.6/kernel/power/snapshot.c
> > > ===================================================================
> > > --- linux-2.6.orig/kernel/power/snapshot.c
> > > +++ linux-2.6/kernel/power/snapshot.c
> > > @@ -1066,41 +1066,97 @@ void swsusp_free(void)
> > >  	buffer = NULL;
> > >  }
> > >  
> > > +/* Helper functions used for the shrinking of memory. */
> > > +
> > > +#ifdef CONFIG_HIGHMEM
> > > +#define GFP_IMAGE	(GFP_KERNEL | __GFP_HIGHMEM | __GFP_NO_OOM_KILL)
> > > +#else
> > > +#define GFP_IMAGE	(GFP_KERNEL | __GFP_NO_OOM_KILL)
> > > +#endif
> > 
> > The CONFIG_HIGHMEM test is not necessary: __GFP_HIGHMEM is always defined.
> > 
> > > +#define SHRINK_BITE	10000
> > 
> > This is ~40MB. A full scan of (for example) 8G pages will be time
> > consuming, not to mention we have to do it 2*(8G-500M)/40M = 384 times!
> > 
> > Can we make it a LONG_MAX? 
> 
> No, I don't think so.  The problem is the number of pages we'll need to copy
> is generally shrinking  as we allocate memory, so we can't do that in one shot.
> 
> We can make it a greater number, but I don't really think it would be a good
> idea to make it greater than 100 MB.

Well, even 100MB is quite big: on 128MB machine, that will probably
mean freeing all the memory (instead of "as much as needed"). And that
memory needs to go to disk, so it will be slow.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/4] PM/Hibernate: Use memory allocations to free memory (rev. 2)
  2009-05-03 16:22                                             ` Rafael J. Wysocki
  (?)
  (?)
@ 2009-05-04  9:31                                             ` Pavel Machek
  -1 siblings, 0 replies; 580+ messages in thread
From: Pavel Machek @ 2009-05-04  9:31 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: kernel-testers, linux-kernel, alan-jenkins, jens.axboe,
	Andrew Morton, Wu Fengguang, torvalds, linux-pm

On Sun 2009-05-03 18:22:54, Rafael J. Wysocki wrote:
> On Sunday 03 May 2009, Wu Fengguang wrote:
> > On Sun, May 03, 2009 at 02:24:20AM +0200, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > 
> > > Modify the hibernation memory shrinking code so that it will make
> > > memory allocations to free memory instead of using an artificial
> > > memory shrinking mechanism for that.  Remove the shrinking of
> > > memory from the suspend-to-RAM code, where it is not really
> > > necessary.  Finally, remove the no longer used memory shrinking
> > > functions from mm/vmscan.c .
> > > 
> > > [rev. 2: Use the existing memory bitmaps for marking preallocated
> > >  image pages and use swsusp_free() from releasing them, introduce
> > >  GFP_IMAGE, add comments describing the memory shrinking strategy.]
> > > 
> > > Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> > > ---
> > >  kernel/power/main.c     |   20 ------
> > >  kernel/power/snapshot.c |  132 +++++++++++++++++++++++++++++++++-----------
> > >  mm/vmscan.c             |  142 ------------------------------------------------
> > >  3 files changed, 101 insertions(+), 193 deletions(-)
> > > 
> > > Index: linux-2.6/kernel/power/snapshot.c
> > > ===================================================================
> > > --- linux-2.6.orig/kernel/power/snapshot.c
> > > +++ linux-2.6/kernel/power/snapshot.c
> > > @@ -1066,41 +1066,97 @@ void swsusp_free(void)
> > >  	buffer = NULL;
> > >  }
> > >  
> > > +/* Helper functions used for the shrinking of memory. */
> > > +
> > > +#ifdef CONFIG_HIGHMEM
> > > +#define GFP_IMAGE	(GFP_KERNEL | __GFP_HIGHMEM | __GFP_NO_OOM_KILL)
> > > +#else
> > > +#define GFP_IMAGE	(GFP_KERNEL | __GFP_NO_OOM_KILL)
> > > +#endif
> > 
> > The CONFIG_HIGHMEM test is not necessary: __GFP_HIGHMEM is always defined.
> > 
> > > +#define SHRINK_BITE	10000
> > 
> > This is ~40MB. A full scan of (for example) 8G pages will be time
> > consuming, not to mention we have to do it 2*(8G-500M)/40M = 384 times!
> > 
> > Can we make it a LONG_MAX? 
> 
> No, I don't think so.  The problem is the number of pages we'll need to copy
> is generally shrinking  as we allocate memory, so we can't do that in one shot.
> 
> We can make it a greater number, but I don't really think it would be a good
> idea to make it greater than 100 MB.

Well, even 100MB is quite big: on 128MB machine, that will probably
mean freeing all the memory (instead of "as much as needed"). And that
memory needs to go to disk, so it will be slow.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/4] PM/Hibernate: Use memory allocations to free memory (rev. 2)
@ 2009-05-04  9:31                                               ` Pavel Machek
  0 siblings, 0 replies; 580+ messages in thread
From: Pavel Machek @ 2009-05-04  9:31 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Wu Fengguang, Andrew Morton,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

On Sun 2009-05-03 18:22:54, Rafael J. Wysocki wrote:
> On Sunday 03 May 2009, Wu Fengguang wrote:
> > On Sun, May 03, 2009 at 02:24:20AM +0200, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
> > > 
> > > Modify the hibernation memory shrinking code so that it will make
> > > memory allocations to free memory instead of using an artificial
> > > memory shrinking mechanism for that.  Remove the shrinking of
> > > memory from the suspend-to-RAM code, where it is not really
> > > necessary.  Finally, remove the no longer used memory shrinking
> > > functions from mm/vmscan.c .
> > > 
> > > [rev. 2: Use the existing memory bitmaps for marking preallocated
> > >  image pages and use swsusp_free() from releasing them, introduce
> > >  GFP_IMAGE, add comments describing the memory shrinking strategy.]
> > > 
> > > Signed-off-by: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
> > > ---
> > >  kernel/power/main.c     |   20 ------
> > >  kernel/power/snapshot.c |  132 +++++++++++++++++++++++++++++++++-----------
> > >  mm/vmscan.c             |  142 ------------------------------------------------
> > >  3 files changed, 101 insertions(+), 193 deletions(-)
> > > 
> > > Index: linux-2.6/kernel/power/snapshot.c
> > > ===================================================================
> > > --- linux-2.6.orig/kernel/power/snapshot.c
> > > +++ linux-2.6/kernel/power/snapshot.c
> > > @@ -1066,41 +1066,97 @@ void swsusp_free(void)
> > >  	buffer = NULL;
> > >  }
> > >  
> > > +/* Helper functions used for the shrinking of memory. */
> > > +
> > > +#ifdef CONFIG_HIGHMEM
> > > +#define GFP_IMAGE	(GFP_KERNEL | __GFP_HIGHMEM | __GFP_NO_OOM_KILL)
> > > +#else
> > > +#define GFP_IMAGE	(GFP_KERNEL | __GFP_NO_OOM_KILL)
> > > +#endif
> > 
> > The CONFIG_HIGHMEM test is not necessary: __GFP_HIGHMEM is always defined.
> > 
> > > +#define SHRINK_BITE	10000
> > 
> > This is ~40MB. A full scan of (for example) 8G pages will be time
> > consuming, not to mention we have to do it 2*(8G-500M)/40M = 384 times!
> > 
> > Can we make it a LONG_MAX? 
> 
> No, I don't think so.  The problem is the number of pages we'll need to copy
> is generally shrinking  as we allocate memory, so we can't do that in one shot.
> 
> We can make it a greater number, but I don't really think it would be a good
> idea to make it greater than 100 MB.

Well, even 100MB is quite big: on 128MB machine, that will probably
mean freeing all the memory (instead of "as much as needed"). And that
memory needs to go to disk, so it will be slow.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 0/4] PM: Drop shrink_all_memory (rev. 2) (was: Re: [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory)
@ 2009-05-04  9:33                                             ` Pavel Machek
  0 siblings, 0 replies; 580+ messages in thread
From: Pavel Machek @ 2009-05-04  9:33 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Wu Fengguang, Andrew Morton, torvalds, jens.axboe, alan-jenkins,
	linux-kernel, kernel-testers, linux-pm

Hi!

> I know that swsusp_shrink_memory() has problems, that's why I'd like to get rid
> of it.
> 
> > I wonder if it's possible to free up the memory within 1s at all.
> 
> I'm not sure.
> 
> Apparently, the counting of saveable pages takes substantial time (0.5 s each
> iteration on my 64-bit test box), so we can improve that by limiting the number
> of iterations.

We could increase step size after each step. Free in 40MB step, then
80MB step, then 160MB step, ...
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 0/4] PM: Drop shrink_all_memory (rev. 2) (was: Re: [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory)
  2009-05-03 16:30                                           ` Rafael J. Wysocki
                                                             ` (2 preceding siblings ...)
  (?)
@ 2009-05-04  9:33                                           ` Pavel Machek
  -1 siblings, 0 replies; 580+ messages in thread
From: Pavel Machek @ 2009-05-04  9:33 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: kernel-testers, linux-kernel, alan-jenkins, jens.axboe,
	Andrew Morton, Wu Fengguang, torvalds, linux-pm

Hi!

> I know that swsusp_shrink_memory() has problems, that's why I'd like to get rid
> of it.
> 
> > I wonder if it's possible to free up the memory within 1s at all.
> 
> I'm not sure.
> 
> Apparently, the counting of saveable pages takes substantial time (0.5 s each
> iteration on my 64-bit test box), so we can improve that by limiting the number
> of iterations.

We could increase step size after each step. Free in 40MB step, then
80MB step, then 160MB step, ...
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 0/4] PM: Drop shrink_all_memory (rev. 2) (was: Re: [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory)
@ 2009-05-04  9:33                                             ` Pavel Machek
  0 siblings, 0 replies; 580+ messages in thread
From: Pavel Machek @ 2009-05-04  9:33 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Wu Fengguang, Andrew Morton,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

Hi!

> I know that swsusp_shrink_memory() has problems, that's why I'd like to get rid
> of it.
> 
> > I wonder if it's possible to free up the memory within 1s at all.
> 
> I'm not sure.
> 
> Apparently, the counting of saveable pages takes substantial time (0.5 s each
> iteration on my 64-bit test box), so we can improve that by limiting the number
> of iterations.

We could increase step size after each step. Free in 40MB step, then
80MB step, then 160MB step, ...
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/4] PM/Hibernate: Use memory allocations to free memory (rev. 2)
@ 2009-05-04  9:36                                                 ` Pavel Machek
  0 siblings, 0 replies; 580+ messages in thread
From: Pavel Machek @ 2009-05-04  9:36 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linus Torvalds, Andrew Morton, jens.axboe, alan-jenkins,
	linux-kernel, kernel-testers, linux-pm

On Sun 2009-05-03 18:35:06, Rafael J. Wysocki wrote:
> On Sunday 03 May 2009, Pavel Machek wrote:
> > Hi!
> 
> Hi,
> 
> > > > Remove the shrinking of memory from the suspend-to-RAM code, where it is 
> > > > not really necessary.
> > > 
> > > Hmm. Shouldn't we do this _regardless_?
> > > 
> > > IOW, shouldn't this be a totally separate patch? It seems to be left-over 
> > > from when we shared the same code-paths, and before the split of the STR 
> > > and hibernate code?
> > > 
> > > IOW, shouldn't the very _first_ patch just be this part? That code doesn't 
> > > make any sense anyway (that FREE_PAGE_NUMBER really _is_ totally 
> > > arbitrary).
> > > 
> > > This part seems to be totally independent of all the other parts in your 
> > > patch-series. No?
> > 
> > I'm not sure this one is a good idea: drivers will need to allocate
> > memory during suspend/resume, and when processes are frozen/disk
> > driver is suspended, normal memory management will no longer work.
> > 
> > So, freeing 4M of memory before starting suspend seems like a good
> > idea. That way those small alocations will not fail.
> 
> I don't think we've ever had problems with the drivers having too little
> memory to suspend.

Well, we had the 4MB buffer there, so it is hardly surprising.

> I'm opting for removing this code and seeing if that leads to any regressions.
> If it does, we can still get some free memory by allocating and releasing it.

I believe we should. If we don't... we will not get any regression
reports, because it will probably just hang with black screen :-(, and
"being out of memory during suspend" is probably going to be hard to
reproduce.

Perhaps we should try to _eat_ all memory available during suspend to
test driver behaviour with 0 pages free?

	while (kmalloc(100, GFP_ATOMIC))
		;

in suspend path should just do it for testing.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/4] PM/Hibernate: Use memory allocations to free memory (rev. 2)
  2009-05-03 16:35                                               ` Rafael J. Wysocki
  (?)
@ 2009-05-04  9:36                                               ` Pavel Machek
  -1 siblings, 0 replies; 580+ messages in thread
From: Pavel Machek @ 2009-05-04  9:36 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-kernel, alan-jenkins, jens.axboe, Andrew Morton,
	kernel-testers, Linus Torvalds, linux-pm

On Sun 2009-05-03 18:35:06, Rafael J. Wysocki wrote:
> On Sunday 03 May 2009, Pavel Machek wrote:
> > Hi!
> 
> Hi,
> 
> > > > Remove the shrinking of memory from the suspend-to-RAM code, where it is 
> > > > not really necessary.
> > > 
> > > Hmm. Shouldn't we do this _regardless_?
> > > 
> > > IOW, shouldn't this be a totally separate patch? It seems to be left-over 
> > > from when we shared the same code-paths, and before the split of the STR 
> > > and hibernate code?
> > > 
> > > IOW, shouldn't the very _first_ patch just be this part? That code doesn't 
> > > make any sense anyway (that FREE_PAGE_NUMBER really _is_ totally 
> > > arbitrary).
> > > 
> > > This part seems to be totally independent of all the other parts in your 
> > > patch-series. No?
> > 
> > I'm not sure this one is a good idea: drivers will need to allocate
> > memory during suspend/resume, and when processes are frozen/disk
> > driver is suspended, normal memory management will no longer work.
> > 
> > So, freeing 4M of memory before starting suspend seems like a good
> > idea. That way those small alocations will not fail.
> 
> I don't think we've ever had problems with the drivers having too little
> memory to suspend.

Well, we had the 4MB buffer there, so it is hardly surprising.

> I'm opting for removing this code and seeing if that leads to any regressions.
> If it does, we can still get some free memory by allocating and releasing it.

I believe we should. If we don't... we will not get any regression
reports, because it will probably just hang with black screen :-(, and
"being out of memory during suspend" is probably going to be hard to
reproduce.

Perhaps we should try to _eat_ all memory available during suspend to
test driver behaviour with 0 pages free?

	while (kmalloc(100, GFP_ATOMIC))
		;

in suspend path should just do it for testing.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/4] PM/Hibernate: Use memory allocations to free memory (rev. 2)
@ 2009-05-04  9:36                                                 ` Pavel Machek
  0 siblings, 0 replies; 580+ messages in thread
From: Pavel Machek @ 2009-05-04  9:36 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linus Torvalds, Andrew Morton, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

On Sun 2009-05-03 18:35:06, Rafael J. Wysocki wrote:
> On Sunday 03 May 2009, Pavel Machek wrote:
> > Hi!
> 
> Hi,
> 
> > > > Remove the shrinking of memory from the suspend-to-RAM code, where it is 
> > > > not really necessary.
> > > 
> > > Hmm. Shouldn't we do this _regardless_?
> > > 
> > > IOW, shouldn't this be a totally separate patch? It seems to be left-over 
> > > from when we shared the same code-paths, and before the split of the STR 
> > > and hibernate code?
> > > 
> > > IOW, shouldn't the very _first_ patch just be this part? That code doesn't 
> > > make any sense anyway (that FREE_PAGE_NUMBER really _is_ totally 
> > > arbitrary).
> > > 
> > > This part seems to be totally independent of all the other parts in your 
> > > patch-series. No?
> > 
> > I'm not sure this one is a good idea: drivers will need to allocate
> > memory during suspend/resume, and when processes are frozen/disk
> > driver is suspended, normal memory management will no longer work.
> > 
> > So, freeing 4M of memory before starting suspend seems like a good
> > idea. That way those small alocations will not fail.
> 
> I don't think we've ever had problems with the drivers having too little
> memory to suspend.

Well, we had the 4MB buffer there, so it is hardly surprising.

> I'm opting for removing this code and seeing if that leads to any regressions.
> If it does, we can still get some free memory by allocating and releasing it.

I believe we should. If we don't... we will not get any regression
reports, because it will probably just hang with black screen :-(, and
"being out of memory during suspend" is probably going to be hard to
reproduce.

Perhaps we should try to _eat_ all memory available during suspend to
test driver behaviour with 0 pages free?

	while (kmalloc(100, GFP_ATOMIC))
		;

in suspend path should just do it for testing.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 2/5] PM/Hibernate: Move memory shrinking to snapshot.c (rev. 2)
@ 2009-05-04 13:35                                                 ` Pavel Machek
  0 siblings, 0 replies; 580+ messages in thread
From: Pavel Machek @ 2009-05-04 13:35 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Wu Fengguang, linux-pm, Andrew Morton, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

On Mon 2009-05-04 02:11:02, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rjw@sisk.pl>
> 
> The next patch is going to modify the memory shrinking code so that
> it will make memory allocations to free memory instead of using an
> artificial memory shrinking mechanism for that.  For this purpose it
> is convenient to move swsusp_shrink_memory() from
> kernel/power/swsusp.c to kernel/power/snapshot.c, because the new
> memory-shrinking code is going to use things that are local to
> kernel/power/snapshot.c .
> 
> [rev. 2: Make some functions static and remove their headers from
>  kernel/power/power.h]
> 
> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>

Acked-by: Pavel Machek <pavel@ucw.cz>

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 2/5] PM/Hibernate: Move memory shrinking to snapshot.c (rev. 2)
  2009-05-04  0:11                                               ` Rafael J. Wysocki
  (?)
  (?)
@ 2009-05-04 13:35                                               ` Pavel Machek
  -1 siblings, 0 replies; 580+ messages in thread
From: Pavel Machek @ 2009-05-04 13:35 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: kernel-testers, linux-kernel, alan-jenkins, jens.axboe, linux-pm,
	Wu Fengguang, torvalds, Andrew Morton

On Mon 2009-05-04 02:11:02, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rjw@sisk.pl>
> 
> The next patch is going to modify the memory shrinking code so that
> it will make memory allocations to free memory instead of using an
> artificial memory shrinking mechanism for that.  For this purpose it
> is convenient to move swsusp_shrink_memory() from
> kernel/power/swsusp.c to kernel/power/snapshot.c, because the new
> memory-shrinking code is going to use things that are local to
> kernel/power/snapshot.c .
> 
> [rev. 2: Make some functions static and remove their headers from
>  kernel/power/power.h]
> 
> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>

Acked-by: Pavel Machek <pavel@ucw.cz>

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 2/5] PM/Hibernate: Move memory shrinking to snapshot.c (rev. 2)
@ 2009-05-04 13:35                                                 ` Pavel Machek
  0 siblings, 0 replies; 580+ messages in thread
From: Pavel Machek @ 2009-05-04 13:35 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Wu Fengguang,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Andrew Morton, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Mon 2009-05-04 02:11:02, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
> 
> The next patch is going to modify the memory shrinking code so that
> it will make memory allocations to free memory instead of using an
> artificial memory shrinking mechanism for that.  For this purpose it
> is convenient to move swsusp_shrink_memory() from
> kernel/power/swsusp.c to kernel/power/snapshot.c, because the new
> memory-shrinking code is going to use things that are local to
> kernel/power/snapshot.c .
> 
> [rev. 2: Make some functions static and remove their headers from
>  kernel/power/power.h]
> 
> Signed-off-by: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>

Acked-by: Pavel Machek <pavel-+ZI9xUNit7I@public.gmane.org>

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-04 15:02                                                   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-04 15:02 UTC (permalink / raw)
  To: David Rientjes
  Cc: Wu Fengguang, linux-pm, Andrew Morton, pavel, torvalds,
	jens.axboe, alan-jenkins, linux-kernel, kernel-testers

On Monday 04 May 2009, David Rientjes wrote:
> On Mon, 4 May 2009, Rafael J. Wysocki wrote:
> 
> > Index: linux-2.6/mm/page_alloc.c
> > ===================================================================
> > --- linux-2.6.orig/mm/page_alloc.c
> > +++ linux-2.6/mm/page_alloc.c
> > @@ -1620,7 +1620,8 @@ nofail_alloc:
> >  		}
> >  
> >  		/* The OOM killer will not help higher order allocs so fail */
> > -		if (order > PAGE_ALLOC_COSTLY_ORDER) {
> > +		if (order > PAGE_ALLOC_COSTLY_ORDER ||
> > +				(gfp_mask & __GFP_NO_OOM_KILL)) {
> >  			clear_zonelist_oom(zonelist, gfp_mask);
> >  			goto nopage;
> >  		}
> 
> This is inconsistent because __GFP_NO_OOM_KILL now implies __GFP_NORETRY 
> (the "goto nopage" above), but only for allocations with __GFP_FS set and 
> __GFP_NORETRY clear.

Well, what would you suggest?

Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-04  0:38                                               ` David Rientjes
@ 2009-05-04 15:02                                                 ` Rafael J. Wysocki
  2009-05-04 15:02                                                   ` Rafael J. Wysocki
  1 sibling, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-04 15:02 UTC (permalink / raw)
  To: David Rientjes
  Cc: kernel-testers, linux-kernel, alan-jenkins, jens.axboe, linux-pm,
	Wu Fengguang, torvalds, Andrew Morton

On Monday 04 May 2009, David Rientjes wrote:
> On Mon, 4 May 2009, Rafael J. Wysocki wrote:
> 
> > Index: linux-2.6/mm/page_alloc.c
> > ===================================================================
> > --- linux-2.6.orig/mm/page_alloc.c
> > +++ linux-2.6/mm/page_alloc.c
> > @@ -1620,7 +1620,8 @@ nofail_alloc:
> >  		}
> >  
> >  		/* The OOM killer will not help higher order allocs so fail */
> > -		if (order > PAGE_ALLOC_COSTLY_ORDER) {
> > +		if (order > PAGE_ALLOC_COSTLY_ORDER ||
> > +				(gfp_mask & __GFP_NO_OOM_KILL)) {
> >  			clear_zonelist_oom(zonelist, gfp_mask);
> >  			goto nopage;
> >  		}
> 
> This is inconsistent because __GFP_NO_OOM_KILL now implies __GFP_NORETRY 
> (the "goto nopage" above), but only for allocations with __GFP_FS set and 
> __GFP_NORETRY clear.

Well, what would you suggest?

Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-04 15:02                                                   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-04 15:02 UTC (permalink / raw)
  To: David Rientjes
  Cc: Wu Fengguang,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Andrew Morton, pavel-+ZI9xUNit7I,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Monday 04 May 2009, David Rientjes wrote:
> On Mon, 4 May 2009, Rafael J. Wysocki wrote:
> 
> > Index: linux-2.6/mm/page_alloc.c
> > ===================================================================
> > --- linux-2.6.orig/mm/page_alloc.c
> > +++ linux-2.6/mm/page_alloc.c
> > @@ -1620,7 +1620,8 @@ nofail_alloc:
> >  		}
> >  
> >  		/* The OOM killer will not help higher order allocs so fail */
> > -		if (order > PAGE_ALLOC_COSTLY_ORDER) {
> > +		if (order > PAGE_ALLOC_COSTLY_ORDER ||
> > +				(gfp_mask & __GFP_NO_OOM_KILL)) {
> >  			clear_zonelist_oom(zonelist, gfp_mask);
> >  			goto nopage;
> >  		}
> 
> This is inconsistent because __GFP_NO_OOM_KILL now implies __GFP_NORETRY 
> (the "goto nopage" above), but only for allocations with __GFP_FS set and 
> __GFP_NORETRY clear.

Well, what would you suggest?

Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-04 16:44                                                     ` David Rientjes
  0 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-04 16:44 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Wu Fengguang, linux-pm, Andrew Morton, pavel, torvalds,
	jens.axboe, alan-jenkins, linux-kernel, kernel-testers

On Mon, 4 May 2009, Rafael J. Wysocki wrote:

> > > Index: linux-2.6/mm/page_alloc.c
> > > ===================================================================
> > > --- linux-2.6.orig/mm/page_alloc.c
> > > +++ linux-2.6/mm/page_alloc.c
> > > @@ -1620,7 +1620,8 @@ nofail_alloc:
> > >  		}
> > >  
> > >  		/* The OOM killer will not help higher order allocs so fail */
> > > -		if (order > PAGE_ALLOC_COSTLY_ORDER) {
> > > +		if (order > PAGE_ALLOC_COSTLY_ORDER ||
> > > +				(gfp_mask & __GFP_NO_OOM_KILL)) {
> > >  			clear_zonelist_oom(zonelist, gfp_mask);
> > >  			goto nopage;
> > >  		}
> > 
> > This is inconsistent because __GFP_NO_OOM_KILL now implies __GFP_NORETRY 
> > (the "goto nopage" above), but only for allocations with __GFP_FS set and 
> > __GFP_NORETRY clear.
> 
> Well, what would you suggest?
> 

A couple things:

 - rebase this on mmotm so that it doesn't conflict with Mel Gorman's page
   allocator speedup changes, and

 - avoid the final call to get_page_from_freelist() for 
   !(gfp_mask & __GFP_NO_OOM_KILL) by adding a check for it alongside 
   (gfp_mask & __GFP_FS) and !(gfp_mask & __GFP_NORETRY) because it should
   really only catch parallel oom killings which won't happen in your 
   suspend case since it uses ALLOC_WMARK_HIGH.

The latter is important to avoid unnecessary dependencies among low-level 
__GFP_* flags (although all __GFP_NO_OOM_KILL allocations should really 
all be passing __GFP_NORETRY too to avoid relying too heavily on direct 
reclaim).

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-04 15:02                                                   ` Rafael J. Wysocki
  (?)
@ 2009-05-04 16:44                                                   ` David Rientjes
  -1 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-04 16:44 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: kernel-testers, linux-kernel, alan-jenkins, jens.axboe, linux-pm,
	Wu Fengguang, torvalds, Andrew Morton

On Mon, 4 May 2009, Rafael J. Wysocki wrote:

> > > Index: linux-2.6/mm/page_alloc.c
> > > ===================================================================
> > > --- linux-2.6.orig/mm/page_alloc.c
> > > +++ linux-2.6/mm/page_alloc.c
> > > @@ -1620,7 +1620,8 @@ nofail_alloc:
> > >  		}
> > >  
> > >  		/* The OOM killer will not help higher order allocs so fail */
> > > -		if (order > PAGE_ALLOC_COSTLY_ORDER) {
> > > +		if (order > PAGE_ALLOC_COSTLY_ORDER ||
> > > +				(gfp_mask & __GFP_NO_OOM_KILL)) {
> > >  			clear_zonelist_oom(zonelist, gfp_mask);
> > >  			goto nopage;
> > >  		}
> > 
> > This is inconsistent because __GFP_NO_OOM_KILL now implies __GFP_NORETRY 
> > (the "goto nopage" above), but only for allocations with __GFP_FS set and 
> > __GFP_NORETRY clear.
> 
> Well, what would you suggest?
> 

A couple things:

 - rebase this on mmotm so that it doesn't conflict with Mel Gorman's page
   allocator speedup changes, and

 - avoid the final call to get_page_from_freelist() for 
   !(gfp_mask & __GFP_NO_OOM_KILL) by adding a check for it alongside 
   (gfp_mask & __GFP_FS) and !(gfp_mask & __GFP_NORETRY) because it should
   really only catch parallel oom killings which won't happen in your 
   suspend case since it uses ALLOC_WMARK_HIGH.

The latter is important to avoid unnecessary dependencies among low-level 
__GFP_* flags (although all __GFP_NO_OOM_KILL allocations should really 
all be passing __GFP_NORETRY too to avoid relying too heavily on direct 
reclaim).

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-04 16:44                                                     ` David Rientjes
  0 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-04 16:44 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Wu Fengguang,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Andrew Morton, pavel-+ZI9xUNit7I,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Mon, 4 May 2009, Rafael J. Wysocki wrote:

> > > Index: linux-2.6/mm/page_alloc.c
> > > ===================================================================
> > > --- linux-2.6.orig/mm/page_alloc.c
> > > +++ linux-2.6/mm/page_alloc.c
> > > @@ -1620,7 +1620,8 @@ nofail_alloc:
> > >  		}
> > >  
> > >  		/* The OOM killer will not help higher order allocs so fail */
> > > -		if (order > PAGE_ALLOC_COSTLY_ORDER) {
> > > +		if (order > PAGE_ALLOC_COSTLY_ORDER ||
> > > +				(gfp_mask & __GFP_NO_OOM_KILL)) {
> > >  			clear_zonelist_oom(zonelist, gfp_mask);
> > >  			goto nopage;
> > >  		}
> > 
> > This is inconsistent because __GFP_NO_OOM_KILL now implies __GFP_NORETRY 
> > (the "goto nopage" above), but only for allocations with __GFP_FS set and 
> > __GFP_NORETRY clear.
> 
> Well, what would you suggest?
> 

A couple things:

 - rebase this on mmotm so that it doesn't conflict with Mel Gorman's page
   allocator speedup changes, and

 - avoid the final call to get_page_from_freelist() for 
   !(gfp_mask & __GFP_NO_OOM_KILL) by adding a check for it alongside 
   (gfp_mask & __GFP_FS) and !(gfp_mask & __GFP_NORETRY) because it should
   really only catch parallel oom killings which won't happen in your 
   suspend case since it uses ALLOC_WMARK_HIGH.

The latter is important to avoid unnecessary dependencies among low-level 
__GFP_* flags (although all __GFP_NO_OOM_KILL allocations should really 
all be passing __GFP_NORETRY too to avoid relying too heavily on direct 
reclaim).

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-04 19:01                                                     ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-04 19:01 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: rientjes, fengguang.wu, linux-pm, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

On Mon, 4 May 2009 17:02:22 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> On Monday 04 May 2009, David Rientjes wrote:
> > On Mon, 4 May 2009, Rafael J. Wysocki wrote:
> > 
> > > Index: linux-2.6/mm/page_alloc.c
> > > ===================================================================
> > > --- linux-2.6.orig/mm/page_alloc.c
> > > +++ linux-2.6/mm/page_alloc.c
> > > @@ -1620,7 +1620,8 @@ nofail_alloc:
> > >  		}
> > >  
> > >  		/* The OOM killer will not help higher order allocs so fail */
> > > -		if (order > PAGE_ALLOC_COSTLY_ORDER) {
> > > +		if (order > PAGE_ALLOC_COSTLY_ORDER ||
> > > +				(gfp_mask & __GFP_NO_OOM_KILL)) {
> > >  			clear_zonelist_oom(zonelist, gfp_mask);
> > >  			goto nopage;
> > >  		}
> > 
> > This is inconsistent because __GFP_NO_OOM_KILL now implies __GFP_NORETRY 
> > (the "goto nopage" above), but only for allocations with __GFP_FS set and 
> > __GFP_NORETRY clear.
> 
> Well, what would you suggest?
> 

Did you check whether the existing __GFP_NORETRY will work as-is for
this requirement?

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-04 15:02                                                   ` Rafael J. Wysocki
                                                                     ` (3 preceding siblings ...)
  (?)
@ 2009-05-04 19:01                                                   ` Andrew Morton
  -1 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-04 19:01 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: kernel-testers, rientjes, linux-kernel, alan-jenkins, jens.axboe,
	linux-pm, fengguang.wu, torvalds

On Mon, 4 May 2009 17:02:22 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> On Monday 04 May 2009, David Rientjes wrote:
> > On Mon, 4 May 2009, Rafael J. Wysocki wrote:
> > 
> > > Index: linux-2.6/mm/page_alloc.c
> > > ===================================================================
> > > --- linux-2.6.orig/mm/page_alloc.c
> > > +++ linux-2.6/mm/page_alloc.c
> > > @@ -1620,7 +1620,8 @@ nofail_alloc:
> > >  		}
> > >  
> > >  		/* The OOM killer will not help higher order allocs so fail */
> > > -		if (order > PAGE_ALLOC_COSTLY_ORDER) {
> > > +		if (order > PAGE_ALLOC_COSTLY_ORDER ||
> > > +				(gfp_mask & __GFP_NO_OOM_KILL)) {
> > >  			clear_zonelist_oom(zonelist, gfp_mask);
> > >  			goto nopage;
> > >  		}
> > 
> > This is inconsistent because __GFP_NO_OOM_KILL now implies __GFP_NORETRY 
> > (the "goto nopage" above), but only for allocations with __GFP_FS set and 
> > __GFP_NORETRY clear.
> 
> Well, what would you suggest?
> 

Did you check whether the existing __GFP_NORETRY will work as-is for
this requirement?

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-04 19:01                                                     ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-04 19:01 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Mon, 4 May 2009 17:02:22 +0200
"Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:

> On Monday 04 May 2009, David Rientjes wrote:
> > On Mon, 4 May 2009, Rafael J. Wysocki wrote:
> > 
> > > Index: linux-2.6/mm/page_alloc.c
> > > ===================================================================
> > > --- linux-2.6.orig/mm/page_alloc.c
> > > +++ linux-2.6/mm/page_alloc.c
> > > @@ -1620,7 +1620,8 @@ nofail_alloc:
> > >  		}
> > >  
> > >  		/* The OOM killer will not help higher order allocs so fail */
> > > -		if (order > PAGE_ALLOC_COSTLY_ORDER) {
> > > +		if (order > PAGE_ALLOC_COSTLY_ORDER ||
> > > +				(gfp_mask & __GFP_NO_OOM_KILL)) {
> > >  			clear_zonelist_oom(zonelist, gfp_mask);
> > >  			goto nopage;
> > >  		}
> > 
> > This is inconsistent because __GFP_NO_OOM_KILL now implies __GFP_NORETRY 
> > (the "goto nopage" above), but only for allocations with __GFP_FS set and 
> > __GFP_NORETRY clear.
> 
> Well, what would you suggest?
> 

Did you check whether the existing __GFP_NORETRY will work as-is for
this requirement?

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-04 19:51                                                       ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-04 19:51 UTC (permalink / raw)
  To: David Rientjes
  Cc: Wu Fengguang, linux-pm, Andrew Morton, pavel, torvalds,
	jens.axboe, alan-jenkins, linux-kernel, kernel-testers

On Monday 04 May 2009, David Rientjes wrote:
> On Mon, 4 May 2009, Rafael J. Wysocki wrote:
> 
> > > > Index: linux-2.6/mm/page_alloc.c
> > > > ===================================================================
> > > > --- linux-2.6.orig/mm/page_alloc.c
> > > > +++ linux-2.6/mm/page_alloc.c
> > > > @@ -1620,7 +1620,8 @@ nofail_alloc:
> > > >  		}
> > > >  
> > > >  		/* The OOM killer will not help higher order allocs so fail */
> > > > -		if (order > PAGE_ALLOC_COSTLY_ORDER) {
> > > > +		if (order > PAGE_ALLOC_COSTLY_ORDER ||
> > > > +				(gfp_mask & __GFP_NO_OOM_KILL)) {
> > > >  			clear_zonelist_oom(zonelist, gfp_mask);
> > > >  			goto nopage;
> > > >  		}
> > > 
> > > This is inconsistent because __GFP_NO_OOM_KILL now implies __GFP_NORETRY 
> > > (the "goto nopage" above), but only for allocations with __GFP_FS set and 
> > > __GFP_NORETRY clear.
> > 
> > Well, what would you suggest?
> > 
> 
> A couple things:
> 
>  - rebase this on mmotm so that it doesn't conflict with Mel Gorman's page
>    allocator speedup changes, and

I'm going to rebase the patchset on top of linux-next eventually.

>  - avoid the final call to get_page_from_freelist() for 
>    !(gfp_mask & __GFP_NO_OOM_KILL) by adding a check for it alongside 
>    (gfp_mask & __GFP_FS) and !(gfp_mask & __GFP_NORETRY) because it should
>    really only catch parallel oom killings which won't happen in your 
>    suspend case since it uses ALLOC_WMARK_HIGH.
> 
> The latter is important to avoid unnecessary dependencies among low-level 
> __GFP_* flags (although all __GFP_NO_OOM_KILL allocations should really 
> all be passing __GFP_NORETRY too to avoid relying too heavily on direct 
> reclaim).

OK, thanks.

Something like this?

---
 include/linux/gfp.h |    3 ++-
 mm/page_alloc.c     |    3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c
+++ linux-2.6/mm/page_alloc.c
@@ -1599,7 +1599,8 @@ nofail_alloc:
 					zonelist, high_zoneidx, alloc_flags);
 		if (page)
 			goto got_pg;
-	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)) {
+	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)
+			&& !(gfp_mask & __GFP_NO_OOM_KILL)) {
 		if (!try_set_zone_oom(zonelist, gfp_mask)) {
 			schedule_timeout_uninterruptible(1);
 			goto restart;
Index: linux-2.6/include/linux/gfp.h
===================================================================
--- linux-2.6.orig/include/linux/gfp.h
+++ linux-2.6/include/linux/gfp.h
@@ -51,8 +51,9 @@ struct vm_area_struct;
 #define __GFP_THISNODE	((__force gfp_t)0x40000u)/* No fallback, no policies */
 #define __GFP_RECLAIMABLE ((__force gfp_t)0x80000u) /* Page is reclaimable */
 #define __GFP_MOVABLE	((__force gfp_t)0x100000u)  /* Page is movable */
+#define __GFP_NO_OOM_KILL ((__force gfp_t)0x200000u)  /* Don't invoke out_of_memory() */
 
-#define __GFP_BITS_SHIFT 21	/* Room for 21 __GFP_FOO bits */
+#define __GFP_BITS_SHIFT 22	/* Number of __GFP_FOO bits */
 #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
 
 /* This equals 0, but use constants in case they ever change */

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-04 16:44                                                     ` David Rientjes
  (?)
@ 2009-05-04 19:51                                                     ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-04 19:51 UTC (permalink / raw)
  To: David Rientjes
  Cc: kernel-testers, linux-kernel, alan-jenkins, jens.axboe, linux-pm,
	Wu Fengguang, torvalds, Andrew Morton

On Monday 04 May 2009, David Rientjes wrote:
> On Mon, 4 May 2009, Rafael J. Wysocki wrote:
> 
> > > > Index: linux-2.6/mm/page_alloc.c
> > > > ===================================================================
> > > > --- linux-2.6.orig/mm/page_alloc.c
> > > > +++ linux-2.6/mm/page_alloc.c
> > > > @@ -1620,7 +1620,8 @@ nofail_alloc:
> > > >  		}
> > > >  
> > > >  		/* The OOM killer will not help higher order allocs so fail */
> > > > -		if (order > PAGE_ALLOC_COSTLY_ORDER) {
> > > > +		if (order > PAGE_ALLOC_COSTLY_ORDER ||
> > > > +				(gfp_mask & __GFP_NO_OOM_KILL)) {
> > > >  			clear_zonelist_oom(zonelist, gfp_mask);
> > > >  			goto nopage;
> > > >  		}
> > > 
> > > This is inconsistent because __GFP_NO_OOM_KILL now implies __GFP_NORETRY 
> > > (the "goto nopage" above), but only for allocations with __GFP_FS set and 
> > > __GFP_NORETRY clear.
> > 
> > Well, what would you suggest?
> > 
> 
> A couple things:
> 
>  - rebase this on mmotm so that it doesn't conflict with Mel Gorman's page
>    allocator speedup changes, and

I'm going to rebase the patchset on top of linux-next eventually.

>  - avoid the final call to get_page_from_freelist() for 
>    !(gfp_mask & __GFP_NO_OOM_KILL) by adding a check for it alongside 
>    (gfp_mask & __GFP_FS) and !(gfp_mask & __GFP_NORETRY) because it should
>    really only catch parallel oom killings which won't happen in your 
>    suspend case since it uses ALLOC_WMARK_HIGH.
> 
> The latter is important to avoid unnecessary dependencies among low-level 
> __GFP_* flags (although all __GFP_NO_OOM_KILL allocations should really 
> all be passing __GFP_NORETRY too to avoid relying too heavily on direct 
> reclaim).

OK, thanks.

Something like this?

---
 include/linux/gfp.h |    3 ++-
 mm/page_alloc.c     |    3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c
+++ linux-2.6/mm/page_alloc.c
@@ -1599,7 +1599,8 @@ nofail_alloc:
 					zonelist, high_zoneidx, alloc_flags);
 		if (page)
 			goto got_pg;
-	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)) {
+	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)
+			&& !(gfp_mask & __GFP_NO_OOM_KILL)) {
 		if (!try_set_zone_oom(zonelist, gfp_mask)) {
 			schedule_timeout_uninterruptible(1);
 			goto restart;
Index: linux-2.6/include/linux/gfp.h
===================================================================
--- linux-2.6.orig/include/linux/gfp.h
+++ linux-2.6/include/linux/gfp.h
@@ -51,8 +51,9 @@ struct vm_area_struct;
 #define __GFP_THISNODE	((__force gfp_t)0x40000u)/* No fallback, no policies */
 #define __GFP_RECLAIMABLE ((__force gfp_t)0x80000u) /* Page is reclaimable */
 #define __GFP_MOVABLE	((__force gfp_t)0x100000u)  /* Page is movable */
+#define __GFP_NO_OOM_KILL ((__force gfp_t)0x200000u)  /* Don't invoke out_of_memory() */
 
-#define __GFP_BITS_SHIFT 21	/* Room for 21 __GFP_FOO bits */
+#define __GFP_BITS_SHIFT 22	/* Number of __GFP_FOO bits */
 #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
 
 /* This equals 0, but use constants in case they ever change */

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-04 19:51                                                       ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-04 19:51 UTC (permalink / raw)
  To: David Rientjes
  Cc: Wu Fengguang,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Andrew Morton, pavel-+ZI9xUNit7I,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Monday 04 May 2009, David Rientjes wrote:
> On Mon, 4 May 2009, Rafael J. Wysocki wrote:
> 
> > > > Index: linux-2.6/mm/page_alloc.c
> > > > ===================================================================
> > > > --- linux-2.6.orig/mm/page_alloc.c
> > > > +++ linux-2.6/mm/page_alloc.c
> > > > @@ -1620,7 +1620,8 @@ nofail_alloc:
> > > >  		}
> > > >  
> > > >  		/* The OOM killer will not help higher order allocs so fail */
> > > > -		if (order > PAGE_ALLOC_COSTLY_ORDER) {
> > > > +		if (order > PAGE_ALLOC_COSTLY_ORDER ||
> > > > +				(gfp_mask & __GFP_NO_OOM_KILL)) {
> > > >  			clear_zonelist_oom(zonelist, gfp_mask);
> > > >  			goto nopage;
> > > >  		}
> > > 
> > > This is inconsistent because __GFP_NO_OOM_KILL now implies __GFP_NORETRY 
> > > (the "goto nopage" above), but only for allocations with __GFP_FS set and 
> > > __GFP_NORETRY clear.
> > 
> > Well, what would you suggest?
> > 
> 
> A couple things:
> 
>  - rebase this on mmotm so that it doesn't conflict with Mel Gorman's page
>    allocator speedup changes, and

I'm going to rebase the patchset on top of linux-next eventually.

>  - avoid the final call to get_page_from_freelist() for 
>    !(gfp_mask & __GFP_NO_OOM_KILL) by adding a check for it alongside 
>    (gfp_mask & __GFP_FS) and !(gfp_mask & __GFP_NORETRY) because it should
>    really only catch parallel oom killings which won't happen in your 
>    suspend case since it uses ALLOC_WMARK_HIGH.
> 
> The latter is important to avoid unnecessary dependencies among low-level 
> __GFP_* flags (although all __GFP_NO_OOM_KILL allocations should really 
> all be passing __GFP_NORETRY too to avoid relying too heavily on direct 
> reclaim).

OK, thanks.

Something like this?

---
 include/linux/gfp.h |    3 ++-
 mm/page_alloc.c     |    3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c
+++ linux-2.6/mm/page_alloc.c
@@ -1599,7 +1599,8 @@ nofail_alloc:
 					zonelist, high_zoneidx, alloc_flags);
 		if (page)
 			goto got_pg;
-	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)) {
+	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)
+			&& !(gfp_mask & __GFP_NO_OOM_KILL)) {
 		if (!try_set_zone_oom(zonelist, gfp_mask)) {
 			schedule_timeout_uninterruptible(1);
 			goto restart;
Index: linux-2.6/include/linux/gfp.h
===================================================================
--- linux-2.6.orig/include/linux/gfp.h
+++ linux-2.6/include/linux/gfp.h
@@ -51,8 +51,9 @@ struct vm_area_struct;
 #define __GFP_THISNODE	((__force gfp_t)0x40000u)/* No fallback, no policies */
 #define __GFP_RECLAIMABLE ((__force gfp_t)0x80000u) /* Page is reclaimable */
 #define __GFP_MOVABLE	((__force gfp_t)0x100000u)  /* Page is movable */
+#define __GFP_NO_OOM_KILL ((__force gfp_t)0x200000u)  /* Don't invoke out_of_memory() */
 
-#define __GFP_BITS_SHIFT 21	/* Room for 21 __GFP_FOO bits */
+#define __GFP_BITS_SHIFT 22	/* Number of __GFP_FOO bits */
 #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
 
 /* This equals 0, but use constants in case they ever change */

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/4] PM/Hibernate: Use memory allocations to free memory (rev. 2)
@ 2009-05-04 19:52                                                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-04 19:52 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Wu Fengguang, Andrew Morton, torvalds, jens.axboe, alan-jenkins,
	linux-kernel, kernel-testers, linux-pm

On Monday 04 May 2009, Pavel Machek wrote:
> On Sun 2009-05-03 18:22:54, Rafael J. Wysocki wrote:
> > On Sunday 03 May 2009, Wu Fengguang wrote:
> > > On Sun, May 03, 2009 at 02:24:20AM +0200, Rafael J. Wysocki wrote:
> > > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > > 
> > > > Modify the hibernation memory shrinking code so that it will make
> > > > memory allocations to free memory instead of using an artificial
> > > > memory shrinking mechanism for that.  Remove the shrinking of
> > > > memory from the suspend-to-RAM code, where it is not really
> > > > necessary.  Finally, remove the no longer used memory shrinking
> > > > functions from mm/vmscan.c .
> > > > 
> > > > [rev. 2: Use the existing memory bitmaps for marking preallocated
> > > >  image pages and use swsusp_free() from releasing them, introduce
> > > >  GFP_IMAGE, add comments describing the memory shrinking strategy.]
> > > > 
> > > > Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> > > > ---
> > > >  kernel/power/main.c     |   20 ------
> > > >  kernel/power/snapshot.c |  132 +++++++++++++++++++++++++++++++++-----------
> > > >  mm/vmscan.c             |  142 ------------------------------------------------
> > > >  3 files changed, 101 insertions(+), 193 deletions(-)
> > > > 
> > > > Index: linux-2.6/kernel/power/snapshot.c
> > > > ===================================================================
> > > > --- linux-2.6.orig/kernel/power/snapshot.c
> > > > +++ linux-2.6/kernel/power/snapshot.c
> > > > @@ -1066,41 +1066,97 @@ void swsusp_free(void)
> > > >  	buffer = NULL;
> > > >  }
> > > >  
> > > > +/* Helper functions used for the shrinking of memory. */
> > > > +
> > > > +#ifdef CONFIG_HIGHMEM
> > > > +#define GFP_IMAGE	(GFP_KERNEL | __GFP_HIGHMEM | __GFP_NO_OOM_KILL)
> > > > +#else
> > > > +#define GFP_IMAGE	(GFP_KERNEL | __GFP_NO_OOM_KILL)
> > > > +#endif
> > > 
> > > The CONFIG_HIGHMEM test is not necessary: __GFP_HIGHMEM is always defined.
> > > 
> > > > +#define SHRINK_BITE	10000
> > > 
> > > This is ~40MB. A full scan of (for example) 8G pages will be time
> > > consuming, not to mention we have to do it 2*(8G-500M)/40M = 384 times!
> > > 
> > > Can we make it a LONG_MAX? 
> > 
> > No, I don't think so.  The problem is the number of pages we'll need to copy
> > is generally shrinking  as we allocate memory, so we can't do that in one shot.
> > 
> > We can make it a greater number, but I don't really think it would be a good
> > idea to make it greater than 100 MB.
> 
> Well, even 100MB is quite big: on 128MB machine, that will probably
> mean freeing all the memory (instead of "as much as needed"). And that
> memory needs to go to disk, so it will be slow.

But we're going to free it anyway?

Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/4] PM/Hibernate: Use memory allocations to free memory (rev. 2)
  2009-05-04  9:31                                               ` Pavel Machek
  (?)
@ 2009-05-04 19:52                                               ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-04 19:52 UTC (permalink / raw)
  To: Pavel Machek
  Cc: kernel-testers, linux-kernel, alan-jenkins, jens.axboe,
	Andrew Morton, Wu Fengguang, torvalds, linux-pm

On Monday 04 May 2009, Pavel Machek wrote:
> On Sun 2009-05-03 18:22:54, Rafael J. Wysocki wrote:
> > On Sunday 03 May 2009, Wu Fengguang wrote:
> > > On Sun, May 03, 2009 at 02:24:20AM +0200, Rafael J. Wysocki wrote:
> > > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > > 
> > > > Modify the hibernation memory shrinking code so that it will make
> > > > memory allocations to free memory instead of using an artificial
> > > > memory shrinking mechanism for that.  Remove the shrinking of
> > > > memory from the suspend-to-RAM code, where it is not really
> > > > necessary.  Finally, remove the no longer used memory shrinking
> > > > functions from mm/vmscan.c .
> > > > 
> > > > [rev. 2: Use the existing memory bitmaps for marking preallocated
> > > >  image pages and use swsusp_free() from releasing them, introduce
> > > >  GFP_IMAGE, add comments describing the memory shrinking strategy.]
> > > > 
> > > > Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> > > > ---
> > > >  kernel/power/main.c     |   20 ------
> > > >  kernel/power/snapshot.c |  132 +++++++++++++++++++++++++++++++++-----------
> > > >  mm/vmscan.c             |  142 ------------------------------------------------
> > > >  3 files changed, 101 insertions(+), 193 deletions(-)
> > > > 
> > > > Index: linux-2.6/kernel/power/snapshot.c
> > > > ===================================================================
> > > > --- linux-2.6.orig/kernel/power/snapshot.c
> > > > +++ linux-2.6/kernel/power/snapshot.c
> > > > @@ -1066,41 +1066,97 @@ void swsusp_free(void)
> > > >  	buffer = NULL;
> > > >  }
> > > >  
> > > > +/* Helper functions used for the shrinking of memory. */
> > > > +
> > > > +#ifdef CONFIG_HIGHMEM
> > > > +#define GFP_IMAGE	(GFP_KERNEL | __GFP_HIGHMEM | __GFP_NO_OOM_KILL)
> > > > +#else
> > > > +#define GFP_IMAGE	(GFP_KERNEL | __GFP_NO_OOM_KILL)
> > > > +#endif
> > > 
> > > The CONFIG_HIGHMEM test is not necessary: __GFP_HIGHMEM is always defined.
> > > 
> > > > +#define SHRINK_BITE	10000
> > > 
> > > This is ~40MB. A full scan of (for example) 8G pages will be time
> > > consuming, not to mention we have to do it 2*(8G-500M)/40M = 384 times!
> > > 
> > > Can we make it a LONG_MAX? 
> > 
> > No, I don't think so.  The problem is the number of pages we'll need to copy
> > is generally shrinking  as we allocate memory, so we can't do that in one shot.
> > 
> > We can make it a greater number, but I don't really think it would be a good
> > idea to make it greater than 100 MB.
> 
> Well, even 100MB is quite big: on 128MB machine, that will probably
> mean freeing all the memory (instead of "as much as needed"). And that
> memory needs to go to disk, so it will be slow.

But we're going to free it anyway?

Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 3/4] PM/Hibernate: Use memory allocations to free memory (rev. 2)
@ 2009-05-04 19:52                                                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-04 19:52 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Wu Fengguang, Andrew Morton,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

On Monday 04 May 2009, Pavel Machek wrote:
> On Sun 2009-05-03 18:22:54, Rafael J. Wysocki wrote:
> > On Sunday 03 May 2009, Wu Fengguang wrote:
> > > On Sun, May 03, 2009 at 02:24:20AM +0200, Rafael J. Wysocki wrote:
> > > > From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
> > > > 
> > > > Modify the hibernation memory shrinking code so that it will make
> > > > memory allocations to free memory instead of using an artificial
> > > > memory shrinking mechanism for that.  Remove the shrinking of
> > > > memory from the suspend-to-RAM code, where it is not really
> > > > necessary.  Finally, remove the no longer used memory shrinking
> > > > functions from mm/vmscan.c .
> > > > 
> > > > [rev. 2: Use the existing memory bitmaps for marking preallocated
> > > >  image pages and use swsusp_free() from releasing them, introduce
> > > >  GFP_IMAGE, add comments describing the memory shrinking strategy.]
> > > > 
> > > > Signed-off-by: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
> > > > ---
> > > >  kernel/power/main.c     |   20 ------
> > > >  kernel/power/snapshot.c |  132 +++++++++++++++++++++++++++++++++-----------
> > > >  mm/vmscan.c             |  142 ------------------------------------------------
> > > >  3 files changed, 101 insertions(+), 193 deletions(-)
> > > > 
> > > > Index: linux-2.6/kernel/power/snapshot.c
> > > > ===================================================================
> > > > --- linux-2.6.orig/kernel/power/snapshot.c
> > > > +++ linux-2.6/kernel/power/snapshot.c
> > > > @@ -1066,41 +1066,97 @@ void swsusp_free(void)
> > > >  	buffer = NULL;
> > > >  }
> > > >  
> > > > +/* Helper functions used for the shrinking of memory. */
> > > > +
> > > > +#ifdef CONFIG_HIGHMEM
> > > > +#define GFP_IMAGE	(GFP_KERNEL | __GFP_HIGHMEM | __GFP_NO_OOM_KILL)
> > > > +#else
> > > > +#define GFP_IMAGE	(GFP_KERNEL | __GFP_NO_OOM_KILL)
> > > > +#endif
> > > 
> > > The CONFIG_HIGHMEM test is not necessary: __GFP_HIGHMEM is always defined.
> > > 
> > > > +#define SHRINK_BITE	10000
> > > 
> > > This is ~40MB. A full scan of (for example) 8G pages will be time
> > > consuming, not to mention we have to do it 2*(8G-500M)/40M = 384 times!
> > > 
> > > Can we make it a LONG_MAX? 
> > 
> > No, I don't think so.  The problem is the number of pages we'll need to copy
> > is generally shrinking  as we allocate memory, so we can't do that in one shot.
> > 
> > We can make it a greater number, but I don't really think it would be a good
> > idea to make it greater than 100 MB.
> 
> Well, even 100MB is quite big: on 128MB machine, that will probably
> mean freeing all the memory (instead of "as much as needed"). And that
> memory needs to go to disk, so it will be slow.

But we're going to free it anyway?

Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 0/4] PM: Drop shrink_all_memory (rev. 2) (was: Re: [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory)
  2009-05-04  9:33                                             ` Pavel Machek
  (?)
@ 2009-05-04 19:53                                             ` Rafael J. Wysocki
  2009-05-04 20:27                                               ` Pavel Machek
  2009-05-04 20:27                                                 ` Pavel Machek
  -1 siblings, 2 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-04 19:53 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Wu Fengguang, Andrew Morton, torvalds, jens.axboe, alan-jenkins,
	linux-kernel, kernel-testers, linux-pm

On Monday 04 May 2009, Pavel Machek wrote:
> Hi!
> 
> > I know that swsusp_shrink_memory() has problems, that's why I'd like to get rid
> > of it.
> > 
> > > I wonder if it's possible to free up the memory within 1s at all.
> > 
> > I'm not sure.
> > 
> > Apparently, the counting of saveable pages takes substantial time (0.5 s each
> > iteration on my 64-bit test box), so we can improve that by limiting the number
> > of iterations.
> 
> We could increase step size after each step. Free in 40MB step, then
> 80MB step, then 160MB step, ...

Why not just one step?  It doesn't seem to hurt performance AFAICS.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 0/4] PM: Drop shrink_all_memory (rev. 2) (was: Re: [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory)
  2009-05-04  9:33                                             ` Pavel Machek
  (?)
  (?)
@ 2009-05-04 19:53                                             ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-04 19:53 UTC (permalink / raw)
  To: Pavel Machek
  Cc: kernel-testers, linux-kernel, alan-jenkins, jens.axboe,
	Andrew Morton, Wu Fengguang, torvalds, linux-pm

On Monday 04 May 2009, Pavel Machek wrote:
> Hi!
> 
> > I know that swsusp_shrink_memory() has problems, that's why I'd like to get rid
> > of it.
> > 
> > > I wonder if it's possible to free up the memory within 1s at all.
> > 
> > I'm not sure.
> > 
> > Apparently, the counting of saveable pages takes substantial time (0.5 s each
> > iteration on my 64-bit test box), so we can improve that by limiting the number
> > of iterations.
> 
> We could increase step size after each step. Free in 40MB step, then
> 80MB step, then 160MB step, ...

Why not just one step?  It doesn't seem to hurt performance AFAICS.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-04 20:02                                                         ` David Rientjes
  0 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-04 20:02 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Wu Fengguang, linux-pm, Andrew Morton, pavel, torvalds,
	jens.axboe, alan-jenkins, linux-kernel, kernel-testers

On Mon, 4 May 2009, Rafael J. Wysocki wrote:

> Index: linux-2.6/mm/page_alloc.c
> ===================================================================
> --- linux-2.6.orig/mm/page_alloc.c
> +++ linux-2.6/mm/page_alloc.c
> @@ -1599,7 +1599,8 @@ nofail_alloc:
>  					zonelist, high_zoneidx, alloc_flags);
>  		if (page)
>  			goto got_pg;
> -	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)) {
> +	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)
> +			&& !(gfp_mask & __GFP_NO_OOM_KILL)) {
>  		if (!try_set_zone_oom(zonelist, gfp_mask)) {
>  			schedule_timeout_uninterruptible(1);
>  			goto restart;
> Index: linux-2.6/include/linux/gfp.h
> ===================================================================
> --- linux-2.6.orig/include/linux/gfp.h
> +++ linux-2.6/include/linux/gfp.h
> @@ -51,8 +51,9 @@ struct vm_area_struct;
>  #define __GFP_THISNODE	((__force gfp_t)0x40000u)/* No fallback, no policies */
>  #define __GFP_RECLAIMABLE ((__force gfp_t)0x80000u) /* Page is reclaimable */
>  #define __GFP_MOVABLE	((__force gfp_t)0x100000u)  /* Page is movable */
> +#define __GFP_NO_OOM_KILL ((__force gfp_t)0x200000u)  /* Don't invoke out_of_memory() */
>  
> -#define __GFP_BITS_SHIFT 21	/* Room for 21 __GFP_FOO bits */
> +#define __GFP_BITS_SHIFT 22	/* Number of __GFP_FOO bits */
>  #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
>  
>  /* This equals 0, but use constants in case they ever change */
> 

Yeah, that's much better, thanks.  There's currently concerns about adding 
a new gfp flag in another thread (__GFP_PANIC), though, so you might find 
some resistance in adding a flag with a very specific and limited use cae.

I think you might have better luck in doing

	struct zone *z;

	for_each_populated_zone(z)
		zone_set_flag(z, ZONE_OOM_LOCKED);
	
if all other tasks are really in D state at this point since oom killer 
serialization is done with try locks in the page allocator.  This is 
equivalent to __GFP_NO_OOM_KILL.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-04 19:51                                                       ` Rafael J. Wysocki
  (?)
@ 2009-05-04 20:02                                                       ` David Rientjes
  -1 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-04 20:02 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: kernel-testers, linux-kernel, alan-jenkins, jens.axboe, linux-pm,
	Wu Fengguang, torvalds, Andrew Morton

On Mon, 4 May 2009, Rafael J. Wysocki wrote:

> Index: linux-2.6/mm/page_alloc.c
> ===================================================================
> --- linux-2.6.orig/mm/page_alloc.c
> +++ linux-2.6/mm/page_alloc.c
> @@ -1599,7 +1599,8 @@ nofail_alloc:
>  					zonelist, high_zoneidx, alloc_flags);
>  		if (page)
>  			goto got_pg;
> -	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)) {
> +	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)
> +			&& !(gfp_mask & __GFP_NO_OOM_KILL)) {
>  		if (!try_set_zone_oom(zonelist, gfp_mask)) {
>  			schedule_timeout_uninterruptible(1);
>  			goto restart;
> Index: linux-2.6/include/linux/gfp.h
> ===================================================================
> --- linux-2.6.orig/include/linux/gfp.h
> +++ linux-2.6/include/linux/gfp.h
> @@ -51,8 +51,9 @@ struct vm_area_struct;
>  #define __GFP_THISNODE	((__force gfp_t)0x40000u)/* No fallback, no policies */
>  #define __GFP_RECLAIMABLE ((__force gfp_t)0x80000u) /* Page is reclaimable */
>  #define __GFP_MOVABLE	((__force gfp_t)0x100000u)  /* Page is movable */
> +#define __GFP_NO_OOM_KILL ((__force gfp_t)0x200000u)  /* Don't invoke out_of_memory() */
>  
> -#define __GFP_BITS_SHIFT 21	/* Room for 21 __GFP_FOO bits */
> +#define __GFP_BITS_SHIFT 22	/* Number of __GFP_FOO bits */
>  #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
>  
>  /* This equals 0, but use constants in case they ever change */
> 

Yeah, that's much better, thanks.  There's currently concerns about adding 
a new gfp flag in another thread (__GFP_PANIC), though, so you might find 
some resistance in adding a flag with a very specific and limited use cae.

I think you might have better luck in doing

	struct zone *z;

	for_each_populated_zone(z)
		zone_set_flag(z, ZONE_OOM_LOCKED);
	
if all other tasks are really in D state at this point since oom killer 
serialization is done with try locks in the page allocator.  This is 
equivalent to __GFP_NO_OOM_KILL.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-04 20:02                                                         ` David Rientjes
  0 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-04 20:02 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Wu Fengguang,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Andrew Morton, pavel-+ZI9xUNit7I,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Mon, 4 May 2009, Rafael J. Wysocki wrote:

> Index: linux-2.6/mm/page_alloc.c
> ===================================================================
> --- linux-2.6.orig/mm/page_alloc.c
> +++ linux-2.6/mm/page_alloc.c
> @@ -1599,7 +1599,8 @@ nofail_alloc:
>  					zonelist, high_zoneidx, alloc_flags);
>  		if (page)
>  			goto got_pg;
> -	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)) {
> +	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)
> +			&& !(gfp_mask & __GFP_NO_OOM_KILL)) {
>  		if (!try_set_zone_oom(zonelist, gfp_mask)) {
>  			schedule_timeout_uninterruptible(1);
>  			goto restart;
> Index: linux-2.6/include/linux/gfp.h
> ===================================================================
> --- linux-2.6.orig/include/linux/gfp.h
> +++ linux-2.6/include/linux/gfp.h
> @@ -51,8 +51,9 @@ struct vm_area_struct;
>  #define __GFP_THISNODE	((__force gfp_t)0x40000u)/* No fallback, no policies */
>  #define __GFP_RECLAIMABLE ((__force gfp_t)0x80000u) /* Page is reclaimable */
>  #define __GFP_MOVABLE	((__force gfp_t)0x100000u)  /* Page is movable */
> +#define __GFP_NO_OOM_KILL ((__force gfp_t)0x200000u)  /* Don't invoke out_of_memory() */
>  
> -#define __GFP_BITS_SHIFT 21	/* Room for 21 __GFP_FOO bits */
> +#define __GFP_BITS_SHIFT 22	/* Number of __GFP_FOO bits */
>  #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
>  
>  /* This equals 0, but use constants in case they ever change */
> 

Yeah, that's much better, thanks.  There's currently concerns about adding 
a new gfp flag in another thread (__GFP_PANIC), though, so you might find 
some resistance in adding a flag with a very specific and limited use cae.

I think you might have better luck in doing

	struct zone *z;

	for_each_populated_zone(z)
		zone_set_flag(z, ZONE_OOM_LOCKED);
	
if all other tasks are really in D state at this point since oom killer 
serialization is done with try locks in the page allocator.  This is 
equivalent to __GFP_NO_OOM_KILL.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 0/4] PM: Drop shrink_all_memory (rev. 2) (was: Re: [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory)
@ 2009-05-04 20:27                                                 ` Pavel Machek
  0 siblings, 0 replies; 580+ messages in thread
From: Pavel Machek @ 2009-05-04 20:27 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Wu Fengguang, Andrew Morton, torvalds, jens.axboe, alan-jenkins,
	linux-kernel, kernel-testers, linux-pm

On Mon 2009-05-04 21:53:36, Rafael J. Wysocki wrote:
> On Monday 04 May 2009, Pavel Machek wrote:
> > Hi!
> > 
> > > I know that swsusp_shrink_memory() has problems, that's why I'd like to get rid
> > > of it.
> > > 
> > > > I wonder if it's possible to free up the memory within 1s at all.
> > > 
> > > I'm not sure.
> > > 
> > > Apparently, the counting of saveable pages takes substantial time (0.5 s each
> > > iteration on my 64-bit test box), so we can improve that by limiting the number
> > > of iterations.
> > 
> > We could increase step size after each step. Free in 40MB step, then
> > 80MB step, then 160MB step, ...
> 
> Why not just one step?  It doesn't seem to hurt performance AFAICS.

One step is obviously fine, too. 
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 0/4] PM: Drop shrink_all_memory (rev. 2) (was: Re: [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory)
  2009-05-04 19:53                                             ` Rafael J. Wysocki
@ 2009-05-04 20:27                                               ` Pavel Machek
  2009-05-04 20:27                                                 ` Pavel Machek
  1 sibling, 0 replies; 580+ messages in thread
From: Pavel Machek @ 2009-05-04 20:27 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: kernel-testers, linux-kernel, alan-jenkins, jens.axboe,
	Andrew Morton, Wu Fengguang, torvalds, linux-pm

On Mon 2009-05-04 21:53:36, Rafael J. Wysocki wrote:
> On Monday 04 May 2009, Pavel Machek wrote:
> > Hi!
> > 
> > > I know that swsusp_shrink_memory() has problems, that's why I'd like to get rid
> > > of it.
> > > 
> > > > I wonder if it's possible to free up the memory within 1s at all.
> > > 
> > > I'm not sure.
> > > 
> > > Apparently, the counting of saveable pages takes substantial time (0.5 s each
> > > iteration on my 64-bit test box), so we can improve that by limiting the number
> > > of iterations.
> > 
> > We could increase step size after each step. Free in 40MB step, then
> > 80MB step, then 160MB step, ...
> 
> Why not just one step?  It doesn't seem to hurt performance AFAICS.

One step is obviously fine, too. 
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 0/4] PM: Drop shrink_all_memory (rev. 2) (was: Re: [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory)
@ 2009-05-04 20:27                                                 ` Pavel Machek
  0 siblings, 0 replies; 580+ messages in thread
From: Pavel Machek @ 2009-05-04 20:27 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Wu Fengguang, Andrew Morton,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

On Mon 2009-05-04 21:53:36, Rafael J. Wysocki wrote:
> On Monday 04 May 2009, Pavel Machek wrote:
> > Hi!
> > 
> > > I know that swsusp_shrink_memory() has problems, that's why I'd like to get rid
> > > of it.
> > > 
> > > > I wonder if it's possible to free up the memory within 1s at all.
> > > 
> > > I'm not sure.
> > > 
> > > Apparently, the counting of saveable pages takes substantial time (0.5 s each
> > > iteration on my 64-bit test box), so we can improve that by limiting the number
> > > of iterations.
> > 
> > We could increase step size after each step. Free in 40MB step, then
> > 80MB step, then 160MB step, ...
> 
> Why not just one step?  It doesn't seem to hurt performance AFAICS.

One step is obviously fine, too. 
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-04 22:23                                                           ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-04 22:23 UTC (permalink / raw)
  To: David Rientjes, Andrew Morton
  Cc: Wu Fengguang, linux-pm, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

On Monday 04 May 2009, David Rientjes wrote:
> On Mon, 4 May 2009, Rafael J. Wysocki wrote:
> 
> > Index: linux-2.6/mm/page_alloc.c
> > ===================================================================
> > --- linux-2.6.orig/mm/page_alloc.c
> > +++ linux-2.6/mm/page_alloc.c
> > @@ -1599,7 +1599,8 @@ nofail_alloc:
> >  					zonelist, high_zoneidx, alloc_flags);
> >  		if (page)
> >  			goto got_pg;
> > -	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)) {
> > +	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)
> > +			&& !(gfp_mask & __GFP_NO_OOM_KILL)) {
> >  		if (!try_set_zone_oom(zonelist, gfp_mask)) {
> >  			schedule_timeout_uninterruptible(1);
> >  			goto restart;
> > Index: linux-2.6/include/linux/gfp.h
> > ===================================================================
> > --- linux-2.6.orig/include/linux/gfp.h
> > +++ linux-2.6/include/linux/gfp.h
> > @@ -51,8 +51,9 @@ struct vm_area_struct;
> >  #define __GFP_THISNODE	((__force gfp_t)0x40000u)/* No fallback, no policies */
> >  #define __GFP_RECLAIMABLE ((__force gfp_t)0x80000u) /* Page is reclaimable */
> >  #define __GFP_MOVABLE	((__force gfp_t)0x100000u)  /* Page is movable */
> > +#define __GFP_NO_OOM_KILL ((__force gfp_t)0x200000u)  /* Don't invoke out_of_memory() */
> >  
> > -#define __GFP_BITS_SHIFT 21	/* Room for 21 __GFP_FOO bits */
> > +#define __GFP_BITS_SHIFT 22	/* Number of __GFP_FOO bits */
> >  #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
> >  
> >  /* This equals 0, but use constants in case they ever change */
> > 
> 
> Yeah, that's much better, thanks.  There's currently concerns about adding 
> a new gfp flag in another thread (__GFP_PANIC), though, so you might find 
> some resistance in adding a flag with a very specific and limited use cae.

Oh great.  Andrew, what's your opinion?

> I think you might have better luck in doing
> 
> 	struct zone *z;
> 
> 	for_each_populated_zone(z)
> 		zone_set_flag(z, ZONE_OOM_LOCKED);
> 	
> if all other tasks are really in D state at this point since oom killer 
> serialization is done with try locks in the page allocator.

Not all of them, actually.  Some kernel threads are not freezable.

> This is equivalent to __GFP_NO_OOM_KILL.

In that case I think I'd go back to my initial idea with disabling the OOM
killer after freezing tasks.

Roughly, this.  [The idea is that the OOM killer is not really going to work
while tasks are frozen, so we can just give up calling it in that case.]

---
 include/linux/freezer.h |    2 ++
 kernel/power/process.c  |   12 ++++++++++++
 mm/page_alloc.c         |    4 +++-
 3 files changed, 17 insertions(+), 1 deletion(-)

Index: linux-2.6/kernel/power/process.c
===================================================================
--- linux-2.6.orig/kernel/power/process.c
+++ linux-2.6/kernel/power/process.c
@@ -19,6 +19,8 @@
  */
 #define TIMEOUT	(20 * HZ)
 
+static bool tasks_frozen;
+
 static inline int freezeable(struct task_struct * p)
 {
 	if ((p == current) ||
@@ -120,6 +122,10 @@ int freeze_processes(void)
  Exit:
 	BUG_ON(in_atomic());
 	printk("\n");
+
+	if (!error)
+		tasks_frozen = true;
+
 	return error;
 }
 
@@ -145,6 +151,8 @@ static void thaw_tasks(bool nosig_only)
 
 void thaw_processes(void)
 {
+	tasks_frozen = false;
+
 	printk("Restarting tasks ... ");
 	thaw_tasks(true);
 	thaw_tasks(false);
@@ -152,3 +160,7 @@ void thaw_processes(void)
 	printk("done.\n");
 }
 
+bool processes_are_frozen(void)
+{
+	return tasks_frozen;
+}
Index: linux-2.6/include/linux/freezer.h
===================================================================
--- linux-2.6.orig/include/linux/freezer.h
+++ linux-2.6/include/linux/freezer.h
@@ -50,6 +50,7 @@ extern int thaw_process(struct task_stru
 extern void refrigerator(void);
 extern int freeze_processes(void);
 extern void thaw_processes(void);
+extern bool processes_are_frozen(void);
 
 static inline int try_to_freeze(void)
 {
@@ -170,6 +171,7 @@ static inline int thaw_process(struct ta
 static inline void refrigerator(void) {}
 static inline int freeze_processes(void) { BUG(); return 0; }
 static inline void thaw_processes(void) {}
+static inline bool processes_are_frozen(void) { return false; }
 
 static inline int try_to_freeze(void) { return 0; }
 
Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c
+++ linux-2.6/mm/page_alloc.c
@@ -46,6 +46,7 @@
 #include <linux/page-isolation.h>
 #include <linux/page_cgroup.h>
 #include <linux/debugobjects.h>
+#include <linux/freezer.h>
 
 #include <asm/tlbflush.h>
 #include <asm/div64.h>
@@ -1599,7 +1600,8 @@ nofail_alloc:
 					zonelist, high_zoneidx, alloc_flags);
 		if (page)
 			goto got_pg;
-	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)) {
+	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)
+			&& !processes_are_frozen()) {
 		if (!try_set_zone_oom(zonelist, gfp_mask)) {
 			schedule_timeout_uninterruptible(1);
 			goto restart;

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-04 20:02                                                         ` David Rientjes
  (?)
@ 2009-05-04 22:23                                                         ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-04 22:23 UTC (permalink / raw)
  To: David Rientjes, Andrew Morton
  Cc: kernel-testers, linux-kernel, alan-jenkins, jens.axboe, linux-pm,
	Wu Fengguang, torvalds

On Monday 04 May 2009, David Rientjes wrote:
> On Mon, 4 May 2009, Rafael J. Wysocki wrote:
> 
> > Index: linux-2.6/mm/page_alloc.c
> > ===================================================================
> > --- linux-2.6.orig/mm/page_alloc.c
> > +++ linux-2.6/mm/page_alloc.c
> > @@ -1599,7 +1599,8 @@ nofail_alloc:
> >  					zonelist, high_zoneidx, alloc_flags);
> >  		if (page)
> >  			goto got_pg;
> > -	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)) {
> > +	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)
> > +			&& !(gfp_mask & __GFP_NO_OOM_KILL)) {
> >  		if (!try_set_zone_oom(zonelist, gfp_mask)) {
> >  			schedule_timeout_uninterruptible(1);
> >  			goto restart;
> > Index: linux-2.6/include/linux/gfp.h
> > ===================================================================
> > --- linux-2.6.orig/include/linux/gfp.h
> > +++ linux-2.6/include/linux/gfp.h
> > @@ -51,8 +51,9 @@ struct vm_area_struct;
> >  #define __GFP_THISNODE	((__force gfp_t)0x40000u)/* No fallback, no policies */
> >  #define __GFP_RECLAIMABLE ((__force gfp_t)0x80000u) /* Page is reclaimable */
> >  #define __GFP_MOVABLE	((__force gfp_t)0x100000u)  /* Page is movable */
> > +#define __GFP_NO_OOM_KILL ((__force gfp_t)0x200000u)  /* Don't invoke out_of_memory() */
> >  
> > -#define __GFP_BITS_SHIFT 21	/* Room for 21 __GFP_FOO bits */
> > +#define __GFP_BITS_SHIFT 22	/* Number of __GFP_FOO bits */
> >  #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
> >  
> >  /* This equals 0, but use constants in case they ever change */
> > 
> 
> Yeah, that's much better, thanks.  There's currently concerns about adding 
> a new gfp flag in another thread (__GFP_PANIC), though, so you might find 
> some resistance in adding a flag with a very specific and limited use cae.

Oh great.  Andrew, what's your opinion?

> I think you might have better luck in doing
> 
> 	struct zone *z;
> 
> 	for_each_populated_zone(z)
> 		zone_set_flag(z, ZONE_OOM_LOCKED);
> 	
> if all other tasks are really in D state at this point since oom killer 
> serialization is done with try locks in the page allocator.

Not all of them, actually.  Some kernel threads are not freezable.

> This is equivalent to __GFP_NO_OOM_KILL.

In that case I think I'd go back to my initial idea with disabling the OOM
killer after freezing tasks.

Roughly, this.  [The idea is that the OOM killer is not really going to work
while tasks are frozen, so we can just give up calling it in that case.]

---
 include/linux/freezer.h |    2 ++
 kernel/power/process.c  |   12 ++++++++++++
 mm/page_alloc.c         |    4 +++-
 3 files changed, 17 insertions(+), 1 deletion(-)

Index: linux-2.6/kernel/power/process.c
===================================================================
--- linux-2.6.orig/kernel/power/process.c
+++ linux-2.6/kernel/power/process.c
@@ -19,6 +19,8 @@
  */
 #define TIMEOUT	(20 * HZ)
 
+static bool tasks_frozen;
+
 static inline int freezeable(struct task_struct * p)
 {
 	if ((p == current) ||
@@ -120,6 +122,10 @@ int freeze_processes(void)
  Exit:
 	BUG_ON(in_atomic());
 	printk("\n");
+
+	if (!error)
+		tasks_frozen = true;
+
 	return error;
 }
 
@@ -145,6 +151,8 @@ static void thaw_tasks(bool nosig_only)
 
 void thaw_processes(void)
 {
+	tasks_frozen = false;
+
 	printk("Restarting tasks ... ");
 	thaw_tasks(true);
 	thaw_tasks(false);
@@ -152,3 +160,7 @@ void thaw_processes(void)
 	printk("done.\n");
 }
 
+bool processes_are_frozen(void)
+{
+	return tasks_frozen;
+}
Index: linux-2.6/include/linux/freezer.h
===================================================================
--- linux-2.6.orig/include/linux/freezer.h
+++ linux-2.6/include/linux/freezer.h
@@ -50,6 +50,7 @@ extern int thaw_process(struct task_stru
 extern void refrigerator(void);
 extern int freeze_processes(void);
 extern void thaw_processes(void);
+extern bool processes_are_frozen(void);
 
 static inline int try_to_freeze(void)
 {
@@ -170,6 +171,7 @@ static inline int thaw_process(struct ta
 static inline void refrigerator(void) {}
 static inline int freeze_processes(void) { BUG(); return 0; }
 static inline void thaw_processes(void) {}
+static inline bool processes_are_frozen(void) { return false; }
 
 static inline int try_to_freeze(void) { return 0; }
 
Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c
+++ linux-2.6/mm/page_alloc.c
@@ -46,6 +46,7 @@
 #include <linux/page-isolation.h>
 #include <linux/page_cgroup.h>
 #include <linux/debugobjects.h>
+#include <linux/freezer.h>
 
 #include <asm/tlbflush.h>
 #include <asm/div64.h>
@@ -1599,7 +1600,8 @@ nofail_alloc:
 					zonelist, high_zoneidx, alloc_flags);
 		if (page)
 			goto got_pg;
-	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)) {
+	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)
+			&& !processes_are_frozen()) {
 		if (!try_set_zone_oom(zonelist, gfp_mask)) {
 			schedule_timeout_uninterruptible(1);
 			goto restart;

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-04 22:23                                                           ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-04 22:23 UTC (permalink / raw)
  To: David Rientjes, Andrew Morton
  Cc: Wu Fengguang,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Monday 04 May 2009, David Rientjes wrote:
> On Mon, 4 May 2009, Rafael J. Wysocki wrote:
> 
> > Index: linux-2.6/mm/page_alloc.c
> > ===================================================================
> > --- linux-2.6.orig/mm/page_alloc.c
> > +++ linux-2.6/mm/page_alloc.c
> > @@ -1599,7 +1599,8 @@ nofail_alloc:
> >  					zonelist, high_zoneidx, alloc_flags);
> >  		if (page)
> >  			goto got_pg;
> > -	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)) {
> > +	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)
> > +			&& !(gfp_mask & __GFP_NO_OOM_KILL)) {
> >  		if (!try_set_zone_oom(zonelist, gfp_mask)) {
> >  			schedule_timeout_uninterruptible(1);
> >  			goto restart;
> > Index: linux-2.6/include/linux/gfp.h
> > ===================================================================
> > --- linux-2.6.orig/include/linux/gfp.h
> > +++ linux-2.6/include/linux/gfp.h
> > @@ -51,8 +51,9 @@ struct vm_area_struct;
> >  #define __GFP_THISNODE	((__force gfp_t)0x40000u)/* No fallback, no policies */
> >  #define __GFP_RECLAIMABLE ((__force gfp_t)0x80000u) /* Page is reclaimable */
> >  #define __GFP_MOVABLE	((__force gfp_t)0x100000u)  /* Page is movable */
> > +#define __GFP_NO_OOM_KILL ((__force gfp_t)0x200000u)  /* Don't invoke out_of_memory() */
> >  
> > -#define __GFP_BITS_SHIFT 21	/* Room for 21 __GFP_FOO bits */
> > +#define __GFP_BITS_SHIFT 22	/* Number of __GFP_FOO bits */
> >  #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
> >  
> >  /* This equals 0, but use constants in case they ever change */
> > 
> 
> Yeah, that's much better, thanks.  There's currently concerns about adding 
> a new gfp flag in another thread (__GFP_PANIC), though, so you might find 
> some resistance in adding a flag with a very specific and limited use cae.

Oh great.  Andrew, what's your opinion?

> I think you might have better luck in doing
> 
> 	struct zone *z;
> 
> 	for_each_populated_zone(z)
> 		zone_set_flag(z, ZONE_OOM_LOCKED);
> 	
> if all other tasks are really in D state at this point since oom killer 
> serialization is done with try locks in the page allocator.

Not all of them, actually.  Some kernel threads are not freezable.

> This is equivalent to __GFP_NO_OOM_KILL.

In that case I think I'd go back to my initial idea with disabling the OOM
killer after freezing tasks.

Roughly, this.  [The idea is that the OOM killer is not really going to work
while tasks are frozen, so we can just give up calling it in that case.]

---
 include/linux/freezer.h |    2 ++
 kernel/power/process.c  |   12 ++++++++++++
 mm/page_alloc.c         |    4 +++-
 3 files changed, 17 insertions(+), 1 deletion(-)

Index: linux-2.6/kernel/power/process.c
===================================================================
--- linux-2.6.orig/kernel/power/process.c
+++ linux-2.6/kernel/power/process.c
@@ -19,6 +19,8 @@
  */
 #define TIMEOUT	(20 * HZ)
 
+static bool tasks_frozen;
+
 static inline int freezeable(struct task_struct * p)
 {
 	if ((p == current) ||
@@ -120,6 +122,10 @@ int freeze_processes(void)
  Exit:
 	BUG_ON(in_atomic());
 	printk("\n");
+
+	if (!error)
+		tasks_frozen = true;
+
 	return error;
 }
 
@@ -145,6 +151,8 @@ static void thaw_tasks(bool nosig_only)
 
 void thaw_processes(void)
 {
+	tasks_frozen = false;
+
 	printk("Restarting tasks ... ");
 	thaw_tasks(true);
 	thaw_tasks(false);
@@ -152,3 +160,7 @@ void thaw_processes(void)
 	printk("done.\n");
 }
 
+bool processes_are_frozen(void)
+{
+	return tasks_frozen;
+}
Index: linux-2.6/include/linux/freezer.h
===================================================================
--- linux-2.6.orig/include/linux/freezer.h
+++ linux-2.6/include/linux/freezer.h
@@ -50,6 +50,7 @@ extern int thaw_process(struct task_stru
 extern void refrigerator(void);
 extern int freeze_processes(void);
 extern void thaw_processes(void);
+extern bool processes_are_frozen(void);
 
 static inline int try_to_freeze(void)
 {
@@ -170,6 +171,7 @@ static inline int thaw_process(struct ta
 static inline void refrigerator(void) {}
 static inline int freeze_processes(void) { BUG(); return 0; }
 static inline void thaw_processes(void) {}
+static inline bool processes_are_frozen(void) { return false; }
 
 static inline int try_to_freeze(void) { return 0; }
 
Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c
+++ linux-2.6/mm/page_alloc.c
@@ -46,6 +46,7 @@
 #include <linux/page-isolation.h>
 #include <linux/page_cgroup.h>
 #include <linux/debugobjects.h>
+#include <linux/freezer.h>
 
 #include <asm/tlbflush.h>
 #include <asm/div64.h>
@@ -1599,7 +1600,8 @@ nofail_alloc:
 					zonelist, high_zoneidx, alloc_flags);
 		if (page)
 			goto got_pg;
-	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)) {
+	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)
+			&& !processes_are_frozen()) {
 		if (!try_set_zone_oom(zonelist, gfp_mask)) {
 			schedule_timeout_uninterruptible(1);
 			goto restart;

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-05  0:37                                                             ` David Rientjes
  0 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-05  0:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, Wu Fengguang, linux-pm, pavel, torvalds,
	jens.axboe, alan-jenkins, linux-kernel, kernel-testers

On Tue, 5 May 2009, Rafael J. Wysocki wrote:

> Index: linux-2.6/kernel/power/process.c
> ===================================================================
> --- linux-2.6.orig/kernel/power/process.c
> +++ linux-2.6/kernel/power/process.c
> @@ -19,6 +19,8 @@
>   */
>  #define TIMEOUT	(20 * HZ)
>  
> +static bool tasks_frozen;
> +
>  static inline int freezeable(struct task_struct * p)
>  {
>  	if ((p == current) ||
> @@ -120,6 +122,10 @@ int freeze_processes(void)
>   Exit:
>  	BUG_ON(in_atomic());
>  	printk("\n");
> +
> +	if (!error)
> +		tasks_frozen = true;
> +
>  	return error;
>  }
>  
> @@ -145,6 +151,8 @@ static void thaw_tasks(bool nosig_only)
>  
>  void thaw_processes(void)
>  {
> +	tasks_frozen = false;
> +
>  	printk("Restarting tasks ... ");
>  	thaw_tasks(true);
>  	thaw_tasks(false);
> @@ -152,3 +160,7 @@ void thaw_processes(void)
>  	printk("done.\n");
>  }
>  
> +bool processes_are_frozen(void)
> +{
> +	return tasks_frozen;
> +}
> Index: linux-2.6/include/linux/freezer.h
> ===================================================================
> --- linux-2.6.orig/include/linux/freezer.h
> +++ linux-2.6/include/linux/freezer.h
> @@ -50,6 +50,7 @@ extern int thaw_process(struct task_stru
>  extern void refrigerator(void);
>  extern int freeze_processes(void);
>  extern void thaw_processes(void);
> +extern bool processes_are_frozen(void);
>  
>  static inline int try_to_freeze(void)
>  {
> @@ -170,6 +171,7 @@ static inline int thaw_process(struct ta
>  static inline void refrigerator(void) {}
>  static inline int freeze_processes(void) { BUG(); return 0; }
>  static inline void thaw_processes(void) {}
> +static inline bool processes_are_frozen(void) { return false; }
>  
>  static inline int try_to_freeze(void) { return 0; }
>  
> Index: linux-2.6/mm/page_alloc.c
> ===================================================================
> --- linux-2.6.orig/mm/page_alloc.c
> +++ linux-2.6/mm/page_alloc.c
> @@ -46,6 +46,7 @@
>  #include <linux/page-isolation.h>
>  #include <linux/page_cgroup.h>
>  #include <linux/debugobjects.h>
> +#include <linux/freezer.h>
>  
>  #include <asm/tlbflush.h>
>  #include <asm/div64.h>
> @@ -1599,7 +1600,8 @@ nofail_alloc:
>  					zonelist, high_zoneidx, alloc_flags);
>  		if (page)
>  			goto got_pg;
> -	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)) {
> +	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)
> +			&& !processes_are_frozen()) {
>  		if (!try_set_zone_oom(zonelist, gfp_mask)) {
>  			schedule_timeout_uninterruptible(1);
>  			goto restart;

Cool, that looks like the semantics of __GFP_NO_OOM_KILL without requiring 
a new gfp flag.  Thanks.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-04 22:23                                                           ` Rafael J. Wysocki
  (?)
  (?)
@ 2009-05-05  0:37                                                           ` David Rientjes
  -1 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-05  0:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: kernel-testers, linux-kernel, alan-jenkins, jens.axboe,
	Andrew Morton, Wu Fengguang, torvalds, linux-pm

On Tue, 5 May 2009, Rafael J. Wysocki wrote:

> Index: linux-2.6/kernel/power/process.c
> ===================================================================
> --- linux-2.6.orig/kernel/power/process.c
> +++ linux-2.6/kernel/power/process.c
> @@ -19,6 +19,8 @@
>   */
>  #define TIMEOUT	(20 * HZ)
>  
> +static bool tasks_frozen;
> +
>  static inline int freezeable(struct task_struct * p)
>  {
>  	if ((p == current) ||
> @@ -120,6 +122,10 @@ int freeze_processes(void)
>   Exit:
>  	BUG_ON(in_atomic());
>  	printk("\n");
> +
> +	if (!error)
> +		tasks_frozen = true;
> +
>  	return error;
>  }
>  
> @@ -145,6 +151,8 @@ static void thaw_tasks(bool nosig_only)
>  
>  void thaw_processes(void)
>  {
> +	tasks_frozen = false;
> +
>  	printk("Restarting tasks ... ");
>  	thaw_tasks(true);
>  	thaw_tasks(false);
> @@ -152,3 +160,7 @@ void thaw_processes(void)
>  	printk("done.\n");
>  }
>  
> +bool processes_are_frozen(void)
> +{
> +	return tasks_frozen;
> +}
> Index: linux-2.6/include/linux/freezer.h
> ===================================================================
> --- linux-2.6.orig/include/linux/freezer.h
> +++ linux-2.6/include/linux/freezer.h
> @@ -50,6 +50,7 @@ extern int thaw_process(struct task_stru
>  extern void refrigerator(void);
>  extern int freeze_processes(void);
>  extern void thaw_processes(void);
> +extern bool processes_are_frozen(void);
>  
>  static inline int try_to_freeze(void)
>  {
> @@ -170,6 +171,7 @@ static inline int thaw_process(struct ta
>  static inline void refrigerator(void) {}
>  static inline int freeze_processes(void) { BUG(); return 0; }
>  static inline void thaw_processes(void) {}
> +static inline bool processes_are_frozen(void) { return false; }
>  
>  static inline int try_to_freeze(void) { return 0; }
>  
> Index: linux-2.6/mm/page_alloc.c
> ===================================================================
> --- linux-2.6.orig/mm/page_alloc.c
> +++ linux-2.6/mm/page_alloc.c
> @@ -46,6 +46,7 @@
>  #include <linux/page-isolation.h>
>  #include <linux/page_cgroup.h>
>  #include <linux/debugobjects.h>
> +#include <linux/freezer.h>
>  
>  #include <asm/tlbflush.h>
>  #include <asm/div64.h>
> @@ -1599,7 +1600,8 @@ nofail_alloc:
>  					zonelist, high_zoneidx, alloc_flags);
>  		if (page)
>  			goto got_pg;
> -	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)) {
> +	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)
> +			&& !processes_are_frozen()) {
>  		if (!try_set_zone_oom(zonelist, gfp_mask)) {
>  			schedule_timeout_uninterruptible(1);
>  			goto restart;

Cool, that looks like the semantics of __GFP_NO_OOM_KILL without requiring 
a new gfp flag.  Thanks.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-05  0:37                                                             ` David Rientjes
  0 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-05  0:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, Wu Fengguang,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Tue, 5 May 2009, Rafael J. Wysocki wrote:

> Index: linux-2.6/kernel/power/process.c
> ===================================================================
> --- linux-2.6.orig/kernel/power/process.c
> +++ linux-2.6/kernel/power/process.c
> @@ -19,6 +19,8 @@
>   */
>  #define TIMEOUT	(20 * HZ)
>  
> +static bool tasks_frozen;
> +
>  static inline int freezeable(struct task_struct * p)
>  {
>  	if ((p == current) ||
> @@ -120,6 +122,10 @@ int freeze_processes(void)
>   Exit:
>  	BUG_ON(in_atomic());
>  	printk("\n");
> +
> +	if (!error)
> +		tasks_frozen = true;
> +
>  	return error;
>  }
>  
> @@ -145,6 +151,8 @@ static void thaw_tasks(bool nosig_only)
>  
>  void thaw_processes(void)
>  {
> +	tasks_frozen = false;
> +
>  	printk("Restarting tasks ... ");
>  	thaw_tasks(true);
>  	thaw_tasks(false);
> @@ -152,3 +160,7 @@ void thaw_processes(void)
>  	printk("done.\n");
>  }
>  
> +bool processes_are_frozen(void)
> +{
> +	return tasks_frozen;
> +}
> Index: linux-2.6/include/linux/freezer.h
> ===================================================================
> --- linux-2.6.orig/include/linux/freezer.h
> +++ linux-2.6/include/linux/freezer.h
> @@ -50,6 +50,7 @@ extern int thaw_process(struct task_stru
>  extern void refrigerator(void);
>  extern int freeze_processes(void);
>  extern void thaw_processes(void);
> +extern bool processes_are_frozen(void);
>  
>  static inline int try_to_freeze(void)
>  {
> @@ -170,6 +171,7 @@ static inline int thaw_process(struct ta
>  static inline void refrigerator(void) {}
>  static inline int freeze_processes(void) { BUG(); return 0; }
>  static inline void thaw_processes(void) {}
> +static inline bool processes_are_frozen(void) { return false; }
>  
>  static inline int try_to_freeze(void) { return 0; }
>  
> Index: linux-2.6/mm/page_alloc.c
> ===================================================================
> --- linux-2.6.orig/mm/page_alloc.c
> +++ linux-2.6/mm/page_alloc.c
> @@ -46,6 +46,7 @@
>  #include <linux/page-isolation.h>
>  #include <linux/page_cgroup.h>
>  #include <linux/debugobjects.h>
> +#include <linux/freezer.h>
>  
>  #include <asm/tlbflush.h>
>  #include <asm/div64.h>
> @@ -1599,7 +1600,8 @@ nofail_alloc:
>  					zonelist, high_zoneidx, alloc_flags);
>  		if (page)
>  			goto got_pg;
> -	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)) {
> +	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)
> +			&& !processes_are_frozen()) {
>  		if (!try_set_zone_oom(zonelist, gfp_mask)) {
>  			schedule_timeout_uninterruptible(1);
>  			goto restart;

Cool, that looks like the semantics of __GFP_NO_OOM_KILL without requiring 
a new gfp flag.  Thanks.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
@ 2009-05-05  2:24                                                 ` Wu Fengguang
  0 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-05  2:24 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, Andrew Morton, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rjw@sisk.pl>
> 
> Since the hibernation code is now going to use allocations of memory
> to create enough room for the image, it can also use the page frames
> allocated at this stage as image page frames.  The low-level
> hibernation code needs to be rearranged for this purpose, but it
> allows us to avoid freeing a great number of pages and allocating
> these same pages once again later, so it generally is worth doing.
> 
> [rev. 2: Change the strategy of preallocating memory to allocate as
>  many pages as needed to get the right image size in one shot (the
>  excessive allocated pages are released afterwards).]

Rafael, I tried out your patches and found doubled memory shrink speed!

[  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
[  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)

For you reference, here is the free memory before/after
hibernate_preallocate_memory():

        # free
                     total       used       free     shared    buffers     cached
        Mem:          1933       1917         15          0          0       1845
        -/+ buffers/cache:         72       1861
        Swap:            0          0          0

        # free
                     total       used       free     shared    buffers     cached
        Mem:          1933        920       1012          0          0        356
        -/+ buffers/cache:        563       1369
        Swap:            0          0          0

It seems that the preallocated memory is not freed on -ENOMEM.

+       error = memory_bm_create(&orig_bm, GFP_IMAGE, PG_ANY);
+       if (error)
+               goto err_out;
+
+       error = memory_bm_create(&copy_bm, GFP_IMAGE, PG_ANY);
+       if (error)
+               goto err_out;

memory_bm_create() is called a number of times, each time it will
call create_mem_extents()/memory_bm_free(). Can they be optimized to
be called only once?

A side note: there are somehow duplicated *_extent_*() logics in the
filesystems, is it possible that we abstract out some of the common code?

+       for_each_populated_zone(zone) {
+               size += snapshot_additional_pages(zone);
+               count += zone_page_state(zone, NR_FREE_PAGES);
+               if (!is_highmem(zone))
+                       count -= zone->lowmem_reserve[ZONE_NORMAL];
+       }

Why [ZONE_NORMAL] instead of [zone]? ZONE_NORMAL may not always be the largest zone,
for example, My 4GB laptop has a tiny ZONE_NORMAL and a large ZONE_DMA32.

+       /* If size < max_size, preallocating enough memory may be impossible. */
+       if (count > 0 && size == max_size)
+               error = -ENOMEM;
+       if (error)
+               goto err_out;

The two if()s can be merged.

At last, I'd express my major concern about the transition to preallocate
based memory shrinking: will it lead to more random swapping IOs?

Thanks,
Fengguang


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
  2009-05-04  0:22                                               ` Rafael J. Wysocki
  (?)
@ 2009-05-05  2:24                                               ` Wu Fengguang
  -1 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-05  2:24 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-kernel, alan-jenkins, jens.axboe, linux-pm, kernel-testers,
	torvalds, Andrew Morton

On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rjw@sisk.pl>
> 
> Since the hibernation code is now going to use allocations of memory
> to create enough room for the image, it can also use the page frames
> allocated at this stage as image page frames.  The low-level
> hibernation code needs to be rearranged for this purpose, but it
> allows us to avoid freeing a great number of pages and allocating
> these same pages once again later, so it generally is worth doing.
> 
> [rev. 2: Change the strategy of preallocating memory to allocate as
>  many pages as needed to get the right image size in one shot (the
>  excessive allocated pages are released afterwards).]

Rafael, I tried out your patches and found doubled memory shrink speed!

[  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
[  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)

For you reference, here is the free memory before/after
hibernate_preallocate_memory():

        # free
                     total       used       free     shared    buffers     cached
        Mem:          1933       1917         15          0          0       1845
        -/+ buffers/cache:         72       1861
        Swap:            0          0          0

        # free
                     total       used       free     shared    buffers     cached
        Mem:          1933        920       1012          0          0        356
        -/+ buffers/cache:        563       1369
        Swap:            0          0          0

It seems that the preallocated memory is not freed on -ENOMEM.

+       error = memory_bm_create(&orig_bm, GFP_IMAGE, PG_ANY);
+       if (error)
+               goto err_out;
+
+       error = memory_bm_create(&copy_bm, GFP_IMAGE, PG_ANY);
+       if (error)
+               goto err_out;

memory_bm_create() is called a number of times, each time it will
call create_mem_extents()/memory_bm_free(). Can they be optimized to
be called only once?

A side note: there are somehow duplicated *_extent_*() logics in the
filesystems, is it possible that we abstract out some of the common code?

+       for_each_populated_zone(zone) {
+               size += snapshot_additional_pages(zone);
+               count += zone_page_state(zone, NR_FREE_PAGES);
+               if (!is_highmem(zone))
+                       count -= zone->lowmem_reserve[ZONE_NORMAL];
+       }

Why [ZONE_NORMAL] instead of [zone]? ZONE_NORMAL may not always be the largest zone,
for example, My 4GB laptop has a tiny ZONE_NORMAL and a large ZONE_DMA32.

+       /* If size < max_size, preallocating enough memory may be impossible. */
+       if (count > 0 && size == max_size)
+               error = -ENOMEM;
+       if (error)
+               goto err_out;

The two if()s can be merged.

At last, I'd express my major concern about the transition to preallocate
based memory shrinking: will it lead to more random swapping IOs?

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
@ 2009-05-05  2:24                                                 ` Wu Fengguang
  0 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-05  2:24 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Andrew Morton, pavel-+ZI9xUNit7I,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
> 
> Since the hibernation code is now going to use allocations of memory
> to create enough room for the image, it can also use the page frames
> allocated at this stage as image page frames.  The low-level
> hibernation code needs to be rearranged for this purpose, but it
> allows us to avoid freeing a great number of pages and allocating
> these same pages once again later, so it generally is worth doing.
> 
> [rev. 2: Change the strategy of preallocating memory to allocate as
>  many pages as needed to get the right image size in one shot (the
>  excessive allocated pages are released afterwards).]

Rafael, I tried out your patches and found doubled memory shrink speed!

[  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
[  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)

For you reference, here is the free memory before/after
hibernate_preallocate_memory():

        # free
                     total       used       free     shared    buffers     cached
        Mem:          1933       1917         15          0          0       1845
        -/+ buffers/cache:         72       1861
        Swap:            0          0          0

        # free
                     total       used       free     shared    buffers     cached
        Mem:          1933        920       1012          0          0        356
        -/+ buffers/cache:        563       1369
        Swap:            0          0          0

It seems that the preallocated memory is not freed on -ENOMEM.

+       error = memory_bm_create(&orig_bm, GFP_IMAGE, PG_ANY);
+       if (error)
+               goto err_out;
+
+       error = memory_bm_create(&copy_bm, GFP_IMAGE, PG_ANY);
+       if (error)
+               goto err_out;

memory_bm_create() is called a number of times, each time it will
call create_mem_extents()/memory_bm_free(). Can they be optimized to
be called only once?

A side note: there are somehow duplicated *_extent_*() logics in the
filesystems, is it possible that we abstract out some of the common code?

+       for_each_populated_zone(zone) {
+               size += snapshot_additional_pages(zone);
+               count += zone_page_state(zone, NR_FREE_PAGES);
+               if (!is_highmem(zone))
+                       count -= zone->lowmem_reserve[ZONE_NORMAL];
+       }

Why [ZONE_NORMAL] instead of [zone]? ZONE_NORMAL may not always be the largest zone,
for example, My 4GB laptop has a tiny ZONE_NORMAL and a large ZONE_DMA32.

+       /* If size < max_size, preallocating enough memory may be impossible. */
+       if (count > 0 && size == max_size)
+               error = -ENOMEM;
+       if (error)
+               goto err_out;

The two if()s can be merged.

At last, I'd express my major concern about the transition to preallocate
based memory shrinking: will it lead to more random swapping IOs?

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
  2009-05-05  2:24                                                 ` Wu Fengguang
@ 2009-05-05  2:46                                                   ` Wu Fengguang
  -1 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-05  2:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, Andrew Morton, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

On Tue, May 05, 2009 at 10:24:27AM +0800, Wu Fengguang wrote:
> On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > 
> > Since the hibernation code is now going to use allocations of memory
> > to create enough room for the image, it can also use the page frames
> > allocated at this stage as image page frames.  The low-level
> > hibernation code needs to be rearranged for this purpose, but it
> > allows us to avoid freeing a great number of pages and allocating
> > these same pages once again later, so it generally is worth doing.
> > 
> > [rev. 2: Change the strategy of preallocating memory to allocate as
> >  many pages as needed to get the right image size in one shot (the
> >  excessive allocated pages are released afterwards).]
> 
> Rafael, I tried out your patches and found doubled memory shrink speed!
> 
> [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
 
> For you reference, here is the free memory before/after
> hibernate_preallocate_memory():
> 
>         # free
>                      total       used       free     shared    buffers     cached
>         Mem:          1933       1917         15          0          0       1845
>         -/+ buffers/cache:         72       1861
>         Swap:            0          0          0
> 
>         # free
>                      total       used       free     shared    buffers     cached
>         Mem:          1933        920       1012          0          0        356
>         -/+ buffers/cache:        563       1369
>         Swap:            0          0          0
> 
> It seems that the preallocated memory is not freed on -ENOMEM.

Ah, this was my fault.

I used to do this debugging trick:

        @@ -1207,7 +1207,7 @@ int hibernate_preallocate_memory(void)
                        pages, size);
                swsusp_show_speed(&start, &stop, pages, "Allocated");

        -       return 0;
        +       return -ENOMEM;

          err_out:
                printk(KERN_CONT "\n");

That "return -ENOMEM" should be "error = -ENOMEM" :-)

Here is one more run:

[  194.016991] PM: Preallocating image memory ... done (allocated 383897 pages, 128000 image pages kept)
[  196.505999] PM: Allocated 1535588 kbytes in 2.47 seconds (621.69 MB/s)

Now the free report is back to normal:

# free
             total       used       free     shared    buffers     cached
Mem:          1933         74       1858          0          0         15


Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
  2009-05-05  2:24                                                 ` Wu Fengguang
  (?)
@ 2009-05-05  2:46                                                 ` Wu Fengguang
  -1 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-05  2:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-kernel, alan-jenkins, jens.axboe, linux-pm, kernel-testers,
	torvalds, Andrew Morton

On Tue, May 05, 2009 at 10:24:27AM +0800, Wu Fengguang wrote:
> On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > 
> > Since the hibernation code is now going to use allocations of memory
> > to create enough room for the image, it can also use the page frames
> > allocated at this stage as image page frames.  The low-level
> > hibernation code needs to be rearranged for this purpose, but it
> > allows us to avoid freeing a great number of pages and allocating
> > these same pages once again later, so it generally is worth doing.
> > 
> > [rev. 2: Change the strategy of preallocating memory to allocate as
> >  many pages as needed to get the right image size in one shot (the
> >  excessive allocated pages are released afterwards).]
> 
> Rafael, I tried out your patches and found doubled memory shrink speed!
> 
> [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
 
> For you reference, here is the free memory before/after
> hibernate_preallocate_memory():
> 
>         # free
>                      total       used       free     shared    buffers     cached
>         Mem:          1933       1917         15          0          0       1845
>         -/+ buffers/cache:         72       1861
>         Swap:            0          0          0
> 
>         # free
>                      total       used       free     shared    buffers     cached
>         Mem:          1933        920       1012          0          0        356
>         -/+ buffers/cache:        563       1369
>         Swap:            0          0          0
> 
> It seems that the preallocated memory is not freed on -ENOMEM.

Ah, this was my fault.

I used to do this debugging trick:

        @@ -1207,7 +1207,7 @@ int hibernate_preallocate_memory(void)
                        pages, size);
                swsusp_show_speed(&start, &stop, pages, "Allocated");

        -       return 0;
        +       return -ENOMEM;

          err_out:
                printk(KERN_CONT "\n");

That "return -ENOMEM" should be "error = -ENOMEM" :-)

Here is one more run:

[  194.016991] PM: Preallocating image memory ... done (allocated 383897 pages, 128000 image pages kept)
[  196.505999] PM: Allocated 1535588 kbytes in 2.47 seconds (621.69 MB/s)

Now the free report is back to normal:

# free
             total       used       free     shared    buffers     cached
Mem:          1933         74       1858          0          0         15


Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
@ 2009-05-05  2:46                                                   ` Wu Fengguang
  0 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-05  2:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Andrew Morton, pavel-+ZI9xUNit7I,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Tue, May 05, 2009 at 10:24:27AM +0800, Wu Fengguang wrote:
> On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
> > 
> > Since the hibernation code is now going to use allocations of memory
> > to create enough room for the image, it can also use the page frames
> > allocated at this stage as image page frames.  The low-level
> > hibernation code needs to be rearranged for this purpose, but it
> > allows us to avoid freeing a great number of pages and allocating
> > these same pages once again later, so it generally is worth doing.
> > 
> > [rev. 2: Change the strategy of preallocating memory to allocate as
> >  many pages as needed to get the right image size in one shot (the
> >  excessive allocated pages are released afterwards).]
> 
> Rafael, I tried out your patches and found doubled memory shrink speed!
> 
> [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
 
> For you reference, here is the free memory before/after
> hibernate_preallocate_memory():
> 
>         # free
>                      total       used       free     shared    buffers     cached
>         Mem:          1933       1917         15          0          0       1845
>         -/+ buffers/cache:         72       1861
>         Swap:            0          0          0
> 
>         # free
>                      total       used       free     shared    buffers     cached
>         Mem:          1933        920       1012          0          0        356
>         -/+ buffers/cache:        563       1369
>         Swap:            0          0          0
> 
> It seems that the preallocated memory is not freed on -ENOMEM.

Ah, this was my fault.

I used to do this debugging trick:

        @@ -1207,7 +1207,7 @@ int hibernate_preallocate_memory(void)
                        pages, size);
                swsusp_show_speed(&start, &stop, pages, "Allocated");

        -       return 0;
        +       return -ENOMEM;

          err_out:
                printk(KERN_CONT "\n");

That "return -ENOMEM" should be "error = -ENOMEM" :-)

Here is one more run:

[  194.016991] PM: Preallocating image memory ... done (allocated 383897 pages, 128000 image pages kept)
[  196.505999] PM: Allocated 1535588 kbytes in 2.47 seconds (621.69 MB/s)

Now the free report is back to normal:

# free
             total       used       free     shared    buffers     cached
Mem:          1933         74       1858          0          0         15


Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-05 22:19                                                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-05 22:19 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, Wu Fengguang, linux-pm, pavel, torvalds,
	jens.axboe, alan-jenkins, linux-kernel, kernel-testers

On Tuesday 05 May 2009, David Rientjes wrote:
> On Tue, 5 May 2009, Rafael J. Wysocki wrote:
> 
> > Index: linux-2.6/kernel/power/process.c
> > ===================================================================
> > --- linux-2.6.orig/kernel/power/process.c
> > +++ linux-2.6/kernel/power/process.c
> > @@ -19,6 +19,8 @@
> >   */
> >  #define TIMEOUT	(20 * HZ)
> >  
> > +static bool tasks_frozen;
> > +
> >  static inline int freezeable(struct task_struct * p)
> >  {
> >  	if ((p == current) ||
> > @@ -120,6 +122,10 @@ int freeze_processes(void)
> >   Exit:
> >  	BUG_ON(in_atomic());
> >  	printk("\n");
> > +
> > +	if (!error)
> > +		tasks_frozen = true;
> > +
> >  	return error;
> >  }
> >  
> > @@ -145,6 +151,8 @@ static void thaw_tasks(bool nosig_only)
> >  
> >  void thaw_processes(void)
> >  {
> > +	tasks_frozen = false;
> > +
> >  	printk("Restarting tasks ... ");
> >  	thaw_tasks(true);
> >  	thaw_tasks(false);
> > @@ -152,3 +160,7 @@ void thaw_processes(void)
> >  	printk("done.\n");
> >  }
> >  
> > +bool processes_are_frozen(void)
> > +{
> > +	return tasks_frozen;
> > +}
> > Index: linux-2.6/include/linux/freezer.h
> > ===================================================================
> > --- linux-2.6.orig/include/linux/freezer.h
> > +++ linux-2.6/include/linux/freezer.h
> > @@ -50,6 +50,7 @@ extern int thaw_process(struct task_stru
> >  extern void refrigerator(void);
> >  extern int freeze_processes(void);
> >  extern void thaw_processes(void);
> > +extern bool processes_are_frozen(void);
> >  
> >  static inline int try_to_freeze(void)
> >  {
> > @@ -170,6 +171,7 @@ static inline int thaw_process(struct ta
> >  static inline void refrigerator(void) {}
> >  static inline int freeze_processes(void) { BUG(); return 0; }
> >  static inline void thaw_processes(void) {}
> > +static inline bool processes_are_frozen(void) { return false; }
> >  
> >  static inline int try_to_freeze(void) { return 0; }
> >  
> > Index: linux-2.6/mm/page_alloc.c
> > ===================================================================
> > --- linux-2.6.orig/mm/page_alloc.c
> > +++ linux-2.6/mm/page_alloc.c
> > @@ -46,6 +46,7 @@
> >  #include <linux/page-isolation.h>
> >  #include <linux/page_cgroup.h>
> >  #include <linux/debugobjects.h>
> > +#include <linux/freezer.h>
> >  
> >  #include <asm/tlbflush.h>
> >  #include <asm/div64.h>
> > @@ -1599,7 +1600,8 @@ nofail_alloc:
> >  					zonelist, high_zoneidx, alloc_flags);
> >  		if (page)
> >  			goto got_pg;
> > -	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)) {
> > +	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)
> > +			&& !processes_are_frozen()) {
> >  		if (!try_set_zone_oom(zonelist, gfp_mask)) {
> >  			schedule_timeout_uninterruptible(1);
> >  			goto restart;
> 
> Cool, that looks like the semantics of __GFP_NO_OOM_KILL without requiring 
> a new gfp flag.  Thanks.

Well, you're welcome.

BTW, I think that Andrew was actually right when he asked if I checked whether
the existing __GFP_NORETRY would work as-is for __GFP_FS set and
__GFP_NORETRY unset.  Namely, in that case we never reach the code before
nopage: that checks __GFP_NORETRY, do we?

So I think we shouldn't modify the 'else if' condition above and check for
!processes_are_frozen() at the beginning of the block below.

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-05  0:37                                                             ` David Rientjes
  (?)
@ 2009-05-05 22:19                                                             ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-05 22:19 UTC (permalink / raw)
  To: David Rientjes
  Cc: kernel-testers, linux-kernel, alan-jenkins, jens.axboe,
	Andrew Morton, Wu Fengguang, torvalds, linux-pm

On Tuesday 05 May 2009, David Rientjes wrote:
> On Tue, 5 May 2009, Rafael J. Wysocki wrote:
> 
> > Index: linux-2.6/kernel/power/process.c
> > ===================================================================
> > --- linux-2.6.orig/kernel/power/process.c
> > +++ linux-2.6/kernel/power/process.c
> > @@ -19,6 +19,8 @@
> >   */
> >  #define TIMEOUT	(20 * HZ)
> >  
> > +static bool tasks_frozen;
> > +
> >  static inline int freezeable(struct task_struct * p)
> >  {
> >  	if ((p == current) ||
> > @@ -120,6 +122,10 @@ int freeze_processes(void)
> >   Exit:
> >  	BUG_ON(in_atomic());
> >  	printk("\n");
> > +
> > +	if (!error)
> > +		tasks_frozen = true;
> > +
> >  	return error;
> >  }
> >  
> > @@ -145,6 +151,8 @@ static void thaw_tasks(bool nosig_only)
> >  
> >  void thaw_processes(void)
> >  {
> > +	tasks_frozen = false;
> > +
> >  	printk("Restarting tasks ... ");
> >  	thaw_tasks(true);
> >  	thaw_tasks(false);
> > @@ -152,3 +160,7 @@ void thaw_processes(void)
> >  	printk("done.\n");
> >  }
> >  
> > +bool processes_are_frozen(void)
> > +{
> > +	return tasks_frozen;
> > +}
> > Index: linux-2.6/include/linux/freezer.h
> > ===================================================================
> > --- linux-2.6.orig/include/linux/freezer.h
> > +++ linux-2.6/include/linux/freezer.h
> > @@ -50,6 +50,7 @@ extern int thaw_process(struct task_stru
> >  extern void refrigerator(void);
> >  extern int freeze_processes(void);
> >  extern void thaw_processes(void);
> > +extern bool processes_are_frozen(void);
> >  
> >  static inline int try_to_freeze(void)
> >  {
> > @@ -170,6 +171,7 @@ static inline int thaw_process(struct ta
> >  static inline void refrigerator(void) {}
> >  static inline int freeze_processes(void) { BUG(); return 0; }
> >  static inline void thaw_processes(void) {}
> > +static inline bool processes_are_frozen(void) { return false; }
> >  
> >  static inline int try_to_freeze(void) { return 0; }
> >  
> > Index: linux-2.6/mm/page_alloc.c
> > ===================================================================
> > --- linux-2.6.orig/mm/page_alloc.c
> > +++ linux-2.6/mm/page_alloc.c
> > @@ -46,6 +46,7 @@
> >  #include <linux/page-isolation.h>
> >  #include <linux/page_cgroup.h>
> >  #include <linux/debugobjects.h>
> > +#include <linux/freezer.h>
> >  
> >  #include <asm/tlbflush.h>
> >  #include <asm/div64.h>
> > @@ -1599,7 +1600,8 @@ nofail_alloc:
> >  					zonelist, high_zoneidx, alloc_flags);
> >  		if (page)
> >  			goto got_pg;
> > -	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)) {
> > +	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)
> > +			&& !processes_are_frozen()) {
> >  		if (!try_set_zone_oom(zonelist, gfp_mask)) {
> >  			schedule_timeout_uninterruptible(1);
> >  			goto restart;
> 
> Cool, that looks like the semantics of __GFP_NO_OOM_KILL without requiring 
> a new gfp flag.  Thanks.

Well, you're welcome.

BTW, I think that Andrew was actually right when he asked if I checked whether
the existing __GFP_NORETRY would work as-is for __GFP_FS set and
__GFP_NORETRY unset.  Namely, in that case we never reach the code before
nopage: that checks __GFP_NORETRY, do we?

So I think we shouldn't modify the 'else if' condition above and check for
!processes_are_frozen() at the beginning of the block below.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-05 22:19                                                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-05 22:19 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, Wu Fengguang,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Tuesday 05 May 2009, David Rientjes wrote:
> On Tue, 5 May 2009, Rafael J. Wysocki wrote:
> 
> > Index: linux-2.6/kernel/power/process.c
> > ===================================================================
> > --- linux-2.6.orig/kernel/power/process.c
> > +++ linux-2.6/kernel/power/process.c
> > @@ -19,6 +19,8 @@
> >   */
> >  #define TIMEOUT	(20 * HZ)
> >  
> > +static bool tasks_frozen;
> > +
> >  static inline int freezeable(struct task_struct * p)
> >  {
> >  	if ((p == current) ||
> > @@ -120,6 +122,10 @@ int freeze_processes(void)
> >   Exit:
> >  	BUG_ON(in_atomic());
> >  	printk("\n");
> > +
> > +	if (!error)
> > +		tasks_frozen = true;
> > +
> >  	return error;
> >  }
> >  
> > @@ -145,6 +151,8 @@ static void thaw_tasks(bool nosig_only)
> >  
> >  void thaw_processes(void)
> >  {
> > +	tasks_frozen = false;
> > +
> >  	printk("Restarting tasks ... ");
> >  	thaw_tasks(true);
> >  	thaw_tasks(false);
> > @@ -152,3 +160,7 @@ void thaw_processes(void)
> >  	printk("done.\n");
> >  }
> >  
> > +bool processes_are_frozen(void)
> > +{
> > +	return tasks_frozen;
> > +}
> > Index: linux-2.6/include/linux/freezer.h
> > ===================================================================
> > --- linux-2.6.orig/include/linux/freezer.h
> > +++ linux-2.6/include/linux/freezer.h
> > @@ -50,6 +50,7 @@ extern int thaw_process(struct task_stru
> >  extern void refrigerator(void);
> >  extern int freeze_processes(void);
> >  extern void thaw_processes(void);
> > +extern bool processes_are_frozen(void);
> >  
> >  static inline int try_to_freeze(void)
> >  {
> > @@ -170,6 +171,7 @@ static inline int thaw_process(struct ta
> >  static inline void refrigerator(void) {}
> >  static inline int freeze_processes(void) { BUG(); return 0; }
> >  static inline void thaw_processes(void) {}
> > +static inline bool processes_are_frozen(void) { return false; }
> >  
> >  static inline int try_to_freeze(void) { return 0; }
> >  
> > Index: linux-2.6/mm/page_alloc.c
> > ===================================================================
> > --- linux-2.6.orig/mm/page_alloc.c
> > +++ linux-2.6/mm/page_alloc.c
> > @@ -46,6 +46,7 @@
> >  #include <linux/page-isolation.h>
> >  #include <linux/page_cgroup.h>
> >  #include <linux/debugobjects.h>
> > +#include <linux/freezer.h>
> >  
> >  #include <asm/tlbflush.h>
> >  #include <asm/div64.h>
> > @@ -1599,7 +1600,8 @@ nofail_alloc:
> >  					zonelist, high_zoneidx, alloc_flags);
> >  		if (page)
> >  			goto got_pg;
> > -	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)) {
> > +	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)
> > +			&& !processes_are_frozen()) {
> >  		if (!try_set_zone_oom(zonelist, gfp_mask)) {
> >  			schedule_timeout_uninterruptible(1);
> >  			goto restart;
> 
> Cool, that looks like the semantics of __GFP_NO_OOM_KILL without requiring 
> a new gfp flag.  Thanks.

Well, you're welcome.

BTW, I think that Andrew was actually right when he asked if I checked whether
the existing __GFP_NORETRY would work as-is for __GFP_FS set and
__GFP_NORETRY unset.  Namely, in that case we never reach the code before
nopage: that checks __GFP_NORETRY, do we?

So I think we shouldn't modify the 'else if' condition above and check for
!processes_are_frozen() at the beginning of the block below.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-05 22:37                                                                 ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-05 22:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: rientjes, fengguang.wu, linux-pm, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

On Wed, 6 May 2009 00:19:35 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> > > +			&& !processes_are_frozen()) {
> > >  		if (!try_set_zone_oom(zonelist, gfp_mask)) {
> > >  			schedule_timeout_uninterruptible(1);
> > >  			goto restart;
> > 
> > Cool, that looks like the semantics of __GFP_NO_OOM_KILL without requiring 
> > a new gfp flag.  Thanks.
> 
> Well, you're welcome.
> 
> BTW, I think that Andrew was actually right when he asked if I checked whether
> the existing __GFP_NORETRY would work as-is for __GFP_FS set and
> __GFP_NORETRY unset.  Namely, in that case we never reach the code before
> nopage: that checks __GFP_NORETRY, do we?
> 
> So I think we shouldn't modify the 'else if' condition above and check for
> !processes_are_frozen() at the beginning of the block below.

Confused.

I'm suspecting that hibernation can allocate its pages with
__GFP_FS|__GFP_WAIT|__GFP_NORETRY|__GFP_NOWARN, and the page allocator
will dtrt: no oom-killings.

In which case, processes_are_frozen() is not needed at all?

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-05 22:19                                                               ` Rafael J. Wysocki
  (?)
  (?)
@ 2009-05-05 22:37                                                               ` Andrew Morton
  -1 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-05 22:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: kernel-testers, rientjes, linux-kernel, alan-jenkins, jens.axboe,
	linux-pm, fengguang.wu, torvalds

On Wed, 6 May 2009 00:19:35 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> > > +			&& !processes_are_frozen()) {
> > >  		if (!try_set_zone_oom(zonelist, gfp_mask)) {
> > >  			schedule_timeout_uninterruptible(1);
> > >  			goto restart;
> > 
> > Cool, that looks like the semantics of __GFP_NO_OOM_KILL without requiring 
> > a new gfp flag.  Thanks.
> 
> Well, you're welcome.
> 
> BTW, I think that Andrew was actually right when he asked if I checked whether
> the existing __GFP_NORETRY would work as-is for __GFP_FS set and
> __GFP_NORETRY unset.  Namely, in that case we never reach the code before
> nopage: that checks __GFP_NORETRY, do we?
> 
> So I think we shouldn't modify the 'else if' condition above and check for
> !processes_are_frozen() at the beginning of the block below.

Confused.

I'm suspecting that hibernation can allocate its pages with
__GFP_FS|__GFP_WAIT|__GFP_NORETRY|__GFP_NOWARN, and the page allocator
will dtrt: no oom-killings.

In which case, processes_are_frozen() is not needed at all?

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-05 22:37                                                                 ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-05 22:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Wed, 6 May 2009 00:19:35 +0200
"Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:

> > > +			&& !processes_are_frozen()) {
> > >  		if (!try_set_zone_oom(zonelist, gfp_mask)) {
> > >  			schedule_timeout_uninterruptible(1);
> > >  			goto restart;
> > 
> > Cool, that looks like the semantics of __GFP_NO_OOM_KILL without requiring 
> > a new gfp flag.  Thanks.
> 
> Well, you're welcome.
> 
> BTW, I think that Andrew was actually right when he asked if I checked whether
> the existing __GFP_NORETRY would work as-is for __GFP_FS set and
> __GFP_NORETRY unset.  Namely, in that case we never reach the code before
> nopage: that checks __GFP_NORETRY, do we?
> 
> So I think we shouldn't modify the 'else if' condition above and check for
> !processes_are_frozen() at the beginning of the block below.

Confused.

I'm suspecting that hibernation can allocate its pages with
__GFP_FS|__GFP_WAIT|__GFP_NORETRY|__GFP_NOWARN, and the page allocator
will dtrt: no oom-killings.

In which case, processes_are_frozen() is not needed at all?

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
  2009-05-05  2:24                                                 ` Wu Fengguang
@ 2009-05-05 23:05                                                   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-05 23:05 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-pm, Andrew Morton, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

On Tuesday 05 May 2009, Wu Fengguang wrote:
> On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > 
> > Since the hibernation code is now going to use allocations of memory
> > to create enough room for the image, it can also use the page frames
> > allocated at this stage as image page frames.  The low-level
> > hibernation code needs to be rearranged for this purpose, but it
> > allows us to avoid freeing a great number of pages and allocating
> > these same pages once again later, so it generally is worth doing.
> > 
> > [rev. 2: Change the strategy of preallocating memory to allocate as
> >  many pages as needed to get the right image size in one shot (the
> >  excessive allocated pages are released afterwards).]
> 
> Rafael, I tried out your patches and found doubled memory shrink speed!
>
> [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)

Unfortunately, I'm observing a regression and a huge one.

On my Atom-based test box with 1 GB of RAM after a fresh boot and starting X
with KDE 4 there are ~256 MB free.  To create an image we need to free ~300 MB
and that takes ~2 s with the old code and ~15 s with the new one.

It helps to call shrink_all_memory() once with a sufficiently large argument
before the preallocation.
 
> For you reference, here is the free memory before/after
> hibernate_preallocate_memory():
> 
>         # free
>                      total       used       free     shared    buffers     cached
>         Mem:          1933       1917         15          0          0       1845
>         -/+ buffers/cache:         72       1861
>         Swap:            0          0          0
> 
>         # free
>                      total       used       free     shared    buffers     cached
>         Mem:          1933        920       1012          0          0        356
>         -/+ buffers/cache:        563       1369
>         Swap:            0          0          0
> 
> It seems that the preallocated memory is not freed on -ENOMEM.
> 
> +       error = memory_bm_create(&orig_bm, GFP_IMAGE, PG_ANY);
> +       if (error)
> +               goto err_out;
> +
> +       error = memory_bm_create(&copy_bm, GFP_IMAGE, PG_ANY);
> +       if (error)
> +               goto err_out;
> 
> memory_bm_create() is called a number of times, each time it will
> call create_mem_extents()/memory_bm_free(). Can they be optimized to
> be called only once?

Possibly, but not right now if you please?  This is just moving code BTW.

> A side note: there are somehow duplicated *_extent_*() logics in the
> filesystems, is it possible that we abstract out some of the common code?

I think we can do it, but it really is low priority to me at the moment.

> +       for_each_populated_zone(zone) {
> +               size += snapshot_additional_pages(zone);
> +               count += zone_page_state(zone, NR_FREE_PAGES);
> +               if (!is_highmem(zone))
> +                       count -= zone->lowmem_reserve[ZONE_NORMAL];
> +       }
> 
> Why [ZONE_NORMAL] instead of [zone]? ZONE_NORMAL may not always be the largest zone,
> for example, My 4GB laptop has a tiny ZONE_NORMAL and a large ZONE_DMA32.

Ah, this is a leftover and it should be changed or even dropped.  Can you
please remind me how exactly lowmem_reserve[] is supposed to work?

> +       /* If size < max_size, preallocating enough memory may be impossible. */
> +       if (count > 0 && size == max_size)
> +               error = -ENOMEM;
> +       if (error)
> +               goto err_out;
> 
> The two if()s can be merged.

Unfortunately, the first one is actually wrong. :-)

It's not present in the updated patchset I'm going to send tomorrow.

> At last, I'd express my major concern about the transition to preallocate
> based memory shrinking: will it lead to more random swapping IOs?

Hmm.  I don't see immediately why would it.  Maybe the regression I'm seeing
is related to that ...

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
  2009-05-05  2:24                                                 ` Wu Fengguang
                                                                   ` (2 preceding siblings ...)
  (?)
@ 2009-05-05 23:05                                                 ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-05 23:05 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-kernel, alan-jenkins, jens.axboe, linux-pm, kernel-testers,
	torvalds, Andrew Morton

On Tuesday 05 May 2009, Wu Fengguang wrote:
> On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > 
> > Since the hibernation code is now going to use allocations of memory
> > to create enough room for the image, it can also use the page frames
> > allocated at this stage as image page frames.  The low-level
> > hibernation code needs to be rearranged for this purpose, but it
> > allows us to avoid freeing a great number of pages and allocating
> > these same pages once again later, so it generally is worth doing.
> > 
> > [rev. 2: Change the strategy of preallocating memory to allocate as
> >  many pages as needed to get the right image size in one shot (the
> >  excessive allocated pages are released afterwards).]
> 
> Rafael, I tried out your patches and found doubled memory shrink speed!
>
> [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)

Unfortunately, I'm observing a regression and a huge one.

On my Atom-based test box with 1 GB of RAM after a fresh boot and starting X
with KDE 4 there are ~256 MB free.  To create an image we need to free ~300 MB
and that takes ~2 s with the old code and ~15 s with the new one.

It helps to call shrink_all_memory() once with a sufficiently large argument
before the preallocation.
 
> For you reference, here is the free memory before/after
> hibernate_preallocate_memory():
> 
>         # free
>                      total       used       free     shared    buffers     cached
>         Mem:          1933       1917         15          0          0       1845
>         -/+ buffers/cache:         72       1861
>         Swap:            0          0          0
> 
>         # free
>                      total       used       free     shared    buffers     cached
>         Mem:          1933        920       1012          0          0        356
>         -/+ buffers/cache:        563       1369
>         Swap:            0          0          0
> 
> It seems that the preallocated memory is not freed on -ENOMEM.
> 
> +       error = memory_bm_create(&orig_bm, GFP_IMAGE, PG_ANY);
> +       if (error)
> +               goto err_out;
> +
> +       error = memory_bm_create(&copy_bm, GFP_IMAGE, PG_ANY);
> +       if (error)
> +               goto err_out;
> 
> memory_bm_create() is called a number of times, each time it will
> call create_mem_extents()/memory_bm_free(). Can they be optimized to
> be called only once?

Possibly, but not right now if you please?  This is just moving code BTW.

> A side note: there are somehow duplicated *_extent_*() logics in the
> filesystems, is it possible that we abstract out some of the common code?

I think we can do it, but it really is low priority to me at the moment.

> +       for_each_populated_zone(zone) {
> +               size += snapshot_additional_pages(zone);
> +               count += zone_page_state(zone, NR_FREE_PAGES);
> +               if (!is_highmem(zone))
> +                       count -= zone->lowmem_reserve[ZONE_NORMAL];
> +       }
> 
> Why [ZONE_NORMAL] instead of [zone]? ZONE_NORMAL may not always be the largest zone,
> for example, My 4GB laptop has a tiny ZONE_NORMAL and a large ZONE_DMA32.

Ah, this is a leftover and it should be changed or even dropped.  Can you
please remind me how exactly lowmem_reserve[] is supposed to work?

> +       /* If size < max_size, preallocating enough memory may be impossible. */
> +       if (count > 0 && size == max_size)
> +               error = -ENOMEM;
> +       if (error)
> +               goto err_out;
> 
> The two if()s can be merged.

Unfortunately, the first one is actually wrong. :-)

It's not present in the updated patchset I'm going to send tomorrow.

> At last, I'd express my major concern about the transition to preallocate
> based memory shrinking: will it lead to more random swapping IOs?

Hmm.  I don't see immediately why would it.  Maybe the regression I'm seeing
is related to that ...

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
@ 2009-05-05 23:05                                                   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-05 23:05 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Andrew Morton, pavel-+ZI9xUNit7I,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Tuesday 05 May 2009, Wu Fengguang wrote:
> On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
> > 
> > Since the hibernation code is now going to use allocations of memory
> > to create enough room for the image, it can also use the page frames
> > allocated at this stage as image page frames.  The low-level
> > hibernation code needs to be rearranged for this purpose, but it
> > allows us to avoid freeing a great number of pages and allocating
> > these same pages once again later, so it generally is worth doing.
> > 
> > [rev. 2: Change the strategy of preallocating memory to allocate as
> >  many pages as needed to get the right image size in one shot (the
> >  excessive allocated pages are released afterwards).]
> 
> Rafael, I tried out your patches and found doubled memory shrink speed!
>
> [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)

Unfortunately, I'm observing a regression and a huge one.

On my Atom-based test box with 1 GB of RAM after a fresh boot and starting X
with KDE 4 there are ~256 MB free.  To create an image we need to free ~300 MB
and that takes ~2 s with the old code and ~15 s with the new one.

It helps to call shrink_all_memory() once with a sufficiently large argument
before the preallocation.
 
> For you reference, here is the free memory before/after
> hibernate_preallocate_memory():
> 
>         # free
>                      total       used       free     shared    buffers     cached
>         Mem:          1933       1917         15          0          0       1845
>         -/+ buffers/cache:         72       1861
>         Swap:            0          0          0
> 
>         # free
>                      total       used       free     shared    buffers     cached
>         Mem:          1933        920       1012          0          0        356
>         -/+ buffers/cache:        563       1369
>         Swap:            0          0          0
> 
> It seems that the preallocated memory is not freed on -ENOMEM.
> 
> +       error = memory_bm_create(&orig_bm, GFP_IMAGE, PG_ANY);
> +       if (error)
> +               goto err_out;
> +
> +       error = memory_bm_create(&copy_bm, GFP_IMAGE, PG_ANY);
> +       if (error)
> +               goto err_out;
> 
> memory_bm_create() is called a number of times, each time it will
> call create_mem_extents()/memory_bm_free(). Can they be optimized to
> be called only once?

Possibly, but not right now if you please?  This is just moving code BTW.

> A side note: there are somehow duplicated *_extent_*() logics in the
> filesystems, is it possible that we abstract out some of the common code?

I think we can do it, but it really is low priority to me at the moment.

> +       for_each_populated_zone(zone) {
> +               size += snapshot_additional_pages(zone);
> +               count += zone_page_state(zone, NR_FREE_PAGES);
> +               if (!is_highmem(zone))
> +                       count -= zone->lowmem_reserve[ZONE_NORMAL];
> +       }
> 
> Why [ZONE_NORMAL] instead of [zone]? ZONE_NORMAL may not always be the largest zone,
> for example, My 4GB laptop has a tiny ZONE_NORMAL and a large ZONE_DMA32.

Ah, this is a leftover and it should be changed or even dropped.  Can you
please remind me how exactly lowmem_reserve[] is supposed to work?

> +       /* If size < max_size, preallocating enough memory may be impossible. */
> +       if (count > 0 && size == max_size)
> +               error = -ENOMEM;
> +       if (error)
> +               goto err_out;
> 
> The two if()s can be merged.

Unfortunately, the first one is actually wrong. :-)

It's not present in the updated patchset I'm going to send tomorrow.

> At last, I'd express my major concern about the transition to preallocate
> based memory shrinking: will it lead to more random swapping IOs?

Hmm.  I don't see immediately why would it.  Maybe the regression I'm seeing
is related to that ...

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
  2009-05-05  2:46                                                   ` Wu Fengguang
@ 2009-05-05 23:07                                                     ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-05 23:07 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-pm, Andrew Morton, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

On Tuesday 05 May 2009, Wu Fengguang wrote:
> On Tue, May 05, 2009 at 10:24:27AM +0800, Wu Fengguang wrote:
> > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > 
> > > Since the hibernation code is now going to use allocations of memory
> > > to create enough room for the image, it can also use the page frames
> > > allocated at this stage as image page frames.  The low-level
> > > hibernation code needs to be rearranged for this purpose, but it
> > > allows us to avoid freeing a great number of pages and allocating
> > > these same pages once again later, so it generally is worth doing.
> > > 
> > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > >  many pages as needed to get the right image size in one shot (the
> > >  excessive allocated pages are released afterwards).]
> > 
> > Rafael, I tried out your patches and found doubled memory shrink speed!
> > 
> > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
>  
> > For you reference, here is the free memory before/after
> > hibernate_preallocate_memory():
> > 
> >         # free
> >                      total       used       free     shared    buffers     cached
> >         Mem:          1933       1917         15          0          0       1845
> >         -/+ buffers/cache:         72       1861
> >         Swap:            0          0          0
> > 
> >         # free
> >                      total       used       free     shared    buffers     cached
> >         Mem:          1933        920       1012          0          0        356
> >         -/+ buffers/cache:        563       1369
> >         Swap:            0          0          0
> > 
> > It seems that the preallocated memory is not freed on -ENOMEM.
> 
> Ah, this was my fault.
> 
> I used to do this debugging trick:
> 
>         @@ -1207,7 +1207,7 @@ int hibernate_preallocate_memory(void)
>                         pages, size);
>                 swsusp_show_speed(&start, &stop, pages, "Allocated");
> 
>         -       return 0;
>         +       return -ENOMEM;
> 
>           err_out:
>                 printk(KERN_CONT "\n");
> 
> That "return -ENOMEM" should be "error = -ENOMEM" :-)
> 
> Here is one more run:
> 
> [  194.016991] PM: Preallocating image memory ... done (allocated 383897 pages, 128000 image pages kept)
> [  196.505999] PM: Allocated 1535588 kbytes in 2.47 seconds (621.69 MB/s)
> 
> Now the free report is back to normal:
> 
> # free
>              total       used       free     shared    buffers     cached
> Mem:          1933         74       1858          0          0         15

Thanks for testing.  The results look encouraging, but I'd also like to get rid
of the regression mentioned in my previous message.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
  2009-05-05  2:46                                                   ` Wu Fengguang
  (?)
  (?)
@ 2009-05-05 23:07                                                   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-05 23:07 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-kernel, alan-jenkins, jens.axboe, linux-pm, kernel-testers,
	torvalds, Andrew Morton

On Tuesday 05 May 2009, Wu Fengguang wrote:
> On Tue, May 05, 2009 at 10:24:27AM +0800, Wu Fengguang wrote:
> > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > 
> > > Since the hibernation code is now going to use allocations of memory
> > > to create enough room for the image, it can also use the page frames
> > > allocated at this stage as image page frames.  The low-level
> > > hibernation code needs to be rearranged for this purpose, but it
> > > allows us to avoid freeing a great number of pages and allocating
> > > these same pages once again later, so it generally is worth doing.
> > > 
> > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > >  many pages as needed to get the right image size in one shot (the
> > >  excessive allocated pages are released afterwards).]
> > 
> > Rafael, I tried out your patches and found doubled memory shrink speed!
> > 
> > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
>  
> > For you reference, here is the free memory before/after
> > hibernate_preallocate_memory():
> > 
> >         # free
> >                      total       used       free     shared    buffers     cached
> >         Mem:          1933       1917         15          0          0       1845
> >         -/+ buffers/cache:         72       1861
> >         Swap:            0          0          0
> > 
> >         # free
> >                      total       used       free     shared    buffers     cached
> >         Mem:          1933        920       1012          0          0        356
> >         -/+ buffers/cache:        563       1369
> >         Swap:            0          0          0
> > 
> > It seems that the preallocated memory is not freed on -ENOMEM.
> 
> Ah, this was my fault.
> 
> I used to do this debugging trick:
> 
>         @@ -1207,7 +1207,7 @@ int hibernate_preallocate_memory(void)
>                         pages, size);
>                 swsusp_show_speed(&start, &stop, pages, "Allocated");
> 
>         -       return 0;
>         +       return -ENOMEM;
> 
>           err_out:
>                 printk(KERN_CONT "\n");
> 
> That "return -ENOMEM" should be "error = -ENOMEM" :-)
> 
> Here is one more run:
> 
> [  194.016991] PM: Preallocating image memory ... done (allocated 383897 pages, 128000 image pages kept)
> [  196.505999] PM: Allocated 1535588 kbytes in 2.47 seconds (621.69 MB/s)
> 
> Now the free report is back to normal:
> 
> # free
>              total       used       free     shared    buffers     cached
> Mem:          1933         74       1858          0          0         15

Thanks for testing.  The results look encouraging, but I'd also like to get rid
of the regression mentioned in my previous message.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
@ 2009-05-05 23:07                                                     ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-05 23:07 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Andrew Morton, pavel-+ZI9xUNit7I,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Tuesday 05 May 2009, Wu Fengguang wrote:
> On Tue, May 05, 2009 at 10:24:27AM +0800, Wu Fengguang wrote:
> > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
> > > 
> > > Since the hibernation code is now going to use allocations of memory
> > > to create enough room for the image, it can also use the page frames
> > > allocated at this stage as image page frames.  The low-level
> > > hibernation code needs to be rearranged for this purpose, but it
> > > allows us to avoid freeing a great number of pages and allocating
> > > these same pages once again later, so it generally is worth doing.
> > > 
> > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > >  many pages as needed to get the right image size in one shot (the
> > >  excessive allocated pages are released afterwards).]
> > 
> > Rafael, I tried out your patches and found doubled memory shrink speed!
> > 
> > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
>  
> > For you reference, here is the free memory before/after
> > hibernate_preallocate_memory():
> > 
> >         # free
> >                      total       used       free     shared    buffers     cached
> >         Mem:          1933       1917         15          0          0       1845
> >         -/+ buffers/cache:         72       1861
> >         Swap:            0          0          0
> > 
> >         # free
> >                      total       used       free     shared    buffers     cached
> >         Mem:          1933        920       1012          0          0        356
> >         -/+ buffers/cache:        563       1369
> >         Swap:            0          0          0
> > 
> > It seems that the preallocated memory is not freed on -ENOMEM.
> 
> Ah, this was my fault.
> 
> I used to do this debugging trick:
> 
>         @@ -1207,7 +1207,7 @@ int hibernate_preallocate_memory(void)
>                         pages, size);
>                 swsusp_show_speed(&start, &stop, pages, "Allocated");
> 
>         -       return 0;
>         +       return -ENOMEM;
> 
>           err_out:
>                 printk(KERN_CONT "\n");
> 
> That "return -ENOMEM" should be "error = -ENOMEM" :-)
> 
> Here is one more run:
> 
> [  194.016991] PM: Preallocating image memory ... done (allocated 383897 pages, 128000 image pages kept)
> [  196.505999] PM: Allocated 1535588 kbytes in 2.47 seconds (621.69 MB/s)
> 
> Now the free report is back to normal:
> 
> # free
>              total       used       free     shared    buffers     cached
> Mem:          1933         74       1858          0          0         15

Thanks for testing.  The results look encouraging, but I'd also like to get rid
of the regression mentioned in my previous message.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-05 23:20                                                                   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-05 23:20 UTC (permalink / raw)
  To: Andrew Morton
  Cc: rientjes, fengguang.wu, linux-pm, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

On Wednesday 06 May 2009, Andrew Morton wrote:
> On Wed, 6 May 2009 00:19:35 +0200
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > > > +			&& !processes_are_frozen()) {
> > > >  		if (!try_set_zone_oom(zonelist, gfp_mask)) {
> > > >  			schedule_timeout_uninterruptible(1);
> > > >  			goto restart;
> > > 
> > > Cool, that looks like the semantics of __GFP_NO_OOM_KILL without requiring 
> > > a new gfp flag.  Thanks.
> > 
> > Well, you're welcome.
> > 
> > BTW, I think that Andrew was actually right when he asked if I checked whether
> > the existing __GFP_NORETRY would work as-is for __GFP_FS set and
> > __GFP_NORETRY unset.  Namely, in that case we never reach the code before
> > nopage: that checks __GFP_NORETRY, do we?
> > 
> > So I think we shouldn't modify the 'else if' condition above and check for
> > !processes_are_frozen() at the beginning of the block below.
> 
> Confused.
> 
> I'm suspecting that hibernation can allocate its pages with
> __GFP_FS|__GFP_WAIT|__GFP_NORETRY|__GFP_NOWARN, and the page allocator
> will dtrt: no oom-killings.
> 
> In which case, processes_are_frozen() is not needed at all?

__GFP_NORETRY alone causes it to fail relatively quickly, but I'll try with
the combination.

Anyway, even if the hibernation code itself doesn't trigger the OOM killer,
but anyone else allocates memory in parallel or after we've preallocated the
image memory, that may still trigger it.  So it seems processes_are_frozen()
may still be useful?

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-05 22:37                                                                 ` Andrew Morton
  (?)
@ 2009-05-05 23:20                                                                 ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-05 23:20 UTC (permalink / raw)
  To: Andrew Morton
  Cc: kernel-testers, rientjes, linux-kernel, alan-jenkins, jens.axboe,
	linux-pm, fengguang.wu, torvalds

On Wednesday 06 May 2009, Andrew Morton wrote:
> On Wed, 6 May 2009 00:19:35 +0200
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > > > +			&& !processes_are_frozen()) {
> > > >  		if (!try_set_zone_oom(zonelist, gfp_mask)) {
> > > >  			schedule_timeout_uninterruptible(1);
> > > >  			goto restart;
> > > 
> > > Cool, that looks like the semantics of __GFP_NO_OOM_KILL without requiring 
> > > a new gfp flag.  Thanks.
> > 
> > Well, you're welcome.
> > 
> > BTW, I think that Andrew was actually right when he asked if I checked whether
> > the existing __GFP_NORETRY would work as-is for __GFP_FS set and
> > __GFP_NORETRY unset.  Namely, in that case we never reach the code before
> > nopage: that checks __GFP_NORETRY, do we?
> > 
> > So I think we shouldn't modify the 'else if' condition above and check for
> > !processes_are_frozen() at the beginning of the block below.
> 
> Confused.
> 
> I'm suspecting that hibernation can allocate its pages with
> __GFP_FS|__GFP_WAIT|__GFP_NORETRY|__GFP_NOWARN, and the page allocator
> will dtrt: no oom-killings.
> 
> In which case, processes_are_frozen() is not needed at all?

__GFP_NORETRY alone causes it to fail relatively quickly, but I'll try with
the combination.

Anyway, even if the hibernation code itself doesn't trigger the OOM killer,
but anyone else allocates memory in parallel or after we've preallocated the
image memory, that may still trigger it.  So it seems processes_are_frozen()
may still be useful?

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-05 23:20                                                                   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-05 23:20 UTC (permalink / raw)
  To: Andrew Morton
  Cc: rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Wednesday 06 May 2009, Andrew Morton wrote:
> On Wed, 6 May 2009 00:19:35 +0200
> "Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:
> 
> > > > +			&& !processes_are_frozen()) {
> > > >  		if (!try_set_zone_oom(zonelist, gfp_mask)) {
> > > >  			schedule_timeout_uninterruptible(1);
> > > >  			goto restart;
> > > 
> > > Cool, that looks like the semantics of __GFP_NO_OOM_KILL without requiring 
> > > a new gfp flag.  Thanks.
> > 
> > Well, you're welcome.
> > 
> > BTW, I think that Andrew was actually right when he asked if I checked whether
> > the existing __GFP_NORETRY would work as-is for __GFP_FS set and
> > __GFP_NORETRY unset.  Namely, in that case we never reach the code before
> > nopage: that checks __GFP_NORETRY, do we?
> > 
> > So I think we shouldn't modify the 'else if' condition above and check for
> > !processes_are_frozen() at the beginning of the block below.
> 
> Confused.
> 
> I'm suspecting that hibernation can allocate its pages with
> __GFP_FS|__GFP_WAIT|__GFP_NORETRY|__GFP_NOWARN, and the page allocator
> will dtrt: no oom-killings.
> 
> In which case, processes_are_frozen() is not needed at all?

__GFP_NORETRY alone causes it to fail relatively quickly, but I'll try with
the combination.

Anyway, even if the hibernation code itself doesn't trigger the OOM killer,
but anyone else allocates memory in parallel or after we've preallocated the
image memory, that may still trigger it.  So it seems processes_are_frozen()
may still be useful?

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-05 23:40                                                                     ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-05 23:40 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: rientjes, fengguang.wu, linux-pm, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

On Wed, 6 May 2009 01:20:34 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> On Wednesday 06 May 2009, Andrew Morton wrote:
> > On Wed, 6 May 2009 00:19:35 +0200
> > "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > 
> > > > > +			&& !processes_are_frozen()) {
> > > > >  		if (!try_set_zone_oom(zonelist, gfp_mask)) {
> > > > >  			schedule_timeout_uninterruptible(1);
> > > > >  			goto restart;
> > > > 
> > > > Cool, that looks like the semantics of __GFP_NO_OOM_KILL without requiring 
> > > > a new gfp flag.  Thanks.
> > > 
> > > Well, you're welcome.
> > > 
> > > BTW, I think that Andrew was actually right when he asked if I checked whether
> > > the existing __GFP_NORETRY would work as-is for __GFP_FS set and
> > > __GFP_NORETRY unset.  Namely, in that case we never reach the code before
> > > nopage: that checks __GFP_NORETRY, do we?
> > > 
> > > So I think we shouldn't modify the 'else if' condition above and check for
> > > !processes_are_frozen() at the beginning of the block below.
> > 
> > Confused.
> > 
> > I'm suspecting that hibernation can allocate its pages with
> > __GFP_FS|__GFP_WAIT|__GFP_NORETRY|__GFP_NOWARN, and the page allocator
> > will dtrt: no oom-killings.
> > 
> > In which case, processes_are_frozen() is not needed at all?
> 
> __GFP_NORETRY alone causes it to fail relatively quickly, but I'll try with
> the combination.

OK.  __GFP_WAIT is the big hammer.

> Anyway, even if the hibernation code itself doesn't trigger the OOM killer,
> but anyone else allocates memory in parallel or after we've preallocated the
> image memory, that may still trigger it.  So it seems processes_are_frozen()
> may still be useful?

Could be.  But only kernel threads are active at this time (yes?), and they
won't have much work to do because userspace is asleep.


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-05 23:20                                                                   ` Rafael J. Wysocki
  (?)
  (?)
@ 2009-05-05 23:40                                                                   ` Andrew Morton
  -1 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-05 23:40 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: kernel-testers, rientjes, linux-kernel, alan-jenkins, jens.axboe,
	linux-pm, fengguang.wu, torvalds

On Wed, 6 May 2009 01:20:34 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> On Wednesday 06 May 2009, Andrew Morton wrote:
> > On Wed, 6 May 2009 00:19:35 +0200
> > "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > 
> > > > > +			&& !processes_are_frozen()) {
> > > > >  		if (!try_set_zone_oom(zonelist, gfp_mask)) {
> > > > >  			schedule_timeout_uninterruptible(1);
> > > > >  			goto restart;
> > > > 
> > > > Cool, that looks like the semantics of __GFP_NO_OOM_KILL without requiring 
> > > > a new gfp flag.  Thanks.
> > > 
> > > Well, you're welcome.
> > > 
> > > BTW, I think that Andrew was actually right when he asked if I checked whether
> > > the existing __GFP_NORETRY would work as-is for __GFP_FS set and
> > > __GFP_NORETRY unset.  Namely, in that case we never reach the code before
> > > nopage: that checks __GFP_NORETRY, do we?
> > > 
> > > So I think we shouldn't modify the 'else if' condition above and check for
> > > !processes_are_frozen() at the beginning of the block below.
> > 
> > Confused.
> > 
> > I'm suspecting that hibernation can allocate its pages with
> > __GFP_FS|__GFP_WAIT|__GFP_NORETRY|__GFP_NOWARN, and the page allocator
> > will dtrt: no oom-killings.
> > 
> > In which case, processes_are_frozen() is not needed at all?
> 
> __GFP_NORETRY alone causes it to fail relatively quickly, but I'll try with
> the combination.

OK.  __GFP_WAIT is the big hammer.

> Anyway, even if the hibernation code itself doesn't trigger the OOM killer,
> but anyone else allocates memory in parallel or after we've preallocated the
> image memory, that may still trigger it.  So it seems processes_are_frozen()
> may still be useful?

Could be.  But only kernel threads are active at this time (yes?), and they
won't have much work to do because userspace is asleep.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-05 23:40                                                                     ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-05 23:40 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Wed, 6 May 2009 01:20:34 +0200
"Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:

> On Wednesday 06 May 2009, Andrew Morton wrote:
> > On Wed, 6 May 2009 00:19:35 +0200
> > "Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:
> > 
> > > > > +			&& !processes_are_frozen()) {
> > > > >  		if (!try_set_zone_oom(zonelist, gfp_mask)) {
> > > > >  			schedule_timeout_uninterruptible(1);
> > > > >  			goto restart;
> > > > 
> > > > Cool, that looks like the semantics of __GFP_NO_OOM_KILL without requiring 
> > > > a new gfp flag.  Thanks.
> > > 
> > > Well, you're welcome.
> > > 
> > > BTW, I think that Andrew was actually right when he asked if I checked whether
> > > the existing __GFP_NORETRY would work as-is for __GFP_FS set and
> > > __GFP_NORETRY unset.  Namely, in that case we never reach the code before
> > > nopage: that checks __GFP_NORETRY, do we?
> > > 
> > > So I think we shouldn't modify the 'else if' condition above and check for
> > > !processes_are_frozen() at the beginning of the block below.
> > 
> > Confused.
> > 
> > I'm suspecting that hibernation can allocate its pages with
> > __GFP_FS|__GFP_WAIT|__GFP_NORETRY|__GFP_NOWARN, and the page allocator
> > will dtrt: no oom-killings.
> > 
> > In which case, processes_are_frozen() is not needed at all?
> 
> __GFP_NORETRY alone causes it to fail relatively quickly, but I'll try with
> the combination.

OK.  __GFP_WAIT is the big hammer.

> Anyway, even if the hibernation code itself doesn't trigger the OOM killer,
> but anyone else allocates memory in parallel or after we've preallocated the
> image memory, that may still trigger it.  So it seems processes_are_frozen()
> may still be useful?

Could be.  But only kernel threads are active at this time (yes?), and they
won't have much work to do because userspace is asleep.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
@ 2009-05-05 23:40                                                       ` Wu Fengguang
  0 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-05 23:40 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, Andrew Morton, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

On Wed, May 06, 2009 at 07:07:44AM +0800, Rafael J. Wysocki wrote:
> On Tuesday 05 May 2009, Wu Fengguang wrote:
> > On Tue, May 05, 2009 at 10:24:27AM +0800, Wu Fengguang wrote:
> > > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > > 
> > > > Since the hibernation code is now going to use allocations of memory
> > > > to create enough room for the image, it can also use the page frames
> > > > allocated at this stage as image page frames.  The low-level
> > > > hibernation code needs to be rearranged for this purpose, but it
> > > > allows us to avoid freeing a great number of pages and allocating
> > > > these same pages once again later, so it generally is worth doing.
> > > > 
> > > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > > >  many pages as needed to get the right image size in one shot (the
> > > >  excessive allocated pages are released afterwards).]
> > > 
> > > Rafael, I tried out your patches and found doubled memory shrink speed!
> > > 
> > > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
> >  
> > > For you reference, here is the free memory before/after
> > > hibernate_preallocate_memory():
> > > 
> > >         # free
> > >                      total       used       free     shared    buffers     cached
> > >         Mem:          1933       1917         15          0          0       1845
> > >         -/+ buffers/cache:         72       1861
> > >         Swap:            0          0          0
> > > 
> > >         # free
> > >                      total       used       free     shared    buffers     cached
> > >         Mem:          1933        920       1012          0          0        356
> > >         -/+ buffers/cache:        563       1369
> > >         Swap:            0          0          0
> > > 
> > > It seems that the preallocated memory is not freed on -ENOMEM.
> > 
> > Ah, this was my fault.
> > 
> > I used to do this debugging trick:
> > 
> >         @@ -1207,7 +1207,7 @@ int hibernate_preallocate_memory(void)
> >                         pages, size);
> >                 swsusp_show_speed(&start, &stop, pages, "Allocated");
> > 
> >         -       return 0;
> >         +       return -ENOMEM;
> > 
> >           err_out:
> >                 printk(KERN_CONT "\n");
> > 
> > That "return -ENOMEM" should be "error = -ENOMEM" :-)
> > 
> > Here is one more run:
> > 
> > [  194.016991] PM: Preallocating image memory ... done (allocated 383897 pages, 128000 image pages kept)
> > [  196.505999] PM: Allocated 1535588 kbytes in 2.47 seconds (621.69 MB/s)
> > 
> > Now the free report is back to normal:
> > 
> > # free
> >              total       used       free     shared    buffers     cached
> > Mem:          1933         74       1858          0          0         15

The above 'free' still exposed something wrong: only 74M memory are left,
instead of image_size=500M memory. I'm prepared to test your updated patches :-)

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
  2009-05-05 23:07                                                     ` Rafael J. Wysocki
  (?)
@ 2009-05-05 23:40                                                     ` Wu Fengguang
  -1 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-05 23:40 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-kernel, alan-jenkins, jens.axboe, linux-pm, kernel-testers,
	torvalds, Andrew Morton

On Wed, May 06, 2009 at 07:07:44AM +0800, Rafael J. Wysocki wrote:
> On Tuesday 05 May 2009, Wu Fengguang wrote:
> > On Tue, May 05, 2009 at 10:24:27AM +0800, Wu Fengguang wrote:
> > > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > > 
> > > > Since the hibernation code is now going to use allocations of memory
> > > > to create enough room for the image, it can also use the page frames
> > > > allocated at this stage as image page frames.  The low-level
> > > > hibernation code needs to be rearranged for this purpose, but it
> > > > allows us to avoid freeing a great number of pages and allocating
> > > > these same pages once again later, so it generally is worth doing.
> > > > 
> > > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > > >  many pages as needed to get the right image size in one shot (the
> > > >  excessive allocated pages are released afterwards).]
> > > 
> > > Rafael, I tried out your patches and found doubled memory shrink speed!
> > > 
> > > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
> >  
> > > For you reference, here is the free memory before/after
> > > hibernate_preallocate_memory():
> > > 
> > >         # free
> > >                      total       used       free     shared    buffers     cached
> > >         Mem:          1933       1917         15          0          0       1845
> > >         -/+ buffers/cache:         72       1861
> > >         Swap:            0          0          0
> > > 
> > >         # free
> > >                      total       used       free     shared    buffers     cached
> > >         Mem:          1933        920       1012          0          0        356
> > >         -/+ buffers/cache:        563       1369
> > >         Swap:            0          0          0
> > > 
> > > It seems that the preallocated memory is not freed on -ENOMEM.
> > 
> > Ah, this was my fault.
> > 
> > I used to do this debugging trick:
> > 
> >         @@ -1207,7 +1207,7 @@ int hibernate_preallocate_memory(void)
> >                         pages, size);
> >                 swsusp_show_speed(&start, &stop, pages, "Allocated");
> > 
> >         -       return 0;
> >         +       return -ENOMEM;
> > 
> >           err_out:
> >                 printk(KERN_CONT "\n");
> > 
> > That "return -ENOMEM" should be "error = -ENOMEM" :-)
> > 
> > Here is one more run:
> > 
> > [  194.016991] PM: Preallocating image memory ... done (allocated 383897 pages, 128000 image pages kept)
> > [  196.505999] PM: Allocated 1535588 kbytes in 2.47 seconds (621.69 MB/s)
> > 
> > Now the free report is back to normal:
> > 
> > # free
> >              total       used       free     shared    buffers     cached
> > Mem:          1933         74       1858          0          0         15

The above 'free' still exposed something wrong: only 74M memory are left,
instead of image_size=500M memory. I'm prepared to test your updated patches :-)

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
@ 2009-05-05 23:40                                                       ` Wu Fengguang
  0 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-05 23:40 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Andrew Morton, pavel-+ZI9xUNit7I,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Wed, May 06, 2009 at 07:07:44AM +0800, Rafael J. Wysocki wrote:
> On Tuesday 05 May 2009, Wu Fengguang wrote:
> > On Tue, May 05, 2009 at 10:24:27AM +0800, Wu Fengguang wrote:
> > > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > > From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
> > > > 
> > > > Since the hibernation code is now going to use allocations of memory
> > > > to create enough room for the image, it can also use the page frames
> > > > allocated at this stage as image page frames.  The low-level
> > > > hibernation code needs to be rearranged for this purpose, but it
> > > > allows us to avoid freeing a great number of pages and allocating
> > > > these same pages once again later, so it generally is worth doing.
> > > > 
> > > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > > >  many pages as needed to get the right image size in one shot (the
> > > >  excessive allocated pages are released afterwards).]
> > > 
> > > Rafael, I tried out your patches and found doubled memory shrink speed!
> > > 
> > > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
> >  
> > > For you reference, here is the free memory before/after
> > > hibernate_preallocate_memory():
> > > 
> > >         # free
> > >                      total       used       free     shared    buffers     cached
> > >         Mem:          1933       1917         15          0          0       1845
> > >         -/+ buffers/cache:         72       1861
> > >         Swap:            0          0          0
> > > 
> > >         # free
> > >                      total       used       free     shared    buffers     cached
> > >         Mem:          1933        920       1012          0          0        356
> > >         -/+ buffers/cache:        563       1369
> > >         Swap:            0          0          0
> > > 
> > > It seems that the preallocated memory is not freed on -ENOMEM.
> > 
> > Ah, this was my fault.
> > 
> > I used to do this debugging trick:
> > 
> >         @@ -1207,7 +1207,7 @@ int hibernate_preallocate_memory(void)
> >                         pages, size);
> >                 swsusp_show_speed(&start, &stop, pages, "Allocated");
> > 
> >         -       return 0;
> >         +       return -ENOMEM;
> > 
> >           err_out:
> >                 printk(KERN_CONT "\n");
> > 
> > That "return -ENOMEM" should be "error = -ENOMEM" :-)
> > 
> > Here is one more run:
> > 
> > [  194.016991] PM: Preallocating image memory ... done (allocated 383897 pages, 128000 image pages kept)
> > [  196.505999] PM: Allocated 1535588 kbytes in 2.47 seconds (621.69 MB/s)
> > 
> > Now the free report is back to normal:
> > 
> > # free
> >              total       used       free     shared    buffers     cached
> > Mem:          1933         74       1858          0          0         15

The above 'free' still exposed something wrong: only 74M memory are left,
instead of image_size=500M memory. I'm prepared to test your updated patches :-)

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
@ 2009-05-06 13:30                                                     ` Wu Fengguang
  0 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-06 13:30 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, Andrew Morton, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

On Wed, May 06, 2009 at 07:05:09AM +0800, Rafael J. Wysocki wrote:
> On Tuesday 05 May 2009, Wu Fengguang wrote:
> > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > 
> > > Since the hibernation code is now going to use allocations of memory
> > > to create enough room for the image, it can also use the page frames
> > > allocated at this stage as image page frames.  The low-level
> > > hibernation code needs to be rearranged for this purpose, but it
> > > allows us to avoid freeing a great number of pages and allocating
> > > these same pages once again later, so it generally is worth doing.
> > > 
> > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > >  many pages as needed to get the right image size in one shot (the
> > >  excessive allocated pages are released afterwards).]
> > 
> > Rafael, I tried out your patches and found doubled memory shrink speed!
> >
> > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
> 
> Unfortunately, I'm observing a regression and a huge one.
> 
> On my Atom-based test box with 1 GB of RAM after a fresh boot and starting X
> with KDE 4 there are ~256 MB free.  To create an image we need to free ~300 MB
> and that takes ~2 s with the old code and ~15 s with the new one.
> 
> It helps to call shrink_all_memory() once with a sufficiently large argument
> before the preallocation.

Yes there are some strange behaviors. I tried to populate the page
cache with 1/30 mapped file pages and others normal file pages, all
referenced once. I get this on "echo disk > /sys/power/state":

[  462.820098] PM: Marking nosave pages: 0000000000001000 - 0000000000006000
[  462.827161] PM: Marking nosave pages: 000000000009f000 - 0000000000100000
[  462.834249] PM: Basic memory bitmaps created
[  462.838631] PM: Syncing filesystems ... done.
[  463.167805] Freezing user space processes ... (elapsed 0.00 seconds) done.
[  463.175738] Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done.
[  463.183834] PM: Preallocating image memory ... done (allocated 383898 pages, 128000 image pages kept)
[  469.605741] PM: Allocated 1535592 kbytes in 6.41 seconds (239.56 MB/s)
[  469.612325]
[  469.768796] Restarting tasks ... done.
[  469.775044] PM: Basic memory bitmaps freed

Immediately after that, I copied a big sparse file into memory, and get this:

[  508.097913] PM: Marking nosave pages: 0000000000001000 - 0000000000006000
[  508.104799] PM: Marking nosave pages: 000000000009f000 - 0000000000100000
[  508.111702] PM: Basic memory bitmaps created
[  508.116073] PM: Syncing filesystems ... done.
[  509.208608] Freezing user space processes ... (elapsed 0.00 seconds) done.
[  509.216692] Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done.
[  509.224708] PM: Preallocating image memory ... done (allocated 383872 pages, 128000 image pages kept)
[  520.951882] PM: Allocated 1535488 kbytes in 11.71 seconds (131.12 MB/s)

It's much worse.

Your patches are really interesting exercises for the vmscan code ;-)

> > +       error = memory_bm_create(&orig_bm, GFP_IMAGE, PG_ANY);
> > +       if (error)
> > +               goto err_out;
> > +
> > +       error = memory_bm_create(&copy_bm, GFP_IMAGE, PG_ANY);
> > +       if (error)
> > +               goto err_out;
> > 
> > memory_bm_create() is called a number of times, each time it will
> > call create_mem_extents()/memory_bm_free(). Can they be optimized to
> > be called only once?
> 
> Possibly, but not right now if you please?  This is just moving code BTW.

OK.

> 
> > A side note: there are somehow duplicated *_extent_*() logics in the
> > filesystems, is it possible that we abstract out some of the common code?
> 
> I think we can do it, but it really is low priority to me at the moment.

OK. Just was a wild thought.

> 
> > +       for_each_populated_zone(zone) {
> > +               size += snapshot_additional_pages(zone);
> > +               count += zone_page_state(zone, NR_FREE_PAGES);
> > +               if (!is_highmem(zone))
> > +                       count -= zone->lowmem_reserve[ZONE_NORMAL];
> > +       }
> > 
> > Why [ZONE_NORMAL] instead of [zone]? ZONE_NORMAL may not always be the largest zone,
> > for example, My 4GB laptop has a tiny ZONE_NORMAL and a large ZONE_DMA32.
> 
> Ah, this is a leftover and it should be changed or even dropped.  Can you
> please remind me how exactly lowmem_reserve[] is supposed to work?

totalreserve_pages could be better. When free memory drops below that
threshold(it actually works per zone), kswapd will wake up trying to
reclaim pages. If the total reclaimable+free pages are as low as
totalreserve_pages, that would drive kswapd mad - scanning the whole
zones, trying to squeeze the last pages out of them.  Sure kswapd will
stop somewhere, but the resulting scan:reclaim ratio would be pretty
high and therefore hurt performance.

So we shall stop preallocation when reclaimable pages go down to
something like (5*totalreserve_pages). The vmscan mad may come earlier
because of unbalanced distributions of reclaimable pages among the zones.

> > At last, I'd express my major concern about the transition to preallocate
> > based memory shrinking: will it lead to more random swapping IOs?
> 
> Hmm.  I don't see immediately why would it.  Maybe the regression I'm seeing
> is related to that ...

OK. Anyway a preallocate based shrinking policy could be far from optimal. 
I'd suggest to switch to user space directed shrinking via fadvise(DONTNEED),
and leave the kernel one a fail safe path.  The user space tool could
gather page information from the filecache interface which I've been
maintaining out of tree, and to drop inactive/active pages from large
files first. That should be a better policy at least for rotational disks.

Thanks,
Fengguang


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
  2009-05-05 23:05                                                   ` Rafael J. Wysocki
  (?)
@ 2009-05-06 13:30                                                   ` Wu Fengguang
  -1 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-06 13:30 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-kernel, alan-jenkins, jens.axboe, linux-pm, kernel-testers,
	torvalds, Andrew Morton

On Wed, May 06, 2009 at 07:05:09AM +0800, Rafael J. Wysocki wrote:
> On Tuesday 05 May 2009, Wu Fengguang wrote:
> > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > 
> > > Since the hibernation code is now going to use allocations of memory
> > > to create enough room for the image, it can also use the page frames
> > > allocated at this stage as image page frames.  The low-level
> > > hibernation code needs to be rearranged for this purpose, but it
> > > allows us to avoid freeing a great number of pages and allocating
> > > these same pages once again later, so it generally is worth doing.
> > > 
> > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > >  many pages as needed to get the right image size in one shot (the
> > >  excessive allocated pages are released afterwards).]
> > 
> > Rafael, I tried out your patches and found doubled memory shrink speed!
> >
> > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
> 
> Unfortunately, I'm observing a regression and a huge one.
> 
> On my Atom-based test box with 1 GB of RAM after a fresh boot and starting X
> with KDE 4 there are ~256 MB free.  To create an image we need to free ~300 MB
> and that takes ~2 s with the old code and ~15 s with the new one.
> 
> It helps to call shrink_all_memory() once with a sufficiently large argument
> before the preallocation.

Yes there are some strange behaviors. I tried to populate the page
cache with 1/30 mapped file pages and others normal file pages, all
referenced once. I get this on "echo disk > /sys/power/state":

[  462.820098] PM: Marking nosave pages: 0000000000001000 - 0000000000006000
[  462.827161] PM: Marking nosave pages: 000000000009f000 - 0000000000100000
[  462.834249] PM: Basic memory bitmaps created
[  462.838631] PM: Syncing filesystems ... done.
[  463.167805] Freezing user space processes ... (elapsed 0.00 seconds) done.
[  463.175738] Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done.
[  463.183834] PM: Preallocating image memory ... done (allocated 383898 pages, 128000 image pages kept)
[  469.605741] PM: Allocated 1535592 kbytes in 6.41 seconds (239.56 MB/s)
[  469.612325]
[  469.768796] Restarting tasks ... done.
[  469.775044] PM: Basic memory bitmaps freed

Immediately after that, I copied a big sparse file into memory, and get this:

[  508.097913] PM: Marking nosave pages: 0000000000001000 - 0000000000006000
[  508.104799] PM: Marking nosave pages: 000000000009f000 - 0000000000100000
[  508.111702] PM: Basic memory bitmaps created
[  508.116073] PM: Syncing filesystems ... done.
[  509.208608] Freezing user space processes ... (elapsed 0.00 seconds) done.
[  509.216692] Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done.
[  509.224708] PM: Preallocating image memory ... done (allocated 383872 pages, 128000 image pages kept)
[  520.951882] PM: Allocated 1535488 kbytes in 11.71 seconds (131.12 MB/s)

It's much worse.

Your patches are really interesting exercises for the vmscan code ;-)

> > +       error = memory_bm_create(&orig_bm, GFP_IMAGE, PG_ANY);
> > +       if (error)
> > +               goto err_out;
> > +
> > +       error = memory_bm_create(&copy_bm, GFP_IMAGE, PG_ANY);
> > +       if (error)
> > +               goto err_out;
> > 
> > memory_bm_create() is called a number of times, each time it will
> > call create_mem_extents()/memory_bm_free(). Can they be optimized to
> > be called only once?
> 
> Possibly, but not right now if you please?  This is just moving code BTW.

OK.

> 
> > A side note: there are somehow duplicated *_extent_*() logics in the
> > filesystems, is it possible that we abstract out some of the common code?
> 
> I think we can do it, but it really is low priority to me at the moment.

OK. Just was a wild thought.

> 
> > +       for_each_populated_zone(zone) {
> > +               size += snapshot_additional_pages(zone);
> > +               count += zone_page_state(zone, NR_FREE_PAGES);
> > +               if (!is_highmem(zone))
> > +                       count -= zone->lowmem_reserve[ZONE_NORMAL];
> > +       }
> > 
> > Why [ZONE_NORMAL] instead of [zone]? ZONE_NORMAL may not always be the largest zone,
> > for example, My 4GB laptop has a tiny ZONE_NORMAL and a large ZONE_DMA32.
> 
> Ah, this is a leftover and it should be changed or even dropped.  Can you
> please remind me how exactly lowmem_reserve[] is supposed to work?

totalreserve_pages could be better. When free memory drops below that
threshold(it actually works per zone), kswapd will wake up trying to
reclaim pages. If the total reclaimable+free pages are as low as
totalreserve_pages, that would drive kswapd mad - scanning the whole
zones, trying to squeeze the last pages out of them.  Sure kswapd will
stop somewhere, but the resulting scan:reclaim ratio would be pretty
high and therefore hurt performance.

So we shall stop preallocation when reclaimable pages go down to
something like (5*totalreserve_pages). The vmscan mad may come earlier
because of unbalanced distributions of reclaimable pages among the zones.

> > At last, I'd express my major concern about the transition to preallocate
> > based memory shrinking: will it lead to more random swapping IOs?
> 
> Hmm.  I don't see immediately why would it.  Maybe the regression I'm seeing
> is related to that ...

OK. Anyway a preallocate based shrinking policy could be far from optimal. 
I'd suggest to switch to user space directed shrinking via fadvise(DONTNEED),
and leave the kernel one a fail safe path.  The user space tool could
gather page information from the filecache interface which I've been
maintaining out of tree, and to drop inactive/active pages from large
files first. That should be a better policy at least for rotational disks.

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
@ 2009-05-06 13:30                                                     ` Wu Fengguang
  0 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-06 13:30 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Andrew Morton, pavel-+ZI9xUNit7I,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Wed, May 06, 2009 at 07:05:09AM +0800, Rafael J. Wysocki wrote:
> On Tuesday 05 May 2009, Wu Fengguang wrote:
> > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
> > > 
> > > Since the hibernation code is now going to use allocations of memory
> > > to create enough room for the image, it can also use the page frames
> > > allocated at this stage as image page frames.  The low-level
> > > hibernation code needs to be rearranged for this purpose, but it
> > > allows us to avoid freeing a great number of pages and allocating
> > > these same pages once again later, so it generally is worth doing.
> > > 
> > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > >  many pages as needed to get the right image size in one shot (the
> > >  excessive allocated pages are released afterwards).]
> > 
> > Rafael, I tried out your patches and found doubled memory shrink speed!
> >
> > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
> 
> Unfortunately, I'm observing a regression and a huge one.
> 
> On my Atom-based test box with 1 GB of RAM after a fresh boot and starting X
> with KDE 4 there are ~256 MB free.  To create an image we need to free ~300 MB
> and that takes ~2 s with the old code and ~15 s with the new one.
> 
> It helps to call shrink_all_memory() once with a sufficiently large argument
> before the preallocation.

Yes there are some strange behaviors. I tried to populate the page
cache with 1/30 mapped file pages and others normal file pages, all
referenced once. I get this on "echo disk > /sys/power/state":

[  462.820098] PM: Marking nosave pages: 0000000000001000 - 0000000000006000
[  462.827161] PM: Marking nosave pages: 000000000009f000 - 0000000000100000
[  462.834249] PM: Basic memory bitmaps created
[  462.838631] PM: Syncing filesystems ... done.
[  463.167805] Freezing user space processes ... (elapsed 0.00 seconds) done.
[  463.175738] Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done.
[  463.183834] PM: Preallocating image memory ... done (allocated 383898 pages, 128000 image pages kept)
[  469.605741] PM: Allocated 1535592 kbytes in 6.41 seconds (239.56 MB/s)
[  469.612325]
[  469.768796] Restarting tasks ... done.
[  469.775044] PM: Basic memory bitmaps freed

Immediately after that, I copied a big sparse file into memory, and get this:

[  508.097913] PM: Marking nosave pages: 0000000000001000 - 0000000000006000
[  508.104799] PM: Marking nosave pages: 000000000009f000 - 0000000000100000
[  508.111702] PM: Basic memory bitmaps created
[  508.116073] PM: Syncing filesystems ... done.
[  509.208608] Freezing user space processes ... (elapsed 0.00 seconds) done.
[  509.216692] Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done.
[  509.224708] PM: Preallocating image memory ... done (allocated 383872 pages, 128000 image pages kept)
[  520.951882] PM: Allocated 1535488 kbytes in 11.71 seconds (131.12 MB/s)

It's much worse.

Your patches are really interesting exercises for the vmscan code ;-)

> > +       error = memory_bm_create(&orig_bm, GFP_IMAGE, PG_ANY);
> > +       if (error)
> > +               goto err_out;
> > +
> > +       error = memory_bm_create(&copy_bm, GFP_IMAGE, PG_ANY);
> > +       if (error)
> > +               goto err_out;
> > 
> > memory_bm_create() is called a number of times, each time it will
> > call create_mem_extents()/memory_bm_free(). Can they be optimized to
> > be called only once?
> 
> Possibly, but not right now if you please?  This is just moving code BTW.

OK.

> 
> > A side note: there are somehow duplicated *_extent_*() logics in the
> > filesystems, is it possible that we abstract out some of the common code?
> 
> I think we can do it, but it really is low priority to me at the moment.

OK. Just was a wild thought.

> 
> > +       for_each_populated_zone(zone) {
> > +               size += snapshot_additional_pages(zone);
> > +               count += zone_page_state(zone, NR_FREE_PAGES);
> > +               if (!is_highmem(zone))
> > +                       count -= zone->lowmem_reserve[ZONE_NORMAL];
> > +       }
> > 
> > Why [ZONE_NORMAL] instead of [zone]? ZONE_NORMAL may not always be the largest zone,
> > for example, My 4GB laptop has a tiny ZONE_NORMAL and a large ZONE_DMA32.
> 
> Ah, this is a leftover and it should be changed or even dropped.  Can you
> please remind me how exactly lowmem_reserve[] is supposed to work?

totalreserve_pages could be better. When free memory drops below that
threshold(it actually works per zone), kswapd will wake up trying to
reclaim pages. If the total reclaimable+free pages are as low as
totalreserve_pages, that would drive kswapd mad - scanning the whole
zones, trying to squeeze the last pages out of them.  Sure kswapd will
stop somewhere, but the resulting scan:reclaim ratio would be pretty
high and therefore hurt performance.

So we shall stop preallocation when reclaimable pages go down to
something like (5*totalreserve_pages). The vmscan mad may come earlier
because of unbalanced distributions of reclaimable pages among the zones.

> > At last, I'd express my major concern about the transition to preallocate
> > based memory shrinking: will it lead to more random swapping IOs?
> 
> Hmm.  I don't see immediately why would it.  Maybe the regression I'm seeing
> is related to that ...

OK. Anyway a preallocate based shrinking policy could be far from optimal. 
I'd suggest to switch to user space directed shrinking via fadvise(DONTNEED),
and leave the kernel one a fail safe path.  The user space tool could
gather page information from the filecache interface which I've been
maintaining out of tree, and to drop inactive/active pages from large
files first. That should be a better policy at least for rotational disks.

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
@ 2009-05-06 13:52                                                     ` Wu Fengguang
  0 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-06 13:52 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, Andrew Morton, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers, Wu Fengguang

On Wed, May 06, 2009 at 07:05:09AM +0800, Rafael J. Wysocki wrote:
> On Tuesday 05 May 2009, Wu Fengguang wrote:
> > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > 
> > > Since the hibernation code is now going to use allocations of memory
> > > to create enough room for the image, it can also use the page frames
> > > allocated at this stage as image page frames.  The low-level
> > > hibernation code needs to be rearranged for this purpose, but it
> > > allows us to avoid freeing a great number of pages and allocating
> > > these same pages once again later, so it generally is worth doing.
> > > 
> > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > >  many pages as needed to get the right image size in one shot (the
> > >  excessive allocated pages are released afterwards).]
> > 
> > Rafael, I tried out your patches and found doubled memory shrink speed!
> >
> > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
> 
> Unfortunately, I'm observing a regression and a huge one.
> 
> On my Atom-based test box with 1 GB of RAM after a fresh boot and starting X
> with KDE 4 there are ~256 MB free.  To create an image we need to free ~300 MB
> and that takes ~2 s with the old code and ~15 s with the new one.
> 
> It helps to call shrink_all_memory() once with a sufficiently large argument
> before the preallocation.

Yes there are some strange behaviors. I tried to populate the page
cache with 1/30 mapped file pages and others normal file pages, all
referenced once. I get this on "echo disk > /sys/power/state":

[  462.820098] PM: Marking nosave pages: 0000000000001000 - 0000000000006000
[  462.827161] PM: Marking nosave pages: 000000000009f000 - 0000000000100000
[  462.834249] PM: Basic memory bitmaps created
[  462.838631] PM: Syncing filesystems ... done.
[  463.167805] Freezing user space processes ... (elapsed 0.00 seconds) done.
[  463.175738] Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done.
[  463.183834] PM: Preallocating image memory ... done (allocated 383898 pages, 128000 image pages kept)
[  469.605741] PM: Allocated 1535592 kbytes in 6.41 seconds (239.56 MB/s)
[  469.612325]
[  469.768796] Restarting tasks ... done.
[  469.775044] PM: Basic memory bitmaps freed

Immediately after that, I copied a big sparse file into memory, and get this:

[  508.097913] PM: Marking nosave pages: 0000000000001000 - 0000000000006000
[  508.104799] PM: Marking nosave pages: 000000000009f000 - 0000000000100000
[  508.111702] PM: Basic memory bitmaps created
[  508.116073] PM: Syncing filesystems ... done.
[  509.208608] Freezing user space processes ... (elapsed 0.00 seconds) done.
[  509.216692] Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done.
[  509.224708] PM: Preallocating image memory ... done (allocated 383872 pages, 128000 image pages kept)
[  520.951882] PM: Allocated 1535488 kbytes in 11.71 seconds (131.12 MB/s)

It's much worse.

Your patches are really interesting exercises for the vmscan code ;-)

> > +       error = memory_bm_create(&orig_bm, GFP_IMAGE, PG_ANY);
> > +       if (error)
> > +               goto err_out;
> > +
> > +       error = memory_bm_create(&copy_bm, GFP_IMAGE, PG_ANY);
> > +       if (error)
> > +               goto err_out;
> > 
> > memory_bm_create() is called a number of times, each time it will
> > call create_mem_extents()/memory_bm_free(). Can they be optimized to
> > be called only once?
> 
> Possibly, but not right now if you please?  This is just moving code BTW.

OK.

> 
> > A side note: there are somehow duplicated *_extent_*() logics in the
> > filesystems, is it possible that we abstract out some of the common code?
> 
> I think we can do it, but it really is low priority to me at the moment.

OK. Just was a wild thought.

> 
> > +       for_each_populated_zone(zone) {
> > +               size += snapshot_additional_pages(zone);
> > +               count += zone_page_state(zone, NR_FREE_PAGES);
> > +               if (!is_highmem(zone))
> > +                       count -= zone->lowmem_reserve[ZONE_NORMAL];
> > +       }
> > 
> > Why [ZONE_NORMAL] instead of [zone]? ZONE_NORMAL may not always be the largest zone,
> > for example, My 4GB laptop has a tiny ZONE_NORMAL and a large ZONE_DMA32.
> 
> Ah, this is a leftover and it should be changed or even dropped.  Can you
> please remind me how exactly lowmem_reserve[] is supposed to work?

totalreserve_pages could be better. When free memory drops below that
threshold(it actually works per zone), kswapd will wake up trying to
reclaim pages. If the total reclaimable+free pages are as low as
totalreserve_pages, that would drive kswapd mad - scanning the whole
zones, trying to squeeze the last pages out of them.  Sure kswapd will
stop somewhere, but the resulting scan:reclaim ratio would be pretty
high and therefore hurt performance.

So we shall stop preallocation when reclaimable pages go down to
something like (5*totalreserve_pages). The vmscan mad may come earlier
because of unbalanced distributions of reclaimable pages among the zones.

> > At last, I'd express my major concern about the transition to preallocate
> > based memory shrinking: will it lead to more random swapping IOs?
> 
> Hmm.  I don't see immediately why would it.  Maybe the regression I'm seeing
> is related to that ...

OK. Anyway a preallocate based shrinking policy could be far from optimal. 
I'd suggest to switch to user space directed shrinking via fadvise(DONTNEED),
and leave the kernel one a fail safe path.  The user space tool could
gather page information from the filecache interface which I've been
maintaining out of tree, and to drop inactive/active pages from large
files first. That should be a better policy at least for rotational disks.

Thanks,
Fengguang


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
  2009-05-05 23:05                                                   ` Rafael J. Wysocki
                                                                     ` (2 preceding siblings ...)
  (?)
@ 2009-05-06 13:52                                                   ` Wu Fengguang
  -1 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-06 13:52 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Wu Fengguang, linux-kernel, alan-jenkins, jens.axboe, linux-pm,
	kernel-testers, torvalds, Andrew Morton

On Wed, May 06, 2009 at 07:05:09AM +0800, Rafael J. Wysocki wrote:
> On Tuesday 05 May 2009, Wu Fengguang wrote:
> > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > 
> > > Since the hibernation code is now going to use allocations of memory
> > > to create enough room for the image, it can also use the page frames
> > > allocated at this stage as image page frames.  The low-level
> > > hibernation code needs to be rearranged for this purpose, but it
> > > allows us to avoid freeing a great number of pages and allocating
> > > these same pages once again later, so it generally is worth doing.
> > > 
> > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > >  many pages as needed to get the right image size in one shot (the
> > >  excessive allocated pages are released afterwards).]
> > 
> > Rafael, I tried out your patches and found doubled memory shrink speed!
> >
> > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
> 
> Unfortunately, I'm observing a regression and a huge one.
> 
> On my Atom-based test box with 1 GB of RAM after a fresh boot and starting X
> with KDE 4 there are ~256 MB free.  To create an image we need to free ~300 MB
> and that takes ~2 s with the old code and ~15 s with the new one.
> 
> It helps to call shrink_all_memory() once with a sufficiently large argument
> before the preallocation.

Yes there are some strange behaviors. I tried to populate the page
cache with 1/30 mapped file pages and others normal file pages, all
referenced once. I get this on "echo disk > /sys/power/state":

[  462.820098] PM: Marking nosave pages: 0000000000001000 - 0000000000006000
[  462.827161] PM: Marking nosave pages: 000000000009f000 - 0000000000100000
[  462.834249] PM: Basic memory bitmaps created
[  462.838631] PM: Syncing filesystems ... done.
[  463.167805] Freezing user space processes ... (elapsed 0.00 seconds) done.
[  463.175738] Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done.
[  463.183834] PM: Preallocating image memory ... done (allocated 383898 pages, 128000 image pages kept)
[  469.605741] PM: Allocated 1535592 kbytes in 6.41 seconds (239.56 MB/s)
[  469.612325]
[  469.768796] Restarting tasks ... done.
[  469.775044] PM: Basic memory bitmaps freed

Immediately after that, I copied a big sparse file into memory, and get this:

[  508.097913] PM: Marking nosave pages: 0000000000001000 - 0000000000006000
[  508.104799] PM: Marking nosave pages: 000000000009f000 - 0000000000100000
[  508.111702] PM: Basic memory bitmaps created
[  508.116073] PM: Syncing filesystems ... done.
[  509.208608] Freezing user space processes ... (elapsed 0.00 seconds) done.
[  509.216692] Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done.
[  509.224708] PM: Preallocating image memory ... done (allocated 383872 pages, 128000 image pages kept)
[  520.951882] PM: Allocated 1535488 kbytes in 11.71 seconds (131.12 MB/s)

It's much worse.

Your patches are really interesting exercises for the vmscan code ;-)

> > +       error = memory_bm_create(&orig_bm, GFP_IMAGE, PG_ANY);
> > +       if (error)
> > +               goto err_out;
> > +
> > +       error = memory_bm_create(&copy_bm, GFP_IMAGE, PG_ANY);
> > +       if (error)
> > +               goto err_out;
> > 
> > memory_bm_create() is called a number of times, each time it will
> > call create_mem_extents()/memory_bm_free(). Can they be optimized to
> > be called only once?
> 
> Possibly, but not right now if you please?  This is just moving code BTW.

OK.

> 
> > A side note: there are somehow duplicated *_extent_*() logics in the
> > filesystems, is it possible that we abstract out some of the common code?
> 
> I think we can do it, but it really is low priority to me at the moment.

OK. Just was a wild thought.

> 
> > +       for_each_populated_zone(zone) {
> > +               size += snapshot_additional_pages(zone);
> > +               count += zone_page_state(zone, NR_FREE_PAGES);
> > +               if (!is_highmem(zone))
> > +                       count -= zone->lowmem_reserve[ZONE_NORMAL];
> > +       }
> > 
> > Why [ZONE_NORMAL] instead of [zone]? ZONE_NORMAL may not always be the largest zone,
> > for example, My 4GB laptop has a tiny ZONE_NORMAL and a large ZONE_DMA32.
> 
> Ah, this is a leftover and it should be changed or even dropped.  Can you
> please remind me how exactly lowmem_reserve[] is supposed to work?

totalreserve_pages could be better. When free memory drops below that
threshold(it actually works per zone), kswapd will wake up trying to
reclaim pages. If the total reclaimable+free pages are as low as
totalreserve_pages, that would drive kswapd mad - scanning the whole
zones, trying to squeeze the last pages out of them.  Sure kswapd will
stop somewhere, but the resulting scan:reclaim ratio would be pretty
high and therefore hurt performance.

So we shall stop preallocation when reclaimable pages go down to
something like (5*totalreserve_pages). The vmscan mad may come earlier
because of unbalanced distributions of reclaimable pages among the zones.

> > At last, I'd express my major concern about the transition to preallocate
> > based memory shrinking: will it lead to more random swapping IOs?
> 
> Hmm.  I don't see immediately why would it.  Maybe the regression I'm seeing
> is related to that ...

OK. Anyway a preallocate based shrinking policy could be far from optimal. 
I'd suggest to switch to user space directed shrinking via fadvise(DONTNEED),
and leave the kernel one a fail safe path.  The user space tool could
gather page information from the filecache interface which I've been
maintaining out of tree, and to drop inactive/active pages from large
files first. That should be a better policy at least for rotational disks.

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
@ 2009-05-06 13:52                                                     ` Wu Fengguang
  0 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-06 13:52 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Andrew Morton, pavel-+ZI9xUNit7I,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA, Wu Fengguang

On Wed, May 06, 2009 at 07:05:09AM +0800, Rafael J. Wysocki wrote:
> On Tuesday 05 May 2009, Wu Fengguang wrote:
> > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
> > > 
> > > Since the hibernation code is now going to use allocations of memory
> > > to create enough room for the image, it can also use the page frames
> > > allocated at this stage as image page frames.  The low-level
> > > hibernation code needs to be rearranged for this purpose, but it
> > > allows us to avoid freeing a great number of pages and allocating
> > > these same pages once again later, so it generally is worth doing.
> > > 
> > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > >  many pages as needed to get the right image size in one shot (the
> > >  excessive allocated pages are released afterwards).]
> > 
> > Rafael, I tried out your patches and found doubled memory shrink speed!
> >
> > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
> 
> Unfortunately, I'm observing a regression and a huge one.
> 
> On my Atom-based test box with 1 GB of RAM after a fresh boot and starting X
> with KDE 4 there are ~256 MB free.  To create an image we need to free ~300 MB
> and that takes ~2 s with the old code and ~15 s with the new one.
> 
> It helps to call shrink_all_memory() once with a sufficiently large argument
> before the preallocation.

Yes there are some strange behaviors. I tried to populate the page
cache with 1/30 mapped file pages and others normal file pages, all
referenced once. I get this on "echo disk > /sys/power/state":

[  462.820098] PM: Marking nosave pages: 0000000000001000 - 0000000000006000
[  462.827161] PM: Marking nosave pages: 000000000009f000 - 0000000000100000
[  462.834249] PM: Basic memory bitmaps created
[  462.838631] PM: Syncing filesystems ... done.
[  463.167805] Freezing user space processes ... (elapsed 0.00 seconds) done.
[  463.175738] Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done.
[  463.183834] PM: Preallocating image memory ... done (allocated 383898 pages, 128000 image pages kept)
[  469.605741] PM: Allocated 1535592 kbytes in 6.41 seconds (239.56 MB/s)
[  469.612325]
[  469.768796] Restarting tasks ... done.
[  469.775044] PM: Basic memory bitmaps freed

Immediately after that, I copied a big sparse file into memory, and get this:

[  508.097913] PM: Marking nosave pages: 0000000000001000 - 0000000000006000
[  508.104799] PM: Marking nosave pages: 000000000009f000 - 0000000000100000
[  508.111702] PM: Basic memory bitmaps created
[  508.116073] PM: Syncing filesystems ... done.
[  509.208608] Freezing user space processes ... (elapsed 0.00 seconds) done.
[  509.216692] Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done.
[  509.224708] PM: Preallocating image memory ... done (allocated 383872 pages, 128000 image pages kept)
[  520.951882] PM: Allocated 1535488 kbytes in 11.71 seconds (131.12 MB/s)

It's much worse.

Your patches are really interesting exercises for the vmscan code ;-)

> > +       error = memory_bm_create(&orig_bm, GFP_IMAGE, PG_ANY);
> > +       if (error)
> > +               goto err_out;
> > +
> > +       error = memory_bm_create(&copy_bm, GFP_IMAGE, PG_ANY);
> > +       if (error)
> > +               goto err_out;
> > 
> > memory_bm_create() is called a number of times, each time it will
> > call create_mem_extents()/memory_bm_free(). Can they be optimized to
> > be called only once?
> 
> Possibly, but not right now if you please?  This is just moving code BTW.

OK.

> 
> > A side note: there are somehow duplicated *_extent_*() logics in the
> > filesystems, is it possible that we abstract out some of the common code?
> 
> I think we can do it, but it really is low priority to me at the moment.

OK. Just was a wild thought.

> 
> > +       for_each_populated_zone(zone) {
> > +               size += snapshot_additional_pages(zone);
> > +               count += zone_page_state(zone, NR_FREE_PAGES);
> > +               if (!is_highmem(zone))
> > +                       count -= zone->lowmem_reserve[ZONE_NORMAL];
> > +       }
> > 
> > Why [ZONE_NORMAL] instead of [zone]? ZONE_NORMAL may not always be the largest zone,
> > for example, My 4GB laptop has a tiny ZONE_NORMAL and a large ZONE_DMA32.
> 
> Ah, this is a leftover and it should be changed or even dropped.  Can you
> please remind me how exactly lowmem_reserve[] is supposed to work?

totalreserve_pages could be better. When free memory drops below that
threshold(it actually works per zone), kswapd will wake up trying to
reclaim pages. If the total reclaimable+free pages are as low as
totalreserve_pages, that would drive kswapd mad - scanning the whole
zones, trying to squeeze the last pages out of them.  Sure kswapd will
stop somewhere, but the resulting scan:reclaim ratio would be pretty
high and therefore hurt performance.

So we shall stop preallocation when reclaimable pages go down to
something like (5*totalreserve_pages). The vmscan mad may come earlier
because of unbalanced distributions of reclaimable pages among the zones.

> > At last, I'd express my major concern about the transition to preallocate
> > based memory shrinking: will it lead to more random swapping IOs?
> 
> Hmm.  I don't see immediately why would it.  Maybe the regression I'm seeing
> is related to that ...

OK. Anyway a preallocate based shrinking policy could be far from optimal. 
I'd suggest to switch to user space directed shrinking via fadvise(DONTNEED),
and leave the kernel one a fail safe path.  The user space tool could
gather page information from the filecache interface which I've been
maintaining out of tree, and to drop inactive/active pages from large
files first. That should be a better policy at least for rotational disks.

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
@ 2009-05-06 13:56                                                     ` Wu Fengguang
  0 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-06 13:56 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Wu Fengguang, linux-pm, Andrew Morton, pavel, torvalds,
	jens.axboe, alan-jenkins, linux-kernel, kernel-testers

On Wed, May 06, 2009 at 07:05:09AM +0800, Rafael J. Wysocki wrote:
> On Tuesday 05 May 2009, Wu Fengguang wrote:
> > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > 
> > > Since the hibernation code is now going to use allocations of memory
> > > to create enough room for the image, it can also use the page frames
> > > allocated at this stage as image page frames.  The low-level
> > > hibernation code needs to be rearranged for this purpose, but it
> > > allows us to avoid freeing a great number of pages and allocating
> > > these same pages once again later, so it generally is worth doing.
> > > 
> > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > >  many pages as needed to get the right image size in one shot (the
> > >  excessive allocated pages are released afterwards).]
> > 
> > Rafael, I tried out your patches and found doubled memory shrink speed!
> >
> > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
> 
> Unfortunately, I'm observing a regression and a huge one.
> 
> On my Atom-based test box with 1 GB of RAM after a fresh boot and starting X
> with KDE 4 there are ~256 MB free.  To create an image we need to free ~300 MB
> and that takes ~2 s with the old code and ~15 s with the new one.
> 
> It helps to call shrink_all_memory() once with a sufficiently large argument
> before the preallocation.
[snip]
> > At last, I'd express my major concern about the transition to preallocate
> > based memory shrinking: will it lead to more random swapping IOs?
> 
> Hmm.  I don't see immediately why would it.  Maybe the regression I'm seeing
> is related to that ...

So you do have swap file enabled? hibernate_preallocate_memory() will
firstly try to allocate as much pages as possible(savable+free), and
then to free up (allocated-image_size) pages. That means *all*
swappable pages will be swapped out in the process - that's a major
performance regression!  And the zones are likely to be *over scanned*
and go to *all unreclaimable* state! (Hopefully they may be already
small at the time.)

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
  2009-05-05 23:05                                                   ` Rafael J. Wysocki
                                                                     ` (4 preceding siblings ...)
  (?)
@ 2009-05-06 13:56                                                   ` Wu Fengguang
  -1 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-06 13:56 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: kernel-testers, linux-kernel, alan-jenkins, jens.axboe, linux-pm,
	Wu Fengguang, torvalds, Andrew Morton

On Wed, May 06, 2009 at 07:05:09AM +0800, Rafael J. Wysocki wrote:
> On Tuesday 05 May 2009, Wu Fengguang wrote:
> > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > 
> > > Since the hibernation code is now going to use allocations of memory
> > > to create enough room for the image, it can also use the page frames
> > > allocated at this stage as image page frames.  The low-level
> > > hibernation code needs to be rearranged for this purpose, but it
> > > allows us to avoid freeing a great number of pages and allocating
> > > these same pages once again later, so it generally is worth doing.
> > > 
> > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > >  many pages as needed to get the right image size in one shot (the
> > >  excessive allocated pages are released afterwards).]
> > 
> > Rafael, I tried out your patches and found doubled memory shrink speed!
> >
> > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
> 
> Unfortunately, I'm observing a regression and a huge one.
> 
> On my Atom-based test box with 1 GB of RAM after a fresh boot and starting X
> with KDE 4 there are ~256 MB free.  To create an image we need to free ~300 MB
> and that takes ~2 s with the old code and ~15 s with the new one.
> 
> It helps to call shrink_all_memory() once with a sufficiently large argument
> before the preallocation.
[snip]
> > At last, I'd express my major concern about the transition to preallocate
> > based memory shrinking: will it lead to more random swapping IOs?
> 
> Hmm.  I don't see immediately why would it.  Maybe the regression I'm seeing
> is related to that ...

So you do have swap file enabled? hibernate_preallocate_memory() will
firstly try to allocate as much pages as possible(savable+free), and
then to free up (allocated-image_size) pages. That means *all*
swappable pages will be swapped out in the process - that's a major
performance regression!  And the zones are likely to be *over scanned*
and go to *all unreclaimable* state! (Hopefully they may be already
small at the time.)

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
@ 2009-05-06 13:56                                                     ` Wu Fengguang
  0 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-06 13:56 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Wu Fengguang,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Andrew Morton, pavel-+ZI9xUNit7I,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Wed, May 06, 2009 at 07:05:09AM +0800, Rafael J. Wysocki wrote:
> On Tuesday 05 May 2009, Wu Fengguang wrote:
> > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
> > > 
> > > Since the hibernation code is now going to use allocations of memory
> > > to create enough room for the image, it can also use the page frames
> > > allocated at this stage as image page frames.  The low-level
> > > hibernation code needs to be rearranged for this purpose, but it
> > > allows us to avoid freeing a great number of pages and allocating
> > > these same pages once again later, so it generally is worth doing.
> > > 
> > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > >  many pages as needed to get the right image size in one shot (the
> > >  excessive allocated pages are released afterwards).]
> > 
> > Rafael, I tried out your patches and found doubled memory shrink speed!
> >
> > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
> 
> Unfortunately, I'm observing a regression and a huge one.
> 
> On my Atom-based test box with 1 GB of RAM after a fresh boot and starting X
> with KDE 4 there are ~256 MB free.  To create an image we need to free ~300 MB
> and that takes ~2 s with the old code and ~15 s with the new one.
> 
> It helps to call shrink_all_memory() once with a sufficiently large argument
> before the preallocation.
[snip]
> > At last, I'd express my major concern about the transition to preallocate
> > based memory shrinking: will it lead to more random swapping IOs?
> 
> Hmm.  I don't see immediately why would it.  Maybe the regression I'm seeing
> is related to that ...

So you do have swap file enabled? hibernate_preallocate_memory() will
firstly try to allocate as much pages as possible(savable+free), and
then to free up (allocated-image_size) pages. That means *all*
swappable pages will be swapped out in the process - that's a major
performance regression!  And the zones are likely to be *over scanned*
and go to *all unreclaimable* state! (Hopefully they may be already
small at the time.)

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
  2009-05-06 13:56                                                     ` Wu Fengguang
@ 2009-05-06 20:54                                                       ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-06 20:54 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Wu Fengguang, linux-pm, Andrew Morton, pavel, torvalds,
	jens.axboe, alan-jenkins, linux-kernel, kernel-testers

On Wednesday 06 May 2009, Wu Fengguang wrote:
> On Wed, May 06, 2009 at 07:05:09AM +0800, Rafael J. Wysocki wrote:
> > On Tuesday 05 May 2009, Wu Fengguang wrote:
> > > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > > 
> > > > Since the hibernation code is now going to use allocations of memory
> > > > to create enough room for the image, it can also use the page frames
> > > > allocated at this stage as image page frames.  The low-level
> > > > hibernation code needs to be rearranged for this purpose, but it
> > > > allows us to avoid freeing a great number of pages and allocating
> > > > these same pages once again later, so it generally is worth doing.
> > > > 
> > > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > > >  many pages as needed to get the right image size in one shot (the
> > > >  excessive allocated pages are released afterwards).]
> > > 
> > > Rafael, I tried out your patches and found doubled memory shrink speed!
> > >
> > > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
> > 
> > Unfortunately, I'm observing a regression and a huge one.
> > 
> > On my Atom-based test box with 1 GB of RAM after a fresh boot and starting X
> > with KDE 4 there are ~256 MB free.  To create an image we need to free ~300 MB
> > and that takes ~2 s with the old code and ~15 s with the new one.
> > 
> > It helps to call shrink_all_memory() once with a sufficiently large argument
> > before the preallocation.
> [snip]
> > > At last, I'd express my major concern about the transition to preallocate
> > > based memory shrinking: will it lead to more random swapping IOs?
> > 
> > Hmm.  I don't see immediately why would it.  Maybe the regression I'm seeing
> > is related to that ...
> 
> So you do have swap file enabled? hibernate_preallocate_memory() will
> firstly try to allocate as much pages as possible(savable+free), and
> then to free up (allocated-image_size) pages.

No.  It's going to allocate (total RAM - anticipated image size) and then free
up (allocated-image_size) pages.

If we consider maximum image sizes, that means allocating slightly more than
50% of RAM, so it really shouldn't regress that much IMO.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
  2009-05-06 13:56                                                     ` Wu Fengguang
  (?)
@ 2009-05-06 20:54                                                     ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-06 20:54 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: kernel-testers, linux-kernel, alan-jenkins, jens.axboe, linux-pm,
	Wu Fengguang, torvalds, Andrew Morton

On Wednesday 06 May 2009, Wu Fengguang wrote:
> On Wed, May 06, 2009 at 07:05:09AM +0800, Rafael J. Wysocki wrote:
> > On Tuesday 05 May 2009, Wu Fengguang wrote:
> > > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > > 
> > > > Since the hibernation code is now going to use allocations of memory
> > > > to create enough room for the image, it can also use the page frames
> > > > allocated at this stage as image page frames.  The low-level
> > > > hibernation code needs to be rearranged for this purpose, but it
> > > > allows us to avoid freeing a great number of pages and allocating
> > > > these same pages once again later, so it generally is worth doing.
> > > > 
> > > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > > >  many pages as needed to get the right image size in one shot (the
> > > >  excessive allocated pages are released afterwards).]
> > > 
> > > Rafael, I tried out your patches and found doubled memory shrink speed!
> > >
> > > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
> > 
> > Unfortunately, I'm observing a regression and a huge one.
> > 
> > On my Atom-based test box with 1 GB of RAM after a fresh boot and starting X
> > with KDE 4 there are ~256 MB free.  To create an image we need to free ~300 MB
> > and that takes ~2 s with the old code and ~15 s with the new one.
> > 
> > It helps to call shrink_all_memory() once with a sufficiently large argument
> > before the preallocation.
> [snip]
> > > At last, I'd express my major concern about the transition to preallocate
> > > based memory shrinking: will it lead to more random swapping IOs?
> > 
> > Hmm.  I don't see immediately why would it.  Maybe the regression I'm seeing
> > is related to that ...
> 
> So you do have swap file enabled? hibernate_preallocate_memory() will
> firstly try to allocate as much pages as possible(savable+free), and
> then to free up (allocated-image_size) pages.

No.  It's going to allocate (total RAM - anticipated image size) and then free
up (allocated-image_size) pages.

If we consider maximum image sizes, that means allocating slightly more than
50% of RAM, so it really shouldn't regress that much IMO.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
@ 2009-05-06 20:54                                                       ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-06 20:54 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Wu Fengguang,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Andrew Morton, pavel-+ZI9xUNit7I,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Wednesday 06 May 2009, Wu Fengguang wrote:
> On Wed, May 06, 2009 at 07:05:09AM +0800, Rafael J. Wysocki wrote:
> > On Tuesday 05 May 2009, Wu Fengguang wrote:
> > > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > > From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
> > > > 
> > > > Since the hibernation code is now going to use allocations of memory
> > > > to create enough room for the image, it can also use the page frames
> > > > allocated at this stage as image page frames.  The low-level
> > > > hibernation code needs to be rearranged for this purpose, but it
> > > > allows us to avoid freeing a great number of pages and allocating
> > > > these same pages once again later, so it generally is worth doing.
> > > > 
> > > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > > >  many pages as needed to get the right image size in one shot (the
> > > >  excessive allocated pages are released afterwards).]
> > > 
> > > Rafael, I tried out your patches and found doubled memory shrink speed!
> > >
> > > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
> > 
> > Unfortunately, I'm observing a regression and a huge one.
> > 
> > On my Atom-based test box with 1 GB of RAM after a fresh boot and starting X
> > with KDE 4 there are ~256 MB free.  To create an image we need to free ~300 MB
> > and that takes ~2 s with the old code and ~15 s with the new one.
> > 
> > It helps to call shrink_all_memory() once with a sufficiently large argument
> > before the preallocation.
> [snip]
> > > At last, I'd express my major concern about the transition to preallocate
> > > based memory shrinking: will it lead to more random swapping IOs?
> > 
> > Hmm.  I don't see immediately why would it.  Maybe the regression I'm seeing
> > is related to that ...
> 
> So you do have swap file enabled? hibernate_preallocate_memory() will
> firstly try to allocate as much pages as possible(savable+free), and
> then to free up (allocated-image_size) pages.

No.  It's going to allocate (total RAM - anticipated image size) and then free
up (allocated-image_size) pages.

If we consider maximum image sizes, that means allocating slightly more than
50% of RAM, so it really shouldn't regress that much IMO.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
@ 2009-05-07  1:58                                                         ` Wu Fengguang
  0 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-07  1:58 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, Andrew Morton, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

On Thu, May 07, 2009 at 04:54:09AM +0800, Rafael J. Wysocki wrote:
> On Wednesday 06 May 2009, Wu Fengguang wrote:
> > On Wed, May 06, 2009 at 07:05:09AM +0800, Rafael J. Wysocki wrote:
> > > On Tuesday 05 May 2009, Wu Fengguang wrote:
> > > > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > > > 
> > > > > Since the hibernation code is now going to use allocations of memory
> > > > > to create enough room for the image, it can also use the page frames
> > > > > allocated at this stage as image page frames.  The low-level
> > > > > hibernation code needs to be rearranged for this purpose, but it
> > > > > allows us to avoid freeing a great number of pages and allocating
> > > > > these same pages once again later, so it generally is worth doing.
> > > > > 
> > > > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > > > >  many pages as needed to get the right image size in one shot (the
> > > > >  excessive allocated pages are released afterwards).]
> > > > 
> > > > Rafael, I tried out your patches and found doubled memory shrink speed!
> > > >
> > > > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > > > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
> > > 
> > > Unfortunately, I'm observing a regression and a huge one.
> > > 
> > > On my Atom-based test box with 1 GB of RAM after a fresh boot and starting X
> > > with KDE 4 there are ~256 MB free.  To create an image we need to free ~300 MB
> > > and that takes ~2 s with the old code and ~15 s with the new one.
> > > 
> > > It helps to call shrink_all_memory() once with a sufficiently large argument
> > > before the preallocation.
> > [snip]
> > > > At last, I'd express my major concern about the transition to preallocate
> > > > based memory shrinking: will it lead to more random swapping IOs?
> > > 
> > > Hmm.  I don't see immediately why would it.  Maybe the regression I'm seeing
> > > is related to that ...
> > 
> > So you do have swap file enabled? hibernate_preallocate_memory() will
> > firstly try to allocate as much pages as possible(savable+free), and
> > then to free up (allocated-image_size) pages.
> 
> No.  It's going to allocate (total RAM - anticipated image size) and then free
> up (allocated-image_size) pages.

Ah yes - I didn't notice that count was subtracted here:

        for (count -= size; count > 0; count--) {

Make "count -= size" a standalone line to make that more obvious?

> If we consider maximum image sizes, that means allocating slightly more than
> 50% of RAM, so it really shouldn't regress that much IMO.

Right, that would be a less problem.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
  2009-05-06 20:54                                                       ` Rafael J. Wysocki
  (?)
@ 2009-05-07  1:58                                                       ` Wu Fengguang
  -1 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-07  1:58 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-kernel, alan-jenkins, jens.axboe, linux-pm, kernel-testers,
	torvalds, Andrew Morton

On Thu, May 07, 2009 at 04:54:09AM +0800, Rafael J. Wysocki wrote:
> On Wednesday 06 May 2009, Wu Fengguang wrote:
> > On Wed, May 06, 2009 at 07:05:09AM +0800, Rafael J. Wysocki wrote:
> > > On Tuesday 05 May 2009, Wu Fengguang wrote:
> > > > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > > > 
> > > > > Since the hibernation code is now going to use allocations of memory
> > > > > to create enough room for the image, it can also use the page frames
> > > > > allocated at this stage as image page frames.  The low-level
> > > > > hibernation code needs to be rearranged for this purpose, but it
> > > > > allows us to avoid freeing a great number of pages and allocating
> > > > > these same pages once again later, so it generally is worth doing.
> > > > > 
> > > > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > > > >  many pages as needed to get the right image size in one shot (the
> > > > >  excessive allocated pages are released afterwards).]
> > > > 
> > > > Rafael, I tried out your patches and found doubled memory shrink speed!
> > > >
> > > > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > > > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
> > > 
> > > Unfortunately, I'm observing a regression and a huge one.
> > > 
> > > On my Atom-based test box with 1 GB of RAM after a fresh boot and starting X
> > > with KDE 4 there are ~256 MB free.  To create an image we need to free ~300 MB
> > > and that takes ~2 s with the old code and ~15 s with the new one.
> > > 
> > > It helps to call shrink_all_memory() once with a sufficiently large argument
> > > before the preallocation.
> > [snip]
> > > > At last, I'd express my major concern about the transition to preallocate
> > > > based memory shrinking: will it lead to more random swapping IOs?
> > > 
> > > Hmm.  I don't see immediately why would it.  Maybe the regression I'm seeing
> > > is related to that ...
> > 
> > So you do have swap file enabled? hibernate_preallocate_memory() will
> > firstly try to allocate as much pages as possible(savable+free), and
> > then to free up (allocated-image_size) pages.
> 
> No.  It's going to allocate (total RAM - anticipated image size) and then free
> up (allocated-image_size) pages.

Ah yes - I didn't notice that count was subtracted here:

        for (count -= size; count > 0; count--) {

Make "count -= size" a standalone line to make that more obvious?

> If we consider maximum image sizes, that means allocating slightly more than
> 50% of RAM, so it really shouldn't regress that much IMO.

Right, that would be a less problem.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
@ 2009-05-07  1:58                                                         ` Wu Fengguang
  0 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-07  1:58 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Andrew Morton, pavel-+ZI9xUNit7I,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Thu, May 07, 2009 at 04:54:09AM +0800, Rafael J. Wysocki wrote:
> On Wednesday 06 May 2009, Wu Fengguang wrote:
> > On Wed, May 06, 2009 at 07:05:09AM +0800, Rafael J. Wysocki wrote:
> > > On Tuesday 05 May 2009, Wu Fengguang wrote:
> > > > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > > > From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
> > > > > 
> > > > > Since the hibernation code is now going to use allocations of memory
> > > > > to create enough room for the image, it can also use the page frames
> > > > > allocated at this stage as image page frames.  The low-level
> > > > > hibernation code needs to be rearranged for this purpose, but it
> > > > > allows us to avoid freeing a great number of pages and allocating
> > > > > these same pages once again later, so it generally is worth doing.
> > > > > 
> > > > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > > > >  many pages as needed to get the right image size in one shot (the
> > > > >  excessive allocated pages are released afterwards).]
> > > > 
> > > > Rafael, I tried out your patches and found doubled memory shrink speed!
> > > >
> > > > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > > > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
> > > 
> > > Unfortunately, I'm observing a regression and a huge one.
> > > 
> > > On my Atom-based test box with 1 GB of RAM after a fresh boot and starting X
> > > with KDE 4 there are ~256 MB free.  To create an image we need to free ~300 MB
> > > and that takes ~2 s with the old code and ~15 s with the new one.
> > > 
> > > It helps to call shrink_all_memory() once with a sufficiently large argument
> > > before the preallocation.
> > [snip]
> > > > At last, I'd express my major concern about the transition to preallocate
> > > > based memory shrinking: will it lead to more random swapping IOs?
> > > 
> > > Hmm.  I don't see immediately why would it.  Maybe the regression I'm seeing
> > > is related to that ...
> > 
> > So you do have swap file enabled? hibernate_preallocate_memory() will
> > firstly try to allocate as much pages as possible(savable+free), and
> > then to free up (allocated-image_size) pages.
> 
> No.  It's going to allocate (total RAM - anticipated image size) and then free
> up (allocated-image_size) pages.

Ah yes - I didn't notice that count was subtracted here:

        for (count -= size; count > 0; count--) {

Make "count -= size" a standalone line to make that more obvious?

> If we consider maximum image sizes, that means allocating slightly more than
> 50% of RAM, so it really shouldn't regress that much IMO.

Right, that would be a less problem.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
  2009-05-07  1:58                                                         ` Wu Fengguang
@ 2009-05-07 12:20                                                           ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-07 12:20 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-pm, Andrew Morton, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

On Thursday 07 May 2009, Wu Fengguang wrote:
> On Thu, May 07, 2009 at 04:54:09AM +0800, Rafael J. Wysocki wrote:
> > On Wednesday 06 May 2009, Wu Fengguang wrote:
> > > On Wed, May 06, 2009 at 07:05:09AM +0800, Rafael J. Wysocki wrote:
> > > > On Tuesday 05 May 2009, Wu Fengguang wrote:
> > > > > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > > > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > > > > 
> > > > > > Since the hibernation code is now going to use allocations of memory
> > > > > > to create enough room for the image, it can also use the page frames
> > > > > > allocated at this stage as image page frames.  The low-level
> > > > > > hibernation code needs to be rearranged for this purpose, but it
> > > > > > allows us to avoid freeing a great number of pages and allocating
> > > > > > these same pages once again later, so it generally is worth doing.
> > > > > > 
> > > > > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > > > > >  many pages as needed to get the right image size in one shot (the
> > > > > >  excessive allocated pages are released afterwards).]
> > > > > 
> > > > > Rafael, I tried out your patches and found doubled memory shrink speed!
> > > > >
> > > > > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > > > > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
> > > > 
> > > > Unfortunately, I'm observing a regression and a huge one.
> > > > 
> > > > On my Atom-based test box with 1 GB of RAM after a fresh boot and starting X
> > > > with KDE 4 there are ~256 MB free.  To create an image we need to free ~300 MB
> > > > and that takes ~2 s with the old code and ~15 s with the new one.
> > > > 
> > > > It helps to call shrink_all_memory() once with a sufficiently large argument
> > > > before the preallocation.
> > > [snip]
> > > > > At last, I'd express my major concern about the transition to preallocate
> > > > > based memory shrinking: will it lead to more random swapping IOs?
> > > > 
> > > > Hmm.  I don't see immediately why would it.  Maybe the regression I'm seeing
> > > > is related to that ...
> > > 
> > > So you do have swap file enabled? hibernate_preallocate_memory() will
> > > firstly try to allocate as much pages as possible(savable+free), and
> > > then to free up (allocated-image_size) pages.
> > 
> > No.  It's going to allocate (total RAM - anticipated image size) and then free
> > up (allocated-image_size) pages.
> 
> Ah yes - I didn't notice that count was subtracted here:
> 
>         for (count -= size; count > 0; count--) {
> 
> Make "count -= size" a standalone line to make that more obvious?

That should be clear in the new patches:
http://patchwork.kernel.org/patch/22193/
http://patchwork.kernel.org/patch/22191/

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
  2009-05-07  1:58                                                         ` Wu Fengguang
  (?)
@ 2009-05-07 12:20                                                         ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-07 12:20 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-kernel, alan-jenkins, jens.axboe, linux-pm, kernel-testers,
	torvalds, Andrew Morton

On Thursday 07 May 2009, Wu Fengguang wrote:
> On Thu, May 07, 2009 at 04:54:09AM +0800, Rafael J. Wysocki wrote:
> > On Wednesday 06 May 2009, Wu Fengguang wrote:
> > > On Wed, May 06, 2009 at 07:05:09AM +0800, Rafael J. Wysocki wrote:
> > > > On Tuesday 05 May 2009, Wu Fengguang wrote:
> > > > > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > > > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > > > > 
> > > > > > Since the hibernation code is now going to use allocations of memory
> > > > > > to create enough room for the image, it can also use the page frames
> > > > > > allocated at this stage as image page frames.  The low-level
> > > > > > hibernation code needs to be rearranged for this purpose, but it
> > > > > > allows us to avoid freeing a great number of pages and allocating
> > > > > > these same pages once again later, so it generally is worth doing.
> > > > > > 
> > > > > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > > > > >  many pages as needed to get the right image size in one shot (the
> > > > > >  excessive allocated pages are released afterwards).]
> > > > > 
> > > > > Rafael, I tried out your patches and found doubled memory shrink speed!
> > > > >
> > > > > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > > > > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
> > > > 
> > > > Unfortunately, I'm observing a regression and a huge one.
> > > > 
> > > > On my Atom-based test box with 1 GB of RAM after a fresh boot and starting X
> > > > with KDE 4 there are ~256 MB free.  To create an image we need to free ~300 MB
> > > > and that takes ~2 s with the old code and ~15 s with the new one.
> > > > 
> > > > It helps to call shrink_all_memory() once with a sufficiently large argument
> > > > before the preallocation.
> > > [snip]
> > > > > At last, I'd express my major concern about the transition to preallocate
> > > > > based memory shrinking: will it lead to more random swapping IOs?
> > > > 
> > > > Hmm.  I don't see immediately why would it.  Maybe the regression I'm seeing
> > > > is related to that ...
> > > 
> > > So you do have swap file enabled? hibernate_preallocate_memory() will
> > > firstly try to allocate as much pages as possible(savable+free), and
> > > then to free up (allocated-image_size) pages.
> > 
> > No.  It's going to allocate (total RAM - anticipated image size) and then free
> > up (allocated-image_size) pages.
> 
> Ah yes - I didn't notice that count was subtracted here:
> 
>         for (count -= size; count > 0; count--) {
> 
> Make "count -= size" a standalone line to make that more obvious?

That should be clear in the new patches:
http://patchwork.kernel.org/patch/22193/
http://patchwork.kernel.org/patch/22191/

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
@ 2009-05-07 12:20                                                           ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-07 12:20 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Andrew Morton, pavel-+ZI9xUNit7I,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Thursday 07 May 2009, Wu Fengguang wrote:
> On Thu, May 07, 2009 at 04:54:09AM +0800, Rafael J. Wysocki wrote:
> > On Wednesday 06 May 2009, Wu Fengguang wrote:
> > > On Wed, May 06, 2009 at 07:05:09AM +0800, Rafael J. Wysocki wrote:
> > > > On Tuesday 05 May 2009, Wu Fengguang wrote:
> > > > > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > > > > From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
> > > > > > 
> > > > > > Since the hibernation code is now going to use allocations of memory
> > > > > > to create enough room for the image, it can also use the page frames
> > > > > > allocated at this stage as image page frames.  The low-level
> > > > > > hibernation code needs to be rearranged for this purpose, but it
> > > > > > allows us to avoid freeing a great number of pages and allocating
> > > > > > these same pages once again later, so it generally is worth doing.
> > > > > > 
> > > > > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > > > > >  many pages as needed to get the right image size in one shot (the
> > > > > >  excessive allocated pages are released afterwards).]
> > > > > 
> > > > > Rafael, I tried out your patches and found doubled memory shrink speed!
> > > > >
> > > > > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > > > > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
> > > > 
> > > > Unfortunately, I'm observing a regression and a huge one.
> > > > 
> > > > On my Atom-based test box with 1 GB of RAM after a fresh boot and starting X
> > > > with KDE 4 there are ~256 MB free.  To create an image we need to free ~300 MB
> > > > and that takes ~2 s with the old code and ~15 s with the new one.
> > > > 
> > > > It helps to call shrink_all_memory() once with a sufficiently large argument
> > > > before the preallocation.
> > > [snip]
> > > > > At last, I'd express my major concern about the transition to preallocate
> > > > > based memory shrinking: will it lead to more random swapping IOs?
> > > > 
> > > > Hmm.  I don't see immediately why would it.  Maybe the regression I'm seeing
> > > > is related to that ...
> > > 
> > > So you do have swap file enabled? hibernate_preallocate_memory() will
> > > firstly try to allocate as much pages as possible(savable+free), and
> > > then to free up (allocated-image_size) pages.
> > 
> > No.  It's going to allocate (total RAM - anticipated image size) and then free
> > up (allocated-image_size) pages.
> 
> Ah yes - I didn't notice that count was subtracted here:
> 
>         for (count -= size; count > 0; count--) {
> 
> Make "count -= size" a standalone line to make that more obvious?

That should be clear in the new patches:
http://patchwork.kernel.org/patch/22193/
http://patchwork.kernel.org/patch/22191/

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
@ 2009-05-07 12:34                                                             ` Wu Fengguang
  0 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-07 12:34 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, Andrew Morton, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

On Thu, May 07, 2009 at 02:20:42PM +0200, Rafael J. Wysocki wrote:
> On Thursday 07 May 2009, Wu Fengguang wrote:
> > On Thu, May 07, 2009 at 04:54:09AM +0800, Rafael J. Wysocki wrote:
> > > On Wednesday 06 May 2009, Wu Fengguang wrote:
> > > > On Wed, May 06, 2009 at 07:05:09AM +0800, Rafael J. Wysocki wrote:
> > > > > On Tuesday 05 May 2009, Wu Fengguang wrote:
> > > > > > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > > > > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > > > > > 
> > > > > > > Since the hibernation code is now going to use allocations of memory
> > > > > > > to create enough room for the image, it can also use the page frames
> > > > > > > allocated at this stage as image page frames.  The low-level
> > > > > > > hibernation code needs to be rearranged for this purpose, but it
> > > > > > > allows us to avoid freeing a great number of pages and allocating
> > > > > > > these same pages once again later, so it generally is worth doing.
> > > > > > > 
> > > > > > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > > > > > >  many pages as needed to get the right image size in one shot (the
> > > > > > >  excessive allocated pages are released afterwards).]
> > > > > > 
> > > > > > Rafael, I tried out your patches and found doubled memory shrink speed!
> > > > > >
> > > > > > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > > > > > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
> > > > > 
> > > > > Unfortunately, I'm observing a regression and a huge one.
> > > > > 
> > > > > On my Atom-based test box with 1 GB of RAM after a fresh boot and starting X
> > > > > with KDE 4 there are ~256 MB free.  To create an image we need to free ~300 MB
> > > > > and that takes ~2 s with the old code and ~15 s with the new one.
> > > > > 
> > > > > It helps to call shrink_all_memory() once with a sufficiently large argument
> > > > > before the preallocation.
> > > > [snip]
> > > > > > At last, I'd express my major concern about the transition to preallocate
> > > > > > based memory shrinking: will it lead to more random swapping IOs?
> > > > > 
> > > > > Hmm.  I don't see immediately why would it.  Maybe the regression I'm seeing
> > > > > is related to that ...
> > > > 
> > > > So you do have swap file enabled? hibernate_preallocate_memory() will
> > > > firstly try to allocate as much pages as possible(savable+free), and
> > > > then to free up (allocated-image_size) pages.
> > > 
> > > No.  It's going to allocate (total RAM - anticipated image size) and then free
> > > up (allocated-image_size) pages.
> > 
> > Ah yes - I didn't notice that count was subtracted here:
> > 
> >         for (count -= size; count > 0; count--) {
> > 
> > Make "count -= size" a standalone line to make that more obvious?
> 
> That should be clear in the new patches:
> http://patchwork.kernel.org/patch/22193/
> http://patchwork.kernel.org/patch/22191/

Yes, thanks! That's much better :)

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
  2009-05-07 12:20                                                           ` Rafael J. Wysocki
  (?)
  (?)
@ 2009-05-07 12:34                                                           ` Wu Fengguang
  -1 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-07 12:34 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-kernel, alan-jenkins, jens.axboe, linux-pm, kernel-testers,
	torvalds, Andrew Morton

On Thu, May 07, 2009 at 02:20:42PM +0200, Rafael J. Wysocki wrote:
> On Thursday 07 May 2009, Wu Fengguang wrote:
> > On Thu, May 07, 2009 at 04:54:09AM +0800, Rafael J. Wysocki wrote:
> > > On Wednesday 06 May 2009, Wu Fengguang wrote:
> > > > On Wed, May 06, 2009 at 07:05:09AM +0800, Rafael J. Wysocki wrote:
> > > > > On Tuesday 05 May 2009, Wu Fengguang wrote:
> > > > > > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > > > > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > > > > > 
> > > > > > > Since the hibernation code is now going to use allocations of memory
> > > > > > > to create enough room for the image, it can also use the page frames
> > > > > > > allocated at this stage as image page frames.  The low-level
> > > > > > > hibernation code needs to be rearranged for this purpose, but it
> > > > > > > allows us to avoid freeing a great number of pages and allocating
> > > > > > > these same pages once again later, so it generally is worth doing.
> > > > > > > 
> > > > > > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > > > > > >  many pages as needed to get the right image size in one shot (the
> > > > > > >  excessive allocated pages are released afterwards).]
> > > > > > 
> > > > > > Rafael, I tried out your patches and found doubled memory shrink speed!
> > > > > >
> > > > > > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > > > > > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
> > > > > 
> > > > > Unfortunately, I'm observing a regression and a huge one.
> > > > > 
> > > > > On my Atom-based test box with 1 GB of RAM after a fresh boot and starting X
> > > > > with KDE 4 there are ~256 MB free.  To create an image we need to free ~300 MB
> > > > > and that takes ~2 s with the old code and ~15 s with the new one.
> > > > > 
> > > > > It helps to call shrink_all_memory() once with a sufficiently large argument
> > > > > before the preallocation.
> > > > [snip]
> > > > > > At last, I'd express my major concern about the transition to preallocate
> > > > > > based memory shrinking: will it lead to more random swapping IOs?
> > > > > 
> > > > > Hmm.  I don't see immediately why would it.  Maybe the regression I'm seeing
> > > > > is related to that ...
> > > > 
> > > > So you do have swap file enabled? hibernate_preallocate_memory() will
> > > > firstly try to allocate as much pages as possible(savable+free), and
> > > > then to free up (allocated-image_size) pages.
> > > 
> > > No.  It's going to allocate (total RAM - anticipated image size) and then free
> > > up (allocated-image_size) pages.
> > 
> > Ah yes - I didn't notice that count was subtracted here:
> > 
> >         for (count -= size; count > 0; count--) {
> > 
> > Make "count -= size" a standalone line to make that more obvious?
> 
> That should be clear in the new patches:
> http://patchwork.kernel.org/patch/22193/
> http://patchwork.kernel.org/patch/22191/

Yes, thanks! That's much better :)

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
@ 2009-05-07 12:34                                                             ` Wu Fengguang
  0 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-07 12:34 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Andrew Morton, pavel-+ZI9xUNit7I,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Thu, May 07, 2009 at 02:20:42PM +0200, Rafael J. Wysocki wrote:
> On Thursday 07 May 2009, Wu Fengguang wrote:
> > On Thu, May 07, 2009 at 04:54:09AM +0800, Rafael J. Wysocki wrote:
> > > On Wednesday 06 May 2009, Wu Fengguang wrote:
> > > > On Wed, May 06, 2009 at 07:05:09AM +0800, Rafael J. Wysocki wrote:
> > > > > On Tuesday 05 May 2009, Wu Fengguang wrote:
> > > > > > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > > > > > From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
> > > > > > > 
> > > > > > > Since the hibernation code is now going to use allocations of memory
> > > > > > > to create enough room for the image, it can also use the page frames
> > > > > > > allocated at this stage as image page frames.  The low-level
> > > > > > > hibernation code needs to be rearranged for this purpose, but it
> > > > > > > allows us to avoid freeing a great number of pages and allocating
> > > > > > > these same pages once again later, so it generally is worth doing.
> > > > > > > 
> > > > > > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > > > > > >  many pages as needed to get the right image size in one shot (the
> > > > > > >  excessive allocated pages are released afterwards).]
> > > > > > 
> > > > > > Rafael, I tried out your patches and found doubled memory shrink speed!
> > > > > >
> > > > > > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > > > > > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
> > > > > 
> > > > > Unfortunately, I'm observing a regression and a huge one.
> > > > > 
> > > > > On my Atom-based test box with 1 GB of RAM after a fresh boot and starting X
> > > > > with KDE 4 there are ~256 MB free.  To create an image we need to free ~300 MB
> > > > > and that takes ~2 s with the old code and ~15 s with the new one.
> > > > > 
> > > > > It helps to call shrink_all_memory() once with a sufficiently large argument
> > > > > before the preallocation.
> > > > [snip]
> > > > > > At last, I'd express my major concern about the transition to preallocate
> > > > > > based memory shrinking: will it lead to more random swapping IOs?
> > > > > 
> > > > > Hmm.  I don't see immediately why would it.  Maybe the regression I'm seeing
> > > > > is related to that ...
> > > > 
> > > > So you do have swap file enabled? hibernate_preallocate_memory() will
> > > > firstly try to allocate as much pages as possible(savable+free), and
> > > > then to free up (allocated-image_size) pages.
> > > 
> > > No.  It's going to allocate (total RAM - anticipated image size) and then free
> > > up (allocated-image_size) pages.
> > 
> > Ah yes - I didn't notice that count was subtracted here:
> > 
> >         for (count -= size; count > 0; count--) {
> > 
> > Make "count -= size" a standalone line to make that more obvious?
> 
> That should be clear in the new patches:
> http://patchwork.kernel.org/patch/22193/
> http://patchwork.kernel.org/patch/22191/

Yes, thanks! That's much better :)

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 18:09                                                                       ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-07 18:09 UTC (permalink / raw)
  To: Andrew Morton
  Cc: rientjes, fengguang.wu, linux-pm, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

On Wednesday 06 May 2009, Andrew Morton wrote:
> On Wed, 6 May 2009 01:20:34 +0200
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > On Wednesday 06 May 2009, Andrew Morton wrote:
> > > On Wed, 6 May 2009 00:19:35 +0200
> > > "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > 
> > > > > > +			&& !processes_are_frozen()) {
> > > > > >  		if (!try_set_zone_oom(zonelist, gfp_mask)) {
> > > > > >  			schedule_timeout_uninterruptible(1);
> > > > > >  			goto restart;
> > > > > 
> > > > > Cool, that looks like the semantics of __GFP_NO_OOM_KILL without requiring 
> > > > > a new gfp flag.  Thanks.
> > > > 
> > > > Well, you're welcome.
> > > > 
> > > > BTW, I think that Andrew was actually right when he asked if I checked whether
> > > > the existing __GFP_NORETRY would work as-is for __GFP_FS set and
> > > > __GFP_NORETRY unset.  Namely, in that case we never reach the code before
> > > > nopage: that checks __GFP_NORETRY, do we?
> > > > 
> > > > So I think we shouldn't modify the 'else if' condition above and check for
> > > > !processes_are_frozen() at the beginning of the block below.
> > > 
> > > Confused.
> > > 
> > > I'm suspecting that hibernation can allocate its pages with
> > > __GFP_FS|__GFP_WAIT|__GFP_NORETRY|__GFP_NOWARN, and the page allocator
> > > will dtrt: no oom-killings.
> > > 
> > > In which case, processes_are_frozen() is not needed at all?
> > 
> > __GFP_NORETRY alone causes it to fail relatively quickly, but I'll try with
> > the combination.
> 
> OK.  __GFP_WAIT is the big hammer.

Unfortunately it fails too quickly with the combination as well, so it looks
like we can't use __GFP_NORETRY during hibernation.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-05 23:40                                                                     ` Andrew Morton
  (?)
@ 2009-05-07 18:09                                                                     ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-07 18:09 UTC (permalink / raw)
  To: Andrew Morton
  Cc: kernel-testers, rientjes, linux-kernel, alan-jenkins, jens.axboe,
	linux-pm, fengguang.wu, torvalds

On Wednesday 06 May 2009, Andrew Morton wrote:
> On Wed, 6 May 2009 01:20:34 +0200
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > On Wednesday 06 May 2009, Andrew Morton wrote:
> > > On Wed, 6 May 2009 00:19:35 +0200
> > > "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > 
> > > > > > +			&& !processes_are_frozen()) {
> > > > > >  		if (!try_set_zone_oom(zonelist, gfp_mask)) {
> > > > > >  			schedule_timeout_uninterruptible(1);
> > > > > >  			goto restart;
> > > > > 
> > > > > Cool, that looks like the semantics of __GFP_NO_OOM_KILL without requiring 
> > > > > a new gfp flag.  Thanks.
> > > > 
> > > > Well, you're welcome.
> > > > 
> > > > BTW, I think that Andrew was actually right when he asked if I checked whether
> > > > the existing __GFP_NORETRY would work as-is for __GFP_FS set and
> > > > __GFP_NORETRY unset.  Namely, in that case we never reach the code before
> > > > nopage: that checks __GFP_NORETRY, do we?
> > > > 
> > > > So I think we shouldn't modify the 'else if' condition above and check for
> > > > !processes_are_frozen() at the beginning of the block below.
> > > 
> > > Confused.
> > > 
> > > I'm suspecting that hibernation can allocate its pages with
> > > __GFP_FS|__GFP_WAIT|__GFP_NORETRY|__GFP_NOWARN, and the page allocator
> > > will dtrt: no oom-killings.
> > > 
> > > In which case, processes_are_frozen() is not needed at all?
> > 
> > __GFP_NORETRY alone causes it to fail relatively quickly, but I'll try with
> > the combination.
> 
> OK.  __GFP_WAIT is the big hammer.

Unfortunately it fails too quickly with the combination as well, so it looks
like we can't use __GFP_NORETRY during hibernation.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 18:09                                                                       ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-07 18:09 UTC (permalink / raw)
  To: Andrew Morton
  Cc: rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Wednesday 06 May 2009, Andrew Morton wrote:
> On Wed, 6 May 2009 01:20:34 +0200
> "Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:
> 
> > On Wednesday 06 May 2009, Andrew Morton wrote:
> > > On Wed, 6 May 2009 00:19:35 +0200
> > > "Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:
> > > 
> > > > > > +			&& !processes_are_frozen()) {
> > > > > >  		if (!try_set_zone_oom(zonelist, gfp_mask)) {
> > > > > >  			schedule_timeout_uninterruptible(1);
> > > > > >  			goto restart;
> > > > > 
> > > > > Cool, that looks like the semantics of __GFP_NO_OOM_KILL without requiring 
> > > > > a new gfp flag.  Thanks.
> > > > 
> > > > Well, you're welcome.
> > > > 
> > > > BTW, I think that Andrew was actually right when he asked if I checked whether
> > > > the existing __GFP_NORETRY would work as-is for __GFP_FS set and
> > > > __GFP_NORETRY unset.  Namely, in that case we never reach the code before
> > > > nopage: that checks __GFP_NORETRY, do we?
> > > > 
> > > > So I think we shouldn't modify the 'else if' condition above and check for
> > > > !processes_are_frozen() at the beginning of the block below.
> > > 
> > > Confused.
> > > 
> > > I'm suspecting that hibernation can allocate its pages with
> > > __GFP_FS|__GFP_WAIT|__GFP_NORETRY|__GFP_NOWARN, and the page allocator
> > > will dtrt: no oom-killings.
> > > 
> > > In which case, processes_are_frozen() is not needed at all?
> > 
> > __GFP_NORETRY alone causes it to fail relatively quickly, but I'll try with
> > the combination.
> 
> OK.  __GFP_WAIT is the big hammer.

Unfortunately it fails too quickly with the combination as well, so it looks
like we can't use __GFP_NORETRY during hibernation.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 18:48                                                                         ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-07 18:48 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: rientjes, fengguang.wu, linux-pm, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

On Thu, 7 May 2009 20:09:52 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> > > > I'm suspecting that hibernation can allocate its pages with
> > > > __GFP_FS|__GFP_WAIT|__GFP_NORETRY|__GFP_NOWARN, and the page allocator
> > > > will dtrt: no oom-killings.
> > > > 
> > > > In which case, processes_are_frozen() is not needed at all?
> > > 
> > > __GFP_NORETRY alone causes it to fail relatively quickly, but I'll try with
> > > the combination.
> > 
> > OK.  __GFP_WAIT is the big hammer.
> 
> Unfortunately it fails too quickly with the combination as well, so it looks
> like we can't use __GFP_NORETRY during hibernation.

hm.

So where do we stand now?

I'm not a big fan of the global application-specific state change
thing.  Something like __GFP_NO_OOM_KILL has a better chance of being
reused by other subsystems in the future, which is a good indicator.


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-07 18:09                                                                       ` Rafael J. Wysocki
  (?)
@ 2009-05-07 18:48                                                                       ` Andrew Morton
  -1 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-07 18:48 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: kernel-testers, rientjes, linux-kernel, alan-jenkins, jens.axboe,
	linux-pm, fengguang.wu, torvalds

On Thu, 7 May 2009 20:09:52 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> > > > I'm suspecting that hibernation can allocate its pages with
> > > > __GFP_FS|__GFP_WAIT|__GFP_NORETRY|__GFP_NOWARN, and the page allocator
> > > > will dtrt: no oom-killings.
> > > > 
> > > > In which case, processes_are_frozen() is not needed at all?
> > > 
> > > __GFP_NORETRY alone causes it to fail relatively quickly, but I'll try with
> > > the combination.
> > 
> > OK.  __GFP_WAIT is the big hammer.
> 
> Unfortunately it fails too quickly with the combination as well, so it looks
> like we can't use __GFP_NORETRY during hibernation.

hm.

So where do we stand now?

I'm not a big fan of the global application-specific state change
thing.  Something like __GFP_NO_OOM_KILL has a better chance of being
reused by other subsystems in the future, which is a good indicator.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 18:48                                                                         ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-07 18:48 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Thu, 7 May 2009 20:09:52 +0200
"Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:

> > > > I'm suspecting that hibernation can allocate its pages with
> > > > __GFP_FS|__GFP_WAIT|__GFP_NORETRY|__GFP_NOWARN, and the page allocator
> > > > will dtrt: no oom-killings.
> > > > 
> > > > In which case, processes_are_frozen() is not needed at all?
> > > 
> > > __GFP_NORETRY alone causes it to fail relatively quickly, but I'll try with
> > > the combination.
> > 
> > OK.  __GFP_WAIT is the big hammer.
> 
> Unfortunately it fails too quickly with the combination as well, so it looks
> like we can't use __GFP_NORETRY during hibernation.

hm.

So where do we stand now?

I'm not a big fan of the global application-specific state change
thing.  Something like __GFP_NO_OOM_KILL has a better chance of being
reused by other subsystems in the future, which is a good indicator.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 18:50                                                                         ` David Rientjes
  0 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-07 18:50 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, fengguang.wu, linux-pm, pavel, torvalds,
	jens.axboe, alan-jenkins, linux-kernel, kernel-testers

On Thu, 7 May 2009, Rafael J. Wysocki wrote:

> Unfortunately it fails too quickly with the combination as well, so it looks
> like we can't use __GFP_NORETRY during hibernation.
> 

If you know that no other tasks are in the oom killer at suspend time, you 
can do what I mentioned earlier:

	struct zone *z;
	for_each_populated_zone(z)
		zone_set_flag(z, ZONE_OOM_LOCKED);

and then later

	for_each_populated_zone(z)
		zone_clear_flag(z, ZONE_OOM_LOCKED);

The only race there is if a task is currently in the oom killer and will 
subsequently clear ZONE_OOM_LOCKED for its zonelist.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-07 18:09                                                                       ` Rafael J. Wysocki
                                                                                         ` (3 preceding siblings ...)
  (?)
@ 2009-05-07 18:50                                                                       ` David Rientjes
  -1 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-07 18:50 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: kernel-testers, linux-kernel, alan-jenkins, jens.axboe,
	Andrew Morton, fengguang.wu, torvalds, linux-pm

On Thu, 7 May 2009, Rafael J. Wysocki wrote:

> Unfortunately it fails too quickly with the combination as well, so it looks
> like we can't use __GFP_NORETRY during hibernation.
> 

If you know that no other tasks are in the oom killer at suspend time, you 
can do what I mentioned earlier:

	struct zone *z;
	for_each_populated_zone(z)
		zone_set_flag(z, ZONE_OOM_LOCKED);

and then later

	for_each_populated_zone(z)
		zone_clear_flag(z, ZONE_OOM_LOCKED);

The only race there is if a task is currently in the oom killer and will 
subsequently clear ZONE_OOM_LOCKED for its zonelist.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 18:50                                                                         ` David Rientjes
  0 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-07 18:50 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Thu, 7 May 2009, Rafael J. Wysocki wrote:

> Unfortunately it fails too quickly with the combination as well, so it looks
> like we can't use __GFP_NORETRY during hibernation.
> 

If you know that no other tasks are in the oom killer at suspend time, you 
can do what I mentioned earlier:

	struct zone *z;
	for_each_populated_zone(z)
		zone_set_flag(z, ZONE_OOM_LOCKED);

and then later

	for_each_populated_zone(z)
		zone_clear_flag(z, ZONE_OOM_LOCKED);

The only race there is if a task is currently in the oom killer and will 
subsequently clear ZONE_OOM_LOCKED for its zonelist.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 19:33                                                                           ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-07 19:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: rientjes, fengguang.wu, linux-pm, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

On Thursday 07 May 2009, Andrew Morton wrote:
> On Thu, 7 May 2009 20:09:52 +0200
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > > > > I'm suspecting that hibernation can allocate its pages with
> > > > > __GFP_FS|__GFP_WAIT|__GFP_NORETRY|__GFP_NOWARN, and the page allocator
> > > > > will dtrt: no oom-killings.
> > > > > 
> > > > > In which case, processes_are_frozen() is not needed at all?
> > > > 
> > > > __GFP_NORETRY alone causes it to fail relatively quickly, but I'll try with
> > > > the combination.
> > > 
> > > OK.  __GFP_WAIT is the big hammer.
> > 
> > Unfortunately it fails too quickly with the combination as well, so it looks
> > like we can't use __GFP_NORETRY during hibernation.
> 
> hm.
> 
> So where do we stand now?
> 
> I'm not a big fan of the global application-specific state change
> thing.  Something like __GFP_NO_OOM_KILL has a better chance of being
> reused by other subsystems in the future, which is a good indicator.

I'm not against __GFP_NO_OOM_KILL, but there's been some strong resistance to
adding new _GPF _FOO flags recently.  Is there any likelihood anyone else we'll
really need it any time soon?

The advantage of the freezer-based approach is that it disables the OOM killer
when it's not going to work anyway, so it looks like a reasonable thing to do
regardless.  IMHO.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-07 18:48                                                                         ` Andrew Morton
  (?)
  (?)
@ 2009-05-07 19:33                                                                         ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-07 19:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: kernel-testers, rientjes, linux-kernel, alan-jenkins, jens.axboe,
	linux-pm, fengguang.wu, torvalds

On Thursday 07 May 2009, Andrew Morton wrote:
> On Thu, 7 May 2009 20:09:52 +0200
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > > > > I'm suspecting that hibernation can allocate its pages with
> > > > > __GFP_FS|__GFP_WAIT|__GFP_NORETRY|__GFP_NOWARN, and the page allocator
> > > > > will dtrt: no oom-killings.
> > > > > 
> > > > > In which case, processes_are_frozen() is not needed at all?
> > > > 
> > > > __GFP_NORETRY alone causes it to fail relatively quickly, but I'll try with
> > > > the combination.
> > > 
> > > OK.  __GFP_WAIT is the big hammer.
> > 
> > Unfortunately it fails too quickly with the combination as well, so it looks
> > like we can't use __GFP_NORETRY during hibernation.
> 
> hm.
> 
> So where do we stand now?
> 
> I'm not a big fan of the global application-specific state change
> thing.  Something like __GFP_NO_OOM_KILL has a better chance of being
> reused by other subsystems in the future, which is a good indicator.

I'm not against __GFP_NO_OOM_KILL, but there's been some strong resistance to
adding new _GPF _FOO flags recently.  Is there any likelihood anyone else we'll
really need it any time soon?

The advantage of the freezer-based approach is that it disables the OOM killer
when it's not going to work anyway, so it looks like a reasonable thing to do
regardless.  IMHO.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 19:33                                                                           ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-07 19:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Thursday 07 May 2009, Andrew Morton wrote:
> On Thu, 7 May 2009 20:09:52 +0200
> "Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:
> 
> > > > > I'm suspecting that hibernation can allocate its pages with
> > > > > __GFP_FS|__GFP_WAIT|__GFP_NORETRY|__GFP_NOWARN, and the page allocator
> > > > > will dtrt: no oom-killings.
> > > > > 
> > > > > In which case, processes_are_frozen() is not needed at all?
> > > > 
> > > > __GFP_NORETRY alone causes it to fail relatively quickly, but I'll try with
> > > > the combination.
> > > 
> > > OK.  __GFP_WAIT is the big hammer.
> > 
> > Unfortunately it fails too quickly with the combination as well, so it looks
> > like we can't use __GFP_NORETRY during hibernation.
> 
> hm.
> 
> So where do we stand now?
> 
> I'm not a big fan of the global application-specific state change
> thing.  Something like __GFP_NO_OOM_KILL has a better chance of being
> reused by other subsystems in the future, which is a good indicator.

I'm not against __GFP_NO_OOM_KILL, but there's been some strong resistance to
adding new _GPF _FOO flags recently.  Is there any likelihood anyone else we'll
really need it any time soon?

The advantage of the freezer-based approach is that it disables the OOM killer
when it's not going to work anyway, so it looks like a reasonable thing to do
regardless.  IMHO.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 20:02                                                                             ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-07 20:02 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: rientjes, fengguang.wu, linux-pm, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

On Thu, 7 May 2009 21:33:47 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> On Thursday 07 May 2009, Andrew Morton wrote:
> > On Thu, 7 May 2009 20:09:52 +0200
> > "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > 
> > > > > > I'm suspecting that hibernation can allocate its pages with
> > > > > > __GFP_FS|__GFP_WAIT|__GFP_NORETRY|__GFP_NOWARN, and the page allocator
> > > > > > will dtrt: no oom-killings.
> > > > > > 
> > > > > > In which case, processes_are_frozen() is not needed at all?
> > > > > 
> > > > > __GFP_NORETRY alone causes it to fail relatively quickly, but I'll try with
> > > > > the combination.
> > > > 
> > > > OK.  __GFP_WAIT is the big hammer.
> > > 
> > > Unfortunately it fails too quickly with the combination as well, so it looks
> > > like we can't use __GFP_NORETRY during hibernation.
> > 
> > hm.
> > 
> > So where do we stand now?
> > 
> > I'm not a big fan of the global application-specific state change
> > thing.  Something like __GFP_NO_OOM_KILL has a better chance of being
> > reused by other subsystems in the future, which is a good indicator.
> 
> I'm not against __GFP_NO_OOM_KILL, but there's been some strong resistance to
> adding new _GPF _FOO flags recently.

We have six or seven left - hardly a crisis.

>  Is there any likelihood anyone else we'll
> really need it any time soon?

Dunno - people do all sorts of crazy things.  But it's more likely to
be reused than a PM-specific global!

I have no strong feelings really, but slotting into the existing
technique with something which might be reusable is quite a bit tidier.


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-07 19:33                                                                           ` Rafael J. Wysocki
  (?)
@ 2009-05-07 20:02                                                                           ` Andrew Morton
  -1 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-07 20:02 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: kernel-testers, rientjes, linux-kernel, alan-jenkins, jens.axboe,
	linux-pm, fengguang.wu, torvalds

On Thu, 7 May 2009 21:33:47 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> On Thursday 07 May 2009, Andrew Morton wrote:
> > On Thu, 7 May 2009 20:09:52 +0200
> > "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > 
> > > > > > I'm suspecting that hibernation can allocate its pages with
> > > > > > __GFP_FS|__GFP_WAIT|__GFP_NORETRY|__GFP_NOWARN, and the page allocator
> > > > > > will dtrt: no oom-killings.
> > > > > > 
> > > > > > In which case, processes_are_frozen() is not needed at all?
> > > > > 
> > > > > __GFP_NORETRY alone causes it to fail relatively quickly, but I'll try with
> > > > > the combination.
> > > > 
> > > > OK.  __GFP_WAIT is the big hammer.
> > > 
> > > Unfortunately it fails too quickly with the combination as well, so it looks
> > > like we can't use __GFP_NORETRY during hibernation.
> > 
> > hm.
> > 
> > So where do we stand now?
> > 
> > I'm not a big fan of the global application-specific state change
> > thing.  Something like __GFP_NO_OOM_KILL has a better chance of being
> > reused by other subsystems in the future, which is a good indicator.
> 
> I'm not against __GFP_NO_OOM_KILL, but there's been some strong resistance to
> adding new _GPF _FOO flags recently.

We have six or seven left - hardly a crisis.

>  Is there any likelihood anyone else we'll
> really need it any time soon?

Dunno - people do all sorts of crazy things.  But it's more likely to
be reused than a PM-specific global!

I have no strong feelings really, but slotting into the existing
technique with something which might be reusable is quite a bit tidier.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 20:02                                                                             ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-07 20:02 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Thu, 7 May 2009 21:33:47 +0200
"Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:

> On Thursday 07 May 2009, Andrew Morton wrote:
> > On Thu, 7 May 2009 20:09:52 +0200
> > "Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:
> > 
> > > > > > I'm suspecting that hibernation can allocate its pages with
> > > > > > __GFP_FS|__GFP_WAIT|__GFP_NORETRY|__GFP_NOWARN, and the page allocator
> > > > > > will dtrt: no oom-killings.
> > > > > > 
> > > > > > In which case, processes_are_frozen() is not needed at all?
> > > > > 
> > > > > __GFP_NORETRY alone causes it to fail relatively quickly, but I'll try with
> > > > > the combination.
> > > > 
> > > > OK.  __GFP_WAIT is the big hammer.
> > > 
> > > Unfortunately it fails too quickly with the combination as well, so it looks
> > > like we can't use __GFP_NORETRY during hibernation.
> > 
> > hm.
> > 
> > So where do we stand now?
> > 
> > I'm not a big fan of the global application-specific state change
> > thing.  Something like __GFP_NO_OOM_KILL has a better chance of being
> > reused by other subsystems in the future, which is a good indicator.
> 
> I'm not against __GFP_NO_OOM_KILL, but there's been some strong resistance to
> adding new _GPF _FOO flags recently.

We have six or seven left - hardly a crisis.

>  Is there any likelihood anyone else we'll
> really need it any time soon?

Dunno - people do all sorts of crazy things.  But it's more likely to
be reused than a PM-specific global!

I have no strong feelings really, but slotting into the existing
technique with something which might be reusable is quite a bit tidier.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 20:18                                                                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-07 20:18 UTC (permalink / raw)
  To: Andrew Morton
  Cc: rientjes, fengguang.wu, linux-pm, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

On Thursday 07 May 2009, Andrew Morton wrote:
> On Thu, 7 May 2009 21:33:47 +0200
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > On Thursday 07 May 2009, Andrew Morton wrote:
> > > On Thu, 7 May 2009 20:09:52 +0200
> > > "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > 
> > > > > > > I'm suspecting that hibernation can allocate its pages with
> > > > > > > __GFP_FS|__GFP_WAIT|__GFP_NORETRY|__GFP_NOWARN, and the page allocator
> > > > > > > will dtrt: no oom-killings.
> > > > > > > 
> > > > > > > In which case, processes_are_frozen() is not needed at all?
> > > > > > 
> > > > > > __GFP_NORETRY alone causes it to fail relatively quickly, but I'll try with
> > > > > > the combination.
> > > > > 
> > > > > OK.  __GFP_WAIT is the big hammer.
> > > > 
> > > > Unfortunately it fails too quickly with the combination as well, so it looks
> > > > like we can't use __GFP_NORETRY during hibernation.
> > > 
> > > hm.
> > > 
> > > So where do we stand now?
> > > 
> > > I'm not a big fan of the global application-specific state change
> > > thing.  Something like __GFP_NO_OOM_KILL has a better chance of being
> > > reused by other subsystems in the future, which is a good indicator.
> > 
> > I'm not against __GFP_NO_OOM_KILL, but there's been some strong resistance to
> > adding new _GPF _FOO flags recently.
> 
> We have six or seven left - hardly a crisis.
> 
> >  Is there any likelihood anyone else we'll
> > really need it any time soon?
> 
> Dunno - people do all sorts of crazy things.  But it's more likely to
> be reused than a PM-specific global!
> 
> I have no strong feelings really, but slotting into the existing
> technique with something which might be reusable is quite a bit tidier.

OK, let's try with __GFP_NO_OOM_KILL first.  If there's too much disagreement,
I'll use the freezer-based approach instead.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-07 20:02                                                                             ` Andrew Morton
  (?)
  (?)
@ 2009-05-07 20:18                                                                             ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-07 20:18 UTC (permalink / raw)
  To: Andrew Morton
  Cc: kernel-testers, rientjes, linux-kernel, alan-jenkins, jens.axboe,
	linux-pm, fengguang.wu, torvalds

On Thursday 07 May 2009, Andrew Morton wrote:
> On Thu, 7 May 2009 21:33:47 +0200
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > On Thursday 07 May 2009, Andrew Morton wrote:
> > > On Thu, 7 May 2009 20:09:52 +0200
> > > "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > 
> > > > > > > I'm suspecting that hibernation can allocate its pages with
> > > > > > > __GFP_FS|__GFP_WAIT|__GFP_NORETRY|__GFP_NOWARN, and the page allocator
> > > > > > > will dtrt: no oom-killings.
> > > > > > > 
> > > > > > > In which case, processes_are_frozen() is not needed at all?
> > > > > > 
> > > > > > __GFP_NORETRY alone causes it to fail relatively quickly, but I'll try with
> > > > > > the combination.
> > > > > 
> > > > > OK.  __GFP_WAIT is the big hammer.
> > > > 
> > > > Unfortunately it fails too quickly with the combination as well, so it looks
> > > > like we can't use __GFP_NORETRY during hibernation.
> > > 
> > > hm.
> > > 
> > > So where do we stand now?
> > > 
> > > I'm not a big fan of the global application-specific state change
> > > thing.  Something like __GFP_NO_OOM_KILL has a better chance of being
> > > reused by other subsystems in the future, which is a good indicator.
> > 
> > I'm not against __GFP_NO_OOM_KILL, but there's been some strong resistance to
> > adding new _GPF _FOO flags recently.
> 
> We have six or seven left - hardly a crisis.
> 
> >  Is there any likelihood anyone else we'll
> > really need it any time soon?
> 
> Dunno - people do all sorts of crazy things.  But it's more likely to
> be reused than a PM-specific global!
> 
> I have no strong feelings really, but slotting into the existing
> technique with something which might be reusable is quite a bit tidier.

OK, let's try with __GFP_NO_OOM_KILL first.  If there's too much disagreement,
I'll use the freezer-based approach instead.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 20:18                                                                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-07 20:18 UTC (permalink / raw)
  To: Andrew Morton
  Cc: rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Thursday 07 May 2009, Andrew Morton wrote:
> On Thu, 7 May 2009 21:33:47 +0200
> "Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:
> 
> > On Thursday 07 May 2009, Andrew Morton wrote:
> > > On Thu, 7 May 2009 20:09:52 +0200
> > > "Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:
> > > 
> > > > > > > I'm suspecting that hibernation can allocate its pages with
> > > > > > > __GFP_FS|__GFP_WAIT|__GFP_NORETRY|__GFP_NOWARN, and the page allocator
> > > > > > > will dtrt: no oom-killings.
> > > > > > > 
> > > > > > > In which case, processes_are_frozen() is not needed at all?
> > > > > > 
> > > > > > __GFP_NORETRY alone causes it to fail relatively quickly, but I'll try with
> > > > > > the combination.
> > > > > 
> > > > > OK.  __GFP_WAIT is the big hammer.
> > > > 
> > > > Unfortunately it fails too quickly with the combination as well, so it looks
> > > > like we can't use __GFP_NORETRY during hibernation.
> > > 
> > > hm.
> > > 
> > > So where do we stand now?
> > > 
> > > I'm not a big fan of the global application-specific state change
> > > thing.  Something like __GFP_NO_OOM_KILL has a better chance of being
> > > reused by other subsystems in the future, which is a good indicator.
> > 
> > I'm not against __GFP_NO_OOM_KILL, but there's been some strong resistance to
> > adding new _GPF _FOO flags recently.
> 
> We have six or seven left - hardly a crisis.
> 
> >  Is there any likelihood anyone else we'll
> > really need it any time soon?
> 
> Dunno - people do all sorts of crazy things.  But it's more likely to
> be reused than a PM-specific global!
> 
> I have no strong feelings really, but slotting into the existing
> technique with something which might be reusable is quite a bit tidier.

OK, let's try with __GFP_NO_OOM_KILL first.  If there's too much disagreement,
I'll use the freezer-based approach instead.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 20:25                                                                                 ` David Rientjes
  0 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-07 20:25 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, fengguang.wu, linux-pm, pavel, torvalds,
	jens.axboe, alan-jenkins, linux-kernel, kernel-testers

On Thu, 7 May 2009, Rafael J. Wysocki wrote:

> OK, let's try with __GFP_NO_OOM_KILL first.  If there's too much disagreement,
> I'll use the freezer-based approach instead.
> 

Third time I'm going to suggest this, and I'd like a response on why it's 
not possible instead of being ignored.

All of your tasks are in D state other than kthreads, right?  That means 
they won't be in the oom killer (thus no zones are oom locked), so you can 
easily do this

	struct zone *z;
	for_each_populated_zone(z)
		zone_set_flag(z, ZONE_OOM_LOCKED);

and then

	for_each_populated_zone(z)
		zone_clear_flag(z, ZONE_OOM_LOCKED);

The serialization is done with trylocks so this will never invoke the oom 
killer because all zones in the allocator's zonelist will be oom locked.

Why does this not work for you?

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-07 20:18                                                                               ` Rafael J. Wysocki
  (?)
@ 2009-05-07 20:25                                                                               ` David Rientjes
  -1 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-07 20:25 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: kernel-testers, linux-kernel, alan-jenkins, jens.axboe,
	Andrew Morton, fengguang.wu, torvalds, linux-pm

On Thu, 7 May 2009, Rafael J. Wysocki wrote:

> OK, let's try with __GFP_NO_OOM_KILL first.  If there's too much disagreement,
> I'll use the freezer-based approach instead.
> 

Third time I'm going to suggest this, and I'd like a response on why it's 
not possible instead of being ignored.

All of your tasks are in D state other than kthreads, right?  That means 
they won't be in the oom killer (thus no zones are oom locked), so you can 
easily do this

	struct zone *z;
	for_each_populated_zone(z)
		zone_set_flag(z, ZONE_OOM_LOCKED);

and then

	for_each_populated_zone(z)
		zone_clear_flag(z, ZONE_OOM_LOCKED);

The serialization is done with trylocks so this will never invoke the oom 
killer because all zones in the allocator's zonelist will be oom locked.

Why does this not work for you?

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 20:25                                                                                 ` David Rientjes
  0 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-07 20:25 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Thu, 7 May 2009, Rafael J. Wysocki wrote:

> OK, let's try with __GFP_NO_OOM_KILL first.  If there's too much disagreement,
> I'll use the freezer-based approach instead.
> 

Third time I'm going to suggest this, and I'd like a response on why it's 
not possible instead of being ignored.

All of your tasks are in D state other than kthreads, right?  That means 
they won't be in the oom killer (thus no zones are oom locked), so you can 
easily do this

	struct zone *z;
	for_each_populated_zone(z)
		zone_set_flag(z, ZONE_OOM_LOCKED);

and then

	for_each_populated_zone(z)
		zone_clear_flag(z, ZONE_OOM_LOCKED);

The serialization is done with trylocks so this will never invoke the oom 
killer because all zones in the allocator's zonelist will be oom locked.

Why does this not work for you?

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 20:35                                                                                   ` Pavel Machek
  0 siblings, 0 replies; 580+ messages in thread
From: Pavel Machek @ 2009-05-07 20:35 UTC (permalink / raw)
  To: David Rientjes
  Cc: Rafael J. Wysocki, Andrew Morton, fengguang.wu, linux-pm,
	torvalds, jens.axboe, alan-jenkins, linux-kernel, kernel-testers

On Thu 2009-05-07 13:25:06, David Rientjes wrote:
> On Thu, 7 May 2009, Rafael J. Wysocki wrote:
> 
> > OK, let's try with __GFP_NO_OOM_KILL first.  If there's too much disagreement,
> > I'll use the freezer-based approach instead.
> > 
> 
> Third time I'm going to suggest this, and I'd like a response on why it's 
> not possible instead of being ignored.
> 
> All of your tasks are in D state other than kthreads, right?  That means 
> they won't be in the oom killer (thus no zones are oom locked), so you can 
> easily do this

Well, OOM killer may be running on behalf of some kthread at that
point....? Quite unlikely, but possible AFAICT.
									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-07 20:25                                                                                 ` David Rientjes
  (?)
  (?)
@ 2009-05-07 20:35                                                                                 ` Pavel Machek
  -1 siblings, 0 replies; 580+ messages in thread
From: Pavel Machek @ 2009-05-07 20:35 UTC (permalink / raw)
  To: David Rientjes
  Cc: kernel-testers, linux-kernel, alan-jenkins, jens.axboe,
	Andrew Morton, fengguang.wu, torvalds, linux-pm

On Thu 2009-05-07 13:25:06, David Rientjes wrote:
> On Thu, 7 May 2009, Rafael J. Wysocki wrote:
> 
> > OK, let's try with __GFP_NO_OOM_KILL first.  If there's too much disagreement,
> > I'll use the freezer-based approach instead.
> > 
> 
> Third time I'm going to suggest this, and I'd like a response on why it's 
> not possible instead of being ignored.
> 
> All of your tasks are in D state other than kthreads, right?  That means 
> they won't be in the oom killer (thus no zones are oom locked), so you can 
> easily do this

Well, OOM killer may be running on behalf of some kthread at that
point....? Quite unlikely, but possible AFAICT.
									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 20:35                                                                                   ` Pavel Machek
  0 siblings, 0 replies; 580+ messages in thread
From: Pavel Machek @ 2009-05-07 20:35 UTC (permalink / raw)
  To: David Rientjes
  Cc: Rafael J. Wysocki, Andrew Morton,
	fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Thu 2009-05-07 13:25:06, David Rientjes wrote:
> On Thu, 7 May 2009, Rafael J. Wysocki wrote:
> 
> > OK, let's try with __GFP_NO_OOM_KILL first.  If there's too much disagreement,
> > I'll use the freezer-based approach instead.
> > 
> 
> Third time I'm going to suggest this, and I'd like a response on why it's 
> not possible instead of being ignored.
> 
> All of your tasks are in D state other than kthreads, right?  That means 
> they won't be in the oom killer (thus no zones are oom locked), so you can 
> easily do this

Well, OOM killer may be running on behalf of some kthread at that
point....? Quite unlikely, but possible AFAICT.
									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 20:38                                                                                   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-07 20:38 UTC (permalink / raw)
  To: David Rientjes, Andrew Morton
  Cc: fengguang.wu, linux-pm, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

On Thursday 07 May 2009, David Rientjes wrote:
> On Thu, 7 May 2009, Rafael J. Wysocki wrote:
> 
> > OK, let's try with __GFP_NO_OOM_KILL first.  If there's too much disagreement,
> > I'll use the freezer-based approach instead.
> > 
> 
> Third time I'm going to suggest this, and I'd like a response on why it's 
> not possible instead of being ignored.
> 
> All of your tasks are in D state other than kthreads, right?  That means 
> they won't be in the oom killer (thus no zones are oom locked), so you can 
> easily do this
> 
> 	struct zone *z;
> 	for_each_populated_zone(z)
> 		zone_set_flag(z, ZONE_OOM_LOCKED);
> 
> and then
> 
> 	for_each_populated_zone(z)
> 		zone_clear_flag(z, ZONE_OOM_LOCKED);
> 
> The serialization is done with trylocks so this will never invoke the oom 
> killer because all zones in the allocator's zonelist will be oom locked.
> 
> Why does this not work for you?

Well, it might work too, but why are you insisting?  How's it better than
__GFP_NO_OOM_KILL, actually?

Andrew, what do you think about this?

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-07 20:25                                                                                 ` David Rientjes
                                                                                                   ` (3 preceding siblings ...)
  (?)
@ 2009-05-07 20:38                                                                                 ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-07 20:38 UTC (permalink / raw)
  To: David Rientjes, Andrew Morton
  Cc: kernel-testers, linux-kernel, alan-jenkins, jens.axboe, linux-pm,
	fengguang.wu, torvalds

On Thursday 07 May 2009, David Rientjes wrote:
> On Thu, 7 May 2009, Rafael J. Wysocki wrote:
> 
> > OK, let's try with __GFP_NO_OOM_KILL first.  If there's too much disagreement,
> > I'll use the freezer-based approach instead.
> > 
> 
> Third time I'm going to suggest this, and I'd like a response on why it's 
> not possible instead of being ignored.
> 
> All of your tasks are in D state other than kthreads, right?  That means 
> they won't be in the oom killer (thus no zones are oom locked), so you can 
> easily do this
> 
> 	struct zone *z;
> 	for_each_populated_zone(z)
> 		zone_set_flag(z, ZONE_OOM_LOCKED);
> 
> and then
> 
> 	for_each_populated_zone(z)
> 		zone_clear_flag(z, ZONE_OOM_LOCKED);
> 
> The serialization is done with trylocks so this will never invoke the oom 
> killer because all zones in the allocator's zonelist will be oom locked.
> 
> Why does this not work for you?

Well, it might work too, but why are you insisting?  How's it better than
__GFP_NO_OOM_KILL, actually?

Andrew, what do you think about this?

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 20:38                                                                                   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-07 20:38 UTC (permalink / raw)
  To: David Rientjes, Andrew Morton
  Cc: fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Thursday 07 May 2009, David Rientjes wrote:
> On Thu, 7 May 2009, Rafael J. Wysocki wrote:
> 
> > OK, let's try with __GFP_NO_OOM_KILL first.  If there's too much disagreement,
> > I'll use the freezer-based approach instead.
> > 
> 
> Third time I'm going to suggest this, and I'd like a response on why it's 
> not possible instead of being ignored.
> 
> All of your tasks are in D state other than kthreads, right?  That means 
> they won't be in the oom killer (thus no zones are oom locked), so you can 
> easily do this
> 
> 	struct zone *z;
> 	for_each_populated_zone(z)
> 		zone_set_flag(z, ZONE_OOM_LOCKED);
> 
> and then
> 
> 	for_each_populated_zone(z)
> 		zone_clear_flag(z, ZONE_OOM_LOCKED);
> 
> The serialization is done with trylocks so this will never invoke the oom 
> killer because all zones in the allocator's zonelist will be oom locked.
> 
> Why does this not work for you?

Well, it might work too, but why are you insisting?  How's it better than
__GFP_NO_OOM_KILL, actually?

Andrew, what do you think about this?

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 20:40                                                                                     ` David Rientjes
  0 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-07 20:40 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Rafael J. Wysocki, Andrew Morton, fengguang.wu, linux-pm,
	torvalds, jens.axboe, alan-jenkins, linux-kernel, kernel-testers

On Thu, 7 May 2009, Pavel Machek wrote:

> > Third time I'm going to suggest this, and I'd like a response on why it's 
> > not possible instead of being ignored.
> > 
> > All of your tasks are in D state other than kthreads, right?  That means 
> > they won't be in the oom killer (thus no zones are oom locked), so you can 
> > easily do this
> 
> Well, OOM killer may be running on behalf of some kthread at that
> point....? Quite unlikely, but possible AFAICT.

The oom killer doesn't care about the task's state, so this will be a 
genuine oom situation where it will kill a task (one in D state since 
kthreads are inherently immune) which will die when unfrozen.  That would 
have had to happen anyway when all tasks wake up since the system is 
completely out of memory (except for kswapd that is given access to memory 
reserves because of PF_MEMALLOC) so you're not worried about completely 
blocking out the oom killer anymore because the next kthread to invoke it 
in such a situation will end up being a no-op because it finds a task with 
TIF_MEMDIE set.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-07 20:35                                                                                   ` Pavel Machek
  (?)
  (?)
@ 2009-05-07 20:40                                                                                   ` David Rientjes
  -1 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-07 20:40 UTC (permalink / raw)
  To: Pavel Machek
  Cc: kernel-testers, linux-kernel, alan-jenkins, jens.axboe,
	Andrew Morton, fengguang.wu, torvalds, linux-pm

On Thu, 7 May 2009, Pavel Machek wrote:

> > Third time I'm going to suggest this, and I'd like a response on why it's 
> > not possible instead of being ignored.
> > 
> > All of your tasks are in D state other than kthreads, right?  That means 
> > they won't be in the oom killer (thus no zones are oom locked), so you can 
> > easily do this
> 
> Well, OOM killer may be running on behalf of some kthread at that
> point....? Quite unlikely, but possible AFAICT.

The oom killer doesn't care about the task's state, so this will be a 
genuine oom situation where it will kill a task (one in D state since 
kthreads are inherently immune) which will die when unfrozen.  That would 
have had to happen anyway when all tasks wake up since the system is 
completely out of memory (except for kswapd that is given access to memory 
reserves because of PF_MEMALLOC) so you're not worried about completely 
blocking out the oom killer anymore because the next kthread to invoke it 
in such a situation will end up being a no-op because it finds a task with 
TIF_MEMDIE set.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 20:40                                                                                     ` David Rientjes
  0 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-07 20:40 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Rafael J. Wysocki, Andrew Morton,
	fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Thu, 7 May 2009, Pavel Machek wrote:

> > Third time I'm going to suggest this, and I'd like a response on why it's 
> > not possible instead of being ignored.
> > 
> > All of your tasks are in D state other than kthreads, right?  That means 
> > they won't be in the oom killer (thus no zones are oom locked), so you can 
> > easily do this
> 
> Well, OOM killer may be running on behalf of some kthread at that
> point....? Quite unlikely, but possible AFAICT.

The oom killer doesn't care about the task's state, so this will be a 
genuine oom situation where it will kill a task (one in D state since 
kthreads are inherently immune) which will die when unfrozen.  That would 
have had to happen anyway when all tasks wake up since the system is 
completely out of memory (except for kswapd that is given access to memory 
reserves because of PF_MEMALLOC) so you're not worried about completely 
blocking out the oom killer anymore because the next kthread to invoke it 
in such a situation will end up being a no-op because it finds a task with 
TIF_MEMDIE set.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 20:42                                                                                     ` David Rientjes
  0 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-07 20:42 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, fengguang.wu, linux-pm, pavel, torvalds,
	jens.axboe, alan-jenkins, linux-kernel, kernel-testers

On Thu, 7 May 2009, Rafael J. Wysocki wrote:

> > Third time I'm going to suggest this, and I'd like a response on why it's 
> > not possible instead of being ignored.
> > 
> > All of your tasks are in D state other than kthreads, right?  That means 
> > they won't be in the oom killer (thus no zones are oom locked), so you can 
> > easily do this
> > 
> > 	struct zone *z;
> > 	for_each_populated_zone(z)
> > 		zone_set_flag(z, ZONE_OOM_LOCKED);
> > 
> > and then
> > 
> > 	for_each_populated_zone(z)
> > 		zone_clear_flag(z, ZONE_OOM_LOCKED);
> > 
> > The serialization is done with trylocks so this will never invoke the oom 
> > killer because all zones in the allocator's zonelist will be oom locked.
> > 
> > Why does this not work for you?
> 
> Well, it might work too, but why are you insisting?  How's it better than
> __GFP_NO_OOM_KILL, actually?
> 

Because I agree with Christoph's concerns about needlessly adding 
additional gfp flags; he was responding to the proposed addition of 
__GFP_PANIC which could be handled in other much simpler ways just like 
this flag can as I've shown.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-07 20:38                                                                                   ` Rafael J. Wysocki
  (?)
  (?)
@ 2009-05-07 20:42                                                                                   ` David Rientjes
  -1 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-07 20:42 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: kernel-testers, linux-kernel, alan-jenkins, jens.axboe,
	Andrew Morton, fengguang.wu, torvalds, linux-pm

On Thu, 7 May 2009, Rafael J. Wysocki wrote:

> > Third time I'm going to suggest this, and I'd like a response on why it's 
> > not possible instead of being ignored.
> > 
> > All of your tasks are in D state other than kthreads, right?  That means 
> > they won't be in the oom killer (thus no zones are oom locked), so you can 
> > easily do this
> > 
> > 	struct zone *z;
> > 	for_each_populated_zone(z)
> > 		zone_set_flag(z, ZONE_OOM_LOCKED);
> > 
> > and then
> > 
> > 	for_each_populated_zone(z)
> > 		zone_clear_flag(z, ZONE_OOM_LOCKED);
> > 
> > The serialization is done with trylocks so this will never invoke the oom 
> > killer because all zones in the allocator's zonelist will be oom locked.
> > 
> > Why does this not work for you?
> 
> Well, it might work too, but why are you insisting?  How's it better than
> __GFP_NO_OOM_KILL, actually?
> 

Because I agree with Christoph's concerns about needlessly adding 
additional gfp flags; he was responding to the proposed addition of 
__GFP_PANIC which could be handled in other much simpler ways just like 
this flag can as I've shown.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 20:42                                                                                     ` David Rientjes
  0 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-07 20:42 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Thu, 7 May 2009, Rafael J. Wysocki wrote:

> > Third time I'm going to suggest this, and I'd like a response on why it's 
> > not possible instead of being ignored.
> > 
> > All of your tasks are in D state other than kthreads, right?  That means 
> > they won't be in the oom killer (thus no zones are oom locked), so you can 
> > easily do this
> > 
> > 	struct zone *z;
> > 	for_each_populated_zone(z)
> > 		zone_set_flag(z, ZONE_OOM_LOCKED);
> > 
> > and then
> > 
> > 	for_each_populated_zone(z)
> > 		zone_clear_flag(z, ZONE_OOM_LOCKED);
> > 
> > The serialization is done with trylocks so this will never invoke the oom 
> > killer because all zones in the allocator's zonelist will be oom locked.
> > 
> > Why does this not work for you?
> 
> Well, it might work too, but why are you insisting?  How's it better than
> __GFP_NO_OOM_KILL, actually?
> 

Because I agree with Christoph's concerns about needlessly adding 
additional gfp flags; he was responding to the proposed addition of 
__GFP_PANIC which could be handled in other much simpler ways just like 
this flag can as I've shown.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 20:56                                                                                     ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-07 20:56 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: rientjes, fengguang.wu, linux-pm, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

On Thu, 7 May 2009 22:38:13 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> On Thursday 07 May 2009, David Rientjes wrote:
> > On Thu, 7 May 2009, Rafael J. Wysocki wrote:
> > 
> > > OK, let's try with __GFP_NO_OOM_KILL first.  If there's too much disagreement,
> > > I'll use the freezer-based approach instead.
> > > 
> > 
> > Third time I'm going to suggest this, and I'd like a response on why it's 
> > not possible instead of being ignored.
> > 
> > All of your tasks are in D state other than kthreads, right?  That means 
> > they won't be in the oom killer (thus no zones are oom locked), so you can 
> > easily do this
> > 
> > 	struct zone *z;
> > 	for_each_populated_zone(z)
> > 		zone_set_flag(z, ZONE_OOM_LOCKED);
> > 
> > and then
> > 
> > 	for_each_populated_zone(z)
> > 		zone_clear_flag(z, ZONE_OOM_LOCKED);
> > 
> > The serialization is done with trylocks so this will never invoke the oom 
> > killer because all zones in the allocator's zonelist will be oom locked.
> > 
> > Why does this not work for you?
> 
> Well, it might work too, but why are you insisting?  How's it better than
> __GFP_NO_OOM_KILL, actually?
> 
> Andrew, what do you think about this?

I don't think I understand the proposal.  Is it to provide a means by
which PM can go in and set a state bit against each and every zone?  If
so, that's still a global boolean, only messier.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-07 20:38                                                                                   ` Rafael J. Wysocki
                                                                                                     ` (2 preceding siblings ...)
  (?)
@ 2009-05-07 20:56                                                                                   ` Andrew Morton
  -1 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-07 20:56 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: kernel-testers, rientjes, linux-kernel, alan-jenkins, jens.axboe,
	linux-pm, fengguang.wu, torvalds

On Thu, 7 May 2009 22:38:13 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> On Thursday 07 May 2009, David Rientjes wrote:
> > On Thu, 7 May 2009, Rafael J. Wysocki wrote:
> > 
> > > OK, let's try with __GFP_NO_OOM_KILL first.  If there's too much disagreement,
> > > I'll use the freezer-based approach instead.
> > > 
> > 
> > Third time I'm going to suggest this, and I'd like a response on why it's 
> > not possible instead of being ignored.
> > 
> > All of your tasks are in D state other than kthreads, right?  That means 
> > they won't be in the oom killer (thus no zones are oom locked), so you can 
> > easily do this
> > 
> > 	struct zone *z;
> > 	for_each_populated_zone(z)
> > 		zone_set_flag(z, ZONE_OOM_LOCKED);
> > 
> > and then
> > 
> > 	for_each_populated_zone(z)
> > 		zone_clear_flag(z, ZONE_OOM_LOCKED);
> > 
> > The serialization is done with trylocks so this will never invoke the oom 
> > killer because all zones in the allocator's zonelist will be oom locked.
> > 
> > Why does this not work for you?
> 
> Well, it might work too, but why are you insisting?  How's it better than
> __GFP_NO_OOM_KILL, actually?
> 
> Andrew, what do you think about this?

I don't think I understand the proposal.  Is it to provide a means by
which PM can go in and set a state bit against each and every zone?  If
so, that's still a global boolean, only messier.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 20:56                                                                                     ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-07 20:56 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Thu, 7 May 2009 22:38:13 +0200
"Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:

> On Thursday 07 May 2009, David Rientjes wrote:
> > On Thu, 7 May 2009, Rafael J. Wysocki wrote:
> > 
> > > OK, let's try with __GFP_NO_OOM_KILL first.  If there's too much disagreement,
> > > I'll use the freezer-based approach instead.
> > > 
> > 
> > Third time I'm going to suggest this, and I'd like a response on why it's 
> > not possible instead of being ignored.
> > 
> > All of your tasks are in D state other than kthreads, right?  That means 
> > they won't be in the oom killer (thus no zones are oom locked), so you can 
> > easily do this
> > 
> > 	struct zone *z;
> > 	for_each_populated_zone(z)
> > 		zone_set_flag(z, ZONE_OOM_LOCKED);
> > 
> > and then
> > 
> > 	for_each_populated_zone(z)
> > 		zone_clear_flag(z, ZONE_OOM_LOCKED);
> > 
> > The serialization is done with trylocks so this will never invoke the oom 
> > killer because all zones in the allocator's zonelist will be oom locked.
> > 
> > Why does this not work for you?
> 
> Well, it might work too, but why are you insisting?  How's it better than
> __GFP_NO_OOM_KILL, actually?
> 
> Andrew, what do you think about this?

I don't think I understand the proposal.  Is it to provide a means by
which PM can go in and set a state bit against each and every zone?  If
so, that's still a global boolean, only messier.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 21:25                                                                                       ` David Rientjes
  0 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-07 21:25 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rafael J. Wysocki, fengguang.wu, linux-pm, pavel, torvalds,
	jens.axboe, alan-jenkins, linux-kernel, kernel-testers

On Thu, 7 May 2009, Andrew Morton wrote:

> > > All of your tasks are in D state other than kthreads, right?  That means 
> > > they won't be in the oom killer (thus no zones are oom locked), so you can 
> > > easily do this
> > > 
> > > 	struct zone *z;
> > > 	for_each_populated_zone(z)
> > > 		zone_set_flag(z, ZONE_OOM_LOCKED);
> > > 
> > > and then
> > > 
> > > 	for_each_populated_zone(z)
> > > 		zone_clear_flag(z, ZONE_OOM_LOCKED);
> > > 
> > > The serialization is done with trylocks so this will never invoke the oom 
> > > killer because all zones in the allocator's zonelist will be oom locked.
> > > 
> > > Why does this not work for you?
> > 
> > Well, it might work too, but why are you insisting?  How's it better than
> > __GFP_NO_OOM_KILL, actually?
> > 
> > Andrew, what do you think about this?
> 
> I don't think I understand the proposal.  Is it to provide a means by
> which PM can go in and set a state bit against each and every zone?  If
> so, that's still a global boolean, only messier.
> 

Why can't it be global while preallocating memory for hibernation since 
nothing but kthreads could allocate at this point and if the system is oom 
then the oom killer wouldn't be able to do anything anyway since it can't 
kill them?

The fact is that _all_ allocations here are implicitly __GFP_NO_OOM_KILL 
whether it specifies it or not since the oom killer would simply kill a 
task in D state which can't exit or free memory and subsequent allocations 
would make the oom killer a no-op because there's an eligible task with 
TIF_MEMDIE set.  The only thing you're saving with __GFP_NO_OOM_KILL is 
calling the oom killer in a first place and killing an unresponsive task 
but that would have to happen anyway when thawed since the system is oom 
(or otherwise lockup for GFP_KERNEL with order < PAGE_ALLOC_COSTLY_ORDER).

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-07 20:56                                                                                     ` Andrew Morton
  (?)
  (?)
@ 2009-05-07 21:25                                                                                     ` David Rientjes
  -1 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-07 21:25 UTC (permalink / raw)
  To: Andrew Morton
  Cc: kernel-testers, linux-kernel, alan-jenkins, jens.axboe, linux-pm,
	fengguang.wu, torvalds

On Thu, 7 May 2009, Andrew Morton wrote:

> > > All of your tasks are in D state other than kthreads, right?  That means 
> > > they won't be in the oom killer (thus no zones are oom locked), so you can 
> > > easily do this
> > > 
> > > 	struct zone *z;
> > > 	for_each_populated_zone(z)
> > > 		zone_set_flag(z, ZONE_OOM_LOCKED);
> > > 
> > > and then
> > > 
> > > 	for_each_populated_zone(z)
> > > 		zone_clear_flag(z, ZONE_OOM_LOCKED);
> > > 
> > > The serialization is done with trylocks so this will never invoke the oom 
> > > killer because all zones in the allocator's zonelist will be oom locked.
> > > 
> > > Why does this not work for you?
> > 
> > Well, it might work too, but why are you insisting?  How's it better than
> > __GFP_NO_OOM_KILL, actually?
> > 
> > Andrew, what do you think about this?
> 
> I don't think I understand the proposal.  Is it to provide a means by
> which PM can go in and set a state bit against each and every zone?  If
> so, that's still a global boolean, only messier.
> 

Why can't it be global while preallocating memory for hibernation since 
nothing but kthreads could allocate at this point and if the system is oom 
then the oom killer wouldn't be able to do anything anyway since it can't 
kill them?

The fact is that _all_ allocations here are implicitly __GFP_NO_OOM_KILL 
whether it specifies it or not since the oom killer would simply kill a 
task in D state which can't exit or free memory and subsequent allocations 
would make the oom killer a no-op because there's an eligible task with 
TIF_MEMDIE set.  The only thing you're saving with __GFP_NO_OOM_KILL is 
calling the oom killer in a first place and killing an unresponsive task 
but that would have to happen anyway when thawed since the system is oom 
(or otherwise lockup for GFP_KERNEL with order < PAGE_ALLOC_COSTLY_ORDER).

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 21:25                                                                                       ` David Rientjes
  0 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-07 21:25 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rafael J. Wysocki, fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Thu, 7 May 2009, Andrew Morton wrote:

> > > All of your tasks are in D state other than kthreads, right?  That means 
> > > they won't be in the oom killer (thus no zones are oom locked), so you can 
> > > easily do this
> > > 
> > > 	struct zone *z;
> > > 	for_each_populated_zone(z)
> > > 		zone_set_flag(z, ZONE_OOM_LOCKED);
> > > 
> > > and then
> > > 
> > > 	for_each_populated_zone(z)
> > > 		zone_clear_flag(z, ZONE_OOM_LOCKED);
> > > 
> > > The serialization is done with trylocks so this will never invoke the oom 
> > > killer because all zones in the allocator's zonelist will be oom locked.
> > > 
> > > Why does this not work for you?
> > 
> > Well, it might work too, but why are you insisting?  How's it better than
> > __GFP_NO_OOM_KILL, actually?
> > 
> > Andrew, what do you think about this?
> 
> I don't think I understand the proposal.  Is it to provide a means by
> which PM can go in and set a state bit against each and every zone?  If
> so, that's still a global boolean, only messier.
> 

Why can't it be global while preallocating memory for hibernation since 
nothing but kthreads could allocate at this point and if the system is oom 
then the oom killer wouldn't be able to do anything anyway since it can't 
kill them?

The fact is that _all_ allocations here are implicitly __GFP_NO_OOM_KILL 
whether it specifies it or not since the oom killer would simply kill a 
task in D state which can't exit or free memory and subsequent allocations 
would make the oom killer a no-op because there's an eligible task with 
TIF_MEMDIE set.  The only thing you're saving with __GFP_NO_OOM_KILL is 
calling the oom killer in a first place and killing an unresponsive task 
but that would have to happen anyway when thawed since the system is oom 
(or otherwise lockup for GFP_KERNEL with order < PAGE_ALLOC_COSTLY_ORDER).

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 21:36                                                                                         ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-07 21:36 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, fengguang.wu, linux-pm, pavel, torvalds,
	jens.axboe, alan-jenkins, linux-kernel, kernel-testers

On Thursday 07 May 2009, David Rientjes wrote:
> On Thu, 7 May 2009, Andrew Morton wrote:
> 
> > > > All of your tasks are in D state other than kthreads, right?  That means 
> > > > they won't be in the oom killer (thus no zones are oom locked), so you can 
> > > > easily do this
> > > > 
> > > > 	struct zone *z;
> > > > 	for_each_populated_zone(z)
> > > > 		zone_set_flag(z, ZONE_OOM_LOCKED);
> > > > 
> > > > and then
> > > > 
> > > > 	for_each_populated_zone(z)
> > > > 		zone_clear_flag(z, ZONE_OOM_LOCKED);
> > > > 
> > > > The serialization is done with trylocks so this will never invoke the oom 
> > > > killer because all zones in the allocator's zonelist will be oom locked.
> > > > 
> > > > Why does this not work for you?
> > > 
> > > Well, it might work too, but why are you insisting?  How's it better than
> > > __GFP_NO_OOM_KILL, actually?
> > > 
> > > Andrew, what do you think about this?
> > 
> > I don't think I understand the proposal.  Is it to provide a means by
> > which PM can go in and set a state bit against each and every zone?  If
> > so, that's still a global boolean, only messier.
> > 
> 
> Why can't it be global while preallocating memory for hibernation since 
> nothing but kthreads could allocate at this point and if the system is oom 
> then the oom killer wouldn't be able to do anything anyway since it can't 
> kill them?
> 
> The fact is that _all_ allocations here are implicitly __GFP_NO_OOM_KILL 
> whether it specifies it or not since the oom killer would simply kill a 
> task in D state which can't exit or free memory and subsequent allocations 
> would make the oom killer a no-op because there's an eligible task with 
> TIF_MEMDIE set.  The only thing you're saving with __GFP_NO_OOM_KILL is 
> calling the oom killer in a first place and killing an unresponsive task 

That's exactly what we're trying to do.  We don't want tasks to get killed just
because we're freeing memory for hibernation image.

> but that would have to happen anyway when thawed since the system is oom 
> (or otherwise lockup for GFP_KERNEL with order < PAGE_ALLOC_COSTLY_ORDER).

Are you sure?  The image memory is freed before thawing tasks.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-07 21:25                                                                                       ` David Rientjes
  (?)
  (?)
@ 2009-05-07 21:36                                                                                       ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-07 21:36 UTC (permalink / raw)
  To: David Rientjes
  Cc: kernel-testers, linux-kernel, alan-jenkins, jens.axboe,
	Andrew Morton, fengguang.wu, torvalds, linux-pm

On Thursday 07 May 2009, David Rientjes wrote:
> On Thu, 7 May 2009, Andrew Morton wrote:
> 
> > > > All of your tasks are in D state other than kthreads, right?  That means 
> > > > they won't be in the oom killer (thus no zones are oom locked), so you can 
> > > > easily do this
> > > > 
> > > > 	struct zone *z;
> > > > 	for_each_populated_zone(z)
> > > > 		zone_set_flag(z, ZONE_OOM_LOCKED);
> > > > 
> > > > and then
> > > > 
> > > > 	for_each_populated_zone(z)
> > > > 		zone_clear_flag(z, ZONE_OOM_LOCKED);
> > > > 
> > > > The serialization is done with trylocks so this will never invoke the oom 
> > > > killer because all zones in the allocator's zonelist will be oom locked.
> > > > 
> > > > Why does this not work for you?
> > > 
> > > Well, it might work too, but why are you insisting?  How's it better than
> > > __GFP_NO_OOM_KILL, actually?
> > > 
> > > Andrew, what do you think about this?
> > 
> > I don't think I understand the proposal.  Is it to provide a means by
> > which PM can go in and set a state bit against each and every zone?  If
> > so, that's still a global boolean, only messier.
> > 
> 
> Why can't it be global while preallocating memory for hibernation since 
> nothing but kthreads could allocate at this point and if the system is oom 
> then the oom killer wouldn't be able to do anything anyway since it can't 
> kill them?
> 
> The fact is that _all_ allocations here are implicitly __GFP_NO_OOM_KILL 
> whether it specifies it or not since the oom killer would simply kill a 
> task in D state which can't exit or free memory and subsequent allocations 
> would make the oom killer a no-op because there's an eligible task with 
> TIF_MEMDIE set.  The only thing you're saving with __GFP_NO_OOM_KILL is 
> calling the oom killer in a first place and killing an unresponsive task 

That's exactly what we're trying to do.  We don't want tasks to get killed just
because we're freeing memory for hibernation image.

> but that would have to happen anyway when thawed since the system is oom 
> (or otherwise lockup for GFP_KERNEL with order < PAGE_ALLOC_COSTLY_ORDER).

Are you sure?  The image memory is freed before thawing tasks.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 21:36                                                                                         ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-07 21:36 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Thursday 07 May 2009, David Rientjes wrote:
> On Thu, 7 May 2009, Andrew Morton wrote:
> 
> > > > All of your tasks are in D state other than kthreads, right?  That means 
> > > > they won't be in the oom killer (thus no zones are oom locked), so you can 
> > > > easily do this
> > > > 
> > > > 	struct zone *z;
> > > > 	for_each_populated_zone(z)
> > > > 		zone_set_flag(z, ZONE_OOM_LOCKED);
> > > > 
> > > > and then
> > > > 
> > > > 	for_each_populated_zone(z)
> > > > 		zone_clear_flag(z, ZONE_OOM_LOCKED);
> > > > 
> > > > The serialization is done with trylocks so this will never invoke the oom 
> > > > killer because all zones in the allocator's zonelist will be oom locked.
> > > > 
> > > > Why does this not work for you?
> > > 
> > > Well, it might work too, but why are you insisting?  How's it better than
> > > __GFP_NO_OOM_KILL, actually?
> > > 
> > > Andrew, what do you think about this?
> > 
> > I don't think I understand the proposal.  Is it to provide a means by
> > which PM can go in and set a state bit against each and every zone?  If
> > so, that's still a global boolean, only messier.
> > 
> 
> Why can't it be global while preallocating memory for hibernation since 
> nothing but kthreads could allocate at this point and if the system is oom 
> then the oom killer wouldn't be able to do anything anyway since it can't 
> kill them?
> 
> The fact is that _all_ allocations here are implicitly __GFP_NO_OOM_KILL 
> whether it specifies it or not since the oom killer would simply kill a 
> task in D state which can't exit or free memory and subsequent allocations 
> would make the oom killer a no-op because there's an eligible task with 
> TIF_MEMDIE set.  The only thing you're saving with __GFP_NO_OOM_KILL is 
> calling the oom killer in a first place and killing an unresponsive task 

That's exactly what we're trying to do.  We don't want tasks to get killed just
because we're freeing memory for hibernation image.

> but that would have to happen anyway when thawed since the system is oom 
> (or otherwise lockup for GFP_KERNEL with order < PAGE_ALLOC_COSTLY_ORDER).

Are you sure?  The image memory is freed before thawing tasks.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 21:46                                                                                           ` David Rientjes
  0 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-07 21:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, fengguang.wu, linux-pm, pavel, torvalds,
	jens.axboe, alan-jenkins, linux-kernel, kernel-testers

On Thu, 7 May 2009, Rafael J. Wysocki wrote:

> > The fact is that _all_ allocations here are implicitly __GFP_NO_OOM_KILL 
> > whether it specifies it or not since the oom killer would simply kill a 
> > task in D state which can't exit or free memory and subsequent allocations 
> > would make the oom killer a no-op because there's an eligible task with 
> > TIF_MEMDIE set.  The only thing you're saving with __GFP_NO_OOM_KILL is 
> > calling the oom killer in a first place and killing an unresponsive task 
> 
> That's exactly what we're trying to do.  We don't want tasks to get killed just
> because we're freeing memory for hibernation image.
> 

Then, again, why can't you just lock out the oom killer as I suggested if 
__GFP_NO_OOM_KILL is actually implied for all allocations when 
preallocating?  It prevents adding an unnecessary gfp flag, sprinkling it 
around in the hibernation code, and a comment would actually explain why 
it's the right thing to do (i.e. no other threads other than kthreads 
could possibly be executing the oom killer and if they are oom then we'll 
have to kill a userspace task anyway when thawed).

> > but that would have to happen anyway when thawed since the system is oom 
> > (or otherwise lockup for GFP_KERNEL with order < PAGE_ALLOC_COSTLY_ORDER).
> 
> Are you sure?  The image memory is freed before thawing tasks.
> 

If you try to allocate any non-__GFP_NORETRY memory such as GFP_KERNEL 
with order < PAGE_ALLOC_COSTLY_ORDER and direct reclaim cannot free memory 
(and the oom killer is implicitly a no-op whether you specify 
__GFP_NO_OOM_KILL or not), then you could loop endlessly in the page 
allocator.  When allocating GFP_IMAGE you need to ensure that can't happen 
and __GFP_NORETRY may not be your best option because it could fail 
unnecessarily when reclaim could have helped.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-07 21:36                                                                                         ` Rafael J. Wysocki
  (?)
  (?)
@ 2009-05-07 21:46                                                                                         ` David Rientjes
  -1 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-07 21:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: kernel-testers, linux-kernel, alan-jenkins, jens.axboe,
	Andrew Morton, fengguang.wu, torvalds, linux-pm

On Thu, 7 May 2009, Rafael J. Wysocki wrote:

> > The fact is that _all_ allocations here are implicitly __GFP_NO_OOM_KILL 
> > whether it specifies it or not since the oom killer would simply kill a 
> > task in D state which can't exit or free memory and subsequent allocations 
> > would make the oom killer a no-op because there's an eligible task with 
> > TIF_MEMDIE set.  The only thing you're saving with __GFP_NO_OOM_KILL is 
> > calling the oom killer in a first place and killing an unresponsive task 
> 
> That's exactly what we're trying to do.  We don't want tasks to get killed just
> because we're freeing memory for hibernation image.
> 

Then, again, why can't you just lock out the oom killer as I suggested if 
__GFP_NO_OOM_KILL is actually implied for all allocations when 
preallocating?  It prevents adding an unnecessary gfp flag, sprinkling it 
around in the hibernation code, and a comment would actually explain why 
it's the right thing to do (i.e. no other threads other than kthreads 
could possibly be executing the oom killer and if they are oom then we'll 
have to kill a userspace task anyway when thawed).

> > but that would have to happen anyway when thawed since the system is oom 
> > (or otherwise lockup for GFP_KERNEL with order < PAGE_ALLOC_COSTLY_ORDER).
> 
> Are you sure?  The image memory is freed before thawing tasks.
> 

If you try to allocate any non-__GFP_NORETRY memory such as GFP_KERNEL 
with order < PAGE_ALLOC_COSTLY_ORDER and direct reclaim cannot free memory 
(and the oom killer is implicitly a no-op whether you specify 
__GFP_NO_OOM_KILL or not), then you could loop endlessly in the page 
allocator.  When allocating GFP_IMAGE you need to ensure that can't happen 
and __GFP_NORETRY may not be your best option because it could fail 
unnecessarily when reclaim could have helped.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 21:46                                                                                           ` David Rientjes
  0 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-07 21:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Thu, 7 May 2009, Rafael J. Wysocki wrote:

> > The fact is that _all_ allocations here are implicitly __GFP_NO_OOM_KILL 
> > whether it specifies it or not since the oom killer would simply kill a 
> > task in D state which can't exit or free memory and subsequent allocations 
> > would make the oom killer a no-op because there's an eligible task with 
> > TIF_MEMDIE set.  The only thing you're saving with __GFP_NO_OOM_KILL is 
> > calling the oom killer in a first place and killing an unresponsive task 
> 
> That's exactly what we're trying to do.  We don't want tasks to get killed just
> because we're freeing memory for hibernation image.
> 

Then, again, why can't you just lock out the oom killer as I suggested if 
__GFP_NO_OOM_KILL is actually implied for all allocations when 
preallocating?  It prevents adding an unnecessary gfp flag, sprinkling it 
around in the hibernation code, and a comment would actually explain why 
it's the right thing to do (i.e. no other threads other than kthreads 
could possibly be executing the oom killer and if they are oom then we'll 
have to kill a userspace task anyway when thawed).

> > but that would have to happen anyway when thawed since the system is oom 
> > (or otherwise lockup for GFP_KERNEL with order < PAGE_ALLOC_COSTLY_ORDER).
> 
> Are you sure?  The image memory is freed before thawing tasks.
> 

If you try to allocate any non-__GFP_NORETRY memory such as GFP_KERNEL 
with order < PAGE_ALLOC_COSTLY_ORDER and direct reclaim cannot free memory 
(and the oom killer is implicitly a no-op whether you specify 
__GFP_NO_OOM_KILL or not), then you could loop endlessly in the page 
allocator.  When allocating GFP_IMAGE you need to ensure that can't happen 
and __GFP_NORETRY may not be your best option because it could fail 
unnecessarily when reclaim could have helped.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 21:50                                                                                         ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-07 21:50 UTC (permalink / raw)
  To: David Rientjes
  Cc: rjw, fengguang.wu, linux-pm, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

On Thu, 7 May 2009 14:25:23 -0700 (PDT)
David Rientjes <rientjes@google.com> wrote:

> On Thu, 7 May 2009, Andrew Morton wrote:
> 
> > > > All of your tasks are in D state other than kthreads, right?  That means 
> > > > they won't be in the oom killer (thus no zones are oom locked), so you can 
> > > > easily do this
> > > > 
> > > > 	struct zone *z;
> > > > 	for_each_populated_zone(z)
> > > > 		zone_set_flag(z, ZONE_OOM_LOCKED);
> > > > 
> > > > and then
> > > > 
> > > > 	for_each_populated_zone(z)
> > > > 		zone_clear_flag(z, ZONE_OOM_LOCKED);
> > > > 
> > > > The serialization is done with trylocks so this will never invoke the oom 
> > > > killer because all zones in the allocator's zonelist will be oom locked.
> > > > 
> > > > Why does this not work for you?
> > > 
> > > Well, it might work too, but why are you insisting?  How's it better than
> > > __GFP_NO_OOM_KILL, actually?
> > > 
> > > Andrew, what do you think about this?
> > 
> > I don't think I understand the proposal.  Is it to provide a means by
> > which PM can go in and set a state bit against each and every zone?  If
> > so, that's still a global boolean, only messier.
> > 
> 
> Why can't it be global while preallocating memory for hibernation since 
> nothing but kthreads could allocate at this point and if the system is oom 
> then the oom killer wouldn't be able to do anything anyway since it can't 
> kill them?

- globals are bad

- the standard way of controlling memory allocator behaviour is via
  the gfp_t.  Bypassing that is an unusual step and needs a higher
  level of justification, which I'm not seeing here.

- if we do this via an unusual global, we reduce the chances that
  another subsytem could use the new feature.

  I don't know what subsytem that might be, but I bet they're out
  there.  checkpoint-restart, virtual machines, ballooning memory
  drivers, kexec loading, etc.

> The fact is that _all_ allocations here are implicitly __GFP_NO_OOM_KILL 
> whether it specifies it or not since the oom killer would simply kill a 
> task in D state which can't exit or free memory and subsequent allocations 
> would make the oom killer a no-op because there's an eligible task with 
> TIF_MEMDIE set.  The only thing you're saving with __GFP_NO_OOM_KILL is 
> calling the oom killer in a first place and killing an unresponsive task 
> but that would have to happen anyway when thawed since the system is oom 
> (or otherwise lockup for GFP_KERNEL with order < PAGE_ALLOC_COSTLY_ORDER).

All the above is specific to the PM application only, when userspace
tasks are stopped.


It might well end up that stopping userspace (beforehand or before
oom-killing) is a hard requirement for reliably disabling the
oom-killer.  Because the __GFP_NO_OOM_KILL user will be safe, but
random other allocations from other tasks will not be.  So perhaps we
_do_ need a global, and random userspace processes should test and
sleep upon that global if they're heading in the direction of the
oom-killer.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-07 21:25                                                                                       ` David Rientjes
                                                                                                         ` (2 preceding siblings ...)
  (?)
@ 2009-05-07 21:50                                                                                       ` Andrew Morton
  -1 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-07 21:50 UTC (permalink / raw)
  To: David Rientjes
  Cc: kernel-testers, linux-kernel, alan-jenkins, jens.axboe, linux-pm,
	fengguang.wu, torvalds

On Thu, 7 May 2009 14:25:23 -0700 (PDT)
David Rientjes <rientjes@google.com> wrote:

> On Thu, 7 May 2009, Andrew Morton wrote:
> 
> > > > All of your tasks are in D state other than kthreads, right?  That means 
> > > > they won't be in the oom killer (thus no zones are oom locked), so you can 
> > > > easily do this
> > > > 
> > > > 	struct zone *z;
> > > > 	for_each_populated_zone(z)
> > > > 		zone_set_flag(z, ZONE_OOM_LOCKED);
> > > > 
> > > > and then
> > > > 
> > > > 	for_each_populated_zone(z)
> > > > 		zone_clear_flag(z, ZONE_OOM_LOCKED);
> > > > 
> > > > The serialization is done with trylocks so this will never invoke the oom 
> > > > killer because all zones in the allocator's zonelist will be oom locked.
> > > > 
> > > > Why does this not work for you?
> > > 
> > > Well, it might work too, but why are you insisting?  How's it better than
> > > __GFP_NO_OOM_KILL, actually?
> > > 
> > > Andrew, what do you think about this?
> > 
> > I don't think I understand the proposal.  Is it to provide a means by
> > which PM can go in and set a state bit against each and every zone?  If
> > so, that's still a global boolean, only messier.
> > 
> 
> Why can't it be global while preallocating memory for hibernation since 
> nothing but kthreads could allocate at this point and if the system is oom 
> then the oom killer wouldn't be able to do anything anyway since it can't 
> kill them?

- globals are bad

- the standard way of controlling memory allocator behaviour is via
  the gfp_t.  Bypassing that is an unusual step and needs a higher
  level of justification, which I'm not seeing here.

- if we do this via an unusual global, we reduce the chances that
  another subsytem could use the new feature.

  I don't know what subsytem that might be, but I bet they're out
  there.  checkpoint-restart, virtual machines, ballooning memory
  drivers, kexec loading, etc.

> The fact is that _all_ allocations here are implicitly __GFP_NO_OOM_KILL 
> whether it specifies it or not since the oom killer would simply kill a 
> task in D state which can't exit or free memory and subsequent allocations 
> would make the oom killer a no-op because there's an eligible task with 
> TIF_MEMDIE set.  The only thing you're saving with __GFP_NO_OOM_KILL is 
> calling the oom killer in a first place and killing an unresponsive task 
> but that would have to happen anyway when thawed since the system is oom 
> (or otherwise lockup for GFP_KERNEL with order < PAGE_ALLOC_COSTLY_ORDER).

All the above is specific to the PM application only, when userspace
tasks are stopped.


It might well end up that stopping userspace (beforehand or before
oom-killing) is a hard requirement for reliably disabling the
oom-killer.  Because the __GFP_NO_OOM_KILL user will be safe, but
random other allocations from other tasks will not be.  So perhaps we
_do_ need a global, and random userspace processes should test and
sleep upon that global if they're heading in the direction of the
oom-killer.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 21:50                                                                                         ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-07 21:50 UTC (permalink / raw)
  To: David Rientjes
  Cc: rjw-KKrjLPT3xs0, fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Thu, 7 May 2009 14:25:23 -0700 (PDT)
David Rientjes <rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:

> On Thu, 7 May 2009, Andrew Morton wrote:
> 
> > > > All of your tasks are in D state other than kthreads, right?  That means 
> > > > they won't be in the oom killer (thus no zones are oom locked), so you can 
> > > > easily do this
> > > > 
> > > > 	struct zone *z;
> > > > 	for_each_populated_zone(z)
> > > > 		zone_set_flag(z, ZONE_OOM_LOCKED);
> > > > 
> > > > and then
> > > > 
> > > > 	for_each_populated_zone(z)
> > > > 		zone_clear_flag(z, ZONE_OOM_LOCKED);
> > > > 
> > > > The serialization is done with trylocks so this will never invoke the oom 
> > > > killer because all zones in the allocator's zonelist will be oom locked.
> > > > 
> > > > Why does this not work for you?
> > > 
> > > Well, it might work too, but why are you insisting?  How's it better than
> > > __GFP_NO_OOM_KILL, actually?
> > > 
> > > Andrew, what do you think about this?
> > 
> > I don't think I understand the proposal.  Is it to provide a means by
> > which PM can go in and set a state bit against each and every zone?  If
> > so, that's still a global boolean, only messier.
> > 
> 
> Why can't it be global while preallocating memory for hibernation since 
> nothing but kthreads could allocate at this point and if the system is oom 
> then the oom killer wouldn't be able to do anything anyway since it can't 
> kill them?

- globals are bad

- the standard way of controlling memory allocator behaviour is via
  the gfp_t.  Bypassing that is an unusual step and needs a higher
  level of justification, which I'm not seeing here.

- if we do this via an unusual global, we reduce the chances that
  another subsytem could use the new feature.

  I don't know what subsytem that might be, but I bet they're out
  there.  checkpoint-restart, virtual machines, ballooning memory
  drivers, kexec loading, etc.

> The fact is that _all_ allocations here are implicitly __GFP_NO_OOM_KILL 
> whether it specifies it or not since the oom killer would simply kill a 
> task in D state which can't exit or free memory and subsequent allocations 
> would make the oom killer a no-op because there's an eligible task with 
> TIF_MEMDIE set.  The only thing you're saving with __GFP_NO_OOM_KILL is 
> calling the oom killer in a first place and killing an unresponsive task 
> but that would have to happen anyway when thawed since the system is oom 
> (or otherwise lockup for GFP_KERNEL with order < PAGE_ALLOC_COSTLY_ORDER).

All the above is specific to the PM application only, when userspace
tasks are stopped.


It might well end up that stopping userspace (beforehand or before
oom-killing) is a hard requirement for reliably disabling the
oom-killer.  Because the __GFP_NO_OOM_KILL user will be safe, but
random other allocations from other tasks will not be.  So perhaps we
_do_ need a global, and random userspace processes should test and
sleep upon that global if they're heading in the direction of the
oom-killer.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 22:05                                                                                             ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-07 22:05 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, fengguang.wu, linux-pm, pavel, torvalds,
	jens.axboe, alan-jenkins, linux-kernel, kernel-testers

On Thursday 07 May 2009, David Rientjes wrote:
> On Thu, 7 May 2009, Rafael J. Wysocki wrote:
> 
> > > The fact is that _all_ allocations here are implicitly __GFP_NO_OOM_KILL 
> > > whether it specifies it or not since the oom killer would simply kill a 
> > > task in D state which can't exit or free memory and subsequent allocations 
> > > would make the oom killer a no-op because there's an eligible task with 
> > > TIF_MEMDIE set.  The only thing you're saving with __GFP_NO_OOM_KILL is 
> > > calling the oom killer in a first place and killing an unresponsive task 
> > 
> > That's exactly what we're trying to do.  We don't want tasks to get killed just
> > because we're freeing memory for hibernation image.
> > 
> 
> Then, again, why can't you just lock out the oom killer as I suggested if 
> __GFP_NO_OOM_KILL is actually implied for all allocations when 
> preallocating?  It prevents adding an unnecessary gfp flag, sprinkling it 
> around in the hibernation code,

In one place really.

> and a comment would actually explain why it's the right thing to do (i.e. no
> other threads other than kthreads could possibly be executing the oom killer
> and if they are oom then we'll have to kill a userspace task anyway when
> thawed).

Quite frankly, I prefer my freezer-based patch to this.  I'm not really
inclined to fiddle with the mm internals from within snapshot.c .

Still, I trust the Andrew's experience and that's why I'm going to try the
__GFP_NO_OOM_KILL first, as I already said.  If there is a problem with it,
I'm going to use the freezer-based approach.
 
> > > but that would have to happen anyway when thawed since the system is oom 
> > > (or otherwise lockup for GFP_KERNEL with order < PAGE_ALLOC_COSTLY_ORDER).
> > 
> > Are you sure?  The image memory is freed before thawing tasks.
> > 
> 
> If you try to allocate any non-__GFP_NORETRY memory such as GFP_KERNEL 
> with order < PAGE_ALLOC_COSTLY_ORDER and direct reclaim cannot free memory 
> (and the oom killer is implicitly a no-op whether you specify 
> __GFP_NO_OOM_KILL or not), then you could loop endlessly in the page 
> allocator.  When allocating GFP_IMAGE you need to ensure that can't happen 
> and __GFP_NORETRY may not be your best option because it could fail 
> unnecessarily when reclaim could have helped.

I'm really unsure what you mean and how that is related to your previous remark
about what's going to happen after the thawing of tasks.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-07 21:46                                                                                           ` David Rientjes
  (?)
@ 2009-05-07 22:05                                                                                           ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-07 22:05 UTC (permalink / raw)
  To: David Rientjes
  Cc: kernel-testers, linux-kernel, alan-jenkins, jens.axboe,
	Andrew Morton, fengguang.wu, torvalds, linux-pm

On Thursday 07 May 2009, David Rientjes wrote:
> On Thu, 7 May 2009, Rafael J. Wysocki wrote:
> 
> > > The fact is that _all_ allocations here are implicitly __GFP_NO_OOM_KILL 
> > > whether it specifies it or not since the oom killer would simply kill a 
> > > task in D state which can't exit or free memory and subsequent allocations 
> > > would make the oom killer a no-op because there's an eligible task with 
> > > TIF_MEMDIE set.  The only thing you're saving with __GFP_NO_OOM_KILL is 
> > > calling the oom killer in a first place and killing an unresponsive task 
> > 
> > That's exactly what we're trying to do.  We don't want tasks to get killed just
> > because we're freeing memory for hibernation image.
> > 
> 
> Then, again, why can't you just lock out the oom killer as I suggested if 
> __GFP_NO_OOM_KILL is actually implied for all allocations when 
> preallocating?  It prevents adding an unnecessary gfp flag, sprinkling it 
> around in the hibernation code,

In one place really.

> and a comment would actually explain why it's the right thing to do (i.e. no
> other threads other than kthreads could possibly be executing the oom killer
> and if they are oom then we'll have to kill a userspace task anyway when
> thawed).

Quite frankly, I prefer my freezer-based patch to this.  I'm not really
inclined to fiddle with the mm internals from within snapshot.c .

Still, I trust the Andrew's experience and that's why I'm going to try the
__GFP_NO_OOM_KILL first, as I already said.  If there is a problem with it,
I'm going to use the freezer-based approach.
 
> > > but that would have to happen anyway when thawed since the system is oom 
> > > (or otherwise lockup for GFP_KERNEL with order < PAGE_ALLOC_COSTLY_ORDER).
> > 
> > Are you sure?  The image memory is freed before thawing tasks.
> > 
> 
> If you try to allocate any non-__GFP_NORETRY memory such as GFP_KERNEL 
> with order < PAGE_ALLOC_COSTLY_ORDER and direct reclaim cannot free memory 
> (and the oom killer is implicitly a no-op whether you specify 
> __GFP_NO_OOM_KILL or not), then you could loop endlessly in the page 
> allocator.  When allocating GFP_IMAGE you need to ensure that can't happen 
> and __GFP_NORETRY may not be your best option because it could fail 
> unnecessarily when reclaim could have helped.

I'm really unsure what you mean and how that is related to your previous remark
about what's going to happen after the thawing of tasks.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 22:05                                                                                             ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-07 22:05 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Thursday 07 May 2009, David Rientjes wrote:
> On Thu, 7 May 2009, Rafael J. Wysocki wrote:
> 
> > > The fact is that _all_ allocations here are implicitly __GFP_NO_OOM_KILL 
> > > whether it specifies it or not since the oom killer would simply kill a 
> > > task in D state which can't exit or free memory and subsequent allocations 
> > > would make the oom killer a no-op because there's an eligible task with 
> > > TIF_MEMDIE set.  The only thing you're saving with __GFP_NO_OOM_KILL is 
> > > calling the oom killer in a first place and killing an unresponsive task 
> > 
> > That's exactly what we're trying to do.  We don't want tasks to get killed just
> > because we're freeing memory for hibernation image.
> > 
> 
> Then, again, why can't you just lock out the oom killer as I suggested if 
> __GFP_NO_OOM_KILL is actually implied for all allocations when 
> preallocating?  It prevents adding an unnecessary gfp flag, sprinkling it 
> around in the hibernation code,

In one place really.

> and a comment would actually explain why it's the right thing to do (i.e. no
> other threads other than kthreads could possibly be executing the oom killer
> and if they are oom then we'll have to kill a userspace task anyway when
> thawed).

Quite frankly, I prefer my freezer-based patch to this.  I'm not really
inclined to fiddle with the mm internals from within snapshot.c .

Still, I trust the Andrew's experience and that's why I'm going to try the
__GFP_NO_OOM_KILL first, as I already said.  If there is a problem with it,
I'm going to use the freezer-based approach.
 
> > > but that would have to happen anyway when thawed since the system is oom 
> > > (or otherwise lockup for GFP_KERNEL with order < PAGE_ALLOC_COSTLY_ORDER).
> > 
> > Are you sure?  The image memory is freed before thawing tasks.
> > 
> 
> If you try to allocate any non-__GFP_NORETRY memory such as GFP_KERNEL 
> with order < PAGE_ALLOC_COSTLY_ORDER and direct reclaim cannot free memory 
> (and the oom killer is implicitly a no-op whether you specify 
> __GFP_NO_OOM_KILL or not), then you could loop endlessly in the page 
> allocator.  When allocating GFP_IMAGE you need to ensure that can't happen 
> and __GFP_NORETRY may not be your best option because it could fail 
> unnecessarily when reclaim could have helped.

I'm really unsure what you mean and how that is related to your previous remark
about what's going to happen after the thawing of tasks.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 22:14                                                                                           ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-07 22:14 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Rientjes, fengguang.wu, linux-pm, pavel, torvalds,
	jens.axboe, alan-jenkins, linux-kernel, kernel-testers

On Thursday 07 May 2009, Andrew Morton wrote:
> On Thu, 7 May 2009 14:25:23 -0700 (PDT)
> David Rientjes <rientjes@google.com> wrote:
> 
> > On Thu, 7 May 2009, Andrew Morton wrote:
> > 
> > > > > All of your tasks are in D state other than kthreads, right?  That means 
> > > > > they won't be in the oom killer (thus no zones are oom locked), so you can 
> > > > > easily do this
> > > > > 
> > > > > 	struct zone *z;
> > > > > 	for_each_populated_zone(z)
> > > > > 		zone_set_flag(z, ZONE_OOM_LOCKED);
> > > > > 
> > > > > and then
> > > > > 
> > > > > 	for_each_populated_zone(z)
> > > > > 		zone_clear_flag(z, ZONE_OOM_LOCKED);
> > > > > 
> > > > > The serialization is done with trylocks so this will never invoke the oom 
> > > > > killer because all zones in the allocator's zonelist will be oom locked.
> > > > > 
> > > > > Why does this not work for you?
> > > > 
> > > > Well, it might work too, but why are you insisting?  How's it better than
> > > > __GFP_NO_OOM_KILL, actually?
> > > > 
> > > > Andrew, what do you think about this?
> > > 
> > > I don't think I understand the proposal.  Is it to provide a means by
> > > which PM can go in and set a state bit against each and every zone?  If
> > > so, that's still a global boolean, only messier.
> > > 
> > 
> > Why can't it be global while preallocating memory for hibernation since 
> > nothing but kthreads could allocate at this point and if the system is oom 
> > then the oom killer wouldn't be able to do anything anyway since it can't 
> > kill them?
> 
> - globals are bad
> 
> - the standard way of controlling memory allocator behaviour is via
>   the gfp_t.  Bypassing that is an unusual step and needs a higher
>   level of justification, which I'm not seeing here.
> 
> - if we do this via an unusual global, we reduce the chances that
>   another subsytem could use the new feature.
> 
>   I don't know what subsytem that might be, but I bet they're out
>   there.  checkpoint-restart, virtual machines, ballooning memory
>   drivers, kexec loading, etc.
> 
> > The fact is that _all_ allocations here are implicitly __GFP_NO_OOM_KILL 
> > whether it specifies it or not since the oom killer would simply kill a 
> > task in D state which can't exit or free memory and subsequent allocations 
> > would make the oom killer a no-op because there's an eligible task with 
> > TIF_MEMDIE set.  The only thing you're saving with __GFP_NO_OOM_KILL is 
> > calling the oom killer in a first place and killing an unresponsive task 
> > but that would have to happen anyway when thawed since the system is oom 
> > (or otherwise lockup for GFP_KERNEL with order < PAGE_ALLOC_COSTLY_ORDER).
> 
> All the above is specific to the PM application only, when userspace
> tasks are stopped.
> 
> 
> It might well end up that stopping userspace (beforehand or before
> oom-killing) is a hard requirement for reliably disabling the
> oom-killer.

In fact I think it is and that's why I wanted to make that freezer-dependent.

IOW, you need to freeze the user space totally before trying to disable the
OOM killer.  Reversely, if you _have_ frozen the user space totally, the OOM
killer won't really help, so why let it run at all in that situation?

FWIW, I've just posted updated patchset with the first patch replaced with
the one introducing __GFP_NO_OOM_KILL, but perhaps I should use the
freezer-based one after all?

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-07 21:50                                                                                         ` Andrew Morton
  (?)
@ 2009-05-07 22:14                                                                                         ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-07 22:14 UTC (permalink / raw)
  To: Andrew Morton
  Cc: kernel-testers, David Rientjes, linux-kernel, alan-jenkins,
	jens.axboe, linux-pm, fengguang.wu, torvalds

On Thursday 07 May 2009, Andrew Morton wrote:
> On Thu, 7 May 2009 14:25:23 -0700 (PDT)
> David Rientjes <rientjes@google.com> wrote:
> 
> > On Thu, 7 May 2009, Andrew Morton wrote:
> > 
> > > > > All of your tasks are in D state other than kthreads, right?  That means 
> > > > > they won't be in the oom killer (thus no zones are oom locked), so you can 
> > > > > easily do this
> > > > > 
> > > > > 	struct zone *z;
> > > > > 	for_each_populated_zone(z)
> > > > > 		zone_set_flag(z, ZONE_OOM_LOCKED);
> > > > > 
> > > > > and then
> > > > > 
> > > > > 	for_each_populated_zone(z)
> > > > > 		zone_clear_flag(z, ZONE_OOM_LOCKED);
> > > > > 
> > > > > The serialization is done with trylocks so this will never invoke the oom 
> > > > > killer because all zones in the allocator's zonelist will be oom locked.
> > > > > 
> > > > > Why does this not work for you?
> > > > 
> > > > Well, it might work too, but why are you insisting?  How's it better than
> > > > __GFP_NO_OOM_KILL, actually?
> > > > 
> > > > Andrew, what do you think about this?
> > > 
> > > I don't think I understand the proposal.  Is it to provide a means by
> > > which PM can go in and set a state bit against each and every zone?  If
> > > so, that's still a global boolean, only messier.
> > > 
> > 
> > Why can't it be global while preallocating memory for hibernation since 
> > nothing but kthreads could allocate at this point and if the system is oom 
> > then the oom killer wouldn't be able to do anything anyway since it can't 
> > kill them?
> 
> - globals are bad
> 
> - the standard way of controlling memory allocator behaviour is via
>   the gfp_t.  Bypassing that is an unusual step and needs a higher
>   level of justification, which I'm not seeing here.
> 
> - if we do this via an unusual global, we reduce the chances that
>   another subsytem could use the new feature.
> 
>   I don't know what subsytem that might be, but I bet they're out
>   there.  checkpoint-restart, virtual machines, ballooning memory
>   drivers, kexec loading, etc.
> 
> > The fact is that _all_ allocations here are implicitly __GFP_NO_OOM_KILL 
> > whether it specifies it or not since the oom killer would simply kill a 
> > task in D state which can't exit or free memory and subsequent allocations 
> > would make the oom killer a no-op because there's an eligible task with 
> > TIF_MEMDIE set.  The only thing you're saving with __GFP_NO_OOM_KILL is 
> > calling the oom killer in a first place and killing an unresponsive task 
> > but that would have to happen anyway when thawed since the system is oom 
> > (or otherwise lockup for GFP_KERNEL with order < PAGE_ALLOC_COSTLY_ORDER).
> 
> All the above is specific to the PM application only, when userspace
> tasks are stopped.
> 
> 
> It might well end up that stopping userspace (beforehand or before
> oom-killing) is a hard requirement for reliably disabling the
> oom-killer.

In fact I think it is and that's why I wanted to make that freezer-dependent.

IOW, you need to freeze the user space totally before trying to disable the
OOM killer.  Reversely, if you _have_ frozen the user space totally, the OOM
killer won't really help, so why let it run at all in that situation?

FWIW, I've just posted updated patchset with the first patch replaced with
the one introducing __GFP_NO_OOM_KILL, but perhaps I should use the
freezer-based one after all?

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 22:14                                                                                           ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-07 22:14 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Rientjes, fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Thursday 07 May 2009, Andrew Morton wrote:
> On Thu, 7 May 2009 14:25:23 -0700 (PDT)
> David Rientjes <rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
> 
> > On Thu, 7 May 2009, Andrew Morton wrote:
> > 
> > > > > All of your tasks are in D state other than kthreads, right?  That means 
> > > > > they won't be in the oom killer (thus no zones are oom locked), so you can 
> > > > > easily do this
> > > > > 
> > > > > 	struct zone *z;
> > > > > 	for_each_populated_zone(z)
> > > > > 		zone_set_flag(z, ZONE_OOM_LOCKED);
> > > > > 
> > > > > and then
> > > > > 
> > > > > 	for_each_populated_zone(z)
> > > > > 		zone_clear_flag(z, ZONE_OOM_LOCKED);
> > > > > 
> > > > > The serialization is done with trylocks so this will never invoke the oom 
> > > > > killer because all zones in the allocator's zonelist will be oom locked.
> > > > > 
> > > > > Why does this not work for you?
> > > > 
> > > > Well, it might work too, but why are you insisting?  How's it better than
> > > > __GFP_NO_OOM_KILL, actually?
> > > > 
> > > > Andrew, what do you think about this?
> > > 
> > > I don't think I understand the proposal.  Is it to provide a means by
> > > which PM can go in and set a state bit against each and every zone?  If
> > > so, that's still a global boolean, only messier.
> > > 
> > 
> > Why can't it be global while preallocating memory for hibernation since 
> > nothing but kthreads could allocate at this point and if the system is oom 
> > then the oom killer wouldn't be able to do anything anyway since it can't 
> > kill them?
> 
> - globals are bad
> 
> - the standard way of controlling memory allocator behaviour is via
>   the gfp_t.  Bypassing that is an unusual step and needs a higher
>   level of justification, which I'm not seeing here.
> 
> - if we do this via an unusual global, we reduce the chances that
>   another subsytem could use the new feature.
> 
>   I don't know what subsytem that might be, but I bet they're out
>   there.  checkpoint-restart, virtual machines, ballooning memory
>   drivers, kexec loading, etc.
> 
> > The fact is that _all_ allocations here are implicitly __GFP_NO_OOM_KILL 
> > whether it specifies it or not since the oom killer would simply kill a 
> > task in D state which can't exit or free memory and subsequent allocations 
> > would make the oom killer a no-op because there's an eligible task with 
> > TIF_MEMDIE set.  The only thing you're saving with __GFP_NO_OOM_KILL is 
> > calling the oom killer in a first place and killing an unresponsive task 
> > but that would have to happen anyway when thawed since the system is oom 
> > (or otherwise lockup for GFP_KERNEL with order < PAGE_ALLOC_COSTLY_ORDER).
> 
> All the above is specific to the PM application only, when userspace
> tasks are stopped.
> 
> 
> It might well end up that stopping userspace (beforehand or before
> oom-killing) is a hard requirement for reliably disabling the
> oom-killer.

In fact I think it is and that's why I wanted to make that freezer-dependent.

IOW, you need to freeze the user space totally before trying to disable the
OOM killer.  Reversely, if you _have_ frozen the user space totally, the OOM
killer won't really help, so why let it run at all in that situation?

FWIW, I've just posted updated patchset with the first patch replaced with
the one introducing __GFP_NO_OOM_KILL, but perhaps I should use the
freezer-based one after all?

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 22:16                                                                                           ` David Rientjes
  0 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-07 22:16 UTC (permalink / raw)
  To: Andrew Morton
  Cc: rjw, fengguang.wu, linux-pm, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

On Thu, 7 May 2009, Andrew Morton wrote:

> - the standard way of controlling memory allocator behaviour is via
>   the gfp_t.  Bypassing that is an unusual step and needs a higher
>   level of justification, which I'm not seeing here.
> 

The standard way of controlling the oom killer behavior for a zone is via 
the ZONE_OOM_LOCKED bit.

> - if we do this via an unusual global, we reduce the chances that
>   another subsytem could use the new feature.
> 
>   I don't know what subsytem that might be, but I bet they're out
>   there.  checkpoint-restart, virtual machines, ballooning memory
>   drivers, kexec loading, etc.
> 

There's two separate issues here: the use of ZONE_OOM_LOCKED to control 
whether or not to invoke the oom killer for a specific zone (which is 
already its only function), and the fact that in this case we're doing it 
for all zones.  It seems like you're concerned with the latter, but the 
distinction in the hibernation case is that no memory freeing would be 
possible as the result of the oom killer for _all_ zones, so it makes 
sense to lock them all out.

> > The fact is that _all_ allocations here are implicitly __GFP_NO_OOM_KILL 
> > whether it specifies it or not since the oom killer would simply kill a 
> > task in D state which can't exit or free memory and subsequent allocations 
> > would make the oom killer a no-op because there's an eligible task with 
> > TIF_MEMDIE set.  The only thing you're saving with __GFP_NO_OOM_KILL is 
> > calling the oom killer in a first place and killing an unresponsive task 
> > but that would have to happen anyway when thawed since the system is oom 
> > (or otherwise lockup for GFP_KERNEL with order < PAGE_ALLOC_COSTLY_ORDER).
> 
> All the above is specific to the PM application only, when userspace
> tasks are stopped.
> 

I'm not arguing that the only way we can ever implement __GFP_NO_OOM_KILL 
is for the entire system: we can set ZONE_OOM_LOCKED for only the zones in 
the zonelist that are passed to the page allocator.  For this particular 
purpose, that is naturally all zones; for other future use cases it may be 
chosen only to lock out the zones we're allowed to allocate from in that 
context.

> It might well end up that stopping userspace (beforehand or before
> oom-killing) is a hard requirement for reliably disabling the
> oom-killer.

Yes, globally, but future use cases may disable only specific zones such 
as with memory hot-remove.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-07 21:50                                                                                         ` Andrew Morton
                                                                                                           ` (2 preceding siblings ...)
  (?)
@ 2009-05-07 22:16                                                                                         ` David Rientjes
  -1 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-07 22:16 UTC (permalink / raw)
  To: Andrew Morton
  Cc: kernel-testers, linux-kernel, alan-jenkins, jens.axboe, linux-pm,
	fengguang.wu, torvalds

On Thu, 7 May 2009, Andrew Morton wrote:

> - the standard way of controlling memory allocator behaviour is via
>   the gfp_t.  Bypassing that is an unusual step and needs a higher
>   level of justification, which I'm not seeing here.
> 

The standard way of controlling the oom killer behavior for a zone is via 
the ZONE_OOM_LOCKED bit.

> - if we do this via an unusual global, we reduce the chances that
>   another subsytem could use the new feature.
> 
>   I don't know what subsytem that might be, but I bet they're out
>   there.  checkpoint-restart, virtual machines, ballooning memory
>   drivers, kexec loading, etc.
> 

There's two separate issues here: the use of ZONE_OOM_LOCKED to control 
whether or not to invoke the oom killer for a specific zone (which is 
already its only function), and the fact that in this case we're doing it 
for all zones.  It seems like you're concerned with the latter, but the 
distinction in the hibernation case is that no memory freeing would be 
possible as the result of the oom killer for _all_ zones, so it makes 
sense to lock them all out.

> > The fact is that _all_ allocations here are implicitly __GFP_NO_OOM_KILL 
> > whether it specifies it or not since the oom killer would simply kill a 
> > task in D state which can't exit or free memory and subsequent allocations 
> > would make the oom killer a no-op because there's an eligible task with 
> > TIF_MEMDIE set.  The only thing you're saving with __GFP_NO_OOM_KILL is 
> > calling the oom killer in a first place and killing an unresponsive task 
> > but that would have to happen anyway when thawed since the system is oom 
> > (or otherwise lockup for GFP_KERNEL with order < PAGE_ALLOC_COSTLY_ORDER).
> 
> All the above is specific to the PM application only, when userspace
> tasks are stopped.
> 

I'm not arguing that the only way we can ever implement __GFP_NO_OOM_KILL 
is for the entire system: we can set ZONE_OOM_LOCKED for only the zones in 
the zonelist that are passed to the page allocator.  For this particular 
purpose, that is naturally all zones; for other future use cases it may be 
chosen only to lock out the zones we're allowed to allocate from in that 
context.

> It might well end up that stopping userspace (beforehand or before
> oom-killing) is a hard requirement for reliably disabling the
> oom-killer.

Yes, globally, but future use cases may disable only specific zones such 
as with memory hot-remove.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 22:16                                                                                           ` David Rientjes
  0 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-07 22:16 UTC (permalink / raw)
  To: Andrew Morton
  Cc: rjw-KKrjLPT3xs0, fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Thu, 7 May 2009, Andrew Morton wrote:

> - the standard way of controlling memory allocator behaviour is via
>   the gfp_t.  Bypassing that is an unusual step and needs a higher
>   level of justification, which I'm not seeing here.
> 

The standard way of controlling the oom killer behavior for a zone is via 
the ZONE_OOM_LOCKED bit.

> - if we do this via an unusual global, we reduce the chances that
>   another subsytem could use the new feature.
> 
>   I don't know what subsytem that might be, but I bet they're out
>   there.  checkpoint-restart, virtual machines, ballooning memory
>   drivers, kexec loading, etc.
> 

There's two separate issues here: the use of ZONE_OOM_LOCKED to control 
whether or not to invoke the oom killer for a specific zone (which is 
already its only function), and the fact that in this case we're doing it 
for all zones.  It seems like you're concerned with the latter, but the 
distinction in the hibernation case is that no memory freeing would be 
possible as the result of the oom killer for _all_ zones, so it makes 
sense to lock them all out.

> > The fact is that _all_ allocations here are implicitly __GFP_NO_OOM_KILL 
> > whether it specifies it or not since the oom killer would simply kill a 
> > task in D state which can't exit or free memory and subsequent allocations 
> > would make the oom killer a no-op because there's an eligible task with 
> > TIF_MEMDIE set.  The only thing you're saving with __GFP_NO_OOM_KILL is 
> > calling the oom killer in a first place and killing an unresponsive task 
> > but that would have to happen anyway when thawed since the system is oom 
> > (or otherwise lockup for GFP_KERNEL with order < PAGE_ALLOC_COSTLY_ORDER).
> 
> All the above is specific to the PM application only, when userspace
> tasks are stopped.
> 

I'm not arguing that the only way we can ever implement __GFP_NO_OOM_KILL 
is for the entire system: we can set ZONE_OOM_LOCKED for only the zones in 
the zonelist that are passed to the page allocator.  For this particular 
purpose, that is naturally all zones; for other future use cases it may be 
chosen only to lock out the zones we're allowed to allocate from in that 
context.

> It might well end up that stopping userspace (beforehand or before
> oom-killing) is a hard requirement for reliably disabling the
> oom-killer.

Yes, globally, but future use cases may disable only specific zones such 
as with memory hot-remove.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-07 22:14                                                                                           ` Rafael J. Wysocki
  (?)
  (?)
@ 2009-05-07 22:38                                                                                           ` Andrew Morton
  2009-05-07 22:50                                                                                             ` Rafael J. Wysocki
  2009-05-07 22:50                                                                                             ` Rafael J. Wysocki
  -1 siblings, 2 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-07 22:38 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: rientjes, fengguang.wu, linux-pm, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

On Fri, 8 May 2009 00:14:48 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> IOW, you need to freeze the user space totally before trying to disable the
> OOM killer.

Not necessarily.  We only need to take action if a task is about to
start oom-killing - presumably by taking a nap.

If a process is sitting there happily computing pi then we can leave it
running.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-07 22:14                                                                                           ` Rafael J. Wysocki
  (?)
@ 2009-05-07 22:38                                                                                           ` Andrew Morton
  -1 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-07 22:38 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: kernel-testers, rientjes, linux-kernel, alan-jenkins, jens.axboe,
	linux-pm, fengguang.wu, torvalds

On Fri, 8 May 2009 00:14:48 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> IOW, you need to freeze the user space totally before trying to disable the
> OOM killer.

Not necessarily.  We only need to take action if a task is about to
start oom-killing - presumably by taking a nap.

If a process is sitting there happily computing pi then we can leave it
running.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 22:45                                                                                             ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-07 22:45 UTC (permalink / raw)
  To: David Rientjes
  Cc: rjw, fengguang.wu, linux-pm, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

On Thu, 7 May 2009 15:16:17 -0700 (PDT)
David Rientjes <rientjes@google.com> wrote:

> On Thu, 7 May 2009, Andrew Morton wrote:
> 
> > - the standard way of controlling memory allocator behaviour is via
> >   the gfp_t.  Bypassing that is an unusual step and needs a higher
> >   level of justification, which I'm not seeing here.
> > 
> 
> The standard way of controlling the oom killer behavior for a zone is via 
> the ZONE_OOM_LOCKED bit.

oop, I didn't remember/realise that ZONE_OOM_LOCKED already exists.

> > - if we do this via an unusual global, we reduce the chances that
> >   another subsytem could use the new feature.
> > 
> >   I don't know what subsytem that might be, but I bet they're out
> >   there.  checkpoint-restart, virtual machines, ballooning memory
> >   drivers, kexec loading, etc.
> > 
> 
> There's two separate issues here: the use of ZONE_OOM_LOCKED to control 
> whether or not to invoke the oom killer for a specific zone (which is 
> already its only function), and the fact that in this case we're doing it 
> for all zones.  It seems like you're concerned with the latter, but the 
> distinction in the hibernation case is that no memory freeing would be 
> possible as the result of the oom killer for _all_ zones, so it makes 
> sense to lock them all out.

OK.

> > > The fact is that _all_ allocations here are implicitly __GFP_NO_OOM_KILL 
> > > whether it specifies it or not since the oom killer would simply kill a 
> > > task in D state which can't exit or free memory and subsequent allocations 
> > > would make the oom killer a no-op because there's an eligible task with 
> > > TIF_MEMDIE set.  The only thing you're saving with __GFP_NO_OOM_KILL is 
> > > calling the oom killer in a first place and killing an unresponsive task 
> > > but that would have to happen anyway when thawed since the system is oom 
> > > (or otherwise lockup for GFP_KERNEL with order < PAGE_ALLOC_COSTLY_ORDER).
> > 
> > All the above is specific to the PM application only, when userspace
> > tasks are stopped.
> > 
> 
> I'm not arguing that the only way we can ever implement __GFP_NO_OOM_KILL 
> is for the entire system: we can set ZONE_OOM_LOCKED for only the zones in 
> the zonelist that are passed to the page allocator.  For this particular 
> purpose, that is naturally all zones; for other future use cases it may be 
> chosen only to lock out the zones we're allowed to allocate from in that 
> context.

OK.

> > It might well end up that stopping userspace (beforehand or before
> > oom-killing) is a hard requirement for reliably disabling the
> > oom-killer.
> 
> Yes, globally, but future use cases may disable only specific zones such 
> as with memory hot-remove.

<goes off to find out what ZONE_OOM_LOCKED does>

That took remarkably longer than one would have expected..

Yes, OK, I agree, globally setting ZONE_OOM_LOCKED would produce a
decent result.

The setting and clearing of that thing looks gruesomely racy..


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-07 22:16                                                                                           ` David Rientjes
  (?)
  (?)
@ 2009-05-07 22:45                                                                                           ` Andrew Morton
  -1 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-07 22:45 UTC (permalink / raw)
  To: David Rientjes
  Cc: kernel-testers, linux-kernel, alan-jenkins, jens.axboe, linux-pm,
	fengguang.wu, torvalds

On Thu, 7 May 2009 15:16:17 -0700 (PDT)
David Rientjes <rientjes@google.com> wrote:

> On Thu, 7 May 2009, Andrew Morton wrote:
> 
> > - the standard way of controlling memory allocator behaviour is via
> >   the gfp_t.  Bypassing that is an unusual step and needs a higher
> >   level of justification, which I'm not seeing here.
> > 
> 
> The standard way of controlling the oom killer behavior for a zone is via 
> the ZONE_OOM_LOCKED bit.

oop, I didn't remember/realise that ZONE_OOM_LOCKED already exists.

> > - if we do this via an unusual global, we reduce the chances that
> >   another subsytem could use the new feature.
> > 
> >   I don't know what subsytem that might be, but I bet they're out
> >   there.  checkpoint-restart, virtual machines, ballooning memory
> >   drivers, kexec loading, etc.
> > 
> 
> There's two separate issues here: the use of ZONE_OOM_LOCKED to control 
> whether or not to invoke the oom killer for a specific zone (which is 
> already its only function), and the fact that in this case we're doing it 
> for all zones.  It seems like you're concerned with the latter, but the 
> distinction in the hibernation case is that no memory freeing would be 
> possible as the result of the oom killer for _all_ zones, so it makes 
> sense to lock them all out.

OK.

> > > The fact is that _all_ allocations here are implicitly __GFP_NO_OOM_KILL 
> > > whether it specifies it or not since the oom killer would simply kill a 
> > > task in D state which can't exit or free memory and subsequent allocations 
> > > would make the oom killer a no-op because there's an eligible task with 
> > > TIF_MEMDIE set.  The only thing you're saving with __GFP_NO_OOM_KILL is 
> > > calling the oom killer in a first place and killing an unresponsive task 
> > > but that would have to happen anyway when thawed since the system is oom 
> > > (or otherwise lockup for GFP_KERNEL with order < PAGE_ALLOC_COSTLY_ORDER).
> > 
> > All the above is specific to the PM application only, when userspace
> > tasks are stopped.
> > 
> 
> I'm not arguing that the only way we can ever implement __GFP_NO_OOM_KILL 
> is for the entire system: we can set ZONE_OOM_LOCKED for only the zones in 
> the zonelist that are passed to the page allocator.  For this particular 
> purpose, that is naturally all zones; for other future use cases it may be 
> chosen only to lock out the zones we're allowed to allocate from in that 
> context.

OK.

> > It might well end up that stopping userspace (beforehand or before
> > oom-killing) is a hard requirement for reliably disabling the
> > oom-killer.
> 
> Yes, globally, but future use cases may disable only specific zones such 
> as with memory hot-remove.

<goes off to find out what ZONE_OOM_LOCKED does>

That took remarkably longer than one would have expected..

Yes, OK, I agree, globally setting ZONE_OOM_LOCKED would produce a
decent result.

The setting and clearing of that thing looks gruesomely racy..

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 22:45                                                                                             ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-07 22:45 UTC (permalink / raw)
  To: David Rientjes
  Cc: rjw-KKrjLPT3xs0, fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Thu, 7 May 2009 15:16:17 -0700 (PDT)
David Rientjes <rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:

> On Thu, 7 May 2009, Andrew Morton wrote:
> 
> > - the standard way of controlling memory allocator behaviour is via
> >   the gfp_t.  Bypassing that is an unusual step and needs a higher
> >   level of justification, which I'm not seeing here.
> > 
> 
> The standard way of controlling the oom killer behavior for a zone is via 
> the ZONE_OOM_LOCKED bit.

oop, I didn't remember/realise that ZONE_OOM_LOCKED already exists.

> > - if we do this via an unusual global, we reduce the chances that
> >   another subsytem could use the new feature.
> > 
> >   I don't know what subsytem that might be, but I bet they're out
> >   there.  checkpoint-restart, virtual machines, ballooning memory
> >   drivers, kexec loading, etc.
> > 
> 
> There's two separate issues here: the use of ZONE_OOM_LOCKED to control 
> whether or not to invoke the oom killer for a specific zone (which is 
> already its only function), and the fact that in this case we're doing it 
> for all zones.  It seems like you're concerned with the latter, but the 
> distinction in the hibernation case is that no memory freeing would be 
> possible as the result of the oom killer for _all_ zones, so it makes 
> sense to lock them all out.

OK.

> > > The fact is that _all_ allocations here are implicitly __GFP_NO_OOM_KILL 
> > > whether it specifies it or not since the oom killer would simply kill a 
> > > task in D state which can't exit or free memory and subsequent allocations 
> > > would make the oom killer a no-op because there's an eligible task with 
> > > TIF_MEMDIE set.  The only thing you're saving with __GFP_NO_OOM_KILL is 
> > > calling the oom killer in a first place and killing an unresponsive task 
> > > but that would have to happen anyway when thawed since the system is oom 
> > > (or otherwise lockup for GFP_KERNEL with order < PAGE_ALLOC_COSTLY_ORDER).
> > 
> > All the above is specific to the PM application only, when userspace
> > tasks are stopped.
> > 
> 
> I'm not arguing that the only way we can ever implement __GFP_NO_OOM_KILL 
> is for the entire system: we can set ZONE_OOM_LOCKED for only the zones in 
> the zonelist that are passed to the page allocator.  For this particular 
> purpose, that is naturally all zones; for other future use cases it may be 
> chosen only to lock out the zones we're allowed to allocate from in that 
> context.

OK.

> > It might well end up that stopping userspace (beforehand or before
> > oom-killing) is a hard requirement for reliably disabling the
> > oom-killer.
> 
> Yes, globally, but future use cases may disable only specific zones such 
> as with memory hot-remove.

<goes off to find out what ZONE_OOM_LOCKED does>

That took remarkably longer than one would have expected..

Yes, OK, I agree, globally setting ZONE_OOM_LOCKED would produce a
decent result.

The setting and clearing of that thing looks gruesomely racy..

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-07 22:38                                                                                           ` Andrew Morton
@ 2009-05-07 22:50                                                                                             ` Rafael J. Wysocki
  2009-05-07 23:15                                                                                               ` Andrew Morton
  2009-05-07 23:15                                                                                                 ` Andrew Morton
  2009-05-07 22:50                                                                                             ` Rafael J. Wysocki
  1 sibling, 2 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-07 22:50 UTC (permalink / raw)
  To: Andrew Morton
  Cc: rientjes, fengguang.wu, linux-pm, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

On Friday 08 May 2009, Andrew Morton wrote:
> On Fri, 8 May 2009 00:14:48 +0200
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > IOW, you need to freeze the user space totally before trying to disable the
> > OOM killer.
> 
> Not necessarily.  We only need to take action if a task is about to
> start oom-killing - presumably by taking a nap.
> 
> If a process is sitting there happily computing pi then we can leave it
> running.

Well, the point is we don't really know what the task is going to do next.
Is it going to continue computing pi, or is it going to execl(huge_binary), for
example?

If we knew what tasks were going to do in advance, the whole freezing wouldn't
really be necessary. :-)

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-07 22:38                                                                                           ` Andrew Morton
  2009-05-07 22:50                                                                                             ` Rafael J. Wysocki
@ 2009-05-07 22:50                                                                                             ` Rafael J. Wysocki
  1 sibling, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-07 22:50 UTC (permalink / raw)
  To: Andrew Morton
  Cc: kernel-testers, rientjes, linux-kernel, alan-jenkins, jens.axboe,
	linux-pm, fengguang.wu, torvalds

On Friday 08 May 2009, Andrew Morton wrote:
> On Fri, 8 May 2009 00:14:48 +0200
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > IOW, you need to freeze the user space totally before trying to disable the
> > OOM killer.
> 
> Not necessarily.  We only need to take action if a task is about to
> start oom-killing - presumably by taking a nap.
> 
> If a process is sitting there happily computing pi then we can leave it
> running.

Well, the point is we don't really know what the task is going to do next.
Is it going to continue computing pi, or is it going to execl(huge_binary), for
example?

If we knew what tasks were going to do in advance, the whole freezing wouldn't
really be necessary. :-)

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 22:59                                                                                               ` David Rientjes
  0 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-07 22:59 UTC (permalink / raw)
  To: Andrew Morton
  Cc: rjw, fengguang.wu, linux-pm, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

On Thu, 7 May 2009, Andrew Morton wrote:

> The setting and clearing of that thing looks gruesomely racy..
> 

It's not racy currently because zone_scan_lock ensures ZONE_OOM_LOCKED 
gets test/set and cleared atomically for the entire zonelist (the clear 
happens for the same zonelist that was test/set).

Using it for hibernation in the way I've proposed will open it up to the 
race I earlier described: when a kthread is in the oom killer and 
subsequently clears its zonelist of ZONE_OOM_LOCKED (all other tasks are 
frozen so they can't be in the oom killer).  That's perfectly acceptable, 
however, since the system is by definition already oom if kthreads can't 
get memory so it will end up killing a user task even though it's stuck in 
D state and will exit on thaw; we aren't concerned about killing 
needlessly because the oom killer becomes a no-op when it finds a task 
that has already been killed but hasn't exited by way of TIF_MEMDIE.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-07 22:45                                                                                             ` Andrew Morton
  (?)
@ 2009-05-07 22:59                                                                                             ` David Rientjes
  -1 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-07 22:59 UTC (permalink / raw)
  To: Andrew Morton
  Cc: kernel-testers, linux-kernel, alan-jenkins, jens.axboe, linux-pm,
	fengguang.wu, torvalds

On Thu, 7 May 2009, Andrew Morton wrote:

> The setting and clearing of that thing looks gruesomely racy..
> 

It's not racy currently because zone_scan_lock ensures ZONE_OOM_LOCKED 
gets test/set and cleared atomically for the entire zonelist (the clear 
happens for the same zonelist that was test/set).

Using it for hibernation in the way I've proposed will open it up to the 
race I earlier described: when a kthread is in the oom killer and 
subsequently clears its zonelist of ZONE_OOM_LOCKED (all other tasks are 
frozen so they can't be in the oom killer).  That's perfectly acceptable, 
however, since the system is by definition already oom if kthreads can't 
get memory so it will end up killing a user task even though it's stuck in 
D state and will exit on thaw; we aren't concerned about killing 
needlessly because the oom killer becomes a no-op when it finds a task 
that has already been killed but hasn't exited by way of TIF_MEMDIE.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 22:59                                                                                               ` David Rientjes
  0 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-07 22:59 UTC (permalink / raw)
  To: Andrew Morton
  Cc: rjw-KKrjLPT3xs0, fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Thu, 7 May 2009, Andrew Morton wrote:

> The setting and clearing of that thing looks gruesomely racy..
> 

It's not racy currently because zone_scan_lock ensures ZONE_OOM_LOCKED 
gets test/set and cleared atomically for the entire zonelist (the clear 
happens for the same zonelist that was test/set).

Using it for hibernation in the way I've proposed will open it up to the 
race I earlier described: when a kthread is in the oom killer and 
subsequently clears its zonelist of ZONE_OOM_LOCKED (all other tasks are 
frozen so they can't be in the oom killer).  That's perfectly acceptable, 
however, since the system is by definition already oom if kthreads can't 
get memory so it will end up killing a user task even though it's stuck in 
D state and will exit on thaw; we aren't concerned about killing 
needlessly because the oom killer becomes a no-op when it finds a task 
that has already been killed but hasn't exited by way of TIF_MEMDIE.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 23:11                                                                                                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-07 23:11 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, fengguang.wu, linux-pm, pavel, torvalds,
	jens.axboe, alan-jenkins, linux-kernel, kernel-testers

On Friday 08 May 2009, David Rientjes wrote:
> On Thu, 7 May 2009, Andrew Morton wrote:
> 
> > The setting and clearing of that thing looks gruesomely racy..
> > 
> 
> It's not racy currently because zone_scan_lock ensures ZONE_OOM_LOCKED 
> gets test/set and cleared atomically for the entire zonelist (the clear 
> happens for the same zonelist that was test/set).
> 
> Using it for hibernation in the way I've proposed will open it up to the 
> race I earlier described: when a kthread is in the oom killer and 
> subsequently clears its zonelist of ZONE_OOM_LOCKED (all other tasks are 
> frozen so they can't be in the oom killer).  That's perfectly acceptable, 
> however, since the system is by definition already oom if kthreads can't 
> get memory so it will end up killing a user task even though it's stuck in 
> D state and will exit on thaw; we aren't concerned about killing 
> needlessly because the oom killer becomes a no-op when it finds a task 
> that has already been killed but hasn't exited by way of TIF_MEMDIE.

OK there.

So everyone seems to agree we can do something like in the patch below?

---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: PM/Hibernate: Rework shrinking of memory

Rework swsusp_shrink_memory() so that it calls shrink_all_memory()
just once to make some room for the image and then allocates memory
to apply more pressure to the memory management subsystem, if
necessary.

Unfortunately, we don't seem to be able to drop shrink_all_memory()
entirely just yet, because that would lead to huge performance
regressions in some test cases.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/snapshot.c |  151 +++++++++++++++++++++++++++++++++---------------
 1 file changed, 104 insertions(+), 47 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -1066,69 +1066,126 @@ void swsusp_free(void)
 	buffer = NULL;
 }
 
+/* Helper functions used for the shrinking of memory. */
+
 /**
- *	swsusp_shrink_memory -  Try to free as much memory as needed
- *
- *	... but do not OOM-kill anyone
+ * preallocate_image_memory - Allocate given number of page frames
+ * @nr_pages: Number of page frames to allocate
  *
- *	Notice: all userland should be stopped before it is called, or
- *	livelock is possible.
+ * Return value: Number of page frames actually allocated
  */
-
-#define SHRINK_BITE	10000
-static inline unsigned long __shrink_memory(long tmp)
+static unsigned long preallocate_image_memory(unsigned long nr_pages)
 {
-	if (tmp > SHRINK_BITE)
-		tmp = SHRINK_BITE;
-	return shrink_all_memory(tmp);
+	unsigned long nr_alloc = 0;
+
+	while (nr_pages > 0) {
+		if (!alloc_image_page(GFP_KERNEL | __GFP_NOWARN))
+			break;
+		nr_pages--;
+		nr_alloc++;
+	}
+
+	return nr_alloc;
 }
 
+/**
+ * swsusp_shrink_memory -  Make the kernel release as much memory as needed
+ *
+ * To create a hibernation image it is necessary to make a copy of every page
+ * frame in use.  We also need a number of page frames to be free during
+ * hibernation for allocations made while saving the image and for device
+ * drivers, in case they need to allocate memory from their hibernation
+ * callbacks (these two numbers are given by PAGES_FOR_IO and SPARE_PAGES,
+ * respectively, both of which are rough estimates).  To make this happen, we
+ * compute the total number of available page frames and allocate at least
+ *
+ * ([page frames total] + PAGES_FOR_IO + [metadata pages]) / 2 + 2 * SPARE_PAGES
+ *
+ * of them, which corresponds to the maximum size of a hibernation image.
+ *
+ * If image_size is set below the number following from the above formula,
+ * the preallocation of memory is continued until the total number of page
+ * frames in use is below the requested image size or it is impossible to
+ * allocate more memory, whichever happens first.
+ */
 int swsusp_shrink_memory(void)
 {
-	long tmp;
 	struct zone *zone;
-	unsigned long pages = 0;
-	unsigned int i = 0;
-	char *p = "-\\|/";
+	unsigned long saveable, size, max_size, count, pages = 0;
 	struct timeval start, stop;
+	int error = 0;
 
-	printk(KERN_INFO "PM: Shrinking memory...  ");
+	printk(KERN_INFO "PM: Shrinking memory ... ");
 	do_gettimeofday(&start);
-	do {
-		long size, highmem_size;
 
-		highmem_size = count_highmem_pages();
-		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
-		tmp = size;
-		size += highmem_size;
-		for_each_populated_zone(zone) {
-			tmp += snapshot_additional_pages(zone);
-			if (is_highmem(zone)) {
-				highmem_size -=
-					zone_page_state(zone, NR_FREE_PAGES);
-			} else {
-				tmp -= zone_page_state(zone, NR_FREE_PAGES);
-				tmp += zone->lowmem_reserve[ZONE_NORMAL];
-			}
-		}
+	/* Count the number of saveable data pages. */
+	saveable = count_data_pages() + count_highmem_pages();
 
-		if (highmem_size < 0)
-			highmem_size = 0;
+	/*
+	 * Compute the total number of page frames we can use (count) and the
+	 * number of pages needed for image metadata (size).
+	 */
+	count = saveable;
+	size = 0;
+	for_each_populated_zone(zone) {
+		size += snapshot_additional_pages(zone);
+		count += zone_page_state(zone, NR_FREE_PAGES);
+		count -= zone->pages_min;
+	}
 
-		tmp += highmem_size;
-		if (tmp > 0) {
-			tmp = __shrink_memory(tmp);
-			if (!tmp)
-				return -ENOMEM;
-			pages += tmp;
-		} else if (size > image_size / PAGE_SIZE) {
-			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
-			pages += tmp;
-		}
-		printk("\b%c", p[i++%4]);
-	} while (tmp > 0);
+	/* Compute the maximum number of saveable pages to leave in memory. */
+	max_size = (count - (size + PAGES_FOR_IO)) / 2 - 2 * SPARE_PAGES;
+	size = DIV_ROUND_UP(image_size, PAGE_SIZE);
+	if (size > max_size)
+		size = max_size;
+	/*
+	 * If the maximum is not less than the current number of saveable pages
+	 * in memory, we don't need to do anything more.
+	 */
+	if (size >= saveable)
+		goto out;
+
+	/*
+	 * Let the memory management subsystem know that we're going to need a
+	 * large number of page frames to allocate and make it free some memory.
+	 * NOTE: If this is not done, performance is heavily affected in some
+	 * test cases.
+	 */
+	shrink_all_memory(saveable - size);
+
+	/*
+	 * Prevent the OOM killer from triggering while we're allocating image
+	 * memory.
+	 */
+	for_each_populated_zone(zone)
+		zone_set_flag(zone, ZONE_OOM_LOCKED);
+	/*
+	 * The number of saveable pages in memory was too high, so apply some
+	 * pressure to decrease it.  First, make room for the largest possible
+	 * image and fail if that doesn't work.  Next, try to decrease the size
+	 * of the image as much as indicated by image_size.
+	 */
+	count -= max_size;
+	pages = preallocate_image_memory(count);
+	if (pages < count)
+		error = -ENOMEM;
+	else
+		pages += preallocate_image_memory(max_size - size);
+
+	for_each_populated_zone(zone)
+		zone_clear_flag(zone, ZONE_OOM_LOCKED);
+
+	/* Release all of the preallocated page frames. */
+	swsusp_free();
+
+	if (error) {
+		printk(KERN_CONT "\n");
+		return error;
+	}
+
+ out:
 	do_gettimeofday(&stop);
-	printk("\bdone (%lu pages freed)\n", pages);
+	printk(KERN_CONT "done (preallocated %lu free pages)\n", pages);
 	swsusp_show_speed(&start, &stop, pages, "Freed");
 
 	return 0;

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-07 22:59                                                                                               ` David Rientjes
  (?)
  (?)
@ 2009-05-07 23:11                                                                                               ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-07 23:11 UTC (permalink / raw)
  To: David Rientjes
  Cc: kernel-testers, linux-kernel, alan-jenkins, jens.axboe,
	Andrew Morton, fengguang.wu, torvalds, linux-pm

On Friday 08 May 2009, David Rientjes wrote:
> On Thu, 7 May 2009, Andrew Morton wrote:
> 
> > The setting and clearing of that thing looks gruesomely racy..
> > 
> 
> It's not racy currently because zone_scan_lock ensures ZONE_OOM_LOCKED 
> gets test/set and cleared atomically for the entire zonelist (the clear 
> happens for the same zonelist that was test/set).
> 
> Using it for hibernation in the way I've proposed will open it up to the 
> race I earlier described: when a kthread is in the oom killer and 
> subsequently clears its zonelist of ZONE_OOM_LOCKED (all other tasks are 
> frozen so they can't be in the oom killer).  That's perfectly acceptable, 
> however, since the system is by definition already oom if kthreads can't 
> get memory so it will end up killing a user task even though it's stuck in 
> D state and will exit on thaw; we aren't concerned about killing 
> needlessly because the oom killer becomes a no-op when it finds a task 
> that has already been killed but hasn't exited by way of TIF_MEMDIE.

OK there.

So everyone seems to agree we can do something like in the patch below?

---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: PM/Hibernate: Rework shrinking of memory

Rework swsusp_shrink_memory() so that it calls shrink_all_memory()
just once to make some room for the image and then allocates memory
to apply more pressure to the memory management subsystem, if
necessary.

Unfortunately, we don't seem to be able to drop shrink_all_memory()
entirely just yet, because that would lead to huge performance
regressions in some test cases.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/snapshot.c |  151 +++++++++++++++++++++++++++++++++---------------
 1 file changed, 104 insertions(+), 47 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -1066,69 +1066,126 @@ void swsusp_free(void)
 	buffer = NULL;
 }
 
+/* Helper functions used for the shrinking of memory. */
+
 /**
- *	swsusp_shrink_memory -  Try to free as much memory as needed
- *
- *	... but do not OOM-kill anyone
+ * preallocate_image_memory - Allocate given number of page frames
+ * @nr_pages: Number of page frames to allocate
  *
- *	Notice: all userland should be stopped before it is called, or
- *	livelock is possible.
+ * Return value: Number of page frames actually allocated
  */
-
-#define SHRINK_BITE	10000
-static inline unsigned long __shrink_memory(long tmp)
+static unsigned long preallocate_image_memory(unsigned long nr_pages)
 {
-	if (tmp > SHRINK_BITE)
-		tmp = SHRINK_BITE;
-	return shrink_all_memory(tmp);
+	unsigned long nr_alloc = 0;
+
+	while (nr_pages > 0) {
+		if (!alloc_image_page(GFP_KERNEL | __GFP_NOWARN))
+			break;
+		nr_pages--;
+		nr_alloc++;
+	}
+
+	return nr_alloc;
 }
 
+/**
+ * swsusp_shrink_memory -  Make the kernel release as much memory as needed
+ *
+ * To create a hibernation image it is necessary to make a copy of every page
+ * frame in use.  We also need a number of page frames to be free during
+ * hibernation for allocations made while saving the image and for device
+ * drivers, in case they need to allocate memory from their hibernation
+ * callbacks (these two numbers are given by PAGES_FOR_IO and SPARE_PAGES,
+ * respectively, both of which are rough estimates).  To make this happen, we
+ * compute the total number of available page frames and allocate at least
+ *
+ * ([page frames total] + PAGES_FOR_IO + [metadata pages]) / 2 + 2 * SPARE_PAGES
+ *
+ * of them, which corresponds to the maximum size of a hibernation image.
+ *
+ * If image_size is set below the number following from the above formula,
+ * the preallocation of memory is continued until the total number of page
+ * frames in use is below the requested image size or it is impossible to
+ * allocate more memory, whichever happens first.
+ */
 int swsusp_shrink_memory(void)
 {
-	long tmp;
 	struct zone *zone;
-	unsigned long pages = 0;
-	unsigned int i = 0;
-	char *p = "-\\|/";
+	unsigned long saveable, size, max_size, count, pages = 0;
 	struct timeval start, stop;
+	int error = 0;
 
-	printk(KERN_INFO "PM: Shrinking memory...  ");
+	printk(KERN_INFO "PM: Shrinking memory ... ");
 	do_gettimeofday(&start);
-	do {
-		long size, highmem_size;
 
-		highmem_size = count_highmem_pages();
-		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
-		tmp = size;
-		size += highmem_size;
-		for_each_populated_zone(zone) {
-			tmp += snapshot_additional_pages(zone);
-			if (is_highmem(zone)) {
-				highmem_size -=
-					zone_page_state(zone, NR_FREE_PAGES);
-			} else {
-				tmp -= zone_page_state(zone, NR_FREE_PAGES);
-				tmp += zone->lowmem_reserve[ZONE_NORMAL];
-			}
-		}
+	/* Count the number of saveable data pages. */
+	saveable = count_data_pages() + count_highmem_pages();
 
-		if (highmem_size < 0)
-			highmem_size = 0;
+	/*
+	 * Compute the total number of page frames we can use (count) and the
+	 * number of pages needed for image metadata (size).
+	 */
+	count = saveable;
+	size = 0;
+	for_each_populated_zone(zone) {
+		size += snapshot_additional_pages(zone);
+		count += zone_page_state(zone, NR_FREE_PAGES);
+		count -= zone->pages_min;
+	}
 
-		tmp += highmem_size;
-		if (tmp > 0) {
-			tmp = __shrink_memory(tmp);
-			if (!tmp)
-				return -ENOMEM;
-			pages += tmp;
-		} else if (size > image_size / PAGE_SIZE) {
-			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
-			pages += tmp;
-		}
-		printk("\b%c", p[i++%4]);
-	} while (tmp > 0);
+	/* Compute the maximum number of saveable pages to leave in memory. */
+	max_size = (count - (size + PAGES_FOR_IO)) / 2 - 2 * SPARE_PAGES;
+	size = DIV_ROUND_UP(image_size, PAGE_SIZE);
+	if (size > max_size)
+		size = max_size;
+	/*
+	 * If the maximum is not less than the current number of saveable pages
+	 * in memory, we don't need to do anything more.
+	 */
+	if (size >= saveable)
+		goto out;
+
+	/*
+	 * Let the memory management subsystem know that we're going to need a
+	 * large number of page frames to allocate and make it free some memory.
+	 * NOTE: If this is not done, performance is heavily affected in some
+	 * test cases.
+	 */
+	shrink_all_memory(saveable - size);
+
+	/*
+	 * Prevent the OOM killer from triggering while we're allocating image
+	 * memory.
+	 */
+	for_each_populated_zone(zone)
+		zone_set_flag(zone, ZONE_OOM_LOCKED);
+	/*
+	 * The number of saveable pages in memory was too high, so apply some
+	 * pressure to decrease it.  First, make room for the largest possible
+	 * image and fail if that doesn't work.  Next, try to decrease the size
+	 * of the image as much as indicated by image_size.
+	 */
+	count -= max_size;
+	pages = preallocate_image_memory(count);
+	if (pages < count)
+		error = -ENOMEM;
+	else
+		pages += preallocate_image_memory(max_size - size);
+
+	for_each_populated_zone(zone)
+		zone_clear_flag(zone, ZONE_OOM_LOCKED);
+
+	/* Release all of the preallocated page frames. */
+	swsusp_free();
+
+	if (error) {
+		printk(KERN_CONT "\n");
+		return error;
+	}
+
+ out:
 	do_gettimeofday(&stop);
-	printk("\bdone (%lu pages freed)\n", pages);
+	printk(KERN_CONT "done (preallocated %lu free pages)\n", pages);
 	swsusp_show_speed(&start, &stop, pages, "Freed");
 
 	return 0;

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 23:11                                                                                                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-07 23:11 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Friday 08 May 2009, David Rientjes wrote:
> On Thu, 7 May 2009, Andrew Morton wrote:
> 
> > The setting and clearing of that thing looks gruesomely racy..
> > 
> 
> It's not racy currently because zone_scan_lock ensures ZONE_OOM_LOCKED 
> gets test/set and cleared atomically for the entire zonelist (the clear 
> happens for the same zonelist that was test/set).
> 
> Using it for hibernation in the way I've proposed will open it up to the 
> race I earlier described: when a kthread is in the oom killer and 
> subsequently clears its zonelist of ZONE_OOM_LOCKED (all other tasks are 
> frozen so they can't be in the oom killer).  That's perfectly acceptable, 
> however, since the system is by definition already oom if kthreads can't 
> get memory so it will end up killing a user task even though it's stuck in 
> D state and will exit on thaw; we aren't concerned about killing 
> needlessly because the oom killer becomes a no-op when it finds a task 
> that has already been killed but hasn't exited by way of TIF_MEMDIE.

OK there.

So everyone seems to agree we can do something like in the patch below?

---
From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
Subject: PM/Hibernate: Rework shrinking of memory

Rework swsusp_shrink_memory() so that it calls shrink_all_memory()
just once to make some room for the image and then allocates memory
to apply more pressure to the memory management subsystem, if
necessary.

Unfortunately, we don't seem to be able to drop shrink_all_memory()
entirely just yet, because that would lead to huge performance
regressions in some test cases.

Signed-off-by: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
---
 kernel/power/snapshot.c |  151 +++++++++++++++++++++++++++++++++---------------
 1 file changed, 104 insertions(+), 47 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -1066,69 +1066,126 @@ void swsusp_free(void)
 	buffer = NULL;
 }
 
+/* Helper functions used for the shrinking of memory. */
+
 /**
- *	swsusp_shrink_memory -  Try to free as much memory as needed
- *
- *	... but do not OOM-kill anyone
+ * preallocate_image_memory - Allocate given number of page frames
+ * @nr_pages: Number of page frames to allocate
  *
- *	Notice: all userland should be stopped before it is called, or
- *	livelock is possible.
+ * Return value: Number of page frames actually allocated
  */
-
-#define SHRINK_BITE	10000
-static inline unsigned long __shrink_memory(long tmp)
+static unsigned long preallocate_image_memory(unsigned long nr_pages)
 {
-	if (tmp > SHRINK_BITE)
-		tmp = SHRINK_BITE;
-	return shrink_all_memory(tmp);
+	unsigned long nr_alloc = 0;
+
+	while (nr_pages > 0) {
+		if (!alloc_image_page(GFP_KERNEL | __GFP_NOWARN))
+			break;
+		nr_pages--;
+		nr_alloc++;
+	}
+
+	return nr_alloc;
 }
 
+/**
+ * swsusp_shrink_memory -  Make the kernel release as much memory as needed
+ *
+ * To create a hibernation image it is necessary to make a copy of every page
+ * frame in use.  We also need a number of page frames to be free during
+ * hibernation for allocations made while saving the image and for device
+ * drivers, in case they need to allocate memory from their hibernation
+ * callbacks (these two numbers are given by PAGES_FOR_IO and SPARE_PAGES,
+ * respectively, both of which are rough estimates).  To make this happen, we
+ * compute the total number of available page frames and allocate at least
+ *
+ * ([page frames total] + PAGES_FOR_IO + [metadata pages]) / 2 + 2 * SPARE_PAGES
+ *
+ * of them, which corresponds to the maximum size of a hibernation image.
+ *
+ * If image_size is set below the number following from the above formula,
+ * the preallocation of memory is continued until the total number of page
+ * frames in use is below the requested image size or it is impossible to
+ * allocate more memory, whichever happens first.
+ */
 int swsusp_shrink_memory(void)
 {
-	long tmp;
 	struct zone *zone;
-	unsigned long pages = 0;
-	unsigned int i = 0;
-	char *p = "-\\|/";
+	unsigned long saveable, size, max_size, count, pages = 0;
 	struct timeval start, stop;
+	int error = 0;
 
-	printk(KERN_INFO "PM: Shrinking memory...  ");
+	printk(KERN_INFO "PM: Shrinking memory ... ");
 	do_gettimeofday(&start);
-	do {
-		long size, highmem_size;
 
-		highmem_size = count_highmem_pages();
-		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
-		tmp = size;
-		size += highmem_size;
-		for_each_populated_zone(zone) {
-			tmp += snapshot_additional_pages(zone);
-			if (is_highmem(zone)) {
-				highmem_size -=
-					zone_page_state(zone, NR_FREE_PAGES);
-			} else {
-				tmp -= zone_page_state(zone, NR_FREE_PAGES);
-				tmp += zone->lowmem_reserve[ZONE_NORMAL];
-			}
-		}
+	/* Count the number of saveable data pages. */
+	saveable = count_data_pages() + count_highmem_pages();
 
-		if (highmem_size < 0)
-			highmem_size = 0;
+	/*
+	 * Compute the total number of page frames we can use (count) and the
+	 * number of pages needed for image metadata (size).
+	 */
+	count = saveable;
+	size = 0;
+	for_each_populated_zone(zone) {
+		size += snapshot_additional_pages(zone);
+		count += zone_page_state(zone, NR_FREE_PAGES);
+		count -= zone->pages_min;
+	}
 
-		tmp += highmem_size;
-		if (tmp > 0) {
-			tmp = __shrink_memory(tmp);
-			if (!tmp)
-				return -ENOMEM;
-			pages += tmp;
-		} else if (size > image_size / PAGE_SIZE) {
-			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
-			pages += tmp;
-		}
-		printk("\b%c", p[i++%4]);
-	} while (tmp > 0);
+	/* Compute the maximum number of saveable pages to leave in memory. */
+	max_size = (count - (size + PAGES_FOR_IO)) / 2 - 2 * SPARE_PAGES;
+	size = DIV_ROUND_UP(image_size, PAGE_SIZE);
+	if (size > max_size)
+		size = max_size;
+	/*
+	 * If the maximum is not less than the current number of saveable pages
+	 * in memory, we don't need to do anything more.
+	 */
+	if (size >= saveable)
+		goto out;
+
+	/*
+	 * Let the memory management subsystem know that we're going to need a
+	 * large number of page frames to allocate and make it free some memory.
+	 * NOTE: If this is not done, performance is heavily affected in some
+	 * test cases.
+	 */
+	shrink_all_memory(saveable - size);
+
+	/*
+	 * Prevent the OOM killer from triggering while we're allocating image
+	 * memory.
+	 */
+	for_each_populated_zone(zone)
+		zone_set_flag(zone, ZONE_OOM_LOCKED);
+	/*
+	 * The number of saveable pages in memory was too high, so apply some
+	 * pressure to decrease it.  First, make room for the largest possible
+	 * image and fail if that doesn't work.  Next, try to decrease the size
+	 * of the image as much as indicated by image_size.
+	 */
+	count -= max_size;
+	pages = preallocate_image_memory(count);
+	if (pages < count)
+		error = -ENOMEM;
+	else
+		pages += preallocate_image_memory(max_size - size);
+
+	for_each_populated_zone(zone)
+		zone_clear_flag(zone, ZONE_OOM_LOCKED);
+
+	/* Release all of the preallocated page frames. */
+	swsusp_free();
+
+	if (error) {
+		printk(KERN_CONT "\n");
+		return error;
+	}
+
+ out:
 	do_gettimeofday(&stop);
-	printk("\bdone (%lu pages freed)\n", pages);
+	printk(KERN_CONT "done (preallocated %lu free pages)\n", pages);
 	swsusp_show_speed(&start, &stop, pages, "Freed");
 
 	return 0;

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 23:15                                                                                                 ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-07 23:15 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: rientjes, fengguang.wu, linux-pm, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

On Fri, 8 May 2009 00:50:41 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> On Friday 08 May 2009, Andrew Morton wrote:
> > On Fri, 8 May 2009 00:14:48 +0200
> > "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > 
> > > IOW, you need to freeze the user space totally before trying to disable the
> > > OOM killer.
> > 
> > Not necessarily.  We only need to take action if a task is about to
> > start oom-killing - presumably by taking a nap.
> > 
> > If a process is sitting there happily computing pi then we can leave it
> > running.
> 
> Well, the point is we don't really know what the task is going to do next.
> Is it going to continue computing pi, or is it going to execl(huge_binary), for
> example?
> 
> If we knew what tasks were going to do in advance, the whole freezing wouldn't
> really be necessary. :-)

argh.  Third time:

- if the task is computing pi, let it do so.

- if the task tries to allocate memory and succeeds, let it proceed.

- if the task tries to allocate memory and fails and then tries to invoke
  the oom-killer, stop the task.


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-07 22:50                                                                                             ` Rafael J. Wysocki
@ 2009-05-07 23:15                                                                                               ` Andrew Morton
  2009-05-07 23:15                                                                                                 ` Andrew Morton
  1 sibling, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-07 23:15 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: kernel-testers, rientjes, linux-kernel, alan-jenkins, jens.axboe,
	linux-pm, fengguang.wu, torvalds

On Fri, 8 May 2009 00:50:41 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> On Friday 08 May 2009, Andrew Morton wrote:
> > On Fri, 8 May 2009 00:14:48 +0200
> > "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > 
> > > IOW, you need to freeze the user space totally before trying to disable the
> > > OOM killer.
> > 
> > Not necessarily.  We only need to take action if a task is about to
> > start oom-killing - presumably by taking a nap.
> > 
> > If a process is sitting there happily computing pi then we can leave it
> > running.
> 
> Well, the point is we don't really know what the task is going to do next.
> Is it going to continue computing pi, or is it going to execl(huge_binary), for
> example?
> 
> If we knew what tasks were going to do in advance, the whole freezing wouldn't
> really be necessary. :-)

argh.  Third time:

- if the task is computing pi, let it do so.

- if the task tries to allocate memory and succeeds, let it proceed.

- if the task tries to allocate memory and fails and then tries to invoke
  the oom-killer, stop the task.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 23:15                                                                                                 ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-07 23:15 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Fri, 8 May 2009 00:50:41 +0200
"Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:

> On Friday 08 May 2009, Andrew Morton wrote:
> > On Fri, 8 May 2009 00:14:48 +0200
> > "Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:
> > 
> > > IOW, you need to freeze the user space totally before trying to disable the
> > > OOM killer.
> > 
> > Not necessarily.  We only need to take action if a task is about to
> > start oom-killing - presumably by taking a nap.
> > 
> > If a process is sitting there happily computing pi then we can leave it
> > running.
> 
> Well, the point is we don't really know what the task is going to do next.
> Is it going to continue computing pi, or is it going to execl(huge_binary), for
> example?
> 
> If we knew what tasks were going to do in advance, the whole freezing wouldn't
> really be necessary. :-)

argh.  Third time:

- if the task is computing pi, let it do so.

- if the task tries to allocate memory and succeeds, let it proceed.

- if the task tries to allocate memory and fails and then tries to invoke
  the oom-killer, stop the task.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 23:24                                                                                                   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-07 23:24 UTC (permalink / raw)
  To: Andrew Morton
  Cc: rientjes, fengguang.wu, linux-pm, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

On Friday 08 May 2009, Andrew Morton wrote:
> On Fri, 8 May 2009 00:50:41 +0200
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > On Friday 08 May 2009, Andrew Morton wrote:
> > > On Fri, 8 May 2009 00:14:48 +0200
> > > "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > 
> > > > IOW, you need to freeze the user space totally before trying to disable the
> > > > OOM killer.
> > > 
> > > Not necessarily.  We only need to take action if a task is about to
> > > start oom-killing - presumably by taking a nap.
> > > 
> > > If a process is sitting there happily computing pi then we can leave it
> > > running.
> > 
> > Well, the point is we don't really know what the task is going to do next.
> > Is it going to continue computing pi, or is it going to execl(huge_binary), for
> > example?
> > 
> > If we knew what tasks were going to do in advance, the whole freezing wouldn't
> > really be necessary. :-)
> 
> argh.  Third time:
> 
> - if the task is computing pi, let it do so.
> 
> - if the task tries to allocate memory and succeeds, let it proceed.
> 
> - if the task tries to allocate memory and fails and then tries to invoke
>   the oom-killer, stop the task.

Understood.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-07 23:15                                                                                                 ` Andrew Morton
  (?)
  (?)
@ 2009-05-07 23:24                                                                                                 ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-07 23:24 UTC (permalink / raw)
  To: Andrew Morton
  Cc: kernel-testers, rientjes, linux-kernel, alan-jenkins, jens.axboe,
	linux-pm, fengguang.wu, torvalds

On Friday 08 May 2009, Andrew Morton wrote:
> On Fri, 8 May 2009 00:50:41 +0200
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > On Friday 08 May 2009, Andrew Morton wrote:
> > > On Fri, 8 May 2009 00:14:48 +0200
> > > "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > 
> > > > IOW, you need to freeze the user space totally before trying to disable the
> > > > OOM killer.
> > > 
> > > Not necessarily.  We only need to take action if a task is about to
> > > start oom-killing - presumably by taking a nap.
> > > 
> > > If a process is sitting there happily computing pi then we can leave it
> > > running.
> > 
> > Well, the point is we don't really know what the task is going to do next.
> > Is it going to continue computing pi, or is it going to execl(huge_binary), for
> > example?
> > 
> > If we knew what tasks were going to do in advance, the whole freezing wouldn't
> > really be necessary. :-)
> 
> argh.  Third time:
> 
> - if the task is computing pi, let it do so.
> 
> - if the task tries to allocate memory and succeeds, let it proceed.
> 
> - if the task tries to allocate memory and fails and then tries to invoke
>   the oom-killer, stop the task.

Understood.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-07 23:24                                                                                                   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-07 23:24 UTC (permalink / raw)
  To: Andrew Morton
  Cc: rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Friday 08 May 2009, Andrew Morton wrote:
> On Fri, 8 May 2009 00:50:41 +0200
> "Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:
> 
> > On Friday 08 May 2009, Andrew Morton wrote:
> > > On Fri, 8 May 2009 00:14:48 +0200
> > > "Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:
> > > 
> > > > IOW, you need to freeze the user space totally before trying to disable the
> > > > OOM killer.
> > > 
> > > Not necessarily.  We only need to take action if a task is about to
> > > start oom-killing - presumably by taking a nap.
> > > 
> > > If a process is sitting there happily computing pi then we can leave it
> > > running.
> > 
> > Well, the point is we don't really know what the task is going to do next.
> > Is it going to continue computing pi, or is it going to execl(huge_binary), for
> > example?
> > 
> > If we knew what tasks were going to do in advance, the whole freezing wouldn't
> > really be necessary. :-)
> 
> argh.  Third time:
> 
> - if the task is computing pi, let it do so.
> 
> - if the task tries to allocate memory and succeeds, let it proceed.
> 
> - if the task tries to allocate memory and fails and then tries to invoke
>   the oom-killer, stop the task.

Understood.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-08  1:16                                                                                                   ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 580+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-05-08  1:16 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: David Rientjes, Andrew Morton, fengguang.wu, linux-pm, pavel,
	torvalds, jens.axboe, alan-jenkins, linux-kernel, kernel-testers

On Fri, 8 May 2009 01:11:30 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> +	for_each_populated_zone(zone)
> +		zone_set_flag(zone, ZONE_OOM_LOCKED);

> +	for_each_populated_zone(zone)
> +		zone_clear_flag(zone, ZONE_OOM_LOCKED);
> +

Isn't it better to make above 2 be functions and move to mm/oom_kill.c ?

Thanks,
-Kame


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-07 23:11                                                                                                 ` Rafael J. Wysocki
  (?)
@ 2009-05-08  1:16                                                                                                 ` KAMEZAWA Hiroyuki
  -1 siblings, 0 replies; 580+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-05-08  1:16 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: kernel-testers, jens.axboe, linux-kernel, alan-jenkins,
	David Rientjes, Andrew Morton, fengguang.wu, torvalds, linux-pm

On Fri, 8 May 2009 01:11:30 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> +	for_each_populated_zone(zone)
> +		zone_set_flag(zone, ZONE_OOM_LOCKED);

> +	for_each_populated_zone(zone)
> +		zone_clear_flag(zone, ZONE_OOM_LOCKED);
> +

Isn't it better to make above 2 be functions and move to mm/oom_kill.c ?

Thanks,
-Kame

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-08  1:16                                                                                                   ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 580+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-05-08  1:16 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: David Rientjes, Andrew Morton,
	fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Fri, 8 May 2009 01:11:30 +0200
"Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:

> +	for_each_populated_zone(zone)
> +		zone_set_flag(zone, ZONE_OOM_LOCKED);

> +	for_each_populated_zone(zone)
> +		zone_clear_flag(zone, ZONE_OOM_LOCKED);
> +

Isn't it better to make above 2 be functions and move to mm/oom_kill.c ?

Thanks,
-Kame

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-08  9:50                                                                                                   ` Wu Fengguang
  0 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-08  9:50 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: David Rientjes, Andrew Morton, linux-pm, pavel, torvalds,
	jens.axboe, alan-jenkins, linux-kernel, kernel-testers

On Fri, May 08, 2009 at 07:11:30AM +0800, Rafael J. Wysocki wrote:
> On Friday 08 May 2009, David Rientjes wrote:
> > On Thu, 7 May 2009, Andrew Morton wrote:
> > 
> > > The setting and clearing of that thing looks gruesomely racy..
> > > 
> > 
> > It's not racy currently because zone_scan_lock ensures ZONE_OOM_LOCKED 
> > gets test/set and cleared atomically for the entire zonelist (the clear 
> > happens for the same zonelist that was test/set).
> > 
> > Using it for hibernation in the way I've proposed will open it up to the 
> > race I earlier described: when a kthread is in the oom killer and 
> > subsequently clears its zonelist of ZONE_OOM_LOCKED (all other tasks are 
> > frozen so they can't be in the oom killer).  That's perfectly acceptable, 
> > however, since the system is by definition already oom if kthreads can't 
> > get memory so it will end up killing a user task even though it's stuck in 
> > D state and will exit on thaw; we aren't concerned about killing 
> > needlessly because the oom killer becomes a no-op when it finds a task 
> > that has already been killed but hasn't exited by way of TIF_MEMDIE.
> 
> OK there.
> 
> So everyone seems to agree we can do something like in the patch below?
> 
> ---
> From: Rafael J. Wysocki <rjw@sisk.pl>
> Subject: PM/Hibernate: Rework shrinking of memory
> 
> Rework swsusp_shrink_memory() so that it calls shrink_all_memory()
> just once to make some room for the image and then allocates memory
> to apply more pressure to the memory management subsystem, if
> necessary.

Thanks! Reducing to single-pass helps memory bounty laptops considerably :)

> Unfortunately, we don't seem to be able to drop shrink_all_memory()
> entirely just yet, because that would lead to huge performance
> regressions in some test cases.

Yes, but it's not the fault of this patch. In fact some regressions
may even be positive pressures to the page allocate/reclaim code ;)

> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> ---
>  kernel/power/snapshot.c |  151 +++++++++++++++++++++++++++++++++---------------
>  1 file changed, 104 insertions(+), 47 deletions(-)
> 
> Index: linux-2.6/kernel/power/snapshot.c
> ===================================================================
> --- linux-2.6.orig/kernel/power/snapshot.c
> +++ linux-2.6/kernel/power/snapshot.c
> @@ -1066,69 +1066,126 @@ void swsusp_free(void)
>  	buffer = NULL;
>  }
>  
> +/* Helper functions used for the shrinking of memory. */
> +
>  /**
> - *	swsusp_shrink_memory -  Try to free as much memory as needed
> - *
> - *	... but do not OOM-kill anyone
> + * preallocate_image_memory - Allocate given number of page frames
> + * @nr_pages: Number of page frames to allocate
>   *
> - *	Notice: all userland should be stopped before it is called, or
> - *	livelock is possible.
> + * Return value: Number of page frames actually allocated
>   */
> -
> -#define SHRINK_BITE	10000
> -static inline unsigned long __shrink_memory(long tmp)
> +static unsigned long preallocate_image_memory(unsigned long nr_pages)
>  {
> -	if (tmp > SHRINK_BITE)
> -		tmp = SHRINK_BITE;
> -	return shrink_all_memory(tmp);
> +	unsigned long nr_alloc = 0;
> +
> +	while (nr_pages > 0) {
> +		if (!alloc_image_page(GFP_KERNEL | __GFP_NOWARN))
> +			break;
> +		nr_pages--;
> +		nr_alloc++;
> +	}
> +
> +	return nr_alloc;
>  }
>  
> +/**
> + * swsusp_shrink_memory -  Make the kernel release as much memory as needed
> + *
> + * To create a hibernation image it is necessary to make a copy of every page
> + * frame in use.  We also need a number of page frames to be free during
> + * hibernation for allocations made while saving the image and for device
> + * drivers, in case they need to allocate memory from their hibernation
> + * callbacks (these two numbers are given by PAGES_FOR_IO and SPARE_PAGES,
> + * respectively, both of which are rough estimates).  To make this happen, we
> + * compute the total number of available page frames and allocate at least
> + *
> + * ([page frames total] + PAGES_FOR_IO + [metadata pages]) / 2 + 2 * SPARE_PAGES
> + *
> + * of them, which corresponds to the maximum size of a hibernation image.
> + *
> + * If image_size is set below the number following from the above formula,
> + * the preallocation of memory is continued until the total number of page
> + * frames in use is below the requested image size or it is impossible to
> + * allocate more memory, whichever happens first.
> + */
>  int swsusp_shrink_memory(void)
>  {
> -	long tmp;
>  	struct zone *zone;
> -	unsigned long pages = 0;
> -	unsigned int i = 0;
> -	char *p = "-\\|/";
> +	unsigned long saveable, size, max_size, count, pages = 0;
>  	struct timeval start, stop;
> +	int error = 0;
>  
> -	printk(KERN_INFO "PM: Shrinking memory...  ");
> +	printk(KERN_INFO "PM: Shrinking memory ... ");
>  	do_gettimeofday(&start);
> -	do {
> -		long size, highmem_size;
>  
> -		highmem_size = count_highmem_pages();
> -		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
> -		tmp = size;
> -		size += highmem_size;
> -		for_each_populated_zone(zone) {
> -			tmp += snapshot_additional_pages(zone);
> -			if (is_highmem(zone)) {
> -				highmem_size -=
> -					zone_page_state(zone, NR_FREE_PAGES);
> -			} else {
> -				tmp -= zone_page_state(zone, NR_FREE_PAGES);
> -				tmp += zone->lowmem_reserve[ZONE_NORMAL];
> -			}
> -		}
> +	/* Count the number of saveable data pages. */
> +	saveable = count_data_pages() + count_highmem_pages();
>  
> -		if (highmem_size < 0)
> -			highmem_size = 0;
> +	/*
> +	 * Compute the total number of page frames we can use (count) and the
> +	 * number of pages needed for image metadata (size).
> +	 */
> +	count = saveable;
> +	size = 0;
> +	for_each_populated_zone(zone) {
> +		size += snapshot_additional_pages(zone);
> +		count += zone_page_state(zone, NR_FREE_PAGES);
> +		count -= zone->pages_min;

I'd prefer to be more safe, by removing the above line...

> +	}

...and add another line here:

        count -= totalreserve_pages;


But hey, that 'count' counts "savable+free" memory.
We don't have a counter for an estimation of "free+freeable" memory,
ie. we are sure we cannot preallocate above that threshold. 

One applicable situation is, when there are 800M anonymous memory,
but only 500M image_size and no swap space.

In that case we will otherwise goto the oom code path. Sure oom is
(and shall be) reliably disabled in hibernation, but still we shall be
cautious enough not to create a low memory situation, which will hurt:
- hibernation speed
  (vmscan goes mad trying to squeeze the last free page)
- user experiences after resume
  (all *active* file data and metadata have to reloaded)

The current code simply tries *too hard* to meet image_size.
I'd rather take that as a mild advice, and to only free
"free+freeable-margin" pages when image_size is not approachable.

The safety margin can be totalreserve_pages, plus enough pages for
retaining the "hard core working set".

Thanks,
Fengguang

> -		tmp += highmem_size;
> -		if (tmp > 0) {
> -			tmp = __shrink_memory(tmp);
> -			if (!tmp)
> -				return -ENOMEM;
> -			pages += tmp;
> -		} else if (size > image_size / PAGE_SIZE) {
> -			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
> -			pages += tmp;
> -		}
> -		printk("\b%c", p[i++%4]);
> -	} while (tmp > 0);
> +	/* Compute the maximum number of saveable pages to leave in memory. */
> +	max_size = (count - (size + PAGES_FOR_IO)) / 2 - 2 * SPARE_PAGES;
> +	size = DIV_ROUND_UP(image_size, PAGE_SIZE);
> +	if (size > max_size)
> +		size = max_size;
> +	/*
> +	 * If the maximum is not less than the current number of saveable pages
> +	 * in memory, we don't need to do anything more.
> +	 */
> +	if (size >= saveable)
> +		goto out;
> +
> +	/*
> +	 * Let the memory management subsystem know that we're going to need a
> +	 * large number of page frames to allocate and make it free some memory.
> +	 * NOTE: If this is not done, performance is heavily affected in some
> +	 * test cases.
> +	 */
> +	shrink_all_memory(saveable - size);
> +
> +	/*
> +	 * Prevent the OOM killer from triggering while we're allocating image
> +	 * memory.
> +	 */
> +	for_each_populated_zone(zone)
> +		zone_set_flag(zone, ZONE_OOM_LOCKED);
> +	/*
> +	 * The number of saveable pages in memory was too high, so apply some
> +	 * pressure to decrease it.  First, make room for the largest possible
> +	 * image and fail if that doesn't work.  Next, try to decrease the size
> +	 * of the image as much as indicated by image_size.
> +	 */
> +	count -= max_size;
> +	pages = preallocate_image_memory(count);
> +	if (pages < count)
> +		error = -ENOMEM;
> +	else
> +		pages += preallocate_image_memory(max_size - size);
> +
> +	for_each_populated_zone(zone)
> +		zone_clear_flag(zone, ZONE_OOM_LOCKED);
> +
> +	/* Release all of the preallocated page frames. */
> +	swsusp_free();
> +
> +	if (error) {
> +		printk(KERN_CONT "\n");
> +		return error;
> +	}
> +
> + out:
>  	do_gettimeofday(&stop);
> -	printk("\bdone (%lu pages freed)\n", pages);
> +	printk(KERN_CONT "done (preallocated %lu free pages)\n", pages);
>  	swsusp_show_speed(&start, &stop, pages, "Freed");
>  
>  	return 0;

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-07 23:11                                                                                                 ` Rafael J. Wysocki
                                                                                                                   ` (2 preceding siblings ...)
  (?)
@ 2009-05-08  9:50                                                                                                 ` Wu Fengguang
  -1 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-08  9:50 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: David Rientjes, linux-kernel, alan-jenkins, jens.axboe,
	Andrew Morton, kernel-testers, torvalds, linux-pm

On Fri, May 08, 2009 at 07:11:30AM +0800, Rafael J. Wysocki wrote:
> On Friday 08 May 2009, David Rientjes wrote:
> > On Thu, 7 May 2009, Andrew Morton wrote:
> > 
> > > The setting and clearing of that thing looks gruesomely racy..
> > > 
> > 
> > It's not racy currently because zone_scan_lock ensures ZONE_OOM_LOCKED 
> > gets test/set and cleared atomically for the entire zonelist (the clear 
> > happens for the same zonelist that was test/set).
> > 
> > Using it for hibernation in the way I've proposed will open it up to the 
> > race I earlier described: when a kthread is in the oom killer and 
> > subsequently clears its zonelist of ZONE_OOM_LOCKED (all other tasks are 
> > frozen so they can't be in the oom killer).  That's perfectly acceptable, 
> > however, since the system is by definition already oom if kthreads can't 
> > get memory so it will end up killing a user task even though it's stuck in 
> > D state and will exit on thaw; we aren't concerned about killing 
> > needlessly because the oom killer becomes a no-op when it finds a task 
> > that has already been killed but hasn't exited by way of TIF_MEMDIE.
> 
> OK there.
> 
> So everyone seems to agree we can do something like in the patch below?
> 
> ---
> From: Rafael J. Wysocki <rjw@sisk.pl>
> Subject: PM/Hibernate: Rework shrinking of memory
> 
> Rework swsusp_shrink_memory() so that it calls shrink_all_memory()
> just once to make some room for the image and then allocates memory
> to apply more pressure to the memory management subsystem, if
> necessary.

Thanks! Reducing to single-pass helps memory bounty laptops considerably :)

> Unfortunately, we don't seem to be able to drop shrink_all_memory()
> entirely just yet, because that would lead to huge performance
> regressions in some test cases.

Yes, but it's not the fault of this patch. In fact some regressions
may even be positive pressures to the page allocate/reclaim code ;)

> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> ---
>  kernel/power/snapshot.c |  151 +++++++++++++++++++++++++++++++++---------------
>  1 file changed, 104 insertions(+), 47 deletions(-)
> 
> Index: linux-2.6/kernel/power/snapshot.c
> ===================================================================
> --- linux-2.6.orig/kernel/power/snapshot.c
> +++ linux-2.6/kernel/power/snapshot.c
> @@ -1066,69 +1066,126 @@ void swsusp_free(void)
>  	buffer = NULL;
>  }
>  
> +/* Helper functions used for the shrinking of memory. */
> +
>  /**
> - *	swsusp_shrink_memory -  Try to free as much memory as needed
> - *
> - *	... but do not OOM-kill anyone
> + * preallocate_image_memory - Allocate given number of page frames
> + * @nr_pages: Number of page frames to allocate
>   *
> - *	Notice: all userland should be stopped before it is called, or
> - *	livelock is possible.
> + * Return value: Number of page frames actually allocated
>   */
> -
> -#define SHRINK_BITE	10000
> -static inline unsigned long __shrink_memory(long tmp)
> +static unsigned long preallocate_image_memory(unsigned long nr_pages)
>  {
> -	if (tmp > SHRINK_BITE)
> -		tmp = SHRINK_BITE;
> -	return shrink_all_memory(tmp);
> +	unsigned long nr_alloc = 0;
> +
> +	while (nr_pages > 0) {
> +		if (!alloc_image_page(GFP_KERNEL | __GFP_NOWARN))
> +			break;
> +		nr_pages--;
> +		nr_alloc++;
> +	}
> +
> +	return nr_alloc;
>  }
>  
> +/**
> + * swsusp_shrink_memory -  Make the kernel release as much memory as needed
> + *
> + * To create a hibernation image it is necessary to make a copy of every page
> + * frame in use.  We also need a number of page frames to be free during
> + * hibernation for allocations made while saving the image and for device
> + * drivers, in case they need to allocate memory from their hibernation
> + * callbacks (these two numbers are given by PAGES_FOR_IO and SPARE_PAGES,
> + * respectively, both of which are rough estimates).  To make this happen, we
> + * compute the total number of available page frames and allocate at least
> + *
> + * ([page frames total] + PAGES_FOR_IO + [metadata pages]) / 2 + 2 * SPARE_PAGES
> + *
> + * of them, which corresponds to the maximum size of a hibernation image.
> + *
> + * If image_size is set below the number following from the above formula,
> + * the preallocation of memory is continued until the total number of page
> + * frames in use is below the requested image size or it is impossible to
> + * allocate more memory, whichever happens first.
> + */
>  int swsusp_shrink_memory(void)
>  {
> -	long tmp;
>  	struct zone *zone;
> -	unsigned long pages = 0;
> -	unsigned int i = 0;
> -	char *p = "-\\|/";
> +	unsigned long saveable, size, max_size, count, pages = 0;
>  	struct timeval start, stop;
> +	int error = 0;
>  
> -	printk(KERN_INFO "PM: Shrinking memory...  ");
> +	printk(KERN_INFO "PM: Shrinking memory ... ");
>  	do_gettimeofday(&start);
> -	do {
> -		long size, highmem_size;
>  
> -		highmem_size = count_highmem_pages();
> -		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
> -		tmp = size;
> -		size += highmem_size;
> -		for_each_populated_zone(zone) {
> -			tmp += snapshot_additional_pages(zone);
> -			if (is_highmem(zone)) {
> -				highmem_size -=
> -					zone_page_state(zone, NR_FREE_PAGES);
> -			} else {
> -				tmp -= zone_page_state(zone, NR_FREE_PAGES);
> -				tmp += zone->lowmem_reserve[ZONE_NORMAL];
> -			}
> -		}
> +	/* Count the number of saveable data pages. */
> +	saveable = count_data_pages() + count_highmem_pages();
>  
> -		if (highmem_size < 0)
> -			highmem_size = 0;
> +	/*
> +	 * Compute the total number of page frames we can use (count) and the
> +	 * number of pages needed for image metadata (size).
> +	 */
> +	count = saveable;
> +	size = 0;
> +	for_each_populated_zone(zone) {
> +		size += snapshot_additional_pages(zone);
> +		count += zone_page_state(zone, NR_FREE_PAGES);
> +		count -= zone->pages_min;

I'd prefer to be more safe, by removing the above line...

> +	}

...and add another line here:

        count -= totalreserve_pages;


But hey, that 'count' counts "savable+free" memory.
We don't have a counter for an estimation of "free+freeable" memory,
ie. we are sure we cannot preallocate above that threshold. 

One applicable situation is, when there are 800M anonymous memory,
but only 500M image_size and no swap space.

In that case we will otherwise goto the oom code path. Sure oom is
(and shall be) reliably disabled in hibernation, but still we shall be
cautious enough not to create a low memory situation, which will hurt:
- hibernation speed
  (vmscan goes mad trying to squeeze the last free page)
- user experiences after resume
  (all *active* file data and metadata have to reloaded)

The current code simply tries *too hard* to meet image_size.
I'd rather take that as a mild advice, and to only free
"free+freeable-margin" pages when image_size is not approachable.

The safety margin can be totalreserve_pages, plus enough pages for
retaining the "hard core working set".

Thanks,
Fengguang

> -		tmp += highmem_size;
> -		if (tmp > 0) {
> -			tmp = __shrink_memory(tmp);
> -			if (!tmp)
> -				return -ENOMEM;
> -			pages += tmp;
> -		} else if (size > image_size / PAGE_SIZE) {
> -			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
> -			pages += tmp;
> -		}
> -		printk("\b%c", p[i++%4]);
> -	} while (tmp > 0);
> +	/* Compute the maximum number of saveable pages to leave in memory. */
> +	max_size = (count - (size + PAGES_FOR_IO)) / 2 - 2 * SPARE_PAGES;
> +	size = DIV_ROUND_UP(image_size, PAGE_SIZE);
> +	if (size > max_size)
> +		size = max_size;
> +	/*
> +	 * If the maximum is not less than the current number of saveable pages
> +	 * in memory, we don't need to do anything more.
> +	 */
> +	if (size >= saveable)
> +		goto out;
> +
> +	/*
> +	 * Let the memory management subsystem know that we're going to need a
> +	 * large number of page frames to allocate and make it free some memory.
> +	 * NOTE: If this is not done, performance is heavily affected in some
> +	 * test cases.
> +	 */
> +	shrink_all_memory(saveable - size);
> +
> +	/*
> +	 * Prevent the OOM killer from triggering while we're allocating image
> +	 * memory.
> +	 */
> +	for_each_populated_zone(zone)
> +		zone_set_flag(zone, ZONE_OOM_LOCKED);
> +	/*
> +	 * The number of saveable pages in memory was too high, so apply some
> +	 * pressure to decrease it.  First, make room for the largest possible
> +	 * image and fail if that doesn't work.  Next, try to decrease the size
> +	 * of the image as much as indicated by image_size.
> +	 */
> +	count -= max_size;
> +	pages = preallocate_image_memory(count);
> +	if (pages < count)
> +		error = -ENOMEM;
> +	else
> +		pages += preallocate_image_memory(max_size - size);
> +
> +	for_each_populated_zone(zone)
> +		zone_clear_flag(zone, ZONE_OOM_LOCKED);
> +
> +	/* Release all of the preallocated page frames. */
> +	swsusp_free();
> +
> +	if (error) {
> +		printk(KERN_CONT "\n");
> +		return error;
> +	}
> +
> + out:
>  	do_gettimeofday(&stop);
> -	printk("\bdone (%lu pages freed)\n", pages);
> +	printk(KERN_CONT "done (preallocated %lu free pages)\n", pages);
>  	swsusp_show_speed(&start, &stop, pages, "Freed");
>  
>  	return 0;

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-08  9:50                                                                                                   ` Wu Fengguang
  0 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-08  9:50 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: David Rientjes, Andrew Morton,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Fri, May 08, 2009 at 07:11:30AM +0800, Rafael J. Wysocki wrote:
> On Friday 08 May 2009, David Rientjes wrote:
> > On Thu, 7 May 2009, Andrew Morton wrote:
> > 
> > > The setting and clearing of that thing looks gruesomely racy..
> > > 
> > 
> > It's not racy currently because zone_scan_lock ensures ZONE_OOM_LOCKED 
> > gets test/set and cleared atomically for the entire zonelist (the clear 
> > happens for the same zonelist that was test/set).
> > 
> > Using it for hibernation in the way I've proposed will open it up to the 
> > race I earlier described: when a kthread is in the oom killer and 
> > subsequently clears its zonelist of ZONE_OOM_LOCKED (all other tasks are 
> > frozen so they can't be in the oom killer).  That's perfectly acceptable, 
> > however, since the system is by definition already oom if kthreads can't 
> > get memory so it will end up killing a user task even though it's stuck in 
> > D state and will exit on thaw; we aren't concerned about killing 
> > needlessly because the oom killer becomes a no-op when it finds a task 
> > that has already been killed but hasn't exited by way of TIF_MEMDIE.
> 
> OK there.
> 
> So everyone seems to agree we can do something like in the patch below?
> 
> ---
> From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
> Subject: PM/Hibernate: Rework shrinking of memory
> 
> Rework swsusp_shrink_memory() so that it calls shrink_all_memory()
> just once to make some room for the image and then allocates memory
> to apply more pressure to the memory management subsystem, if
> necessary.

Thanks! Reducing to single-pass helps memory bounty laptops considerably :)

> Unfortunately, we don't seem to be able to drop shrink_all_memory()
> entirely just yet, because that would lead to huge performance
> regressions in some test cases.

Yes, but it's not the fault of this patch. In fact some regressions
may even be positive pressures to the page allocate/reclaim code ;)

> Signed-off-by: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
> ---
>  kernel/power/snapshot.c |  151 +++++++++++++++++++++++++++++++++---------------
>  1 file changed, 104 insertions(+), 47 deletions(-)
> 
> Index: linux-2.6/kernel/power/snapshot.c
> ===================================================================
> --- linux-2.6.orig/kernel/power/snapshot.c
> +++ linux-2.6/kernel/power/snapshot.c
> @@ -1066,69 +1066,126 @@ void swsusp_free(void)
>  	buffer = NULL;
>  }
>  
> +/* Helper functions used for the shrinking of memory. */
> +
>  /**
> - *	swsusp_shrink_memory -  Try to free as much memory as needed
> - *
> - *	... but do not OOM-kill anyone
> + * preallocate_image_memory - Allocate given number of page frames
> + * @nr_pages: Number of page frames to allocate
>   *
> - *	Notice: all userland should be stopped before it is called, or
> - *	livelock is possible.
> + * Return value: Number of page frames actually allocated
>   */
> -
> -#define SHRINK_BITE	10000
> -static inline unsigned long __shrink_memory(long tmp)
> +static unsigned long preallocate_image_memory(unsigned long nr_pages)
>  {
> -	if (tmp > SHRINK_BITE)
> -		tmp = SHRINK_BITE;
> -	return shrink_all_memory(tmp);
> +	unsigned long nr_alloc = 0;
> +
> +	while (nr_pages > 0) {
> +		if (!alloc_image_page(GFP_KERNEL | __GFP_NOWARN))
> +			break;
> +		nr_pages--;
> +		nr_alloc++;
> +	}
> +
> +	return nr_alloc;
>  }
>  
> +/**
> + * swsusp_shrink_memory -  Make the kernel release as much memory as needed
> + *
> + * To create a hibernation image it is necessary to make a copy of every page
> + * frame in use.  We also need a number of page frames to be free during
> + * hibernation for allocations made while saving the image and for device
> + * drivers, in case they need to allocate memory from their hibernation
> + * callbacks (these two numbers are given by PAGES_FOR_IO and SPARE_PAGES,
> + * respectively, both of which are rough estimates).  To make this happen, we
> + * compute the total number of available page frames and allocate at least
> + *
> + * ([page frames total] + PAGES_FOR_IO + [metadata pages]) / 2 + 2 * SPARE_PAGES
> + *
> + * of them, which corresponds to the maximum size of a hibernation image.
> + *
> + * If image_size is set below the number following from the above formula,
> + * the preallocation of memory is continued until the total number of page
> + * frames in use is below the requested image size or it is impossible to
> + * allocate more memory, whichever happens first.
> + */
>  int swsusp_shrink_memory(void)
>  {
> -	long tmp;
>  	struct zone *zone;
> -	unsigned long pages = 0;
> -	unsigned int i = 0;
> -	char *p = "-\\|/";
> +	unsigned long saveable, size, max_size, count, pages = 0;
>  	struct timeval start, stop;
> +	int error = 0;
>  
> -	printk(KERN_INFO "PM: Shrinking memory...  ");
> +	printk(KERN_INFO "PM: Shrinking memory ... ");
>  	do_gettimeofday(&start);
> -	do {
> -		long size, highmem_size;
>  
> -		highmem_size = count_highmem_pages();
> -		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
> -		tmp = size;
> -		size += highmem_size;
> -		for_each_populated_zone(zone) {
> -			tmp += snapshot_additional_pages(zone);
> -			if (is_highmem(zone)) {
> -				highmem_size -=
> -					zone_page_state(zone, NR_FREE_PAGES);
> -			} else {
> -				tmp -= zone_page_state(zone, NR_FREE_PAGES);
> -				tmp += zone->lowmem_reserve[ZONE_NORMAL];
> -			}
> -		}
> +	/* Count the number of saveable data pages. */
> +	saveable = count_data_pages() + count_highmem_pages();
>  
> -		if (highmem_size < 0)
> -			highmem_size = 0;
> +	/*
> +	 * Compute the total number of page frames we can use (count) and the
> +	 * number of pages needed for image metadata (size).
> +	 */
> +	count = saveable;
> +	size = 0;
> +	for_each_populated_zone(zone) {
> +		size += snapshot_additional_pages(zone);
> +		count += zone_page_state(zone, NR_FREE_PAGES);
> +		count -= zone->pages_min;

I'd prefer to be more safe, by removing the above line...

> +	}

...and add another line here:

        count -= totalreserve_pages;


But hey, that 'count' counts "savable+free" memory.
We don't have a counter for an estimation of "free+freeable" memory,
ie. we are sure we cannot preallocate above that threshold. 

One applicable situation is, when there are 800M anonymous memory,
but only 500M image_size and no swap space.

In that case we will otherwise goto the oom code path. Sure oom is
(and shall be) reliably disabled in hibernation, but still we shall be
cautious enough not to create a low memory situation, which will hurt:
- hibernation speed
  (vmscan goes mad trying to squeeze the last free page)
- user experiences after resume
  (all *active* file data and metadata have to reloaded)

The current code simply tries *too hard* to meet image_size.
I'd rather take that as a mild advice, and to only free
"free+freeable-margin" pages when image_size is not approachable.

The safety margin can be totalreserve_pages, plus enough pages for
retaining the "hard core working set".

Thanks,
Fengguang

> -		tmp += highmem_size;
> -		if (tmp > 0) {
> -			tmp = __shrink_memory(tmp);
> -			if (!tmp)
> -				return -ENOMEM;
> -			pages += tmp;
> -		} else if (size > image_size / PAGE_SIZE) {
> -			tmp = __shrink_memory(size - (image_size / PAGE_SIZE));
> -			pages += tmp;
> -		}
> -		printk("\b%c", p[i++%4]);
> -	} while (tmp > 0);
> +	/* Compute the maximum number of saveable pages to leave in memory. */
> +	max_size = (count - (size + PAGES_FOR_IO)) / 2 - 2 * SPARE_PAGES;
> +	size = DIV_ROUND_UP(image_size, PAGE_SIZE);
> +	if (size > max_size)
> +		size = max_size;
> +	/*
> +	 * If the maximum is not less than the current number of saveable pages
> +	 * in memory, we don't need to do anything more.
> +	 */
> +	if (size >= saveable)
> +		goto out;
> +
> +	/*
> +	 * Let the memory management subsystem know that we're going to need a
> +	 * large number of page frames to allocate and make it free some memory.
> +	 * NOTE: If this is not done, performance is heavily affected in some
> +	 * test cases.
> +	 */
> +	shrink_all_memory(saveable - size);
> +
> +	/*
> +	 * Prevent the OOM killer from triggering while we're allocating image
> +	 * memory.
> +	 */
> +	for_each_populated_zone(zone)
> +		zone_set_flag(zone, ZONE_OOM_LOCKED);
> +	/*
> +	 * The number of saveable pages in memory was too high, so apply some
> +	 * pressure to decrease it.  First, make room for the largest possible
> +	 * image and fail if that doesn't work.  Next, try to decrease the size
> +	 * of the image as much as indicated by image_size.
> +	 */
> +	count -= max_size;
> +	pages = preallocate_image_memory(count);
> +	if (pages < count)
> +		error = -ENOMEM;
> +	else
> +		pages += preallocate_image_memory(max_size - size);
> +
> +	for_each_populated_zone(zone)
> +		zone_clear_flag(zone, ZONE_OOM_LOCKED);
> +
> +	/* Release all of the preallocated page frames. */
> +	swsusp_free();
> +
> +	if (error) {
> +		printk(KERN_CONT "\n");
> +		return error;
> +	}
> +
> + out:
>  	do_gettimeofday(&stop);
> -	printk("\bdone (%lu pages freed)\n", pages);
> +	printk(KERN_CONT "done (preallocated %lu free pages)\n", pages);
>  	swsusp_show_speed(&start, &stop, pages, "Freed");
>  
>  	return 0;

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-08 13:42                                                                                                     ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-08 13:42 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: David Rientjes, Andrew Morton, fengguang.wu, linux-pm, pavel,
	torvalds, jens.axboe, alan-jenkins, linux-kernel, kernel-testers

On Friday 08 May 2009, KAMEZAWA Hiroyuki wrote:
> On Fri, 8 May 2009 01:11:30 +0200
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > +	for_each_populated_zone(zone)
> > +		zone_set_flag(zone, ZONE_OOM_LOCKED);
> 
> > +	for_each_populated_zone(zone)
> > +		zone_clear_flag(zone, ZONE_OOM_LOCKED);
> > +
> 
> Isn't it better to make above 2 be functions and move to mm/oom_kill.c ?

Hmm, OK.  I'll do it.

Well, in fact snapshot.c is all about memory management, so perhaps it's a good
idea to move it into mm as a whole. ;-)

Best,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-08  1:16                                                                                                   ` KAMEZAWA Hiroyuki
  (?)
  (?)
@ 2009-05-08 13:42                                                                                                   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-08 13:42 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: kernel-testers, jens.axboe, linux-kernel, alan-jenkins,
	David Rientjes, Andrew Morton, fengguang.wu, torvalds, linux-pm

On Friday 08 May 2009, KAMEZAWA Hiroyuki wrote:
> On Fri, 8 May 2009 01:11:30 +0200
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > +	for_each_populated_zone(zone)
> > +		zone_set_flag(zone, ZONE_OOM_LOCKED);
> 
> > +	for_each_populated_zone(zone)
> > +		zone_clear_flag(zone, ZONE_OOM_LOCKED);
> > +
> 
> Isn't it better to make above 2 be functions and move to mm/oom_kill.c ?

Hmm, OK.  I'll do it.

Well, in fact snapshot.c is all about memory management, so perhaps it's a good
idea to move it into mm as a whole. ;-)

Best,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-08 13:42                                                                                                     ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-08 13:42 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: David Rientjes, Andrew Morton,
	fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Friday 08 May 2009, KAMEZAWA Hiroyuki wrote:
> On Fri, 8 May 2009 01:11:30 +0200
> "Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:
> 
> > +	for_each_populated_zone(zone)
> > +		zone_set_flag(zone, ZONE_OOM_LOCKED);
> 
> > +	for_each_populated_zone(zone)
> > +		zone_clear_flag(zone, ZONE_OOM_LOCKED);
> > +
> 
> Isn't it better to make above 2 be functions and move to mm/oom_kill.c ?

Hmm, OK.  I'll do it.

Well, in fact snapshot.c is all about memory management, so perhaps it's a good
idea to move it into mm as a whole. ;-)

Best,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-08  9:50                                                                                                   ` Wu Fengguang
@ 2009-05-08 13:51                                                                                                     ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-08 13:51 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: David Rientjes, Andrew Morton, linux-pm, pavel, torvalds,
	jens.axboe, alan-jenkins, linux-kernel, kernel-testers

On Friday 08 May 2009, Wu Fengguang wrote:
> On Fri, May 08, 2009 at 07:11:30AM +0800, Rafael J. Wysocki wrote:
> > On Friday 08 May 2009, David Rientjes wrote:
> > > On Thu, 7 May 2009, Andrew Morton wrote:
> > > 
> > > > The setting and clearing of that thing looks gruesomely racy..
> > > > 
> > > 
> > > It's not racy currently because zone_scan_lock ensures ZONE_OOM_LOCKED 
> > > gets test/set and cleared atomically for the entire zonelist (the clear 
> > > happens for the same zonelist that was test/set).
> > > 
> > > Using it for hibernation in the way I've proposed will open it up to the 
> > > race I earlier described: when a kthread is in the oom killer and 
> > > subsequently clears its zonelist of ZONE_OOM_LOCKED (all other tasks are 
> > > frozen so they can't be in the oom killer).  That's perfectly acceptable, 
> > > however, since the system is by definition already oom if kthreads can't 
> > > get memory so it will end up killing a user task even though it's stuck in 
> > > D state and will exit on thaw; we aren't concerned about killing 
> > > needlessly because the oom killer becomes a no-op when it finds a task 
> > > that has already been killed but hasn't exited by way of TIF_MEMDIE.
> > 
> > OK there.
> > 
> > So everyone seems to agree we can do something like in the patch below?
> > 
> > ---
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > Subject: PM/Hibernate: Rework shrinking of memory
> > 
> > Rework swsusp_shrink_memory() so that it calls shrink_all_memory()
> > just once to make some room for the image and then allocates memory
> > to apply more pressure to the memory management subsystem, if
> > necessary.
> 
> Thanks! Reducing to single-pass helps memory bounty laptops considerably :)
> 
> > Unfortunately, we don't seem to be able to drop shrink_all_memory()
> > entirely just yet, because that would lead to huge performance
> > regressions in some test cases.
> 
> Yes, but it's not the fault of this patch. In fact some regressions
> may even be positive pressures to the page allocate/reclaim code ;)
> 
> > Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> > ---
> >  kernel/power/snapshot.c |  151 +++++++++++++++++++++++++++++++++---------------
> >  1 file changed, 104 insertions(+), 47 deletions(-)
> > 
> > Index: linux-2.6/kernel/power/snapshot.c
> > ===================================================================
> > --- linux-2.6.orig/kernel/power/snapshot.c
> > +++ linux-2.6/kernel/power/snapshot.c
> > @@ -1066,69 +1066,126 @@ void swsusp_free(void)
> >  	buffer = NULL;
> >  }
> >  
> > +/* Helper functions used for the shrinking of memory. */
> > +
> >  /**
> > - *	swsusp_shrink_memory -  Try to free as much memory as needed
> > - *
> > - *	... but do not OOM-kill anyone
> > + * preallocate_image_memory - Allocate given number of page frames
> > + * @nr_pages: Number of page frames to allocate
> >   *
> > - *	Notice: all userland should be stopped before it is called, or
> > - *	livelock is possible.
> > + * Return value: Number of page frames actually allocated
> >   */
> > -
> > -#define SHRINK_BITE	10000
> > -static inline unsigned long __shrink_memory(long tmp)
> > +static unsigned long preallocate_image_memory(unsigned long nr_pages)
> >  {
> > -	if (tmp > SHRINK_BITE)
> > -		tmp = SHRINK_BITE;
> > -	return shrink_all_memory(tmp);
> > +	unsigned long nr_alloc = 0;
> > +
> > +	while (nr_pages > 0) {
> > +		if (!alloc_image_page(GFP_KERNEL | __GFP_NOWARN))
> > +			break;
> > +		nr_pages--;
> > +		nr_alloc++;
> > +	}
> > +
> > +	return nr_alloc;
> >  }
> >  
> > +/**
> > + * swsusp_shrink_memory -  Make the kernel release as much memory as needed
> > + *
> > + * To create a hibernation image it is necessary to make a copy of every page
> > + * frame in use.  We also need a number of page frames to be free during
> > + * hibernation for allocations made while saving the image and for device
> > + * drivers, in case they need to allocate memory from their hibernation
> > + * callbacks (these two numbers are given by PAGES_FOR_IO and SPARE_PAGES,
> > + * respectively, both of which are rough estimates).  To make this happen, we
> > + * compute the total number of available page frames and allocate at least
> > + *
> > + * ([page frames total] + PAGES_FOR_IO + [metadata pages]) / 2 + 2 * SPARE_PAGES
> > + *
> > + * of them, which corresponds to the maximum size of a hibernation image.
> > + *
> > + * If image_size is set below the number following from the above formula,
> > + * the preallocation of memory is continued until the total number of page
> > + * frames in use is below the requested image size or it is impossible to
> > + * allocate more memory, whichever happens first.
> > + */
> >  int swsusp_shrink_memory(void)
> >  {
> > -	long tmp;
> >  	struct zone *zone;
> > -	unsigned long pages = 0;
> > -	unsigned int i = 0;
> > -	char *p = "-\\|/";
> > +	unsigned long saveable, size, max_size, count, pages = 0;
> >  	struct timeval start, stop;
> > +	int error = 0;
> >  
> > -	printk(KERN_INFO "PM: Shrinking memory...  ");
> > +	printk(KERN_INFO "PM: Shrinking memory ... ");
> >  	do_gettimeofday(&start);
> > -	do {
> > -		long size, highmem_size;
> >  
> > -		highmem_size = count_highmem_pages();
> > -		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
> > -		tmp = size;
> > -		size += highmem_size;
> > -		for_each_populated_zone(zone) {
> > -			tmp += snapshot_additional_pages(zone);
> > -			if (is_highmem(zone)) {
> > -				highmem_size -=
> > -					zone_page_state(zone, NR_FREE_PAGES);
> > -			} else {
> > -				tmp -= zone_page_state(zone, NR_FREE_PAGES);
> > -				tmp += zone->lowmem_reserve[ZONE_NORMAL];
> > -			}
> > -		}
> > +	/* Count the number of saveable data pages. */
> > +	saveable = count_data_pages() + count_highmem_pages();
> >  
> > -		if (highmem_size < 0)
> > -			highmem_size = 0;
> > +	/*
> > +	 * Compute the total number of page frames we can use (count) and the
> > +	 * number of pages needed for image metadata (size).
> > +	 */
> > +	count = saveable;
> > +	size = 0;
> > +	for_each_populated_zone(zone) {
> > +		size += snapshot_additional_pages(zone);
> > +		count += zone_page_state(zone, NR_FREE_PAGES);
> > +		count -= zone->pages_min;
> 
> I'd prefer to be more safe, by removing the above line...
> 
> > +	}
> 
> ...and add another line here:
> 
>         count -= totalreserve_pages;

OK

> But hey, that 'count' counts "savable+free" memory.
> We don't have a counter for an estimation of "free+freeable" memory,
> ie. we are sure we cannot preallocate above that threshold. 
> 
> One applicable situation is, when there are 800M anonymous memory,
> but only 500M image_size and no swap space.
> 
> In that case we will otherwise goto the oom code path. Sure oom is
> (and shall be) reliably disabled in hibernation, but still we shall be
> cautious enough not to create a low memory situation, which will hurt:
> - hibernation speed
>   (vmscan goes mad trying to squeeze the last free page)
> - user experiences after resume
>   (all *active* file data and metadata have to reloaded)

Strangely enough, my recent testing with this patch doesn't confirm the
theory. :-)  Namely, I set image_size too low on purpose and it only caused
preallocate_image_memory() to return NULL at one point and that was it.

It didn't even took too much time.

I'll carry out more testing to verify this observation.

> The current code simply tries *too hard* to meet image_size.
> I'd rather take that as a mild advice, and to only free
> "free+freeable-margin" pages when image_size is not approachable.
> 
> The safety margin can be totalreserve_pages, plus enough pages for
> retaining the "hard core working set".

How to compute the size of the "hard core working set", then?

Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-08  9:50                                                                                                   ` Wu Fengguang
  (?)
@ 2009-05-08 13:51                                                                                                   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-08 13:51 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: David Rientjes, linux-kernel, alan-jenkins, jens.axboe,
	Andrew Morton, kernel-testers, torvalds, linux-pm

On Friday 08 May 2009, Wu Fengguang wrote:
> On Fri, May 08, 2009 at 07:11:30AM +0800, Rafael J. Wysocki wrote:
> > On Friday 08 May 2009, David Rientjes wrote:
> > > On Thu, 7 May 2009, Andrew Morton wrote:
> > > 
> > > > The setting and clearing of that thing looks gruesomely racy..
> > > > 
> > > 
> > > It's not racy currently because zone_scan_lock ensures ZONE_OOM_LOCKED 
> > > gets test/set and cleared atomically for the entire zonelist (the clear 
> > > happens for the same zonelist that was test/set).
> > > 
> > > Using it for hibernation in the way I've proposed will open it up to the 
> > > race I earlier described: when a kthread is in the oom killer and 
> > > subsequently clears its zonelist of ZONE_OOM_LOCKED (all other tasks are 
> > > frozen so they can't be in the oom killer).  That's perfectly acceptable, 
> > > however, since the system is by definition already oom if kthreads can't 
> > > get memory so it will end up killing a user task even though it's stuck in 
> > > D state and will exit on thaw; we aren't concerned about killing 
> > > needlessly because the oom killer becomes a no-op when it finds a task 
> > > that has already been killed but hasn't exited by way of TIF_MEMDIE.
> > 
> > OK there.
> > 
> > So everyone seems to agree we can do something like in the patch below?
> > 
> > ---
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > Subject: PM/Hibernate: Rework shrinking of memory
> > 
> > Rework swsusp_shrink_memory() so that it calls shrink_all_memory()
> > just once to make some room for the image and then allocates memory
> > to apply more pressure to the memory management subsystem, if
> > necessary.
> 
> Thanks! Reducing to single-pass helps memory bounty laptops considerably :)
> 
> > Unfortunately, we don't seem to be able to drop shrink_all_memory()
> > entirely just yet, because that would lead to huge performance
> > regressions in some test cases.
> 
> Yes, but it's not the fault of this patch. In fact some regressions
> may even be positive pressures to the page allocate/reclaim code ;)
> 
> > Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> > ---
> >  kernel/power/snapshot.c |  151 +++++++++++++++++++++++++++++++++---------------
> >  1 file changed, 104 insertions(+), 47 deletions(-)
> > 
> > Index: linux-2.6/kernel/power/snapshot.c
> > ===================================================================
> > --- linux-2.6.orig/kernel/power/snapshot.c
> > +++ linux-2.6/kernel/power/snapshot.c
> > @@ -1066,69 +1066,126 @@ void swsusp_free(void)
> >  	buffer = NULL;
> >  }
> >  
> > +/* Helper functions used for the shrinking of memory. */
> > +
> >  /**
> > - *	swsusp_shrink_memory -  Try to free as much memory as needed
> > - *
> > - *	... but do not OOM-kill anyone
> > + * preallocate_image_memory - Allocate given number of page frames
> > + * @nr_pages: Number of page frames to allocate
> >   *
> > - *	Notice: all userland should be stopped before it is called, or
> > - *	livelock is possible.
> > + * Return value: Number of page frames actually allocated
> >   */
> > -
> > -#define SHRINK_BITE	10000
> > -static inline unsigned long __shrink_memory(long tmp)
> > +static unsigned long preallocate_image_memory(unsigned long nr_pages)
> >  {
> > -	if (tmp > SHRINK_BITE)
> > -		tmp = SHRINK_BITE;
> > -	return shrink_all_memory(tmp);
> > +	unsigned long nr_alloc = 0;
> > +
> > +	while (nr_pages > 0) {
> > +		if (!alloc_image_page(GFP_KERNEL | __GFP_NOWARN))
> > +			break;
> > +		nr_pages--;
> > +		nr_alloc++;
> > +	}
> > +
> > +	return nr_alloc;
> >  }
> >  
> > +/**
> > + * swsusp_shrink_memory -  Make the kernel release as much memory as needed
> > + *
> > + * To create a hibernation image it is necessary to make a copy of every page
> > + * frame in use.  We also need a number of page frames to be free during
> > + * hibernation for allocations made while saving the image and for device
> > + * drivers, in case they need to allocate memory from their hibernation
> > + * callbacks (these two numbers are given by PAGES_FOR_IO and SPARE_PAGES,
> > + * respectively, both of which are rough estimates).  To make this happen, we
> > + * compute the total number of available page frames and allocate at least
> > + *
> > + * ([page frames total] + PAGES_FOR_IO + [metadata pages]) / 2 + 2 * SPARE_PAGES
> > + *
> > + * of them, which corresponds to the maximum size of a hibernation image.
> > + *
> > + * If image_size is set below the number following from the above formula,
> > + * the preallocation of memory is continued until the total number of page
> > + * frames in use is below the requested image size or it is impossible to
> > + * allocate more memory, whichever happens first.
> > + */
> >  int swsusp_shrink_memory(void)
> >  {
> > -	long tmp;
> >  	struct zone *zone;
> > -	unsigned long pages = 0;
> > -	unsigned int i = 0;
> > -	char *p = "-\\|/";
> > +	unsigned long saveable, size, max_size, count, pages = 0;
> >  	struct timeval start, stop;
> > +	int error = 0;
> >  
> > -	printk(KERN_INFO "PM: Shrinking memory...  ");
> > +	printk(KERN_INFO "PM: Shrinking memory ... ");
> >  	do_gettimeofday(&start);
> > -	do {
> > -		long size, highmem_size;
> >  
> > -		highmem_size = count_highmem_pages();
> > -		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
> > -		tmp = size;
> > -		size += highmem_size;
> > -		for_each_populated_zone(zone) {
> > -			tmp += snapshot_additional_pages(zone);
> > -			if (is_highmem(zone)) {
> > -				highmem_size -=
> > -					zone_page_state(zone, NR_FREE_PAGES);
> > -			} else {
> > -				tmp -= zone_page_state(zone, NR_FREE_PAGES);
> > -				tmp += zone->lowmem_reserve[ZONE_NORMAL];
> > -			}
> > -		}
> > +	/* Count the number of saveable data pages. */
> > +	saveable = count_data_pages() + count_highmem_pages();
> >  
> > -		if (highmem_size < 0)
> > -			highmem_size = 0;
> > +	/*
> > +	 * Compute the total number of page frames we can use (count) and the
> > +	 * number of pages needed for image metadata (size).
> > +	 */
> > +	count = saveable;
> > +	size = 0;
> > +	for_each_populated_zone(zone) {
> > +		size += snapshot_additional_pages(zone);
> > +		count += zone_page_state(zone, NR_FREE_PAGES);
> > +		count -= zone->pages_min;
> 
> I'd prefer to be more safe, by removing the above line...
> 
> > +	}
> 
> ...and add another line here:
> 
>         count -= totalreserve_pages;

OK

> But hey, that 'count' counts "savable+free" memory.
> We don't have a counter for an estimation of "free+freeable" memory,
> ie. we are sure we cannot preallocate above that threshold. 
> 
> One applicable situation is, when there are 800M anonymous memory,
> but only 500M image_size and no swap space.
> 
> In that case we will otherwise goto the oom code path. Sure oom is
> (and shall be) reliably disabled in hibernation, but still we shall be
> cautious enough not to create a low memory situation, which will hurt:
> - hibernation speed
>   (vmscan goes mad trying to squeeze the last free page)
> - user experiences after resume
>   (all *active* file data and metadata have to reloaded)

Strangely enough, my recent testing with this patch doesn't confirm the
theory. :-)  Namely, I set image_size too low on purpose and it only caused
preallocate_image_memory() to return NULL at one point and that was it.

It didn't even took too much time.

I'll carry out more testing to verify this observation.

> The current code simply tries *too hard* to meet image_size.
> I'd rather take that as a mild advice, and to only free
> "free+freeable-margin" pages when image_size is not approachable.
> 
> The safety margin can be totalreserve_pages, plus enough pages for
> retaining the "hard core working set".

How to compute the size of the "hard core working set", then?

Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-08 13:51                                                                                                     ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-08 13:51 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: David Rientjes, Andrew Morton,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Friday 08 May 2009, Wu Fengguang wrote:
> On Fri, May 08, 2009 at 07:11:30AM +0800, Rafael J. Wysocki wrote:
> > On Friday 08 May 2009, David Rientjes wrote:
> > > On Thu, 7 May 2009, Andrew Morton wrote:
> > > 
> > > > The setting and clearing of that thing looks gruesomely racy..
> > > > 
> > > 
> > > It's not racy currently because zone_scan_lock ensures ZONE_OOM_LOCKED 
> > > gets test/set and cleared atomically for the entire zonelist (the clear 
> > > happens for the same zonelist that was test/set).
> > > 
> > > Using it for hibernation in the way I've proposed will open it up to the 
> > > race I earlier described: when a kthread is in the oom killer and 
> > > subsequently clears its zonelist of ZONE_OOM_LOCKED (all other tasks are 
> > > frozen so they can't be in the oom killer).  That's perfectly acceptable, 
> > > however, since the system is by definition already oom if kthreads can't 
> > > get memory so it will end up killing a user task even though it's stuck in 
> > > D state and will exit on thaw; we aren't concerned about killing 
> > > needlessly because the oom killer becomes a no-op when it finds a task 
> > > that has already been killed but hasn't exited by way of TIF_MEMDIE.
> > 
> > OK there.
> > 
> > So everyone seems to agree we can do something like in the patch below?
> > 
> > ---
> > From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
> > Subject: PM/Hibernate: Rework shrinking of memory
> > 
> > Rework swsusp_shrink_memory() so that it calls shrink_all_memory()
> > just once to make some room for the image and then allocates memory
> > to apply more pressure to the memory management subsystem, if
> > necessary.
> 
> Thanks! Reducing to single-pass helps memory bounty laptops considerably :)
> 
> > Unfortunately, we don't seem to be able to drop shrink_all_memory()
> > entirely just yet, because that would lead to huge performance
> > regressions in some test cases.
> 
> Yes, but it's not the fault of this patch. In fact some regressions
> may even be positive pressures to the page allocate/reclaim code ;)
> 
> > Signed-off-by: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
> > ---
> >  kernel/power/snapshot.c |  151 +++++++++++++++++++++++++++++++++---------------
> >  1 file changed, 104 insertions(+), 47 deletions(-)
> > 
> > Index: linux-2.6/kernel/power/snapshot.c
> > ===================================================================
> > --- linux-2.6.orig/kernel/power/snapshot.c
> > +++ linux-2.6/kernel/power/snapshot.c
> > @@ -1066,69 +1066,126 @@ void swsusp_free(void)
> >  	buffer = NULL;
> >  }
> >  
> > +/* Helper functions used for the shrinking of memory. */
> > +
> >  /**
> > - *	swsusp_shrink_memory -  Try to free as much memory as needed
> > - *
> > - *	... but do not OOM-kill anyone
> > + * preallocate_image_memory - Allocate given number of page frames
> > + * @nr_pages: Number of page frames to allocate
> >   *
> > - *	Notice: all userland should be stopped before it is called, or
> > - *	livelock is possible.
> > + * Return value: Number of page frames actually allocated
> >   */
> > -
> > -#define SHRINK_BITE	10000
> > -static inline unsigned long __shrink_memory(long tmp)
> > +static unsigned long preallocate_image_memory(unsigned long nr_pages)
> >  {
> > -	if (tmp > SHRINK_BITE)
> > -		tmp = SHRINK_BITE;
> > -	return shrink_all_memory(tmp);
> > +	unsigned long nr_alloc = 0;
> > +
> > +	while (nr_pages > 0) {
> > +		if (!alloc_image_page(GFP_KERNEL | __GFP_NOWARN))
> > +			break;
> > +		nr_pages--;
> > +		nr_alloc++;
> > +	}
> > +
> > +	return nr_alloc;
> >  }
> >  
> > +/**
> > + * swsusp_shrink_memory -  Make the kernel release as much memory as needed
> > + *
> > + * To create a hibernation image it is necessary to make a copy of every page
> > + * frame in use.  We also need a number of page frames to be free during
> > + * hibernation for allocations made while saving the image and for device
> > + * drivers, in case they need to allocate memory from their hibernation
> > + * callbacks (these two numbers are given by PAGES_FOR_IO and SPARE_PAGES,
> > + * respectively, both of which are rough estimates).  To make this happen, we
> > + * compute the total number of available page frames and allocate at least
> > + *
> > + * ([page frames total] + PAGES_FOR_IO + [metadata pages]) / 2 + 2 * SPARE_PAGES
> > + *
> > + * of them, which corresponds to the maximum size of a hibernation image.
> > + *
> > + * If image_size is set below the number following from the above formula,
> > + * the preallocation of memory is continued until the total number of page
> > + * frames in use is below the requested image size or it is impossible to
> > + * allocate more memory, whichever happens first.
> > + */
> >  int swsusp_shrink_memory(void)
> >  {
> > -	long tmp;
> >  	struct zone *zone;
> > -	unsigned long pages = 0;
> > -	unsigned int i = 0;
> > -	char *p = "-\\|/";
> > +	unsigned long saveable, size, max_size, count, pages = 0;
> >  	struct timeval start, stop;
> > +	int error = 0;
> >  
> > -	printk(KERN_INFO "PM: Shrinking memory...  ");
> > +	printk(KERN_INFO "PM: Shrinking memory ... ");
> >  	do_gettimeofday(&start);
> > -	do {
> > -		long size, highmem_size;
> >  
> > -		highmem_size = count_highmem_pages();
> > -		size = count_data_pages() + PAGES_FOR_IO + SPARE_PAGES;
> > -		tmp = size;
> > -		size += highmem_size;
> > -		for_each_populated_zone(zone) {
> > -			tmp += snapshot_additional_pages(zone);
> > -			if (is_highmem(zone)) {
> > -				highmem_size -=
> > -					zone_page_state(zone, NR_FREE_PAGES);
> > -			} else {
> > -				tmp -= zone_page_state(zone, NR_FREE_PAGES);
> > -				tmp += zone->lowmem_reserve[ZONE_NORMAL];
> > -			}
> > -		}
> > +	/* Count the number of saveable data pages. */
> > +	saveable = count_data_pages() + count_highmem_pages();
> >  
> > -		if (highmem_size < 0)
> > -			highmem_size = 0;
> > +	/*
> > +	 * Compute the total number of page frames we can use (count) and the
> > +	 * number of pages needed for image metadata (size).
> > +	 */
> > +	count = saveable;
> > +	size = 0;
> > +	for_each_populated_zone(zone) {
> > +		size += snapshot_additional_pages(zone);
> > +		count += zone_page_state(zone, NR_FREE_PAGES);
> > +		count -= zone->pages_min;
> 
> I'd prefer to be more safe, by removing the above line...
> 
> > +	}
> 
> ...and add another line here:
> 
>         count -= totalreserve_pages;

OK

> But hey, that 'count' counts "savable+free" memory.
> We don't have a counter for an estimation of "free+freeable" memory,
> ie. we are sure we cannot preallocate above that threshold. 
> 
> One applicable situation is, when there are 800M anonymous memory,
> but only 500M image_size and no swap space.
> 
> In that case we will otherwise goto the oom code path. Sure oom is
> (and shall be) reliably disabled in hibernation, but still we shall be
> cautious enough not to create a low memory situation, which will hurt:
> - hibernation speed
>   (vmscan goes mad trying to squeeze the last free page)
> - user experiences after resume
>   (all *active* file data and metadata have to reloaded)

Strangely enough, my recent testing with this patch doesn't confirm the
theory. :-)  Namely, I set image_size too low on purpose and it only caused
preallocate_image_memory() to return NULL at one point and that was it.

It didn't even took too much time.

I'll carry out more testing to verify this observation.

> The current code simply tries *too hard* to meet image_size.
> I'd rather take that as a mild advice, and to only free
> "free+freeable-margin" pages when image_size is not approachable.
> 
> The safety margin can be totalreserve_pages, plus enough pages for
> retaining the "hard core working set".

How to compute the size of the "hard core working set", then?

Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-08 23:55                                                                                   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-08 23:55 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, fengguang.wu, linux-pm, pavel, torvalds,
	jens.axboe, alan-jenkins, linux-kernel, kernel-testers

On Thursday 07 May 2009, David Rientjes wrote:
> On Thu, 7 May 2009, Rafael J. Wysocki wrote:
> 
> > OK, let's try with __GFP_NO_OOM_KILL first.  If there's too much disagreement,
> > I'll use the freezer-based approach instead.
> > 
> 
> Third time I'm going to suggest this, and I'd like a response on why it's 
> not possible instead of being ignored.
> 
> All of your tasks are in D state other than kthreads, right?  That means 
> they won't be in the oom killer (thus no zones are oom locked), so you can 
> easily do this
> 
> 	struct zone *z;
> 	for_each_populated_zone(z)
> 		zone_set_flag(z, ZONE_OOM_LOCKED);
> 
> and then
> 
> 	for_each_populated_zone(z)
> 		zone_clear_flag(z, ZONE_OOM_LOCKED);
> 
> The serialization is done with trylocks so this will never invoke the oom 
> killer because all zones in the allocator's zonelist will be oom locked.

Well, that might have been a good idea if it actually had worked. :-(

> Why does this not work for you?

If I set image_size to something below "hard core working set" +
totalreserve_pages, preallocate_image_memory() hangs the
box (please refer to the last patch I sent,
http://patchwork.kernel.org/patch/22423/).

However, with the freezer-based disabling of the OOM killer it doesn't hang
under the same test conditions.

The difference appears to be that using your approach makes
__alloc_pages_internal() loop forever between the !try_set_zone_oom() test and
restart:, while it should go to nopage: in that situation.

So, I think I'll stick to the Andrew's approach with using __GFP_NO_OOM_KILL.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-07 20:25                                                                                 ` David Rientjes
                                                                                                   ` (4 preceding siblings ...)
  (?)
@ 2009-05-08 23:55                                                                                 ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-08 23:55 UTC (permalink / raw)
  To: David Rientjes
  Cc: kernel-testers, linux-kernel, alan-jenkins, jens.axboe,
	Andrew Morton, fengguang.wu, torvalds, linux-pm

On Thursday 07 May 2009, David Rientjes wrote:
> On Thu, 7 May 2009, Rafael J. Wysocki wrote:
> 
> > OK, let's try with __GFP_NO_OOM_KILL first.  If there's too much disagreement,
> > I'll use the freezer-based approach instead.
> > 
> 
> Third time I'm going to suggest this, and I'd like a response on why it's 
> not possible instead of being ignored.
> 
> All of your tasks are in D state other than kthreads, right?  That means 
> they won't be in the oom killer (thus no zones are oom locked), so you can 
> easily do this
> 
> 	struct zone *z;
> 	for_each_populated_zone(z)
> 		zone_set_flag(z, ZONE_OOM_LOCKED);
> 
> and then
> 
> 	for_each_populated_zone(z)
> 		zone_clear_flag(z, ZONE_OOM_LOCKED);
> 
> The serialization is done with trylocks so this will never invoke the oom 
> killer because all zones in the allocator's zonelist will be oom locked.

Well, that might have been a good idea if it actually had worked. :-(

> Why does this not work for you?

If I set image_size to something below "hard core working set" +
totalreserve_pages, preallocate_image_memory() hangs the
box (please refer to the last patch I sent,
http://patchwork.kernel.org/patch/22423/).

However, with the freezer-based disabling of the OOM killer it doesn't hang
under the same test conditions.

The difference appears to be that using your approach makes
__alloc_pages_internal() loop forever between the !try_set_zone_oom() test and
restart:, while it should go to nopage: in that situation.

So, I think I'll stick to the Andrew's approach with using __GFP_NO_OOM_KILL.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-08 23:55                                                                                   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-08 23:55 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Thursday 07 May 2009, David Rientjes wrote:
> On Thu, 7 May 2009, Rafael J. Wysocki wrote:
> 
> > OK, let's try with __GFP_NO_OOM_KILL first.  If there's too much disagreement,
> > I'll use the freezer-based approach instead.
> > 
> 
> Third time I'm going to suggest this, and I'd like a response on why it's 
> not possible instead of being ignored.
> 
> All of your tasks are in D state other than kthreads, right?  That means 
> they won't be in the oom killer (thus no zones are oom locked), so you can 
> easily do this
> 
> 	struct zone *z;
> 	for_each_populated_zone(z)
> 		zone_set_flag(z, ZONE_OOM_LOCKED);
> 
> and then
> 
> 	for_each_populated_zone(z)
> 		zone_clear_flag(z, ZONE_OOM_LOCKED);
> 
> The serialization is done with trylocks so this will never invoke the oom 
> killer because all zones in the allocator's zonelist will be oom locked.

Well, that might have been a good idea if it actually had worked. :-(

> Why does this not work for you?

If I set image_size to something below "hard core working set" +
totalreserve_pages, preallocate_image_memory() hangs the
box (please refer to the last patch I sent,
http://patchwork.kernel.org/patch/22423/).

However, with the freezer-based disabling of the OOM killer it doesn't hang
under the same test conditions.

The difference appears to be that using your approach makes
__alloc_pages_internal() loop forever between the !try_set_zone_oom() test and
restart:, while it should go to nopage: in that situation.

So, I think I'll stick to the Andrew's approach with using __GFP_NO_OOM_KILL.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-08 13:51                                                                                                     ` Rafael J. Wysocki
@ 2009-05-09  0:08                                                                                                       ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-09  0:08 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: David Rientjes, Andrew Morton, linux-pm, pavel, torvalds,
	jens.axboe, alan-jenkins, linux-kernel, kernel-testers

On Friday 08 May 2009, Rafael J. Wysocki wrote:
> On Friday 08 May 2009, Wu Fengguang wrote:
[--snip--]
> > But hey, that 'count' counts "savable+free" memory.
> > We don't have a counter for an estimation of "free+freeable" memory,
> > ie. we are sure we cannot preallocate above that threshold. 
> > 
> > One applicable situation is, when there are 800M anonymous memory,
> > but only 500M image_size and no swap space.
> > 
> > In that case we will otherwise goto the oom code path. Sure oom is
> > (and shall be) reliably disabled in hibernation, but still we shall be
> > cautious enough not to create a low memory situation, which will hurt:
> > - hibernation speed
> >   (vmscan goes mad trying to squeeze the last free page)
> > - user experiences after resume
> >   (all *active* file data and metadata have to reloaded)
> 
> Strangely enough, my recent testing with this patch doesn't confirm the
> theory. :-)  Namely, I set image_size too low on purpose and it only caused
> preallocate_image_memory() to return NULL at one point and that was it.
> 
> It didn't even took too much time.
> 
> I'll carry out more testing to verify this observation.

I can confirm that even if image_size is below the minimum we can get,
the second preallocate_image_memory() just returns after allocating fewer pages
that it's been asked for (that's with the original __GFP_NO_OOM_KILL-based
approach, as I wrote in the previous message in this thread) and nothing bad
happens.

That may be because we freeze the mm kernel threads, but I've also tested
without freezing them and it's still worked the same way.

> > The current code simply tries *too hard* to meet image_size.
> > I'd rather take that as a mild advice, and to only free
> > "free+freeable-margin" pages when image_size is not approachable.
> > 
> > The safety margin can be totalreserve_pages, plus enough pages for
> > retaining the "hard core working set".
> 
> How to compute the size of the "hard core working set", then?

Well, I'm still interested in the answer here. ;-)

Best,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-09  0:08                                                                                                       ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-09  0:08 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: David Rientjes, linux-kernel, alan-jenkins, jens.axboe,
	Andrew Morton, kernel-testers, torvalds, linux-pm

On Friday 08 May 2009, Rafael J. Wysocki wrote:
> On Friday 08 May 2009, Wu Fengguang wrote:
[--snip--]
> > But hey, that 'count' counts "savable+free" memory.
> > We don't have a counter for an estimation of "free+freeable" memory,
> > ie. we are sure we cannot preallocate above that threshold. 
> > 
> > One applicable situation is, when there are 800M anonymous memory,
> > but only 500M image_size and no swap space.
> > 
> > In that case we will otherwise goto the oom code path. Sure oom is
> > (and shall be) reliably disabled in hibernation, but still we shall be
> > cautious enough not to create a low memory situation, which will hurt:
> > - hibernation speed
> >   (vmscan goes mad trying to squeeze the last free page)
> > - user experiences after resume
> >   (all *active* file data and metadata have to reloaded)
> 
> Strangely enough, my recent testing with this patch doesn't confirm the
> theory. :-)  Namely, I set image_size too low on purpose and it only caused
> preallocate_image_memory() to return NULL at one point and that was it.
> 
> It didn't even took too much time.
> 
> I'll carry out more testing to verify this observation.

I can confirm that even if image_size is below the minimum we can get,
the second preallocate_image_memory() just returns after allocating fewer pages
that it's been asked for (that's with the original __GFP_NO_OOM_KILL-based
approach, as I wrote in the previous message in this thread) and nothing bad
happens.

That may be because we freeze the mm kernel threads, but I've also tested
without freezing them and it's still worked the same way.

> > The current code simply tries *too hard* to meet image_size.
> > I'd rather take that as a mild advice, and to only free
> > "free+freeable-margin" pages when image_size is not approachable.
> > 
> > The safety margin can be totalreserve_pages, plus enough pages for
> > retaining the "hard core working set".
> 
> How to compute the size of the "hard core working set", then?

Well, I'm still interested in the answer here. ;-)

Best,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-21 23:21                 ` Laurent Pinchart
  2009-05-09  3:28                   ` Ming Lei
@ 2009-05-09  3:28                   ` Ming Lei
  2009-05-09 16:24                       ` Linus Torvalds
  1 sibling, 1 reply; 580+ messages in thread
From: Ming Lei @ 2009-05-09  3:28 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Linus Torvalds, Rafael J. Wysocki, Linux Kernel Mailing List,
	Adrian Bunk, Andrew Morton, Natalie Protasevich,
	Kernel Testers List, Network Development, Linux ACPI,
	Linux PM List, Linux SCSI List, video4linux-list, mchehab

On Wed, 22 Apr 2009 01:21:10 +0200
Laurent Pinchart <laurent.pinchart@skynet.be> wrote:

> Hi,
> 
> On Tuesday 21 April 2009 03:47:34 Ming Lei wrote:
> > 2009/4/21 Laurent Pinchart <laurent.pinchart@skynet.be>:
> > > On Saturday 18 April 2009 06:51:11 leiming wrote:

> > >> From a3b3d72cdd57a0699fb643b41b78eb7beb211ff5 Mon Sep 17
> > >> 00:00:00 2001 From: Ming Lei <tom.leiming@gmail.com>
> > >> Date: Wed, 15 Apr 2009 22:32:51 +0800
> > >> Subject: [PATCH] V4L/DVB:usbvideo:fix uvc resume failed(v2)
> > >>
> > >> Now urb buffers is not freed before suspend, so
> > >> uvc_alloc_urb_buffers should return packet counts allocated
> > >> originally during uvc resume , instead of zero.
> > >>
> > >> This version uses round down to return packet counts on Linus's
> > >> suggestions, or else may lead to buffer destructed if packet size
> > >> is changed before calling uvc_alloc_urb_buffers() in this kind of
> > >> case.
> > >
> > > The comment is misleading. If the packet size changes we need to
> > > reallocate the buffers anyway. Have you checked if the packet
> > > size (which depends on the endpoint being selected) can be
> > > changed between suspend and resume, either by the uvcvideo driver
> > > (I don't think it can) or the USB core ?
> >
> > The packet size does not change between suspend and resume.  I mean
> > uvc_alloc_urb_buffers() still can be used in other cases if buffers
> > was not freed and is reuesed in future. It seems there is no such
> > cases in uvcvideo now, but uvc_alloc_urb_buffers() really __can__
> > work in such case, isn't it?
> >
> > IMHO It is only used to allocate or reserve UVC_URBS usb buffers,
> > which size is video->urb_size, and npackets can be shortened or
> > enlarged if psize is changed, after all.
> 
> You're right. Patch applied, thanks.

Rc5 has been released today, why isn't this patch accepted by upstream
now?  It is really a bug fix.

Thanks.

> 
> Best regards,
> 
> Laurent Pinchart
> 

-- 
Lei Ming

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-04-21 23:21                 ` Laurent Pinchart
@ 2009-05-09  3:28                   ` Ming Lei
  2009-05-09  3:28                   ` Ming Lei
  1 sibling, 0 replies; 580+ messages in thread
From: Ming Lei @ 2009-05-09  3:28 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Adrian Bunk, Linux SCSI List, Network Development,
	Linux Kernel Mailing List, Natalie Protasevich, mchehab,
	Linux ACPI, video4linux-list, Linux, Andrew Morton,
	Kernel Testers List, Linus Torvalds, List

On Wed, 22 Apr 2009 01:21:10 +0200
Laurent Pinchart <laurent.pinchart@skynet.be> wrote:

> Hi,
> 
> On Tuesday 21 April 2009 03:47:34 Ming Lei wrote:
> > 2009/4/21 Laurent Pinchart <laurent.pinchart@skynet.be>:
> > > On Saturday 18 April 2009 06:51:11 leiming wrote:

> > >> From a3b3d72cdd57a0699fb643b41b78eb7beb211ff5 Mon Sep 17
> > >> 00:00:00 2001 From: Ming Lei <tom.leiming@gmail.com>
> > >> Date: Wed, 15 Apr 2009 22:32:51 +0800
> > >> Subject: [PATCH] V4L/DVB:usbvideo:fix uvc resume failed(v2)
> > >>
> > >> Now urb buffers is not freed before suspend, so
> > >> uvc_alloc_urb_buffers should return packet counts allocated
> > >> originally during uvc resume , instead of zero.
> > >>
> > >> This version uses round down to return packet counts on Linus's
> > >> suggestions, or else may lead to buffer destructed if packet size
> > >> is changed before calling uvc_alloc_urb_buffers() in this kind of
> > >> case.
> > >
> > > The comment is misleading. If the packet size changes we need to
> > > reallocate the buffers anyway. Have you checked if the packet
> > > size (which depends on the endpoint being selected) can be
> > > changed between suspend and resume, either by the uvcvideo driver
> > > (I don't think it can) or the USB core ?
> >
> > The packet size does not change between suspend and resume.  I mean
> > uvc_alloc_urb_buffers() still can be used in other cases if buffers
> > was not freed and is reuesed in future. It seems there is no such
> > cases in uvcvideo now, but uvc_alloc_urb_buffers() really __can__
> > work in such case, isn't it?
> >
> > IMHO It is only used to allocate or reserve UVC_URBS usb buffers,
> > which size is video->urb_size, and npackets can be shortened or
> > enlarged if psize is changed, after all.
> 
> You're right. Patch applied, thanks.

Rc5 has been released today, why isn't this patch accepted by upstream
now?  It is really a bug fix.

Thanks.

> 
> Best regards,
> 
> Laurent Pinchart
> 

-- 
Lei Ming

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-09  7:34                                                                                                         ` Wu Fengguang
  0 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-09  7:34 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: David Rientjes, Andrew Morton, linux-pm, pavel, torvalds,
	jens.axboe, alan-jenkins, linux-kernel, kernel-testers

On Sat, May 09, 2009 at 08:08:43AM +0800, Rafael J. Wysocki wrote:
> On Friday 08 May 2009, Rafael J. Wysocki wrote:
> > On Friday 08 May 2009, Wu Fengguang wrote:
> [--snip--]
> > > But hey, that 'count' counts "savable+free" memory.
> > > We don't have a counter for an estimation of "free+freeable" memory,
> > > ie. we are sure we cannot preallocate above that threshold. 
> > > 
> > > One applicable situation is, when there are 800M anonymous memory,
> > > but only 500M image_size and no swap space.
> > > 
> > > In that case we will otherwise goto the oom code path. Sure oom is
> > > (and shall be) reliably disabled in hibernation, but still we shall be
> > > cautious enough not to create a low memory situation, which will hurt:
> > > - hibernation speed
> > >   (vmscan goes mad trying to squeeze the last free page)
> > > - user experiences after resume
> > >   (all *active* file data and metadata have to reloaded)
> > 
> > Strangely enough, my recent testing with this patch doesn't confirm the
> > theory. :-)  Namely, I set image_size too low on purpose and it only caused
> > preallocate_image_memory() to return NULL at one point and that was it.
> > 
> > It didn't even took too much time.
> > 
> > I'll carry out more testing to verify this observation.
> 
> I can confirm that even if image_size is below the minimum we can get,

Which minimum please?

> the second preallocate_image_memory() just returns after allocating fewer pages
> that it's been asked for (that's with the original __GFP_NO_OOM_KILL-based
> approach, as I wrote in the previous message in this thread) and nothing bad
> happens.
>
> That may be because we freeze the mm kernel threads, but I've also tested
> without freezing them and it's still worked the same way.
> 
> > > The current code simply tries *too hard* to meet image_size.
> > > I'd rather take that as a mild advice, and to only free
> > > "free+freeable-margin" pages when image_size is not approachable.
> > > 
> > > The safety margin can be totalreserve_pages, plus enough pages for
> > > retaining the "hard core working set".
> > 
> > How to compute the size of the "hard core working set", then?
> 
> Well, I'm still interested in the answer here. ;-)

A tough question ;-)

We can start with the following formula, this should be called *after*
the initial memory shrinking.

/* a typical desktop do not have more than 100MB mapped pages */
#define MAX_MMAP_PAGES  (100 << (20 - PAGE_SHIFT))
unsigned long hard_core_working_set(void)
{
        unsigned long nr;

        /*
         * mapped pages are normally small and precious,
         * but shall be bounded for safety.
         */
        nr = global_page_state(NR_FILE_MAPPED);
        nr = min_t(unsigned long, nr, MAX_MMAP_PAGES);

        /*
         * if no swap space, this is a hard request;
         * otherwise this is an optimization.
         * (the disk image IO can be much faster than swap IO)
         */
        nr += global_page_state(NR_ACTIVE_ANON);
        nr += global_page_state(NR_INACTIVE_ANON);

        /* hard (but normally small) memory requests */
        nr += global_page_state(NR_SLAB_UNRECLAIMABLE);
        nr += global_page_state(NR_UNEVICTABLE);
        nr += global_page_state(NR_PAGETABLE);

        return nr;
}


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-09  0:08                                                                                                       ` Rafael J. Wysocki
  (?)
@ 2009-05-09  7:34                                                                                                       ` Wu Fengguang
  -1 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-09  7:34 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: David Rientjes, linux-kernel, alan-jenkins, jens.axboe,
	Andrew Morton, kernel-testers, torvalds, linux-pm

On Sat, May 09, 2009 at 08:08:43AM +0800, Rafael J. Wysocki wrote:
> On Friday 08 May 2009, Rafael J. Wysocki wrote:
> > On Friday 08 May 2009, Wu Fengguang wrote:
> [--snip--]
> > > But hey, that 'count' counts "savable+free" memory.
> > > We don't have a counter for an estimation of "free+freeable" memory,
> > > ie. we are sure we cannot preallocate above that threshold. 
> > > 
> > > One applicable situation is, when there are 800M anonymous memory,
> > > but only 500M image_size and no swap space.
> > > 
> > > In that case we will otherwise goto the oom code path. Sure oom is
> > > (and shall be) reliably disabled in hibernation, but still we shall be
> > > cautious enough not to create a low memory situation, which will hurt:
> > > - hibernation speed
> > >   (vmscan goes mad trying to squeeze the last free page)
> > > - user experiences after resume
> > >   (all *active* file data and metadata have to reloaded)
> > 
> > Strangely enough, my recent testing with this patch doesn't confirm the
> > theory. :-)  Namely, I set image_size too low on purpose and it only caused
> > preallocate_image_memory() to return NULL at one point and that was it.
> > 
> > It didn't even took too much time.
> > 
> > I'll carry out more testing to verify this observation.
> 
> I can confirm that even if image_size is below the minimum we can get,

Which minimum please?

> the second preallocate_image_memory() just returns after allocating fewer pages
> that it's been asked for (that's with the original __GFP_NO_OOM_KILL-based
> approach, as I wrote in the previous message in this thread) and nothing bad
> happens.
>
> That may be because we freeze the mm kernel threads, but I've also tested
> without freezing them and it's still worked the same way.
> 
> > > The current code simply tries *too hard* to meet image_size.
> > > I'd rather take that as a mild advice, and to only free
> > > "free+freeable-margin" pages when image_size is not approachable.
> > > 
> > > The safety margin can be totalreserve_pages, plus enough pages for
> > > retaining the "hard core working set".
> > 
> > How to compute the size of the "hard core working set", then?
> 
> Well, I'm still interested in the answer here. ;-)

A tough question ;-)

We can start with the following formula, this should be called *after*
the initial memory shrinking.

/* a typical desktop do not have more than 100MB mapped pages */
#define MAX_MMAP_PAGES  (100 << (20 - PAGE_SHIFT))
unsigned long hard_core_working_set(void)
{
        unsigned long nr;

        /*
         * mapped pages are normally small and precious,
         * but shall be bounded for safety.
         */
        nr = global_page_state(NR_FILE_MAPPED);
        nr = min_t(unsigned long, nr, MAX_MMAP_PAGES);

        /*
         * if no swap space, this is a hard request;
         * otherwise this is an optimization.
         * (the disk image IO can be much faster than swap IO)
         */
        nr += global_page_state(NR_ACTIVE_ANON);
        nr += global_page_state(NR_INACTIVE_ANON);

        /* hard (but normally small) memory requests */
        nr += global_page_state(NR_SLAB_UNRECLAIMABLE);
        nr += global_page_state(NR_UNEVICTABLE);
        nr += global_page_state(NR_PAGETABLE);

        return nr;
}

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-09  7:34                                                                                                         ` Wu Fengguang
  0 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-09  7:34 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: David Rientjes, Andrew Morton,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Sat, May 09, 2009 at 08:08:43AM +0800, Rafael J. Wysocki wrote:
> On Friday 08 May 2009, Rafael J. Wysocki wrote:
> > On Friday 08 May 2009, Wu Fengguang wrote:
> [--snip--]
> > > But hey, that 'count' counts "savable+free" memory.
> > > We don't have a counter for an estimation of "free+freeable" memory,
> > > ie. we are sure we cannot preallocate above that threshold. 
> > > 
> > > One applicable situation is, when there are 800M anonymous memory,
> > > but only 500M image_size and no swap space.
> > > 
> > > In that case we will otherwise goto the oom code path. Sure oom is
> > > (and shall be) reliably disabled in hibernation, but still we shall be
> > > cautious enough not to create a low memory situation, which will hurt:
> > > - hibernation speed
> > >   (vmscan goes mad trying to squeeze the last free page)
> > > - user experiences after resume
> > >   (all *active* file data and metadata have to reloaded)
> > 
> > Strangely enough, my recent testing with this patch doesn't confirm the
> > theory. :-)  Namely, I set image_size too low on purpose and it only caused
> > preallocate_image_memory() to return NULL at one point and that was it.
> > 
> > It didn't even took too much time.
> > 
> > I'll carry out more testing to verify this observation.
> 
> I can confirm that even if image_size is below the minimum we can get,

Which minimum please?

> the second preallocate_image_memory() just returns after allocating fewer pages
> that it's been asked for (that's with the original __GFP_NO_OOM_KILL-based
> approach, as I wrote in the previous message in this thread) and nothing bad
> happens.
>
> That may be because we freeze the mm kernel threads, but I've also tested
> without freezing them and it's still worked the same way.
> 
> > > The current code simply tries *too hard* to meet image_size.
> > > I'd rather take that as a mild advice, and to only free
> > > "free+freeable-margin" pages when image_size is not approachable.
> > > 
> > > The safety margin can be totalreserve_pages, plus enough pages for
> > > retaining the "hard core working set".
> > 
> > How to compute the size of the "hard core working set", then?
> 
> Well, I'm still interested in the answer here. ;-)

A tough question ;-)

We can start with the following formula, this should be called *after*
the initial memory shrinking.

/* a typical desktop do not have more than 100MB mapped pages */
#define MAX_MMAP_PAGES  (100 << (20 - PAGE_SHIFT))
unsigned long hard_core_working_set(void)
{
        unsigned long nr;

        /*
         * mapped pages are normally small and precious,
         * but shall be bounded for safety.
         */
        nr = global_page_state(NR_FILE_MAPPED);
        nr = min_t(unsigned long, nr, MAX_MMAP_PAGES);

        /*
         * if no swap space, this is a hard request;
         * otherwise this is an optimization.
         * (the disk image IO can be much faster than swap IO)
         */
        nr += global_page_state(NR_ACTIVE_ANON);
        nr += global_page_state(NR_INACTIVE_ANON);

        /* hard (but normally small) memory requests */
        nr += global_page_state(NR_SLAB_UNRECLAIMABLE);
        nr += global_page_state(NR_UNEVICTABLE);
        nr += global_page_state(NR_PAGETABLE);

        return nr;
}

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-05-09  3:28                   ` Ming Lei
@ 2009-05-09 16:24                       ` Linus Torvalds
  0 siblings, 0 replies; 580+ messages in thread
From: Linus Torvalds @ 2009-05-09 16:24 UTC (permalink / raw)
  To: Ming Lei
  Cc: Adrian Bunk, Linux SCSI List, Network Development,
	Linux Kernel Mailing List, Natalie Protasevich, mchehab,
	Linux ACPI, video4linux-list, Laurent Pinchart, Andrew Morton,
	Kernel Testers List, Linux PM List



On Sat, 9 May 2009, Ming Lei wrote:
> 
> Rc5 has been released today, why isn't this patch accepted by upstream
> now?  It is really a bug fix.

I can take it directly, but was hoping to get it through the regular DVB 
tree. Haven't had a DVB update request yet (or maybe it got lost?)

		Linus

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
@ 2009-05-09 16:24                       ` Linus Torvalds
  0 siblings, 0 replies; 580+ messages in thread
From: Linus Torvalds @ 2009-05-09 16:24 UTC (permalink / raw)
  To: Ming Lei
  Cc: Laurent Pinchart, Rafael J. Wysocki, Linux Kernel Mailing List,
	Adrian Bunk, Andrew Morton, Natalie Protasevich,
	Kernel Testers List, Network Development, Linux ACPI,
	Linux PM List, Linux SCSI List, video4linux-list, mchehab



On Sat, 9 May 2009, Ming Lei wrote:
> 
> Rc5 has been released today, why isn't this patch accepted by upstream
> now?  It is really a bug fix.

I can take it directly, but was hoping to get it through the regular DVB 
tree. Haven't had a DVB update request yet (or maybe it got lost?)

		Linus

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-09  7:34                                                                                                         ` Wu Fengguang
@ 2009-05-09 19:22                                                                                                           ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-09 19:22 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: David Rientjes, Andrew Morton, linux-pm, pavel, torvalds,
	jens.axboe, alan-jenkins, linux-kernel, kernel-testers

On Saturday 09 May 2009, Wu Fengguang wrote:
> On Sat, May 09, 2009 at 08:08:43AM +0800, Rafael J. Wysocki wrote:
> > On Friday 08 May 2009, Rafael J. Wysocki wrote:
> > > On Friday 08 May 2009, Wu Fengguang wrote:
> > [--snip--]
> > > > But hey, that 'count' counts "savable+free" memory.
> > > > We don't have a counter for an estimation of "free+freeable" memory,
> > > > ie. we are sure we cannot preallocate above that threshold. 
> > > > 
> > > > One applicable situation is, when there are 800M anonymous memory,
> > > > but only 500M image_size and no swap space.
> > > > 
> > > > In that case we will otherwise goto the oom code path. Sure oom is
> > > > (and shall be) reliably disabled in hibernation, but still we shall be
> > > > cautious enough not to create a low memory situation, which will hurt:
> > > > - hibernation speed
> > > >   (vmscan goes mad trying to squeeze the last free page)
> > > > - user experiences after resume
> > > >   (all *active* file data and metadata have to reloaded)
> > > 
> > > Strangely enough, my recent testing with this patch doesn't confirm the
> > > theory. :-)  Namely, I set image_size too low on purpose and it only caused
> > > preallocate_image_memory() to return NULL at one point and that was it.
> > > 
> > > It didn't even took too much time.
> > > 
> > > I'll carry out more testing to verify this observation.
> > 
> > I can confirm that even if image_size is below the minimum we can get,
> 
> Which minimum please?

That was supposed to be an alternative way of saying "below any reasonable
value", but it wasn't very precise indeed.

I should have said that for given system there was a minimum number of saveable
pages that hibernate_preallocate_memory() leaved in memory and it just couldn't
go below that limit.  If image_size is set below this number, the
preallocate_image_memory(max_size - size) call returns fewer pages that it's
been requested to allocate and that's it.  No disasters, no anything wrong.

> > the second preallocate_image_memory() just returns after allocating fewer pages
> > that it's been asked for (that's with the original __GFP_NO_OOM_KILL-based
> > approach, as I wrote in the previous message in this thread) and nothing bad
> > happens.
> >
> > That may be because we freeze the mm kernel threads, but I've also tested
> > without freezing them and it's still worked the same way.
> > 
> > > > The current code simply tries *too hard* to meet image_size.
> > > > I'd rather take that as a mild advice, and to only free
> > > > "free+freeable-margin" pages when image_size is not approachable.
> > > > 
> > > > The safety margin can be totalreserve_pages, plus enough pages for
> > > > retaining the "hard core working set".
> > > 
> > > How to compute the size of the "hard core working set", then?
> > 
> > Well, I'm still interested in the answer here. ;-)
> 
> A tough question ;-)
> 
> We can start with the following formula, this should be called *after*
> the initial memory shrinking.

OK

> /* a typical desktop do not have more than 100MB mapped pages */
> #define MAX_MMAP_PAGES  (100 << (20 - PAGE_SHIFT))
> unsigned long hard_core_working_set(void)
> {
>         unsigned long nr;
> 
>         /*
>          * mapped pages are normally small and precious,
>          * but shall be bounded for safety.
>          */
>         nr = global_page_state(NR_FILE_MAPPED);
>         nr = min_t(unsigned long, nr, MAX_MMAP_PAGES);
> 
>         /*
>          * if no swap space, this is a hard request;
>          * otherwise this is an optimization.
>          * (the disk image IO can be much faster than swap IO)

Well, if there's no swap space at this point, we won't be able to save the
image anyway, so this always is an optimization IMO. :-)

>          */
>         nr += global_page_state(NR_ACTIVE_ANON);
>         nr += global_page_state(NR_INACTIVE_ANON);
> 
>         /* hard (but normally small) memory requests */
>         nr += global_page_state(NR_SLAB_UNRECLAIMABLE);
>         nr += global_page_state(NR_UNEVICTABLE);
>         nr += global_page_state(NR_PAGETABLE);
> 
>         return nr;
> }

OK, thanks.

I'll create a separate patch adding this function and we'll see how it works.

Best,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-09  7:34                                                                                                         ` Wu Fengguang
  (?)
@ 2009-05-09 19:22                                                                                                         ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-09 19:22 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: David Rientjes, linux-kernel, alan-jenkins, jens.axboe,
	Andrew Morton, kernel-testers, torvalds, linux-pm

On Saturday 09 May 2009, Wu Fengguang wrote:
> On Sat, May 09, 2009 at 08:08:43AM +0800, Rafael J. Wysocki wrote:
> > On Friday 08 May 2009, Rafael J. Wysocki wrote:
> > > On Friday 08 May 2009, Wu Fengguang wrote:
> > [--snip--]
> > > > But hey, that 'count' counts "savable+free" memory.
> > > > We don't have a counter for an estimation of "free+freeable" memory,
> > > > ie. we are sure we cannot preallocate above that threshold. 
> > > > 
> > > > One applicable situation is, when there are 800M anonymous memory,
> > > > but only 500M image_size and no swap space.
> > > > 
> > > > In that case we will otherwise goto the oom code path. Sure oom is
> > > > (and shall be) reliably disabled in hibernation, but still we shall be
> > > > cautious enough not to create a low memory situation, which will hurt:
> > > > - hibernation speed
> > > >   (vmscan goes mad trying to squeeze the last free page)
> > > > - user experiences after resume
> > > >   (all *active* file data and metadata have to reloaded)
> > > 
> > > Strangely enough, my recent testing with this patch doesn't confirm the
> > > theory. :-)  Namely, I set image_size too low on purpose and it only caused
> > > preallocate_image_memory() to return NULL at one point and that was it.
> > > 
> > > It didn't even took too much time.
> > > 
> > > I'll carry out more testing to verify this observation.
> > 
> > I can confirm that even if image_size is below the minimum we can get,
> 
> Which minimum please?

That was supposed to be an alternative way of saying "below any reasonable
value", but it wasn't very precise indeed.

I should have said that for given system there was a minimum number of saveable
pages that hibernate_preallocate_memory() leaved in memory and it just couldn't
go below that limit.  If image_size is set below this number, the
preallocate_image_memory(max_size - size) call returns fewer pages that it's
been requested to allocate and that's it.  No disasters, no anything wrong.

> > the second preallocate_image_memory() just returns after allocating fewer pages
> > that it's been asked for (that's with the original __GFP_NO_OOM_KILL-based
> > approach, as I wrote in the previous message in this thread) and nothing bad
> > happens.
> >
> > That may be because we freeze the mm kernel threads, but I've also tested
> > without freezing them and it's still worked the same way.
> > 
> > > > The current code simply tries *too hard* to meet image_size.
> > > > I'd rather take that as a mild advice, and to only free
> > > > "free+freeable-margin" pages when image_size is not approachable.
> > > > 
> > > > The safety margin can be totalreserve_pages, plus enough pages for
> > > > retaining the "hard core working set".
> > > 
> > > How to compute the size of the "hard core working set", then?
> > 
> > Well, I'm still interested in the answer here. ;-)
> 
> A tough question ;-)
> 
> We can start with the following formula, this should be called *after*
> the initial memory shrinking.

OK

> /* a typical desktop do not have more than 100MB mapped pages */
> #define MAX_MMAP_PAGES  (100 << (20 - PAGE_SHIFT))
> unsigned long hard_core_working_set(void)
> {
>         unsigned long nr;
> 
>         /*
>          * mapped pages are normally small and precious,
>          * but shall be bounded for safety.
>          */
>         nr = global_page_state(NR_FILE_MAPPED);
>         nr = min_t(unsigned long, nr, MAX_MMAP_PAGES);
> 
>         /*
>          * if no swap space, this is a hard request;
>          * otherwise this is an optimization.
>          * (the disk image IO can be much faster than swap IO)

Well, if there's no swap space at this point, we won't be able to save the
image anyway, so this always is an optimization IMO. :-)

>          */
>         nr += global_page_state(NR_ACTIVE_ANON);
>         nr += global_page_state(NR_INACTIVE_ANON);
> 
>         /* hard (but normally small) memory requests */
>         nr += global_page_state(NR_SLAB_UNRECLAIMABLE);
>         nr += global_page_state(NR_UNEVICTABLE);
>         nr += global_page_state(NR_PAGETABLE);
> 
>         return nr;
> }

OK, thanks.

I'll create a separate patch adding this function and we'll see how it works.

Best,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-09 19:22                                                                                                           ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-09 19:22 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: David Rientjes, Andrew Morton,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Saturday 09 May 2009, Wu Fengguang wrote:
> On Sat, May 09, 2009 at 08:08:43AM +0800, Rafael J. Wysocki wrote:
> > On Friday 08 May 2009, Rafael J. Wysocki wrote:
> > > On Friday 08 May 2009, Wu Fengguang wrote:
> > [--snip--]
> > > > But hey, that 'count' counts "savable+free" memory.
> > > > We don't have a counter for an estimation of "free+freeable" memory,
> > > > ie. we are sure we cannot preallocate above that threshold. 
> > > > 
> > > > One applicable situation is, when there are 800M anonymous memory,
> > > > but only 500M image_size and no swap space.
> > > > 
> > > > In that case we will otherwise goto the oom code path. Sure oom is
> > > > (and shall be) reliably disabled in hibernation, but still we shall be
> > > > cautious enough not to create a low memory situation, which will hurt:
> > > > - hibernation speed
> > > >   (vmscan goes mad trying to squeeze the last free page)
> > > > - user experiences after resume
> > > >   (all *active* file data and metadata have to reloaded)
> > > 
> > > Strangely enough, my recent testing with this patch doesn't confirm the
> > > theory. :-)  Namely, I set image_size too low on purpose and it only caused
> > > preallocate_image_memory() to return NULL at one point and that was it.
> > > 
> > > It didn't even took too much time.
> > > 
> > > I'll carry out more testing to verify this observation.
> > 
> > I can confirm that even if image_size is below the minimum we can get,
> 
> Which minimum please?

That was supposed to be an alternative way of saying "below any reasonable
value", but it wasn't very precise indeed.

I should have said that for given system there was a minimum number of saveable
pages that hibernate_preallocate_memory() leaved in memory and it just couldn't
go below that limit.  If image_size is set below this number, the
preallocate_image_memory(max_size - size) call returns fewer pages that it's
been requested to allocate and that's it.  No disasters, no anything wrong.

> > the second preallocate_image_memory() just returns after allocating fewer pages
> > that it's been asked for (that's with the original __GFP_NO_OOM_KILL-based
> > approach, as I wrote in the previous message in this thread) and nothing bad
> > happens.
> >
> > That may be because we freeze the mm kernel threads, but I've also tested
> > without freezing them and it's still worked the same way.
> > 
> > > > The current code simply tries *too hard* to meet image_size.
> > > > I'd rather take that as a mild advice, and to only free
> > > > "free+freeable-margin" pages when image_size is not approachable.
> > > > 
> > > > The safety margin can be totalreserve_pages, plus enough pages for
> > > > retaining the "hard core working set".
> > > 
> > > How to compute the size of the "hard core working set", then?
> > 
> > Well, I'm still interested in the answer here. ;-)
> 
> A tough question ;-)
> 
> We can start with the following formula, this should be called *after*
> the initial memory shrinking.

OK

> /* a typical desktop do not have more than 100MB mapped pages */
> #define MAX_MMAP_PAGES  (100 << (20 - PAGE_SHIFT))
> unsigned long hard_core_working_set(void)
> {
>         unsigned long nr;
> 
>         /*
>          * mapped pages are normally small and precious,
>          * but shall be bounded for safety.
>          */
>         nr = global_page_state(NR_FILE_MAPPED);
>         nr = min_t(unsigned long, nr, MAX_MMAP_PAGES);
> 
>         /*
>          * if no swap space, this is a hard request;
>          * otherwise this is an optimization.
>          * (the disk image IO can be much faster than swap IO)

Well, if there's no swap space at this point, we won't be able to save the
image anyway, so this always is an optimization IMO. :-)

>          */
>         nr += global_page_state(NR_ACTIVE_ANON);
>         nr += global_page_state(NR_INACTIVE_ANON);
> 
>         /* hard (but normally small) memory requests */
>         nr += global_page_state(NR_SLAB_UNRECLAIMABLE);
>         nr += global_page_state(NR_UNEVICTABLE);
>         nr += global_page_state(NR_PAGETABLE);
> 
>         return nr;
> }

OK, thanks.

I'll create a separate patch adding this function and we'll see how it works.

Best,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-09 21:22                                                                                     ` David Rientjes
  0 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-09 21:22 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, fengguang.wu, linux-pm, pavel, Linus Torvalds,
	jens.axboe, alan-jenkins, linux-kernel, kernel-testers,
	Mel Gorman

On Sat, 9 May 2009, Rafael J. Wysocki wrote:

> > All of your tasks are in D state other than kthreads, right?  That means 
> > they won't be in the oom killer (thus no zones are oom locked), so you can 
> > easily do this
> > 
> > 	struct zone *z;
> > 	for_each_populated_zone(z)
> > 		zone_set_flag(z, ZONE_OOM_LOCKED);
> > 
> > and then
> > 
> > 	for_each_populated_zone(z)
> > 		zone_clear_flag(z, ZONE_OOM_LOCKED);
> > 
> > The serialization is done with trylocks so this will never invoke the oom 
> > killer because all zones in the allocator's zonelist will be oom locked.
> 
> Well, that might have been a good idea if it actually had worked. :-(
> 
> > Why does this not work for you?
> 
> If I set image_size to something below "hard core working set" +
> totalreserve_pages, preallocate_image_memory() hangs the
> box (please refer to the last patch I sent,
> http://patchwork.kernel.org/patch/22423/).
> 

This has been changed in the latest mmotm with Mel's page alloactor 
patches (and I think yours should be based on mmotm).  Specifically, 
page-allocator-break-up-the-allocator-entry-point-into-fast-and-slow-paths.patch.

Before his patchset, zonelists that had ZONE_OOM_LOCKED set for at least 
one of their zones would unconditionally goto restart.  Now, if
order > PAGE_ALLOC_COSTLY_ORDER, it gives up and returns NULL.  Otherwise, 
it does goto restart.

So if your allocation has order > PAGE_ALLOC_COSTLY_ORDER, using the 
ZONE_OOM_LOCKED approach to locking out the oom killer will work just fine 
in mmotm.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-08 23:55                                                                                   ` Rafael J. Wysocki
  (?)
  (?)
@ 2009-05-09 21:22                                                                                   ` David Rientjes
  -1 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-09 21:22 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: kernel-testers, Mel Gorman, linux-kernel, alan-jenkins,
	jens.axboe, Andrew Morton, fengguang.wu, Linus Torvalds,
	linux-pm

On Sat, 9 May 2009, Rafael J. Wysocki wrote:

> > All of your tasks are in D state other than kthreads, right?  That means 
> > they won't be in the oom killer (thus no zones are oom locked), so you can 
> > easily do this
> > 
> > 	struct zone *z;
> > 	for_each_populated_zone(z)
> > 		zone_set_flag(z, ZONE_OOM_LOCKED);
> > 
> > and then
> > 
> > 	for_each_populated_zone(z)
> > 		zone_clear_flag(z, ZONE_OOM_LOCKED);
> > 
> > The serialization is done with trylocks so this will never invoke the oom 
> > killer because all zones in the allocator's zonelist will be oom locked.
> 
> Well, that might have been a good idea if it actually had worked. :-(
> 
> > Why does this not work for you?
> 
> If I set image_size to something below "hard core working set" +
> totalreserve_pages, preallocate_image_memory() hangs the
> box (please refer to the last patch I sent,
> http://patchwork.kernel.org/patch/22423/).
> 

This has been changed in the latest mmotm with Mel's page alloactor 
patches (and I think yours should be based on mmotm).  Specifically, 
page-allocator-break-up-the-allocator-entry-point-into-fast-and-slow-paths.patch.

Before his patchset, zonelists that had ZONE_OOM_LOCKED set for at least 
one of their zones would unconditionally goto restart.  Now, if
order > PAGE_ALLOC_COSTLY_ORDER, it gives up and returns NULL.  Otherwise, 
it does goto restart.

So if your allocation has order > PAGE_ALLOC_COSTLY_ORDER, using the 
ZONE_OOM_LOCKED approach to locking out the oom killer will work just fine 
in mmotm.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-09 21:22                                                                                     ` David Rientjes
  0 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-09 21:22 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, Linus Torvalds,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA, Mel Gorman

On Sat, 9 May 2009, Rafael J. Wysocki wrote:

> > All of your tasks are in D state other than kthreads, right?  That means 
> > they won't be in the oom killer (thus no zones are oom locked), so you can 
> > easily do this
> > 
> > 	struct zone *z;
> > 	for_each_populated_zone(z)
> > 		zone_set_flag(z, ZONE_OOM_LOCKED);
> > 
> > and then
> > 
> > 	for_each_populated_zone(z)
> > 		zone_clear_flag(z, ZONE_OOM_LOCKED);
> > 
> > The serialization is done with trylocks so this will never invoke the oom 
> > killer because all zones in the allocator's zonelist will be oom locked.
> 
> Well, that might have been a good idea if it actually had worked. :-(
> 
> > Why does this not work for you?
> 
> If I set image_size to something below "hard core working set" +
> totalreserve_pages, preallocate_image_memory() hangs the
> box (please refer to the last patch I sent,
> http://patchwork.kernel.org/patch/22423/).
> 

This has been changed in the latest mmotm with Mel's page alloactor 
patches (and I think yours should be based on mmotm).  Specifically, 
page-allocator-break-up-the-allocator-entry-point-into-fast-and-slow-paths.patch.

Before his patchset, zonelists that had ZONE_OOM_LOCKED set for at least 
one of their zones would unconditionally goto restart.  Now, if
order > PAGE_ALLOC_COSTLY_ORDER, it gives up and returns NULL.  Otherwise, 
it does goto restart.

So if your allocation has order > PAGE_ALLOC_COSTLY_ORDER, using the 
ZONE_OOM_LOCKED approach to locking out the oom killer will work just fine 
in mmotm.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-05-09 16:24                       ` Linus Torvalds
  (?)
@ 2009-05-09 21:37                       ` Mauro Carvalho Chehab
  -1 siblings, 0 replies; 580+ messages in thread
From: Mauro Carvalho Chehab @ 2009-05-09 21:37 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ming Lei, Laurent Pinchart, Rafael J. Wysocki,
	Linux Kernel Mailing List, Adrian Bunk, Andrew Morton,
	Natalie Protasevich, Kernel Testers List, Network Development,
	Linux ACPI, Linux PM List, Linux SCSI List, video4linux-list

Em Sat, 9 May 2009 09:24:51 -0700 (PDT)
Linus Torvalds <torvalds@linux-foundation.org> escreveu:

> 
> 
> On Sat, 9 May 2009, Ming Lei wrote:
> > 
> > Rc5 has been released today, why isn't this patch accepted by upstream
> > now?  It is really a bug fix.


> 
> I can take it directly, but was hoping to get it through the regular DVB 
> tree. Haven't had a DVB update request yet (or maybe it got lost?)
> 
> 		Linus

The patch were added on my linux-next tree. I'll move it to the tree I handle
bug fixes and I'll ask Linus to pull from it together with a few other fixes I
have there, later today or tomorrow.




Cheers,
Mauro

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29
  2009-05-09 16:24                       ` Linus Torvalds
  (?)
  (?)
@ 2009-05-09 21:37                       ` Mauro Carvalho Chehab
  -1 siblings, 0 replies; 580+ messages in thread
From: Mauro Carvalho Chehab @ 2009-05-09 21:37 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Adrian Bunk, Natalie, Linux SCSI List, Kernel Mailing List,
	Protasevich, Linux ACPI, video4linux-list, Laurent Pinchart,
	Linux, Andrew Morton, Kernel Testers List, List,
	Network Development

Em Sat, 9 May 2009 09:24:51 -0700 (PDT)
Linus Torvalds <torvalds@linux-foundation.org> escreveu:

> 
> 
> On Sat, 9 May 2009, Ming Lei wrote:
> > 
> > Rc5 has been released today, why isn't this patch accepted by upstream
> > now?  It is really a bug fix.


> 
> I can take it directly, but was hoping to get it through the regular DVB 
> tree. Haven't had a DVB update request yet (or maybe it got lost?)
> 
> 		Linus

The patch were added on my linux-next tree. I'll move it to the tree I handle
bug fixes and I'll ask Linus to pull from it together with a few other fixes I
have there, later today or tomorrow.




Cheers,
Mauro

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-09 21:37                                                                                       ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-09 21:37 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, fengguang.wu, linux-pm, pavel, Linus Torvalds,
	jens.axboe, alan-jenkins, linux-kernel, kernel-testers,
	Mel Gorman

On Saturday 09 May 2009, David Rientjes wrote:
> On Sat, 9 May 2009, Rafael J. Wysocki wrote:
> 
> > > All of your tasks are in D state other than kthreads, right?  That means 
> > > they won't be in the oom killer (thus no zones are oom locked), so you can 
> > > easily do this
> > > 
> > > 	struct zone *z;
> > > 	for_each_populated_zone(z)
> > > 		zone_set_flag(z, ZONE_OOM_LOCKED);
> > > 
> > > and then
> > > 
> > > 	for_each_populated_zone(z)
> > > 		zone_clear_flag(z, ZONE_OOM_LOCKED);
> > > 
> > > The serialization is done with trylocks so this will never invoke the oom 
> > > killer because all zones in the allocator's zonelist will be oom locked.
> > 
> > Well, that might have been a good idea if it actually had worked. :-(
> > 
> > > Why does this not work for you?
> > 
> > If I set image_size to something below "hard core working set" +
> > totalreserve_pages, preallocate_image_memory() hangs the
> > box (please refer to the last patch I sent,
> > http://patchwork.kernel.org/patch/22423/).
> > 
> 
> This has been changed in the latest mmotm with Mel's page alloactor 
> patches (and I think yours should be based on mmotm).  Specifically, 
> page-allocator-break-up-the-allocator-entry-point-into-fast-and-slow-paths.patch.
> 
> Before his patchset, zonelists that had ZONE_OOM_LOCKED set for at least 
> one of their zones would unconditionally goto restart.  Now, if
> order > PAGE_ALLOC_COSTLY_ORDER, it gives up and returns NULL.  Otherwise, 
> it does goto restart.
> 
> So if your allocation has order > PAGE_ALLOC_COSTLY_ORDER,

It doesn't.  All of my allocations are of order 0.

> using the ZONE_OOM_LOCKED approach to locking out the oom killer will work
> just fine in mmotm.

No, it won't, AFAICT.

Best,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-09 21:22                                                                                     ` David Rientjes
  (?)
@ 2009-05-09 21:37                                                                                     ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-09 21:37 UTC (permalink / raw)
  To: David Rientjes
  Cc: kernel-testers, Mel Gorman, linux-kernel, alan-jenkins,
	jens.axboe, Andrew Morton, fengguang.wu, Linus Torvalds,
	linux-pm

On Saturday 09 May 2009, David Rientjes wrote:
> On Sat, 9 May 2009, Rafael J. Wysocki wrote:
> 
> > > All of your tasks are in D state other than kthreads, right?  That means 
> > > they won't be in the oom killer (thus no zones are oom locked), so you can 
> > > easily do this
> > > 
> > > 	struct zone *z;
> > > 	for_each_populated_zone(z)
> > > 		zone_set_flag(z, ZONE_OOM_LOCKED);
> > > 
> > > and then
> > > 
> > > 	for_each_populated_zone(z)
> > > 		zone_clear_flag(z, ZONE_OOM_LOCKED);
> > > 
> > > The serialization is done with trylocks so this will never invoke the oom 
> > > killer because all zones in the allocator's zonelist will be oom locked.
> > 
> > Well, that might have been a good idea if it actually had worked. :-(
> > 
> > > Why does this not work for you?
> > 
> > If I set image_size to something below "hard core working set" +
> > totalreserve_pages, preallocate_image_memory() hangs the
> > box (please refer to the last patch I sent,
> > http://patchwork.kernel.org/patch/22423/).
> > 
> 
> This has been changed in the latest mmotm with Mel's page alloactor 
> patches (and I think yours should be based on mmotm).  Specifically, 
> page-allocator-break-up-the-allocator-entry-point-into-fast-and-slow-paths.patch.
> 
> Before his patchset, zonelists that had ZONE_OOM_LOCKED set for at least 
> one of their zones would unconditionally goto restart.  Now, if
> order > PAGE_ALLOC_COSTLY_ORDER, it gives up and returns NULL.  Otherwise, 
> it does goto restart.
> 
> So if your allocation has order > PAGE_ALLOC_COSTLY_ORDER,

It doesn't.  All of my allocations are of order 0.

> using the ZONE_OOM_LOCKED approach to locking out the oom killer will work
> just fine in mmotm.

No, it won't, AFAICT.

Best,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-09 21:37                                                                                       ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-09 21:37 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, Linus Torvalds,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA, Mel Gorman

On Saturday 09 May 2009, David Rientjes wrote:
> On Sat, 9 May 2009, Rafael J. Wysocki wrote:
> 
> > > All of your tasks are in D state other than kthreads, right?  That means 
> > > they won't be in the oom killer (thus no zones are oom locked), so you can 
> > > easily do this
> > > 
> > > 	struct zone *z;
> > > 	for_each_populated_zone(z)
> > > 		zone_set_flag(z, ZONE_OOM_LOCKED);
> > > 
> > > and then
> > > 
> > > 	for_each_populated_zone(z)
> > > 		zone_clear_flag(z, ZONE_OOM_LOCKED);
> > > 
> > > The serialization is done with trylocks so this will never invoke the oom 
> > > killer because all zones in the allocator's zonelist will be oom locked.
> > 
> > Well, that might have been a good idea if it actually had worked. :-(
> > 
> > > Why does this not work for you?
> > 
> > If I set image_size to something below "hard core working set" +
> > totalreserve_pages, preallocate_image_memory() hangs the
> > box (please refer to the last patch I sent,
> > http://patchwork.kernel.org/patch/22423/).
> > 
> 
> This has been changed in the latest mmotm with Mel's page alloactor 
> patches (and I think yours should be based on mmotm).  Specifically, 
> page-allocator-break-up-the-allocator-entry-point-into-fast-and-slow-paths.patch.
> 
> Before his patchset, zonelists that had ZONE_OOM_LOCKED set for at least 
> one of their zones would unconditionally goto restart.  Now, if
> order > PAGE_ALLOC_COSTLY_ORDER, it gives up and returns NULL.  Otherwise, 
> it does goto restart.
> 
> So if your allocation has order > PAGE_ALLOC_COSTLY_ORDER,

It doesn't.  All of my allocations are of order 0.

> using the ZONE_OOM_LOCKED approach to locking out the oom killer will work
> just fine in mmotm.

No, it won't, AFAICT.

Best,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-09 22:39                                                                                         ` David Rientjes
  0 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-09 22:39 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, fengguang.wu, linux-pm, pavel, Linus Torvalds,
	jens.axboe, alan-jenkins, linux-kernel, kernel-testers,
	Mel Gorman

On Sat, 9 May 2009, Rafael J. Wysocki wrote:

> > This has been changed in the latest mmotm with Mel's page alloactor 
> > patches (and I think yours should be based on mmotm).  Specifically, 
> > page-allocator-break-up-the-allocator-entry-point-into-fast-and-slow-paths.patch.
> > 
> > Before his patchset, zonelists that had ZONE_OOM_LOCKED set for at least 
> > one of their zones would unconditionally goto restart.  Now, if
> > order > PAGE_ALLOC_COSTLY_ORDER, it gives up and returns NULL.  Otherwise, 
> > it does goto restart.
> > 
> > So if your allocation has order > PAGE_ALLOC_COSTLY_ORDER,
> 
> It doesn't.  All of my allocations are of order 0.
> 

All order 0 allocations are implicitly __GFP_NOFAIL and will loop 
endlessly unless they can't block.  So if you want to simply prohibit the 
oom killer from being invoked and not change the retry behavior, setting 
ZONE_OOM_LOCKED for all zones will do that.  If your machine hangs, it 
means nothing can be reclaimed and you can't free memory via oom killing, 
so there's nothing else the page allocator can do.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-09 21:37                                                                                       ` Rafael J. Wysocki
  (?)
  (?)
@ 2009-05-09 22:39                                                                                       ` David Rientjes
  -1 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-09 22:39 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: kernel-testers, Mel Gorman, linux-kernel, alan-jenkins,
	jens.axboe, Andrew Morton, fengguang.wu, Linus Torvalds,
	linux-pm

On Sat, 9 May 2009, Rafael J. Wysocki wrote:

> > This has been changed in the latest mmotm with Mel's page alloactor 
> > patches (and I think yours should be based on mmotm).  Specifically, 
> > page-allocator-break-up-the-allocator-entry-point-into-fast-and-slow-paths.patch.
> > 
> > Before his patchset, zonelists that had ZONE_OOM_LOCKED set for at least 
> > one of their zones would unconditionally goto restart.  Now, if
> > order > PAGE_ALLOC_COSTLY_ORDER, it gives up and returns NULL.  Otherwise, 
> > it does goto restart.
> > 
> > So if your allocation has order > PAGE_ALLOC_COSTLY_ORDER,
> 
> It doesn't.  All of my allocations are of order 0.
> 

All order 0 allocations are implicitly __GFP_NOFAIL and will loop 
endlessly unless they can't block.  So if you want to simply prohibit the 
oom killer from being invoked and not change the retry behavior, setting 
ZONE_OOM_LOCKED for all zones will do that.  If your machine hangs, it 
means nothing can be reclaimed and you can't free memory via oom killing, 
so there's nothing else the page allocator can do.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-09 22:39                                                                                         ` David Rientjes
  0 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-09 22:39 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, Linus Torvalds,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA, Mel Gorman

On Sat, 9 May 2009, Rafael J. Wysocki wrote:

> > This has been changed in the latest mmotm with Mel's page alloactor 
> > patches (and I think yours should be based on mmotm).  Specifically, 
> > page-allocator-break-up-the-allocator-entry-point-into-fast-and-slow-paths.patch.
> > 
> > Before his patchset, zonelists that had ZONE_OOM_LOCKED set for at least 
> > one of their zones would unconditionally goto restart.  Now, if
> > order > PAGE_ALLOC_COSTLY_ORDER, it gives up and returns NULL.  Otherwise, 
> > it does goto restart.
> > 
> > So if your allocation has order > PAGE_ALLOC_COSTLY_ORDER,
> 
> It doesn't.  All of my allocations are of order 0.
> 

All order 0 allocations are implicitly __GFP_NOFAIL and will loop 
endlessly unless they can't block.  So if you want to simply prohibit the 
oom killer from being invoked and not change the retry behavior, setting 
ZONE_OOM_LOCKED for all zones will do that.  If your machine hangs, it 
means nothing can be reclaimed and you can't free memory via oom killing, 
so there's nothing else the page allocator can do.

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-09 23:03                                                                                           ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-09 23:03 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, fengguang.wu, linux-pm, pavel, Linus Torvalds,
	jens.axboe, alan-jenkins, linux-kernel, kernel-testers,
	Mel Gorman

On Sunday 10 May 2009, David Rientjes wrote:
> On Sat, 9 May 2009, Rafael J. Wysocki wrote:
> 
> > > This has been changed in the latest mmotm with Mel's page alloactor 
> > > patches (and I think yours should be based on mmotm).  Specifically, 
> > > page-allocator-break-up-the-allocator-entry-point-into-fast-and-slow-paths.patch.
> > > 
> > > Before his patchset, zonelists that had ZONE_OOM_LOCKED set for at least 
> > > one of their zones would unconditionally goto restart.  Now, if
> > > order > PAGE_ALLOC_COSTLY_ORDER, it gives up and returns NULL.  Otherwise, 
> > > it does goto restart.
> > > 
> > > So if your allocation has order > PAGE_ALLOC_COSTLY_ORDER,
> > 
> > It doesn't.  All of my allocations are of order 0.
> > 
> 
> All order 0 allocations are implicitly __GFP_NOFAIL and will loop 
> endlessly unless they can't block.  So if you want to simply prohibit the 
> oom killer from being invoked and not change the retry behavior, setting 
> ZONE_OOM_LOCKED for all zones will do that.  If your machine hangs, it 
> means nothing can be reclaimed and you can't free memory via oom killing, 
> so there's nothing else the page allocator can do.

But I want it to give up in this case instead of looping forever.

Look.  I have a specific problem at hand that I want to solve and the approach
you suggested _clearly_ _doesn't_ _work_.  I have also tried to explain to you
why it doesn't work, but you're ingnoring it, so I really don't know what else
I can say.

OTOH, the approach suggested by Andrew _does_ _work_ regardless of your
opinion about it.  It's been tested and it's done the job 100% of the time.  Go
figure.  And please stop beating the dead horse.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-09 22:39                                                                                         ` David Rientjes
  (?)
  (?)
@ 2009-05-09 23:03                                                                                         ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-09 23:03 UTC (permalink / raw)
  To: David Rientjes
  Cc: kernel-testers, Mel Gorman, linux-kernel, alan-jenkins,
	jens.axboe, Andrew Morton, fengguang.wu, Linus Torvalds,
	linux-pm

On Sunday 10 May 2009, David Rientjes wrote:
> On Sat, 9 May 2009, Rafael J. Wysocki wrote:
> 
> > > This has been changed in the latest mmotm with Mel's page alloactor 
> > > patches (and I think yours should be based on mmotm).  Specifically, 
> > > page-allocator-break-up-the-allocator-entry-point-into-fast-and-slow-paths.patch.
> > > 
> > > Before his patchset, zonelists that had ZONE_OOM_LOCKED set for at least 
> > > one of their zones would unconditionally goto restart.  Now, if
> > > order > PAGE_ALLOC_COSTLY_ORDER, it gives up and returns NULL.  Otherwise, 
> > > it does goto restart.
> > > 
> > > So if your allocation has order > PAGE_ALLOC_COSTLY_ORDER,
> > 
> > It doesn't.  All of my allocations are of order 0.
> > 
> 
> All order 0 allocations are implicitly __GFP_NOFAIL and will loop 
> endlessly unless they can't block.  So if you want to simply prohibit the 
> oom killer from being invoked and not change the retry behavior, setting 
> ZONE_OOM_LOCKED for all zones will do that.  If your machine hangs, it 
> means nothing can be reclaimed and you can't free memory via oom killing, 
> so there's nothing else the page allocator can do.

But I want it to give up in this case instead of looping forever.

Look.  I have a specific problem at hand that I want to solve and the approach
you suggested _clearly_ _doesn't_ _work_.  I have also tried to explain to you
why it doesn't work, but you're ingnoring it, so I really don't know what else
I can say.

OTOH, the approach suggested by Andrew _does_ _work_ regardless of your
opinion about it.  It's been tested and it's done the job 100% of the time.  Go
figure.  And please stop beating the dead horse.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-09 23:03                                                                                           ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-09 23:03 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, Linus Torvalds,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA, Mel Gorman

On Sunday 10 May 2009, David Rientjes wrote:
> On Sat, 9 May 2009, Rafael J. Wysocki wrote:
> 
> > > This has been changed in the latest mmotm with Mel's page alloactor 
> > > patches (and I think yours should be based on mmotm).  Specifically, 
> > > page-allocator-break-up-the-allocator-entry-point-into-fast-and-slow-paths.patch.
> > > 
> > > Before his patchset, zonelists that had ZONE_OOM_LOCKED set for at least 
> > > one of their zones would unconditionally goto restart.  Now, if
> > > order > PAGE_ALLOC_COSTLY_ORDER, it gives up and returns NULL.  Otherwise, 
> > > it does goto restart.
> > > 
> > > So if your allocation has order > PAGE_ALLOC_COSTLY_ORDER,
> > 
> > It doesn't.  All of my allocations are of order 0.
> > 
> 
> All order 0 allocations are implicitly __GFP_NOFAIL and will loop 
> endlessly unless they can't block.  So if you want to simply prohibit the 
> oom killer from being invoked and not change the retry behavior, setting 
> ZONE_OOM_LOCKED for all zones will do that.  If your machine hangs, it 
> means nothing can be reclaimed and you can't free memory via oom killing, 
> so there's nothing else the page allocator can do.

But I want it to give up in this case instead of looping forever.

Look.  I have a specific problem at hand that I want to solve and the approach
you suggested _clearly_ _doesn't_ _work_.  I have also tried to explain to you
why it doesn't work, but you're ingnoring it, so I really don't know what else
I can say.

OTOH, the approach suggested by Andrew _does_ _work_ regardless of your
opinion about it.  It's been tested and it's done the job 100% of the time.  Go
figure.  And please stop beating the dead horse.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-10  4:52                                                                                                             ` Wu Fengguang
  0 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-10  4:52 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: David Rientjes, Andrew Morton, linux-pm, pavel, torvalds,
	jens.axboe, alan-jenkins, linux-kernel, kernel-testers

On Sun, May 10, 2009 at 03:22:57AM +0800, Rafael J. Wysocki wrote:
> On Saturday 09 May 2009, Wu Fengguang wrote:
> > On Sat, May 09, 2009 at 08:08:43AM +0800, Rafael J. Wysocki wrote:
> > > On Friday 08 May 2009, Rafael J. Wysocki wrote:
> > > > On Friday 08 May 2009, Wu Fengguang wrote:
> > > [--snip--]
> > > > > But hey, that 'count' counts "savable+free" memory.
> > > > > We don't have a counter for an estimation of "free+freeable" memory,
> > > > > ie. we are sure we cannot preallocate above that threshold. 
> > > > > 
> > > > > One applicable situation is, when there are 800M anonymous memory,
> > > > > but only 500M image_size and no swap space.
> > > > > 
> > > > > In that case we will otherwise goto the oom code path. Sure oom is
> > > > > (and shall be) reliably disabled in hibernation, but still we shall be
> > > > > cautious enough not to create a low memory situation, which will hurt:
> > > > > - hibernation speed
> > > > >   (vmscan goes mad trying to squeeze the last free page)
> > > > > - user experiences after resume
> > > > >   (all *active* file data and metadata have to reloaded)
> > > > 
> > > > Strangely enough, my recent testing with this patch doesn't confirm the
> > > > theory. :-)  Namely, I set image_size too low on purpose and it only caused
> > > > preallocate_image_memory() to return NULL at one point and that was it.
> > > > 
> > > > It didn't even took too much time.
> > > > 
> > > > I'll carry out more testing to verify this observation.
> > > 
> > > I can confirm that even if image_size is below the minimum we can get,
> > 
> > Which minimum please?
> 
> That was supposed to be an alternative way of saying "below any reasonable
> value", but it wasn't very precise indeed.
> 
> I should have said that for given system there was a minimum number of saveable
> pages that hibernate_preallocate_memory() leaved in memory and it just couldn't
> go below that limit.  If image_size is set below this number, the
> preallocate_image_memory(max_size - size) call returns fewer pages that it's
> been requested to allocate and that's it.  No disasters, no anything wrong.

"preallocate_image_memory(max_size - size) returning fewer pages"
would better be avoided, and possibly can be avoided by checking
hard_core_working_set(), right?

> > > the second preallocate_image_memory() just returns after allocating fewer pages
> > > that it's been asked for (that's with the original __GFP_NO_OOM_KILL-based
> > > approach, as I wrote in the previous message in this thread) and nothing bad
> > > happens.
> > >
> > > That may be because we freeze the mm kernel threads, but I've also tested
> > > without freezing them and it's still worked the same way.
> > > 
> > > > > The current code simply tries *too hard* to meet image_size.
> > > > > I'd rather take that as a mild advice, and to only free
> > > > > "free+freeable-margin" pages when image_size is not approachable.
> > > > > 
> > > > > The safety margin can be totalreserve_pages, plus enough pages for
> > > > > retaining the "hard core working set".
> > > > 
> > > > How to compute the size of the "hard core working set", then?
> > > 
> > > Well, I'm still interested in the answer here. ;-)
> > 
> > A tough question ;-)
> > 
> > We can start with the following formula, this should be called *after*
> > the initial memory shrinking.
> 
> OK
> 
> > /* a typical desktop do not have more than 100MB mapped pages */
> > #define MAX_MMAP_PAGES  (100 << (20 - PAGE_SHIFT))
> > unsigned long hard_core_working_set(void)
> > {
> >         unsigned long nr;
> > 
> >         /*
> >          * mapped pages are normally small and precious,
> >          * but shall be bounded for safety.
> >          */
> >         nr = global_page_state(NR_FILE_MAPPED);
> >         nr = min_t(unsigned long, nr, MAX_MMAP_PAGES);
> > 
> >         /*
> >          * if no swap space, this is a hard request;
> >          * otherwise this is an optimization.
> >          * (the disk image IO can be much faster than swap IO)
> 
> Well, if there's no swap space at this point, we won't be able to save the
> image anyway, so this always is an optimization IMO. :-)

Ah OK. Do you think the anonymous pages optimization should be limited?

My desktop normally consumes 200-400MB anonymous pages, but when some
virtual machine is running, the anonymous pages can go beyond 1GB,
with mapped file pages go slightly beyond 100MB.

The image-write vs. swapout-write speeds should be equal, however the
hibernate tool may be able to compress the dataset.

The image-read will be much faster than swapin-read for *rotational*
disks. It may take more time to resume, however the user experiences
after completion will be much better.

I don't think "populating memory with useless data" would be a major
concern, since we already freed up half of the total memory. It's all
about the speed one can get back to work.

> 
> >          */
> >         nr += global_page_state(NR_ACTIVE_ANON);
> >         nr += global_page_state(NR_INACTIVE_ANON);
> > 
> >         /* hard (but normally small) memory requests */
> >         nr += global_page_state(NR_SLAB_UNRECLAIMABLE);
> >         nr += global_page_state(NR_UNEVICTABLE);
> >         nr += global_page_state(NR_PAGETABLE);
> > 
> >         return nr;
> > }
> 
> OK, thanks.
> 
> I'll create a separate patch adding this function and we'll see how it works.

OK, thanks!

btw, if the shrink_all_memory() functions cannot go away because of
performance problems, I can help clean it up.  (FYI: I happen to be
doing so just before you submitted this patchset.:)

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-09 19:22                                                                                                           ` Rafael J. Wysocki
  (?)
@ 2009-05-10  4:52                                                                                                           ` Wu Fengguang
  -1 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-10  4:52 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: David Rientjes, linux-kernel, alan-jenkins, jens.axboe,
	Andrew Morton, kernel-testers, torvalds, linux-pm

On Sun, May 10, 2009 at 03:22:57AM +0800, Rafael J. Wysocki wrote:
> On Saturday 09 May 2009, Wu Fengguang wrote:
> > On Sat, May 09, 2009 at 08:08:43AM +0800, Rafael J. Wysocki wrote:
> > > On Friday 08 May 2009, Rafael J. Wysocki wrote:
> > > > On Friday 08 May 2009, Wu Fengguang wrote:
> > > [--snip--]
> > > > > But hey, that 'count' counts "savable+free" memory.
> > > > > We don't have a counter for an estimation of "free+freeable" memory,
> > > > > ie. we are sure we cannot preallocate above that threshold. 
> > > > > 
> > > > > One applicable situation is, when there are 800M anonymous memory,
> > > > > but only 500M image_size and no swap space.
> > > > > 
> > > > > In that case we will otherwise goto the oom code path. Sure oom is
> > > > > (and shall be) reliably disabled in hibernation, but still we shall be
> > > > > cautious enough not to create a low memory situation, which will hurt:
> > > > > - hibernation speed
> > > > >   (vmscan goes mad trying to squeeze the last free page)
> > > > > - user experiences after resume
> > > > >   (all *active* file data and metadata have to reloaded)
> > > > 
> > > > Strangely enough, my recent testing with this patch doesn't confirm the
> > > > theory. :-)  Namely, I set image_size too low on purpose and it only caused
> > > > preallocate_image_memory() to return NULL at one point and that was it.
> > > > 
> > > > It didn't even took too much time.
> > > > 
> > > > I'll carry out more testing to verify this observation.
> > > 
> > > I can confirm that even if image_size is below the minimum we can get,
> > 
> > Which minimum please?
> 
> That was supposed to be an alternative way of saying "below any reasonable
> value", but it wasn't very precise indeed.
> 
> I should have said that for given system there was a minimum number of saveable
> pages that hibernate_preallocate_memory() leaved in memory and it just couldn't
> go below that limit.  If image_size is set below this number, the
> preallocate_image_memory(max_size - size) call returns fewer pages that it's
> been requested to allocate and that's it.  No disasters, no anything wrong.

"preallocate_image_memory(max_size - size) returning fewer pages"
would better be avoided, and possibly can be avoided by checking
hard_core_working_set(), right?

> > > the second preallocate_image_memory() just returns after allocating fewer pages
> > > that it's been asked for (that's with the original __GFP_NO_OOM_KILL-based
> > > approach, as I wrote in the previous message in this thread) and nothing bad
> > > happens.
> > >
> > > That may be because we freeze the mm kernel threads, but I've also tested
> > > without freezing them and it's still worked the same way.
> > > 
> > > > > The current code simply tries *too hard* to meet image_size.
> > > > > I'd rather take that as a mild advice, and to only free
> > > > > "free+freeable-margin" pages when image_size is not approachable.
> > > > > 
> > > > > The safety margin can be totalreserve_pages, plus enough pages for
> > > > > retaining the "hard core working set".
> > > > 
> > > > How to compute the size of the "hard core working set", then?
> > > 
> > > Well, I'm still interested in the answer here. ;-)
> > 
> > A tough question ;-)
> > 
> > We can start with the following formula, this should be called *after*
> > the initial memory shrinking.
> 
> OK
> 
> > /* a typical desktop do not have more than 100MB mapped pages */
> > #define MAX_MMAP_PAGES  (100 << (20 - PAGE_SHIFT))
> > unsigned long hard_core_working_set(void)
> > {
> >         unsigned long nr;
> > 
> >         /*
> >          * mapped pages are normally small and precious,
> >          * but shall be bounded for safety.
> >          */
> >         nr = global_page_state(NR_FILE_MAPPED);
> >         nr = min_t(unsigned long, nr, MAX_MMAP_PAGES);
> > 
> >         /*
> >          * if no swap space, this is a hard request;
> >          * otherwise this is an optimization.
> >          * (the disk image IO can be much faster than swap IO)
> 
> Well, if there's no swap space at this point, we won't be able to save the
> image anyway, so this always is an optimization IMO. :-)

Ah OK. Do you think the anonymous pages optimization should be limited?

My desktop normally consumes 200-400MB anonymous pages, but when some
virtual machine is running, the anonymous pages can go beyond 1GB,
with mapped file pages go slightly beyond 100MB.

The image-write vs. swapout-write speeds should be equal, however the
hibernate tool may be able to compress the dataset.

The image-read will be much faster than swapin-read for *rotational*
disks. It may take more time to resume, however the user experiences
after completion will be much better.

I don't think "populating memory with useless data" would be a major
concern, since we already freed up half of the total memory. It's all
about the speed one can get back to work.

> 
> >          */
> >         nr += global_page_state(NR_ACTIVE_ANON);
> >         nr += global_page_state(NR_INACTIVE_ANON);
> > 
> >         /* hard (but normally small) memory requests */
> >         nr += global_page_state(NR_SLAB_UNRECLAIMABLE);
> >         nr += global_page_state(NR_UNEVICTABLE);
> >         nr += global_page_state(NR_PAGETABLE);
> > 
> >         return nr;
> > }
> 
> OK, thanks.
> 
> I'll create a separate patch adding this function and we'll see how it works.

OK, thanks!

btw, if the shrink_all_memory() functions cannot go away because of
performance problems, I can help clean it up.  (FYI: I happen to be
doing so just before you submitted this patchset.:)

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-10  4:52                                                                                                             ` Wu Fengguang
  0 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-05-10  4:52 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: David Rientjes, Andrew Morton,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Sun, May 10, 2009 at 03:22:57AM +0800, Rafael J. Wysocki wrote:
> On Saturday 09 May 2009, Wu Fengguang wrote:
> > On Sat, May 09, 2009 at 08:08:43AM +0800, Rafael J. Wysocki wrote:
> > > On Friday 08 May 2009, Rafael J. Wysocki wrote:
> > > > On Friday 08 May 2009, Wu Fengguang wrote:
> > > [--snip--]
> > > > > But hey, that 'count' counts "savable+free" memory.
> > > > > We don't have a counter for an estimation of "free+freeable" memory,
> > > > > ie. we are sure we cannot preallocate above that threshold. 
> > > > > 
> > > > > One applicable situation is, when there are 800M anonymous memory,
> > > > > but only 500M image_size and no swap space.
> > > > > 
> > > > > In that case we will otherwise goto the oom code path. Sure oom is
> > > > > (and shall be) reliably disabled in hibernation, but still we shall be
> > > > > cautious enough not to create a low memory situation, which will hurt:
> > > > > - hibernation speed
> > > > >   (vmscan goes mad trying to squeeze the last free page)
> > > > > - user experiences after resume
> > > > >   (all *active* file data and metadata have to reloaded)
> > > > 
> > > > Strangely enough, my recent testing with this patch doesn't confirm the
> > > > theory. :-)  Namely, I set image_size too low on purpose and it only caused
> > > > preallocate_image_memory() to return NULL at one point and that was it.
> > > > 
> > > > It didn't even took too much time.
> > > > 
> > > > I'll carry out more testing to verify this observation.
> > > 
> > > I can confirm that even if image_size is below the minimum we can get,
> > 
> > Which minimum please?
> 
> That was supposed to be an alternative way of saying "below any reasonable
> value", but it wasn't very precise indeed.
> 
> I should have said that for given system there was a minimum number of saveable
> pages that hibernate_preallocate_memory() leaved in memory and it just couldn't
> go below that limit.  If image_size is set below this number, the
> preallocate_image_memory(max_size - size) call returns fewer pages that it's
> been requested to allocate and that's it.  No disasters, no anything wrong.

"preallocate_image_memory(max_size - size) returning fewer pages"
would better be avoided, and possibly can be avoided by checking
hard_core_working_set(), right?

> > > the second preallocate_image_memory() just returns after allocating fewer pages
> > > that it's been asked for (that's with the original __GFP_NO_OOM_KILL-based
> > > approach, as I wrote in the previous message in this thread) and nothing bad
> > > happens.
> > >
> > > That may be because we freeze the mm kernel threads, but I've also tested
> > > without freezing them and it's still worked the same way.
> > > 
> > > > > The current code simply tries *too hard* to meet image_size.
> > > > > I'd rather take that as a mild advice, and to only free
> > > > > "free+freeable-margin" pages when image_size is not approachable.
> > > > > 
> > > > > The safety margin can be totalreserve_pages, plus enough pages for
> > > > > retaining the "hard core working set".
> > > > 
> > > > How to compute the size of the "hard core working set", then?
> > > 
> > > Well, I'm still interested in the answer here. ;-)
> > 
> > A tough question ;-)
> > 
> > We can start with the following formula, this should be called *after*
> > the initial memory shrinking.
> 
> OK
> 
> > /* a typical desktop do not have more than 100MB mapped pages */
> > #define MAX_MMAP_PAGES  (100 << (20 - PAGE_SHIFT))
> > unsigned long hard_core_working_set(void)
> > {
> >         unsigned long nr;
> > 
> >         /*
> >          * mapped pages are normally small and precious,
> >          * but shall be bounded for safety.
> >          */
> >         nr = global_page_state(NR_FILE_MAPPED);
> >         nr = min_t(unsigned long, nr, MAX_MMAP_PAGES);
> > 
> >         /*
> >          * if no swap space, this is a hard request;
> >          * otherwise this is an optimization.
> >          * (the disk image IO can be much faster than swap IO)
> 
> Well, if there's no swap space at this point, we won't be able to save the
> image anyway, so this always is an optimization IMO. :-)

Ah OK. Do you think the anonymous pages optimization should be limited?

My desktop normally consumes 200-400MB anonymous pages, but when some
virtual machine is running, the anonymous pages can go beyond 1GB,
with mapped file pages go slightly beyond 100MB.

The image-write vs. swapout-write speeds should be equal, however the
hibernate tool may be able to compress the dataset.

The image-read will be much faster than swapin-read for *rotational*
disks. It may take more time to resume, however the user experiences
after completion will be much better.

I don't think "populating memory with useless data" would be a major
concern, since we already freed up half of the total memory. It's all
about the speed one can get back to work.

> 
> >          */
> >         nr += global_page_state(NR_ACTIVE_ANON);
> >         nr += global_page_state(NR_INACTIVE_ANON);
> > 
> >         /* hard (but normally small) memory requests */
> >         nr += global_page_state(NR_SLAB_UNRECLAIMABLE);
> >         nr += global_page_state(NR_UNEVICTABLE);
> >         nr += global_page_state(NR_PAGETABLE);
> > 
> >         return nr;
> > }
> 
> OK, thanks.
> 
> I'll create a separate patch adding this function and we'll see how it works.

OK, thanks!

btw, if the shrink_all_memory() functions cannot go away because of
performance problems, I can help clean it up.  (FYI: I happen to be
doing so just before you submitted this patchset.:)

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-10  4:52                                                                                                             ` Wu Fengguang
@ 2009-05-10 12:52                                                                                                               ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-10 12:52 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: David Rientjes, Andrew Morton, linux-pm, pavel, torvalds,
	jens.axboe, alan-jenkins, linux-kernel, kernel-testers

On Sunday 10 May 2009, Wu Fengguang wrote:
> On Sun, May 10, 2009 at 03:22:57AM +0800, Rafael J. Wysocki wrote:
> > On Saturday 09 May 2009, Wu Fengguang wrote:
> > > On Sat, May 09, 2009 at 08:08:43AM +0800, Rafael J. Wysocki wrote:
> > > > On Friday 08 May 2009, Rafael J. Wysocki wrote:
> > > > > On Friday 08 May 2009, Wu Fengguang wrote:
> > > > [--snip--]
> > > > > > But hey, that 'count' counts "savable+free" memory.
> > > > > > We don't have a counter for an estimation of "free+freeable" memory,
> > > > > > ie. we are sure we cannot preallocate above that threshold. 
> > > > > > 
> > > > > > One applicable situation is, when there are 800M anonymous memory,
> > > > > > but only 500M image_size and no swap space.
> > > > > > 
> > > > > > In that case we will otherwise goto the oom code path. Sure oom is
> > > > > > (and shall be) reliably disabled in hibernation, but still we shall be
> > > > > > cautious enough not to create a low memory situation, which will hurt:
> > > > > > - hibernation speed
> > > > > >   (vmscan goes mad trying to squeeze the last free page)
> > > > > > - user experiences after resume
> > > > > >   (all *active* file data and metadata have to reloaded)
> > > > > 
> > > > > Strangely enough, my recent testing with this patch doesn't confirm the
> > > > > theory. :-)  Namely, I set image_size too low on purpose and it only caused
> > > > > preallocate_image_memory() to return NULL at one point and that was it.
> > > > > 
> > > > > It didn't even took too much time.
> > > > > 
> > > > > I'll carry out more testing to verify this observation.
> > > > 
> > > > I can confirm that even if image_size is below the minimum we can get,
> > > 
> > > Which minimum please?
> > 
> > That was supposed to be an alternative way of saying "below any reasonable
> > value", but it wasn't very precise indeed.
> > 
> > I should have said that for given system there was a minimum number of saveable
> > pages that hibernate_preallocate_memory() leaved in memory and it just couldn't
> > go below that limit.  If image_size is set below this number, the
> > preallocate_image_memory(max_size - size) call returns fewer pages that it's
> > been requested to allocate and that's it.  No disasters, no anything wrong.
> 
> "preallocate_image_memory(max_size - size) returning fewer pages"
> would better be avoided, and possibly can be avoided by checking
> hard_core_working_set(), right?

Yes, but your formula doesn't seem to be suitable for that, because the number
it returns it too low.

On an x86_64 test box the minimum image size I can get (by setting
image_size=1000) is about 24000 pages, while the formula for the hard core
working set size returns about 12000, so it is not very useful.

On an i386 test box it's even worse, as the minimum image size I can get is
about 48000 pages.

Besides, while testing this I noticed that on i386 preallocate_image_memory()
didn't allocate from highmem, so I changed it to do so.  As a result of this I
had to change the patches, so I'm going to post the new patchset shortly.

> > > > the second preallocate_image_memory() just returns after allocating fewer pages
> > > > that it's been asked for (that's with the original __GFP_NO_OOM_KILL-based
> > > > approach, as I wrote in the previous message in this thread) and nothing bad
> > > > happens.
> > > >
> > > > That may be because we freeze the mm kernel threads, but I've also tested
> > > > without freezing them and it's still worked the same way.
> > > > 
> > > > > > The current code simply tries *too hard* to meet image_size.
> > > > > > I'd rather take that as a mild advice, and to only free
> > > > > > "free+freeable-margin" pages when image_size is not approachable.
> > > > > > 
> > > > > > The safety margin can be totalreserve_pages, plus enough pages for
> > > > > > retaining the "hard core working set".
> > > > > 
> > > > > How to compute the size of the "hard core working set", then?
> > > > 
> > > > Well, I'm still interested in the answer here. ;-)
> > > 
> > > A tough question ;-)
> > > 
> > > We can start with the following formula, this should be called *after*
> > > the initial memory shrinking.
> > 
> > OK
> > 
> > > /* a typical desktop do not have more than 100MB mapped pages */
> > > #define MAX_MMAP_PAGES  (100 << (20 - PAGE_SHIFT))
> > > unsigned long hard_core_working_set(void)
> > > {
> > >         unsigned long nr;
> > > 
> > >         /*
> > >          * mapped pages are normally small and precious,
> > >          * but shall be bounded for safety.
> > >          */
> > >         nr = global_page_state(NR_FILE_MAPPED);
> > >         nr = min_t(unsigned long, nr, MAX_MMAP_PAGES);
> > > 
> > >         /*
> > >          * if no swap space, this is a hard request;
> > >          * otherwise this is an optimization.
> > >          * (the disk image IO can be much faster than swap IO)
> > 
> > Well, if there's no swap space at this point, we won't be able to save the
> > image anyway, so this always is an optimization IMO. :-)
> 
> Ah OK. Do you think the anonymous pages optimization should be limited?

That depends.

> My desktop normally consumes 200-400MB anonymous pages, but when some
> virtual machine is running, the anonymous pages can go beyond 1GB,

That's too much IMO, so there should be a limit.

> with mapped file pages go slightly beyond 100MB.
> 
> The image-write vs. swapout-write speeds should be equal,

They aren't, really.  Image write is way faster, even without compression.

> however the hibernate tool may be able to compress the dataset.

Sure.

> The image-read will be much faster than swapin-read for *rotational*
> disks. It may take more time to resume, however the user experiences
> after completion will be much better.

Agreed.

> I don't think "populating memory with useless data" would be a major
> concern, since we already freed up half of the total memory. It's all
> about the speed one can get back to work.

Agreed again.

> > 
> > >          */
> > >         nr += global_page_state(NR_ACTIVE_ANON);
> > >         nr += global_page_state(NR_INACTIVE_ANON);
> > > 
> > >         /* hard (but normally small) memory requests */
> > >         nr += global_page_state(NR_SLAB_UNRECLAIMABLE);
> > >         nr += global_page_state(NR_UNEVICTABLE);
> > >         nr += global_page_state(NR_PAGETABLE);
> > > 
> > >         return nr;
> > > }
> > 
> > OK, thanks.
> > 
> > I'll create a separate patch adding this function and we'll see how it works.
> 
> OK, thanks!

Actually it doesn't work too well as I said above.  Arguably that's because the
number of anonymous pages was probably lower than average in my test cases,
but I also think that our hard core working set formula should be suitable for
all test cases.

> btw, if the shrink_all_memory() functions cannot go away because of
> performance problems, I can help clean it up.  (FYI: I happen to be
> doing so just before you submitted this patchset.:)

That would be great, thanks a lot!

I'm going to post updated patchset in a while, let's move the discussion to
the new thread.

Best,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-10  4:52                                                                                                             ` Wu Fengguang
  (?)
  (?)
@ 2009-05-10 12:52                                                                                                             ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-10 12:52 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: David Rientjes, linux-kernel, alan-jenkins, jens.axboe,
	Andrew Morton, kernel-testers, torvalds, linux-pm

On Sunday 10 May 2009, Wu Fengguang wrote:
> On Sun, May 10, 2009 at 03:22:57AM +0800, Rafael J. Wysocki wrote:
> > On Saturday 09 May 2009, Wu Fengguang wrote:
> > > On Sat, May 09, 2009 at 08:08:43AM +0800, Rafael J. Wysocki wrote:
> > > > On Friday 08 May 2009, Rafael J. Wysocki wrote:
> > > > > On Friday 08 May 2009, Wu Fengguang wrote:
> > > > [--snip--]
> > > > > > But hey, that 'count' counts "savable+free" memory.
> > > > > > We don't have a counter for an estimation of "free+freeable" memory,
> > > > > > ie. we are sure we cannot preallocate above that threshold. 
> > > > > > 
> > > > > > One applicable situation is, when there are 800M anonymous memory,
> > > > > > but only 500M image_size and no swap space.
> > > > > > 
> > > > > > In that case we will otherwise goto the oom code path. Sure oom is
> > > > > > (and shall be) reliably disabled in hibernation, but still we shall be
> > > > > > cautious enough not to create a low memory situation, which will hurt:
> > > > > > - hibernation speed
> > > > > >   (vmscan goes mad trying to squeeze the last free page)
> > > > > > - user experiences after resume
> > > > > >   (all *active* file data and metadata have to reloaded)
> > > > > 
> > > > > Strangely enough, my recent testing with this patch doesn't confirm the
> > > > > theory. :-)  Namely, I set image_size too low on purpose and it only caused
> > > > > preallocate_image_memory() to return NULL at one point and that was it.
> > > > > 
> > > > > It didn't even took too much time.
> > > > > 
> > > > > I'll carry out more testing to verify this observation.
> > > > 
> > > > I can confirm that even if image_size is below the minimum we can get,
> > > 
> > > Which minimum please?
> > 
> > That was supposed to be an alternative way of saying "below any reasonable
> > value", but it wasn't very precise indeed.
> > 
> > I should have said that for given system there was a minimum number of saveable
> > pages that hibernate_preallocate_memory() leaved in memory and it just couldn't
> > go below that limit.  If image_size is set below this number, the
> > preallocate_image_memory(max_size - size) call returns fewer pages that it's
> > been requested to allocate and that's it.  No disasters, no anything wrong.
> 
> "preallocate_image_memory(max_size - size) returning fewer pages"
> would better be avoided, and possibly can be avoided by checking
> hard_core_working_set(), right?

Yes, but your formula doesn't seem to be suitable for that, because the number
it returns it too low.

On an x86_64 test box the minimum image size I can get (by setting
image_size=1000) is about 24000 pages, while the formula for the hard core
working set size returns about 12000, so it is not very useful.

On an i386 test box it's even worse, as the minimum image size I can get is
about 48000 pages.

Besides, while testing this I noticed that on i386 preallocate_image_memory()
didn't allocate from highmem, so I changed it to do so.  As a result of this I
had to change the patches, so I'm going to post the new patchset shortly.

> > > > the second preallocate_image_memory() just returns after allocating fewer pages
> > > > that it's been asked for (that's with the original __GFP_NO_OOM_KILL-based
> > > > approach, as I wrote in the previous message in this thread) and nothing bad
> > > > happens.
> > > >
> > > > That may be because we freeze the mm kernel threads, but I've also tested
> > > > without freezing them and it's still worked the same way.
> > > > 
> > > > > > The current code simply tries *too hard* to meet image_size.
> > > > > > I'd rather take that as a mild advice, and to only free
> > > > > > "free+freeable-margin" pages when image_size is not approachable.
> > > > > > 
> > > > > > The safety margin can be totalreserve_pages, plus enough pages for
> > > > > > retaining the "hard core working set".
> > > > > 
> > > > > How to compute the size of the "hard core working set", then?
> > > > 
> > > > Well, I'm still interested in the answer here. ;-)
> > > 
> > > A tough question ;-)
> > > 
> > > We can start with the following formula, this should be called *after*
> > > the initial memory shrinking.
> > 
> > OK
> > 
> > > /* a typical desktop do not have more than 100MB mapped pages */
> > > #define MAX_MMAP_PAGES  (100 << (20 - PAGE_SHIFT))
> > > unsigned long hard_core_working_set(void)
> > > {
> > >         unsigned long nr;
> > > 
> > >         /*
> > >          * mapped pages are normally small and precious,
> > >          * but shall be bounded for safety.
> > >          */
> > >         nr = global_page_state(NR_FILE_MAPPED);
> > >         nr = min_t(unsigned long, nr, MAX_MMAP_PAGES);
> > > 
> > >         /*
> > >          * if no swap space, this is a hard request;
> > >          * otherwise this is an optimization.
> > >          * (the disk image IO can be much faster than swap IO)
> > 
> > Well, if there's no swap space at this point, we won't be able to save the
> > image anyway, so this always is an optimization IMO. :-)
> 
> Ah OK. Do you think the anonymous pages optimization should be limited?

That depends.

> My desktop normally consumes 200-400MB anonymous pages, but when some
> virtual machine is running, the anonymous pages can go beyond 1GB,

That's too much IMO, so there should be a limit.

> with mapped file pages go slightly beyond 100MB.
> 
> The image-write vs. swapout-write speeds should be equal,

They aren't, really.  Image write is way faster, even without compression.

> however the hibernate tool may be able to compress the dataset.

Sure.

> The image-read will be much faster than swapin-read for *rotational*
> disks. It may take more time to resume, however the user experiences
> after completion will be much better.

Agreed.

> I don't think "populating memory with useless data" would be a major
> concern, since we already freed up half of the total memory. It's all
> about the speed one can get back to work.

Agreed again.

> > 
> > >          */
> > >         nr += global_page_state(NR_ACTIVE_ANON);
> > >         nr += global_page_state(NR_INACTIVE_ANON);
> > > 
> > >         /* hard (but normally small) memory requests */
> > >         nr += global_page_state(NR_SLAB_UNRECLAIMABLE);
> > >         nr += global_page_state(NR_UNEVICTABLE);
> > >         nr += global_page_state(NR_PAGETABLE);
> > > 
> > >         return nr;
> > > }
> > 
> > OK, thanks.
> > 
> > I'll create a separate patch adding this function and we'll see how it works.
> 
> OK, thanks!

Actually it doesn't work too well as I said above.  Arguably that's because the
number of anonymous pages was probably lower than average in my test cases,
but I also think that our hard core working set formula should be suitable for
all test cases.

> btw, if the shrink_all_memory() functions cannot go away because of
> performance problems, I can help clean it up.  (FYI: I happen to be
> doing so just before you submitted this patchset.:)

That would be great, thanks a lot!

I'm going to post updated patchset in a while, let's move the discussion to
the new thread.

Best,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-10 12:52                                                                                                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-10 12:52 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: David Rientjes, Andrew Morton,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Sunday 10 May 2009, Wu Fengguang wrote:
> On Sun, May 10, 2009 at 03:22:57AM +0800, Rafael J. Wysocki wrote:
> > On Saturday 09 May 2009, Wu Fengguang wrote:
> > > On Sat, May 09, 2009 at 08:08:43AM +0800, Rafael J. Wysocki wrote:
> > > > On Friday 08 May 2009, Rafael J. Wysocki wrote:
> > > > > On Friday 08 May 2009, Wu Fengguang wrote:
> > > > [--snip--]
> > > > > > But hey, that 'count' counts "savable+free" memory.
> > > > > > We don't have a counter for an estimation of "free+freeable" memory,
> > > > > > ie. we are sure we cannot preallocate above that threshold. 
> > > > > > 
> > > > > > One applicable situation is, when there are 800M anonymous memory,
> > > > > > but only 500M image_size and no swap space.
> > > > > > 
> > > > > > In that case we will otherwise goto the oom code path. Sure oom is
> > > > > > (and shall be) reliably disabled in hibernation, but still we shall be
> > > > > > cautious enough not to create a low memory situation, which will hurt:
> > > > > > - hibernation speed
> > > > > >   (vmscan goes mad trying to squeeze the last free page)
> > > > > > - user experiences after resume
> > > > > >   (all *active* file data and metadata have to reloaded)
> > > > > 
> > > > > Strangely enough, my recent testing with this patch doesn't confirm the
> > > > > theory. :-)  Namely, I set image_size too low on purpose and it only caused
> > > > > preallocate_image_memory() to return NULL at one point and that was it.
> > > > > 
> > > > > It didn't even took too much time.
> > > > > 
> > > > > I'll carry out more testing to verify this observation.
> > > > 
> > > > I can confirm that even if image_size is below the minimum we can get,
> > > 
> > > Which minimum please?
> > 
> > That was supposed to be an alternative way of saying "below any reasonable
> > value", but it wasn't very precise indeed.
> > 
> > I should have said that for given system there was a minimum number of saveable
> > pages that hibernate_preallocate_memory() leaved in memory and it just couldn't
> > go below that limit.  If image_size is set below this number, the
> > preallocate_image_memory(max_size - size) call returns fewer pages that it's
> > been requested to allocate and that's it.  No disasters, no anything wrong.
> 
> "preallocate_image_memory(max_size - size) returning fewer pages"
> would better be avoided, and possibly can be avoided by checking
> hard_core_working_set(), right?

Yes, but your formula doesn't seem to be suitable for that, because the number
it returns it too low.

On an x86_64 test box the minimum image size I can get (by setting
image_size=1000) is about 24000 pages, while the formula for the hard core
working set size returns about 12000, so it is not very useful.

On an i386 test box it's even worse, as the minimum image size I can get is
about 48000 pages.

Besides, while testing this I noticed that on i386 preallocate_image_memory()
didn't allocate from highmem, so I changed it to do so.  As a result of this I
had to change the patches, so I'm going to post the new patchset shortly.

> > > > the second preallocate_image_memory() just returns after allocating fewer pages
> > > > that it's been asked for (that's with the original __GFP_NO_OOM_KILL-based
> > > > approach, as I wrote in the previous message in this thread) and nothing bad
> > > > happens.
> > > >
> > > > That may be because we freeze the mm kernel threads, but I've also tested
> > > > without freezing them and it's still worked the same way.
> > > > 
> > > > > > The current code simply tries *too hard* to meet image_size.
> > > > > > I'd rather take that as a mild advice, and to only free
> > > > > > "free+freeable-margin" pages when image_size is not approachable.
> > > > > > 
> > > > > > The safety margin can be totalreserve_pages, plus enough pages for
> > > > > > retaining the "hard core working set".
> > > > > 
> > > > > How to compute the size of the "hard core working set", then?
> > > > 
> > > > Well, I'm still interested in the answer here. ;-)
> > > 
> > > A tough question ;-)
> > > 
> > > We can start with the following formula, this should be called *after*
> > > the initial memory shrinking.
> > 
> > OK
> > 
> > > /* a typical desktop do not have more than 100MB mapped pages */
> > > #define MAX_MMAP_PAGES  (100 << (20 - PAGE_SHIFT))
> > > unsigned long hard_core_working_set(void)
> > > {
> > >         unsigned long nr;
> > > 
> > >         /*
> > >          * mapped pages are normally small and precious,
> > >          * but shall be bounded for safety.
> > >          */
> > >         nr = global_page_state(NR_FILE_MAPPED);
> > >         nr = min_t(unsigned long, nr, MAX_MMAP_PAGES);
> > > 
> > >         /*
> > >          * if no swap space, this is a hard request;
> > >          * otherwise this is an optimization.
> > >          * (the disk image IO can be much faster than swap IO)
> > 
> > Well, if there's no swap space at this point, we won't be able to save the
> > image anyway, so this always is an optimization IMO. :-)
> 
> Ah OK. Do you think the anonymous pages optimization should be limited?

That depends.

> My desktop normally consumes 200-400MB anonymous pages, but when some
> virtual machine is running, the anonymous pages can go beyond 1GB,

That's too much IMO, so there should be a limit.

> with mapped file pages go slightly beyond 100MB.
> 
> The image-write vs. swapout-write speeds should be equal,

They aren't, really.  Image write is way faster, even without compression.

> however the hibernate tool may be able to compress the dataset.

Sure.

> The image-read will be much faster than swapin-read for *rotational*
> disks. It may take more time to resume, however the user experiences
> after completion will be much better.

Agreed.

> I don't think "populating memory with useless data" would be a major
> concern, since we already freed up half of the total memory. It's all
> about the speed one can get back to work.

Agreed again.

> > 
> > >          */
> > >         nr += global_page_state(NR_ACTIVE_ANON);
> > >         nr += global_page_state(NR_INACTIVE_ANON);
> > > 
> > >         /* hard (but normally small) memory requests */
> > >         nr += global_page_state(NR_SLAB_UNRECLAIMABLE);
> > >         nr += global_page_state(NR_UNEVICTABLE);
> > >         nr += global_page_state(NR_PAGETABLE);
> > > 
> > >         return nr;
> > > }
> > 
> > OK, thanks.
> > 
> > I'll create a separate patch adding this function and we'll see how it works.
> 
> OK, thanks!

Actually it doesn't work too well as I said above.  Arguably that's because the
number of anonymous pages was probably lower than average in my test cases,
but I also think that our hard core working set formula should be suitable for
all test cases.

> btw, if the shrink_all_memory() functions cannot go away because of
> performance problems, I can help clean it up.  (FYI: I happen to be
> doing so just before you submitted this patchset.:)

That would be great, thanks a lot!

I'm going to post updated patchset in a while, let's move the discussion to
the new thread.

Best,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-11 20:11                                                                                             ` David Rientjes
  0 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-11 20:11 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, fengguang.wu, linux-pm, pavel, Linus Torvalds,
	jens.axboe, alan-jenkins, linux-kernel, kernel-testers,
	Mel Gorman

On Sun, 10 May 2009, Rafael J. Wysocki wrote:

> > All order 0 allocations are implicitly __GFP_NOFAIL and will loop 
> > endlessly unless they can't block.  So if you want to simply prohibit the 
> > oom killer from being invoked and not change the retry behavior, setting 
> > ZONE_OOM_LOCKED for all zones will do that.  If your machine hangs, it 
> > means nothing can be reclaimed and you can't free memory via oom killing, 
> > so there's nothing else the page allocator can do.
> 
> But I want it to give up in this case instead of looping forever.
> 
> Look.  I have a specific problem at hand that I want to solve and the approach
> you suggested _clearly_ _doesn't_ _work_.  I have also tried to explain to you
> why it doesn't work, but you're ingnoring it, so I really don't know what else
> I can say.
> 
> OTOH, the approach suggested by Andrew _does_ _work_ regardless of your
> opinion about it.  It's been tested and it's done the job 100% of the time.  Go
> figure.  And please stop beating the dead horse.
> 

Which implementation are you talking about?  You've had several:

	http://marc.info/?l=linux-kernel&m=124121728429113
	http://marc.info/?l=linux-kernel&m=124131049223733
	http://marc.info/?l=linux-kernel&m=124165031723627
	http://marc.info/?l=linux-kernel&m=124146681311494

The issue with your approach is that it doesn't address the problem; the 
problem is _not_ specific to individual page allocations it is specific to 
the STATE OF THE MACHINE.

If all userspace tasks are uninterruptible when trying to reserve this 
memory and, thus, oom killing is negligent and not going to help, that 
needs to be addressed in the page allocator.  It is a bug for the 
allocator to continuously retry the allocation unless __GFP_NOFAIL is set 
if oom killing will not free memory.

Adding a new __GFP_NO_OOM_KILL flag to address that isn't helpful since it 
has nothing at all to do with the specific allocation.  It may certainly 
be the easiest way to implement your patchset without doing VM work, but 
it's not going to fix the problem for others.

I just posted a patch series[*] that would fix this problem for you 
without even locking out the oom killer or adding any unnecessary gfp 
flags.  It is based on mmotm since it has Mel's page allocator speedups.  
Any change you do to the allocator at this point should be based on that 
to avoid nasty merge conflicts later, so try my series out and see how it 
works.

Now, I won't engage in your personal attacks because (i) nobody else 
cares, and (ii) it's not going to be productive.  I'll let my code do the 
talking.

 [*] http://lkml.org/lkml/2009/5/10/118

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-09 23:03                                                                                           ` Rafael J. Wysocki
  (?)
@ 2009-05-11 20:11                                                                                           ` David Rientjes
  -1 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-11 20:11 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: kernel-testers, Mel Gorman, linux-kernel, alan-jenkins,
	jens.axboe, Andrew Morton, fengguang.wu, Linus Torvalds,
	linux-pm

On Sun, 10 May 2009, Rafael J. Wysocki wrote:

> > All order 0 allocations are implicitly __GFP_NOFAIL and will loop 
> > endlessly unless they can't block.  So if you want to simply prohibit the 
> > oom killer from being invoked and not change the retry behavior, setting 
> > ZONE_OOM_LOCKED for all zones will do that.  If your machine hangs, it 
> > means nothing can be reclaimed and you can't free memory via oom killing, 
> > so there's nothing else the page allocator can do.
> 
> But I want it to give up in this case instead of looping forever.
> 
> Look.  I have a specific problem at hand that I want to solve and the approach
> you suggested _clearly_ _doesn't_ _work_.  I have also tried to explain to you
> why it doesn't work, but you're ingnoring it, so I really don't know what else
> I can say.
> 
> OTOH, the approach suggested by Andrew _does_ _work_ regardless of your
> opinion about it.  It's been tested and it's done the job 100% of the time.  Go
> figure.  And please stop beating the dead horse.
> 

Which implementation are you talking about?  You've had several:

	http://marc.info/?l=linux-kernel&m=124121728429113
	http://marc.info/?l=linux-kernel&m=124131049223733
	http://marc.info/?l=linux-kernel&m=124165031723627
	http://marc.info/?l=linux-kernel&m=124146681311494

The issue with your approach is that it doesn't address the problem; the 
problem is _not_ specific to individual page allocations it is specific to 
the STATE OF THE MACHINE.

If all userspace tasks are uninterruptible when trying to reserve this 
memory and, thus, oom killing is negligent and not going to help, that 
needs to be addressed in the page allocator.  It is a bug for the 
allocator to continuously retry the allocation unless __GFP_NOFAIL is set 
if oom killing will not free memory.

Adding a new __GFP_NO_OOM_KILL flag to address that isn't helpful since it 
has nothing at all to do with the specific allocation.  It may certainly 
be the easiest way to implement your patchset without doing VM work, but 
it's not going to fix the problem for others.

I just posted a patch series[*] that would fix this problem for you 
without even locking out the oom killer or adding any unnecessary gfp 
flags.  It is based on mmotm since it has Mel's page allocator speedups.  
Any change you do to the allocator at this point should be based on that 
to avoid nasty merge conflicts later, so try my series out and see how it 
works.

Now, I won't engage in your personal attacks because (i) nobody else 
cares, and (ii) it's not going to be productive.  I'll let my code do the 
talking.

 [*] http://lkml.org/lkml/2009/5/10/118

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-11 20:11                                                                                             ` David Rientjes
  0 siblings, 0 replies; 580+ messages in thread
From: David Rientjes @ 2009-05-11 20:11 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, Linus Torvalds,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA, Mel Gorman

On Sun, 10 May 2009, Rafael J. Wysocki wrote:

> > All order 0 allocations are implicitly __GFP_NOFAIL and will loop 
> > endlessly unless they can't block.  So if you want to simply prohibit the 
> > oom killer from being invoked and not change the retry behavior, setting 
> > ZONE_OOM_LOCKED for all zones will do that.  If your machine hangs, it 
> > means nothing can be reclaimed and you can't free memory via oom killing, 
> > so there's nothing else the page allocator can do.
> 
> But I want it to give up in this case instead of looping forever.
> 
> Look.  I have a specific problem at hand that I want to solve and the approach
> you suggested _clearly_ _doesn't_ _work_.  I have also tried to explain to you
> why it doesn't work, but you're ingnoring it, so I really don't know what else
> I can say.
> 
> OTOH, the approach suggested by Andrew _does_ _work_ regardless of your
> opinion about it.  It's been tested and it's done the job 100% of the time.  Go
> figure.  And please stop beating the dead horse.
> 

Which implementation are you talking about?  You've had several:

	http://marc.info/?l=linux-kernel&m=124121728429113
	http://marc.info/?l=linux-kernel&m=124131049223733
	http://marc.info/?l=linux-kernel&m=124165031723627
	http://marc.info/?l=linux-kernel&m=124146681311494

The issue with your approach is that it doesn't address the problem; the 
problem is _not_ specific to individual page allocations it is specific to 
the STATE OF THE MACHINE.

If all userspace tasks are uninterruptible when trying to reserve this 
memory and, thus, oom killing is negligent and not going to help, that 
needs to be addressed in the page allocator.  It is a bug for the 
allocator to continuously retry the allocation unless __GFP_NOFAIL is set 
if oom killing will not free memory.

Adding a new __GFP_NO_OOM_KILL flag to address that isn't helpful since it 
has nothing at all to do with the specific allocation.  It may certainly 
be the easiest way to implement your patchset without doing VM work, but 
it's not going to fix the problem for others.

I just posted a patch series[*] that would fix this problem for you 
without even locking out the oom killer or adding any unnecessary gfp 
flags.  It is based on mmotm since it has Mel's page allocator speedups.  
Any change you do to the allocator at this point should be based on that 
to avoid nasty merge conflicts later, so try my series out and see how it 
works.

Now, I won't engage in your personal attacks because (i) nobody else 
cares, and (ii) it's not going to be productive.  I'll let my code do the 
talking.

 [*] http://lkml.org/lkml/2009/5/10/118

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-11 22:44                                                                                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-11 22:44 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, fengguang.wu, linux-pm, pavel, Linus Torvalds,
	jens.axboe, alan-jenkins, linux-kernel, kernel-testers,
	Mel Gorman

On Monday 11 May 2009, David Rientjes wrote:
> On Sun, 10 May 2009, Rafael J. Wysocki wrote:
> 
> > > All order 0 allocations are implicitly __GFP_NOFAIL and will loop 
> > > endlessly unless they can't block.  So if you want to simply prohibit the 
> > > oom killer from being invoked and not change the retry behavior, setting 
> > > ZONE_OOM_LOCKED for all zones will do that.  If your machine hangs, it 
> > > means nothing can be reclaimed and you can't free memory via oom killing, 
> > > so there's nothing else the page allocator can do.
> > 
> > But I want it to give up in this case instead of looping forever.
> > 
> > Look.  I have a specific problem at hand that I want to solve and the approach
> > you suggested _clearly_ _doesn't_ _work_.  I have also tried to explain to you
> > why it doesn't work, but you're ingnoring it, so I really don't know what else
> > I can say.
> > 
> > OTOH, the approach suggested by Andrew _does_ _work_ regardless of your
> > opinion about it.  It's been tested and it's done the job 100% of the time.  Go
> > figure.  And please stop beating the dead horse.
> > 
> 
> Which implementation are you talking about?  You've had several:
> 
> 	http://marc.info/?l=linux-kernel&m=124121728429113
> 	http://marc.info/?l=linux-kernel&m=124131049223733
> 	http://marc.info/?l=linux-kernel&m=124165031723627
> 	http://marc.info/?l=linux-kernel&m=124146681311494

The second one.  The first one was too much code, the third one was not the
Andrew's favourite and the last one is wrong, because it changes the behaviour
related to __GFP_NORETRY incorrectly.

> The issue with your approach is that it doesn't address the problem; the 
> problem is _not_ specific to individual page allocations it is specific to 
> the STATE OF THE MACHINE.

Yes, it is, but have you followed my discussion with Andrew?

> If all userspace tasks are uninterruptible when trying to reserve this 
> memory and, thus, oom killing is negligent and not going to help, that 
> needs to be addressed in the page allocator.  It is a bug for the 
> allocator to continuously retry the allocation unless __GFP_NOFAIL is set 
> if oom killing will not free memory.

That was my argument in the discussion with Andrew, actually.

> Adding a new __GFP_NO_OOM_KILL flag to address that isn't helpful since it 
> has nothing at all to do with the specific allocation.  It may certainly 
> be the easiest way to implement your patchset without doing VM work, but 
> it's not going to fix the problem for others.

I agree, but I didn't even want to fix the problem with OOM killing after
freezing tasks.

> I just posted a patch series[*] that would fix this problem for you 
> without even locking out the oom killer or adding any unnecessary gfp 
> flags.  It is based on mmotm since it has Mel's page allocator speedups.  
> Any change you do to the allocator at this point should be based on that 
> to avoid nasty merge conflicts later, so try my series out and see how it 
> works.
> 
> Now, I won't engage in your personal attacks because (i) nobody else 
> cares, and (ii) it's not going to be productive.

My previous message wasn't meant to be personal, so I'm sorry if it sounded
like it was.

> I'll let my code do the talking.
>
>  [*] http://lkml.org/lkml/2009/5/10/118

OK, so the patch is http://lkml.org/lkml/2009/5/10/127, isn't it?  I'm not
sure it will fly, given the Andrew's reply.

In fact the problem is that processes in D state are only legitimately going
to stay in this state when they are _frozen_.  So, the right approach seems to
be to avoid calling the OOM killer at all after freezing processes and instead
fail the allocations that would have triggered it.  Which means this patch:
http://marc.info/?l=linux-kernel&m=124165031723627 (it also is my favourite
one).

But Andrew says that it's better to have a __GFP_NO_OOM_KILL flag instead,
because someone else might presumably use it in future for something (I have
no idea who that might be, but whatever) and _surely_ no one else will use a
global switch related to the freezer.

Still _I_ think that since the freezer is the source of the problematic
situation (all tasks are persistently unkillable), using it should change the
behaviour of the page allocator, so that the OOM killer is not activated
while processes are frozen.  And in fact that should not depend on what flags
are used by whoever tries to allocate memory.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-11 20:11                                                                                             ` David Rientjes
  (?)
  (?)
@ 2009-05-11 22:44                                                                                             ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-11 22:44 UTC (permalink / raw)
  To: David Rientjes
  Cc: kernel-testers, Mel Gorman, linux-kernel, alan-jenkins,
	jens.axboe, Andrew Morton, fengguang.wu, Linus Torvalds,
	linux-pm

On Monday 11 May 2009, David Rientjes wrote:
> On Sun, 10 May 2009, Rafael J. Wysocki wrote:
> 
> > > All order 0 allocations are implicitly __GFP_NOFAIL and will loop 
> > > endlessly unless they can't block.  So if you want to simply prohibit the 
> > > oom killer from being invoked and not change the retry behavior, setting 
> > > ZONE_OOM_LOCKED for all zones will do that.  If your machine hangs, it 
> > > means nothing can be reclaimed and you can't free memory via oom killing, 
> > > so there's nothing else the page allocator can do.
> > 
> > But I want it to give up in this case instead of looping forever.
> > 
> > Look.  I have a specific problem at hand that I want to solve and the approach
> > you suggested _clearly_ _doesn't_ _work_.  I have also tried to explain to you
> > why it doesn't work, but you're ingnoring it, so I really don't know what else
> > I can say.
> > 
> > OTOH, the approach suggested by Andrew _does_ _work_ regardless of your
> > opinion about it.  It's been tested and it's done the job 100% of the time.  Go
> > figure.  And please stop beating the dead horse.
> > 
> 
> Which implementation are you talking about?  You've had several:
> 
> 	http://marc.info/?l=linux-kernel&m=124121728429113
> 	http://marc.info/?l=linux-kernel&m=124131049223733
> 	http://marc.info/?l=linux-kernel&m=124165031723627
> 	http://marc.info/?l=linux-kernel&m=124146681311494

The second one.  The first one was too much code, the third one was not the
Andrew's favourite and the last one is wrong, because it changes the behaviour
related to __GFP_NORETRY incorrectly.

> The issue with your approach is that it doesn't address the problem; the 
> problem is _not_ specific to individual page allocations it is specific to 
> the STATE OF THE MACHINE.

Yes, it is, but have you followed my discussion with Andrew?

> If all userspace tasks are uninterruptible when trying to reserve this 
> memory and, thus, oom killing is negligent and not going to help, that 
> needs to be addressed in the page allocator.  It is a bug for the 
> allocator to continuously retry the allocation unless __GFP_NOFAIL is set 
> if oom killing will not free memory.

That was my argument in the discussion with Andrew, actually.

> Adding a new __GFP_NO_OOM_KILL flag to address that isn't helpful since it 
> has nothing at all to do with the specific allocation.  It may certainly 
> be the easiest way to implement your patchset without doing VM work, but 
> it's not going to fix the problem for others.

I agree, but I didn't even want to fix the problem with OOM killing after
freezing tasks.

> I just posted a patch series[*] that would fix this problem for you 
> without even locking out the oom killer or adding any unnecessary gfp 
> flags.  It is based on mmotm since it has Mel's page allocator speedups.  
> Any change you do to the allocator at this point should be based on that 
> to avoid nasty merge conflicts later, so try my series out and see how it 
> works.
> 
> Now, I won't engage in your personal attacks because (i) nobody else 
> cares, and (ii) it's not going to be productive.

My previous message wasn't meant to be personal, so I'm sorry if it sounded
like it was.

> I'll let my code do the talking.
>
>  [*] http://lkml.org/lkml/2009/5/10/118

OK, so the patch is http://lkml.org/lkml/2009/5/10/127, isn't it?  I'm not
sure it will fly, given the Andrew's reply.

In fact the problem is that processes in D state are only legitimately going
to stay in this state when they are _frozen_.  So, the right approach seems to
be to avoid calling the OOM killer at all after freezing processes and instead
fail the allocations that would have triggered it.  Which means this patch:
http://marc.info/?l=linux-kernel&m=124165031723627 (it also is my favourite
one).

But Andrew says that it's better to have a __GFP_NO_OOM_KILL flag instead,
because someone else might presumably use it in future for something (I have
no idea who that might be, but whatever) and _surely_ no one else will use a
global switch related to the freezer.

Still _I_ think that since the freezer is the source of the problematic
situation (all tasks are persistently unkillable), using it should change the
behaviour of the page allocator, so that the OOM killer is not activated
while processes are frozen.  And in fact that should not depend on what flags
are used by whoever tries to allocate memory.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-11 22:44                                                                                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-11 22:44 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, Linus Torvalds,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA, Mel Gorman

On Monday 11 May 2009, David Rientjes wrote:
> On Sun, 10 May 2009, Rafael J. Wysocki wrote:
> 
> > > All order 0 allocations are implicitly __GFP_NOFAIL and will loop 
> > > endlessly unless they can't block.  So if you want to simply prohibit the 
> > > oom killer from being invoked and not change the retry behavior, setting 
> > > ZONE_OOM_LOCKED for all zones will do that.  If your machine hangs, it 
> > > means nothing can be reclaimed and you can't free memory via oom killing, 
> > > so there's nothing else the page allocator can do.
> > 
> > But I want it to give up in this case instead of looping forever.
> > 
> > Look.  I have a specific problem at hand that I want to solve and the approach
> > you suggested _clearly_ _doesn't_ _work_.  I have also tried to explain to you
> > why it doesn't work, but you're ingnoring it, so I really don't know what else
> > I can say.
> > 
> > OTOH, the approach suggested by Andrew _does_ _work_ regardless of your
> > opinion about it.  It's been tested and it's done the job 100% of the time.  Go
> > figure.  And please stop beating the dead horse.
> > 
> 
> Which implementation are you talking about?  You've had several:
> 
> 	http://marc.info/?l=linux-kernel&m=124121728429113
> 	http://marc.info/?l=linux-kernel&m=124131049223733
> 	http://marc.info/?l=linux-kernel&m=124165031723627
> 	http://marc.info/?l=linux-kernel&m=124146681311494

The second one.  The first one was too much code, the third one was not the
Andrew's favourite and the last one is wrong, because it changes the behaviour
related to __GFP_NORETRY incorrectly.

> The issue with your approach is that it doesn't address the problem; the 
> problem is _not_ specific to individual page allocations it is specific to 
> the STATE OF THE MACHINE.

Yes, it is, but have you followed my discussion with Andrew?

> If all userspace tasks are uninterruptible when trying to reserve this 
> memory and, thus, oom killing is negligent and not going to help, that 
> needs to be addressed in the page allocator.  It is a bug for the 
> allocator to continuously retry the allocation unless __GFP_NOFAIL is set 
> if oom killing will not free memory.

That was my argument in the discussion with Andrew, actually.

> Adding a new __GFP_NO_OOM_KILL flag to address that isn't helpful since it 
> has nothing at all to do with the specific allocation.  It may certainly 
> be the easiest way to implement your patchset without doing VM work, but 
> it's not going to fix the problem for others.

I agree, but I didn't even want to fix the problem with OOM killing after
freezing tasks.

> I just posted a patch series[*] that would fix this problem for you 
> without even locking out the oom killer or adding any unnecessary gfp 
> flags.  It is based on mmotm since it has Mel's page allocator speedups.  
> Any change you do to the allocator at this point should be based on that 
> to avoid nasty merge conflicts later, so try my series out and see how it 
> works.
> 
> Now, I won't engage in your personal attacks because (i) nobody else 
> cares, and (ii) it's not going to be productive.

My previous message wasn't meant to be personal, so I'm sorry if it sounded
like it was.

> I'll let my code do the talking.
>
>  [*] http://lkml.org/lkml/2009/5/10/118

OK, so the patch is http://lkml.org/lkml/2009/5/10/127, isn't it?  I'm not
sure it will fly, given the Andrew's reply.

In fact the problem is that processes in D state are only legitimately going
to stay in this state when they are _frozen_.  So, the right approach seems to
be to avoid calling the OOM killer at all after freezing processes and instead
fail the allocations that would have triggered it.  Which means this patch:
http://marc.info/?l=linux-kernel&m=124165031723627 (it also is my favourite
one).

But Andrew says that it's better to have a __GFP_NO_OOM_KILL flag instead,
because someone else might presumably use it in future for something (I have
no idea who that might be, but whatever) and _surely_ no one else will use a
global switch related to the freezer.

Still _I_ think that since the freezer is the source of the problematic
situation (all tasks are persistently unkillable), using it should change the
behaviour of the page allocator, so that the OOM killer is not activated
while processes are frozen.  And in fact that should not depend on what flags
are used by whoever tries to allocate memory.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-11 23:07                                                                                                 ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-11 23:07 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: rientjes, fengguang.wu, linux-pm, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers, mel

On Tue, 12 May 2009 00:44:36 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> Which means this patch:
> http://marc.info/?l=linux-kernel&m=124165031723627 (it also is my favourite
> one).

ho hum, I could live with that ;)

Would it make sense to turn it into something more general?  Instead of
"tasks_frozen/processes_are_frozen()", present it as
"oom_killer_disabled/oom_killer_is_disabled()"?

That would invite other subsystems to use it, if they want to.  Which
might well be a bad thing on their behalf, hard to say..



^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-11 22:44                                                                                               ` Rafael J. Wysocki
  (?)
@ 2009-05-11 23:07                                                                                               ` Andrew Morton
  -1 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-11 23:07 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: kernel-testers, mel, rientjes, linux-kernel, alan-jenkins,
	jens.axboe, linux-pm, fengguang.wu, torvalds

On Tue, 12 May 2009 00:44:36 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> Which means this patch:
> http://marc.info/?l=linux-kernel&m=124165031723627 (it also is my favourite
> one).

ho hum, I could live with that ;)

Would it make sense to turn it into something more general?  Instead of
"tasks_frozen/processes_are_frozen()", present it as
"oom_killer_disabled/oom_killer_is_disabled()"?

That would invite other subsystems to use it, if they want to.  Which
might well be a bad thing on their behalf, hard to say..

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-11 23:07                                                                                                 ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-11 23:07 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	mel-wPRd99KPJ+uzQB+pC5nmwQ

On Tue, 12 May 2009 00:44:36 +0200
"Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:

> Which means this patch:
> http://marc.info/?l=linux-kernel&m=124165031723627 (it also is my favourite
> one).

ho hum, I could live with that ;)

Would it make sense to turn it into something more general?  Instead of
"tasks_frozen/processes_are_frozen()", present it as
"oom_killer_disabled/oom_killer_is_disabled()"?

That would invite other subsystems to use it, if they want to.  Which
might well be a bad thing on their behalf, hard to say..


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-11 23:28                                                                                                   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-11 23:28 UTC (permalink / raw)
  To: Andrew Morton
  Cc: rientjes, fengguang.wu, linux-pm, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers, mel

On Tuesday 12 May 2009, Andrew Morton wrote:
> On Tue, 12 May 2009 00:44:36 +0200
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > Which means this patch:
> > http://marc.info/?l=linux-kernel&m=124165031723627 (it also is my favourite
> > one).
> 
> ho hum, I could live with that ;)
> 
> Would it make sense to turn it into something more general?  Instead of
> "tasks_frozen/processes_are_frozen()", present it as
> "oom_killer_disabled/oom_killer_is_disabled()"?
> 
> That would invite other subsystems to use it, if they want to.  Which
> might well be a bad thing on their behalf, hard to say..

I chose the names this way because the variable is defined in the freezer code.

Alternatively, I can define one in page_alloc.c, add [disable|enable]_oom_killer()
for manipulating it and call them from the freezer code.  Do you think that
would be better?

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-11 23:07                                                                                                 ` Andrew Morton
  (?)
@ 2009-05-11 23:28                                                                                                 ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-11 23:28 UTC (permalink / raw)
  To: Andrew Morton
  Cc: kernel-testers, mel, rientjes, linux-kernel, alan-jenkins,
	jens.axboe, linux-pm, fengguang.wu, torvalds

On Tuesday 12 May 2009, Andrew Morton wrote:
> On Tue, 12 May 2009 00:44:36 +0200
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > Which means this patch:
> > http://marc.info/?l=linux-kernel&m=124165031723627 (it also is my favourite
> > one).
> 
> ho hum, I could live with that ;)
> 
> Would it make sense to turn it into something more general?  Instead of
> "tasks_frozen/processes_are_frozen()", present it as
> "oom_killer_disabled/oom_killer_is_disabled()"?
> 
> That would invite other subsystems to use it, if they want to.  Which
> might well be a bad thing on their behalf, hard to say..

I chose the names this way because the variable is defined in the freezer code.

Alternatively, I can define one in page_alloc.c, add [disable|enable]_oom_killer()
for manipulating it and call them from the freezer code.  Do you think that
would be better?

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-11 23:28                                                                                                   ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-11 23:28 UTC (permalink / raw)
  To: Andrew Morton
  Cc: rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	mel-wPRd99KPJ+uzQB+pC5nmwQ

On Tuesday 12 May 2009, Andrew Morton wrote:
> On Tue, 12 May 2009 00:44:36 +0200
> "Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:
> 
> > Which means this patch:
> > http://marc.info/?l=linux-kernel&m=124165031723627 (it also is my favourite
> > one).
> 
> ho hum, I could live with that ;)
> 
> Would it make sense to turn it into something more general?  Instead of
> "tasks_frozen/processes_are_frozen()", present it as
> "oom_killer_disabled/oom_killer_is_disabled()"?
> 
> That would invite other subsystems to use it, if they want to.  Which
> might well be a bad thing on their behalf, hard to say..

I chose the names this way because the variable is defined in the freezer code.

Alternatively, I can define one in page_alloc.c, add [disable|enable]_oom_killer()
for manipulating it and call them from the freezer code.  Do you think that
would be better?

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-12  0:11                                                                                                     ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-12  0:11 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: rientjes, fengguang.wu, linux-pm, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers, mel

On Tue, 12 May 2009 01:28:15 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> On Tuesday 12 May 2009, Andrew Morton wrote:
> > On Tue, 12 May 2009 00:44:36 +0200
> > "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > 
> > > Which means this patch:
> > > http://marc.info/?l=linux-kernel&m=124165031723627 (it also is my favourite
> > > one).
> > 
> > ho hum, I could live with that ;)
> > 
> > Would it make sense to turn it into something more general?  Instead of
> > "tasks_frozen/processes_are_frozen()", present it as
> > "oom_killer_disabled/oom_killer_is_disabled()"?
> > 
> > That would invite other subsystems to use it, if they want to.  Which
> > might well be a bad thing on their behalf, hard to say..
> 
> I chose the names this way because the variable is defined in the freezer code.
> 
> Alternatively, I can define one in page_alloc.c, add [disable|enable]_oom_killer()
> for manipulating it and call them from the freezer code.  Do you think that
> would be better?

The choice is:

a) put a general oom-killer interface function into the oom-killer
   code, call that from swsusp.

b) put a swsusp-specific change into the oom-killer, call that from swsusp.


>From a cleanliess POV, a) is way better.  But it does need to be a
general function!  If there's some hidden requirement which only makes
the function applicable to swsusp, such as "all tasks must be frozen" then
we'd be kidding ourselves by making it general-looking.

I have a bad feeling that after one week and 12^17 emails, we're back
to your original patch :)


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-11 23:28                                                                                                   ` Rafael J. Wysocki
  (?)
  (?)
@ 2009-05-12  0:11                                                                                                   ` Andrew Morton
  -1 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-12  0:11 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: kernel-testers, mel, rientjes, linux-kernel, alan-jenkins,
	jens.axboe, linux-pm, fengguang.wu, torvalds

On Tue, 12 May 2009 01:28:15 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> On Tuesday 12 May 2009, Andrew Morton wrote:
> > On Tue, 12 May 2009 00:44:36 +0200
> > "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > 
> > > Which means this patch:
> > > http://marc.info/?l=linux-kernel&m=124165031723627 (it also is my favourite
> > > one).
> > 
> > ho hum, I could live with that ;)
> > 
> > Would it make sense to turn it into something more general?  Instead of
> > "tasks_frozen/processes_are_frozen()", present it as
> > "oom_killer_disabled/oom_killer_is_disabled()"?
> > 
> > That would invite other subsystems to use it, if they want to.  Which
> > might well be a bad thing on their behalf, hard to say..
> 
> I chose the names this way because the variable is defined in the freezer code.
> 
> Alternatively, I can define one in page_alloc.c, add [disable|enable]_oom_killer()
> for manipulating it and call them from the freezer code.  Do you think that
> would be better?

The choice is:

a) put a general oom-killer interface function into the oom-killer
   code, call that from swsusp.

b) put a swsusp-specific change into the oom-killer, call that from swsusp.


>From a cleanliess POV, a) is way better.  But it does need to be a
general function!  If there's some hidden requirement which only makes
the function applicable to swsusp, such as "all tasks must be frozen" then
we'd be kidding ourselves by making it general-looking.

I have a bad feeling that after one week and 12^17 emails, we're back
to your original patch :)

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-12  0:11                                                                                                     ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-12  0:11 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	mel-wPRd99KPJ+uzQB+pC5nmwQ

On Tue, 12 May 2009 01:28:15 +0200
"Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:

> On Tuesday 12 May 2009, Andrew Morton wrote:
> > On Tue, 12 May 2009 00:44:36 +0200
> > "Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:
> > 
> > > Which means this patch:
> > > http://marc.info/?l=linux-kernel&m=124165031723627 (it also is my favourite
> > > one).
> > 
> > ho hum, I could live with that ;)
> > 
> > Would it make sense to turn it into something more general?  Instead of
> > "tasks_frozen/processes_are_frozen()", present it as
> > "oom_killer_disabled/oom_killer_is_disabled()"?
> > 
> > That would invite other subsystems to use it, if they want to.  Which
> > might well be a bad thing on their behalf, hard to say..
> 
> I chose the names this way because the variable is defined in the freezer code.
> 
> Alternatively, I can define one in page_alloc.c, add [disable|enable]_oom_killer()
> for manipulating it and call them from the freezer code.  Do you think that
> would be better?

The choice is:

a) put a general oom-killer interface function into the oom-killer
   code, call that from swsusp.

b) put a swsusp-specific change into the oom-killer, call that from swsusp.


From a cleanliess POV, a) is way better.  But it does need to be a
general function!  If there's some hidden requirement which only makes
the function applicable to swsusp, such as "all tasks must be frozen" then
we'd be kidding ourselves by making it general-looking.

I have a bad feeling that after one week and 12^17 emails, we're back
to your original patch :)

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-12 16:52                                                                                                       ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-12 16:52 UTC (permalink / raw)
  To: Andrew Morton
  Cc: rientjes, fengguang.wu, linux-pm, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers, mel

On Tuesday 12 May 2009, Andrew Morton wrote:
> On Tue, 12 May 2009 01:28:15 +0200
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > On Tuesday 12 May 2009, Andrew Morton wrote:
> > > On Tue, 12 May 2009 00:44:36 +0200
> > > "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > 
> > > > Which means this patch:
> > > > http://marc.info/?l=linux-kernel&m=124165031723627 (it also is my favourite
> > > > one).
> > > 
> > > ho hum, I could live with that ;)
> > > 
> > > Would it make sense to turn it into something more general?  Instead of
> > > "tasks_frozen/processes_are_frozen()", present it as
> > > "oom_killer_disabled/oom_killer_is_disabled()"?
> > > 
> > > That would invite other subsystems to use it, if they want to.  Which
> > > might well be a bad thing on their behalf, hard to say..
> > 
> > I chose the names this way because the variable is defined in the freezer code.
> > 
> > Alternatively, I can define one in page_alloc.c, add [disable|enable]_oom_killer()
> > for manipulating it and call them from the freezer code.  Do you think that
> > would be better?
> 
> The choice is:
> 
> a) put a general oom-killer interface function into the oom-killer
>    code, call that from swsusp.
> 
> b) put a swsusp-specific change into the oom-killer, call that from swsusp.
> 
> 
> From a cleanliess POV, a) is way better.  But it does need to be a
> general function!  If there's some hidden requirement which only makes
> the function applicable to swsusp, such as "all tasks must be frozen" then
> we'd be kidding ourselves by making it general-looking.

Hmm.  I guess there may be other situations in which it's better to fail
memory allocations than to kill tasks.

> I have a bad feeling that after one week and 12^17 emails, we're back
> to your original patch :)

Well, what about the following?

---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: mm, PM/Freezer: Disable OOM killer when tasks are frozen

Currently, the following scenario appears to be possible in theory:
* Tasks are frozen for hibernation or suspend.
* Free pages are almost exhausted.
* Certain piece of code in the suspend code path attempts to allocate
  some memory using GFP_KERNEL and allocation order less than or
  equal to PAGE_ALLOC_COSTLY_ORDER.
* __alloc_pages_internal() cannot find a free page so it invokes the
  OOM killer.
* The OOM killer attempts to kill a task, but the task is frozen, so
  it doesn't die immediately.
* __alloc_pages_internal() jumps to 'restart', unsuccessfully tries
  to find a free page and invokes the OOM killer.
* No progress can be made.
Although it is now hard to trigger during hibernation due to the
memory shrinking carried out by the hibernation code, it is
theoretically possible to trigger during suspend after the memory
shrinking has been removed from that code path.  Moreover, since
memory allocations are going to be used for the hibernation memory
shrinking, it will be even more likely to happen during hibernation.

To prevent it from happening, introduce the oom_killer_disabled
switch that will cause __alloc_pages_internal() to fail in the
situations in which the OOM killer would have been called and make
the freezer set this switch after tasks have been successfully
frozen.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 include/linux/gfp.h    |   12 ++++++++++++
 kernel/power/process.c |    5 +++++
 mm/page_alloc.c        |    5 +++++
 3 files changed, 22 insertions(+)

Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c
+++ linux-2.6/mm/page_alloc.c
@@ -175,6 +175,8 @@ static void set_pageblock_migratetype(st
 					PB_migrate, PB_migrate_end);
 }
 
+bool oom_killer_disabled __read_mostly;
+
 #ifdef CONFIG_DEBUG_VM
 static int page_outside_zone_boundaries(struct zone *zone, struct page *page)
 {
@@ -1600,6 +1602,9 @@ nofail_alloc:
 		if (page)
 			goto got_pg;
 	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)) {
+		if (oom_killer_disabled)
+			goto nopage;
+
 		if (!try_set_zone_oom(zonelist, gfp_mask)) {
 			schedule_timeout_uninterruptible(1);
 			goto restart;
Index: linux-2.6/include/linux/gfp.h
===================================================================
--- linux-2.6.orig/include/linux/gfp.h
+++ linux-2.6/include/linux/gfp.h
@@ -245,4 +245,16 @@ void drain_zone_pages(struct zone *zone,
 void drain_all_pages(void);
 void drain_local_pages(void *dummy);
 
+extern bool oom_killer_disabled;
+
+static inline void disable_oom_killer(void)
+{
+	oom_killer_disabled = true;
+}
+
+static inline void enable_oom_killer(void)
+{
+	oom_killer_disabled = false;
+}
+
 #endif /* __LINUX_GFP_H */
Index: linux-2.6/kernel/power/process.c
===================================================================
--- linux-2.6.orig/kernel/power/process.c
+++ linux-2.6/kernel/power/process.c
@@ -117,9 +117,12 @@ int freeze_processes(void)
 	if (error)
 		goto Exit;
 	printk("done.");
+
+	disable_oom_killer();
  Exit:
 	BUG_ON(in_atomic());
 	printk("\n");
+
 	return error;
 }
 
@@ -145,6 +148,8 @@ static void thaw_tasks(bool nosig_only)
 
 void thaw_processes(void)
 {
+	enable_oom_killer();
+
 	printk("Restarting tasks ... ");
 	thaw_tasks(true);
 	thaw_tasks(false);

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-12  0:11                                                                                                     ` Andrew Morton
  (?)
  (?)
@ 2009-05-12 16:52                                                                                                     ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-12 16:52 UTC (permalink / raw)
  To: Andrew Morton
  Cc: kernel-testers, mel, rientjes, linux-kernel, alan-jenkins,
	jens.axboe, linux-pm, fengguang.wu, torvalds

On Tuesday 12 May 2009, Andrew Morton wrote:
> On Tue, 12 May 2009 01:28:15 +0200
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > On Tuesday 12 May 2009, Andrew Morton wrote:
> > > On Tue, 12 May 2009 00:44:36 +0200
> > > "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > 
> > > > Which means this patch:
> > > > http://marc.info/?l=linux-kernel&m=124165031723627 (it also is my favourite
> > > > one).
> > > 
> > > ho hum, I could live with that ;)
> > > 
> > > Would it make sense to turn it into something more general?  Instead of
> > > "tasks_frozen/processes_are_frozen()", present it as
> > > "oom_killer_disabled/oom_killer_is_disabled()"?
> > > 
> > > That would invite other subsystems to use it, if they want to.  Which
> > > might well be a bad thing on their behalf, hard to say..
> > 
> > I chose the names this way because the variable is defined in the freezer code.
> > 
> > Alternatively, I can define one in page_alloc.c, add [disable|enable]_oom_killer()
> > for manipulating it and call them from the freezer code.  Do you think that
> > would be better?
> 
> The choice is:
> 
> a) put a general oom-killer interface function into the oom-killer
>    code, call that from swsusp.
> 
> b) put a swsusp-specific change into the oom-killer, call that from swsusp.
> 
> 
> From a cleanliess POV, a) is way better.  But it does need to be a
> general function!  If there's some hidden requirement which only makes
> the function applicable to swsusp, such as "all tasks must be frozen" then
> we'd be kidding ourselves by making it general-looking.

Hmm.  I guess there may be other situations in which it's better to fail
memory allocations than to kill tasks.

> I have a bad feeling that after one week and 12^17 emails, we're back
> to your original patch :)

Well, what about the following?

---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: mm, PM/Freezer: Disable OOM killer when tasks are frozen

Currently, the following scenario appears to be possible in theory:
* Tasks are frozen for hibernation or suspend.
* Free pages are almost exhausted.
* Certain piece of code in the suspend code path attempts to allocate
  some memory using GFP_KERNEL and allocation order less than or
  equal to PAGE_ALLOC_COSTLY_ORDER.
* __alloc_pages_internal() cannot find a free page so it invokes the
  OOM killer.
* The OOM killer attempts to kill a task, but the task is frozen, so
  it doesn't die immediately.
* __alloc_pages_internal() jumps to 'restart', unsuccessfully tries
  to find a free page and invokes the OOM killer.
* No progress can be made.
Although it is now hard to trigger during hibernation due to the
memory shrinking carried out by the hibernation code, it is
theoretically possible to trigger during suspend after the memory
shrinking has been removed from that code path.  Moreover, since
memory allocations are going to be used for the hibernation memory
shrinking, it will be even more likely to happen during hibernation.

To prevent it from happening, introduce the oom_killer_disabled
switch that will cause __alloc_pages_internal() to fail in the
situations in which the OOM killer would have been called and make
the freezer set this switch after tasks have been successfully
frozen.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 include/linux/gfp.h    |   12 ++++++++++++
 kernel/power/process.c |    5 +++++
 mm/page_alloc.c        |    5 +++++
 3 files changed, 22 insertions(+)

Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c
+++ linux-2.6/mm/page_alloc.c
@@ -175,6 +175,8 @@ static void set_pageblock_migratetype(st
 					PB_migrate, PB_migrate_end);
 }
 
+bool oom_killer_disabled __read_mostly;
+
 #ifdef CONFIG_DEBUG_VM
 static int page_outside_zone_boundaries(struct zone *zone, struct page *page)
 {
@@ -1600,6 +1602,9 @@ nofail_alloc:
 		if (page)
 			goto got_pg;
 	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)) {
+		if (oom_killer_disabled)
+			goto nopage;
+
 		if (!try_set_zone_oom(zonelist, gfp_mask)) {
 			schedule_timeout_uninterruptible(1);
 			goto restart;
Index: linux-2.6/include/linux/gfp.h
===================================================================
--- linux-2.6.orig/include/linux/gfp.h
+++ linux-2.6/include/linux/gfp.h
@@ -245,4 +245,16 @@ void drain_zone_pages(struct zone *zone,
 void drain_all_pages(void);
 void drain_local_pages(void *dummy);
 
+extern bool oom_killer_disabled;
+
+static inline void disable_oom_killer(void)
+{
+	oom_killer_disabled = true;
+}
+
+static inline void enable_oom_killer(void)
+{
+	oom_killer_disabled = false;
+}
+
 #endif /* __LINUX_GFP_H */
Index: linux-2.6/kernel/power/process.c
===================================================================
--- linux-2.6.orig/kernel/power/process.c
+++ linux-2.6/kernel/power/process.c
@@ -117,9 +117,12 @@ int freeze_processes(void)
 	if (error)
 		goto Exit;
 	printk("done.");
+
+	disable_oom_killer();
  Exit:
 	BUG_ON(in_atomic());
 	printk("\n");
+
 	return error;
 }
 
@@ -145,6 +148,8 @@ static void thaw_tasks(bool nosig_only)
 
 void thaw_processes(void)
 {
+	enable_oom_killer();
+
 	printk("Restarting tasks ... ");
 	thaw_tasks(true);
 	thaw_tasks(false);

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-12 16:52                                                                                                       ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-12 16:52 UTC (permalink / raw)
  To: Andrew Morton
  Cc: rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	mel-wPRd99KPJ+uzQB+pC5nmwQ

On Tuesday 12 May 2009, Andrew Morton wrote:
> On Tue, 12 May 2009 01:28:15 +0200
> "Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:
> 
> > On Tuesday 12 May 2009, Andrew Morton wrote:
> > > On Tue, 12 May 2009 00:44:36 +0200
> > > "Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:
> > > 
> > > > Which means this patch:
> > > > http://marc.info/?l=linux-kernel&m=124165031723627 (it also is my favourite
> > > > one).
> > > 
> > > ho hum, I could live with that ;)
> > > 
> > > Would it make sense to turn it into something more general?  Instead of
> > > "tasks_frozen/processes_are_frozen()", present it as
> > > "oom_killer_disabled/oom_killer_is_disabled()"?
> > > 
> > > That would invite other subsystems to use it, if they want to.  Which
> > > might well be a bad thing on their behalf, hard to say..
> > 
> > I chose the names this way because the variable is defined in the freezer code.
> > 
> > Alternatively, I can define one in page_alloc.c, add [disable|enable]_oom_killer()
> > for manipulating it and call them from the freezer code.  Do you think that
> > would be better?
> 
> The choice is:
> 
> a) put a general oom-killer interface function into the oom-killer
>    code, call that from swsusp.
> 
> b) put a swsusp-specific change into the oom-killer, call that from swsusp.
> 
> 
> From a cleanliess POV, a) is way better.  But it does need to be a
> general function!  If there's some hidden requirement which only makes
> the function applicable to swsusp, such as "all tasks must be frozen" then
> we'd be kidding ourselves by making it general-looking.

Hmm.  I guess there may be other situations in which it's better to fail
memory allocations than to kill tasks.

> I have a bad feeling that after one week and 12^17 emails, we're back
> to your original patch :)

Well, what about the following?

---
From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
Subject: mm, PM/Freezer: Disable OOM killer when tasks are frozen

Currently, the following scenario appears to be possible in theory:
* Tasks are frozen for hibernation or suspend.
* Free pages are almost exhausted.
* Certain piece of code in the suspend code path attempts to allocate
  some memory using GFP_KERNEL and allocation order less than or
  equal to PAGE_ALLOC_COSTLY_ORDER.
* __alloc_pages_internal() cannot find a free page so it invokes the
  OOM killer.
* The OOM killer attempts to kill a task, but the task is frozen, so
  it doesn't die immediately.
* __alloc_pages_internal() jumps to 'restart', unsuccessfully tries
  to find a free page and invokes the OOM killer.
* No progress can be made.
Although it is now hard to trigger during hibernation due to the
memory shrinking carried out by the hibernation code, it is
theoretically possible to trigger during suspend after the memory
shrinking has been removed from that code path.  Moreover, since
memory allocations are going to be used for the hibernation memory
shrinking, it will be even more likely to happen during hibernation.

To prevent it from happening, introduce the oom_killer_disabled
switch that will cause __alloc_pages_internal() to fail in the
situations in which the OOM killer would have been called and make
the freezer set this switch after tasks have been successfully
frozen.

Signed-off-by: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
---
 include/linux/gfp.h    |   12 ++++++++++++
 kernel/power/process.c |    5 +++++
 mm/page_alloc.c        |    5 +++++
 3 files changed, 22 insertions(+)

Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c
+++ linux-2.6/mm/page_alloc.c
@@ -175,6 +175,8 @@ static void set_pageblock_migratetype(st
 					PB_migrate, PB_migrate_end);
 }
 
+bool oom_killer_disabled __read_mostly;
+
 #ifdef CONFIG_DEBUG_VM
 static int page_outside_zone_boundaries(struct zone *zone, struct page *page)
 {
@@ -1600,6 +1602,9 @@ nofail_alloc:
 		if (page)
 			goto got_pg;
 	} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)) {
+		if (oom_killer_disabled)
+			goto nopage;
+
 		if (!try_set_zone_oom(zonelist, gfp_mask)) {
 			schedule_timeout_uninterruptible(1);
 			goto restart;
Index: linux-2.6/include/linux/gfp.h
===================================================================
--- linux-2.6.orig/include/linux/gfp.h
+++ linux-2.6/include/linux/gfp.h
@@ -245,4 +245,16 @@ void drain_zone_pages(struct zone *zone,
 void drain_all_pages(void);
 void drain_local_pages(void *dummy);
 
+extern bool oom_killer_disabled;
+
+static inline void disable_oom_killer(void)
+{
+	oom_killer_disabled = true;
+}
+
+static inline void enable_oom_killer(void)
+{
+	oom_killer_disabled = false;
+}
+
 #endif /* __LINUX_GFP_H */
Index: linux-2.6/kernel/power/process.c
===================================================================
--- linux-2.6.orig/kernel/power/process.c
+++ linux-2.6/kernel/power/process.c
@@ -117,9 +117,12 @@ int freeze_processes(void)
 	if (error)
 		goto Exit;
 	printk("done.");
+
+	disable_oom_killer();
  Exit:
 	BUG_ON(in_atomic());
 	printk("\n");
+
 	return error;
 }
 
@@ -145,6 +148,8 @@ static void thaw_tasks(bool nosig_only)
 
 void thaw_processes(void)
 {
+	enable_oom_killer();
+
 	printk("Restarting tasks ... ");
 	thaw_tasks(true);
 	thaw_tasks(false);

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-12 16:52                                                                                                       ` Rafael J. Wysocki
@ 2009-05-12 17:50                                                                                                         ` Andrew Morton
  -1 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-12 17:50 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: rientjes, fengguang.wu, linux-pm, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers, mel

On Tue, 12 May 2009 18:52:36 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> Well, what about the following?

It has the virtue of simplicity.

> +static inline void disable_oom_killer(void)
> +{
> +	oom_killer_disabled = true;
> +}
> +
> +static inline void enable_oom_killer(void)
> +{
> +	oom_killer_disabled = false;
> +}

I'll change these to oom_killer_disable() and oom_killer_enable(), OK?

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-12 17:50                                                                                                         ` Andrew Morton
  0 siblings, 0 replies; 580+ messages in thread
From: Andrew Morton @ 2009-05-12 17:50 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: kernel-testers, mel, rientjes, linux-kernel, alan-jenkins,
	jens.axboe, linux-pm, fengguang.wu, torvalds

On Tue, 12 May 2009 18:52:36 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> Well, what about the following?

It has the virtue of simplicity.

> +static inline void disable_oom_killer(void)
> +{
> +	oom_killer_disabled = true;
> +}
> +
> +static inline void enable_oom_killer(void)
> +{
> +	oom_killer_disabled = false;
> +}

I'll change these to oom_killer_disable() and oom_killer_enable(), OK?

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-12 20:40                                                                                                           ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-12 20:40 UTC (permalink / raw)
  To: Andrew Morton
  Cc: rientjes, fengguang.wu, linux-pm, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers, mel

On Tuesday 12 May 2009, Andrew Morton wrote:
> On Tue, 12 May 2009 18:52:36 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > Well, what about the following?
> 
> It has the virtue of simplicity.
> 
> > +static inline void disable_oom_killer(void)
> > +{
> > +	oom_killer_disabled = true;
> > +}
> > +
> > +static inline void enable_oom_killer(void)
> > +{
> > +	oom_killer_disabled = false;
> > +}
> 
> I'll change these to oom_killer_disable() and oom_killer_enable(), OK?

Works for me, thanks!

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
  2009-05-12 17:50                                                                                                         ` Andrew Morton
  (?)
@ 2009-05-12 20:40                                                                                                         ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-12 20:40 UTC (permalink / raw)
  To: Andrew Morton
  Cc: kernel-testers, mel, rientjes, linux-kernel, alan-jenkins,
	jens.axboe, linux-pm, fengguang.wu, torvalds

On Tuesday 12 May 2009, Andrew Morton wrote:
> On Tue, 12 May 2009 18:52:36 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > Well, what about the following?
> 
> It has the virtue of simplicity.
> 
> > +static inline void disable_oom_killer(void)
> > +{
> > +	oom_killer_disabled = true;
> > +}
> > +
> > +static inline void enable_oom_killer(void)
> > +{
> > +	oom_killer_disabled = false;
> > +}
> 
> I'll change these to oom_killer_disable() and oom_killer_enable(), OK?

Works for me, thanks!

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag
@ 2009-05-12 20:40                                                                                                           ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-12 20:40 UTC (permalink / raw)
  To: Andrew Morton
  Cc: rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	fengguang.wu-ral2JQCrhuEAvxtiuMwx3w,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pavel-+ZI9xUNit7I, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	mel-wPRd99KPJ+uzQB+pC5nmwQ

On Tuesday 12 May 2009, Andrew Morton wrote:
> On Tue, 12 May 2009 18:52:36 +0200 "Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:
> 
> > Well, what about the following?
> 
> It has the virtue of simplicity.
> 
> > +static inline void disable_oom_killer(void)
> > +{
> > +	oom_killer_disabled = true;
> > +}
> > +
> > +static inline void enable_oom_killer(void)
> > +{
> > +	oom_killer_disabled = false;
> > +}
> 
> I'll change these to oom_killer_disable() and oom_killer_enable(), OK?

Works for me, thanks!

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13069] regression in 2.6.29-git3 on SH/Dreamcast
@ 2009-05-17  8:12       ` Pekka Enberg
  0 siblings, 0 replies; 580+ messages in thread
From: Pekka Enberg @ 2009-05-17  8:12 UTC (permalink / raw)
  To: Adrian McMenamin
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Manuel Lauss

On Thu, 2009-04-16 at 23:45 +0200, Rafael J. Wysocki wrote:
>> This message has been generated automatically as a part of a report
>> of recent regressions.
>>
>> The following bug entry is on the current list of known regressions
>> from 2.6.29.  Please verify if it still should be listed and let me know
>> (either way).
>>
>>
>> Bug-Entry     : http://bugzilla.kernel.org/show_bug.cgi?id=13069
>> Subject               : regression in 2.6.29-git3 on SH/Dreamcast
>> Submitter     : Adrian McMenamin <adrian@newgolddream.dyndns.info>
>> Date          : 2009-03-29 19:04 (19 days old)
>> References    : http://marc.info/?l=linux-kernel&m=123835353115372&w=4

On Fri, Apr 24, 2009 at 8:37 PM, Adrian McMenamin
<adrian@newgolddream.info> wrote:
> At this point it *looks* as though it was simply a question of
> insufficient memory to boot, but I think it needs further testing.
> Nobody else seems to have picked it up though.

Yeah, it looks like there's not enough memory to boot. It could well
be that the kernel memory footprint got bigger but I don't think the
bisected commit is at fault here. Perhaps we should just close the
bug?

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13069] regression in 2.6.29-git3 on SH/Dreamcast
@ 2009-05-17  8:12       ` Pekka Enberg
  0 siblings, 0 replies; 580+ messages in thread
From: Pekka Enberg @ 2009-05-17  8:12 UTC (permalink / raw)
  To: Adrian McMenamin
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Manuel Lauss

On Thu, 2009-04-16 at 23:45 +0200, Rafael J. Wysocki wrote:
>> This message has been generated automatically as a part of a report
>> of recent regressions.
>>
>> The following bug entry is on the current list of known regressions
>> from 2.6.29.  Please verify if it still should be listed and let me know
>> (either way).
>>
>>
>> Bug-Entry     : http://bugzilla.kernel.org/show_bug.cgi?id=13069
>> Subject               : regression in 2.6.29-git3 on SH/Dreamcast
>> Submitter     : Adrian McMenamin <adrian-TSF8l6Tg6afpT6hvJLqO3fkIJyZcjF/4@public.gmane.orgo>
>> Date          : 2009-03-29 19:04 (19 days old)
>> References    : http://marc.info/?l=linux-kernel&m=123835353115372&w=4

On Fri, Apr 24, 2009 at 8:37 PM, Adrian McMenamin
<adrian-TSF8l6Tg6ad3zSCowOiEO2GXanvQGlWp@public.gmane.org> wrote:
> At this point it *looks* as though it was simply a question of
> insufficient memory to boot, but I think it needs further testing.
> Nobody else seems to have picked it up though.

Yeah, it looks like there's not enough memory to boot. It could well
be that the kernel memory footprint got bigger but I don't think the
bisected commit is at fault here. Perhaps we should just close the
bug?

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13069] regression in 2.6.29-git3 on SH/Dreamcast
@ 2009-05-17 10:28         ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-17 10:28 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Adrian McMenamin, Linux Kernel Mailing List, Kernel Testers List,
	Manuel Lauss

On Sunday 17 May 2009, Pekka Enberg wrote:
> On Thu, 2009-04-16 at 23:45 +0200, Rafael J. Wysocki wrote:
> >> This message has been generated automatically as a part of a report
> >> of recent regressions.
> >>
> >> The following bug entry is on the current list of known regressions
> >> from 2.6.29.  Please verify if it still should be listed and let me know
> >> (either way).
> >>
> >>
> >> Bug-Entry     : http://bugzilla.kernel.org/show_bug.cgi?id=13069
> >> Subject               : regression in 2.6.29-git3 on SH/Dreamcast
> >> Submitter     : Adrian McMenamin <adrian@newgolddream.dyndns.info>
> >> Date          : 2009-03-29 19:04 (19 days old)
> >> References    : http://marc.info/?l=linux-kernel&m=123835353115372&w=4
> 
> On Fri, Apr 24, 2009 at 8:37 PM, Adrian McMenamin
> <adrian@newgolddream.info> wrote:
> > At this point it *looks* as though it was simply a question of
> > insufficient memory to boot, but I think it needs further testing.
> > Nobody else seems to have picked it up though.
> 
> Yeah, it looks like there's not enough memory to boot. It could well
> be that the kernel memory footprint got bigger but I don't think the
> bisected commit is at fault here. Perhaps we should just close the
> bug?

Done.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13069] regression in 2.6.29-git3 on SH/Dreamcast
@ 2009-05-17 10:28         ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-05-17 10:28 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Adrian McMenamin, Linux Kernel Mailing List, Kernel Testers List,
	Manuel Lauss

On Sunday 17 May 2009, Pekka Enberg wrote:
> On Thu, 2009-04-16 at 23:45 +0200, Rafael J. Wysocki wrote:
> >> This message has been generated automatically as a part of a report
> >> of recent regressions.
> >>
> >> The following bug entry is on the current list of known regressions
> >> from 2.6.29.  Please verify if it still should be listed and let me know
> >> (either way).
> >>
> >>
> >> Bug-Entry     : http://bugzilla.kernel.org/show_bug.cgi?id=13069
> >> Subject               : regression in 2.6.29-git3 on SH/Dreamcast
> >> Submitter     : Adrian McMenamin <adrian-TSF8l6Tg6afpT6hvJLqO3U8SxdOydiOw@public.gmane.org>
> >> Date          : 2009-03-29 19:04 (19 days old)
> >> References    : http://marc.info/?l=linux-kernel&m=123835353115372&w=4
> 
> On Fri, Apr 24, 2009 at 8:37 PM, Adrian McMenamin
> <adrian-TSF8l6Tg6ad3zSCowOiEO2GXanvQGlWp@public.gmane.org> wrote:
> > At this point it *looks* as though it was simply a question of
> > insufficient memory to boot, but I think it needs further testing.
> > Nobody else seems to have picked it up though.
> 
> Yeah, it looks like there's not enough memory to boot. It could well
> be that the kernel memory footprint got bigger but I don't think the
> bisected commit is at fault here. Perhaps we should just close the
> bug?

Done.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [Bug #13069] regression in 2.6.29-git3 on SH/Dreamcast
  2009-05-17  8:12       ` Pekka Enberg
  (?)
  (?)
@ 2009-05-17 10:38       ` Adrian McMenamin
  -1 siblings, 0 replies; 580+ messages in thread
From: Adrian McMenamin @ 2009-05-17 10:38 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Manuel Lauss

On Sun, 2009-05-17 at 11:12 +0300, Pekka Enberg wrote:
> On Thu, 2009-04-16 at 23:45 +0200, Rafael J. Wysocki wrote:
> >> This message has been generated automatically as a part of a report
> >> of recent regressions.
> >>
> >> The following bug entry is on the current list of known regressions
> >> from 2.6.29.  Please verify if it still should be listed and let me know
> >> (either way).
> >>
> >>
> >> Bug-Entry     : http://bugzilla.kernel.org/show_bug.cgi?id=13069
> >> Subject               : regression in 2.6.29-git3 on SH/Dreamcast
> >> Submitter     : Adrian McMenamin <adrian@newgolddream.dyndns.info>
> >> Date          : 2009-03-29 19:04 (19 days old)
> >> References    : http://marc.info/?l=linux-kernel&m=123835353115372&w=4
> 
> On Fri, Apr 24, 2009 at 8:37 PM, Adrian McMenamin
> <adrian@newgolddream.info> wrote:
> > At this point it *looks* as though it was simply a question of
> > insufficient memory to boot, but I think it needs further testing.
> > Nobody else seems to have picked it up though.
> 
> Yeah, it looks like there's not enough memory to boot. It could well
> be that the kernel memory footprint got bigger but I don't think the
> bisected commit is at fault here. Perhaps we should just close the
> bug?

I think that is probably the best course of action for now, yes.


^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
@ 2009-08-16 13:46                                                     ` Wu Fengguang
  0 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-08-16 13:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, Andrew Morton, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

On Wed, May 06, 2009 at 07:05:09AM +0800, Rafael J. Wysocki wrote:
> On Tuesday 05 May 2009, Wu Fengguang wrote:
> > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > 
> > > Since the hibernation code is now going to use allocations of memory
> > > to create enough room for the image, it can also use the page frames
> > > allocated at this stage as image page frames.  The low-level
> > > hibernation code needs to be rearranged for this purpose, but it
> > > allows us to avoid freeing a great number of pages and allocating
> > > these same pages once again later, so it generally is worth doing.
> > > 
> > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > >  many pages as needed to get the right image size in one shot (the
> > >  excessive allocated pages are released afterwards).]
> > 
> > Rafael, I tried out your patches and found doubled memory shrink speed!
> >
> > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
> 
> Unfortunately, I'm observing a regression and a huge one.
> 
> On my Atom-based test box with 1 GB of RAM after a fresh boot and starting X
> with KDE 4 there are ~256 MB free.  To create an image we need to free ~300 MB
> and that takes ~2 s with the old code and ~15 s with the new one.
> 
> It helps to call shrink_all_memory() once with a sufficiently large argument
> before the preallocation.

The 10 fold slowdown may be related to swapping IO:
shrink_all_memory() tends to be reclaiming less anon pages.

Is this box running on SSD? (Which can be slow on random writes.)

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
  2009-05-05 23:05                                                   ` Rafael J. Wysocki
                                                                     ` (7 preceding siblings ...)
  (?)
@ 2009-08-16 13:46                                                   ` Wu Fengguang
  -1 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-08-16 13:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-kernel, alan-jenkins, jens.axboe, linux-pm, kernel-testers,
	torvalds, Andrew Morton

On Wed, May 06, 2009 at 07:05:09AM +0800, Rafael J. Wysocki wrote:
> On Tuesday 05 May 2009, Wu Fengguang wrote:
> > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > 
> > > Since the hibernation code is now going to use allocations of memory
> > > to create enough room for the image, it can also use the page frames
> > > allocated at this stage as image page frames.  The low-level
> > > hibernation code needs to be rearranged for this purpose, but it
> > > allows us to avoid freeing a great number of pages and allocating
> > > these same pages once again later, so it generally is worth doing.
> > > 
> > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > >  many pages as needed to get the right image size in one shot (the
> > >  excessive allocated pages are released afterwards).]
> > 
> > Rafael, I tried out your patches and found doubled memory shrink speed!
> >
> > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
> 
> Unfortunately, I'm observing a regression and a huge one.
> 
> On my Atom-based test box with 1 GB of RAM after a fresh boot and starting X
> with KDE 4 there are ~256 MB free.  To create an image we need to free ~300 MB
> and that takes ~2 s with the old code and ~15 s with the new one.
> 
> It helps to call shrink_all_memory() once with a sufficiently large argument
> before the preallocation.

The 10 fold slowdown may be related to swapping IO:
shrink_all_memory() tends to be reclaiming less anon pages.

Is this box running on SSD? (Which can be slow on random writes.)

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
@ 2009-08-16 13:46                                                     ` Wu Fengguang
  0 siblings, 0 replies; 580+ messages in thread
From: Wu Fengguang @ 2009-08-16 13:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Andrew Morton, pavel-+ZI9xUNit7I,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Wed, May 06, 2009 at 07:05:09AM +0800, Rafael J. Wysocki wrote:
> On Tuesday 05 May 2009, Wu Fengguang wrote:
> > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
> > > 
> > > Since the hibernation code is now going to use allocations of memory
> > > to create enough room for the image, it can also use the page frames
> > > allocated at this stage as image page frames.  The low-level
> > > hibernation code needs to be rearranged for this purpose, but it
> > > allows us to avoid freeing a great number of pages and allocating
> > > these same pages once again later, so it generally is worth doing.
> > > 
> > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > >  many pages as needed to get the right image size in one shot (the
> > >  excessive allocated pages are released afterwards).]
> > 
> > Rafael, I tried out your patches and found doubled memory shrink speed!
> >
> > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
> 
> Unfortunately, I'm observing a regression and a huge one.
> 
> On my Atom-based test box with 1 GB of RAM after a fresh boot and starting X
> with KDE 4 there are ~256 MB free.  To create an image we need to free ~300 MB
> and that takes ~2 s with the old code and ~15 s with the new one.
> 
> It helps to call shrink_all_memory() once with a sufficiently large argument
> before the preallocation.

The 10 fold slowdown may be related to swapping IO:
shrink_all_memory() tends to be reclaiming less anon pages.

Is this box running on SSD? (Which can be slow on random writes.)

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
  2009-08-16 13:46                                                     ` Wu Fengguang
@ 2009-08-16 22:48                                                       ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-08-16 22:48 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-pm, Andrew Morton, pavel, torvalds, jens.axboe,
	alan-jenkins, linux-kernel, kernel-testers

On Sunday 16 August 2009, Wu Fengguang wrote:
> On Wed, May 06, 2009 at 07:05:09AM +0800, Rafael J. Wysocki wrote:
> > On Tuesday 05 May 2009, Wu Fengguang wrote:
> > > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > > 
> > > > Since the hibernation code is now going to use allocations of memory
> > > > to create enough room for the image, it can also use the page frames
> > > > allocated at this stage as image page frames.  The low-level
> > > > hibernation code needs to be rearranged for this purpose, but it
> > > > allows us to avoid freeing a great number of pages and allocating
> > > > these same pages once again later, so it generally is worth doing.
> > > > 
> > > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > > >  many pages as needed to get the right image size in one shot (the
> > > >  excessive allocated pages are released afterwards).]
> > > 
> > > Rafael, I tried out your patches and found doubled memory shrink speed!
> > >
> > > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
> > 
> > Unfortunately, I'm observing a regression and a huge one.
> > 
> > On my Atom-based test box with 1 GB of RAM after a fresh boot and starting X
> > with KDE 4 there are ~256 MB free.  To create an image we need to free ~300 MB
> > and that takes ~2 s with the old code and ~15 s with the new one.
> > 
> > It helps to call shrink_all_memory() once with a sufficiently large argument
> > before the preallocation.
> 
> The 10 fold slowdown may be related to swapping IO:

I guess it is.

> shrink_all_memory() tends to be reclaiming less anon pages.
> 
> Is this box running on SSD? (Which can be slow on random writes.)

No, on a normal spinning-plate HDD (2.5'').

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
  2009-08-16 13:46                                                     ` Wu Fengguang
  (?)
  (?)
@ 2009-08-16 22:48                                                     ` Rafael J. Wysocki
  -1 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-08-16 22:48 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-kernel, alan-jenkins, jens.axboe, linux-pm, kernel-testers,
	torvalds, Andrew Morton

On Sunday 16 August 2009, Wu Fengguang wrote:
> On Wed, May 06, 2009 at 07:05:09AM +0800, Rafael J. Wysocki wrote:
> > On Tuesday 05 May 2009, Wu Fengguang wrote:
> > > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > > 
> > > > Since the hibernation code is now going to use allocations of memory
> > > > to create enough room for the image, it can also use the page frames
> > > > allocated at this stage as image page frames.  The low-level
> > > > hibernation code needs to be rearranged for this purpose, but it
> > > > allows us to avoid freeing a great number of pages and allocating
> > > > these same pages once again later, so it generally is worth doing.
> > > > 
> > > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > > >  many pages as needed to get the right image size in one shot (the
> > > >  excessive allocated pages are released afterwards).]
> > > 
> > > Rafael, I tried out your patches and found doubled memory shrink speed!
> > >
> > > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
> > 
> > Unfortunately, I'm observing a regression and a huge one.
> > 
> > On my Atom-based test box with 1 GB of RAM after a fresh boot and starting X
> > with KDE 4 there are ~256 MB free.  To create an image we need to free ~300 MB
> > and that takes ~2 s with the old code and ~15 s with the new one.
> > 
> > It helps to call shrink_all_memory() once with a sufficiently large argument
> > before the preallocation.
> 
> The 10 fold slowdown may be related to swapping IO:

I guess it is.

> shrink_all_memory() tends to be reclaiming less anon pages.
> 
> Is this box running on SSD? (Which can be slow on random writes.)

No, on a normal spinning-plate HDD (2.5'').

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

* Re: [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
@ 2009-08-16 22:48                                                       ` Rafael J. Wysocki
  0 siblings, 0 replies; 580+ messages in thread
From: Rafael J. Wysocki @ 2009-08-16 22:48 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Andrew Morton, pavel-+ZI9xUNit7I,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jens.axboe-QHcLZuEGTsvQT0dZR+AlfA,
	alan-jenkins-cCz0Lq7MMjm9FHfhHBbuYA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA

On Sunday 16 August 2009, Wu Fengguang wrote:
> On Wed, May 06, 2009 at 07:05:09AM +0800, Rafael J. Wysocki wrote:
> > On Tuesday 05 May 2009, Wu Fengguang wrote:
> > > On Mon, May 04, 2009 at 08:22:38AM +0800, Rafael J. Wysocki wrote:
> > > > From: Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org>
> > > > 
> > > > Since the hibernation code is now going to use allocations of memory
> > > > to create enough room for the image, it can also use the page frames
> > > > allocated at this stage as image page frames.  The low-level
> > > > hibernation code needs to be rearranged for this purpose, but it
> > > > allows us to avoid freeing a great number of pages and allocating
> > > > these same pages once again later, so it generally is worth doing.
> > > > 
> > > > [rev. 2: Change the strategy of preallocating memory to allocate as
> > > >  many pages as needed to get the right image size in one shot (the
> > > >  excessive allocated pages are released afterwards).]
> > > 
> > > Rafael, I tried out your patches and found doubled memory shrink speed!
> > >
> > > [  579.641781] PM: Preallocating image memory ... done (allocated 383900 pages, 128000 image pages kept)
> > > [  583.087875] PM: Allocated 1535600 kbytes in 3.43 seconds (447.69 MB/s)
> > 
> > Unfortunately, I'm observing a regression and a huge one.
> > 
> > On my Atom-based test box with 1 GB of RAM after a fresh boot and starting X
> > with KDE 4 there are ~256 MB free.  To create an image we need to free ~300 MB
> > and that takes ~2 s with the old code and ~15 s with the new one.
> > 
> > It helps to call shrink_all_memory() once with a sufficiently large argument
> > before the preallocation.
> 
> The 10 fold slowdown may be related to swapping IO:

I guess it is.

> shrink_all_memory() tends to be reclaiming less anon pages.
> 
> Is this box running on SSD? (Which can be slow on random writes.)

No, on a normal spinning-plate HDD (2.5'').

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 580+ messages in thread

end of thread, other threads:[~2009-08-16 22:48 UTC | newest]

Thread overview: 580+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-16 21:42 2.6.30-rc2-git2: Reported regressions from 2.6.29 Rafael J. Wysocki
2009-04-16 21:42 ` Rafael J. Wysocki
2009-04-16 21:42 ` [Bug #13031] Deadlock/hang in SATA probe Rafael J. Wysocki
2009-04-16 21:42   ` Rafael J. Wysocki
2009-04-16 21:45 ` [Bug #13048] /sys/class/backlight/acpi_video0/* is gone on vaio laptop with Intel GM45 Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-16 21:45 ` [Bug #13044] 2.6.30-rc1 can't find the root fs Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-16 21:45 ` [Bug #13058] First hibernation attempt fails Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-17  6:30   ` Jens Axboe
2009-04-17  6:30     ` Jens Axboe
2009-04-17  8:28     ` Alan Jenkins
2009-04-17  8:28       ` Alan Jenkins
2009-04-17  9:13       ` Jens Axboe
2009-04-17  9:13         ` Jens Axboe
2009-04-17  9:34         ` Jens Axboe
2009-04-17  9:34           ` Jens Axboe
2009-04-17  9:38           ` Alan Jenkins
2009-04-17  9:45             ` Jens Axboe
2009-04-17  9:45               ` Jens Axboe
2009-04-17 10:46               ` Alan Jenkins
2009-04-17 10:46                 ` Alan Jenkins
2009-04-17 16:00                 ` Linus Torvalds
2009-04-17 16:00                   ` Linus Torvalds
2009-04-17 17:46                   ` Alan Jenkins
2009-04-17 17:46                     ` Alan Jenkins
2009-04-17 20:58                     ` Rafael J. Wysocki
2009-04-17 20:58                       ` Rafael J. Wysocki
2009-04-17 21:12                       ` Linus Torvalds
2009-04-17 21:12                         ` Linus Torvalds
2009-04-18  8:16                         ` Alan Jenkins
2009-04-18  8:16                           ` Alan Jenkins
2009-04-18 12:38                           ` Rafael J. Wysocki
2009-04-18 12:38                             ` Rafael J. Wysocki
2009-04-18 12:57                             ` Alan Jenkins
2009-04-18 12:57                               ` Alan Jenkins
2009-04-18 15:23                               ` [PATCH] PM/Hibernate: Fix memory shrinking (Re: [Bug #13058] First hibernation attempt fails) Rafael J. Wysocki
2009-04-18 15:23                                 ` Rafael J. Wysocki
2009-04-17 15:55         ` [Bug #13058] First hibernation attempt fails Linus Torvalds
2009-04-17 15:55           ` Linus Torvalds
2009-04-07  8:06           ` Pavel Machek
2009-04-07  8:06             ` Pavel Machek
2009-04-20 19:20             ` Andrew Morton
2009-04-20 19:20               ` Andrew Morton
2009-04-20 19:49               ` Rafael J. Wysocki
2009-04-20 19:49                 ` Rafael J. Wysocki
2009-04-20 19:53               ` Pavel Machek
2009-04-20 19:53                 ` Pavel Machek
2009-04-20 20:04                 ` Andrew Morton
2009-04-20 20:04                   ` Andrew Morton
2009-04-20 23:37                   ` Andrew Morton
2009-04-20 23:37                     ` Andrew Morton
2009-04-21 18:53                     ` Rafael J. Wysocki
2009-04-22 13:07                     ` Pavel Machek
2009-04-22 20:11                       ` Rafael J. Wysocki
2009-04-22 20:19                         ` Andrew Morton
2009-04-22 20:19                           ` Andrew Morton
2009-05-01 22:26                           ` [PATCH 0/3] PM: Drop shrink_all_memory (was: Re: [Bug #13058] First hibernation attempt fails) Rafael J. Wysocki
2009-05-01 22:26                           ` Rafael J. Wysocki
2009-05-01 22:26                             ` Rafael J. Wysocki
2009-05-01 22:27                             ` [PATCH 1/3] PM: Disable OOM killer during system-wide power transitions Rafael J. Wysocki
2009-05-01 22:27                             ` Rafael J. Wysocki
2009-05-01 22:27                               ` Rafael J. Wysocki
2009-05-01 23:09                               ` Andrew Morton
2009-05-01 23:09                               ` Andrew Morton
2009-05-01 23:09                                 ` Andrew Morton
2009-05-02 11:34                                 ` Rafael J. Wysocki
2009-05-02 11:34                                 ` Rafael J. Wysocki
2009-05-02 11:34                                   ` Rafael J. Wysocki
2009-05-03  9:47                                   ` Pavel Machek
2009-05-03  9:47                                   ` Pavel Machek
2009-05-03  9:47                                     ` Pavel Machek
2009-05-01 22:28                             ` [PATCH 2/3] PM/Hibernate: Move memory shrinking to snapshot.c Rafael J. Wysocki
2009-05-01 22:28                             ` Rafael J. Wysocki
2009-05-01 22:28                               ` Rafael J. Wysocki
2009-05-01 22:29                             ` [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory Rafael J. Wysocki
2009-05-01 22:29                               ` Rafael J. Wysocki
2009-05-01 23:14                               ` Andrew Morton
2009-05-01 23:14                                 ` Andrew Morton
2009-05-02 11:46                                 ` Rafael J. Wysocki
2009-05-02 11:46                                   ` Rafael J. Wysocki
2009-05-02 17:49                                   ` Andrew Morton
2009-05-02 17:49                                   ` Andrew Morton
2009-05-02 17:49                                     ` Andrew Morton
2009-05-03  0:20                                     ` [PATCH 0/4] PM: Drop shrink_all_memory (rev. 2) (was: Re: [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory) Rafael J. Wysocki
2009-05-03  0:20                                       ` Rafael J. Wysocki
2009-05-03  0:22                                       ` [PATCH 1/4] mm: Add __GFP_NO_OOM_KILL flag Rafael J. Wysocki
2009-05-03  0:22                                         ` Rafael J. Wysocki
2009-05-03 11:54                                         ` Wu Fengguang
2009-05-03 11:54                                         ` Wu Fengguang
2009-05-03 11:54                                           ` Wu Fengguang
2009-05-03  0:22                                       ` Rafael J. Wysocki
2009-05-03  0:23                                       ` [PATCH 2/4] PM/Hibernate: Move memory shrinking to snapshot.c (rev. 2) Rafael J. Wysocki
2009-05-03  0:23                                       ` Rafael J. Wysocki
2009-05-03  0:23                                         ` Rafael J. Wysocki
2009-05-03  0:24                                       ` [PATCH 3/4] PM/Hibernate: Use memory allocations to free memory " Rafael J. Wysocki
2009-05-03  0:24                                       ` Rafael J. Wysocki
2009-05-03  0:24                                         ` Rafael J. Wysocki
2009-05-03  3:06                                         ` Linus Torvalds
2009-05-03  3:06                                         ` Linus Torvalds
2009-05-03  3:06                                           ` Linus Torvalds
2009-05-03  9:36                                           ` Pavel Machek
2009-05-03  9:36                                           ` Pavel Machek
2009-05-03  9:36                                             ` Pavel Machek
2009-05-03 16:35                                             ` Rafael J. Wysocki
2009-05-03 16:35                                             ` Rafael J. Wysocki
2009-05-03 16:35                                               ` Rafael J. Wysocki
2009-05-04  9:36                                               ` Pavel Machek
2009-05-04  9:36                                               ` Pavel Machek
2009-05-04  9:36                                                 ` Pavel Machek
2009-05-03 16:15                                           ` Rafael J. Wysocki
2009-05-03 16:15                                           ` Rafael J. Wysocki
2009-05-03 16:15                                             ` Rafael J. Wysocki
2009-05-03 11:51                                         ` Wu Fengguang
2009-05-03 11:51                                         ` Wu Fengguang
2009-05-03 11:51                                           ` Wu Fengguang
2009-05-03 16:22                                           ` Rafael J. Wysocki
2009-05-03 16:22                                           ` Rafael J. Wysocki
2009-05-03 16:22                                             ` Rafael J. Wysocki
2009-05-04  9:31                                             ` Pavel Machek
2009-05-04  9:31                                               ` Pavel Machek
2009-05-04 19:52                                               ` Rafael J. Wysocki
2009-05-04 19:52                                               ` Rafael J. Wysocki
2009-05-04 19:52                                                 ` Rafael J. Wysocki
2009-05-04  9:31                                             ` Pavel Machek
2009-05-03  0:25                                       ` [PATCH 4/4] PM/Hibernate: Do not release preallocated memory unnecessarily Rafael J. Wysocki
2009-05-03  0:25                                       ` Rafael J. Wysocki
2009-05-03  0:25                                         ` Rafael J. Wysocki
2009-05-03 13:08                                       ` [PATCH 0/4] PM: Drop shrink_all_memory (rev. 2) (was: Re: [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory) Wu Fengguang
2009-05-03 13:08                                       ` Wu Fengguang
2009-05-03 13:08                                         ` Wu Fengguang
2009-05-03 16:30                                         ` Rafael J. Wysocki
2009-05-03 16:30                                         ` Rafael J. Wysocki
2009-05-03 16:30                                           ` Rafael J. Wysocki
2009-05-04  0:08                                           ` [PATCH 0/5] PM: Drop shrink_all_memory (rev. 3) Rafael J. Wysocki
2009-05-04  0:08                                           ` Rafael J. Wysocki
2009-05-04  0:08                                             ` Rafael J. Wysocki
2009-05-04  0:10                                             ` [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag Rafael J. Wysocki
2009-05-04  0:10                                               ` Rafael J. Wysocki
2009-05-04  0:38                                               ` David Rientjes
2009-05-04  0:38                                               ` David Rientjes
2009-05-04 15:02                                                 ` Rafael J. Wysocki
2009-05-04 15:02                                                 ` Rafael J. Wysocki
2009-05-04 15:02                                                   ` Rafael J. Wysocki
2009-05-04 16:44                                                   ` David Rientjes
2009-05-04 16:44                                                   ` David Rientjes
2009-05-04 16:44                                                     ` David Rientjes
2009-05-04 19:51                                                     ` Rafael J. Wysocki
2009-05-04 19:51                                                     ` Rafael J. Wysocki
2009-05-04 19:51                                                       ` Rafael J. Wysocki
2009-05-04 20:02                                                       ` David Rientjes
2009-05-04 20:02                                                       ` David Rientjes
2009-05-04 20:02                                                         ` David Rientjes
2009-05-04 22:23                                                         ` Rafael J. Wysocki
2009-05-04 22:23                                                         ` Rafael J. Wysocki
2009-05-04 22:23                                                           ` Rafael J. Wysocki
2009-05-05  0:37                                                           ` David Rientjes
2009-05-05  0:37                                                             ` David Rientjes
2009-05-05 22:19                                                             ` Rafael J. Wysocki
2009-05-05 22:19                                                             ` Rafael J. Wysocki
2009-05-05 22:19                                                               ` Rafael J. Wysocki
2009-05-05 22:37                                                               ` Andrew Morton
2009-05-05 22:37                                                                 ` Andrew Morton
2009-05-05 23:20                                                                 ` Rafael J. Wysocki
2009-05-05 23:20                                                                 ` Rafael J. Wysocki
2009-05-05 23:20                                                                   ` Rafael J. Wysocki
2009-05-05 23:40                                                                   ` Andrew Morton
2009-05-05 23:40                                                                     ` Andrew Morton
2009-05-07 18:09                                                                     ` Rafael J. Wysocki
2009-05-07 18:09                                                                     ` Rafael J. Wysocki
2009-05-07 18:09                                                                       ` Rafael J. Wysocki
2009-05-07 18:48                                                                       ` Andrew Morton
2009-05-07 18:48                                                                       ` Andrew Morton
2009-05-07 18:48                                                                         ` Andrew Morton
2009-05-07 19:33                                                                         ` Rafael J. Wysocki
2009-05-07 19:33                                                                           ` Rafael J. Wysocki
2009-05-07 20:02                                                                           ` Andrew Morton
2009-05-07 20:02                                                                           ` Andrew Morton
2009-05-07 20:02                                                                             ` Andrew Morton
2009-05-07 20:18                                                                             ` Rafael J. Wysocki
2009-05-07 20:18                                                                               ` Rafael J. Wysocki
2009-05-07 20:25                                                                               ` David Rientjes
2009-05-07 20:25                                                                               ` David Rientjes
2009-05-07 20:25                                                                                 ` David Rientjes
2009-05-07 20:35                                                                                 ` Pavel Machek
2009-05-07 20:35                                                                                   ` Pavel Machek
2009-05-07 20:40                                                                                   ` David Rientjes
2009-05-07 20:40                                                                                     ` David Rientjes
2009-05-07 20:40                                                                                   ` David Rientjes
2009-05-07 20:35                                                                                 ` Pavel Machek
2009-05-07 20:38                                                                                 ` Rafael J. Wysocki
2009-05-07 20:38                                                                                   ` Rafael J. Wysocki
2009-05-07 20:42                                                                                   ` David Rientjes
2009-05-07 20:42                                                                                     ` David Rientjes
2009-05-07 20:42                                                                                   ` David Rientjes
2009-05-07 20:56                                                                                   ` Andrew Morton
2009-05-07 20:56                                                                                   ` Andrew Morton
2009-05-07 20:56                                                                                     ` Andrew Morton
2009-05-07 21:25                                                                                     ` David Rientjes
2009-05-07 21:25                                                                                       ` David Rientjes
2009-05-07 21:36                                                                                       ` Rafael J. Wysocki
2009-05-07 21:36                                                                                         ` Rafael J. Wysocki
2009-05-07 21:46                                                                                         ` David Rientjes
2009-05-07 21:46                                                                                           ` David Rientjes
2009-05-07 22:05                                                                                           ` Rafael J. Wysocki
2009-05-07 22:05                                                                                           ` Rafael J. Wysocki
2009-05-07 22:05                                                                                             ` Rafael J. Wysocki
2009-05-07 21:46                                                                                         ` David Rientjes
2009-05-07 21:36                                                                                       ` Rafael J. Wysocki
2009-05-07 21:50                                                                                       ` Andrew Morton
2009-05-07 21:50                                                                                       ` Andrew Morton
2009-05-07 21:50                                                                                         ` Andrew Morton
2009-05-07 22:14                                                                                         ` Rafael J. Wysocki
2009-05-07 22:14                                                                                         ` Rafael J. Wysocki
2009-05-07 22:14                                                                                           ` Rafael J. Wysocki
2009-05-07 22:38                                                                                           ` Andrew Morton
2009-05-07 22:38                                                                                           ` Andrew Morton
2009-05-07 22:50                                                                                             ` Rafael J. Wysocki
2009-05-07 23:15                                                                                               ` Andrew Morton
2009-05-07 23:15                                                                                               ` Andrew Morton
2009-05-07 23:15                                                                                                 ` Andrew Morton
2009-05-07 23:24                                                                                                 ` Rafael J. Wysocki
2009-05-07 23:24                                                                                                   ` Rafael J. Wysocki
2009-05-07 23:24                                                                                                 ` Rafael J. Wysocki
2009-05-07 22:50                                                                                             ` Rafael J. Wysocki
2009-05-07 22:16                                                                                         ` David Rientjes
2009-05-07 22:16                                                                                         ` David Rientjes
2009-05-07 22:16                                                                                           ` David Rientjes
2009-05-07 22:45                                                                                           ` Andrew Morton
2009-05-07 22:45                                                                                             ` Andrew Morton
2009-05-07 22:59                                                                                             ` David Rientjes
2009-05-07 22:59                                                                                             ` David Rientjes
2009-05-07 22:59                                                                                               ` David Rientjes
2009-05-07 23:11                                                                                               ` Rafael J. Wysocki
2009-05-07 23:11                                                                                                 ` Rafael J. Wysocki
2009-05-08  1:16                                                                                                 ` KAMEZAWA Hiroyuki
2009-05-08  1:16                                                                                                 ` KAMEZAWA Hiroyuki
2009-05-08  1:16                                                                                                   ` KAMEZAWA Hiroyuki
2009-05-08 13:42                                                                                                   ` Rafael J. Wysocki
2009-05-08 13:42                                                                                                     ` Rafael J. Wysocki
2009-05-08 13:42                                                                                                   ` Rafael J. Wysocki
2009-05-08  9:50                                                                                                 ` Wu Fengguang
2009-05-08  9:50                                                                                                 ` Wu Fengguang
2009-05-08  9:50                                                                                                   ` Wu Fengguang
2009-05-08 13:51                                                                                                   ` Rafael J. Wysocki
2009-05-08 13:51                                                                                                   ` Rafael J. Wysocki
2009-05-08 13:51                                                                                                     ` Rafael J. Wysocki
2009-05-09  0:08                                                                                                     ` Rafael J. Wysocki
2009-05-09  0:08                                                                                                       ` Rafael J. Wysocki
2009-05-09  7:34                                                                                                       ` Wu Fengguang
2009-05-09  7:34                                                                                                       ` Wu Fengguang
2009-05-09  7:34                                                                                                         ` Wu Fengguang
2009-05-09 19:22                                                                                                         ` Rafael J. Wysocki
2009-05-09 19:22                                                                                                         ` Rafael J. Wysocki
2009-05-09 19:22                                                                                                           ` Rafael J. Wysocki
2009-05-10  4:52                                                                                                           ` Wu Fengguang
2009-05-10  4:52                                                                                                           ` Wu Fengguang
2009-05-10  4:52                                                                                                             ` Wu Fengguang
2009-05-10 12:52                                                                                                             ` Rafael J. Wysocki
2009-05-10 12:52                                                                                                               ` Rafael J. Wysocki
2009-05-10 12:52                                                                                                             ` Rafael J. Wysocki
2009-05-07 23:11                                                                                               ` Rafael J. Wysocki
2009-05-07 22:45                                                                                           ` Andrew Morton
2009-05-07 21:25                                                                                     ` David Rientjes
2009-05-07 20:38                                                                                 ` Rafael J. Wysocki
2009-05-08 23:55                                                                                 ` Rafael J. Wysocki
2009-05-08 23:55                                                                                 ` Rafael J. Wysocki
2009-05-08 23:55                                                                                   ` Rafael J. Wysocki
2009-05-09 21:22                                                                                   ` David Rientjes
2009-05-09 21:22                                                                                     ` David Rientjes
2009-05-09 21:37                                                                                     ` Rafael J. Wysocki
2009-05-09 21:37                                                                                     ` Rafael J. Wysocki
2009-05-09 21:37                                                                                       ` Rafael J. Wysocki
2009-05-09 22:39                                                                                       ` David Rientjes
2009-05-09 22:39                                                                                         ` David Rientjes
2009-05-09 23:03                                                                                         ` Rafael J. Wysocki
2009-05-09 23:03                                                                                           ` Rafael J. Wysocki
2009-05-11 20:11                                                                                           ` David Rientjes
2009-05-11 20:11                                                                                           ` David Rientjes
2009-05-11 20:11                                                                                             ` David Rientjes
2009-05-11 22:44                                                                                             ` Rafael J. Wysocki
2009-05-11 22:44                                                                                               ` Rafael J. Wysocki
2009-05-11 23:07                                                                                               ` Andrew Morton
2009-05-11 23:07                                                                                               ` Andrew Morton
2009-05-11 23:07                                                                                                 ` Andrew Morton
2009-05-11 23:28                                                                                                 ` Rafael J. Wysocki
2009-05-11 23:28                                                                                                 ` Rafael J. Wysocki
2009-05-11 23:28                                                                                                   ` Rafael J. Wysocki
2009-05-12  0:11                                                                                                   ` Andrew Morton
2009-05-12  0:11                                                                                                     ` Andrew Morton
2009-05-12 16:52                                                                                                     ` Rafael J. Wysocki
2009-05-12 16:52                                                                                                       ` Rafael J. Wysocki
2009-05-12 17:50                                                                                                       ` Andrew Morton
2009-05-12 17:50                                                                                                         ` Andrew Morton
2009-05-12 20:40                                                                                                         ` Rafael J. Wysocki
2009-05-12 20:40                                                                                                         ` Rafael J. Wysocki
2009-05-12 20:40                                                                                                           ` Rafael J. Wysocki
2009-05-12 16:52                                                                                                     ` Rafael J. Wysocki
2009-05-12  0:11                                                                                                   ` Andrew Morton
2009-05-11 22:44                                                                                             ` Rafael J. Wysocki
2009-05-09 23:03                                                                                         ` Rafael J. Wysocki
2009-05-09 22:39                                                                                       ` David Rientjes
2009-05-09 21:22                                                                                   ` David Rientjes
2009-05-07 20:18                                                                             ` Rafael J. Wysocki
2009-05-07 19:33                                                                         ` Rafael J. Wysocki
2009-05-07 18:50                                                                       ` David Rientjes
2009-05-07 18:50                                                                         ` David Rientjes
2009-05-07 18:50                                                                       ` David Rientjes
2009-05-05 23:40                                                                   ` Andrew Morton
2009-05-05 22:37                                                               ` Andrew Morton
2009-05-05  0:37                                                           ` David Rientjes
2009-05-04 19:01                                                   ` Andrew Morton
2009-05-04 19:01                                                     ` Andrew Morton
2009-05-04 19:01                                                   ` Andrew Morton
2009-05-04  0:10                                             ` Rafael J. Wysocki
2009-05-04  0:11                                             ` [PATCH 2/5] PM/Hibernate: Move memory shrinking to snapshot.c (rev. 2) Rafael J. Wysocki
2009-05-04  0:11                                             ` Rafael J. Wysocki
2009-05-04  0:11                                               ` Rafael J. Wysocki
2009-05-04 13:35                                               ` Pavel Machek
2009-05-04 13:35                                                 ` Pavel Machek
2009-05-04 13:35                                               ` Pavel Machek
2009-05-04  0:12                                             ` [PATCH 3/5] PM/Suspend: Do not shrink memory before suspend Rafael J. Wysocki
2009-05-04  0:12                                               ` Rafael J. Wysocki
2009-05-04  0:12                                             ` Rafael J. Wysocki
2009-05-04  0:20                                             ` [PATCH 4/5] PM/Hibernate: Use memory allocations to free memory (rev. 3) Rafael J. Wysocki
2009-05-04  0:20                                             ` Rafael J. Wysocki
2009-05-04  0:20                                               ` Rafael J. Wysocki
2009-05-04  0:22                                             ` [PATCH 5/5] PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2) Rafael J. Wysocki
2009-05-04  0:22                                               ` Rafael J. Wysocki
2009-05-05  2:24                                               ` Wu Fengguang
2009-05-05  2:24                                               ` Wu Fengguang
2009-05-05  2:24                                                 ` Wu Fengguang
2009-05-05  2:46                                                 ` Wu Fengguang
2009-05-05  2:46                                                 ` Wu Fengguang
2009-05-05  2:46                                                   ` Wu Fengguang
2009-05-05 23:07                                                   ` Rafael J. Wysocki
2009-05-05 23:07                                                     ` Rafael J. Wysocki
2009-05-05 23:40                                                     ` Wu Fengguang
2009-05-05 23:40                                                     ` Wu Fengguang
2009-05-05 23:40                                                       ` Wu Fengguang
2009-05-05 23:07                                                   ` Rafael J. Wysocki
2009-05-05 23:05                                                 ` Rafael J. Wysocki
2009-05-05 23:05                                                 ` Rafael J. Wysocki
2009-05-05 23:05                                                   ` Rafael J. Wysocki
2009-05-06 13:30                                                   ` Wu Fengguang
2009-05-06 13:30                                                   ` Wu Fengguang
2009-05-06 13:30                                                     ` Wu Fengguang
2009-05-06 13:52                                                   ` Wu Fengguang
2009-05-06 13:52                                                   ` Wu Fengguang
2009-05-06 13:52                                                     ` Wu Fengguang
2009-05-06 13:56                                                   ` Wu Fengguang
2009-05-06 13:56                                                   ` Wu Fengguang
2009-05-06 13:56                                                     ` Wu Fengguang
2009-05-06 20:54                                                     ` Rafael J. Wysocki
2009-05-06 20:54                                                     ` Rafael J. Wysocki
2009-05-06 20:54                                                       ` Rafael J. Wysocki
2009-05-07  1:58                                                       ` Wu Fengguang
2009-05-07  1:58                                                       ` Wu Fengguang
2009-05-07  1:58                                                         ` Wu Fengguang
2009-05-07 12:20                                                         ` Rafael J. Wysocki
2009-05-07 12:20                                                         ` Rafael J. Wysocki
2009-05-07 12:20                                                           ` Rafael J. Wysocki
2009-05-07 12:34                                                           ` Wu Fengguang
2009-05-07 12:34                                                             ` Wu Fengguang
2009-05-07 12:34                                                           ` Wu Fengguang
2009-08-16 13:46                                                   ` Wu Fengguang
2009-08-16 13:46                                                     ` Wu Fengguang
2009-08-16 22:48                                                     ` Rafael J. Wysocki
2009-08-16 22:48                                                       ` Rafael J. Wysocki
2009-08-16 22:48                                                     ` Rafael J. Wysocki
2009-08-16 13:46                                                   ` Wu Fengguang
2009-05-04  0:22                                             ` Rafael J. Wysocki
2009-05-04  9:33                                           ` [PATCH 0/4] PM: Drop shrink_all_memory (rev. 2) (was: Re: [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory) Pavel Machek
2009-05-04  9:33                                           ` Pavel Machek
2009-05-04  9:33                                             ` Pavel Machek
2009-05-04 19:53                                             ` Rafael J. Wysocki
2009-05-04 20:27                                               ` Pavel Machek
2009-05-04 20:27                                               ` Pavel Machek
2009-05-04 20:27                                                 ` Pavel Machek
2009-05-04 19:53                                             ` Rafael J. Wysocki
2009-05-03  0:20                                     ` Rafael J. Wysocki
2009-05-02 11:46                                 ` [PATCH 3/3] PM/Hibernate: Use memory allocations to free memory Rafael J. Wysocki
2009-05-01 23:14                               ` Andrew Morton
2009-05-01 22:29                             ` Rafael J. Wysocki
2009-04-17 20:34           ` [Bug #13058] First hibernation attempt fails Rafael J. Wysocki
2009-04-17 20:34             ` Rafael J. Wysocki
2009-04-16 21:45 ` [Bug #13067] iwl3945: wlan0: beacon loss from AP - sending probe request Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-17  3:38   ` Justin Madru
2009-04-17  3:38     ` Justin Madru
2009-04-17 21:09     ` Rafael J. Wysocki
2009-04-17 21:09       ` Rafael J. Wysocki
2009-04-16 21:45 ` [Bug #13069] regression in 2.6.29-git3 on SH/Dreamcast Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-24 17:37   ` Adrian McMenamin
2009-04-24 17:37     ` Adrian McMenamin
2009-05-17  8:12     ` Pekka Enberg
2009-05-17  8:12       ` Pekka Enberg
2009-05-17 10:28       ` Rafael J. Wysocki
2009-05-17 10:28         ` Rafael J. Wysocki
2009-05-17 10:38       ` Adrian McMenamin
2009-04-16 21:45 ` [Bug #13068] Lockdep warining in inotify_dev_queue_event Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-19  9:36   ` Sachin Sant
2009-04-19  9:36     ` Sachin Sant
2009-04-19 10:56     ` Rafael J. Wysocki
2009-04-19 10:56       ` Rafael J. Wysocki
2009-04-22  9:50   ` [Bug #13068] Lockdep warning " Sachin Sant
2009-04-22  9:50     ` Sachin Sant
2009-04-16 21:45 ` [Bug #13066] Intel HD Audio oops Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-17 16:57   ` Takashi Iwai
2009-04-17 21:07     ` Rafael J. Wysocki
2009-04-17 21:07       ` Rafael J. Wysocki
2009-04-16 21:45 ` [Bug #13096] 2.6.30-rc2 hangs in get_measured_perf on tigerton Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-16 21:45 ` [Bug #13087] boot hang due to commit ff69f2bba67bd45514923aaedbf40fe351787c59 Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-16 21:45 ` [Bug #13095] thinkpad-acpi: cannot control brightness with hotkeys Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-16 21:45 ` [Bug #13097] Kernel will freeze network after using a tun/tap device Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-17  0:44   ` David Miller
2009-04-17  0:44     ` David Miller
2009-04-17  0:54     ` Herbert Xu
2009-04-17  0:54       ` Herbert Xu
2009-04-16 21:45 ` [Bug #13099] net, sky2: BUG: unable to handle kernel NULL pointer dereference, pci_vpd_truncate() Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-17  0:45   ` Ingo Molnar
2009-04-17  0:45     ` Ingo Molnar
2009-04-17 21:14     ` Rafael J. Wysocki
2009-04-16 21:45 ` [Bug #13101] BUG: scheduling while atomic: swapper/0/0x10000100 Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-16 21:45 ` [Bug #13098] 2.6.29-git12 breaks vga=0x0f07 on MSI/Intel GPU Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-17  5:24   ` Andi Kleen
2009-04-17  5:24     ` Andi Kleen
2009-04-16 21:45 ` [Bug #13106] 2.6.30-rc1: intel 3945 no wireless Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-17  0:53   ` Larry Finger
2009-04-17  0:53     ` Larry Finger
2009-04-17  3:21     ` Justin Madru
2009-04-17  3:21       ` Justin Madru
2009-04-17 21:16       ` Rafael J. Wysocki
2009-04-17 21:16         ` Rafael J. Wysocki
2009-04-16 21:45 ` [Bug #13109] High latency on /sys/class/thermal Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-16 21:45 ` [Bug #13107] LTP 20080131 causes defunct processes w/2.6.30-rc1 Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-17 16:55   ` Sukadev Bhattiprolu
2009-04-17 16:55     ` Sukadev Bhattiprolu
2009-04-16 21:45 ` [Bug #13108] 2.6.30-rc1: white screen during boot (regression) on spitz Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-25 11:54   ` Pavel Machek
2009-04-26 12:18     ` Rafael J. Wysocki
2009-04-16 21:45 ` [Bug #13111] Linux 2.6.30-rc1 tg3 endian issues with MAC addresses on BCM5701 Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-17  0:43   ` David Miller
2009-04-17  0:43     ` David Miller
2009-04-17  0:58     ` Matt Carlson
2009-04-17  0:58       ` Matt Carlson
2009-04-17 12:21       ` Robin Holt
2009-04-17 12:21         ` Robin Holt
2009-04-19  4:30         ` David Miller
2009-04-19  4:30           ` David Miller
2009-04-20  4:31           ` Michael Chan
2009-04-20  4:31             ` Michael Chan
2009-04-16 21:45 ` [Bug #13113] tiobench read 50% regression with 2.6.30-rc1 Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-17  6:29   ` Jens Axboe
2009-04-17  6:29     ` Jens Axboe
2009-04-17 21:22     ` Rafael J. Wysocki
2009-04-17 21:22       ` Rafael J. Wysocki
2009-04-16 21:45 ` [Bug #13110] 2.6.30-rc1 problems with firmware loading Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-16 21:45 ` [Bug #13112] Oops in drain_array Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-16 21:45 ` [Bug #13116] Can't boot with nosmp Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-16 21:45 ` [Bug #13118] iptables very slow after commit 784544739a25c30637397ace5489eeb6e15d7d49 Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-16 21:45 ` [Bug #13119] Trouble with make-install from a NFS mount Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-16 21:45 ` [Bug #13115] microcode driver newly spews warnings Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-16 21:45 ` [Bug #13114] USB storage (usbstick) automount woes Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-17  4:01   ` Mike Galbraith
2009-04-17  4:01     ` Mike Galbraith
2009-04-16 21:45 ` [Bug #13124] ioatdma: DMA-API: device driver frees DMA memory with wrong function Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-16 21:45 ` [Bug #13123] 20 ACPI interrupts per second on EEEPC 4G Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-16 21:45 ` [Bug #13122] reiserfs_delete_xattrs: Couldn't delete all xattrs (-13) Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-16 21:45 ` [Bug #13120] BUG: using rootfstype=ext4 causes oops Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-19 18:31   ` Andrew Price
2009-04-19 18:31     ` Andrew Price
2009-04-16 21:45 ` [Bug #13121] commit 1a7c618a3f7bef1a20ae740df512eeba21397fa5 breaks ACPI video Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-16 21:45 ` [Bug #13126] BUG: MAX_LOCKDEP_ENTRIES too low! when mounting rootfs Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-16 21:45 ` [Bug #13125] active uvcvideo breaks over suspend Rafael J. Wysocki
2009-04-16 21:45   ` Rafael J. Wysocki
2009-04-17  0:40 ` 2.6.30-rc2-git2: Reported regressions from 2.6.29 Linus Torvalds
2009-04-17  0:40   ` Linus Torvalds
2009-04-17  1:25   ` Ingo Molnar
2009-04-17  1:25   ` Ingo Molnar
2009-04-17 21:25     ` Rafael J. Wysocki
     [not found]     ` <20090417012544.GB16126-X9Un+BFzKDI@public.gmane.org>
2009-04-17 21:25       ` Rafael J. Wysocki
2009-04-17 21:25         ` Rafael J. Wysocki
2009-04-17  0:41 ` David Miller
2009-04-17 21:27   ` Rafael J. Wysocki
2009-04-17 21:27   ` Rafael J. Wysocki
2009-04-17  0:41 ` David Miller
2009-04-17  0:46 ` Linus Torvalds
2009-04-17  0:46   ` Linus Torvalds
2009-04-17 21:31   ` Rafael J. Wysocki
2009-04-17 21:31   ` Rafael J. Wysocki
2009-04-17  0:46 ` Linus Torvalds
2009-04-17  1:28 ` Jeff Chua
2009-04-17  1:28 ` Jeff Chua
2009-04-17  1:28   ` Jeff Chua
2009-04-17  1:30 ` Zhang Rui
2009-04-17  2:34   ` yakui_zhao
2009-04-17  2:34     ` yakui_zhao
2009-04-17  2:34   ` yakui_zhao
2009-04-17 21:35   ` Rafael J. Wysocki
2009-04-17 21:35   ` Rafael J. Wysocki
2009-04-17 21:35     ` Rafael J. Wysocki
2009-04-17  1:30 ` Zhang Rui
2009-04-17  1:37 ` Ming Lei
2009-04-17  1:37 ` Ming Lei
2009-04-17  1:37   ` Ming Lei
2009-04-17 21:36   ` Rafael J. Wysocki
2009-04-17 21:36   ` Rafael J. Wysocki
2009-04-17 23:56     ` Laurent Pinchart
2009-04-18 12:29       ` Rafael J. Wysocki
     [not found]       ` <200904180156.24366.laurent.pinchart-AgBVmzD5pcezQB+pC5nmwQ@public.gmane.org>
2009-04-18 12:29         ` Rafael J. Wysocki
2009-04-18 12:29           ` Rafael J. Wysocki
2009-04-17 23:56     ` Laurent Pinchart
2009-04-18  2:32     ` leiming
2009-04-18  2:32     ` leiming
2009-04-18  2:32       ` leiming
2009-04-18  2:32       ` leiming
2009-04-18  2:55       ` Linus Torvalds
2009-04-18  2:55       ` Linus Torvalds
2009-04-18  2:55         ` Linus Torvalds
2009-04-18  3:50         ` leiming
2009-04-18  3:50         ` leiming
2009-04-18  4:51         ` leiming
2009-04-18  4:51         ` leiming
2009-04-18  4:51           ` leiming
2009-04-18 12:33           ` Rafael J. Wysocki
2009-04-18 12:33           ` Rafael J. Wysocki
2009-04-20 20:08           ` Laurent Pinchart
2009-04-20 20:08           ` Laurent Pinchart
2009-04-21  1:47             ` Ming Lei
     [not found]             ` <200904202208.23899.laurent.pinchart-AgBVmzD5pcezQB+pC5nmwQ@public.gmane.org>
2009-04-21  1:47               ` Ming Lei
2009-04-21  1:47                 ` Ming Lei
2009-04-21 23:21                 ` Laurent Pinchart
2009-05-09  3:28                   ` Ming Lei
2009-05-09  3:28                   ` Ming Lei
2009-05-09 16:24                     ` Linus Torvalds
2009-05-09 16:24                       ` Linus Torvalds
2009-05-09 21:37                       ` Mauro Carvalho Chehab
2009-05-09 21:37                       ` Mauro Carvalho Chehab
2009-04-21 23:21                 ` Laurent Pinchart
2009-04-17 17:09 ` Thomas Meyer
2009-04-17 21:38   ` Rafael J. Wysocki
2009-04-17 21:38     ` Rafael J. Wysocki
2009-04-24 13:44 ` Kalle Valo
2009-04-24 13:44 ` Kalle Valo
     [not found]   ` <87ljpqqi89.fsf-xNZwKgViW5gAvxtiuMwx3w@public.gmane.org>
2009-04-25 21:57     ` Rafael J. Wysocki
2009-04-25 21:57       ` Rafael J. Wysocki
2009-04-26  7:06       ` Kalle Valo
2009-04-25 21:57   ` Rafael J. Wysocki
     [not found] ` <200904170752.48078.edt@aei.ca>
     [not found]   ` <200904171648.38172.rjw@sisk.pl>
2009-04-26 13:35     ` Ed Tomlinson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.