* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
@ 2004-12-17 10:45 Voluspa
2004-12-18 23:02 ` Con Kolivas
0 siblings, 1 reply; 46+ messages in thread
From: Voluspa @ 2004-12-17 10:45 UTC (permalink / raw)
To: akpm; +Cc: nickpiggin, mr, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 2532 bytes --]
Sorry about the delay.
On 2004-12-17 0:41:30 Andrew Morton wrote:
> Can you identify the kernel release which caused the problem to start?
My next mail did just that. Sort of. Somewhere between, and including, 2.6.9-
rc1 and 2.6.9-rc1-bk4. The latter being the first functional kernel I can
test due to oopses and loss of keyboard in X starting from -rc1.
>> with a default of nice 19 and sucks up every free CPU cycle.
>
> What sucks up all the CPU? The application? kswapd?
The folding client uses all unused CPU, as it should. What kswapd does is
beyond my knowledge.
> How much RAM, how much swap?
256 megabyte ram, about 1 gigabyte swap. You'll find more info in the next
section.
On 2004-12-16 8:14:44 Nick Piggin wrote:
> So please, do the sysrq+m traces with a 2.6.10-rc3 kernel. Thanks.
Ok, done. I can do the same with last uneffected 2.6.8.1-bk2 upon request (didn't
want to spam unless told to). Log from dmesg attached. Don't want to "inline"
it since my ISP has changed the webmail program to some POS java where I
have no control over the linebreaks.
Testing explanation: Cold boot. Started the folding client and waited 15
minutes for it to write the first checkpoint (wanted full stability). Started
X. Started Blender. Loaded a scene where I only use the "Sequence Editor"
mode.
In this mode there's a 'preview' window where you can Alt-a, for animate, and
watch your work in an almost real time. Overhead prevents a real, real time.
Here I let the animation loop until the testing is over.
What happens during animation is that my 500 1.2 meg pictures (ie 20 seconds) is
read from /dev/hdb - a slightly better and modern disk, fills up memory
and then starts using the swap partition on /dev/hda. The read from /dev/hdb
seems to be done only once since neither memory nor swap is released until
I close the scene.
The machine CPU usage, as monitored by Gkrellm, is highest during the initial
phase of swapping, about 50 percent (not counting the niced folding client
usage) and then falls to about 15 percent when all swapping is done. How
high it reaches during the screen freezes I don't know.
The sysrq+m snapshots were taken thusly: 1) Some seconds after the beginning of
swap usage. 2) When the first screen freeze began. 3) In another screen
freeze. 4) In the last minute of swapping, also during a screen freeze.
Total wall clock was about 3 minutes from beginning of animation to when all
swapping had been done and the animation was "stable".
Mvh
Mats Johannesson
[-- Attachment #2: dmesg-2.6.10-rc3.txt --]
[-- Type: text/plain, Size: 12135 bytes --]
Linux version 2.6.10-rc3-sysrq (root@loke) (gcc version 3.4.3) #1 Thu Dec 16 10:38:55 CET 2004
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 00000000000a0000 (usable)
BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 000000000fff0000 (usable)
BIOS-e820: 000000000fff0000 - 000000000fff3000 (ACPI NVS)
BIOS-e820: 000000000fff3000 - 0000000010000000 (ACPI data)
BIOS-e820: 00000000ffb00000 - 0000000100000000 (reserved)
255MB LOWMEM available.
On node 0 totalpages: 65520
DMA zone: 4096 pages, LIFO batch:1
Normal zone: 61424 pages, LIFO batch:14
HighMem zone: 0 pages, LIFO batch:1
DMI 2.2 present.
Built 1 zonelists
Kernel command line: root=/dev/hda2 pci=usepirqmask elevator=cfq apic=verbose lapic=lapic
Local APIC disabled by BIOS -- reenabling.
Found and enabled local APIC!
mapped APIC to ffffd000 (fee00000)
Initializing CPU#0
CPU 0 irqstacks, hard=c0348000 soft=c0347000
PID hash table entries: 1024 (order: 10, 16384 bytes)
Detected 1075.368 MHz processor.
Using tsc for high-res timesource
Console: colour VGA+ 80x25
Dentry cache hash table entries: 65536 (order: 6, 262144 bytes)
Inode-cache hash table entries: 32768 (order: 5, 131072 bytes)
Memory: 256548k/262080k available (1564k kernel code, 4968k reserved, 405k data, 336k init, 0k highmem)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay loop... 2121.72 BogoMIPS (lpj=1060864)
Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
CPU: After generic identify, caps: 0383fbff 00000000 00000000 00000000
CPU: After vendor identify, caps: 0383fbff 00000000 00000000 00000000
CPU: L1 I cache: 16K, L1 D cache: 16K
CPU: L2 cache: 128K
CPU: After all inits, caps: 0383fbff 00000000 00000000 00000040
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU: Intel Celeron (Coppermine) stepping 06
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
enabled ExtINT on CPU#0
Using local APIC timer interrupts.
calibrating APIC timer ...
..... CPU clock speed is 1074.0766 MHz.
..... host bus clock speed is 134.0345 MHz.
NET: Registered protocol family 16
PCI: PCI BIOS revision 2.10 entry at 0xfb550, last bus=2
PCI: Using configuration type 1
mtrr: v2.0 (20020519)
Linux Plug and Play Support v0.97 (c) Adam Belay
PnPBIOS: Scanning system for PnP BIOS support...
PnPBIOS: Found PnP BIOS installation structure at 0xc00fbf10
PnPBIOS: PnP BIOS version 1.0, entry 0xf0000:0xbf40, dseg 0xf0000
PnPBIOS: 16 nodes reported by PnP BIOS; 16 recorded by driver
usbcore: registered new driver usbfs
usbcore: registered new driver hub
PCI: Probing PCI hardware
PCI: Probing PCI hardware (bus 00)
PCI: Transparent bridge - 0000:00:1e.0
PCI: Using IRQ router PIIX/ICH [8086/2440] at 0000:00:1f.0
PCI: Found IRQ 12 for device 0000:00:1f.3
PCI: Sharing IRQ 12 with 0000:02:01.0
pnp: 00:0c: ioport range 0x3f0-0x3f1 has been reserved
Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
isapnp: Scanning for PnP cards...
isapnp: No Plug & Play device found
Real Time Clock Driver v1.12
serio: i8042 AUX port at 0x60,0x64 irq 12
serio: i8042 KBD port at 0x60,0x64 irq 1
parport: PnPBIOS parport detected.
parport0: PC-style at 0x378, irq 7 [PCSPP,TRISTATE]
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
loop: loaded (max 8 devices)
ub: sizeof ub_scsi_cmd 64 ub_dev 2472
usbcore: registered new driver ub
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
ICH2: IDE controller at PCI slot 0000:00:1f.1
ICH2: chipset revision 1
ICH2: not 100% native mode: will probe irqs later
ide0: BM-DMA at 0xf000-0xf007, BIOS settings: hda:DMA, hdb:DMA
ide1: BM-DMA at 0xf008-0xf00f, BIOS settings: hdc:DMA, hdd:DMA
Probing IDE interface ide0...
hda: IBM-DTLA-307030, ATA DISK drive
hdb: IC35L080AVVA07-0, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Probing IDE interface ide1...
hdc: Hewlett-Packard CD-Writer Plus 9100, ATAPI CD/DVD-ROM drive
hdd: DVD-ROM DDU220E, ATAPI CD/DVD-ROM drive
ide1 at 0x170-0x177,0x376 on irq 15
Probing IDE interface ide2...
ide2: Wait for ready failed before probe !
Probing IDE interface ide3...
ide3: Wait for ready failed before probe !
Probing IDE interface ide4...
ide4: Wait for ready failed before probe !
Probing IDE interface ide5...
ide5: Wait for ready failed before probe !
hda: max request size: 128KiB
hda: 59772900 sectors (30603 MB) w/1916KiB Cache, CHS=59298/16/63, UDMA(100)
hda: cache flushes not supported
hda: hda1 hda2 hda3 hda4 < hda5 hda6 hda7 hda8 hda9 >
hdb: max request size: 128KiB
hdb: 160836480 sectors (82348 MB) w/1863KiB Cache, CHS=65535/16/63, UDMA(100)
hdb: cache flushes supported
hdb: hdb1 hdb2 hdb3 hdb4
hdc: ATAPI 32X CD-ROM CD-R/RW drive, 4096kB Cache, UDMA(33)
Uniform CD-ROM driver Revision: 3.20
hdd: ATAPI DVD-ROM drive, 512kB Cache, DMA
USB Universal Host Controller Interface driver v2.2
PCI: Found IRQ 11 for device 0000:00:1f.2
PCI: Sharing IRQ 11 with 0000:02:03.0
uhci_hcd 0000:00:1f.2: Intel Corp. 82801BA/BAM USB (Hub #1)
PCI: Setting latency timer of device 0000:00:1f.2 to 64
uhci_hcd 0000:00:1f.2: irq 11, io base 0xd000
uhci_hcd 0000:00:1f.2: new USB bus registered, assigned bus number 1
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 2 ports detected
PCI: setting IRQ 5 as level-triggered
PCI: Assigned IRQ 5 for device 0000:00:1f.4
uhci_hcd 0000:00:1f.4: Intel Corp. 82801BA/BAM USB (Hub #2)
PCI: Setting latency timer of device 0000:00:1f.4 to 64
uhci_hcd 0000:00:1f.4: irq 5, io base 0xd400
uhci_hcd 0000:00:1f.4: new USB bus registered, assigned bus number 2
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
usb 1-2: new low speed USB device using uhci_hcd and address 2
input: USB HID v1.10 Mouse [Logitech USB Trackball] on usb-0000:00:1f.2-2
usbcore: registered new driver usbhid
drivers/usb/input/hid-core.c: v2.0:USB HID core driver
mice: PS/2 mouse device common for all mice
input: AT Translated Set 2 keyboard on isa0060/serio0
input: PC Speaker
i2c /dev entries driver
u32 classifier
OLD policer on
NET: Registered protocol family 2
IP: routing cache hash table of 2048 buckets, 16Kbytes
TCP: Hash tables configured (established 16384 bind 32768)
NET: Registered protocol family 1
NET: Registered protocol family 17
NET: Registered protocol family 15
VFS: Mounted root (ext2 filesystem) readonly.
Freeing unused kernel memory: 336k freed
Adding 979924k swap on /dev/hda7. Priority:1 extents:1
PCI: Found IRQ 12 for device 0000:02:01.0
PCI: Sharing IRQ 12 with 0000:00:1f.3
nvidia: module license 'NVIDIA' taints kernel.
NVRM: loading NVIDIA Linux x86 NVIDIA Kernel Module 1.0-6629 Wed Nov 3 13:12:51 PST 2004
8139too Fast Ethernet driver 0.9.27
PCI: Found IRQ 11 for device 0000:02:03.0
PCI: Sharing IRQ 11 with 0000:00:1f.2
eth0: RealTek RTL8139 at 0xd097e000, 00:48:54:66:c8:84, IRQ 11
eth0: Identified 8139 chip type 'RTL-8139B'
eth0: link up, 100Mbps, full-duplex, lpa 0x45E1
SysRq : Show Memory
Mem-info:
DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
Normal per-cpu:
cpu 0 hot: low 28, high 84, batch 14
cpu 0 cold: low 0, high 28, batch 14
HighMem per-cpu: empty
Free pages: 2528kB (0kB HighMem)
Active:40862 inactive:16601 dirty:0 writeback:0 unstable:0 free:632 slab:1392 mapped:49585 pagetables:206
DMA free:152kB min:124kB low:152kB high:184kB active:8636kB inactive:4992kB present:16384kB pages_scanned:36 all_unreclaimable? no
protections[]: 0 0 0
Normal free:2376kB min:1916kB low:2392kB high:2872kB active:154812kB inactive:61412kB present:245696kB pages_scanned:99 all_unreclaimable? no
protections[]: 0 0 0
HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
protections[]: 0 0 0
DMA: 0*4kB 1*8kB 1*16kB 2*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 152kB
Normal: 4*4kB 1*8kB 1*16kB 1*32kB 10*64kB 1*128kB 0*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 2376kB
HighMem: empty
Swap cache: add 12914, delete 12031, find 8/36, race 0+0
Free swap: 929164kB
65520 pages of RAM
0 pages of HIGHMEM
4391 reserved pages
7013 pages shared
883 pages swap cached
SysRq : Show Memory
Mem-info:
DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
Normal per-cpu:
cpu 0 hot: low 28, high 84, batch 14
cpu 0 cold: low 0, high 28, batch 14
HighMem per-cpu: empty
Free pages: 2928kB (0kB HighMem)
Active:38048 inactive:19285 dirty:0 writeback:1403 unstable:0 free:732 slab:1373 mapped:53240 pagetables:255
DMA free:216kB min:124kB low:152kB high:184kB active:5868kB inactive:7620kB present:16384kB pages_scanned:0 all_unreclaimable? no
protections[]: 0 0 0
Normal free:2712kB min:1916kB low:2392kB high:2872kB active:146324kB inactive:69520kB present:245696kB pages_scanned:0 all_unreclaimable? no
protections[]: 0 0 0
HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
protections[]: 0 0 0
DMA: 24*4kB 1*8kB 1*16kB 1*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 216kB
Normal: 0*4kB 9*8kB 15*16kB 15*32kB 2*64kB 2*128kB 0*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 2712kB
HighMem: empty
Swap cache: add 59988, delete 57469, find 384/488, race 0+0
Free swap: 743204kB
65520 pages of RAM
0 pages of HIGHMEM
4391 reserved pages
5303 pages shared
2519 pages swap cached
SysRq : Show Memory
Mem-info:
DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
Normal per-cpu:
cpu 0 hot: low 28, high 84, batch 14
cpu 0 cold: low 0, high 28, batch 14
HighMem per-cpu: empty
Free pages: 2760kB (0kB HighMem)
Active:53024 inactive:4421 dirty:0 writeback:0 unstable:0 free:690 slab:1331 mapped:56992 pagetables:255
DMA free:272kB min:124kB low:152kB high:184kB active:11704kB inactive:1736kB present:16384kB pages_scanned:0 all_unreclaimable? no
protections[]: 0 0 0
Normal free:2488kB min:1916kB low:2392kB high:2872kB active:200392kB inactive:15948kB present:245696kB pages_scanned:0 all_unreclaimable? no
protections[]: 0 0 0
HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
protections[]: 0 0 0
DMA: 28*4kB 2*8kB 3*16kB 1*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 272kB
Normal: 0*4kB 21*8kB 1*16kB 0*32kB 0*64kB 2*128kB 2*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 2488kB
HighMem: empty
Swap cache: add 102121, delete 82743, find 19904/23544, race 0+0
Free swap: 684360kB
65520 pages of RAM
0 pages of HIGHMEM
4391 reserved pages
4895 pages shared
19378 pages swap cached
SysRq : Show Memory
Mem-info:
DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
Normal per-cpu:
cpu 0 hot: low 28, high 84, batch 14
cpu 0 cold: low 0, high 28, batch 14
HighMem per-cpu: empty
Free pages: 2768kB (0kB HighMem)
Active:37234 inactive:20167 dirty:0 writeback:0 unstable:0 free:692 slab:1340 mapped:55097 pagetables:255
DMA free:280kB min:124kB low:152kB high:184kB active:8132kB inactive:5268kB present:16384kB pages_scanned:0 all_unreclaimable? no
protections[]: 0 0 0
Normal free:2488kB min:1916kB low:2392kB high:2872kB active:140804kB inactive:75400kB present:245696kB pages_scanned:0 all_unreclaimable? no
protections[]: 0 0 0
HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
protections[]: 0 0 0
DMA: 32*4kB 1*8kB 3*16kB 1*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 280kB
Normal: 0*4kB 1*8kB 7*16kB 0*32kB 3*64kB 1*128kB 2*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 2488kB
HighMem: empty
Swap cache: add 142576, delete 111546, find 38489/46005, race 0+0
Free swap: 635524kB
65520 pages of RAM
0 pages of HIGHMEM
4391 reserved pages
4945 pages shared
31030 pages swap cached
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-17 10:45 2.6.10-rc3: kswapd eats CPU on start of memory-eating task Voluspa
@ 2004-12-18 23:02 ` Con Kolivas
0 siblings, 0 replies; 46+ messages in thread
From: Con Kolivas @ 2004-12-18 23:02 UTC (permalink / raw)
To: lista4; +Cc: akpm, nickpiggin, mr, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 2737 bytes --]
Voluspa wrote:
> Sorry about the delay.
>
> On 2004-12-17 0:41:30 Andrew Morton wrote:
>
>
>>Can you identify the kernel release which caused the problem to start?
>
>
> My next mail did just that. Sort of. Somewhere between, and including, 2.6.9-
> rc1 and 2.6.9-rc1-bk4. The latter being the first functional kernel I can
> test due to oopses and loss of keyboard in X starting from -rc1.
>
>
>>>with a default of nice 19 and sucks up every free CPU cycle.
>>
>>What sucks up all the CPU? The application? kswapd?
>
>
> The folding client uses all unused CPU, as it should. What kswapd does is
> beyond my knowledge.
>
>
>>How much RAM, how much swap?
>
>
> 256 megabyte ram, about 1 gigabyte swap. You'll find more info in the next
> section.
>
> On 2004-12-16 8:14:44 Nick Piggin wrote:
>
>
>>So please, do the sysrq+m traces with a 2.6.10-rc3 kernel. Thanks.
>
>
> Ok, done. I can do the same with last uneffected 2.6.8.1-bk2 upon request (didn't
> want to spam unless told to). Log from dmesg attached. Don't want to "inline"
> it since my ISP has changed the webmail program to some POS java where I
> have no control over the linebreaks.
>
> Testing explanation: Cold boot. Started the folding client and waited 15
> minutes for it to write the first checkpoint (wanted full stability). Started
> X. Started Blender. Loaded a scene where I only use the "Sequence Editor"
> mode.
>
> In this mode there's a 'preview' window where you can Alt-a, for animate, and
> watch your work in an almost real time. Overhead prevents a real, real time.
> Here I let the animation loop until the testing is over.
>
> What happens during animation is that my 500 1.2 meg pictures (ie 20 seconds) is
> read from /dev/hdb - a slightly better and modern disk, fills up memory
> and then starts using the swap partition on /dev/hda. The read from /dev/hdb
> seems to be done only once since neither memory nor swap is released until
> I close the scene.
>
> The machine CPU usage, as monitored by Gkrellm, is highest during the initial
> phase of swapping, about 50 percent (not counting the niced folding client
> usage) and then falls to about 15 percent when all swapping is done. How
> high it reaches during the screen freezes I don't know.
>
> The sysrq+m snapshots were taken thusly: 1) Some seconds after the beginning of
> swap usage. 2) When the first screen freeze began. 3) In another screen
> freeze. 4) In the last minute of swapping, also during a screen freeze.
>
> Total wall clock was about 3 minutes from beginning of animation to when all
> swapping had been done and the animation was "stable".
Try disabling the swap token
echo 0 > /proc/sys/vm/swap_token_timeout
Cheers,
Con
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 256 bytes --]
^ permalink raw reply [flat|nested] 46+ messages in thread
* RE: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-23 13:26 ` Rik van Riel
@ 2004-12-23 13:28 ` Rik van Riel
0 siblings, 0 replies; 46+ messages in thread
From: Rik van Riel @ 2004-12-23 13:28 UTC (permalink / raw)
To: Zou, Nanhai; +Cc: Nick Piggin, Andrew Morton, lista4, linux-kernel, mr, kernel
On Thu, 23 Dec 2004, Rik van Riel wrote:
>>> You need the oneline patch that Andrew Morton posted two
>>> days ago:
>>>
>>> Message-Id: <20041219230754.64c0e52e.akpm@osdl.org>
>>
>> You mean that totally disable swap_token?
Oops, wrong thread ;( You need this one:
Message-Id: <20041220125443.091a911b.akpm@osdl.org>
We haven't been incrementing local variable total_scanned since the
scan_control stuff went in. That broke kswapd throttling.
Signed-off-by: Andrew Morton <akpm@osdl.org>
---
25-akpm/mm/vmscan.c | 1 +
1 files changed, 1 insertion(+)
--- linux-2.6.9/mm/vmscan.c.oom 2004-12-21 11:26:20.343790527 -0500
+++ linux-2.6.9/mm/vmscan.c 2004-12-21 11:27:43.514384221 -0500
@@ -1079,6 +1079,7 @@
shrink_slab(sc.nr_scanned, GFP_KERNEL, lru_pages);
sc.nr_reclaimed += reclaim_state->reclaimed_slab;
total_reclaimed += sc.nr_reclaimed;
+ total_scanned += sc.nr_scanned;
if (zone->all_unreclaimable)
continue;
if (zone->pages_scanned >= (zone->nr_active +
--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
^ permalink raw reply [flat|nested] 46+ messages in thread
* RE: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-23 0:26 Zou, Nanhai
@ 2004-12-23 13:26 ` Rik van Riel
2004-12-23 13:28 ` Rik van Riel
0 siblings, 1 reply; 46+ messages in thread
From: Rik van Riel @ 2004-12-23 13:26 UTC (permalink / raw)
To: Zou, Nanhai; +Cc: Nick Piggin, Andrew Morton, lista4, linux-kernel, mr, kernel
On Thu, 23 Dec 2004, Zou, Nanhai wrote:
> Rik van Riel wrote:
>>> Seems that vmscan-ignore-swap-token-when-in-trouble.patch +
>>> vm-pageout-throttling.patch dose not fix the problem,
>>> I ran stress test for 2.6.9 + these 2 patches.
>>> OOM killer was still triggered.
>>
>> You need the oneline patch that Andrew Morton posted two
>> days ago:
>>
>> Message-Id: <20041219230754.64c0e52e.akpm@osdl.org>
>
> You mean that totally disable swap_token?
No, the other one. The one from the email with the message-id
above ;)
--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
^ permalink raw reply [flat|nested] 46+ messages in thread
* RE: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
@ 2004-12-23 0:26 Zou, Nanhai
2004-12-23 13:26 ` Rik van Riel
0 siblings, 1 reply; 46+ messages in thread
From: Zou, Nanhai @ 2004-12-23 0:26 UTC (permalink / raw)
To: Rik van Riel; +Cc: Nick Piggin, Andrew Morton, lista4, linux-kernel, mr, kernel
Rik van Riel wrote:
> > Seems that vmscan-ignore-swap-token-when-in-trouble.patch +
> > vm-pageout-throttling.patch dose not fix the problem,
> > I ran stress test for 2.6.9 + these 2 patches.
> > OOM killer was still triggered.
>
> You need the oneline patch that Andrew Morton posted two
> days ago:
>
> Message-Id: <20041219230754.64c0e52e.akpm@osdl.org>
You mean that totally disable swap_token?
I have just tried it yesterday on a RHEL4-PRERC kernel, which is based
on 2.6.9.
I still see the OOM killer in a couple of hours...,
>
> --
> "Debugging is twice as hard as writing the code in the first place.
> Therefore, if you write the code as cleverly as possible, you are,
> by definition, not smart enough to debug it." - Brian W. Kernighan
^ permalink raw reply [flat|nested] 46+ messages in thread
* RE: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-22 8:45 Zou, Nanhai
@ 2004-12-22 14:23 ` Rik van Riel
0 siblings, 0 replies; 46+ messages in thread
From: Rik van Riel @ 2004-12-22 14:23 UTC (permalink / raw)
To: Zou, Nanhai; +Cc: Nick Piggin, Andrew Morton, lista4, linux-kernel, mr, kernel
On Wed, 22 Dec 2004, Zou, Nanhai wrote:
>> That's Marcelo's vm-pageout-throttling.patch, which is one
>> of the essential ingredients in avoiding false OOM kills.
>>
>> I'm waiting on some test results for another two patches
>> that I suspect are also needed ...
> Seems that vmscan-ignore-swap-token-when-in-trouble.patch +
> vm-pageout-throttling.patch dose not fix the problem,
> I ran stress test for 2.6.9 + these 2 patches.
> OOM killer was still triggered.
You need the oneline patch that Andrew Morton posted two
days ago:
Message-Id: <20041219230754.64c0e52e.akpm@osdl.org>
--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
^ permalink raw reply [flat|nested] 46+ messages in thread
* RE: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
@ 2004-12-22 8:45 Zou, Nanhai
2004-12-22 14:23 ` Rik van Riel
0 siblings, 1 reply; 46+ messages in thread
From: Zou, Nanhai @ 2004-12-22 8:45 UTC (permalink / raw)
To: Rik van Riel; +Cc: Nick Piggin, Andrew Morton, lista4, linux-kernel, mr, kernel
> -----Original Message-----
> From: Rik van Riel [mailto:riel@redhat.com]
> Sent: Monday, December 20, 2004 11:08 PM
> To: Zou, Nanhai
> Cc: Nick Piggin; Andrew Morton; lista4@comhem.se;
> linux-kernel@vger.kernel.org; mr@ramendik.ru; kernel@kolivas.org
Rik van Riel wrote:
> That's Marcelo's vm-pageout-throttling.patch, which is one
> of the essential ingredients in avoiding false OOM kills.
>
> I'm waiting on some test results for another two patches
> that I suspect are also needed ...
>
> --
Seems that vmscan-ignore-swap-token-when-in-trouble.patch +
vm-pageout-throttling.patch dose not fix the problem,
I ran stress test for 2.6.9 + these 2 patches.
OOM killer was still triggered.
Zou Nan hai
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-20 12:59 Voluspa
@ 2004-12-21 1:46 ` Mikhail Ramendik
0 siblings, 0 replies; 46+ messages in thread
From: Mikhail Ramendik @ 2004-12-21 1:46 UTC (permalink / raw)
To: lista4; +Cc: kernel, nickpiggin, akpm, linux-kernel, riel
Voluspa wrote:
> >This patch should have the desired effect.
>
> Yes, it sure has. And with that I mean, YES. My testcase shows no freezes
> now, and it has the same swapping time as 2.6.8.1-bk2.
Confirmed.
On 2.6.10-rc3 with Con's patch, when I run the memory eater, there is a high
kswapd CPU load for about 10 seconds, then things are OK. The screen never
freezes at that time or at any other moment.
When I add vm-pageout-throttling.patch from -mm, the CPU load in the beginning
is somewhat less constant but remains there.
While it would be nice to fix the high CPU load, the system is usable as it
is.
--
Yours, Mikhail Ramendik
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-20 17:49 ` Hideo AOKI
@ 2004-12-20 23:51 ` Nick Piggin
0 siblings, 0 replies; 46+ messages in thread
From: Nick Piggin @ 2004-12-20 23:51 UTC (permalink / raw)
To: Hideo AOKI; +Cc: Andrew Morton, lista4, linux-kernel, mr, kernel, riel
Hideo AOKI wrote:
> Nick Piggin wrote:
>
>
>>Andrew Morton wrote:
>
> [snip]
>
>>>Did anyone come up with a simple step-by-step procedure for
>>>reproducing the
>>>problem? It would be good if someone could do this, because I don't
>>>think
>>>we understand the root cause yet?
>>
>>I admit to generally being in the same boat as you with respect to
>>running complex userspace apps.
>>
>>However, based on this and other scattered reports, I'd say it seems
>>quite likely that token based thrashing control is the culprit. Based
>>on the cost/benefit, I wonder if we should disable TBTC by default for
>>2.6.10, rather than trying to fix it, and try again for 2.6.11?
>
>
> Hello,
>
> I imagine that the issue might occur when only one process holds
> almost all memory and has swap token too long time.
>
> However, TBTC has a good effect in my workload.
> So, I think that it is better to keep VM tunable using TBTC.
>
> It may be a good idea to set 0 to default swap_token_timeout
> until we find the root cause.
>
Yes, with Con's patch to have TBTC turned off when swap_token_timeout
is set to zero. It causes unacceptable regressions, so that is the
best way to go.
It would be great to get it fixed, but I would be worried about putting
in new patches for it now, right before 2.6.10.
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-20 7:44 ` Nick Piggin
2004-12-20 8:03 ` Con Kolivas
2004-12-20 12:06 ` Ed Tomlinson
@ 2004-12-20 17:49 ` Hideo AOKI
2004-12-20 23:51 ` Nick Piggin
2 siblings, 1 reply; 46+ messages in thread
From: Hideo AOKI @ 2004-12-20 17:49 UTC (permalink / raw)
To: Nick Piggin; +Cc: Andrew Morton, lista4, linux-kernel, mr, kernel, riel
Nick Piggin wrote:
> Andrew Morton wrote:
[snip]
>> Did anyone come up with a simple step-by-step procedure for
>> reproducing the
>> problem? It would be good if someone could do this, because I don't
>> think
>> we understand the root cause yet?
>
> I admit to generally being in the same boat as you with respect to
> running complex userspace apps.
>
> However, based on this and other scattered reports, I'd say it seems
> quite likely that token based thrashing control is the culprit. Based
> on the cost/benefit, I wonder if we should disable TBTC by default for
> 2.6.10, rather than trying to fix it, and try again for 2.6.11?
Hello,
I imagine that the issue might occur when only one process holds
almost all memory and has swap token too long time.
However, TBTC has a good effect in my workload.
So, I think that it is better to keep VM tunable using TBTC.
It may be a good idea to set 0 to default swap_token_timeout
until we find the root cause.
Best regards,
Hideo AOKI
^ permalink raw reply [flat|nested] 46+ messages in thread
* RE: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-20 9:22 Zou, Nanhai
@ 2004-12-20 15:08 ` Rik van Riel
0 siblings, 0 replies; 46+ messages in thread
From: Rik van Riel @ 2004-12-20 15:08 UTC (permalink / raw)
To: Zou, Nanhai; +Cc: Nick Piggin, Andrew Morton, lista4, linux-kernel, mr, kernel
On Mon, 20 Dec 2004, Zou, Nanhai wrote:
> With 2.6.9 + vmscan-ignore-swap-token-when-in-trouble.patch
> OOM killer will be invoked around 30 hours.
>
> While 2.6.10-rc3-mm1 seems to be much more stable.
> At least for the test I was running, it bypassed 48 hours test.
That's Marcelo's vm-pageout-throttling.patch, which is one
of the essential ingredients in avoiding false OOM kills.
I'm waiting on some test results for another two patches
that I suspect are also needed ...
--
He did not think of himself as a tourist; he was a traveler. The difference is
partly one of time, he would explain. Where as the tourist generally hurries
back home at the end of a few weeks or months, the traveler belonging no more
to one place than to the next, moves slowly, over periods of years, from one
part of the earth to another. Indeed, he would have found it difficult to tell,
among the many places he had lived, precisely where it was he had felt most at
home. -- Paul Bowles
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-20 9:07 ` mr
@ 2004-12-20 15:06 ` Rik van Riel
0 siblings, 0 replies; 46+ messages in thread
From: Rik van Riel @ 2004-12-20 15:06 UTC (permalink / raw)
To: mr; +Cc: Andrew Morton, lista4, linux-kernel, nickpiggin, kernel
On Mon, 20 Dec 2004 mr@ramendik.ru wrote:
> - Enjoy :) "eatmemory" will slowly eat up more and more RAM (visible in
> top as RSS); under 2.6.8.1 no screen freezes come, and under 2.6.9 and
> 2.6.10-rc3 they do come; under 2.6.10-rc3 I also see high CPU periods for
> kswapd.
The high cpu use for kswapd should be fixed by applying
the vm-pageout-throttling.patch patch from -mm.
I'll also come up with a patch to not have the swap token
used when the system is not under a swapin load...
--
He did not think of himself as a tourist; he was a traveler. The difference is
partly one of time, he would explain. Where as the tourist generally hurries
back home at the end of a few weeks or months, the traveler belonging no more
to one place than to the next, moves slowly, over periods of years, from one
part of the earth to another. Indeed, he would have found it difficult to tell,
among the many places he had lived, precisely where it was he had felt most at
home. -- Paul Bowles
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
@ 2004-12-20 12:59 Voluspa
2004-12-21 1:46 ` Mikhail Ramendik
0 siblings, 1 reply; 46+ messages in thread
From: Voluspa @ 2004-12-20 12:59 UTC (permalink / raw)
To: kernel; +Cc: nickpiggin, akpm, linux-kernel, mr, riel
Con Kolivas wrote:
>> Logistically what makes sense is if a timeout of 0 is used as a test
>> that completely disables it (avoids another sysctl too). In time for
>> 2.6.10 we should disable it by default until the regressions are better
>> understood. Tuning it into a useful "on" position can happen later and
I
>> suspect requires more code.
>
>This patch should have the desired effect.
Yes, it sure has. And with that I mean, YES. My testcase shows no freezes now,
and it has the same swapping time as 2.6.8.1-bk2.
Thanks Con,
Mats Johannesson
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-20 8:58 ` Con Kolivas
@ 2004-12-20 12:55 ` Andrea Arcangeli
0 siblings, 0 replies; 46+ messages in thread
From: Andrea Arcangeli @ 2004-12-20 12:55 UTC (permalink / raw)
To: Con Kolivas; +Cc: Nick Piggin, Andrew Morton, lista4, linux-kernel, mr, riel
On Mon, Dec 20, 2004 at 07:58:40PM +1100, Con Kolivas wrote:
> This patch should have the desired effect.
Look great Con, thanks.
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-20 12:06 ` Ed Tomlinson
@ 2004-12-20 12:29 ` Con Kolivas
0 siblings, 0 replies; 46+ messages in thread
From: Con Kolivas @ 2004-12-20 12:29 UTC (permalink / raw)
To: Ed Tomlinson
Cc: Nick Piggin, Andrew Morton, lista4, linux-kernel, mr, kernel, riel
Ed Tomlinson writes:
> On Monday 20 December 2004 02:44, Nick Piggin wrote:
>> Andrew Morton wrote:
>> > Voluspa <lista4@comhem.se> wrote:
>> >
>> >>Would be nice though if someone else could verify...
>> >
>> >
>> > Well I'd love to, but afaik the only workloads which we currently know of
>> > involve complex userspace apps which I have no experience running.
>> >
>> > Did anyone come up with a simple step-by-step procedure for reproducing the
>> > problem? It would be good if someone could do this, because I don't think
>> > we understand the root cause yet?
>> >
>>
>> I admit to generally being in the same boat as you with respect to
>> running complex userspace apps.
>>
>> However, based on this and other scattered reports, I'd say it seems
>> quite likely that token based thrashing control is the culprit. Based
>> on the cost/benefit, I wonder if we should disable TBTC by default for
>> 2.6.10, rather than trying to fix it, and try again for 2.6.11?
>>
>> Rik? Andrew?
>>
>> Also, it would be nice to have a sysctl to *completely* disable TBTC,
>> that would make testing easier.
>
> Except that disabling it (with 0) reportedly did not solve the problem. There is
> a possibility that its a more complex issue...
Disabling it is more than setting it to 0. Removing the patch disables it
and this does fix the problem. We need it to be truly possible to disable
it. See the patch I posted on this thread later.
Con
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-20 7:44 ` Nick Piggin
2004-12-20 8:03 ` Con Kolivas
@ 2004-12-20 12:06 ` Ed Tomlinson
2004-12-20 12:29 ` Con Kolivas
2004-12-20 17:49 ` Hideo AOKI
2 siblings, 1 reply; 46+ messages in thread
From: Ed Tomlinson @ 2004-12-20 12:06 UTC (permalink / raw)
To: Nick Piggin; +Cc: Andrew Morton, lista4, linux-kernel, mr, kernel, riel
On Monday 20 December 2004 02:44, Nick Piggin wrote:
> Andrew Morton wrote:
> > Voluspa <lista4@comhem.se> wrote:
> >
> >>Would be nice though if someone else could verify...
> >
> >
> > Well I'd love to, but afaik the only workloads which we currently know of
> > involve complex userspace apps which I have no experience running.
> >
> > Did anyone come up with a simple step-by-step procedure for reproducing the
> > problem? It would be good if someone could do this, because I don't think
> > we understand the root cause yet?
> >
>
> I admit to generally being in the same boat as you with respect to
> running complex userspace apps.
>
> However, based on this and other scattered reports, I'd say it seems
> quite likely that token based thrashing control is the culprit. Based
> on the cost/benefit, I wonder if we should disable TBTC by default for
> 2.6.10, rather than trying to fix it, and try again for 2.6.11?
>
> Rik? Andrew?
>
> Also, it would be nice to have a sysctl to *completely* disable TBTC,
> that would make testing easier.
Except that disabling it (with 0) reportedly did not solve the problem. There is
a possibility that its a more complex issue...
Ed Tomlinson
^ permalink raw reply [flat|nested] 46+ messages in thread
* RE: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
@ 2004-12-20 9:22 Zou, Nanhai
2004-12-20 15:08 ` Rik van Riel
0 siblings, 1 reply; 46+ messages in thread
From: Zou, Nanhai @ 2004-12-20 9:22 UTC (permalink / raw)
To: Nick Piggin, Andrew Morton; +Cc: lista4, linux-kernel, mr, kernel, riel
> However, based on this and other scattered reports, I'd say it seems
> quite likely that token based thrashing control is the culprit. Based
> on the cost/benefit, I wonder if we should disable TBTC by default for
> 2.6.10, rather than trying to fix it, and try again for 2.6.11?
>
> Rik? Andrew?
>
> Also, it would be nice to have a sysctl to *completely* disable TBTC,
> that would make testing easier.
>
> Nick
I have run some stress tests against 2.6.9,
2.6.9 + ignore-swap-token-when-in-trouble.patch
and 2.6.10-rc3-mm1 on an Itanium2 with 4G memory.
With 2.6.9
OOM killer will be invoked within a few hours of stress test running.
With 2.6.9 + vmscan-ignore-swap-token-when-in-trouble.patch
OOM killer will be invoked around 30 hours.
While 2.6.10-rc3-mm1 seems to be much more stable.
At least for the test I was running, it bypassed 48 hours test.
Zou Nan hai
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-20 7:12 ` Andrew Morton
2004-12-20 7:44 ` Nick Piggin
@ 2004-12-20 9:07 ` mr
2004-12-20 15:06 ` Rik van Riel
1 sibling, 1 reply; 46+ messages in thread
From: mr @ 2004-12-20 9:07 UTC (permalink / raw)
To: Andrew Morton; +Cc: lista4, linux-kernel, nickpiggin, mr, kernel, riel
Hello,
> Did anyone come up with a simple step-by-step procedure for reproducing
> the
> problem? It would be good if someone could do this, because I don't think
> we understand the root cause yet?
Here's a step-by-step explanation of the way I test this:
- Get the Memory Eater and compile it:
http://lkml.org/lkml/2004/12/13/272
- Do a clean boot
- Start top, and some app that has a clock and preferrably a CPU graph (to
monitor screen freezes and CPU load; it's IceWM for me)
- Start the Memory Eater
- Give it an amount of megabytes that is more than the actual RAM size. I
use a value of 300, as my computer has 256 M RAM.
- Enjoy :) "eatmemory" will slowly eat up more and more RAM (visible in
top as RSS); under 2.6.8.1 no screen freezes come, and under 2.6.9 and
2.6.10-rc3 they do come; under 2.6.10-rc3 I also see high CPU periods for
kswapd.
Yours, Mikhail Ramendik
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-20 8:03 ` Con Kolivas
@ 2004-12-20 8:58 ` Con Kolivas
2004-12-20 12:55 ` Andrea Arcangeli
0 siblings, 1 reply; 46+ messages in thread
From: Con Kolivas @ 2004-12-20 8:58 UTC (permalink / raw)
To: Con Kolivas; +Cc: Nick Piggin, Andrew Morton, lista4, linux-kernel, mr, riel
[-- Attachment #1.1: Type: text/plain, Size: 393 bytes --]
Con Kolivas wrote:
> Logistically what makes sense is if a timeout of 0 is used as a test
> that completely disables it (avoids another sysctl too). In time for
> 2.6.10 we should disable it by default until the regressions are better
> understood. Tuning it into a useful "on" position can happen later and I
> suspect requires more code.
This patch should have the desired effect.
Con
[-- Attachment #1.2: disable_thrash_control.patch --]
[-- Type: text/x-diff, Size: 1047 bytes --]
Index: linux-2.6.10-rc3/mm/rmap.c
===================================================================
--- linux-2.6.10-rc3.orig/mm/rmap.c 2004-12-06 13:14:01.000000000 +1100
+++ linux-2.6.10-rc3/mm/rmap.c 2004-12-20 19:54:42.416058897 +1100
@@ -395,6 +395,9 @@ int page_referenced(struct page *page, i
{
int referenced = 0;
+ if (!swap_token_default_timeout)
+ ignore_token = 1;
+
if (page_test_and_clear_young(page))
referenced++;
Index: linux-2.6.10-rc3/mm/thrash.c
===================================================================
--- linux-2.6.10-rc3.orig/mm/thrash.c 2004-12-06 13:14:01.000000000 +1100
+++ linux-2.6.10-rc3/mm/thrash.c 2004-12-20 19:56:01.594602700 +1100
@@ -19,7 +19,10 @@ unsigned long swap_token_check;
struct mm_struct * swap_token_mm = &init_mm;
#define SWAP_TOKEN_CHECK_INTERVAL (HZ * 2)
-#define SWAP_TOKEN_TIMEOUT (HZ * 300)
+#define SWAP_TOKEN_TIMEOUT 0
+/*
+ * Currently disabled; Needs further code to work at HZ * 300.
+ */
unsigned long swap_token_default_timeout = SWAP_TOKEN_TIMEOUT;
/*
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 256 bytes --]
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-20 7:44 ` Nick Piggin
@ 2004-12-20 8:03 ` Con Kolivas
2004-12-20 8:58 ` Con Kolivas
2004-12-20 12:06 ` Ed Tomlinson
2004-12-20 17:49 ` Hideo AOKI
2 siblings, 1 reply; 46+ messages in thread
From: Con Kolivas @ 2004-12-20 8:03 UTC (permalink / raw)
To: Nick Piggin; +Cc: Andrew Morton, lista4, linux-kernel, mr, riel
[-- Attachment #1: Type: text/plain, Size: 1345 bytes --]
Nick Piggin wrote:
> Andrew Morton wrote:
>
>> Voluspa <lista4@comhem.se> wrote:
>>
>>> Would be nice though if someone else could verify...
>>
>>
>>
>> Well I'd love to, but afaik the only workloads which we currently know of
>> involve complex userspace apps which I have no experience running.
>>
>> Did anyone come up with a simple step-by-step procedure for
>> reproducing the
>> problem? It would be good if someone could do this, because I don't
>> think
>> we understand the root cause yet?
>>
>
> I admit to generally being in the same boat as you with respect to
> running complex userspace apps.
>
> However, based on this and other scattered reports, I'd say it seems
> quite likely that token based thrashing control is the culprit. Based
> on the cost/benefit, I wonder if we should disable TBTC by default for
> 2.6.10, rather than trying to fix it, and try again for 2.6.11?
>
> Rik? Andrew?
>
> Also, it would be nice to have a sysctl to *completely* disable TBTC,
> that would make testing easier.
Logistically what makes sense is if a timeout of 0 is used as a test
that completely disables it (avoids another sysctl too). In time for
2.6.10 we should disable it by default until the regressions are better
understood. Tuning it into a useful "on" position can happen later and I
suspect requires more code.
Con
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 256 bytes --]
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-20 7:12 ` Andrew Morton
@ 2004-12-20 7:44 ` Nick Piggin
2004-12-20 8:03 ` Con Kolivas
` (2 more replies)
2004-12-20 9:07 ` mr
1 sibling, 3 replies; 46+ messages in thread
From: Nick Piggin @ 2004-12-20 7:44 UTC (permalink / raw)
To: Andrew Morton; +Cc: lista4, linux-kernel, mr, kernel, riel
Andrew Morton wrote:
> Voluspa <lista4@comhem.se> wrote:
>
>>Would be nice though if someone else could verify...
>
>
> Well I'd love to, but afaik the only workloads which we currently know of
> involve complex userspace apps which I have no experience running.
>
> Did anyone come up with a simple step-by-step procedure for reproducing the
> problem? It would be good if someone could do this, because I don't think
> we understand the root cause yet?
>
I admit to generally being in the same boat as you with respect to
running complex userspace apps.
However, based on this and other scattered reports, I'd say it seems
quite likely that token based thrashing control is the culprit. Based
on the cost/benefit, I wonder if we should disable TBTC by default for
2.6.10, rather than trying to fix it, and try again for 2.6.11?
Rik? Andrew?
Also, it would be nice to have a sysctl to *completely* disable TBTC,
that would make testing easier.
Nick
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-20 6:51 Voluspa
@ 2004-12-20 7:12 ` Andrew Morton
2004-12-20 7:44 ` Nick Piggin
2004-12-20 9:07 ` mr
0 siblings, 2 replies; 46+ messages in thread
From: Andrew Morton @ 2004-12-20 7:12 UTC (permalink / raw)
To: lista4; +Cc: linux-kernel, nickpiggin, mr, kernel, riel
Voluspa <lista4@comhem.se> wrote:
>
> Would be nice though if someone else could verify...
Well I'd love to, but afaik the only workloads which we currently know of
involve complex userspace apps which I have no experience running.
Did anyone come up with a simple step-by-step procedure for reproducing the
problem? It would be good if someone could do this, because I don't think
we understand the root cause yet?
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-20 4:33 ` Nick Piggin
@ 2004-12-20 7:07 ` Andrew Morton
0 siblings, 0 replies; 46+ messages in thread
From: Andrew Morton @ 2004-12-20 7:07 UTC (permalink / raw)
To: Nick Piggin; +Cc: riel, kernel, mr, akpm, lista4, linux-kernel
Nick Piggin <nickpiggin@yahoo.com.au> wrote:
>
> On Sun, 2004-12-19 at 22:21 -0500, Rik van Riel wrote:
> > On Mon, 20 Dec 2004, Con Kolivas wrote:
> >
> > > I still suspect the thrash token patch even with the swap token timeout
> > > at 0. Is it completely disabled at 0 or does it still do something?
> >
> > It makes it harder to page out pages from the task holding the
> > token. I wonder if kswapd should try to steal the token away
> > from the task holding it, so in effect nobody holds the token
> > when the system isn't under a heavy swapping load.
> >
>
> In that case, the first thing we need to do is disable thrash token
> completely, and retest that. We still don't know for sure that it is
> the problem.
>
> I don't have the code in front of me at the moment, but I'll be able
> to send a patch to do that in a couple of hours, if nobody beats me
> to it.
This should disable the thrashing control code?
--- 25/mm/rmap.c~a 2004-12-19 23:05:58.759420936 -0800
+++ 25-akpm/mm/rmap.c 2004-12-19 23:06:43.105679280 -0800
@@ -395,6 +395,8 @@ int page_referenced(struct page *page, i
{
int referenced = 0;
+ ignore_token = 1;
+
if (page_test_and_clear_young(page))
referenced++;
_
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
@ 2004-12-20 6:51 Voluspa
2004-12-20 7:12 ` Andrew Morton
0 siblings, 1 reply; 46+ messages in thread
From: Voluspa @ 2004-12-20 6:51 UTC (permalink / raw)
To: linux-kernel; +Cc: akpm, nickpiggin, mr, kernel, riel
Bingo.
[PATCH] token based thrashing control
http://marc.theaimsgroup.com/?l=bk-commits-head&m=109330925227996&w=2
Backing that one out 2.6.9-rc1 behaves just like 2.6.8.1-bk2, ie no freezes and
swapping done in 1 minute in my testcase. Tested both with and without lapic=lapic
(due to my own mind demons ;-)
If someone doubt my ability to back out a patch, here's how it looked:
root:loke:/usr/src/debug/1-mydebug/linux-2.6.9-rc1-debug-notoken# patch -
Rp1 -i
../token.patch
patching file include/linux/sched.h
patching file include/linux/swap.h
patching file kernel/fork.c
patching file mm/Makefile
patching file mm/filemap.c
Hunk #1 succeeded at 1246 (offset 51 lines).
patching file mm/memory.c
patching file mm/rmap.c
patching file mm/thrash.c
Then I diffed the original tree and this notoken-tree and eyeball-compared it
with the patch (had to first delete a mm/filemap.c~ backup left by the patch
program). Was all OK.
Would be nice though if someone else could verify...
This is also that time of the year when no strict timetables can be made. I'll
be available on and off for the next 48 hours if some testing needs to be
done.
Mvh
Mats Johannesson
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-20 3:21 ` Rik van Riel
2004-12-20 4:13 ` Con Kolivas
@ 2004-12-20 4:33 ` Nick Piggin
2004-12-20 7:07 ` Andrew Morton
1 sibling, 1 reply; 46+ messages in thread
From: Nick Piggin @ 2004-12-20 4:33 UTC (permalink / raw)
To: Rik van Riel
Cc: Con Kolivas, Mikhail Ramendik, Andrew Morton, lista4, linux-kernel
On Sun, 2004-12-19 at 22:21 -0500, Rik van Riel wrote:
> On Mon, 20 Dec 2004, Con Kolivas wrote:
>
> > I still suspect the thrash token patch even with the swap token timeout
> > at 0. Is it completely disabled at 0 or does it still do something?
>
> It makes it harder to page out pages from the task holding the
> token. I wonder if kswapd should try to steal the token away
> from the task holding it, so in effect nobody holds the token
> when the system isn't under a heavy swapping load.
>
In that case, the first thing we need to do is disable thrash token
completely, and retest that. We still don't know for sure that it is
the problem.
I don't have the code in front of me at the moment, but I'll be able
to send a patch to do that in a couple of hours, if nobody beats me
to it.
Nick
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-20 4:18 ` Rik van Riel
@ 2004-12-20 4:21 ` Con Kolivas
0 siblings, 0 replies; 46+ messages in thread
From: Con Kolivas @ 2004-12-20 4:21 UTC (permalink / raw)
To: Rik van Riel
Cc: Mikhail Ramendik, Andrew Morton, Nick Piggin, lista4, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 386 bytes --]
Rik van Riel wrote:
> On Mon, 20 Dec 2004, Con Kolivas wrote:
>
>> What if the token isn't handed out at all until a heavy swapping load
>> starts? A slight delay in thrash control would be worth it.
>
>
> How do you define "heavy swapping" ?
>
> How would you measure it ?
>
> How would you relinquish the token after the "heavy swapping"
> load stopped ?
>
N F I
Cheers,
Con
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 256 bytes --]
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-20 4:13 ` Con Kolivas
@ 2004-12-20 4:18 ` Rik van Riel
2004-12-20 4:21 ` Con Kolivas
0 siblings, 1 reply; 46+ messages in thread
From: Rik van Riel @ 2004-12-20 4:18 UTC (permalink / raw)
To: Con Kolivas
Cc: Mikhail Ramendik, Andrew Morton, Nick Piggin, lista4, linux-kernel
On Mon, 20 Dec 2004, Con Kolivas wrote:
> What if the token isn't handed out at all until a heavy swapping load
> starts? A slight delay in thrash control would be worth it.
How do you define "heavy swapping" ?
How would you measure it ?
How would you relinquish the token after the "heavy swapping"
load stopped ?
--
He did not think of himself as a tourist; he was a traveler. The difference is
partly one of time, he would explain. Where as the tourist generally hurries
back home at the end of a few weeks or months, the traveler belonging no more
to one place than to the next, moves slowly, over periods of years, from one
part of the earth to another. Indeed, he would have found it difficult to tell,
among the many places he had lived, precisely where it was he had felt most at
home. -- Paul Bowles
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-20 3:21 ` Rik van Riel
@ 2004-12-20 4:13 ` Con Kolivas
2004-12-20 4:18 ` Rik van Riel
2004-12-20 4:33 ` Nick Piggin
1 sibling, 1 reply; 46+ messages in thread
From: Con Kolivas @ 2004-12-20 4:13 UTC (permalink / raw)
To: Rik van Riel
Cc: Mikhail Ramendik, Andrew Morton, Nick Piggin, lista4, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 608 bytes --]
Rik van Riel wrote:
> On Mon, 20 Dec 2004, Con Kolivas wrote:
>
>> I still suspect the thrash token patch even with the swap token
>> timeout at 0. Is it completely disabled at 0 or does it still do
>> something?
>
>
> It makes it harder to page out pages from the task holding the
> token. I wonder if kswapd should try to steal the token away
> from the task holding it, so in effect nobody holds the token
> when the system isn't under a heavy swapping load.
>
What if the token isn't handed out at all until a heavy swapping load
starts? A slight delay in thrash control would be worth it.
Con
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 256 bytes --]
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-20 3:02 ` Con Kolivas
@ 2004-12-20 3:21 ` Rik van Riel
2004-12-20 4:13 ` Con Kolivas
2004-12-20 4:33 ` Nick Piggin
0 siblings, 2 replies; 46+ messages in thread
From: Rik van Riel @ 2004-12-20 3:21 UTC (permalink / raw)
To: Con Kolivas
Cc: Mikhail Ramendik, Andrew Morton, Nick Piggin, lista4, linux-kernel
On Mon, 20 Dec 2004, Con Kolivas wrote:
> I still suspect the thrash token patch even with the swap token timeout
> at 0. Is it completely disabled at 0 or does it still do something?
It makes it harder to page out pages from the task holding the
token. I wonder if kswapd should try to steal the token away
from the task holding it, so in effect nobody holds the token
when the system isn't under a heavy swapping load.
--
He did not think of himself as a tourist; he was a traveler. The difference is
partly one of time, he would explain. Where as the tourist generally hurries
back home at the end of a few weeks or months, the traveler belonging no more
to one place than to the next, moves slowly, over periods of years, from one
part of the earth to another. Indeed, he would have found it difficult to tell,
among the many places he had lived, precisely where it was he had felt most at
home. -- Paul Bowles
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-20 0:03 ` Mikhail Ramendik
@ 2004-12-20 3:02 ` Con Kolivas
2004-12-20 3:21 ` Rik van Riel
0 siblings, 1 reply; 46+ messages in thread
From: Con Kolivas @ 2004-12-20 3:02 UTC (permalink / raw)
To: Mikhail Ramendik; +Cc: Andrew Morton, Nick Piggin, lista4, linux-kernel, riel
[-- Attachment #1: Type: text/plain, Size: 689 bytes --]
Mikhail Ramendik wrote:
> Andrew Morton wrote:
>
>
>>- Ask Voluspa to do
>>
>> echo 0 > /proc/sys/vm/swap_token_timeout
>>
>> on 2.6.10-rc3 and retest.
>
>
> He did, and I did (but I have not sent my report to lkml). In both cases,
> screen freezes remained but were now less in duration (up to 10-20 sec). In
> mu case I also monitored CPU loading and the big load peaks were there (the
> biggest one was in the beginning).
>
>
>>(We still don't know why it chews tons of CPU, do we?)
>
>
> It does! Any way to dig into this?
>
I still suspect the thrash token patch even with the swap token timeout
at 0. Is it completely disabled at 0 or does it still do something?
Con
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 256 bytes --]
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-19 23:57 ` Andrew Morton
@ 2004-12-20 0:03 ` Mikhail Ramendik
2004-12-20 3:02 ` Con Kolivas
0 siblings, 1 reply; 46+ messages in thread
From: Mikhail Ramendik @ 2004-12-20 0:03 UTC (permalink / raw)
To: Andrew Morton; +Cc: Nick Piggin, lista4, linux-kernel, kernel
Andrew Morton wrote:
> - Ask Voluspa to do
>
> echo 0 > /proc/sys/vm/swap_token_timeout
>
> on 2.6.10-rc3 and retest.
He did, and I did (but I have not sent my report to lkml). In both cases,
screen freezes remained but were now less in duration (up to 10-20 sec). In
mu case I also monitored CPU loading and the big load peaks were there (the
biggest one was in the beginning).
> (We still don't know why it chews tons of CPU, do we?)
It does! Any way to dig into this?
--
Yours, Mikhail Ramendik
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-19 22:56 ` Nick Piggin
2004-12-19 23:08 ` Mikhail Ramendik
@ 2004-12-19 23:57 ` Andrew Morton
2004-12-20 0:03 ` Mikhail Ramendik
1 sibling, 1 reply; 46+ messages in thread
From: Andrew Morton @ 2004-12-19 23:57 UTC (permalink / raw)
To: Nick Piggin; +Cc: lista4, linux-kernel, mr, kernel
Nick Piggin <nickpiggin@yahoo.com.au> wrote:
>
> Andrew, what should we do?
- Ask Voluspa to do
echo 0 > /proc/sys/vm/swap_token_timeout
on 2.6.10-rc3 and retest.
- Dig out Rik's token-timeout-autotuning patch, make it apply, test it,
then ask Volupsa and others to test that.
Have you time to look into the latter?
(We still don't know why it chews tons of CPU, do we?)
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
@ 2004-12-19 23:12 Voluspa
0 siblings, 0 replies; 46+ messages in thread
From: Voluspa @ 2004-12-19 23:12 UTC (permalink / raw)
To: nickpiggin; +Cc: linux-kernel, akpm, mr, kernel
NP wrote:
> It would be nice to find out what is going on before 2.6.10 gets released,
> but Mats isn't going to be able to do any more testing for the moment.
> Andrew, what should we do?
I do have a window open 12 hours from now and going 12 hours forward. Anything
suggested in that period I can test.
Mvh
Mats Johannesson
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-19 22:56 ` Nick Piggin
@ 2004-12-19 23:08 ` Mikhail Ramendik
2004-12-19 23:57 ` Andrew Morton
1 sibling, 0 replies; 46+ messages in thread
From: Mikhail Ramendik @ 2004-12-19 23:08 UTC (permalink / raw)
To: Nick Piggin; +Cc: lista4, linux-kernel, akpm, kernel
Nick Piggin wrote:
> It would be nice to find out what is going on before 2.6.10 gets released,
> but Mats isn't going to be able to do any more testing for the moment.
> Andrew, what should we do?
I am ready to do the testing with the memory eater, and with "a complie in the
background plus the memory eater".
I'm not as good at kernel code management as Mats and did not do the
regression tests on -bk and past -rc versions, nor can I duplicate the X hack
on 2.6.9-rc1. But if you give me a patch against any numbered version
(2.6.8.1, 2.6.9) or against 2.6.10-rc3 , I'll gladly test it, with probably
no more than 24 h response time ;)
--
Yours, Mikhail Ramendik
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-19 22:40 Voluspa
@ 2004-12-19 22:56 ` Nick Piggin
2004-12-19 23:08 ` Mikhail Ramendik
2004-12-19 23:57 ` Andrew Morton
0 siblings, 2 replies; 46+ messages in thread
From: Nick Piggin @ 2004-12-19 22:56 UTC (permalink / raw)
To: lista4; +Cc: linux-kernel, akpm, mr, kernel
Voluspa wrote:
> Found the first kernel version with the regression. It's linux-2.6.9-rc1
>
Thanks!
"[PATCH] token based thrashing control" would be a prime suspect.
None of my infamous VM patches (which did cause random problems) had gone
into 2.6.9-rc1. The first ones were in 2.6.9-rc2.
Well, "[PATCH] make shrinker_sem an rwsem" was in -rc1; I guess that would
be worthwhile testing, if only because it touches vmscan.c
It would be nice to find out what is going on before 2.6.10 gets released,
but Mats isn't going to be able to do any more testing for the moment.
Andrew, what should we do?
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
@ 2004-12-19 22:40 Voluspa
2004-12-19 22:56 ` Nick Piggin
0 siblings, 1 reply; 46+ messages in thread
From: Voluspa @ 2004-12-19 22:40 UTC (permalink / raw)
To: linux-kernel; +Cc: akpm, nickpiggin, mr, kernel
Found the first kernel version with the regression. It's linux-2.6.9-rc1
Perusing lkml from august there was a short thread about the oopses and loss
of keyboard in X. Applying that information in a crude hack I was able to
test the effected 2.6.9-rc1 and three -bk forward:
http://marc.theaimsgroup.com/?t=109357291300002&r=1&w=2
diff -Naur linux-2.6.9-rc1/net/sunrpc/svcauth_unix.c linux-2.6.9-rc1-debug/net/sunrpc/svcauth_unix.c
--- linux-2.6.9-rc1/net/sunrpc/svcauth_unix.c 2004-12-15 18:39:28.000000000
+0100
+++ linux-2.6.9-rc1-debug/net/sunrpc/svcauth_unix.c 2004-12-19 19:01:
53.000000000 +0100
@@ -104,7 +104,6 @@
if (test_bit(CACHE_VALID, &item->flags) &&
!test_bit(CACHE_NEGATIVE, &item->flags))
auth_domain_put(&im->m_client->h);
- kfree(im->m_class);
kfree(im);
}
}
I've since tested and retested for several hours on the different kernels. At one
point I thought the usage of lapic=lapic made a difference, but it turned
out to be a red herring.
2.6.8.1-bk2 is without doubt the last kernel to handle my testcase "properly". There
are no freezes whatsoever and the swapping is finished within 1 minute and
some seconds.
2.6.9-rc1 and forward all have the freezes. Swapping and readback takes from
3 to 6 minutes. I can't find a pattern in the time differences.
What's left now is to find some repository which has the gargantuan 2.6.9-rc1
broken out in its pieces (and I guess 2.6.8.1-bk1 and 2 must be subtracted
from that). Then reverting patches. A process where I'd need some handholding
as to what would be likely candidates.
An innocent one is Ingo's "context-switching overhead in X, ioport()" patch.
I added it to 2.6.8.1-bk2 and it didn't break my testcase.
Ah, well. It's that time of the year, so I won't be able to do any testing until
the madness is over.
Mvh
Mats Johannesson
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
@ 2004-12-19 14:08 Voluspa
0 siblings, 0 replies; 46+ messages in thread
From: Voluspa @ 2004-12-19 14:08 UTC (permalink / raw)
To: kernel; +Cc: akpm, nickpiggin, mr, linux-kernel
ARGH... I hate my ISPs new webmail. Disregard previous change of Topic.
On 2004-12-18 23:02:33 Con Kolivas wrote:
> Try disabling the swap token
>
> echo 0 > /proc/sys/vm/swap_token_timeout
Hi Con. It changes the behaviour of my testcase, yes, but it doesn't cure the
problem. When swap_token_timeout is the default 300 the screen freezes are
longer in duration, about 30 seconds. With a swap_token_timeout of 0, max
screen freezeis about 10 seconds, inter-foliated with freezes of less length.
On a positive note, the _total_ time of "unstability" is equal in both cases.
Which in my animation test means 6 minutes. I said 3 previously, but that
was wrong. Didn't let it run long enough.
Cheers,
Mats Johannesson
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-15 14:02 Voluspa
@ 2004-12-17 0:41 ` Andrew Morton
0 siblings, 0 replies; 46+ messages in thread
From: Andrew Morton @ 2004-12-17 0:41 UTC (permalink / raw)
To: lista4; +Cc: mr, nickpiggin, linux-kernel
Voluspa <lista4@comhem.se> wrote:
>
> I've now booted all -rc kernels from 2.6.8 to 2.6.10-rc3 and examined the
> behaviour of a heavy session with the 3D program Blender with regards to
> screen freezes and mouse unresponsiveness during memory swap.
Can you identify the kernel release which caused the problem to start?
> I find no problem when blender is the sole (large) application, but when a
> distributed computing client is running in the background the reported problems
> surface. I use http://folding.stanford.edu for protein folding. It runs
> with a default of nice 19 and sucks up every free CPU cycle.
What sucks up all the CPU? The application? kswapd?
How much RAM, how much swap?
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-16 8:03 ` Nick Piggin
@ 2004-12-16 8:14 ` Nick Piggin
0 siblings, 0 replies; 46+ messages in thread
From: Nick Piggin @ 2004-12-16 8:14 UTC (permalink / raw)
To: lista4; +Cc: mr, linux-kernel
Nick Piggin wrote:
> Voluspa wrote:
>
>> Earlier today I wrote:
>>
>>
>>> I find no problem when blender is the sole (large) application, but
>>> when a
>>> distributed computing client is running in the background the reported
>>
>>
>> problems
>>
>>> surface. I use http://folding.stanford.edu for protein folding. It runs
>>> with a default of nice 19 and sucks up every free CPU cycle. I've never
>>> seen it interfere with anything prior to this swap issue - been running
>>> it since 2000.
>>
>>
>>
>> More testing done to find the breaking point. Running the folding
>> client and blender:
>>
>> 2.6.8.1-bk2 is the last kernel without _any_ swapping problem (no
>> screen freezes etc)
>> |
>> | 2.6.9-rc1 and three -bk forward have oopses and loss of keyboard in
>> X. Can't test them.
>> |
>> 2.6.9-rc1-bk4 is the first functional kernel where the freezes show up.
>>
>> So it is a real regression.
>>
>
> Can you turn on magic sysrq in the kernel hacking menu, and press
> alt+sysrq+m a few times while kswapd is using lots of memory, please?
>
> Then run `dmesg -s 1000000 > dmesg.out`, and send the dmesg over,
> please?
>
By the way, I think the only relevant VM patches that went in between
2.6.8 and 2.6.9-rc2 are the following:
<nickpiggin@yahoo.com.au>
[PATCH] vm: writeout watermark tuning
<nickpiggin@yahoo.com.au>
[PATCH] vm: alloc_pages watermark fixes
<akpm@osdl.org>
[PATCH] alloc_pages priority tuning
The first one shouldn't do much, and the last two should definitely
be improving things rather than anything else, because they cause
kswapd to properly start freeing in the background rather than force
the app to do the memory freeing itself.
This did expose a couple of bugs in kswapd, which were since fixed,
but are not in the 2.6.9-rc1-bk4 kernel.
So please, do the sysrq+m traces with a 2.6.10-rc3 kernel. Thanks.
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-15 22:49 Voluspa
@ 2004-12-16 8:03 ` Nick Piggin
2004-12-16 8:14 ` Nick Piggin
0 siblings, 1 reply; 46+ messages in thread
From: Nick Piggin @ 2004-12-16 8:03 UTC (permalink / raw)
To: lista4; +Cc: mr, linux-kernel
Voluspa wrote:
> Earlier today I wrote:
>
>
>>I find no problem when blender is the sole (large) application, but when a
>>distributed computing client is running in the background the reported
>
> problems
>
>>surface. I use http://folding.stanford.edu for protein folding. It runs
>>with a default of nice 19 and sucks up every free CPU cycle. I've never
>>seen it interfere with anything prior to this swap issue - been running
>>it since 2000.
>
>
> More testing done to find the breaking point. Running the folding client and
> blender:
>
> 2.6.8.1-bk2 is the last kernel without _any_ swapping problem (no screen freezes
> etc)
> |
> | 2.6.9-rc1 and three -bk forward have oopses and loss of keyboard in X.
> Can't test them.
> |
> 2.6.9-rc1-bk4 is the first functional kernel where the freezes show up.
>
> So it is a real regression.
>
Can you turn on magic sysrq in the kernel hacking menu, and press
alt+sysrq+m a few times while kswapd is using lots of memory, please?
Then run `dmesg -s 1000000 > dmesg.out`, and send the dmesg over,
please?
Thanks,
Nick
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
@ 2004-12-15 22:49 Voluspa
2004-12-16 8:03 ` Nick Piggin
0 siblings, 1 reply; 46+ messages in thread
From: Voluspa @ 2004-12-15 22:49 UTC (permalink / raw)
To: mr; +Cc: nickpiggin, linux-kernel
Earlier today I wrote:
>I find no problem when blender is the sole (large) application, but when a
>distributed computing client is running in the background the reported
problems
>surface. I use http://folding.stanford.edu for protein folding. It runs
>with a default of nice 19 and sucks up every free CPU cycle. I've never
>seen it interfere with anything prior to this swap issue - been running
>it since 2000.
More testing done to find the breaking point. Running the folding client and
blender:
2.6.8.1-bk2 is the last kernel without _any_ swapping problem (no screen freezes
etc)
|
| 2.6.9-rc1 and three -bk forward have oopses and loss of keyboard in X.
Can't test them.
|
2.6.9-rc1-bk4 is the first functional kernel where the freezes show up.
So it is a real regression.
Mvh
Mats Johannesson
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
@ 2004-12-15 14:02 Voluspa
2004-12-17 0:41 ` Andrew Morton
0 siblings, 1 reply; 46+ messages in thread
From: Voluspa @ 2004-12-15 14:02 UTC (permalink / raw)
To: mr; +Cc: nickpiggin, linux-kernel
I've now booted all -rc kernels from 2.6.8 to 2.6.10-rc3 and examined the
behaviour of a heavy session with the 3D program Blender with regards to
screen freezes and mouse unresponsiveness during memory swap.
I find no problem when blender is the sole (large) application, but when a
distributed computing client is running in the background the reported problems
surface. I use http://folding.stanford.edu for protein folding. It runs
with a default of nice 19 and sucks up every free CPU cycle. I've never
seen it interfere with anything prior to this swap issue - been running
it since 2000.
Guess kernel people will say "don't do that then"...
Mvh
Mats Johannesson
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
@ 2004-12-14 7:24 Voluspa
0 siblings, 0 replies; 46+ messages in thread
From: Voluspa @ 2004-12-14 7:24 UTC (permalink / raw)
To: mr; +Cc: linux-kernel
At 2004-12-14 2:28:59 Mikhail Ramendik wrote:
> BTW, somebody told me in a private email to try the oomkiller patch, but
I
> could not extract it from the Web archive, so I don't have the latest version
> of that :( I'd apreciate if anyone emailed that to me, or gave me a link.
or
> a pointer to instructions on getting it right from obe of the Web archives.
Final incarnation can be picked up at
http://marc.theaimsgroup.com/?l=linux-kernel&m=110269783227867&w=2
But on my machine it doesn't address the issue you speak of. When I run something
as demanding as that (end of memory, eating a large chunk of swap) it behaves
like
yours. Gkrellm stops - no screen updates, mouse becomes very unresponsive etc.
Though
I saw that as "normal" for the workload.
In this appartment there's no difference between 2.6.9 patched with the kswapd
fix and
the oomkill patch, or 2.6.10-rc3 with or without oomkill patch. Can't comment
on 2.6.8
since I didn't exhaust memory with applications back then.
Mvh
Mats Johannesson
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-14 0:51 ` Nick Piggin
@ 2004-12-14 2:28 ` Mikhail Ramendik
0 siblings, 0 replies; 46+ messages in thread
From: Mikhail Ramendik @ 2004-12-14 2:28 UTC (permalink / raw)
To: Nick Piggin; +Cc: linux-kernel, Andrew Morton
[-- Attachment #1: Type: text/plain, Size: 2081 bytes --]
Nick Piggin wrote:
> > With kernel 2.6.10-rc3 and 256 M RAM, when I start a task taht eats a ot
> > of RAM (for example, viewing a big TIFF file; also tested with a
> > synthetic "eater"), in the resulting swapping process kswapd tahes quite
> > a bit of CPU time. The computer becomes extremely unresponsive, the clock
> > (in icewm) stops for periods of time up to a minute). And the task
> > startup itself is somewhaat slow.
> >
> > I have checked both 2.6.8.1 and 2.6.9 for comparison, and they fare a lot
> > better. The CPU hogging is not there, the computer is much more
> > responsive, and the task starts faster.
> I'm not quite sure what the problem would be. Please check that you are
> using the same config for each kernel, and both kernels have detected the
> same amount of memory.
Seems so.
> Then, can you start by posting /proc/vmstat before and after running the
> synthetic "eater" for some amount of time, with both 2.6.9 and 2.6.10-rc3;
I have rerun the tests to record the data, and this time 2.6.9 behaved
differently. There was no CPU hog for kswapd, but at some poing the computer
went un-interactive, and after about 20 seconds the task was killed.
2.6.8.1 hummed along nicely and remained interactive (somewhat jerky as one
would expect under heavy swapping, but at least the clock always ran)
2.6.10-rc3 started with some kswapd CPU hogging, and then became more and more
unresponsive. It took me some minutes to become able to simply get the vmstat
data!
The requested files (cat's of /proc/meminfo and /proc/slabinfo before the run,
and /proc/vmstat before and during the run ["after" for 2.6.9 which killed
the process]) are attached, as well as the eater code. I told the eater to
eat 300 MB.
BTW, somebody told me in a private email to try the oomkiller patch, but I
could not extract it from the Web archive, so I don't have the latest version
of that :( I'd apreciate if anyone emailed that to me, or gave me a link. or
a pointer to instructions on getting it right from obe of the Web archives.
--
Yours, Mikhail Ramendik
[-- Attachment #2: meminfo.2.6.10-rc3 --]
[-- Type: text/plain, Size: 598 bytes --]
MemTotal: 255352 kB
MemFree: 5304 kB
Buffers: 6808 kB
Cached: 123708 kB
SwapCached: 0 kB
Active: 188196 kB
Inactive: 34092 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 255352 kB
LowFree: 5304 kB
SwapTotal: 2048244 kB
SwapFree: 2048188 kB
Dirty: 708 kB
Writeback: 0 kB
Mapped: 154868 kB
Slab: 12308 kB
CommitLimit: 2175920 kB
Committed_AS: 330276 kB
PageTables: 2704 kB
VmallocTotal: 778200 kB
VmallocUsed: 7092 kB
VmallocChunk: 770004 kB
[-- Attachment #3: vmstat.2.6.8.1.post --]
[-- Type: text/plain, Size: 679 bytes --]
nr_dirty 3
nr_writeback 4426
nr_unstable 0
nr_page_table_pages 653
nr_mapped 48547
nr_slab 3220
pgpgin 915374
pgpgout 837415
pswpin 150114
pswpout 201304
pgalloc_high 0
pgalloc_normal 1026690
pgalloc_dma 89983
pgfree 1117246
pgactivate 66409
pgdeactivate 283601
pgfault 794596
pgmajfault 21963
pgrefill_high 0
pgrefill_normal 803261
pgrefill_dma 111308
pgsteal_high 0
pgsteal_normal 238280
pgsteal_dma 30691
pgscan_kswapd_high 0
pgscan_kswapd_normal 563277
pgscan_kswapd_dma 121163
pgscan_direct_high 0
pgscan_direct_normal 98736
pgscan_direct_dma 12320
pginodesteal 0
slabs_scanned 50307
kswapd_steal 206788
kswapd_inodesteal 26
pageoutrun 1331
allocstall 1707
pgrotated 196485
[-- Attachment #4: meminfo.2.6.8.1 --]
[-- Type: text/plain, Size: 572 bytes --]
MemTotal: 255380 kB
MemFree: 5896 kB
Buffers: 3828 kB
Cached: 121096 kB
SwapCached: 0 kB
Active: 177576 kB
Inactive: 38528 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 255380 kB
LowFree: 5896 kB
SwapTotal: 2048244 kB
SwapFree: 2048188 kB
Dirty: 124 kB
Writeback: 0 kB
Mapped: 154204 kB
Slab: 18624 kB
Committed_AS: 329732 kB
PageTables: 2232 kB
VmallocTotal: 778200 kB
VmallocUsed: 6900 kB
VmallocChunk: 771164 kB
[-- Attachment #5: slabinfo.2.6.10-rc3 --]
[-- Type: text/plain, Size: 11977 bytes --]
slabinfo - version: 2.1
# name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <batchcount> <limit> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
ip_fib_alias 20 226 16 226 1 : tunables 120 60 0 : slabdata 1 1 0
ip_fib_hash 19 119 32 119 1 : tunables 120 60 0 : slabdata 1 1 0
ip_conntrack_expect 0 0 128 31 1 : tunables 120 60 0 : slabdata 0 0 0
ip_conntrack 2 10 384 10 1 : tunables 54 27 0 : slabdata 1 1 0
fat_inode_cache 2 11 348 11 1 : tunables 54 27 0 : slabdata 1 1 0
fat_cache 3 185 20 185 1 : tunables 120 60 0 : slabdata 1 1 0
ext3_inode_cache 1933 2896 476 8 1 : tunables 54 27 0 : slabdata 362 362 0
ext3_xattr 0 0 48 81 1 : tunables 120 60 0 : slabdata 0 0 0
journal_handle 4 185 20 185 1 : tunables 120 60 0 : slabdata 1 1 0
journal_head 205 648 48 81 1 : tunables 120 60 0 : slabdata 8 8 0
revoke_table 4 290 12 290 1 : tunables 120 60 0 : slabdata 1 1 0
revoke_record 0 0 16 226 1 : tunables 120 60 0 : slabdata 0 0 0
unix_sock 179 190 384 10 1 : tunables 54 27 0 : slabdata 19 19 0
ip_mrt_cache 0 0 128 31 1 : tunables 120 60 0 : slabdata 0 0 0
tcp_tw_bucket 0 0 128 31 1 : tunables 120 60 0 : slabdata 0 0 0
tcp_bind_bucket 6 226 16 226 1 : tunables 120 60 0 : slabdata 1 1 0
tcp_open_request 0 0 64 61 1 : tunables 120 60 0 : slabdata 0 0 0
inet_peer_cache 1 61 64 61 1 : tunables 120 60 0 : slabdata 1 1 0
secpath_cache 0 0 128 31 1 : tunables 120 60 0 : slabdata 0 0 0
xfrm_dst_cache 0 0 256 15 1 : tunables 120 60 0 : slabdata 0 0 0
ip_dst_cache 13 15 256 15 1 : tunables 120 60 0 : slabdata 1 1 0
arp_cache 4 31 128 31 1 : tunables 120 60 0 : slabdata 1 1 0
raw_sock 2 7 512 7 1 : tunables 54 27 0 : slabdata 1 1 0
udp_sock 11 14 512 7 1 : tunables 54 27 0 : slabdata 2 2 0
tcp_sock 10 12 1024 4 1 : tunables 54 27 0 : slabdata 3 3 0
flow_cache 0 0 128 31 1 : tunables 120 60 0 : slabdata 0 0 0
uhci_urb_priv 2 88 44 88 1 : tunables 120 60 0 : slabdata 1 1 0
cfq_ioc_pool 0 0 24 156 1 : tunables 120 60 0 : slabdata 0 0 0
cfq_pool 0 0 104 38 1 : tunables 120 60 0 : slabdata 0 0 0
crq_pool 0 0 56 70 1 : tunables 120 60 0 : slabdata 0 0 0
deadline_drq 0 0 52 75 1 : tunables 120 60 0 : slabdata 0 0 0
as_arq 132 183 64 61 1 : tunables 120 60 0 : slabdata 3 3 0
ext2_inode_cache 1 9 420 9 1 : tunables 54 27 0 : slabdata 1 1 0
ext2_xattr 0 0 48 81 1 : tunables 120 60 0 : slabdata 0 0 0
dnotify_cache 100 185 20 185 1 : tunables 120 60 0 : slabdata 1 1 0
dquot 0 0 128 31 1 : tunables 120 60 0 : slabdata 0 0 0
eventpoll_pwq 0 0 36 107 1 : tunables 120 60 0 : slabdata 0 0 0
eventpoll_epi 0 0 128 31 1 : tunables 120 60 0 : slabdata 0 0 0
kioctx 0 0 256 15 1 : tunables 120 60 0 : slabdata 0 0 0
kiocb 0 0 128 31 1 : tunables 120 60 0 : slabdata 0 0 0
fasync_cache 2 226 16 226 1 : tunables 120 60 0 : slabdata 1 1 0
shmem_inode_cache 6 10 384 10 1 : tunables 54 27 0 : slabdata 1 1 0
posix_timers_cache 0 0 96 41 1 : tunables 120 60 0 : slabdata 0 0 0
uid_cache 5 61 64 61 1 : tunables 120 60 0 : slabdata 1 1 0
sgpool-128 32 32 2048 2 1 : tunables 24 12 0 : slabdata 16 16 0
sgpool-64 32 32 1024 4 1 : tunables 54 27 0 : slabdata 8 8 0
sgpool-32 32 32 512 8 1 : tunables 54 27 0 : slabdata 4 4 0
sgpool-16 32 45 256 15 1 : tunables 120 60 0 : slabdata 3 3 0
sgpool-8 32 62 128 31 1 : tunables 120 60 0 : slabdata 2 2 0
blkdev_ioc 63 156 24 156 1 : tunables 120 60 0 : slabdata 1 1 0
blkdev_queue 26 33 352 11 1 : tunables 54 27 0 : slabdata 3 3 0
blkdev_requests 135 135 148 27 1 : tunables 120 60 0 : slabdata 5 5 0
biovec-(256) 256 256 3072 2 2 : tunables 24 12 0 : slabdata 128 128 0
biovec-128 256 260 1536 5 2 : tunables 24 12 0 : slabdata 52 52 0
biovec-64 256 260 768 5 1 : tunables 54 27 0 : slabdata 52 52 0
biovec-16 259 270 256 15 1 : tunables 120 60 0 : slabdata 18 18 0
biovec-4 256 305 64 61 1 : tunables 120 60 0 : slabdata 5 5 0
biovec-1 354 452 16 226 1 : tunables 120 60 0 : slabdata 2 2 0
bio 332 403 128 31 1 : tunables 120 60 0 : slabdata 13 13 0
file_lock_cache 16 45 88 45 1 : tunables 120 60 0 : slabdata 1 1 0
sock_inode_cache 270 270 384 10 1 : tunables 54 27 0 : slabdata 27 27 0
skbuff_head_cache 480 480 256 15 1 : tunables 120 60 0 : slabdata 32 32 0
sock 6 10 384 10 1 : tunables 54 27 0 : slabdata 1 1 0
proc_inode_cache 36 195 308 13 1 : tunables 54 27 0 : slabdata 15 15 0
sigqueue 8 27 148 27 1 : tunables 120 60 0 : slabdata 1 1 0
radix_tree_node 2502 3724 276 14 1 : tunables 54 27 0 : slabdata 266 266 0
bdev_cache 9 14 512 7 1 : tunables 54 27 0 : slabdata 2 2 0
mnt_cache 21 31 128 31 1 : tunables 120 60 0 : slabdata 1 1 0
inode_cache 1222 1235 292 13 1 : tunables 54 27 0 : slabdata 95 95 0
dentry_cache 2964 6670 136 29 1 : tunables 120 60 0 : slabdata 230 230 0
filp 1545 1545 256 15 1 : tunables 120 60 0 : slabdata 103 103 0
names_cache 9 9 4096 1 1 : tunables 24 12 0 : slabdata 9 9 0
idr_layer_cache 82 87 136 29 1 : tunables 120 60 0 : slabdata 3 3 0
buffer_head 1772 4500 52 75 1 : tunables 120 60 0 : slabdata 60 60 0
mm_struct 114 114 640 6 1 : tunables 54 27 0 : slabdata 19 19 0
vm_area_struct 4823 5123 84 47 1 : tunables 120 60 0 : slabdata 109 109 0
fs_cache 110 119 32 119 1 : tunables 120 60 0 : slabdata 1 1 0
files_cache 109 112 512 7 1 : tunables 54 27 0 : slabdata 16 16 0
signal_cache 126 135 256 15 1 : tunables 120 60 0 : slabdata 9 9 0
sighand_cache 125 125 1408 5 2 : tunables 24 12 0 : slabdata 25 25 0
task_struct 186 189 1248 3 1 : tunables 24 12 0 : slabdata 63 63 0
anon_vma 1336 1628 8 407 1 : tunables 120 60 0 : slabdata 4 4 0
pgd 110 117 4096 1 1 : tunables 24 12 0 : slabdata 110 117 0
size-131072(DMA) 0 0 131072 1 32 : tunables 8 4 0 : slabdata 0 0 0
size-131072 0 0 131072 1 32 : tunables 8 4 0 : slabdata 0 0 0
size-65536(DMA) 0 0 65536 1 16 : tunables 8 4 0 : slabdata 0 0 0
size-65536 1 1 65536 1 16 : tunables 8 4 0 : slabdata 1 1 0
size-32768(DMA) 0 0 32768 1 8 : tunables 8 4 0 : slabdata 0 0 0
size-32768 0 0 32768 1 8 : tunables 8 4 0 : slabdata 0 0 0
size-16384(DMA) 0 0 16384 1 4 : tunables 8 4 0 : slabdata 0 0 0
size-16384 1 1 16384 1 4 : tunables 8 4 0 : slabdata 1 1 0
size-8192(DMA) 0 0 8192 1 2 : tunables 8 4 0 : slabdata 0 0 0
size-8192 191 191 8192 1 2 : tunables 8 4 0 : slabdata 191 191 0
size-4096(DMA) 0 0 4096 1 1 : tunables 24 12 0 : slabdata 0 0 0
size-4096 62 62 4096 1 1 : tunables 24 12 0 : slabdata 62 62 0
size-2048(DMA) 0 0 2048 2 1 : tunables 24 12 0 : slabdata 0 0 0
size-2048 401 402 2048 2 1 : tunables 24 12 0 : slabdata 201 201 0
size-1024(DMA) 0 0 1024 4 1 : tunables 54 27 0 : slabdata 0 0 0
size-1024 184 184 1024 4 1 : tunables 54 27 0 : slabdata 46 46 0
size-512(DMA) 0 0 512 8 1 : tunables 54 27 0 : slabdata 0 0 0
size-512 277 296 512 8 1 : tunables 54 27 0 : slabdata 37 37 0
size-256(DMA) 0 0 256 15 1 : tunables 120 60 0 : slabdata 0 0 0
size-256 150 150 256 15 1 : tunables 120 60 0 : slabdata 10 10 0
size-128(DMA) 0 0 128 31 1 : tunables 120 60 0 : slabdata 0 0 0
size-128 1736 1736 128 31 1 : tunables 120 60 0 : slabdata 56 56 0
size-64(DMA) 0 0 64 61 1 : tunables 120 60 0 : slabdata 0 0 0
size-64 5367 5368 64 61 1 : tunables 120 60 0 : slabdata 88 88 0
size-32(DMA) 0 0 32 119 1 : tunables 120 60 0 : slabdata 0 0 0
size-32 2916 2975 32 119 1 : tunables 120 60 0 : slabdata 25 25 0
kmem_cache 124 124 128 31 1 : tunables 120 60 0 : slabdata 4 4 0
[-- Attachment #6: meminfo.2.6.9 --]
[-- Type: text/plain, Size: 572 bytes --]
MemTotal: 255308 kB
MemFree: 4920 kB
Buffers: 6416 kB
Cached: 123404 kB
SwapCached: 56 kB
Active: 181336 kB
Inactive: 40544 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 255308 kB
LowFree: 4920 kB
SwapTotal: 2048244 kB
SwapFree: 2048188 kB
Dirty: 100 kB
Writeback: 0 kB
Mapped: 154648 kB
Slab: 13068 kB
Committed_AS: 330532 kB
PageTables: 2716 kB
VmallocTotal: 778200 kB
VmallocUsed: 7108 kB
VmallocChunk: 770004 kB
[-- Attachment #7: vmstat.2.6.9.post --]
[-- Type: text/plain, Size: 679 bytes --]
nr_dirty 1
nr_writeback 0
nr_unstable 0
nr_page_table_pages 679
nr_mapped 3323
nr_slab 2948
pgpgin 1225220
pgpgout 359216
pswpin 53954
pswpout 81905
pgalloc_high 0
pgalloc_normal 1153804
pgalloc_dma 85580
pgfree 1285435
pgactivate 437029
pgdeactivate 517802
pgfault 692436
pgmajfault 11168
pgrefill_high 0
pgrefill_normal 43823473
pgrefill_dma 3836086
pgsteal_high 0
pgsteal_normal 220020
pgsteal_dma 41131
pgscan_kswapd_high 0
pgscan_kswapd_normal 1205256
pgscan_kswapd_dma 591734
pgscan_direct_high 0
pgscan_direct_normal 65736
pgscan_direct_dma 7624
pginodesteal 0
slabs_scanned 58624
kswapd_steal 249560
kswapd_inodesteal 18718
pageoutrun 8900
allocstall 297
pgrotated 81962
[-- Attachment #8: vmstat.2.6.9.pre --]
[-- Type: text/plain, Size: 645 bytes --]
nr_dirty 14
nr_writeback 0
nr_unstable 0
nr_page_table_pages 679
nr_mapped 38687
nr_slab 3262
pgpgin 853252
pgpgout 30300
pswpin 0
pswpout 0
pgalloc_high 0
pgalloc_normal 721655
pgalloc_dma 46151
pgfree 768897
pgactivate 46217
pgdeactivate 12894
pgfault 566874
pgmajfault 2524
pgrefill_high 0
pgrefill_normal 51165
pgrefill_dma 18693
pgsteal_high 0
pgsteal_normal 83771
pgsteal_dma 20152
pgscan_kswapd_high 0
pgscan_kswapd_normal 91641
pgscan_kswapd_dma 22007
pgscan_direct_high 0
pgscan_direct_normal 0
pgscan_direct_dma 0
pginodesteal 0
slabs_scanned 53760
kswapd_steal 103923
kswapd_inodesteal 16533
pageoutrun 3709
allocstall 0
pgrotated 45
[-- Attachment #9: vmstat.2.6.8.1.pre --]
[-- Type: text/plain, Size: 646 bytes --]
nr_dirty 49
nr_writeback 0
nr_unstable 0
nr_page_table_pages 558
nr_mapped 38552
nr_slab 4650
pgpgin 277610
pgpgout 30723
pswpin 0
pswpout 14
pgalloc_high 0
pgalloc_normal 505535
pgalloc_dma 31361
pgfree 538419
pgactivate 45364
pgdeactivate 23508
pgfault 558801
pgmajfault 2575
pgrefill_high 0
pgrefill_normal 63405
pgrefill_dma 17101
pgsteal_high 0
pgsteal_normal 31213
pgsteal_dma 3057
pgscan_kswapd_high 0
pgscan_kswapd_normal 29337
pgscan_kswapd_dma 3102
pgscan_direct_high 0
pgscan_direct_normal 8316
pgscan_direct_dma 990
pginodesteal 0
slabs_scanned 22702
kswapd_steal 26509
kswapd_inodesteal 25
pageoutrun 165
allocstall 183
pgrotated 43
[-- Attachment #10: eatmemory.c --]
[-- Type: text/x-csrc, Size: 3007 bytes --]
/* eatmemory.c created sometime early 2003 by billy@gonoph.net */
/* released in the public domain for anyone silly enough to run it */
/* use at your own risk! */
#include <stdio.h>
#include <stdlib.h>
#ifdef POSIX
#include <sys/select.h>
#else
#include <sys/time.h>
#include <sys/types.h>
#include <unistd.h>
#endif
#include <string.h>
const int CHUNK=1024*1024;
char **CreateLargeArray(unsigned long megs);
char **CreateLargeChunk(unsigned long chunks,char **largearray);
int main(int argc,char **argv)
{
char **bigarray=NULL;
unsigned long megabytes;
unsigned long old_megabytes=0;
unsigned long looper1;
unsigned long long realbytes;
fd_set empty;
struct timeval waittv;
printf("Enter Megabytes to chew: ");
fscanf(stdin,"%lu",&megabytes);
printf("\n");
realbytes=megabytes*CHUNK;
/* the memory eating portion was in a loop.
* I've since removed the loop, but kept this
* part in case anyone wants to put it back in */
if (bigarray) {
for(looper1=0;looper1<old_megabytes;looper1++) { free(bigarray[looper1]); bigarray[looper1]=0; }
free(bigarray);
bigarray=0;
}
old_megabytes=megabytes;
/* end loop handling code */
bigarray=CreateLargeArray(megabytes); /* create my array of chunks */
if (!bigarray) { exit(-1); }
/* bzero seems faster than memset - I like it more than memset
* still calloc appears faster on objects larger than 100kb
* probably due to mmaping from the OS, so this maybe OS dependant */
bzero(bigarray,megabytes);
if (!CreateLargeChunk(megabytes,bigarray)) { exit(-1); } /* fill in the array with actual chunks */
/* loop the memory to keep it out of swap
* wait 100ms per chunk to keep machine from
* freaking out with max processor usage -
* especially if it starts to swap */
for (;;)
{
for (looper1=0;looper1<megabytes;looper1++)
{
memset(bigarray[looper1],48+(512 % 10),CHUNK); /* just picked something random to throw in there */
FD_ZERO(&empty);
waittv.tv_sec = 0;
waittv.tv_usec = 100;
select(0,&empty,&empty,&empty,&waittv);
}
}
return(0);
}
char **CreateLargeArray(unsigned long megs)
{
char **largearray;
largearray=malloc(megs*sizeof(char*));
if (!largearray)
{
fprintf(stderr,"[warn] Unable to malloc() %lu megabytes.\n",megs);
perror("[error]");
return(0);
}
return(largearray);
}
char **CreateLargeChunk(unsigned long chunks,char **largearray)
{
unsigned long looper1;
/* Loop the largearray and create the CHUNKS
* I did it this way as in theory, this app
* should be able to consume >4GB of memory. */
for(looper1=0;looper1<chunks;looper1++)
{
largearray[looper1]=malloc((CHUNK)+1);
if (!largearray[looper1])
{
fprintf(stderr,"[warn] Unable to malloc() %lu chunks.\n",chunks);
perror("[error]");
return(0);
}
/* set the memory to ascii(48) '0' */
memset(largearray[looper1],'0',CHUNK);
}
return(largearray);
}
// vim: sw=2 cindent :
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
2004-12-12 14:28 Mikhail Ramendik
@ 2004-12-14 0:51 ` Nick Piggin
2004-12-14 2:28 ` Mikhail Ramendik
0 siblings, 1 reply; 46+ messages in thread
From: Nick Piggin @ 2004-12-14 0:51 UTC (permalink / raw)
To: Mikhail Ramendik; +Cc: linux-kernel, Andrew Morton
Mikhail Ramendik wrote:
> Hello,
>
> With kernel 2.6.10-rc3 and 256 M RAM, when I start a task taht eats a ot of
> RAM (for example, viewing a big TIFF file; also tested with a synthetic
> "eater"), in the resulting swapping process kswapd tahes quite a bit of CPU
> time. The computer becomes extremely unresponsive, the clock (in icewm) stops
> for periods of time up to a minute). And the task startup itself is somewhaat
> slow.
>
> I have checked both 2.6.8.1 and 2.6.9 for comparison, and they fare a lot
> better. The CPU hogging is not there, the computer is much more responsive,
> and the task starts faster.
>
Hi Mikhail,
I'm not quite sure what the problem would be. Please check that you are using
the same config for each kernel, and both kernels have detected the same amount
of memory.
Then, can you start by posting /proc/vmstat before and after running the
synthetic "eater" for some amount of time, with both 2.6.9 and 2.6.10-rc3; so:
boot 2.6.9
cat /proc/vmstat > 2.6.9-pre ; ./eater ; cat /proc/vmstat 2.6.9-post
and the same for 2.6.10-rc3.
Also, /proc/meminfo and /proc/slabinfo output for each kernel before running
eater may give some clues.
Oh, and can you post the source code for the "eater" as well, please?
Thanks,
Nick
^ permalink raw reply [flat|nested] 46+ messages in thread
* 2.6.10-rc3: kswapd eats CPU on start of memory-eating task
@ 2004-12-12 14:28 Mikhail Ramendik
2004-12-14 0:51 ` Nick Piggin
0 siblings, 1 reply; 46+ messages in thread
From: Mikhail Ramendik @ 2004-12-12 14:28 UTC (permalink / raw)
To: linux-kernel
Hello,
With kernel 2.6.10-rc3 and 256 M RAM, when I start a task taht eats a ot of
RAM (for example, viewing a big TIFF file; also tested with a synthetic
"eater"), in the resulting swapping process kswapd tahes quite a bit of CPU
time. The computer becomes extremely unresponsive, the clock (in icewm) stops
for periods of time up to a minute). And the task startup itself is somewhaat
slow.
I have checked both 2.6.8.1 and 2.6.9 for comparison, and they fare a lot
better. The CPU hogging is not there, the computer is much more responsive,
and the task starts faster.
--
Yours, Mikhail Ramendik
^ permalink raw reply [flat|nested] 46+ messages in thread
end of thread, other threads:[~2004-12-23 13:29 UTC | newest]
Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-12-17 10:45 2.6.10-rc3: kswapd eats CPU on start of memory-eating task Voluspa
2004-12-18 23:02 ` Con Kolivas
-- strict thread matches above, loose matches on Subject: below --
2004-12-23 0:26 Zou, Nanhai
2004-12-23 13:26 ` Rik van Riel
2004-12-23 13:28 ` Rik van Riel
2004-12-22 8:45 Zou, Nanhai
2004-12-22 14:23 ` Rik van Riel
2004-12-20 12:59 Voluspa
2004-12-21 1:46 ` Mikhail Ramendik
2004-12-20 9:22 Zou, Nanhai
2004-12-20 15:08 ` Rik van Riel
2004-12-20 6:51 Voluspa
2004-12-20 7:12 ` Andrew Morton
2004-12-20 7:44 ` Nick Piggin
2004-12-20 8:03 ` Con Kolivas
2004-12-20 8:58 ` Con Kolivas
2004-12-20 12:55 ` Andrea Arcangeli
2004-12-20 12:06 ` Ed Tomlinson
2004-12-20 12:29 ` Con Kolivas
2004-12-20 17:49 ` Hideo AOKI
2004-12-20 23:51 ` Nick Piggin
2004-12-20 9:07 ` mr
2004-12-20 15:06 ` Rik van Riel
2004-12-19 23:12 Voluspa
2004-12-19 22:40 Voluspa
2004-12-19 22:56 ` Nick Piggin
2004-12-19 23:08 ` Mikhail Ramendik
2004-12-19 23:57 ` Andrew Morton
2004-12-20 0:03 ` Mikhail Ramendik
2004-12-20 3:02 ` Con Kolivas
2004-12-20 3:21 ` Rik van Riel
2004-12-20 4:13 ` Con Kolivas
2004-12-20 4:18 ` Rik van Riel
2004-12-20 4:21 ` Con Kolivas
2004-12-20 4:33 ` Nick Piggin
2004-12-20 7:07 ` Andrew Morton
2004-12-19 14:08 Voluspa
2004-12-15 22:49 Voluspa
2004-12-16 8:03 ` Nick Piggin
2004-12-16 8:14 ` Nick Piggin
2004-12-15 14:02 Voluspa
2004-12-17 0:41 ` Andrew Morton
2004-12-14 7:24 Voluspa
2004-12-12 14:28 Mikhail Ramendik
2004-12-14 0:51 ` Nick Piggin
2004-12-14 2:28 ` Mikhail Ramendik
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.