[PATCH] Fix dirty page accounting in redirty_page_for_writepage()

* [PATCH] Fix dirty page accounting in redirty_page_for_writepage()
@ 2009-04-29 23:25 Mathieu Desnoyers
  2009-04-29 23:56 ` Mathieu Desnoyers
                   ` (2 more replies)
  0 siblings, 3 replies; 62+ messages in thread
From: Mathieu Desnoyers @ 2009-04-29 23:25 UTC (permalink / raw)
  To: Linus Torvalds, akpm, Nick Piggin
  Cc: Ingo Molnar, KOSAKI Motohiro, Peter Zijlstra, thomas.pi,
	Yuriy Lalym, linux-kernel, ltt-dev

Basically, the following execution :

dd if=/dev/zero of=/tmp/testfile

will slowly fill _all_ ram available without taking into account memory
pressure.

This is because the dirty page accounting is incorrect in
redirty_page_for_writepage.

This patch adds missing dirty page accounting in redirty_page_for_writepage().
This should fix a _lot_ of issues involving machines becoming slow under heavy
write I/O. No surprise : eventually the system starts swapping.

Linux kernel 2.6.30-rc2

The /proc/meminfo picture I had before applying this patch after filling my
memory with the dd execution was :

MemTotal:       16433732 kB
MemFree:        10919700 kB
Buffers:           12492 kB
Cached:          5262508 kB
SwapCached:            0 kB
Active:            37096 kB
Inactive:        5254384 kB
Active(anon):      16716 kB
Inactive(anon):        0 kB
Active(file):      20380 kB
Inactive(file):  5254384 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:      19535024 kB
SwapFree:       19535024 kB
Dirty:           2125956 kB
Writeback:         50476 kB
AnonPages:         16660 kB
Mapped:             9560 kB
Slab:             189692 kB
SReclaimable:     166688 kB
SUnreclaim:        23004 kB
PageTables:         3396 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    27751888 kB
Committed_AS:      53904 kB
VmallocTotal:   34359738367 kB
VmallocUsed:       10764 kB
VmallocChunk:   34359726963 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:        3456 kB
DirectMap2M:    16773120 kB

After applying my patch, the same test case steadily leaves between 8
and 500MB ram free in the steady-state (when pressure is reached).

MemTotal:       16433732 kB
MemFree:           85144 kB
Buffers:           23148 kB
Cached:         15766280 kB
SwapCached:            0 kB
Active:            51500 kB
Inactive:       15755140 kB
Active(anon):      15540 kB
Inactive(anon):     1824 kB
Active(file):      35960 kB
Inactive(file): 15753316 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:      19535024 kB
SwapFree:       19535024 kB
Dirty:           2501644 kB
Writeback:         33280 kB
AnonPages:         17280 kB
Mapped:             9272 kB
Slab:             505524 kB
SReclaimable:     485596 kB
SUnreclaim:        19928 kB
PageTables:         3396 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    27751888 kB
Committed_AS:      54508 kB
VmallocTotal:   34359738367 kB
VmallocUsed:       10764 kB
VmallocChunk:   34359726715 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:        3456 kB
DirectMap2M:    16773120 kB

The pressure pattern I see with the patch applied is :
(16GB ram total)

- Inactive(file) fills up to 15.7GB.
- Dirty fills up to 1.7GB.
- Writeback vary between 0 and 600MB

sync() behavior :

- Dirty down to ~6MB.
- Writeback increases to 1.6GB, then shrinks down to ~0MB.

References :
This insanely huge
http://bugzilla.kernel.org/show_bug.cgi?id=12309
[Bug 12309] Large I/O operations result in slow performance and high iowait times
(yes, I've been in CC all along)

Special thanks to Linus Torvalds and Nick Piggin and Thomas Pi for their
suggestions on previous patch iterations.

Special thanks to the LTTng community, which helped me getting LTTng up to its
current usability level. It's been tremendously useful in understanding those
problematic I/O workloads and generating fio test cases.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: akpm@linux-foundation.org
CC: Nick Piggin <nickpiggin@yahoo.com.au>
CC: Ingo Molnar <mingo@elte.hu>
CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: thomas.pi@arcor.dea
CC: Yuriy Lalym <ylalym@gmail.com>
---
 mm/page-writeback.c |    6 ++++++
 1 file changed, 6 insertions(+)

Index: linux-2.6-lttng/mm/page-writeback.c
===================================================================

--- linux-2.6-lttng.orig/mm/page-writeback.c	2009-04-29 18:14:48.000000000 -0400
+++ linux-2.6-lttng/mm/page-writeback.c	2009-04-29 18:23:59.000000000 -0400
@@ -1237,6 +1237,12 @@ int __set_page_dirty_nobuffers(struct pa
 		if (!mapping)
 			return 1;
 
+		/*
+		 * Take care of setting back page accounting correctly.
+		 */
+		inc_zone_page_state(page, NR_FILE_DIRTY);
+		inc_bdi_stat(mapping->backing_dev_info, BDI_RECLAIMABLE);
+
 		spin_lock_irq(&mapping->tree_lock);
 		mapping2 = page_mapping(page);
 		if (mapping2) { /* Race with truncate? */

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 62+ messages in thread