All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] mm: let the bdi_writeout fraction respond more quickly
@ 2010-06-14 13:58 Richard Kennedy
  2010-06-14 14:44   ` Richard Kennedy
  0 siblings, 1 reply; 11+ messages in thread
From: Richard Kennedy @ 2010-06-14 13:58 UTC (permalink / raw)
  To: Jens Axboe, Peter Zijlstra, Andrew Morton, Wu Fengguang; +Cc: lkml, linux-mm

[-- Attachment #1: Type: text/plain, Size: 1860 bytes --]

Hi all,
The fraction of vm cache allowed to each BDI as calculated by
get_dirty_limits (mm/page-writeback.c) respond very slowly to changes in
workload.

Running a simple test that alternately writes 1Gb to sda then sdb,
twice, shows the bdi_threshold taking approximately 15 seconds to reach
a steady state value. This prevents a application from using all of the
available cache and forces it to write to the physical disk earlier than
strictly necessary.  
As you can see from the attached graph, bdi_thresh_before.png, our
current control system responds to this kind of workload very slowly.

The below patch speeds up the recalculation and lets it reach a steady
state value in a couple of seconds. see bdi_thresh_after.png.

I get better throughput with this patch applied and have been running
some variation of this on and off for some months without any obvious
problems.

(These tests were all run on 2.6.35-rc3,
where dm-2 is a sata drive lvm/ext4 and sdb is ide ext4.
I've got lots more results and graphs but won't bore you all with
them ;) )

I see this as a considerable improvement but I have found the magic
number of -4 empirically so it may just be tuned to my system. I'm not
sure how to decide on a value that is suitable for everyone. 

Does anyone have any suggestions or thoughts?

Unfortunately I don't have any other hardware to try this on, so I would
be very interest to hear if anyone tries this on their favourite
workload.

regards
Richard
 
patch against 2.6.35-rc3

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 2fdda90..315dd04 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -144,7 +144,7 @@ static int calc_period_shift(void)
 	else
 		dirty_total = (vm_dirty_ratio * determine_dirtyable_memory()) /
 				100;
-	return 2 + ilog2(dirty_total - 1);
+	return ilog2(dirty_total - 1) - 4;
 }
 
 /*


[-- Attachment #2: bdi_thresh_before.png --]
[-- Type: image/png, Size: 4098 bytes --]

[-- Attachment #3: bdi_thresh_after.png --]
[-- Type: image/png, Size: 2398 bytes --]

^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2010-06-17 18:45 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-06-14 13:58 [RFC PATCH] mm: let the bdi_writeout fraction respond more quickly Richard Kennedy
2010-06-14 14:44 ` Richard Kennedy
2010-06-14 14:44   ` Richard Kennedy
2010-06-16 18:54   ` Peter Zijlstra
2010-06-16 18:54     ` Peter Zijlstra
2010-06-17 11:39     ` Richard Kennedy
2010-06-17 11:39       ` Richard Kennedy
2010-06-17 11:41       ` Jens Axboe
2010-06-17 11:41         ` Jens Axboe
2010-06-17 18:45         ` Richard Kennedy
2010-06-17 18:45           ` Richard Kennedy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.