linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH][1/2] adjust dirty threshold for lowmem-only mappings
@ 2004-12-20 15:15 Rik van Riel
  2004-12-20 15:23 ` Rik van Riel
  2004-12-20 20:54 ` Andrew Morton
  0 siblings, 2 replies; 27+ messages in thread
From: Rik van Riel @ 2004-12-20 15:15 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Robert_Hentosh

Simply running "dd if=/dev/zero of=/dev/hd<one you can miss>" will
result in OOM kills, with the dirty pagecache completely filling up
lowmem.  This patch is part 1 to fixing that problem.

This patch effectively lowers the dirty limit for mappings which cannot
be cached in highmem, counting the dirty limit as a percentage of lowmem
instead.  This should prevent heavy block device writers from pushing
the VM over the edge and triggering OOM kills.

Signed-off-by: Rik van Riel <riel@redhat.com>


--- linux-2.6.9/mm/page-writeback.c.highmem	2004-12-16 11:22:48.193641312 
-0500
+++ linux-2.6.9/mm/page-writeback.c	2004-12-16 11:30:00.565676290 -0500
@@ -133,18 +133,28 @@
   * clamping level.
   */
  static void
-get_dirty_limits(struct writeback_state *wbs, long *pbackground, long *pdirty)
+get_dirty_limits(struct writeback_state *wbs, long *pbackground, long *pdirty, 
struct address_space *mapping)
  {
  	int background_ratio;		/* Percentages */
  	int dirty_ratio;
  	int unmapped_ratio;
  	long background;
  	long dirty;
+	unsigned long available_memory = total_pages;
  	struct task_struct *tsk;

  	get_writeback_state(wbs);

-	unmapped_ratio = 100 - (wbs->nr_mapped * 100) / total_pages;
+#ifdef CONFIG_HIGHMEM
+	/*
+	 * If this mapping can only allocate from low memory,
+	 * we exclude high memory from our count.
+	 */
+	if (mapping && !(mapping_gfp_mask(mapping) & __GFP_HIGHMEM))
+		available_memory -= totalhigh_pages;
+#endif
+
+	unmapped_ratio = 100 - (wbs->nr_mapped * 100) / available_memory;

  	dirty_ratio = vm_dirty_ratio;
  	if (dirty_ratio > unmapped_ratio / 2)
@@ -194,7 +204,8 @@
  			.nr_to_write	= write_chunk,
  		};

-		get_dirty_limits(&wbs, &background_thresh, &dirty_thresh);
+		get_dirty_limits(&wbs, &background_thresh,
+					&dirty_thresh, mapping);
  		nr_reclaimable = wbs.nr_dirty + wbs.nr_unstable;
  		if (nr_reclaimable + wbs.nr_writeback <= dirty_thresh)
  			break;
@@ -210,7 +221,7 @@
  		if (nr_reclaimable) {
  			writeback_inodes(&wbc);
  			get_dirty_limits(&wbs, &background_thresh,
-					&dirty_thresh);
+					&dirty_thresh, mapping);
  			nr_reclaimable = wbs.nr_dirty + wbs.nr_unstable;
  			if (nr_reclaimable + wbs.nr_writeback <= dirty_thresh)
  				break;
@@ -283,7 +294,7 @@
  	long dirty_thresh;

          for ( ; ; ) {
-		get_dirty_limits(&wbs, &background_thresh, &dirty_thresh);
+		get_dirty_limits(&wbs, &background_thresh, &dirty_thresh, 
NULL);

                  /*
                   * Boost the allowable dirty threshold a bit for page
@@ -318,7 +329,7 @@
  		long background_thresh;
  		long dirty_thresh;

-		get_dirty_limits(&wbs, &background_thresh, &dirty_thresh);
+		get_dirty_limits(&wbs, &background_thresh, &dirty_thresh, 
NULL);
  		if (wbs.nr_dirty + wbs.nr_unstable < background_thresh
  				&& min_pages <= 0)
  			break;

^ permalink raw reply	[flat|nested] 27+ messages in thread
* RE: [PATCH][1/2] adjust dirty threshold for lowmem-only mappings
@ 2004-12-20 16:46 Robert_Hentosh
  2004-12-20 17:56 ` Sami Farin
  0 siblings, 1 reply; 27+ messages in thread
From: Robert_Hentosh @ 2004-12-20 16:46 UTC (permalink / raw)
  To: riel, akpm; +Cc: linux-kernel



> On Mon, 20 Dec 2004, Rik van Riel wrote:
>
>> Simply running "dd if=/dev/zero of=/dev/hd<one you can miss>"
>> will result in OOM kills, with the dirty pagecache
>> completely filling up lowmem.  This patch is part 1 to
>> fixing that problem.
>
> What I forgot to say is that in order to trigger this OOM
> Kill the dirty_limit of 40% needs to be more memory than
> what fits in low memory.  So this will work on x86 with 
> 4GB RAM, since the dirty_limit is 1.6GB, but the block 
> device cache cannot grow that big because it is restricted
> to low memory.
>
> This has the effect of all low memory being tied up in
> Dirty page cache and userspace try_to_free_pages() skipping
> the writeout of these pages because the block device is
> congested.

I am just confirming that this is a real problem.  The problem 
more frequently shows up with block sizes above 4k on the
dd and also showed up on some platforms with just a mke2fs
on a slower device such as a USB hard drive.

Rik's patch has solved the issue and has been running under
stress (via ctcs) over the weekend without failure.  

Regards,
Robert


^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2005-01-02 20:25 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-12-20 15:15 [PATCH][1/2] adjust dirty threshold for lowmem-only mappings Rik van Riel
2004-12-20 15:23 ` Rik van Riel
2004-12-20 20:54 ` Andrew Morton
2004-12-20 21:27   ` Rik van Riel
2004-12-23 19:21   ` Rik van Riel
2004-12-24 16:01     ` Andrea Arcangeli
2004-12-24 16:22       ` Rik van Riel
2004-12-24 16:40         ` Andrea Arcangeli
2004-12-24 22:12           ` Rik van Riel
2004-12-25  2:07             ` Andrea Arcangeli
2004-12-25 17:59               ` Rik van Riel
2004-12-25 18:36                 ` Andrea Arcangeli
2004-12-25 19:07                 ` William Lee Irwin III
2004-12-25 20:03                   ` Andrea Arcangeli
2004-12-26  3:07                     ` William Lee Irwin III
2005-01-02 16:10                       ` Andrea Arcangeli
2005-01-02 16:36                         ` William Lee Irwin III
2005-01-02 16:53                         ` Rik van Riel
2005-01-02 17:21                           ` Andrea Arcangeli
2004-12-25 22:03                   ` Nikita Danilov
2004-12-26  3:16                     ` William Lee Irwin III
2005-01-02 15:11                   ` Jens Axboe
2005-01-02 16:18                     ` Andrea Arcangeli
2005-01-02 20:03                     ` Andrew Morton
2005-01-02 20:25                       ` William Lee Irwin III
2004-12-20 16:46 Robert_Hentosh
2004-12-20 17:56 ` Sami Farin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).