From: Hideo AOKI <haoki@redhat.com>
To: akpm@osdl.org
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: [patch 1/3] mm: An enhancement of OVERCOMMIT_GUESS
Date: Wed, 05 Apr 2006 19:47:27 -0400 [thread overview]
Message-ID: <4434570F.9030507@redhat.com> (raw)
[-- Attachment #1: Type: text/plain, Size: 562 bytes --]
Hello Andrew,
Could you apply my patches to your tree?
These patches are an enhancement of OVERCOMMIT_GUESS algorithm in
__vm_enough_memory(). The detailed description is in attached patch.
Actually, these are the revised patch which I sent to lkml in the last
year.
http://marc.theaimsgroup.com/?l=linux-kernel&m=112993489022427&w=2
I wrote a test kernel module to show the result of the patches.
For your information, I also would like to send the module in later e-mail.
Best regards,
Hideo Aoki
---
Hideo Aoki, Hitachi Computer Products (America) Inc.
[-- Attachment #2: mm-add-totalreserve_pages.patch --]
[-- Type: text/x-patch, Size: 4960 bytes --]
These patches are an enhancement of OVERCOMMIT_GUESS algorithm in
__vm_enough_memory().
- why the kernel needed patching
When the kernel can't allocate anonymous pages in practice, currnet
OVERCOMMIT_GUESS could return success. This implementation might be
the cause of oom kill in memory pressure situation.
If the Linux runs with page reservation features like
/proc/sys/vm/lowmem_reserve_ratio and without swap region, I think
the oom kill occurs easily.
- the overall design approach in the patch
When the OVERCOMMET_GUESS algorithm calculates number of free pages,
the reserved free pages are regarded as non-free pages.
This change helps to avoid the pitfall that the number of free pages
become less than the number which the kernel tries to keep free.
- testing results
I tested the patches using my test kernel module.
If the patches aren't applied to the kernel, __vm_enough_memory()
returns success in the situation but autual page allocation is
failed.
On the other hand, if the patches are applied to the kernel, memory
allocation failure is avoided since __vm_enough_memory() returns
failure in the situation.
I checked that on i386 SMP 16GB memory machine. I haven't tested on
nommu environment currently.
- changelog
v5:
- updated to 2.6.17-rc1-mm1
- did more strict tests.
- added the enhancement to mm/nommu.c too
v4:
- dealing with pages_high as reserved pages
- updated the code for 2.6.14-rc4-mm1
v3 (private):
- enhanced error handling in __vm_enough_memory
- fixed an issue related calculation of totalreserve_pages
v2 (private):
- fixed error handling bug
- updated test results
- updated the code for 2.6.14-rc2-mm2
This patch adds totalreserve_pages for __vm_enough_memory().
Calculate_totalreserve_pages() checks maximum lowmem_reserve pages and
pages_high in each zone. Finally, the function stores the sum of each
zone to totalreserve_pages.
The totalreserve_pages is calculated when the VM is initilized.
And the variable is updated when /proc/sys/vm/lowmem_reserve_raito
or /proc/sys/vm/min_free_kbytes are changed.
Signed-off-by: Hideo Aoki <haoki@redhat.com>
---
include/linux/swap.h | 1 +
mm/page_alloc.c | 39 +++++++++++++++++++++++++++++++++++++++
2 files changed, 40 insertions(+)
diff -purN linux-2.6.17-rc1-mm1/include/linux/swap.h linux-2.6.17-rc1-mm1-idea6/include/linux/swap.h
--- linux-2.6.17-rc1-mm1/include/linux/swap.h 2006-04-04 10:43:57.000000000 -0400
+++ linux-2.6.17-rc1-mm1-idea6/include/linux/swap.h 2006-04-04 15:13:26.000000000 -0400
@@ -155,6 +155,7 @@ extern void swapin_readahead(swp_entry_t
/* linux/mm/page_alloc.c */
extern unsigned long totalram_pages;
extern unsigned long totalhigh_pages;
+extern unsigned long totalreserve_pages;
extern long nr_swap_pages;
extern unsigned int nr_free_pages(void);
extern unsigned int nr_free_pages_pgdat(pg_data_t *pgdat);
diff -purN linux-2.6.17-rc1-mm1/mm/page_alloc.c linux-2.6.17-rc1-mm1-idea6/mm/page_alloc.c
--- linux-2.6.17-rc1-mm1/mm/page_alloc.c 2006-04-04 10:43:57.000000000 -0400
+++ linux-2.6.17-rc1-mm1-idea6/mm/page_alloc.c 2006-04-04 15:13:26.000000000 -0400
@@ -51,6 +51,7 @@ nodemask_t node_possible_map __read_most
EXPORT_SYMBOL(node_possible_map);
unsigned long totalram_pages __read_mostly;
unsigned long totalhigh_pages __read_mostly;
+unsigned long totalreserve_pages __read_mostly;
long nr_swap_pages;
int percpu_pagelist_fraction;
@@ -2548,6 +2549,38 @@ void __init page_alloc_init(void)
}
/*
+ * calculate_totalreserve_pages - called when sysctl_lower_zone_reserve_ratio
+ * or min_free_kbytes changes.
+ */
+static void calculate_totalreserve_pages(void)
+{
+ struct pglist_data *pgdat;
+ unsigned long reserve_pages = 0;
+ int i, j;
+
+ for_each_online_pgdat(pgdat) {
+ for (i = 0; i < MAX_NR_ZONES; i++) {
+ struct zone *zone = pgdat->node_zones + i;
+ unsigned long max = 0;
+
+ /* Find valid and maximum lowmem_reserve in the zone */
+ for (j = i; j < MAX_NR_ZONES; j++) {
+ if (zone->lowmem_reserve[j] > max)
+ max = zone->lowmem_reserve[j];
+ }
+
+ /* we treat pages_high as reserved pages. */
+ max += zone->pages_high;
+
+ if (max > zone->present_pages)
+ max = zone->present_pages;
+ reserve_pages += max;
+ }
+ }
+ totalreserve_pages = reserve_pages;
+}
+
+/*
* setup_per_zone_lowmem_reserve - called whenever
* sysctl_lower_zone_reserve_ratio changes. Ensures that each zone
* has a correct pages reserved value, so an adequate number of
@@ -2578,6 +2611,9 @@ static void setup_per_zone_lowmem_reserv
}
}
}
+
+ /* update totalreserve_pages */
+ calculate_totalreserve_pages();
}
/*
@@ -2632,6 +2668,9 @@ void setup_per_zone_pages_min(void)
zone->pages_high = zone->pages_min + tmp / 2;
spin_unlock_irqrestore(&zone->lru_lock, flags);
}
+
+ /* update totalreserve_pages */
+ calculate_totalreserve_pages();
}
/*
next reply other threads:[~2006-04-05 23:47 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-04-05 23:47 Hideo AOKI [this message]
2006-04-05 23:51 ` A test kernel module of OVERCOMMIT_GUESS Hideo AOKI
2006-04-05 23:52 ` A patch for test_overcommit module Hideo AOKI
2006-04-06 0:45 ` [patch 1/3] mm: An enhancement of OVERCOMMIT_GUESS KAMEZAWA Hiroyuki
2006-04-06 7:20 ` Hideo AOKI
2006-04-06 8:08 ` KAMEZAWA Hiroyuki
2006-04-07 11:49 ` Hideo AOKI
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4434570F.9030507@redhat.com \
--to=haoki@redhat.com \
--cc=akpm@osdl.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).