* [patch 1/3] mm: An enhancement of OVERCOMMIT_GUESS
@ 2006-04-05 23:47 Hideo AOKI
2006-04-05 23:51 ` A test kernel module " Hideo AOKI
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Hideo AOKI @ 2006-04-05 23:47 UTC (permalink / raw)
To: akpm; +Cc: linux-kernel, linux-mm
[-- Attachment #1: Type: text/plain, Size: 562 bytes --]
Hello Andrew,
Could you apply my patches to your tree?
These patches are an enhancement of OVERCOMMIT_GUESS algorithm in
__vm_enough_memory(). The detailed description is in attached patch.
Actually, these are the revised patch which I sent to lkml in the last
year.
http://marc.theaimsgroup.com/?l=linux-kernel&m=112993489022427&w=2
I wrote a test kernel module to show the result of the patches.
For your information, I also would like to send the module in later e-mail.
Best regards,
Hideo Aoki
---
Hideo Aoki, Hitachi Computer Products (America) Inc.
[-- Attachment #2: mm-add-totalreserve_pages.patch --]
[-- Type: text/x-patch, Size: 4960 bytes --]
These patches are an enhancement of OVERCOMMIT_GUESS algorithm in
__vm_enough_memory().
- why the kernel needed patching
When the kernel can't allocate anonymous pages in practice, currnet
OVERCOMMIT_GUESS could return success. This implementation might be
the cause of oom kill in memory pressure situation.
If the Linux runs with page reservation features like
/proc/sys/vm/lowmem_reserve_ratio and without swap region, I think
the oom kill occurs easily.
- the overall design approach in the patch
When the OVERCOMMET_GUESS algorithm calculates number of free pages,
the reserved free pages are regarded as non-free pages.
This change helps to avoid the pitfall that the number of free pages
become less than the number which the kernel tries to keep free.
- testing results
I tested the patches using my test kernel module.
If the patches aren't applied to the kernel, __vm_enough_memory()
returns success in the situation but autual page allocation is
failed.
On the other hand, if the patches are applied to the kernel, memory
allocation failure is avoided since __vm_enough_memory() returns
failure in the situation.
I checked that on i386 SMP 16GB memory machine. I haven't tested on
nommu environment currently.
- changelog
v5:
- updated to 2.6.17-rc1-mm1
- did more strict tests.
- added the enhancement to mm/nommu.c too
v4:
- dealing with pages_high as reserved pages
- updated the code for 2.6.14-rc4-mm1
v3 (private):
- enhanced error handling in __vm_enough_memory
- fixed an issue related calculation of totalreserve_pages
v2 (private):
- fixed error handling bug
- updated test results
- updated the code for 2.6.14-rc2-mm2
This patch adds totalreserve_pages for __vm_enough_memory().
Calculate_totalreserve_pages() checks maximum lowmem_reserve pages and
pages_high in each zone. Finally, the function stores the sum of each
zone to totalreserve_pages.
The totalreserve_pages is calculated when the VM is initilized.
And the variable is updated when /proc/sys/vm/lowmem_reserve_raito
or /proc/sys/vm/min_free_kbytes are changed.
Signed-off-by: Hideo Aoki <haoki@redhat.com>
---
include/linux/swap.h | 1 +
mm/page_alloc.c | 39 +++++++++++++++++++++++++++++++++++++++
2 files changed, 40 insertions(+)
diff -purN linux-2.6.17-rc1-mm1/include/linux/swap.h linux-2.6.17-rc1-mm1-idea6/include/linux/swap.h
--- linux-2.6.17-rc1-mm1/include/linux/swap.h 2006-04-04 10:43:57.000000000 -0400
+++ linux-2.6.17-rc1-mm1-idea6/include/linux/swap.h 2006-04-04 15:13:26.000000000 -0400
@@ -155,6 +155,7 @@ extern void swapin_readahead(swp_entry_t
/* linux/mm/page_alloc.c */
extern unsigned long totalram_pages;
extern unsigned long totalhigh_pages;
+extern unsigned long totalreserve_pages;
extern long nr_swap_pages;
extern unsigned int nr_free_pages(void);
extern unsigned int nr_free_pages_pgdat(pg_data_t *pgdat);
diff -purN linux-2.6.17-rc1-mm1/mm/page_alloc.c linux-2.6.17-rc1-mm1-idea6/mm/page_alloc.c
--- linux-2.6.17-rc1-mm1/mm/page_alloc.c 2006-04-04 10:43:57.000000000 -0400
+++ linux-2.6.17-rc1-mm1-idea6/mm/page_alloc.c 2006-04-04 15:13:26.000000000 -0400
@@ -51,6 +51,7 @@ nodemask_t node_possible_map __read_most
EXPORT_SYMBOL(node_possible_map);
unsigned long totalram_pages __read_mostly;
unsigned long totalhigh_pages __read_mostly;
+unsigned long totalreserve_pages __read_mostly;
long nr_swap_pages;
int percpu_pagelist_fraction;
@@ -2548,6 +2549,38 @@ void __init page_alloc_init(void)
}
/*
+ * calculate_totalreserve_pages - called when sysctl_lower_zone_reserve_ratio
+ * or min_free_kbytes changes.
+ */
+static void calculate_totalreserve_pages(void)
+{
+ struct pglist_data *pgdat;
+ unsigned long reserve_pages = 0;
+ int i, j;
+
+ for_each_online_pgdat(pgdat) {
+ for (i = 0; i < MAX_NR_ZONES; i++) {
+ struct zone *zone = pgdat->node_zones + i;
+ unsigned long max = 0;
+
+ /* Find valid and maximum lowmem_reserve in the zone */
+ for (j = i; j < MAX_NR_ZONES; j++) {
+ if (zone->lowmem_reserve[j] > max)
+ max = zone->lowmem_reserve[j];
+ }
+
+ /* we treat pages_high as reserved pages. */
+ max += zone->pages_high;
+
+ if (max > zone->present_pages)
+ max = zone->present_pages;
+ reserve_pages += max;
+ }
+ }
+ totalreserve_pages = reserve_pages;
+}
+
+/*
* setup_per_zone_lowmem_reserve - called whenever
* sysctl_lower_zone_reserve_ratio changes. Ensures that each zone
* has a correct pages reserved value, so an adequate number of
@@ -2578,6 +2611,9 @@ static void setup_per_zone_lowmem_reserv
}
}
}
+
+ /* update totalreserve_pages */
+ calculate_totalreserve_pages();
}
/*
@@ -2632,6 +2668,9 @@ void setup_per_zone_pages_min(void)
zone->pages_high = zone->pages_min + tmp / 2;
spin_unlock_irqrestore(&zone->lru_lock, flags);
}
+
+ /* update totalreserve_pages */
+ calculate_totalreserve_pages();
}
/*
^ permalink raw reply [flat|nested] 7+ messages in thread
* A test kernel module of OVERCOMMIT_GUESS
2006-04-05 23:47 [patch 1/3] mm: An enhancement of OVERCOMMIT_GUESS Hideo AOKI
@ 2006-04-05 23:51 ` Hideo AOKI
2006-04-05 23:52 ` A patch for test_overcommit module Hideo AOKI
2006-04-06 0:45 ` [patch 1/3] mm: An enhancement of OVERCOMMIT_GUESS KAMEZAWA Hiroyuki
2 siblings, 0 replies; 7+ messages in thread
From: Hideo AOKI @ 2006-04-05 23:51 UTC (permalink / raw)
To: akpm; +Cc: linux-kernel, linux-mm
[-- Attachment #1: Type: text/plain, Size: 978 bytes --]
Hello Andrew,
This is a kernel module patch which I developed to test my patches.
The module makes a kind of memory pressure situation. After that, the
module tests if the OVERCOMMIT_GUESS detects overcommit.
The module has "mode" option. If you specify "mode=1", the module
tries to allocate pages in the test phase.
Here is the test result when I did "mode=1" test on my machine.
* 2.6.17-rc1-mm1
kernel: Test MAY be <failed>.
kernel: allocation failed: out of vmalloc space - use vmalloc=<size> to increase size.
kernel: allocation failed: out of vmalloc space - use vmalloc=<size> to increase size.
kernel: Test SURELY was <FAILED>.
* 2.6.17-rc1-mm1 + my patches
kernel: Test was <PASSED>.
Unfortunately, this kernel module needs another kernel patch.
I will send it in later e-mail.
Please note that I don't intend to propose to apply the module to
kernel tree.
Best regards,
Hideo Aoki
---
Hideo Aoki, Hitachi Computer Products (America) Inc.
[-- Attachment #2: mm-test_overcommit.patch --]
[-- Type: text/x-patch, Size: 14341 bytes --]
A kernel module to test OVERCOMMIT_GUESS.
lib/Kconfig.debug | 11 +
mm/Makefile | 1
mm/test_overcommit.c | 515 +++++++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 527 insertions(+)
diff -purN linux-2.6.17-rc1-mm1/lib/Kconfig.debug linux-2.6.17-rc1-mm1-test1/lib/Kconfig.debug
--- linux-2.6.17-rc1-mm1/lib/Kconfig.debug 2006-04-04 10:43:57.000000000 -0400
+++ linux-2.6.17-rc1-mm1-test1/lib/Kconfig.debug 2006-04-04 16:30:31.000000000 -0400
@@ -282,6 +282,17 @@ config RCU_TORTURE_TEST
Say M if you want the RCU torture tests to build as a module.
Say N if you are unsure.
+config OVERCOMMIT_GUESS_TEST
+ tristate "An overcommit guess testing module"
+ depends on DEBUG_KERNEL && X86
+ default n
+ help
+ This option provides a kernel module that can test OVERCOMMIT_GUESS
+ in __vm_enough_memory().
+
+ You should say N or M here. Say M if you want to build the module.
+ Say N if you are unsure.
+
config WANT_EXTRA_DEBUG_INFORMATION
bool
select DEBUG_INFO
diff -purN linux-2.6.17-rc1-mm1/mm/Makefile linux-2.6.17-rc1-mm1-test1/mm/Makefile
--- linux-2.6.17-rc1-mm1/mm/Makefile 2006-04-04 10:43:57.000000000 -0400
+++ linux-2.6.17-rc1-mm1-test1/mm/Makefile 2006-04-04 16:23:09.000000000 -0400
@@ -24,4 +24,5 @@ obj-$(CONFIG_SLAB) += slab.o
obj-$(CONFIG_MEMORY_HOTPLUG) += memory_hotplug.o
obj-$(CONFIG_FS_XIP) += filemap_xip.o
obj-$(CONFIG_MIGRATION) += migrate.o
+obj-$(CONFIG_OVERCOMMIT_GUESS_TEST) += test_overcommit.o
diff -purN linux-2.6.17-rc1-mm1/mm/test_overcommit.c linux-2.6.17-rc1-mm1-test1/mm/test_overcommit.c
--- linux-2.6.17-rc1-mm1/mm/test_overcommit.c 1969-12-31 19:00:00.000000000 -0500
+++ linux-2.6.17-rc1-mm1-test1/mm/test_overcommit.c 2006-04-04 16:23:09.000000000 -0400
@@ -0,0 +1,515 @@
+/*
+ * A kernel module for testing OVERCOMMIT_GUESS. ver 0.0.1
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
+ * MA 02110-1301, USA.
+ */
+/*
+ * This kernel module tests the later part of OVERCOMMIT_GUESS algorithm.
+ * To test the algorithm, the module makes a kind of memory pressure situation.
+ * After that, the module tests if the algorithm detects overcommit.
+ *
+ * You can specify by "mode" module option if the module actually tries to
+ * allocate pages in the situation.
+ *
+ * You have to apply a kernel patch to use the module currently, since the
+ * module needs to refer some internal symbols for testing.
+ *
+ * The module was tested on only i386 SMP machines.
+ */
+#include <linux/module.h> /* Needed by all modules */
+#include <linux/moduleparam.h>
+#include <linux/kernel.h> /* Needed for KERN_INFO */
+
+#include <linux/gfp.h>
+#include <linux/mm.h>
+#include <linux/mman.h>
+#include <linux/pagemap.h>
+#include <linux/slab.h>
+#include <linux/swap.h>
+#include <linux/vmalloc.h>
+
+#include <asm/kmap_types.h>
+#include <asm/system.h>
+#include <linux/blkdev.h>
+#include <linux/string.h>
+
+/*
+ * Get rid of taint message by declaring code as GPL v2.
+ */
+MODULE_LICENSE("GPL v2");
+
+/*
+ * Module documentation
+ */
+MODULE_AUTHOR("Hideo Aoki");
+MODULE_DESCRIPTION("test module of overcommit guess");
+
+/*
+ * Module parameter(s)
+ */
+enum { MODE_DEFALUT = 0, MODE_DOLASTALLOC = 1 };
+static int mode = MODE_DEFALUT;
+
+module_param(mode, int, S_IRUSR);
+MODULE_PARM_DESC(mode, "test mode. If mode is 1, the module executes actual page allocation in concrete test.");
+
+/*
+ * declarations
+ */
+#define MOD_MM 1 /* modify kernel */
+#define DEF_PRT_LV KERN_DEBUG /* default print level in this mod. */
+
+/*
+ * Page managemnt: We manage allocated pages. When module is unloaded,
+ * the module releases them.
+ */
+struct page_mng_t {
+ struct page *listp; /* list of allocated pages */
+ int zone; /* which zone manage */
+};
+
+/* We use head of allocated pages for managing page list */
+struct mypage_t {
+ struct page *next_page;
+ int order; /* page order */
+};
+
+
+enum {
+ NORMAL, HIGH
+};
+
+static struct page_mng_t low_pg; /* LOWMEM pages management */
+static struct page_mng_t high_pg;/* HIGHMEM pages management */
+static unsigned long committed_pages;/* number of pages committed in the mod.*/
+
+void prepare_test_env(void);
+void final_test(int mode);
+int commit_pages(unsigned long target, struct page_mng_t *mng, int gfp,
+ char *msg);
+int try_to_commit_pages(int order, int gfp, struct page_mng_t *mng,
+ unsigned long *n_alloced);
+int alloc_last_pages(unsigned long npage);
+
+/* allocated page management */
+int page_mng_init(struct page_mng_t *mng, int gfp);
+void page_mng_add(struct page_mng_t *mng, struct page *newp, int order);
+void page_mng_freeall(struct page_mng_t *mng);
+
+/* zone */
+struct zone *get_zone(char *name);
+void print_zone_info(struct zone *zone, char *msg);
+
+
+/*
+ * initialize module and invoke test functions
+ */
+int init_module(void)
+{
+ int ret;
+
+ printk(DEF_PRT_LV "Test module was loaded. <mode %d>\n", mode);
+
+ if (mode != MODE_DEFALUT && mode != MODE_DOLASTALLOC) {
+ printk(KERN_ERR "invalid mode. <mode %d>\n", mode);
+ return 1;
+ }
+
+ /*
+ * making a memory pressure situation
+ */
+ printk(DEF_PRT_LV "init ...");
+ committed_pages = 0;
+ ret = page_mng_init(&low_pg, GFP_KERNEL);
+ if (ret == 1) {
+ printk(KERN_ERR "failed to init lowmem mng\n");
+ return 1;
+ }
+ ret = page_mng_init(&high_pg, GFP_HIGHUSER);
+ if (ret == 1) {
+ printk(KERN_ERR "failed to init highmem mng\n");
+ return 1;
+ }
+ printk(DEF_PRT_LV "done\n");
+
+ prepare_test_env();
+
+ /*
+ * concrete test
+ */
+ printk(DEF_PRT_LV "concrete test ...\n");
+ final_test(mode);
+ printk(DEF_PRT_LV "concrete test ...done.\n");
+
+ /*
+ * A non 0 return means init_module failed; module can't be loaded.
+ */
+ return 0;
+}
+
+
+/*
+ * destructor of module
+ */
+void cleanup_module(void)
+{
+ printk(DEF_PRT_LV "Unloading module ...\n");
+ page_mng_freeall(&low_pg);
+ page_mng_freeall(&high_pg);
+ vm_unacct_memory(committed_pages);
+}
+
+
+/*
+ * To prepare test environment, this function repeat to allocate pages
+ * in ZONE_HIGHMEM and ZONE_NORMAL until the number of free pages is
+ * pages_high in the zone.
+ */
+void prepare_test_env(void)
+{
+ struct zone *high_zone; /* high mem zone */
+ struct zone *normal_zone; /* normal zone */
+ int i;
+ int ret;
+ unsigned long target;
+
+ high_zone = get_zone("HighMem");
+ if (high_zone == NULL) {
+ printk(KERN_ERR "fail to get higmem zone\n");
+ return ;
+ }
+ for (i = 0; i < 100; i++) {
+ if (high_zone->free_pages <= high_zone->pages_high) {
+ printk(DEF_PRT_LV "already satisfied\n");
+ break;
+ }
+
+ spin_lock_irq(&high_zone->lru_lock);
+ target = high_zone->free_pages - high_zone->pages_high;
+ spin_unlock_irq(&high_zone->lru_lock);
+ print_zone_info(high_zone, "HIGH");
+ ret = commit_pages(target, &high_pg, GFP_HIGHUSER, "HighMem");
+ if (ret < 0) {
+ printk(KERN_ERR "error high %i\n", i);
+ goto error;
+ }
+
+ print_zone_info(high_zone, "HIGH");
+ /* printk(DEF_PRT_LV "%i\n", i); */
+ blk_congestion_wait(WRITE, HZ/2);
+ }
+
+ normal_zone = get_zone("Normal");
+ if (high_zone == NULL) {
+ printk(KERN_ERR "fail to get normal zone\n");
+ return ;
+ }
+ for (i = 0; i < 100; i++) {
+ if (normal_zone->free_pages <= normal_zone->pages_high) {
+ printk(DEF_PRT_LV "already satisfied\n");
+ break;
+ }
+
+ spin_lock_irq(&normal_zone->lru_lock);
+ target = normal_zone->free_pages - normal_zone->pages_high;
+ spin_unlock_irq(&normal_zone->lru_lock);
+ print_zone_info(normal_zone, "NORMAL");
+ ret = commit_pages(target, &low_pg, GFP_KERNEL, "Normal");
+ if (ret < 0) {
+ printk(KERN_ERR "error %i\n", i);
+ goto error;
+ }
+
+ print_zone_info(normal_zone, "NORMAL");
+ /* printk(DEF_PRT_LV "%i\n", i); */
+ blk_congestion_wait(WRITE, HZ/2);
+ }
+
+error:
+ return;
+}
+
+/*
+ * main test function
+ */
+void final_test(int mode)
+{
+ int ret;
+ unsigned long n;
+ unsigned long nbuffer;
+ unsigned long ncache;
+ unsigned long nmargin;
+#if MOD_MM
+ unsigned long nslabrec;
+ unsigned long nswap;
+#endif
+ struct sysinfo info;
+
+
+ for (nmargin = 1; nmargin < 100000; nmargin *= 10) {
+ si_meminfo(&info);
+ nbuffer = info.bufferram;
+ ncache = get_page_cache_size();
+#if MOD_MM
+ nslabrec = atomic_read(&slab_reclaim_pages);
+ nswap = nr_swap_pages;
+ n = ncache + nslabrec + nmargin + nswap;
+ printk(DEF_PRT_LV "<buf %lu><cache %lu><slab reclaim %lu><swap %lu> <+ %lu> <target %lu>\n",
+ nbuffer, ncache, nslabrec, nswap, nmargin, n);
+#else
+ n = ncache + nmargin;
+ printk(DEF_PRT_LV "<buf %lu> <cache %lu> <+ %lu> <target %lu>\n",
+ nbuffer, ncache, nmargin, n);
+#endif
+
+ ret = __vm_enough_memory(n, 0);
+ if (ret != 0) {
+ printk(KERN_ERR "Test was <PASSED>.\n");
+ break ;
+ }
+
+ /* unexpected result */
+ committed_pages += n;
+ if (mode == MODE_DOLASTALLOC) {
+ printk(KERN_ERR "Test MAY be <failed>.\n");
+ ret = alloc_last_pages(n);
+ if (ret == 0) {
+ printk(KERN_ERR "Test modeule has problem\n");
+ } else {
+ printk(KERN_ERR "Test SURELY was <FAILED>.\n");
+ break ;
+ }
+ } else {
+ printk(KERN_ERR "Test was <FAILED>\n");
+ }
+ }
+}
+
+int commit_pages(unsigned long target, struct page_mng_t *mng, int gfp,
+ char *msg)
+{
+ int ret;
+ unsigned long total;
+ unsigned long npage;
+
+ if (target == 0 || mng == NULL)
+ goto error;
+
+ /*
+ * try to commit anonymous pages
+ */
+ total = 0;
+ while (total < target) {
+ ret = try_to_commit_pages(0, gfp, mng, &npage);
+ if (ret == 1) {
+ printk(DEF_PRT_LV "%s test stoped. <target %lu>\n",
+ msg, target);
+ return 1;
+ } else if (ret == 0) {
+ total += npage;
+ /* printk(KERN_ERR "p"); */
+ } else if (ret == -1) {
+ printk(KERN_ERR "error %s overcommit.\n", msg);
+ goto error;
+ } else {
+ printk(KERN_ERR "error %s test environment.\n", msg);
+ goto error;
+ }
+ }
+ printk(DEF_PRT_LV "%s <target %lu>, ", msg, target);
+
+ return 0;
+
+ error:
+ return -1;
+}
+
+/*
+ * ret: 1; success (detected overcommit)
+ * 0: success
+ * -1: error (overcommiet was not detected)
+ */
+int try_to_commit_pages(int order, int gfp, struct page_mng_t *mng,
+ unsigned long *n_alloced)
+{
+ int ret;
+ long n_pages;
+ struct page *p;
+
+ n_pages = 1L << order;
+ /* printk(KERN_ERR "<order %d>, <pages %ld>\n,", order, n_pages); */
+
+ *n_alloced = 0;
+
+ ret = __vm_enough_memory(n_pages, 0);
+ if (ret != 0) {
+ printk(KERN_ERR "<order %d>, <pages %ld> ", order, n_pages);
+ printk(KERN_ERR "overcommit was detected.\n");
+ return 1;
+ }
+ committed_pages += n_pages;
+
+ p = alloc_pages(gfp, order);
+ if (p == NULL) {
+ /* error */
+ printk(KERN_ERR "<order %d>, <pages %ld> ", order, n_pages);
+ printk(KERN_ERR "allocation failed\n");
+ return -1;
+ } else {
+ page_mng_add(mng, p, order);
+ *n_alloced += n_pages;
+ /*
+ * printk(KERN_ERR "<order %d>, <pages %ld> <alloced %lu> ",
+ * order, n_pages, *n_alloced);
+ * printk(KERN_ERR " succeed\n");
+ */
+ return 0;
+ }
+}
+
+/*
+ * 0: success, 1: failure
+ */
+int alloc_last_pages(unsigned long npage)
+{
+ int i;
+ int ret;
+ void *mem;
+ struct mypage2_t {
+ struct mypage2_t *next;
+ } *endp, *listp, *p, *nextp;
+
+ /*
+ * allocation scenario 1
+ */
+ mem = vmalloc(npage * PAGE_SIZE);
+ if (mem != NULL) {
+ printk(KERN_ERR "TEST MODULE HAS PROBLEMS.\n");
+ vfree(mem);
+ return 0;
+ }
+
+ /*
+ * allocation scenario 2
+ */
+ for (listp = endp = NULL, i = 0; i < npage; i++) {
+ p = (struct mypage2_t *)vmalloc(PAGE_SIZE);
+ if (p == NULL) {
+ ret = 1;
+ goto release;
+ }
+
+ p->next = NULL;
+ if (listp == NULL)
+ listp = p;
+ else
+ endp->next = p;
+
+ endp = p;
+ }
+
+ printk(KERN_ERR "TEST MODULE HAS PROBLEMS.\n");
+ ret = 0;
+
+release:
+ for ( ; listp != NULL; listp = nextp) {
+ nextp = listp->next;
+ vfree(listp);
+ }
+
+ return ret;
+}
+
+/*
+ * ret: 1: error, 0: success
+ */
+int page_mng_init(struct page_mng_t *mng, int gfp)
+{
+ mng->listp = NULL;
+ if (gfp == GFP_HIGHUSER)
+ mng->zone = HIGH;
+ else if (gfp == GFP_KERNEL || gfp == GFP_USER)
+ mng->zone = NORMAL;
+ else
+ return 1;
+
+ return 0;
+}
+
+void page_mng_add(struct page_mng_t *mng, struct page *newp, int order)
+{
+ struct mypage_t *p;
+
+ if (mng->zone != NORMAL && mng->zone != HIGH) {
+ printk(KERN_ERR "PAGE_MNG: ERROR \n");
+ return ;
+ }
+
+ p = (struct mypage_t *)kmap_atomic(newp, KM_TYPE_NR);
+ p->order = order;
+ p->next_page = mng->listp;
+ mng->listp = newp;
+
+ kunmap_atomic((void *)p, KM_TYPE_NR);
+}
+
+void page_mng_freeall(struct page_mng_t *mng)
+{
+ int order;
+ struct page *next;
+ struct mypage_t *p;
+
+ for ( ; mng->listp != NULL; mng->listp = next) {
+ p = (struct mypage_t *)kmap_atomic(mng->listp, KM_TYPE_NR);
+ next = p->next_page;
+ order = p->order;
+ kunmap_atomic((void *)p, KM_TYPE_NR);
+ __free_pages(mng->listp, order);
+ }
+}
+
+struct zone *get_zone(char *name)
+{
+ struct zone *zone;
+ for_each_zone(zone) {
+ if (strcmp(zone->name, name) == 0)
+ return zone;
+ }
+
+ return NULL;
+}
+
+void print_zone_info(struct zone *zone, char *msg)
+{
+ unsigned long free_pages;
+ unsigned long nr_active;
+ unsigned long nr_inactive;
+ unsigned long present_pages;
+
+ spin_lock_irq(&zone->lru_lock);
+ free_pages = zone->free_pages;
+ nr_active = zone->nr_active;
+ nr_inactive = zone->nr_inactive;
+ present_pages = zone->present_pages;
+ spin_unlock_irq(&zone->lru_lock);
+
+ printk(DEF_PRT_LV "\n%s: <active %lu><inactive %lu><free %lu><sum %lu><present %lu>\n",
+ msg,
+ nr_active,
+ nr_inactive,
+ free_pages,
+ nr_active + nr_inactive + free_pages,
+ present_pages);
+}
^ permalink raw reply [flat|nested] 7+ messages in thread
* A patch for test_overcommit module
2006-04-05 23:47 [patch 1/3] mm: An enhancement of OVERCOMMIT_GUESS Hideo AOKI
2006-04-05 23:51 ` A test kernel module " Hideo AOKI
@ 2006-04-05 23:52 ` Hideo AOKI
2006-04-06 0:45 ` [patch 1/3] mm: An enhancement of OVERCOMMIT_GUESS KAMEZAWA Hiroyuki
2 siblings, 0 replies; 7+ messages in thread
From: Hideo AOKI @ 2006-04-05 23:52 UTC (permalink / raw)
To: akpm; +Cc: linux-kernel, linux-mm
[-- Attachment #1: Type: text/plain, Size: 161 bytes --]
The test module needs this kernel patch.
I don't intend to propose to apply this patch to kernel tree.
---
Hideo Aoki, Hitachi Computer Products (America) Inc.
[-- Attachment #2: mm-debug.patch --]
[-- Type: text/x-patch, Size: 2086 bytes --]
This patch exports the symbols which a kernel module has to refer.
include/linux/mman.h | 1 +
mm/page_alloc.c | 1 +
mm/slab.c | 1 +
mm/swap.c | 1 +
4 files changed, 4 insertions(+)
diff -pruN linux-2.6.17-rc1-mm1/include/linux/mman.h linux-2.6.17-rc1-mm1-test1/include/linux/mman.h
--- linux-2.6.17-rc1-mm1/include/linux/mman.h 2006-03-24 12:40:19.000000000 -0500
+++ linux-2.6.17-rc1-mm1-test1/include/linux/mman.h 2006-04-04 12:53:52.000000000 -0400
@@ -24,6 +24,7 @@ static inline void vm_acct_memory(long p
{
atomic_add(pages, &vm_committed_space);
}
+EXPORT_SYMBOL(vm_acct_memory);
#endif
static inline void vm_unacct_memory(long pages)
diff -pruN linux-2.6.17-rc1-mm1/mm/page_alloc.c linux-2.6.17-rc1-mm1-test1/mm/page_alloc.c
--- linux-2.6.17-rc1-mm1/mm/page_alloc.c 2006-04-04 10:43:57.000000000 -0400
+++ linux-2.6.17-rc1-mm1-test1/mm/page_alloc.c 2006-04-04 12:53:52.000000000 -0400
@@ -52,6 +52,7 @@ EXPORT_SYMBOL(node_possible_map);
unsigned long totalram_pages __read_mostly;
unsigned long totalhigh_pages __read_mostly;
long nr_swap_pages;
+EXPORT_SYMBOL(nr_swap_pages);
int percpu_pagelist_fraction;
static void __free_pages_ok(struct page *page, unsigned int order);
diff -pruN linux-2.6.17-rc1-mm1/mm/slab.c linux-2.6.17-rc1-mm1-test1/mm/slab.c
--- linux-2.6.17-rc1-mm1/mm/slab.c 2006-04-04 10:43:57.000000000 -0400
+++ linux-2.6.17-rc1-mm1-test1/mm/slab.c 2006-04-04 12:53:52.000000000 -0400
@@ -692,6 +692,7 @@ static struct list_head cache_chain;
* SLAB_RECLAIM_ACCOUNT turns this on per-slab
*/
atomic_t slab_reclaim_pages;
+EXPORT_SYMBOL(slab_reclaim_pages);
/*
* chicken and egg problem: delay the per-cpu array allocation
diff -pruN linux-2.6.17-rc1-mm1/mm/swap.c linux-2.6.17-rc1-mm1-test1/mm/swap.c
--- linux-2.6.17-rc1-mm1/mm/swap.c 2006-04-04 10:43:57.000000000 -0400
+++ linux-2.6.17-rc1-mm1-test1/mm/swap.c 2006-04-04 12:53:52.000000000 -0400
@@ -499,6 +499,7 @@ void vm_acct_memory(long pages)
}
preempt_enable();
}
+EXPORT_SYMBOL(vm_acct_memory);
#ifdef CONFIG_HOTPLUG_CPU
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch 1/3] mm: An enhancement of OVERCOMMIT_GUESS
2006-04-05 23:47 [patch 1/3] mm: An enhancement of OVERCOMMIT_GUESS Hideo AOKI
2006-04-05 23:51 ` A test kernel module " Hideo AOKI
2006-04-05 23:52 ` A patch for test_overcommit module Hideo AOKI
@ 2006-04-06 0:45 ` KAMEZAWA Hiroyuki
2006-04-06 7:20 ` Hideo AOKI
2 siblings, 1 reply; 7+ messages in thread
From: KAMEZAWA Hiroyuki @ 2006-04-06 0:45 UTC (permalink / raw)
To: Hideo AOKI; +Cc: akpm, linux-kernel, linux-mm
Hi, AOKI-san
On Wed, 05 Apr 2006 19:47:27 -0400
Hideo AOKI <haoki@redhat.com> wrote:
> Hello Andrew,
>
> Could you apply my patches to your tree?
>
> These patches are an enhancement of OVERCOMMIT_GUESS algorithm in
> __vm_enough_memory(). The detailed description is in attached patch.
>
I think adding a function like this is more simple way.
(call this istead of nr_free_pages().)
==
int nr_available_memory()
{
unsigned long sum = 0;
for_each_zone(zone) {
if (zone->free_pages > zone->pages_high)
sum += zone->free_pages - zone->pages_high;
}
return sum;
}
==
BTW, vm_enough_memory() doesn't eat cpuset information ?
-Kame
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch 1/3] mm: An enhancement of OVERCOMMIT_GUESS
2006-04-06 0:45 ` [patch 1/3] mm: An enhancement of OVERCOMMIT_GUESS KAMEZAWA Hiroyuki
@ 2006-04-06 7:20 ` Hideo AOKI
2006-04-06 8:08 ` KAMEZAWA Hiroyuki
0 siblings, 1 reply; 7+ messages in thread
From: Hideo AOKI @ 2006-04-06 7:20 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki; +Cc: akpm, linux-kernel, linux-mm
Hi Kamezawa-san,
Thank you for your comments.
KAMEZAWA Hiroyuki wrote:
> Hi, AOKI-san
>
> On Wed, 05 Apr 2006 19:47:27 -0400
> Hideo AOKI <haoki@redhat.com> wrote:
>
>
>>Hello Andrew,
>>
>>Could you apply my patches to your tree?
>>
>>These patches are an enhancement of OVERCOMMIT_GUESS algorithm in
>>__vm_enough_memory(). The detailed description is in attached patch.
>
> I think adding a function like this is more simple way.
> (call this istead of nr_free_pages().)
> ==
> int nr_available_memory()
> {
> unsigned long sum = 0;
> for_each_zone(zone) {
> if (zone->free_pages > zone->pages_high)
> sum += zone->free_pages - zone->pages_high;
> }
> return sum;
> }
> ==
I like your idea. But, in the function, I think we need to care
lowmem_reserve too.
Since __vm_enough_memory() doesn't know zone and cpuset information,
we have to guess proper value of lowmem_reserve in each zone
like I did in calculate_totalreserve_pages() in my patch.
Do you think that we can do this calculation every time?
If it is good enough, I'll make revised patch.
> BTW, vm_enough_memory() doesn't eat cpuset information ?
I think this is another point which we should improve.
Best regards,
Hideo Aoki
---
Hideo Aoki, Hitachi Computer Products (America) Inc.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch 1/3] mm: An enhancement of OVERCOMMIT_GUESS
2006-04-06 7:20 ` Hideo AOKI
@ 2006-04-06 8:08 ` KAMEZAWA Hiroyuki
2006-04-07 11:49 ` Hideo AOKI
0 siblings, 1 reply; 7+ messages in thread
From: KAMEZAWA Hiroyuki @ 2006-04-06 8:08 UTC (permalink / raw)
To: Hideo AOKI; +Cc: akpm, linux-kernel, linux-mm
On Thu, 06 Apr 2006 03:20:10 -0400
Hideo AOKI <haoki@redhat.com> wrote:
> Hi Kamezawa-san,
>
> Thank you for your comments.
>
> KAMEZAWA Hiroyuki wrote:
> > Hi, AOKI-san
> I like your idea. But, in the function, I think we need to care
> lowmem_reserve too.
>
Ah, I see.
> Since __vm_enough_memory() doesn't know zone and cpuset information,
> we have to guess proper value of lowmem_reserve in each zone
> like I did in calculate_totalreserve_pages() in my patch.
> Do you think that we can do this calculation every time?
>
> If it is good enough, I'll make revised patch.
>
I just thought to show "how to calculate" in unified way is better.
But if things goes ugly, please ignore my comment.
Do you have a detailed comparison of test result with and without this patch ?
I'm interested in.
I'm sorry if I missed your post of result.
Cheers!
-Kame
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch 1/3] mm: An enhancement of OVERCOMMIT_GUESS
2006-04-06 8:08 ` KAMEZAWA Hiroyuki
@ 2006-04-07 11:49 ` Hideo AOKI
0 siblings, 0 replies; 7+ messages in thread
From: Hideo AOKI @ 2006-04-07 11:49 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki; +Cc: akpm, linux-kernel, linux-mm
[-- Attachment #1: Type: text/plain, Size: 1869 bytes --]
Hi Kamezawa-san,
Thank you for your quick response. And sorry for slow response.
KAMEZAWA Hiroyuki wrote:
> Hideo AOKI <haoki@redhat.com> wrote:
>
>>Since __vm_enough_memory() doesn't know zone and cpuset information,
>>we have to guess proper value of lowmem_reserve in each zone
>>like I did in calculate_totalreserve_pages() in my patch.
>>Do you think that we can do this calculation every time?
>>
>>If it is good enough, I'll make revised patch.
>>
>
> I just thought to show "how to calculate" in unified way is better.
I got it.
> Do you have a detailed comparison of test result with and without this patch ?
Yes. I have test logs and attach them to this e-mail.
The logs are verbose output of my test kernel module which I already
sent to lkml.
http://marc.theaimsgroup.com/?l=linux-kernel&m=114428121522349&w=2
Test machine was i386 4GB memory PC. I didn't use swap region.
Let me explain a few things about the log.
* 2.6.17-rc1-mm1
HIGH: <active 18220><inactive 12278><free 1419><sum 31917><present 622220>
NORMAL: <active 1618><inactive 2293><free 1397><sum 5308><present 225280>
The test module consumes free pages until the number of free pages
is less than pages_high.
<buf 3916><cache 31785><slab reclaim 1550><swap 0> <+ 1> <target 33336>
This line shows the status of memory just before the module calls
__vm_enough_memory(). Meaning of each item is below.
buf: bufferram
cache: page cache
slab reclaim: slab_reclaim_pages
swap: nr_swap_pages
+: margin
target: the number of pages to ask __vm_enough_memory()
Test MAY be <failed>.
This line shows __vm_enough_memory() returned success.
Please let me know if you have any questions and suggestions.
Regards,
Hideo Aoki
---
Hideo Aoki, Hitachi Computer Products (America) Inc.
[-- Attachment #2: log-2.6.17-rc1-mm1.txt --]
[-- Type: text/plain, Size: 2233 bytes --]
* 2.6.17-rc1-mm1
Apr 6 20:33:33 dhcp1 kernel: Test module was loaded. <mode 1>
Apr 6 20:33:33 dhcp1 kernel: init ...<3>done
Apr 6 20:33:33 dhcp1 kernel:
Apr 6 20:33:33 dhcp1 kernel: HIGH: <active 18238><inactive 12278><free 590698><sum 621214><present 622220>
Apr 6 20:33:34 dhcp1 kernel: HighMem <target 589272>, <3>
Apr 6 20:33:34 dhcp1 kernel: HIGH: <active 18220><inactive 12278><free 1512><sum 32010><present 622220>
Apr 6 20:33:34 dhcp1 kernel:
Apr 6 20:33:34 dhcp1 kernel: HIGH: <active 18220><inactive 12278><free 1512><sum 32010><present 622220>
Apr 6 20:33:34 dhcp1 kernel: HighMem <target 86>, <3>
Apr 6 20:33:34 dhcp1 kernel: HIGH: <active 18220><inactive 12278><free 1419><sum 31917><present 622220>
Apr 6 20:33:34 dhcp1 kernel: already satisfied
Apr 6 20:33:34 dhcp1 kernel:
Apr 6 20:33:34 dhcp1 kernel: NORMAL: <active 1618><inactive 2277><free 205532><sum 209427><present 225280>
Apr 6 20:33:34 dhcp1 kernel: Normal <target 204124>, <3>
Apr 6 20:33:34 dhcp1 kernel: NORMAL: <active 1618><inactive 2291><free 1490><sum 5399><present 225280>
Apr 6 20:33:34 dhcp1 kernel:
Apr 6 20:33:34 dhcp1 kernel: NORMAL: <active 1618><inactive 2293><free 1490><sum 5401><present 225280>
Apr 6 20:33:34 dhcp1 kernel: Normal <target 82>, <3>
Apr 6 20:33:34 dhcp1 kernel: NORMAL: <active 1618><inactive 2293><free 1428><sum 5339><present 225280>
Apr 6 20:33:34 dhcp1 kernel:
Apr 6 20:33:34 dhcp1 kernel: NORMAL: <active 1618><inactive 2293><free 1428><sum 5339><present 225280>
Apr 6 20:33:34 dhcp1 kernel: Normal <target 20>, <3>
Apr 6 20:33:34 dhcp1 kernel: NORMAL: <active 1618><inactive 2293><free 1397><sum 5308><present 225280>
Apr 6 20:33:34 dhcp1 kernel: already satisfied
Apr 6 20:33:34 dhcp1 kernel: concrete test ...
Apr 6 20:33:34 dhcp1 kernel: <buf 3916><cache 31785><slab reclaim 1550><swap 0> <+ 1> <target 33336>
Apr 6 20:33:34 dhcp1 kernel: Test MAY be <failed>.
Apr 6 20:33:34 dhcp1 kernel: allocation failed: out of vmalloc space - use
vmalloc=<size> to increase size.
Apr 6 20:33:35 dhcp1 kernel: allocation failed: out of vmalloc space - use
vmalloc=<size> to increase size.
Apr 6 20:33:35 dhcp1 kernel: Test SURELY was <FAILED>.
Apr 6 20:33:35 dhcp1 kernel: concrete test ...done.
[-- Attachment #3: log-2.6.17-rc1-mm1+patch.txt --]
[-- Type: text/plain, Size: 4053 bytes --]
* 2.6.17-rc1-mm1 + patches
Apr 6 20:56:36 dhcp1 kernel: Test module was loaded. <mode 1>
Apr 6 20:56:36 dhcp1 kernel: init ...<3>done
Apr 6 20:56:36 dhcp1 kernel:
Apr 6 20:56:36 dhcp1 kernel: HIGH: <active 17074><inactive 13427><free 590727><sum 621228><present 622220>
Apr 6 20:56:36 dhcp1 kernel: HighMem <target 589301>, <3>
Apr 6 20:56:36 dhcp1 kernel: HIGH: <active 17074><inactive 13427><free 1479><sum 31980><present 622220>
Apr 6 20:56:36 dhcp1 kernel:
Apr 6 20:56:36 dhcp1 kernel: HIGH: <active 17074><inactive 13427><free 1479><sum 31980><present 622220>
Apr 6 20:56:36 dhcp1 kernel: HighMem <target 53>, <3>
Apr 6 20:56:36 dhcp1 kernel: HIGH: <active 17074><inactive 13427><free 1417><sum 31918><present 622220>
Apr 6 20:56:36 dhcp1 kernel: already satisfied
Apr 6 20:56:36 dhcp1 kernel:
Apr 6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2248><free 205669><sum 209543><present 225280>
Apr 6 20:56:36 dhcp1 kernel: Normal <target 204261>, <3>
Apr 6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2262><free 1441><sum 5329><present 225280>
Apr 6 20:56:36 dhcp1 kernel:
Apr 6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2264><free 1441><sum 5331><present 225280>
Apr 6 20:56:36 dhcp1 kernel: Normal <target 33>, <3>
Apr 6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2264><free 1410><sum 5300><present 225280>
Apr 6 20:56:36 dhcp1 kernel:
Apr 6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2264><free 1410><sum 5300><present 225280>
Apr 6 20:56:36 dhcp1 kernel: Normal <target 2>, <3>
Apr 6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2265><free 1410><sum 5301><present 225280>
Apr 6 20:56:36 dhcp1 kernel:
Apr 6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2265><free 1410><sum 5301><present 225280>
Apr 6 20:56:36 dhcp1 kernel: Normal <target 2>, <3>
Apr 6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2265><free 1410><sum 5301><present 225280>
Apr 6 20:56:36 dhcp1 kernel:
Apr 6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2265><free 1410><sum 5301><present 225280>
Apr 6 20:56:36 dhcp1 kernel: Normal <target 2>, <3>
Apr 6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2265><free 1410><sum 5301><present 225280>
Apr 6 20:56:36 dhcp1 kernel:
Apr 6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2265><free 1410><sum 5301><present 225280>
Apr 6 20:56:36 dhcp1 kernel: Normal <target 2>, <3>
Apr 6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2265><free 1410><sum 5301><present 225280>
Apr 6 20:56:36 dhcp1 kernel:
Apr 6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2265><free 1410><sum 5301><present 225280>
Apr 6 20:56:36 dhcp1 kernel: Normal <target 2>, <3>
Apr 6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2265><free 1410><sum 5301><present 225280>
Apr 6 20:56:36 dhcp1 kernel:
Apr 6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2265><free 1410><sum 5301><present 225280>
Apr 6 20:56:36 dhcp1 kernel: Normal <target 2>, <3>
Apr 6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2265><free 1410><sum 5301><present 225280>
Apr 6 20:56:36 dhcp1 kernel:
Apr 6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2265><free 1410><sum 5301><present 225280>
Apr 6 20:56:36 dhcp1 kernel: Normal <target 2>, <3>
Apr 6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2265><free 1410><sum 5301><present 225280>
Apr 6 20:56:36 dhcp1 kernel:
Apr 6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2265><free 1410><sum 5301><present 225280>
Apr 6 20:56:36 dhcp1 kernel: Normal <target 2>, <3>
Apr 6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2265><free 1379><sum 5270><present 225280>
Apr 6 20:56:36 dhcp1 kernel: already satisfied
Apr 6 20:56:36 dhcp1 kernel: concrete test ...
Apr 6 20:56:36 dhcp1 kernel: <buf 3902><cache 31720><slab reclaim 1538><swap 0> <+ 1> <target 33259>
Apr 6 20:56:36 dhcp1 kernel: Test was <PASSED>.
Apr 6 20:56:36 dhcp1 kernel: concrete test ...done.
Apr 6 20:56:48 dhcp1 kernel: Unloading module ...
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2006-04-07 11:50 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-04-05 23:47 [patch 1/3] mm: An enhancement of OVERCOMMIT_GUESS Hideo AOKI
2006-04-05 23:51 ` A test kernel module " Hideo AOKI
2006-04-05 23:52 ` A patch for test_overcommit module Hideo AOKI
2006-04-06 0:45 ` [patch 1/3] mm: An enhancement of OVERCOMMIT_GUESS KAMEZAWA Hiroyuki
2006-04-06 7:20 ` Hideo AOKI
2006-04-06 8:08 ` KAMEZAWA Hiroyuki
2006-04-07 11:49 ` Hideo AOKI
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).