linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch 1/3] mm: An enhancement of OVERCOMMIT_GUESS
@ 2006-04-05 23:47 Hideo AOKI
  2006-04-05 23:51 ` A test kernel module " Hideo AOKI
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Hideo AOKI @ 2006-04-05 23:47 UTC (permalink / raw)
  To: akpm; +Cc: linux-kernel, linux-mm

[-- Attachment #1: Type: text/plain, Size: 562 bytes --]

Hello Andrew,

Could you apply my patches to your tree?

These patches are an enhancement of OVERCOMMIT_GUESS algorithm in
__vm_enough_memory(). The detailed description is in attached patch.

Actually, these are the revised patch which I sent to lkml in the last
year.
http://marc.theaimsgroup.com/?l=linux-kernel&m=112993489022427&w=2

I wrote a test kernel module to show the result of the patches.
For your information, I also would like to send the module in later e-mail.

Best regards,
Hideo Aoki

---
Hideo Aoki, Hitachi Computer Products (America) Inc.

[-- Attachment #2: mm-add-totalreserve_pages.patch --]
[-- Type: text/x-patch, Size: 4960 bytes --]

These patches are an enhancement of OVERCOMMIT_GUESS algorithm in
__vm_enough_memory().

- why the kernel needed patching

  When the kernel can't allocate anonymous pages in practice, currnet
  OVERCOMMIT_GUESS could return success. This implementation might be
  the cause of oom kill in memory pressure situation.

  If the Linux runs with page reservation features like
  /proc/sys/vm/lowmem_reserve_ratio and without swap region, I think
  the oom kill occurs easily.


- the overall design approach in the patch

  When the OVERCOMMET_GUESS algorithm calculates number of free pages,
  the reserved free pages are regarded as non-free pages.

  This change helps to avoid the pitfall that the number of free pages
  become less than the number which the kernel tries to keep free.


- testing results

  I tested the patches using my test kernel module.

  If the patches aren't applied to the kernel, __vm_enough_memory()
  returns success in the situation but autual page allocation is
  failed.

  On the other hand, if the patches are applied to the kernel, memory
  allocation failure is avoided since __vm_enough_memory() returns
  failure in the situation.

  I checked that on i386 SMP 16GB memory machine. I haven't tested on
  nommu environment currently.


- changelog

  v5:
    - updated to 2.6.17-rc1-mm1
    - did more strict tests.
    - added the enhancement to mm/nommu.c too

  v4:
    - dealing with pages_high as reserved pages
    - updated the code for 2.6.14-rc4-mm1

  v3 (private): 
    - enhanced error handling in __vm_enough_memory
    - fixed an issue related calculation of totalreserve_pages 

  v2 (private):
    - fixed error handling bug
    - updated test results
    - updated the code for 2.6.14-rc2-mm2 


This patch adds totalreserve_pages for __vm_enough_memory().

Calculate_totalreserve_pages() checks maximum lowmem_reserve pages and
pages_high in each zone. Finally, the function stores the sum of each
zone to totalreserve_pages.

The totalreserve_pages is calculated when the VM is initilized.
And the variable is updated when /proc/sys/vm/lowmem_reserve_raito
or /proc/sys/vm/min_free_kbytes are changed.


Signed-off-by: Hideo Aoki <haoki@redhat.com>
---

 include/linux/swap.h |    1 +
 mm/page_alloc.c      |   39 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 40 insertions(+)

diff -purN linux-2.6.17-rc1-mm1/include/linux/swap.h linux-2.6.17-rc1-mm1-idea6/include/linux/swap.h
--- linux-2.6.17-rc1-mm1/include/linux/swap.h	2006-04-04 10:43:57.000000000 -0400
+++ linux-2.6.17-rc1-mm1-idea6/include/linux/swap.h	2006-04-04 15:13:26.000000000 -0400
@@ -155,6 +155,7 @@ extern void swapin_readahead(swp_entry_t
 /* linux/mm/page_alloc.c */
 extern unsigned long totalram_pages;
 extern unsigned long totalhigh_pages;
+extern unsigned long totalreserve_pages;
 extern long nr_swap_pages;
 extern unsigned int nr_free_pages(void);
 extern unsigned int nr_free_pages_pgdat(pg_data_t *pgdat);
diff -purN linux-2.6.17-rc1-mm1/mm/page_alloc.c linux-2.6.17-rc1-mm1-idea6/mm/page_alloc.c
--- linux-2.6.17-rc1-mm1/mm/page_alloc.c	2006-04-04 10:43:57.000000000 -0400
+++ linux-2.6.17-rc1-mm1-idea6/mm/page_alloc.c	2006-04-04 15:13:26.000000000 -0400
@@ -51,6 +51,7 @@ nodemask_t node_possible_map __read_most
 EXPORT_SYMBOL(node_possible_map);
 unsigned long totalram_pages __read_mostly;
 unsigned long totalhigh_pages __read_mostly;
+unsigned long totalreserve_pages __read_mostly;
 long nr_swap_pages;
 int percpu_pagelist_fraction;
 
@@ -2548,6 +2549,38 @@ void __init page_alloc_init(void)
 }
 
 /*
+ * calculate_totalreserve_pages - called when sysctl_lower_zone_reserve_ratio
+ *	or min_free_kbytes changes. 
+ */
+static void calculate_totalreserve_pages(void)
+{
+	struct pglist_data *pgdat;
+	unsigned long reserve_pages = 0;
+	int i, j;
+
+	for_each_online_pgdat(pgdat) {
+		for (i = 0; i < MAX_NR_ZONES; i++) {
+			struct zone *zone = pgdat->node_zones + i;
+			unsigned long max = 0;
+			
+			/* Find valid and maximum lowmem_reserve in the zone */
+			for (j = i; j < MAX_NR_ZONES; j++) {
+				if (zone->lowmem_reserve[j] > max)
+					max = zone->lowmem_reserve[j];
+			}
+
+			/* we treat pages_high as reserved pages. */
+			max += zone->pages_high;
+
+			if (max > zone->present_pages)
+				max = zone->present_pages;
+			reserve_pages += max;
+		}
+	}
+	totalreserve_pages = reserve_pages;
+}
+
+/*
  * setup_per_zone_lowmem_reserve - called whenever
  *	sysctl_lower_zone_reserve_ratio changes.  Ensures that each zone
  *	has a correct pages reserved value, so an adequate number of
@@ -2578,6 +2611,9 @@ static void setup_per_zone_lowmem_reserv
 			}
 		}
 	}
+
+	/* update totalreserve_pages */
+	calculate_totalreserve_pages();
 }
 
 /*
@@ -2632,6 +2668,9 @@ void setup_per_zone_pages_min(void)
 		zone->pages_high  = zone->pages_min + tmp / 2;
 		spin_unlock_irqrestore(&zone->lru_lock, flags);
 	}
+
+	/* update totalreserve_pages */
+	calculate_totalreserve_pages();
 }
 
 /*

^ permalink raw reply	[flat|nested] 7+ messages in thread

* A test kernel module of OVERCOMMIT_GUESS
  2006-04-05 23:47 [patch 1/3] mm: An enhancement of OVERCOMMIT_GUESS Hideo AOKI
@ 2006-04-05 23:51 ` Hideo AOKI
  2006-04-05 23:52 ` A patch for test_overcommit module Hideo AOKI
  2006-04-06  0:45 ` [patch 1/3] mm: An enhancement of OVERCOMMIT_GUESS KAMEZAWA Hiroyuki
  2 siblings, 0 replies; 7+ messages in thread
From: Hideo AOKI @ 2006-04-05 23:51 UTC (permalink / raw)
  To: akpm; +Cc: linux-kernel, linux-mm

[-- Attachment #1: Type: text/plain, Size: 978 bytes --]

Hello Andrew,

This is a kernel module patch which I developed to test my patches.

The module makes a kind of memory pressure situation. After that, the
module tests if the OVERCOMMIT_GUESS detects overcommit.

The module has "mode" option. If you specify "mode=1", the module
tries to allocate pages in the test phase.

Here is the test result when I did "mode=1" test on my machine.

* 2.6.17-rc1-mm1

   kernel: Test MAY be <failed>.
   kernel: allocation failed: out of vmalloc space - use vmalloc=<size> to increase size.
   kernel: allocation failed: out of vmalloc space - use vmalloc=<size> to increase size.
   kernel: Test SURELY was <FAILED>.


* 2.6.17-rc1-mm1 + my patches

   kernel: Test was <PASSED>.


Unfortunately, this kernel module needs another kernel patch.
I will send it in later e-mail.

Please note that I don't intend to propose to apply the module to
kernel tree.

Best regards,
Hideo Aoki

---
Hideo Aoki, Hitachi Computer Products (America) Inc.

[-- Attachment #2: mm-test_overcommit.patch --]
[-- Type: text/x-patch, Size: 14341 bytes --]

A kernel module to test OVERCOMMIT_GUESS.
 
 lib/Kconfig.debug    |   11 +
 mm/Makefile          |    1 
 mm/test_overcommit.c |  515 +++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 527 insertions(+)

diff -purN linux-2.6.17-rc1-mm1/lib/Kconfig.debug linux-2.6.17-rc1-mm1-test1/lib/Kconfig.debug
--- linux-2.6.17-rc1-mm1/lib/Kconfig.debug	2006-04-04 10:43:57.000000000 -0400
+++ linux-2.6.17-rc1-mm1-test1/lib/Kconfig.debug	2006-04-04 16:30:31.000000000 -0400
@@ -282,6 +282,17 @@ config RCU_TORTURE_TEST
 	  Say M if you want the RCU torture tests to build as a module.
 	  Say N if you are unsure.
 
+config OVERCOMMIT_GUESS_TEST
+	tristate "An overcommit guess testing module"
+	depends on DEBUG_KERNEL && X86
+	default n
+	help
+	  This option provides a kernel module that can test OVERCOMMIT_GUESS
+	  in __vm_enough_memory(). 
+ 
+	  You should say N or M here. Say M if you want to build the module.
+	  Say N if you are unsure.
+
 config WANT_EXTRA_DEBUG_INFORMATION
 	bool
 	select DEBUG_INFO
diff -purN linux-2.6.17-rc1-mm1/mm/Makefile linux-2.6.17-rc1-mm1-test1/mm/Makefile
--- linux-2.6.17-rc1-mm1/mm/Makefile	2006-04-04 10:43:57.000000000 -0400
+++ linux-2.6.17-rc1-mm1-test1/mm/Makefile	2006-04-04 16:23:09.000000000 -0400
@@ -24,4 +24,5 @@ obj-$(CONFIG_SLAB) += slab.o
 obj-$(CONFIG_MEMORY_HOTPLUG) += memory_hotplug.o
 obj-$(CONFIG_FS_XIP) += filemap_xip.o
 obj-$(CONFIG_MIGRATION) += migrate.o
+obj-$(CONFIG_OVERCOMMIT_GUESS_TEST) += test_overcommit.o
 
diff -purN linux-2.6.17-rc1-mm1/mm/test_overcommit.c linux-2.6.17-rc1-mm1-test1/mm/test_overcommit.c
--- linux-2.6.17-rc1-mm1/mm/test_overcommit.c	1969-12-31 19:00:00.000000000 -0500
+++ linux-2.6.17-rc1-mm1-test1/mm/test_overcommit.c	2006-04-04 16:23:09.000000000 -0400
@@ -0,0 +1,515 @@
+/*
+ * A kernel module for testing OVERCOMMIT_GUESS. ver 0.0.1
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, 
+ * MA  02110-1301, USA.
+ */
+/*
+ * This kernel module tests the later part of OVERCOMMIT_GUESS algorithm.
+ * To test the algorithm, the module makes a kind of memory pressure situation.
+ * After that, the module tests if the algorithm detects overcommit.
+ *
+ * You can specify by "mode" module option if the module actually tries to
+ * allocate pages in the situation. 
+ *
+ * You have to apply a kernel patch to use the module currently, since the
+ * module needs to refer some internal symbols for testing.
+ *
+ * The module was tested on only i386 SMP machines.
+ */
+#include <linux/module.h>  /* Needed by all modules */
+#include <linux/moduleparam.h>
+#include <linux/kernel.h>  /* Needed for KERN_INFO */
+
+#include <linux/gfp.h>
+#include <linux/mm.h>
+#include <linux/mman.h>
+#include <linux/pagemap.h>
+#include <linux/slab.h>
+#include <linux/swap.h>
+#include <linux/vmalloc.h>
+
+#include <asm/kmap_types.h>
+#include <asm/system.h>
+#include <linux/blkdev.h>
+#include <linux/string.h>
+
+/*
+ * Get rid of taint message by declaring code as GPL v2.
+ */
+MODULE_LICENSE("GPL v2");
+
+/*
+ * Module documentation
+ */
+MODULE_AUTHOR("Hideo Aoki");
+MODULE_DESCRIPTION("test module of overcommit guess");
+
+/*
+ * Module parameter(s)
+ */
+enum { MODE_DEFALUT = 0, MODE_DOLASTALLOC = 1 };
+static int mode = MODE_DEFALUT;
+
+module_param(mode, int, S_IRUSR);
+MODULE_PARM_DESC(mode, "test mode. If mode is 1, the module executes actual page allocation in concrete test.");
+
+/*
+ * declarations 
+ */
+#define MOD_MM 1                        /* modify kernel */
+#define DEF_PRT_LV KERN_DEBUG           /* default print level in this mod. */
+
+/*
+ * Page managemnt: We manage allocated pages. When module is unloaded,
+ * the module releases them.
+ */
+struct page_mng_t {
+	struct page *listp;     /* list of allocated pages */
+	int zone;               /* which zone manage */
+};
+
+/* We use head of allocated pages for managing page list */ 
+struct mypage_t {
+	struct page *next_page;
+	int order;              /* page order */
+};
+
+
+enum {
+	NORMAL, HIGH
+};
+
+static struct page_mng_t low_pg; /* LOWMEM pages management */  
+static struct page_mng_t high_pg;/* HIGHMEM pages management */
+static unsigned long committed_pages;/* number of pages committed in the mod.*/
+
+void prepare_test_env(void);
+void final_test(int mode);
+int commit_pages(unsigned long target, struct page_mng_t *mng, int gfp,
+		 char *msg);
+int try_to_commit_pages(int order, int gfp, struct page_mng_t *mng, 
+			unsigned long *n_alloced);
+int alloc_last_pages(unsigned long npage);
+
+/* allocated page management */ 
+int page_mng_init(struct page_mng_t *mng, int gfp);
+void page_mng_add(struct page_mng_t *mng, struct page *newp, int order);
+void page_mng_freeall(struct page_mng_t *mng);
+
+/* zone */
+struct zone *get_zone(char *name);
+void print_zone_info(struct zone *zone, char *msg);
+
+
+/*
+ * initialize module and invoke test functions
+ */
+int init_module(void)
+{
+	int ret; 
+
+	printk(DEF_PRT_LV "Test module was loaded. <mode %d>\n", mode);
+
+	if (mode != MODE_DEFALUT && mode != MODE_DOLASTALLOC) {
+		printk(KERN_ERR "invalid mode. <mode %d>\n", mode);
+		return 1;
+	}
+
+	/*
+	 * making a memory pressure situation
+	 */
+	printk(DEF_PRT_LV "init ...");
+	committed_pages = 0;
+	ret = page_mng_init(&low_pg, GFP_KERNEL);
+	if (ret == 1) {
+		printk(KERN_ERR "failed to init lowmem mng\n");
+		return 1;
+	}
+	ret = page_mng_init(&high_pg, GFP_HIGHUSER);
+	if (ret == 1) {
+		printk(KERN_ERR "failed to init highmem mng\n");
+		return 1;
+	}
+	printk(DEF_PRT_LV "done\n");
+
+	prepare_test_env();
+
+	/*
+	 * concrete test
+	 */
+	printk(DEF_PRT_LV "concrete test ...\n");
+	final_test(mode);
+	printk(DEF_PRT_LV "concrete test ...done.\n");
+
+	/*
+	 * A non 0 return means init_module failed; module can't be loaded.
+	 */
+	return 0;
+}
+
+
+/*
+ * destructor of module
+ */
+void cleanup_module(void)
+{
+	printk(DEF_PRT_LV "Unloading module ...\n");
+	page_mng_freeall(&low_pg);
+	page_mng_freeall(&high_pg);
+	vm_unacct_memory(committed_pages);
+}
+
+
+/*
+ * To prepare test environment, this function repeat to allocate pages
+ * in ZONE_HIGHMEM and ZONE_NORMAL until the number of free pages is
+ * pages_high in the zone.
+ */
+void prepare_test_env(void)
+{
+	struct zone *high_zone; /* high mem zone */
+	struct zone *normal_zone; /* normal zone */
+	int i;
+	int ret;
+	unsigned long target;
+
+	high_zone = get_zone("HighMem");
+	if (high_zone == NULL) {
+		printk(KERN_ERR "fail to get higmem zone\n");
+		return ;
+	}
+	for (i = 0; i < 100; i++) {
+		if (high_zone->free_pages <= high_zone->pages_high) {
+			printk(DEF_PRT_LV "already satisfied\n");
+			break;
+		}
+
+		spin_lock_irq(&high_zone->lru_lock);
+		target = high_zone->free_pages - high_zone->pages_high;
+		spin_unlock_irq(&high_zone->lru_lock);
+		print_zone_info(high_zone, "HIGH");
+		ret = commit_pages(target, &high_pg, GFP_HIGHUSER, "HighMem");
+		if (ret < 0) {
+			printk(KERN_ERR "error high %i\n", i);
+			goto error;
+		}
+
+		print_zone_info(high_zone, "HIGH");
+		/* printk(DEF_PRT_LV "%i\n", i); */
+		blk_congestion_wait(WRITE, HZ/2);
+	}
+
+	normal_zone = get_zone("Normal");
+	if (high_zone == NULL) {
+		printk(KERN_ERR "fail to get normal zone\n");
+		return ;
+	}
+	for (i = 0; i < 100; i++) {
+		if (normal_zone->free_pages <= normal_zone->pages_high) {
+			printk(DEF_PRT_LV "already satisfied\n");
+			break;
+		}
+
+		spin_lock_irq(&normal_zone->lru_lock);
+		target = normal_zone->free_pages - normal_zone->pages_high;
+		spin_unlock_irq(&normal_zone->lru_lock);
+		print_zone_info(normal_zone, "NORMAL");
+		ret = commit_pages(target, &low_pg, GFP_KERNEL, "Normal");
+		if (ret < 0) {
+			printk(KERN_ERR "error %i\n", i);
+			goto error;
+		}
+	
+		print_zone_info(normal_zone, "NORMAL");
+		/* printk(DEF_PRT_LV "%i\n", i); */
+		blk_congestion_wait(WRITE, HZ/2);
+	}
+
+error:
+	return;
+}
+
+/*
+ * main test function
+ */
+void final_test(int mode)
+{
+	int ret;
+	unsigned long n;
+	unsigned long nbuffer;
+	unsigned long ncache;
+	unsigned long nmargin;
+#if MOD_MM
+	unsigned long nslabrec;
+	unsigned long nswap;
+#endif
+	struct sysinfo info;
+	
+	
+	for (nmargin = 1; nmargin < 100000; nmargin *= 10) {
+		si_meminfo(&info);
+		nbuffer = info.bufferram;
+		ncache = get_page_cache_size();
+#if MOD_MM
+		nslabrec = atomic_read(&slab_reclaim_pages);
+		nswap = nr_swap_pages;
+		n = ncache + nslabrec + nmargin + nswap;
+		printk(DEF_PRT_LV "<buf %lu><cache %lu><slab reclaim %lu><swap %lu> <+ %lu> <target %lu>\n",
+		       nbuffer, ncache, nslabrec, nswap, nmargin, n);
+#else
+		n = ncache + nmargin;
+		printk(DEF_PRT_LV "<buf %lu> <cache %lu> <+ %lu> <target %lu>\n",
+		       nbuffer, ncache, nmargin, n);
+#endif
+
+		ret = __vm_enough_memory(n, 0);
+		if (ret != 0) {
+			printk(KERN_ERR "Test was <PASSED>.\n"); 
+			break ;
+		} 
+
+		/* unexpected result */ 
+		committed_pages += n;
+		if (mode == MODE_DOLASTALLOC) {
+			printk(KERN_ERR "Test MAY be <failed>.\n");
+			ret = alloc_last_pages(n);
+			if (ret == 0) {
+				printk(KERN_ERR "Test modeule has problem\n");
+			} else {
+				printk(KERN_ERR "Test SURELY was <FAILED>.\n");
+				break ;
+			}
+		} else {
+			printk(KERN_ERR "Test was <FAILED>\n");
+		}
+	}
+}
+
+int commit_pages(unsigned long target, struct page_mng_t *mng, int gfp,
+		 char *msg)
+{
+	int ret;
+	unsigned long total;
+	unsigned long npage;
+
+	if (target == 0 || mng == NULL)
+		goto error;
+
+	/* 
+	 * try to commit anonymous pages 
+	 */
+	total = 0;
+	while (total < target) {
+		ret = try_to_commit_pages(0, gfp, mng, &npage);
+		if (ret == 1) {
+			printk(DEF_PRT_LV "%s test stoped. <target %lu>\n",
+			       msg, target);
+			return 1;
+		} else if (ret == 0) {
+			total += npage;
+			/* printk(KERN_ERR "p"); */
+		} else if (ret == -1) {
+			printk(KERN_ERR "error %s overcommit.\n", msg);
+			goto error;
+		} else {
+			printk(KERN_ERR "error %s test environment.\n", msg);
+			goto error;
+		}
+	}
+	printk(DEF_PRT_LV "%s <target %lu>, ", msg, target);
+
+	return 0;
+
+ error:
+	return -1;
+}
+
+/*
+ * ret:  1; success (detected overcommit) 
+ *       0: success 
+ *      -1: error   (overcommiet was not detected)
+ */
+int try_to_commit_pages(int order, int gfp, struct page_mng_t *mng, 
+			unsigned long *n_alloced)
+{
+	int ret;
+	long n_pages;
+	struct page *p;
+	
+	n_pages = 1L << order;
+	/* printk(KERN_ERR "<order %d>, <pages %ld>\n,", order, n_pages); */
+
+	*n_alloced = 0;
+
+	ret = __vm_enough_memory(n_pages, 0);
+	if (ret != 0) {
+		printk(KERN_ERR "<order %d>, <pages %ld> ", order, n_pages);
+		printk(KERN_ERR "overcommit was detected.\n");
+		return 1;
+	}
+	committed_pages += n_pages;
+
+	p = alloc_pages(gfp, order);
+	if (p ==  NULL) {
+		/* error */
+		printk(KERN_ERR "<order %d>, <pages %ld> ", order, n_pages);
+		printk(KERN_ERR "allocation failed\n");
+		return -1;
+	} else {
+		page_mng_add(mng, p, order);
+		*n_alloced += n_pages;
+		/*
+		 * printk(KERN_ERR "<order %d>, <pages %ld> <alloced %lu> ",
+		 *        order, n_pages, *n_alloced);
+		 * printk(KERN_ERR " succeed\n");
+		 */
+		return 0;
+	}
+}
+
+/*
+ * 0: success, 1: failure  
+ */
+int alloc_last_pages(unsigned long npage)
+{
+	int i;
+	int ret;
+	void *mem;
+	struct mypage2_t {
+		struct mypage2_t *next;
+	} *endp, *listp, *p, *nextp;
+
+	/*
+	 * allocation scenario 1
+	 */
+	mem = vmalloc(npage * PAGE_SIZE);
+	if (mem != NULL) {
+		printk(KERN_ERR "TEST MODULE HAS PROBLEMS.\n");
+		vfree(mem);
+		return 0;
+	}
+
+	/*
+	 * allocation scenario 2
+	 */
+	for (listp = endp = NULL, i = 0; i < npage; i++) {
+		p = (struct mypage2_t *)vmalloc(PAGE_SIZE);
+		if (p == NULL) {
+			ret = 1;
+			goto release;
+		}
+
+		p->next = NULL;
+		if (listp == NULL)
+			listp = p;
+		else
+			endp->next = p;
+		
+		endp = p;
+	}
+	
+	printk(KERN_ERR "TEST MODULE HAS PROBLEMS.\n");
+	ret = 0;
+	
+release: 
+	for ( ; listp != NULL; listp = nextp) {
+		nextp = listp->next;
+		vfree(listp);
+	}
+
+	return ret;
+}
+
+/*
+ * ret: 1: error, 0: success
+ */
+int page_mng_init(struct page_mng_t *mng, int gfp)
+{
+	mng->listp = NULL;
+	if (gfp == GFP_HIGHUSER)
+		mng->zone = HIGH;
+	else if (gfp == GFP_KERNEL || gfp == GFP_USER)
+		mng->zone = NORMAL;
+	else
+		return 1;
+
+	return 0;
+}
+
+void page_mng_add(struct page_mng_t *mng, struct page *newp, int order)
+{
+	struct mypage_t *p;
+
+	if (mng->zone != NORMAL && mng->zone != HIGH) {
+		printk(KERN_ERR "PAGE_MNG: ERROR \n");
+		return ;
+	}
+
+	p = (struct mypage_t *)kmap_atomic(newp, KM_TYPE_NR);
+	p->order = order;
+	p->next_page = mng->listp;
+	mng->listp = newp;
+
+	kunmap_atomic((void *)p, KM_TYPE_NR);
+}
+
+void page_mng_freeall(struct page_mng_t *mng)
+{
+	int order; 
+	struct page *next;
+	struct mypage_t *p;
+
+	for ( ; mng->listp != NULL; mng->listp = next) {
+		p = (struct mypage_t *)kmap_atomic(mng->listp, KM_TYPE_NR);
+		next = p->next_page;
+		order = p->order;
+		kunmap_atomic((void *)p, KM_TYPE_NR);
+		__free_pages(mng->listp, order);
+	}
+}
+
+struct zone *get_zone(char *name)
+{
+	struct zone *zone;
+	for_each_zone(zone) {
+		if (strcmp(zone->name, name) == 0)
+			return zone;
+	}	
+
+	return NULL;
+}
+
+void print_zone_info(struct zone *zone, char *msg)
+{
+	unsigned long free_pages;
+	unsigned long nr_active;
+	unsigned long nr_inactive;
+	unsigned long present_pages;
+
+	spin_lock_irq(&zone->lru_lock);
+	free_pages = zone->free_pages;
+	nr_active = zone->nr_active;
+	nr_inactive = zone->nr_inactive;
+	present_pages = zone->present_pages;
+	spin_unlock_irq(&zone->lru_lock);
+
+	printk(DEF_PRT_LV "\n%s: <active %lu><inactive %lu><free %lu><sum %lu><present %lu>\n",
+	       msg,
+	       nr_active,
+	       nr_inactive,
+	       free_pages,
+	       nr_active + nr_inactive + free_pages,
+	       present_pages);
+}

^ permalink raw reply	[flat|nested] 7+ messages in thread

* A patch for test_overcommit module
  2006-04-05 23:47 [patch 1/3] mm: An enhancement of OVERCOMMIT_GUESS Hideo AOKI
  2006-04-05 23:51 ` A test kernel module " Hideo AOKI
@ 2006-04-05 23:52 ` Hideo AOKI
  2006-04-06  0:45 ` [patch 1/3] mm: An enhancement of OVERCOMMIT_GUESS KAMEZAWA Hiroyuki
  2 siblings, 0 replies; 7+ messages in thread
From: Hideo AOKI @ 2006-04-05 23:52 UTC (permalink / raw)
  To: akpm; +Cc: linux-kernel, linux-mm

[-- Attachment #1: Type: text/plain, Size: 161 bytes --]

The test module needs this kernel patch.
I don't intend to propose to apply this patch to kernel tree.

---
Hideo Aoki, Hitachi Computer Products (America) Inc.

[-- Attachment #2: mm-debug.patch --]
[-- Type: text/x-patch, Size: 2086 bytes --]

This patch exports the symbols which a kernel module has to refer.

 include/linux/mman.h |    1 +
 mm/page_alloc.c      |    1 +
 mm/slab.c            |    1 +
 mm/swap.c            |    1 +
 4 files changed, 4 insertions(+)

diff -pruN linux-2.6.17-rc1-mm1/include/linux/mman.h linux-2.6.17-rc1-mm1-test1/include/linux/mman.h
--- linux-2.6.17-rc1-mm1/include/linux/mman.h	2006-03-24 12:40:19.000000000 -0500
+++ linux-2.6.17-rc1-mm1-test1/include/linux/mman.h	2006-04-04 12:53:52.000000000 -0400
@@ -24,6 +24,7 @@ static inline void vm_acct_memory(long p
 {
 	atomic_add(pages, &vm_committed_space);
 }
+EXPORT_SYMBOL(vm_acct_memory);
 #endif
 
 static inline void vm_unacct_memory(long pages)
diff -pruN linux-2.6.17-rc1-mm1/mm/page_alloc.c linux-2.6.17-rc1-mm1-test1/mm/page_alloc.c
--- linux-2.6.17-rc1-mm1/mm/page_alloc.c	2006-04-04 10:43:57.000000000 -0400
+++ linux-2.6.17-rc1-mm1-test1/mm/page_alloc.c	2006-04-04 12:53:52.000000000 -0400
@@ -52,6 +52,7 @@ EXPORT_SYMBOL(node_possible_map);
 unsigned long totalram_pages __read_mostly;
 unsigned long totalhigh_pages __read_mostly;
 long nr_swap_pages;
+EXPORT_SYMBOL(nr_swap_pages);
 int percpu_pagelist_fraction;
 
 static void __free_pages_ok(struct page *page, unsigned int order);
diff -pruN linux-2.6.17-rc1-mm1/mm/slab.c linux-2.6.17-rc1-mm1-test1/mm/slab.c
--- linux-2.6.17-rc1-mm1/mm/slab.c	2006-04-04 10:43:57.000000000 -0400
+++ linux-2.6.17-rc1-mm1-test1/mm/slab.c	2006-04-04 12:53:52.000000000 -0400
@@ -692,6 +692,7 @@ static struct list_head cache_chain;
  * SLAB_RECLAIM_ACCOUNT turns this on per-slab
  */
 atomic_t slab_reclaim_pages;
+EXPORT_SYMBOL(slab_reclaim_pages);
 
 /*
  * chicken and egg problem: delay the per-cpu array allocation
diff -pruN linux-2.6.17-rc1-mm1/mm/swap.c linux-2.6.17-rc1-mm1-test1/mm/swap.c
--- linux-2.6.17-rc1-mm1/mm/swap.c	2006-04-04 10:43:57.000000000 -0400
+++ linux-2.6.17-rc1-mm1-test1/mm/swap.c	2006-04-04 12:53:52.000000000 -0400
@@ -499,6 +499,7 @@ void vm_acct_memory(long pages)
 	}
 	preempt_enable();
 }
+EXPORT_SYMBOL(vm_acct_memory);
 
 #ifdef CONFIG_HOTPLUG_CPU
 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch 1/3] mm: An enhancement of OVERCOMMIT_GUESS
  2006-04-05 23:47 [patch 1/3] mm: An enhancement of OVERCOMMIT_GUESS Hideo AOKI
  2006-04-05 23:51 ` A test kernel module " Hideo AOKI
  2006-04-05 23:52 ` A patch for test_overcommit module Hideo AOKI
@ 2006-04-06  0:45 ` KAMEZAWA Hiroyuki
  2006-04-06  7:20   ` Hideo AOKI
  2 siblings, 1 reply; 7+ messages in thread
From: KAMEZAWA Hiroyuki @ 2006-04-06  0:45 UTC (permalink / raw)
  To: Hideo AOKI; +Cc: akpm, linux-kernel, linux-mm


Hi, AOKI-san

On Wed, 05 Apr 2006 19:47:27 -0400
Hideo AOKI <haoki@redhat.com> wrote:

> Hello Andrew,
> 
> Could you apply my patches to your tree?
> 
> These patches are an enhancement of OVERCOMMIT_GUESS algorithm in
> __vm_enough_memory(). The detailed description is in attached patch.
> 

I think adding a function like this is more simple way.
(call this istead of nr_free_pages().)
==
int nr_available_memory() 
{
	unsigned long sum = 0;
	for_each_zone(zone) {
		if (zone->free_pages > zone->pages_high)
			sum += zone->free_pages - zone->pages_high;
	}
	return sum;
}
==

BTW, vm_enough_memory() doesn't eat cpuset information ?

-Kame


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch 1/3] mm: An enhancement of OVERCOMMIT_GUESS
  2006-04-06  0:45 ` [patch 1/3] mm: An enhancement of OVERCOMMIT_GUESS KAMEZAWA Hiroyuki
@ 2006-04-06  7:20   ` Hideo AOKI
  2006-04-06  8:08     ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 7+ messages in thread
From: Hideo AOKI @ 2006-04-06  7:20 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki; +Cc: akpm, linux-kernel, linux-mm

Hi Kamezawa-san,

Thank you for your comments.

KAMEZAWA Hiroyuki wrote:
> Hi, AOKI-san
> 
> On Wed, 05 Apr 2006 19:47:27 -0400
> Hideo AOKI <haoki@redhat.com> wrote:
> 
> 
>>Hello Andrew,
>>
>>Could you apply my patches to your tree?
>>
>>These patches are an enhancement of OVERCOMMIT_GUESS algorithm in
>>__vm_enough_memory(). The detailed description is in attached patch.
> 
> I think adding a function like this is more simple way.
> (call this istead of nr_free_pages().)
> ==
> int nr_available_memory() 
> {
> 	unsigned long sum = 0;
> 	for_each_zone(zone) {
> 		if (zone->free_pages > zone->pages_high)
> 			sum += zone->free_pages - zone->pages_high;
> 	}
> 	return sum;
> }
> ==

I like your idea. But, in the function, I think we need to care
lowmem_reserve too.

Since __vm_enough_memory() doesn't know zone and cpuset information,
we have to guess proper value of lowmem_reserve in each zone
like I did in calculate_totalreserve_pages() in my patch.
Do you think that we can do this calculation every time?

If it is good enough, I'll make revised patch.


> BTW, vm_enough_memory() doesn't eat cpuset information ?

I think this is another point which we should improve.

Best regards,
Hideo Aoki

---
Hideo Aoki, Hitachi Computer Products (America) Inc.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch 1/3] mm: An enhancement of OVERCOMMIT_GUESS
  2006-04-06  7:20   ` Hideo AOKI
@ 2006-04-06  8:08     ` KAMEZAWA Hiroyuki
  2006-04-07 11:49       ` Hideo AOKI
  0 siblings, 1 reply; 7+ messages in thread
From: KAMEZAWA Hiroyuki @ 2006-04-06  8:08 UTC (permalink / raw)
  To: Hideo AOKI; +Cc: akpm, linux-kernel, linux-mm

On Thu, 06 Apr 2006 03:20:10 -0400
Hideo AOKI <haoki@redhat.com> wrote:

> Hi Kamezawa-san,
> 
> Thank you for your comments.
> 
> KAMEZAWA Hiroyuki wrote:
> > Hi, AOKI-san
> I like your idea. But, in the function, I think we need to care
> lowmem_reserve too.
> 
Ah, I see.

> Since __vm_enough_memory() doesn't know zone and cpuset information,
> we have to guess proper value of lowmem_reserve in each zone
> like I did in calculate_totalreserve_pages() in my patch.
> Do you think that we can do this calculation every time?
> 
> If it is good enough, I'll make revised patch.
> 
I just thought to show "how to calculate" in unified way is better.
But if things goes ugly, please ignore my comment.

Do you have a detailed comparison of test result with and without this patch ?
I'm interested in.
I'm sorry if I missed your post of result.


Cheers!
-Kame

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch 1/3] mm: An enhancement of OVERCOMMIT_GUESS
  2006-04-06  8:08     ` KAMEZAWA Hiroyuki
@ 2006-04-07 11:49       ` Hideo AOKI
  0 siblings, 0 replies; 7+ messages in thread
From: Hideo AOKI @ 2006-04-07 11:49 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki; +Cc: akpm, linux-kernel, linux-mm

[-- Attachment #1: Type: text/plain, Size: 1869 bytes --]

Hi Kamezawa-san,

Thank you for your quick response. And sorry for slow response.

KAMEZAWA Hiroyuki wrote:

> Hideo AOKI <haoki@redhat.com> wrote:
> 
>>Since __vm_enough_memory() doesn't know zone and cpuset information,
>>we have to guess proper value of lowmem_reserve in each zone
>>like I did in calculate_totalreserve_pages() in my patch.
>>Do you think that we can do this calculation every time?
>>
>>If it is good enough, I'll make revised patch.
>>
> 
> I just thought to show "how to calculate" in unified way is better.

I got it.

> Do you have a detailed comparison of test result with and without this patch ?

Yes. I have test logs and attach them to this e-mail.

The logs are verbose output of my test kernel module which I already
sent to lkml.
http://marc.theaimsgroup.com/?l=linux-kernel&m=114428121522349&w=2

Test machine was i386 4GB memory PC. I didn't use swap region.


Let me explain a few things about the log.

* 2.6.17-rc1-mm1

HIGH: <active 18220><inactive 12278><free 1419><sum 31917><present 622220>
NORMAL: <active 1618><inactive 2293><free 1397><sum 5308><present 225280>

   The test module consumes free pages until the number of free pages
   is less than pages_high.


<buf 3916><cache 31785><slab reclaim 1550><swap 0> <+ 1> <target 33336>

   This line shows the status of memory just before the module calls
   __vm_enough_memory(). Meaning of each item is below.

     buf:            bufferram
     cache:          page cache
     slab reclaim:   slab_reclaim_pages
     swap:           nr_swap_pages
     +:              margin
     target:         the number of pages to ask __vm_enough_memory()


Test MAY be <failed>.

   This line shows __vm_enough_memory() returned success.


Please let me know if you have any questions and suggestions.

Regards,
Hideo Aoki

---
Hideo Aoki, Hitachi Computer Products (America) Inc.

[-- Attachment #2: log-2.6.17-rc1-mm1.txt --]
[-- Type: text/plain, Size: 2233 bytes --]

* 2.6.17-rc1-mm1

Apr  6 20:33:33 dhcp1 kernel: Test module was loaded. <mode 1>
Apr  6 20:33:33 dhcp1 kernel: init ...<3>done
Apr  6 20:33:33 dhcp1 kernel:
Apr  6 20:33:33 dhcp1 kernel: HIGH: <active 18238><inactive 12278><free 590698><sum 621214><present 622220>
Apr  6 20:33:34 dhcp1 kernel: HighMem <target 589272>, <3>
Apr  6 20:33:34 dhcp1 kernel: HIGH: <active 18220><inactive 12278><free 1512><sum 32010><present 622220>
Apr  6 20:33:34 dhcp1 kernel:
Apr  6 20:33:34 dhcp1 kernel: HIGH: <active 18220><inactive 12278><free 1512><sum 32010><present 622220>
Apr  6 20:33:34 dhcp1 kernel: HighMem <target 86>, <3>
Apr  6 20:33:34 dhcp1 kernel: HIGH: <active 18220><inactive 12278><free 1419><sum 31917><present 622220>
Apr  6 20:33:34 dhcp1 kernel: already satisfied
Apr  6 20:33:34 dhcp1 kernel:
Apr  6 20:33:34 dhcp1 kernel: NORMAL: <active 1618><inactive 2277><free 205532><sum 209427><present 225280>
Apr  6 20:33:34 dhcp1 kernel: Normal <target 204124>, <3>
Apr  6 20:33:34 dhcp1 kernel: NORMAL: <active 1618><inactive 2291><free 1490><sum 5399><present 225280>
Apr  6 20:33:34 dhcp1 kernel:
Apr  6 20:33:34 dhcp1 kernel: NORMAL: <active 1618><inactive 2293><free 1490><sum 5401><present 225280>
Apr  6 20:33:34 dhcp1 kernel: Normal <target 82>, <3>
Apr  6 20:33:34 dhcp1 kernel: NORMAL: <active 1618><inactive 2293><free 1428><sum 5339><present 225280>
Apr  6 20:33:34 dhcp1 kernel:
Apr  6 20:33:34 dhcp1 kernel: NORMAL: <active 1618><inactive 2293><free 1428><sum 5339><present 225280>
Apr  6 20:33:34 dhcp1 kernel: Normal <target 20>, <3>
Apr  6 20:33:34 dhcp1 kernel: NORMAL: <active 1618><inactive 2293><free 1397><sum 5308><present 225280>
Apr  6 20:33:34 dhcp1 kernel: already satisfied
Apr  6 20:33:34 dhcp1 kernel: concrete test ...
Apr  6 20:33:34 dhcp1 kernel: <buf 3916><cache 31785><slab reclaim 1550><swap 0> <+ 1> <target 33336>
Apr  6 20:33:34 dhcp1 kernel: Test MAY be <failed>.
Apr  6 20:33:34 dhcp1 kernel: allocation failed: out of vmalloc space - use
vmalloc=<size> to increase size.
Apr  6 20:33:35 dhcp1 kernel: allocation failed: out of vmalloc space - use
vmalloc=<size> to increase size.
Apr  6 20:33:35 dhcp1 kernel: Test SURELY was <FAILED>.
Apr  6 20:33:35 dhcp1 kernel: concrete test ...done.

[-- Attachment #3: log-2.6.17-rc1-mm1+patch.txt --]
[-- Type: text/plain, Size: 4053 bytes --]

* 2.6.17-rc1-mm1 + patches

Apr  6 20:56:36 dhcp1 kernel: Test module was loaded. <mode 1>
Apr  6 20:56:36 dhcp1 kernel: init ...<3>done
Apr  6 20:56:36 dhcp1 kernel:
Apr  6 20:56:36 dhcp1 kernel: HIGH: <active 17074><inactive 13427><free 590727><sum 621228><present 622220>
Apr  6 20:56:36 dhcp1 kernel: HighMem <target 589301>, <3>
Apr  6 20:56:36 dhcp1 kernel: HIGH: <active 17074><inactive 13427><free 1479><sum 31980><present 622220>
Apr  6 20:56:36 dhcp1 kernel:
Apr  6 20:56:36 dhcp1 kernel: HIGH: <active 17074><inactive 13427><free 1479><sum 31980><present 622220>
Apr  6 20:56:36 dhcp1 kernel: HighMem <target 53>, <3>
Apr  6 20:56:36 dhcp1 kernel: HIGH: <active 17074><inactive 13427><free 1417><sum 31918><present 622220>
Apr  6 20:56:36 dhcp1 kernel: already satisfied
Apr  6 20:56:36 dhcp1 kernel:
Apr  6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2248><free 205669><sum 209543><present 225280>
Apr  6 20:56:36 dhcp1 kernel: Normal <target 204261>, <3>
Apr  6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2262><free 1441><sum 5329><present 225280>
Apr  6 20:56:36 dhcp1 kernel:
Apr  6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2264><free 1441><sum 5331><present 225280>
Apr  6 20:56:36 dhcp1 kernel: Normal <target 33>, <3>
Apr  6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2264><free 1410><sum 5300><present 225280>
Apr  6 20:56:36 dhcp1 kernel:
Apr  6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2264><free 1410><sum 5300><present 225280>
Apr  6 20:56:36 dhcp1 kernel: Normal <target 2>, <3>
Apr  6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2265><free 1410><sum 5301><present 225280>
Apr  6 20:56:36 dhcp1 kernel:
Apr  6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2265><free 1410><sum 5301><present 225280>
Apr  6 20:56:36 dhcp1 kernel: Normal <target 2>, <3>
Apr  6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2265><free 1410><sum 5301><present 225280>
Apr  6 20:56:36 dhcp1 kernel:
Apr  6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2265><free 1410><sum 5301><present 225280>
Apr  6 20:56:36 dhcp1 kernel: Normal <target 2>, <3>
Apr  6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2265><free 1410><sum 5301><present 225280>
Apr  6 20:56:36 dhcp1 kernel:
Apr  6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2265><free 1410><sum 5301><present 225280>
Apr  6 20:56:36 dhcp1 kernel: Normal <target 2>, <3>
Apr  6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2265><free 1410><sum 5301><present 225280>
Apr  6 20:56:36 dhcp1 kernel:
Apr  6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2265><free 1410><sum 5301><present 225280>
Apr  6 20:56:36 dhcp1 kernel: Normal <target 2>, <3>
Apr  6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2265><free 1410><sum 5301><present 225280>
Apr  6 20:56:36 dhcp1 kernel:
Apr  6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2265><free 1410><sum 5301><present 225280>
Apr  6 20:56:36 dhcp1 kernel: Normal <target 2>, <3>
Apr  6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2265><free 1410><sum 5301><present 225280>
Apr  6 20:56:36 dhcp1 kernel:
Apr  6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2265><free 1410><sum 5301><present 225280>
Apr  6 20:56:36 dhcp1 kernel: Normal <target 2>, <3>
Apr  6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2265><free 1410><sum 5301><present 225280>
Apr  6 20:56:36 dhcp1 kernel:
Apr  6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2265><free 1410><sum 5301><present 225280>
Apr  6 20:56:36 dhcp1 kernel: Normal <target 2>, <3>
Apr  6 20:56:36 dhcp1 kernel: NORMAL: <active 1626><inactive 2265><free 1379><sum 5270><present 225280>
Apr  6 20:56:36 dhcp1 kernel: already satisfied
Apr  6 20:56:36 dhcp1 kernel: concrete test ...
Apr  6 20:56:36 dhcp1 kernel: <buf 3902><cache 31720><slab reclaim 1538><swap 0> <+ 1> <target 33259>
Apr  6 20:56:36 dhcp1 kernel: Test was <PASSED>.
Apr  6 20:56:36 dhcp1 kernel: concrete test ...done.
Apr  6 20:56:48 dhcp1 kernel: Unloading module ...

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2006-04-07 11:50 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-04-05 23:47 [patch 1/3] mm: An enhancement of OVERCOMMIT_GUESS Hideo AOKI
2006-04-05 23:51 ` A test kernel module " Hideo AOKI
2006-04-05 23:52 ` A patch for test_overcommit module Hideo AOKI
2006-04-06  0:45 ` [patch 1/3] mm: An enhancement of OVERCOMMIT_GUESS KAMEZAWA Hiroyuki
2006-04-06  7:20   ` Hideo AOKI
2006-04-06  8:08     ` KAMEZAWA Hiroyuki
2006-04-07 11:49       ` Hideo AOKI

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).