linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 1/2] mm: factor commit limit calculation
@ 2013-10-18 12:56 Jerome Marchand
  2013-10-18 12:56 ` [PATCH v4 2/2] mm: allow to set overcommit ratio more precisely Jerome Marchand
  2013-11-05 23:51 ` [PATCH v4 1/2] mm: factor commit limit calculation Andrew Morton
  0 siblings, 2 replies; 11+ messages in thread
From: Jerome Marchand @ 2013-10-18 12:56 UTC (permalink / raw)
  To: linux-mm; +Cc: linux-kernel, dave.hansen

Change since v3:
 - rebase on 3.12-rc5

The same calculation is currently done in three differents places.
Factor that code so future changes has to be made at only one place.

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
---
 fs/proc/meminfo.c    |    5 +----
 include/linux/mman.h |   12 ++++++++++++
 mm/mmap.c            |    4 +---
 mm/nommu.c           |    3 +--
 4 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
index 59d85d6..c805d5b 100644
--- a/fs/proc/meminfo.c
+++ b/fs/proc/meminfo.c
@@ -24,7 +24,6 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
 {
 	struct sysinfo i;
 	unsigned long committed;
-	unsigned long allowed;
 	struct vmalloc_info vmi;
 	long cached;
 	unsigned long pages[NR_LRU_LISTS];
@@ -37,8 +36,6 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
 	si_meminfo(&i);
 	si_swapinfo(&i);
 	committed = percpu_counter_read_positive(&vm_committed_as);
-	allowed = ((totalram_pages - hugetlb_total_pages())
-		* sysctl_overcommit_ratio / 100) + total_swap_pages;
 
 	cached = global_page_state(NR_FILE_PAGES) -
 			total_swapcache_pages() - i.bufferram;
@@ -147,7 +144,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
 		K(global_page_state(NR_UNSTABLE_NFS)),
 		K(global_page_state(NR_BOUNCE)),
 		K(global_page_state(NR_WRITEBACK_TEMP)),
-		K(allowed),
+		K(vm_commit_limit()),
 		K(committed),
 		(unsigned long)VMALLOC_TOTAL >> 10,
 		vmi.used >> 10,
diff --git a/include/linux/mman.h b/include/linux/mman.h
index 92dc257..d622d34 100644
--- a/include/linux/mman.h
+++ b/include/linux/mman.h
@@ -7,6 +7,9 @@
 #include <linux/atomic.h>
 #include <uapi/linux/mman.h>
 
+#include <linux/hugetlb.h>
+#include <linux/swap.h>
+
 extern int sysctl_overcommit_memory;
 extern int sysctl_overcommit_ratio;
 extern struct percpu_counter vm_committed_as;
@@ -87,4 +90,13 @@ calc_vm_flag_bits(unsigned long flags)
 	       _calc_vm_trans(flags, MAP_DENYWRITE,  VM_DENYWRITE ) |
 	       _calc_vm_trans(flags, MAP_LOCKED,     VM_LOCKED    );
 }
+
+/*
+ * Commited memory limit enforced when OVERCOMMIT_NEVER policy is used
+ */
+static inline unsigned long vm_commit_limit()
+{
+	return ((totalram_pages - hugetlb_total_pages())
+		* sysctl_overcommit_ratio / 100) + total_swap_pages;
+}
 #endif /* _LINUX_MMAN_H */
diff --git a/mm/mmap.c b/mm/mmap.c
index 9d54851..7755953 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -179,14 +179,12 @@ int __vm_enough_memory(struct mm_struct *mm, long pages, int cap_sys_admin)
 		goto error;
 	}
 
-	allowed = (totalram_pages - hugetlb_total_pages())
-	       	* sysctl_overcommit_ratio / 100;
+	allowed = vm_commit_limit();
 	/*
 	 * Reserve some for root
 	 */
 	if (!cap_sys_admin)
 		allowed -= sysctl_admin_reserve_kbytes >> (PAGE_SHIFT - 10);
-	allowed += total_swap_pages;
 
 	/*
 	 * Don't let a single process grow so big a user can't recover
diff --git a/mm/nommu.c b/mm/nommu.c
index ecd1f15..d8a957b 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -1948,13 +1948,12 @@ int __vm_enough_memory(struct mm_struct *mm, long pages, int cap_sys_admin)
 		goto error;
 	}
 
-	allowed = totalram_pages * sysctl_overcommit_ratio / 100;
+	allowed = vm_commit_limit();
 	/*
 	 * Reserve some 3% for root
 	 */
 	if (!cap_sys_admin)
 		allowed -= sysctl_admin_reserve_kbytes >> (PAGE_SHIFT - 10);
-	allowed += total_swap_pages;
 
 	/*
 	 * Don't let a single process grow so big a user can't recover
-- 
1.7.7.6

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4 2/2] mm: allow to set overcommit ratio more precisely
  2013-10-18 12:56 [PATCH v4 1/2] mm: factor commit limit calculation Jerome Marchand
@ 2013-10-18 12:56 ` Jerome Marchand
  2013-11-05 23:53   ` Andrew Morton
  2013-12-03 13:33   ` [PATCH v5] mm: add overcommit_kbytes sysctl variable Jerome Marchand
  2013-11-05 23:51 ` [PATCH v4 1/2] mm: factor commit limit calculation Andrew Morton
  1 sibling, 2 replies; 11+ messages in thread
From: Jerome Marchand @ 2013-10-18 12:56 UTC (permalink / raw)
  To: linux-mm; +Cc: linux-kernel, dave.hansen

Changes since v3:
 - rebase on 3.12-rc5
Changes since v2:
 - updates documentation
Changes since v1:
 - use overcommit_ratio_ppm instead of overcommit_kbytes
 - keep both variables in sync

Some applications that run on HPC clusters are designed around the
availability of RAM and the overcommit ratio is fine tuned to get the
maximum usage of memory without swapping. With growing memory, the 1%
of all RAM grain provided by overcommit_ratio has become too coarse
for these workload (on a 2TB machine it represents no less than
20GB).

This patch adds the new overcommit_ratio_ppm sysctl variable that
allow to set overcommit ratio with a part per million precision.
The old overcommit_ratio variable can still be used to set and read
the ratio with a 1% precision. That way, overcommit_ratio interface
isn't broken in any way that I can imagine.

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
---
 Documentation/sysctl/vm.txt            |   10 +++++
 Documentation/vm/overcommit-accounting |    7 ++--
 include/linux/mman.h                   |    6 ++--
 include/linux/sysctl.h                 |    2 +
 kernel/sysctl.c                        |   63 ++++++++++++++++++++++++++++++--
 mm/mmap.c                              |    2 +-
 mm/nommu.c                             |    2 +-
 7 files changed, 81 insertions(+), 11 deletions(-)

diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
index 79a797e..a25943e 100644
--- a/Documentation/sysctl/vm.txt
+++ b/Documentation/sysctl/vm.txt
@@ -49,6 +49,7 @@ Currently, these files are in /proc/sys/vm:
 - oom_kill_allocating_task
 - overcommit_memory
 - overcommit_ratio
+- overcommit_ratio_ppm
 - page-cluster
 - panic_on_oom
 - percpu_pagelist_fraction
@@ -599,6 +600,15 @@ overcommit_ratio:
 When overcommit_memory is set to 2, the committed address
 space is not permitted to exceed swap plus this percentage
 of physical RAM.  See above.
+If overcommit_ratio_ppm has been set, overcommit_ratio shows a
+rounded value.
+
+==============================================================
+
+overcommit_ratio_ppm:
+
+Same as overcommit_ratio, but allows to set the ratio with a finer
+grain (part per million).
 
 ==============================================================
 
diff --git a/Documentation/vm/overcommit-accounting b/Documentation/vm/overcommit-accounting
index 8eaa2fc..15b5ecb 100644
--- a/Documentation/vm/overcommit-accounting
+++ b/Documentation/vm/overcommit-accounting
@@ -14,8 +14,8 @@ The Linux kernel supports the following overcommit handling modes
 
 2	-	Don't overcommit. The total address space commit
 		for the system is not permitted to exceed swap + a
-		configurable percentage (default is 50) of physical RAM.
-		Depending on the percentage you use, in most situations
+		configurable ratio (default is 50%) of physical RAM.
+		Depending on the ratio you use, in most situations
 		this means a process will not be killed while accessing
 		pages but will receive errors on memory allocation as
 		appropriate.
@@ -26,7 +26,8 @@ The Linux kernel supports the following overcommit handling modes
 
 The overcommit policy is set via the sysctl `vm.overcommit_memory'.
 
-The overcommit percentage is set via `vm.overcommit_ratio'.
+The overcommit percentage is set via `vm.overcommit_ratio' or
+`vm.overcommit_ratio_ppm'.
 
 The current overcommit limit and amount committed are viewable in
 /proc/meminfo as CommitLimit and Committed_AS respectively.
diff --git a/include/linux/mman.h b/include/linux/mman.h
index d622d34..24f9c12 100644
--- a/include/linux/mman.h
+++ b/include/linux/mman.h
@@ -11,7 +11,7 @@
 #include <linux/swap.h>
 
 extern int sysctl_overcommit_memory;
-extern int sysctl_overcommit_ratio;
+extern int sysctl_overcommit_ratio_ppm;
 extern struct percpu_counter vm_committed_as;
 
 #ifdef CONFIG_SMP
@@ -96,7 +96,7 @@ calc_vm_flag_bits(unsigned long flags)
  */
 static inline unsigned long vm_commit_limit()
 {
-	return ((totalram_pages - hugetlb_total_pages())
-		* sysctl_overcommit_ratio / 100) + total_swap_pages;
+	return ((u64) (totalram_pages - hugetlb_total_pages())
+		* sysctl_overcommit_ratio_ppm / 100000) + total_swap_pages;
 }
 #endif /* _LINUX_MMAN_H */
diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
index 14a8ff2..2e2389c 100644
--- a/include/linux/sysctl.h
+++ b/include/linux/sysctl.h
@@ -51,6 +51,8 @@ extern int proc_dointvec_userhz_jiffies(struct ctl_table *, int,
 					void __user *, size_t *, loff_t *);
 extern int proc_dointvec_ms_jiffies(struct ctl_table *, int,
 				    void __user *, size_t *, loff_t *);
+extern int proc_dointvec_percent_ppm(struct ctl_table *, int,
+				     void __user *, size_t *, loff_t *);
 extern int proc_doulongvec_minmax(struct ctl_table *, int,
 				  void __user *, size_t *, loff_t *);
 extern int proc_doulongvec_ms_jiffies_minmax(struct ctl_table *table, int,
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index b2f06f3..ecb22f4 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -96,7 +96,7 @@
 
 /* External variables not in a header file. */
 extern int sysctl_overcommit_memory;
-extern int sysctl_overcommit_ratio;
+extern int sysctl_overcommit_ratio_ppm;
 extern int max_threads;
 extern int suid_dumpable;
 #ifdef CONFIG_COREDUMP
@@ -1116,8 +1116,15 @@ static struct ctl_table vm_table[] = {
 	},
 	{
 		.procname	= "overcommit_ratio",
-		.data		= &sysctl_overcommit_ratio,
-		.maxlen		= sizeof(sysctl_overcommit_ratio),
+		.data		= &sysctl_overcommit_ratio_ppm,
+		.maxlen		= sizeof(sysctl_overcommit_ratio_ppm),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_percent_ppm,
+	},
+	{
+		.procname	= "overcommit_ratio_ppm",
+		.data		= &sysctl_overcommit_ratio_ppm,
+		.maxlen		= sizeof(sysctl_overcommit_ratio_ppm),
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec,
 	},
@@ -2433,6 +2440,56 @@ int proc_dointvec_ms_jiffies(struct ctl_table *table, int write,
 				do_proc_dointvec_ms_jiffies_conv, NULL);
 }
 
+static int do_proc_dointvec_percent_ppm_conv(bool *negp, unsigned long *lvalp,
+					     int *valp,
+					     int write, void *data)
+{
+	if (write) {
+		unsigned long ppm = (*negp ? -*lvalp : *lvalp) * 10000;
+
+		if (ppm > INT_MAX)
+			return 1;
+		*valp = (int)ppm;
+	} else {
+		int val = *valp;
+		unsigned long lval;
+		if (val < 0) {
+			*negp = true;
+			lval = (unsigned long)-val;
+		} else {
+			*negp = false;
+			lval = (unsigned long)val;
+		}
+		*lvalp = lval / 10000;
+		if (lval % 10000 >= 5000)
+			(*lvalp)++;
+	}
+	return 0;
+}
+
+/**
+ * proc_dointvec_percent_ppm - read a vector of integers as percent and convert it to ppm
+ * @table: the sysctl table
+ * @write: %TRUE if this is a write to the sysctl file
+ * @buffer: the user buffer
+ * @lenp: the size of the user buffer
+ * @ppos: file position
+ * @ppos: the current position in the file
+ *
+ * Reads/writes up to table->maxlen/sizeof(unsigned int) integer
+ * values from/to the user buffer, treated as an ASCII string.
+ * The values read are assumed to be in percents, and are converted
+ * into parts per million.
+ *
+ * Returns 0 on success.
+ */
+int proc_dointvec_percent_ppm(struct ctl_table *table, int write,
+			      void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+	return do_proc_dointvec(table, write, buffer, lenp, ppos,
+				do_proc_dointvec_percent_ppm_conv, NULL);
+}
+
 static int proc_do_cad_pid(struct ctl_table *table, int write,
 			   void __user *buffer, size_t *lenp, loff_t *ppos)
 {
diff --git a/mm/mmap.c b/mm/mmap.c
index 7755953..3096d9d 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -85,7 +85,7 @@ pgprot_t vm_get_page_prot(unsigned long vm_flags)
 EXPORT_SYMBOL(vm_get_page_prot);
 
 int sysctl_overcommit_memory __read_mostly = OVERCOMMIT_GUESS;  /* heuristic overcommit */
-int sysctl_overcommit_ratio __read_mostly = 50;	/* default is 50% */
+int sysctl_overcommit_ratio_ppm __read_mostly = 500000;	/* default is 50% */
 int sysctl_max_map_count __read_mostly = DEFAULT_MAX_MAP_COUNT;
 unsigned long sysctl_user_reserve_kbytes __read_mostly = 1UL << 17; /* 128MB */
 unsigned long sysctl_admin_reserve_kbytes __read_mostly = 1UL << 13; /* 8MB */
diff --git a/mm/nommu.c b/mm/nommu.c
index d8a957b..cf10a9b 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -59,7 +59,7 @@ unsigned long max_mapnr;
 unsigned long highest_memmap_pfn;
 struct percpu_counter vm_committed_as;
 int sysctl_overcommit_memory = OVERCOMMIT_GUESS; /* heuristic overcommit */
-int sysctl_overcommit_ratio = 50; /* default is 50% */
+int sysctl_overcommit_ratio_ppm = 500000; /* default is 50% */
 int sysctl_max_map_count = DEFAULT_MAX_MAP_COUNT;
 int sysctl_nr_trim_pages = CONFIG_NOMMU_INITIAL_TRIM_EXCESS;
 unsigned long sysctl_user_reserve_kbytes __read_mostly = 1UL << 17; /* 128MB */
-- 
1.7.7.6

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 1/2] mm: factor commit limit calculation
  2013-10-18 12:56 [PATCH v4 1/2] mm: factor commit limit calculation Jerome Marchand
  2013-10-18 12:56 ` [PATCH v4 2/2] mm: allow to set overcommit ratio more precisely Jerome Marchand
@ 2013-11-05 23:51 ` Andrew Morton
  1 sibling, 0 replies; 11+ messages in thread
From: Andrew Morton @ 2013-11-05 23:51 UTC (permalink / raw)
  To: Jerome Marchand; +Cc: linux-mm, linux-kernel, dave.hansen

On Fri, 18 Oct 2013 14:56:58 +0200 Jerome Marchand <jmarchan@redhat.com> wrote:

> Change since v3:
>  - rebase on 3.12-rc5
> 
> The same calculation is currently done in three differents places.
> Factor that code so future changes has to be made at only one place.
> 

lgtm.

> --- a/include/linux/mman.h
> +++ b/include/linux/mman.h
> @@ -7,6 +7,9 @@
>  #include <linux/atomic.h>
>  #include <uapi/linux/mman.h>
>  
> +#include <linux/hugetlb.h>
> +#include <linux/swap.h>
> +
>  extern int sysctl_overcommit_memory;
>  extern int sysctl_overcommit_ratio;
>  extern struct percpu_counter vm_committed_as;
> @@ -87,4 +90,13 @@ calc_vm_flag_bits(unsigned long flags)
>  	       _calc_vm_trans(flags, MAP_DENYWRITE,  VM_DENYWRITE ) |
>  	       _calc_vm_trans(flags, MAP_LOCKED,     VM_LOCKED    );
>  }
> +
> +/*
> + * Commited memory limit enforced when OVERCOMMIT_NEVER policy is used
> + */
> +static inline unsigned long vm_commit_limit()
> +{
> +	return ((totalram_pages - hugetlb_total_pages())
> +		* sysctl_overcommit_ratio / 100) + total_swap_pages;
> +}

Not sure I like this part much.  This function is large and slow and
doesn't merit inlining, plus it requires worsening our nested-include
mess.  This?

Also, it should be vm_commit_limit(void).



From: Andrew Morton <akpm@linux-foundation.org>
Subject: mm-factor-commit-limit-calculation-fix

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/mman.h |   12 +-----------
 mm/mmap.c            |    9 +++++++++
 2 files changed, 10 insertions(+), 11 deletions(-)

diff -puN fs/proc/meminfo.c~mm-factor-commit-limit-calculation-fix fs/proc/meminfo.c
diff -puN include/linux/mman.h~mm-factor-commit-limit-calculation-fix include/linux/mman.h
--- a/include/linux/mman.h~mm-factor-commit-limit-calculation-fix
+++ a/include/linux/mman.h
@@ -7,9 +7,6 @@
 #include <linux/atomic.h>
 #include <uapi/linux/mman.h>
 
-#include <linux/hugetlb.h>
-#include <linux/swap.h>
-
 extern int sysctl_overcommit_memory;
 extern int sysctl_overcommit_ratio;
 extern struct percpu_counter vm_committed_as;
@@ -91,12 +88,5 @@ calc_vm_flag_bits(unsigned long flags)
 	       _calc_vm_trans(flags, MAP_LOCKED,     VM_LOCKED    );
 }
 
-/*
- * Commited memory limit enforced when OVERCOMMIT_NEVER policy is used
- */
-static inline unsigned long vm_commit_limit()
-{
-	return ((totalram_pages - hugetlb_total_pages())
-		* sysctl_overcommit_ratio / 100) + total_swap_pages;
-}
+unsigned long vm_commit_limit(void);
 #endif /* _LINUX_MMAN_H */
diff -puN mm/mmap.c~mm-factor-commit-limit-calculation-fix mm/mmap.c
--- a/mm/mmap.c~mm-factor-commit-limit-calculation-fix
+++ a/mm/mmap.c
@@ -110,6 +110,15 @@ unsigned long vm_memory_committed(void)
 EXPORT_SYMBOL_GPL(vm_memory_committed);
 
 /*
+ * Commited memory limit enforced when OVERCOMMIT_NEVER policy is used
+ */
+unsigned long vm_commit_limit(void)
+{
+	return ((totalram_pages - hugetlb_total_pages())
+		* sysctl_overcommit_ratio / 100) + total_swap_pages;
+}
+
+/*
  * Check that a process has enough memory to allocate a new virtual
  * mapping. 0 means there is enough memory for the allocation to
  * succeed and -ENOMEM implies there is not.
diff -puN mm/nommu.c~mm-factor-commit-limit-calculation-fix mm/nommu.c
diff -puN mm/util.c~mm-factor-commit-limit-calculation-fix mm/util.c
diff -puN include/linux/mm.h~mm-factor-commit-limit-calculation-fix include/linux/mm.h
_

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 2/2] mm: allow to set overcommit ratio more precisely
  2013-10-18 12:56 ` [PATCH v4 2/2] mm: allow to set overcommit ratio more precisely Jerome Marchand
@ 2013-11-05 23:53   ` Andrew Morton
  2013-11-06  8:42     ` Jerome Marchand
  2013-12-03 13:33   ` [PATCH v5] mm: add overcommit_kbytes sysctl variable Jerome Marchand
  1 sibling, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2013-11-05 23:53 UTC (permalink / raw)
  To: Jerome Marchand; +Cc: linux-mm, linux-kernel, dave.hansen

On Fri, 18 Oct 2013 14:56:59 +0200 Jerome Marchand <jmarchan@redhat.com> wrote:

> Some applications that run on HPC clusters are designed around the
> availability of RAM and the overcommit ratio is fine tuned to get the
> maximum usage of memory without swapping. With growing memory, the 1%
> of all RAM grain provided by overcommit_ratio has become too coarse
> for these workload (on a 2TB machine it represents no less than
> 20GB).
> 
> This patch adds the new overcommit_ratio_ppm sysctl variable that
> allow to set overcommit ratio with a part per million precision.
> The old overcommit_ratio variable can still be used to set and read
> the ratio with a 1% precision. That way, overcommit_ratio interface
> isn't broken in any way that I can imagine.

The way we've permanently squished this mistake in the past is to
switch to "bytes".  See /proc/sys/vm/*bytes.

Would that approach work in this case?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 2/2] mm: allow to set overcommit ratio more precisely
  2013-11-05 23:53   ` Andrew Morton
@ 2013-11-06  8:42     ` Jerome Marchand
  2013-11-06 22:33       ` Andrew Morton
  0 siblings, 1 reply; 11+ messages in thread
From: Jerome Marchand @ 2013-11-06  8:42 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, linux-kernel, dave hansen



----- Original Message -----
> From: "Andrew Morton" <akpm@linux-foundation.org>
> To: "Jerome Marchand" <jmarchan@redhat.com>
> Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, "dave hansen" <dave.hansen@intel.com>
> Sent: Wednesday, November 6, 2013 12:53:19 AM
> Subject: Re: [PATCH v4 2/2] mm: allow to set overcommit ratio more precisely
> 
> On Fri, 18 Oct 2013 14:56:59 +0200 Jerome Marchand <jmarchan@redhat.com>
> wrote:
> 
> > Some applications that run on HPC clusters are designed around the
> > availability of RAM and the overcommit ratio is fine tuned to get the
> > maximum usage of memory without swapping. With growing memory, the 1%
> > of all RAM grain provided by overcommit_ratio has become too coarse
> > for these workload (on a 2TB machine it represents no less than
> > 20GB).
> > 
> > This patch adds the new overcommit_ratio_ppm sysctl variable that
> > allow to set overcommit ratio with a part per million precision.
> > The old overcommit_ratio variable can still be used to set and read
> > the ratio with a 1% precision. That way, overcommit_ratio interface
> > isn't broken in any way that I can imagine.
> 
> The way we've permanently squished this mistake in the past is to
> switch to "bytes".  See /proc/sys/vm/*bytes.
> 
> Would that approach work in this case?
> 

That was my first version of this patch (actually "kbytes" to avoid
overflow).
Dave raised the issue that it silently breaks the user interface:
overcommit_ratio is zero while the system behaves differently.

Jerome

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 2/2] mm: allow to set overcommit ratio more precisely
  2013-11-06  8:42     ` Jerome Marchand
@ 2013-11-06 22:33       ` Andrew Morton
  2013-11-06 23:49         ` Dave Hansen
  0 siblings, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2013-11-06 22:33 UTC (permalink / raw)
  To: Jerome Marchand; +Cc: linux-mm, linux-kernel, dave hansen

On Wed, 6 Nov 2013 03:42:20 -0500 (EST) Jerome Marchand <jmarchan@redhat.com> wrote:

> 
> 
> ----- Original Message -----
> > From: "Andrew Morton" <akpm@linux-foundation.org>
> > To: "Jerome Marchand" <jmarchan@redhat.com>
> > Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, "dave hansen" <dave.hansen@intel.com>
> > Sent: Wednesday, November 6, 2013 12:53:19 AM
> > Subject: Re: [PATCH v4 2/2] mm: allow to set overcommit ratio more precisely
> > 
> > On Fri, 18 Oct 2013 14:56:59 +0200 Jerome Marchand <jmarchan@redhat.com>
> > wrote:
> > 
> > > Some applications that run on HPC clusters are designed around the
> > > availability of RAM and the overcommit ratio is fine tuned to get the
> > > maximum usage of memory without swapping. With growing memory, the 1%
> > > of all RAM grain provided by overcommit_ratio has become too coarse
> > > for these workload (on a 2TB machine it represents no less than
> > > 20GB).
> > > 
> > > This patch adds the new overcommit_ratio_ppm sysctl variable that
> > > allow to set overcommit ratio with a part per million precision.
> > > The old overcommit_ratio variable can still be used to set and read
> > > the ratio with a 1% precision. That way, overcommit_ratio interface
> > > isn't broken in any way that I can imagine.
> > 
> > The way we've permanently squished this mistake in the past is to
> > switch to "bytes".  See /proc/sys/vm/*bytes.
> > 
> > Would that approach work in this case?
> > 
> 
> That was my first version of this patch (actually "kbytes" to avoid
> overflow).
> Dave raised the issue that it silently breaks the user interface:
> overcommit_ratio is zero while the system behaves differently.

I don't understand that at all.  We keep overcommit_ratio as-is, with
the same default values and add a different way of altering it.  That
should be back-compatible?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 2/2] mm: allow to set overcommit ratio more precisely
  2013-11-06 22:33       ` Andrew Morton
@ 2013-11-06 23:49         ` Dave Hansen
  2013-11-07 10:43           ` Jerome Marchand
  0 siblings, 1 reply; 11+ messages in thread
From: Dave Hansen @ 2013-11-06 23:49 UTC (permalink / raw)
  To: Andrew Morton, Jerome Marchand; +Cc: linux-mm, linux-kernel

On 11/06/2013 02:33 PM, Andrew Morton wrote:
> On Wed, 6 Nov 2013 03:42:20 -0500 (EST) Jerome Marchand <jmarchan@redhat.com> wrote:
>> That was my first version of this patch (actually "kbytes" to avoid
>> overflow).
>> Dave raised the issue that it silently breaks the user interface:
>> overcommit_ratio is zero while the system behaves differently.
> 
> I don't understand that at all.  We keep overcommit_ratio as-is, with
> the same default values and add a different way of altering it.  That
> should be back-compatible?

Reading the old thread, I think my main point was that we shouldn't
output overcommit_ratio=0 when overcommit_bytes>0.  We need to round up
for numbers less than 1 so that folks don't think overcommit_ratio is _off_.

I was really just trying to talk you in to cramming the extra precision
in to the _existing_ sysctl. :)  I don't think bytes vs. ratio is really
that big of a deal.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 2/2] mm: allow to set overcommit ratio more precisely
  2013-11-06 23:49         ` Dave Hansen
@ 2013-11-07 10:43           ` Jerome Marchand
  0 siblings, 0 replies; 11+ messages in thread
From: Jerome Marchand @ 2013-11-07 10:43 UTC (permalink / raw)
  To: Dave Hansen; +Cc: Andrew Morton, linux-mm, linux-kernel



----- Original Message -----
> From: "Dave Hansen" <dave.hansen@intel.com>
> To: "Andrew Morton" <akpm@linux-foundation.org>, "Jerome Marchand" <jmarchan@redhat.com>
> Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org
> Sent: Thursday, November 7, 2013 12:49:54 AM
> Subject: Re: [PATCH v4 2/2] mm: allow to set overcommit ratio more precisely
> 
> On 11/06/2013 02:33 PM, Andrew Morton wrote:
> > On Wed, 6 Nov 2013 03:42:20 -0500 (EST) Jerome Marchand
> > <jmarchan@redhat.com> wrote:
> >> That was my first version of this patch (actually "kbytes" to avoid
> >> overflow).
> >> Dave raised the issue that it silently breaks the user interface:
> >> overcommit_ratio is zero while the system behaves differently.
> > 
> > I don't understand that at all.  We keep overcommit_ratio as-is, with
> > the same default values and add a different way of altering it.  That
> > should be back-compatible?
> 
> Reading the old thread, I think my main point was that we shouldn't
> output overcommit_ratio=0 when overcommit_bytes>0. We need to round up
> for numbers less than 1 so that folks don't think overcommit_ratio is _off_.

This is not how current *bytes work. Also the *ratio and *bytes value
would diverge if the amount of memory changes (e.g. memory hotplug).

> 
> I was really just trying to talk you in to cramming the extra precision
> in to the _existing_ sysctl. :)  I don't think bytes vs. ratio is really
> that big of a deal.
> 

If everybody agrees on overcommit_kbytes, I can resend my original patch.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v5] mm: add overcommit_kbytes sysctl variable
  2013-10-18 12:56 ` [PATCH v4 2/2] mm: allow to set overcommit ratio more precisely Jerome Marchand
  2013-11-05 23:53   ` Andrew Morton
@ 2013-12-03 13:33   ` Jerome Marchand
  2013-12-03 22:14     ` Andrew Morton
  2013-12-19  7:36     ` Olof Johansson
  1 sibling, 2 replies; 11+ messages in thread
From: Jerome Marchand @ 2013-12-03 13:33 UTC (permalink / raw)
  To: linux-mm; +Cc: linux-kernel, dave.hansen, Andrew Morton


Changes since v4:
 - revert to my initial overcommit_kbytes design as it is more
 consistent with current *_ratio/*_bytes implementation for other
 variables.

Some applications that run on HPC clusters are designed around the
availability of RAM and the overcommit ratio is fine tuned to get the
maximum usage of memory without swapping. With growing memory, the
1%-of-all-RAM grain provided by overcommit_ratio has become too coarse
for these workload (on a 2TB machine it represents no less than
20GB).

This patch adds the new overcommit_kbytes sysctl variable that allow a
much finer grain.

Signed-of-by: Jerome Marchand <jmarchan@redhat.com>
---
 Documentation/sysctl/vm.txt            |   12 ++++++++++++
 Documentation/vm/overcommit-accounting |    7 ++++---
 include/linux/mm.h                     |    5 +++++
 include/linux/mman.h                   |    1 +
 kernel/sysctl.c                        |   10 +++++++++-
 mm/mmap.c                              |   25 +++++++++++++++++++++++++
 mm/nommu.c                             |    1 +
 mm/util.c                              |   12 ++++++++++--
 8 files changed, 67 insertions(+), 6 deletions(-)

diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
index 1fbd4eb..739c21e 100644
--- a/Documentation/sysctl/vm.txt
+++ b/Documentation/sysctl/vm.txt
@@ -47,6 +47,7 @@ Currently, these files are in /proc/sys/vm:
 - numa_zonelist_order
 - oom_dump_tasks
 - oom_kill_allocating_task
+- overcommit_kbytes
 - overcommit_memory
 - overcommit_ratio
 - page-cluster
@@ -574,6 +575,17 @@ The default value is 0.
 
 ==============================================================
 
+overcommit_kbytes:
+
+When overcommit_memory is set to 2, the committed address space is not
+permitted to exceed swap plus this amount of physical RAM. See below.
+
+Note: overcommit_kbytes is the counterpart of overcommit_ratio. Only one
+of them may be specified at a time. Setting one disable the other (which
+then appears as 0 when read).
+
+==============================================================
+
 overcommit_memory:
 
 This value contains a flag that enables memory overcommitment.
diff --git a/Documentation/vm/overcommit-accounting b/Documentation/vm/overcommit-accounting
index 8eaa2fc..cbfaaa6 100644
--- a/Documentation/vm/overcommit-accounting
+++ b/Documentation/vm/overcommit-accounting
@@ -14,8 +14,8 @@ The Linux kernel supports the following overcommit handling modes
 
 2	-	Don't overcommit. The total address space commit
 		for the system is not permitted to exceed swap + a
-		configurable percentage (default is 50) of physical RAM.
-		Depending on the percentage you use, in most situations
+		configurable amount (default is 50%) of physical RAM.
+		Depending on the amount you use, in most situations
 		this means a process will not be killed while accessing
 		pages but will receive errors on memory allocation as
 		appropriate.
@@ -26,7 +26,8 @@ The Linux kernel supports the following overcommit handling modes
 
 The overcommit policy is set via the sysctl `vm.overcommit_memory'.
 
-The overcommit percentage is set via `vm.overcommit_ratio'.
+The overcommit amount can be set via `vm.overcommit_ratio' (percentage)
+or `vm.overcommit_kbytes' (absolute value).
 
 The current overcommit limit and amount committed are viewable in
 /proc/meminfo as CommitLimit and Committed_AS respectively.
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 1cedd00..8f17978 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -57,6 +57,11 @@ extern int sysctl_legacy_va_layout;
 extern unsigned long sysctl_user_reserve_kbytes;
 extern unsigned long sysctl_admin_reserve_kbytes;
 
+extern int overcommit_ratio_handler(struct ctl_table *, int, void __user *,
+				    size_t *, loff_t *);
+extern int overcommit_kbytes_handler(struct ctl_table *, int, void __user *,
+				    size_t *, loff_t *);
+
 #define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n))
 
 /* to align the pointer to the (next) page boundary */
diff --git a/include/linux/mman.h b/include/linux/mman.h
index 7f7f8da..16373c8 100644
--- a/include/linux/mman.h
+++ b/include/linux/mman.h
@@ -9,6 +9,7 @@
 
 extern int sysctl_overcommit_memory;
 extern int sysctl_overcommit_ratio;
+extern unsigned long sysctl_overcommit_kbytes;
 extern struct percpu_counter vm_committed_as;
 
 #ifdef CONFIG_SMP
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 34a6047..7877929 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -97,6 +97,7 @@
 /* External variables not in a header file. */
 extern int sysctl_overcommit_memory;
 extern int sysctl_overcommit_ratio;
+extern unsigned long sysctl_overcommit_kbytes;
 extern int max_threads;
 extern int suid_dumpable;
 #ifdef CONFIG_COREDUMP
@@ -1128,7 +1129,14 @@ static struct ctl_table vm_table[] = {
 		.data		= &sysctl_overcommit_ratio,
 		.maxlen		= sizeof(sysctl_overcommit_ratio),
 		.mode		= 0644,
-		.proc_handler	= proc_dointvec,
+		.proc_handler	= overcommit_ratio_handler,
+	},
+	{
+		.procname	= "overcommit_kbytes",
+		.data		= &sysctl_overcommit_kbytes,
+		.maxlen		= sizeof(sysctl_overcommit_kbytes),
+		.mode		= 0644,
+		.proc_handler	= overcommit_kbytes_handler,
 	},
 	{
 		.procname	= "page-cluster", 
diff --git a/mm/mmap.c b/mm/mmap.c
index 834b2d7..b25167d 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -86,6 +86,7 @@ EXPORT_SYMBOL(vm_get_page_prot);
 
 int sysctl_overcommit_memory __read_mostly = OVERCOMMIT_GUESS;  /* heuristic overcommit */
 int sysctl_overcommit_ratio __read_mostly = 50;	/* default is 50% */
+unsigned long sysctl_overcommit_kbytes __read_mostly = 0;
 int sysctl_max_map_count __read_mostly = DEFAULT_MAX_MAP_COUNT;
 unsigned long sysctl_user_reserve_kbytes __read_mostly = 1UL << 17; /* 128MB */
 unsigned long sysctl_admin_reserve_kbytes __read_mostly = 1UL << 13; /* 8MB */
@@ -95,6 +96,30 @@ unsigned long sysctl_admin_reserve_kbytes __read_mostly = 1UL << 13; /* 8MB */
  */
 struct percpu_counter vm_committed_as ____cacheline_aligned_in_smp;
 
+int overcommit_ratio_handler(struct ctl_table *table, int write,
+			     void __user *buffer, size_t *lenp,
+			     loff_t *ppos)
+{
+	int ret;
+
+	ret = proc_dointvec(table, write, buffer, lenp, ppos);
+	if (ret == 0 && write)
+		sysctl_overcommit_kbytes = 0;
+	return ret;
+}
+
+int overcommit_kbytes_handler(struct ctl_table *table, int write,
+			     void __user *buffer, size_t *lenp,
+			     loff_t *ppos)
+{
+	int ret;
+
+	ret = proc_doulongvec_minmax(table, write, buffer, lenp, ppos);
+	if (ret == 0 && write)
+		sysctl_overcommit_ratio = 0;
+	return ret;
+}
+
 /*
  * The global memory commitment made in the system can be a metric
  * that can be used to drive ballooning decisions when Linux is hosted
diff --git a/mm/nommu.c b/mm/nommu.c
index fec093a..319ab8f 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -60,6 +60,7 @@ unsigned long highest_memmap_pfn;
 struct percpu_counter vm_committed_as;
 int sysctl_overcommit_memory = OVERCOMMIT_GUESS; /* heuristic overcommit */
 int sysctl_overcommit_ratio = 50; /* default is 50% */
+unsigned long sysctl_overcommit_kbytes __read_mostly = 0;
 int sysctl_max_map_count = DEFAULT_MAX_MAP_COUNT;
 int sysctl_nr_trim_pages = CONFIG_NOMMU_INITIAL_TRIM_EXCESS;
 unsigned long sysctl_user_reserve_kbytes __read_mostly = 1UL << 17; /* 128MB */
diff --git a/mm/util.c b/mm/util.c
index f7bc209..73cf802 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -406,8 +406,16 @@ struct address_space *page_mapping(struct page *page)
  */
 unsigned long vm_commit_limit(void)
 {
-	return ((totalram_pages - hugetlb_total_pages())
-		* sysctl_overcommit_ratio / 100) + total_swap_pages;
+	unsigned long allowed;
+
+	if (sysctl_overcommit_kbytes)
+		allowed = sysctl_overcommit_kbytes >> (PAGE_SHIFT - 10);
+	else
+		allowed = ((totalram_pages - hugetlb_total_pages())
+			   * sysctl_overcommit_ratio / 100);
+	allowed += total_swap_pages;
+
+	return allowed;
 }
 
 
-- 
1.7.7.6



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v5] mm: add overcommit_kbytes sysctl variable
  2013-12-03 13:33   ` [PATCH v5] mm: add overcommit_kbytes sysctl variable Jerome Marchand
@ 2013-12-03 22:14     ` Andrew Morton
  2013-12-19  7:36     ` Olof Johansson
  1 sibling, 0 replies; 11+ messages in thread
From: Andrew Morton @ 2013-12-03 22:14 UTC (permalink / raw)
  To: Jerome Marchand; +Cc: linux-mm, linux-kernel, dave.hansen

On Tue, 03 Dec 2013 14:33:35 +0100 Jerome Marchand <jmarchan@redhat.com> wrote:

> 
> Changes since v4:
>  - revert to my initial overcommit_kbytes design as it is more
>  consistent with current *_ratio/*_bytes implementation for other
>  variables.
> 
> Some applications that run on HPC clusters are designed around the
> availability of RAM and the overcommit ratio is fine tuned to get the
> maximum usage of memory without swapping. With growing memory, the
> 1%-of-all-RAM grain provided by overcommit_ratio has become too coarse
> for these workload (on a 2TB machine it represents no less than
> 20GB).
> 
> This patch adds the new overcommit_kbytes sysctl variable that allow a
> much finer grain.

Seems OK to me.

> --- a/Documentation/sysctl/vm.txt
> +++ b/Documentation/sysctl/vm.txt
> @@ -574,6 +575,17 @@ The default value is 0.
>  
>  ==============================================================
>  
> +overcommit_kbytes:
> +
> +When overcommit_memory is set to 2, the committed address space is not
> +permitted to exceed swap plus this amount of physical RAM. See below.
> +
> +Note: overcommit_kbytes is the counterpart of overcommit_ratio. Only one
> +of them may be specified at a time. Setting one disable the other (which


--- a/Documentation/sysctl/vm.txt~mm-add-overcommit_kbytes-sysctl-variable-fix
+++ a/Documentation/sysctl/vm.txt
@@ -581,7 +581,7 @@ When overcommit_memory is set to 2, the
 permitted to exceed swap plus this amount of physical RAM. See below.
 
 Note: overcommit_kbytes is the counterpart of overcommit_ratio. Only one
-of them may be specified at a time. Setting one disable the other (which
+of them may be specified at a time. Setting one disables the other (which
 then appears as 0 when read).
 
 ==============================================================



Please do use checkpatch.

From: Andrew Morton <akpm@linux-foundation.org>
Subject: mm-add-overcommit_kbytes-sysctl-variable-checkpatch-fixes

WARNING: Non-standard signature: Signed-of-by:
#13: 
Signed-of-by: Jerome Marchand <jmarchan@redhat.com>

WARNING: externs should be avoided in .c files
#115: FILE: kernel/sysctl.c:100:
+extern unsigned long sysctl_overcommit_kbytes;

ERROR: do not initialise globals to 0 or NULL
#142: FILE: mm/mmap.c:89:
+unsigned long sysctl_overcommit_kbytes __read_mostly = 0;

ERROR: do not initialise globals to 0 or NULL
#184: FILE: mm/nommu.c:63:
+unsigned long sysctl_overcommit_kbytes __read_mostly = 0;

total: 2 errors, 2 warnings, 145 lines checked

./patches/mm-add-overcommit_kbytes-sysctl-variable.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/mm.h |    4 ++++
 kernel/sysctl.c    |    3 ---
 mm/mmap.c          |    2 +-
 mm/nommu.c         |    2 +-
 4 files changed, 6 insertions(+), 5 deletions(-)

diff -puN include/linux/mm.h~mm-add-overcommit_kbytes-sysctl-variable-checkpatch-fixes include/linux/mm.h
--- a/include/linux/mm.h~mm-add-overcommit_kbytes-sysctl-variable-checkpatch-fixes
+++ a/include/linux/mm.h
@@ -57,6 +57,10 @@ extern int sysctl_legacy_va_layout;
 extern unsigned long sysctl_user_reserve_kbytes;
 extern unsigned long sysctl_admin_reserve_kbytes;
 
+extern int sysctl_overcommit_memory;
+extern int sysctl_overcommit_ratio;
+extern unsigned long sysctl_overcommit_kbytes;
+
 extern int overcommit_ratio_handler(struct ctl_table *, int, void __user *,
 				    size_t *, loff_t *);
 extern int overcommit_kbytes_handler(struct ctl_table *, int, void __user *,
diff -puN kernel/sysctl.c~mm-add-overcommit_kbytes-sysctl-variable-checkpatch-fixes kernel/sysctl.c
--- a/kernel/sysctl.c~mm-add-overcommit_kbytes-sysctl-variable-checkpatch-fixes
+++ a/kernel/sysctl.c
@@ -95,9 +95,6 @@
 #if defined(CONFIG_SYSCTL)
 
 /* External variables not in a header file. */
-extern int sysctl_overcommit_memory;
-extern int sysctl_overcommit_ratio;
-extern unsigned long sysctl_overcommit_kbytes;
 extern int max_threads;
 extern int suid_dumpable;
 #ifdef CONFIG_COREDUMP
diff -puN mm/mmap.c~mm-add-overcommit_kbytes-sysctl-variable-checkpatch-fixes mm/mmap.c
--- a/mm/mmap.c~mm-add-overcommit_kbytes-sysctl-variable-checkpatch-fixes
+++ a/mm/mmap.c
@@ -86,7 +86,7 @@ EXPORT_SYMBOL(vm_get_page_prot);
 
 int sysctl_overcommit_memory __read_mostly = OVERCOMMIT_GUESS;  /* heuristic overcommit */
 int sysctl_overcommit_ratio __read_mostly = 50;	/* default is 50% */
-unsigned long sysctl_overcommit_kbytes __read_mostly = 0;
+unsigned long sysctl_overcommit_kbytes __read_mostly;
 int sysctl_max_map_count __read_mostly = DEFAULT_MAX_MAP_COUNT;
 unsigned long sysctl_user_reserve_kbytes __read_mostly = 1UL << 17; /* 128MB */
 unsigned long sysctl_admin_reserve_kbytes __read_mostly = 1UL << 13; /* 8MB */
diff -puN mm/nommu.c~mm-add-overcommit_kbytes-sysctl-variable-checkpatch-fixes mm/nommu.c
--- a/mm/nommu.c~mm-add-overcommit_kbytes-sysctl-variable-checkpatch-fixes
+++ a/mm/nommu.c
@@ -60,7 +60,7 @@ unsigned long highest_memmap_pfn;
 struct percpu_counter vm_committed_as;
 int sysctl_overcommit_memory = OVERCOMMIT_GUESS; /* heuristic overcommit */
 int sysctl_overcommit_ratio = 50; /* default is 50% */
-unsigned long sysctl_overcommit_kbytes __read_mostly = 0;
+unsigned long sysctl_overcommit_kbytes __read_mostly;
 int sysctl_max_map_count = DEFAULT_MAX_MAP_COUNT;
 int sysctl_nr_trim_pages = CONFIG_NOMMU_INITIAL_TRIM_EXCESS;
 unsigned long sysctl_user_reserve_kbytes __read_mostly = 1UL << 17; /* 128MB */
_

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v5] mm: add overcommit_kbytes sysctl variable
  2013-12-03 13:33   ` [PATCH v5] mm: add overcommit_kbytes sysctl variable Jerome Marchand
  2013-12-03 22:14     ` Andrew Morton
@ 2013-12-19  7:36     ` Olof Johansson
  1 sibling, 0 replies; 11+ messages in thread
From: Olof Johansson @ 2013-12-19  7:36 UTC (permalink / raw)
  To: Jerome Marchand; +Cc: linux-mm, linux-kernel, dave.hansen, Andrew Morton

Hi,

On Tue, Dec 3, 2013 at 5:33 AM, Jerome Marchand <jmarchan@redhat.com> wrote:

[...]

> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index 34a6047..7877929 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -97,6 +97,7 @@
>  /* External variables not in a header file. */
>  extern int sysctl_overcommit_memory;
>  extern int sysctl_overcommit_ratio;
> +extern unsigned long sysctl_overcommit_kbytes;
>  extern int max_threads;
>  extern int suid_dumpable;
>  #ifdef CONFIG_COREDUMP
> @@ -1128,7 +1129,14 @@ static struct ctl_table vm_table[] = {
>                 .data           = &sysctl_overcommit_ratio,
>                 .maxlen         = sizeof(sysctl_overcommit_ratio),
>                 .mode           = 0644,
> -               .proc_handler   = proc_dointvec,
> +               .proc_handler   = overcommit_ratio_handler,
> +       },
> +       {
> +               .procname       = "overcommit_kbytes",
> +               .data           = &sysctl_overcommit_kbytes,
> +               .maxlen         = sizeof(sysctl_overcommit_kbytes),
> +               .mode           = 0644,
> +               .proc_handler   = overcommit_kbytes_handler,
>         },
>         {
>                 .procname       = "page-cluster",
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 834b2d7..b25167d 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -86,6 +86,7 @@ EXPORT_SYMBOL(vm_get_page_prot);
>
>  int sysctl_overcommit_memory __read_mostly = OVERCOMMIT_GUESS;  /* heuristic overcommit */
>  int sysctl_overcommit_ratio __read_mostly = 50;        /* default is 50% */
> +unsigned long sysctl_overcommit_kbytes __read_mostly = 0;
>  int sysctl_max_map_count __read_mostly = DEFAULT_MAX_MAP_COUNT;
>  unsigned long sysctl_user_reserve_kbytes __read_mostly = 1UL << 17; /* 128MB */
>  unsigned long sysctl_admin_reserve_kbytes __read_mostly = 1UL << 13; /* 8MB */
> @@ -95,6 +96,30 @@ unsigned long sysctl_admin_reserve_kbytes __read_mostly = 1UL << 13; /* 8MB */
>   */
>  struct percpu_counter vm_committed_as ____cacheline_aligned_in_smp;
>
> +int overcommit_ratio_handler(struct ctl_table *table, int write,
> +                            void __user *buffer, size_t *lenp,
> +                            loff_t *ppos)
> +{
> +       int ret;
> +
> +       ret = proc_dointvec(table, write, buffer, lenp, ppos);
> +       if (ret == 0 && write)
> +               sysctl_overcommit_kbytes = 0;
> +       return ret;
> +}
> +
> +int overcommit_kbytes_handler(struct ctl_table *table, int write,
> +                            void __user *buffer, size_t *lenp,
> +                            loff_t *ppos)
> +{
> +       int ret;
> +
> +       ret = proc_doulongvec_minmax(table, write, buffer, lenp, ppos);
> +       if (ret == 0 && write)
> +               sysctl_overcommit_ratio = 0;
> +       return ret;
> +}
> +
>  /*
>   * The global memory commitment made in the system can be a metric
>   * that can be used to drive ballooning decisions when Linux is hosted
> diff --git a/mm/nommu.c b/mm/nommu.c
> index fec093a..319ab8f 100644
> --- a/mm/nommu.c
> +++ b/mm/nommu.c
> @@ -60,6 +60,7 @@ unsigned long highest_memmap_pfn;
>  struct percpu_counter vm_committed_as;
>  int sysctl_overcommit_memory = OVERCOMMIT_GUESS; /* heuristic overcommit */
>  int sysctl_overcommit_ratio = 50; /* default is 50% */
> +unsigned long sysctl_overcommit_kbytes __read_mostly = 0;
>  int sysctl_max_map_count = DEFAULT_MAX_MAP_COUNT;
>  int sysctl_nr_trim_pages = CONFIG_NOMMU_INITIAL_TRIM_EXCESS;
>  unsigned long sysctl_user_reserve_kbytes __read_mostly = 1UL << 17; /* 128MB */

You add the variable on the nommu side, but not the functions to
handle the sysctl. So things fail to compile on !MMU builds with:

kernel/built-in.o:(.data+0x4e0): undefined reference to
`overcommit_ratio_handler'
kernel/built-in.o:(.data+0x504): undefined reference to
`overcommit_kbytes_handler'

I don't know mm well enough to tell if copying and pasting the code
over verbatim is the right thing to do, or if there's a preferred
other location (that is shared) to move it to?


-Olof

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2013-12-19  7:36 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-10-18 12:56 [PATCH v4 1/2] mm: factor commit limit calculation Jerome Marchand
2013-10-18 12:56 ` [PATCH v4 2/2] mm: allow to set overcommit ratio more precisely Jerome Marchand
2013-11-05 23:53   ` Andrew Morton
2013-11-06  8:42     ` Jerome Marchand
2013-11-06 22:33       ` Andrew Morton
2013-11-06 23:49         ` Dave Hansen
2013-11-07 10:43           ` Jerome Marchand
2013-12-03 13:33   ` [PATCH v5] mm: add overcommit_kbytes sysctl variable Jerome Marchand
2013-12-03 22:14     ` Andrew Morton
2013-12-19  7:36     ` Olof Johansson
2013-11-05 23:51 ` [PATCH v4 1/2] mm: factor commit limit calculation Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).