All of lore.kernel.org
 help / color / mirror / Atom feed
* [LTP] [PATCH v3] overcommit_memory: Fix unstable subtest
@ 2020-11-27  7:14 Joerg Vehlow
  2020-11-30  8:01 ` Richard Palethorpe
  2020-12-03 15:30 ` Cyril Hrubis
  0 siblings, 2 replies; 7+ messages in thread
From: Joerg Vehlow @ 2020-11-27  7:14 UTC (permalink / raw)
  To: ltp

From: Joerg Vehlow <joerg.vehlow@aox-tech.de>

Changes to v2:
 - Removed useless call to get_total_batch_size_bytes
 - Calculate batch size before reading /proc/meminfo
   This should reduce memory allocations after meminfo was read

The test sets overcommit policy to never overcommit and then tries
to allocate the commit limit reported by /proc/meminfo. This value is an exact
value (at least at that point in time) of memory, that can be allocated
according to the policy and ratio settings. This should fail, since there
is already some memory allocated for running the test programm, but due to
inaccurate memory accounting in mm/util.c __vm_enough_memory(), the allocation
can still succeed.

The commited memory is stored in a percpu counter, that counts in 1 + ncpu
variables. For small allocations and deallocations, the memory is counted
in a counter per cpu, without locking. If this counter reaches a threshold,
the value is committed to a global counter. Due to this the global counter
can become negative. This global counter is the only thing taken into
account in __vm_enough_memory, propably for performance reasons, because
otherwise a lock is required.

Because of the inaccuracy the system can overcommit a bit by number of cpus
times threshold value. By adding this value to the exact commit limit
reported by /proc/meminfo, we can be sure, that we really always hit the
memory limit.

Signed-off-by: Joerg Vehlow <joerg.vehlow@aox-tech.de>
---
 .../kernel/mem/tunable/overcommit_memory.c    | 56 +++++++++++++------
 1 file changed, 39 insertions(+), 17 deletions(-)

diff --git a/testcases/kernel/mem/tunable/overcommit_memory.c b/testcases/kernel/mem/tunable/overcommit_memory.c
index f77939908..918b4e68e 100644
--- a/testcases/kernel/mem/tunable/overcommit_memory.c
+++ b/testcases/kernel/mem/tunable/overcommit_memory.c
@@ -1,18 +1,7 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
 /*
- * Copyright (c) Linux Test Project, 2012-2020
- * Copyright (C) 2012-2017  Red Hat, Inc.
- *
- * This program is free software;  you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY;  without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See
- * the GNU General Public License for more details.
- *
- * Descriptions:
+ * Copyright (c) 2012-2020 Linux Test Project
+ * Copyright (c) 2012-2017 Red Hat, Inc.
  *
  * There are two tunables overcommit_memory and overcommit_ratio under
  * /proc/sys/vm/, which can control memory overcommitment.
@@ -53,12 +42,16 @@
  * the system is limited to CommitLimit(Swap+RAM*overcommit_ratio)
  * commit_left(allocatable memory) = CommitLimit - Committed_AS
  * a. less than commit_left:    commit_left / 2, alloc should pass
- * b. greater than commit_left: commit_left * 2, alloc should fail
- * c. overcommit limit:         CommitLimit,     alloc should fail
+ * b. overcommit limit:         CommitLimit + TotalBatchSize, should fail
+ * c. greater than commit_left: commit_left * 2, alloc should fail
  * *note: CommitLimit is the current overcommit limit.
  *        Committed_AS is the amount of memory that system has used.
  * it couldn't choose 'equal to commit_left' as a case, because
  * commit_left rely on Committed_AS, but the Committed_AS is not stable.
+ * *note2: TotalBatchSize is the total number of bytes, that can be
+ *         accounted for in the per cpu counters for the vm_committed_as
+ *         counter. Since the check used by malloc only looks at the
+ *         global counter of vm_committed_as, it can overallocate a bit.
  *
  * References:
  * - Documentation/sysctl/vm.txt
@@ -89,11 +82,13 @@ static long sum_total;
 static long free_total;
 static long commit_limit;
 static long commit_left;
+static long total_batch_size;
 
 static int heavy_malloc(long size);
 static void alloc_and_check(long size, int expect_result);
 static void update_mem(void);
 static void update_mem_commit(void);
+static long get_total_batch_size_bytes(void);
 
 static void setup(void)
 {
@@ -154,7 +149,7 @@ static void overcommit_memory_test(void)
 
 	update_mem_commit();
 	alloc_and_check(commit_left * 2, EXPECT_FAIL);
-	alloc_and_check(commit_limit, EXPECT_FAIL);
+	alloc_and_check(commit_limit + total_batch_size, EXPECT_FAIL);
 	update_mem_commit();
 	alloc_and_check(commit_left / 2, EXPECT_PASS);
 
@@ -228,6 +223,8 @@ static void update_mem_commit(void)
 {
 	long committed;
 
+	total_batch_size = get_total_batch_size_bytes();
+
 	commit_limit = SAFE_READ_MEMINFO("CommitLimit:");
 	committed = SAFE_READ_MEMINFO("Committed_AS:");
 	commit_left = commit_limit - committed;
@@ -247,6 +244,31 @@ static void update_mem_commit(void)
 	}
 }
 
+static long get_total_batch_size_bytes(void)
+{
+	struct sysinfo info;
+	long ncpus = tst_ncpus_conf();
+	long pagesize = getpagesize();
+	SAFE_SYSINFO(&info);
+
+	/* see linux source mm/mm_init.c mm_compute_batch() (This is in pages) */
+	long batch_size = MAX(
+			ncpus * 2,
+			MAX(
+				32,
+				MIN(
+					INT32_MAX,
+					(long)(info.totalram / pagesize) / ncpus / 256
+				)
+			)
+		);
+
+	/* there are ncpu separate counters, that can all grow up to
+	 * batch_size. So the maximum error for __vm_enough_memory is
+	 * batch_size * ncpus. */
+	return batch_size * ncpus * pagesize;
+}
+
 static struct tst_test test = {
 	.needs_root = 1,
 	.options = options,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [LTP] [PATCH v3] overcommit_memory: Fix unstable subtest
  2020-11-27  7:14 [LTP] [PATCH v3] overcommit_memory: Fix unstable subtest Joerg Vehlow
@ 2020-11-30  8:01 ` Richard Palethorpe
  2020-11-30  8:30   ` Joerg Vehlow
  2020-12-03 15:30 ` Cyril Hrubis
  1 sibling, 1 reply; 7+ messages in thread
From: Richard Palethorpe @ 2020-11-30  8:01 UTC (permalink / raw)
  To: ltp

Hello,

Joerg Vehlow <lkml@jv-coder.de> writes:

> From: Joerg Vehlow <joerg.vehlow@aox-tech.de>
>
> Changes to v2:
>  - Removed useless call to get_total_batch_size_bytes
>  - Calculate batch size before reading /proc/meminfo
>    This should reduce memory allocations after meminfo was read

Still mostly LGTM, but I have one question about CPU conf below.

>
> The test sets overcommit policy to never overcommit and then tries
> to allocate the commit limit reported by /proc/meminfo. This value is an exact
> value (at least at that point in time) of memory, that can be allocated
> according to the policy and ratio settings. This should fail, since there
> is already some memory allocated for running the test programm, but due to
> inaccurate memory accounting in mm/util.c __vm_enough_memory(), the allocation
> can still succeed.
>
> The commited memory is stored in a percpu counter, that counts in 1 + ncpu
> variables. For small allocations and deallocations, the memory is counted
> in a counter per cpu, without locking. If this counter reaches a threshold,
> the value is committed to a global counter. Due to this the global counter
> can become negative. This global counter is the only thing taken into
> account in __vm_enough_memory, propably for performance reasons, because
> otherwise a lock is required.
>
> Because of the inaccuracy the system can overcommit a bit by number of cpus
> times threshold value. By adding this value to the exact commit limit
> reported by /proc/meminfo, we can be sure, that we really always hit the
> memory limit.
>
> Signed-off-by: Joerg Vehlow <joerg.vehlow@aox-tech.de>
> ---
>  .../kernel/mem/tunable/overcommit_memory.c    | 56 +++++++++++++------
>  1 file changed, 39 insertions(+), 17 deletions(-)
>
> diff --git a/testcases/kernel/mem/tunable/overcommit_memory.c b/testcases/kernel/mem/tunable/overcommit_memory.c
> index f77939908..918b4e68e 100644
> --- a/testcases/kernel/mem/tunable/overcommit_memory.c
> +++ b/testcases/kernel/mem/tunable/overcommit_memory.c
> @@ -1,18 +1,7 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
>  /*
> - * Copyright (c) Linux Test Project, 2012-2020
> - * Copyright (C) 2012-2017  Red Hat, Inc.
> - *
> - * This program is free software;  you can redistribute it and/or modify
> - * it under the terms of the GNU General Public License as published by
> - * the Free Software Foundation; either version 2 of the License, or
> - * (at your option) any later version.
> - *
> - * This program is distributed in the hope that it will be useful,
> - * but WITHOUT ANY WARRANTY;  without even the implied warranty of
> - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See
> - * the GNU General Public License for more details.
> - *
> - * Descriptions:
> + * Copyright (c) 2012-2020 Linux Test Project
> + * Copyright (c) 2012-2017 Red Hat, Inc.
>   *
>   * There are two tunables overcommit_memory and overcommit_ratio under
>   * /proc/sys/vm/, which can control memory overcommitment.
> @@ -53,12 +42,16 @@
>   * the system is limited to CommitLimit(Swap+RAM*overcommit_ratio)
>   * commit_left(allocatable memory) = CommitLimit - Committed_AS
>   * a. less than commit_left:    commit_left / 2, alloc should pass
> - * b. greater than commit_left: commit_left * 2, alloc should fail
> - * c. overcommit limit:         CommitLimit,     alloc should fail
> + * b. overcommit limit:         CommitLimit + TotalBatchSize, should fail
> + * c. greater than commit_left: commit_left * 2, alloc should fail
>   * *note: CommitLimit is the current overcommit limit.
>   *        Committed_AS is the amount of memory that system has used.
>   * it couldn't choose 'equal to commit_left' as a case, because
>   * commit_left rely on Committed_AS, but the Committed_AS is not stable.
> + * *note2: TotalBatchSize is the total number of bytes, that can be
> + *         accounted for in the per cpu counters for the vm_committed_as
> + *         counter. Since the check used by malloc only looks at the
> + *         global counter of vm_committed_as, it can overallocate a bit.
>   *
>   * References:
>   * - Documentation/sysctl/vm.txt
> @@ -89,11 +82,13 @@ static long sum_total;
>  static long free_total;
>  static long commit_limit;
>  static long commit_left;
> +static long total_batch_size;
>  
>  static int heavy_malloc(long size);
>  static void alloc_and_check(long size, int expect_result);
>  static void update_mem(void);
>  static void update_mem_commit(void);
> +static long get_total_batch_size_bytes(void);
>  
>  static void setup(void)
>  {
> @@ -154,7 +149,7 @@ static void overcommit_memory_test(void)
>  
>  	update_mem_commit();
>  	alloc_and_check(commit_left * 2, EXPECT_FAIL);
> -	alloc_and_check(commit_limit, EXPECT_FAIL);
> +	alloc_and_check(commit_limit + total_batch_size, EXPECT_FAIL);
>  	update_mem_commit();
>  	alloc_and_check(commit_left / 2, EXPECT_PASS);
>  
> @@ -228,6 +223,8 @@ static void update_mem_commit(void)
>  {
>  	long committed;
>  
> +	total_batch_size = get_total_batch_size_bytes();
> +
>  	commit_limit = SAFE_READ_MEMINFO("CommitLimit:");
>  	committed = SAFE_READ_MEMINFO("Committed_AS:");
>  	commit_left = commit_limit - committed;
> @@ -247,6 +244,31 @@ static void update_mem_commit(void)
>  	}
>  }
>  
> +static long get_total_batch_size_bytes(void)
> +{
> +	struct sysinfo info;
> +	long ncpus = tst_ncpus_conf();

I'm not completely sure if this is the same value as num_cpus_present()
in the kernel? I'm just wondering if CPU hotplugging could result in the
wrong value being calculated (other than if it is hotplugged while the test
is running).

-- 
Thank you,
Richard.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [LTP] [PATCH v3] overcommit_memory: Fix unstable subtest
  2020-11-30  8:01 ` Richard Palethorpe
@ 2020-11-30  8:30   ` Joerg Vehlow
  2020-11-30  8:54     ` Richard Palethorpe
  0 siblings, 1 reply; 7+ messages in thread
From: Joerg Vehlow @ 2020-11-30  8:30 UTC (permalink / raw)
  To: ltp

Hi,
>> +static long get_total_batch_size_bytes(void)
>> +{
>> +	struct sysinfo info;
>> +	long ncpus = tst_ncpus_conf();
> I'm not completely sure if this is the same value as num_cpus_present()
> in the kernel? I'm just wondering if CPU hotplugging could result in the
> wrong value being calculated (other than if it is hotplugged while the test
> is running).

I was thinking about this as well when I implemented this. Here is my 
reasoning:

If hotplug is disabled possible=present and possible=nr cpus at boot. 
Otherwise present is the real number of existing (not necessarily 
enabled cpus), and possible=NR_CPU
In both cases it is the number of cpus installed in the system, enabled 
or not.

tst_ncpus_conf is _SC_NPROCESSORS_CONF, which is documented as "returns 
the number of processors the operating system configured. But it might 
be possible for the operating system to disable individual processors 
and so the call", in contrast to _SC_NPROCESSORS_ONLN "returns the 
number of processors which are currently online (i.e., available).".

I would interpret _SC_NPROCESSORS_CONF as equal to present and 
_SC_NPROCESSORS_ONLN as equal to online.

Anything flaw in my logic?

J?rg

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [LTP] [PATCH v3] overcommit_memory: Fix unstable subtest
  2020-11-30  8:30   ` Joerg Vehlow
@ 2020-11-30  8:54     ` Richard Palethorpe
  0 siblings, 0 replies; 7+ messages in thread
From: Richard Palethorpe @ 2020-11-30  8:54 UTC (permalink / raw)
  To: ltp

Hello,

Joerg Vehlow <lkml@jv-coder.de> writes:

> Hi,
>>> +static long get_total_batch_size_bytes(void)
>>> +{
>>> +	struct sysinfo info;
>>> +	long ncpus = tst_ncpus_conf();
>> I'm not completely sure if this is the same value as num_cpus_present()
>> in the kernel? I'm just wondering if CPU hotplugging could result in the
>> wrong value being calculated (other than if it is hotplugged while the test
>> is running).
>
> I was thinking about this as well when I implemented this. Here is my
> reasoning:
>
> If hotplug is disabled possible=present and possible=nr cpus at
> boot. Otherwise present is the real number of existing (not
> necessarily enabled cpus), and possible=NR_CPU
> In both cases it is the number of cpus installed in the system,
> enabled or not.
>
> tst_ncpus_conf is _SC_NPROCESSORS_CONF, which is documented as
> "returns the number of processors the operating system configured. But
> it might be possible for the operating system to disable individual
> processors and so the call", in contrast to _SC_NPROCESSORS_ONLN
> "returns the number of processors which are currently online (i.e.,
> available).".
>
> I would interpret _SC_NPROCESSORS_CONF as equal to present and
> _SC_NPROCESSORS_ONLN as equal to online.
>
> Anything flaw in my logic?
>
> J?rg

This sounds correct, it would be strange if _SC_NPROCESSORS_CONF
returned num_possible_cpus().

-- 
Thank you,
Richard.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [LTP] [PATCH v3] overcommit_memory: Fix unstable subtest
  2020-11-27  7:14 [LTP] [PATCH v3] overcommit_memory: Fix unstable subtest Joerg Vehlow
  2020-11-30  8:01 ` Richard Palethorpe
@ 2020-12-03 15:30 ` Cyril Hrubis
  2020-12-04  5:35   ` Joerg Vehlow
  1 sibling, 1 reply; 7+ messages in thread
From: Cyril Hrubis @ 2020-12-03 15:30 UTC (permalink / raw)
  To: ltp

Hi!
> +static long get_total_batch_size_bytes(void)
> +{
> +	struct sysinfo info;
> +	long ncpus = tst_ncpus_conf();
> +	long pagesize = getpagesize();
> +	SAFE_SYSINFO(&info);
> +
> +	/* see linux source mm/mm_init.c mm_compute_batch() (This is in pages) */
> +	long batch_size = MAX(
> +			ncpus * 2,
> +			MAX(
> +				32,
> +				MIN(
> +					INT32_MAX,
> +					(long)(info.totalram / pagesize) / ncpus / 256
> +				)
> +			)
> +		);
> +
> +	/* there are ncpu separate counters, that can all grow up to
> +	 * batch_size. So the maximum error for __vm_enough_memory is
> +	 * batch_size * ncpus. */
> +	return batch_size * ncpus * pagesize;
> +}

This version looks better. However two (minor) comments for this
function still apply.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [LTP] [PATCH v3] overcommit_memory: Fix unstable subtest
  2020-12-03 15:30 ` Cyril Hrubis
@ 2020-12-04  5:35   ` Joerg Vehlow
  2020-12-11 14:29     ` Cyril Hrubis
  0 siblings, 1 reply; 7+ messages in thread
From: Joerg Vehlow @ 2020-12-04  5:35 UTC (permalink / raw)
  To: ltp

Hi Cyril,
> This version looks better. However two (minor) comments for this
> function still apply.
I'll respond to the comments here:
> Why don't we put the first argument on the same line as the MAX( or MIN(
> here? That would be much more compact but still nicely readable.
Will change that
> I do wonder, are the counters flushed if task is migrated between CPUs?
> If so we wouldn't need the multiplication by bcpus.
IIRC, no they are not flushed and there wouldn't be any reason for that. 
These counters are not supposed to account per cpu statistics, they are 
just used to prevent lock contention. If a task is migrated to another 
cpu and it's memory is freed there, the counter may become negative, 
which is perfectly fine.
Additionally this total batch size is the maximum amount, the global 
counter can deviate from the real value. This takes into account all 
processes running on all cpus, because the overcommit limit is a global 
limit, not a per task one.
> Can we please call the get_total_back_size_bytes() in the test setup and
> store the value.
I think I left this in the test function, because there is a slight 
chance, that the value changes over time due to hotplug. If cpus are 
added (really plugged into the system, not only enabled) or if there is 
memory added, the value changes. In vms with memory ballooning, I think 
this is more likely to happen. The cost is not very high, so why add the 
possiblity to break this?

If you are ok with me only changing the formatting of the MIN/MAX stuff, 
I'll send a v4.

J?rg

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [LTP] [PATCH v3] overcommit_memory: Fix unstable subtest
  2020-12-04  5:35   ` Joerg Vehlow
@ 2020-12-11 14:29     ` Cyril Hrubis
  0 siblings, 0 replies; 7+ messages in thread
From: Cyril Hrubis @ 2020-12-11 14:29 UTC (permalink / raw)
  To: ltp

Hi!
> > I do wonder, are the counters flushed if task is migrated between CPUs?
> > If so we wouldn't need the multiplication by bcpus.
> IIRC, no they are not flushed and there wouldn't be any reason for that. 
> These counters are not supposed to account per cpu statistics, they are 
> just used to prevent lock contention. If a task is migrated to another 
> cpu and it's memory is freed there, the counter may become negative, 
> which is perfectly fine.
> Additionally this total batch size is the maximum amount, the global 
> counter can deviate from the real value. This takes into account all 
> processes running on all cpus, because the overcommit limit is a global 
> limit, not a per task one.

Sounds reasonable.

Btw I wonder if we can get into a situation where
commit_left < total_batch_size, but I guess that it's unlikely.

> > Can we please call the get_total_back_size_bytes() in the test setup and
> > store the value.
> I think I left this in the test function, because there is a slight 
> chance, that the value changes over time due to hotplug. If cpus are 
> added (really plugged into the system, not only enabled) or if there is 
> memory added, the value changes. In vms with memory ballooning, I think 
> this is more likely to happen. The cost is not very high, so why add the 
> possiblity to break this?

Well this does not fix the race, in a fact it does not even shrink the
race window that much. If the hotplug happens anwhere after the call to
the tst_ncpus_conf() it will still fail since the change will not be
reflected in the test anyways.

So I would prefer simply stating that the test results are undefined if
hotplug happens while the test is running and put the call to the test
setup. There are couple of NUMA tests in LTP that are not CPU hotplug
safe anyways.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-12-11 14:29 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-27  7:14 [LTP] [PATCH v3] overcommit_memory: Fix unstable subtest Joerg Vehlow
2020-11-30  8:01 ` Richard Palethorpe
2020-11-30  8:30   ` Joerg Vehlow
2020-11-30  8:54     ` Richard Palethorpe
2020-12-03 15:30 ` Cyril Hrubis
2020-12-04  5:35   ` Joerg Vehlow
2020-12-11 14:29     ` Cyril Hrubis

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.