linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] mm, memcg: cgroup v2 tunable load/store tearing fixes
@ 2020-03-12 17:32 Chris Down
  2020-03-12 17:32 ` [PATCH 1/6] mm, memcg: Prevent memory.high load/store tearing Chris Down
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Chris Down @ 2020-03-12 17:32 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Tejun Heo, Roman Gushchin, linux-mm, cgroups,
	linux-kernel, kernel-team

Chris Down (6):
  mm, memcg: Prevent memory.high load/store tearing
  mm, memcg: Prevent memory.max load tearing
  mm, memcg: Prevent memory.low load/store tearing
  mm, memcg: Prevent memory.min load/store tearing
  mm, memcg: Prevent memory.swap.max load tearing
  mm, memcg: Prevent mem_cgroup_protected store tearing

 mm/memcontrol.c   | 42 ++++++++++++++++++++++--------------------
 mm/page_counter.c | 17 +++++++++++------
 2 files changed, 33 insertions(+), 26 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/6] mm, memcg: Prevent memory.high load/store tearing
  2020-03-12 17:32 [PATCH 0/6] mm, memcg: cgroup v2 tunable load/store tearing fixes Chris Down
@ 2020-03-12 17:32 ` Chris Down
  2020-03-16 14:54   ` Michal Hocko
  2020-03-12 17:32 ` [PATCH 2/6] mm, memcg: Prevent memory.max load tearing Chris Down
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Chris Down @ 2020-03-12 17:32 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Tejun Heo, Roman Gushchin, linux-mm, cgroups,
	linux-kernel, kernel-team

A mem_cgroup's high attribute can be concurrently set at the same time
as we are trying to read it -- for example, if we are in
memory_high_write at the same time as we are trying to do high reclaim.

Signed-off-by: Chris Down <chris@chrisdown.name>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: linux-mm@kvack.org
Cc: cgroups@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: kernel-team@fb.com
---
 mm/memcontrol.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 63bb6a2aab81..d32d3c0a16d4 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2228,7 +2228,7 @@ static void reclaim_high(struct mem_cgroup *memcg,
 			 gfp_t gfp_mask)
 {
 	do {
-		if (page_counter_read(&memcg->memory) <= memcg->high)
+		if (page_counter_read(&memcg->memory) <= READ_ONCE(memcg->high))
 			continue;
 		memcg_memory_event(memcg, MEMCG_HIGH);
 		try_to_free_mem_cgroup_pages(memcg, nr_pages, gfp_mask, true);
@@ -2545,7 +2545,7 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
 	 * reclaim, the cost of mismatch is negligible.
 	 */
 	do {
-		if (page_counter_read(&memcg->memory) > memcg->high) {
+		if (page_counter_read(&memcg->memory) > READ_ONCE(memcg->high)) {
 			/* Don't bother a random interrupted task */
 			if (in_interrupt()) {
 				schedule_work(&memcg->high_work);
@@ -4257,7 +4257,8 @@ void mem_cgroup_wb_stats(struct bdi_writeback *wb, unsigned long *pfilepages,
 	*pheadroom = PAGE_COUNTER_MAX;
 
 	while ((parent = parent_mem_cgroup(memcg))) {
-		unsigned long ceiling = min(memcg->memory.max, memcg->high);
+		unsigned long ceiling = min(memcg->memory.max,
+					    READ_ONCE(memcg->high));
 		unsigned long used = page_counter_read(&memcg->memory);
 
 		*pheadroom = min(*pheadroom, ceiling - min(ceiling, used));
@@ -4978,7 +4979,7 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
 	if (!memcg)
 		return ERR_PTR(error);
 
-	memcg->high = PAGE_COUNTER_MAX;
+	WRITE_ONCE(memcg->high, PAGE_COUNTER_MAX);
 	memcg->soft_limit = PAGE_COUNTER_MAX;
 	if (parent) {
 		memcg->swappiness = mem_cgroup_swappiness(parent);
@@ -5131,7 +5132,7 @@ static void mem_cgroup_css_reset(struct cgroup_subsys_state *css)
 	page_counter_set_max(&memcg->tcpmem, PAGE_COUNTER_MAX);
 	page_counter_set_min(&memcg->memory, 0);
 	page_counter_set_low(&memcg->memory, 0);
-	memcg->high = PAGE_COUNTER_MAX;
+	WRITE_ONCE(memcg->high, PAGE_COUNTER_MAX);
 	memcg->soft_limit = PAGE_COUNTER_MAX;
 	memcg_wb_domain_size_changed(memcg);
 }
@@ -5947,7 +5948,7 @@ static ssize_t memory_high_write(struct kernfs_open_file *of,
 	if (err)
 		return err;
 
-	memcg->high = high;
+	WRITE_ONCE(memcg->high, high);
 
 	for (;;) {
 		unsigned long nr_pages = page_counter_read(&memcg->memory);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/6] mm, memcg: Prevent memory.max load tearing
  2020-03-12 17:32 [PATCH 0/6] mm, memcg: cgroup v2 tunable load/store tearing fixes Chris Down
  2020-03-12 17:32 ` [PATCH 1/6] mm, memcg: Prevent memory.high load/store tearing Chris Down
@ 2020-03-12 17:32 ` Chris Down
  2020-03-16 14:56   ` Michal Hocko
  2020-03-12 17:33 ` [PATCH 3/6] mm, memcg: Prevent memory.low load/store tearing Chris Down
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Chris Down @ 2020-03-12 17:32 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Tejun Heo, Roman Gushchin, linux-mm, cgroups,
	linux-kernel, kernel-team

This one is a bit more nuanced because we have memcg_max_mutex, which is
mostly just used for enforcing invariants, but we still need to
READ_ONCE since (despite its name) it doesn't really protect memory.max
access.

On write (page_counter_set_max() and memory_max_write()) we use xchg(),
which uses smp_mb(), so that's already fine.

Signed-off-by: Chris Down <chris@chrisdown.name>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: linux-mm@kvack.org
Cc: cgroups@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: kernel-team@fb.com
---
 mm/memcontrol.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index d32d3c0a16d4..aca2964ea494 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1507,7 +1507,7 @@ void mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg)
 
 	pr_info("memory: usage %llukB, limit %llukB, failcnt %lu\n",
 		K((u64)page_counter_read(&memcg->memory)),
-		K((u64)memcg->memory.max), memcg->memory.failcnt);
+		K((u64)READ_ONCE(memcg->memory.max)), memcg->memory.failcnt);
 	if (cgroup_subsys_on_dfl(memory_cgrp_subsys))
 		pr_info("swap: usage %llukB, limit %llukB, failcnt %lu\n",
 			K((u64)page_counter_read(&memcg->swap)),
@@ -1538,7 +1538,7 @@ unsigned long mem_cgroup_get_max(struct mem_cgroup *memcg)
 {
 	unsigned long max;
 
-	max = memcg->memory.max;
+	max = READ_ONCE(memcg->memory.max);
 	if (mem_cgroup_swappiness(memcg)) {
 		unsigned long memsw_max;
 		unsigned long swap_max;
@@ -3006,7 +3006,7 @@ static int mem_cgroup_resize_max(struct mem_cgroup *memcg,
 		 * Make sure that the new limit (memsw or memory limit) doesn't
 		 * break our basic invariant rule memory.max <= memsw.max.
 		 */
-		limits_invariant = memsw ? max >= memcg->memory.max :
+		limits_invariant = memsw ? max >= READ_ONCE(memcg->memory.max) :
 					   max <= memcg->memsw.max;
 		if (!limits_invariant) {
 			mutex_unlock(&memcg_max_mutex);
@@ -3753,8 +3753,8 @@ static int memcg_stat_show(struct seq_file *m, void *v)
 	/* Hierarchical information */
 	memory = memsw = PAGE_COUNTER_MAX;
 	for (mi = memcg; mi; mi = parent_mem_cgroup(mi)) {
-		memory = min(memory, mi->memory.max);
-		memsw = min(memsw, mi->memsw.max);
+		memory = min(memory, READ_ONCE(mi->memory.max));
+		memsw = min(memsw, READ_ONCE(mi->memsw.max));
 	}
 	seq_printf(m, "hierarchical_memory_limit %llu\n",
 		   (u64)memory * PAGE_SIZE);
@@ -4257,7 +4257,7 @@ void mem_cgroup_wb_stats(struct bdi_writeback *wb, unsigned long *pfilepages,
 	*pheadroom = PAGE_COUNTER_MAX;
 
 	while ((parent = parent_mem_cgroup(memcg))) {
-		unsigned long ceiling = min(memcg->memory.max,
+		unsigned long ceiling = min(READ_ONCE(memcg->memory.max),
 					    READ_ONCE(memcg->high));
 		unsigned long used = page_counter_read(&memcg->memory);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/6] mm, memcg: Prevent memory.low load/store tearing
  2020-03-12 17:32 [PATCH 0/6] mm, memcg: cgroup v2 tunable load/store tearing fixes Chris Down
  2020-03-12 17:32 ` [PATCH 1/6] mm, memcg: Prevent memory.high load/store tearing Chris Down
  2020-03-12 17:32 ` [PATCH 2/6] mm, memcg: Prevent memory.max load tearing Chris Down
@ 2020-03-12 17:33 ` Chris Down
  2020-03-16 14:57   ` Michal Hocko
  2020-06-12 17:03   ` Michal Koutný
  2020-03-12 17:33 ` [PATCH 4/6] mm, memcg: Prevent memory.min " Chris Down
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 15+ messages in thread
From: Chris Down @ 2020-03-12 17:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Tejun Heo, Roman Gushchin, linux-mm, cgroups,
	linux-kernel, kernel-team

This can be set concurrently with reads, which may cause the wrong value
to be propagated.

Signed-off-by: Chris Down <chris@chrisdown.name>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: linux-mm@kvack.org
Cc: cgroups@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: kernel-team@fb.com
---
 mm/memcontrol.c   | 4 ++--
 mm/page_counter.c | 9 ++++++---
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index aca2964ea494..c85a304fa4a1 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -6262,7 +6262,7 @@ enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root,
 		return MEMCG_PROT_NONE;
 
 	emin = memcg->memory.min;
-	elow = memcg->memory.low;
+	elow = READ_ONCE(memcg->memory.low);
 
 	parent = parent_mem_cgroup(memcg);
 	/* No parent means a non-hierarchical mode on v1 memcg */
@@ -6291,7 +6291,7 @@ enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root,
 	if (elow && parent_elow) {
 		unsigned long low_usage, siblings_low_usage;
 
-		low_usage = min(usage, memcg->memory.low);
+		low_usage = min(usage, READ_ONCE(memcg->memory.low));
 		siblings_low_usage = atomic_long_read(
 			&parent->memory.children_low_usage);
 
diff --git a/mm/page_counter.c b/mm/page_counter.c
index 50184929b61f..18b7f779f2e2 100644
--- a/mm/page_counter.c
+++ b/mm/page_counter.c
@@ -17,6 +17,7 @@ static void propagate_protected_usage(struct page_counter *c,
 				      unsigned long usage)
 {
 	unsigned long protected, old_protected;
+	unsigned long low;
 	long delta;
 
 	if (!c->parent)
@@ -34,8 +35,10 @@ static void propagate_protected_usage(struct page_counter *c,
 			atomic_long_add(delta, &c->parent->children_min_usage);
 	}
 
-	if (c->low || atomic_long_read(&c->low_usage)) {
-		if (usage <= c->low)
+	low = READ_ONCE(c->low);
+
+	if (low || atomic_long_read(&c->low_usage)) {
+		if (usage <= low)
 			protected = usage;
 		else
 			protected = 0;
@@ -231,7 +234,7 @@ void page_counter_set_low(struct page_counter *counter, unsigned long nr_pages)
 {
 	struct page_counter *c;
 
-	counter->low = nr_pages;
+	WRITE_ONCE(counter->low, nr_pages);
 
 	for (c = counter; c; c = c->parent)
 		propagate_protected_usage(c, atomic_long_read(&c->usage));
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/6] mm, memcg: Prevent memory.min load/store tearing
  2020-03-12 17:32 [PATCH 0/6] mm, memcg: cgroup v2 tunable load/store tearing fixes Chris Down
                   ` (2 preceding siblings ...)
  2020-03-12 17:33 ` [PATCH 3/6] mm, memcg: Prevent memory.low load/store tearing Chris Down
@ 2020-03-12 17:33 ` Chris Down
  2020-03-16 14:58   ` Michal Hocko
  2020-03-12 17:33 ` [PATCH 5/6] mm, memcg: Prevent memory.swap.max load tearing Chris Down
  2020-03-12 17:33 ` [PATCH 6/6] mm, memcg: Prevent mem_cgroup_protected store tearing Chris Down
  5 siblings, 1 reply; 15+ messages in thread
From: Chris Down @ 2020-03-12 17:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Tejun Heo, Roman Gushchin, linux-mm, cgroups,
	linux-kernel, kernel-team

This can be set concurrently with reads, which may cause the wrong value
to be propagated.

Signed-off-by: Chris Down <chris@chrisdown.name>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: linux-mm@kvack.org
Cc: cgroups@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: kernel-team@fb.com
---
 mm/memcontrol.c   |  4 ++--
 mm/page_counter.c | 10 ++++++----
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index c85a304fa4a1..e0ed790a2a8c 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -6261,7 +6261,7 @@ enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root,
 	if (!usage)
 		return MEMCG_PROT_NONE;
 
-	emin = memcg->memory.min;
+	emin = READ_ONCE(memcg->memory.min);
 	elow = READ_ONCE(memcg->memory.low);
 
 	parent = parent_mem_cgroup(memcg);
@@ -6277,7 +6277,7 @@ enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root,
 	if (emin && parent_emin) {
 		unsigned long min_usage, siblings_min_usage;
 
-		min_usage = min(usage, memcg->memory.min);
+		min_usage = min(usage, READ_ONCE(memcg->memory.min));
 		siblings_min_usage = atomic_long_read(
 			&parent->memory.children_min_usage);
 
diff --git a/mm/page_counter.c b/mm/page_counter.c
index 18b7f779f2e2..ae471c7d255f 100644
--- a/mm/page_counter.c
+++ b/mm/page_counter.c
@@ -17,14 +17,16 @@ static void propagate_protected_usage(struct page_counter *c,
 				      unsigned long usage)
 {
 	unsigned long protected, old_protected;
-	unsigned long low;
+	unsigned long low, min;
 	long delta;
 
 	if (!c->parent)
 		return;
 
-	if (c->min || atomic_long_read(&c->min_usage)) {
-		if (usage <= c->min)
+	min = READ_ONCE(c->min);
+
+	if (min || atomic_long_read(&c->min_usage)) {
+		if (usage <= min)
 			protected = usage;
 		else
 			protected = 0;
@@ -217,7 +219,7 @@ void page_counter_set_min(struct page_counter *counter, unsigned long nr_pages)
 {
 	struct page_counter *c;
 
-	counter->min = nr_pages;
+	WRITE_ONCE(counter->min, nr_pages);
 
 	for (c = counter; c; c = c->parent)
 		propagate_protected_usage(c, atomic_long_read(&c->usage));
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/6] mm, memcg: Prevent memory.swap.max load tearing
  2020-03-12 17:32 [PATCH 0/6] mm, memcg: cgroup v2 tunable load/store tearing fixes Chris Down
                   ` (3 preceding siblings ...)
  2020-03-12 17:33 ` [PATCH 4/6] mm, memcg: Prevent memory.min " Chris Down
@ 2020-03-12 17:33 ` Chris Down
  2020-03-16 14:58   ` Michal Hocko
  2020-03-12 17:33 ` [PATCH 6/6] mm, memcg: Prevent mem_cgroup_protected store tearing Chris Down
  5 siblings, 1 reply; 15+ messages in thread
From: Chris Down @ 2020-03-12 17:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Tejun Heo, Roman Gushchin, linux-mm, cgroups,
	linux-kernel, kernel-team

The write side of this is xchg()/smp_mb(), so that's all good. Just a
few sites missing a READ_ONCE.

Signed-off-by: Chris Down <chris@chrisdown.name>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: linux-mm@kvack.org
Cc: cgroups@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: kernel-team@fb.com
---
 mm/memcontrol.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index e0ed790a2a8c..57048a38c75d 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1511,7 +1511,7 @@ void mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg)
 	if (cgroup_subsys_on_dfl(memory_cgrp_subsys))
 		pr_info("swap: usage %llukB, limit %llukB, failcnt %lu\n",
 			K((u64)page_counter_read(&memcg->swap)),
-			K((u64)memcg->swap.max), memcg->swap.failcnt);
+			K((u64)READ_ONCE(memcg->swap.max)), memcg->swap.failcnt);
 	else {
 		pr_info("memory+swap: usage %llukB, limit %llukB, failcnt %lu\n",
 			K((u64)page_counter_read(&memcg->memsw)),
@@ -1544,7 +1544,7 @@ unsigned long mem_cgroup_get_max(struct mem_cgroup *memcg)
 		unsigned long swap_max;
 
 		memsw_max = memcg->memsw.max;
-		swap_max = memcg->swap.max;
+		swap_max = READ_ONCE(memcg->swap.max);
 		swap_max = min(swap_max, (unsigned long)total_swap_pages);
 		max = min(max + swap_max, memsw_max);
 	}
@@ -7025,7 +7025,8 @@ bool mem_cgroup_swap_full(struct page *page)
 		return false;
 
 	for (; memcg != root_mem_cgroup; memcg = parent_mem_cgroup(memcg))
-		if (page_counter_read(&memcg->swap) * 2 >= memcg->swap.max)
+		if (page_counter_read(&memcg->swap) * 2 >=
+		    READ_ONCE(memcg->swap.max))
 			return true;
 
 	return false;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 6/6] mm, memcg: Prevent mem_cgroup_protected store tearing
  2020-03-12 17:32 [PATCH 0/6] mm, memcg: cgroup v2 tunable load/store tearing fixes Chris Down
                   ` (4 preceding siblings ...)
  2020-03-12 17:33 ` [PATCH 5/6] mm, memcg: Prevent memory.swap.max load tearing Chris Down
@ 2020-03-12 17:33 ` Chris Down
  2020-03-16 14:59   ` Michal Hocko
  5 siblings, 1 reply; 15+ messages in thread
From: Chris Down @ 2020-03-12 17:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Tejun Heo, Roman Gushchin, linux-mm, cgroups,
	linux-kernel, kernel-team

The read side of this is all protected, but we can still tear if
multiple iterations of mem_cgroup_protected are going at the same time.

There's some intentional racing in mem_cgroup_protected which is ok, but
load/store tearing should be avoided.

Signed-off-by: Chris Down <chris@chrisdown.name>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: linux-mm@kvack.org
Cc: cgroups@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: kernel-team@fb.com
---
 mm/memcontrol.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 57048a38c75d..e9af606238ab 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -6301,8 +6301,8 @@ enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root,
 	}
 
 exit:
-	memcg->memory.emin = emin;
-	memcg->memory.elow = elow;
+	WRITE_ONCE(memcg->memory.emin, emin);
+	WRITE_ONCE(memcg->memory.elow, elow);
 
 	if (usage <= emin)
 		return MEMCG_PROT_MIN;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/6] mm, memcg: Prevent memory.high load/store tearing
  2020-03-12 17:32 ` [PATCH 1/6] mm, memcg: Prevent memory.high load/store tearing Chris Down
@ 2020-03-16 14:54   ` Michal Hocko
  0 siblings, 0 replies; 15+ messages in thread
From: Michal Hocko @ 2020-03-16 14:54 UTC (permalink / raw)
  To: Chris Down
  Cc: Andrew Morton, Johannes Weiner, Tejun Heo, Roman Gushchin,
	linux-mm, cgroups, linux-kernel, kernel-team

On Thu 12-03-20 17:32:51, Chris Down wrote:
> A mem_cgroup's high attribute can be concurrently set at the same time
> as we are trying to read it -- for example, if we are in
> memory_high_write at the same time as we are trying to do high reclaim.

I assume this is a replace all kinda patch because css_alloc shouldn't
really be a subject to races. I am not sure about css_reset but it
sounds like a safe as well.

That being said I do not object because this cannot be harmful but it
would be nice to mention that in the changelog just in case somebody
wonders about this in future.
 
> Signed-off-by: Chris Down <chris@chrisdown.name>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Roman Gushchin <guro@fb.com>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: linux-mm@kvack.org
> Cc: cgroups@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: kernel-team@fb.com

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/memcontrol.c | 13 +++++++------
>  1 file changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 63bb6a2aab81..d32d3c0a16d4 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2228,7 +2228,7 @@ static void reclaim_high(struct mem_cgroup *memcg,
>  			 gfp_t gfp_mask)
>  {
>  	do {
> -		if (page_counter_read(&memcg->memory) <= memcg->high)
> +		if (page_counter_read(&memcg->memory) <= READ_ONCE(memcg->high))
>  			continue;
>  		memcg_memory_event(memcg, MEMCG_HIGH);
>  		try_to_free_mem_cgroup_pages(memcg, nr_pages, gfp_mask, true);
> @@ -2545,7 +2545,7 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
>  	 * reclaim, the cost of mismatch is negligible.
>  	 */
>  	do {
> -		if (page_counter_read(&memcg->memory) > memcg->high) {
> +		if (page_counter_read(&memcg->memory) > READ_ONCE(memcg->high)) {
>  			/* Don't bother a random interrupted task */
>  			if (in_interrupt()) {
>  				schedule_work(&memcg->high_work);
> @@ -4257,7 +4257,8 @@ void mem_cgroup_wb_stats(struct bdi_writeback *wb, unsigned long *pfilepages,
>  	*pheadroom = PAGE_COUNTER_MAX;
>  
>  	while ((parent = parent_mem_cgroup(memcg))) {
> -		unsigned long ceiling = min(memcg->memory.max, memcg->high);
> +		unsigned long ceiling = min(memcg->memory.max,
> +					    READ_ONCE(memcg->high));
>  		unsigned long used = page_counter_read(&memcg->memory);
>  
>  		*pheadroom = min(*pheadroom, ceiling - min(ceiling, used));
> @@ -4978,7 +4979,7 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
>  	if (!memcg)
>  		return ERR_PTR(error);
>  
> -	memcg->high = PAGE_COUNTER_MAX;
> +	WRITE_ONCE(memcg->high, PAGE_COUNTER_MAX);
>  	memcg->soft_limit = PAGE_COUNTER_MAX;
>  	if (parent) {
>  		memcg->swappiness = mem_cgroup_swappiness(parent);
> @@ -5131,7 +5132,7 @@ static void mem_cgroup_css_reset(struct cgroup_subsys_state *css)
>  	page_counter_set_max(&memcg->tcpmem, PAGE_COUNTER_MAX);
>  	page_counter_set_min(&memcg->memory, 0);
>  	page_counter_set_low(&memcg->memory, 0);
> -	memcg->high = PAGE_COUNTER_MAX;
> +	WRITE_ONCE(memcg->high, PAGE_COUNTER_MAX);
>  	memcg->soft_limit = PAGE_COUNTER_MAX;
>  	memcg_wb_domain_size_changed(memcg);
>  }
> @@ -5947,7 +5948,7 @@ static ssize_t memory_high_write(struct kernfs_open_file *of,
>  	if (err)
>  		return err;
>  
> -	memcg->high = high;
> +	WRITE_ONCE(memcg->high, high);
>  
>  	for (;;) {
>  		unsigned long nr_pages = page_counter_read(&memcg->memory);
> -- 
> 2.25.1
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/6] mm, memcg: Prevent memory.max load tearing
  2020-03-12 17:32 ` [PATCH 2/6] mm, memcg: Prevent memory.max load tearing Chris Down
@ 2020-03-16 14:56   ` Michal Hocko
  0 siblings, 0 replies; 15+ messages in thread
From: Michal Hocko @ 2020-03-16 14:56 UTC (permalink / raw)
  To: Chris Down
  Cc: Andrew Morton, Johannes Weiner, Tejun Heo, Roman Gushchin,
	linux-mm, cgroups, linux-kernel, kernel-team

On Thu 12-03-20 17:32:56, Chris Down wrote:
> This one is a bit more nuanced because we have memcg_max_mutex, which is
> mostly just used for enforcing invariants, but we still need to
> READ_ONCE since (despite its name) it doesn't really protect memory.max
> access.
> 
> On write (page_counter_set_max() and memory_max_write()) we use xchg(),
> which uses smp_mb(), so that's already fine.
> 
> Signed-off-by: Chris Down <chris@chrisdown.name>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Roman Gushchin <guro@fb.com>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: linux-mm@kvack.org
> Cc: cgroups@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: kernel-team@fb.com

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/memcontrol.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index d32d3c0a16d4..aca2964ea494 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1507,7 +1507,7 @@ void mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg)
>  
>  	pr_info("memory: usage %llukB, limit %llukB, failcnt %lu\n",
>  		K((u64)page_counter_read(&memcg->memory)),
> -		K((u64)memcg->memory.max), memcg->memory.failcnt);
> +		K((u64)READ_ONCE(memcg->memory.max)), memcg->memory.failcnt);
>  	if (cgroup_subsys_on_dfl(memory_cgrp_subsys))
>  		pr_info("swap: usage %llukB, limit %llukB, failcnt %lu\n",
>  			K((u64)page_counter_read(&memcg->swap)),
> @@ -1538,7 +1538,7 @@ unsigned long mem_cgroup_get_max(struct mem_cgroup *memcg)
>  {
>  	unsigned long max;
>  
> -	max = memcg->memory.max;
> +	max = READ_ONCE(memcg->memory.max);
>  	if (mem_cgroup_swappiness(memcg)) {
>  		unsigned long memsw_max;
>  		unsigned long swap_max;
> @@ -3006,7 +3006,7 @@ static int mem_cgroup_resize_max(struct mem_cgroup *memcg,
>  		 * Make sure that the new limit (memsw or memory limit) doesn't
>  		 * break our basic invariant rule memory.max <= memsw.max.
>  		 */
> -		limits_invariant = memsw ? max >= memcg->memory.max :
> +		limits_invariant = memsw ? max >= READ_ONCE(memcg->memory.max) :
>  					   max <= memcg->memsw.max;
>  		if (!limits_invariant) {
>  			mutex_unlock(&memcg_max_mutex);
> @@ -3753,8 +3753,8 @@ static int memcg_stat_show(struct seq_file *m, void *v)
>  	/* Hierarchical information */
>  	memory = memsw = PAGE_COUNTER_MAX;
>  	for (mi = memcg; mi; mi = parent_mem_cgroup(mi)) {
> -		memory = min(memory, mi->memory.max);
> -		memsw = min(memsw, mi->memsw.max);
> +		memory = min(memory, READ_ONCE(mi->memory.max));
> +		memsw = min(memsw, READ_ONCE(mi->memsw.max));
>  	}
>  	seq_printf(m, "hierarchical_memory_limit %llu\n",
>  		   (u64)memory * PAGE_SIZE);
> @@ -4257,7 +4257,7 @@ void mem_cgroup_wb_stats(struct bdi_writeback *wb, unsigned long *pfilepages,
>  	*pheadroom = PAGE_COUNTER_MAX;
>  
>  	while ((parent = parent_mem_cgroup(memcg))) {
> -		unsigned long ceiling = min(memcg->memory.max,
> +		unsigned long ceiling = min(READ_ONCE(memcg->memory.max),
>  					    READ_ONCE(memcg->high));
>  		unsigned long used = page_counter_read(&memcg->memory);
>  
> -- 
> 2.25.1

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/6] mm, memcg: Prevent memory.low load/store tearing
  2020-03-12 17:33 ` [PATCH 3/6] mm, memcg: Prevent memory.low load/store tearing Chris Down
@ 2020-03-16 14:57   ` Michal Hocko
  2020-06-12 17:03   ` Michal Koutný
  1 sibling, 0 replies; 15+ messages in thread
From: Michal Hocko @ 2020-03-16 14:57 UTC (permalink / raw)
  To: Chris Down
  Cc: Andrew Morton, Johannes Weiner, Tejun Heo, Roman Gushchin,
	linux-mm, cgroups, linux-kernel, kernel-team

On Thu 12-03-20 17:33:01, Chris Down wrote:
> This can be set concurrently with reads, which may cause the wrong value
> to be propagated.
> 
> Signed-off-by: Chris Down <chris@chrisdown.name>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Roman Gushchin <guro@fb.com>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: linux-mm@kvack.org
> Cc: cgroups@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: kernel-team@fb.com

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/memcontrol.c   | 4 ++--
>  mm/page_counter.c | 9 ++++++---
>  2 files changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index aca2964ea494..c85a304fa4a1 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -6262,7 +6262,7 @@ enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root,
>  		return MEMCG_PROT_NONE;
>  
>  	emin = memcg->memory.min;
> -	elow = memcg->memory.low;
> +	elow = READ_ONCE(memcg->memory.low);
>  
>  	parent = parent_mem_cgroup(memcg);
>  	/* No parent means a non-hierarchical mode on v1 memcg */
> @@ -6291,7 +6291,7 @@ enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root,
>  	if (elow && parent_elow) {
>  		unsigned long low_usage, siblings_low_usage;
>  
> -		low_usage = min(usage, memcg->memory.low);
> +		low_usage = min(usage, READ_ONCE(memcg->memory.low));
>  		siblings_low_usage = atomic_long_read(
>  			&parent->memory.children_low_usage);
>  
> diff --git a/mm/page_counter.c b/mm/page_counter.c
> index 50184929b61f..18b7f779f2e2 100644
> --- a/mm/page_counter.c
> +++ b/mm/page_counter.c
> @@ -17,6 +17,7 @@ static void propagate_protected_usage(struct page_counter *c,
>  				      unsigned long usage)
>  {
>  	unsigned long protected, old_protected;
> +	unsigned long low;
>  	long delta;
>  
>  	if (!c->parent)
> @@ -34,8 +35,10 @@ static void propagate_protected_usage(struct page_counter *c,
>  			atomic_long_add(delta, &c->parent->children_min_usage);
>  	}
>  
> -	if (c->low || atomic_long_read(&c->low_usage)) {
> -		if (usage <= c->low)
> +	low = READ_ONCE(c->low);
> +
> +	if (low || atomic_long_read(&c->low_usage)) {
> +		if (usage <= low)
>  			protected = usage;
>  		else
>  			protected = 0;
> @@ -231,7 +234,7 @@ void page_counter_set_low(struct page_counter *counter, unsigned long nr_pages)
>  {
>  	struct page_counter *c;
>  
> -	counter->low = nr_pages;
> +	WRITE_ONCE(counter->low, nr_pages);
>  
>  	for (c = counter; c; c = c->parent)
>  		propagate_protected_usage(c, atomic_long_read(&c->usage));
> -- 
> 2.25.1
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/6] mm, memcg: Prevent memory.min load/store tearing
  2020-03-12 17:33 ` [PATCH 4/6] mm, memcg: Prevent memory.min " Chris Down
@ 2020-03-16 14:58   ` Michal Hocko
  0 siblings, 0 replies; 15+ messages in thread
From: Michal Hocko @ 2020-03-16 14:58 UTC (permalink / raw)
  To: Chris Down
  Cc: Andrew Morton, Johannes Weiner, Tejun Heo, Roman Gushchin,
	linux-mm, cgroups, linux-kernel, kernel-team

On Thu 12-03-20 17:33:07, Chris Down wrote:
> This can be set concurrently with reads, which may cause the wrong value
> to be propagated.
> 
> Signed-off-by: Chris Down <chris@chrisdown.name>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Roman Gushchin <guro@fb.com>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: linux-mm@kvack.org
> Cc: cgroups@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: kernel-team@fb.com

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/memcontrol.c   |  4 ++--
>  mm/page_counter.c | 10 ++++++----
>  2 files changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index c85a304fa4a1..e0ed790a2a8c 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -6261,7 +6261,7 @@ enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root,
>  	if (!usage)
>  		return MEMCG_PROT_NONE;
>  
> -	emin = memcg->memory.min;
> +	emin = READ_ONCE(memcg->memory.min);
>  	elow = READ_ONCE(memcg->memory.low);
>  
>  	parent = parent_mem_cgroup(memcg);
> @@ -6277,7 +6277,7 @@ enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root,
>  	if (emin && parent_emin) {
>  		unsigned long min_usage, siblings_min_usage;
>  
> -		min_usage = min(usage, memcg->memory.min);
> +		min_usage = min(usage, READ_ONCE(memcg->memory.min));
>  		siblings_min_usage = atomic_long_read(
>  			&parent->memory.children_min_usage);
>  
> diff --git a/mm/page_counter.c b/mm/page_counter.c
> index 18b7f779f2e2..ae471c7d255f 100644
> --- a/mm/page_counter.c
> +++ b/mm/page_counter.c
> @@ -17,14 +17,16 @@ static void propagate_protected_usage(struct page_counter *c,
>  				      unsigned long usage)
>  {
>  	unsigned long protected, old_protected;
> -	unsigned long low;
> +	unsigned long low, min;
>  	long delta;
>  
>  	if (!c->parent)
>  		return;
>  
> -	if (c->min || atomic_long_read(&c->min_usage)) {
> -		if (usage <= c->min)
> +	min = READ_ONCE(c->min);
> +
> +	if (min || atomic_long_read(&c->min_usage)) {
> +		if (usage <= min)
>  			protected = usage;
>  		else
>  			protected = 0;
> @@ -217,7 +219,7 @@ void page_counter_set_min(struct page_counter *counter, unsigned long nr_pages)
>  {
>  	struct page_counter *c;
>  
> -	counter->min = nr_pages;
> +	WRITE_ONCE(counter->min, nr_pages);
>  
>  	for (c = counter; c; c = c->parent)
>  		propagate_protected_usage(c, atomic_long_read(&c->usage));
> -- 
> 2.25.1
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/6] mm, memcg: Prevent memory.swap.max load tearing
  2020-03-12 17:33 ` [PATCH 5/6] mm, memcg: Prevent memory.swap.max load tearing Chris Down
@ 2020-03-16 14:58   ` Michal Hocko
  0 siblings, 0 replies; 15+ messages in thread
From: Michal Hocko @ 2020-03-16 14:58 UTC (permalink / raw)
  To: Chris Down
  Cc: Andrew Morton, Johannes Weiner, Tejun Heo, Roman Gushchin,
	linux-mm, cgroups, linux-kernel, kernel-team

On Thu 12-03-20 17:33:11, Chris Down wrote:
> The write side of this is xchg()/smp_mb(), so that's all good. Just a
> few sites missing a READ_ONCE.
> 
> Signed-off-by: Chris Down <chris@chrisdown.name>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Roman Gushchin <guro@fb.com>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: linux-mm@kvack.org
> Cc: cgroups@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: kernel-team@fb.com

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/memcontrol.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index e0ed790a2a8c..57048a38c75d 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1511,7 +1511,7 @@ void mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg)
>  	if (cgroup_subsys_on_dfl(memory_cgrp_subsys))
>  		pr_info("swap: usage %llukB, limit %llukB, failcnt %lu\n",
>  			K((u64)page_counter_read(&memcg->swap)),
> -			K((u64)memcg->swap.max), memcg->swap.failcnt);
> +			K((u64)READ_ONCE(memcg->swap.max)), memcg->swap.failcnt);
>  	else {
>  		pr_info("memory+swap: usage %llukB, limit %llukB, failcnt %lu\n",
>  			K((u64)page_counter_read(&memcg->memsw)),
> @@ -1544,7 +1544,7 @@ unsigned long mem_cgroup_get_max(struct mem_cgroup *memcg)
>  		unsigned long swap_max;
>  
>  		memsw_max = memcg->memsw.max;
> -		swap_max = memcg->swap.max;
> +		swap_max = READ_ONCE(memcg->swap.max);
>  		swap_max = min(swap_max, (unsigned long)total_swap_pages);
>  		max = min(max + swap_max, memsw_max);
>  	}
> @@ -7025,7 +7025,8 @@ bool mem_cgroup_swap_full(struct page *page)
>  		return false;
>  
>  	for (; memcg != root_mem_cgroup; memcg = parent_mem_cgroup(memcg))
> -		if (page_counter_read(&memcg->swap) * 2 >= memcg->swap.max)
> +		if (page_counter_read(&memcg->swap) * 2 >=
> +		    READ_ONCE(memcg->swap.max))
>  			return true;
>  
>  	return false;
> -- 
> 2.25.1
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 6/6] mm, memcg: Prevent mem_cgroup_protected store tearing
  2020-03-12 17:33 ` [PATCH 6/6] mm, memcg: Prevent mem_cgroup_protected store tearing Chris Down
@ 2020-03-16 14:59   ` Michal Hocko
  0 siblings, 0 replies; 15+ messages in thread
From: Michal Hocko @ 2020-03-16 14:59 UTC (permalink / raw)
  To: Chris Down
  Cc: Andrew Morton, Johannes Weiner, Tejun Heo, Roman Gushchin,
	linux-mm, cgroups, linux-kernel, kernel-team

On Thu 12-03-20 17:33:16, Chris Down wrote:
> The read side of this is all protected, but we can still tear if
> multiple iterations of mem_cgroup_protected are going at the same time.
> 
> There's some intentional racing in mem_cgroup_protected which is ok, but
> load/store tearing should be avoided.
> 
> Signed-off-by: Chris Down <chris@chrisdown.name>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Roman Gushchin <guro@fb.com>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: linux-mm@kvack.org
> Cc: cgroups@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: kernel-team@fb.com

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/memcontrol.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 57048a38c75d..e9af606238ab 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -6301,8 +6301,8 @@ enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root,
>  	}
>  
>  exit:
> -	memcg->memory.emin = emin;
> -	memcg->memory.elow = elow;
> +	WRITE_ONCE(memcg->memory.emin, emin);
> +	WRITE_ONCE(memcg->memory.elow, elow);
>  
>  	if (usage <= emin)
>  		return MEMCG_PROT_MIN;
> -- 
> 2.25.1

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/6] mm, memcg: Prevent memory.low load/store tearing
  2020-03-12 17:33 ` [PATCH 3/6] mm, memcg: Prevent memory.low load/store tearing Chris Down
  2020-03-16 14:57   ` Michal Hocko
@ 2020-06-12 17:03   ` Michal Koutný
  2020-06-12 17:13     ` Chris Down
  1 sibling, 1 reply; 15+ messages in thread
From: Michal Koutný @ 2020-06-12 17:03 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Chris Down, Johannes Weiner, Tejun Heo, Roman Gushchin, linux-mm,
	cgroups, linux-kernel, kernel-team

[-- Attachment #1: Type: text/plain, Size: 2117 bytes --]

Hello.

I see suspicious asymmetry, in the current mainline:
>	WRITE_ONCE(memcg->memory.emin, effective_protection(usage, parent_usage,
>			READ_ONCE(memcg->memory.min),
>			READ_ONCE(parent->memory.emin),
>			atomic_long_read(&parent->memory.children_min_usage)));
>
>	WRITE_ONCE(memcg->memory.elow, effective_protection(usage, parent_usage,
>			memcg->memory.low, READ_ONCE(parent->memory.elow),
>			atomic_long_read(&parent->memory.children_low_usage)));

On Thu, Mar 12, 2020 at 05:33:01PM +0000, Chris Down <chris@chrisdown.name> wrote:
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index aca2964ea494..c85a304fa4a1 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -6262,7 +6262,7 @@ enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root,
>  		return MEMCG_PROT_NONE;
>  
>  	emin = memcg->memory.min;
> -	elow = memcg->memory.low;
> +	elow = READ_ONCE(memcg->memory.low);
>  
>  	parent = parent_mem_cgroup(memcg);
>  	/* No parent means a non-hierarchical mode on v1 memcg */
> @@ -6291,7 +6291,7 @@ enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root,
>  	if (elow && parent_elow) {
>  		unsigned long low_usage, siblings_low_usage;
>  
> -		low_usage = min(usage, memcg->memory.low);
> +		low_usage = min(usage, READ_ONCE(memcg->memory.low));
>  		siblings_low_usage = atomic_long_read(
>  			&parent->memory.children_low_usage);
Is it possible that these hunks were lost during rebase/merge?

IMHO it should apply as:

-- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -6428,7 +6428,8 @@ enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root,
                        atomic_long_read(&parent->memory.children_min_usage)));

        WRITE_ONCE(memcg->memory.elow, effective_protection(usage, parent_usage,
-                       memcg->memory.low, READ_ONCE(parent->memory.elow),
+                       READ_ONCE(memcg->memory.low),
+                       READ_ONCE(parent->memory.elow),
                        atomic_long_read(&parent->memory.children_low_usage)));

 out:


Michal

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/6] mm, memcg: Prevent memory.low load/store tearing
  2020-06-12 17:03   ` Michal Koutný
@ 2020-06-12 17:13     ` Chris Down
  0 siblings, 0 replies; 15+ messages in thread
From: Chris Down @ 2020-06-12 17:13 UTC (permalink / raw)
  To: Michal Koutný
  Cc: Andrew Morton, Johannes Weiner, Tejun Heo, Roman Gushchin,
	linux-mm, cgroups, linux-kernel, kernel-team

[-- Attachment #1: Type: text/plain, Size: 645 bytes --]

Hi Michal,

Good catch! Andrew and I must have missed these when massaging the commits with 
other stuff in -mm, which is totally understandable considering the amount of 
places being touched by this and other patch series at the same time. Just goes 
to show how complex it can be sometimes, since I even double checked these and 
didn't see that missed hunk :-)

The good news is that these are belt-and-suspenders: this avoids a rare 
theoretical case, not something likely to be practically dangerous. But yes, we 
should fix it up. Since the commit is already out, I'll just submit a new one.

Thanks for the report!

Chris

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 963 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-06-12 17:13 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-12 17:32 [PATCH 0/6] mm, memcg: cgroup v2 tunable load/store tearing fixes Chris Down
2020-03-12 17:32 ` [PATCH 1/6] mm, memcg: Prevent memory.high load/store tearing Chris Down
2020-03-16 14:54   ` Michal Hocko
2020-03-12 17:32 ` [PATCH 2/6] mm, memcg: Prevent memory.max load tearing Chris Down
2020-03-16 14:56   ` Michal Hocko
2020-03-12 17:33 ` [PATCH 3/6] mm, memcg: Prevent memory.low load/store tearing Chris Down
2020-03-16 14:57   ` Michal Hocko
2020-06-12 17:03   ` Michal Koutný
2020-06-12 17:13     ` Chris Down
2020-03-12 17:33 ` [PATCH 4/6] mm, memcg: Prevent memory.min " Chris Down
2020-03-16 14:58   ` Michal Hocko
2020-03-12 17:33 ` [PATCH 5/6] mm, memcg: Prevent memory.swap.max load tearing Chris Down
2020-03-16 14:58   ` Michal Hocko
2020-03-12 17:33 ` [PATCH 6/6] mm, memcg: Prevent mem_cgroup_protected store tearing Chris Down
2020-03-16 14:59   ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).