linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] mm: page_counter: relayout structure to reduce false sharing
@ 2020-12-29 14:35 Feng Tang
  2020-12-29 14:35 ` [PATCH 2/2] mm: memcg: add a new MEMCG_UPDATE_BATCH Feng Tang
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Feng Tang @ 2020-12-29 14:35 UTC (permalink / raw)
  To: Andrew Morton, Michal Hocko, Johannes Weiner, Vladimir Davydov, linux-mm
  Cc: linux-kernel, andi.kleen, tim.c.chen, dave.hansen, ying.huang,
	Feng Tang, Roman Gushchin

When checking a memory cgroup related performance regression [1],
from the perf c2c profiling data, we found high false sharing for
accessing 'usage' and 'parent'.

On 64 bit system, the 'usage' and 'parent' are close to each other,
and easy to be in one cacheline (for cacheline size == 64+ B). 'usage'
is usally written, while 'parent' is usually read as the cgroup's
hierarchical counting nature.

So move the 'parent' to the end of the structure to make sure they
are in different cache lines.

Following are some performance data with the patch, against
v5.11-rc1, on several generations of Xeon platforms. Most of the
results are improvements, with only one malloc case on one platform
shows a -4.0% regression. Each category below has several subcases
run on different platform, and only the worst and best scores are
listed:

fio:				 +1.8% ~  +8.3%
will-it-scale/malloc1:		 -4.0% ~  +8.9%
will-it-scale/page_fault1:	 no change
will-it-scale/page_fault2:	 +2.4% ~  +20.2%

[1].https://lore.kernel.org/lkml/20201102091543.GM31092@shao2-debian/
Signed-off-by: Feng Tang <feng.tang@intel.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
---
 include/linux/page_counter.h | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/include/linux/page_counter.h b/include/linux/page_counter.h
index 85bd413..6795913 100644
--- a/include/linux/page_counter.h
+++ b/include/linux/page_counter.h
@@ -12,7 +12,6 @@ struct page_counter {
 	unsigned long low;
 	unsigned long high;
 	unsigned long max;
-	struct page_counter *parent;
 
 	/* effective memory.min and memory.min usage tracking */
 	unsigned long emin;
@@ -27,6 +26,14 @@ struct page_counter {
 	/* legacy */
 	unsigned long watermark;
 	unsigned long failcnt;
+
+	/*
+	 * 'parent' is placed here to be far from 'usage' to reduce
+	 * cache false sharing, as 'usage' is written mostly while
+	 * parent is frequently read for cgroup's hierarchical
+	 * counting nature.
+	 */
+	struct page_counter *parent;
 };
 
 #if BITS_PER_LONG == 32
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2021-01-06  4:46 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-29 14:35 [PATCH 1/2] mm: page_counter: relayout structure to reduce false sharing Feng Tang
2020-12-29 14:35 ` [PATCH 2/2] mm: memcg: add a new MEMCG_UPDATE_BATCH Feng Tang
2020-12-29 17:13   ` Roman Gushchin
2021-01-04  2:53     ` Feng Tang
2021-01-04  7:46   ` [mm] 4d8191276e: vm-scalability.throughput 43.4% improvement kernel test robot
2021-01-04 13:15   ` [PATCH 2/2] mm: memcg: add a new MEMCG_UPDATE_BATCH Michal Hocko
2021-01-05  1:57     ` Feng Tang
2021-01-06  0:47   ` Shakeel Butt
2021-01-06  2:12     ` Feng Tang
2021-01-06  3:43       ` Chris Down
2021-01-06  3:45         ` Chris Down
2021-01-06  4:45         ` Feng Tang
2020-12-29 16:56 ` [PATCH 1/2] mm: page_counter: relayout structure to reduce false sharing Roman Gushchin
2020-12-30 14:19   ` Feng Tang
2021-01-04 13:03 ` Michal Hocko
2021-01-04 13:34   ` Feng Tang
2021-01-04 14:11     ` Michal Hocko
2021-01-04 14:44       ` Feng Tang
2021-01-04 15:34         ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).