All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] memcg: optimizatize charge codepath
@ 2022-08-22  0:17 ` Shakeel Butt
  0 siblings, 0 replies; 86+ messages in thread
From: Shakeel Butt @ 2022-08-22  0:17 UTC (permalink / raw)
  To: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song
  Cc: Michal Koutný,
	Eric Dumazet, Soheil Hassas Yeganeh, Feng Tang, Oliver Sang,
	Andrew Morton, lkp, cgroups, linux-mm, netdev, linux-kernel,
	Shakeel Butt

Recently Linux networking stack has moved from a very old per socket
pre-charge caching to per-cpu caching to avoid pre-charge fragmentation
and unwarranted OOMs. One impact of this change is that for network
traffic workloads, memcg charging codepath can become a bottleneck. The
kernel test robot has also reported this regression. This patch series
tries to improve the memcg charging for such workloads.

This patch series implement three optimizations:
(A) Reduce atomic ops in page counter update path.
(B) Change layout of struct page_counter to eliminate false sharing
    between usage and high.
(C) Increase the memcg charge batch to 64.

To evaluate the impact of these optimizations, on a 72 CPUs machine, we
ran the following workload in root memcg and then compared with scenario
where the workload is run in a three level of cgroup hierarchy with top
level having min and low setup appropriately.

 $ netserver -6
 # 36 instances of netperf with following params
 $ netperf -6 -H ::1 -l 60 -t TCP_SENDFILE -- -m 10K

Results (average throughput of netperf):
1. root memcg		21694.8
2. 6.0-rc1		10482.7 (-51.6%)
3. 6.0-rc1 + (A)	14542.5 (-32.9%)
4. 6.0-rc1 + (B)	12413.7 (-42.7%)
5. 6.0-rc1 + (C)	17063.7 (-21.3%)
6. 6.0-rc1 + (A+B+C)	20120.3 (-7.2%)

With all three optimizations, the memcg overhead of this workload has
been reduced from 51.6% to just 7.2%.

Shakeel Butt (3):
  mm: page_counter: remove unneeded atomic ops for low/min
  mm: page_counter: rearrange struct page_counter fields
  memcg: increase MEMCG_CHARGE_BATCH to 64

 include/linux/memcontrol.h   |  7 ++++---
 include/linux/page_counter.h | 34 +++++++++++++++++++++++-----------
 mm/page_counter.c            | 13 ++++++-------
 3 files changed, 33 insertions(+), 21 deletions(-)

-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply	[flat|nested] 86+ messages in thread

end of thread, other threads:[~2022-08-23 12:24 UTC | newest]

Thread overview: 86+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-22  0:17 [PATCH 0/3] memcg: optimizatize charge codepath Shakeel Butt
2022-08-22  0:17 ` Shakeel Butt
2022-08-22  0:17 ` Shakeel Butt
2022-08-22  0:17 ` [PATCH 1/3] mm: page_counter: remove unneeded atomic ops for low/min Shakeel Butt
2022-08-22  0:17   ` Shakeel Butt
2022-08-22  0:17   ` Shakeel Butt
2022-08-22  0:20   ` Soheil Hassas Yeganeh
2022-08-22  0:20     ` Soheil Hassas Yeganeh
2022-08-22  0:20     ` Soheil Hassas Yeganeh
2022-08-22  2:39   ` Feng Tang
2022-08-22  2:39     ` Feng Tang
2022-08-22  2:39     ` Feng Tang
2022-08-22  9:55   ` Michal Hocko
2022-08-22  9:55     ` Michal Hocko
2022-08-22 10:18     ` Michal Hocko
2022-08-22 10:18       ` Michal Hocko
2022-08-22 10:18       ` Michal Hocko
2022-08-22 14:55       ` Shakeel Butt
2022-08-22 14:55         ` Shakeel Butt
2022-08-22 15:20         ` Michal Hocko
2022-08-22 15:20           ` Michal Hocko
2022-08-22 16:06           ` Shakeel Butt
2022-08-22 16:06             ` Shakeel Butt
2022-08-22 16:06             ` Shakeel Butt
2022-08-23  9:42           ` Michal Hocko
2022-08-23  9:42             ` Michal Hocko
2022-08-23  9:42             ` Michal Hocko
2022-08-22 18:23   ` Roman Gushchin
2022-08-22 18:23     ` Roman Gushchin
2022-08-22 18:23     ` Roman Gushchin
2022-08-22  0:17 ` [PATCH 2/3] mm: page_counter: rearrange struct page_counter fields Shakeel Butt
2022-08-22  0:17   ` Shakeel Butt
2022-08-22  0:24   ` Soheil Hassas Yeganeh
2022-08-22  0:24     ` Soheil Hassas Yeganeh
2022-08-22  0:24     ` Soheil Hassas Yeganeh
2022-08-22  4:55     ` Shakeel Butt
2022-08-22  4:55       ` Shakeel Butt
2022-08-22 13:06       ` Soheil Hassas Yeganeh
2022-08-22 13:06         ` Soheil Hassas Yeganeh
2022-08-22  2:10   ` Feng Tang
2022-08-22  2:10     ` Feng Tang
2022-08-22  2:10     ` Feng Tang
2022-08-22  4:59     ` Shakeel Butt
2022-08-22  4:59       ` Shakeel Butt
2022-08-22  4:59       ` Shakeel Butt
2022-08-22 10:23   ` Michal Hocko
2022-08-22 10:23     ` Michal Hocko
2022-08-22 10:23     ` Michal Hocko
2022-08-22 15:06     ` Shakeel Butt
2022-08-22 15:06       ` Shakeel Butt
2022-08-22 15:15       ` Michal Hocko
2022-08-22 15:15         ` Michal Hocko
2022-08-22 15:15         ` Michal Hocko
2022-08-22 16:04         ` Shakeel Butt
2022-08-22 16:04           ` Shakeel Butt
2022-08-22 18:27           ` Roman Gushchin
2022-08-22 18:27             ` Roman Gushchin
2022-08-22 18:27             ` Roman Gushchin
2022-08-22  0:17 ` [PATCH 3/3] memcg: increase MEMCG_CHARGE_BATCH to 64 Shakeel Butt
2022-08-22  0:17   ` Shakeel Butt
2022-08-22  0:24   ` Soheil Hassas Yeganeh
2022-08-22  0:24     ` Soheil Hassas Yeganeh
2022-08-22  0:24     ` Soheil Hassas Yeganeh
2022-08-22  2:30   ` Feng Tang
2022-08-22  2:30     ` Feng Tang
2022-08-22 10:47   ` Michal Hocko
2022-08-22 10:47     ` Michal Hocko
2022-08-22 10:47     ` Michal Hocko
2022-08-22 15:09     ` Shakeel Butt
2022-08-22 15:09       ` Shakeel Butt
2022-08-22 15:22       ` Michal Hocko
2022-08-22 15:22         ` Michal Hocko
2022-08-22 16:07         ` Shakeel Butt
2022-08-22 16:07           ` Shakeel Butt
2022-08-22 16:07           ` Shakeel Butt
2022-08-22 18:37   ` Roman Gushchin
2022-08-22 18:37     ` Roman Gushchin
2022-08-22 18:37     ` Roman Gushchin
2022-08-22 19:34     ` Michal Hocko
2022-08-22 19:34       ` Michal Hocko
2022-08-22 19:34       ` Michal Hocko
2022-08-23  2:22       ` Roman Gushchin
2022-08-23  2:22         ` Roman Gushchin
2022-08-23  2:22         ` Roman Gushchin
2022-08-23  4:49         ` Michal Hocko
2022-08-23  4:49           ` Michal Hocko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.