All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Shakeel Butt <shakeelb@google.com>
Cc: "Johannes Weiner" <hannes@cmpxchg.org>,
	"Michal Hocko" <mhocko@kernel.org>,
	"Roman Gushchin" <roman.gushchin@linux.dev>,
	"Muchun Song" <songmuchun@bytedance.com>,
	" Michal Koutný " <mkoutny@suse.com>,
	"Eric Dumazet" <edumazet@google.com>,
	"Soheil Hassas Yeganeh" <soheil@google.com>,
	"Feng Tang" <feng.tang@intel.com>,
	"Oliver Sang" <oliver.sang@intel.com>,
	lkp@lists.01.org, cgroups@vger.kernel.org, linux-mm@kvack.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 2/3] mm: page_counter: rearrange struct page_counter fields
Date: Wed, 24 Aug 2022 17:33:30 -0700	[thread overview]
Message-ID: <20220824173330.2a15bcda24d2c3c248bc43c7@linux-foundation.org> (raw)
In-Reply-To: <20220825000506.239406-3-shakeelb@google.com>

On Thu, 25 Aug 2022 00:05:05 +0000 Shakeel Butt <shakeelb@google.com> wrote:

> With memcg v2 enabled, memcg->memory.usage is a very hot member for
> the workloads doing memcg charging on multiple CPUs concurrently.
> Particularly the network intensive workloads. In addition, there is a
> false cache sharing between memory.usage and memory.high on the charge
> path. This patch moves the usage into a separate cacheline and move all
> the read most fields into separate cacheline.
> 
> To evaluate the impact of this optimization, on a 72 CPUs machine, we
> ran the following workload in a three level of cgroup hierarchy.
> 
>  $ netserver -6
>  # 36 instances of netperf with following params
>  $ netperf -6 -H ::1 -l 60 -t TCP_SENDFILE -- -m 10K
> 
> Results (average throughput of netperf):
> Without (6.0-rc1)	10482.7 Mbps
> With patch		12413.7 Mbps (18.4% improvement)
> 
> With the patch, the throughput improved by 18.4%.
> 
> One side-effect of this patch is the increase in the size of struct
> mem_cgroup. For example with this patch on 64 bit build, the size of
> struct mem_cgroup increased from 4032 bytes to 4416 bytes. However for
> the performance improvement, this additional size is worth it. In
> addition there are opportunities to reduce the size of struct
> mem_cgroup like deprecation of kmem and tcpmem page counters and
> better packing.

Did you evaluate the effects of using a per-cpu counter of some form?

WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm@linux-foundation.org>
To: lkp@lists.01.org
Subject: Re: [PATCH v2 2/3] mm: page_counter: rearrange struct page_counter fields
Date: Wed, 24 Aug 2022 17:33:30 -0700	[thread overview]
Message-ID: <20220824173330.2a15bcda24d2c3c248bc43c7@linux-foundation.org> (raw)
In-Reply-To: <20220825000506.239406-3-shakeelb@google.com>

[-- Attachment #1: Type: text/plain, Size: 1467 bytes --]

On Thu, 25 Aug 2022 00:05:05 +0000 Shakeel Butt <shakeelb@google.com> wrote:

> With memcg v2 enabled, memcg->memory.usage is a very hot member for
> the workloads doing memcg charging on multiple CPUs concurrently.
> Particularly the network intensive workloads. In addition, there is a
> false cache sharing between memory.usage and memory.high on the charge
> path. This patch moves the usage into a separate cacheline and move all
> the read most fields into separate cacheline.
> 
> To evaluate the impact of this optimization, on a 72 CPUs machine, we
> ran the following workload in a three level of cgroup hierarchy.
> 
>  $ netserver -6
>  # 36 instances of netperf with following params
>  $ netperf -6 -H ::1 -l 60 -t TCP_SENDFILE -- -m 10K
> 
> Results (average throughput of netperf):
> Without (6.0-rc1)	10482.7 Mbps
> With patch		12413.7 Mbps (18.4% improvement)
> 
> With the patch, the throughput improved by 18.4%.
> 
> One side-effect of this patch is the increase in the size of struct
> mem_cgroup. For example with this patch on 64 bit build, the size of
> struct mem_cgroup increased from 4032 bytes to 4416 bytes. However for
> the performance improvement, this additional size is worth it. In
> addition there are opportunities to reduce the size of struct
> mem_cgroup like deprecation of kmem and tcpmem page counters and
> better packing.

Did you evaluate the effects of using a per-cpu counter of some form?

WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
To: Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Cc: "Johannes Weiner"
	<hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	"Michal Hocko" <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	"Roman Gushchin"
	<roman.gushchin-fxUVXftIFDnyG1zEObXtfA@public.gmane.org>,
	"Muchun Song"
	<songmuchun-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org>,
	" Michal Koutný " <mkoutny-IBi9RG/b67k@public.gmane.org>,
	"Eric Dumazet" <edumazet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	"Soheil Hassas Yeganeh"
	<soheil-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	"Feng Tang" <feng.tang-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	"Oliver Sang"
	<oliver.sang-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	lkp-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH v2 2/3] mm: page_counter: rearrange struct page_counter fields
Date: Wed, 24 Aug 2022 17:33:30 -0700	[thread overview]
Message-ID: <20220824173330.2a15bcda24d2c3c248bc43c7@linux-foundation.org> (raw)
In-Reply-To: <20220825000506.239406-3-shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>

On Thu, 25 Aug 2022 00:05:05 +0000 Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:

> With memcg v2 enabled, memcg->memory.usage is a very hot member for
> the workloads doing memcg charging on multiple CPUs concurrently.
> Particularly the network intensive workloads. In addition, there is a
> false cache sharing between memory.usage and memory.high on the charge
> path. This patch moves the usage into a separate cacheline and move all
> the read most fields into separate cacheline.
> 
> To evaluate the impact of this optimization, on a 72 CPUs machine, we
> ran the following workload in a three level of cgroup hierarchy.
> 
>  $ netserver -6
>  # 36 instances of netperf with following params
>  $ netperf -6 -H ::1 -l 60 -t TCP_SENDFILE -- -m 10K
> 
> Results (average throughput of netperf):
> Without (6.0-rc1)	10482.7 Mbps
> With patch		12413.7 Mbps (18.4% improvement)
> 
> With the patch, the throughput improved by 18.4%.
> 
> One side-effect of this patch is the increase in the size of struct
> mem_cgroup. For example with this patch on 64 bit build, the size of
> struct mem_cgroup increased from 4032 bytes to 4416 bytes. However for
> the performance improvement, this additional size is worth it. In
> addition there are opportunities to reduce the size of struct
> mem_cgroup like deprecation of kmem and tcpmem page counters and
> better packing.

Did you evaluate the effects of using a per-cpu counter of some form?

  reply	other threads:[~2022-08-25  0:33 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-25  0:05 [PATCH v2 0/3] memcg: optimize charge codepath Shakeel Butt
2022-08-25  0:05 ` Shakeel Butt
2022-08-25  0:05 ` Shakeel Butt
2022-08-25  0:05 ` [PATCH v2 1/3] mm: page_counter: remove unneeded atomic ops for low/min Shakeel Butt
2022-08-25  0:05   ` Shakeel Butt
2022-08-25  6:43   ` Michal Hocko
2022-08-25  6:43     ` Michal Hocko
2022-08-25  6:43     ` Michal Hocko
2022-08-25  0:05 ` [PATCH v2 2/3] mm: page_counter: rearrange struct page_counter fields Shakeel Butt
2022-08-25  0:05   ` Shakeel Butt
2022-08-25  0:33   ` Andrew Morton [this message]
2022-08-25  0:33     ` Andrew Morton
2022-08-25  0:33     ` Andrew Morton
2022-08-25  4:41     ` Shakeel Butt
2022-08-25  4:41       ` Shakeel Butt
2022-08-25  4:41       ` Shakeel Butt
2022-08-25  5:21       ` Andrew Morton
2022-08-25  5:21         ` Andrew Morton
2022-08-25  5:21         ` Andrew Morton
2022-08-25 15:24         ` Shakeel Butt
2022-08-25 15:24           ` Shakeel Butt
2022-08-25 15:24           ` Shakeel Butt
2022-08-25  6:47   ` Michal Hocko
2022-08-25  6:47     ` Michal Hocko
2022-08-25 15:25     ` Shakeel Butt
2022-08-25 15:25       ` Shakeel Butt
2022-08-25 15:25       ` Shakeel Butt
2022-08-25  0:05 ` [PATCH v2 3/3] memcg: increase MEMCG_CHARGE_BATCH to 64 Shakeel Butt
2022-08-25  0:05   ` Shakeel Butt
2022-08-25  6:49   ` Michal Hocko
2022-08-25  6:49     ` Michal Hocko
2022-08-25  8:30   ` Muchun Song
2022-08-25  8:30     ` Muchun Song
2022-08-25  8:30     ` Muchun Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220824173330.2a15bcda24d2c3c248bc43c7@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=edumazet@google.com \
    --cc=feng.tang@intel.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkp@lists.01.org \
    --cc=mhocko@kernel.org \
    --cc=mkoutny@suse.com \
    --cc=netdev@vger.kernel.org \
    --cc=oliver.sang@intel.com \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeelb@google.com \
    --cc=soheil@google.com \
    --cc=songmuchun@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.