linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Feng Tang <feng.tang@intel.com>
To: Linus Torvalds <torvalds@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>
Cc: kernel test robot <oliver.sang@intel.com>,
	Roman Gushchin <guro@fb.com>, Michal Hocko <mhocko@suse.com>,
	Shakeel Butt <shakeelb@google.com>,
	Michal Koutn?? <mkoutny@suse.com>,
	Balbir Singh <bsingharora@gmail.com>, Tejun Heo <tj@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	lkp@lists.01.org, kernel test robot <lkp@intel.com>,
	"Huang, Ying" <ying.huang@intel.com>,
	Zhengjun Xing <zhengjun.xing@linux.intel.com>
Subject: Re: [mm] 2d146aa3aa: vm-scalability.throughput -36.4% regression
Date: Mon, 16 Aug 2021 11:28:55 +0800	[thread overview]
Message-ID: <20210816032855.GB72770@shbuild999.sh.intel.com> (raw)
In-Reply-To: <20210812031910.GA63920@shbuild999.sh.intel.com>

On Thu, Aug 12, 2021 at 11:19:10AM +0800, Feng Tang wrote:
> On Tue, Aug 10, 2021 at 07:59:53PM -1000, Linus Torvalds wrote:
[SNIP]

> And seems there is some cache false sharing when accessing mem_cgroup
> member: 'struct cgroup_subsys_state', from the offset (0x0 and 0x10 here)
> and the calling sites, the cache false sharing could happen between:
> 
>     cgroup_rstat_updated (read memcg->css.cgroup, offset 0x0)
> and 
>     get_mem_cgroup_from_mm
> 	css_tryget(&memcg->css) (read/write memcg->css.refcnt, offset 0x10)
> 
> (This could be wrong as many of the functions are inlined, and the
> exact calling site isn't shown)
> 
> And to verify this, we did a test by adding padding between
> memcg->css.cgroup and memcg->css.refcnt to push them into 2
> different cache lines, and the performance are partly restored:
> 
> dc26532aed0ab25c 2d146aa3aa842d7f5065802556b 73371bf27a8a8ea68df2fbf456b 
> ---------------- --------------------------- --------------------------- 
>   65523232 ±  4%     -40.8%   38817332 ±  5%     -19.6%   52701654 ±  3%  vm-scalability.throughput
>
> We are still checking more, and will update if there is new data. 

Seems this is the second case to hit 'adjacent cacheline prefetch",
the first time we saw it is also related with mem_cgroup
https://lore.kernel.org/lkml/20201125062445.GA51005@shbuild999.sh.intel.com/

In previous debug patch, the 'css.cgroup' and 'css.refcnt' is
separated to 2 cache lines, which are still adjacent (2N and 2N+1)
cachelines. And with more padding (add 128 bytes padding in between),
the performance is restored, and even better (test run 3 times):

dc26532aed0ab25c 2d146aa3aa842d7f5065802556b 2e34d6daf5fbab0fb286dcdb3bc 
---------------- --------------------------- --------------------------- 
  65523232 ±  4%     -40.8%   38817332 ±  5%     +23.4%   80862243 ±  3%  vm-scalability.throughput

The debug patch is:
--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -142,6 +142,8 @@ struct cgroup_subsys_state {
 	/* PI: the cgroup subsystem that this css is attached to */
 	struct cgroup_subsys *ss;
 
+	unsigned long pad[16];
+
 	/* reference count - access via css_[try]get() and css_put() */
 	struct percpu_ref refcnt;
 
Thanks,
Feng

> Btw, the test platform is a 2 sockets, 4 nodes, 96C/192T Cascadelake AP,
> and if run the same case on 2S/2Nodes/48C/96T Cascade Lake SP box, the
> regression is about -22.3%.
> 
> Thanks,
> Feng
>       
> > Anybody?
> > 
> >               Linus

  reply	other threads:[~2021-08-16  3:29 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-11  3:17 [mm] 2d146aa3aa: vm-scalability.throughput -36.4% regression kernel test robot
2021-08-11  5:59 ` Linus Torvalds
2021-08-11 20:12   ` Johannes Weiner
2021-08-12  3:19   ` Feng Tang
2021-08-16  3:28     ` Feng Tang [this message]
2021-08-16 21:41       ` Johannes Weiner
2021-08-17  2:45         ` Feng Tang
2021-08-17 16:47           ` Michal Koutný
2021-08-17 17:10             ` Shakeel Butt
2021-08-18  2:30             ` Feng Tang
2021-08-30 14:51               ` Michal Koutný
2021-08-31  6:30                 ` Feng Tang
2021-08-31  9:23                   ` Michal Koutný
2021-09-01  4:50                     ` Feng Tang
2021-09-01 15:12                       ` Andi Kleen
2021-09-02  1:35                         ` Feng Tang
2021-09-02  2:23                           ` Andi Kleen
2021-09-02  3:46                             ` Feng Tang
2021-09-02 10:53                               ` Michal Koutný
2021-09-02 13:39                                 ` Feng Tang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210816032855.GB72770@shbuild999.sh.intel.com \
    --to=feng.tang@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=bsingharora@gmail.com \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=lkp@lists.01.org \
    --cc=mhocko@suse.com \
    --cc=mkoutny@suse.com \
    --cc=oliver.sang@intel.com \
    --cc=shakeelb@google.com \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=ying.huang@intel.com \
    --cc=zhengjun.xing@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).