linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Feng Tang <feng.tang@intel.com>
To: Michal Koutn?? <mkoutny@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	andi.kleen@intel.com, kernel test robot <oliver.sang@intel.com>,
	Roman Gushchin <guro@fb.com>, Michal Hocko <mhocko@suse.com>,
	Shakeel Butt <shakeelb@google.com>,
	Balbir Singh <bsingharora@gmail.com>, Tejun Heo <tj@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	lkp@lists.01.org, kernel test robot <lkp@intel.com>,
	"Huang, Ying" <ying.huang@intel.com>,
	Zhengjun Xing <zhengjun.xing@linux.intel.com>
Subject: Re: [mm] 2d146aa3aa: vm-scalability.throughput -36.4% regression
Date: Wed, 1 Sep 2021 12:50:32 +0800	[thread overview]
Message-ID: <20210901045032.GA21937@shbuild999.sh.intel.com> (raw)
In-Reply-To: <20210831092304.GA17119@blackbody.suse.cz>

On Tue, Aug 31, 2021 at 11:23:04AM +0200, Michal Koutn?? wrote:
> On Tue, Aug 31, 2021 at 02:30:36PM +0800, Feng Tang <feng.tang@intel.com> wrote:
> > Yes, I tried many re-arrangement of the members of cgroup_subsys_state,
> > and even close members of memcg, but there were no obvious changes.
> > What can recover the regresion is adding 128 bytes padding in the css,
> > no matter at the start, end or in the middle.
> 
> Do you mean the padding added outside the .cgroup--.refcnt members area
> also restores the benchmark results? (Or you refer to paddings that move
> .cgroup and .refcnt across a cacheline border ?) I'm asking to be sure
> we have correct understanding of what members are contended (what's the
> frequent writer).

Yes, the tests I did is no matter where the 128B padding is added, the
performance can be restored and even improved.

struct cgroup_subsys_state {
				   <----------------- padding
	struct cgroup *cgroup;
	struct cgroup_subsys *ss;
				   <----------------- padding
	struct percpu_ref refcnt;
	struct list_head sibling;
	struct list_head children;
	struct list_head rstat_css_node;
	int id;
	unsigned int flags;
	u64 serial_nr;
	atomic_t online_cnt;
	struct work_struct destroy_work;
	struct rcu_work destroy_rwork;
	struct cgroup_subsys_state *parent;
				   <----------------- padding
};

Other tries I did are moving the untouched members around,
to separate the serveral hottest members, but no much effect.

From the data from perf-tool, 3 members are frequently accessed
(read actually): 'cgroup', 'refcnt', 'flags'

I also used 'perf mem' command tryint to catch read/write to
the css, and haven't found any _write_ operation, nor can the
code reading.

That led me to go check the "HW cache prefetcher", as in my
laste email. And all these test results make me think it's
data access pattern caused HW prefetcher related performance
change.

Thanks,
Feng


> Thanks,
> Michal

  reply	other threads:[~2021-09-01  4:50 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-11  3:17 [mm] 2d146aa3aa: vm-scalability.throughput -36.4% regression kernel test robot
2021-08-11  5:59 ` Linus Torvalds
2021-08-11 20:12   ` Johannes Weiner
2021-08-12  3:19   ` Feng Tang
2021-08-16  3:28     ` Feng Tang
2021-08-16 21:41       ` Johannes Weiner
2021-08-17  2:45         ` Feng Tang
2021-08-17 16:47           ` Michal Koutný
2021-08-17 17:10             ` Shakeel Butt
2021-08-18  2:30             ` Feng Tang
2021-08-30 14:51               ` Michal Koutný
2021-08-31  6:30                 ` Feng Tang
2021-08-31  9:23                   ` Michal Koutný
2021-09-01  4:50                     ` Feng Tang [this message]
2021-09-01 15:12                       ` Andi Kleen
2021-09-02  1:35                         ` Feng Tang
2021-09-02  2:23                           ` Andi Kleen
2021-09-02  3:46                             ` Feng Tang
2021-09-02 10:53                               ` Michal Koutný
2021-09-02 13:39                                 ` Feng Tang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210901045032.GA21937@shbuild999.sh.intel.com \
    --to=feng.tang@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi.kleen@intel.com \
    --cc=bsingharora@gmail.com \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=lkp@lists.01.org \
    --cc=mhocko@suse.com \
    --cc=mkoutny@suse.com \
    --cc=oliver.sang@intel.com \
    --cc=shakeelb@google.com \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=ying.huang@intel.com \
    --cc=zhengjun.xing@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).