All of lore.kernel.org
 help / color / mirror / Atom feed
From: Feng Tang <feng.tang@intel.com>
To: Roman Gushchin <guro@fb.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@suse.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	andi.kleen@intel.com, tim.c.chen@intel.com,
	dave.hansen@intel.com, ying.huang@intel.com
Subject: Re: [PATCH 1/2] mm: page_counter: relayout structure to reduce false sharing
Date: Wed, 30 Dec 2020 22:19:23 +0800	[thread overview]
Message-ID: <20201230141923.GA43248@shbuild999.sh.intel.com> (raw)
In-Reply-To: <20201229165642.GA371241@carbon.dhcp.thefacebook.com>

On Tue, Dec 29, 2020 at 08:56:42AM -0800, Roman Gushchin wrote:
> On Tue, Dec 29, 2020 at 10:35:13PM +0800, Feng Tang wrote:
> > When checking a memory cgroup related performance regression [1],
> > from the perf c2c profiling data, we found high false sharing for
> > accessing 'usage' and 'parent'.
> > 
> > On 64 bit system, the 'usage' and 'parent' are close to each other,
> > and easy to be in one cacheline (for cacheline size == 64+ B). 'usage'
> > is usally written, while 'parent' is usually read as the cgroup's
> > hierarchical counting nature.
> > 
> > So move the 'parent' to the end of the structure to make sure they
> > are in different cache lines.
> > 
> > Following are some performance data with the patch, against
> > v5.11-rc1, on several generations of Xeon platforms. Most of the
> > results are improvements, with only one malloc case on one platform
> > shows a -4.0% regression. Each category below has several subcases
> > run on different platform, and only the worst and best scores are
> > listed:
> > 
> > fio:				 +1.8% ~  +8.3%
> > will-it-scale/malloc1:		 -4.0% ~  +8.9%
> > will-it-scale/page_fault1:	 no change
> > will-it-scale/page_fault2:	 +2.4% ~  +20.2%
> > 
> > [1].https://lore.kernel.org/lkml/20201102091543.GM31092@shao2-debian/
> > Signed-off-by: Feng Tang <feng.tang@intel.com>
> > Cc: Roman Gushchin <guro@fb.com>
> > Cc: Johannes Weiner <hannes@cmpxchg.org>
> > ---
> >  include/linux/page_counter.h | 9 ++++++++-
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/page_counter.h b/include/linux/page_counter.h
> > index 85bd413..6795913 100644
> > --- a/include/linux/page_counter.h
> > +++ b/include/linux/page_counter.h
> > @@ -12,7 +12,6 @@ struct page_counter {
> >  	unsigned long low;
> >  	unsigned long high;
> >  	unsigned long max;
> > -	struct page_counter *parent;
> >  
> >  	/* effective memory.min and memory.min usage tracking */
> >  	unsigned long emin;
> > @@ -27,6 +26,14 @@ struct page_counter {
> >  	/* legacy */
> >  	unsigned long watermark;
> >  	unsigned long failcnt;
> > +
> > +	/*
> > +	 * 'parent' is placed here to be far from 'usage' to reduce
> > +	 * cache false sharing, as 'usage' is written mostly while
> > +	 * parent is frequently read for cgroup's hierarchical
> > +	 * counting nature.
> > +	 */
> > +	struct page_counter *parent;
> >  };
> 
> LGTM!
> 
> Reviewed-by: Roman Gushchin <guro@fb.com>

Thanks for the review!

> I wonder if we have the same problem with min/low/high/max?
> Maybe try to group all mostly-read-only fields (min, low, high,
> max and parent) and separate them with some padding?

Yep, we thought about it too. From current perf c2c profiling
data, I haven't noticed obvious hot spots of false sharing for
min/low/high/max (which are read mostly).

For padding, we had some proposal before, current page_counter
for 64 bits platform is 112 bytes, padding to 2 cacheline
will only cost 16 bytes more. If this is fine, I can send another
patch or folder it to this one.

Thanks,
Feng

> Thank you!

  reply	other threads:[~2020-12-30 14:20 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-29 14:35 [PATCH 1/2] mm: page_counter: relayout structure to reduce false sharing Feng Tang
2020-12-29 14:35 ` [PATCH 2/2] mm: memcg: add a new MEMCG_UPDATE_BATCH Feng Tang
2020-12-29 17:13   ` Roman Gushchin
2021-01-04  2:53     ` Feng Tang
2021-01-04  7:46   ` [mm] 4d8191276e: vm-scalability.throughput 43.4% improvement kernel test robot
2021-01-04  7:46     ` kernel test robot
2021-01-04 13:15   ` [PATCH 2/2] mm: memcg: add a new MEMCG_UPDATE_BATCH Michal Hocko
2021-01-05  1:57     ` Feng Tang
2021-01-06  0:47   ` Shakeel Butt
2021-01-06  0:47     ` Shakeel Butt
2021-01-06  2:12     ` Feng Tang
2021-01-06  3:43       ` Chris Down
2021-01-06  3:45         ` Chris Down
2021-01-06  4:45         ` Feng Tang
2020-12-29 16:56 ` [PATCH 1/2] mm: page_counter: relayout structure to reduce false sharing Roman Gushchin
2020-12-30 14:19   ` Feng Tang [this message]
2021-01-04 13:03 ` Michal Hocko
2021-01-04 13:34   ` Feng Tang
2021-01-04 14:11     ` Michal Hocko
2021-01-04 14:44       ` Feng Tang
2021-01-04 15:34         ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201230141923.GA43248@shbuild999.sh.intel.com \
    --to=feng.tang@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi.kleen@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=tim.c.chen@intel.com \
    --cc=vdavydov.dev@gmail.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.