All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yosry Ahmed <yosryahmed@google.com>
To: Shakeel Butt <shakeelb@google.com>
Cc: "Tejun Heo" <tj@kernel.org>, "Josef Bacik" <josef@toxicpanda.com>,
	"Jens Axboe" <axboe@kernel.dk>,
	"Zefan Li" <lizefan.x@bytedance.com>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Michal Hocko" <mhocko@kernel.org>,
	"Roman Gushchin" <roman.gushchin@linux.dev>,
	"Muchun Song" <muchun.song@linux.dev>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Michal Koutný" <mkoutny@suse.com>,
	"Vasily Averin" <vasily.averin@linux.dev>,
	cgroups@vger.kernel.org, linux-block@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	bpf@vger.kernel.org
Subject: Re: [PATCH v1 5/9] memcg: replace stats_flush_lock with an atomic
Date: Tue, 28 Mar 2023 11:52:50 -0700	[thread overview]
Message-ID: <CAJD7tkYo=CeXJPUi_KxjzC0QCxC2qd_J2_FQi_aXh7svD8u60A@mail.gmail.com> (raw)
In-Reply-To: <20230328141523.txyhl7wt7wtvssea@google.com>

On Tue, Mar 28, 2023 at 7:15 AM Shakeel Butt <shakeelb@google.com> wrote:
>
> On Tue, Mar 28, 2023 at 06:16:34AM +0000, Yosry Ahmed wrote:
> [...]
> > @@ -585,8 +585,8 @@ mem_cgroup_largest_soft_limit_node(struct mem_cgroup_tree_per_node *mctz)
> >   */
> >  static void flush_memcg_stats_dwork(struct work_struct *w);
> >  static DECLARE_DEFERRABLE_WORK(stats_flush_dwork, flush_memcg_stats_dwork);
> > -static DEFINE_SPINLOCK(stats_flush_lock);
> >  static DEFINE_PER_CPU(unsigned int, stats_updates);
> > +static atomic_t stats_flush_ongoing = ATOMIC_INIT(0);
> >  static atomic_t stats_flush_threshold = ATOMIC_INIT(0);
> >  static u64 flush_next_time;
> >
> > @@ -636,15 +636,18 @@ static inline void memcg_rstat_updated(struct mem_cgroup *memcg, int val)
> >
> >  static void __mem_cgroup_flush_stats(void)
> >  {
> > -     unsigned long flag;
> > -
> > -     if (!spin_trylock_irqsave(&stats_flush_lock, flag))
> > +     /*
> > +      * We always flush the entire tree, so concurrent flushers can just
> > +      * skip. This avoids a thundering herd problem on the rstat global lock
> > +      * from memcg flushers (e.g. reclaim, refault, etc).
> > +      */
> > +     if (atomic_xchg(&stats_flush_ongoing, 1))
>
> Have you profiled this? I wonder if we should replace the above with
>
>         if (atomic_read(&stats_flush_ongoing) || atomic_xchg(&stats_flush_ongoing, 1))

I profiled the entire series with perf and I haven't noticed a notable
difference between before and after the patch series -- but maybe some
specific access patterns cause a regression, not sure.

Does an atomic_cmpxchg() satisfy the same purpose? it's easier to read
/ more concise I guess.

Something like

    if (atomic_cmpxchg(&stats_flush_ongoing, 0, 1))

WDYT?




>
> to not always dirty the cacheline. This would not be an issue if there
> is no cacheline sharing but I suspect percpu stats_updates is sharing
> the cacheline with it and may cause false sharing with the parallel stat
> updaters (updaters only need to read the base percpu pointer).
>
> Other than that the patch looks good.

WARNING: multiple messages have this Message-ID (diff)
From: Yosry Ahmed <yosryahmed-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
To: Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Cc: "Tejun Heo" <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	"Josef Bacik" <josef-DigfWCa+lFGyeJad7bwFQA@public.gmane.org>,
	"Jens Axboe" <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>,
	"Zefan Li" <lizefan.x-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org>,
	"Johannes Weiner"
	<hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	"Michal Hocko" <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	"Roman Gushchin"
	<roman.gushchin-fxUVXftIFDnyG1zEObXtfA@public.gmane.org>,
	"Muchun Song"
	<muchun.song-fxUVXftIFDnyG1zEObXtfA@public.gmane.org>,
	"Andrew Morton"
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	"Michal Koutný" <mkoutny-IBi9RG/b67k@public.gmane.org>,
	"Vasily Averin"
	<vasily.averin-fxUVXftIFDnyG1zEObXtfA@public.gmane.org>,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-block-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	bpf-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH v1 5/9] memcg: replace stats_flush_lock with an atomic
Date: Tue, 28 Mar 2023 11:52:50 -0700	[thread overview]
Message-ID: <CAJD7tkYo=CeXJPUi_KxjzC0QCxC2qd_J2_FQi_aXh7svD8u60A@mail.gmail.com> (raw)
In-Reply-To: <20230328141523.txyhl7wt7wtvssea-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>

On Tue, Mar 28, 2023 at 7:15 AM Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
>
> On Tue, Mar 28, 2023 at 06:16:34AM +0000, Yosry Ahmed wrote:
> [...]
> > @@ -585,8 +585,8 @@ mem_cgroup_largest_soft_limit_node(struct mem_cgroup_tree_per_node *mctz)
> >   */
> >  static void flush_memcg_stats_dwork(struct work_struct *w);
> >  static DECLARE_DEFERRABLE_WORK(stats_flush_dwork, flush_memcg_stats_dwork);
> > -static DEFINE_SPINLOCK(stats_flush_lock);
> >  static DEFINE_PER_CPU(unsigned int, stats_updates);
> > +static atomic_t stats_flush_ongoing = ATOMIC_INIT(0);
> >  static atomic_t stats_flush_threshold = ATOMIC_INIT(0);
> >  static u64 flush_next_time;
> >
> > @@ -636,15 +636,18 @@ static inline void memcg_rstat_updated(struct mem_cgroup *memcg, int val)
> >
> >  static void __mem_cgroup_flush_stats(void)
> >  {
> > -     unsigned long flag;
> > -
> > -     if (!spin_trylock_irqsave(&stats_flush_lock, flag))
> > +     /*
> > +      * We always flush the entire tree, so concurrent flushers can just
> > +      * skip. This avoids a thundering herd problem on the rstat global lock
> > +      * from memcg flushers (e.g. reclaim, refault, etc).
> > +      */
> > +     if (atomic_xchg(&stats_flush_ongoing, 1))
>
> Have you profiled this? I wonder if we should replace the above with
>
>         if (atomic_read(&stats_flush_ongoing) || atomic_xchg(&stats_flush_ongoing, 1))

I profiled the entire series with perf and I haven't noticed a notable
difference between before and after the patch series -- but maybe some
specific access patterns cause a regression, not sure.

Does an atomic_cmpxchg() satisfy the same purpose? it's easier to read
/ more concise I guess.

Something like

    if (atomic_cmpxchg(&stats_flush_ongoing, 0, 1))

WDYT?




>
> to not always dirty the cacheline. This would not be an issue if there
> is no cacheline sharing but I suspect percpu stats_updates is sharing
> the cacheline with it and may cause false sharing with the parallel stat
> updaters (updaters only need to read the base percpu pointer).
>
> Other than that the patch looks good.

  reply	other threads:[~2023-03-28 18:53 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-28  6:16 [PATCH v1 0/9] memcg: make rstat flushing irq and sleep friendly Yosry Ahmed
2023-03-28  6:16 ` Yosry Ahmed
2023-03-28  6:16 ` [PATCH v1 1/9] cgroup: rename cgroup_rstat_flush_"irqsafe" to "atomic" Yosry Ahmed
2023-03-28  6:16   ` Yosry Ahmed
2023-03-28 13:24   ` Shakeel Butt
2023-03-28 13:24     ` Shakeel Butt
2023-03-28 17:42   ` Johannes Weiner
2023-03-28 17:42     ` Johannes Weiner
2023-03-28  6:16 ` [PATCH v1 2/9] memcg: rename mem_cgroup_flush_stats_"delayed" to "ratelimited" Yosry Ahmed
2023-03-28 13:25   ` Shakeel Butt
2023-03-28 13:25     ` Shakeel Butt
2023-03-28 17:42   ` Johannes Weiner
2023-03-28 17:42     ` Johannes Weiner
2023-03-28  6:16 ` [PATCH v1 3/9] memcg: do not flush stats in irq context Yosry Ahmed
2023-03-28 13:26   ` Shakeel Butt
2023-03-28 13:26     ` Shakeel Butt
2023-03-28 17:43   ` Johannes Weiner
2023-03-28  6:16 ` [PATCH v1 4/9] cgroup: rstat: add WARN_ON_ONCE() if flushing outside task context Yosry Ahmed
2023-03-28 14:59   ` Shakeel Butt
2023-03-28 14:59     ` Shakeel Butt
2023-03-28 17:49   ` Johannes Weiner
2023-03-28 18:59     ` Yosry Ahmed
2023-03-28 18:59       ` Yosry Ahmed
2023-03-28 22:18       ` Yosry Ahmed
2023-03-28 22:18         ` Yosry Ahmed
2023-03-28  6:16 ` [PATCH v1 5/9] memcg: replace stats_flush_lock with an atomic Yosry Ahmed
2023-03-28 14:15   ` Shakeel Butt
2023-03-28 14:15     ` Shakeel Butt
2023-03-28 18:52     ` Yosry Ahmed [this message]
2023-03-28 18:52       ` Yosry Ahmed
2023-03-28 19:28       ` Shakeel Butt
2023-03-28 19:28         ` Shakeel Butt
2023-03-28 19:34         ` Yosry Ahmed
2023-03-28 19:34           ` Yosry Ahmed
2023-03-28 19:42           ` Yosry Ahmed
2023-03-28 17:53   ` Johannes Weiner
2023-03-28 17:53     ` Johannes Weiner
2023-03-28  6:16 ` [PATCH v1 6/9] memcg: sleep during flushing stats in safe contexts Yosry Ahmed
2023-03-28 15:09   ` Shakeel Butt
2023-03-28 15:09     ` Shakeel Butt
2023-03-28 18:35   ` Johannes Weiner
2023-03-28 18:45     ` Yosry Ahmed
2023-03-28 18:45       ` Yosry Ahmed
2023-03-28 19:06       ` Johannes Weiner
2023-03-28 19:06         ` Johannes Weiner
2023-03-28 19:26         ` Yosry Ahmed
2023-03-28 19:26           ` Yosry Ahmed
2023-03-28  6:16 ` [PATCH v1 7/9] workingset: memcg: sleep when flushing stats in workingset_refault() Yosry Ahmed
2023-03-28 15:18   ` Shakeel Butt
2023-03-28 15:18     ` Shakeel Butt
2023-03-28 18:47     ` Johannes Weiner
2023-03-28 18:47       ` Johannes Weiner
2023-03-28 19:25     ` Yosry Ahmed
2023-03-28 18:43   ` Johannes Weiner
2023-03-28 18:43     ` Johannes Weiner
2023-03-28  6:16 ` [PATCH v1 8/9] vmscan: memcg: sleep when flushing stats during reclaim Yosry Ahmed
2023-03-28  6:16   ` Yosry Ahmed
2023-03-28 15:19   ` Shakeel Butt
2023-03-28 15:19     ` Shakeel Butt
2023-03-28 19:01     ` Yosry Ahmed
2023-03-28 19:01       ` Yosry Ahmed
2023-03-28 19:29       ` Shakeel Butt
2023-03-28 19:29         ` Shakeel Butt
2023-03-28 18:49   ` Johannes Weiner
2023-03-28  6:16 ` [PATCH v1 9/9] memcg: do not modify rstat tree for zero updates Yosry Ahmed
2023-03-28 15:20   ` Shakeel Butt
2023-03-28 15:20     ` Shakeel Butt
2023-03-28 18:50   ` Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJD7tkYo=CeXJPUi_KxjzC0QCxC2qd_J2_FQi_aXh7svD8u60A@mail.gmail.com' \
    --to=yosryahmed@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=bpf@vger.kernel.org \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=josef@toxicpanda.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizefan.x@bytedance.com \
    --cc=mhocko@kernel.org \
    --cc=mkoutny@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeelb@google.com \
    --cc=tj@kernel.org \
    --cc=vasily.averin@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.