From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91BBBC28D13 for ; Mon, 22 Aug 2022 18:24:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235126AbiHVSYP (ORCPT ); Mon, 22 Aug 2022 14:24:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48926 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237425AbiHVSYI (ORCPT ); Mon, 22 Aug 2022 14:24:08 -0400 Received: from out2.migadu.com (out2.migadu.com [IPv6:2001:41d0:2:aacc::]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5EA141A81F; Mon, 22 Aug 2022 11:24:03 -0700 (PDT) Date: Mon, 22 Aug 2022 11:23:56 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1661192642; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=bQzUYJbzeSJVtDIpGsHXApQTdq+ArDiyBUiXh9dJl9c=; b=ko5CwHMhDHobyMV2JTCqgpMz2+52hcGSQVRXn7bUUUSzOzmxP9AFk+/mRkwC+S5zY6MqdX F2q4LAhYZbvIHrC86yIgvLfaWvRCGsb0n0M91E3H2sYMBQDtjW1gtraZ1MHGzTdRTZPXQf PJRtOi3CP/lPjq5dDiL0j6nSDxN5b2s= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Roman Gushchin To: Shakeel Butt Cc: Johannes Weiner , Michal Hocko , Muchun Song , Michal =?iso-8859-1?Q?Koutn=FD?= , Eric Dumazet , Soheil Hassas Yeganeh , Feng Tang , Oliver Sang , Andrew Morton , lkp@lists.01.org, cgroups@vger.kernel.org, linux-mm@kvack.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 1/3] mm: page_counter: remove unneeded atomic ops for low/min Message-ID: References: <20220822001737.4120417-1-shakeelb@google.com> <20220822001737.4120417-2-shakeelb@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220822001737.4120417-2-shakeelb@google.com> X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: linux.dev Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Aug 22, 2022 at 12:17:35AM +0000, Shakeel Butt wrote: > For cgroups using low or min protections, the function > propagate_protected_usage() was doing an atomic xchg() operation > irrespectively. It only needs to do that operation if the new value of > protection is different from older one. This patch does that. > > To evaluate the impact of this optimization, on a 72 CPUs machine, we > ran the following workload in a three level of cgroup hierarchy with top > level having min and low setup appropriately. More specifically > memory.min equal to size of netperf binary and memory.low double of > that. > > $ netserver -6 > # 36 instances of netperf with following params > $ netperf -6 -H ::1 -l 60 -t TCP_SENDFILE -- -m 10K > > Results (average throughput of netperf): > Without (6.0-rc1) 10482.7 Mbps > With patch 14542.5 Mbps (38.7% improvement) > > With the patch, the throughput improved by 38.7% Nice savings! > > Signed-off-by: Shakeel Butt > Reported-by: kernel test robot > --- > mm/page_counter.c | 13 ++++++------- > 1 file changed, 6 insertions(+), 7 deletions(-) > > diff --git a/mm/page_counter.c b/mm/page_counter.c > index eb156ff5d603..47711aa28161 100644 > --- a/mm/page_counter.c > +++ b/mm/page_counter.c > @@ -17,24 +17,23 @@ static void propagate_protected_usage(struct page_counter *c, > unsigned long usage) > { > unsigned long protected, old_protected; > - unsigned long low, min; > long delta; > > if (!c->parent) > return; > > - min = READ_ONCE(c->min); > - if (min || atomic_long_read(&c->min_usage)) { > - protected = min(usage, min); > + protected = min(usage, READ_ONCE(c->min)); > + old_protected = atomic_long_read(&c->min_usage); > + if (protected != old_protected) { > old_protected = atomic_long_xchg(&c->min_usage, protected); > delta = protected - old_protected; > if (delta) > atomic_long_add(delta, &c->parent->children_min_usage); What if there is a concurrent update of c->min_usage? Then the patched version can miss an update. I can't imagine a case when it will lead to bad consequences, so probably it's ok. But not super obvious. I think the way to think of it is that a missed update will be fixed by the next one, so it's ok to run some time with old numbers. Acked-by: Roman Gushchin Thanks! From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: multipart/mixed; boundary="===============4860063059949734193==" MIME-Version: 1.0 From: Roman Gushchin To: lkp@lists.01.org Subject: Re: [PATCH 1/3] mm: page_counter: remove unneeded atomic ops for low/min Date: Mon, 22 Aug 2022 11:23:56 -0700 Message-ID: In-Reply-To: <20220822001737.4120417-2-shakeelb@google.com> List-Id: --===============4860063059949734193== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable On Mon, Aug 22, 2022 at 12:17:35AM +0000, Shakeel Butt wrote: > For cgroups using low or min protections, the function > propagate_protected_usage() was doing an atomic xchg() operation > irrespectively. It only needs to do that operation if the new value of > protection is different from older one. This patch does that. > = > To evaluate the impact of this optimization, on a 72 CPUs machine, we > ran the following workload in a three level of cgroup hierarchy with top > level having min and low setup appropriately. More specifically > memory.min equal to size of netperf binary and memory.low double of > that. > = > $ netserver -6 > # 36 instances of netperf with following params > $ netperf -6 -H ::1 -l 60 -t TCP_SENDFILE -- -m 10K > = > Results (average throughput of netperf): > Without (6.0-rc1) 10482.7 Mbps > With patch 14542.5 Mbps (38.7% improvement) > = > With the patch, the throughput improved by 38.7% Nice savings! > = > Signed-off-by: Shakeel Butt > Reported-by: kernel test robot > --- > mm/page_counter.c | 13 ++++++------- > 1 file changed, 6 insertions(+), 7 deletions(-) > = > diff --git a/mm/page_counter.c b/mm/page_counter.c > index eb156ff5d603..47711aa28161 100644 > --- a/mm/page_counter.c > +++ b/mm/page_counter.c > @@ -17,24 +17,23 @@ static void propagate_protected_usage(struct page_cou= nter *c, > unsigned long usage) > { > unsigned long protected, old_protected; > - unsigned long low, min; > long delta; > = > if (!c->parent) > return; > = > - min =3D READ_ONCE(c->min); > - if (min || atomic_long_read(&c->min_usage)) { > - protected =3D min(usage, min); > + protected =3D min(usage, READ_ONCE(c->min)); > + old_protected =3D atomic_long_read(&c->min_usage); > + if (protected !=3D old_protected) { > old_protected =3D atomic_long_xchg(&c->min_usage, protected); > delta =3D protected - old_protected; > if (delta) > atomic_long_add(delta, &c->parent->children_min_usage); What if there is a concurrent update of c->min_usage? Then the patched vers= ion can miss an update. I can't imagine a case when it will lead to bad consequ= ences, so probably it's ok. But not super obvious. I think the way to think of it is that a missed update will be fixed by the= next one, so it's ok to run some time with old numbers. Acked-by: Roman Gushchin Thanks! --===============4860063059949734193==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Roman Gushchin Subject: Re: [PATCH 1/3] mm: page_counter: remove unneeded atomic ops for low/min Date: Mon, 22 Aug 2022 11:23:56 -0700 Message-ID: References: <20220822001737.4120417-1-shakeelb@google.com> <20220822001737.4120417-2-shakeelb@google.com> Mime-Version: 1.0 Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1661192642; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=bQzUYJbzeSJVtDIpGsHXApQTdq+ArDiyBUiXh9dJl9c=; b=ko5CwHMhDHobyMV2JTCqgpMz2+52hcGSQVRXn7bUUUSzOzmxP9AFk+/mRkwC+S5zY6MqdX F2q4LAhYZbvIHrC86yIgvLfaWvRCGsb0n0M91E3H2sYMBQDtjW1gtraZ1MHGzTdRTZPXQf PJRtOi3CP/lPjq5dDiL0j6nSDxN5b2s= Content-Disposition: inline In-Reply-To: <20220822001737.4120417-2-shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> List-ID: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Shakeel Butt Cc: Johannes Weiner , Michal Hocko , Muchun Song , Michal =?iso-8859-1?Q?Koutn=FD?= , Eric Dumazet , Soheil Hassas Yeganeh , Feng Tang , Oliver Sang , Andrew Morton , lkp-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org, cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On Mon, Aug 22, 2022 at 12:17:35AM +0000, Shakeel Butt wrote: > For cgroups using low or min protections, the function > propagate_protected_usage() was doing an atomic xchg() operation > irrespectively. It only needs to do that operation if the new value of > protection is different from older one. This patch does that. > > To evaluate the impact of this optimization, on a 72 CPUs machine, we > ran the following workload in a three level of cgroup hierarchy with top > level having min and low setup appropriately. More specifically > memory.min equal to size of netperf binary and memory.low double of > that. > > $ netserver -6 > # 36 instances of netperf with following params > $ netperf -6 -H ::1 -l 60 -t TCP_SENDFILE -- -m 10K > > Results (average throughput of netperf): > Without (6.0-rc1) 10482.7 Mbps > With patch 14542.5 Mbps (38.7% improvement) > > With the patch, the throughput improved by 38.7% Nice savings! > > Signed-off-by: Shakeel Butt > Reported-by: kernel test robot > --- > mm/page_counter.c | 13 ++++++------- > 1 file changed, 6 insertions(+), 7 deletions(-) > > diff --git a/mm/page_counter.c b/mm/page_counter.c > index eb156ff5d603..47711aa28161 100644 > --- a/mm/page_counter.c > +++ b/mm/page_counter.c > @@ -17,24 +17,23 @@ static void propagate_protected_usage(struct page_counter *c, > unsigned long usage) > { > unsigned long protected, old_protected; > - unsigned long low, min; > long delta; > > if (!c->parent) > return; > > - min = READ_ONCE(c->min); > - if (min || atomic_long_read(&c->min_usage)) { > - protected = min(usage, min); > + protected = min(usage, READ_ONCE(c->min)); > + old_protected = atomic_long_read(&c->min_usage); > + if (protected != old_protected) { > old_protected = atomic_long_xchg(&c->min_usage, protected); > delta = protected - old_protected; > if (delta) > atomic_long_add(delta, &c->parent->children_min_usage); What if there is a concurrent update of c->min_usage? Then the patched version can miss an update. I can't imagine a case when it will lead to bad consequences, so probably it's ok. But not super obvious. I think the way to think of it is that a missed update will be fixed by the next one, so it's ok to run some time with old numbers. Acked-by: Roman Gushchin Thanks!