All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <edumazet@google.com>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Xin Long <lucien.xin@gmail.com>,
	 Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>,
	kernel test robot <oliver.sang@intel.com>,
	 Shakeel Butt <shakeelb@google.com>,
	Soheil Hassas Yeganeh <soheil@google.com>,
	 LKML <linux-kernel@vger.kernel.org>,
	 Linux Memory Management List <linux-mm@kvack.org>,
	network dev <netdev@vger.kernel.org>,
	linux-s390@vger.kernel.org,
	 MPTCP Upstream <mptcp@lists.linux.dev>,
	 "linux-sctp @ vger . kernel . org" <linux-sctp@vger.kernel.org>,
	lkp@lists.01.org,  kbuild test robot <lkp@intel.com>,
	Huang Ying <ying.huang@intel.com>,
	"Tang, Feng" <feng.tang@intel.com>,
	 zhengjun.xing@linux.intel.com, fengwei.yin@intel.com,
	 Ying Xu <yinxu@redhat.com>
Subject: Re: [net] 4890b686f4: netperf.Throughput_Mbps -69.4% regression
Date: Fri, 24 Jun 2022 06:13:51 +0200	[thread overview]
Message-ID: <CANn89iLidqjiiV8vxr7KnUg0JvfoS9+TRGg=8ANZ8NBRjeQxsQ@mail.gmail.com> (raw)
In-Reply-To: <20220623185730.25b88096@kernel.org>

On Fri, Jun 24, 2022 at 3:57 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Thu, 23 Jun 2022 18:50:07 -0400 Xin Long wrote:
> > From the perf data, we can see __sk_mem_reduce_allocated() is the one
> > using CPU the most more than before, and mem_cgroup APIs are also
> > called in this function. It means the mem cgroup must be enabled in
> > the test env, which may explain why I couldn't reproduce it.
> >
> > The Commit 4890b686f4 ("net: keep sk->sk_forward_alloc as small as
> > possible") uses sk_mem_reclaim(checking reclaimable >= PAGE_SIZE) to
> > reclaim the memory, which is *more frequent* to call
> > __sk_mem_reduce_allocated() than before (checking reclaimable >=
> > SK_RECLAIM_THRESHOLD). It might be cheap when
> > mem_cgroup_sockets_enabled is false, but I'm not sure if it's still
> > cheap when mem_cgroup_sockets_enabled is true.
> >
> > I think SCTP netperf could trigger this, as the CPU is the bottleneck
> > for SCTP netperf testing, which is more sensitive to the extra
> > function calls than TCP.
> >
> > Can we re-run this testing without mem cgroup enabled?
>
> FWIW I defer to Eric, thanks a lot for double checking the report
> and digging in!

I did tests with TCP + memcg and noticed a very small additional cost
in memcg functions,
because of suboptimal layout:

Extract of an internal Google bug, update from June 9th:

--------------------------------
I have noticed a minor false sharing to fetch (struct
mem_cgroup)->css.parent, at offset 0xc0,
because it shares the cache line containing struct mem_cgroup.memory,
at offset 0xd0

Ideally, memcg->socket_pressure and memcg->parent should sit in a read
mostly cache line.
-----------------------

But nothing that could explain a "-69.4% regression"

memcg has a very similar strategy of per-cpu reserves, with
MEMCG_CHARGE_BATCH being 32 pages per cpu.

It is not clear why SCTP with 10K writes would overflow this reserve constantly.

Presumably memcg experts will have to rework structure alignments to
make sure they can cope better
with more charge/uncharge operations, because we are not going back to
gigantic per-socket reserves,
this simply does not scale.

WARNING: multiple messages have this Message-ID (diff)
From: Eric Dumazet <edumazet@google.com>
To: lkp@lists.01.org
Subject: Re: [net] 4890b686f4: netperf.Throughput_Mbps -69.4% regression
Date: Fri, 24 Jun 2022 06:13:51 +0200	[thread overview]
Message-ID: <CANn89iLidqjiiV8vxr7KnUg0JvfoS9+TRGg=8ANZ8NBRjeQxsQ@mail.gmail.com> (raw)
In-Reply-To: <20220623185730.25b88096@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 2209 bytes --]

On Fri, Jun 24, 2022 at 3:57 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Thu, 23 Jun 2022 18:50:07 -0400 Xin Long wrote:
> > From the perf data, we can see __sk_mem_reduce_allocated() is the one
> > using CPU the most more than before, and mem_cgroup APIs are also
> > called in this function. It means the mem cgroup must be enabled in
> > the test env, which may explain why I couldn't reproduce it.
> >
> > The Commit 4890b686f4 ("net: keep sk->sk_forward_alloc as small as
> > possible") uses sk_mem_reclaim(checking reclaimable >= PAGE_SIZE) to
> > reclaim the memory, which is *more frequent* to call
> > __sk_mem_reduce_allocated() than before (checking reclaimable >=
> > SK_RECLAIM_THRESHOLD). It might be cheap when
> > mem_cgroup_sockets_enabled is false, but I'm not sure if it's still
> > cheap when mem_cgroup_sockets_enabled is true.
> >
> > I think SCTP netperf could trigger this, as the CPU is the bottleneck
> > for SCTP netperf testing, which is more sensitive to the extra
> > function calls than TCP.
> >
> > Can we re-run this testing without mem cgroup enabled?
>
> FWIW I defer to Eric, thanks a lot for double checking the report
> and digging in!

I did tests with TCP + memcg and noticed a very small additional cost
in memcg functions,
because of suboptimal layout:

Extract of an internal Google bug, update from June 9th:

--------------------------------
I have noticed a minor false sharing to fetch (struct
mem_cgroup)->css.parent, at offset 0xc0,
because it shares the cache line containing struct mem_cgroup.memory,
at offset 0xd0

Ideally, memcg->socket_pressure and memcg->parent should sit in a read
mostly cache line.
-----------------------

But nothing that could explain a "-69.4% regression"

memcg has a very similar strategy of per-cpu reserves, with
MEMCG_CHARGE_BATCH being 32 pages per cpu.

It is not clear why SCTP with 10K writes would overflow this reserve constantly.

Presumably memcg experts will have to rework structure alignments to
make sure they can cope better
with more charge/uncharge operations, because we are not going back to
gigantic per-socket reserves,
this simply does not scale.

  reply	other threads:[~2022-06-24  4:14 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-19 15:04 [net] 4890b686f4: netperf.Throughput_Mbps -69.4% regression kernel test robot
2022-06-19 15:04 ` kernel test robot
2022-06-23  0:28 ` Jakub Kicinski
2022-06-23  0:28   ` Jakub Kicinski
2022-06-23  3:08   ` Xin Long
2022-06-23  3:08     ` Xin Long
2022-06-23 22:50     ` Xin Long
2022-06-23 22:50       ` Xin Long
2022-06-24  1:57       ` Jakub Kicinski
2022-06-24  1:57         ` Jakub Kicinski
2022-06-24  4:13         ` Eric Dumazet [this message]
2022-06-24  4:13           ` Eric Dumazet
2022-06-24  4:22           ` Eric Dumazet
2022-06-24  4:22             ` Eric Dumazet
2022-06-24  5:13           ` Feng Tang
2022-06-24  5:13             ` Feng Tang
2022-06-24  5:45             ` Eric Dumazet
2022-06-24  5:45               ` Eric Dumazet
2022-06-24  6:00               ` Feng Tang
2022-06-24  6:00                 ` Feng Tang
2022-06-24  6:07                 ` Eric Dumazet
2022-06-24  6:07                   ` Eric Dumazet
2022-06-24  6:34           ` Shakeel Butt
2022-06-24  6:34             ` Shakeel Butt
2022-06-24  7:06             ` Feng Tang
2022-06-24  7:06               ` Feng Tang
2022-06-24 14:43               ` Shakeel Butt
2022-06-24 14:43                 ` Shakeel Butt
2022-06-25  2:36                 ` Feng Tang
2022-06-25  2:36                   ` Feng Tang
2022-06-27  2:38                   ` Feng Tang
2022-06-27  2:38                     ` Feng Tang
2022-06-27  8:46                     ` Eric Dumazet
2022-06-27  8:46                       ` Eric Dumazet
2022-06-27 12:34                       ` Feng Tang
2022-06-27 12:34                         ` Feng Tang
2022-06-27 14:07                         ` Eric Dumazet
2022-06-27 14:07                           ` Eric Dumazet
2022-06-27 14:48                           ` Feng Tang
2022-06-27 14:48                             ` Feng Tang
2022-06-27 16:25                             ` Eric Dumazet
2022-06-27 16:25                               ` Eric Dumazet
2022-06-27 16:48                               ` Shakeel Butt
2022-06-27 16:48                                 ` Shakeel Butt
2022-06-27 17:05                                 ` Eric Dumazet
2022-06-27 17:05                                   ` Eric Dumazet
2022-06-28  1:46                                 ` Roman Gushchin
2022-06-28  1:46                                   ` Roman Gushchin
2022-06-28  3:49                               ` Feng Tang
2022-06-28  3:49                                 ` Feng Tang
2022-07-01 15:47                                 ` Shakeel Butt
2022-07-01 15:47                                   ` Shakeel Butt
2022-07-03 10:43                                   ` Feng Tang
2022-07-03 10:43                                     ` Feng Tang
2022-07-03 22:55                                     ` Roman Gushchin
2022-07-03 22:55                                       ` Roman Gushchin
2022-07-05  5:03                                       ` Feng Tang
2022-07-05  5:03                                         ` Feng Tang
2022-08-16  5:52                                         ` Oliver Sang
2022-08-16  5:52                                           ` Oliver Sang
2022-08-16 15:55                                           ` Shakeel Butt
2022-08-16 15:55                                             ` Shakeel Butt
2022-06-27 14:52                         ` Shakeel Butt
2022-06-27 14:52                           ` Shakeel Butt
2022-06-27 14:56                           ` Eric Dumazet
2022-06-27 14:56                             ` Eric Dumazet
2022-06-27 15:12                           ` Feng Tang
2022-06-27 15:12                             ` Feng Tang
2022-06-27 16:25                             ` Shakeel Butt
2022-06-27 16:25                               ` Shakeel Butt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CANn89iLidqjiiV8vxr7KnUg0JvfoS9+TRGg=8ANZ8NBRjeQxsQ@mail.gmail.com' \
    --to=edumazet@google.com \
    --cc=feng.tang@intel.com \
    --cc=fengwei.yin@intel.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux-sctp@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=lkp@lists.01.org \
    --cc=lucien.xin@gmail.com \
    --cc=marcelo.leitner@gmail.com \
    --cc=mptcp@lists.linux.dev \
    --cc=netdev@vger.kernel.org \
    --cc=oliver.sang@intel.com \
    --cc=shakeelb@google.com \
    --cc=soheil@google.com \
    --cc=ying.huang@intel.com \
    --cc=yinxu@redhat.com \
    --cc=zhengjun.xing@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.