From: Eric Dumazet <edumazet@google.com>
To: Olof Johansson <olof@lixom.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>,
David Miller <davem@davemloft.net>,
Neil Horman <nhorman@tuxdriver.com>,
Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>,
Vladislav Yasevich <vyasevich@gmail.com>,
Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>,
Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>,
linux-crypto@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
linux-sctp@vger.kernel.org, netdev <netdev@vger.kernel.org>,
linux-decnet-user@lists.sourceforge.net,
kernel-team <kernel-team@fb.com>,
Yuchung Cheng <ycheng@google.com>,
Neal Cardwell <ncardwell@google.com>
Subject: Re: [PATCH] net/sock: move memory_allocated over to percpu_counter variables
Date: Fri, 7 Sep 2018 00:21:46 -0700 [thread overview]
Message-ID: <CANn89iKgZkfwQ8nAGEfOzubOh69y285TNKB5Q518Wf_phbq2Yg@mail.gmail.com> (raw)
In-Reply-To: <CANn89iKJcgMWb2Kmk6L9k=NkfBUKZ6BwriWr3O+N5Y0u5dy=9g@mail.gmail.com>
On Fri, Sep 7, 2018 at 12:03 AM Eric Dumazet <edumazet@google.com> wrote:
> Problem is : we have platforms with more than 100 cpus, and
> sk_memory_allocated() cost will be too expensive,
> especially if the host is under memory pressure, since all cpus will
> touch their private counter.
>
> per cpu variables do not really scale, they were ok 10 years ago when
> no more than 16 cpus were the norm.
>
> I would prefer change TCP to not aggressively call
> __sk_mem_reduce_allocated() from tcp_write_timer()
>
> Ideally only tcp_retransmit_timer() should attempt to reduce forward
> allocations, after recurring timeout.
>
> Note that after 20c64d5cd5a2bdcdc8982a06cb05e5e1bd851a3d ("net: avoid
> sk_forward_alloc overflows")
> we have better control over sockets having huge forward allocations.
>
> Something like :
Or something less risky :
diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
index 7fdf222a0bdfe9775970082f6b5dcdcc82b2ae1a..0aee80b6966cb2898e46350c761f9eb431ff1206
100644
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -604,7 +604,8 @@ void tcp_write_timer_handler(struct sock *sk)
}
out:
- sk_mem_reclaim(sk);
+ if (tcp_under_memory_pressure(sk))
+ sk_mem_reclaim(sk);
}
static void tcp_write_timer(struct timer_list *t)
next prev parent reply other threads:[~2018-09-07 7:22 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-06 19:20 [PATCH] net/sock: move memory_allocated over to percpu_counter variables Olof Johansson
2018-09-06 19:33 ` Eric Dumazet
2018-09-07 3:32 ` Herbert Xu
2018-09-07 6:20 ` Olof Johansson
2018-09-07 7:03 ` Eric Dumazet
2018-09-07 7:21 ` Eric Dumazet [this message]
2018-09-08 17:02 ` Olof Johansson
2018-09-09 18:38 ` Eric Dumazet
2018-09-18 9:37 ` [LKP] [net/sock] b99259a614: netperf.Throughput_Mbps -6.6% regression kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CANn89iKgZkfwQ8nAGEfOzubOh69y285TNKB5Q518Wf_phbq2Yg@mail.gmail.com \
--to=edumazet@google.com \
--cc=davem@davemloft.net \
--cc=herbert@gondor.apana.org.au \
--cc=kernel-team@fb.com \
--cc=kuznet@ms2.inr.ac.ru \
--cc=linux-crypto@vger.kernel.org \
--cc=linux-decnet-user@lists.sourceforge.net \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-sctp@vger.kernel.org \
--cc=marcelo.leitner@gmail.com \
--cc=ncardwell@google.com \
--cc=netdev@vger.kernel.org \
--cc=nhorman@tuxdriver.com \
--cc=olof@lixom.net \
--cc=vyasevich@gmail.com \
--cc=ycheng@google.com \
--cc=yoshfuji@linux-ipv6.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).