netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Willy Tarreau <w@1wt.eu>
To: Tom Herbert <tom@herbertland.com>
Cc: Yann Ylavic <ylavic.dev@gmail.com>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	Linux Kernel Network Developers <netdev@vger.kernel.org>,
	Tolga Ceylan <tolga.ceylan@gmail.com>,
	Craig Gallek <cgallek@google.com>, Josh Snyder <josh@code406.com>,
	Aaron Conole <aconole@bytheb.org>,
	"David S. Miller" <davem@davemloft.net>,
	Daniel Borkmann <daniel@iogearbox.net>
Subject: Re: [PATCH 1/1] net: Add SO_REUSEPORT_LISTEN_OFF socket option as drain mode
Date: Fri, 25 Mar 2016 06:28:19 +0100	[thread overview]
Message-ID: <20160325052819.GA8548@1wt.eu> (raw)
In-Reply-To: <CALx6S37uZah89sNgH9wuD1J+_WEhd34Z5zmrnG8Qp-AQ7Ew=Jg@mail.gmail.com>

On Thu, Mar 24, 2016 at 04:54:03PM -0700, Tom Herbert wrote:
> On Thu, Mar 24, 2016 at 4:40 PM, Yann Ylavic <ylavic.dev@gmail.com> wrote:
> > I'll learn how to do this to get the best performances from the
> > server, but having to do so to work around what looks like a defect
> > (for simple/default SMP configurations at least, no NUMA or clever
> > CPU-affinity or queuing policy involved) seems odd in the first place.
> >
> I disagree with your assessment that there is a defect. SO_REUSEPORT
> is designed to spread packets amongst _equivalent_ connections. In the
> server draining case sockets are no longer equivalent, but that is a
> special case.

I partially disagree with you here Tom. Initially SO_REUSEPORT was not
used to spread packets but to allow soft restart in some applications.
I've been using it since 2001 in haproxy on *BSD and linux 2.2. It was
removed during 2.3 and I used to keep a patch to reimplement it in 2.4
(basically 2 or 3 lines, the infrastructure was still present), but the
patch was not accepted. The same patch worked for 2.6 and 3.x, allowing
me to continue to perform soft-restarts on Linux just like I used to do
on *BSD. When SO_REUSEPORT was reimplemented in 3.9 with load balancing,
I was happy because it at last allowed me to drop my patch and I got
the extra benefit of better load balancing of incoming connections.

But the main use we have for it (at least historically) is for soft
restarts, where one process replaces another one. Very few people use
more than one process in our case.

However given the benefits of the load spreading for extreme loads,
I'm willing to find how to achieve the same with BPF, but it's pretty
clear that at this point I have no idea where to start from and that
for a single process replacing a single one, it looks quite complicated.

For me quite frankly the special case is the load balancing which is
a side benefit (and a nice one, don't get me wrong).

That's why I would have found it nice to "fix" the process replacement
to avoid dropping incoming connections, though I don't want it to become
a problem for future improvements on BPF. I don't think the two lines I
proposed could become an issue but I'll live without them (or continue
to apply this patch).

BTW, I have no problem with having to write a little bit of assembly for
fast interfaces if it remains untouched for years, we do already have a
bit in haproxy. It's just a longterm investment.

Best regards,
Willy

  parent reply	other threads:[~2016-03-25  5:28 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-27  0:30 [PATCH 1/1] net: Add SO_REUSEPORT_LISTEN_OFF socket option as drain mode Tolga Ceylan
2015-09-27  1:04 ` Eric Dumazet
2015-09-27  1:37   ` Tolga Ceylan
2015-09-27  1:44 ` Aaron Conole
2015-09-27  2:02   ` Tolga Ceylan
2015-09-27  2:24     ` Eric Dumazet
2015-11-11  5:41       ` Tom Herbert
2015-11-11  6:19         ` Eric Dumazet
2015-11-11 17:05           ` Tom Herbert
2015-11-11 17:23             ` Eric Dumazet
2015-11-11 18:23               ` Tom Herbert
2015-11-11 18:43                 ` Eric Dumazet
2015-11-12  1:09                   ` Eric Dumazet
2015-12-15 16:14                     ` Willy Tarreau
2015-12-15 17:10                       ` Eric Dumazet
2015-12-15 17:43                         ` Willy Tarreau
2015-12-15 18:21                           ` Eric Dumazet
2015-12-15 19:44                             ` Willy Tarreau
2015-12-15 21:21                               ` Eric Dumazet
2015-12-16  7:38                                 ` Willy Tarreau
2015-12-16 16:15                                   ` Willy Tarreau
2015-12-18 16:33                                     ` Josh Snyder
2015-12-18 18:58                                       ` Willy Tarreau
2015-12-19  2:38                                         ` Eric Dumazet
2015-12-19  7:00                                           ` Willy Tarreau
2015-12-21 20:38                                             ` Tom Herbert
2015-12-21 20:41                                               ` Willy Tarreau
2016-03-24  5:10                                                 ` Tolga Ceylan
2016-03-24  6:12                                                   ` Willy Tarreau
2016-03-24 14:13                                                     ` Eric Dumazet
2016-03-24 14:22                                                       ` Willy Tarreau
2016-03-24 14:45                                                         ` Eric Dumazet
2016-03-24 15:30                                                           ` Willy Tarreau
2016-03-24 16:33                                                             ` Eric Dumazet
2016-03-24 16:50                                                               ` Willy Tarreau
2016-03-24 17:01                                                                 ` Eric Dumazet
2016-03-24 17:26                                                                   ` Tom Herbert
2016-03-24 17:55                                                                     ` Daniel Borkmann
2016-03-24 18:20                                                                       ` Tolga Ceylan
2016-03-24 18:24                                                                         ` Willy Tarreau
2016-03-24 18:37                                                                         ` Eric Dumazet
2016-03-24 22:40                                                                       ` Yann Ylavic
2016-03-24 22:49                                                                         ` Eric Dumazet
2016-03-24 23:40                                                                           ` Yann Ylavic
2016-03-24 23:54                                                                             ` Tom Herbert
2016-03-25  0:01                                                                               ` Yann Ylavic
2016-03-25  5:28                                                                               ` Willy Tarreau [this message]
2016-03-25  6:49                                                                                 ` Eric Dumazet
2016-03-25  8:53                                                                                   ` Willy Tarreau
2016-03-25 11:21                                                                                     ` Yann Ylavic
2016-03-25 13:17                                                                                       ` Eric Dumazet
2016-03-25  0:25                                                                           ` David Miller
2016-03-25  0:24                                                                         ` David Miller
2016-03-24 18:00                                                                   ` Willy Tarreau
2016-03-24 18:21                                                                     ` Willy Tarreau
2016-03-24 18:32                                                                     ` Eric Dumazet
2016-03-25 15:29 Craig Gallek
2016-03-25 16:21 ` Alexei Starovoitov
2016-03-25 16:31   ` Craig Gallek
2016-03-25 17:00     ` Eric Dumazet
2016-03-25 18:31       ` Willem de Bruijn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160325052819.GA8548@1wt.eu \
    --to=w@1wt.eu \
    --cc=aconole@bytheb.org \
    --cc=cgallek@google.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=josh@code406.com \
    --cc=netdev@vger.kernel.org \
    --cc=tolga.ceylan@gmail.com \
    --cc=tom@herbertland.com \
    --cc=ylavic.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).