From: Julian Anastasov <ja@ssi.bg>
To: Michal Kubecek <mkubecek@suse.cz>
Cc: lvs-devel@vger.kernel.org, Wensong Zhang <wensong@linux-vs.org>,
Simon Horman <horms@verge.net.au>,
netfilter-devel@vger.kernel.org, coreteam@netfilter.org,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
Pablo Neira Ayuso <pablo@netfilter.org>,
Patrick McHardy <kaber@trash.net>,
Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>,
"David S. Miller" <davem@davemloft.net>
Subject: Re: [PATCH ipvs-next] ipvs: count pre-established TCP states as active
Date: Sun, 12 Jun 2016 18:27:39 +0300 (EEST) [thread overview]
Message-ID: <alpine.LFD.2.11.1606121825110.2021@ja.home.ssi.bg> (raw)
In-Reply-To: <20160603155650.292BEA0E60@unicorn.suse.cz>
Hello,
On Fri, 3 Jun 2016, Michal Kubecek wrote:
> Some users observed that "least connection" distribution algorithm doesn't
> handle well bursts of TCP connections from reconnecting clients after
> a node or network failure.
>
> This is because the algorithm counts active connection as worth 256
> inactive ones where for TCP, "active" only means TCP connections in
> ESTABLISHED state. In case of a connection burst, new connections are
> handled before previous ones have finished the three way handshaking so
> that all are still counted as "inactive", i.e. cheap ones. The become
> "active" quickly but at that time, all of them are already assigned to one
> real server (or few), resulting in highly unbalanced distribution.
>
> Address this by counting the "pre-established" states as "active".
>
> Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Acked-by: Julian Anastasov <ja@ssi.bg>
Simon, please apply!
> ---
> net/netfilter/ipvs/ip_vs_proto_tcp.c | 25 +++++++++++++++++++++++--
> 1 file changed, 23 insertions(+), 2 deletions(-)
>
> diff --git a/net/netfilter/ipvs/ip_vs_proto_tcp.c b/net/netfilter/ipvs/ip_vs_proto_tcp.c
> index d7024b2ed769..5117bcb7d2f0 100644
> --- a/net/netfilter/ipvs/ip_vs_proto_tcp.c
> +++ b/net/netfilter/ipvs/ip_vs_proto_tcp.c
> @@ -395,6 +395,20 @@ static const char *const tcp_state_name_table[IP_VS_TCP_S_LAST+1] = {
> [IP_VS_TCP_S_LAST] = "BUG!",
> };
>
> +static const bool tcp_state_active_table[IP_VS_TCP_S_LAST] = {
> + [IP_VS_TCP_S_NONE] = false,
> + [IP_VS_TCP_S_ESTABLISHED] = true,
> + [IP_VS_TCP_S_SYN_SENT] = true,
> + [IP_VS_TCP_S_SYN_RECV] = true,
> + [IP_VS_TCP_S_FIN_WAIT] = false,
> + [IP_VS_TCP_S_TIME_WAIT] = false,
> + [IP_VS_TCP_S_CLOSE] = false,
> + [IP_VS_TCP_S_CLOSE_WAIT] = false,
> + [IP_VS_TCP_S_LAST_ACK] = false,
> + [IP_VS_TCP_S_LISTEN] = false,
> + [IP_VS_TCP_S_SYNACK] = true,
> +};
> +
> #define sNO IP_VS_TCP_S_NONE
> #define sES IP_VS_TCP_S_ESTABLISHED
> #define sSS IP_VS_TCP_S_SYN_SENT
> @@ -418,6 +432,13 @@ static const char * tcp_state_name(int state)
> return tcp_state_name_table[state] ? tcp_state_name_table[state] : "?";
> }
>
> +static bool tcp_state_active(int state)
> +{
> + if (state >= IP_VS_TCP_S_LAST)
> + return false;
> + return tcp_state_active_table[state];
> +}
> +
> static struct tcp_states_t tcp_states [] = {
> /* INPUT */
> /* sNO, sES, sSS, sSR, sFW, sTW, sCL, sCW, sLA, sLI, sSA */
> @@ -540,12 +561,12 @@ set_tcp_state(struct ip_vs_proto_data *pd, struct ip_vs_conn *cp,
>
> if (dest) {
> if (!(cp->flags & IP_VS_CONN_F_INACTIVE) &&
> - (new_state != IP_VS_TCP_S_ESTABLISHED)) {
> + !tcp_state_active(new_state)) {
> atomic_dec(&dest->activeconns);
> atomic_inc(&dest->inactconns);
> cp->flags |= IP_VS_CONN_F_INACTIVE;
> } else if ((cp->flags & IP_VS_CONN_F_INACTIVE) &&
> - (new_state == IP_VS_TCP_S_ESTABLISHED)) {
> + tcp_state_active(new_state)) {
> atomic_inc(&dest->activeconns);
> atomic_dec(&dest->inactconns);
> cp->flags &= ~IP_VS_CONN_F_INACTIVE;
> --
> 2.8.3
Regards
--
Julian Anastasov <ja@ssi.bg>
next prev parent reply other threads:[~2016-06-12 15:28 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-03 15:56 [PATCH ipvs-next] ipvs: count pre-established TCP states as active Michal Kubecek
2016-06-06 7:23 ` Julian Anastasov
2016-06-12 15:27 ` Julian Anastasov [this message]
2016-06-13 5:20 ` Simon Horman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.2.11.1606121825110.2021@ja.home.ssi.bg \
--to=ja@ssi.bg \
--cc=coreteam@netfilter.org \
--cc=davem@davemloft.net \
--cc=horms@verge.net.au \
--cc=kaber@trash.net \
--cc=kadlec@blackhole.kfki.hu \
--cc=linux-kernel@vger.kernel.org \
--cc=lvs-devel@vger.kernel.org \
--cc=mkubecek@suse.cz \
--cc=netdev@vger.kernel.org \
--cc=netfilter-devel@vger.kernel.org \
--cc=pablo@netfilter.org \
--cc=wensong@linux-vs.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).