netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Pablo Neira Ayuso <pablo@netfilter.org>
To: netfilter-devel@vger.kernel.org
Cc: davem@davemloft.net, netdev@vger.kernel.org
Subject: [PATCH 01/25] ipvs: count pre-established TCP states as active
Date: Sat, 23 Jul 2016 13:08:15 +0200	[thread overview]
Message-ID: <1469272119-29942-2-git-send-email-pablo@netfilter.org> (raw)
In-Reply-To: <1469272119-29942-1-git-send-email-pablo@netfilter.org>

From: Michal Kubecek <mkubecek@suse.cz>

Some users observed that "least connection" distribution algorithm doesn't
handle well bursts of TCP connections from reconnecting clients after
a node or network failure.

This is because the algorithm counts active connection as worth 256
inactive ones where for TCP, "active" only means TCP connections in
ESTABLISHED state. In case of a connection burst, new connections are
handled before previous ones have finished the three way handshaking so
that all are still counted as "inactive", i.e. cheap ones. The become
"active" quickly but at that time, all of them are already assigned to one
real server (or few), resulting in highly unbalanced distribution.

Address this by counting the "pre-established" states as "active".

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
 net/netfilter/ipvs/ip_vs_proto_tcp.c | 25 +++++++++++++++++++++++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/ipvs/ip_vs_proto_tcp.c b/net/netfilter/ipvs/ip_vs_proto_tcp.c
index d7024b2..5117bcb 100644
--- a/net/netfilter/ipvs/ip_vs_proto_tcp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_tcp.c
@@ -395,6 +395,20 @@ static const char *const tcp_state_name_table[IP_VS_TCP_S_LAST+1] = {
 	[IP_VS_TCP_S_LAST]		=	"BUG!",
 };
 
+static const bool tcp_state_active_table[IP_VS_TCP_S_LAST] = {
+	[IP_VS_TCP_S_NONE]		=	false,
+	[IP_VS_TCP_S_ESTABLISHED]	=	true,
+	[IP_VS_TCP_S_SYN_SENT]		=	true,
+	[IP_VS_TCP_S_SYN_RECV]		=	true,
+	[IP_VS_TCP_S_FIN_WAIT]		=	false,
+	[IP_VS_TCP_S_TIME_WAIT]		=	false,
+	[IP_VS_TCP_S_CLOSE]		=	false,
+	[IP_VS_TCP_S_CLOSE_WAIT]	=	false,
+	[IP_VS_TCP_S_LAST_ACK]		=	false,
+	[IP_VS_TCP_S_LISTEN]		=	false,
+	[IP_VS_TCP_S_SYNACK]		=	true,
+};
+
 #define sNO IP_VS_TCP_S_NONE
 #define sES IP_VS_TCP_S_ESTABLISHED
 #define sSS IP_VS_TCP_S_SYN_SENT
@@ -418,6 +432,13 @@ static const char * tcp_state_name(int state)
 	return tcp_state_name_table[state] ? tcp_state_name_table[state] : "?";
 }
 
+static bool tcp_state_active(int state)
+{
+	if (state >= IP_VS_TCP_S_LAST)
+		return false;
+	return tcp_state_active_table[state];
+}
+
 static struct tcp_states_t tcp_states [] = {
 /*	INPUT */
 /*        sNO, sES, sSS, sSR, sFW, sTW, sCL, sCW, sLA, sLI, sSA	*/
@@ -540,12 +561,12 @@ set_tcp_state(struct ip_vs_proto_data *pd, struct ip_vs_conn *cp,
 
 		if (dest) {
 			if (!(cp->flags & IP_VS_CONN_F_INACTIVE) &&
-			    (new_state != IP_VS_TCP_S_ESTABLISHED)) {
+			    !tcp_state_active(new_state)) {
 				atomic_dec(&dest->activeconns);
 				atomic_inc(&dest->inactconns);
 				cp->flags |= IP_VS_CONN_F_INACTIVE;
 			} else if ((cp->flags & IP_VS_CONN_F_INACTIVE) &&
-				   (new_state == IP_VS_TCP_S_ESTABLISHED)) {
+				   tcp_state_active(new_state)) {
 				atomic_inc(&dest->activeconns);
 				atomic_dec(&dest->inactconns);
 				cp->flags &= ~IP_VS_CONN_F_INACTIVE;
-- 
2.1.4

  reply	other threads:[~2016-07-23 11:08 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-23 11:08 [PATCH 00/25] Netfilter/IPVS updates for net-next Pablo Neira Ayuso
2016-07-23 11:08 ` Pablo Neira Ayuso [this message]
2016-07-23 11:08 ` [PATCH 02/25] netfilter: conntrack: fix race between nf_conntrack proc read and hash resize Pablo Neira Ayuso
2016-07-23 11:08 ` [PATCH 03/25] netfilter: cttimeout: unlink timeout obj again when hash resize happen Pablo Neira Ayuso
2016-07-23 11:08 ` [PATCH 04/25] netfilter: nf_ct_helper: unlink helper " Pablo Neira Ayuso
2016-07-23 11:08 ` [PATCH 05/25] netfilter: conntrack: simplify early_drop Pablo Neira Ayuso
2016-07-23 11:08 ` [PATCH 06/25] netfilter: move nat hlist_head to nf_conn Pablo Neira Ayuso
2016-07-23 11:08 ` [PATCH 07/25] netfilter: nat: convert nat bysrc hash to rhashtable Pablo Neira Ayuso
2016-07-23 11:08 ` [PATCH 08/25] netfilter: physdev: physdev-is-out should not work with OUTPUT chain Pablo Neira Ayuso
2016-07-23 11:08 ` [PATCH 09/25] netfilter: nft_ct: make byte/packet expr more friendly Pablo Neira Ayuso
2016-07-23 11:08 ` [PATCH 10/25] netfilter: constify arg to is_dying/confirmed Pablo Neira Ayuso
2016-07-23 11:08 ` [PATCH 11/25] netfilter: nf_tables: get rid of possible_net_t from set and basechain Pablo Neira Ayuso
2016-07-23 11:08 ` [PATCH 12/25] netfilter: nf_conntrack_h323: fix off-by-one in DecodeQ931 Pablo Neira Ayuso
2016-07-23 11:08 ` [PATCH 13/25] netfilter: conntrack: protect early_drop by rcu read lock Pablo Neira Ayuso
2016-07-23 11:08 ` [PATCH 14/25] netfilter: x_tables: speed up jump target validation Pablo Neira Ayuso
2016-07-23 11:08 ` [PATCH 15/25] netfilter: nft_ct: fix unpaired nf_connlabels_get/put call Pablo Neira Ayuso
2016-07-23 11:08 ` [PATCH 16/25] netfilter: Add helper array register/unregister functions Pablo Neira Ayuso
2016-07-23 11:08 ` [PATCH 17/25] netfilter: nft_log: fix possible memory leak if log expr init fail Pablo Neira Ayuso
2016-07-23 11:08 ` [PATCH 18/25] netfilter: nft_log: check the validity of log level Pablo Neira Ayuso
2016-07-23 11:08 ` [PATCH 19/25] netfilter: nft_log: fix snaplen does not truncate packets Pablo Neira Ayuso
2016-07-23 11:08 ` [PATCH 20/25] netfilter: nf_tables: allow to filter out rules by table and chain Pablo Neira Ayuso
2016-07-23 11:08 ` [PATCH 21/25] netfilter: conntrack: support a fixed size of 128 distinct labels Pablo Neira Ayuso
2016-07-23 11:08 ` [PATCH 22/25] netfilter: connlabels: move set helper to xt_connlabel Pablo Neira Ayuso
2016-07-23 11:08 ` [PATCH 23/25] netfilter: h323: Use mod_timer instead of set_expect_timeout Pablo Neira Ayuso
2016-07-23 11:08 ` [PATCH 24/25] netfilter: nft_compat: put back match/target module if init fail Pablo Neira Ayuso
2016-07-23 11:08 ` [PATCH 25/25] netfilter: nft_compat: fix crash when related match/target module is removed Pablo Neira Ayuso
  -- strict thread matches above, loose matches on Subject: below --
2016-07-23 11:02 [PATCH 00/25] Netfilter/IPVS updates for net-next Pablo Neira Ayuso
2016-07-23 11:02 ` [PATCH 01/25] ipvs: count pre-established TCP states as active Pablo Neira Ayuso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1469272119-29942-2-git-send-email-pablo@netfilter.org \
    --to=pablo@netfilter.org \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).