All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Bendik Rønning Opstad" <bro.devel@gmail.com>
To: "David S. Miller" <davem@davemloft.net>, <netdev@vger.kernel.org>
Cc: "Yuchung Cheng" <ycheng@google.com>,
	"Eric Dumazet" <eric.dumazet@gmail.com>,
	"Neal Cardwell" <ncardwell@google.com>,
	"Andreas Petlund" <apetlund@simula.no>,
	"Carsten Griwodz" <griff@simula.no>,
	"Pål Halvorsen" <paalh@simula.no>,
	"Jonas Markussen" <jonassm@ifi.uio.no>,
	"Kristian Evensen" <kristian.evensen@gmail.com>,
	"Kenneth Klette Jonassen" <kennetkl@ifi.uio.no>,
	"Bendik Rønning Opstad" <bro.devel+kernel@gmail.com>
Subject: [PATCH RFC v2 net-next 1/2] tcp: Add DPIFL thin stream detection mechanism
Date: Mon, 23 Nov 2015 17:26:25 +0100	[thread overview]
Message-ID: <1448295986-14576-2-git-send-email-bro.devel+kernel@gmail.com> (raw)
In-Reply-To: <1448295986-14576-1-git-send-email-bro.devel+kernel@gmail.com>
In-Reply-To: <1445633413-3532-1-git-send-email-bro.devel+kernel@gmail.com>

The existing mechanism for detecting thin streams (tcp_stream_is_thin)
is based on a static limit of less than 4 packets in flight. This treats
streams differently depending on the connections RTT, such that a stream
on a high RTT link may never be considered thin, whereas the same
application would produce a stream that would always be thin in a low RTT
scenario (e.g. data center).

By calculating a dynamic packets in flight limit (DPIFL), the thin stream
detection will be independent of the RTT and treat streams equally based
on the transmission pattern, i.e. the inter-transmission time (ITT).

Cc: Andreas Petlund <apetlund@simula.no>
Cc: Carsten Griwodz <griff@simula.no>
Cc: Pål Halvorsen <paalh@simula.no>
Cc: Jonas Markussen <jonassm@ifi.uio.no>
Cc: Kristian Evensen <kristian.evensen@gmail.com>
Cc: Kenneth Klette Jonassen <kennetkl@ifi.uio.no>
Signed-off-by: Bendik Rønning Opstad <bro.devel+kernel@gmail.com>
---
 Documentation/networking/ip-sysctl.txt |  8 ++++++++
 include/net/tcp.h                      | 21 +++++++++++++++++++++
 net/ipv4/sysctl_net_ipv4.c             |  9 +++++++++
 net/ipv4/tcp.c                         |  2 ++
 4 files changed, 40 insertions(+)

diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index 2ea4c45..938ae73 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -700,6 +700,14 @@ tcp_thin_dupack - BOOLEAN
 	Documentation/networking/tcp-thin.txt
 	Default: 0
 
+tcp_thin_dpifl_itt_lower_bound - INTEGER
+	Controls the lower bound inter-transmission time (ITT) threshold
+	for when a stream is considered thin. The value is specified in
+	microseconds, and may not be lower than 10000 (10 ms). Based on
+	this threshold, a dynamic packets in flight limit (DPIFL) is
+	calculated, which is used to classify whether a stream is thin.
+	Default: 10000
+
 tcp_limit_output_bytes - INTEGER
 	Controls TCP Small Queue limit per tcp socket.
 	TCP bulk sender tends to increase packets in flight until it
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 4fc457b..deac96f 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -215,6 +215,8 @@ void tcp_time_wait(struct sock *sk, int state, int timeo);
 
 /* TCP thin-stream limits */
 #define TCP_THIN_LINEAR_RETRIES 6       /* After 6 linear retries, do exp. backoff */
+/* Lowest possible DPIFL lower bound ITT is 10 ms (10000 usec) */
+#define TCP_THIN_DPIFL_ITT_LOWER_BOUND_MIN 10000
 
 /* TCP initial congestion window as per draft-hkchu-tcpm-initcwnd-01 */
 #define TCP_INIT_CWND		10
@@ -274,6 +276,7 @@ extern int sysctl_tcp_workaround_signed_windows;
 extern int sysctl_tcp_slow_start_after_idle;
 extern int sysctl_tcp_thin_linear_timeouts;
 extern int sysctl_tcp_thin_dupack;
+extern int sysctl_tcp_thin_dpifl_itt_lower_bound;
 extern int sysctl_tcp_early_retrans;
 extern int sysctl_tcp_limit_output_bytes;
 extern int sysctl_tcp_challenge_ack_limit;
@@ -1631,6 +1634,24 @@ static inline bool tcp_stream_is_thin(struct tcp_sock *tp)
 	return tp->packets_out < 4 && !tcp_in_initial_slowstart(tp);
 }
 
+/**
+ * tcp_stream_is_thin_dpifl() - Tests if the stream is thin based on dynamic PIF
+ *                              limit
+ * @tp: the tcp_sock struct
+ *
+ * Return: true if current packets in flight (PIF) count is lower than
+ *         the dynamic PIF limit, else false
+ */
+static inline bool tcp_stream_is_thin_dpifl(const struct tcp_sock *tp)
+{
+	/* Calculate the maximum allowed PIF limit by dividing the RTT by
+	 * the minimum allowed inter-transmission time (ITT).
+	 * Tests if PIF < RTT / ITT-lower-bound
+	 */
+	return (u64) tcp_packets_in_flight(tp) *
+		sysctl_tcp_thin_dpifl_itt_lower_bound < (tp->srtt_us >> 3);
+}
+
 /* /proc */
 enum tcp_seq_states {
 	TCP_SEQ_STATE_LISTENING,
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index a0bd7a5..5b12446 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -42,6 +42,7 @@ static int tcp_syn_retries_min = 1;
 static int tcp_syn_retries_max = MAX_TCP_SYNCNT;
 static int ip_ping_group_range_min[] = { 0, 0 };
 static int ip_ping_group_range_max[] = { GID_T_MAX, GID_T_MAX };
+static int tcp_thin_dpifl_itt_lower_bound_min = TCP_THIN_DPIFL_ITT_LOWER_BOUND_MIN;
 
 /* Update system visible IP port range */
 static void set_local_port_range(struct net *net, int range[2])
@@ -709,6 +710,14 @@ static struct ctl_table ipv4_table[] = {
 		.proc_handler   = proc_dointvec
 	},
 	{
+		.procname	= "tcp_thin_dpifl_itt_lower_bound",
+		.data		= &sysctl_tcp_thin_dpifl_itt_lower_bound,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= &proc_dointvec_minmax,
+		.extra1		= &tcp_thin_dpifl_itt_lower_bound_min,
+	},
+	{
 		.procname	= "tcp_early_retrans",
 		.data		= &sysctl_tcp_early_retrans,
 		.maxlen		= sizeof(int),
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index c172877..cb3354d 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -287,6 +287,8 @@ int sysctl_tcp_min_tso_segs __read_mostly = 2;
 
 int sysctl_tcp_autocorking __read_mostly = 1;
 
+int sysctl_tcp_thin_dpifl_itt_lower_bound __read_mostly = TCP_THIN_DPIFL_ITT_LOWER_BOUND_MIN;
+
 struct percpu_counter tcp_orphan_count;
 EXPORT_SYMBOL_GPL(tcp_orphan_count);
 
-- 
1.9.1

  parent reply	other threads:[~2015-11-23 16:27 UTC|newest]

Thread overview: 81+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-23 20:50 [PATCH RFC net-next 0/2] tcp: Redundant Data Bundling (RDB) Bendik Rønning Opstad
2015-10-23 20:50 ` Bendik Rønning Opstad
2015-10-23 20:50 ` [PATCH RFC net-next 1/2] tcp: Add DPIFL thin stream detection mechanism Bendik Rønning Opstad
2015-10-23 20:50   ` Bendik Rønning Opstad
2015-10-23 21:44   ` Eric Dumazet
2015-10-23 21:44     ` Eric Dumazet
2015-10-25  5:56     ` Bendik Rønning Opstad
2015-10-25  5:56       ` Bendik Rønning Opstad
2015-10-23 20:50 ` [PATCH RFC net-next 2/2] tcp: Add Redundant Data Bundling (RDB) Bendik Rønning Opstad
2015-10-23 20:50   ` Bendik Rønning Opstad
2015-10-26 14:50   ` Neal Cardwell
2015-10-26 14:50     ` Neal Cardwell
2015-10-26 21:35     ` Andreas Petlund
2015-10-26 21:35       ` Andreas Petlund
2015-10-26 21:58       ` Yuchung Cheng
2015-10-26 21:58         ` Yuchung Cheng
2015-10-27 19:15         ` Jonas Markussen
2015-10-27 19:15           ` Jonas Markussen
2015-10-29 22:53         ` Bendik Rønning Opstad
2015-10-29 22:53           ` Bendik Rønning Opstad
2015-11-02  9:18           ` David Laight
2015-11-02  9:18             ` David Laight
2015-11-02  9:37   ` David Laight
2015-11-02  9:37     ` David Laight
2015-11-05  2:06     ` Bendik Rønning Opstad
2015-11-05  2:06       ` Bendik Rønning Opstad
2015-11-05  2:06       ` Bendik Rønning Opstad
2015-10-24  6:11 ` [PATCH RFC net-next 0/2] tcp: " Yuchung Cheng
2015-10-24  6:11   ` Yuchung Cheng
2015-10-24  6:11   ` Yuchung Cheng
2015-10-24  8:00   ` Jonas Markussen
2015-10-24  8:00     ` Jonas Markussen
2015-10-24 12:57     ` Eric Dumazet
2015-10-24 12:57       ` Eric Dumazet
2015-11-09 19:40       ` Bendik Rønning Opstad
2015-11-23 16:26 ` [PATCH RFC v2 " Bendik Rønning Opstad
2015-11-23 16:26 ` Bendik Rønning Opstad [this message]
2015-11-23 16:26 ` [PATCH RFC v2 net-next 2/2] tcp: Add " Bendik Rønning Opstad
2015-11-23 17:43   ` Eric Dumazet
2015-11-23 20:05     ` Bendik Rønning Opstad
2016-02-02 19:23 ` [PATCH v3 net-next 0/2] tcp: " Bendik Rønning Opstad
2016-02-02 19:23 ` [PATCH v3 net-next 1/2] tcp: Add DPIFL thin stream detection mechanism Bendik Rønning Opstad
2016-02-02 19:23 ` [PATCH v3 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) Bendik Rønning Opstad
2016-02-02 20:35   ` Eric Dumazet
2016-02-03 18:17     ` Bendik Rønning Opstad
2016-02-03 19:34       ` Eric Dumazet
     [not found]         ` <CAF8eE=VOuoNLQHtkRwM9ZG+vJ-uH2ufVW5y_pS24rGqWh4Qa2g@mail.gmail.com>
2016-02-08 17:30           ` Bendik Rønning Opstad
2016-02-08 17:38         ` Bendik Rønning Opstad
2016-02-16 13:51 ` [PATCH v4 net-next 0/2] tcp: " Bendik Rønning Opstad
2016-02-16 13:51 ` [PATCH v4 net-next 1/2] tcp: Add DPIFL thin stream detection mechanism Bendik Rønning Opstad
2016-02-16 13:51 ` [PATCH v4 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) Bendik Rønning Opstad
2016-02-18 15:18   ` Eric Dumazet
2016-02-19 14:12     ` Bendik Rønning Opstad
2016-02-24 21:12 ` [PATCH v5 net-next 0/2] tcp: " Bendik Rønning Opstad
2016-02-24 21:12 ` [PATCH v5 net-next 1/2] tcp: Add DPIFL thin stream detection mechanism Bendik Rønning Opstad
2016-02-24 21:12 ` [PATCH v5 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) Bendik Rønning Opstad
2016-03-02 19:52   ` David Miller
2016-03-02 22:33     ` Bendik Rønning Opstad
2016-03-03 18:06 ` [PATCH v6 net-next 0/2] tcp: " Bendik Rønning Opstad
2016-03-07 19:36   ` David Miller
2016-03-10  0:20   ` Yuchung Cheng
2016-03-10  1:45     ` Jonas Markussen
2016-03-10  2:27       ` Yuchung Cheng
2016-03-12  9:23         ` Jonas Markussen
2016-03-13 23:18     ` Bendik Rønning Opstad
2016-03-14 21:59       ` Yuchung Cheng
2016-03-18 14:25         ` Bendik Rønning Opstad
2016-03-03 18:06 ` [PATCH v6 net-next 1/2] tcp: Add DPIFL thin stream detection mechanism Bendik Rønning Opstad
2016-03-03 18:06 ` [PATCH v6 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) Bendik Rønning Opstad
2016-03-14 21:15   ` Eric Dumazet
2016-03-15  1:04     ` Rick Jones
2016-03-15 18:09       ` Yuchung Cheng
2016-03-18 17:58     ` Bendik Rønning Opstad
2016-03-14 21:54   ` Yuchung Cheng
2016-03-15  0:40     ` Bill Fink
2016-03-17 23:26     ` Bendik Rønning Opstad
2016-03-21 18:54       ` Yuchung Cheng
2016-06-16 17:12         ` Bendik Rønning Opstad
2016-06-22 14:56 ` [PATCH v7 net-next 0/2] tcp: " Bendik Rønning Opstad
2016-06-22 14:56 ` [PATCH v7 net-next 1/2] tcp: Add DPIFL thin stream detection mechanism Bendik Rønning Opstad
2016-06-22 14:56 ` [PATCH v7 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) Bendik Rønning Opstad

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1448295986-14576-2-git-send-email-bro.devel+kernel@gmail.com \
    --to=bro.devel@gmail.com \
    --cc=apetlund@simula.no \
    --cc=bro.devel+kernel@gmail.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=griff@simula.no \
    --cc=jonassm@ifi.uio.no \
    --cc=kennetkl@ifi.uio.no \
    --cc=kristian.evensen@gmail.com \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=paalh@simula.no \
    --cc=ycheng@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.