netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Bendik Rønning Opstad" <bro.devel@gmail.com>
To: "David S. Miller" <davem@davemloft.net>, <netdev@vger.kernel.org>
Cc: "Yuchung Cheng" <ycheng@google.com>,
	"Eric Dumazet" <eric.dumazet@gmail.com>,
	"Neal Cardwell" <ncardwell@google.com>,
	"Andreas Petlund" <apetlund@simula.no>,
	"Carsten Griwodz" <griff@simula.no>,
	"Pål Halvorsen" <paalh@simula.no>,
	"Jonas Markussen" <jonassm@ifi.uio.no>,
	"Kristian Evensen" <kristian.evensen@gmail.com>,
	"Kenneth Klette Jonassen" <kennetkl@ifi.uio.no>,
	"Bendik Rønning Opstad" <bro.devel+kernel@gmail.com>
Subject: [PATCH v3 net-next 1/2] tcp: Add DPIFL thin stream detection mechanism
Date: Tue,  2 Feb 2016 20:23:14 +0100	[thread overview]
Message-ID: <1454440995-18457-2-git-send-email-bro.devel+kernel@gmail.com> (raw)
In-Reply-To: <1454440995-18457-1-git-send-email-bro.devel+kernel@gmail.com>
In-Reply-To: <1445633413-3532-1-git-send-email-bro.devel+kernel@gmail.com>

The existing mechanism for detecting thin streams (tcp_stream_is_thin)
is based on a static limit of less than 4 packets in flight. This treats
streams differently depending on the connections RTT, such that a stream
on a high RTT link may never be considered thin, whereas the same
application would produce a stream that would always be thin in a low RTT
scenario (e.g. data center).

By calculating a dynamic packets in flight limit (DPIFL), the thin stream
detection will be independent of the RTT and treat streams equally based
on the transmission pattern, i.e. the inter-transmission time (ITT).

Cc: Andreas Petlund <apetlund@simula.no>
Cc: Carsten Griwodz <griff@simula.no>
Cc: Pål Halvorsen <paalh@simula.no>
Cc: Jonas Markussen <jonassm@ifi.uio.no>
Cc: Kristian Evensen <kristian.evensen@gmail.com>
Cc: Kenneth Klette Jonassen <kennetkl@ifi.uio.no>
Signed-off-by: Bendik Rønning Opstad <bro.devel+kernel@gmail.com>
---
 Documentation/networking/ip-sysctl.txt |  8 ++++++++
 include/net/tcp.h                      | 21 +++++++++++++++++++++
 net/ipv4/sysctl_net_ipv4.c             |  9 +++++++++
 net/ipv4/tcp.c                         |  2 ++
 4 files changed, 40 insertions(+)

diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index 73b36d7..eb42853 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -708,6 +708,14 @@ tcp_thin_dupack - BOOLEAN
 	Documentation/networking/tcp-thin.txt
 	Default: 0
 
+tcp_thin_dpifl_itt_lower_bound - INTEGER
+	Controls the lower bound inter-transmission time (ITT) threshold
+	for when a stream is considered thin. The value is specified in
+	microseconds, and may not be lower than 10000 (10 ms). Based on
+	this threshold, a dynamic packets in flight limit (DPIFL) is
+	calculated, which is used to classify whether a stream is thin.
+	Default: 10000
+
 tcp_limit_output_bytes - INTEGER
 	Controls TCP Small Queue limit per tcp socket.
 	TCP bulk sender tends to increase packets in flight until it
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 3dd20fe..2d86bd7 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -215,6 +215,8 @@ void tcp_time_wait(struct sock *sk, int state, int timeo);
 
 /* TCP thin-stream limits */
 #define TCP_THIN_LINEAR_RETRIES 6       /* After 6 linear retries, do exp. backoff */
+/* Lowest possible DPIFL lower bound ITT is 10 ms (10000 usec) */
+#define TCP_THIN_DPIFL_ITT_LOWER_BOUND_MIN 10000
 
 /* TCP initial congestion window as per rfc6928 */
 #define TCP_INIT_CWND		10
@@ -271,6 +273,7 @@ extern int sysctl_tcp_workaround_signed_windows;
 extern int sysctl_tcp_slow_start_after_idle;
 extern int sysctl_tcp_thin_linear_timeouts;
 extern int sysctl_tcp_thin_dupack;
+extern int sysctl_tcp_thin_dpifl_itt_lower_bound;
 extern int sysctl_tcp_early_retrans;
 extern int sysctl_tcp_limit_output_bytes;
 extern int sysctl_tcp_challenge_ack_limit;
@@ -1649,6 +1652,24 @@ static inline bool tcp_stream_is_thin(struct tcp_sock *tp)
 	return tp->packets_out < 4 && !tcp_in_initial_slowstart(tp);
 }
 
+/**
+ * tcp_stream_is_thin_dpifl() - Tests if the stream is thin based on dynamic PIF
+ *                              limit
+ * @tp: the tcp_sock struct
+ *
+ * Return: true if current packets in flight (PIF) count is lower than
+ *         the dynamic PIF limit, else false
+ */
+static inline bool tcp_stream_is_thin_dpifl(const struct tcp_sock *tp)
+{
+	/* Calculate the maximum allowed PIF limit by dividing the RTT by
+	 * the minimum allowed inter-transmission time (ITT).
+	 * Tests if PIF < RTT / ITT-lower-bound
+	 */
+	return (u64) tcp_packets_in_flight(tp) *
+		sysctl_tcp_thin_dpifl_itt_lower_bound < (tp->srtt_us >> 3);
+}
+
 /* /proc */
 enum tcp_seq_states {
 	TCP_SEQ_STATE_LISTENING,
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 4d367b4..6014bc4 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -41,6 +41,7 @@ static int tcp_syn_retries_min = 1;
 static int tcp_syn_retries_max = MAX_TCP_SYNCNT;
 static int ip_ping_group_range_min[] = { 0, 0 };
 static int ip_ping_group_range_max[] = { GID_T_MAX, GID_T_MAX };
+static int tcp_thin_dpifl_itt_lower_bound_min = TCP_THIN_DPIFL_ITT_LOWER_BOUND_MIN;
 
 /* Update system visible IP port range */
 static void set_local_port_range(struct net *net, int range[2])
@@ -687,6 +688,14 @@ static struct ctl_table ipv4_table[] = {
 		.proc_handler   = proc_dointvec
 	},
 	{
+		.procname	= "tcp_thin_dpifl_itt_lower_bound",
+		.data		= &sysctl_tcp_thin_dpifl_itt_lower_bound,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= &proc_dointvec_minmax,
+		.extra1		= &tcp_thin_dpifl_itt_lower_bound_min,
+	},
+	{
 		.procname	= "tcp_early_retrans",
 		.data		= &sysctl_tcp_early_retrans,
 		.maxlen		= sizeof(int),
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 19746b3..16087fe 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -288,6 +288,8 @@ int sysctl_tcp_min_tso_segs __read_mostly = 2;
 
 int sysctl_tcp_autocorking __read_mostly = 1;
 
+int sysctl_tcp_thin_dpifl_itt_lower_bound __read_mostly = TCP_THIN_DPIFL_ITT_LOWER_BOUND_MIN;
+
 struct percpu_counter tcp_orphan_count;
 EXPORT_SYMBOL_GPL(tcp_orphan_count);
 
-- 
1.9.1

  parent reply	other threads:[~2016-02-02 19:24 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-23 20:50 [PATCH RFC net-next 0/2] tcp: Redundant Data Bundling (RDB) Bendik Rønning Opstad
2015-10-23 20:50 ` [PATCH RFC net-next 1/2] tcp: Add DPIFL thin stream detection mechanism Bendik Rønning Opstad
2015-10-23 21:44   ` Eric Dumazet
     [not found]     ` <1445636654.22974.193.camel-XN9IlZ5yJG9HTL0Zs8A6p/gx64E7kk8eUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>
2015-10-25  5:56       ` Bendik Rønning Opstad
2015-10-23 20:50 ` [PATCH RFC net-next 2/2] tcp: Add Redundant Data Bundling (RDB) Bendik Rønning Opstad
2015-10-26 14:50   ` Neal Cardwell
2015-10-26 21:35     ` Andreas Petlund
2015-10-26 21:58       ` Yuchung Cheng
2015-10-27 19:15         ` Jonas Markussen
2015-10-29 22:53         ` Bendik Rønning Opstad
2015-11-02  9:18           ` David Laight
2015-11-02  9:37   ` David Laight
2015-11-05  2:06     ` Bendik Rønning Opstad
2015-10-24  6:11 ` [PATCH RFC net-next 0/2] tcp: " Yuchung Cheng
2015-10-24  8:00   ` Jonas Markussen
     [not found]     ` <61F74109-9FDC-485A-978B-714B7AA27445-6miFZF/5cTBuMpJDpNschA@public.gmane.org>
2015-10-24 12:57       ` Eric Dumazet
2015-11-09 19:40         ` Bendik Rønning Opstad
2015-11-23 16:26 ` [PATCH RFC v2 " Bendik Rønning Opstad
2015-11-23 16:26 ` [PATCH RFC v2 net-next 1/2] tcp: Add DPIFL thin stream detection mechanism Bendik Rønning Opstad
2015-11-23 16:26 ` [PATCH RFC v2 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) Bendik Rønning Opstad
2015-11-23 17:43   ` Eric Dumazet
2015-11-23 20:05     ` Bendik Rønning Opstad
2016-02-02 19:23 ` [PATCH v3 net-next 0/2] tcp: " Bendik Rønning Opstad
2016-02-02 19:23 ` Bendik Rønning Opstad [this message]
2016-02-02 19:23 ` [PATCH v3 net-next 2/2] tcp: Add " Bendik Rønning Opstad
2016-02-02 20:35   ` Eric Dumazet
2016-02-03 18:17     ` Bendik Rønning Opstad
2016-02-03 19:34       ` Eric Dumazet
     [not found]         ` <CAF8eE=VOuoNLQHtkRwM9ZG+vJ-uH2ufVW5y_pS24rGqWh4Qa2g@mail.gmail.com>
2016-02-08 17:30           ` Bendik Rønning Opstad
2016-02-08 17:38         ` Bendik Rønning Opstad
2016-02-16 13:51 ` [PATCH v4 net-next 0/2] tcp: " Bendik Rønning Opstad
2016-02-16 13:51 ` [PATCH v4 net-next 1/2] tcp: Add DPIFL thin stream detection mechanism Bendik Rønning Opstad
2016-02-16 13:51 ` [PATCH v4 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) Bendik Rønning Opstad
2016-02-18 15:18   ` Eric Dumazet
2016-02-19 14:12     ` Bendik Rønning Opstad
2016-02-24 21:12 ` [PATCH v5 net-next 0/2] tcp: " Bendik Rønning Opstad
2016-02-24 21:12 ` [PATCH v5 net-next 1/2] tcp: Add DPIFL thin stream detection mechanism Bendik Rønning Opstad
2016-02-24 21:12 ` [PATCH v5 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) Bendik Rønning Opstad
2016-03-02 19:52   ` David Miller
2016-03-02 22:33     ` Bendik Rønning Opstad
2016-03-03 18:06 ` [PATCH v6 net-next 0/2] tcp: " Bendik Rønning Opstad
2016-03-07 19:36   ` David Miller
2016-03-10  0:20   ` Yuchung Cheng
2016-03-10  1:45     ` Jonas Markussen
2016-03-10  2:27       ` Yuchung Cheng
2016-03-12  9:23         ` Jonas Markussen
2016-03-13 23:18     ` Bendik Rønning Opstad
2016-03-14 21:59       ` Yuchung Cheng
2016-03-18 14:25         ` Bendik Rønning Opstad
2016-03-03 18:06 ` [PATCH v6 net-next 1/2] tcp: Add DPIFL thin stream detection mechanism Bendik Rønning Opstad
2016-03-03 18:06 ` [PATCH v6 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) Bendik Rønning Opstad
2016-03-14 21:15   ` Eric Dumazet
2016-03-15  1:04     ` Rick Jones
2016-03-15 18:09       ` Yuchung Cheng
2016-03-18 17:58     ` Bendik Rønning Opstad
2016-03-14 21:54   ` Yuchung Cheng
2016-03-15  0:40     ` Bill Fink
2016-03-17 23:26     ` Bendik Rønning Opstad
2016-03-21 18:54       ` Yuchung Cheng
2016-06-16 17:12         ` Bendik Rønning Opstad
2016-06-22 14:56 ` [PATCH v7 net-next 0/2] tcp: " Bendik Rønning Opstad
2016-06-22 14:56 ` [PATCH v7 net-next 1/2] tcp: Add DPIFL thin stream detection mechanism Bendik Rønning Opstad
2016-06-22 14:56 ` [PATCH v7 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) Bendik Rønning Opstad

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1454440995-18457-2-git-send-email-bro.devel+kernel@gmail.com \
    --to=bro.devel@gmail.com \
    --cc=apetlund@simula.no \
    --cc=bro.devel+kernel@gmail.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=griff@simula.no \
    --cc=jonassm@ifi.uio.no \
    --cc=kennetkl@ifi.uio.no \
    --cc=kristian.evensen@gmail.com \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=paalh@simula.no \
    --cc=ycheng@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).