Netdev Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v3 1/2] tcp: Add TCP_INFO counter for packets received out-of-order
@ 2019-09-11 22:31 Thomas Higdon
  2019-09-11 22:31 ` [PATCH v3 2/2] tcp: Add rcv_wnd to TCP_INFO Thomas Higdon
  0 siblings, 1 reply; 6+ messages in thread
From: Thomas Higdon @ 2019-09-11 22:31 UTC (permalink / raw)
  To: netdev; +Cc: Jonathan Lemon, Dave Jones, Eric Dumazet, Neal Cardwell

For receive-heavy cases on the server-side, we want to track the
connection quality for individual client IPs. This counter, similar to
the existing system-wide TCPOFOQueue counter in /proc/net/netstat,
tracks out-of-order packet reception. By providing this counter in
TCP_INFO, it will allow understanding to what degree receive-heavy
sockets are experiencing out-of-order delivery and packet drops
indicating congestion.

Please note that this is similar to the counter in NetBSD TCP_INFO, and
has the same name.

Signed-off-by: Thomas Higdon <tph@fb.com>
---
 include/linux/tcp.h      | 2 ++
 include/uapi/linux/tcp.h | 2 ++
 net/ipv4/tcp.c           | 2 ++
 net/ipv4/tcp_input.c     | 1 +
 4 files changed, 7 insertions(+)

diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index f3a85a7fb4b1..a01dc78218f1 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -393,6 +393,8 @@ struct tcp_sock {
 	 */
 	struct request_sock *fastopen_rsk;
 	u32	*saved_syn;
+
+	u32 rcv_ooopack; /* Received out-of-order packets, for tcpinfo */
 };
 
 enum tsq_enum {
diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index b3564f85a762..20237987ccc8 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -270,6 +270,8 @@ struct tcp_info {
 	__u64	tcpi_bytes_retrans;  /* RFC4898 tcpEStatsPerfOctetsRetrans */
 	__u32	tcpi_dsack_dups;     /* RFC4898 tcpEStatsStackDSACKDups */
 	__u32	tcpi_reord_seen;     /* reordering events seen */
+
+	__u32	tcpi_rcv_ooopack;    /* Out-of-order packets received */
 };
 
 /* netlink attributes types for SCM_TIMESTAMPING_OPT_STATS */
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 94df48bcecc2..4cf58208270e 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2653,6 +2653,7 @@ int tcp_disconnect(struct sock *sk, int flags)
 	tp->rx_opt.saw_tstamp = 0;
 	tp->rx_opt.dsack = 0;
 	tp->rx_opt.num_sacks = 0;
+	tp->rcv_ooopack = 0;
 
 
 	/* Clean up fastopen related fields */
@@ -3295,6 +3296,7 @@ void tcp_get_info(struct sock *sk, struct tcp_info *info)
 	info->tcpi_bytes_retrans = tp->bytes_retrans;
 	info->tcpi_dsack_dups = tp->dsack_dups;
 	info->tcpi_reord_seen = tp->reord_seen;
+	info->tcpi_rcv_ooopack = tp->rcv_ooopack;
 	unlock_sock_fast(sk, slow);
 }
 EXPORT_SYMBOL_GPL(tcp_get_info);
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 706cbb3b2986..2ef333354026 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4555,6 +4555,7 @@ static void tcp_data_queue_ofo(struct sock *sk, struct sk_buff *skb)
 	tp->pred_flags = 0;
 	inet_csk_schedule_ack(sk);
 
+	tp->rcv_ooopack += max_t(u16, 1, skb_shinfo(skb)->gso_segs);
 	NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPOFOQUEUE);
 	seq = TCP_SKB_CB(skb)->seq;
 	end_seq = TCP_SKB_CB(skb)->end_seq;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v3 2/2] tcp: Add rcv_wnd to TCP_INFO
  2019-09-11 22:31 [PATCH v3 1/2] tcp: Add TCP_INFO counter for packets received out-of-order Thomas Higdon
@ 2019-09-11 22:31 ` Thomas Higdon
  2019-09-12  0:49   ` Neal Cardwell
  0 siblings, 1 reply; 6+ messages in thread
From: Thomas Higdon @ 2019-09-11 22:31 UTC (permalink / raw)
  To: netdev; +Cc: Jonathan Lemon, Dave Jones, Eric Dumazet, Neal Cardwell

Neal Cardwell mentioned that rcv_wnd would be useful for helping
diagnose whether a flow is receive-window-limited at a given instant.

This serves the purpose of adding an additional __u32 to avoid the
would-be hole caused by the addition of the tcpi_rcvi_ooopack field.

Signed-off-by: Thomas Higdon <tph@fb.com>
---
 include/uapi/linux/tcp.h | 1 +
 net/ipv4/tcp.c           | 1 +
 2 files changed, 2 insertions(+)

diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index 20237987ccc8..8a0d1d1af622 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -272,6 +272,7 @@ struct tcp_info {
 	__u32	tcpi_reord_seen;     /* reordering events seen */
 
 	__u32	tcpi_rcv_ooopack;    /* Out-of-order packets received */
+	__u32	tcpi_rcv_wnd;        /* Receive window size */
 };
 
 /* netlink attributes types for SCM_TIMESTAMPING_OPT_STATS */
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 4cf58208270e..c980145c4247 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3297,6 +3297,7 @@ void tcp_get_info(struct sock *sk, struct tcp_info *info)
 	info->tcpi_dsack_dups = tp->dsack_dups;
 	info->tcpi_reord_seen = tp->reord_seen;
 	info->tcpi_rcv_ooopack = tp->rcv_ooopack;
+	info->tcpi_rcv_wnd = tp->rcv_wnd;
 	unlock_sock_fast(sk, slow);
 }
 EXPORT_SYMBOL_GPL(tcp_get_info);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 2/2] tcp: Add rcv_wnd to TCP_INFO
  2019-09-11 22:31 ` [PATCH v3 2/2] tcp: Add rcv_wnd to TCP_INFO Thomas Higdon
@ 2019-09-12  0:49   ` Neal Cardwell
  2019-09-12  9:14     ` Dave Taht
  0 siblings, 1 reply; 6+ messages in thread
From: Neal Cardwell @ 2019-09-12  0:49 UTC (permalink / raw)
  To: Thomas Higdon
  Cc: netdev, Jonathan Lemon, Dave Jones, Eric Dumazet, Yuchung Cheng,
	Soheil Hassas Yeganeh

On Wed, Sep 11, 2019 at 6:32 PM Thomas Higdon <tph@fb.com> wrote:
>
> Neal Cardwell mentioned that rcv_wnd would be useful for helping
> diagnose whether a flow is receive-window-limited at a given instant.
>
> This serves the purpose of adding an additional __u32 to avoid the
> would-be hole caused by the addition of the tcpi_rcvi_ooopack field.
>
> Signed-off-by: Thomas Higdon <tph@fb.com>
> ---

Thanks, Thomas.

I know that when I mentioned this before I mentioned the idea of both
tp->snd_wnd (send-side receive window) and tp->rcv_wnd (receive-side
receive window) in tcp_info, and did not express a preference between
the two. Now that we are faced with a decision between the two,
personally I think it would be a little more useful to start with
tp->snd_wnd. :-)

Two main reasons:

(1) Usually when we're diagnosing TCP performance problems, we do so
from the sender, since the sender makes most of the
performance-critical decisions (cwnd, pacing, TSO size, TSQ, etc).
From the sender-side the thing that would be most useful is to see
tp->snd_wnd, the receive window that the receiver has advertised to
the sender.

(2) From the receiver side, "ss" can already show a fair amount of
info about receive-side buffer/window limits, like:
info->tcpi_rcv_ssthresh, info->tcpi_rcv_space,
skmeminfo[SK_MEMINFO_RMEM_ALLOC], skmeminfo[SK_MEMINFO_RCVBUF]. Often
the rwin can be approximated by combining those.

Hopefully Eric, Yuchung, and Soheil can weigh in on the question of
snd_wnd vs rcv_wnd. Or we can perhaps think of another field, and add
the tcpi_rcvi_ooopack, snd_wnd, rcv_wnd, and that final field, all
together.

thanks,
neal

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 2/2] tcp: Add rcv_wnd to TCP_INFO
  2019-09-12  0:49   ` Neal Cardwell
@ 2019-09-12  9:14     ` Dave Taht
  2019-09-13 14:29       ` Thomas Higdon
  0 siblings, 1 reply; 6+ messages in thread
From: Dave Taht @ 2019-09-12  9:14 UTC (permalink / raw)
  To: Neal Cardwell
  Cc: Thomas Higdon, netdev, Jonathan Lemon, Dave Jones, Eric Dumazet,
	Yuchung Cheng, Soheil Hassas Yeganeh

On Thu, Sep 12, 2019 at 1:59 AM Neal Cardwell <ncardwell@google.com> wrote:
>
> On Wed, Sep 11, 2019 at 6:32 PM Thomas Higdon <tph@fb.com> wrote:
> >
> > Neal Cardwell mentioned that rcv_wnd would be useful for helping
> > diagnose whether a flow is receive-window-limited at a given instant.
> >
> > This serves the purpose of adding an additional __u32 to avoid the
> > would-be hole caused by the addition of the tcpi_rcvi_ooopack field.
> >
> > Signed-off-by: Thomas Higdon <tph@fb.com>
> > ---
>
> Thanks, Thomas.
>
> I know that when I mentioned this before I mentioned the idea of both
> tp->snd_wnd (send-side receive window) and tp->rcv_wnd (receive-side
> receive window) in tcp_info, and did not express a preference between
> the two. Now that we are faced with a decision between the two,
> personally I think it would be a little more useful to start with
> tp->snd_wnd. :-)
>
> Two main reasons:
>
> (1) Usually when we're diagnosing TCP performance problems, we do so
> from the sender, since the sender makes most of the
> performance-critical decisions (cwnd, pacing, TSO size, TSQ, etc).
> From the sender-side the thing that would be most useful is to see
> tp->snd_wnd, the receive window that the receiver has advertised to
> the sender.

I am under the impression, that particularly in the mobile space, that
network behavior
is often governed by rcv_wnd. At least, there's been so many papers on
this that I'd
tended to assume so.

Given a desire to do both vars, is there a *third* u32 we could add to
fill in the next hole? :)
ecn marks?

>
> (2) From the receiver side, "ss" can already show a fair amount of
> info about receive-side buffer/window limits, like:
> info->tcpi_rcv_ssthresh, info->tcpi_rcv_space,
> skmeminfo[SK_MEMINFO_RMEM_ALLOC], skmeminfo[SK_MEMINFO_RCVBUF]. Often
> the rwin can be approximated by combining those.
>
> Hopefully Eric, Yuchung, and Soheil can weigh in on the question of
> snd_wnd vs rcv_wnd. Or we can perhaps think of another field, and add
> the tcpi_rcvi_ooopack, snd_wnd, rcv_wnd, and that final field, all
> together.
>
> thanks,
> neal



-- 

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 2/2] tcp: Add rcv_wnd to TCP_INFO
  2019-09-12  9:14     ` Dave Taht
@ 2019-09-13 14:29       ` Thomas Higdon
  2019-09-13 14:37         ` Neal Cardwell
  0 siblings, 1 reply; 6+ messages in thread
From: Thomas Higdon @ 2019-09-13 14:29 UTC (permalink / raw)
  To: Dave Taht
  Cc: Neal Cardwell, netdev, Jonathan Lemon, Dave Jones, Eric Dumazet,
	Yuchung Cheng, Soheil Hassas Yeganeh

On Thu, Sep 12, 2019 at 10:14:33AM +0100, Dave Taht wrote:
> On Thu, Sep 12, 2019 at 1:59 AM Neal Cardwell <ncardwell@google.com> wrote:
> >
> > On Wed, Sep 11, 2019 at 6:32 PM Thomas Higdon <tph@fb.com> wrote:
> > >
> > > Neal Cardwell mentioned that rcv_wnd would be useful for helping
> > > diagnose whether a flow is receive-window-limited at a given instant.
> > >
> > > This serves the purpose of adding an additional __u32 to avoid the
> > > would-be hole caused by the addition of the tcpi_rcvi_ooopack field.
> > >
> > > Signed-off-by: Thomas Higdon <tph@fb.com>
> > > ---
> >
> > Thanks, Thomas.
> >
> > I know that when I mentioned this before I mentioned the idea of both
> > tp->snd_wnd (send-side receive window) and tp->rcv_wnd (receive-side
> > receive window) in tcp_info, and did not express a preference between
> > the two. Now that we are faced with a decision between the two,
> > personally I think it would be a little more useful to start with
> > tp->snd_wnd. :-)
> >
> > Two main reasons:
> >
> > (1) Usually when we're diagnosing TCP performance problems, we do so
> > from the sender, since the sender makes most of the
> > performance-critical decisions (cwnd, pacing, TSO size, TSQ, etc).
> > From the sender-side the thing that would be most useful is to see
> > tp->snd_wnd, the receive window that the receiver has advertised to
> > the sender.
> 
> I am under the impression, that particularly in the mobile space, that
> network behavior
> is often governed by rcv_wnd. At least, there's been so many papers on
> this that I'd
> tended to assume so.
> 
> Given a desire to do both vars, is there a *third* u32 we could add to
> fill in the next hole? :)
> ecn marks?

Neal makes some good points -- there is a fair amount of existing
information for deriving receive window. It seems like snd_wnd would be
more valuable at this moment. For the purpose of pairing up these __u32s
to get something we can commit, I propose that we go with
the rcv_ooopack/snd_wnd pair for now, and when something comes up later,
one might consider pairing up rcv_wnd.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 2/2] tcp: Add rcv_wnd to TCP_INFO
  2019-09-13 14:29       ` Thomas Higdon
@ 2019-09-13 14:37         ` Neal Cardwell
  0 siblings, 0 replies; 6+ messages in thread
From: Neal Cardwell @ 2019-09-13 14:37 UTC (permalink / raw)
  To: Thomas Higdon
  Cc: Dave Taht, netdev, Jonathan Lemon, Dave Jones, Eric Dumazet,
	Yuchung Cheng, Soheil Hassas Yeganeh

On Fri, Sep 13, 2019 at 10:29 AM Thomas Higdon <tph@fb.com> wrote:
>
> On Thu, Sep 12, 2019 at 10:14:33AM +0100, Dave Taht wrote:
> > On Thu, Sep 12, 2019 at 1:59 AM Neal Cardwell <ncardwell@google.com> wrote:
> > >
> > > On Wed, Sep 11, 2019 at 6:32 PM Thomas Higdon <tph@fb.com> wrote:
> > > >
> > > > Neal Cardwell mentioned that rcv_wnd would be useful for helping
> > > > diagnose whether a flow is receive-window-limited at a given instant.
> > > >
> > > > This serves the purpose of adding an additional __u32 to avoid the
> > > > would-be hole caused by the addition of the tcpi_rcvi_ooopack field.
> > > >
> > > > Signed-off-by: Thomas Higdon <tph@fb.com>
> > > > ---
> > >
> > > Thanks, Thomas.
> > >
> > > I know that when I mentioned this before I mentioned the idea of both
> > > tp->snd_wnd (send-side receive window) and tp->rcv_wnd (receive-side
> > > receive window) in tcp_info, and did not express a preference between
> > > the two. Now that we are faced with a decision between the two,
> > > personally I think it would be a little more useful to start with
> > > tp->snd_wnd. :-)
> > >
> > > Two main reasons:
> > >
> > > (1) Usually when we're diagnosing TCP performance problems, we do so
> > > from the sender, since the sender makes most of the
> > > performance-critical decisions (cwnd, pacing, TSO size, TSQ, etc).
> > > From the sender-side the thing that would be most useful is to see
> > > tp->snd_wnd, the receive window that the receiver has advertised to
> > > the sender.
> >
> > I am under the impression, that particularly in the mobile space, that
> > network behavior
> > is often governed by rcv_wnd. At least, there's been so many papers on
> > this that I'd
> > tended to assume so.
> >
> > Given a desire to do both vars, is there a *third* u32 we could add to
> > fill in the next hole? :)
> > ecn marks?
>
> Neal makes some good points -- there is a fair amount of existing
> information for deriving receive window. It seems like snd_wnd would be
> more valuable at this moment. For the purpose of pairing up these __u32s
> to get something we can commit, I propose that we go with
> the rcv_ooopack/snd_wnd pair for now, and when something comes up later,
> one might consider pairing up rcv_wnd.

FWIW that sounds like a great plan to me. Thanks, Thomas!

neal

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, back to index

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-11 22:31 [PATCH v3 1/2] tcp: Add TCP_INFO counter for packets received out-of-order Thomas Higdon
2019-09-11 22:31 ` [PATCH v3 2/2] tcp: Add rcv_wnd to TCP_INFO Thomas Higdon
2019-09-12  0:49   ` Neal Cardwell
2019-09-12  9:14     ` Dave Taht
2019-09-13 14:29       ` Thomas Higdon
2019-09-13 14:37         ` Neal Cardwell

Netdev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/netdev/0 netdev/git/0.git
	git clone --mirror https://lore.kernel.org/netdev/1 netdev/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 netdev netdev/ https://lore.kernel.org/netdev \
		netdev@vger.kernel.org netdev@archiver.kernel.org
	public-inbox-index netdev


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.netdev


AGPL code for this site: git clone https://public-inbox.org/ public-inbox