All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] sock: add SO_RCVQUEUE_SIZE getsockopt
@ 2017-03-13 15:59 Josh Hunt
  2017-03-13 16:12 ` Eric Dumazet
  0 siblings, 1 reply; 8+ messages in thread
From: Josh Hunt @ 2017-03-13 15:59 UTC (permalink / raw)
  To: davem, arnd
  Cc: edumazet, soheil, willemb, pabeni, linux-arch, netdev, Josh Hunt

Allows application to read the amount of data sitting in the receive
queue.

Signed-off-by: Josh Hunt <johunt@akamai.com>
---

A team here is looking for a way to get the amount of data in a UDP socket's
receive queue. It seems like this should be SIOCINQ, but for UDP sockets that
returns the size of the next pending datagram. I implemented the patch below,
but am wondering if this is the right place for this change? I was debating
between this or a new UDP ioctl.

 include/net/sock.h                | 7 ++++++-
 include/uapi/asm-generic/socket.h | 2 ++
 net/core/sock.c                   | 4 ++++
 3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 5e59976..bcfed2ae 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -854,6 +854,11 @@ static inline void __sk_add_backlog(struct sock *sk, struct sk_buff *skb)
 	skb->next = NULL;
 }
 
+static unsigned int sk_rcvqueue_size(const struct sock *sk)
+{
+	return sk->sk_backlog.len + atomic_read(&sk->sk_rmem_alloc);
+}
+
 /*
  * Take into account size of receive queue and backlog queue
  * Do not take into account this skb truesize,
@@ -861,7 +866,7 @@ static inline void __sk_add_backlog(struct sock *sk, struct sk_buff *skb)
  */
 static inline bool sk_rcvqueues_full(const struct sock *sk, unsigned int limit)
 {
-	unsigned int qsize = sk->sk_backlog.len + atomic_read(&sk->sk_rmem_alloc);
+	unsigned int qsize = sk_rcvqueue_size(sk);
 
 	return qsize > limit;
 }
diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h
index 2c748dd..36b8cd5 100644
--- a/include/uapi/asm-generic/socket.h
+++ b/include/uapi/asm-generic/socket.h
@@ -94,4 +94,6 @@
 
 #define SCM_TIMESTAMPING_OPT_STATS	54
 
+#define SO_RCVQUEUE_SIZE	55
+
 #endif /* __ASM_GENERIC_SOCKET_H */
diff --git a/net/core/sock.c b/net/core/sock.c
index f6fd79f..fa69864 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1269,6 +1269,10 @@ int sock_getsockopt(struct socket *sock, int level, int optname,
 		v.val = sk->sk_incoming_cpu;
 		break;
 
+	case SO_RCVQUEUE_SIZE:
+		v.val = sk_rcvqueue_size(sk);
+		break;
+
 	default:
 		/* We implement the SO_SNDLOWAT etc to not be settable
 		 * (1003.1g 7).
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] sock: add SO_RCVQUEUE_SIZE getsockopt
  2017-03-13 15:59 [RFC PATCH] sock: add SO_RCVQUEUE_SIZE getsockopt Josh Hunt
@ 2017-03-13 16:12 ` Eric Dumazet
  2017-03-13 17:38   ` Josh Hunt
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2017-03-13 16:12 UTC (permalink / raw)
  To: Josh Hunt
  Cc: David Miller, Arnd Bergmann, Soheil Hassas Yeganeh,
	Willem de Bruijn, Paolo Abeni, linux-arch, netdev

On Mon, Mar 13, 2017 at 8:59 AM, Josh Hunt <johunt@akamai.com> wrote:
> Allows application to read the amount of data sitting in the receive
> queue.
>
> Signed-off-by: Josh Hunt <johunt@akamai.com>
> ---
>
> A team here is looking for a way to get the amount of data in a UDP socket's
> receive queue. It seems like this should be SIOCINQ, but for UDP sockets that
> returns the size of the next pending datagram. I implemented the patch below,
> but am wondering if this is the right place for this change? I was debating
> between this or a new UDP ioctl.

But what is the 'amount of data' exactly ?
Number of packets, amount of bytes to read from these packets ?

You chose to report kernel memory usage, which is not guaranteed to be
the same among kernels versions (or kernel configs)

If we export these internals, I would export the whole thing, like we
did with netlink

ie tweak sock_diag_put_meminfo() and export the SK_MEMINFO_VARS

So that we avoid adding other options in the future.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] sock: add SO_RCVQUEUE_SIZE getsockopt
  2017-03-13 16:12 ` Eric Dumazet
@ 2017-03-13 17:38   ` Josh Hunt
  2017-03-13 19:39     ` David Miller
  0 siblings, 1 reply; 8+ messages in thread
From: Josh Hunt @ 2017-03-13 17:38 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, Arnd Bergmann, Soheil Hassas Yeganeh,
	Willem de Bruijn, Paolo Abeni, linux-arch, netdev

On 03/13/2017 11:12 AM, Eric Dumazet wrote:
> On Mon, Mar 13, 2017 at 8:59 AM, Josh Hunt <johunt@akamai.com> wrote:
>> Allows application to read the amount of data sitting in the receive
>> queue.
>>
>> Signed-off-by: Josh Hunt <johunt@akamai.com>
>> ---
>>
>> A team here is looking for a way to get the amount of data in a UDP socket's
>> receive queue. It seems like this should be SIOCINQ, but for UDP sockets that
>> returns the size of the next pending datagram. I implemented the patch below,
>> but am wondering if this is the right place for this change? I was debating
>> between this or a new UDP ioctl.
>
> But what is the 'amount of data' exactly ?
> Number of packets, amount of bytes to read from these packets ?

I meant bytes. I will clarify in the next version.

>
> You chose to report kernel memory usage, which is not guaranteed to be
> the same among kernels versions (or kernel configs)
>
> If we export these internals, I would export the whole thing, like we
> did with netlink
>
> ie tweak sock_diag_put_meminfo() and export the SK_MEMINFO_VARS
>
> So that we avoid adding other options in the future.
>

OK, so you're suggesting we provide all of the SK_MEMINFO_VARS data via 
socket option then? That sounds good to me. I will send that in a v2.

Josh

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] sock: add SO_RCVQUEUE_SIZE getsockopt
  2017-03-13 17:38   ` Josh Hunt
@ 2017-03-13 19:39     ` David Miller
  2017-03-13 23:34       ` Josh Hunt
  0 siblings, 1 reply; 8+ messages in thread
From: David Miller @ 2017-03-13 19:39 UTC (permalink / raw)
  To: johunt; +Cc: edumazet, arnd, soheil, willemb, pabeni, linux-arch, netdev

From: Josh Hunt <johunt@akamai.com>
Date: Mon, 13 Mar 2017 12:38:39 -0500

> On 03/13/2017 11:12 AM, Eric Dumazet wrote:
>> On Mon, Mar 13, 2017 at 8:59 AM, Josh Hunt <johunt@akamai.com> wrote:
>>> Allows application to read the amount of data sitting in the receive
>>> queue.
>>>
>>> Signed-off-by: Josh Hunt <johunt@akamai.com>
>>> ---
>>>
>>> A team here is looking for a way to get the amount of data in a UDP
>>> socket's
>>> receive queue. It seems like this should be SIOCINQ, but for UDP
>>> sockets that
>>> returns the size of the next pending datagram. I implemented the patch
>>> below,
>>> but am wondering if this is the right place for this change? I was
>>> debating
>>> between this or a new UDP ioctl.
>>
>> But what is the 'amount of data' exactly ?
>> Number of packets, amount of bytes to read from these packets ?
> 
> I meant bytes. I will clarify in the next version.

As Eric is hinting, the calculation you are using doesn't represent
this.

You need to do something like walk the receive queue and add the
skb->len values together.

sk->sk_rmem_alloc is usually much larger than the sum of the skb->len
values in the socket receive queue.  I don't see how this culmination
of skb->truesize values is useful, whereas I can see how an application
could want the summation of the skb->len values.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] sock: add SO_RCVQUEUE_SIZE getsockopt
  2017-03-13 19:39     ` David Miller
@ 2017-03-13 23:34       ` Josh Hunt
  2017-03-14  0:10         ` David Miller
  2017-03-14  1:31         ` Eric Dumazet
  0 siblings, 2 replies; 8+ messages in thread
From: Josh Hunt @ 2017-03-13 23:34 UTC (permalink / raw)
  To: David Miller; +Cc: edumazet, arnd, soheil, willemb, pabeni, linux-arch, netdev

On 03/13/2017 02:39 PM, David Miller wrote:
> From: Josh Hunt <johunt@akamai.com>
> Date: Mon, 13 Mar 2017 12:38:39 -0500
>
>> On 03/13/2017 11:12 AM, Eric Dumazet wrote:
>>> On Mon, Mar 13, 2017 at 8:59 AM, Josh Hunt <johunt@akamai.com> wrote:
>>>> Allows application to read the amount of data sitting in the receive
>>>> queue.
>>>>
>>>> Signed-off-by: Josh Hunt <johunt@akamai.com>
>>>> ---
>>>>
>>>> A team here is looking for a way to get the amount of data in a UDP
>>>> socket's
>>>> receive queue. It seems like this should be SIOCINQ, but for UDP
>>>> sockets that
>>>> returns the size of the next pending datagram. I implemented the patch
>>>> below,
>>>> but am wondering if this is the right place for this change? I was
>>>> debating
>>>> between this or a new UDP ioctl.
>>>
>>> But what is the 'amount of data' exactly ?
>>> Number of packets, amount of bytes to read from these packets ?
>>
>> I meant bytes. I will clarify in the next version.
>
> As Eric is hinting, the calculation you are using doesn't represent
> this.
>
> You need to do something like walk the receive queue and add the
> skb->len values together.
>
> sk->sk_rmem_alloc is usually much larger than the sum of the skb->len
> values in the socket receive queue.  I don't see how this culmination
> of skb->truesize values is useful, whereas I can see how an application
> could want the summation of the skb->len values.
>

In this particular case they really do want to know total # of bytes in 
the receive queue, not the data bytes they can consume from an 
application pov. The kernel currently only exposes this value through 
netlink or /proc/net/udp from what I saw.

I believe Eric's suggestion in his previous mail was to export all of 
these meminfo metrics via a single socket option call similar to how its 
done in netlink. We could then use that for both call sites.

I agree that it would be useful to also have the data you and Eric are 
suggesting exposed somewhere, the total # of skb->len bytes sitting in 
the receive queue. I could add that as a second socket option.

Does this sound reasonable?

Josh

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] sock: add SO_RCVQUEUE_SIZE getsockopt
  2017-03-13 23:34       ` Josh Hunt
@ 2017-03-14  0:10         ` David Miller
  2017-03-14 22:11           ` Josh Hunt
  2017-03-14  1:31         ` Eric Dumazet
  1 sibling, 1 reply; 8+ messages in thread
From: David Miller @ 2017-03-14  0:10 UTC (permalink / raw)
  To: johunt; +Cc: edumazet, arnd, soheil, willemb, pabeni, linux-arch, netdev

From: Josh Hunt <johunt@akamai.com>
Date: Mon, 13 Mar 2017 18:34:41 -0500

> In this particular case they really do want to know total # of bytes
> in the receive queue, not the data bytes they can consume from an
> application pov. The kernel currently only exposes this value through
> netlink or /proc/net/udp from what I saw.

Can you explain in what way this is useful?

The difference between skb->len and skb->truesize is really kernel
internal implementation detail, and I'm trying to figure out why
this would be useful to an application.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] sock: add SO_RCVQUEUE_SIZE getsockopt
  2017-03-13 23:34       ` Josh Hunt
  2017-03-14  0:10         ` David Miller
@ 2017-03-14  1:31         ` Eric Dumazet
  1 sibling, 0 replies; 8+ messages in thread
From: Eric Dumazet @ 2017-03-14  1:31 UTC (permalink / raw)
  To: Josh Hunt
  Cc: David Miller, edumazet, arnd, soheil, willemb, pabeni,
	linux-arch, netdev

On Mon, 2017-03-13 at 18:34 -0500, Josh Hunt wrote:

> In this particular case they really do want to know total # of bytes in 
> the receive queue, not the data bytes they can consume from an 
> application pov. The kernel currently only exposes this value through 
> netlink or /proc/net/udp from what I saw.
> 
> I believe Eric's suggestion in his previous mail was to export all of 
> these meminfo metrics via a single socket option call similar to how its 
> done in netlink. We could then use that for both call sites.
> 
> I agree that it would be useful to also have the data you and Eric are 
> suggesting exposed somewhere, the total # of skb->len bytes sitting in 
> the receive queue. I could add that as a second socket option.

Please note that UDP stack does not maintain a per socket sum(skb->len)

Implementing this in a system call would require to lock the receive
queue (blocking BH) and iterating over a potential huge skb list.

Or add a new socket field and add/sub every skb->len of packets
added/removed to/from receive queue.

So I would prefer to not provide this information, this looks quite a
bloat.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] sock: add SO_RCVQUEUE_SIZE getsockopt
  2017-03-14  0:10         ` David Miller
@ 2017-03-14 22:11           ` Josh Hunt
  0 siblings, 0 replies; 8+ messages in thread
From: Josh Hunt @ 2017-03-14 22:11 UTC (permalink / raw)
  To: David Miller; +Cc: edumazet, arnd, soheil, willemb, pabeni, linux-arch, netdev

On 03/13/2017 07:10 PM, David Miller wrote:
> From: Josh Hunt <johunt@akamai.com>
> Date: Mon, 13 Mar 2017 18:34:41 -0500
>
>> In this particular case they really do want to know total # of bytes
>> in the receive queue, not the data bytes they can consume from an
>> application pov. The kernel currently only exposes this value through
>> netlink or /proc/net/udp from what I saw.
>
> Can you explain in what way this is useful?
>
> The difference between skb->len and skb->truesize is really kernel
> internal implementation detail, and I'm trying to figure out why
> this would be useful to an application.
>

First, it looks like my original patch was against an old kernel which 
did not have the updated udp accounting code. Not sure how that 
happened. Apologies for that. There's no need to add in the backlog, at 
least for udp now, sk_rmem_alloc is all that is needed for my case.

The application here is interested in monitoring the amount of data in 
the receive buffer. Looking for and identifying overflows, and also 
understanding how full it is. I know we already have SO_RXQ_OVFL, but 
this only shows the # of drops on overflow.

We expose this (skmem) information via /proc and netlink today. It seems 
like unnecessary overhead to require an application to also create a 
netlink socket to get this data.

Creating a socket option to mimic the behavior of 
sock_diag_put_meminfo() and export all meminfo_vars would be great if 
that's something you'd accept.

Josh

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-03-14 22:11 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-13 15:59 [RFC PATCH] sock: add SO_RCVQUEUE_SIZE getsockopt Josh Hunt
2017-03-13 16:12 ` Eric Dumazet
2017-03-13 17:38   ` Josh Hunt
2017-03-13 19:39     ` David Miller
2017-03-13 23:34       ` Josh Hunt
2017-03-14  0:10         ` David Miller
2017-03-14 22:11           ` Josh Hunt
2017-03-14  1:31         ` Eric Dumazet

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.