All of lore.kernel.org
 help / color / mirror / Atom feed
* iproute uses too small of a receive buffer
@ 2009-10-27 23:16 Ben Greear
  2009-10-27 23:24 ` Stephen Hemminger
  0 siblings, 1 reply; 17+ messages in thread
From: Ben Greear @ 2009-10-27 23:16 UTC (permalink / raw)
  To: NetDev

[-- Attachment #1: Type: text/plain, Size: 1065 bytes --]

I have a very busy system with a bunch of xorp router processes (mis)configured.

This thing is rapidly making route changes for whatever reason.

The 'ip monitor route' command was failing:

[root@i7-dqc-1 ]# ip monitor route
netlink receive error No buffer space available (105)
Dump terminated


It is only using a 32k rcv buffer, and it seems the OS was
overdriving it.

Please consider making the rcv buffer larger, perhaps something
like this (inline is white-space damaged...attachment should apply
if deemed useful.):

Signed-off-by:  Ben Greear <greearb@candelatech.com>

diff --git a/lib/libnetlink.c b/lib/libnetlink.c
index b68e2fd..95a7d1d 100644
--- a/lib/libnetlink.c
+++ b/lib/libnetlink.c
@@ -38,7 +38,7 @@ int rtnl_open_byproto(struct rtnl_handle *rth, unsigned subscriptions,
  {
         socklen_t addr_len;
         int sndbuf = 32768;
-       int rcvbuf = 32768;
+       int rcvbuf = 3276800;

         memset(rth, 0, sizeof(*rth));


Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


[-- Attachment #2: iputils.patch --]
[-- Type: text/plain, Size: 343 bytes --]

diff --git a/lib/libnetlink.c b/lib/libnetlink.c
index b68e2fd..95a7d1d 100644
--- a/lib/libnetlink.c
+++ b/lib/libnetlink.c
@@ -38,7 +38,7 @@ int rtnl_open_byproto(struct rtnl_handle *rth, unsigned subscriptions,
 {
 	socklen_t addr_len;
 	int sndbuf = 32768;
-	int rcvbuf = 32768;
+	int rcvbuf = 3276800;
 
 	memset(rth, 0, sizeof(*rth));
 

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: iproute uses too small of a receive buffer
  2009-10-27 23:16 iproute uses too small of a receive buffer Ben Greear
@ 2009-10-27 23:24 ` Stephen Hemminger
  2009-10-27 23:30   ` Ben Greear
  2009-10-28  7:52   ` Eric Dumazet
  0 siblings, 2 replies; 17+ messages in thread
From: Stephen Hemminger @ 2009-10-27 23:24 UTC (permalink / raw)
  To: Ben Greear; +Cc: NetDev

On Tue, 27 Oct 2009 16:16:52 -0700
Ben Greear <greearb@candelatech.com> wrote:

> I have a very busy system with a bunch of xorp router processes (mis)configured.
> 
> This thing is rapidly making route changes for whatever reason.
> 
> The 'ip monitor route' command was failing:
> 
> [root@i7-dqc-1 ]# ip monitor route
> netlink receive error No buffer space available (105)
> Dump terminated
> 
> 
> It is only using a 32k rcv buffer, and it seems the OS was
> overdriving it.
> 
> Please consider making the rcv buffer larger, perhaps something
> like this (inline is white-space damaged...attachment should apply
> if deemed useful.):
> 
> Signed-off-by:  Ben Greear <greearb@candelatech.com>
> 
> diff --git a/lib/libnetlink.c b/lib/libnetlink.c
> index b68e2fd..95a7d1d 100644
> --- a/lib/libnetlink.c
> +++ b/lib/libnetlink.c
> @@ -38,7 +38,7 @@ int rtnl_open_byproto(struct rtnl_handle *rth, unsigned subscriptions,
>   {
>          socklen_t addr_len;
>          int sndbuf = 32768;
> -       int rcvbuf = 32768;
> +       int rcvbuf = 3276800;
> 
>          memset(rth, 0, sizeof(*rth));
> 
> 
> Thanks,
> Ben
> 

Just having larger buffer isn't guarantee of success. Allocating
a huge buffer is not going to work on embedded.

Why not have it continue after one error.

-- 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: iproute uses too small of a receive buffer
  2009-10-27 23:24 ` Stephen Hemminger
@ 2009-10-27 23:30   ` Ben Greear
  2009-10-28  7:01     ` Eric Dumazet
  2009-10-28  7:52   ` Eric Dumazet
  1 sibling, 1 reply; 17+ messages in thread
From: Ben Greear @ 2009-10-27 23:30 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: NetDev

On 10/27/2009 04:24 PM, Stephen Hemminger wrote:
> On Tue, 27 Oct 2009 16:16:52 -0700
> Ben Greear<greearb@candelatech.com>  wrote:
>
>> I have a very busy system with a bunch of xorp router processes (mis)configured.
>>
>> This thing is rapidly making route changes for whatever reason.
>>
>> The 'ip monitor route' command was failing:
>>
>> [root@i7-dqc-1 ]# ip monitor route
>> netlink receive error No buffer space available (105)
>> Dump terminated
>>
>>
>> It is only using a 32k rcv buffer, and it seems the OS was
>> overdriving it.
>>
>> Please consider making the rcv buffer larger, perhaps something
>> like this (inline is white-space damaged...attachment should apply
>> if deemed useful.):
>>
>> Signed-off-by:  Ben Greear<greearb@candelatech.com>
>>
>> diff --git a/lib/libnetlink.c b/lib/libnetlink.c
>> index b68e2fd..95a7d1d 100644
>> --- a/lib/libnetlink.c
>> +++ b/lib/libnetlink.c
>> @@ -38,7 +38,7 @@ int rtnl_open_byproto(struct rtnl_handle *rth, unsigned subscriptions,
>>    {
>>           socklen_t addr_len;
>>           int sndbuf = 32768;
>> -       int rcvbuf = 32768;
>> +       int rcvbuf = 3276800;
>>
>>           memset(rth, 0, sizeof(*rth));
>>
>>
>> Thanks,
>> Ben
>>
>
> Just having larger buffer isn't guarantee of success. Allocating
> a huge buffer is not going to work on embedded.
>
> Why not have it continue after one error.

Probably the right way is to give a cmd-line arg to set the buffer size
and also continue if the error is ENOBUFs (but print some error out
so users know they have issues).  I can make the attempt if that
sounds good to you.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: iproute uses too small of a receive buffer
  2009-10-27 23:30   ` Ben Greear
@ 2009-10-28  7:01     ` Eric Dumazet
  2009-10-28  7:09       ` Eric Dumazet
  2009-10-28  7:37       ` Eric Dumazet
  0 siblings, 2 replies; 17+ messages in thread
From: Eric Dumazet @ 2009-10-28  7:01 UTC (permalink / raw)
  To: Ben Greear; +Cc: Stephen Hemminger, NetDev

Ben Greear a écrit :
> 
> Probably the right way is to give a cmd-line arg to set the buffer size
> and also continue if the error is ENOBUFs (but print some error out
> so users know they have issues).  I can make the attempt if that
> sounds good to you.

Real fix is to realloc buffer at receive time, no need for user setting.

In my testings I saw it reaching 1 Mbyte
write(2, "REALLOC buflen 8192\n"..., 20) = 20
write(2, "REALLOC buflen 16384\n"..., 21) = 21
write(2, "REALLOC buflen 32768\n"..., 21) = 21
write(2, "REALLOC buflen 65536\n"..., 21) = 21
write(2, "REALLOC buflen 131072\n"..., 22) = 22
write(2, "REALLOC buflen 262144\n"..., 22) = 22
write(2, "REALLOC buflen 524288\n"..., 22) = 22


[iproute2] realloc buffer in rtnl_listen

# ip monitor route
netlink receive error No buffer space available (105)
Dump terminated 

Reported-by: Ben Greear<greearb@candelatech.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
diff --git a/lib/libnetlink.c b/lib/libnetlink.c
index b68e2fd..134ce7f 100644
--- a/lib/libnetlink.c
+++ b/lib/libnetlink.c
@@ -392,8 +392,14 @@ int rtnl_listen(struct rtnl_handle *rtnl,
 		.msg_iov = &iov,
 		.msg_iovlen = 1,
 	};
-	char   buf[8192];
+	char   *buf;
+	size_t buflen = 8192;
 
+	buf = malloc(buflen);
+	if (buf == NULL) {
+		fprintf(stderr, "netlink could not alloc %lu bytes\n", buflen);
+		return -1;
+	}
 	memset(&nladdr, 0, sizeof(nladdr));
 	nladdr.nl_family = AF_NETLINK;
 	nladdr.nl_pid = 0;
@@ -401,12 +407,20 @@ int rtnl_listen(struct rtnl_handle *rtnl,
 
 	iov.iov_base = buf;
 	while (1) {
-		iov.iov_len = sizeof(buf);
+		iov.iov_len = buflen;
 		status = recvmsg(rtnl->fd, &msg, 0);
 
 		if (status < 0) {
 			if (errno == EINTR || errno == EAGAIN)
 				continue;
+			if (errno == ENOBUFS) {
+				buf = realloc(buf, buflen * 2);
+				if (buf) {
+					buflen *= 2;
+					iov.iov_base = buf;
+					continue;
+				}
+			}
 			fprintf(stderr, "netlink receive error %s (%d)\n",
 				strerror(errno), errno);
 			return -1;

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: iproute uses too small of a receive buffer
  2009-10-28  7:01     ` Eric Dumazet
@ 2009-10-28  7:09       ` Eric Dumazet
  2009-10-28  7:37       ` Eric Dumazet
  1 sibling, 0 replies; 17+ messages in thread
From: Eric Dumazet @ 2009-10-28  7:09 UTC (permalink / raw)
  Cc: Ben Greear, Stephen Hemminger, NetDev

Eric Dumazet a écrit :
> Ben Greear a écrit :
>> Probably the right way is to give a cmd-line arg to set the buffer size
>> and also continue if the error is ENOBUFs (but print some error out
>> so users know they have issues).  I can make the attempt if that
>> sounds good to you.
> 
> Real fix is to realloc buffer at receive time, no need for user setting.
> 

Then, another problem is that some information can be dropped at kernel level
when socket rcvbuf is full (ip monitor too slow to read its socket)

Thats hard to fix because you need to tweak /proc/sys/net/core/rmem_max



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: iproute uses too small of a receive buffer
  2009-10-28  7:01     ` Eric Dumazet
  2009-10-28  7:09       ` Eric Dumazet
@ 2009-10-28  7:37       ` Eric Dumazet
  1 sibling, 0 replies; 17+ messages in thread
From: Eric Dumazet @ 2009-10-28  7:37 UTC (permalink / raw)
  To: Ben Greear, Stephen Hemminger; +Cc: NetDev

Eric Dumazet a écrit :
> Ben Greear a écrit :
>> Probably the right way is to give a cmd-line arg to set the buffer size
>> and also continue if the error is ENOBUFs (but print some error out
>> so users know they have issues).  I can make the attempt if that
>> sounds good to you.
> 
> Real fix is to realloc buffer at receive time, no need for user setting.
> 
> In my testings I saw it reaching 1 Mbyte
> write(2, "REALLOC buflen 8192\n"..., 20) = 20
> write(2, "REALLOC buflen 16384\n"..., 21) = 21
> write(2, "REALLOC buflen 32768\n"..., 21) = 21
> write(2, "REALLOC buflen 65536\n"..., 21) = 21
> write(2, "REALLOC buflen 131072\n"..., 22) = 22
> write(2, "REALLOC buflen 262144\n"..., 22) = 22
> write(2, "REALLOC buflen 524288\n"..., 22) = 22
> 
> 
> [iproute2] realloc buffer in rtnl_listen
> 
> # ip monitor route
> netlink receive error No buffer space available (105)
> Dump terminated 
> 
> Reported-by: Ben Greear<greearb@candelatech.com>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Oops, this was wrong, Ben was right, sorry...

ENOBUFS errors is a flag to actually report to user that some information was dropped,
not that user supplied buffer at recv() time is not big enough.

I was surprised that buffer could reach 1Mbytes, while RCVBUF was 32768 or so.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: iproute uses too small of a receive buffer
  2009-10-27 23:24 ` Stephen Hemminger
  2009-10-27 23:30   ` Ben Greear
@ 2009-10-28  7:52   ` Eric Dumazet
  2009-10-28  7:55     ` David Miller
  2009-10-28 19:05     ` Patrick McHardy
  1 sibling, 2 replies; 17+ messages in thread
From: Eric Dumazet @ 2009-10-28  7:52 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Ben Greear, NetDev

Stephen Hemminger a écrit :
> 
> Just having larger buffer isn't guarantee of success. Allocating
> a huge buffer is not going to work on embedded.
> 

Please note we do not allocate a big buffer, only allow more small skbs
to be queued on socket receive queue.

If memory is not available, skb allocation will eventually fail
and be reported as well, embedded or not.

I vote for allowing 1024*1024 bytes instead of 32768,
and eventually user should be warned that it is capped by 
/proc/sys/net/core/rmem_max


> Why not have it continue after one error.

Yes, but caller of 'ip monitor' just restart it anyway

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: iproute uses too small of a receive buffer
  2009-10-28  7:52   ` Eric Dumazet
@ 2009-10-28  7:55     ` David Miller
  2009-10-28 19:05     ` Patrick McHardy
  1 sibling, 0 replies; 17+ messages in thread
From: David Miller @ 2009-10-28  7:55 UTC (permalink / raw)
  To: eric.dumazet; +Cc: shemminger, greearb, netdev

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 28 Oct 2009 08:52:57 +0100

> Stephen Hemminger a écrit :
>> 
>> Just having larger buffer isn't guarantee of success. Allocating
>> a huge buffer is not going to work on embedded.
>> 
> 
> Please note we do not allocate a big buffer, only allow more small skbs
> to be queued on socket receive queue.
> 
> If memory is not available, skb allocation will eventually fail
> and be reported as well, embedded or not.
> 
> I vote for allowing 1024*1024 bytes instead of 32768,
> and eventually user should be warned that it is capped by 
> /proc/sys/net/core/rmem_max

This discussion constantly reminds me of:

/*
 *	skb should fit one page. This choice is good for headerless malloc.
 *	But we should limit to 8K so that userspace does not have to
 *	use enormous buffer sizes on recvmsg() calls just to avoid
 *	MSG_TRUNC when PAGE_SIZE is very large.
 */
#if PAGE_SIZE < 8192UL
#define NLMSG_GOODSIZE	SKB_WITH_OVERHEAD(PAGE_SIZE)
#else
#define NLMSG_GOODSIZE	SKB_WITH_OVERHEAD(8192UL)
#endif

#define NLMSG_DEFAULT_SIZE (NLMSG_GOODSIZE - NLMSG_HDRLEN)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: iproute uses too small of a receive buffer
  2009-10-28  7:52   ` Eric Dumazet
  2009-10-28  7:55     ` David Miller
@ 2009-10-28 19:05     ` Patrick McHardy
  2009-10-28 19:19       ` Ben Greear
  2009-10-29  8:17       ` David Miller
  1 sibling, 2 replies; 17+ messages in thread
From: Patrick McHardy @ 2009-10-28 19:05 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Stephen Hemminger, Ben Greear, NetDev

[-- Attachment #1: Type: text/plain, Size: 753 bytes --]

Eric Dumazet wrote:
> Stephen Hemminger a écrit :
>> Just having larger buffer isn't guarantee of success. Allocating
>> a huge buffer is not going to work on embedded.
>>
> 
> Please note we do not allocate a big buffer, only allow more small skbs
> to be queued on socket receive queue.
> 
> If memory is not available, skb allocation will eventually fail
> and be reported as well, embedded or not.
> 
> I vote for allowing 1024*1024 bytes instead of 32768,
> and eventually user should be warned that it is capped by 
> /proc/sys/net/core/rmem_max

How about this? It will double the receive queue limit on ENOBUFS
up to 1024 * 1024b, then bail out with the normal error message on
further ENOBUFS.

Signed-off-by: Patrick McHardy <kaber@trash.net>

[-- Attachment #2: x --]
[-- Type: text/plain, Size: 894 bytes --]

diff --git a/lib/libnetlink.c b/lib/libnetlink.c
index b68e2fd..e4fda40 100644
--- a/lib/libnetlink.c
+++ b/lib/libnetlink.c
@@ -25,6 +25,8 @@
 
 #include "libnetlink.h"
 
+static int rcvbuf = 32768;
+
 void rtnl_close(struct rtnl_handle *rth)
 {
 	if (rth->fd >= 0) {
@@ -38,7 +40,6 @@ int rtnl_open_byproto(struct rtnl_handle *rth, unsigned subscriptions,
 {
 	socklen_t addr_len;
 	int sndbuf = 32768;
-	int rcvbuf = 32768;
 
 	memset(rth, 0, sizeof(*rth));
 
@@ -407,6 +409,12 @@ int rtnl_listen(struct rtnl_handle *rtnl,
 		if (status < 0) {
 			if (errno == EINTR || errno == EAGAIN)
 				continue;
+			if (errno == ENOBUFS && rcvbuf < 1024 * 1024) {
+				rcvbuf *= 2;
+				if (setsockopt(rtnl->fd, SOL_SOCKET, SO_RCVBUF,
+					       &rcvbuf, sizeof(rcvbuf)) == 0)
+					continue;
+			}
 			fprintf(stderr, "netlink receive error %s (%d)\n",
 				strerror(errno), errno);
 			return -1;

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: iproute uses too small of a receive buffer
  2009-10-28 19:05     ` Patrick McHardy
@ 2009-10-28 19:19       ` Ben Greear
  2009-10-28 19:50         ` Patrick McHardy
  2009-10-28 20:38         ` Eric Dumazet
  2009-10-29  8:17       ` David Miller
  1 sibling, 2 replies; 17+ messages in thread
From: Ben Greear @ 2009-10-28 19:19 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Eric Dumazet, Stephen Hemminger, NetDev

On 10/28/2009 12:05 PM, Patrick McHardy wrote:
> Eric Dumazet wrote:
>> Stephen Hemminger a écrit :
>>> Just having larger buffer isn't guarantee of success. Allocating
>>> a huge buffer is not going to work on embedded.
>>>
>>
>> Please note we do not allocate a big buffer, only allow more small skbs
>> to be queued on socket receive queue.
>>
>> If memory is not available, skb allocation will eventually fail
>> and be reported as well, embedded or not.
>>
>> I vote for allowing 1024*1024 bytes instead of 32768,
>> and eventually user should be warned that it is capped by
>> /proc/sys/net/core/rmem_max
>
> How about this? It will double the receive queue limit on ENOBUFS
> up to 1024 * 1024b, then bail out with the normal error message on
> further ENOBUFS.
>
> Signed-off-by: Patrick McHardy<kaber@trash.net>

First:  This still pretty much guarantees that messages will be lost when
the program starts (when messages are coming in too large of chunks for small buffers)
If you are debugging something tricky, having lost messages will be
very annoying!

Second:  Why bail on ENOBUFS at all?  I don't see how it helps the user
since they will probably just have to start it again, and will miss more
messages than keeping going would have.

And, even 1MB may not be enough for some scenarios.  So, probably best to
let users over-ride the initial setting on cmd-line.  If not, then use
a large value to start with.

Thanks,
Ben



-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: iproute uses too small of a receive buffer
  2009-10-28 19:19       ` Ben Greear
@ 2009-10-28 19:50         ` Patrick McHardy
  2009-10-28 20:04           ` Ben Greear
  2009-11-10 17:15           ` Stephen Hemminger
  2009-10-28 20:38         ` Eric Dumazet
  1 sibling, 2 replies; 17+ messages in thread
From: Patrick McHardy @ 2009-10-28 19:50 UTC (permalink / raw)
  To: Ben Greear; +Cc: Eric Dumazet, Stephen Hemminger, NetDev

[-- Attachment #1: Type: text/plain, Size: 2006 bytes --]

Ben Greear wrote:
> On 10/28/2009 12:05 PM, Patrick McHardy wrote:
>> Eric Dumazet wrote:
>>> Stephen Hemminger a écrit :
>>>> Just having larger buffer isn't guarantee of success. Allocating
>>>> a huge buffer is not going to work on embedded.
>>>>
>>>
>>> Please note we do not allocate a big buffer, only allow more small skbs
>>> to be queued on socket receive queue.
>>>
>>> If memory is not available, skb allocation will eventually fail
>>> and be reported as well, embedded or not.
>>>
>>> I vote for allowing 1024*1024 bytes instead of 32768,
>>> and eventually user should be warned that it is capped by
>>> /proc/sys/net/core/rmem_max
>>
>> How about this? It will double the receive queue limit on ENOBUFS
>> up to 1024 * 1024b, then bail out with the normal error message on
>> further ENOBUFS.
>>
>> Signed-off-by: Patrick McHardy<kaber@trash.net>
> 
> First:  This still pretty much guarantees that messages will be lost when
> the program starts (when messages are coming in too large of chunks for
> small buffers)
> If you are debugging something tricky, having lost messages will be
> very annoying!

Yeah, on second thought the probing also doesn't make too much sense
since the memory is only used when its really needed anyways. And its
capped by rmem_max.

> Second:  Why bail on ENOBUFS at all?  I don't see how it helps the user
> since they will probably just have to start it again, and will miss more
> messages than keeping going would have.

Agreed.

> And, even 1MB may not be enough for some scenarios.  So, probably best to
> let users over-ride the initial setting on cmd-line.  If not, then use
> a large value to start with.

How about this? It uses 1MB as receive buf limit by default (without
increasing /proc/sys/net/core/rmem_max it will be limited by less
however) and allows to specify the size manually using "-rcvbuf X"
(-r is already used, so you need to specify at least -rc).

Additionally rtnl_listen() continues on ENOBUFS after printing the
error message.

[-- Attachment #2: x --]
[-- Type: text/plain, Size: 2170 bytes --]

diff --git a/include/libnetlink.h b/include/libnetlink.h
index 0e02468..61da15b 100644
--- a/include/libnetlink.h
+++ b/include/libnetlink.h
@@ -17,6 +17,8 @@ struct rtnl_handle
 	__u32			dump;
 };
 
+extern int rcvbuf;
+
 extern int rtnl_open(struct rtnl_handle *rth, unsigned subscriptions);
 extern int rtnl_open_byproto(struct rtnl_handle *rth, unsigned subscriptions, int protocol);
 extern void rtnl_close(struct rtnl_handle *rth);
diff --git a/ip/ip.c b/ip/ip.c
index 2bd54b2..b4c076a 100644
--- a/ip/ip.c
+++ b/ip/ip.c
@@ -50,7 +50,8 @@ static void usage(void)
 "                   tunnel | maddr | mroute | monitor | xfrm }\n"
 "       OPTIONS := { -V[ersion] | -s[tatistics] | -d[etails] | -r[esolve] |\n"
 "                    -f[amily] { inet | inet6 | ipx | dnet | link } |\n"
-"                    -o[neline] | -t[imestamp] | -b[atch] [filename] }\n");
+"                    -o[neline] | -t[imestamp] | -b[atch] [filename] |\n"
+"                    -rc[vbuf] [size]}\n");
 	exit(-1);
 }
 
@@ -213,6 +214,19 @@ int main(int argc, char **argv)
 			if (argc <= 1)
 				usage();
 			batch_file = argv[1];
+		} else if (matches(opt, "-rcvbuf") == 0) {
+			unsigned int size;
+
+			argc--;
+			argv++;
+			if (argc <= 1)
+				usage();
+			if (get_unsigned(&size, argv[1], 0)) {
+				fprintf(stderr, "Invalid rcvbuf size '%s'\n",
+					argv[1]);
+				exit(-1);
+			}
+			rcvbuf = size;
 		} else if (matches(opt, "-help") == 0) {
 			usage();
 		} else {
diff --git a/lib/libnetlink.c b/lib/libnetlink.c
index b68e2fd..5c716ab 100644
--- a/lib/libnetlink.c
+++ b/lib/libnetlink.c
@@ -25,6 +25,8 @@
 
 #include "libnetlink.h"
 
+int rcvbuf = 1024 * 1024;
+
 void rtnl_close(struct rtnl_handle *rth)
 {
 	if (rth->fd >= 0) {
@@ -38,7 +40,6 @@ int rtnl_open_byproto(struct rtnl_handle *rth, unsigned subscriptions,
 {
 	socklen_t addr_len;
 	int sndbuf = 32768;
-	int rcvbuf = 32768;
 
 	memset(rth, 0, sizeof(*rth));
 
@@ -409,6 +410,8 @@ int rtnl_listen(struct rtnl_handle *rtnl,
 				continue;
 			fprintf(stderr, "netlink receive error %s (%d)\n",
 				strerror(errno), errno);
+			if (errno == ENOBUFS)
+				continue;
 			return -1;
 		}
 		if (status == 0) {

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: iproute uses too small of a receive buffer
  2009-10-28 19:50         ` Patrick McHardy
@ 2009-10-28 20:04           ` Ben Greear
  2009-10-28 20:07             ` Patrick McHardy
  2009-11-10 17:15           ` Stephen Hemminger
  1 sibling, 1 reply; 17+ messages in thread
From: Ben Greear @ 2009-10-28 20:04 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Eric Dumazet, Stephen Hemminger, NetDev

On 10/28/2009 12:50 PM, Patrick McHardy wrote:

>> And, even 1MB may not be enough for some scenarios.  So, probably best to
>> let users over-ride the initial setting on cmd-line.  If not, then use
>> a large value to start with.
>
> How about this? It uses 1MB as receive buf limit by default (without
> increasing /proc/sys/net/core/rmem_max it will be limited by less
> however) and allows to specify the size manually using "-rcvbuf X"
> (-r is already used, so you need to specify at least -rc).
>
> Additionally rtnl_listen() continues on ENOBUFS after printing the
> error message.

Looks good..except:

If rmem_max is smaller than 1M, will that cause setsocktopt to
fail and thus fail early out of rtnl_open_byproto?

Maybe we should only print errors but not return in that method
when setsockopt fails?

In another project, I ended up trying ever smaller values until one
worked in order to get near what the user wanted even if rmem_max
was configured smaller.  Not sure if that is worth doing here or not.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: iproute uses too small of a receive buffer
  2009-10-28 20:04           ` Ben Greear
@ 2009-10-28 20:07             ` Patrick McHardy
  2009-10-28 20:21               ` Ben Greear
  0 siblings, 1 reply; 17+ messages in thread
From: Patrick McHardy @ 2009-10-28 20:07 UTC (permalink / raw)
  To: Ben Greear; +Cc: Eric Dumazet, Stephen Hemminger, NetDev

Ben Greear wrote:
> On 10/28/2009 12:50 PM, Patrick McHardy wrote:
> 
>>> And, even 1MB may not be enough for some scenarios.  So, probably
>>> best to
>>> let users over-ride the initial setting on cmd-line.  If not, then use
>>> a large value to start with.
>>
>> How about this? It uses 1MB as receive buf limit by default (without
>> increasing /proc/sys/net/core/rmem_max it will be limited by less
>> however) and allows to specify the size manually using "-rcvbuf X"
>> (-r is already used, so you need to specify at least -rc).
>>
>> Additionally rtnl_listen() continues on ENOBUFS after printing the
>> error message.
> 
> Looks good..except:
> 
> If rmem_max is smaller than 1M, will that cause setsocktopt to
> fail and thus fail early out of rtnl_open_byproto?

No, the kernel takes the value as a hint and only uses the
maximum allowable value:

	case SO_RCVBUF:
		/* Don't error on this BSD doesn't and if you think
		   about it this is right. Otherwise apps have to
		   play 'guess the biggest size' games. RCVBUF/SNDBUF
		   are treated in BSD as hints */

		if (val > sysctl_rmem_max)
			val = sysctl_rmem_max;

> Maybe we should only print errors but not return in that method
> when setsockopt fails?
> 
> In another project, I ended up trying ever smaller values until one
> worked in order to get near what the user wanted even if rmem_max
> was configured smaller.  Not sure if that is worth doing here or not.

I think it should be fine this way.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: iproute uses too small of a receive buffer
  2009-10-28 20:07             ` Patrick McHardy
@ 2009-10-28 20:21               ` Ben Greear
  0 siblings, 0 replies; 17+ messages in thread
From: Ben Greear @ 2009-10-28 20:21 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Eric Dumazet, Stephen Hemminger, NetDev

On 10/28/2009 01:07 PM, Patrick McHardy wrote:
> Ben Greear wrote:
>> On 10/28/2009 12:50 PM, Patrick McHardy wrote:
>>
>>>> And, even 1MB may not be enough for some scenarios.  So, probably
>>>> best to
>>>> let users over-ride the initial setting on cmd-line.  If not, then use
>>>> a large value to start with.
>>>
>>> How about this? It uses 1MB as receive buf limit by default (without
>>> increasing /proc/sys/net/core/rmem_max it will be limited by less
>>> however) and allows to specify the size manually using "-rcvbuf X"
>>> (-r is already used, so you need to specify at least -rc).
>>>
>>> Additionally rtnl_listen() continues on ENOBUFS after printing the
>>> error message.
>>
>> Looks good..except:
>>
>> If rmem_max is smaller than 1M, will that cause setsocktopt to
>> fail and thus fail early out of rtnl_open_byproto?
>
> No, the kernel takes the value as a hint and only uses the
> maximum allowable value:

Sweet.  No complaints from me then.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: iproute uses too small of a receive buffer
  2009-10-28 19:19       ` Ben Greear
  2009-10-28 19:50         ` Patrick McHardy
@ 2009-10-28 20:38         ` Eric Dumazet
  1 sibling, 0 replies; 17+ messages in thread
From: Eric Dumazet @ 2009-10-28 20:38 UTC (permalink / raw)
  To: Ben Greear; +Cc: Patrick McHardy, Stephen Hemminger, NetDev

Ben Greear a écrit :

> Second:  Why bail on ENOBUFS at all?  I don't see how it helps the user
> since they will probably just have to start it again, and will miss more
> messages than keeping going would have.
> 
> And, even 1MB may not be enough for some scenarios.  So, probably best to
> let users over-ride the initial setting on cmd-line.  If not, then use
> a large value to start with.
> 

In this case, just dont call setsockopt() at all in "ip" and let system use the
standard/default  value (/proc/sys/net/core/rmem_default) that an admin can change
if he wants to handle one million devices :)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: iproute uses too small of a receive buffer
  2009-10-28 19:05     ` Patrick McHardy
  2009-10-28 19:19       ` Ben Greear
@ 2009-10-29  8:17       ` David Miller
  1 sibling, 0 replies; 17+ messages in thread
From: David Miller @ 2009-10-29  8:17 UTC (permalink / raw)
  To: kaber; +Cc: eric.dumazet, shemminger, greearb, netdev

From: Patrick McHardy <kaber@trash.net>
Date: Wed, 28 Oct 2009 20:05:12 +0100

> How about this? It will double the receive queue limit on ENOBUFS
> up to 1024 * 1024b, then bail out with the normal error message on
> further ENOBUFS.
> 
> Signed-off-by: Patrick McHardy <kaber@trash.net>

Acked-by: David S. Miller <davem@davemloft.net>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: iproute uses too small of a receive buffer
  2009-10-28 19:50         ` Patrick McHardy
  2009-10-28 20:04           ` Ben Greear
@ 2009-11-10 17:15           ` Stephen Hemminger
  1 sibling, 0 replies; 17+ messages in thread
From: Stephen Hemminger @ 2009-11-10 17:15 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Ben Greear, Eric Dumazet, NetDev


> 
> How about this? It uses 1MB as receive buf limit by default (without
> increasing /proc/sys/net/core/rmem_max it will be limited by less
> however) and allows to specify the size manually using "-rcvbuf X"
> (-r is already used, so you need to specify at least -rc).
> 
> Additionally rtnl_listen() continues on ENOBUFS after printing the
> error message.

Applied, seems like the best workaround

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2009-11-10 17:15 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-27 23:16 iproute uses too small of a receive buffer Ben Greear
2009-10-27 23:24 ` Stephen Hemminger
2009-10-27 23:30   ` Ben Greear
2009-10-28  7:01     ` Eric Dumazet
2009-10-28  7:09       ` Eric Dumazet
2009-10-28  7:37       ` Eric Dumazet
2009-10-28  7:52   ` Eric Dumazet
2009-10-28  7:55     ` David Miller
2009-10-28 19:05     ` Patrick McHardy
2009-10-28 19:19       ` Ben Greear
2009-10-28 19:50         ` Patrick McHardy
2009-10-28 20:04           ` Ben Greear
2009-10-28 20:07             ` Patrick McHardy
2009-10-28 20:21               ` Ben Greear
2009-11-10 17:15           ` Stephen Hemminger
2009-10-28 20:38         ` Eric Dumazet
2009-10-29  8:17       ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.