All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Hangbin Liu <liuhangbin@gmail.com>, netdev@vger.kernel.org
Cc: Stephen Hemminger <stephen@networkplumber.org>,
	Michal Kubecek <mkubecek@suse.cz>, Phil Sutter <phil@nwl.cc>
Subject: Re: [PATCHv5 iproute2 net-next 2/2] lib/libnetlink: re malloc buff if size is not enough
Date: Tue, 12 Feb 2019 15:32:21 -0800	[thread overview]
Message-ID: <244174ae-3ab4-68c0-6783-f8c91840a7e1@gmail.com> (raw)
In-Reply-To: <1508982107-28474-2-git-send-email-liuhangbin@gmail.com>



On 10/25/2017 06:41 PM, Hangbin Liu wrote:
> With commit 72b365e8e0fd ("libnetlink: Double the dump buffer size")
> we doubled the buffer size to support more VFs. But the VFs number is
> increasing all the time. Some customers even use more than 200 VFs now.
> 
> We could not double it everytime when the buffer is not enough. Let's just
> not hard code the buffer size and malloc the correct number when running.
> 
> Introduce function rtnl_recvmsg() to always return a newly allocated buffer.
> The caller need to free it after using.
> 
> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
> Signed-off-by: Phil Sutter <phil@nwl.cc>
> ---
>  lib/libnetlink.c | 114 ++++++++++++++++++++++++++++++++++++++-----------------
>  1 file changed, 80 insertions(+), 34 deletions(-)
> 
> diff --git a/lib/libnetlink.c b/lib/libnetlink.c
> index be7ac86..1847c0b 100644
> --- a/lib/libnetlink.c
> +++ b/lib/libnetlink.c
> @@ -402,6 +402,64 @@ static void rtnl_dump_error(const struct rtnl_handle *rth,
>  	}
>  }
>  
> +static int __rtnl_recvmsg(int fd, struct msghdr *msg, int flags)
> +{
> +	int len;
> +
> +	do {
> +		len = recvmsg(fd, msg, flags);
> +	} while (len < 0 && (errno == EINTR || errno == EAGAIN));
> +
> +	if (len < 0) {
> +		fprintf(stderr, "netlink receive error %s (%d)\n",
> +			strerror(errno), errno);
> +		return -errno;
> +	}
> +
> +	if (len == 0) {
> +		fprintf(stderr, "EOF on netlink\n");
> +		return -ENODATA;
> +	}
> +
> +	return len;
> +}
> +
> +static int rtnl_recvmsg(int fd, struct msghdr *msg, char **answer)
> +{
> +	struct iovec *iov = msg->msg_iov;
> +	char *buf;
> +	int len;
> +
> +	iov->iov_base = NULL;
> +	iov->iov_len = 0;
> +
> +	len = __rtnl_recvmsg(fd, msg, MSG_PEEK | MSG_TRUNC);
> +	if (len < 0)
> +		return len;
> +
> +	buf = malloc(len);
> +	if (!buf) {
> +		fprintf(stderr, "malloc error: not enough buffer\n");
> +		return -ENOMEM;
> +	}
> +
> +	iov->iov_base = buf;
> +	iov->iov_len = len;
> +
> +	len = __rtnl_recvmsg(fd, msg, 0);
> +	if (len < 0) {
> +		free(buf);
> +		return len;
> +	}
> +
> +	if (answer)
> +		*answer = buf;
> +	else
> +		free(buf);
> +
> +	return len;
> +}
> +


This patch brings a serious performance penalty.

ss command now uses two system calls per ~4KB worth of data

recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{NULL, 0}], msg_controllen=0, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 3328 <0.000120>
recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"h\0\0\0\24\0\2\0@\342\1\0\322\0\6\0\n\1\1\0\250\253\276@&\7\370\260\200\231\16\6"..., 3328}], msg_controllen=0, msg_flags=0}, 0) = 3328 <0.000108>
recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{NULL, 0}], msg_controllen=0, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 3328 <0.000086>
recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"h\0\0\0\24\0\2\0@\342\1\0\322\0\6\0\n\10\2\0002A\266S&\7\370\260\200\231\16\6"..., 3328}], msg_controllen=0, msg_flags=0}, 0) = 3328 <0.000121>


So we are back to a very pessimistic situation.


  parent reply	other threads:[~2019-02-12 23:32 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-26  1:41 [PATCHv5 iproute2 net-next 0/2] libnetlink: malloc correct buff at run time Hangbin Liu
2017-10-26  1:41 ` [PATCHv5 iproute2 net-next 2/2] lib/libnetlink: re malloc buff if size is not enough Hangbin Liu
2017-10-26  2:59   ` David Ahern
2017-10-26 10:24     ` Stephen Hemminger
2017-10-26 15:28       ` David Ahern
2017-10-26 15:33         ` Phil Sutter
2017-10-26 15:42           ` David Ahern
2017-10-26 18:31             ` Phil Sutter
2019-02-12 23:32   ` Eric Dumazet [this message]
2019-02-12 23:43     ` Eric Dumazet
2017-10-26  1:41 ` [PATCHv5 iproute2 net-next 2/2] lib/libnetlink: update rtnl_talk to support malloc buff at run time Hangbin Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=244174ae-3ab4-68c0-6783-f8c91840a7e1@gmail.com \
    --to=eric.dumazet@gmail.com \
    --cc=liuhangbin@gmail.com \
    --cc=mkubecek@suse.cz \
    --cc=netdev@vger.kernel.org \
    --cc=phil@nwl.cc \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.