linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Cezar Sa Espinola <cezarsa@gmail.com>
To: Julian Anastasov <ja@ssi.bg>
Cc: Cezar Sa Espinola <cezarsa@gmail.com>,
	Wensong Zhang <wensong@linux-vs.org>,
	Simon Horman <horms@verge.net.au>,
	Pablo Neira Ayuso <pablo@netfilter.org>,
	Jozsef Kadlecsik <kadlec@netfilter.org>,
	Florian Westphal <fw@strlen.de>,
	"David S. Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>,
	netdev@vger.kernel.org (open list:IPVS),
	lvs-devel@vger.kernel.org (open list:IPVS),
	linux-kernel@vger.kernel.org (open list),
	netfilter-devel@vger.kernel.org (open list:NETFILTER),
	coreteam@netfilter.org (open list:NETFILTER)
Subject: [PATCH RFC] ipvs: add genetlink cmd to dump all services and destinations
Date: Fri, 30 Oct 2020 17:27:27 -0300	[thread overview]
Message-ID: <20201030202727.1053534-1-cezarsa@gmail.com> (raw)

A common operation for userspace applications managing ipvs is to dump
all services and all destinations and then sort out what needs to be
done. Previously this could only be accomplished by issuing 1 netlink
IPVS_CMD_GET_SERVICE dump command followed by N IPVS_CMD_GET_DEST dump
commands. For a dynamic system with a very large number of services this
could be cause a performance impact.

This patch introduces a new way of dumping all services and destinations
with the new IPVS_CMD_GET_SERVICE_DEST command. A dump call for this
command will send the services as IPVS_CMD_NEW_SERVICE messages
imediatelly followed by its destinations as multiple IPVS_CMD_NEW_DEST
messages. It's also possible to dump a single service and its
destinations by sending a IPVS_CMD_ATTR_SERVICE argument to the dump
command.

Signed-off-by: Cezar Sa Espinola <cezarsa@gmail.com>
---

To ensure that this patch improves performance I decided to also patch
ipvsadm in order to run some benchmarks comparing 'ipvsadm -Sn' with the
unpatched version. The ipvsadm patch is available on github in [1] for
now but I intend to submit it if this RFC goes forward.

The benchmarks look nice and detailed results and some scripts to allow
reproducing then are available in another github repository [2]. The
summary of the benchmarks is:

svcs  | dsts | run time compared to unpatched
----- | ---- | ------------------------------
 1000 |    4 | -60.63%
 2000 |    2 | -71.10%
 8000 |    2 | -52.83%
16000 |    1 | -54.13%
  100 |  100 |  -4.76%

[1] - https://github.com/cezarsa/ipvsadm/compare/master...dump-svc-ds
[2] - https://github.com/cezarsa/ipvsadm-validate#benchmark-results

 include/uapi/linux/ip_vs.h     |   2 +
 net/netfilter/ipvs/ip_vs_ctl.c | 109 +++++++++++++++++++++++++++++++++
 2 files changed, 111 insertions(+)

diff --git a/include/uapi/linux/ip_vs.h b/include/uapi/linux/ip_vs.h
index 4102ddcb4e14..353548cb7b81 100644
--- a/include/uapi/linux/ip_vs.h
+++ b/include/uapi/linux/ip_vs.h
@@ -331,6 +331,8 @@ enum {
 	IPVS_CMD_ZERO,			/* zero all counters and stats */
 	IPVS_CMD_FLUSH,			/* flush services and dests */
 
+	IPVS_CMD_GET_SERVICE_DEST,	/* get service and destination info */
+
 	__IPVS_CMD_MAX,
 };
 
diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index e279ded4e306..09a7dd823dc0 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -3396,6 +3396,109 @@ static int ip_vs_genl_dump_dests(struct sk_buff *skb,
 	return skb->len;
 }
 
+struct dump_services_dests_ctx {
+	struct ip_vs_service	*last_svc;
+	int			idx_svc;
+	int			idx_dest;
+	int			start_svc;
+	int			start_dest;
+};
+
+static int ip_vs_genl_dump_service_destinations(struct sk_buff *skb,
+						struct netlink_callback *cb,
+						struct ip_vs_service *svc,
+						struct dump_services_dests_ctx *ctx)
+{
+	struct ip_vs_dest *dest;
+
+	if (++ctx->idx_svc < ctx->start_svc)
+		return 0;
+
+	if (ctx->idx_svc == ctx->start_svc && ctx->last_svc != svc)
+		return 0;
+
+	if (ctx->idx_svc > ctx->start_svc) {
+		if (ip_vs_genl_dump_service(skb, svc, cb) < 0) {
+			ctx->idx_svc--;
+			return -EMSGSIZE;
+		}
+		ctx->last_svc = svc;
+		ctx->start_dest = 0;
+	}
+
+	ctx->idx_dest = 0;
+	list_for_each_entry(dest, &svc->destinations, n_list) {
+		if (++ctx->idx_dest <= ctx->start_dest)
+			continue;
+		if (ip_vs_genl_dump_dest(skb, dest, cb) < 0) {
+			ctx->idx_dest--;
+			return -EMSGSIZE;
+		}
+	}
+
+	return 0;
+}
+
+static int ip_vs_genl_dump_services_destinations(struct sk_buff *skb,
+						 struct netlink_callback *cb)
+{
+	/* Besides usual index based counters, saving a pointer to the last
+	 * dumped service is useful to ensure we only dump destinations that
+	 * belong to it, even when services are removed while the dump is still
+	 * running causing indexes to shift.
+	 */
+	struct dump_services_dests_ctx ctx = {
+		.idx_svc = 0,
+		.idx_dest = 0,
+		.start_svc = cb->args[0],
+		.start_dest = cb->args[1],
+		.last_svc = (struct ip_vs_service *)(cb->args[2]),
+	};
+	struct net *net = sock_net(skb->sk);
+	struct netns_ipvs *ipvs = net_ipvs(net);
+	struct ip_vs_service *svc = NULL;
+	struct nlattr *attrs[IPVS_CMD_ATTR_MAX + 1];
+	int i;
+
+	mutex_lock(&__ip_vs_mutex);
+
+	if (nlmsg_parse_deprecated(cb->nlh, GENL_HDRLEN, attrs, IPVS_CMD_ATTR_MAX,
+				   ip_vs_cmd_policy, cb->extack) == 0) {
+		svc = ip_vs_genl_find_service(ipvs, attrs[IPVS_CMD_ATTR_SERVICE]);
+
+		if (!IS_ERR_OR_NULL(svc)) {
+			ip_vs_genl_dump_service_destinations(skb, cb, svc, &ctx);
+			goto nla_put_failure;
+		}
+	}
+
+	for (i = 0; i < IP_VS_SVC_TAB_SIZE; i++) {
+		hlist_for_each_entry(svc, &ip_vs_svc_table[i], s_list) {
+			if (svc->ipvs != ipvs)
+				continue;
+			if (ip_vs_genl_dump_service_destinations(skb, cb, svc, &ctx) < 0)
+				goto nla_put_failure;
+		}
+	}
+
+	for (i = 0; i < IP_VS_SVC_TAB_SIZE; i++) {
+		hlist_for_each_entry(svc, &ip_vs_svc_fwm_table[i], s_list) {
+			if (svc->ipvs != ipvs)
+				continue;
+			if (ip_vs_genl_dump_service_destinations(skb, cb, svc, &ctx) < 0)
+				goto nla_put_failure;
+		}
+	}
+
+nla_put_failure:
+	mutex_unlock(&__ip_vs_mutex);
+	cb->args[0] = ctx.idx_svc;
+	cb->args[1] = ctx.idx_dest;
+	cb->args[2] = (long)ctx.last_svc;
+
+	return skb->len;
+}
+
 static int ip_vs_genl_parse_dest(struct ip_vs_dest_user_kern *udest,
 				 struct nlattr *nla, bool full_entry)
 {
@@ -3991,6 +4094,12 @@ static const struct genl_small_ops ip_vs_genl_ops[] = {
 		.flags	= GENL_ADMIN_PERM,
 		.doit	= ip_vs_genl_set_cmd,
 	},
+	{
+		.cmd	= IPVS_CMD_GET_SERVICE_DEST,
+		.validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
+		.flags	= GENL_ADMIN_PERM,
+		.dumpit	= ip_vs_genl_dump_services_destinations,
+	},
 };
 
 static struct genl_family ip_vs_genl_family __ro_after_init = {
-- 
2.25.1


             reply	other threads:[~2020-10-30 20:38 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-30 20:27 Cezar Sa Espinola [this message]
2020-11-02 20:56 ` [PATCH RFC] ipvs: add genetlink cmd to dump all services and destinations Julian Anastasov
2020-11-03 16:36   ` Cezar Sá Espinola
2020-11-03 19:18     ` Julian Anastasov
2020-11-06 15:40       ` [PATCH RFC v2] " Cezar Sa Espinola
2020-11-09 21:29         ` Julian Anastasov
2020-11-10 14:45           ` [PATCH RFC v3] " Cezar Sa Espinola
2020-11-15 12:50             ` Julian Anastasov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201030202727.1053534-1-cezarsa@gmail.com \
    --to=cezarsa@gmail.com \
    --cc=coreteam@netfilter.org \
    --cc=davem@davemloft.net \
    --cc=fw@strlen.de \
    --cc=horms@verge.net.au \
    --cc=ja@ssi.bg \
    --cc=kadlec@netfilter.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lvs-devel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pablo@netfilter.org \
    --cc=wensong@linux-vs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).