All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC iproute2-next 00/11] ip: nexthop: cache nexthops and print routes' nh info
@ 2021-09-29 15:28 Nikolay Aleksandrov
  2021-09-29 15:28 ` [RFC iproute2-next 01/11] ip: print_rta_if takes ifindex as device argument instead of attribute Nikolay Aleksandrov
                   ` (11 more replies)
  0 siblings, 12 replies; 20+ messages in thread
From: Nikolay Aleksandrov @ 2021-09-29 15:28 UTC (permalink / raw)
  To: netdev; +Cc: roopa, donaldsharp72, dsahern, idosch, Nikolay Aleksandrov

From: Nikolay Aleksandrov <nikolay@nvidia.com>

Hi,
This set tries to help with an old ask that we've had for some time
which is to print nexthop information while monitoring or dumping routes.
The core problem is that people cannot follow nexthop changes while
monitoring route changes, by the time they check the nexthop it could be
deleted or updated to something else. In order to help them out I've
added a nexthop cache which is populated (only used if -d / show_details
is specified) while decoding routes and kept up to date while monitoring.
The nexthop information is printed on its own line starting with the
"nh_info" attribute and its embedded inside it if printing JSON. To
cache the nexthop entries I parse them into structures, in order to
reuse most of the code the print helpers have been altered so they rely
on prepared structures. Nexthops are now always parsed into a structure,
even if they won't be cached, that structure is later used to print the
nexthop and destroyed if not going to be cached. New nexthops (not found
in the cache) are retrieved from the kernel using a private netlink
socket so they don't disrupt an ongoing dump, similar to how interfaces
are retrieved and cached.

I have tested the set with the kernel forwarding selftests and also by
stressing it with nexthop create/update/delete in loops while monitoring.

Comments are very welcome as usual. :)

Patch breakdown:
Patches 1-2: update current route helpers to take parsed arguments so we
             can directly pass them from the nh_entry structure later
Patch     3: adds the new nh_entry structure and a helper to parse nhmsg
             into it
Patch     4: adds resilient nexthop group structure and a parser for it
Patch     5: converts current nexthop print code to always parse the
             nhmsg into nh_entry structure and use it for printing
Patch     6: factors out ipnh_get's rtnl talk part and allows to use a
             different rt handle for the communication
Patch     7: adds nexthop cache and helpers to manage it, it uses the
             new __ipnh_get to retrieve nexthops
Patch     8: factors out nh_entry printing into a separate helper called
             __print_nexthop_entry
Patch     9: adds a new helper print_cache_nexthop_id that prints nexthop
             information from its id, if the nexthop is not found in the
             cache it fetches it
Patch    10: the new print_cache_nexthop_id helper is used when printing
             routes with show_details (-d) to output detailed nexthop
             information, the format after nh_info is the same as
             ip nexthop show
Patch    11: changes print_nexthop into print_cache_nexthop which always
             outputs the nexthop information and can also update the cache
             (based on process_cache argument), it's used to keep the
             cache up to date while monitoring

Example outputs (monitor):
[NEXTHOP]id 101 via 169.254.2.22 dev veth2 scope link proto unspec 
[NEXTHOP]id 102 via 169.254.3.23 dev veth4 scope link proto unspec 
[NEXTHOP]id 103 group 101/102 type resilient buckets 512 idle_timer 0 unbalanced_timer 0 unbalanced_time 0 scope global proto unspec 
[ROUTE]unicast 192.0.2.0/24 nhid 203 table 4 proto boot scope global 
	nh_info id 203 group 201/202 type resilient buckets 512 idle_timer 0 unbalanced_timer 0 unbalanced_time 0 scope global proto unspec 
	nexthop via 169.254.2.12 dev veth3 weight 1 
	nexthop via 169.254.3.13 dev veth5 weight 1 

[NEXTHOP]id 204 via fe80:2::12 dev veth3 scope link proto unspec 
[NEXTHOP]id 205 via fe80:3::13 dev veth5 scope link proto unspec 
[NEXTHOP]id 206 group 204/205 type resilient buckets 512 idle_timer 0 unbalanced_timer 0 unbalanced_time 0 scope global proto unspec 
[ROUTE]unicast 2001:db8:1::/64 nhid 206 table 4 proto boot scope global metric 1024 pref medium
	nh_info id 206 group 204/205 type resilient buckets 512 idle_timer 0 unbalanced_timer 0 unbalanced_time 0 scope global proto unspec 
	nexthop via fe80:2::12 dev veth3 weight 1 
	nexthop via fe80:3::13 dev veth5 weight 1 

[NEXTHOP]id 2  encap mpls  200/300 via 10.1.1.1 dev ens20 scope link proto unspec onlink 
[ROUTE]unicast 2.3.4.10 nhid 2 table main proto boot scope global 
	nh_info id 2  encap mpls  200/300 via 10.1.1.1 dev ens20 scope link proto unspec onlink 

JSON:
 {
        "type": "unicast",
        "dst": "198.51.100.0/24",
        "nhid": 103,
        "table": "3",
        "protocol": "boot",
        "scope": "global",
        "flags": [ ],
        "nh_info": {
            "id": 103,
            "group": [ {
                    "id": 101,
                    "weight": 11
                },{
                    "id": 102,
                    "weight": 45
                } ],
            "type": "resilient",
            "resilient_args": {
                "buckets": 512,
                "idle_timer": 0,
                "unbalanced_timer": 0,
                "unbalanced_time": 0
            },
            "scope": "global",
            "protocol": "unspec",
            "flags": [ ]
        },
        "nexthops": [ {
                "gateway": "169.254.2.22",
                "dev": "veth2",
                "weight": 11,
                "flags": [ ]
            },{
                "gateway": "169.254.3.23",
                "dev": "veth4",
                "weight": 45,
                "flags": [ ]
            } ]
  }

Thank you,
 Nik

Nikolay Aleksandrov (11):
  ip: print_rta_if takes ifindex as device argument instead of attribute
  ip: export print_rta_gateway version which outputs prepared gateway
    string
  ip: nexthop: add nh struct and a helper to parse nhmsg into it
  ip: nexthop: parse resilient nexthop group attribute into structure
  ip: nexthop: always parse attributes for printing
  ip: nexthop: pull ipnh_get_id rtnl talk into a helper
  ip: nexthop: add cache helpers
  ip: nexthop: factor out entry printing
  ip: nexthop: add a helper which retrieves and prints cached nh entry
  ip: route: print and cache detailed nexthop information when requested
  ip: nexthop: add print_cache_nexthop which prints and manages the nh
    cache

 ip/ip_common.h |   4 +-
 ip/ipmonitor.c |   3 +-
 ip/ipnexthop.c | 455 +++++++++++++++++++++++++++++++++++++++----------
 ip/iproute.c   |  32 ++--
 ip/nh_common.h |  53 ++++++
 5 files changed, 446 insertions(+), 101 deletions(-)
 create mode 100644 ip/nh_common.h

-- 
2.31.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC iproute2-next 01/11] ip: print_rta_if takes ifindex as device argument instead of attribute
  2021-09-29 15:28 [RFC iproute2-next 00/11] ip: nexthop: cache nexthops and print routes' nh info Nikolay Aleksandrov
@ 2021-09-29 15:28 ` Nikolay Aleksandrov
  2021-09-29 15:28 ` [RFC iproute2-next 02/11] ip: export print_rta_gateway version which outputs prepared gateway string Nikolay Aleksandrov
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: Nikolay Aleksandrov @ 2021-09-29 15:28 UTC (permalink / raw)
  To: netdev; +Cc: roopa, donaldsharp72, dsahern, idosch, Nikolay Aleksandrov

From: Nikolay Aleksandrov <nikolay@nvidia.com>

We need print_rta_if() to take ifindex directly so later we can use it
with cached converted nexthop objects.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
---
 ip/ip_common.h |  2 +-
 ip/ipnexthop.c |  2 +-
 ip/iproute.c   | 12 ++++++------
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/ip/ip_common.h b/ip/ip_common.h
index ad018183eac0..d3d50cbca74d 100644
--- a/ip/ip_common.h
+++ b/ip/ip_common.h
@@ -168,7 +168,7 @@ int name_is_vrf(const char *name);
 
 void print_num(FILE *fp, unsigned int width, uint64_t count);
 void print_rt_flags(FILE *fp, unsigned int flags);
-void print_rta_if(FILE *fp, const struct rtattr *rta, const char *prefix);
+void print_rta_ifidx(FILE *fp, __u32 ifidx, const char *prefix);
 void print_rta_gateway(FILE *fp, unsigned char family,
 		       const struct rtattr *rta);
 #endif /* _IP_COMMON_H_ */
diff --git a/ip/ipnexthop.c b/ip/ipnexthop.c
index 9478aa5298eb..a4048d803325 100644
--- a/ip/ipnexthop.c
+++ b/ip/ipnexthop.c
@@ -381,7 +381,7 @@ int print_nexthop(struct nlmsghdr *n, void *arg)
 		print_rta_gateway(fp, nhm->nh_family, tb[NHA_GATEWAY]);
 
 	if (tb[NHA_OIF])
-		print_rta_if(fp, tb[NHA_OIF], "dev");
+		print_rta_ifidx(fp, rta_getattr_u32(tb[NHA_OIF]), "dev");
 
 	if (nhm->nh_scope != RT_SCOPE_UNIVERSE || show_details > 0) {
 		print_string(PRINT_ANY, "scope", "scope %s ",
diff --git a/ip/iproute.c b/ip/iproute.c
index 1e5e2002d2ed..f2bf4737b958 100644
--- a/ip/iproute.c
+++ b/ip/iproute.c
@@ -410,13 +410,13 @@ static void print_rt_pref(FILE *fp, unsigned int pref)
 	}
 }
 
-void print_rta_if(FILE *fp, const struct rtattr *rta, const char *prefix)
+void print_rta_ifidx(FILE *fp, __u32 ifidx, const char *prefix)
 {
-	const char *ifname = ll_index_to_name(rta_getattr_u32(rta));
+	const char *ifname = ll_index_to_name(ifidx);
 
-	if (is_json_context())
+	if (is_json_context()) {
 		print_string(PRINT_JSON, prefix, NULL, ifname);
-	else {
+	} else {
 		fprintf(fp, "%s ", prefix);
 		color_fprintf(fp, COLOR_IFNAME, "%s ", ifname);
 	}
@@ -862,7 +862,7 @@ int print_route(struct nlmsghdr *n, void *arg)
 		print_rta_via(fp, tb[RTA_VIA]);
 
 	if (tb[RTA_OIF] && filter.oifmask != -1)
-		print_rta_if(fp, tb[RTA_OIF], "dev");
+		print_rta_ifidx(fp, rta_getattr_u32(tb[RTA_OIF]), "dev");
 
 	if (table && (table != RT_TABLE_MAIN || show_details > 0) && !filter.tb)
 		print_string(PRINT_ANY,
@@ -946,7 +946,7 @@ int print_route(struct nlmsghdr *n, void *arg)
 		print_rta_metrics(fp, tb[RTA_METRICS]);
 
 	if (tb[RTA_IIF] && filter.iifmask != -1)
-		print_rta_if(fp, tb[RTA_IIF], "iif");
+		print_rta_ifidx(fp, rta_getattr_u32(tb[RTA_IIF]), "iif");
 
 	if (tb[RTA_PREF])
 		print_rt_pref(fp, rta_getattr_u8(tb[RTA_PREF]));
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC iproute2-next 02/11] ip: export print_rta_gateway version which outputs prepared gateway string
  2021-09-29 15:28 [RFC iproute2-next 00/11] ip: nexthop: cache nexthops and print routes' nh info Nikolay Aleksandrov
  2021-09-29 15:28 ` [RFC iproute2-next 01/11] ip: print_rta_if takes ifindex as device argument instead of attribute Nikolay Aleksandrov
@ 2021-09-29 15:28 ` Nikolay Aleksandrov
  2021-09-29 15:28 ` [RFC iproute2-next 03/11] ip: nexthop: add nh struct and a helper to parse nhmsg into it Nikolay Aleksandrov
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: Nikolay Aleksandrov @ 2021-09-29 15:28 UTC (permalink / raw)
  To: netdev; +Cc: roopa, donaldsharp72, dsahern, idosch, Nikolay Aleksandrov

From: Nikolay Aleksandrov <nikolay@nvidia.com>

Export a new __print_rta_gateway that takes a prepared gateway string to
print which is also used by print_rta_gateway for consistent format.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
---
 ip/ip_common.h |  1 +
 ip/iproute.c   | 15 ++++++++++-----
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/ip/ip_common.h b/ip/ip_common.h
index d3d50cbca74d..a02a3b96f7fd 100644
--- a/ip/ip_common.h
+++ b/ip/ip_common.h
@@ -169,6 +169,7 @@ int name_is_vrf(const char *name);
 void print_num(FILE *fp, unsigned int width, uint64_t count);
 void print_rt_flags(FILE *fp, unsigned int flags);
 void print_rta_ifidx(FILE *fp, __u32 ifidx, const char *prefix);
+void __print_rta_gateway(FILE *fp, unsigned char family, const char *gateway);
 void print_rta_gateway(FILE *fp, unsigned char family,
 		       const struct rtattr *rta);
 #endif /* _IP_COMMON_H_ */
diff --git a/ip/iproute.c b/ip/iproute.c
index f2bf4737b958..3c933df4dd29 100644
--- a/ip/iproute.c
+++ b/ip/iproute.c
@@ -547,13 +547,11 @@ static void print_rta_newdst(FILE *fp, const struct rtmsg *r,
 	}
 }
 
-void print_rta_gateway(FILE *fp, unsigned char family, const struct rtattr *rta)
+void __print_rta_gateway(FILE *fp, unsigned char family, const char *gateway)
 {
-	const char *gateway = format_host_rta(family, rta);
-
-	if (is_json_context())
+	if (is_json_context()) {
 		print_string(PRINT_JSON, "gateway", NULL, gateway);
-	else {
+	} else {
 		fprintf(fp, "via ");
 		print_color_string(PRINT_FP,
 				   ifa_family_color(family),
@@ -561,6 +559,13 @@ void print_rta_gateway(FILE *fp, unsigned char family, const struct rtattr *rta)
 	}
 }
 
+void print_rta_gateway(FILE *fp, unsigned char family, const struct rtattr *rta)
+{
+	const char *gateway = format_host_rta(family, rta);
+
+	__print_rta_gateway(fp, family, gateway);
+}
+
 static void print_rta_via(FILE *fp, const struct rtattr *rta)
 {
 	size_t len = RTA_PAYLOAD(rta) - 2;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC iproute2-next 03/11] ip: nexthop: add nh struct and a helper to parse nhmsg into it
  2021-09-29 15:28 [RFC iproute2-next 00/11] ip: nexthop: cache nexthops and print routes' nh info Nikolay Aleksandrov
  2021-09-29 15:28 ` [RFC iproute2-next 01/11] ip: print_rta_if takes ifindex as device argument instead of attribute Nikolay Aleksandrov
  2021-09-29 15:28 ` [RFC iproute2-next 02/11] ip: export print_rta_gateway version which outputs prepared gateway string Nikolay Aleksandrov
@ 2021-09-29 15:28 ` Nikolay Aleksandrov
  2021-09-30  3:33   ` David Ahern
  2021-09-29 15:28 ` [RFC iproute2-next 04/11] ip: nexthop: parse resilient nexthop group attribute into structure Nikolay Aleksandrov
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 20+ messages in thread
From: Nikolay Aleksandrov @ 2021-09-29 15:28 UTC (permalink / raw)
  To: netdev; +Cc: roopa, donaldsharp72, dsahern, idosch, Nikolay Aleksandrov

From: Nikolay Aleksandrov <nikolay@nvidia.com>

Add a structure which describes a nexthop with all of its properties and
a helper which parses a nhmsg into it. Note the LWT attribute is copied
because there are too many different types with their own structures,
having to follow them all for changes would be overkill. It's much better
to copy the attribtue and pass it for decoding to the lwt code which is
already well maintained.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
---
 ip/ipnexthop.c | 97 +++++++++++++++++++++++++++++++++++++++++++++++++-
 ip/nh_common.h | 33 +++++++++++++++++
 2 files changed, 129 insertions(+), 1 deletion(-)
 create mode 100644 ip/nh_common.h

diff --git a/ip/ipnexthop.c b/ip/ipnexthop.c
index a4048d803325..be8541476fa6 100644
--- a/ip/ipnexthop.c
+++ b/ip/ipnexthop.c
@@ -13,6 +13,7 @@
 
 #include "utils.h"
 #include "ip_common.h"
+#include "nh_common.h"
 
 static struct {
 	unsigned int flushed;
@@ -212,13 +213,20 @@ out:
 	return rc;
 }
 
+static bool __valid_nh_group_attr(const struct rtattr *g_attr)
+{
+	int num = RTA_PAYLOAD(g_attr) / sizeof(struct nexthop_grp);
+
+	return num && num * sizeof(struct nexthop_grp) == RTA_PAYLOAD(g_attr);
+}
+
 static void print_nh_group(FILE *fp, const struct rtattr *grps_attr)
 {
 	struct nexthop_grp *nhg = RTA_DATA(grps_attr);
 	int num = RTA_PAYLOAD(grps_attr) / sizeof(*nhg);
 	int i;
 
-	if (!num || num * sizeof(*nhg) != RTA_PAYLOAD(grps_attr)) {
+	if (!__valid_nh_group_attr(grps_attr)) {
 		fprintf(fp, "<invalid nexthop group>");
 		return;
 	}
@@ -328,6 +336,93 @@ static void print_nh_res_bucket(FILE *fp, const struct rtattr *res_bucket_attr)
 	close_json_object();
 }
 
+static void ipnh_destroy_entry(struct nh_entry *nhe)
+{
+	if (nhe->nh_encap)
+		free(nhe->nh_encap);
+	if (nhe->nh_groups)
+		free(nhe->nh_groups);
+}
+
+/* parse nhmsg into nexthop entry struct which must be destroyed by
+ * ipnh_destroy_enty when it's not needed anymore
+ */
+static int ipnh_parse_nhmsg(FILE *fp, const struct nhmsg *nhm, int len,
+			    struct nh_entry *nhe)
+{
+	struct rtattr *tb[NHA_MAX+1];
+	int err = 0;
+
+	memset(nhe, 0, sizeof(*nhe));
+	parse_rtattr_flags(tb, NHA_MAX, RTM_NHA(nhm), len, NLA_F_NESTED);
+
+	if (tb[NHA_ID])
+		nhe->nh_id = rta_getattr_u32(tb[NHA_ID]);
+
+	if (tb[NHA_OIF])
+		nhe->nh_oif = rta_getattr_u32(tb[NHA_OIF]);
+
+	if (tb[NHA_GROUP_TYPE])
+		nhe->nh_grp_type = rta_getattr_u16(tb[NHA_GROUP_TYPE]);
+
+	if (tb[NHA_GATEWAY]) {
+		if (RTA_PAYLOAD(tb[NHA_GATEWAY]) > sizeof(nhe->nh_gateway)) {
+			fprintf(fp, "<nexthop id %u invalid gateway length %lu>\n",
+				nhe->nh_id, RTA_PAYLOAD(tb[NHA_GATEWAY]));
+			err = EINVAL;
+			goto out_err;
+		}
+		nhe->nh_gateway_len = RTA_PAYLOAD(tb[NHA_GATEWAY]);
+		memcpy(&nhe->nh_gateway, RTA_DATA(tb[NHA_GATEWAY]),
+		       RTA_PAYLOAD(tb[NHA_GATEWAY]));
+	}
+
+	if (tb[NHA_ENCAP]) {
+		nhe->nh_encap = malloc(RTA_LENGTH(RTA_PAYLOAD(tb[NHA_ENCAP])));
+		if (!nhe->nh_encap) {
+			err = ENOMEM;
+			goto out_err;
+		}
+		memcpy(nhe->nh_encap, tb[NHA_ENCAP],
+		       RTA_LENGTH(RTA_PAYLOAD(tb[NHA_ENCAP])));
+		memcpy(&nhe->nh_encap_type, tb[NHA_ENCAP_TYPE],
+		       sizeof(nhe->nh_encap_type));
+	}
+
+	if (tb[NHA_GROUP]) {
+		if (!__valid_nh_group_attr(tb[NHA_GROUP])) {
+			fprintf(fp, "<nexthop id %u invalid nexthop group>",
+				nhe->nh_id);
+			err = EINVAL;
+			goto out_err;
+		}
+
+		nhe->nh_groups = malloc(RTA_PAYLOAD(tb[NHA_GROUP]));
+		if (!nhe->nh_groups) {
+			err = ENOMEM;
+			goto out_err;
+		}
+		nhe->nh_groups_cnt = RTA_PAYLOAD(tb[NHA_GROUP]) /
+				     sizeof(struct nexthop_grp);
+		memcpy(nhe->nh_groups, RTA_DATA(tb[NHA_GROUP]),
+		       RTA_PAYLOAD(tb[NHA_GROUP]));
+	}
+
+	nhe->nh_blackhole = !!tb[NHA_BLACKHOLE];
+	nhe->nh_fdb = !!tb[NHA_FDB];
+
+	nhe->nh_family = nhm->nh_family;
+	nhe->nh_protocol = nhm->nh_protocol;
+	nhe->nh_scope = nhm->nh_scope;
+	nhe->nh_flags = nhm->nh_flags;
+
+	return 0;
+
+out_err:
+	ipnh_destroy_entry(nhe);
+	return err;
+}
+
 int print_nexthop(struct nlmsghdr *n, void *arg)
 {
 	struct nhmsg *nhm = NLMSG_DATA(n);
diff --git a/ip/nh_common.h b/ip/nh_common.h
new file mode 100644
index 000000000000..f2ff0e6532d3
--- /dev/null
+++ b/ip/nh_common.h
@@ -0,0 +1,33 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __NH_COMMON_H__
+#define __NH_COMMON_H__ 1
+
+struct nh_entry {
+	__u32			nh_id;
+	__u32			nh_oif;
+	__u32			nh_flags;
+	__u16			nh_grp_type;
+	__u8			nh_family;
+	__u8			nh_scope;
+	__u8			nh_protocol;
+
+	bool			nh_blackhole;
+	bool			nh_fdb;
+
+	int			nh_gateway_len;
+	union {
+		__be32		ipv4;
+		struct in6_addr	ipv6;
+	}			nh_gateway;
+
+	struct rtattr		*nh_encap;
+	union {
+		struct rtattr	rta;
+		__u8		_buf[RTA_LENGTH(sizeof(__u16))];
+	}			nh_encap_type;
+
+	int			nh_groups_cnt;
+	struct nexthop_grp	*nh_groups;
+};
+
+#endif /* __NH_COMMON_H__ */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC iproute2-next 04/11] ip: nexthop: parse resilient nexthop group attribute into structure
  2021-09-29 15:28 [RFC iproute2-next 00/11] ip: nexthop: cache nexthops and print routes' nh info Nikolay Aleksandrov
                   ` (2 preceding siblings ...)
  2021-09-29 15:28 ` [RFC iproute2-next 03/11] ip: nexthop: add nh struct and a helper to parse nhmsg into it Nikolay Aleksandrov
@ 2021-09-29 15:28 ` Nikolay Aleksandrov
  2021-09-30  3:35   ` David Ahern
  2021-09-29 15:28 ` [RFC iproute2-next 05/11] ip: nexthop: always parse attributes for printing Nikolay Aleksandrov
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 20+ messages in thread
From: Nikolay Aleksandrov @ 2021-09-29 15:28 UTC (permalink / raw)
  To: netdev; +Cc: roopa, donaldsharp72, dsahern, idosch, Nikolay Aleksandrov

From: Nikolay Aleksandrov <nikolay@nvidia.com>

Add a structure which describes resilient nexthop groups and parse such
attributes into it.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
---
 ip/ipnexthop.c | 32 ++++++++++++++++++++++++++++++++
 ip/nh_common.h | 10 ++++++++++
 2 files changed, 42 insertions(+)

diff --git a/ip/ipnexthop.c b/ip/ipnexthop.c
index be8541476fa6..9340d8941277 100644
--- a/ip/ipnexthop.c
+++ b/ip/ipnexthop.c
@@ -272,6 +272,33 @@ static void print_nh_group_type(FILE *fp, const struct rtattr *grp_type_attr)
 	print_string(PRINT_ANY, "type", "type %s ", nh_group_type_name(type));
 }
 
+static void parse_nh_res_group_rta(const struct rtattr *res_grp_attr,
+				   struct nha_res_grp *res_grp)
+{
+	struct rtattr *tb[NHA_RES_GROUP_MAX + 1];
+	struct rtattr *rta;
+
+	parse_rtattr_nested(tb, NHA_RES_GROUP_MAX, res_grp_attr);
+
+	if (tb[NHA_RES_GROUP_BUCKETS])
+		res_grp->buckets = rta_getattr_u16(tb[NHA_RES_GROUP_BUCKETS]);
+
+	if (tb[NHA_RES_GROUP_IDLE_TIMER]) {
+		rta = tb[NHA_RES_GROUP_IDLE_TIMER];
+		res_grp->idle_timer = rta_getattr_u32(rta);
+	}
+
+	if (tb[NHA_RES_GROUP_UNBALANCED_TIMER]) {
+		rta = tb[NHA_RES_GROUP_UNBALANCED_TIMER];
+		res_grp->unbalanced_timer = rta_getattr_u32(rta);
+	}
+
+	if (tb[NHA_RES_GROUP_UNBALANCED_TIME]) {
+		rta = tb[NHA_RES_GROUP_UNBALANCED_TIME];
+		res_grp->unbalanced_time = rta_getattr_u64(rta);
+	}
+}
+
 static void print_nh_res_group(FILE *fp, const struct rtattr *res_grp_attr)
 {
 	struct rtattr *tb[NHA_RES_GROUP_MAX + 1];
@@ -408,6 +435,11 @@ static int ipnh_parse_nhmsg(FILE *fp, const struct nhmsg *nhm, int len,
 		       RTA_PAYLOAD(tb[NHA_GROUP]));
 	}
 
+	if (tb[NHA_RES_GROUP]) {
+		parse_nh_res_group_rta(tb[NHA_RES_GROUP], &nhe->nh_res_grp);
+		nhe->nh_has_res_grp = true;
+	}
+
 	nhe->nh_blackhole = !!tb[NHA_BLACKHOLE];
 	nhe->nh_fdb = !!tb[NHA_FDB];
 
diff --git a/ip/nh_common.h b/ip/nh_common.h
index f2ff0e6532d3..8c96f9993562 100644
--- a/ip/nh_common.h
+++ b/ip/nh_common.h
@@ -2,6 +2,13 @@
 #ifndef __NH_COMMON_H__
 #define __NH_COMMON_H__ 1
 
+struct nha_res_grp {
+	__u16			buckets;
+	__u32			idle_timer;
+	__u32			unbalanced_timer;
+	__u64			unbalanced_time;
+};
+
 struct nh_entry {
 	__u32			nh_id;
 	__u32			nh_oif;
@@ -26,6 +33,9 @@ struct nh_entry {
 		__u8		_buf[RTA_LENGTH(sizeof(__u16))];
 	}			nh_encap_type;
 
+	bool			nh_has_res_grp;
+	struct nha_res_grp	nh_res_grp;
+
 	int			nh_groups_cnt;
 	struct nexthop_grp	*nh_groups;
 };
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC iproute2-next 05/11] ip: nexthop: always parse attributes for printing
  2021-09-29 15:28 [RFC iproute2-next 00/11] ip: nexthop: cache nexthops and print routes' nh info Nikolay Aleksandrov
                   ` (3 preceding siblings ...)
  2021-09-29 15:28 ` [RFC iproute2-next 04/11] ip: nexthop: parse resilient nexthop group attribute into structure Nikolay Aleksandrov
@ 2021-09-29 15:28 ` Nikolay Aleksandrov
  2021-09-30  3:37   ` David Ahern
  2021-09-29 15:28 ` [RFC iproute2-next 06/11] ip: nexthop: pull ipnh_get_id rtnl talk into a helper Nikolay Aleksandrov
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 20+ messages in thread
From: Nikolay Aleksandrov @ 2021-09-29 15:28 UTC (permalink / raw)
  To: netdev; +Cc: roopa, donaldsharp72, dsahern, idosch, Nikolay Aleksandrov

From: Nikolay Aleksandrov <nikolay@nvidia.com>

Always parse nexthop group attributes into structure for printing. Nexthop
print helpers take structures and print them accordingly.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
---
 ip/ipnexthop.c | 117 +++++++++++++++++++++----------------------------
 1 file changed, 49 insertions(+), 68 deletions(-)

diff --git a/ip/ipnexthop.c b/ip/ipnexthop.c
index 9340d8941277..e334b852aa55 100644
--- a/ip/ipnexthop.c
+++ b/ip/ipnexthop.c
@@ -220,28 +220,22 @@ static bool __valid_nh_group_attr(const struct rtattr *g_attr)
 	return num && num * sizeof(struct nexthop_grp) == RTA_PAYLOAD(g_attr);
 }
 
-static void print_nh_group(FILE *fp, const struct rtattr *grps_attr)
+static void print_nh_group(const struct nh_entry *nhe)
 {
-	struct nexthop_grp *nhg = RTA_DATA(grps_attr);
-	int num = RTA_PAYLOAD(grps_attr) / sizeof(*nhg);
 	int i;
 
-	if (!__valid_nh_group_attr(grps_attr)) {
-		fprintf(fp, "<invalid nexthop group>");
-		return;
-	}
-
 	open_json_array(PRINT_JSON, "group");
 	print_string(PRINT_FP, NULL, "%s", "group ");
-	for (i = 0; i < num; ++i) {
+	for (i = 0; i < nhe->nh_groups_cnt; ++i) {
 		open_json_object(NULL);
 
 		if (i)
 			print_string(PRINT_FP, NULL, "%s", "/");
 
-		print_uint(PRINT_ANY, "id", "%u", nhg[i].id);
-		if (nhg[i].weight)
-			print_uint(PRINT_ANY, "weight", ",%u", nhg[i].weight + 1);
+		print_uint(PRINT_ANY, "id", "%u", nhe->nh_groups[i].id);
+		if (nhe->nh_groups[i].weight)
+			print_uint(PRINT_ANY, "weight", ",%u",
+				   nhe->nh_groups[i].weight + 1);
 
 		close_json_object();
 	}
@@ -261,15 +255,14 @@ static const char *nh_group_type_name(__u16 type)
 	}
 }
 
-static void print_nh_group_type(FILE *fp, const struct rtattr *grp_type_attr)
+static void print_nh_group_type(__u16 nh_grp_type)
 {
-	__u16 type = rta_getattr_u16(grp_type_attr);
-
-	if (type == NEXTHOP_GRP_TYPE_MPATH)
+	if (nh_grp_type == NEXTHOP_GRP_TYPE_MPATH)
 		/* Do not print type in order not to break existing output. */
 		return;
 
-	print_string(PRINT_ANY, "type", "type %s ", nh_group_type_name(type));
+	print_string(PRINT_ANY, "type", "type %s ",
+		     nh_group_type_name(nh_grp_type));
 }
 
 static void parse_nh_res_group_rta(const struct rtattr *res_grp_attr,
@@ -299,39 +292,22 @@ static void parse_nh_res_group_rta(const struct rtattr *res_grp_attr,
 	}
 }
 
-static void print_nh_res_group(FILE *fp, const struct rtattr *res_grp_attr)
+static void print_nh_res_group(const struct nha_res_grp *res_grp)
 {
-	struct rtattr *tb[NHA_RES_GROUP_MAX + 1];
-	struct rtattr *rta;
 	struct timeval tv;
 
-	parse_rtattr_nested(tb, NHA_RES_GROUP_MAX, res_grp_attr);
-
 	open_json_object("resilient_args");
 
-	if (tb[NHA_RES_GROUP_BUCKETS])
-		print_uint(PRINT_ANY, "buckets", "buckets %u ",
-			   rta_getattr_u16(tb[NHA_RES_GROUP_BUCKETS]));
+	print_uint(PRINT_ANY, "buckets", "buckets %u ", res_grp->buckets);
 
-	if (tb[NHA_RES_GROUP_IDLE_TIMER]) {
-		rta = tb[NHA_RES_GROUP_IDLE_TIMER];
-		__jiffies_to_tv(&tv, rta_getattr_u32(rta));
-		print_tv(PRINT_ANY, "idle_timer", "idle_timer %g ", &tv);
-	}
+	__jiffies_to_tv(&tv, res_grp->idle_timer);
+	print_tv(PRINT_ANY, "idle_timer", "idle_timer %g ", &tv);
 
-	if (tb[NHA_RES_GROUP_UNBALANCED_TIMER]) {
-		rta = tb[NHA_RES_GROUP_UNBALANCED_TIMER];
-		__jiffies_to_tv(&tv, rta_getattr_u32(rta));
-		print_tv(PRINT_ANY, "unbalanced_timer", "unbalanced_timer %g ",
-			 &tv);
-	}
+	__jiffies_to_tv(&tv, res_grp->unbalanced_timer);
+	print_tv(PRINT_ANY, "unbalanced_timer", "unbalanced_timer %g ", &tv);
 
-	if (tb[NHA_RES_GROUP_UNBALANCED_TIME]) {
-		rta = tb[NHA_RES_GROUP_UNBALANCED_TIME];
-		__jiffies_to_tv(&tv, rta_getattr_u32(rta));
-		print_tv(PRINT_ANY, "unbalanced_time", "unbalanced_time %g ",
-			 &tv);
-	}
+	__jiffies_to_tv(&tv, res_grp->unbalanced_time);
+	print_tv(PRINT_ANY, "unbalanced_time", "unbalanced_time %g ", &tv);
 
 	close_json_object();
 }
@@ -458,9 +434,9 @@ out_err:
 int print_nexthop(struct nlmsghdr *n, void *arg)
 {
 	struct nhmsg *nhm = NLMSG_DATA(n);
-	struct rtattr *tb[NHA_MAX+1];
 	FILE *fp = (FILE *)arg;
-	int len;
+	struct nh_entry nhe;
+	int len, err;
 
 	SPRINT_BUF(b1);
 
@@ -481,56 +457,61 @@ int print_nexthop(struct nlmsghdr *n, void *arg)
 	if (filter.proto && filter.proto != nhm->nh_protocol)
 		return 0;
 
-	parse_rtattr_flags(tb, NHA_MAX, RTM_NHA(nhm), len, NLA_F_NESTED);
-
+	err = parse_nexthop_rta(fp, nhm, len, &nhe);
+	if (err) {
+		close_json_object();
+		fprintf(stderr, "Error parsing nexthop: %s\n", strerror(err));
+		return -1;
+	}
 	open_json_object(NULL);
 
 	if (n->nlmsg_type == RTM_DELNEXTHOP)
 		print_bool(PRINT_ANY, "deleted", "Deleted ", true);
 
-	if (tb[NHA_ID])
-		print_uint(PRINT_ANY, "id", "id %u ",
-			   rta_getattr_u32(tb[NHA_ID]));
+	print_uint(PRINT_ANY, "id", "id %u ", nhe.nh_id);
 
-	if (tb[NHA_GROUP])
-		print_nh_group(fp, tb[NHA_GROUP]);
+	if (nhe.nh_groups)
+		print_nh_group(&nhe);
 
-	if (tb[NHA_GROUP_TYPE])
-		print_nh_group_type(fp, tb[NHA_GROUP_TYPE]);
+	print_nh_group_type(nhe.nh_grp_type);
 
-	if (tb[NHA_RES_GROUP])
-		print_nh_res_group(fp, tb[NHA_RES_GROUP]);
+	if (nhe.nh_has_res_grp)
+		print_nh_res_group(&nhe.nh_res_grp);
 
-	if (tb[NHA_ENCAP])
-		lwt_print_encap(fp, tb[NHA_ENCAP_TYPE], tb[NHA_ENCAP]);
+	if (nhe.nh_encap)
+		lwt_print_encap(fp, &nhe.nh_encap_type.rta, nhe.nh_encap);
 
-	if (tb[NHA_GATEWAY])
-		print_rta_gateway(fp, nhm->nh_family, tb[NHA_GATEWAY]);
+	if (nhe.nh_gateway_len)
+		__print_rta_gateway(fp, nhe.nh_family,
+				    format_host(nhe.nh_family,
+						nhe.nh_gateway_len,
+						&nhe.nh_gateway));
 
-	if (tb[NHA_OIF])
-		print_rta_ifidx(fp, rta_getattr_u32(tb[NHA_OIF]), "dev");
+	if (nhe.nh_oif)
+		print_rta_ifidx(fp, nhe.nh_oif, "dev");
 
-	if (nhm->nh_scope != RT_SCOPE_UNIVERSE || show_details > 0) {
+	if (nhe.nh_scope != RT_SCOPE_UNIVERSE || show_details > 0) {
 		print_string(PRINT_ANY, "scope", "scope %s ",
-			     rtnl_rtscope_n2a(nhm->nh_scope, b1, sizeof(b1)));
+			     rtnl_rtscope_n2a(nhe.nh_scope, b1, sizeof(b1)));
 	}
 
-	if (tb[NHA_BLACKHOLE])
+	if (nhe.nh_blackhole)
 		print_null(PRINT_ANY, "blackhole", "blackhole ", NULL);
 
-	if (nhm->nh_protocol != RTPROT_UNSPEC || show_details > 0) {
+	if (nhe.nh_protocol != RTPROT_UNSPEC || show_details > 0) {
 		print_string(PRINT_ANY, "protocol", "proto %s ",
-			     rtnl_rtprot_n2a(nhm->nh_protocol, b1, sizeof(b1)));
+			     rtnl_rtprot_n2a(nhe.nh_protocol, b1, sizeof(b1)));
 	}
 
-	print_rt_flags(fp, nhm->nh_flags);
+	print_rt_flags(fp, nhe.nh_flags);
 
-	if (tb[NHA_FDB])
+	if (nhe.nh_fdb)
 		print_null(PRINT_ANY, "fdb", "fdb", NULL);
 
 	print_string(PRINT_FP, NULL, "%s", "\n");
 	close_json_object();
 	fflush(fp);
+	destroy_nexthop_entry(&nhe);
 
 	return 0;
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC iproute2-next 06/11] ip: nexthop: pull ipnh_get_id rtnl talk into a helper
  2021-09-29 15:28 [RFC iproute2-next 00/11] ip: nexthop: cache nexthops and print routes' nh info Nikolay Aleksandrov
                   ` (4 preceding siblings ...)
  2021-09-29 15:28 ` [RFC iproute2-next 05/11] ip: nexthop: always parse attributes for printing Nikolay Aleksandrov
@ 2021-09-29 15:28 ` Nikolay Aleksandrov
  2021-09-29 15:28 ` [RFC iproute2-next 07/11] ip: nexthop: add cache helpers Nikolay Aleksandrov
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: Nikolay Aleksandrov @ 2021-09-29 15:28 UTC (permalink / raw)
  To: netdev; +Cc: roopa, donaldsharp72, dsahern, idosch, Nikolay Aleksandrov

From: Nikolay Aleksandrov <nikolay@nvidia.com>

Pull ipnh_get_id's rtnl talk portion into a separate helper which will
be reused later to retrieve nexthops for caching.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
---
 ip/ipnexthop.c | 33 ++++++++++++++++++++-------------
 1 file changed, 20 insertions(+), 13 deletions(-)

diff --git a/ip/ipnexthop.c b/ip/ipnexthop.c
index e334b852aa55..0a08230fc278 100644
--- a/ip/ipnexthop.c
+++ b/ip/ipnexthop.c
@@ -431,6 +431,25 @@ out_err:
 	return err;
 }
 
+static int  __ipnh_get_id(struct rtnl_handle *rthp, __u32 nh_id,
+			  struct nlmsghdr **answer)
+{
+	struct {
+		struct nlmsghdr n;
+		struct nhmsg	nhm;
+		char		buf[1024];
+	} req = {
+		.n.nlmsg_len	= NLMSG_LENGTH(sizeof(struct nhmsg)),
+		.n.nlmsg_flags	= NLM_F_REQUEST,
+		.n.nlmsg_type	= RTM_GETNEXTHOP,
+		.nhm.nh_family	= preferred_family,
+	};
+
+	addattr32(&req.n, sizeof(req), NHA_ID, nh_id);
+
+	return rtnl_talk(rthp, &req.n, answer);
+}
+
 int print_nexthop(struct nlmsghdr *n, void *arg)
 {
 	struct nhmsg *nhm = NLMSG_DATA(n);
@@ -820,21 +839,9 @@ static int ipnh_modify(int cmd, unsigned int flags, int argc, char **argv)
 
 static int ipnh_get_id(__u32 id)
 {
-	struct {
-		struct nlmsghdr	n;
-		struct nhmsg	nhm;
-		char		buf[1024];
-	} req = {
-		.n.nlmsg_len = NLMSG_LENGTH(sizeof(struct nhmsg)),
-		.n.nlmsg_flags = NLM_F_REQUEST,
-		.n.nlmsg_type  = RTM_GETNEXTHOP,
-		.nhm.nh_family = preferred_family,
-	};
 	struct nlmsghdr *answer;
 
-	addattr32(&req.n, sizeof(req), NHA_ID, id);
-
-	if (rtnl_talk(&rth, &req.n, &answer) < 0)
+	if (__ipnh_get_id(&rth, id, &answer) < 0)
 		return -2;
 
 	new_json_obj(json);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC iproute2-next 07/11] ip: nexthop: add cache helpers
  2021-09-29 15:28 [RFC iproute2-next 00/11] ip: nexthop: cache nexthops and print routes' nh info Nikolay Aleksandrov
                   ` (5 preceding siblings ...)
  2021-09-29 15:28 ` [RFC iproute2-next 06/11] ip: nexthop: pull ipnh_get_id rtnl talk into a helper Nikolay Aleksandrov
@ 2021-09-29 15:28 ` Nikolay Aleksandrov
  2021-09-30  3:39   ` David Ahern
  2021-09-29 15:28 ` [RFC iproute2-next 08/11] ip: nexthop: factor out entry printing Nikolay Aleksandrov
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 20+ messages in thread
From: Nikolay Aleksandrov @ 2021-09-29 15:28 UTC (permalink / raw)
  To: netdev; +Cc: roopa, donaldsharp72, dsahern, idosch, Nikolay Aleksandrov

From: Nikolay Aleksandrov <nikolay@nvidia.com>

Add a static nexthop cache in a hash with 1024 buckets and helpers to
manage it (link, unlink, find, add nexthop, del nexthop). Adding new
nexthops is done by creating a new rtnl handle and using it to retrieve
the nexthop so the helper is safe to use while already reading a
response (i.e. using the global rth).

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
---
 ip/ipnexthop.c | 112 +++++++++++++++++++++++++++++++++++++++++++++----
 ip/nh_common.h |   6 +++
 2 files changed, 111 insertions(+), 7 deletions(-)

diff --git a/ip/ipnexthop.c b/ip/ipnexthop.c
index 0a08230fc278..6e5ea47ac927 100644
--- a/ip/ipnexthop.c
+++ b/ip/ipnexthop.c
@@ -34,6 +34,8 @@ enum {
 #define RTM_NHA(h)  ((struct rtattr *)(((char *)(h)) + \
 			NLMSG_ALIGN(sizeof(struct nhmsg))))
 
+static struct hlist_head nh_cache[NH_CACHE_SIZE];
+
 static void usage(void) __attribute__((noreturn));
 
 static void usage(void)
@@ -347,6 +349,41 @@ static void ipnh_destroy_entry(struct nh_entry *nhe)
 		free(nhe->nh_groups);
 }
 
+static struct hlist_head *ipnh_cache_head(__u32 nh_id)
+{
+	nh_id ^= nh_id >> 20;
+	nh_id ^= nh_id >> 10;
+
+	return &nh_cache[nh_id % NH_CACHE_SIZE];
+}
+
+static void ipnh_cache_link_entry(struct nh_entry *nhe)
+{
+	struct hlist_head *head = ipnh_cache_head(nhe->nh_id);
+
+	hlist_add_head(&nhe->nh_hash, head);
+}
+
+static void ipnh_cache_unlink_entry(struct nh_entry *nhe)
+{
+	hlist_del(&nhe->nh_hash);
+}
+
+static struct nh_entry *ipnh_cache_get(__u32 nh_id)
+{
+	struct hlist_head *head = ipnh_cache_head(nh_id);
+	struct nh_entry *nhe;
+	struct hlist_node *n;
+
+	hlist_for_each(n, head) {
+		nhe = container_of(n, struct nh_entry, nh_hash);
+		if (nhe->nh_id == nh_id)
+			return nhe;
+	}
+
+	return NULL;
+}
+
 /* parse nhmsg into nexthop entry struct which must be destroyed by
  * ipnh_destroy_enty when it's not needed anymore
  */
@@ -372,7 +409,7 @@ static int ipnh_parse_nhmsg(FILE *fp, const struct nhmsg *nhm, int len,
 		if (RTA_PAYLOAD(tb[NHA_GATEWAY]) > sizeof(nhe->nh_gateway)) {
 			fprintf(fp, "<nexthop id %u invalid gateway length %lu>\n",
 				nhe->nh_id, RTA_PAYLOAD(tb[NHA_GATEWAY]));
-			err = EINVAL;
+			err = -EINVAL;
 			goto out_err;
 		}
 		nhe->nh_gateway_len = RTA_PAYLOAD(tb[NHA_GATEWAY]);
@@ -383,7 +420,7 @@ static int ipnh_parse_nhmsg(FILE *fp, const struct nhmsg *nhm, int len,
 	if (tb[NHA_ENCAP]) {
 		nhe->nh_encap = malloc(RTA_LENGTH(RTA_PAYLOAD(tb[NHA_ENCAP])));
 		if (!nhe->nh_encap) {
-			err = ENOMEM;
+			err = -ENOMEM;
 			goto out_err;
 		}
 		memcpy(nhe->nh_encap, tb[NHA_ENCAP],
@@ -396,13 +433,13 @@ static int ipnh_parse_nhmsg(FILE *fp, const struct nhmsg *nhm, int len,
 		if (!__valid_nh_group_attr(tb[NHA_GROUP])) {
 			fprintf(fp, "<nexthop id %u invalid nexthop group>",
 				nhe->nh_id);
-			err = EINVAL;
+			err = -EINVAL;
 			goto out_err;
 		}
 
 		nhe->nh_groups = malloc(RTA_PAYLOAD(tb[NHA_GROUP]));
 		if (!nhe->nh_groups) {
-			err = ENOMEM;
+			err = -ENOMEM;
 			goto out_err;
 		}
 		nhe->nh_groups_cnt = RTA_PAYLOAD(tb[NHA_GROUP]) /
@@ -450,6 +487,67 @@ static int  __ipnh_get_id(struct rtnl_handle *rthp, __u32 nh_id,
 	return rtnl_talk(rthp, &req.n, answer);
 }
 
+static int __ipnh_cache_parse_nlmsg(const struct nlmsghdr *n,
+				    struct nh_entry *nhe)
+{
+	int err, len;
+
+	len = n->nlmsg_len - NLMSG_SPACE(sizeof(struct nhmsg));
+	if (len < 0) {
+		fprintf(stderr, "BUG: wrong nlmsg len %d\n", len);
+		return -EINVAL;
+	}
+
+	err = ipnh_parse_nhmsg(stderr, NLMSG_DATA(n), len, nhe);
+	if (err) {
+		fprintf(stderr, "Error parsing nexthop: %s\n", strerror(-err));
+		return err;
+	}
+
+	return 0;
+}
+
+static struct nh_entry *ipnh_cache_add(__u32 nh_id)
+{
+	struct rtnl_handle cache_rth = { .fd = -1 };
+	struct nlmsghdr *answer = NULL;
+	struct nh_entry *nhe = NULL;
+
+	if (rtnl_open(&cache_rth, 0) < 0)
+		goto out;
+
+	if (__ipnh_get_id(&cache_rth, nh_id, &answer) < 0)
+		goto out;
+
+	nhe = malloc(sizeof(*nhe));
+	if (!nhe)
+		goto out;
+
+	if (__ipnh_cache_parse_nlmsg(answer, nhe))
+		goto out_free_nhe;
+
+	ipnh_cache_link_entry(nhe);
+
+out:
+	if (answer)
+		free(answer);
+	rtnl_close(&cache_rth);
+
+	return nhe;
+
+out_free_nhe:
+	free(nhe);
+	nhe = NULL;
+	goto out;
+}
+
+static void ipnh_cache_del(struct nh_entry *nhe)
+{
+	ipnh_cache_unlink_entry(nhe);
+	ipnh_destroy_entry(nhe);
+	free(nhe);
+}
+
 int print_nexthop(struct nlmsghdr *n, void *arg)
 {
 	struct nhmsg *nhm = NLMSG_DATA(n);
@@ -476,10 +574,10 @@ int print_nexthop(struct nlmsghdr *n, void *arg)
 	if (filter.proto && filter.proto != nhm->nh_protocol)
 		return 0;
 
-	err = parse_nexthop_rta(fp, nhm, len, &nhe);
+	err = ipnh_parse_nhmsg(fp, nhm, len, &nhe);
 	if (err) {
 		close_json_object();
-		fprintf(stderr, "Error parsing nexthop: %s\n", strerror(err));
+		fprintf(stderr, "Error parsing nexthop: %s\n", strerror(-err));
 		return -1;
 	}
 	open_json_object(NULL);
@@ -530,7 +628,7 @@ int print_nexthop(struct nlmsghdr *n, void *arg)
 	print_string(PRINT_FP, NULL, "%s", "\n");
 	close_json_object();
 	fflush(fp);
-	destroy_nexthop_entry(&nhe);
+	ipnh_destroy_entry(&nhe);
 
 	return 0;
 }
diff --git a/ip/nh_common.h b/ip/nh_common.h
index 8c96f9993562..a34b0d20916e 100644
--- a/ip/nh_common.h
+++ b/ip/nh_common.h
@@ -2,6 +2,10 @@
 #ifndef __NH_COMMON_H__
 #define __NH_COMMON_H__ 1
 
+#include <list.h>
+
+#define NH_CACHE_SIZE		1024
+
 struct nha_res_grp {
 	__u16			buckets;
 	__u32			idle_timer;
@@ -10,6 +14,8 @@ struct nha_res_grp {
 };
 
 struct nh_entry {
+	struct hlist_node	nh_hash;
+
 	__u32			nh_id;
 	__u32			nh_oif;
 	__u32			nh_flags;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC iproute2-next 08/11] ip: nexthop: factor out entry printing
  2021-09-29 15:28 [RFC iproute2-next 00/11] ip: nexthop: cache nexthops and print routes' nh info Nikolay Aleksandrov
                   ` (6 preceding siblings ...)
  2021-09-29 15:28 ` [RFC iproute2-next 07/11] ip: nexthop: add cache helpers Nikolay Aleksandrov
@ 2021-09-29 15:28 ` Nikolay Aleksandrov
  2021-09-29 15:28 ` [RFC iproute2-next 09/11] ip: nexthop: add a helper which retrieves and prints cached nh entry Nikolay Aleksandrov
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: Nikolay Aleksandrov @ 2021-09-29 15:28 UTC (permalink / raw)
  To: netdev; +Cc: roopa, donaldsharp72, dsahern, idosch, Nikolay Aleksandrov

From: Nikolay Aleksandrov <nikolay@nvidia.com>

Factor out nexthop entry printing into a separate function.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
---
 ip/ipnexthop.c | 103 ++++++++++++++++++++++++++-----------------------
 1 file changed, 55 insertions(+), 48 deletions(-)

diff --git a/ip/ipnexthop.c b/ip/ipnexthop.c
index 6e5ea47ac927..37b94d6702df 100644
--- a/ip/ipnexthop.c
+++ b/ip/ipnexthop.c
@@ -548,6 +548,60 @@ static void ipnh_cache_del(struct nh_entry *nhe)
 	free(nhe);
 }
 
+static void __print_nexthop_entry(FILE *fp, const char *jsobj,
+				  struct nh_entry *nhe,
+				  bool deleted)
+{
+	SPRINT_BUF(b1);
+
+	open_json_object(jsobj);
+
+	if (deleted)
+		print_bool(PRINT_ANY, "deleted", "Deleted ", true);
+
+	print_uint(PRINT_ANY, "id", "id %u ", nhe->nh_id);
+
+	if (nhe->nh_groups)
+		print_nh_group(nhe);
+
+	print_nh_group_type(nhe->nh_grp_type);
+
+	if (nhe->nh_has_res_grp)
+		print_nh_res_group(&nhe->nh_res_grp);
+
+	if (nhe->nh_encap)
+		lwt_print_encap(fp, &nhe->nh_encap_type.rta, nhe->nh_encap);
+
+	if (nhe->nh_gateway_len)
+		__print_rta_gateway(fp, nhe->nh_family,
+				    format_host(nhe->nh_family,
+						nhe->nh_gateway_len,
+						&nhe->nh_gateway));
+
+	if (nhe->nh_oif)
+		print_rta_ifidx(fp, nhe->nh_oif, "dev");
+
+	if (nhe->nh_scope != RT_SCOPE_UNIVERSE || show_details > 0) {
+		print_string(PRINT_ANY, "scope", "scope %s ",
+			     rtnl_rtscope_n2a(nhe->nh_scope, b1, sizeof(b1)));
+	}
+
+	if (nhe->nh_blackhole)
+		print_null(PRINT_ANY, "blackhole", "blackhole ", NULL);
+
+	if (nhe->nh_protocol != RTPROT_UNSPEC || show_details > 0) {
+		print_string(PRINT_ANY, "protocol", "proto %s ",
+			     rtnl_rtprot_n2a(nhe->nh_protocol, b1, sizeof(b1)));
+	}
+
+	print_rt_flags(fp, nhe->nh_flags);
+
+	if (nhe->nh_fdb)
+		print_null(PRINT_ANY, "fdb", "fdb", NULL);
+
+	close_json_object();
+}
+
 int print_nexthop(struct nlmsghdr *n, void *arg)
 {
 	struct nhmsg *nhm = NLMSG_DATA(n);
@@ -555,8 +609,6 @@ int print_nexthop(struct nlmsghdr *n, void *arg)
 	struct nh_entry nhe;
 	int len, err;
 
-	SPRINT_BUF(b1);
-
 	if (n->nlmsg_type != RTM_DELNEXTHOP &&
 	    n->nlmsg_type != RTM_NEWNEXTHOP) {
 		fprintf(stderr, "Not a nexthop: %08x %08x %08x\n",
@@ -580,53 +632,8 @@ int print_nexthop(struct nlmsghdr *n, void *arg)
 		fprintf(stderr, "Error parsing nexthop: %s\n", strerror(-err));
 		return -1;
 	}
-	open_json_object(NULL);
-
-	if (n->nlmsg_type == RTM_DELNEXTHOP)
-		print_bool(PRINT_ANY, "deleted", "Deleted ", true);
-
-	print_uint(PRINT_ANY, "id", "id %u ", nhe.nh_id);
-
-	if (nhe.nh_groups)
-		print_nh_group(&nhe);
-
-	print_nh_group_type(nhe.nh_grp_type);
-
-	if (nhe.nh_has_res_grp)
-		print_nh_res_group(&nhe.nh_res_grp);
-
-	if (nhe.nh_encap)
-		lwt_print_encap(fp, &nhe.nh_encap_type.rta, nhe.nh_encap);
-
-	if (nhe.nh_gateway_len)
-		__print_rta_gateway(fp, nhe.nh_family,
-				    format_host(nhe.nh_family,
-						nhe.nh_gateway_len,
-						&nhe.nh_gateway));
-
-	if (nhe.nh_oif)
-		print_rta_ifidx(fp, nhe.nh_oif, "dev");
-
-	if (nhe.nh_scope != RT_SCOPE_UNIVERSE || show_details > 0) {
-		print_string(PRINT_ANY, "scope", "scope %s ",
-			     rtnl_rtscope_n2a(nhe.nh_scope, b1, sizeof(b1)));
-	}
-
-	if (nhe.nh_blackhole)
-		print_null(PRINT_ANY, "blackhole", "blackhole ", NULL);
-
-	if (nhe.nh_protocol != RTPROT_UNSPEC || show_details > 0) {
-		print_string(PRINT_ANY, "protocol", "proto %s ",
-			     rtnl_rtprot_n2a(nhe.nh_protocol, b1, sizeof(b1)));
-	}
-
-	print_rt_flags(fp, nhe.nh_flags);
-
-	if (nhe.nh_fdb)
-		print_null(PRINT_ANY, "fdb", "fdb", NULL);
-
+	__print_nexthop_entry(fp, NULL, &nhe, n->nlmsg_type == RTM_DELNEXTHOP);
 	print_string(PRINT_FP, NULL, "%s", "\n");
-	close_json_object();
 	fflush(fp);
 	ipnh_destroy_entry(&nhe);
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC iproute2-next 09/11] ip: nexthop: add a helper which retrieves and prints cached nh entry
  2021-09-29 15:28 [RFC iproute2-next 00/11] ip: nexthop: cache nexthops and print routes' nh info Nikolay Aleksandrov
                   ` (7 preceding siblings ...)
  2021-09-29 15:28 ` [RFC iproute2-next 08/11] ip: nexthop: factor out entry printing Nikolay Aleksandrov
@ 2021-09-29 15:28 ` Nikolay Aleksandrov
  2021-09-29 15:28 ` [RFC iproute2-next 10/11] ip: route: print and cache detailed nexthop information when requested Nikolay Aleksandrov
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: Nikolay Aleksandrov @ 2021-09-29 15:28 UTC (permalink / raw)
  To: netdev; +Cc: roopa, donaldsharp72, dsahern, idosch, Nikolay Aleksandrov

From: Nikolay Aleksandrov <nikolay@nvidia.com>

Add a helper which looks for a nexthop in the cache and if not found
reads the entry from the kernel and caches it. Finally the entry is
printed.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
---
 ip/ipnexthop.c | 16 ++++++++++++++++
 ip/nh_common.h |  3 +++
 2 files changed, 19 insertions(+)

diff --git a/ip/ipnexthop.c b/ip/ipnexthop.c
index 37b94d6702df..fdd0d0926630 100644
--- a/ip/ipnexthop.c
+++ b/ip/ipnexthop.c
@@ -640,6 +640,22 @@ int print_nexthop(struct nlmsghdr *n, void *arg)
 	return 0;
 }
 
+void print_cache_nexthop_id(FILE *fp, const char *fp_prefix, const char *jsobj,
+			    __u32 nh_id)
+{
+	struct nh_entry *nhe = ipnh_cache_get(nh_id);
+
+	if (!nhe) {
+		nhe = ipnh_cache_add(nh_id);
+		if (!nhe)
+			return;
+	}
+
+	if (fp_prefix)
+		print_string(PRINT_FP, NULL, "%s", fp_prefix);
+	__print_nexthop_entry(fp, jsobj, nhe, false);
+}
+
 int print_nexthop_bucket(struct nlmsghdr *n, void *arg)
 {
 	struct nhmsg *nhm = NLMSG_DATA(n);
diff --git a/ip/nh_common.h b/ip/nh_common.h
index a34b0d20916e..a672d658a9ea 100644
--- a/ip/nh_common.h
+++ b/ip/nh_common.h
@@ -46,4 +46,7 @@ struct nh_entry {
 	struct nexthop_grp	*nh_groups;
 };
 
+void print_cache_nexthop_id(FILE *fp, const char *fp_prefix, const char *jsobj,
+			    __u32 nh_id);
+
 #endif /* __NH_COMMON_H__ */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC iproute2-next 10/11] ip: route: print and cache detailed nexthop information when requested
  2021-09-29 15:28 [RFC iproute2-next 00/11] ip: nexthop: cache nexthops and print routes' nh info Nikolay Aleksandrov
                   ` (8 preceding siblings ...)
  2021-09-29 15:28 ` [RFC iproute2-next 09/11] ip: nexthop: add a helper which retrieves and prints cached nh entry Nikolay Aleksandrov
@ 2021-09-29 15:28 ` Nikolay Aleksandrov
  2021-09-29 15:28 ` [RFC iproute2-next 11/11] ip: nexthop: add print_cache_nexthop which prints and manages the nh cache Nikolay Aleksandrov
  2021-09-30  3:42 ` [RFC iproute2-next 00/11] ip: nexthop: cache nexthops and print routes' nh info David Ahern
  11 siblings, 0 replies; 20+ messages in thread
From: Nikolay Aleksandrov @ 2021-09-29 15:28 UTC (permalink / raw)
  To: netdev; +Cc: roopa, donaldsharp72, dsahern, idosch, Nikolay Aleksandrov

From: Nikolay Aleksandrov <nikolay@nvidia.com>

If -d (show_details) is used when printing/monitoring routes then print
detailed nexthop information in the field "nh_info". The nexthop is also
cached for future searches.

Output looks like:
 unicast 198.51.100.0/24 nhid 103 table 3 proto boot scope global
	 nh_info id 103 group 101/102 type resilient buckets 512 idle_timer 0 unbalanced_timer 0 unbalanced_time 0 scope global proto unspec
	 nexthop via 169.254.2.22 dev veth2 weight 1
	 nexthop via 169.254.3.23 dev veth4 weight 1

The nh_info field has the same format as ip -d nexthop show would've had
for the same nexthop id.

For completeness the JSON version looks like:
 {
        "type": "unicast",
        "dst": "198.51.100.0/24",
        "nhid": 103,
        "table": "3",
        "protocol": "boot",
        "scope": "global",
        "flags": [ ],
        "nh_info": {
            "id": 103,
            "group": [ {
                    "id": 101
                },{
                    "id": 102
                } ],
            "type": "resilient",
            "resilient_args": {
                "buckets": 512,
                "idle_timer": 0,
                "unbalanced_timer": 0,
                "unbalanced_time": 0
            },
            "scope": "global",
            "protocol": "unspec",
            "flags": [ ]
        },
        "nexthops": [ {
                "gateway": "169.254.2.22",
                "dev": "veth2",
                "weight": 1,
                "flags": [ ]
            },{
                "gateway": "169.254.3.23",
                "dev": "veth4",
                "weight": 1,
                "flags": [ ]
            } ]
 }

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
---
 ip/iproute.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/ip/iproute.c b/ip/iproute.c
index 3c933df4dd29..8532b5ce315e 100644
--- a/ip/iproute.c
+++ b/ip/iproute.c
@@ -28,6 +28,7 @@
 #include "rt_names.h"
 #include "utils.h"
 #include "ip_common.h"
+#include "nh_common.h"
 
 #ifndef RTAX_RTTVAR
 #define RTAX_RTTVAR RTAX_HOPS
@@ -968,6 +969,10 @@ int print_route(struct nlmsghdr *n, void *arg)
 				     propagate ? "enabled" : "disabled");
 	}
 
+	if (tb[RTA_NH_ID] && show_details)
+		print_cache_nexthop_id(fp, "\n\tnh_info ", "nh_info",
+				       rta_getattr_u32(tb[RTA_NH_ID]));
+
 	if (tb[RTA_MULTIPATH])
 		print_rta_multipath(fp, r, tb[RTA_MULTIPATH]);
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC iproute2-next 11/11] ip: nexthop: add print_cache_nexthop which prints and manages the nh cache
  2021-09-29 15:28 [RFC iproute2-next 00/11] ip: nexthop: cache nexthops and print routes' nh info Nikolay Aleksandrov
                   ` (9 preceding siblings ...)
  2021-09-29 15:28 ` [RFC iproute2-next 10/11] ip: route: print and cache detailed nexthop information when requested Nikolay Aleksandrov
@ 2021-09-29 15:28 ` Nikolay Aleksandrov
  2021-09-30  3:42 ` [RFC iproute2-next 00/11] ip: nexthop: cache nexthops and print routes' nh info David Ahern
  11 siblings, 0 replies; 20+ messages in thread
From: Nikolay Aleksandrov @ 2021-09-29 15:28 UTC (permalink / raw)
  To: netdev; +Cc: roopa, donaldsharp72, dsahern, idosch, Nikolay Aleksandrov

From: Nikolay Aleksandrov <nikolay@nvidia.com>

Add a new helper print_cache_nexthop replacing print_nexthop which can
update the nexthop cache if the process_cache argument is true. It is
used when monitoring netlink messages to keep the nexthop cache up to
date with nexthop changes happening. For the old callers and anyone
who's just dumping nexthops its _nocache version is used which is a
wrapper for print_cache_nexthop.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
---
 ip/ip_common.h |  1 -
 ip/ipmonitor.c |  3 ++-
 ip/ipnexthop.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++----
 ip/nh_common.h |  1 +
 4 files changed, 52 insertions(+), 6 deletions(-)

diff --git a/ip/ip_common.h b/ip/ip_common.h
index a02a3b96f7fd..ea04c8ff3dea 100644
--- a/ip/ip_common.h
+++ b/ip/ip_common.h
@@ -53,7 +53,6 @@ int print_prefix(struct nlmsghdr *n, void *arg);
 int print_rule(struct nlmsghdr *n, void *arg);
 int print_netconf(struct rtnl_ctrl_data *ctrl,
 		  struct nlmsghdr *n, void *arg);
-int print_nexthop(struct nlmsghdr *n, void *arg);
 int print_nexthop_bucket(struct nlmsghdr *n, void *arg);
 void netns_map_init(void);
 void netns_nsid_socket_init(void);
diff --git a/ip/ipmonitor.c b/ip/ipmonitor.c
index ab1af2ebd6df..0badda4e7812 100644
--- a/ip/ipmonitor.c
+++ b/ip/ipmonitor.c
@@ -22,6 +22,7 @@
 
 #include "utils.h"
 #include "ip_common.h"
+#include "nh_common.h"
 
 static void usage(void) __attribute__((noreturn));
 static int prefix_banner;
@@ -88,7 +89,7 @@ static int accept_msg(struct rtnl_ctrl_data *ctrl,
 	case RTM_NEWNEXTHOP:
 	case RTM_DELNEXTHOP:
 		print_headers(fp, "[NEXTHOP]", ctrl);
-		print_nexthop(n, arg);
+		print_cache_nexthop(n, arg, true);
 		return 0;
 
 	case RTM_NEWNEXTHOPBUCKET:
diff --git a/ip/ipnexthop.c b/ip/ipnexthop.c
index fdd0d0926630..43dc238c7725 100644
--- a/ip/ipnexthop.c
+++ b/ip/ipnexthop.c
@@ -602,7 +602,43 @@ static void __print_nexthop_entry(FILE *fp, const char *jsobj,
 	close_json_object();
 }
 
-int print_nexthop(struct nlmsghdr *n, void *arg)
+/* update, add or delete a nexthop entry based on nlmsghdr */
+static int ipnh_cache_process_nlmsg(const struct nlmsghdr *n,
+				    struct nh_entry *new_nhe)
+{
+	struct nh_entry *nhe;
+
+	nhe = ipnh_cache_get(new_nhe->nh_id);
+	switch (n->nlmsg_type) {
+	case RTM_DELNEXTHOP:
+		if (nhe)
+			ipnh_cache_del(nhe);
+		ipnh_destroy_entry(new_nhe);
+		break;
+	case RTM_NEWNEXTHOP:
+		if (!nhe) {
+			nhe = malloc(sizeof(*nhe));
+			if (!nhe) {
+				ipnh_destroy_entry(new_nhe);
+				return -1;
+			}
+		} else {
+			/* this allows us to save 1 allocation on updates by
+			 * reusing the old nh entry, but we need to cleanup its
+			 * internal storage
+			 */
+			ipnh_cache_unlink_entry(nhe);
+			ipnh_destroy_entry(nhe);
+		}
+		memcpy(nhe, new_nhe, sizeof(*nhe));
+		ipnh_cache_link_entry(nhe);
+		break;
+	}
+
+	return 0;
+}
+
+int print_cache_nexthop(struct nlmsghdr *n, void *arg, bool process_cache)
 {
 	struct nhmsg *nhm = NLMSG_DATA(n);
 	FILE *fp = (FILE *)arg;
@@ -635,11 +671,20 @@ int print_nexthop(struct nlmsghdr *n, void *arg)
 	__print_nexthop_entry(fp, NULL, &nhe, n->nlmsg_type == RTM_DELNEXTHOP);
 	print_string(PRINT_FP, NULL, "%s", "\n");
 	fflush(fp);
-	ipnh_destroy_entry(&nhe);
+
+	if (process_cache)
+		ipnh_cache_process_nlmsg(n, &nhe);
+	else
+		ipnh_destroy_entry(&nhe);
 
 	return 0;
 }
 
+static int print_nexthop_nocache(struct nlmsghdr *n, void *arg)
+{
+	return print_cache_nexthop(n, arg, false);
+}
+
 void print_cache_nexthop_id(FILE *fp, const char *fp_prefix, const char *jsobj,
 			    __u32 nh_id)
 {
@@ -967,7 +1012,7 @@ static int ipnh_get_id(__u32 id)
 
 	new_json_obj(json);
 
-	if (print_nexthop(answer, (void *)stdout) < 0) {
+	if (print_nexthop_nocache(answer, (void *)stdout) < 0) {
 		free(answer);
 		return -1;
 	}
@@ -1052,7 +1097,7 @@ static int ipnh_list_flush(int argc, char **argv, int action)
 
 	new_json_obj(json);
 
-	if (rtnl_dump_filter(&rth, print_nexthop, stdout) < 0) {
+	if (rtnl_dump_filter(&rth, print_nexthop_nocache, stdout) < 0) {
 		fprintf(stderr, "Dump terminated\n");
 		return -2;
 	}
diff --git a/ip/nh_common.h b/ip/nh_common.h
index a672d658a9ea..2ad8b134f891 100644
--- a/ip/nh_common.h
+++ b/ip/nh_common.h
@@ -48,5 +48,6 @@ struct nh_entry {
 
 void print_cache_nexthop_id(FILE *fp, const char *fp_prefix, const char *jsobj,
 			    __u32 nh_id);
+int print_cache_nexthop(struct nlmsghdr *n, void *arg, bool process_cache);
 
 #endif /* __NH_COMMON_H__ */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [RFC iproute2-next 03/11] ip: nexthop: add nh struct and a helper to parse nhmsg into it
  2021-09-29 15:28 ` [RFC iproute2-next 03/11] ip: nexthop: add nh struct and a helper to parse nhmsg into it Nikolay Aleksandrov
@ 2021-09-30  3:33   ` David Ahern
  0 siblings, 0 replies; 20+ messages in thread
From: David Ahern @ 2021-09-30  3:33 UTC (permalink / raw)
  To: Nikolay Aleksandrov, netdev
  Cc: roopa, donaldsharp72, idosch, Nikolay Aleksandrov

On 9/29/21 9:28 AM, Nikolay Aleksandrov wrote:
> @@ -328,6 +336,93 @@ static void print_nh_res_bucket(FILE *fp, const struct rtattr *res_bucket_attr)
>  	close_json_object();
>  }
>  
> +static void ipnh_destroy_entry(struct nh_entry *nhe)
> +{
> +	if (nhe->nh_encap)
> +		free(nhe->nh_encap);
> +	if (nhe->nh_groups)
> +		free(nhe->nh_groups);
> +}
> +
> +/* parse nhmsg into nexthop entry struct which must be destroyed by
> + * ipnh_destroy_enty when it's not needed anymore
> + */

I'd rather not have 2 functions interpreting the attributes. You should
be able to get print_nexthop to use the parse function and then print
using the nh_entry.


> +static int ipnh_parse_nhmsg(FILE *fp, const struct nhmsg *nhm, int len,
> +			    struct nh_entry *nhe)
> +{
> +	struct rtattr *tb[NHA_MAX+1];
> +	int err = 0;
> +
> +	memset(nhe, 0, sizeof(*nhe));
> +	parse_rtattr_flags(tb, NHA_MAX, RTM_NHA(nhm), len, NLA_F_NESTED);
> +
> +	if (tb[NHA_ID])
> +		nhe->nh_id = rta_getattr_u32(tb[NHA_ID]);
> +
> +	if (tb[NHA_OIF])
> +		nhe->nh_oif = rta_getattr_u32(tb[NHA_OIF]);
> +
> +	if (tb[NHA_GROUP_TYPE])
> +		nhe->nh_grp_type = rta_getattr_u16(tb[NHA_GROUP_TYPE]);
> +
> +	if (tb[NHA_GATEWAY]) {
> +		if (RTA_PAYLOAD(tb[NHA_GATEWAY]) > sizeof(nhe->nh_gateway)) {
> +			fprintf(fp, "<nexthop id %u invalid gateway length %lu>\n",
> +				nhe->nh_id, RTA_PAYLOAD(tb[NHA_GATEWAY]));
> +			err = EINVAL;
> +			goto out_err;
> +		}
> +		nhe->nh_gateway_len = RTA_PAYLOAD(tb[NHA_GATEWAY]);
> +		memcpy(&nhe->nh_gateway, RTA_DATA(tb[NHA_GATEWAY]),
> +		       RTA_PAYLOAD(tb[NHA_GATEWAY]));
> +	}
> +
> +	if (tb[NHA_ENCAP]) {
> +		nhe->nh_encap = malloc(RTA_LENGTH(RTA_PAYLOAD(tb[NHA_ENCAP])));
> +		if (!nhe->nh_encap) {
> +			err = ENOMEM;
> +			goto out_err;
> +		}
> +		memcpy(nhe->nh_encap, tb[NHA_ENCAP],
> +		       RTA_LENGTH(RTA_PAYLOAD(tb[NHA_ENCAP])));
> +		memcpy(&nhe->nh_encap_type, tb[NHA_ENCAP_TYPE],
> +		       sizeof(nhe->nh_encap_type));
> +	}
> +
> +	if (tb[NHA_GROUP]) {
> +		if (!__valid_nh_group_attr(tb[NHA_GROUP])) {
> +			fprintf(fp, "<nexthop id %u invalid nexthop group>",
> +				nhe->nh_id);
> +			err = EINVAL;
> +			goto out_err;
> +		}
> +
> +		nhe->nh_groups = malloc(RTA_PAYLOAD(tb[NHA_GROUP]));
> +		if (!nhe->nh_groups) {
> +			err = ENOMEM;
> +			goto out_err;
> +		}
> +		nhe->nh_groups_cnt = RTA_PAYLOAD(tb[NHA_GROUP]) /
> +				     sizeof(struct nexthop_grp);
> +		memcpy(nhe->nh_groups, RTA_DATA(tb[NHA_GROUP]),
> +		       RTA_PAYLOAD(tb[NHA_GROUP]));
> +	}
> +
> +	nhe->nh_blackhole = !!tb[NHA_BLACKHOLE];
> +	nhe->nh_fdb = !!tb[NHA_FDB];
> +
> +	nhe->nh_family = nhm->nh_family;
> +	nhe->nh_protocol = nhm->nh_protocol;
> +	nhe->nh_scope = nhm->nh_scope;
> +	nhe->nh_flags = nhm->nh_flags;
> +
> +	return 0;
> +
> +out_err:
> +	ipnh_destroy_entry(nhe);
> +	return err;
> +}
> +



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC iproute2-next 04/11] ip: nexthop: parse resilient nexthop group attribute into structure
  2021-09-29 15:28 ` [RFC iproute2-next 04/11] ip: nexthop: parse resilient nexthop group attribute into structure Nikolay Aleksandrov
@ 2021-09-30  3:35   ` David Ahern
  0 siblings, 0 replies; 20+ messages in thread
From: David Ahern @ 2021-09-30  3:35 UTC (permalink / raw)
  To: Nikolay Aleksandrov, netdev
  Cc: roopa, donaldsharp72, idosch, Nikolay Aleksandrov

On 9/29/21 9:28 AM, Nikolay Aleksandrov wrote:
> diff --git a/ip/ipnexthop.c b/ip/ipnexthop.c
> index be8541476fa6..9340d8941277 100644
> --- a/ip/ipnexthop.c
> +++ b/ip/ipnexthop.c
> @@ -272,6 +272,33 @@ static void print_nh_group_type(FILE *fp, const struct rtattr *grp_type_attr)
>  	print_string(PRINT_ANY, "type", "type %s ", nh_group_type_name(type));
>  }
>  
> +static void parse_nh_res_group_rta(const struct rtattr *res_grp_attr,
> +				   struct nha_res_grp *res_grp)
> +{
> +	struct rtattr *tb[NHA_RES_GROUP_MAX + 1];
> +	struct rtattr *rta;
> +
> +	parse_rtattr_nested(tb, NHA_RES_GROUP_MAX, res_grp_attr);
> +
> +	if (tb[NHA_RES_GROUP_BUCKETS])
> +		res_grp->buckets = rta_getattr_u16(tb[NHA_RES_GROUP_BUCKETS]);
> +
> +	if (tb[NHA_RES_GROUP_IDLE_TIMER]) {
> +		rta = tb[NHA_RES_GROUP_IDLE_TIMER];
> +		res_grp->idle_timer = rta_getattr_u32(rta);
> +	}
> +
> +	if (tb[NHA_RES_GROUP_UNBALANCED_TIMER]) {
> +		rta = tb[NHA_RES_GROUP_UNBALANCED_TIMER];
> +		res_grp->unbalanced_timer = rta_getattr_u32(rta);
> +	}
> +
> +	if (tb[NHA_RES_GROUP_UNBALANCED_TIME]) {
> +		rta = tb[NHA_RES_GROUP_UNBALANCED_TIME];
> +		res_grp->unbalanced_time = rta_getattr_u64(rta);
> +	}
> +}
> +
>  static void print_nh_res_group(FILE *fp, const struct rtattr *res_grp_attr)
>  {
>  	struct rtattr *tb[NHA_RES_GROUP_MAX + 1];

similarly here - have print_nh_res_group use the one parse function and
print based on the outcome.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC iproute2-next 05/11] ip: nexthop: always parse attributes for printing
  2021-09-29 15:28 ` [RFC iproute2-next 05/11] ip: nexthop: always parse attributes for printing Nikolay Aleksandrov
@ 2021-09-30  3:37   ` David Ahern
  2021-09-30  7:10     ` Nikolay Aleksandrov
  0 siblings, 1 reply; 20+ messages in thread
From: David Ahern @ 2021-09-30  3:37 UTC (permalink / raw)
  To: Nikolay Aleksandrov, netdev
  Cc: roopa, donaldsharp72, idosch, Nikolay Aleksandrov

On 9/29/21 9:28 AM, Nikolay Aleksandrov wrote:
> @@ -481,56 +457,61 @@ int print_nexthop(struct nlmsghdr *n, void *arg)
>  	if (filter.proto && filter.proto != nhm->nh_protocol)
>  		return 0;
>  
> -	parse_rtattr_flags(tb, NHA_MAX, RTM_NHA(nhm), len, NLA_F_NESTED);
> -
> +	err = parse_nexthop_rta(fp, nhm, len, &nhe);
> +	if (err) {
> +		close_json_object();
> +		fprintf(stderr, "Error parsing nexthop: %s\n", strerror(err));
> +		return -1;
> +	}

so you are doing that but in reverse order. Re-order the patches such
that existing code is refactored with the parse functions first.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC iproute2-next 07/11] ip: nexthop: add cache helpers
  2021-09-29 15:28 ` [RFC iproute2-next 07/11] ip: nexthop: add cache helpers Nikolay Aleksandrov
@ 2021-09-30  3:39   ` David Ahern
  2021-09-30  6:53     ` Nikolay Aleksandrov
  0 siblings, 1 reply; 20+ messages in thread
From: David Ahern @ 2021-09-30  3:39 UTC (permalink / raw)
  To: Nikolay Aleksandrov, netdev
  Cc: roopa, donaldsharp72, idosch, Nikolay Aleksandrov

On 9/29/21 9:28 AM, Nikolay Aleksandrov wrote:
> @@ -372,7 +409,7 @@ static int ipnh_parse_nhmsg(FILE *fp, const struct nhmsg *nhm, int len,
>  		if (RTA_PAYLOAD(tb[NHA_GATEWAY]) > sizeof(nhe->nh_gateway)) {
>  			fprintf(fp, "<nexthop id %u invalid gateway length %lu>\n",
>  				nhe->nh_id, RTA_PAYLOAD(tb[NHA_GATEWAY]));
> -			err = EINVAL;
> +			err = -EINVAL;
>  			goto out_err;
>  		}
>  		nhe->nh_gateway_len = RTA_PAYLOAD(tb[NHA_GATEWAY]);
> @@ -383,7 +420,7 @@ static int ipnh_parse_nhmsg(FILE *fp, const struct nhmsg *nhm, int len,
>  	if (tb[NHA_ENCAP]) {
>  		nhe->nh_encap = malloc(RTA_LENGTH(RTA_PAYLOAD(tb[NHA_ENCAP])));
>  		if (!nhe->nh_encap) {
> -			err = ENOMEM;
> +			err = -ENOMEM;
>  			goto out_err;
>  		}
>  		memcpy(nhe->nh_encap, tb[NHA_ENCAP],
> @@ -396,13 +433,13 @@ static int ipnh_parse_nhmsg(FILE *fp, const struct nhmsg *nhm, int len,
>  		if (!__valid_nh_group_attr(tb[NHA_GROUP])) {
>  			fprintf(fp, "<nexthop id %u invalid nexthop group>",
>  				nhe->nh_id);
> -			err = EINVAL;
> +			err = -EINVAL;
>  			goto out_err;
>  		}
>  
>  		nhe->nh_groups = malloc(RTA_PAYLOAD(tb[NHA_GROUP]));
>  		if (!nhe->nh_groups) {
> -			err = ENOMEM;
> +			err = -ENOMEM;
>  			goto out_err;
>  		}
>  		nhe->nh_groups_cnt = RTA_PAYLOAD(tb[NHA_GROUP]) /

those should go with previous patches.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC iproute2-next 00/11] ip: nexthop: cache nexthops and print routes' nh info
  2021-09-29 15:28 [RFC iproute2-next 00/11] ip: nexthop: cache nexthops and print routes' nh info Nikolay Aleksandrov
                   ` (10 preceding siblings ...)
  2021-09-29 15:28 ` [RFC iproute2-next 11/11] ip: nexthop: add print_cache_nexthop which prints and manages the nh cache Nikolay Aleksandrov
@ 2021-09-30  3:42 ` David Ahern
  2021-09-30  7:17   ` Nikolay Aleksandrov
  11 siblings, 1 reply; 20+ messages in thread
From: David Ahern @ 2021-09-30  3:42 UTC (permalink / raw)
  To: Nikolay Aleksandrov, netdev
  Cc: roopa, donaldsharp72, idosch, Nikolay Aleksandrov

On 9/29/21 9:28 AM, Nikolay Aleksandrov wrote:
> From: Nikolay Aleksandrov <nikolay@nvidia.com>
> 
> Hi,
> This set tries to help with an old ask that we've had for some time
> which is to print nexthop information while monitoring or dumping routes.
> The core problem is that people cannot follow nexthop changes while
> monitoring route changes, by the time they check the nexthop it could be
> deleted or updated to something else. In order to help them out I've
> added a nexthop cache which is populated (only used if -d / show_details
> is specified) while decoding routes and kept up to date while monitoring.
> The nexthop information is printed on its own line starting with the
> "nh_info" attribute and its embedded inside it if printing JSON. To
> cache the nexthop entries I parse them into structures, in order to
> reuse most of the code the print helpers have been altered so they rely
> on prepared structures. Nexthops are now always parsed into a structure,
> even if they won't be cached, that structure is later used to print the
> nexthop and destroyed if not going to be cached. New nexthops (not found
> in the cache) are retrieved from the kernel using a private netlink
> socket so they don't disrupt an ongoing dump, similar to how interfaces
> are retrieved and cached.
> 
> I have tested the set with the kernel forwarding selftests and also by
> stressing it with nexthop create/update/delete in loops while monitoring.
> 
> Comments are very welcome as usual. :)

overall it looks fine and not surprised a cache is needed.

Big comment is to re-order the patches - do all of the refactoring first
to get the code where you need it and then add what is needed for the cache.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC iproute2-next 07/11] ip: nexthop: add cache helpers
  2021-09-30  3:39   ` David Ahern
@ 2021-09-30  6:53     ` Nikolay Aleksandrov
  0 siblings, 0 replies; 20+ messages in thread
From: Nikolay Aleksandrov @ 2021-09-30  6:53 UTC (permalink / raw)
  To: David Ahern, netdev; +Cc: roopa, donaldsharp72, idosch, Nikolay Aleksandrov

On 30/09/2021 06:39, David Ahern wrote:
> On 9/29/21 9:28 AM, Nikolay Aleksandrov wrote:
>> @@ -372,7 +409,7 @@ static int ipnh_parse_nhmsg(FILE *fp, const struct nhmsg *nhm, int len,
>>  		if (RTA_PAYLOAD(tb[NHA_GATEWAY]) > sizeof(nhe->nh_gateway)) {
>>  			fprintf(fp, "<nexthop id %u invalid gateway length %lu>\n",
>>  				nhe->nh_id, RTA_PAYLOAD(tb[NHA_GATEWAY]));
>> -			err = EINVAL;
>> +			err = -EINVAL;
>>  			goto out_err;
>>  		}
>>  		nhe->nh_gateway_len = RTA_PAYLOAD(tb[NHA_GATEWAY]);
>> @@ -383,7 +420,7 @@ static int ipnh_parse_nhmsg(FILE *fp, const struct nhmsg *nhm, int len,
>>  	if (tb[NHA_ENCAP]) {
>>  		nhe->nh_encap = malloc(RTA_LENGTH(RTA_PAYLOAD(tb[NHA_ENCAP])));
>>  		if (!nhe->nh_encap) {
>> -			err = ENOMEM;
>> +			err = -ENOMEM;
>>  			goto out_err;
>>  		}
>>  		memcpy(nhe->nh_encap, tb[NHA_ENCAP],
>> @@ -396,13 +433,13 @@ static int ipnh_parse_nhmsg(FILE *fp, const struct nhmsg *nhm, int len,
>>  		if (!__valid_nh_group_attr(tb[NHA_GROUP])) {
>>  			fprintf(fp, "<nexthop id %u invalid nexthop group>",
>>  				nhe->nh_id);
>> -			err = EINVAL;
>> +			err = -EINVAL;
>>  			goto out_err;
>>  		}
>>  
>>  		nhe->nh_groups = malloc(RTA_PAYLOAD(tb[NHA_GROUP]));
>>  		if (!nhe->nh_groups) {
>> -			err = ENOMEM;
>> +			err = -ENOMEM;
>>  			goto out_err;
>>  		}
>>  		nhe->nh_groups_cnt = RTA_PAYLOAD(tb[NHA_GROUP]) /
> 
> those should go with previous patches.
> 

Oops, this one was a last minute change, I didn't notice I've done it in the wrong patch.
Of course I'll fix it for the non-rfc submission.

Thanks

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC iproute2-next 05/11] ip: nexthop: always parse attributes for printing
  2021-09-30  3:37   ` David Ahern
@ 2021-09-30  7:10     ` Nikolay Aleksandrov
  0 siblings, 0 replies; 20+ messages in thread
From: Nikolay Aleksandrov @ 2021-09-30  7:10 UTC (permalink / raw)
  To: David Ahern, netdev; +Cc: roopa, donaldsharp72, idosch, Nikolay Aleksandrov

On 30/09/2021 06:37, David Ahern wrote:
> On 9/29/21 9:28 AM, Nikolay Aleksandrov wrote:
>> @@ -481,56 +457,61 @@ int print_nexthop(struct nlmsghdr *n, void *arg)
>>  	if (filter.proto && filter.proto != nhm->nh_protocol)
>>  		return 0;
>>  
>> -	parse_rtattr_flags(tb, NHA_MAX, RTM_NHA(nhm), len, NLA_F_NESTED);
>> -
>> +	err = parse_nexthop_rta(fp, nhm, len, &nhe);
>> +	if (err) {
>> +		close_json_object();
>> +		fprintf(stderr, "Error parsing nexthop: %s\n", strerror(err));
>> +		return -1;
>> +	}
> 
> so you are doing that but in reverse order. Re-order the patches such
> that existing code is refactored with the parse functions first.
> 

I added the parse functions first without any functional changes and converted
the existing printing code to use them in this patch after that. I don't mind
doing the conversion after each parse function is added, too.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC iproute2-next 00/11] ip: nexthop: cache nexthops and print routes' nh info
  2021-09-30  3:42 ` [RFC iproute2-next 00/11] ip: nexthop: cache nexthops and print routes' nh info David Ahern
@ 2021-09-30  7:17   ` Nikolay Aleksandrov
  0 siblings, 0 replies; 20+ messages in thread
From: Nikolay Aleksandrov @ 2021-09-30  7:17 UTC (permalink / raw)
  To: David Ahern, netdev; +Cc: roopa, donaldsharp72, idosch, Nikolay Aleksandrov

On 30/09/2021 06:42, David Ahern wrote:
> On 9/29/21 9:28 AM, Nikolay Aleksandrov wrote:
>> From: Nikolay Aleksandrov <nikolay@nvidia.com>
>>
>> Hi,
>> This set tries to help with an old ask that we've had for some time
>> which is to print nexthop information while monitoring or dumping routes.
>> The core problem is that people cannot follow nexthop changes while
>> monitoring route changes, by the time they check the nexthop it could be
>> deleted or updated to something else. In order to help them out I've
>> added a nexthop cache which is populated (only used if -d / show_details
>> is specified) while decoding routes and kept up to date while monitoring.
>> The nexthop information is printed on its own line starting with the
>> "nh_info" attribute and its embedded inside it if printing JSON. To
>> cache the nexthop entries I parse them into structures, in order to
>> reuse most of the code the print helpers have been altered so they rely
>> on prepared structures. Nexthops are now always parsed into a structure,
>> even if they won't be cached, that structure is later used to print the
>> nexthop and destroyed if not going to be cached. New nexthops (not found
>> in the cache) are retrieved from the kernel using a private netlink
>> socket so they don't disrupt an ongoing dump, similar to how interfaces
>> are retrieved and cached.
>>
>> I have tested the set with the kernel forwarding selftests and also by
>> stressing it with nexthop create/update/delete in loops while monitoring.
>>
>> Comments are very welcome as usual. :)
> 
> overall it looks fine and not surprised a cache is needed.
> 
> Big comment is to re-order the patches - do all of the refactoring first
> to get the code where you need it and then add what is needed for the cache.
> 

Thanks for the comments, apart from pairing the add parse/use parse functions
in the first few patches only patch 08 seems out of place, although it's there
because it was first needed in patch 09, I don't mind pulling it back.
All other patches after 06 are adding the new cache and print functions and
using them in iproute/ipmonitor, there is no refactoring done in those, so I
plan to keep those as they are.

Cheers,
 Nik


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-09-30  7:17 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-29 15:28 [RFC iproute2-next 00/11] ip: nexthop: cache nexthops and print routes' nh info Nikolay Aleksandrov
2021-09-29 15:28 ` [RFC iproute2-next 01/11] ip: print_rta_if takes ifindex as device argument instead of attribute Nikolay Aleksandrov
2021-09-29 15:28 ` [RFC iproute2-next 02/11] ip: export print_rta_gateway version which outputs prepared gateway string Nikolay Aleksandrov
2021-09-29 15:28 ` [RFC iproute2-next 03/11] ip: nexthop: add nh struct and a helper to parse nhmsg into it Nikolay Aleksandrov
2021-09-30  3:33   ` David Ahern
2021-09-29 15:28 ` [RFC iproute2-next 04/11] ip: nexthop: parse resilient nexthop group attribute into structure Nikolay Aleksandrov
2021-09-30  3:35   ` David Ahern
2021-09-29 15:28 ` [RFC iproute2-next 05/11] ip: nexthop: always parse attributes for printing Nikolay Aleksandrov
2021-09-30  3:37   ` David Ahern
2021-09-30  7:10     ` Nikolay Aleksandrov
2021-09-29 15:28 ` [RFC iproute2-next 06/11] ip: nexthop: pull ipnh_get_id rtnl talk into a helper Nikolay Aleksandrov
2021-09-29 15:28 ` [RFC iproute2-next 07/11] ip: nexthop: add cache helpers Nikolay Aleksandrov
2021-09-30  3:39   ` David Ahern
2021-09-30  6:53     ` Nikolay Aleksandrov
2021-09-29 15:28 ` [RFC iproute2-next 08/11] ip: nexthop: factor out entry printing Nikolay Aleksandrov
2021-09-29 15:28 ` [RFC iproute2-next 09/11] ip: nexthop: add a helper which retrieves and prints cached nh entry Nikolay Aleksandrov
2021-09-29 15:28 ` [RFC iproute2-next 10/11] ip: route: print and cache detailed nexthop information when requested Nikolay Aleksandrov
2021-09-29 15:28 ` [RFC iproute2-next 11/11] ip: nexthop: add print_cache_nexthop which prints and manages the nh cache Nikolay Aleksandrov
2021-09-30  3:42 ` [RFC iproute2-next 00/11] ip: nexthop: cache nexthops and print routes' nh info David Ahern
2021-09-30  7:17   ` Nikolay Aleksandrov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.