netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/3] ss: pretty-printing BPF socket-local storage
@ 2024-01-12 14:04 Quentin Deslandes
  2024-01-12 14:04 ` [PATCH v4 1/3] ss: add support for " Quentin Deslandes
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Quentin Deslandes @ 2024-01-12 14:04 UTC (permalink / raw)
  To: netdev; +Cc: David Ahern, Martin KaFai Lau, Quentin Deslandes, kernel-team

BPF allows programs to store socket-specific data using
BPF_MAP_TYPE_SK_STORAGE maps. The data is attached to the socket itself,
and Martin added INET_DIAG_REQ_SK_BPF_STORAGES, so it can be fetched
using the INET_DIAG mechanism.

Currently, ss doesn't request the socket-local data, this patch aims to
fix this.

The first patch requests the socket-local data for the requested map ID
(--bpf-map-id=) or all the maps (--bpf-maps). It then prints the map_id
in a dedicated column.

Patch #2 uses libbpf and BTF to pretty print the map's content, like
`bpftool map dump` would do.

Patch #3 updates ss' man page to explain new options.

While I think it makes sense for ss to provide the socket-local storage
content for the sockets, it's difficult to conciliate the column-based
output of ss and having readable socket-local data. Hence, the
socket-local data is printed in a readable fashion over multiple lines
under its socket statistics, independently of the column-based approach.

Here is an example of ss' output with --bpf-maps:
[...]
ESTAB                  340116             0 [...]
    map_id: 114 [
        (struct my_sk_storage){
            .field_hh = (char)3,
            (union){
                .a = (int)17,
                .b = (int)17,
            },
        }
    ]

Changes from v3:
* Minor refactoring to reduce number of HAVE_LIBBF usage.
* Update ss' man page.
* btf_dump structure created to print the socket-local data is cached
  in bpf_map_opts. Creation of the btf_dump structure is performed if
  needed, before printing the data.
* If a map can't be pretty-printed, print its ID and a message instead
  of skipping it.
* If show_all=true, send an empty message to the kernel to retrieve all
  the maps (as Martin suggested).
Changes from v2:
* bpf_map_opts_is_enabled is not inline anymore.
* Add more #ifdef HAVE_LIBBPF to prevent compilation error if
  libbpf support is disabled.
* Fix erroneous usage of args instead of _args in vout().
* Add missing btf__free() and close(fd).
Changes from v1:
* Remove the first patch from the series (fix) and submit it separately.
* Remove double allocation of struct rtattr.
* Close BPF map FDs on exit.
* If bpf_map_get_fd_by_id() fails with ENOENT, print an error message
  and continue to the next map ID.
* Fix typo in new command line option documentation.
* Only use bpf_map_info.btf_value_type_id and ignore
  bpf_map_info.btf_vmlinux_value_type_id (unused for socket-local storage).
* Use btf_dump__dump_type_data() instead of manually using BTF to
  pretty-print socket-local storage data. This change alone divides the size
  of the patch series by 2.

Quentin Deslandes (3):
  ss: add support for BPF socket-local storage
  ss: pretty-print BPF socket-local storage
  ss: update man page to document --bpf-maps and --bpf-map-id=

 man/man8/ss.8 |   6 +
 misc/ss.c     | 390 ++++++++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 387 insertions(+), 9 deletions(-)

--
2.43.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v4 1/3] ss: add support for BPF socket-local storage
  2024-01-12 14:04 [PATCH v4 0/3] ss: pretty-printing BPF socket-local storage Quentin Deslandes
@ 2024-01-12 14:04 ` Quentin Deslandes
  2024-01-12 22:50   ` Martin KaFai Lau
  2024-01-13  2:12   ` Martin KaFai Lau
  2024-01-12 14:04 ` [PATCH v4 2/3] ss: pretty-print " Quentin Deslandes
  2024-01-12 14:04 ` [PATCH v4 3/3] ss: update man page to document --bpf-maps and --bpf-map-id= Quentin Deslandes
  2 siblings, 2 replies; 8+ messages in thread
From: Quentin Deslandes @ 2024-01-12 14:04 UTC (permalink / raw)
  To: netdev; +Cc: David Ahern, Martin KaFai Lau, Quentin Deslandes, kernel-team

While sock_diag is able to return BPF socket-local storage in response
to INET_DIAG_REQ_SK_BPF_STORAGES requests, ss doesn't request it.

This change introduces the --bpf-maps and --bpf-map-id= options to request
BPF socket-local storage for all SK_STORAGE maps, or only specific ones.

The bigger part of this change will check the requested map IDs and
ensure they are valid. A new column has been added named "Socket
storage" to print a list of map ID a given socket has data defined for.
This column is disabled unless --bpf-maps or --bpf-map-id= is used.

When --bpf-maps is used, ss will send an empty INET_DIAG_REQ_SK_BPF_STORAGES
request, in return the kernel will send all the BPF socket-local storage
entries for a given socket.

When --bpf-map-id=ID is used, a file descriptor to the requested maps is
open to 1) ensure the map doesn't disappear before the data is printed,
and 2) ensure the map type is BPF_MAP_TYPE_SK_STORAGE.

Signed-off-by: Quentin Deslandes <qde@naccy.de>
Co-authored-by: Martin KaFai Lau <martin.lau@kernel.org>
---
 misc/ss.c | 257 +++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 254 insertions(+), 3 deletions(-)

diff --git a/misc/ss.c b/misc/ss.c
index 900fefa4..f38e4744 100644
--- a/misc/ss.c
+++ b/misc/ss.c
@@ -51,6 +51,11 @@
 #include <linux/tls.h>
 #include <linux/mptcp.h>

+#ifdef HAVE_LIBBPF
+#include <bpf/bpf.h>
+#include <bpf/libbpf.h>
+#endif
+
 #if HAVE_RPC
 #include <rpc/rpc.h>
 #include <rpc/xdr.h>
@@ -101,6 +106,7 @@ enum col_id {
 	COL_RADDR,
 	COL_RSERV,
 	COL_PROC,
+	COL_SKSTOR,
 	COL_EXT,
 	COL_MAX
 };
@@ -130,6 +136,7 @@ static struct column columns[] = {
 	{ ALIGN_RIGHT,	"Peer Address:",	" ",	0, 0, 0 },
 	{ ALIGN_LEFT,	"Port",			"",	0, 0, 0 },
 	{ ALIGN_LEFT,	"Process",		"",	0, 0, 0 },
+	{ ALIGN_LEFT,	"Socket storage",	"",	1, 0, 0 },
 	{ ALIGN_LEFT,	"",			"",	0, 0, 0 },
 };

@@ -3378,6 +3385,194 @@ static void parse_diag_msg(struct nlmsghdr *nlh, struct sockstat *s)
 	memcpy(s->remote.data, r->id.idiag_dst, s->local.bytelen);
 }

+#ifdef HAVE_LIBBPF
+
+#define MAX_NR_BPF_MAP_ID_OPTS 32
+
+struct btf;
+
+static struct bpf_map_opts {
+	unsigned int nr_maps;
+	struct bpf_sk_storage_map_info {
+		unsigned int id;
+		int fd;
+	} maps[MAX_NR_BPF_MAP_ID_OPTS];
+	bool show_all;
+} bpf_map_opts;
+
+static void bpf_map_opts_mixed_error(void)
+{
+	fprintf(stderr,
+		"ss: --bpf-maps and --bpf-map-id cannot be used together\n");
+}
+
+static int bpf_map_opts_load_info(unsigned int map_id)
+{
+	struct bpf_map_info info = {};
+	uint32_t len = sizeof(info);
+	int fd;
+	int r;
+
+	if (bpf_map_opts.nr_maps == MAX_NR_BPF_MAP_ID_OPTS) {
+		fprintf(stderr, "ss: too many (> %u) BPF socket-local storage maps found, skipping map ID %u\n",
+			MAX_NR_BPF_MAP_ID_OPTS, map_id);
+		return 0;
+	}
+
+	fd = bpf_map_get_fd_by_id(map_id);
+	if (fd == -1) {
+		if (errno == -ENOENT)
+			return 0;
+
+		fprintf(stderr, "ss: cannot get fd for BPF map ID %u%s\n",
+			map_id, errno == EPERM ?
+			": missing root permissions, CAP_BPF, or CAP_SYS_ADMIN" : "");
+		return -1;
+	}
+
+	r = bpf_obj_get_info_by_fd(fd, &info, &len);
+	if (r) {
+		fprintf(stderr, "ss: failed to get info for BPF map ID %u\n",
+			map_id);
+		close(fd);
+		return -1;
+	}
+
+	if (info.type != BPF_MAP_TYPE_SK_STORAGE) {
+		fprintf(stderr, "ss: BPF map with ID %s has type '%s', expecting 'sk_storage'\n",
+			optarg, libbpf_bpf_map_type_str(info.type));
+		close(fd);
+		return -1;
+	}
+
+	bpf_map_opts.maps[bpf_map_opts.nr_maps].id = map_id;
+	bpf_map_opts.maps[bpf_map_opts.nr_maps++].fd = fd;
+
+	return 0;
+}
+
+static struct bpf_sk_storage_map_info *bpf_map_opts_get_info(
+	unsigned int map_id)
+{
+	unsigned int i;
+	int r;
+
+	for (i = 0; i < bpf_map_opts.nr_maps; ++i) {
+		if (bpf_map_opts.maps[i].id == map_id)
+			return &bpf_map_opts.maps[i];
+	}
+
+	r = bpf_map_opts_load_info(map_id);
+	if (r)
+		return NULL;
+
+	return &bpf_map_opts.maps[bpf_map_opts.nr_maps - 1];
+}
+
+static int bpf_map_opts_add_id(const char *optarg)
+{
+	size_t optarg_len;
+	unsigned long id;
+	char *end;
+
+	if (bpf_map_opts.show_all) {
+		bpf_map_opts_mixed_error();
+		return -1;
+	}
+
+	optarg_len = strlen(optarg);
+	id = strtoul(optarg, &end, 0);
+	if (end != optarg + optarg_len || id == 0 || id >= UINT32_MAX) {
+		fprintf(stderr, "ss: invalid BPF map ID %s\n", optarg);
+		return -1;
+	}
+
+	// Force lazy loading of the map's data.
+	if (!bpf_map_opts_get_info(id))
+		return -ENOENT;
+
+	return 0;
+}
+
+static void bpf_map_opts_destroy(void)
+{
+	int i;
+
+	for (i = 0; i < bpf_map_opts.nr_maps; ++i)
+		close(bpf_map_opts.maps[i].fd);
+}
+
+static struct rtattr *bpf_map_opts_alloc_rta(void)
+{
+	struct rtattr *stgs_rta, *fd_rta;
+	size_t total_size;
+	unsigned int i;
+	void *buf;
+
+	/* If bpf_map_opts.show_all == true, then bpf_map_opts.nr_maps == 0. We
+	 * will send an empty message to the kernel, which will return all the
+	 * socket-local data attached to a socket, no matter their map ID. */
+	total_size = RTA_LENGTH(RTA_LENGTH(sizeof(int)) * bpf_map_opts.nr_maps);
+	buf = malloc(total_size);
+	if (!buf)
+		return NULL;
+
+	stgs_rta = buf;
+	stgs_rta->rta_type = INET_DIAG_REQ_SK_BPF_STORAGES | NLA_F_NESTED;
+	stgs_rta->rta_len = total_size;
+
+	buf = RTA_DATA(stgs_rta);
+	for (i = 0; i < bpf_map_opts.nr_maps; i++) {
+		int *fd;
+
+		fd_rta = buf;
+		fd_rta->rta_type = SK_DIAG_BPF_STORAGE_REQ_MAP_FD;
+		fd_rta->rta_len = RTA_LENGTH(sizeof(int));
+
+		fd = RTA_DATA(fd_rta);
+		*fd = bpf_map_opts.maps[i].fd;
+
+		buf += fd_rta->rta_len;
+	}
+
+	return stgs_rta;
+}
+
+static void show_sk_bpf_storages(struct rtattr *bpf_stgs)
+{
+	struct rtattr *tb[SK_DIAG_BPF_STORAGE_MAX + 1], *bpf_stg;
+	unsigned int rem;
+
+	for (bpf_stg = RTA_DATA(bpf_stgs), rem = RTA_PAYLOAD(bpf_stgs);
+		RTA_OK(bpf_stg, rem); bpf_stg = RTA_NEXT(bpf_stg, rem)) {
+
+		if ((bpf_stg->rta_type & NLA_TYPE_MASK) != SK_DIAG_BPF_STORAGE)
+			continue;
+
+		parse_rtattr_nested(tb, SK_DIAG_BPF_STORAGE_MAX,
+			(struct rtattr *)bpf_stg);
+
+		if (tb[SK_DIAG_BPF_STORAGE_MAP_ID]) {
+			out("map_id:%u ",
+				rta_getattr_u32(tb[SK_DIAG_BPF_STORAGE_MAP_ID]));
+		}
+	}
+}
+
+static bool bpf_map_opts_is_enabled(void)
+{
+	return bpf_map_opts.nr_maps || bpf_map_opts.show_all;
+}
+
+#else
+
+static bool bpf_map_opts_is_enabled(void)
+{
+	return false;
+}
+
+#endif
+
 static int inet_show_sock(struct nlmsghdr *nlh,
 			  struct sockstat *s)
 {
@@ -3385,8 +3580,9 @@ static int inet_show_sock(struct nlmsghdr *nlh,
 	struct inet_diag_msg *r = NLMSG_DATA(nlh);
 	unsigned char v6only = 0;

-	parse_rtattr(tb, INET_DIAG_MAX, (struct rtattr *)(r+1),
-		     nlh->nlmsg_len - NLMSG_LENGTH(sizeof(*r)));
+	parse_rtattr_flags(tb, INET_DIAG_MAX, (struct rtattr *)(r+1),
+			   nlh->nlmsg_len - NLMSG_LENGTH(sizeof(*r)),
+			   NLA_F_NESTED);

 	if (tb[INET_DIAG_PROTOCOL])
 		s->type = rta_getattr_u8(tb[INET_DIAG_PROTOCOL]);
@@ -3483,6 +3679,13 @@ static int inet_show_sock(struct nlmsghdr *nlh,
 	}
 	sctp_ino = s->ino;

+#ifdef HAVE_LIBBPF
+	if (tb[INET_DIAG_SK_BPF_STORAGES]) {
+		field_set(COL_SKSTOR);
+		show_sk_bpf_storages(tb[INET_DIAG_SK_BPF_STORAGES]);
+	}
+#endif
+
 	return 0;
 }

@@ -3564,13 +3767,14 @@ static int sockdiag_send(int family, int fd, int protocol, struct filter *f)
 {
 	struct sockaddr_nl nladdr = { .nl_family = AF_NETLINK };
 	DIAG_REQUEST(req, struct inet_diag_req_v2 r);
+	struct rtattr *bpf_stgs_rta = NULL;
 	char    *bc = NULL;
 	int	bclen;
 	__u32	proto;
 	struct msghdr msg;
 	struct rtattr rta_bc;
 	struct rtattr rta_proto;
-	struct iovec iov[5];
+	struct iovec iov[6];
 	int iovlen = 1;

 	if (family == PF_UNSPEC)
@@ -3623,6 +3827,19 @@ static int sockdiag_send(int family, int fd, int protocol, struct filter *f)
 		iovlen += 2;
 	}

+#ifdef HAVE_LIBBPF
+	if (bpf_map_opts_is_enabled()) {
+		bpf_stgs_rta = bpf_map_opts_alloc_rta();
+		if (!bpf_stgs_rta) {
+			fprintf(stderr, "ss: cannot alloc request for --bpf-map\n");
+			return -1;
+		}
+
+		iov[iovlen++] = (struct iovec){ bpf_stgs_rta, bpf_stgs_rta->rta_len };
+		req.nlh.nlmsg_len += bpf_stgs_rta->rta_len;
+	}
+#endif
+
 	msg = (struct msghdr) {
 		.msg_name = (void *)&nladdr,
 		.msg_namelen = sizeof(nladdr),
@@ -3631,10 +3848,13 @@ static int sockdiag_send(int family, int fd, int protocol, struct filter *f)
 	};

 	if (sendmsg(fd, &msg, 0) < 0) {
+		free(bpf_stgs_rta);
 		close(fd);
 		return -1;
 	}

+	free(bpf_stgs_rta);
+
 	return 0;
 }

@@ -5355,6 +5575,10 @@ static void _usage(FILE *dest)
 "       --tos           show tos and priority information\n"
 "       --cgroup        show cgroup information\n"
 "   -b, --bpf           show bpf filter socket information\n"
+#ifdef HAVE_LIBBPF
+"       --bpf-maps      show all BPF socket-local storage maps\n"
+"       --bpf-map-id=MAP-ID    show a BPF socket-local storage map\n"
+#endif
 "   -E, --events        continually display sockets as they are destroyed\n"
 "   -Z, --context       display task SELinux security contexts\n"
 "   -z, --contexts      display task and socket SELinux security contexts\n"
@@ -5480,6 +5704,9 @@ wrong_state:

 #define OPT_INET_SOCKOPT 262

+#define OPT_BPF_MAPS 263
+#define OPT_BPF_MAP_ID 264
+
 static const struct option long_opts[] = {
 	{ "numeric", 0, 0, 'n' },
 	{ "resolve", 0, 0, 'r' },
@@ -5525,6 +5752,10 @@ static const struct option long_opts[] = {
 	{ "mptcp", 0, 0, 'M' },
 	{ "oneline", 0, 0, 'O' },
 	{ "inet-sockopt", 0, 0, OPT_INET_SOCKOPT },
+#ifdef HAVE_LIBBPF
+	{ "bpf-maps", 0, 0, OPT_BPF_MAPS},
+	{ "bpf-map-id", 1, 0, OPT_BPF_MAP_ID},
+#endif
 	{ 0 }

 };
@@ -5730,6 +5961,19 @@ int main(int argc, char *argv[])
 		case OPT_INET_SOCKOPT:
 			show_inet_sockopt = 1;
 			break;
+#ifdef HAVE_LIBBPF
+		case OPT_BPF_MAPS:
+			if (bpf_map_opts.nr_maps) {
+				bpf_map_opts_mixed_error();
+				return -1;
+			}
+			bpf_map_opts.show_all = true;
+			break;
+		case OPT_BPF_MAP_ID:
+			if (bpf_map_opts_add_id(optarg))
+				exit(1);
+			break;
+#endif
 		case 'h':
 			help();
 		case '?':
@@ -5828,6 +6072,9 @@ int main(int argc, char *argv[])
 	if (!(current_filter.states & (current_filter.states - 1)))
 		columns[COL_STATE].disabled = 1;

+	if (bpf_map_opts_is_enabled())
+		columns[COL_SKSTOR].disabled = 0;
+
 	if (show_header)
 		print_header();

@@ -5864,6 +6111,10 @@ int main(int argc, char *argv[])
 	if (show_processes || show_threads || show_proc_ctx || show_sock_ctx)
 		user_ent_destroy();

+#ifdef HAVE_LIBBPF
+	bpf_map_opts_destroy();
+#endif
+
 	render();

 	return 0;
--
2.43.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v4 2/3] ss: pretty-print BPF socket-local storage
  2024-01-12 14:04 [PATCH v4 0/3] ss: pretty-printing BPF socket-local storage Quentin Deslandes
  2024-01-12 14:04 ` [PATCH v4 1/3] ss: add support for " Quentin Deslandes
@ 2024-01-12 14:04 ` Quentin Deslandes
  2024-01-12 22:59   ` Martin KaFai Lau
  2024-01-12 14:04 ` [PATCH v4 3/3] ss: update man page to document --bpf-maps and --bpf-map-id= Quentin Deslandes
  2 siblings, 1 reply; 8+ messages in thread
From: Quentin Deslandes @ 2024-01-12 14:04 UTC (permalink / raw)
  To: netdev; +Cc: David Ahern, Martin KaFai Lau, Quentin Deslandes, kernel-team

ss is able to print the map ID(s) for which a given socket has BPF
socket-local storage defined (using --bpf-maps or --bpf-map-id=). However,
the actual content of the map remains hidden.

This change aims to pretty-print the socket-local storage content following
the socket details, similar to what `bpftool map dump` would do. The exact
output format is inspired by drgn, while the BTF data processing is similar
to bpftool's.

ss will use libbpf's btf_dump__dump_type_data() to ease pretty-printing
of binary data. This requires out_bpf_sk_storage_print_fn() as a print
callback function used by btf_dump__dump_type_data(). vout() is also
introduced, which is similar to out() but accepts a va_list as
parameter.

COL_SKSTOR's header is replaced with an empty string, as it doesn't need to
be printed anymore; it's used as a "virtual" column to refer to the
socket-local storage dump, which will be printed under the socket information.
The column's width is fixed to 1, so it doesn't mess up ss' output.

ss' output remains unchanged unless --bpf-maps or --bpf-map-id= is used,
in which case each socket containing BPF local storage will be followed by
the content of the storage before the next socket's info is displayed.

Signed-off-by: Quentin Deslandes <qde@naccy.de>
---
 misc/ss.c | 145 +++++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 133 insertions(+), 12 deletions(-)

diff --git a/misc/ss.c b/misc/ss.c
index f38e4744..38e2ba6c 100644
--- a/misc/ss.c
+++ b/misc/ss.c
@@ -53,7 +53,9 @@
 
 #ifdef HAVE_LIBBPF
 #include <bpf/bpf.h>
+#include <bpf/btf.h>
 #include <bpf/libbpf.h>
+#include <linux/btf.h>
 #endif
 
 #if HAVE_RPC
@@ -136,7 +138,7 @@ static struct column columns[] = {
 	{ ALIGN_RIGHT,	"Peer Address:",	" ",	0, 0, 0 },
 	{ ALIGN_LEFT,	"Port",			"",	0, 0, 0 },
 	{ ALIGN_LEFT,	"Process",		"",	0, 0, 0 },
-	{ ALIGN_LEFT,	"Socket storage",	"",	1, 0, 0 },
+	{ ALIGN_LEFT,	"",			"",	1, 0, 0 },
 	{ ALIGN_LEFT,	"",			"",	0, 0, 0 },
 };
 
@@ -1041,11 +1043,10 @@ static int buf_update(int len)
 }
 
 /* Append content to buffer as part of the current field */
-__attribute__((format(printf, 1, 2)))
-static void out(const char *fmt, ...)
+static void vout(const char *fmt, va_list args)
 {
 	struct column *f = current_field;
-	va_list args;
+	va_list _args;
 	char *pos;
 	int len;
 
@@ -1056,18 +1057,27 @@ static void out(const char *fmt, ...)
 		buffer.head = buf_chunk_new();
 
 again:	/* Append to buffer: if we have a new chunk, print again */
+	va_copy(_args, args);
 
 	pos = buffer.cur->data + buffer.cur->len;
-	va_start(args, fmt);
 
 	/* Limit to tail room. If we hit the limit, buf_update() will tell us */
-	len = vsnprintf(pos, buf_chunk_avail(buffer.tail), fmt, args);
-	va_end(args);
+	len = vsnprintf(pos, buf_chunk_avail(buffer.tail), fmt, _args);
 
 	if (buf_update(len))
 		goto again;
 }
 
+__attribute__((format(printf, 1, 2)))
+static void out(const char *fmt, ...)
+{
+	va_list args;
+
+	va_start(args, fmt);
+	vout(fmt, args);
+	va_end(args);
+}
+
 static int print_left_spacing(struct column *f, int stored, int printed)
 {
 	int s;
@@ -1215,6 +1225,9 @@ static void render_calc_width(void)
 		 */
 		c->width = min(c->width, screen_width);
 
+		if (c == &columns[COL_SKSTOR])
+			c->width = 1;
+
 		if (c->width)
 			first = 0;
 	}
@@ -3396,6 +3409,9 @@ static struct bpf_map_opts {
 	struct bpf_sk_storage_map_info {
 		unsigned int id;
 		int fd;
+		struct bpf_map_info info;
+		struct btf *btf;
+		struct btf_dump *dump;
 	} maps[MAX_NR_BPF_MAP_ID_OPTS];
 	bool show_all;
 } bpf_map_opts;
@@ -3406,10 +3422,27 @@ static void bpf_map_opts_mixed_error(void)
 		"ss: --bpf-maps and --bpf-map-id cannot be used together\n");
 }
 
+static int bpf_maps_opts_load_btf(struct bpf_map_info *info, struct btf **btf)
+{
+	if (info->btf_value_type_id) {
+		*btf = btf__load_from_kernel_by_id(info->btf_id);
+		if (!*btf) {
+			fprintf(stderr, "ss: failed to load BTF for map ID %u\n",
+				info->id);
+			return -1;
+		}
+	} else {
+		*btf = NULL;
+	}
+
+	return 0;
+}
+
 static int bpf_map_opts_load_info(unsigned int map_id)
 {
 	struct bpf_map_info info = {};
 	uint32_t len = sizeof(info);
+	struct btf *btf;
 	int fd;
 	int r;
 
@@ -3445,8 +3478,16 @@ static int bpf_map_opts_load_info(unsigned int map_id)
 		return -1;
 	}
 
+	r = bpf_maps_opts_load_btf(&info, &btf);
+	if (r) {
+		close(fd);
+		return -1;
+	}
+
 	bpf_map_opts.maps[bpf_map_opts.nr_maps].id = map_id;
-	bpf_map_opts.maps[bpf_map_opts.nr_maps++].fd = fd;
+	bpf_map_opts.maps[bpf_map_opts.nr_maps].fd = fd;
+	bpf_map_opts.maps[bpf_map_opts.nr_maps].info = info;
+	bpf_map_opts.maps[bpf_map_opts.nr_maps++].btf = btf;
 
 	return 0;
 }
@@ -3469,6 +3510,29 @@ static struct bpf_sk_storage_map_info *bpf_map_opts_get_info(
 	return &bpf_map_opts.maps[bpf_map_opts.nr_maps - 1];
 }
 
+static void out_bpf_sk_storage_print_fn(void *ctx, const char *fmt, va_list args)
+{
+	vout(fmt, args);
+}
+
+static struct btf_dump *bpf_map_opts_get_btf_dump(
+	struct bpf_sk_storage_map_info *map_info)
+{
+	struct btf_dump_opts dopts = {
+		.sz = sizeof(struct btf_dump_opts)
+	};
+
+	if (!map_info->dump) {
+		map_info->dump = btf_dump__new(map_info->btf,
+					       out_bpf_sk_storage_print_fn,
+					       NULL, &dopts);
+		if (!map_info->dump)
+			fprintf(stderr, "Failed to create btf_dump object\n");
+	}
+
+	return map_info->dump;
+}
+
 static int bpf_map_opts_add_id(const char *optarg)
 {
 	size_t optarg_len;
@@ -3498,8 +3562,11 @@ static void bpf_map_opts_destroy(void)
 {
 	int i;
 
-	for (i = 0; i < bpf_map_opts.nr_maps; ++i)
+	for (i = 0; i < bpf_map_opts.nr_maps; ++i) {
+		btf_dump__free(bpf_map_opts.maps[i].dump);
+		btf__free(bpf_map_opts.maps[i].btf);
 		close(bpf_map_opts.maps[i].fd);
+	}
 }
 
 static struct rtattr *bpf_map_opts_alloc_rta(void)
@@ -3538,10 +3605,54 @@ static struct rtattr *bpf_map_opts_alloc_rta(void)
 	return stgs_rta;
 }
 
+#define SK_STORAGE_INDENT_STR "    "
+
+static void out_bpf_sk_storage(int map_id, const void *data, size_t len)
+{
+	uint32_t type_id;
+	struct bpf_sk_storage_map_info *map_info;
+	struct btf_dump *dump;
+	struct btf_dump_type_data_opts opts = {
+		.sz = sizeof(struct btf_dump_type_data_opts),
+		.indent_str = SK_STORAGE_INDENT_STR,
+		.indent_level = 2,
+		.emit_zeroes = 1
+	};
+	int r;
+
+	map_info = bpf_map_opts_get_info(map_id);
+	if (!map_info) {
+		/* The kernel might return a map we can't get info for, skip
+		 * it but print the other ones. */
+		out(SK_STORAGE_INDENT_STR "map_id: %d failed to fetch info, skipping\n",
+		    map_id);
+		return;
+	}
+
+	if (map_info->info.value_size != len) {
+		fprintf(stderr, "map_id: %d: invalid value size, expecting %u, got %lu\n",
+			map_id, map_info->info.value_size, len);
+		return;
+	}
+
+	type_id = map_info->info.btf_value_type_id;
+
+	dump = bpf_map_opts_get_btf_dump(map_info);
+	if (!dump)
+		return;
+
+	out(SK_STORAGE_INDENT_STR "map_id: %d [\n", map_id);
+	r = btf_dump__dump_type_data(dump, type_id, data, len, &opts);
+	if (r < 0)
+		out(SK_STORAGE_INDENT_STR SK_STORAGE_INDENT_STR "failed to dump data: %d", r);
+	out("\n" SK_STORAGE_INDENT_STR "]");
+}
+
 static void show_sk_bpf_storages(struct rtattr *bpf_stgs)
 {
 	struct rtattr *tb[SK_DIAG_BPF_STORAGE_MAX + 1], *bpf_stg;
-	unsigned int rem;
+	unsigned int rem, map_id;
+	struct rtattr *value;
 
 	for (bpf_stg = RTA_DATA(bpf_stgs), rem = RTA_PAYLOAD(bpf_stgs);
 		RTA_OK(bpf_stg, rem); bpf_stg = RTA_NEXT(bpf_stg, rem)) {
@@ -3553,8 +3664,13 @@ static void show_sk_bpf_storages(struct rtattr *bpf_stgs)
 			(struct rtattr *)bpf_stg);
 
 		if (tb[SK_DIAG_BPF_STORAGE_MAP_ID]) {
-			out("map_id:%u ",
-				rta_getattr_u32(tb[SK_DIAG_BPF_STORAGE_MAP_ID]));
+			out("\n");
+
+			map_id = rta_getattr_u32(tb[SK_DIAG_BPF_STORAGE_MAP_ID]);
+			value = tb[SK_DIAG_BPF_STORAGE_MAP_VALUE];
+
+			out_bpf_sk_storage(map_id, RTA_DATA(value),
+				RTA_PAYLOAD(value));
 		}
 	}
 }
@@ -5982,6 +6098,11 @@ int main(int argc, char *argv[])
 		}
 	}
 
+	if (oneline && bpf_map_opts_is_enabled()) {
+		fprintf(stderr, "ss: --oneline, --bpf-maps, and --bpf-map-id are incompatible\n");
+		exit(-1);
+	}
+
 	if (show_processes || show_threads || show_proc_ctx || show_sock_ctx)
 		user_ent_hash_build();
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v4 3/3] ss: update man page to document --bpf-maps and --bpf-map-id=
  2024-01-12 14:04 [PATCH v4 0/3] ss: pretty-printing BPF socket-local storage Quentin Deslandes
  2024-01-12 14:04 ` [PATCH v4 1/3] ss: add support for " Quentin Deslandes
  2024-01-12 14:04 ` [PATCH v4 2/3] ss: pretty-print " Quentin Deslandes
@ 2024-01-12 14:04 ` Quentin Deslandes
  2024-01-12 23:00   ` Martin KaFai Lau
  2 siblings, 1 reply; 8+ messages in thread
From: Quentin Deslandes @ 2024-01-12 14:04 UTC (permalink / raw)
  To: netdev; +Cc: David Ahern, Martin KaFai Lau, Quentin Deslandes, kernel-team

Document new --bpf-maps and --bpf-map-id= options.

Signed-off-by: Quentin Deslandes <qde@naccy.de>
---
 man/man8/ss.8 | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/man/man8/ss.8 b/man/man8/ss.8
index 4ece41fa..0ab212d0 100644
--- a/man/man8/ss.8
+++ b/man/man8/ss.8
@@ -423,6 +423,12 @@ to FILE after applying filters. If FILE is - stdout is used.
 Read filter information from FILE.  Each line of FILE is interpreted
 like single command line option. If FILE is - stdin is used.
 .TP
+.B \-\-bpf-maps
+Pretty-print all the BPF socket-local data entries for each socket.
+.TP
+.B \-\-bpf-map-id=MAP_ID
+Pretty-print the BPF socket-local data entries for the requested map ID. Can be used more than once.
+.TP
 .B FILTER := [ state STATE-FILTER ] [ EXPRESSION ]
 Please take a look at the official documentation for details regarding filters.
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 1/3] ss: add support for BPF socket-local storage
  2024-01-12 14:04 ` [PATCH v4 1/3] ss: add support for " Quentin Deslandes
@ 2024-01-12 22:50   ` Martin KaFai Lau
  2024-01-13  2:12   ` Martin KaFai Lau
  1 sibling, 0 replies; 8+ messages in thread
From: Martin KaFai Lau @ 2024-01-12 22:50 UTC (permalink / raw)
  To: Quentin Deslandes; +Cc: David Ahern, Martin KaFai Lau, kernel-team, netdev

On 1/12/24 6:04 AM, Quentin Deslandes wrote:
> +static int bpf_map_opts_load_info(unsigned int map_id)
> +{
> +	struct bpf_map_info info = {};
> +	uint32_t len = sizeof(info);
> +	int fd;
> +	int r;
> +
> +	if (bpf_map_opts.nr_maps == MAX_NR_BPF_MAP_ID_OPTS) {
> +		fprintf(stderr, "ss: too many (> %u) BPF socket-local storage maps found, skipping map ID %u\n",
> +			MAX_NR_BPF_MAP_ID_OPTS, map_id);
> +		return 0;
> +	}
> +
> +	fd = bpf_map_get_fd_by_id(map_id);
> +	if (fd == -1) {

I also just noticed libbpf returns -errno (from libbpf_err_errno()), so better 
check for < 0 here.

> +		if (errno == -ENOENT)
> +			return 0;
> +
> +		fprintf(stderr, "ss: cannot get fd for BPF map ID %u%s\n",
> +			map_id, errno == EPERM ?
> +			": missing root permissions, CAP_BPF, or CAP_SYS_ADMIN" : "");
> +		return -1;
> +	}
> +
> +	r = bpf_obj_get_info_by_fd(fd, &info, &len);
> +	if (r) {
> +		fprintf(stderr, "ss: failed to get info for BPF map ID %u\n",
> +			map_id);
> +		close(fd);
> +		return -1;
> +	}
> +
> +	if (info.type != BPF_MAP_TYPE_SK_STORAGE) {
> +		fprintf(stderr, "ss: BPF map with ID %s has type '%s', expecting 'sk_storage'\n",
> +			optarg, libbpf_bpf_map_type_str(info.type));
> +		close(fd);
> +		return -1;
> +	}
> +
> +	bpf_map_opts.maps[bpf_map_opts.nr_maps].id = map_id;
> +	bpf_map_opts.maps[bpf_map_opts.nr_maps++].fd = fd;
> +
> +	return 0;
> +}
> +
> +static struct bpf_sk_storage_map_info *bpf_map_opts_get_info(
> +	unsigned int map_id)
> +{
> +	unsigned int i;
> +	int r;
> +
> +	for (i = 0; i < bpf_map_opts.nr_maps; ++i) {
> +		if (bpf_map_opts.maps[i].id == map_id)
> +			return &bpf_map_opts.maps[i];
> +	}
> +
> +	r = bpf_map_opts_load_info(map_id);
> +	if (r)
> +		return NULL;
> +
> +	return &bpf_map_opts.maps[bpf_map_opts.nr_maps - 1];
> +}
> +
> +static int bpf_map_opts_add_id(const char *optarg)
> +{
> +	size_t optarg_len;
> +	unsigned long id;
> +	char *end;
> +
> +	if (bpf_map_opts.show_all) {
> +		bpf_map_opts_mixed_error();
> +		return -1;
> +	}
> +
> +	optarg_len = strlen(optarg);
> +	id = strtoul(optarg, &end, 0);
> +	if (end != optarg + optarg_len || id == 0 || id >= UINT32_MAX) {
> +		fprintf(stderr, "ss: invalid BPF map ID %s\n", optarg);
> +		return -1;
> +	}
> +
> +	// Force lazy loading of the map's data.
> +	if (!bpf_map_opts_get_info(id))
> +		return -ENOENT;

nit. may be also "return -1;" here to be consistent with the above error returns.

Other than the minor nits, lgtm, you can carry my ack in the next spin:

Acked-by: Martin KaFai Lau <martin.lau@kernel.org>



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 2/3] ss: pretty-print BPF socket-local storage
  2024-01-12 14:04 ` [PATCH v4 2/3] ss: pretty-print " Quentin Deslandes
@ 2024-01-12 22:59   ` Martin KaFai Lau
  0 siblings, 0 replies; 8+ messages in thread
From: Martin KaFai Lau @ 2024-01-12 22:59 UTC (permalink / raw)
  To: Quentin Deslandes; +Cc: David Ahern, Martin KaFai Lau, kernel-team, netdev

On 1/12/24 6:04 AM, Quentin Deslandes wrote:
> @@ -3445,8 +3478,16 @@ static int bpf_map_opts_load_info(unsigned int map_id)
>   		return -1;
>   	}
>   
> +	r = bpf_maps_opts_load_btf(&info, &btf);
> +	if (r) {
> +		close(fd);
> +		return -1;
> +	}
> +
>   	bpf_map_opts.maps[bpf_map_opts.nr_maps].id = map_id;
> -	bpf_map_opts.maps[bpf_map_opts.nr_maps++].fd = fd;
> +	bpf_map_opts.maps[bpf_map_opts.nr_maps].fd = fd;
> +	bpf_map_opts.maps[bpf_map_opts.nr_maps].info = info;
> +	bpf_map_opts.maps[bpf_map_opts.nr_maps++].btf = btf;
>   
>   	return 0;
>   }
> @@ -3469,6 +3510,29 @@ static struct bpf_sk_storage_map_info *bpf_map_opts_get_info(
>   	return &bpf_map_opts.maps[bpf_map_opts.nr_maps - 1];
>   }
>   
> +static void out_bpf_sk_storage_print_fn(void *ctx, const char *fmt, va_list args)
> +{
> +	vout(fmt, args);
> +}
> +
> +static struct btf_dump *bpf_map_opts_get_btf_dump(
> +	struct bpf_sk_storage_map_info *map_info)
> +{
> +	struct btf_dump_opts dopts = {
> +		.sz = sizeof(struct btf_dump_opts)
> +	};
> +
> +	if (!map_info->dump) {
> +		map_info->dump = btf_dump__new(map_info->btf,
> +					       out_bpf_sk_storage_print_fn,
> +					       NULL, &dopts);

A nit/simplification for the consideration. May be initialize the map_info->dump 
in the bpf_map_opts_load_info() also together with other map_info->* 
initialization? It is likely map_info->dump will be needed anyway.

Acked-by: Martin KaFai Lau <martin.lau@kernel.org>


> +		if (!map_info->dump)
> +			fprintf(stderr, "Failed to create btf_dump object\n");
> +	}
> +
> +	return map_info->dump;
> +}
> +

[ ... ]

> +static void out_bpf_sk_storage(int map_id, const void *data, size_t len)
> +{
> +	uint32_t type_id;
> +	struct bpf_sk_storage_map_info *map_info;
> +	struct btf_dump *dump;
> +	struct btf_dump_type_data_opts opts = {
> +		.sz = sizeof(struct btf_dump_type_data_opts),
> +		.indent_str = SK_STORAGE_INDENT_STR,
> +		.indent_level = 2,
> +		.emit_zeroes = 1
> +	};
> +	int r;
> +
> +	map_info = bpf_map_opts_get_info(map_id);
> +	if (!map_info) {
> +		/* The kernel might return a map we can't get info for, skip
> +		 * it but print the other ones. */
> +		out(SK_STORAGE_INDENT_STR "map_id: %d failed to fetch info, skipping\n",
> +		    map_id);
> +		return;
> +	}
> +
> +	if (map_info->info.value_size != len) {
> +		fprintf(stderr, "map_id: %d: invalid value size, expecting %u, got %lu\n",
> +			map_id, map_info->info.value_size, len);
> +		return;
> +	}
> +
> +	type_id = map_info->info.btf_value_type_id;
> +
> +	dump = bpf_map_opts_get_btf_dump(map_info);
> +	if (!dump)
> +		return;
> +
> +	out(SK_STORAGE_INDENT_STR "map_id: %d [\n", map_id);
> +	r = btf_dump__dump_type_data(dump, type_id, data, len, &opts);
> +	if (r < 0)
> +		out(SK_STORAGE_INDENT_STR SK_STORAGE_INDENT_STR "failed to dump data: %d", r);
> +	out("\n" SK_STORAGE_INDENT_STR "]");
> +}
> +


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 3/3] ss: update man page to document --bpf-maps and --bpf-map-id=
  2024-01-12 14:04 ` [PATCH v4 3/3] ss: update man page to document --bpf-maps and --bpf-map-id= Quentin Deslandes
@ 2024-01-12 23:00   ` Martin KaFai Lau
  0 siblings, 0 replies; 8+ messages in thread
From: Martin KaFai Lau @ 2024-01-12 23:00 UTC (permalink / raw)
  To: Quentin Deslandes; +Cc: David Ahern, Martin KaFai Lau, kernel-team, netdev

On 1/12/24 6:04 AM, Quentin Deslandes wrote:
> Document new --bpf-maps and --bpf-map-id= options.

Acked-by: Martin KaFai Lau <martin.lau@kernel.org>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 1/3] ss: add support for BPF socket-local storage
  2024-01-12 14:04 ` [PATCH v4 1/3] ss: add support for " Quentin Deslandes
  2024-01-12 22:50   ` Martin KaFai Lau
@ 2024-01-13  2:12   ` Martin KaFai Lau
  1 sibling, 0 replies; 8+ messages in thread
From: Martin KaFai Lau @ 2024-01-13  2:12 UTC (permalink / raw)
  To: Quentin Deslandes; +Cc: David Ahern, Martin KaFai Lau, kernel-team, netdev

On 1/12/24 6:04 AM, Quentin Deslandes wrote:
> +static struct rtattr *bpf_map_opts_alloc_rta(void)
> +{
> +	struct rtattr *stgs_rta, *fd_rta;
> +	size_t total_size;
> +	unsigned int i;
> +	void *buf;
> +
> +	/* If bpf_map_opts.show_all == true, then bpf_map_opts.nr_maps == 0. We
> +	 * will send an empty message to the kernel, which will return all the
> +	 * socket-local data attached to a socket, no matter their map ID. */
> +	total_size = RTA_LENGTH(RTA_LENGTH(sizeof(int)) * bpf_map_opts.nr_maps);

I have been trying the patch in some heavier traffic machines because I am over 
excited :)

The "--bpf-maps" result is pretty flaky. It does not always print all the 
sk_storage_map.

This line has a bug when using with the "--bpf-maps" cmd opts. The nr_maps will 
become non-zero and will end up not printing all sk_storage_map. Take a look at 
the inet_show_netlink() and there is a "goto again" case.

It really has to test with the bpf_map_opts.show_all here.


> +	buf = malloc(total_size);
> +	if (!buf)
> +		return NULL;
> +
> +	stgs_rta = buf;
> +	stgs_rta->rta_type = INET_DIAG_REQ_SK_BPF_STORAGES | NLA_F_NESTED;
> +	stgs_rta->rta_len = total_size;
> +
> +	buf = RTA_DATA(stgs_rta);
> +	for (i = 0; i < bpf_map_opts.nr_maps; i++) {
> +		int *fd;
> +
> +		fd_rta = buf;
> +		fd_rta->rta_type = SK_DIAG_BPF_STORAGE_REQ_MAP_FD;
> +		fd_rta->rta_len = RTA_LENGTH(sizeof(int));
> +
> +		fd = RTA_DATA(fd_rta);
> +		*fd = bpf_map_opts.maps[i].fd;
> +
> +		buf += fd_rta->rta_len;
> +	}
> +
> +	return stgs_rta;
> +}
> +

[ ... ]

> @@ -3564,13 +3767,14 @@ static int sockdiag_send(int family, int fd, int protocol, struct filter *f)
>   {
>   	struct sockaddr_nl nladdr = { .nl_family = AF_NETLINK };
>   	DIAG_REQUEST(req, struct inet_diag_req_v2 r);
> +	struct rtattr *bpf_stgs_rta = NULL;
>   	char    *bc = NULL;
>   	int	bclen;
>   	__u32	proto;
>   	struct msghdr msg;
>   	struct rtattr rta_bc;
>   	struct rtattr rta_proto;
> -	struct iovec iov[5];
> +	struct iovec iov[6];
>   	int iovlen = 1;
> 
>   	if (family == PF_UNSPEC)
> @@ -3623,6 +3827,19 @@ static int sockdiag_send(int family, int fd, int protocol, struct filter *f)
>   		iovlen += 2;
>   	}
> 
> +#ifdef HAVE_LIBBPF
> +	if (bpf_map_opts_is_enabled()) {
> +		bpf_stgs_rta = bpf_map_opts_alloc_rta();
> +		if (!bpf_stgs_rta) {
> +			fprintf(stderr, "ss: cannot alloc request for --bpf-map\n");
> +			return -1;
> +		}
> +
> +		iov[iovlen++] = (struct iovec){ bpf_stgs_rta, bpf_stgs_rta->rta_len };
> +		req.nlh.nlmsg_len += bpf_stgs_rta->rta_len;
> +	}
> +#endif
> +
>   	msg = (struct msghdr) {
>   		.msg_name = (void *)&nladdr,
>   		.msg_namelen = sizeof(nladdr),
> @@ -3631,10 +3848,13 @@ static int sockdiag_send(int family, int fd, int protocol, struct filter *f)
>   	};
> 
>   	if (sendmsg(fd, &msg, 0) < 0) {
> +		free(bpf_stgs_rta);
>   		close(fd);
>   		return -1;
>   	}
> 
> +	free(bpf_stgs_rta);
> +
>   	return 0;
>   }


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-01-13  2:12 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-12 14:04 [PATCH v4 0/3] ss: pretty-printing BPF socket-local storage Quentin Deslandes
2024-01-12 14:04 ` [PATCH v4 1/3] ss: add support for " Quentin Deslandes
2024-01-12 22:50   ` Martin KaFai Lau
2024-01-13  2:12   ` Martin KaFai Lau
2024-01-12 14:04 ` [PATCH v4 2/3] ss: pretty-print " Quentin Deslandes
2024-01-12 22:59   ` Martin KaFai Lau
2024-01-12 14:04 ` [PATCH v4 3/3] ss: update man page to document --bpf-maps and --bpf-map-id= Quentin Deslandes
2024-01-12 23:00   ` Martin KaFai Lau

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).