All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC v2 bpf-next 0/3] bpf: add netfilter program type
@ 2023-03-02 17:27 Florian Westphal
  2023-03-02 17:27 ` [PATCH RFC v2 bpf-next 1/3] bpf: add bpf_link support for BPF_NETFILTER programs Florian Westphal
                   ` (4 more replies)
  0 siblings, 5 replies; 20+ messages in thread
From: Florian Westphal @ 2023-03-02 17:27 UTC (permalink / raw)
  To: bpf; +Cc: netfilter-devel, Florian Westphal

Add minimal support to hook bpf programs to netfilter hooks,
e.g. PREROUTING or FORWARD.

For this the most relevant parts for registering a netfilter
hook via the in-kernel api are exposed to userspace via bpf_link.

The new program type is 'tracing style' and assumes skb dynptrs are used
rather than 'direct packet access'.

With this its possible to build a small test program such as:

#include "vmlinux.h"

extern int bpf_dynptr_from_skb(struct __sk_buff *skb, __u64 flags,
                               struct bpf_dynptr *ptr__uninit) __ksym;
extern void *bpf_dynptr_slice(const struct bpf_dynptr *ptr, uint32_t offset,
                                   void *buffer, uint32_t buffer__sz) __ksym;

SEC("netfilter")
int nf_test(struct bpf_nf_ctx *ctx)
{
	struct nf_hook_state *state = ctx->state;
	struct sk_buff *skb = ctx->skb;
	const struct iphdr *iph, _iph;
	const struct tcphdr *th, _th;
	struct bpf_dynptr ptr;

	if (bpf_dynptr_from_skb(skb, 0, &ptr))
		return NF_DROP;

	iph = bpf_dynptr_slice(&ptr, 0, &_iph, sizeof(_iph));
	if (!iph)
		return NF_DROP;

	th = bpf_dynptr_slice(&ptr, iph->ihl << 2, &_th, sizeof(_th));
	if (!th)
		return NF_DROP;

	bpf_printk("accept %x:%d->%x:%d, hook %d ifin %d\n", iph->saddr, bpf_ntohs(th->source), iph->daddr, bpf_ntohs(th->dest), state->hook, state->in->ifindex);
        return NF_ACCEPT;
}

(output can be observed via /sys/kernel/tracing/trace_pipe).

At this point I think its fairly complete.  Known problems are:
- no test cases, I will look into this.  Might take some time
  though because I might have to extend libbpf first.
- nfnetlink_hook needs minor work so that it can dump the bpf
  program id. As-is, userspace could see that a bpf program
  is attached to e.g. forward and output, but it cannot tell
  which program.  This is fairly simple and doesn't need changes
  on bpf side.

I will work on these address those two next unless anyone spots
a fundamental issue with this rfc set.

Florian Westphal (3):
  bpf: add bpf_link support for BPF_NETFILTER programs
  libbpf: sync header file, add nf prog section name
  bpf: minimal support for programs hooked into netfilter framework

 include/linux/bpf_types.h           |   4 +
 include/linux/netfilter.h           |   1 +
 include/net/netfilter/nf_hook_bpf.h |   8 ++
 include/uapi/linux/bpf.h            |  12 ++
 kernel/bpf/btf.c                    |   5 +
 kernel/bpf/syscall.c                |   6 +
 kernel/bpf/verifier.c               |   3 +
 net/netfilter/Kconfig               |   3 +
 net/netfilter/Makefile              |   1 +
 net/netfilter/nf_bpf_link.c         | 192 ++++++++++++++++++++++++++++
 tools/include/uapi/linux/bpf.h      |  12 ++
 tools/lib/bpf/libbpf.c              |   1 +
 12 files changed, 248 insertions(+)
 create mode 100644 include/net/netfilter/nf_hook_bpf.h
 create mode 100644 net/netfilter/nf_bpf_link.c
-- 
2.39.2


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH RFC v2 bpf-next 1/3] bpf: add bpf_link support for BPF_NETFILTER programs
  2023-03-02 17:27 [PATCH RFC v2 bpf-next 0/3] bpf: add netfilter program type Florian Westphal
@ 2023-03-02 17:27 ` Florian Westphal
  2023-03-02 20:07   ` kernel test robot
                     ` (2 more replies)
  2023-03-02 17:27 ` [PATCH RFC v2 bpf-next 2/3] libbpf: sync header file, add nf prog section name Florian Westphal
                   ` (3 subsequent siblings)
  4 siblings, 3 replies; 20+ messages in thread
From: Florian Westphal @ 2023-03-02 17:27 UTC (permalink / raw)
  To: bpf; +Cc: netfilter-devel, Florian Westphal

Add bpf_link support skeleton.  To keep this reviewable, no bpf program
can be invoked here, if a program would be attached only a c-stub is called.

This defaults to 'y' if both netfilter and bpf syscall are enabled in
kconfig.

Uapi example usage:
	union bpf_attr attr = { };

	attr.link_create.prog_fd = progfd;
	attr.link_create.attach_type = BPF_NETFILTER;
	attr.link_create.netfilter.pf = PF_INET;
	attr.link_create.netfilter.hooknum = NF_INET_LOCAL_IN;
	attr.link_create.netfilter.priority = -128;

	err = bpf(BPF_LINK_CREATE, &attr, sizeof(attr));

... this would attach progfd to ipv4:input hook.

Such hook gets removed automatically if the calling program exits.

BPF_NETFILTER program invocation is added in followup change.

NF_HOOK_OP_BPF enum will eventually be read from nfnetlink_hook, it
allows to tell userspace which program is attached at the given hook
when user runs 'nft hook list' command rather than just the priority
and not-very-helpful 'this hook runs a bpf prog but I can't tell which
one'.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/linux/netfilter.h           |   1 +
 include/net/netfilter/nf_hook_bpf.h |   2 +
 include/uapi/linux/bpf.h            |  12 +++
 kernel/bpf/syscall.c                |   6 ++
 net/netfilter/Kconfig               |   3 +
 net/netfilter/Makefile              |   1 +
 net/netfilter/nf_bpf_link.c         | 116 ++++++++++++++++++++++++++++
 7 files changed, 141 insertions(+)
 create mode 100644 include/net/netfilter/nf_hook_bpf.h
 create mode 100644 net/netfilter/nf_bpf_link.c

diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index 6863e271a9de..beec40ccbd79 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -80,6 +80,7 @@ typedef unsigned int nf_hookfn(void *priv,
 enum nf_hook_ops_type {
 	NF_HOOK_OP_UNDEFINED,
 	NF_HOOK_OP_NF_TABLES,
+	NF_HOOK_OP_BPF,
 };
 
 struct nf_hook_ops {
diff --git a/include/net/netfilter/nf_hook_bpf.h b/include/net/netfilter/nf_hook_bpf.h
new file mode 100644
index 000000000000..9d1b338e89d7
--- /dev/null
+++ b/include/net/netfilter/nf_hook_bpf.h
@@ -0,0 +1,2 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+int bpf_nf_link_attach(const union bpf_attr *attr, struct bpf_prog *prog);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index c9699304aed2..b063c6985769 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -986,6 +986,7 @@ enum bpf_prog_type {
 	BPF_PROG_TYPE_LSM,
 	BPF_PROG_TYPE_SK_LOOKUP,
 	BPF_PROG_TYPE_SYSCALL, /* a program that can execute syscalls */
+	BPF_PROG_TYPE_NETFILTER,
 };
 
 enum bpf_attach_type {
@@ -1049,6 +1050,7 @@ enum bpf_link_type {
 	BPF_LINK_TYPE_PERF_EVENT = 7,
 	BPF_LINK_TYPE_KPROBE_MULTI = 8,
 	BPF_LINK_TYPE_STRUCT_OPS = 9,
+	BPF_LINK_TYPE_NETFILTER = 10,
 
 	MAX_BPF_LINK_TYPE,
 };
@@ -1543,6 +1545,11 @@ union bpf_attr {
 				 */
 				__u64		cookie;
 			} tracing;
+			struct {
+				__u32		pf;
+				__u32		hooknum;
+				__s32		prio;
+			} netfilter;
 		};
 	} link_create;
 
@@ -6373,6 +6380,11 @@ struct bpf_link_info {
 		struct {
 			__u32 ifindex;
 		} xdp;
+		struct {
+			__u32 pf;
+			__u32 hooknum;
+			__s32 priority;
+		} netfilter;
 	};
 } __attribute__((aligned(8)));
 
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index e3fcdc9836a6..8f5cd1fb83ee 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -35,6 +35,7 @@
 #include <linux/rcupdate_trace.h>
 #include <linux/memcontrol.h>
 #include <linux/trace_events.h>
+#include <net/netfilter/nf_hook_bpf.h>
 
 #define IS_FD_ARRAY(map) ((map)->map_type == BPF_MAP_TYPE_PERF_EVENT_ARRAY || \
 			  (map)->map_type == BPF_MAP_TYPE_CGROUP_ARRAY || \
@@ -2448,6 +2449,7 @@ static bool is_net_admin_prog_type(enum bpf_prog_type prog_type)
 	case BPF_PROG_TYPE_CGROUP_SYSCTL:
 	case BPF_PROG_TYPE_SOCK_OPS:
 	case BPF_PROG_TYPE_EXT: /* extends any prog */
+	case BPF_PROG_TYPE_NETFILTER:
 		return true;
 	case BPF_PROG_TYPE_CGROUP_SKB:
 		/* always unpriv */
@@ -4562,6 +4564,7 @@ static int link_create(union bpf_attr *attr, bpfptr_t uattr)
 
 	switch (prog->type) {
 	case BPF_PROG_TYPE_EXT:
+	case BPF_PROG_TYPE_NETFILTER:
 		break;
 	case BPF_PROG_TYPE_PERF_EVENT:
 	case BPF_PROG_TYPE_TRACEPOINT:
@@ -4628,6 +4631,9 @@ static int link_create(union bpf_attr *attr, bpfptr_t uattr)
 	case BPF_PROG_TYPE_XDP:
 		ret = bpf_xdp_link_attach(attr, prog);
 		break;
+	case BPF_PROG_TYPE_NETFILTER:
+		ret = bpf_nf_link_attach(attr, prog);
+		break;
 #endif
 	case BPF_PROG_TYPE_PERF_EVENT:
 	case BPF_PROG_TYPE_TRACEPOINT:
diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index 4d6737160857..bea06f62a30e 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -30,6 +30,9 @@ config NETFILTER_FAMILY_BRIDGE
 config NETFILTER_FAMILY_ARP
 	bool
 
+config NETFILTER_BPF_LINK
+	def_bool BPF_SYSCALL
+
 config NETFILTER_NETLINK_HOOK
 	tristate "Netfilter base hook dump support"
 	depends on NETFILTER_ADVANCED
diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
index 5ffef1cd6143..d4958e7e7631 100644
--- a/net/netfilter/Makefile
+++ b/net/netfilter/Makefile
@@ -22,6 +22,7 @@ nf_conntrack-$(CONFIG_DEBUG_INFO_BTF) += nf_conntrack_bpf.o
 endif
 
 obj-$(CONFIG_NETFILTER) = netfilter.o
+obj-$(CONFIG_NETFILTER_BPF_LINK) += nf_bpf_link.o
 
 obj-$(CONFIG_NETFILTER_NETLINK) += nfnetlink.o
 obj-$(CONFIG_NETFILTER_NETLINK_ACCT) += nfnetlink_acct.o
diff --git a/net/netfilter/nf_bpf_link.c b/net/netfilter/nf_bpf_link.c
new file mode 100644
index 000000000000..fa4fae5cc669
--- /dev/null
+++ b/net/netfilter/nf_bpf_link.c
@@ -0,0 +1,116 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/bpf.h>
+#include <linux/netfilter.h>
+
+#include <net/netfilter/nf_hook_bpf.h>
+
+static unsigned int nf_hook_run_bpf(void *bpf_prog, struct sk_buff *skb, const struct nf_hook_state *s)
+{
+	return NF_ACCEPT;
+}
+
+struct bpf_nf_link {
+	struct bpf_link link;
+	struct nf_hook_ops hook_ops;
+	struct net *net;
+};
+
+static void bpf_nf_link_release(struct bpf_link *link)
+{
+	struct bpf_nf_link *nf_link = container_of(link, struct bpf_nf_link, link);
+
+	nf_unregister_net_hook(nf_link->net, &nf_link->hook_ops);
+}
+
+static void bpf_nf_link_dealloc(struct bpf_link *link)
+{
+	struct bpf_nf_link *nf_link = container_of(link, struct bpf_nf_link, link);
+
+	kfree(nf_link);
+}
+
+static int bpf_nf_link_detach(struct bpf_link *link)
+{
+	bpf_nf_link_release(link);
+	return 0;
+}
+
+static void bpf_nf_link_show_info(const struct bpf_link *link,
+				  struct seq_file *seq)
+{
+	struct bpf_nf_link *nf_link = container_of(link, struct bpf_nf_link, link);
+
+	seq_printf(seq, "pf:\t%u\thooknum:\t%u\tprio:\t%d\n",
+		  nf_link->hook_ops.pf, nf_link->hook_ops.hooknum,
+		  nf_link->hook_ops.priority);
+}
+
+static int bpf_nf_link_fill_link_info(const struct bpf_link *link,
+				      struct bpf_link_info *info)
+{
+	struct bpf_nf_link *nf_link = container_of(link, struct bpf_nf_link, link);
+
+	info->netfilter.pf = nf_link->hook_ops.pf;
+	info->netfilter.hooknum = nf_link->hook_ops.hooknum;
+	info->netfilter.priority = nf_link->hook_ops.priority;
+
+	return 0;
+}
+
+static int bpf_nf_link_update(struct bpf_link *link, struct bpf_prog *new_prog,
+			      struct bpf_prog *old_prog)
+{
+	return -EOPNOTSUPP;
+}
+
+static const struct bpf_link_ops bpf_nf_link_lops = {
+	.release = bpf_nf_link_release,
+	.dealloc = bpf_nf_link_dealloc,
+	.detach = bpf_nf_link_detach,
+	.show_fdinfo = bpf_nf_link_show_info,
+	.fill_link_info = bpf_nf_link_fill_link_info,
+	.update_prog = bpf_nf_link_update,
+};
+
+int bpf_nf_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
+{
+	struct net *net = current->nsproxy->net_ns;
+	struct bpf_link_primer link_primer;
+	struct bpf_nf_link *link;
+	int err;
+
+	if (attr->link_create.flags)
+		return -EINVAL;
+
+	link = kzalloc(sizeof(*link), GFP_USER);
+	if (!link)
+		return -ENOMEM;
+
+	bpf_link_init(&link->link, BPF_LINK_TYPE_NETFILTER, &bpf_nf_link_lops, prog);
+
+	link->hook_ops.hook = nf_hook_run_bpf;
+	link->hook_ops.hook_ops_type = NF_HOOK_OP_BPF;
+	link->hook_ops.priv = prog;
+
+	link->hook_ops.pf = attr->link_create.netfilter.pf;
+	link->hook_ops.priority = attr->link_create.netfilter.prio;
+	link->hook_ops.hooknum = attr->link_create.netfilter.hooknum;
+
+	link->net = net;
+
+	err = bpf_link_prime(&link->link, &link_primer);
+	if (err)
+		goto out_free;
+
+	err = nf_register_net_hook(net, &link->hook_ops);
+	if (err) {
+		bpf_link_cleanup(&link_primer);
+		goto out_free;
+	}
+
+	return bpf_link_settle(&link_primer);
+
+out_free:
+	kfree(link);
+	return err;
+}
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH RFC v2 bpf-next 2/3] libbpf: sync header file, add nf prog section name
  2023-03-02 17:27 [PATCH RFC v2 bpf-next 0/3] bpf: add netfilter program type Florian Westphal
  2023-03-02 17:27 ` [PATCH RFC v2 bpf-next 1/3] bpf: add bpf_link support for BPF_NETFILTER programs Florian Westphal
@ 2023-03-02 17:27 ` Florian Westphal
  2023-03-02 17:27 ` [PATCH RFC v2 bpf-next 3/3] bpf: minimal support for programs hooked into netfilter framework Florian Westphal
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 20+ messages in thread
From: Florian Westphal @ 2023-03-02 17:27 UTC (permalink / raw)
  To: bpf; +Cc: netfilter-devel, Florian Westphal

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 tools/include/uapi/linux/bpf.h | 12 ++++++++++++
 tools/lib/bpf/libbpf.c         |  1 +
 2 files changed, 13 insertions(+)

diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index c9699304aed2..b063c6985769 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -986,6 +986,7 @@ enum bpf_prog_type {
 	BPF_PROG_TYPE_LSM,
 	BPF_PROG_TYPE_SK_LOOKUP,
 	BPF_PROG_TYPE_SYSCALL, /* a program that can execute syscalls */
+	BPF_PROG_TYPE_NETFILTER,
 };
 
 enum bpf_attach_type {
@@ -1049,6 +1050,7 @@ enum bpf_link_type {
 	BPF_LINK_TYPE_PERF_EVENT = 7,
 	BPF_LINK_TYPE_KPROBE_MULTI = 8,
 	BPF_LINK_TYPE_STRUCT_OPS = 9,
+	BPF_LINK_TYPE_NETFILTER = 10,
 
 	MAX_BPF_LINK_TYPE,
 };
@@ -1543,6 +1545,11 @@ union bpf_attr {
 				 */
 				__u64		cookie;
 			} tracing;
+			struct {
+				__u32		pf;
+				__u32		hooknum;
+				__s32		prio;
+			} netfilter;
 		};
 	} link_create;
 
@@ -6373,6 +6380,11 @@ struct bpf_link_info {
 		struct {
 			__u32 ifindex;
 		} xdp;
+		struct {
+			__u32 pf;
+			__u32 hooknum;
+			__s32 priority;
+		} netfilter;
 	};
 } __attribute__((aligned(8)));
 
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 05c4db355f28..a2cbd6fe6a51 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -8570,6 +8570,7 @@ static const struct bpf_sec_def section_defs[] = {
 	SEC_DEF("struct_ops+",		STRUCT_OPS, 0, SEC_NONE),
 	SEC_DEF("struct_ops.s+",	STRUCT_OPS, 0, SEC_SLEEPABLE),
 	SEC_DEF("sk_lookup",		SK_LOOKUP, BPF_SK_LOOKUP, SEC_ATTACHABLE),
+	SEC_DEF("netfilter",		NETFILTER, 0, SEC_NONE),
 };
 
 static size_t custom_sec_def_cnt;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH RFC v2 bpf-next 3/3] bpf: minimal support for programs hooked into netfilter framework
  2023-03-02 17:27 [PATCH RFC v2 bpf-next 0/3] bpf: add netfilter program type Florian Westphal
  2023-03-02 17:27 ` [PATCH RFC v2 bpf-next 1/3] bpf: add bpf_link support for BPF_NETFILTER programs Florian Westphal
  2023-03-02 17:27 ` [PATCH RFC v2 bpf-next 2/3] libbpf: sync header file, add nf prog section name Florian Westphal
@ 2023-03-02 17:27 ` Florian Westphal
  2023-03-02 19:59   ` Toke Høiland-Jørgensen
  2023-03-02 19:59 ` [PATCH RFC v2 bpf-next 0/3] bpf: add netfilter program type Toke Høiland-Jørgensen
  2023-03-23  0:36 ` Daniel Xu
  4 siblings, 1 reply; 20+ messages in thread
From: Florian Westphal @ 2023-03-02 17:27 UTC (permalink / raw)
  To: bpf; +Cc: netfilter-devel, Florian Westphal

This adds minimal support for BPF_PROG_TYPE_NETFILTER bpf programs
that will be invoked via the NF_HOOK() points in the ip stack.

Invocation incurs an indirect call.  This is not a necessity: Its
possible to add 'DEFINE_BPF_DISPATCHER(nf_progs)' and handle the
program invocation with the same method already done for xdp progs.

This isn't done here to keep the size of this chunk down.

Verifier restricts verdicts to either DROP or ACCEPT.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/linux/bpf_types.h           |  4 ++
 include/net/netfilter/nf_hook_bpf.h |  6 +++
 kernel/bpf/btf.c                    |  5 ++
 kernel/bpf/verifier.c               |  3 ++
 net/netfilter/nf_bpf_link.c         | 78 ++++++++++++++++++++++++++++-
 5 files changed, 95 insertions(+), 1 deletion(-)

diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h
index d4ee3ccd3753..39a999abb0ce 100644
--- a/include/linux/bpf_types.h
+++ b/include/linux/bpf_types.h
@@ -79,6 +79,10 @@ BPF_PROG_TYPE(BPF_PROG_TYPE_LSM, lsm,
 #endif
 BPF_PROG_TYPE(BPF_PROG_TYPE_SYSCALL, bpf_syscall,
 	      void *, void *)
+#ifdef CONFIG_NETFILTER
+BPF_PROG_TYPE(BPF_PROG_TYPE_NETFILTER, netfilter,
+	      struct bpf_nf_ctx, struct bpf_nf_ctx)
+#endif
 
 BPF_MAP_TYPE(BPF_MAP_TYPE_ARRAY, array_map_ops)
 BPF_MAP_TYPE(BPF_MAP_TYPE_PERCPU_ARRAY, percpu_array_map_ops)
diff --git a/include/net/netfilter/nf_hook_bpf.h b/include/net/netfilter/nf_hook_bpf.h
index 9d1b338e89d7..863cbbcc66f9 100644
--- a/include/net/netfilter/nf_hook_bpf.h
+++ b/include/net/netfilter/nf_hook_bpf.h
@@ -1,2 +1,8 @@
 /* SPDX-License-Identifier: GPL-2.0 */
+
+struct bpf_nf_ctx {
+	const struct nf_hook_state *state;
+	struct sk_buff *skb;
+};
+
 int bpf_nf_link_attach(const union bpf_attr *attr, struct bpf_prog *prog);
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index ef2d8969ed1f..ec6eb78b9aec 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -25,6 +25,9 @@
 #include <linux/bsearch.h>
 #include <linux/kobject.h>
 #include <linux/sysfs.h>
+
+#include <net/netfilter/nf_hook_bpf.h>
+
 #include <net/sock.h>
 #include "../tools/lib/bpf/relo_core.h"
 
@@ -7726,6 +7729,8 @@ static int bpf_prog_type_to_kfunc_hook(enum bpf_prog_type prog_type)
 	case BPF_PROG_TYPE_LWT_XMIT:
 	case BPF_PROG_TYPE_LWT_SEG6LOCAL:
 		return BTF_KFUNC_HOOK_LWT;
+	case BPF_PROG_TYPE_NETFILTER:
+		return BTF_KFUNC_HOOK_SOCKET_FILTER;
 	default:
 		return BTF_KFUNC_HOOK_MAX;
 	}
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 8dcddcc00bd0..733af5a28421 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -13142,6 +13142,9 @@ static int check_return_code(struct bpf_verifier_env *env)
 		}
 		break;
 
+	case BPF_PROG_TYPE_NETFILTER:
+		range = tnum_range(NF_DROP, NF_ACCEPT);
+		break;
 	case BPF_PROG_TYPE_EXT:
 		/* freplace program can return anything as its return value
 		 * depends on the to-be-replaced kernel func or bpf program.
diff --git a/net/netfilter/nf_bpf_link.c b/net/netfilter/nf_bpf_link.c
index fa4fae5cc669..b28b7bbdc3a9 100644
--- a/net/netfilter/nf_bpf_link.c
+++ b/net/netfilter/nf_bpf_link.c
@@ -1,12 +1,19 @@
 // SPDX-License-Identifier: GPL-2.0
 #include <linux/bpf.h>
+#include <linux/filter.h>
 #include <linux/netfilter.h>
 
 #include <net/netfilter/nf_hook_bpf.h>
 
 static unsigned int nf_hook_run_bpf(void *bpf_prog, struct sk_buff *skb, const struct nf_hook_state *s)
 {
-	return NF_ACCEPT;
+	const struct bpf_prog *prog = bpf_prog;
+	struct bpf_nf_ctx ctx = {
+		.state = s,
+		.skb = skb,
+	};
+
+	return bpf_prog_run(prog, &ctx);
 }
 
 struct bpf_nf_link {
@@ -114,3 +121,72 @@ int bpf_nf_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
 	kfree(link);
 	return err;
 }
+
+static int bpf_prog_test_run_nf(struct bpf_prog *prog,
+				const union bpf_attr *kattr,
+				union bpf_attr __user *uattr)
+{
+	return -EOPNOTSUPP;
+}
+
+const struct bpf_prog_ops netfilter_prog_ops = {
+	.test_run		= bpf_prog_test_run_nf,
+};
+
+static bool nf_ptr_to_btf_id(struct bpf_insn_access_aux *info, const char *name)
+{
+	struct btf *btf;
+	s32 type_id;
+
+	btf = bpf_get_btf_vmlinux();
+	if (IS_ERR_OR_NULL(btf))
+		return false;
+
+	type_id = btf_find_by_name_kind(btf, name, BTF_KIND_STRUCT);
+	if (WARN_ON_ONCE(type_id < 0))
+		return false;
+
+	info->btf = btf;
+	info->btf_id = type_id;
+	info->reg_type = PTR_TO_BTF_ID | PTR_TRUSTED;
+	return true;
+}
+
+static bool nf_is_valid_access(int off, int size, enum bpf_access_type type,
+			       const struct bpf_prog *prog,
+			       struct bpf_insn_access_aux *info)
+{
+	if (off < 0 || off >= sizeof(struct bpf_nf_ctx))
+		return false;
+
+	if (type == BPF_WRITE)
+		return false;
+
+	switch (off) {
+	case bpf_ctx_range(struct bpf_nf_ctx, skb):
+		if (size != sizeof_field(struct bpf_nf_ctx, skb))
+			return false;
+
+		return nf_ptr_to_btf_id(info, "sk_buff");
+	case bpf_ctx_range(struct bpf_nf_ctx, state):
+		if (size != sizeof_field(struct bpf_nf_ctx, state))
+			return false;
+
+		return nf_ptr_to_btf_id(info, "nf_hook_state");
+	default:
+		return false;
+	}
+
+	return false;
+}
+
+static const struct bpf_func_proto *
+bpf_nf_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
+{
+	return bpf_base_func_proto(func_id);
+}
+
+const struct bpf_verifier_ops netfilter_verifier_ops = {
+	.is_valid_access	= nf_is_valid_access,
+	.get_func_proto		= bpf_nf_func_proto,
+};
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 bpf-next 3/3] bpf: minimal support for programs hooked into netfilter framework
  2023-03-02 17:27 ` [PATCH RFC v2 bpf-next 3/3] bpf: minimal support for programs hooked into netfilter framework Florian Westphal
@ 2023-03-02 19:59   ` Toke Høiland-Jørgensen
  2023-03-02 23:53     ` Florian Westphal
  0 siblings, 1 reply; 20+ messages in thread
From: Toke Høiland-Jørgensen @ 2023-03-02 19:59 UTC (permalink / raw)
  To: Florian Westphal, bpf; +Cc: netfilter-devel, Florian Westphal

Florian Westphal <fw@strlen.de> writes:

> This adds minimal support for BPF_PROG_TYPE_NETFILTER bpf programs
> that will be invoked via the NF_HOOK() points in the ip stack.
>
> Invocation incurs an indirect call.  This is not a necessity: Its
> possible to add 'DEFINE_BPF_DISPATCHER(nf_progs)' and handle the
> program invocation with the same method already done for xdp progs.
>
> This isn't done here to keep the size of this chunk down.
>
> Verifier restricts verdicts to either DROP or ACCEPT.
>
> Signed-off-by: Florian Westphal <fw@strlen.de>
> ---
>  include/linux/bpf_types.h           |  4 ++
>  include/net/netfilter/nf_hook_bpf.h |  6 +++
>  kernel/bpf/btf.c                    |  5 ++
>  kernel/bpf/verifier.c               |  3 ++
>  net/netfilter/nf_bpf_link.c         | 78 ++++++++++++++++++++++++++++-
>  5 files changed, 95 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h
> index d4ee3ccd3753..39a999abb0ce 100644
> --- a/include/linux/bpf_types.h
> +++ b/include/linux/bpf_types.h
> @@ -79,6 +79,10 @@ BPF_PROG_TYPE(BPF_PROG_TYPE_LSM, lsm,
>  #endif
>  BPF_PROG_TYPE(BPF_PROG_TYPE_SYSCALL, bpf_syscall,
>  	      void *, void *)
> +#ifdef CONFIG_NETFILTER
> +BPF_PROG_TYPE(BPF_PROG_TYPE_NETFILTER, netfilter,
> +	      struct bpf_nf_ctx, struct bpf_nf_ctx)
> +#endif
>  
>  BPF_MAP_TYPE(BPF_MAP_TYPE_ARRAY, array_map_ops)
>  BPF_MAP_TYPE(BPF_MAP_TYPE_PERCPU_ARRAY, percpu_array_map_ops)
> diff --git a/include/net/netfilter/nf_hook_bpf.h b/include/net/netfilter/nf_hook_bpf.h
> index 9d1b338e89d7..863cbbcc66f9 100644
> --- a/include/net/netfilter/nf_hook_bpf.h
> +++ b/include/net/netfilter/nf_hook_bpf.h
> @@ -1,2 +1,8 @@
>  /* SPDX-License-Identifier: GPL-2.0 */
> +
> +struct bpf_nf_ctx {
> +	const struct nf_hook_state *state;
> +	struct sk_buff *skb;
> +};
> +
>  int bpf_nf_link_attach(const union bpf_attr *attr, struct bpf_prog *prog);
> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index ef2d8969ed1f..ec6eb78b9aec 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
> @@ -25,6 +25,9 @@
>  #include <linux/bsearch.h>
>  #include <linux/kobject.h>
>  #include <linux/sysfs.h>
> +
> +#include <net/netfilter/nf_hook_bpf.h>
> +
>  #include <net/sock.h>
>  #include "../tools/lib/bpf/relo_core.h"
>  
> @@ -7726,6 +7729,8 @@ static int bpf_prog_type_to_kfunc_hook(enum bpf_prog_type prog_type)
>  	case BPF_PROG_TYPE_LWT_XMIT:
>  	case BPF_PROG_TYPE_LWT_SEG6LOCAL:
>  		return BTF_KFUNC_HOOK_LWT;
> +	case BPF_PROG_TYPE_NETFILTER:
> +		return BTF_KFUNC_HOOK_SOCKET_FILTER;

The dynptr patch reuses the actual set between the different IDs, so
this should probably define a new BTF_KFUNC_HOOK_NETFILTER, with an
associated register_btf_kfunc_id_set() call?

-Toke

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 bpf-next 0/3] bpf: add netfilter program type
  2023-03-02 17:27 [PATCH RFC v2 bpf-next 0/3] bpf: add netfilter program type Florian Westphal
                   ` (2 preceding siblings ...)
  2023-03-02 17:27 ` [PATCH RFC v2 bpf-next 3/3] bpf: minimal support for programs hooked into netfilter framework Florian Westphal
@ 2023-03-02 19:59 ` Toke Høiland-Jørgensen
  2023-03-23  0:36 ` Daniel Xu
  4 siblings, 0 replies; 20+ messages in thread
From: Toke Høiland-Jørgensen @ 2023-03-02 19:59 UTC (permalink / raw)
  To: Florian Westphal, bpf; +Cc: netfilter-devel, Florian Westphal

Florian Westphal <fw@strlen.de> writes:

> Add minimal support to hook bpf programs to netfilter hooks,
> e.g. PREROUTING or FORWARD.
>
> For this the most relevant parts for registering a netfilter
> hook via the in-kernel api are exposed to userspace via bpf_link.
>
> The new program type is 'tracing style' and assumes skb dynptrs are used
> rather than 'direct packet access'.
>
> With this its possible to build a small test program such as:
>
> #include "vmlinux.h"
>
> extern int bpf_dynptr_from_skb(struct __sk_buff *skb, __u64 flags,
>                                struct bpf_dynptr *ptr__uninit) __ksym;
> extern void *bpf_dynptr_slice(const struct bpf_dynptr *ptr, uint32_t offset,
>                                    void *buffer, uint32_t buffer__sz) __ksym;
>
> SEC("netfilter")
> int nf_test(struct bpf_nf_ctx *ctx)
> {
> 	struct nf_hook_state *state = ctx->state;
> 	struct sk_buff *skb = ctx->skb;
> 	const struct iphdr *iph, _iph;
> 	const struct tcphdr *th, _th;
> 	struct bpf_dynptr ptr;
>
> 	if (bpf_dynptr_from_skb(skb, 0, &ptr))
> 		return NF_DROP;
>
> 	iph = bpf_dynptr_slice(&ptr, 0, &_iph, sizeof(_iph));
> 	if (!iph)
> 		return NF_DROP;
>
> 	th = bpf_dynptr_slice(&ptr, iph->ihl << 2, &_th, sizeof(_th));
> 	if (!th)
> 		return NF_DROP;
>
> 	bpf_printk("accept %x:%d->%x:%d, hook %d ifin %d\n", iph->saddr, bpf_ntohs(th->source), iph->daddr, bpf_ntohs(th->dest), state->hook, state->in->ifindex);
>         return NF_ACCEPT;
> }
>
> (output can be observed via /sys/kernel/tracing/trace_pipe).
>
> At this point I think its fairly complete.  Known problems are:
> - no test cases, I will look into this.  Might take some time
>   though because I might have to extend libbpf first.
> - nfnetlink_hook needs minor work so that it can dump the bpf
>   program id. As-is, userspace could see that a bpf program
>   is attached to e.g. forward and output, but it cannot tell
>   which program.  This is fairly simple and doesn't need changes
>   on bpf side.
>
> I will work on these address those two next unless anyone spots
> a fundamental issue with this rfc set.

I only spotted one small nit on the third patch, which I replied to
separately. Otherwise I think it looks pretty good, in fact I'm amazed
at how little code it takes to enable this; nice work! :)

-Toke

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 bpf-next 1/3] bpf: add bpf_link support for BPF_NETFILTER programs
  2023-03-02 17:27 ` [PATCH RFC v2 bpf-next 1/3] bpf: add bpf_link support for BPF_NETFILTER programs Florian Westphal
@ 2023-03-02 20:07   ` kernel test robot
  2023-03-02 20:28   ` Stanislav Fomichev
  2023-03-02 21:10   ` kernel test robot
  2 siblings, 0 replies; 20+ messages in thread
From: kernel test robot @ 2023-03-02 20:07 UTC (permalink / raw)
  To: Florian Westphal; +Cc: oe-kbuild-all

Hi Florian,

[FYI, it's a private test report for your RFC patch.]
[auto build test ERROR on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Florian-Westphal/bpf-add-bpf_link-support-for-BPF_NETFILTER-programs/20230303-013022
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link:    https://lore.kernel.org/r/20230302172757.9548-2-fw%40strlen.de
patch subject: [PATCH RFC v2 bpf-next 1/3] bpf: add bpf_link support for BPF_NETFILTER programs
config: i386-randconfig-a003 (https://download.01.org/0day-ci/archive/20230303/202303030452.W2gRyUnd-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.3.0-8) 11.3.0
reproduce (this is a W=1 build):
        # https://github.com/intel-lab-lkp/linux/commit/dafd9c7dce974be4bf60f89c8427a765acfcba77
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Florian-Westphal/bpf-add-bpf_link-support-for-BPF_NETFILTER-programs/20230303-013022
        git checkout dafd9c7dce974be4bf60f89c8427a765acfcba77
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        make W=1 O=build_dir ARCH=i386 olddefconfig
        make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202303030452.W2gRyUnd-lkp@intel.com/

All errors (new ones prefixed by >>):

   ld: kernel/bpf/syscall.o: in function `link_create':
>> kernel/bpf/syscall.c:4641: undefined reference to `bpf_nf_link_attach'


vim +4641 kernel/bpf/syscall.c

  4551	
  4552	#define BPF_LINK_CREATE_LAST_FIELD link_create.kprobe_multi.cookies
  4553	static int link_create(union bpf_attr *attr, bpfptr_t uattr)
  4554	{
  4555		enum bpf_prog_type ptype;
  4556		struct bpf_prog *prog;
  4557		int ret;
  4558	
  4559		if (CHECK_ATTR(BPF_LINK_CREATE))
  4560			return -EINVAL;
  4561	
  4562		prog = bpf_prog_get(attr->link_create.prog_fd);
  4563		if (IS_ERR(prog))
  4564			return PTR_ERR(prog);
  4565	
  4566		ret = bpf_prog_attach_check_attach_type(prog,
  4567							attr->link_create.attach_type);
  4568		if (ret)
  4569			goto out;
  4570	
  4571		switch (prog->type) {
  4572		case BPF_PROG_TYPE_EXT:
  4573		case BPF_PROG_TYPE_NETFILTER:
  4574			break;
  4575		case BPF_PROG_TYPE_PERF_EVENT:
  4576		case BPF_PROG_TYPE_TRACEPOINT:
  4577			if (attr->link_create.attach_type != BPF_PERF_EVENT) {
  4578				ret = -EINVAL;
  4579				goto out;
  4580			}
  4581			break;
  4582		case BPF_PROG_TYPE_KPROBE:
  4583			if (attr->link_create.attach_type != BPF_PERF_EVENT &&
  4584			    attr->link_create.attach_type != BPF_TRACE_KPROBE_MULTI) {
  4585				ret = -EINVAL;
  4586				goto out;
  4587			}
  4588			break;
  4589		default:
  4590			ptype = attach_type_to_prog_type(attr->link_create.attach_type);
  4591			if (ptype == BPF_PROG_TYPE_UNSPEC || ptype != prog->type) {
  4592				ret = -EINVAL;
  4593				goto out;
  4594			}
  4595			break;
  4596		}
  4597	
  4598		switch (prog->type) {
  4599		case BPF_PROG_TYPE_CGROUP_SKB:
  4600		case BPF_PROG_TYPE_CGROUP_SOCK:
  4601		case BPF_PROG_TYPE_CGROUP_SOCK_ADDR:
  4602		case BPF_PROG_TYPE_SOCK_OPS:
  4603		case BPF_PROG_TYPE_CGROUP_DEVICE:
  4604		case BPF_PROG_TYPE_CGROUP_SYSCTL:
  4605		case BPF_PROG_TYPE_CGROUP_SOCKOPT:
  4606			ret = cgroup_bpf_link_attach(attr, prog);
  4607			break;
  4608		case BPF_PROG_TYPE_EXT:
  4609			ret = bpf_tracing_prog_attach(prog,
  4610						      attr->link_create.target_fd,
  4611						      attr->link_create.target_btf_id,
  4612						      attr->link_create.tracing.cookie);
  4613			break;
  4614		case BPF_PROG_TYPE_LSM:
  4615		case BPF_PROG_TYPE_TRACING:
  4616			if (attr->link_create.attach_type != prog->expected_attach_type) {
  4617				ret = -EINVAL;
  4618				goto out;
  4619			}
  4620			if (prog->expected_attach_type == BPF_TRACE_RAW_TP)
  4621				ret = bpf_raw_tp_link_attach(prog, NULL);
  4622			else if (prog->expected_attach_type == BPF_TRACE_ITER)
  4623				ret = bpf_iter_link_attach(attr, uattr, prog);
  4624			else if (prog->expected_attach_type == BPF_LSM_CGROUP)
  4625				ret = cgroup_bpf_link_attach(attr, prog);
  4626			else
  4627				ret = bpf_tracing_prog_attach(prog,
  4628							      attr->link_create.target_fd,
  4629							      attr->link_create.target_btf_id,
  4630							      attr->link_create.tracing.cookie);
  4631			break;
  4632		case BPF_PROG_TYPE_FLOW_DISSECTOR:
  4633		case BPF_PROG_TYPE_SK_LOOKUP:
  4634			ret = netns_bpf_link_create(attr, prog);
  4635			break;
  4636	#ifdef CONFIG_NET
  4637		case BPF_PROG_TYPE_XDP:
  4638			ret = bpf_xdp_link_attach(attr, prog);
  4639			break;
  4640		case BPF_PROG_TYPE_NETFILTER:
> 4641			ret = bpf_nf_link_attach(attr, prog);
  4642			break;
  4643	#endif
  4644		case BPF_PROG_TYPE_PERF_EVENT:
  4645		case BPF_PROG_TYPE_TRACEPOINT:
  4646			ret = bpf_perf_link_attach(attr, prog);
  4647			break;
  4648		case BPF_PROG_TYPE_KPROBE:
  4649			if (attr->link_create.attach_type == BPF_PERF_EVENT)
  4650				ret = bpf_perf_link_attach(attr, prog);
  4651			else
  4652				ret = bpf_kprobe_multi_link_attach(attr, prog);
  4653			break;
  4654		default:
  4655			ret = -EINVAL;
  4656		}
  4657	
  4658	out:
  4659		if (ret < 0)
  4660			bpf_prog_put(prog);
  4661		return ret;
  4662	}
  4663	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 bpf-next 1/3] bpf: add bpf_link support for BPF_NETFILTER programs
  2023-03-02 17:27 ` [PATCH RFC v2 bpf-next 1/3] bpf: add bpf_link support for BPF_NETFILTER programs Florian Westphal
  2023-03-02 20:07   ` kernel test robot
@ 2023-03-02 20:28   ` Stanislav Fomichev
  2023-03-03  0:27     ` Florian Westphal
  2023-03-02 21:10   ` kernel test robot
  2 siblings, 1 reply; 20+ messages in thread
From: Stanislav Fomichev @ 2023-03-02 20:28 UTC (permalink / raw)
  To: Florian Westphal; +Cc: bpf, netfilter-devel

On 03/02, Florian Westphal wrote:
> Add bpf_link support skeleton.  To keep this reviewable, no bpf program
> can be invoked here, if a program would be attached only a c-stub is  
> called.

> This defaults to 'y' if both netfilter and bpf syscall are enabled in
> kconfig.

> Uapi example usage:
> 	union bpf_attr attr = { };

> 	attr.link_create.prog_fd = progfd;
> 	attr.link_create.attach_type = BPF_NETFILTER;
> 	attr.link_create.netfilter.pf = PF_INET;
> 	attr.link_create.netfilter.hooknum = NF_INET_LOCAL_IN;
> 	attr.link_create.netfilter.priority = -128;

> 	err = bpf(BPF_LINK_CREATE, &attr, sizeof(attr));

> ... this would attach progfd to ipv4:input hook.

> Such hook gets removed automatically if the calling program exits.

> BPF_NETFILTER program invocation is added in followup change.

> NF_HOOK_OP_BPF enum will eventually be read from nfnetlink_hook, it
> allows to tell userspace which program is attached at the given hook
> when user runs 'nft hook list' command rather than just the priority
> and not-very-helpful 'this hook runs a bpf prog but I can't tell which
> one'.

> Signed-off-by: Florian Westphal <fw@strlen.de>
> ---
>   include/linux/netfilter.h           |   1 +
>   include/net/netfilter/nf_hook_bpf.h |   2 +
>   include/uapi/linux/bpf.h            |  12 +++
>   kernel/bpf/syscall.c                |   6 ++
>   net/netfilter/Kconfig               |   3 +
>   net/netfilter/Makefile              |   1 +
>   net/netfilter/nf_bpf_link.c         | 116 ++++++++++++++++++++++++++++
>   7 files changed, 141 insertions(+)
>   create mode 100644 include/net/netfilter/nf_hook_bpf.h
>   create mode 100644 net/netfilter/nf_bpf_link.c

> diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
> index 6863e271a9de..beec40ccbd79 100644
> --- a/include/linux/netfilter.h
> +++ b/include/linux/netfilter.h
> @@ -80,6 +80,7 @@ typedef unsigned int nf_hookfn(void *priv,
>   enum nf_hook_ops_type {
>   	NF_HOOK_OP_UNDEFINED,
>   	NF_HOOK_OP_NF_TABLES,
> +	NF_HOOK_OP_BPF,
>   };

>   struct nf_hook_ops {
> diff --git a/include/net/netfilter/nf_hook_bpf.h  
> b/include/net/netfilter/nf_hook_bpf.h
> new file mode 100644
> index 000000000000..9d1b338e89d7
> --- /dev/null
> +++ b/include/net/netfilter/nf_hook_bpf.h
> @@ -0,0 +1,2 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +int bpf_nf_link_attach(const union bpf_attr *attr, struct bpf_prog  
> *prog);
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index c9699304aed2..b063c6985769 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -986,6 +986,7 @@ enum bpf_prog_type {
>   	BPF_PROG_TYPE_LSM,
>   	BPF_PROG_TYPE_SK_LOOKUP,
>   	BPF_PROG_TYPE_SYSCALL, /* a program that can execute syscalls */
> +	BPF_PROG_TYPE_NETFILTER,
>   };

>   enum bpf_attach_type {
> @@ -1049,6 +1050,7 @@ enum bpf_link_type {
>   	BPF_LINK_TYPE_PERF_EVENT = 7,
>   	BPF_LINK_TYPE_KPROBE_MULTI = 8,
>   	BPF_LINK_TYPE_STRUCT_OPS = 9,
> +	BPF_LINK_TYPE_NETFILTER = 10,

>   	MAX_BPF_LINK_TYPE,
>   };
> @@ -1543,6 +1545,11 @@ union bpf_attr {
>   				 */
>   				__u64		cookie;
>   			} tracing;
> +			struct {
> +				__u32		pf;
> +				__u32		hooknum;
> +				__s32		prio;
> +			} netfilter;

For recent tc BPF program extensions, we've discussed that it might be  
better
to have an option to attach program before/after another one in the chain.
So the API essentially would receive a before/after flag + fd/id of the
program where we want to be.

Should we do something similar here? See [0] for the original
discussion.

0: https://lore.kernel.org/bpf/YzzWDqAmN5DRTupQ@google.com/

>   		};
>   	} link_create;

> @@ -6373,6 +6380,11 @@ struct bpf_link_info {
>   		struct {
>   			__u32 ifindex;
>   		} xdp;
> +		struct {
> +			__u32 pf;
> +			__u32 hooknum;
> +			__s32 priority;
> +		} netfilter;
>   	};
>   } __attribute__((aligned(8)));

> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index e3fcdc9836a6..8f5cd1fb83ee 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -35,6 +35,7 @@
>   #include <linux/rcupdate_trace.h>
>   #include <linux/memcontrol.h>
>   #include <linux/trace_events.h>
> +#include <net/netfilter/nf_hook_bpf.h>

>   #define IS_FD_ARRAY(map) ((map)->map_type ==  
> BPF_MAP_TYPE_PERF_EVENT_ARRAY || \
>   			  (map)->map_type == BPF_MAP_TYPE_CGROUP_ARRAY || \
> @@ -2448,6 +2449,7 @@ static bool is_net_admin_prog_type(enum  
> bpf_prog_type prog_type)
>   	case BPF_PROG_TYPE_CGROUP_SYSCTL:
>   	case BPF_PROG_TYPE_SOCK_OPS:
>   	case BPF_PROG_TYPE_EXT: /* extends any prog */
> +	case BPF_PROG_TYPE_NETFILTER:
>   		return true;
>   	case BPF_PROG_TYPE_CGROUP_SKB:
>   		/* always unpriv */
> @@ -4562,6 +4564,7 @@ static int link_create(union bpf_attr *attr,  
> bpfptr_t uattr)

>   	switch (prog->type) {
>   	case BPF_PROG_TYPE_EXT:
> +	case BPF_PROG_TYPE_NETFILTER:
>   		break;
>   	case BPF_PROG_TYPE_PERF_EVENT:
>   	case BPF_PROG_TYPE_TRACEPOINT:
> @@ -4628,6 +4631,9 @@ static int link_create(union bpf_attr *attr,  
> bpfptr_t uattr)
>   	case BPF_PROG_TYPE_XDP:
>   		ret = bpf_xdp_link_attach(attr, prog);
>   		break;
> +	case BPF_PROG_TYPE_NETFILTER:
> +		ret = bpf_nf_link_attach(attr, prog);
> +		break;
>   #endif
>   	case BPF_PROG_TYPE_PERF_EVENT:
>   	case BPF_PROG_TYPE_TRACEPOINT:
> diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
> index 4d6737160857..bea06f62a30e 100644
> --- a/net/netfilter/Kconfig
> +++ b/net/netfilter/Kconfig
> @@ -30,6 +30,9 @@ config NETFILTER_FAMILY_BRIDGE
>   config NETFILTER_FAMILY_ARP
>   	bool

> +config NETFILTER_BPF_LINK
> +	def_bool BPF_SYSCALL
> +
>   config NETFILTER_NETLINK_HOOK
>   	tristate "Netfilter base hook dump support"
>   	depends on NETFILTER_ADVANCED
> diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
> index 5ffef1cd6143..d4958e7e7631 100644
> --- a/net/netfilter/Makefile
> +++ b/net/netfilter/Makefile
> @@ -22,6 +22,7 @@ nf_conntrack-$(CONFIG_DEBUG_INFO_BTF) +=  
> nf_conntrack_bpf.o
>   endif

>   obj-$(CONFIG_NETFILTER) = netfilter.o
> +obj-$(CONFIG_NETFILTER_BPF_LINK) += nf_bpf_link.o

>   obj-$(CONFIG_NETFILTER_NETLINK) += nfnetlink.o
>   obj-$(CONFIG_NETFILTER_NETLINK_ACCT) += nfnetlink_acct.o
> diff --git a/net/netfilter/nf_bpf_link.c b/net/netfilter/nf_bpf_link.c
> new file mode 100644
> index 000000000000..fa4fae5cc669
> --- /dev/null
> +++ b/net/netfilter/nf_bpf_link.c
> @@ -0,0 +1,116 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <linux/bpf.h>
> +#include <linux/netfilter.h>
> +
> +#include <net/netfilter/nf_hook_bpf.h>
> +
> +static unsigned int nf_hook_run_bpf(void *bpf_prog, struct sk_buff *skb,  
> const struct nf_hook_state *s)
> +{
> +	return NF_ACCEPT;
> +}
> +
> +struct bpf_nf_link {
> +	struct bpf_link link;
> +	struct nf_hook_ops hook_ops;
> +	struct net *net;
> +};
> +
> +static void bpf_nf_link_release(struct bpf_link *link)
> +{
> +	struct bpf_nf_link *nf_link = container_of(link, struct bpf_nf_link,  
> link);
> +
> +	nf_unregister_net_hook(nf_link->net, &nf_link->hook_ops);
> +}
> +
> +static void bpf_nf_link_dealloc(struct bpf_link *link)
> +{
> +	struct bpf_nf_link *nf_link = container_of(link, struct bpf_nf_link,  
> link);
> +
> +	kfree(nf_link);
> +}
> +
> +static int bpf_nf_link_detach(struct bpf_link *link)
> +{
> +	bpf_nf_link_release(link);
> +	return 0;
> +}
> +
> +static void bpf_nf_link_show_info(const struct bpf_link *link,
> +				  struct seq_file *seq)
> +{
> +	struct bpf_nf_link *nf_link = container_of(link, struct bpf_nf_link,  
> link);
> +
> +	seq_printf(seq, "pf:\t%u\thooknum:\t%u\tprio:\t%d\n",
> +		  nf_link->hook_ops.pf, nf_link->hook_ops.hooknum,
> +		  nf_link->hook_ops.priority);
> +}
> +
> +static int bpf_nf_link_fill_link_info(const struct bpf_link *link,
> +				      struct bpf_link_info *info)
> +{
> +	struct bpf_nf_link *nf_link = container_of(link, struct bpf_nf_link,  
> link);
> +
> +	info->netfilter.pf = nf_link->hook_ops.pf;
> +	info->netfilter.hooknum = nf_link->hook_ops.hooknum;
> +	info->netfilter.priority = nf_link->hook_ops.priority;
> +
> +	return 0;
> +}
> +
> +static int bpf_nf_link_update(struct bpf_link *link, struct bpf_prog  
> *new_prog,
> +			      struct bpf_prog *old_prog)
> +{
> +	return -EOPNOTSUPP;
> +}
> +
> +static const struct bpf_link_ops bpf_nf_link_lops = {
> +	.release = bpf_nf_link_release,
> +	.dealloc = bpf_nf_link_dealloc,
> +	.detach = bpf_nf_link_detach,
> +	.show_fdinfo = bpf_nf_link_show_info,
> +	.fill_link_info = bpf_nf_link_fill_link_info,
> +	.update_prog = bpf_nf_link_update,
> +};
> +
> +int bpf_nf_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
> +{
> +	struct net *net = current->nsproxy->net_ns;
> +	struct bpf_link_primer link_primer;
> +	struct bpf_nf_link *link;
> +	int err;
> +
> +	if (attr->link_create.flags)
> +		return -EINVAL;
> +
> +	link = kzalloc(sizeof(*link), GFP_USER);
> +	if (!link)
> +		return -ENOMEM;
> +
> +	bpf_link_init(&link->link, BPF_LINK_TYPE_NETFILTER, &bpf_nf_link_lops,  
> prog);
> +
> +	link->hook_ops.hook = nf_hook_run_bpf;
> +	link->hook_ops.hook_ops_type = NF_HOOK_OP_BPF;
> +	link->hook_ops.priv = prog;
> +
> +	link->hook_ops.pf = attr->link_create.netfilter.pf;
> +	link->hook_ops.priority = attr->link_create.netfilter.prio;
> +	link->hook_ops.hooknum = attr->link_create.netfilter.hooknum;
> +
> +	link->net = net;
> +
> +	err = bpf_link_prime(&link->link, &link_primer);
> +	if (err)
> +		goto out_free;
> +
> +	err = nf_register_net_hook(net, &link->hook_ops);
> +	if (err) {
> +		bpf_link_cleanup(&link_primer);
> +		goto out_free;
> +	}
> +
> +	return bpf_link_settle(&link_primer);
> +
> +out_free:
> +	kfree(link);
> +	return err;
> +}
> --
> 2.39.2


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 bpf-next 1/3] bpf: add bpf_link support for BPF_NETFILTER programs
  2023-03-02 17:27 ` [PATCH RFC v2 bpf-next 1/3] bpf: add bpf_link support for BPF_NETFILTER programs Florian Westphal
  2023-03-02 20:07   ` kernel test robot
  2023-03-02 20:28   ` Stanislav Fomichev
@ 2023-03-02 21:10   ` kernel test robot
  2 siblings, 0 replies; 20+ messages in thread
From: kernel test robot @ 2023-03-02 21:10 UTC (permalink / raw)
  To: Florian Westphal; +Cc: llvm, oe-kbuild-all

Hi Florian,

[FYI, it's a private test report for your RFC patch.]
[auto build test ERROR on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Florian-Westphal/bpf-add-bpf_link-support-for-BPF_NETFILTER-programs/20230303-013022
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link:    https://lore.kernel.org/r/20230302172757.9548-2-fw%40strlen.de
patch subject: [PATCH RFC v2 bpf-next 1/3] bpf: add bpf_link support for BPF_NETFILTER programs
config: i386-randconfig-a002 (https://download.01.org/0day-ci/archive/20230303/202303030501.plArM8Yu-lkp@intel.com/config)
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/dafd9c7dce974be4bf60f89c8427a765acfcba77
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Florian-Westphal/bpf-add-bpf_link-support-for-BPF_NETFILTER-programs/20230303-013022
        git checkout dafd9c7dce974be4bf60f89c8427a765acfcba77
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=i386 olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=i386 SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202303030501.plArM8Yu-lkp@intel.com/

All errors (new ones prefixed by >>):

>> ld.lld: error: undefined symbol: bpf_nf_link_attach
   >>> referenced by syscall.c:4641 (kernel/bpf/syscall.c:4641)
   >>>               kernel/bpf/syscall.o:(link_create) in archive vmlinux.a

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 bpf-next 3/3] bpf: minimal support for programs hooked into netfilter framework
  2023-03-02 19:59   ` Toke Høiland-Jørgensen
@ 2023-03-02 23:53     ` Florian Westphal
  2023-03-03  0:06       ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 20+ messages in thread
From: Florian Westphal @ 2023-03-02 23:53 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen; +Cc: Florian Westphal, bpf, netfilter-devel

Toke Høiland-Jørgensen <toke@kernel.org> wrote:
> Florian Westphal <fw@strlen.de> writes:
> > +	case BPF_PROG_TYPE_NETFILTER:
> > +		return BTF_KFUNC_HOOK_SOCKET_FILTER;
> 
> The dynptr patch reuses the actual set between the different IDs, so
> this should probably define a new BTF_KFUNC_HOOK_NETFILTER, with an
> associated register_btf_kfunc_id_set() call?

Good point, the above was a kludge that I forgot about.

I will wait for dynptr patchset to land and will add new
BTF_KFUNC_HOOK_NETFILTER for next revision.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 bpf-next 3/3] bpf: minimal support for programs hooked into netfilter framework
  2023-03-02 23:53     ` Florian Westphal
@ 2023-03-03  0:06       ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 20+ messages in thread
From: Toke Høiland-Jørgensen @ 2023-03-03  0:06 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Florian Westphal, bpf, netfilter-devel

Florian Westphal <fw@strlen.de> writes:

> Toke Høiland-Jørgensen <toke@kernel.org> wrote:
>> Florian Westphal <fw@strlen.de> writes:
>> > +	case BPF_PROG_TYPE_NETFILTER:
>> > +		return BTF_KFUNC_HOOK_SOCKET_FILTER;
>> 
>> The dynptr patch reuses the actual set between the different IDs, so
>> this should probably define a new BTF_KFUNC_HOOK_NETFILTER, with an
>> associated register_btf_kfunc_id_set() call?
>
> Good point, the above was a kludge that I forgot about.
>
> I will wait for dynptr patchset to land and will add new
> BTF_KFUNC_HOOK_NETFILTER for next revision.

Sounds good! The dynptr series did land:
https://lore.kernel.org/r/167769421944.16358.12443693977215512909.git-patchwork-notify@kernel.org

-Toke

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 bpf-next 1/3] bpf: add bpf_link support for BPF_NETFILTER programs
  2023-03-02 20:28   ` Stanislav Fomichev
@ 2023-03-03  0:27     ` Florian Westphal
  2023-03-23  0:41       ` Daniel Xu
  0 siblings, 1 reply; 20+ messages in thread
From: Florian Westphal @ 2023-03-03  0:27 UTC (permalink / raw)
  To: Stanislav Fomichev; +Cc: Florian Westphal, bpf, netfilter-devel

Stanislav Fomichev <sdf@google.com> wrote:
> On 03/02, Florian Westphal wrote:
> > +			struct {
> > +				__u32		pf;
> > +				__u32		hooknum;
> > +				__s32		prio;
> > +			} netfilter;
> 
> For recent tc BPF program extensions, we've discussed that it might be
> better
> to have an option to attach program before/after another one in the chain.
> So the API essentially would receive a before/after flag + fd/id of the
>
> Should we do something similar here? See [0] for the original
> discussion.
> 
> 0: https://lore.kernel.org/bpf/YzzWDqAmN5DRTupQ@google.com/

Thanks for the pointer, I will have a look.

The above exposes the "prio" of netfilter hooks, so someone
that needs their hook to run early on, say, before netfilters
nat engine, could just use INT_MIN.

We could -- for nf bpf -- make the bpf_link fail if a hook
with the same priority already exists to avoid the "undefined
behaviour" here (same prio means register order decides what
hook function runs first ...).

This could be relevant if you have e.g. one bpf program collecting
statistics vs. one doing drops.

I'll dig though the thread and would try to mimic the tc link
mechanism as close as possible.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 bpf-next 0/3] bpf: add netfilter program type
  2023-03-02 17:27 [PATCH RFC v2 bpf-next 0/3] bpf: add netfilter program type Florian Westphal
                   ` (3 preceding siblings ...)
  2023-03-02 19:59 ` [PATCH RFC v2 bpf-next 0/3] bpf: add netfilter program type Toke Høiland-Jørgensen
@ 2023-03-23  0:36 ` Daniel Xu
  2023-03-24 18:36   ` Florian Westphal
  4 siblings, 1 reply; 20+ messages in thread
From: Daniel Xu @ 2023-03-23  0:36 UTC (permalink / raw)
  To: Florian Westphal; +Cc: bpf, netfilter-devel

Hi Florian,

On Thu, Mar 02, 2023 at 06:27:54PM +0100, Florian Westphal wrote:
> Add minimal support to hook bpf programs to netfilter hooks,
> e.g. PREROUTING or FORWARD.
> 
> For this the most relevant parts for registering a netfilter
> hook via the in-kernel api are exposed to userspace via bpf_link.
> 
> The new program type is 'tracing style' and assumes skb dynptrs are used
> rather than 'direct packet access'.

[...]

Hope all is well. Do you have any updates on this series? I'm keen to
start building on top of this work.

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 bpf-next 1/3] bpf: add bpf_link support for BPF_NETFILTER programs
  2023-03-03  0:27     ` Florian Westphal
@ 2023-03-23  0:41       ` Daniel Xu
  2023-03-23 18:31         ` Stanislav Fomichev
  0 siblings, 1 reply; 20+ messages in thread
From: Daniel Xu @ 2023-03-23  0:41 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Stanislav Fomichev, bpf, netfilter-devel

Hi Florian, Stan,

On Fri, Mar 03, 2023 at 01:27:52AM +0100, Florian Westphal wrote:
> Stanislav Fomichev <sdf@google.com> wrote:
> > On 03/02, Florian Westphal wrote:
> > > +			struct {
> > > +				__u32		pf;
> > > +				__u32		hooknum;
> > > +				__s32		prio;
> > > +			} netfilter;
> > 
> > For recent tc BPF program extensions, we've discussed that it might be
> > better
> > to have an option to attach program before/after another one in the chain.
> > So the API essentially would receive a before/after flag + fd/id of the
> >
> > Should we do something similar here? See [0] for the original
> > discussion.
> > 
> > 0: https://lore.kernel.org/bpf/YzzWDqAmN5DRTupQ@google.com/
> 
> Thanks for the pointer, I will have a look.
> 
> The above exposes the "prio" of netfilter hooks, so someone
> that needs their hook to run early on, say, before netfilters
> nat engine, could just use INT_MIN.
> 
> We could -- for nf bpf -- make the bpf_link fail if a hook
> with the same priority already exists to avoid the "undefined
> behaviour" here (same prio means register order decides what
> hook function runs first ...).
> 
> This could be relevant if you have e.g. one bpf program collecting
> statistics vs. one doing drops.
> 
> I'll dig though the thread and would try to mimic the tc link
> mechanism as close as possible.

While I think the direction the TC link discussion took is totally fine,
TC has the advantage (IIUC) of being a somewhat isolated hook. Meaning
it does not make sense for a user to mix priority values && before/after
semantics.

Netfilter is different in that there is by default modules active with
fixed priority values. So mixing in before/after semantics here could
get confusing.

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 bpf-next 1/3] bpf: add bpf_link support for BPF_NETFILTER programs
  2023-03-23  0:41       ` Daniel Xu
@ 2023-03-23 18:31         ` Stanislav Fomichev
  2023-03-24 17:33           ` Daniel Xu
  0 siblings, 1 reply; 20+ messages in thread
From: Stanislav Fomichev @ 2023-03-23 18:31 UTC (permalink / raw)
  To: Daniel Xu; +Cc: Florian Westphal, bpf, netfilter-devel

On Wed, Mar 22, 2023 at 5:41 PM Daniel Xu <dxu@dxuuu.xyz> wrote:
>
> Hi Florian, Stan,
>
> On Fri, Mar 03, 2023 at 01:27:52AM +0100, Florian Westphal wrote:
> > Stanislav Fomichev <sdf@google.com> wrote:
> > > On 03/02, Florian Westphal wrote:
> > > > +                 struct {
> > > > +                         __u32           pf;
> > > > +                         __u32           hooknum;
> > > > +                         __s32           prio;
> > > > +                 } netfilter;
> > >
> > > For recent tc BPF program extensions, we've discussed that it might be
> > > better
> > > to have an option to attach program before/after another one in the chain.
> > > So the API essentially would receive a before/after flag + fd/id of the
> > >
> > > Should we do something similar here? See [0] for the original
> > > discussion.
> > >
> > > 0: https://lore.kernel.org/bpf/YzzWDqAmN5DRTupQ@google.com/
> >
> > Thanks for the pointer, I will have a look.
> >
> > The above exposes the "prio" of netfilter hooks, so someone
> > that needs their hook to run early on, say, before netfilters
> > nat engine, could just use INT_MIN.
> >
> > We could -- for nf bpf -- make the bpf_link fail if a hook
> > with the same priority already exists to avoid the "undefined
> > behaviour" here (same prio means register order decides what
> > hook function runs first ...).
> >
> > This could be relevant if you have e.g. one bpf program collecting
> > statistics vs. one doing drops.
> >
> > I'll dig though the thread and would try to mimic the tc link
> > mechanism as close as possible.
>
> While I think the direction the TC link discussion took is totally fine,
> TC has the advantage (IIUC) of being a somewhat isolated hook. Meaning
> it does not make sense for a user to mix priority values && before/after
> semantics.
>
> Netfilter is different in that there is by default modules active with
> fixed priority values. So mixing in before/after semantics here could
> get confusing.

I don't remember the details, so pls correct me, but last time I
looked, this priority was basically an ordering within a hook?
And there were a bunch of kernel-hardcoded values. So either that
whole story has to become a UAPI (so the bpf program knows
before/after which kernel hook it has to run), or we need some other
ordering mechanism. (I'm not sure what's the story with bpf vs kernel
hooks interop, so maybe it's all moot?)
Am I missing something? Can you share more about why those fixed
priorities are fine?

> Thanks,
> Daniel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 bpf-next 1/3] bpf: add bpf_link support for BPF_NETFILTER programs
  2023-03-23 18:31         ` Stanislav Fomichev
@ 2023-03-24 17:33           ` Daniel Xu
  2023-03-24 17:58             ` Stanislav Fomichev
  0 siblings, 1 reply; 20+ messages in thread
From: Daniel Xu @ 2023-03-24 17:33 UTC (permalink / raw)
  To: Stanislav Fomichev; +Cc: Florian Westphal, bpf, netfilter-devel

Hi Stan,

On Thu, Mar 23, 2023 at 11:31:14AM -0700, Stanislav Fomichev wrote:
> On Wed, Mar 22, 2023 at 5:41 PM Daniel Xu <dxu@dxuuu.xyz> wrote:
> >
> > Hi Florian, Stan,
> >
> > On Fri, Mar 03, 2023 at 01:27:52AM +0100, Florian Westphal wrote:
> > > Stanislav Fomichev <sdf@google.com> wrote:
> > > > On 03/02, Florian Westphal wrote:
> > > > > +                 struct {
> > > > > +                         __u32           pf;
> > > > > +                         __u32           hooknum;
> > > > > +                         __s32           prio;
> > > > > +                 } netfilter;
> > > >
> > > > For recent tc BPF program extensions, we've discussed that it might be
> > > > better
> > > > to have an option to attach program before/after another one in the chain.
> > > > So the API essentially would receive a before/after flag + fd/id of the
> > > >
> > > > Should we do something similar here? See [0] for the original
> > > > discussion.
> > > >
> > > > 0: https://lore.kernel.org/bpf/YzzWDqAmN5DRTupQ@google.com/
> > >
> > > Thanks for the pointer, I will have a look.
> > >
> > > The above exposes the "prio" of netfilter hooks, so someone
> > > that needs their hook to run early on, say, before netfilters
> > > nat engine, could just use INT_MIN.
> > >
> > > We could -- for nf bpf -- make the bpf_link fail if a hook
> > > with the same priority already exists to avoid the "undefined
> > > behaviour" here (same prio means register order decides what
> > > hook function runs first ...).
> > >
> > > This could be relevant if you have e.g. one bpf program collecting
> > > statistics vs. one doing drops.
> > >
> > > I'll dig though the thread and would try to mimic the tc link
> > > mechanism as close as possible.
> >
> > While I think the direction the TC link discussion took is totally fine,
> > TC has the advantage (IIUC) of being a somewhat isolated hook. Meaning
> > it does not make sense for a user to mix priority values && before/after
> > semantics.
> >
> > Netfilter is different in that there is by default modules active with
> > fixed priority values. So mixing in before/after semantics here could
> > get confusing.
> 
> I don't remember the details, so pls correct me, but last time I
> looked, this priority was basically an ordering within a hook?

Yeah, that is my understanding as well.

> And there were a bunch of kernel-hardcoded values. So either that
> whole story has to become a UAPI (so the bpf program knows
> before/after which kernel hook it has to run), or we need some other
> ordering mechanism.

I'm not sure what you mean by "whole story" but netfilter kernel modules
register via a priority value as well. As well as the modules the kernel
ships. So there's that to consider.

(I'm not sure what's the story with bpf vs kernel
> hooks interop, so maybe it's all moot?)
> Am I missing something? Can you share more about why those fixed
> priorities are fine?

I guess I wouldn't say it's ideal (for all the reasons brought up in the
previous discussion), but trying to use before/after semantics here
would necessarily pull in a netfilter rework too, no? Or maybe there's
some clever way to merge the two worlds and get both subsystems what
they want.

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 bpf-next 1/3] bpf: add bpf_link support for BPF_NETFILTER programs
  2023-03-24 17:33           ` Daniel Xu
@ 2023-03-24 17:58             ` Stanislav Fomichev
  2023-03-24 18:22               ` Florian Westphal
  0 siblings, 1 reply; 20+ messages in thread
From: Stanislav Fomichev @ 2023-03-24 17:58 UTC (permalink / raw)
  To: Daniel Xu; +Cc: Florian Westphal, bpf, netfilter-devel

On Fri, Mar 24, 2023 at 10:33 AM Daniel Xu <dxu@dxuuu.xyz> wrote:
>
> Hi Stan,
>
> On Thu, Mar 23, 2023 at 11:31:14AM -0700, Stanislav Fomichev wrote:
> > On Wed, Mar 22, 2023 at 5:41 PM Daniel Xu <dxu@dxuuu.xyz> wrote:
> > >
> > > Hi Florian, Stan,
> > >
> > > On Fri, Mar 03, 2023 at 01:27:52AM +0100, Florian Westphal wrote:
> > > > Stanislav Fomichev <sdf@google.com> wrote:
> > > > > On 03/02, Florian Westphal wrote:
> > > > > > +                 struct {
> > > > > > +                         __u32           pf;
> > > > > > +                         __u32           hooknum;
> > > > > > +                         __s32           prio;
> > > > > > +                 } netfilter;
> > > > >
> > > > > For recent tc BPF program extensions, we've discussed that it might be
> > > > > better
> > > > > to have an option to attach program before/after another one in the chain.
> > > > > So the API essentially would receive a before/after flag + fd/id of the
> > > > >
> > > > > Should we do something similar here? See [0] for the original
> > > > > discussion.
> > > > >
> > > > > 0: https://lore.kernel.org/bpf/YzzWDqAmN5DRTupQ@google.com/
> > > >
> > > > Thanks for the pointer, I will have a look.
> > > >
> > > > The above exposes the "prio" of netfilter hooks, so someone
> > > > that needs their hook to run early on, say, before netfilters
> > > > nat engine, could just use INT_MIN.
> > > >
> > > > We could -- for nf bpf -- make the bpf_link fail if a hook
> > > > with the same priority already exists to avoid the "undefined
> > > > behaviour" here (same prio means register order decides what
> > > > hook function runs first ...).
> > > >
> > > > This could be relevant if you have e.g. one bpf program collecting
> > > > statistics vs. one doing drops.
> > > >
> > > > I'll dig though the thread and would try to mimic the tc link
> > > > mechanism as close as possible.
> > >
> > > While I think the direction the TC link discussion took is totally fine,
> > > TC has the advantage (IIUC) of being a somewhat isolated hook. Meaning
> > > it does not make sense for a user to mix priority values && before/after
> > > semantics.
> > >
> > > Netfilter is different in that there is by default modules active with
> > > fixed priority values. So mixing in before/after semantics here could
> > > get confusing.
> >
> > I don't remember the details, so pls correct me, but last time I
> > looked, this priority was basically an ordering within a hook?
>
> Yeah, that is my understanding as well.
>
> > And there were a bunch of kernel-hardcoded values. So either that
> > whole story has to become a UAPI (so the bpf program knows
> > before/after which kernel hook it has to run), or we need some other
> > ordering mechanism.
>
> I'm not sure what you mean by "whole story" but netfilter kernel modules
> register via a priority value as well. As well as the modules the kernel
> ships. So there's that to consider.

Sorry for not being clear. What I meant here is that we'd have to
export those existing priorities in the UAPI headers and keep those
numbers stable. Otherwise it seems impossible to have a proper interop
between those fixed existing priorities and new bpf modules?
(idk if that's a real problem or I'm overthinking)

Because the problem as I see it is as follows:
Let's say I want to trigger before/after kernel module X. I need to
know its priority and it has to be stable.
Alternatively, there should be some way to query the priority of
module X so I can do +1/-1 (which is essentially "before X/after X"
semantics).

> (I'm not sure what's the story with bpf vs kernel
> > hooks interop, so maybe it's all moot?)
> > Am I missing something? Can you share more about why those fixed
> > priorities are fine?
>
> I guess I wouldn't say it's ideal (for all the reasons brought up in the
> previous discussion), but trying to use before/after semantics here
> would necessarily pull in a netfilter rework too, no? Or maybe there's
> some clever way to merge the two worlds and get both subsystems what
> they want.

Right, I don't have an answer here, just trying to understand whether
(as a side effect of those patches) we're really making those existing
priorities a UAPI or not :-)

> Thanks,
> Daniel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 bpf-next 1/3] bpf: add bpf_link support for BPF_NETFILTER programs
  2023-03-24 17:58             ` Stanislav Fomichev
@ 2023-03-24 18:22               ` Florian Westphal
  2023-03-24 19:22                 ` Stanislav Fomichev
  0 siblings, 1 reply; 20+ messages in thread
From: Florian Westphal @ 2023-03-24 18:22 UTC (permalink / raw)
  To: Stanislav Fomichev; +Cc: Daniel Xu, Florian Westphal, bpf, netfilter-devel

Stanislav Fomichev <sdf@google.com> wrote:
> > I'm not sure what you mean by "whole story" but netfilter kernel modules
> > register via a priority value as well. As well as the modules the kernel
> > ships. So there's that to consider.
> 
> Sorry for not being clear. What I meant here is that we'd have to
> export those existing priorities in the UAPI headers and keep those
> numbers stable. Otherwise it seems impossible to have a proper interop
> between those fixed existing priorities and new bpf modules?
> (idk if that's a real problem or I'm overthinking)

They are already in uapi and exported.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 bpf-next 0/3] bpf: add netfilter program type
  2023-03-23  0:36 ` Daniel Xu
@ 2023-03-24 18:36   ` Florian Westphal
  0 siblings, 0 replies; 20+ messages in thread
From: Florian Westphal @ 2023-03-24 18:36 UTC (permalink / raw)
  To: Daniel Xu; +Cc: Florian Westphal, bpf, netfilter-devel

Daniel Xu <dxu@dxuuu.xyz> wrote:
> On Thu, Mar 02, 2023 at 06:27:54PM +0100, Florian Westphal wrote:
> > Add minimal support to hook bpf programs to netfilter hooks,
> > e.g. PREROUTING or FORWARD.
> > 
> > For this the most relevant parts for registering a netfilter
> > hook via the in-kernel api are exposed to userspace via bpf_link.
> > 
> > The new program type is 'tracing style' and assumes skb dynptrs are used
> > rather than 'direct packet access'.
> 
> [...]
> 
> Hope all is well. Do you have any updates on this series? I'm keen to
> start building on top of this work.

Sorry, I was busy with other work so this got sidelined.

I've pushed what I hav atm to
https://git.breakpoint.cc/cgit/fw/bpf-next.git/log/?h=nf_bpf_hooks_07

I had no time so far to do the testing needed for a new official
submission (e.g. bpf_link_info).

Compared to last uapi this now has a "flags" member that could be
used to indicate "need defrag" and so on.

I hope I can submit this again early April.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 bpf-next 1/3] bpf: add bpf_link support for BPF_NETFILTER programs
  2023-03-24 18:22               ` Florian Westphal
@ 2023-03-24 19:22                 ` Stanislav Fomichev
  0 siblings, 0 replies; 20+ messages in thread
From: Stanislav Fomichev @ 2023-03-24 19:22 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Daniel Xu, bpf, netfilter-devel

On Fri, Mar 24, 2023 at 11:22 AM Florian Westphal <fw@strlen.de> wrote:
>
> Stanislav Fomichev <sdf@google.com> wrote:
> > > I'm not sure what you mean by "whole story" but netfilter kernel modules
> > > register via a priority value as well. As well as the modules the kernel
> > > ships. So there's that to consider.
> >
> > Sorry for not being clear. What I meant here is that we'd have to
> > export those existing priorities in the UAPI headers and keep those
> > numbers stable. Otherwise it seems impossible to have a proper interop
> > between those fixed existing priorities and new bpf modules?
> > (idk if that's a real problem or I'm overthinking)
>
> They are already in uapi and exported.

Oh, nice, then probably keeping those prios is the way to go. Up to
you on whether to explore the alternative (before/after) or not. Agree
with Daniel that it probably requires reworking netfilter internals
and it's not really justified here.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2023-03-24 19:23 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-02 17:27 [PATCH RFC v2 bpf-next 0/3] bpf: add netfilter program type Florian Westphal
2023-03-02 17:27 ` [PATCH RFC v2 bpf-next 1/3] bpf: add bpf_link support for BPF_NETFILTER programs Florian Westphal
2023-03-02 20:07   ` kernel test robot
2023-03-02 20:28   ` Stanislav Fomichev
2023-03-03  0:27     ` Florian Westphal
2023-03-23  0:41       ` Daniel Xu
2023-03-23 18:31         ` Stanislav Fomichev
2023-03-24 17:33           ` Daniel Xu
2023-03-24 17:58             ` Stanislav Fomichev
2023-03-24 18:22               ` Florian Westphal
2023-03-24 19:22                 ` Stanislav Fomichev
2023-03-02 21:10   ` kernel test robot
2023-03-02 17:27 ` [PATCH RFC v2 bpf-next 2/3] libbpf: sync header file, add nf prog section name Florian Westphal
2023-03-02 17:27 ` [PATCH RFC v2 bpf-next 3/3] bpf: minimal support for programs hooked into netfilter framework Florian Westphal
2023-03-02 19:59   ` Toke Høiland-Jørgensen
2023-03-02 23:53     ` Florian Westphal
2023-03-03  0:06       ` Toke Høiland-Jørgensen
2023-03-02 19:59 ` [PATCH RFC v2 bpf-next 0/3] bpf: add netfilter program type Toke Høiland-Jørgensen
2023-03-23  0:36 ` Daniel Xu
2023-03-24 18:36   ` Florian Westphal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.