All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next 00/11] bpfilter
@ 2021-05-17 22:52 Dmitrii Banshchikov
  2021-05-17 22:52 ` [PATCH bpf-next 01/11] bpfilter: Add types for usermode helper Dmitrii Banshchikov
                   ` (11 more replies)
  0 siblings, 12 replies; 28+ messages in thread
From: Dmitrii Banshchikov @ 2021-05-17 22:52 UTC (permalink / raw)
  To: bpf
  Cc: Dmitrii Banshchikov, ast, davem, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, kpsingh, netdev, rdna

The patchset is based on the patches from David S. Miller [1] and Daniel
Borkmann [2].

The main goal of the patchset is to prepare bpfilter for iptables'
configuration blob parsing and code generation.

The patchset introduces data structures and code for matches, targets, rules
and tables.

It seems inconvenient to continue to use the same blob internally in bpfilter
in parts other than the blob parsing. That is why a superstructure with native
types is introduced. It provides a more convenient way to iterate over the blob
and limit the crazy structs widespread in the bpfilter code.

In this patchset version existing blob's correctness checking that is done by
the iptables kernel part is not reproduced. It will be added in the next
iteration.

Also the current version misses handling of counters. Postpone its
implementation until the code generation phase as it's not clear yet how to
better handle them.

The rough plan for the code generation.

It seems reasonable to assume that the first rules should cover most of the
packet flow.  This is why they are critical from the performance point of view.
At the same time a number of user defined rules might be pretty large. Also
there is a limit on size and complexity of a BPF program introduced by the
verifier.

There are two approaches how to handle iptables' rules in generated BPF programs.

The first approach is to generate a BPF program that is an equivalent to a set
of rules on a rule by rule basis. This approach should give the best
performance. The drawback is the limitation from the verifier on size and
complexity of BPF program.

The second approach is to use an internal representation of rules stored in a
BPF map and use bpf_for_each_map_elem() helper to iterate over them. In this case
the helper's callback is a BPF function that is able to process any valid
rule.

Combination of the two approaches should give most of the benefits - a
heuristic should help to select a small subset of the rules for code generation
on a rule by rule basis. All other rules are cold and it should be possible to
store them in an internal form in a BPF map. The rules will be handled by
bpf_for_each_map_elem().  This should remove the limit on the number of
supported rules.

During development it was useful to use statically linked sanitizers in
bpfilter usermode helper. Also it is possible to use fuzzers but it's not clear
if it is worth adding them to the test infrastructure - because there are no
other fuzzers under tools/testing/selftests currently.

Patch 1 adds definitions of the used types.
Patch 2 adds logging facility to bpfilter.
Patch 3 adds IO functions.
Patch 4 adds bpfilter header to tools
Patch 5 adds an associative map.
Patches 6/7/8/9 adds code for matches, targets, rules and table.
Patch 10 handles hooked setsockopt(2) calls.
Patch 11 uses prepared code in main().

Here is the example:

# dmesg  | tail -n 2
[   23.636102] bpfilter: Loaded bpfilter_umh pid 181
[   23.658529] bpfilter: started
# /usr/sbin/iptables-legacy -L -n
Chain INPUT (policy ACCEPT)
target     prot opt source               destination

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
# /usr/sbin/iptables-legacy -A INPUT -p udp --dport 23 -j DROP
# /usr/sbin/iptables-legacy -L -n
Chain INPUT (policy ACCEPT)
target     prot opt source               destination
DROP       udp  --  0.0.0.0/0            0.0.0.0/0           udp dpt:23

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
# /usr/sbin/iptables-legacy -F
# /usr/sbin/iptables-legacy -L -n
Chain INPUT (policy ACCEPT)
target     prot opt source               destination

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
#


1. https://lore.kernel.org/patchwork/patch/902785/
2. https://lore.kernel.org/patchwork/patch/902783/

Dmitrii Banshchikov (11):
  bpfilter: Add types for usermode helper
  bpfilter: Add logging facility
  bpfilter: Add IO functions
  tools: Add bpfilter usermode helper header
  bpfilter: Add map container
  bpfilter: Add struct match
  bpfilter: Add struct target
  bpfilter: Add struct rule
  bpfilter: Add struct table
  bpfilter: Add handling of setsockopt() calls
  bpfilter: Handle setsockopts

 include/uapi/linux/bpfilter.h                 | 155 ++++++++
 net/bpfilter/Makefile                         |   3 +-
 net/bpfilter/bflog.c                          |  29 ++
 net/bpfilter/bflog.h                          |  24 ++
 net/bpfilter/context.c                        | 176 +++++++++
 net/bpfilter/context.h                        |  27 ++
 net/bpfilter/io.c                             |  77 ++++
 net/bpfilter/io.h                             |  18 +
 net/bpfilter/main.c                           |  99 ++---
 net/bpfilter/map-common.c                     |  64 ++++
 net/bpfilter/map-common.h                     |  19 +
 net/bpfilter/match-ops-map.h                  |  48 +++
 net/bpfilter/match.c                          |  73 ++++
 net/bpfilter/match.h                          |  34 ++
 net/bpfilter/rule.c                           | 128 +++++++
 net/bpfilter/rule.h                           |  27 ++
 net/bpfilter/sockopt.c                        | 357 ++++++++++++++++++
 net/bpfilter/sockopt.h                        |  14 +
 net/bpfilter/table-map.h                      |  41 ++
 net/bpfilter/table.c                          | 167 ++++++++
 net/bpfilter/table.h                          |  33 ++
 net/bpfilter/target-ops-map.h                 |  49 +++
 net/bpfilter/target.c                         | 112 ++++++
 net/bpfilter/target.h                         |  34 ++
 tools/include/uapi/linux/bpfilter.h           | 179 +++++++++
 .../testing/selftests/bpf/bpfilter/.gitignore |   6 +
 tools/testing/selftests/bpf/bpfilter/Makefile |  31 ++
 .../selftests/bpf/bpfilter/bpfilter_util.h    |  39 ++
 .../testing/selftests/bpf/bpfilter/test_io.c  | 100 +++++
 .../testing/selftests/bpf/bpfilter/test_map.c |  63 ++++
 .../selftests/bpf/bpfilter/test_match.c       |  63 ++++
 .../selftests/bpf/bpfilter/test_rule.c        |  55 +++
 .../selftests/bpf/bpfilter/test_target.c      |  85 +++++
 33 files changed, 2382 insertions(+), 47 deletions(-)
 create mode 100644 net/bpfilter/bflog.c
 create mode 100644 net/bpfilter/bflog.h
 create mode 100644 net/bpfilter/context.c
 create mode 100644 net/bpfilter/context.h
 create mode 100644 net/bpfilter/io.c
 create mode 100644 net/bpfilter/io.h
 create mode 100644 net/bpfilter/map-common.c
 create mode 100644 net/bpfilter/map-common.h
 create mode 100644 net/bpfilter/match-ops-map.h
 create mode 100644 net/bpfilter/match.c
 create mode 100644 net/bpfilter/match.h
 create mode 100644 net/bpfilter/rule.c
 create mode 100644 net/bpfilter/rule.h
 create mode 100644 net/bpfilter/sockopt.c
 create mode 100644 net/bpfilter/sockopt.h
 create mode 100644 net/bpfilter/table-map.h
 create mode 100644 net/bpfilter/table.c
 create mode 100644 net/bpfilter/table.h
 create mode 100644 net/bpfilter/target-ops-map.h
 create mode 100644 net/bpfilter/target.c
 create mode 100644 net/bpfilter/target.h
 create mode 100644 tools/include/uapi/linux/bpfilter.h
 create mode 100644 tools/testing/selftests/bpf/bpfilter/.gitignore
 create mode 100644 tools/testing/selftests/bpf/bpfilter/Makefile
 create mode 100644 tools/testing/selftests/bpf/bpfilter/bpfilter_util.h
 create mode 100644 tools/testing/selftests/bpf/bpfilter/test_io.c
 create mode 100644 tools/testing/selftests/bpf/bpfilter/test_map.c
 create mode 100644 tools/testing/selftests/bpf/bpfilter/test_match.c
 create mode 100644 tools/testing/selftests/bpf/bpfilter/test_rule.c
 create mode 100644 tools/testing/selftests/bpf/bpfilter/test_target.c

-- 
2.25.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH bpf-next 01/11] bpfilter: Add types for usermode helper
  2021-05-17 22:52 [PATCH bpf-next 00/11] bpfilter Dmitrii Banshchikov
@ 2021-05-17 22:52 ` Dmitrii Banshchikov
  2021-05-17 22:52 ` [PATCH bpf-next 02/11] bpfilter: Add logging facility Dmitrii Banshchikov
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 28+ messages in thread
From: Dmitrii Banshchikov @ 2021-05-17 22:52 UTC (permalink / raw)
  To: bpf
  Cc: Dmitrii Banshchikov, ast, davem, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, kpsingh, netdev, rdna

Add more definitions that mirror existing iptables' ABI.
These definitions will be used in bpfilter usermode helper.

Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
---
 include/uapi/linux/bpfilter.h | 155 ++++++++++++++++++++++++++++++++++
 1 file changed, 155 insertions(+)

diff --git a/include/uapi/linux/bpfilter.h b/include/uapi/linux/bpfilter.h
index cbc1f5813f50..e97d95d0ba54 100644
--- a/include/uapi/linux/bpfilter.h
+++ b/include/uapi/linux/bpfilter.h
@@ -3,6 +3,13 @@
 #define _UAPI_LINUX_BPFILTER_H
 
 #include <linux/if.h>
+#include <linux/const.h>
+
+#define BPFILTER_FUNCTION_MAXNAMELEN    30
+#define BPFILTER_EXTENSION_MAXNAMELEN   29
+
+#define BPFILTER_STANDARD_TARGET        ""
+#define BPFILTER_ERROR_TARGET           "ERROR"
 
 enum {
 	BPFILTER_IPT_SO_SET_REPLACE = 64,
@@ -18,4 +25,152 @@ enum {
 	BPFILTER_IPT_GET_MAX,
 };
 
+enum {
+	BPFILTER_XT_TABLE_MAXNAMELEN = 32,
+};
+
+enum {
+	BPFILTER_NF_DROP = 0,
+	BPFILTER_NF_ACCEPT = 1,
+	BPFILTER_NF_STOLEN = 2,
+	BPFILTER_NF_QUEUE = 3,
+	BPFILTER_NF_REPEAT = 4,
+	BPFILTER_NF_STOP = 5,
+	BPFILTER_NF_MAX_VERDICT = BPFILTER_NF_STOP,
+	BPFILTER_RETURN = (-BPFILTER_NF_REPEAT - 1),
+};
+
+enum {
+	BPFILTER_INET_HOOK_PRE_ROUTING = 0,
+	BPFILTER_INET_HOOK_LOCAL_IN = 1,
+	BPFILTER_INET_HOOK_FORWARD = 2,
+	BPFILTER_INET_HOOK_LOCAL_OUT = 3,
+	BPFILTER_INET_HOOK_POST_ROUTING = 4,
+	BPFILTER_INET_HOOK_MAX,
+};
+
+enum {
+	BPFILTER_IPT_F_MASK = 0x03,
+	BPFILTER_IPT_INV_MASK = 0x7f
+};
+
+struct bpfilter_ipt_match {
+	union {
+		struct {
+			__u16 match_size;
+			char name[BPFILTER_EXTENSION_MAXNAMELEN];
+			__u8 revision;
+		} user;
+		struct {
+			__u16 match_size;
+			void *match;
+		} kernel;
+		__u16 match_size;
+	} u;
+	unsigned char data[0];
+};
+
+struct bpfilter_ipt_target {
+	union {
+		struct {
+			__u16 target_size;
+			char name[BPFILTER_EXTENSION_MAXNAMELEN];
+			__u8 revision;
+		} user;
+		struct {
+			__u16 target_size;
+			void *target;
+		} kernel;
+		__u16 target_size;
+	} u;
+	unsigned char data[0];
+};
+
+struct bpfilter_ipt_standard_target {
+	struct bpfilter_ipt_target target;
+	int verdict;
+};
+
+struct bpfilter_ipt_error_target {
+	struct bpfilter_ipt_target target;
+	char error_name[BPFILTER_FUNCTION_MAXNAMELEN];
+};
+
+struct bpfilter_ipt_get_info {
+	char name[BPFILTER_XT_TABLE_MAXNAMELEN];
+	__u32 valid_hooks;
+	__u32 hook_entry[BPFILTER_INET_HOOK_MAX];
+	__u32 underflow[BPFILTER_INET_HOOK_MAX];
+	__u32 num_entries;
+	__u32 size;
+};
+
+struct bpfilter_ipt_counters {
+	__u64 packet_cnt;
+	__u64 byte_cnt;
+};
+
+struct bpfilter_ipt_counters_info {
+	char name[BPFILTER_XT_TABLE_MAXNAMELEN];
+	__u32 num_counters;
+	struct bpfilter_ipt_counters counters[0];
+};
+
+struct bpfilter_ipt_get_revision {
+	char name[BPFILTER_EXTENSION_MAXNAMELEN];
+	__u8 revision;
+};
+
+struct bpfilter_ipt_ip {
+	__u32 src;
+	__u32 dst;
+	__u32 src_mask;
+	__u32 dst_mask;
+	char in_iface[IFNAMSIZ];
+	char out_iface[IFNAMSIZ];
+	__u8 in_iface_mask[IFNAMSIZ];
+	__u8 out_iface_mask[IFNAMSIZ];
+	__u16 protocol;
+	__u8 flags;
+	__u8 invflags;
+};
+
+struct bpfilter_ipt_entry {
+	struct bpfilter_ipt_ip ip;
+	__u32 bfcache;
+	__u16 target_offset;
+	__u16 next_offset;
+	__u32 comefrom;
+	struct bpfilter_ipt_counters counters;
+	__u8 elems[0];
+};
+
+struct bpfilter_ipt_standard_entry {
+	struct bpfilter_ipt_entry entry;
+	struct bpfilter_ipt_standard_target target;
+};
+
+struct bpfilter_ipt_error_entry {
+	struct bpfilter_ipt_entry entry;
+	struct bpfilter_ipt_error_target target;
+};
+
+struct bpfilter_ipt_get_entries {
+	char name[BPFILTER_XT_TABLE_MAXNAMELEN];
+	__u32 size;
+	struct bpfilter_ipt_entry entries[0];
+};
+
+struct bpfilter_ipt_replace {
+	char name[BPFILTER_XT_TABLE_MAXNAMELEN];
+	__u32 valid_hooks;
+	__u32 num_entries;
+	__u32 size;
+	__u32 hook_entry[BPFILTER_INET_HOOK_MAX];
+	__u32 underflow[BPFILTER_INET_HOOK_MAX];
+	__u32 num_counters;
+	struct bpfilter_ipt_counters *cntrs;
+	struct bpfilter_ipt_entry entries[0];
+};
+
 #endif /* _UAPI_LINUX_BPFILTER_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH bpf-next 02/11] bpfilter: Add logging facility
  2021-05-17 22:52 [PATCH bpf-next 00/11] bpfilter Dmitrii Banshchikov
  2021-05-17 22:52 ` [PATCH bpf-next 01/11] bpfilter: Add types for usermode helper Dmitrii Banshchikov
@ 2021-05-17 22:52 ` Dmitrii Banshchikov
  2021-05-19 17:32   ` Song Liu
  2021-05-17 22:53 ` [PATCH bpf-next 03/11] bpfilter: Add IO functions Dmitrii Banshchikov
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 28+ messages in thread
From: Dmitrii Banshchikov @ 2021-05-17 22:52 UTC (permalink / raw)
  To: bpf
  Cc: Dmitrii Banshchikov, ast, davem, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, kpsingh, netdev, rdna

There are three logging levels for messages: FATAL, NOTICE and DEBUG.
When a message is logged with FATAL level it results in bpfilter
usermode helper termination.

Introduce struct context to avoid use of global objects and store there
the logging parameters: log level and log sink.

Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
---
 net/bpfilter/Makefile  |  2 +-
 net/bpfilter/bflog.c   | 29 +++++++++++++++++++++++++++++
 net/bpfilter/bflog.h   | 24 ++++++++++++++++++++++++
 net/bpfilter/context.h | 16 ++++++++++++++++
 4 files changed, 70 insertions(+), 1 deletion(-)
 create mode 100644 net/bpfilter/bflog.c
 create mode 100644 net/bpfilter/bflog.h
 create mode 100644 net/bpfilter/context.h

diff --git a/net/bpfilter/Makefile b/net/bpfilter/Makefile
index cdac82b8c53a..874d5ef6237d 100644
--- a/net/bpfilter/Makefile
+++ b/net/bpfilter/Makefile
@@ -4,7 +4,7 @@
 #
 
 userprogs := bpfilter_umh
-bpfilter_umh-objs := main.o
+bpfilter_umh-objs := main.o bflog.o
 userccflags += -I $(srctree)/tools/include/ -I $(srctree)/tools/include/uapi
 
 ifeq ($(CONFIG_BPFILTER_UMH), y)
diff --git a/net/bpfilter/bflog.c b/net/bpfilter/bflog.c
new file mode 100644
index 000000000000..2752e39060e4
--- /dev/null
+++ b/net/bpfilter/bflog.c
@@ -0,0 +1,29 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#define _GNU_SOURCE
+
+#include "bflog.h"
+
+#include <stdarg.h>
+#include <stdio.h>
+#include <stdlib.h>
+
+#include "context.h"
+
+void bflog(struct context *ctx, int level, const char *fmt, ...)
+{
+	if (ctx->log_file &&
+	    (level == BFLOG_LEVEL_FATAL || (level & ctx->log_level))) {
+		va_list va;
+
+		va_start(va, fmt);
+		vfprintf(ctx->log_file, fmt, va);
+		va_end(va);
+	}
+
+	if (level == BFLOG_LEVEL_FATAL)
+		exit(EXIT_FAILURE);
+}
diff --git a/net/bpfilter/bflog.h b/net/bpfilter/bflog.h
new file mode 100644
index 000000000000..4ed12791cfa1
--- /dev/null
+++ b/net/bpfilter/bflog.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#ifndef NET_BPFILTER_BFLOG_H
+#define NET_BPFILTER_BFLOG_H
+
+struct context;
+
+#define BFLOG_IMPL(ctx, level, fmt, ...) bflog(ctx, level, "bpfilter: " fmt, ##__VA_ARGS__)
+
+#define BFLOG_LEVEL_FATAL (0)
+#define BFLOG_LEVEL_NOTICE (1)
+#define BFLOG_LEVEL_DEBUG (2)
+
+#define BFLOG_FATAL(ctx, fmt, ...)                                                                 \
+	BFLOG_IMPL(ctx, BFLOG_LEVEL_FATAL, "fatal error: " fmt, ##__VA_ARGS__)
+#define BFLOG_NOTICE(ctx, fmt, ...) BFLOG_IMPL(ctx, BFLOG_LEVEL_NOTICE, fmt, ##__VA_ARGS__)
+#define BFLOG_DEBUG(ctx, fmt, ...) BFLOG_IMPL(ctx, BFLOG_LEVEL_DEBUG, fmt, ##__VA_ARGS__)
+
+void bflog(struct context *ctx, int level, const char *fmt, ...);
+
+#endif // NET_BPFILTER_BFLOG_H
diff --git a/net/bpfilter/context.h b/net/bpfilter/context.h
new file mode 100644
index 000000000000..e85c97c3d010
--- /dev/null
+++ b/net/bpfilter/context.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#ifndef NET_BPFILTER_CONTEXT_H
+#define NET_BPFILTER_CONTEXT_H
+
+#include <stdio.h>
+
+struct context {
+	FILE *log_file;
+	int log_level;
+};
+
+#endif // NET_BPFILTER_CONTEXT_H
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH bpf-next 03/11] bpfilter: Add IO functions
  2021-05-17 22:52 [PATCH bpf-next 00/11] bpfilter Dmitrii Banshchikov
  2021-05-17 22:52 ` [PATCH bpf-next 01/11] bpfilter: Add types for usermode helper Dmitrii Banshchikov
  2021-05-17 22:52 ` [PATCH bpf-next 02/11] bpfilter: Add logging facility Dmitrii Banshchikov
@ 2021-05-17 22:53 ` Dmitrii Banshchikov
  2021-05-19 18:47   ` Song Liu
  2021-05-17 22:53 ` [PATCH bpf-next 04/11] tools: Add bpfilter usermode helper header Dmitrii Banshchikov
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 28+ messages in thread
From: Dmitrii Banshchikov @ 2021-05-17 22:53 UTC (permalink / raw)
  To: bpf
  Cc: Dmitrii Banshchikov, ast, davem, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, kpsingh, netdev, rdna

Introduce IO functions for:
1) reading and writing data from a descriptor: read_exact(), write_exact(),
2) reading and writing memory of other processes: pvm_read(), pvm_write().

read_exact() and write_exact() are wrappers over read(2)/write(2) with
correct handling of partial read/write. These functions are intended to
be used for communication over pipe with the kernel part of bpfilter.

pvm_read() and pvm_write() are wrappers over
process_vm_readv(2)/process_vm_writev(2) with an interface that uses a
single buffer instead of vectored form. These functions are intended to
be used for readining/writing memory buffers supplied to iptables ABI
setsockopt(2) from other processes.

Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
---
 net/bpfilter/Makefile                         |   2 +-
 net/bpfilter/io.c                             |  77 ++++++++++++++
 net/bpfilter/io.h                             |  18 ++++
 .../testing/selftests/bpf/bpfilter/.gitignore |   2 +
 tools/testing/selftests/bpf/bpfilter/Makefile |  17 +++
 .../testing/selftests/bpf/bpfilter/test_io.c  | 100 ++++++++++++++++++
 6 files changed, 215 insertions(+), 1 deletion(-)
 create mode 100644 net/bpfilter/io.c
 create mode 100644 net/bpfilter/io.h
 create mode 100644 tools/testing/selftests/bpf/bpfilter/.gitignore
 create mode 100644 tools/testing/selftests/bpf/bpfilter/Makefile
 create mode 100644 tools/testing/selftests/bpf/bpfilter/test_io.c

diff --git a/net/bpfilter/Makefile b/net/bpfilter/Makefile
index 874d5ef6237d..69a6c139fc7a 100644
--- a/net/bpfilter/Makefile
+++ b/net/bpfilter/Makefile
@@ -4,7 +4,7 @@
 #
 
 userprogs := bpfilter_umh
-bpfilter_umh-objs := main.o bflog.o
+bpfilter_umh-objs := main.o bflog.o io.o
 userccflags += -I $(srctree)/tools/include/ -I $(srctree)/tools/include/uapi
 
 ifeq ($(CONFIG_BPFILTER_UMH), y)
diff --git a/net/bpfilter/io.c b/net/bpfilter/io.c
new file mode 100644
index 000000000000..e645ae9d7a50
--- /dev/null
+++ b/net/bpfilter/io.c
@@ -0,0 +1,77 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#define _GNU_SOURCE
+
+#include "io.h"
+
+#include <errno.h>
+#include <sys/uio.h>
+#include <unistd.h>
+
+#define do_exact(fd, op, buffer, count)                                                            \
+	({                                                                                         \
+		size_t total = 0;                                                                  \
+		int err = 0;                                                                       \
+												   \
+		do {                                                                               \
+			const ssize_t part = op(fd, (buffer) + total, (count) - total);            \
+			if (part > 0) {                                                            \
+				total += part;                                                     \
+			} else if (part == 0 && (count) > 0) {                                     \
+				err = -EIO;                                                        \
+				break;                                                             \
+			} else if (part == -1) {                                                   \
+				if (errno == EINTR)                                                \
+					continue;                                                  \
+				err = -errno;                                                      \
+				break;                                                             \
+			}                                                                          \
+		} while (total < (count));                                                         \
+												   \
+		err;                                                                               \
+	})
+
+int read_exact(int fd, void *buffer, size_t count)
+{
+	return do_exact(fd, read, buffer, count);
+}
+
+int write_exact(int fd, const void *buffer, size_t count)
+{
+	return do_exact(fd, write, buffer, count);
+}
+
+int pvm_read(pid_t pid, void *to, const void *from, size_t count)
+{
+	const struct iovec r_iov = { .iov_base = (void *)from, .iov_len = count };
+	const struct iovec l_iov = { .iov_base = to, .iov_len = count };
+	size_t total_bytes;
+
+	total_bytes = process_vm_readv(pid, &l_iov, 1, &r_iov, 1, 0);
+	if (total_bytes == -1)
+		return -errno;
+
+	if (total_bytes != count)
+		return -EFAULT;
+
+	return 0;
+}
+
+int pvm_write(pid_t pid, void *to, const void *from, size_t count)
+{
+	const struct iovec l_iov = { .iov_base = (void *)from, .iov_len = count };
+	const struct iovec r_iov = { .iov_base = to, .iov_len = count };
+	size_t total_bytes;
+
+	total_bytes = process_vm_writev(pid, &l_iov, 1, &r_iov, 1, 0);
+	if (total_bytes == -1)
+		return -errno;
+
+	if (total_bytes != count)
+		return -EFAULT;
+
+	return 0;
+}
diff --git a/net/bpfilter/io.h b/net/bpfilter/io.h
new file mode 100644
index 000000000000..ab56c8bb8e61
--- /dev/null
+++ b/net/bpfilter/io.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#ifndef NET_BPFILTER_IO_H
+#define NET_BPFILTER_IO_H
+
+#include <stddef.h>
+#include <sys/types.h>
+
+int read_exact(int fd, void *buffer, size_t count);
+int write_exact(int fd, const void *buffer, size_t count);
+
+int pvm_read(pid_t pid, void *to, const void *from, size_t count);
+int pvm_write(pid_t pid, void *to, const void *from, size_t count);
+
+#endif // NET_BPFILTER_IO_H
diff --git a/tools/testing/selftests/bpf/bpfilter/.gitignore b/tools/testing/selftests/bpf/bpfilter/.gitignore
new file mode 100644
index 000000000000..f5785e366013
--- /dev/null
+++ b/tools/testing/selftests/bpf/bpfilter/.gitignore
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+test_io
diff --git a/tools/testing/selftests/bpf/bpfilter/Makefile b/tools/testing/selftests/bpf/bpfilter/Makefile
new file mode 100644
index 000000000000..c02d72d89199
--- /dev/null
+++ b/tools/testing/selftests/bpf/bpfilter/Makefile
@@ -0,0 +1,17 @@
+# SPDX-License-Identifier: GPL-2.0
+
+top_srcdir = ../../../../..
+TOOLSDIR := $(abspath ../../../../)
+TOOLSINCDIR := $(TOOLSDIR)/include
+APIDIR := $(TOOLSINCDIR)/uapi
+BPFILTERSRCDIR := $(top_srcdir)/net/bpfilter
+
+CFLAGS += -Wall -g -pthread -I$(TOOLSINCDIR) -I$(APIDIR) -I$(BPFILTERSRCDIR)
+
+TEST_GEN_PROGS += test_io
+
+KSFT_KHDR_INSTALL := 1
+
+include ../../lib.mk
+
+$(OUTPUT)/test_io: test_io.c $(BPFILTERSRCDIR)/io.c
diff --git a/tools/testing/selftests/bpf/bpfilter/test_io.c b/tools/testing/selftests/bpf/bpfilter/test_io.c
new file mode 100644
index 000000000000..e4294930c581
--- /dev/null
+++ b/tools/testing/selftests/bpf/bpfilter/test_io.c
@@ -0,0 +1,100 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "io.h"
+
+#include <signal.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <string.h>
+#include <sys/types.h>
+#include <sys/wait.h>
+#include <unistd.h>
+
+#include "../../kselftest_harness.h"
+
+FIXTURE(test_pvm)
+{
+	int wstatus;
+	int fd[2];
+	pid_t pid;
+	pid_t ppid;
+	char expected[5];
+	char actual[5];
+};
+
+FIXTURE_SETUP(test_pvm)
+{
+	snprintf(self->expected, sizeof(self->expected), "ipfw");
+	memset(self->actual, 0, sizeof(self->actual));
+	self->ppid = getpid();
+	ASSERT_EQ(pipe(self->fd), 0);
+	self->pid = fork();
+	ASSERT_NE(self->pid, -1) TH_LOG("Cannot fork(): %m\n");
+	close(self->fd[!!self->pid]);
+};
+
+FIXTURE_TEARDOWN(test_pvm)
+{
+	int wstatus;
+
+	if (!self->pid)
+		exit(0);
+
+	kill(self->pid, SIGKILL);
+	waitpid(self->pid, &wstatus, -2);
+	close(self->fd[1]);
+}
+
+TEST_F(test_pvm, read)
+{
+	if (!self->pid) {
+		const uint8_t baton = 'x';
+
+		memcpy(self->actual, self->expected, sizeof(self->actual));
+		ASSERT_EQ(write(self->fd[1], &baton, sizeof(baton)), sizeof(baton));
+
+		pause();
+		exit(0);
+	} else {
+		int err;
+		uint8_t baton;
+
+		EXPECT_EQ(read(self->fd[0], &baton, sizeof(baton)), sizeof(baton));
+		EXPECT_EQ(baton, 'x');
+
+		err = pvm_read(self->pid, &self->actual, &self->actual, sizeof(self->actual));
+		EXPECT_EQ(err, 0)
+		TH_LOG("Cannot pvm_read(): %s\n", strerror(-err));
+
+		EXPECT_EQ(memcmp(&self->expected, &self->actual, sizeof(self->actual)), 0);
+	}
+}
+
+TEST_F(test_pvm, write)
+{
+	if (getuid())
+		SKIP(return, "pvm_write requires CAP_SYS_PTRACE");
+
+	if (!self->pid) {
+		const uint8_t baton = 'x';
+		int err;
+
+		err = pvm_write(self->ppid, &self->actual, &self->expected, sizeof(self->expected));
+		EXPECT_EQ(err, 0) TH_LOG("Cannot pvm_write: %s\n", strerror(-err));
+
+		ASSERT_EQ(write(self->fd[1], &baton, sizeof(baton)), sizeof(baton));
+
+		pause();
+		exit(0);
+
+	} else {
+		uint8_t baton;
+
+		EXPECT_EQ(read(self->fd[0], &baton, sizeof(baton)), sizeof(baton));
+		EXPECT_EQ(baton, 'x');
+
+		EXPECT_EQ(memcmp(&self->expected, &self->actual, sizeof(self->actual)), 0);
+	}
+}
+
+TEST_HARNESS_MAIN
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH bpf-next 04/11] tools: Add bpfilter usermode helper header
  2021-05-17 22:52 [PATCH bpf-next 00/11] bpfilter Dmitrii Banshchikov
                   ` (2 preceding siblings ...)
  2021-05-17 22:53 ` [PATCH bpf-next 03/11] bpfilter: Add IO functions Dmitrii Banshchikov
@ 2021-05-17 22:53 ` Dmitrii Banshchikov
  2021-05-17 22:53 ` [PATCH bpf-next 05/11] bpfilter: Add map container Dmitrii Banshchikov
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 28+ messages in thread
From: Dmitrii Banshchikov @ 2021-05-17 22:53 UTC (permalink / raw)
  To: bpf
  Cc: Dmitrii Banshchikov, ast, davem, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, kpsingh, netdev, rdna

The header will be used in bpfilter usermode helper test infrastructure.

Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
---
 tools/include/uapi/linux/bpfilter.h | 179 ++++++++++++++++++++++++++++
 1 file changed, 179 insertions(+)
 create mode 100644 tools/include/uapi/linux/bpfilter.h

diff --git a/tools/include/uapi/linux/bpfilter.h b/tools/include/uapi/linux/bpfilter.h
new file mode 100644
index 000000000000..8b49d81f81c8
--- /dev/null
+++ b/tools/include/uapi/linux/bpfilter.h
@@ -0,0 +1,179 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _UAPI_LINUX_BPFILTER_H
+#define _UAPI_LINUX_BPFILTER_H
+
+#include <linux/if.h>
+#include <linux/const.h>
+
+#define BPFILTER_FUNCTION_MAXNAMELEN    30
+#define BPFILTER_EXTENSION_MAXNAMELEN   29
+
+#define BPFILTER_STANDARD_TARGET        ""
+#define BPFILTER_ERROR_TARGET           "ERROR"
+
+
+#define BPFILTER_ALIGN(__X) __ALIGN_KERNEL(__X, __alignof__(__u64))
+
+enum {
+	BPFILTER_IPT_SO_SET_REPLACE = 64,
+	BPFILTER_IPT_SO_SET_ADD_COUNTERS = 65,
+	BPFILTER_IPT_SET_MAX,
+};
+
+enum {
+	BPFILTER_IPT_SO_GET_INFO = 64,
+	BPFILTER_IPT_SO_GET_ENTRIES = 65,
+	BPFILTER_IPT_SO_GET_REVISION_MATCH = 66,
+	BPFILTER_IPT_SO_GET_REVISION_TARGET = 67,
+	BPFILTER_IPT_GET_MAX,
+};
+
+enum {
+	BPFILTER_XT_TABLE_MAXNAMELEN = 32,
+};
+
+enum {
+	BPFILTER_NF_DROP = 0,
+	BPFILTER_NF_ACCEPT = 1,
+	BPFILTER_NF_STOLEN = 2,
+	BPFILTER_NF_QUEUE = 3,
+	BPFILTER_NF_REPEAT = 4,
+	BPFILTER_NF_STOP = 5,
+	BPFILTER_NF_MAX_VERDICT = BPFILTER_NF_STOP,
+	BPFILTER_RETURN = (-BPFILTER_NF_REPEAT - 1),
+};
+
+enum {
+	BPFILTER_INET_HOOK_PRE_ROUTING = 0,
+	BPFILTER_INET_HOOK_LOCAL_IN = 1,
+	BPFILTER_INET_HOOK_FORWARD = 2,
+	BPFILTER_INET_HOOK_LOCAL_OUT = 3,
+	BPFILTER_INET_HOOK_POST_ROUTING = 4,
+	BPFILTER_INET_HOOK_MAX,
+};
+
+enum {
+	BPFILTER_IPT_F_MASK = 0x03,
+	BPFILTER_IPT_INV_MASK = 0x7f
+};
+
+struct bpfilter_ipt_match {
+	union {
+		struct {
+			__u16 match_size;
+			char name[BPFILTER_EXTENSION_MAXNAMELEN];
+			__u8 revision;
+		} user;
+		struct {
+			__u16 match_size;
+			void *match;
+		} kernel;
+		__u16 match_size;
+	} u;
+	unsigned char data[0];
+};
+
+struct bpfilter_ipt_target {
+	union {
+		struct {
+			__u16 target_size;
+			char name[BPFILTER_EXTENSION_MAXNAMELEN];
+			__u8 revision;
+		} user;
+		struct {
+			__u16 target_size;
+			void *target;
+		} kernel;
+		__u16 target_size;
+	} u;
+	unsigned char data[0];
+};
+
+struct bpfilter_ipt_standard_target {
+	struct bpfilter_ipt_target target;
+	int verdict;
+};
+
+struct bpfilter_ipt_error_target {
+	struct bpfilter_ipt_target target;
+	char error_name[BPFILTER_FUNCTION_MAXNAMELEN];
+};
+
+struct bpfilter_ipt_get_info {
+	char name[BPFILTER_XT_TABLE_MAXNAMELEN];
+	__u32 valid_hooks;
+	__u32 hook_entry[BPFILTER_INET_HOOK_MAX];
+	__u32 underflow[BPFILTER_INET_HOOK_MAX];
+	__u32 num_entries;
+	__u32 size;
+};
+
+struct bpfilter_ipt_counters {
+	__u64 packet_cnt;
+	__u64 byte_cnt;
+};
+
+struct bpfilter_ipt_counters_info {
+	char name[BPFILTER_XT_TABLE_MAXNAMELEN];
+	__u32 num_counters;
+	struct bpfilter_ipt_counters counters[0];
+};
+
+struct bpfilter_ipt_get_revision {
+	char name[BPFILTER_EXTENSION_MAXNAMELEN];
+	__u8 revision;
+};
+
+struct bpfilter_ipt_ip {
+	__u32 src;
+	__u32 dst;
+	__u32 src_mask;
+	__u32 dst_mask;
+	char in_iface[IFNAMSIZ];
+	char out_iface[IFNAMSIZ];
+	__u8 in_iface_mask[IFNAMSIZ];
+	__u8 out_iface_mask[IFNAMSIZ];
+	__u16 protocol;
+	__u8 flags;
+	__u8 invflags;
+};
+
+struct bpfilter_ipt_entry {
+	struct bpfilter_ipt_ip ip;
+	__u32 bfcache;
+	__u16 target_offset;
+	__u16 next_offset;
+	__u32 comefrom;
+	struct bpfilter_ipt_counters counters;
+	__u8 elems[0];
+};
+
+struct bpfilter_ipt_standard_entry {
+	struct bpfilter_ipt_entry entry;
+	struct bpfilter_ipt_standard_target target;
+};
+
+struct bpfilter_ipt_error_entry {
+	struct bpfilter_ipt_entry entry;
+	struct bpfilter_ipt_error_target target;
+};
+
+struct bpfilter_ipt_get_entries {
+	char name[BPFILTER_XT_TABLE_MAXNAMELEN];
+	__u32 size;
+	struct bpfilter_ipt_entry entries[0];
+};
+
+struct bpfilter_ipt_replace {
+	char name[BPFILTER_XT_TABLE_MAXNAMELEN];
+	__u32 valid_hooks;
+	__u32 num_entries;
+	__u32 size;
+	__u32 hook_entry[BPFILTER_INET_HOOK_MAX];
+	__u32 underflow[BPFILTER_INET_HOOK_MAX];
+	__u32 num_counters;
+	struct bpfilter_ipt_counters *cntrs;
+	struct bpfilter_ipt_entry entries[0];
+};
+
+#endif /* _UAPI_LINUX_BPFILTER_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH bpf-next 05/11] bpfilter: Add map container
  2021-05-17 22:52 [PATCH bpf-next 00/11] bpfilter Dmitrii Banshchikov
                   ` (3 preceding siblings ...)
  2021-05-17 22:53 ` [PATCH bpf-next 04/11] tools: Add bpfilter usermode helper header Dmitrii Banshchikov
@ 2021-05-17 22:53 ` Dmitrii Banshchikov
  2021-05-17 22:53 ` [PATCH bpf-next 06/11] bpfilter: Add struct match Dmitrii Banshchikov
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 28+ messages in thread
From: Dmitrii Banshchikov @ 2021-05-17 22:53 UTC (permalink / raw)
  To: bpf
  Cc: Dmitrii Banshchikov, ast, davem, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, kpsingh, netdev, rdna

Introduce common code for an associative container. This common code
will be used for maps of matches, targets and tables. Hash search tables
from libc are used as an index. The supported set of operations is:
insert, update and find.

Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
---
 net/bpfilter/Makefile                         |  2 +-
 net/bpfilter/map-common.c                     | 64 +++++++++++++++++++
 net/bpfilter/map-common.h                     | 19 ++++++
 .../testing/selftests/bpf/bpfilter/.gitignore |  1 +
 tools/testing/selftests/bpf/bpfilter/Makefile |  2 +
 .../testing/selftests/bpf/bpfilter/test_map.c | 63 ++++++++++++++++++
 6 files changed, 150 insertions(+), 1 deletion(-)
 create mode 100644 net/bpfilter/map-common.c
 create mode 100644 net/bpfilter/map-common.h
 create mode 100644 tools/testing/selftests/bpf/bpfilter/test_map.c

diff --git a/net/bpfilter/Makefile b/net/bpfilter/Makefile
index 69a6c139fc7a..908fbad680ec 100644
--- a/net/bpfilter/Makefile
+++ b/net/bpfilter/Makefile
@@ -4,7 +4,7 @@
 #
 
 userprogs := bpfilter_umh
-bpfilter_umh-objs := main.o bflog.o io.o
+bpfilter_umh-objs := main.o bflog.o io.o map-common.o
 userccflags += -I $(srctree)/tools/include/ -I $(srctree)/tools/include/uapi
 
 ifeq ($(CONFIG_BPFILTER_UMH), y)
diff --git a/net/bpfilter/map-common.c b/net/bpfilter/map-common.c
new file mode 100644
index 000000000000..6a4ab0c5d3ec
--- /dev/null
+++ b/net/bpfilter/map-common.c
@@ -0,0 +1,64 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#include "map-common.h"
+
+#include <linux/err.h>
+
+#include <errno.h>
+#include <string.h>
+
+int create_map(struct hsearch_data *htab, size_t nelem)
+{
+	memset(htab, 0, sizeof(*htab));
+	if (!hcreate_r(nelem, htab))
+		return -errno;
+
+	return 0;
+}
+
+void *map_find(struct hsearch_data *htab, const char *name)
+{
+	const ENTRY needle = { .key = (char *)name };
+	ENTRY *found;
+
+	if (!hsearch_r(needle, FIND, &found, htab))
+		return ERR_PTR(-ENOENT);
+
+	return found->data;
+}
+
+int map_update(struct hsearch_data *htab, const char *name, void *data)
+{
+	const ENTRY needle = { .key = (char *)name, .data = data };
+	ENTRY *found;
+
+	if (!hsearch_r(needle, ENTER, &found, htab))
+		return -errno;
+
+	found->key = (char *)name;
+	found->data = data;
+
+	return 0;
+}
+
+int map_insert(struct hsearch_data *htab, const char *name, void *data)
+{
+	const ENTRY needle = { .key = (char *)name, .data = data };
+	ENTRY *found;
+
+	if (!hsearch_r(needle, ENTER, &found, htab))
+		return -errno;
+
+	if (found->data != data)
+		return -EEXIST;
+
+	return 0;
+}
+
+void free_map(struct hsearch_data *htab)
+{
+	hdestroy_r(htab);
+}
diff --git a/net/bpfilter/map-common.h b/net/bpfilter/map-common.h
new file mode 100644
index 000000000000..b29829230eff
--- /dev/null
+++ b/net/bpfilter/map-common.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#ifndef NET_BPFILTER_MAP_COMMON_H
+#define NET_BPFILTER_MAP_COMMON_H
+
+#define _GNU_SOURCE
+
+#include <search.h>
+
+int create_map(struct hsearch_data *htab, size_t nelem);
+void *map_find(struct hsearch_data *htab, const char *name);
+int map_insert(struct hsearch_data *htab, const char *name, void *data);
+int map_update(struct hsearch_data *htab, const char *name, void *data);
+void free_map(struct hsearch_data *htab);
+
+#endif // NET_BPFILTER_MAP_COMMON_H
diff --git a/tools/testing/selftests/bpf/bpfilter/.gitignore b/tools/testing/selftests/bpf/bpfilter/.gitignore
index f5785e366013..be10b50ca289 100644
--- a/tools/testing/selftests/bpf/bpfilter/.gitignore
+++ b/tools/testing/selftests/bpf/bpfilter/.gitignore
@@ -1,2 +1,3 @@
 # SPDX-License-Identifier: GPL-2.0-only
 test_io
+test_map
diff --git a/tools/testing/selftests/bpf/bpfilter/Makefile b/tools/testing/selftests/bpf/bpfilter/Makefile
index c02d72d89199..77afbbdf27c5 100644
--- a/tools/testing/selftests/bpf/bpfilter/Makefile
+++ b/tools/testing/selftests/bpf/bpfilter/Makefile
@@ -9,9 +9,11 @@ BPFILTERSRCDIR := $(top_srcdir)/net/bpfilter
 CFLAGS += -Wall -g -pthread -I$(TOOLSINCDIR) -I$(APIDIR) -I$(BPFILTERSRCDIR)
 
 TEST_GEN_PROGS += test_io
+TEST_GEN_PROGS += test_map
 
 KSFT_KHDR_INSTALL := 1
 
 include ../../lib.mk
 
 $(OUTPUT)/test_io: test_io.c $(BPFILTERSRCDIR)/io.c
+$(OUTPUT)/test_map: test_map.c $(BPFILTERSRCDIR)/map-common.c
diff --git a/tools/testing/selftests/bpf/bpfilter/test_map.c b/tools/testing/selftests/bpf/bpfilter/test_map.c
new file mode 100644
index 000000000000..6ac61a634e41
--- /dev/null
+++ b/tools/testing/selftests/bpf/bpfilter/test_map.c
@@ -0,0 +1,63 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "map-common.h"
+
+#include <linux/err.h>
+
+#include "../../kselftest_harness.h"
+
+FIXTURE(test_map)
+{
+	struct hsearch_data map;
+	const char *key;
+	void *expected;
+	void *actual;
+};
+
+FIXTURE_SETUP(test_map)
+{
+	const int max_nelements = 100;
+
+	create_map(&self->map, max_nelements);
+	self->key = "key";
+	self->expected = "expected";
+	self->actual = "actual";
+}
+
+FIXTURE_TEARDOWN(test_map)
+{
+	free_map(&self->map);
+}
+
+TEST_F(test_map, insert_and_find)
+{
+	void *found;
+
+	found = map_find(&self->map, self->key);
+	ASSERT_TRUE(IS_ERR(found))
+	ASSERT_EQ(-ENOENT, PTR_ERR(found))
+
+	ASSERT_EQ(0, map_insert(&self->map, self->key, self->expected));
+	ASSERT_EQ(0, map_insert(&self->map, self->key, self->expected));
+	ASSERT_EQ(-EEXIST, map_insert(&self->map, self->key, self->actual));
+
+	found = map_find(&self->map, self->key);
+
+	ASSERT_FALSE(IS_ERR(found));
+	ASSERT_STREQ(self->expected, found);
+}
+
+TEST_F(test_map, update)
+{
+	void *found;
+
+	ASSERT_EQ(0, map_insert(&self->map, self->key, self->actual));
+	ASSERT_EQ(0, map_update(&self->map, self->key, self->expected));
+
+	found = map_find(&self->map, self->key);
+
+	ASSERT_FALSE(IS_ERR(found));
+	ASSERT_STREQ(self->expected, found);
+}
+
+TEST_HARNESS_MAIN
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH bpf-next 06/11] bpfilter: Add struct match
  2021-05-17 22:52 [PATCH bpf-next 00/11] bpfilter Dmitrii Banshchikov
                   ` (4 preceding siblings ...)
  2021-05-17 22:53 ` [PATCH bpf-next 05/11] bpfilter: Add map container Dmitrii Banshchikov
@ 2021-05-17 22:53 ` Dmitrii Banshchikov
  2021-05-20  4:26   ` Song Liu
  2021-05-17 22:53 ` [PATCH bpf-next 07/11] bpfilter: Add struct target Dmitrii Banshchikov
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 28+ messages in thread
From: Dmitrii Banshchikov @ 2021-05-17 22:53 UTC (permalink / raw)
  To: bpf
  Cc: Dmitrii Banshchikov, ast, davem, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, kpsingh, netdev, rdna

struct match_ops defines polymorphic interface for matches. A match
consists of pointers to struct match_ops and struct xt_entry_match which
contains a payload for the match's type.

All match_ops are kept in map match_ops_map by their name.

Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
---
 net/bpfilter/Makefile                         |  2 +-
 net/bpfilter/context.c                        | 41 +++++++++++
 net/bpfilter/context.h                        |  6 ++
 net/bpfilter/match-ops-map.h                  | 48 ++++++++++++
 net/bpfilter/match.c                          | 73 +++++++++++++++++++
 net/bpfilter/match.h                          | 34 +++++++++
 .../testing/selftests/bpf/bpfilter/.gitignore |  1 +
 tools/testing/selftests/bpf/bpfilter/Makefile |  3 +
 .../selftests/bpf/bpfilter/test_match.c       | 63 ++++++++++++++++
 9 files changed, 270 insertions(+), 1 deletion(-)
 create mode 100644 net/bpfilter/context.c
 create mode 100644 net/bpfilter/match-ops-map.h
 create mode 100644 net/bpfilter/match.c
 create mode 100644 net/bpfilter/match.h
 create mode 100644 tools/testing/selftests/bpf/bpfilter/test_match.c

diff --git a/net/bpfilter/Makefile b/net/bpfilter/Makefile
index 908fbad680ec..d1a36dd2c666 100644
--- a/net/bpfilter/Makefile
+++ b/net/bpfilter/Makefile
@@ -4,7 +4,7 @@
 #
 
 userprogs := bpfilter_umh
-bpfilter_umh-objs := main.o bflog.o io.o map-common.o
+bpfilter_umh-objs := main.o bflog.o io.o map-common.o match.o context.o
 userccflags += -I $(srctree)/tools/include/ -I $(srctree)/tools/include/uapi
 
 ifeq ($(CONFIG_BPFILTER_UMH), y)
diff --git a/net/bpfilter/context.c b/net/bpfilter/context.c
new file mode 100644
index 000000000000..96735c7883bf
--- /dev/null
+++ b/net/bpfilter/context.c
@@ -0,0 +1,41 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#define _GNU_SOURCE
+
+#include "context.h"
+
+#include <linux/err.h>
+#include <linux/list.h>
+
+#include "match.h"
+
+static int init_match_ops_map(struct context *ctx)
+{
+	const struct match_ops *match_ops[] = { &udp_match_ops };
+	int i, err;
+
+	err = create_match_ops_map(&ctx->match_ops_map, ARRAY_SIZE(match_ops));
+	if (err)
+		return err;
+
+	for (i = 0; i < ARRAY_SIZE(match_ops); ++i) {
+		err = match_ops_map_insert(&ctx->match_ops_map, match_ops[i]);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+int create_context(struct context *ctx)
+{
+	return init_match_ops_map(ctx);
+}
+
+void free_context(struct context *ctx)
+{
+	free_match_ops_map(&ctx->match_ops_map);
+}
diff --git a/net/bpfilter/context.h b/net/bpfilter/context.h
index e85c97c3d010..a3e737b603f0 100644
--- a/net/bpfilter/context.h
+++ b/net/bpfilter/context.h
@@ -8,9 +8,15 @@
 
 #include <stdio.h>
 
+#include "match-ops-map.h"
+
 struct context {
 	FILE *log_file;
 	int log_level;
+	struct match_ops_map match_ops_map;
 };
 
+int create_context(struct context *ctx);
+void free_context(struct context *ctx);
+
 #endif // NET_BPFILTER_CONTEXT_H
diff --git a/net/bpfilter/match-ops-map.h b/net/bpfilter/match-ops-map.h
new file mode 100644
index 000000000000..0ff57f2d8da8
--- /dev/null
+++ b/net/bpfilter/match-ops-map.h
@@ -0,0 +1,48 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#ifndef NET_BPFILTER_MATCH_OPS_MAP_H
+#define NET_BPFILTER_MATCH_OPS_MAP_H
+
+#include "map-common.h"
+
+#include <linux/err.h>
+
+#include <errno.h>
+#include <string.h>
+
+#include "match.h"
+
+struct match_ops_map {
+	struct hsearch_data index;
+};
+
+static inline int create_match_ops_map(struct match_ops_map *map, size_t nelem)
+{
+	return create_map(&map->index, nelem);
+}
+
+static inline const struct match_ops *match_ops_map_find(struct match_ops_map *map,
+							 const char *name)
+{
+	const size_t namelen = strnlen(name, BPFILTER_EXTENSION_MAXNAMELEN);
+
+	if (namelen < BPFILTER_EXTENSION_MAXNAMELEN)
+		return map_find(&map->index, name);
+
+	return ERR_PTR(-EINVAL);
+}
+
+static inline int match_ops_map_insert(struct match_ops_map *map, const struct match_ops *match_ops)
+{
+	return map_insert(&map->index, match_ops->name, (void *)match_ops);
+}
+
+static inline void free_match_ops_map(struct match_ops_map *map)
+{
+	free_map(&map->index);
+}
+
+#endif // NET_BPFILTER_MATCT_OPS_MAP_H
diff --git a/net/bpfilter/match.c b/net/bpfilter/match.c
new file mode 100644
index 000000000000..aeca1b93cd2d
--- /dev/null
+++ b/net/bpfilter/match.c
@@ -0,0 +1,73 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#define _GNU_SOURCE
+
+#include "match.h"
+
+#include <linux/err.h>
+#include <linux/netfilter/xt_tcpudp.h>
+
+#include <errno.h>
+#include <string.h>
+
+#include "bflog.h"
+#include "context.h"
+#include "match-ops-map.h"
+
+#define BPFILTER_ALIGN(__X) __ALIGN_KERNEL(__X, __alignof__(__u64))
+#define MATCH_SIZE(type) (sizeof(struct bpfilter_ipt_match) + BPFILTER_ALIGN(sizeof(type)))
+
+static int udp_match_check(struct context *ctx, const struct bpfilter_ipt_match *ipt_match)
+{
+	const struct xt_udp *udp;
+
+	udp = (const struct xt_udp *)&ipt_match->data;
+
+	if (udp->invflags & XT_UDP_INV_MASK) {
+		BFLOG_DEBUG(ctx, "cannot check match 'udp': invalid flags\n");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+const struct match_ops udp_match_ops = { .name = "udp",
+					 .size = MATCH_SIZE(struct xt_udp),
+					 .revision = 0,
+					 .check = udp_match_check };
+
+int init_match(struct context *ctx, const struct bpfilter_ipt_match *ipt_match, struct match *match)
+{
+	const size_t maxlen = sizeof(ipt_match->u.user.name);
+	const struct match_ops *found;
+	int err;
+
+	if (strnlen(ipt_match->u.user.name, maxlen) == maxlen) {
+		BFLOG_DEBUG(ctx, "cannot init match: too long match name\n");
+		return -EINVAL;
+	}
+
+	found = match_ops_map_find(&ctx->match_ops_map, ipt_match->u.user.name);
+	if (IS_ERR(found)) {
+		BFLOG_DEBUG(ctx, "cannot find match by name: '%s'\n", ipt_match->u.user.name);
+		return PTR_ERR(found);
+	}
+
+	if (found->size != ipt_match->u.match_size ||
+	    found->revision != ipt_match->u.user.revision) {
+		BFLOG_DEBUG(ctx, "invalid match: '%s'\n", ipt_match->u.user.name);
+		return -EINVAL;
+	}
+
+	err = found->check(ctx, ipt_match);
+	if (err)
+		return err;
+
+	match->match_ops = found;
+	match->ipt_match = ipt_match;
+
+	return 0;
+}
diff --git a/net/bpfilter/match.h b/net/bpfilter/match.h
new file mode 100644
index 000000000000..79b7c87016d4
--- /dev/null
+++ b/net/bpfilter/match.h
@@ -0,0 +1,34 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#ifndef NET_BPFILTER_MATCH_H
+#define NET_BPFILTER_MATCH_H
+
+#include "../../include/uapi/linux/bpfilter.h"
+
+#include <stdint.h>
+
+struct bpfilter_ipt_match;
+struct context;
+struct match_ops_map;
+
+struct match_ops {
+	char name[BPFILTER_EXTENSION_MAXNAMELEN];
+	uint16_t size;
+	uint8_t revision;
+	int (*check)(struct context *ctx, const struct bpfilter_ipt_match *ipt_match);
+};
+
+extern const struct match_ops udp_match_ops;
+
+struct match {
+	const struct match_ops *match_ops;
+	const struct bpfilter_ipt_match *ipt_match;
+};
+
+int init_match(struct context *ctx, const struct bpfilter_ipt_match *ipt_match,
+	       struct match *match);
+
+#endif // NET_BPFILTER_MATCH_H
diff --git a/tools/testing/selftests/bpf/bpfilter/.gitignore b/tools/testing/selftests/bpf/bpfilter/.gitignore
index be10b50ca289..e5073231f811 100644
--- a/tools/testing/selftests/bpf/bpfilter/.gitignore
+++ b/tools/testing/selftests/bpf/bpfilter/.gitignore
@@ -1,3 +1,4 @@
 # SPDX-License-Identifier: GPL-2.0-only
 test_io
 test_map
+test_match
diff --git a/tools/testing/selftests/bpf/bpfilter/Makefile b/tools/testing/selftests/bpf/bpfilter/Makefile
index 77afbbdf27c5..362c9a28b88d 100644
--- a/tools/testing/selftests/bpf/bpfilter/Makefile
+++ b/tools/testing/selftests/bpf/bpfilter/Makefile
@@ -10,6 +10,7 @@ CFLAGS += -Wall -g -pthread -I$(TOOLSINCDIR) -I$(APIDIR) -I$(BPFILTERSRCDIR)
 
 TEST_GEN_PROGS += test_io
 TEST_GEN_PROGS += test_map
+TEST_GEN_PROGS += test_match
 
 KSFT_KHDR_INSTALL := 1
 
@@ -17,3 +18,5 @@ include ../../lib.mk
 
 $(OUTPUT)/test_io: test_io.c $(BPFILTERSRCDIR)/io.c
 $(OUTPUT)/test_map: test_map.c $(BPFILTERSRCDIR)/map-common.c
+$(OUTPUT)/test_match: test_match.c $(BPFILTERSRCDIR)/match.c $(BPFILTERSRCDIR)/map-common.c \
+	$(BPFILTERSRCDIR)/context.c $(BPFILTERSRCDIR)/bflog.c
diff --git a/tools/testing/selftests/bpf/bpfilter/test_match.c b/tools/testing/selftests/bpf/bpfilter/test_match.c
new file mode 100644
index 000000000000..3a56d79ed24c
--- /dev/null
+++ b/tools/testing/selftests/bpf/bpfilter/test_match.c
@@ -0,0 +1,63 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#define _GNU_SOURCE
+
+#include "context.h"
+#include "match.h"
+
+#include <linux/bpfilter.h>
+#include <linux/err.h>
+
+#include <linux/netfilter_ipv4/ip_tables.h>
+#include <linux/netfilter/x_tables.h>
+#include <linux/netfilter/xt_tcpudp.h>
+
+#include <stdio.h>
+
+#include "../../kselftest_harness.h"
+
+struct udp_match {
+	struct xt_entry_match ipt_match;
+	struct xt_udp udp;
+};
+
+FIXTURE(test_udp_match)
+{
+	struct context ctx;
+	struct udp_match udp_match;
+	struct match match;
+};
+
+FIXTURE_SETUP(test_udp_match)
+{
+	ASSERT_EQ(0, create_context(&self->ctx));
+	self->ctx.log_file = stderr;
+
+	memset(&self->udp_match, 0, sizeof(self->udp_match));
+	snprintf(self->udp_match.ipt_match.u.user.name,
+		 sizeof(self->udp_match.ipt_match.u.user.name), "udp");
+	self->udp_match.ipt_match.u.user.match_size = sizeof(struct udp_match);
+	self->udp_match.ipt_match.u.user.revision = 0;
+};
+
+FIXTURE_TEARDOWN(test_udp_match)
+{
+	free_context(&self->ctx);
+}
+
+TEST_F(test_udp_match, init)
+{
+	self->udp_match.udp.spts[0] = 1;
+	self->udp_match.udp.spts[1] = 2;
+	self->udp_match.udp.dpts[0] = 3;
+	self->udp_match.udp.dpts[1] = 4;
+	self->udp_match.udp.invflags = 0;
+
+	ASSERT_EQ(init_match(&self->ctx,
+			     (const struct bpfilter_ipt_match *)&self->udp_match
+				     .ipt_match,
+			     &self->match),
+		  0);
+}
+
+TEST_HARNESS_MAIN
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH bpf-next 07/11] bpfilter: Add struct target
  2021-05-17 22:52 [PATCH bpf-next 00/11] bpfilter Dmitrii Banshchikov
                   ` (5 preceding siblings ...)
  2021-05-17 22:53 ` [PATCH bpf-next 06/11] bpfilter: Add struct match Dmitrii Banshchikov
@ 2021-05-17 22:53 ` Dmitrii Banshchikov
  2021-05-20  4:36   ` Song Liu
  2021-05-17 22:53 ` [PATCH bpf-next 08/11] bpfilter: Add struct rule Dmitrii Banshchikov
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 28+ messages in thread
From: Dmitrii Banshchikov @ 2021-05-17 22:53 UTC (permalink / raw)
  To: bpf
  Cc: Dmitrii Banshchikov, ast, davem, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, kpsingh, netdev, rdna

struct target_ops defines polymorphic interface for targets. A target
consists of pointers to struct target_ops and struct xt_entry_target
which contains a payload for the target's type.

All target_ops are kept in map target_ops_map by their name.

Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
---
 net/bpfilter/Makefile                         |   2 +-
 net/bpfilter/context.c                        |  34 +++++-
 net/bpfilter/context.h                        |   2 +
 net/bpfilter/target-ops-map.h                 |  49 ++++++++
 net/bpfilter/target.c                         | 112 ++++++++++++++++++
 net/bpfilter/target.h                         |  34 ++++++
 .../testing/selftests/bpf/bpfilter/.gitignore |   1 +
 tools/testing/selftests/bpf/bpfilter/Makefile |   5 +-
 .../selftests/bpf/bpfilter/bpfilter_util.h    |  31 +++++
 .../selftests/bpf/bpfilter/test_target.c      |  85 +++++++++++++
 10 files changed, 352 insertions(+), 3 deletions(-)
 create mode 100644 net/bpfilter/target-ops-map.h
 create mode 100644 net/bpfilter/target.c
 create mode 100644 net/bpfilter/target.h
 create mode 100644 tools/testing/selftests/bpf/bpfilter/bpfilter_util.h
 create mode 100644 tools/testing/selftests/bpf/bpfilter/test_target.c

diff --git a/net/bpfilter/Makefile b/net/bpfilter/Makefile
index d1a36dd2c666..f3de07bc8004 100644
--- a/net/bpfilter/Makefile
+++ b/net/bpfilter/Makefile
@@ -4,7 +4,7 @@
 #
 
 userprogs := bpfilter_umh
-bpfilter_umh-objs := main.o bflog.o io.o map-common.o match.o context.o
+bpfilter_umh-objs := main.o bflog.o io.o map-common.o context.o match.o target.o
 userccflags += -I $(srctree)/tools/include/ -I $(srctree)/tools/include/uapi
 
 ifeq ($(CONFIG_BPFILTER_UMH), y)
diff --git a/net/bpfilter/context.c b/net/bpfilter/context.c
index 96735c7883bf..a77134008540 100644
--- a/net/bpfilter/context.c
+++ b/net/bpfilter/context.c
@@ -11,6 +11,7 @@
 #include <linux/list.h>
 
 #include "match.h"
+#include "target.h"
 
 static int init_match_ops_map(struct context *ctx)
 {
@@ -30,12 +31,43 @@ static int init_match_ops_map(struct context *ctx)
 	return 0;
 }
 
+static int init_target_ops_map(struct context *ctx)
+{
+	const struct target_ops *target_ops[] = { &standard_target_ops, &error_target_ops };
+	int i, err;
+
+	err = create_target_ops_map(&ctx->target_ops_map, ARRAY_SIZE(target_ops));
+	if (err)
+		return err;
+
+	for (i = 0; i < ARRAY_SIZE(target_ops); ++i) {
+		err = target_ops_map_insert(&ctx->target_ops_map, target_ops[i]);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
 int create_context(struct context *ctx)
 {
-	return init_match_ops_map(ctx);
+	int err;
+
+	err = init_match_ops_map(ctx);
+	if (err)
+		return err;
+
+	err = init_target_ops_map(ctx);
+	if (err) {
+		free_match_ops_map(&ctx->match_ops_map);
+		return err;
+	}
+
+	return 0;
 }
 
 void free_context(struct context *ctx)
 {
+	free_target_ops_map(&ctx->target_ops_map);
 	free_match_ops_map(&ctx->match_ops_map);
 }
diff --git a/net/bpfilter/context.h b/net/bpfilter/context.h
index a3e737b603f0..c62c1ba4781c 100644
--- a/net/bpfilter/context.h
+++ b/net/bpfilter/context.h
@@ -9,11 +9,13 @@
 #include <stdio.h>
 
 #include "match-ops-map.h"
+#include "target-ops-map.h"
 
 struct context {
 	FILE *log_file;
 	int log_level;
 	struct match_ops_map match_ops_map;
+	struct target_ops_map target_ops_map;
 };
 
 int create_context(struct context *ctx);
diff --git a/net/bpfilter/target-ops-map.h b/net/bpfilter/target-ops-map.h
new file mode 100644
index 000000000000..6b65241328da
--- /dev/null
+++ b/net/bpfilter/target-ops-map.h
@@ -0,0 +1,49 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#ifndef NET_BPFILTER_TARGET_OPS_MAP_H
+#define NET_BPFILTER_TARGET_OPS_MAP_H
+
+#include "map-common.h"
+
+#include <linux/err.h>
+
+#include <errno.h>
+#include <string.h>
+
+#include "target.h"
+
+struct target_ops_map {
+	struct hsearch_data index;
+};
+
+static inline int create_target_ops_map(struct target_ops_map *map, size_t nelem)
+{
+	return create_map(&map->index, nelem);
+}
+
+static inline const struct target_ops *target_ops_map_find(struct target_ops_map *map,
+							   const char *name)
+{
+	const size_t namelen = strnlen(name, BPFILTER_EXTENSION_MAXNAMELEN);
+
+	if (namelen < BPFILTER_EXTENSION_MAXNAMELEN)
+		return map_find(&map->index, name);
+
+	return ERR_PTR(-EINVAL);
+}
+
+static inline int target_ops_map_insert(struct target_ops_map *map,
+					const struct target_ops *target_ops)
+{
+	return map_insert(&map->index, target_ops->name, (void *)target_ops);
+}
+
+static inline void free_target_ops_map(struct target_ops_map *map)
+{
+	free_map(&map->index);
+}
+
+#endif // NET_BPFILTER_TARGET_OPS_MAP_H
diff --git a/net/bpfilter/target.c b/net/bpfilter/target.c
new file mode 100644
index 000000000000..a18fe477f93c
--- /dev/null
+++ b/net/bpfilter/target.c
@@ -0,0 +1,112 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#define _GNU_SOURCE
+
+#include "target.h"
+
+#include <linux/err.h>
+#include <linux/netfilter/x_tables.h>
+
+#include <errno.h>
+#include <string.h>
+
+#include "bflog.h"
+#include "context.h"
+#include "target-ops-map.h"
+
+static int convert_verdict(int verdict)
+{
+	return -verdict - 1;
+}
+
+static int standard_target_check(struct context *ctx, const struct bpfilter_ipt_target *ipt_target)
+{
+	const struct bpfilter_ipt_standard_target *standard_target;
+
+	standard_target = (const struct bpfilter_ipt_standard_target *)ipt_target;
+
+	if (standard_target->verdict > 0)
+		return 0;
+
+	if (standard_target->verdict < 0) {
+		if (standard_target->verdict == BPFILTER_RETURN)
+			return 0;
+
+		switch (convert_verdict(standard_target->verdict)) {
+		case BPFILTER_NF_ACCEPT:
+		case BPFILTER_NF_DROP:
+		case BPFILTER_NF_QUEUE:
+			return 0;
+		}
+	}
+
+	BFLOG_DEBUG(ctx, "invalid verdict: %d\n", standard_target->verdict);
+
+	return -EINVAL;
+}
+
+const struct target_ops standard_target_ops = {
+	.name = "",
+	.revision = 0,
+	.size = sizeof(struct xt_standard_target),
+	.check = standard_target_check,
+};
+
+static int error_target_check(struct context *ctx, const struct bpfilter_ipt_target *ipt_target)
+{
+	const struct bpfilter_ipt_error_target *error_target;
+	size_t maxlen;
+
+	error_target = (const struct bpfilter_ipt_error_target *)&ipt_target;
+	maxlen = sizeof(error_target->error_name);
+	if (strnlen(error_target->error_name, maxlen) == maxlen) {
+		BFLOG_DEBUG(ctx, "cannot check error target: too long errorname\n");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+const struct target_ops error_target_ops = {
+	.name = "ERROR",
+	.revision = 0,
+	.size = sizeof(struct xt_error_target),
+	.check = error_target_check,
+};
+
+int init_target(struct context *ctx, const struct bpfilter_ipt_target *ipt_target,
+		struct target *target)
+{
+	const size_t maxlen = sizeof(ipt_target->u.user.name);
+	const struct target_ops *found;
+	int err;
+
+	if (strnlen(ipt_target->u.user.name, maxlen) == maxlen) {
+		BFLOG_DEBUG(ctx, "cannot init target: too long target name\n");
+		return -EINVAL;
+	}
+
+	found = target_ops_map_find(&ctx->target_ops_map, ipt_target->u.user.name);
+	if (IS_ERR(found)) {
+		BFLOG_DEBUG(ctx, "cannot find target by name: '%s'\n", ipt_target->u.user.name);
+		return PTR_ERR(found);
+	}
+
+	if (found->size != ipt_target->u.target_size ||
+	    found->revision != ipt_target->u.user.revision) {
+		BFLOG_DEBUG(ctx, "invalid target: '%s'\n", ipt_target->u.user.name);
+		return -EINVAL;
+	}
+
+	err = found->check(ctx, ipt_target);
+	if (err)
+		return err;
+
+	target->target_ops = found;
+	target->ipt_target = ipt_target;
+
+	return 0;
+}
diff --git a/net/bpfilter/target.h b/net/bpfilter/target.h
new file mode 100644
index 000000000000..5d9c4c459c05
--- /dev/null
+++ b/net/bpfilter/target.h
@@ -0,0 +1,34 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#ifndef NET_BPFILTER_TARGET_H
+#define NET_BPFILTER_TARGET_H
+
+#include "../../include/uapi/linux/bpfilter.h"
+
+#include <stdint.h>
+
+struct context;
+struct target_ops_map;
+
+struct target_ops {
+	char name[BPFILTER_EXTENSION_MAXNAMELEN];
+	uint16_t size;
+	uint8_t revision;
+	int (*check)(struct context *ctx, const struct bpfilter_ipt_target *ipt_target);
+};
+
+struct target {
+	const struct target_ops *target_ops;
+	const struct bpfilter_ipt_target *ipt_target;
+};
+
+extern const struct target_ops standard_target_ops;
+extern const struct target_ops error_target_ops;
+
+int init_target(struct context *ctx, const struct bpfilter_ipt_target *ipt_target,
+		struct target *target);
+
+#endif // NET_BPFILTER_TARGET_H
diff --git a/tools/testing/selftests/bpf/bpfilter/.gitignore b/tools/testing/selftests/bpf/bpfilter/.gitignore
index e5073231f811..1856d0515f49 100644
--- a/tools/testing/selftests/bpf/bpfilter/.gitignore
+++ b/tools/testing/selftests/bpf/bpfilter/.gitignore
@@ -2,3 +2,4 @@
 test_io
 test_map
 test_match
+test_target
diff --git a/tools/testing/selftests/bpf/bpfilter/Makefile b/tools/testing/selftests/bpf/bpfilter/Makefile
index 362c9a28b88d..78da74b9ee68 100644
--- a/tools/testing/selftests/bpf/bpfilter/Makefile
+++ b/tools/testing/selftests/bpf/bpfilter/Makefile
@@ -11,6 +11,7 @@ CFLAGS += -Wall -g -pthread -I$(TOOLSINCDIR) -I$(APIDIR) -I$(BPFILTERSRCDIR)
 TEST_GEN_PROGS += test_io
 TEST_GEN_PROGS += test_map
 TEST_GEN_PROGS += test_match
+TEST_GEN_PROGS += test_target
 
 KSFT_KHDR_INSTALL := 1
 
@@ -19,4 +20,6 @@ include ../../lib.mk
 $(OUTPUT)/test_io: test_io.c $(BPFILTERSRCDIR)/io.c
 $(OUTPUT)/test_map: test_map.c $(BPFILTERSRCDIR)/map-common.c
 $(OUTPUT)/test_match: test_match.c $(BPFILTERSRCDIR)/match.c $(BPFILTERSRCDIR)/map-common.c \
-	$(BPFILTERSRCDIR)/context.c $(BPFILTERSRCDIR)/bflog.c
+	$(BPFILTERSRCDIR)/context.c $(BPFILTERSRCDIR)/bflog.c $(BPFILTERSRCDIR)/target.c
+$(OUTPUT)/test_target: test_target.c $(BPFILTERSRCDIR)/target.c $(BPFILTERSRCDIR)/map-common.c \
+	$(BPFILTERSRCDIR)/context.c $(BPFILTERSRCDIR)/bflog.c $(BPFILTERSRCDIR)/match.c
diff --git a/tools/testing/selftests/bpf/bpfilter/bpfilter_util.h b/tools/testing/selftests/bpf/bpfilter/bpfilter_util.h
new file mode 100644
index 000000000000..d82ff86f280e
--- /dev/null
+++ b/tools/testing/selftests/bpf/bpfilter/bpfilter_util.h
@@ -0,0 +1,31 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef BPFILTER_UTIL_H
+#define BPFILTER_UTIL_H
+
+#include <linux/bpfilter.h>
+#include <linux/netfilter/x_tables.h>
+
+#include <stdio.h>
+
+static inline void init_standard_target(struct xt_standard_target *ipt_target, int revision,
+					int verdict)
+{
+	snprintf(ipt_target->target.u.user.name, sizeof(ipt_target->target.u.user.name), "%s",
+		 BPFILTER_STANDARD_TARGET);
+	ipt_target->target.u.user.revision = revision;
+	ipt_target->target.u.user.target_size = sizeof(*ipt_target);
+	ipt_target->verdict = verdict;
+}
+
+static inline void init_error_target(struct xt_error_target *ipt_target, int revision,
+				     const char *error_name)
+{
+	snprintf(ipt_target->target.u.user.name, sizeof(ipt_target->target.u.user.name), "%s",
+		 BPFILTER_ERROR_TARGET);
+	ipt_target->target.u.user.revision = revision;
+	ipt_target->target.u.user.target_size = sizeof(*ipt_target);
+	snprintf(ipt_target->errorname, sizeof(ipt_target->errorname), "%s", error_name);
+}
+
+#endif // BPFILTER_UTIL_H
diff --git a/tools/testing/selftests/bpf/bpfilter/test_target.c b/tools/testing/selftests/bpf/bpfilter/test_target.c
new file mode 100644
index 000000000000..6765497b53c4
--- /dev/null
+++ b/tools/testing/selftests/bpf/bpfilter/test_target.c
@@ -0,0 +1,85 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#define _GNU_SOURCE
+
+#include "context.h"
+#include "target.h"
+
+#include <linux/bpfilter.h>
+#include <linux/err.h>
+
+#include <linux/netfilter/x_tables.h>
+#include <linux/netfilter_ipv4/ip_tables.h>
+
+#include "../../kselftest_harness.h"
+
+#include "bpfilter_util.h"
+
+FIXTURE(test_standard_target)
+{
+	struct context ctx;
+	struct xt_standard_target ipt_target;
+	struct target target;
+};
+
+FIXTURE_VARIANT(test_standard_target)
+{
+	int verdict;
+};
+
+FIXTURE_VARIANT_ADD(test_standard_target, accept) {
+	.verdict = -BPFILTER_NF_ACCEPT - 1,
+};
+
+FIXTURE_VARIANT_ADD(test_standard_target, drop) {
+	.verdict = -BPFILTER_NF_DROP - 1,
+};
+
+FIXTURE_SETUP(test_standard_target)
+{
+	ASSERT_EQ(0, create_context(&self->ctx));
+	self->ctx.log_file = stderr;
+
+	memset(&self->ipt_target, 0, sizeof(self->ipt_target));
+	init_standard_target(&self->ipt_target, 0, variant->verdict);
+}
+
+FIXTURE_TEARDOWN(test_standard_target)
+{
+	free_context(&self->ctx);
+}
+
+TEST_F(test_standard_target, init)
+{
+	ASSERT_EQ(0, init_target(&self->ctx, (const struct bpfilter_ipt_target *)&self->ipt_target,
+				 &self->target));
+}
+
+FIXTURE(test_error_target)
+{
+	struct context ctx;
+	struct xt_error_target ipt_target;
+	struct target target;
+};
+
+FIXTURE_SETUP(test_error_target)
+{
+	ASSERT_EQ(0, create_context(&self->ctx));
+	self->ctx.log_file = stderr;
+
+	memset(&self->ipt_target, 0, sizeof(self->ipt_target));
+	init_error_target(&self->ipt_target, 0, "x");
+}
+
+FIXTURE_TEARDOWN(test_error_target)
+{
+	free_context(&self->ctx);
+}
+
+TEST_F(test_error_target, init)
+{
+	ASSERT_EQ(0, init_target(&self->ctx, (const struct bpfilter_ipt_target *)&self->ipt_target,
+				 &self->target));
+}
+
+TEST_HARNESS_MAIN
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH bpf-next 08/11] bpfilter: Add struct rule
  2021-05-17 22:52 [PATCH bpf-next 00/11] bpfilter Dmitrii Banshchikov
                   ` (6 preceding siblings ...)
  2021-05-17 22:53 ` [PATCH bpf-next 07/11] bpfilter: Add struct target Dmitrii Banshchikov
@ 2021-05-17 22:53 ` Dmitrii Banshchikov
  2021-05-17 22:53 ` [PATCH bpf-next 09/11] bpfilter: Add struct table Dmitrii Banshchikov
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 28+ messages in thread
From: Dmitrii Banshchikov @ 2021-05-17 22:53 UTC (permalink / raw)
  To: bpf
  Cc: Dmitrii Banshchikov, ast, davem, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, kpsingh, netdev, rdna

struct rule is an equivalent of struct ipt_entry. A rule consists of
zero or more matches and a target. A rule tracks its own offset in
iptables' entries blob. struct rule should simplify iteration over a
blob and avoid blob's guts in code generation.

Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
---
 net/bpfilter/Makefile                         |   2 +-
 net/bpfilter/rule.c                           | 128 ++++++++++++++++++
 net/bpfilter/rule.h                           |  27 ++++
 .../testing/selftests/bpf/bpfilter/.gitignore |   1 +
 tools/testing/selftests/bpf/bpfilter/Makefile |   4 +
 .../selftests/bpf/bpfilter/bpfilter_util.h    |   8 ++
 .../selftests/bpf/bpfilter/test_rule.c        |  55 ++++++++
 7 files changed, 224 insertions(+), 1 deletion(-)
 create mode 100644 net/bpfilter/rule.c
 create mode 100644 net/bpfilter/rule.h
 create mode 100644 tools/testing/selftests/bpf/bpfilter/test_rule.c

diff --git a/net/bpfilter/Makefile b/net/bpfilter/Makefile
index f3de07bc8004..1191770d41f7 100644
--- a/net/bpfilter/Makefile
+++ b/net/bpfilter/Makefile
@@ -4,7 +4,7 @@
 #
 
 userprogs := bpfilter_umh
-bpfilter_umh-objs := main.o bflog.o io.o map-common.o context.o match.o target.o
+bpfilter_umh-objs := main.o bflog.o io.o map-common.o context.o match.o target.o rule.o
 userccflags += -I $(srctree)/tools/include/ -I $(srctree)/tools/include/uapi
 
 ifeq ($(CONFIG_BPFILTER_UMH), y)
diff --git a/net/bpfilter/rule.c b/net/bpfilter/rule.c
new file mode 100644
index 000000000000..0cbee4656070
--- /dev/null
+++ b/net/bpfilter/rule.c
@@ -0,0 +1,128 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#define _GNU_SOURCE
+
+#include "rule.h"
+
+#include "../../include/uapi/linux/bpfilter.h"
+
+#include <linux/err.h>
+
+#include <errno.h>
+#include <stdlib.h>
+
+#include "context.h"
+#include "bflog.h"
+
+static const struct bpfilter_ipt_target *
+ipt_entry_target(const struct bpfilter_ipt_entry *ipt_entry)
+{
+	return (const void *)ipt_entry + ipt_entry->target_offset;
+}
+
+static const struct bpfilter_ipt_match *ipt_entry_match(const struct bpfilter_ipt_entry *entry,
+							size_t offset)
+{
+	return (const void *)entry + offset;
+}
+
+static int ipt_entry_num_matches(const struct bpfilter_ipt_entry *ipt_entry)
+{
+	const struct bpfilter_ipt_match *ipt_match;
+	uint32_t offset = sizeof(*ipt_entry);
+	int num_matches = 0;
+
+	while (offset < ipt_entry->target_offset) {
+		ipt_match = ipt_entry_match(ipt_entry, offset);
+
+		if (ipt_entry->target_offset < offset + ipt_match->u.match_size)
+			return -EINVAL;
+
+		++num_matches;
+		offset += ipt_match->u.match_size;
+	}
+
+	if (offset != ipt_entry->target_offset)
+		return -EINVAL;
+
+	return num_matches;
+}
+
+static int check_ipt_entry_ip(struct context *ctx, const struct bpfilter_ipt_ip *ip)
+{
+	if (ip->flags & ~BPFILTER_IPT_F_MASK)
+		return -EINVAL;
+
+	if (ip->invflags & ~BPFILTER_IPT_INV_MASK)
+		return -EINVAL;
+
+	return 0;
+}
+
+static int init_rule_matches(struct context *ctx, const struct bpfilter_ipt_entry *ipt_entry,
+			     struct rule *rule)
+{
+	const struct bpfilter_ipt_match *ipt_match;
+	uint32_t offset = sizeof(*ipt_entry);
+	struct match *match;
+	int err;
+
+	rule->matches = calloc(rule->num_matches, sizeof(rule->matches[0]));
+	if (!rule->matches)
+		return -ENOMEM;
+
+	match = rule->matches;
+	while (offset < ipt_entry->target_offset) {
+		ipt_match = ipt_entry_match(ipt_entry, offset);
+		err = init_match(ctx, ipt_match, match);
+		if (err)
+			return err;
+
+		++match;
+		offset += ipt_match->u.match_size;
+	}
+
+	return 0;
+}
+
+int init_rule(struct context *ctx, const struct bpfilter_ipt_entry *ipt_entry, struct rule *rule)
+{
+	const struct bpfilter_ipt_target *ipt_target;
+	int err;
+
+	err = check_ipt_entry_ip(ctx, &ipt_entry->ip);
+
+	if (ipt_entry->next_offset < ipt_entry->target_offset)
+		return -EINVAL;
+
+	if (ipt_entry->target_offset < sizeof(*ipt_entry))
+		return -EINVAL;
+
+	ipt_target = ipt_entry_target(ipt_entry);
+	if (ipt_target->u.target_size != ipt_entry->next_offset - ipt_entry->target_offset)
+		return -EINVAL;
+
+	err = init_target(ctx, ipt_target, &rule->target);
+	if (err)
+		return err;
+
+	rule->num_matches = ipt_entry_num_matches(ipt_entry);
+	if (rule->num_matches < 0)
+		return rule->num_matches;
+
+	err = init_rule_matches(ctx, ipt_entry, rule);
+	if (err) {
+		free_rule(rule);
+		return err;
+	}
+
+	return 0;
+}
+
+void free_rule(struct rule *rule)
+{
+	free(rule->matches);
+}
diff --git a/net/bpfilter/rule.h b/net/bpfilter/rule.h
new file mode 100644
index 000000000000..b445962485a4
--- /dev/null
+++ b/net/bpfilter/rule.h
@@ -0,0 +1,27 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#ifndef NET_BPFILTER_RULE_H
+#define NET_BPFILTER_RULE_H
+
+#include <stdint.h>
+
+#include "target.h"
+
+struct bpfilter_ipt_entry;
+struct context;
+struct match;
+
+struct rule {
+	uint32_t offset;
+	uint16_t num_matches;
+	struct match *matches;
+	struct target target;
+};
+
+int init_rule(struct context *ctx, const struct bpfilter_ipt_entry *ipt_entry, struct rule *rule);
+void free_rule(struct rule *rule);
+
+#endif // NET_BPFILTER_RULE_H
diff --git a/tools/testing/selftests/bpf/bpfilter/.gitignore b/tools/testing/selftests/bpf/bpfilter/.gitignore
index 1856d0515f49..c5ccfe4db881 100644
--- a/tools/testing/selftests/bpf/bpfilter/.gitignore
+++ b/tools/testing/selftests/bpf/bpfilter/.gitignore
@@ -3,3 +3,4 @@ test_io
 test_map
 test_match
 test_target
+test_rule
diff --git a/tools/testing/selftests/bpf/bpfilter/Makefile b/tools/testing/selftests/bpf/bpfilter/Makefile
index 78da74b9ee68..02d860e02c58 100644
--- a/tools/testing/selftests/bpf/bpfilter/Makefile
+++ b/tools/testing/selftests/bpf/bpfilter/Makefile
@@ -12,6 +12,7 @@ TEST_GEN_PROGS += test_io
 TEST_GEN_PROGS += test_map
 TEST_GEN_PROGS += test_match
 TEST_GEN_PROGS += test_target
+TEST_GEN_PROGS += test_rule
 
 KSFT_KHDR_INSTALL := 1
 
@@ -23,3 +24,6 @@ $(OUTPUT)/test_match: test_match.c $(BPFILTERSRCDIR)/match.c $(BPFILTERSRCDIR)/m
 	$(BPFILTERSRCDIR)/context.c $(BPFILTERSRCDIR)/bflog.c $(BPFILTERSRCDIR)/target.c
 $(OUTPUT)/test_target: test_target.c $(BPFILTERSRCDIR)/target.c $(BPFILTERSRCDIR)/map-common.c \
 	$(BPFILTERSRCDIR)/context.c $(BPFILTERSRCDIR)/bflog.c $(BPFILTERSRCDIR)/match.c
+$(OUTPUT)/test_rule: test_rule.c $(BPFILTERSRCDIR)/rule.c $(BPFILTERSRCDIR)/bflog.c 		\
+	$(BPFILTERSRCDIR)/map-common.c $(BPFILTERSRCDIR)/context.c $(BPFILTERSRCDIR)/match.c	\
+	$(BPFILTERSRCDIR)/target.c
diff --git a/tools/testing/selftests/bpf/bpfilter/bpfilter_util.h b/tools/testing/selftests/bpf/bpfilter/bpfilter_util.h
index d82ff86f280e..55fb0e959fca 100644
--- a/tools/testing/selftests/bpf/bpfilter/bpfilter_util.h
+++ b/tools/testing/selftests/bpf/bpfilter/bpfilter_util.h
@@ -7,6 +7,7 @@
 #include <linux/netfilter/x_tables.h>
 
 #include <stdio.h>
+#include <string.h>
 
 static inline void init_standard_target(struct xt_standard_target *ipt_target, int revision,
 					int verdict)
@@ -28,4 +29,11 @@ static inline void init_error_target(struct xt_error_target *ipt_target, int rev
 	snprintf(ipt_target->errorname, sizeof(ipt_target->errorname), "%s", error_name);
 }
 
+static inline void init_standard_entry(struct ipt_entry *entry, __u16 matches_size)
+{
+	memset(entry, 0, sizeof(*entry));
+	entry->target_offset = sizeof(*entry) + matches_size;
+	entry->next_offset = sizeof(*entry) + matches_size + sizeof(struct xt_standard_target);
+}
+
 #endif // BPFILTER_UTIL_H
diff --git a/tools/testing/selftests/bpf/bpfilter/test_rule.c b/tools/testing/selftests/bpf/bpfilter/test_rule.c
new file mode 100644
index 000000000000..fe12adf32fe5
--- /dev/null
+++ b/tools/testing/selftests/bpf/bpfilter/test_rule.c
@@ -0,0 +1,55 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#define _GNU_SOURCE
+
+#include "rule.h"
+
+#include <linux/bpfilter.h>
+#include <linux/err.h>
+
+#include <linux/netfilter_ipv4/ip_tables.h>
+
+#include <stdio.h>
+#include <stdlib.h>
+
+#include "../../kselftest_harness.h"
+
+#include "context.h"
+#include "rule.h"
+
+#include "bpfilter_util.h"
+
+FIXTURE(test_standard_rule)
+{
+	struct context ctx;
+	struct {
+		struct ipt_entry entry;
+		struct xt_standard_target target;
+	} entry;
+	struct rule rule;
+};
+
+FIXTURE_SETUP(test_standard_rule)
+{
+	const int verdict = BPFILTER_NF_ACCEPT;
+
+	ASSERT_EQ(create_context(&self->ctx), 0);
+	self->ctx.log_file = stderr;
+
+	init_standard_entry(&self->entry.entry, 0);
+	init_standard_target(&self->entry.target, 0, -verdict - 1);
+}
+
+FIXTURE_TEARDOWN(test_standard_rule)
+{
+	free_rule(&self->rule);
+	free_context(&self->ctx);
+}
+
+TEST_F(test_standard_rule, init)
+{
+	ASSERT_EQ(0, init_rule(&self->ctx, (const struct bpfilter_ipt_entry *)&self->entry.entry,
+			       &self->rule));
+}
+
+TEST_HARNESS_MAIN
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH bpf-next 09/11] bpfilter: Add struct table
  2021-05-17 22:52 [PATCH bpf-next 00/11] bpfilter Dmitrii Banshchikov
                   ` (7 preceding siblings ...)
  2021-05-17 22:53 ` [PATCH bpf-next 08/11] bpfilter: Add struct rule Dmitrii Banshchikov
@ 2021-05-17 22:53 ` Dmitrii Banshchikov
  2021-05-20 18:07   ` Song Liu
  2021-05-17 22:53 ` [PATCH bpf-next 10/11] bpfilter: Add handling of setsockopt() calls Dmitrii Banshchikov
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 28+ messages in thread
From: Dmitrii Banshchikov @ 2021-05-17 22:53 UTC (permalink / raw)
  To: bpf
  Cc: Dmitrii Banshchikov, ast, davem, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, kpsingh, netdev, rdna

A table keeps iptables' blob and an array of struct rule for this blob.
The array of rules provides more convenient way to interact with blob's
entries.

All tables are stored in table_ops_map map which is used for lookups.
Also all tables are linked into a list that is used for freeing them.

Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
---
 net/bpfilter/Makefile                         |   2 +-
 net/bpfilter/context.c                        | 103 +++++++++++
 net/bpfilter/context.h                        |   3 +
 net/bpfilter/table-map.h                      |  41 +++++
 net/bpfilter/table.c                          | 167 ++++++++++++++++++
 net/bpfilter/table.h                          |  33 ++++
 tools/testing/selftests/bpf/bpfilter/Makefile |  10 +-
 7 files changed, 354 insertions(+), 5 deletions(-)
 create mode 100644 net/bpfilter/table-map.h
 create mode 100644 net/bpfilter/table.c
 create mode 100644 net/bpfilter/table.h

diff --git a/net/bpfilter/Makefile b/net/bpfilter/Makefile
index 1191770d41f7..e0090563f2ca 100644
--- a/net/bpfilter/Makefile
+++ b/net/bpfilter/Makefile
@@ -4,7 +4,7 @@
 #
 
 userprogs := bpfilter_umh
-bpfilter_umh-objs := main.o bflog.o io.o map-common.o context.o match.o target.o rule.o
+bpfilter_umh-objs := main.o bflog.o io.o map-common.o context.o match.o target.o rule.o table.o
 userccflags += -I $(srctree)/tools/include/ -I $(srctree)/tools/include/uapi
 
 ifeq ($(CONFIG_BPFILTER_UMH), y)
diff --git a/net/bpfilter/context.c b/net/bpfilter/context.c
index a77134008540..5d9679212b25 100644
--- a/net/bpfilter/context.c
+++ b/net/bpfilter/context.c
@@ -10,7 +10,11 @@
 #include <linux/err.h>
 #include <linux/list.h>
 
+#include <stddef.h>
+
 #include "match.h"
+#include "rule.h"
+#include "table.h"
 #include "target.h"
 
 static int init_match_ops_map(struct context *ctx)
@@ -49,6 +53,86 @@ static int init_target_ops_map(struct context *ctx)
 	return 0;
 }
 
+static void init_standard_entry(struct bpfilter_ipt_standard_entry *ipt_entry)
+{
+	ipt_entry->entry.next_offset = sizeof(*ipt_entry);
+	ipt_entry->entry.target_offset = sizeof(ipt_entry->entry);
+	ipt_entry->target.target.u.user.revision = 0;
+	ipt_entry->target.target.u.user.target_size = sizeof(struct bpfilter_ipt_standard_target);
+	ipt_entry->target.verdict = -BPFILTER_NF_ACCEPT - 1;
+}
+
+static void init_error_entry(struct bpfilter_ipt_error_entry *ipt_entry)
+{
+	ipt_entry->entry.next_offset = sizeof(*ipt_entry);
+	ipt_entry->entry.target_offset = sizeof(ipt_entry->entry);
+	ipt_entry->target.target.u.target_size = sizeof(struct bpfilter_ipt_error_target);
+	ipt_entry->target.target.u.user.revision = 0;
+	snprintf(ipt_entry->target.target.u.user.name, sizeof(ipt_entry->target.target.u.user.name),
+		 "ERROR");
+}
+
+static struct table *create_filter_table(struct context *ctx)
+{
+	struct filter_table_entries {
+		struct bpfilter_ipt_standard_entry local_in;
+		struct bpfilter_ipt_standard_entry forward;
+		struct bpfilter_ipt_standard_entry local_out;
+		struct bpfilter_ipt_error_entry error;
+	};
+
+	struct filter_table {
+		struct bpfilter_ipt_replace replace;
+		struct filter_table_entries entries;
+	} filter_table;
+
+	memset(&filter_table, 0, sizeof(filter_table));
+
+	snprintf(filter_table.replace.name, sizeof(filter_table.replace.name), "filter");
+	filter_table.replace.valid_hooks = 1 << BPFILTER_INET_HOOK_LOCAL_IN |
+					   1 << BPFILTER_INET_HOOK_FORWARD |
+					   1 << BPFILTER_INET_HOOK_LOCAL_OUT;
+	filter_table.replace.num_entries = 4;
+	filter_table.replace.size = sizeof(struct filter_table_entries);
+
+	filter_table.replace.hook_entry[BPFILTER_INET_HOOK_FORWARD] =
+		offsetof(struct filter_table_entries, forward);
+	filter_table.replace.underflow[BPFILTER_INET_HOOK_FORWARD] =
+		offsetof(struct filter_table_entries, forward);
+
+	filter_table.replace.hook_entry[BPFILTER_INET_HOOK_LOCAL_OUT] =
+		offsetof(struct filter_table_entries, local_out);
+	filter_table.replace.underflow[BPFILTER_INET_HOOK_LOCAL_OUT] =
+		offsetof(struct filter_table_entries, local_out);
+
+	init_standard_entry(&filter_table.entries.local_in);
+	init_standard_entry(&filter_table.entries.forward);
+	init_standard_entry(&filter_table.entries.local_out);
+	init_error_entry(&filter_table.entries.error);
+
+	return create_table(ctx, &filter_table.replace);
+}
+
+static int init_table_map(struct context *ctx)
+{
+	struct table *table;
+	int err;
+
+	err = create_table_map(&ctx->table_map, 1);
+	if (err)
+		return err;
+
+	table = create_filter_table(ctx);
+	if (IS_ERR(table)) {
+		free_table_map(&ctx->table_map);
+		return PTR_ERR(table);
+	}
+
+	list_add_tail(&table->list, &ctx->table_list);
+
+	return table_map_insert(&ctx->table_map, table);
+}
+
 int create_context(struct context *ctx)
 {
 	int err;
@@ -63,11 +147,30 @@ int create_context(struct context *ctx)
 		return err;
 	}
 
+	INIT_LIST_HEAD(&ctx->table_list);
+
+	err = init_table_map(ctx);
+	if (err) {
+		free_match_ops_map(&ctx->match_ops_map);
+		free_target_ops_map(&ctx->target_ops_map);
+		return err;
+	}
+
 	return 0;
 }
 
 void free_context(struct context *ctx)
 {
+	struct list_head *t, *n;
+
+	list_for_each_safe(t, n, &ctx->table_list) {
+		struct table *table;
+
+		table = list_entry(t, struct table, list);
+		free_table(table);
+	}
+
+	free_table_map(&ctx->table_map);
 	free_target_ops_map(&ctx->target_ops_map);
 	free_match_ops_map(&ctx->match_ops_map);
 }
diff --git a/net/bpfilter/context.h b/net/bpfilter/context.h
index c62c1ba4781c..2d9e3fafb0f8 100644
--- a/net/bpfilter/context.h
+++ b/net/bpfilter/context.h
@@ -10,12 +10,15 @@
 
 #include "match-ops-map.h"
 #include "target-ops-map.h"
+#include "table-map.h"
 
 struct context {
 	FILE *log_file;
 	int log_level;
 	struct match_ops_map match_ops_map;
 	struct target_ops_map target_ops_map;
+	struct table_map table_map;
+	struct list_head table_list;
 };
 
 int create_context(struct context *ctx);
diff --git a/net/bpfilter/table-map.h b/net/bpfilter/table-map.h
new file mode 100644
index 000000000000..6c5340e88542
--- /dev/null
+++ b/net/bpfilter/table-map.h
@@ -0,0 +1,41 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#ifndef NET_BPFILTER_TABLE_MAP_H
+#define NET_BPFILTER_TABLE_MAP_H
+
+#include "map-common.h"
+#include "table.h"
+
+struct table_map {
+	struct hsearch_data index;
+};
+
+static inline int create_table_map(struct table_map *map, size_t nelem)
+{
+	return create_map(&map->index, nelem);
+}
+
+static inline struct table *table_map_find(struct table_map *map, const char *name)
+{
+	return map_find(&map->index, name);
+}
+
+static inline int table_map_update(struct table_map *map, const char *name, void *data)
+{
+	return map_update(&map->index, name, data);
+}
+
+static inline int table_map_insert(struct table_map *map, struct table *table)
+{
+	return map_insert(&map->index, table->name, table);
+}
+
+static inline void free_table_map(struct table_map *map)
+{
+	free_map(&map->index);
+}
+
+#endif // NET_BPFILTER_TABLE_MAP_H
diff --git a/net/bpfilter/table.c b/net/bpfilter/table.c
new file mode 100644
index 000000000000..e8be6369ef71
--- /dev/null
+++ b/net/bpfilter/table.c
@@ -0,0 +1,167 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#define _GNU_SOURCE
+
+#include "table.h"
+
+#include <linux/err.h>
+#include <linux/list.h>
+
+#include <errno.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "context.h"
+#include "rule.h"
+#include "table-map.h"
+
+static int rule_offset_comparator(const void *x, const void *y)
+{
+	const struct rule *rule = y;
+	const uint32_t *offset = x;
+
+	return *offset < rule->offset ? -1 : *offset - rule->offset;
+}
+
+static struct rule *table_get_rule_by_offset(struct table *table, uint32_t offset)
+{
+	return bsearch(&offset, table->rules, table->num_rules, sizeof(table->rules[0]),
+		       rule_offset_comparator);
+}
+
+static int table_init_rules(struct context *ctx, struct table *table,
+			    const struct bpfilter_ipt_replace *ipt_replace)
+{
+	uint32_t offset;
+	int i;
+
+	table->entries = malloc(table->size);
+	if (!table->entries)
+		return -ENOMEM;
+
+	memcpy(table->entries, ipt_replace->entries, table->size);
+
+	table->rules = calloc(table->num_rules, sizeof(table->rules[0]));
+	if (!table->rules)
+		return -ENOMEM;
+
+	offset = 0;
+	for (i = 0; i < table->num_rules; ++i) {
+		const struct bpfilter_ipt_entry *ipt_entry;
+		int err;
+
+		if (table->size < offset)
+			return -EINVAL;
+
+		if (table->size < offset + sizeof(*ipt_entry))
+			return -EINVAL;
+
+		ipt_entry = table->entries + offset;
+
+		if (table->size < offset + ipt_entry->next_offset)
+			return -EINVAL;
+
+		err = init_rule(ctx, ipt_entry, &table->rules[i]);
+		if (err)
+			return err;
+
+		table->rules[i].offset = offset;
+		offset += ipt_entry->next_offset;
+	}
+
+	if (offset != ipt_replace->size)
+		return -EINVAL;
+
+	return 0;
+}
+
+static int table_init_hooks(struct table *table, const struct bpfilter_ipt_replace *ipt_replace)
+{
+	int i;
+
+	for (i = 0; i < BPFILTER_INET_HOOK_MAX; ++i) {
+		if (!(table->valid_hooks & (1 << i)))
+			continue;
+
+		table->hook_entry[i] = table_get_rule_by_offset(table, ipt_replace->hook_entry[i]);
+		table->underflow[i] = table_get_rule_by_offset(table, ipt_replace->underflow[i]);
+
+		if (!table->hook_entry[i] || !table->underflow[i])
+			return -EINVAL;
+	}
+
+	return 0;
+}
+
+struct table *create_table(struct context *ctx, const struct bpfilter_ipt_replace *ipt_replace)
+{
+	struct table *table;
+	int err;
+
+	table = calloc(1, sizeof(*table));
+	if (!table)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&table->list);
+	snprintf(table->name, sizeof(table->name), "%s", ipt_replace->name);
+	table->valid_hooks = ipt_replace->valid_hooks;
+	table->num_rules = ipt_replace->num_entries;
+	table->num_counters = ipt_replace->num_counters;
+	table->size = ipt_replace->size;
+
+	err = table_init_rules(ctx, table, ipt_replace);
+	if (err)
+		goto err_free;
+
+	err = table_init_hooks(table, ipt_replace);
+	if (err)
+		goto err_free;
+
+	return table;
+
+err_free:
+	free_table(table);
+
+	return ERR_PTR(err);
+}
+
+void table_get_info(const struct table *table, struct bpfilter_ipt_get_info *info)
+{
+	int i;
+
+	snprintf(info->name, sizeof(info->name), "%s", table->name);
+	info->valid_hooks = table->valid_hooks;
+	for (i = 0; i < BPFILTER_INET_HOOK_MAX; ++i) {
+		const struct rule *hook_entry = table->hook_entry[i];
+		const struct rule *underflow = table->underflow[i];
+
+		info->hook_entry[i] = hook_entry ? hook_entry->offset : 0;
+		info->underflow[i] = underflow ? underflow->offset : 0;
+	}
+	info->num_entries = table->num_rules;
+	info->size = table->size;
+}
+
+void free_table(struct table *table)
+{
+	int i;
+
+	if (!table)
+		return;
+
+	list_del(&table->list);
+
+	if (table->rules) {
+		for (i = 0; i < table->num_rules; ++i)
+			free_rule(&table->rules[i]);
+		free(table->rules);
+	}
+
+	free(table->entries);
+	free(table);
+}
diff --git a/net/bpfilter/table.h b/net/bpfilter/table.h
new file mode 100644
index 000000000000..101f6d813c8c
--- /dev/null
+++ b/net/bpfilter/table.h
@@ -0,0 +1,33 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#ifndef NET_BPFILTER_TABLE_H
+#define NET_BPFILTER_TABLE_H
+
+#include "../../include/uapi/linux/bpfilter.h"
+
+#include <stdint.h>
+
+struct context;
+struct rule;
+
+struct table {
+	struct list_head list;
+	char name[BPFILTER_XT_TABLE_MAXNAMELEN];
+	uint32_t valid_hooks;
+	uint32_t num_rules;
+	uint32_t num_counters;
+	uint32_t size;
+	struct rule *hook_entry[BPFILTER_INET_HOOK_MAX];
+	struct rule *underflow[BPFILTER_INET_HOOK_MAX];
+	struct rule *rules;
+	void *entries;
+};
+
+struct table *create_table(struct context *ctx, const struct bpfilter_ipt_replace *ipt_replace);
+void table_get_info(const struct table *table, struct bpfilter_ipt_get_info *info);
+void free_table(struct table *table);
+
+#endif // NET_BPFILTER_TABLE_H
diff --git a/tools/testing/selftests/bpf/bpfilter/Makefile b/tools/testing/selftests/bpf/bpfilter/Makefile
index 02d860e02c58..131174dd2bdf 100644
--- a/tools/testing/selftests/bpf/bpfilter/Makefile
+++ b/tools/testing/selftests/bpf/bpfilter/Makefile
@@ -21,9 +21,11 @@ include ../../lib.mk
 $(OUTPUT)/test_io: test_io.c $(BPFILTERSRCDIR)/io.c
 $(OUTPUT)/test_map: test_map.c $(BPFILTERSRCDIR)/map-common.c
 $(OUTPUT)/test_match: test_match.c $(BPFILTERSRCDIR)/match.c $(BPFILTERSRCDIR)/map-common.c \
-	$(BPFILTERSRCDIR)/context.c $(BPFILTERSRCDIR)/bflog.c $(BPFILTERSRCDIR)/target.c
-$(OUTPUT)/test_target: test_target.c $(BPFILTERSRCDIR)/target.c $(BPFILTERSRCDIR)/map-common.c \
-	$(BPFILTERSRCDIR)/context.c $(BPFILTERSRCDIR)/bflog.c $(BPFILTERSRCDIR)/match.c
+	$(BPFILTERSRCDIR)/context.c $(BPFILTERSRCDIR)/bflog.c $(BPFILTERSRCDIR)/target.c 		\
+	$(BPFILTERSRCDIR)/table.c $(BPFILTERSRCDIR)/rule.c
+$(OUTPUT)/test_target: test_target.c $(BPFILTERSRCDIR)/target.c $(BPFILTERSRCDIR)/map-common.c 	\
+	$(BPFILTERSRCDIR)/context.c $(BPFILTERSRCDIR)/bflog.c $(BPFILTERSRCDIR)/match.c				\
+	$(BPFILTERSRCDIR)/table.c $(BPFILTERSRCDIR)/rule.c
 $(OUTPUT)/test_rule: test_rule.c $(BPFILTERSRCDIR)/rule.c $(BPFILTERSRCDIR)/bflog.c 		\
 	$(BPFILTERSRCDIR)/map-common.c $(BPFILTERSRCDIR)/context.c $(BPFILTERSRCDIR)/match.c	\
-	$(BPFILTERSRCDIR)/target.c
+	$(BPFILTERSRCDIR)/target.c $(BPFILTERSRCDIR)/table.c $(BPFILTERSRCDIR)/rule.c
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH bpf-next 10/11] bpfilter: Add handling of setsockopt() calls
  2021-05-17 22:52 [PATCH bpf-next 00/11] bpfilter Dmitrii Banshchikov
                   ` (8 preceding siblings ...)
  2021-05-17 22:53 ` [PATCH bpf-next 09/11] bpfilter: Add struct table Dmitrii Banshchikov
@ 2021-05-17 22:53 ` Dmitrii Banshchikov
  2021-05-17 22:53 ` [PATCH bpf-next 11/11] bpfilter: Handle setsockopts Dmitrii Banshchikov
  2021-05-20  4:54 ` [PATCH bpf-next 00/11] bpfilter Song Liu
  11 siblings, 0 replies; 28+ messages in thread
From: Dmitrii Banshchikov @ 2021-05-17 22:53 UTC (permalink / raw)
  To: bpf
  Cc: Dmitrii Banshchikov, ast, davem, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, kpsingh, netdev, rdna

Add support of iptables' setsockopt(2).
The parameters of a setsockopt(2) call are passed by struct mbox_request
which contains a type of the setsockopt(2) call and its memory buffer
description.

The typical way to handle hooked setsockopt(2) call is to read the
supplied memory buffer via pvm_read(), process it and write an answer to
the process memory via pvm_write().

Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
---
 net/bpfilter/Makefile  |   3 +-
 net/bpfilter/sockopt.c | 357 +++++++++++++++++++++++++++++++++++++++++
 net/bpfilter/sockopt.h |  14 ++
 3 files changed, 373 insertions(+), 1 deletion(-)
 create mode 100644 net/bpfilter/sockopt.c
 create mode 100644 net/bpfilter/sockopt.h

diff --git a/net/bpfilter/Makefile b/net/bpfilter/Makefile
index e0090563f2ca..bae14fc9e396 100644
--- a/net/bpfilter/Makefile
+++ b/net/bpfilter/Makefile
@@ -4,7 +4,8 @@
 #
 
 userprogs := bpfilter_umh
-bpfilter_umh-objs := main.o bflog.o io.o map-common.o context.o match.o target.o rule.o table.o
+bpfilter_umh-objs := main.o bflog.o io.o map-common.o context.o match.o target.o rule.o table.o \
+	sockopt.o
 userccflags += -I $(srctree)/tools/include/ -I $(srctree)/tools/include/uapi
 
 ifeq ($(CONFIG_BPFILTER_UMH), y)
diff --git a/net/bpfilter/sockopt.c b/net/bpfilter/sockopt.c
new file mode 100644
index 000000000000..16f0ade203c0
--- /dev/null
+++ b/net/bpfilter/sockopt.c
@@ -0,0 +1,357 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#define _GNU_SOURCE
+
+#include "sockopt.h"
+
+#include <linux/err.h>
+#include <linux/list.h>
+
+#include <errno.h>
+#include <limits.h>
+#include <stdlib.h>
+
+#include "bflog.h"
+#include "context.h"
+#include "io.h"
+#include "msgfmt.h"
+#include "table-map.h"
+
+static int read_ipt_get_info(const struct mbox_request *req, struct bpfilter_ipt_get_info *info)
+{
+	int err;
+
+	if (req->len != sizeof(*info))
+		return -EINVAL;
+
+	err = pvm_read(req->pid, info, (const void *)req->addr, sizeof(*info));
+	if (err)
+		return err;
+
+	info->name[sizeof(info->name) - 1] = '\0';
+
+	return 0;
+}
+
+static int sockopt_get_info(struct context *ctx, const struct mbox_request *req)
+{
+	struct bpfilter_ipt_get_info info;
+	struct table *table;
+	int err;
+
+	BFLOG_DEBUG(ctx, "handling IPT_SO_GET_INFO\n");
+
+	if (req->len != sizeof(info))
+		return -EINVAL;
+
+	err = read_ipt_get_info(req, &info);
+	if (err) {
+		BFLOG_DEBUG(ctx, "cannot read ipt_get_info: %s\n", strerror(-err));
+		return err;
+	}
+
+	table = table_map_find(&ctx->table_map, info.name);
+	if (IS_ERR(table)) {
+		BFLOG_DEBUG(ctx, "cannot find table: '%s'\n", info.name);
+		return -ENOENT;
+	}
+
+	table_get_info(table, &info);
+
+	return pvm_write(req->pid, (void *)req->addr, &info, sizeof(info));
+}
+
+static int read_ipt_get_entries(const struct mbox_request *req,
+				struct bpfilter_ipt_get_entries *entries)
+{
+	int err;
+
+	if (req->len < sizeof(*entries))
+		return -EINVAL;
+
+	err = pvm_read(req->pid, entries, (const void *)req->addr, sizeof(*entries));
+	if (err)
+		return err;
+
+	entries->name[sizeof(entries->name) - 1] = '\0';
+
+	return 0;
+}
+
+static int sockopt_get_entries(struct context *ctx, const struct mbox_request *req)
+{
+	struct bpfilter_ipt_get_entries get_entries, *entries;
+	struct table *table;
+	int err;
+
+	BFLOG_DEBUG(ctx, "handling IPT_SO_GET_ENTRIES\n");
+
+	err = read_ipt_get_entries(req, &get_entries);
+	if (err) {
+		BFLOG_DEBUG(ctx, "cannot read ipt_get_entries: %s\n", strerror(-err));
+		return err;
+	}
+
+	table = table_map_find(&ctx->table_map, get_entries.name);
+	if (IS_ERR(table)) {
+		BFLOG_DEBUG(ctx, "cannot find table: '%s'\n", get_entries.name);
+		return -ENOENT;
+	}
+
+	if (get_entries.size != table->size) {
+		BFLOG_DEBUG(ctx, "table '%s' get entries size mismatch\n", get_entries.name);
+		return -EINVAL;
+	}
+
+	entries = (struct bpfilter_ipt_get_entries *)req->addr;
+
+	err = pvm_write(req->pid, entries->name, table->name, sizeof(entries->name));
+	if (err)
+		return err;
+
+	err = pvm_write(req->pid, &entries->size, &table->size, sizeof(table->size));
+	if (err)
+		return err;
+
+	return pvm_write(req->pid, entries->entries, table->entries, table->size);
+}
+
+static int read_ipt_get_revision(const struct mbox_request *req,
+				 struct bpfilter_ipt_get_revision *revision)
+{
+	int err;
+
+	if (req->len != sizeof(*revision))
+		return -EINVAL;
+
+	err = pvm_read(req->pid, revision, (const void *)req->addr, sizeof(*revision));
+	if (err)
+		return err;
+
+	revision->name[sizeof(revision->name) - 1] = '\0';
+
+	return 0;
+}
+
+static int sockopt_get_revision_match(struct context *ctx, const struct mbox_request *req)
+{
+	struct bpfilter_ipt_get_revision get_revision;
+	const struct match_ops *found;
+	int err;
+
+	BFLOG_DEBUG(ctx, "handling IPT_SO_GET_REVISION_MATCH\n");
+
+	err = read_ipt_get_revision(req, &get_revision);
+	if (err)
+		return err;
+
+	found = match_ops_map_find(&ctx->match_ops_map, get_revision.name);
+	if (IS_ERR(found)) {
+		BFLOG_DEBUG(ctx, "cannot find match: '%s'\n", get_revision.name);
+		return PTR_ERR(found);
+	}
+
+	return found->revision;
+}
+
+static struct bpfilter_ipt_replace *read_ipt_replace(const struct mbox_request *req)
+{
+	struct bpfilter_ipt_replace *replace;
+	int err;
+
+	if (req->len < sizeof(*replace))
+		return ERR_PTR(-EINVAL);
+
+	replace = malloc(req->len);
+	if (!replace)
+		return ERR_PTR(-ENOMEM);
+
+	err = pvm_read(req->pid, replace, (const void *)req->addr, sizeof(*replace));
+	if (err)
+		goto err_free;
+
+	if (replace->num_counters == 0) {
+		err = -EINVAL;
+		goto err_free;
+	}
+
+	if (replace->num_counters >= INT_MAX / sizeof(struct bpfilter_ipt_counters)) {
+		err = -ENOMEM;
+		goto err_free;
+	}
+
+	replace->name[sizeof(replace->name) - 1] = '\0';
+
+	// TODO: add more checks here
+
+	err = pvm_read(req->pid, replace->entries, (const void *)req->addr + sizeof(*replace),
+		       req->len - sizeof(*replace));
+	if (err)
+		goto err_free;
+
+	return replace;
+
+err_free:
+	free(replace);
+
+	return ERR_PTR(err);
+}
+
+static int sockopt_set_replace(struct context *ctx, const struct mbox_request *req)
+{
+	struct bpfilter_ipt_replace *ipt_replace;
+	struct table *table, *new_table = NULL;
+	int err;
+
+	BFLOG_DEBUG(ctx, "handling IPT_SO_SET_REPLACE\n");
+
+	ipt_replace = read_ipt_replace(req);
+	if (IS_ERR(ipt_replace)) {
+		BFLOG_DEBUG(ctx, "cannot read ipt_replace: %s\n", strerror(-PTR_ERR(ipt_replace)));
+		return PTR_ERR(ipt_replace);
+	}
+
+	table = table_map_find(&ctx->table_map, ipt_replace->name);
+	if (IS_ERR(table)) {
+		err = PTR_ERR(table);
+		BFLOG_DEBUG(ctx, "cannot find table: '%s'\n", ipt_replace->name);
+		goto cleanup;
+	}
+
+	new_table = create_table(ctx, ipt_replace);
+	if (IS_ERR(new_table)) {
+		err = PTR_ERR(new_table);
+		BFLOG_DEBUG(ctx, "cannot read table: %s\n", strerror(-PTR_ERR(new_table)));
+		goto cleanup;
+	}
+
+	// Here be codegen
+	// ...
+	//
+
+	err = table_map_update(&ctx->table_map, new_table->name, new_table);
+	if (err) {
+		BFLOG_DEBUG(ctx, "cannot update table map: %s\n", strerror(-err));
+		goto cleanup;
+	}
+
+	list_add_tail(&new_table->list, &ctx->table_list);
+	new_table = table;
+
+cleanup:
+	if (!IS_ERR(new_table))
+		free_table(new_table);
+
+	free(ipt_replace);
+
+	return err;
+}
+
+static struct bpfilter_ipt_counters_info *read_ipt_counters_info(const struct mbox_request *req)
+{
+	struct bpfilter_ipt_counters_info *info;
+	size_t size;
+	int err;
+
+	if (req->len < sizeof(*info))
+		return ERR_PTR(-EINVAL);
+
+	info = malloc(req->len);
+	if (!info)
+		return ERR_PTR(-ENOMEM);
+
+	err = pvm_read(req->pid, info, (const void *)req->addr, sizeof(*info));
+	if (err)
+		goto err_free;
+
+	// TODO add more size checks here
+
+	size = info->num_counters * sizeof(info->counters[0]);
+	if (req->len != sizeof(*info) + size) {
+		err = -EINVAL;
+		goto err_free;
+	}
+
+	info->name[sizeof(info->name) - 1] = '\0';
+
+	err = pvm_read(req->pid, info->counters, (const void *)req->addr + sizeof(*info), size);
+	if (err)
+		goto err_free;
+
+	return info;
+
+err_free:
+	free(info);
+
+	return ERR_PTR(err);
+}
+
+static int sockopt_set_add_counters(struct context *ctx, const struct mbox_request *req)
+{
+	struct bpfilter_ipt_counters_info *info;
+	struct table *table;
+	int err = 0;
+
+	BFLOG_DEBUG(ctx, "handling IPT_SO_SET_ADD_COUNTERS\n");
+
+	info = read_ipt_counters_info(req);
+	if (IS_ERR(info)) {
+		err = PTR_ERR(info);
+		BFLOG_DEBUG(ctx, "cannot read ipt_counters_info: %s\n", strerror(-err));
+		goto err_free;
+	}
+
+	table = table_map_find(&ctx->table_map, info->name);
+	if (IS_ERR(table)) {
+		err = PTR_ERR(table);
+		BFLOG_DEBUG(ctx, "cannot find table: '%s'\n", info->name);
+		goto err_free;
+	}
+
+	// TODO handle counters
+
+err_free:
+	free(info);
+
+	return err;
+}
+
+static int handle_get_request(struct context *ctx, const struct mbox_request *req)
+{
+	switch (req->cmd) {
+	case 0:
+		return 0;
+	case BPFILTER_IPT_SO_GET_INFO:
+		return sockopt_get_info(ctx, req);
+	case BPFILTER_IPT_SO_GET_ENTRIES:
+		return sockopt_get_entries(ctx, req);
+	case BPFILTER_IPT_SO_GET_REVISION_MATCH:
+		return sockopt_get_revision_match(ctx, req);
+	}
+
+	BFLOG_NOTICE(ctx, "Unexpected SO_GET request: %d\n", req->cmd);
+
+	return -ENOPROTOOPT;
+}
+
+static int handle_set_request(struct context *ctx, const struct mbox_request *req)
+{
+	switch (req->cmd) {
+	case BPFILTER_IPT_SO_SET_REPLACE:
+		return sockopt_set_replace(ctx, req);
+	case BPFILTER_IPT_SO_SET_ADD_COUNTERS:
+		return sockopt_set_add_counters(ctx, req);
+	}
+
+	BFLOG_NOTICE(ctx, "Unexpected SO_SET request: %d\n", req->cmd);
+
+	return -ENOPROTOOPT;
+}
+
+int handle_sockopt_request(struct context *ctx, const struct mbox_request *req)
+{
+	return req->is_set ? handle_set_request(ctx, req) : handle_get_request(ctx, req);
+}
diff --git a/net/bpfilter/sockopt.h b/net/bpfilter/sockopt.h
new file mode 100644
index 000000000000..711c2f295d89
--- /dev/null
+++ b/net/bpfilter/sockopt.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#ifndef NET_BPFILTER_SOCKOPT_H
+#define NET_BPFILTER_SOCKOPT_H
+
+struct context;
+struct mbox_request;
+
+int handle_sockopt_request(struct context *ctx, const struct mbox_request *req);
+
+#endif // NET_BPFILTER_SOCKOPT_H
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH bpf-next 11/11] bpfilter: Handle setsockopts
  2021-05-17 22:52 [PATCH bpf-next 00/11] bpfilter Dmitrii Banshchikov
                   ` (9 preceding siblings ...)
  2021-05-17 22:53 ` [PATCH bpf-next 10/11] bpfilter: Add handling of setsockopt() calls Dmitrii Banshchikov
@ 2021-05-17 22:53 ` Dmitrii Banshchikov
  2021-05-20  4:54 ` [PATCH bpf-next 00/11] bpfilter Song Liu
  11 siblings, 0 replies; 28+ messages in thread
From: Dmitrii Banshchikov @ 2021-05-17 22:53 UTC (permalink / raw)
  To: bpf
  Cc: Dmitrii Banshchikov, ast, davem, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, kpsingh, netdev, rdna

Use earlier introduced infrastructure for and handle setsockopt(2)
calls.

Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
---
 net/bpfilter/main.c | 99 ++++++++++++++++++++++++---------------------
 1 file changed, 53 insertions(+), 46 deletions(-)

diff --git a/net/bpfilter/main.c b/net/bpfilter/main.c
index 05e1cfc1e5cd..19c8c2d7ef87 100644
--- a/net/bpfilter/main.c
+++ b/net/bpfilter/main.c
@@ -1,64 +1,71 @@
 // SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
 #define _GNU_SOURCE
-#include <sys/uio.h>
+
+#include <unistd.h>
 #include <errno.h>
+
 #include <stdio.h>
-#include <sys/socket.h>
-#include <fcntl.h>
-#include <unistd.h>
-#include "../../include/uapi/linux/bpf.h"
-#include <asm/unistd.h>
-#include "msgfmt.h"
+#include <string.h>
 
-FILE *debug_f;
+#include "bflog.h"
+#include "context.h"
+#include "io.h"
+#include "msgfmt.h"
+#include "sockopt.h"
 
-static int handle_get_cmd(struct mbox_request *cmd)
+static int setup_context(struct context *ctx)
 {
-	switch (cmd->cmd) {
-	case 0:
-		return 0;
-	default:
-		break;
-	}
-	return -ENOPROTOOPT;
-}
+	ctx->log_file = fopen("/dev/kmsg", "w");
+	if (!ctx->log_file)
+		return -errno;
 
-static int handle_set_cmd(struct mbox_request *cmd)
-{
-	return -ENOPROTOOPT;
+	setvbuf(ctx->log_file, 0, _IOLBF, 0);
+	ctx->log_level = BFLOG_LEVEL_NOTICE;
+
+	return 0;
 }
 
-static void loop(void)
+static void loop(struct context *ctx)
 {
-	while (1) {
-		struct mbox_request req;
-		struct mbox_reply reply;
-		int n;
-
-		n = read(0, &req, sizeof(req));
-		if (n != sizeof(req)) {
-			fprintf(debug_f, "invalid request %d\n", n);
-			return;
-		}
-
-		reply.status = req.is_set ?
-			handle_set_cmd(&req) :
-			handle_get_cmd(&req);
-
-		n = write(1, &reply, sizeof(reply));
-		if (n != sizeof(reply)) {
-			fprintf(debug_f, "reply failed %d\n", n);
-			return;
-		}
+	struct mbox_request req;
+	struct mbox_reply reply;
+	int err;
+
+	for (;;) {
+		err = read_exact(STDIN_FILENO, &req, sizeof(req));
+		if (err)
+			BFLOG_FATAL(ctx, "cannot read request: %s\n", strerror(-err));
+
+		reply.status = handle_sockopt_request(ctx, &req);
+
+		err = write_exact(STDOUT_FILENO, &reply, sizeof(reply));
+		if (err)
+			BFLOG_FATAL(ctx, "cannot write reply: %s\n", strerror(-err));
 	}
 }
 
 int main(void)
 {
-	debug_f = fopen("/dev/kmsg", "w");
-	setvbuf(debug_f, 0, _IOLBF, 0);
-	fprintf(debug_f, "Started bpfilter\n");
-	loop();
-	fclose(debug_f);
+	struct context ctx;
+	int err;
+
+	err = create_context(&ctx);
+	if (err)
+		return err;
+
+	err = setup_context(&ctx);
+	if (err) {
+		free_context(&ctx);
+		return err;
+	}
+
+	BFLOG_NOTICE(&ctx, "started\n");
+
+	loop(&ctx);
+
 	return 0;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next 02/11] bpfilter: Add logging facility
  2021-05-17 22:52 ` [PATCH bpf-next 02/11] bpfilter: Add logging facility Dmitrii Banshchikov
@ 2021-05-19 17:32   ` Song Liu
  2021-05-20  7:08     ` Dmitrii Banshchikov
  0 siblings, 1 reply; 28+ messages in thread
From: Song Liu @ 2021-05-19 17:32 UTC (permalink / raw)
  To: Dmitrii Banshchikov
  Cc: bpf, Alexei Starovoitov, David S . Miller, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Networking, Andrey Ignatov

On Tue, May 18, 2021 at 11:05 PM Dmitrii Banshchikov <me@ubique.spb.ru> wrote:
>
> There are three logging levels for messages: FATAL, NOTICE and DEBUG.
> When a message is logged with FATAL level it results in bpfilter
> usermode helper termination.

Could you please explain why we choose to have 3 levels? Will we need
more levels,
like WARNING, ERROR, etc.?

>
> Introduce struct context to avoid use of global objects and store there
> the logging parameters: log level and log sink.
>
> Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
> ---
>  net/bpfilter/Makefile  |  2 +-
>  net/bpfilter/bflog.c   | 29 +++++++++++++++++++++++++++++
>  net/bpfilter/bflog.h   | 24 ++++++++++++++++++++++++
>  net/bpfilter/context.h | 16 ++++++++++++++++

Maybe combine bflog.h and context.h into one file? And bflog() can
probably fit in
that file too.

Thanks,
Song

>  4 files changed, 70 insertions(+), 1 deletion(-)
>  create mode 100644 net/bpfilter/bflog.c
>  create mode 100644 net/bpfilter/bflog.h
>  create mode 100644 net/bpfilter/context.h
>
> diff --git a/net/bpfilter/Makefile b/net/bpfilter/Makefile
> index cdac82b8c53a..874d5ef6237d 100644
> --- a/net/bpfilter/Makefile
> +++ b/net/bpfilter/Makefile
> @@ -4,7 +4,7 @@
>  #
>
>  userprogs := bpfilter_umh
> -bpfilter_umh-objs := main.o
> +bpfilter_umh-objs := main.o bflog.o
>  userccflags += -I $(srctree)/tools/include/ -I $(srctree)/tools/include/uapi
>
>  ifeq ($(CONFIG_BPFILTER_UMH), y)
> diff --git a/net/bpfilter/bflog.c b/net/bpfilter/bflog.c
> new file mode 100644
> index 000000000000..2752e39060e4
> --- /dev/null
> +++ b/net/bpfilter/bflog.c
> @@ -0,0 +1,29 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2021 Telegram FZ-LLC
> + */
> +
> +#define _GNU_SOURCE
> +
> +#include "bflog.h"
> +
> +#include <stdarg.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +
> +#include "context.h"
> +
> +void bflog(struct context *ctx, int level, const char *fmt, ...)
> +{
> +       if (ctx->log_file &&
> +           (level == BFLOG_LEVEL_FATAL || (level & ctx->log_level))) {
> +               va_list va;
> +
> +               va_start(va, fmt);
> +               vfprintf(ctx->log_file, fmt, va);
> +               va_end(va);
> +       }
> +
> +       if (level == BFLOG_LEVEL_FATAL)
> +               exit(EXIT_FAILURE);
> +}
> diff --git a/net/bpfilter/bflog.h b/net/bpfilter/bflog.h
> new file mode 100644
> index 000000000000..4ed12791cfa1
> --- /dev/null
> +++ b/net/bpfilter/bflog.h
> @@ -0,0 +1,24 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (c) 2021 Telegram FZ-LLC
> + */
> +
> +#ifndef NET_BPFILTER_BFLOG_H
> +#define NET_BPFILTER_BFLOG_H
> +
> +struct context;
> +
> +#define BFLOG_IMPL(ctx, level, fmt, ...) bflog(ctx, level, "bpfilter: " fmt, ##__VA_ARGS__)
> +
> +#define BFLOG_LEVEL_FATAL (0)
> +#define BFLOG_LEVEL_NOTICE (1)
> +#define BFLOG_LEVEL_DEBUG (2)
> +
> +#define BFLOG_FATAL(ctx, fmt, ...)                                                                 \
> +       BFLOG_IMPL(ctx, BFLOG_LEVEL_FATAL, "fatal error: " fmt, ##__VA_ARGS__)
> +#define BFLOG_NOTICE(ctx, fmt, ...) BFLOG_IMPL(ctx, BFLOG_LEVEL_NOTICE, fmt, ##__VA_ARGS__)
> +#define BFLOG_DEBUG(ctx, fmt, ...) BFLOG_IMPL(ctx, BFLOG_LEVEL_DEBUG, fmt, ##__VA_ARGS__)
> +
> +void bflog(struct context *ctx, int level, const char *fmt, ...);
> +
> +#endif // NET_BPFILTER_BFLOG_H
> diff --git a/net/bpfilter/context.h b/net/bpfilter/context.h
> new file mode 100644
> index 000000000000..e85c97c3d010
> --- /dev/null
> +++ b/net/bpfilter/context.h
> @@ -0,0 +1,16 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (c) 2021 Telegram FZ-LLC
> + */
> +
> +#ifndef NET_BPFILTER_CONTEXT_H
> +#define NET_BPFILTER_CONTEXT_H
> +
> +#include <stdio.h>
> +
> +struct context {
> +       FILE *log_file;
> +       int log_level;
> +};
> +
> +#endif // NET_BPFILTER_CONTEXT_H
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next 03/11] bpfilter: Add IO functions
  2021-05-17 22:53 ` [PATCH bpf-next 03/11] bpfilter: Add IO functions Dmitrii Banshchikov
@ 2021-05-19 18:47   ` Song Liu
  0 siblings, 0 replies; 28+ messages in thread
From: Song Liu @ 2021-05-19 18:47 UTC (permalink / raw)
  To: Dmitrii Banshchikov
  Cc: bpf, Alexei Starovoitov, David S . Miller, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Networking, Andrey Ignatov

On Tue, May 18, 2021 at 11:06 PM Dmitrii Banshchikov <me@ubique.spb.ru> wrote:
>
> Introduce IO functions for:
> 1) reading and writing data from a descriptor: read_exact(), write_exact(),
> 2) reading and writing memory of other processes: pvm_read(), pvm_write().
>
> read_exact() and write_exact() are wrappers over read(2)/write(2) with
> correct handling of partial read/write. These functions are intended to
> be used for communication over pipe with the kernel part of bpfilter.
>
> pvm_read() and pvm_write() are wrappers over
> process_vm_readv(2)/process_vm_writev(2) with an interface that uses a
> single buffer instead of vectored form. These functions are intended to
> be used for readining/writing memory buffers supplied to iptables ABI
> setsockopt(2) from other processes.
>
> Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>

The code looks correct, so

Acked-by: Song Liu <songliubraving@fb.com>

However, I am not sure whether we really want these wrapper functions.

[...]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next 06/11] bpfilter: Add struct match
  2021-05-17 22:53 ` [PATCH bpf-next 06/11] bpfilter: Add struct match Dmitrii Banshchikov
@ 2021-05-20  4:26   ` Song Liu
  2021-05-20  7:31     ` Dmitrii Banshchikov
  0 siblings, 1 reply; 28+ messages in thread
From: Song Liu @ 2021-05-20  4:26 UTC (permalink / raw)
  To: Dmitrii Banshchikov
  Cc: open list:BPF (Safe dynamic programs and tools),
	Alexei Starovoitov, David S . Miller, Daniel Borkmann,
	Andrii Nakryiko, Martin Lau, Yonghong Song, John Fastabend,
	KP Singh, open list:BPF (Safe dynamic programs and tools),
	Andrey Ignatov



> On May 17, 2021, at 3:53 PM, Dmitrii Banshchikov <me@ubique.spb.ru> wrote:
> 
> struct match_ops defines polymorphic interface for matches. A match
> consists of pointers to struct match_ops and struct xt_entry_match which
> contains a payload for the match's type.
> 
> All match_ops are kept in map match_ops_map by their name.
> 
> Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
> 
[...]

> diff --git a/net/bpfilter/match-ops-map.h b/net/bpfilter/match-ops-map.h
> new file mode 100644
> index 000000000000..0ff57f2d8da8
> --- /dev/null
> +++ b/net/bpfilter/match-ops-map.h
> @@ -0,0 +1,48 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (c) 2021 Telegram FZ-LLC
> + */
> +
> +#ifndef NET_BPFILTER_MATCH_OPS_MAP_H
> +#define NET_BPFILTER_MATCH_OPS_MAP_H
> +
> +#include "map-common.h"
> +
> +#include <linux/err.h>
> +
> +#include <errno.h>
> +#include <string.h>
> +
> +#include "match.h"
> +
> +struct match_ops_map {
> +	struct hsearch_data index;
> +};

Do we plan to extend match_ops_map? Otherwise, we can just use 
hsearch_data in struct context. 

> +
> +static inline int create_match_ops_map(struct match_ops_map *map, size_t nelem)
> +{
> +	return create_map(&map->index, nelem);
> +}
> +
> +static inline const struct match_ops *match_ops_map_find(struct match_ops_map *map,
> +							 const char *name)
> +{
> +	const size_t namelen = strnlen(name, BPFILTER_EXTENSION_MAXNAMELEN);
> +
> +	if (namelen < BPFILTER_EXTENSION_MAXNAMELEN)
> +		return map_find(&map->index, name);
> +
> +	return ERR_PTR(-EINVAL);
> +}
> +
> +static inline int match_ops_map_insert(struct match_ops_map *map, const struct match_ops *match_ops)
> +{
> +	return map_insert(&map->index, match_ops->name, (void *)match_ops);
> +}
> +
> +static inline void free_match_ops_map(struct match_ops_map *map)
> +{
> +	free_map(&map->index);
> +}
> +
> +#endif // NET_BPFILTER_MATCT_OPS_MAP_H
> diff --git a/net/bpfilter/match.c b/net/bpfilter/match.c
> new file mode 100644
> index 000000000000..aeca1b93cd2d
> --- /dev/null
> +++ b/net/bpfilter/match.c
> @@ -0,0 +1,73 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2021 Telegram FZ-LLC
> + */
> +
> +#define _GNU_SOURCE
> +
> +#include "match.h"
> +
> +#include <linux/err.h>
> +#include <linux/netfilter/xt_tcpudp.h>

Besides xt_ filters, do we plan to support others? If so, we probably 
want separate files for each of them. 

> +
> +#include <errno.h>
> +#include <string.h>
> +
> +#include "bflog.h"
> +#include "context.h"
> +#include "match-ops-map.h"
> +
> +#define BPFILTER_ALIGN(__X) __ALIGN_KERNEL(__X, __alignof__(__u64))
> +#define MATCH_SIZE(type) (sizeof(struct bpfilter_ipt_match) + BPFILTER_ALIGN(sizeof(type)))
> +
> +static int udp_match_check(struct context *ctx, const struct bpfilter_ipt_match *ipt_match)
> +{
> +	const struct xt_udp *udp;
> +
> +	udp = (const struct xt_udp *)&ipt_match->data;
> +
> +	if (udp->invflags & XT_UDP_INV_MASK) {
> +		BFLOG_DEBUG(ctx, "cannot check match 'udp': invalid flags\n");
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
> +const struct match_ops udp_match_ops = { .name = "udp",

And maybe we should name this one "xt_udp"? 

> +					 .size = MATCH_SIZE(struct xt_udp),
> +					 .revision = 0,
> +					 .check = udp_match_check };
> +
> +int init_match(struct context *ctx, const struct bpfilter_ipt_match *ipt_match, struct match *match)
> +{
> +	const size_t maxlen = sizeof(ipt_match->u.user.name);
> +	const struct match_ops *found;
> +	int err;
> +
> +	if (strnlen(ipt_match->u.user.name, maxlen) == maxlen) {
> +		BFLOG_DEBUG(ctx, "cannot init match: too long match name\n");
> +		return -EINVAL;
> +	}
> +
> +	found = match_ops_map_find(&ctx->match_ops_map, ipt_match->u.user.name);
> +	if (IS_ERR(found)) {
> +		BFLOG_DEBUG(ctx, "cannot find match by name: '%s'\n", ipt_match->u.user.name);
> +		return PTR_ERR(found);
> +	}
> +
> +	if (found->size != ipt_match->u.match_size ||
> +	    found->revision != ipt_match->u.user.revision) {
> +		BFLOG_DEBUG(ctx, "invalid match: '%s'\n", ipt_match->u.user.name);
> +		return -EINVAL;
> +	}
> +
> +	err = found->check(ctx, ipt_match);
> +	if (err)
> +		return err;
> +
> +	match->match_ops = found;
> +	match->ipt_match = ipt_match;
> +
> +	return 0;
> +}
> diff --git a/net/bpfilter/match.h b/net/bpfilter/match.h
> new file mode 100644
> index 000000000000..79b7c87016d4
> --- /dev/null
> +++ b/net/bpfilter/match.h
> @@ -0,0 +1,34 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (c) 2021 Telegram FZ-LLC
> + */
> +
> +#ifndef NET_BPFILTER_MATCH_H
> +#define NET_BPFILTER_MATCH_H
> +
> +#include "../../include/uapi/linux/bpfilter.h"
> +
> +#include <stdint.h>
> +
> +struct bpfilter_ipt_match;
> +struct context;
> +struct match_ops_map;
> +
> +struct match_ops {
> +	char name[BPFILTER_EXTENSION_MAXNAMELEN];

BPFILTER_EXTENSION_MAXNAMELEN is 29, so "size" below is mis-aligned. I guess
we can swap size and revision. 

> +	uint16_t size;
> +	uint8_t revision;
> +	int (*check)(struct context *ctx, const struct bpfilter_ipt_match *ipt_match);
> +};
> +
> +extern const struct match_ops udp_match_ops;
> +
> +struct match {
> +	const struct match_ops *match_ops;
> +	const struct bpfilter_ipt_match *ipt_match;
> +};

[...]


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next 07/11] bpfilter: Add struct target
  2021-05-17 22:53 ` [PATCH bpf-next 07/11] bpfilter: Add struct target Dmitrii Banshchikov
@ 2021-05-20  4:36   ` Song Liu
  2021-05-20  7:44     ` Dmitrii Banshchikov
  0 siblings, 1 reply; 28+ messages in thread
From: Song Liu @ 2021-05-20  4:36 UTC (permalink / raw)
  To: Dmitrii Banshchikov
  Cc: open list:BPF (Safe dynamic programs and tools),
	ast, davem, daniel, andrii, Martin Lau, Yonghong Song,
	john.fastabend, kpsingh, netdev, Andrey Ignatov



> On May 17, 2021, at 3:53 PM, Dmitrii Banshchikov <me@ubique.spb.ru> wrote:
> 
> struct target_ops defines polymorphic interface for targets. A target
> consists of pointers to struct target_ops and struct xt_entry_target
> which contains a payload for the target's type.
> 
> All target_ops are kept in map target_ops_map by their name.
> 
> Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
> 

[...]

> index 000000000000..6b65241328da
> --- /dev/null
> +++ b/net/bpfilter/target-ops-map.h
> @@ -0,0 +1,49 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (c) 2021 Telegram FZ-LLC
> + */
> +
> +#ifndef NET_BPFILTER_TARGET_OPS_MAP_H
> +#define NET_BPFILTER_TARGET_OPS_MAP_H
> +
> +#include "map-common.h"
> +
> +#include <linux/err.h>
> +
> +#include <errno.h>
> +#include <string.h>
> +
> +#include "target.h"
> +
> +struct target_ops_map {
> +	struct hsearch_data index;
> +};

Similar to 06/11, target_ops_map seems unnecessary. Also, do we need to 
support non-xt targets? 

> +
> +static inline int create_target_ops_map(struct target_ops_map *map, size_t nelem)
> +{
> +	return create_map(&map->index, nelem);
> +}
> +
> +static inline const struct target_ops *target_ops_map_find(struct target_ops_map *map,
> +							   const char *name)
> +{
> +	const size_t namelen = strnlen(name, BPFILTER_EXTENSION_MAXNAMELEN);
> +
> +	if (namelen < BPFILTER_EXTENSION_MAXNAMELEN)
> +		return map_find(&map->index, name);
> +
> +	return ERR_PTR(-EINVAL);
> +}
> +
> +static inline int target_ops_map_insert(struct target_ops_map *map,
> +					const struct target_ops *target_ops)
> +{
> +	return map_insert(&map->index, target_ops->name, (void *)target_ops);
> +}
> +
> +static inline void free_target_ops_map(struct target_ops_map *map)
> +{
> +	free_map(&map->index);
> +}
> +
> +#endif // NET_BPFILTER_TARGET_OPS_MAP_H
> diff --git a/net/bpfilter/target.c b/net/bpfilter/target.c
> new file mode 100644
> index 000000000000..a18fe477f93c
> --- /dev/null
> +++ b/net/bpfilter/target.c
> @@ -0,0 +1,112 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2021 Telegram FZ-LLC
> + */
> +
> +#define _GNU_SOURCE
> +
> +#include "target.h"
> +
> +#include <linux/err.h>
> +#include <linux/netfilter/x_tables.h>
> +
> +#include <errno.h>
> +#include <string.h>
> +
> +#include "bflog.h"
> +#include "context.h"
> +#include "target-ops-map.h"
> +

Please add some comments about convert_verdict. 


> +static int convert_verdict(int verdict)
> +{
> +	return -verdict - 1;
> +}
> +
> +static int standard_target_check(struct context *ctx, const struct bpfilter_ipt_target *ipt_target)
> +{
> +	const struct bpfilter_ipt_standard_target *standard_target;
> +
> +	standard_target = (const struct bpfilter_ipt_standard_target *)ipt_target;
> +
> +	if (standard_target->verdict > 0)
> +		return 0;
> +
> +	if (standard_target->verdict < 0) {
> +		if (standard_target->verdict == BPFILTER_RETURN)
> +			return 0;
> +
> +		switch (convert_verdict(standard_target->verdict)) {
> +		case BPFILTER_NF_ACCEPT:
> +		case BPFILTER_NF_DROP:
> +		case BPFILTER_NF_QUEUE:
> +			return 0;
> +		}
> +	}
> +
> +	BFLOG_DEBUG(ctx, "invalid verdict: %d\n", standard_target->verdict);
> +
> +	return -EINVAL;
> +}
> +
> +const struct target_ops standard_target_ops = {
> +	.name = "",
> +	.revision = 0,
> +	.size = sizeof(struct xt_standard_target),
> +	.check = standard_target_check,
> +};
> +
> +static int error_target_check(struct context *ctx, const struct bpfilter_ipt_target *ipt_target)
> +{
> +	const struct bpfilter_ipt_error_target *error_target;
> +	size_t maxlen;
> +
> +	error_target = (const struct bpfilter_ipt_error_target *)&ipt_target;
> +	maxlen = sizeof(error_target->error_name);
> +	if (strnlen(error_target->error_name, maxlen) == maxlen) {
> +		BFLOG_DEBUG(ctx, "cannot check error target: too long errorname\n");
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
> +const struct target_ops error_target_ops = {
> +	.name = "ERROR",
> +	.revision = 0,
> +	.size = sizeof(struct xt_error_target),
> +	.check = error_target_check,
> +};
> +
> +int init_target(struct context *ctx, const struct bpfilter_ipt_target *ipt_target,
> +		struct target *target)
> +{
> +	const size_t maxlen = sizeof(ipt_target->u.user.name);
> +	const struct target_ops *found;
> +	int err;
> +
> +	if (strnlen(ipt_target->u.user.name, maxlen) == maxlen) {
> +		BFLOG_DEBUG(ctx, "cannot init target: too long target name\n");
> +		return -EINVAL;
> +	}
> +
> +	found = target_ops_map_find(&ctx->target_ops_map, ipt_target->u.user.name);
> +	if (IS_ERR(found)) {
> +		BFLOG_DEBUG(ctx, "cannot find target by name: '%s'\n", ipt_target->u.user.name);
> +		return PTR_ERR(found);
> +	}
> +
> +	if (found->size != ipt_target->u.target_size ||
> +	    found->revision != ipt_target->u.user.revision) {
> +		BFLOG_DEBUG(ctx, "invalid target: '%s'\n", ipt_target->u.user.name);
> +		return -EINVAL;
> +	}
> +
> +	err = found->check(ctx, ipt_target);
> +	if (err)
> +		return err;
> +
> +	target->target_ops = found;
> +	target->ipt_target = ipt_target;
> +
> +	return 0;
> +}
> diff --git a/net/bpfilter/target.h b/net/bpfilter/target.h
> new file mode 100644
> index 000000000000..5d9c4c459c05
> --- /dev/null
> +++ b/net/bpfilter/target.h
> @@ -0,0 +1,34 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (c) 2021 Telegram FZ-LLC
> + */
> +
> +#ifndef NET_BPFILTER_TARGET_H
> +#define NET_BPFILTER_TARGET_H
> +
> +#include "../../include/uapi/linux/bpfilter.h"
> +
> +#include <stdint.h>
> +
> +struct context;
> +struct target_ops_map;
> +
> +struct target_ops {
> +	char name[BPFILTER_EXTENSION_MAXNAMELEN];
> +	uint16_t size;

Mis-aligned "size". 

> +	uint8_t revision;
> +	int (*check)(struct context *ctx, const struct bpfilter_ipt_target *ipt_target);
> +};
> +
> +struct target {
> +	const struct target_ops *target_ops;
> +	const struct bpfilter_ipt_target *ipt_target;
> +};
> +
> +extern const struct target_ops standard_target_ops;
> +extern const struct target_ops error_target_ops;
> +
> +int init_target(struct context *ctx, const struct bpfilter_ipt_target *ipt_target,
> +		struct target *target);
> +
> +#endif // NET_BPFILTER_TARGET_H
> diff --git a/tools/testing/selftests/bpf/bpfilter/.gitignore b/tools/testing/selftests/bpf/bpfilter/.gitignore
> index e5073231f811..1856d0515f49 100644
> --- a/tools/testing/selftests/bpf/bpfilter/.gitignore
> +++ b/tools/testing/selftests/bpf/bpfilter/.gitignore
> @@ -2,3 +2,4 @@
> test_io
> test_map
> test_match
> +test_target
> diff --git a/tools/testing/selftests/bpf/bpfilter/Makefile b/tools/testing/selftests/bpf/bpfilter/Makefile
> index 362c9a28b88d..78da74b9ee68 100644
> --- a/tools/testing/selftests/bpf/bpfilter/Makefile
> +++ b/tools/testing/selftests/bpf/bpfilter/Makefile
> @@ -11,6 +11,7 @@ CFLAGS += -Wall -g -pthread -I$(TOOLSINCDIR) -I$(APIDIR) -I$(BPFILTERSRCDIR)
> TEST_GEN_PROGS += test_io
> TEST_GEN_PROGS += test_map
> TEST_GEN_PROGS += test_match
> +TEST_GEN_PROGS += test_target
> 
> KSFT_KHDR_INSTALL := 1
> 
> @@ -19,4 +20,6 @@ include ../../lib.mk
> $(OUTPUT)/test_io: test_io.c $(BPFILTERSRCDIR)/io.c
> $(OUTPUT)/test_map: test_map.c $(BPFILTERSRCDIR)/map-common.c
> $(OUTPUT)/test_match: test_match.c $(BPFILTERSRCDIR)/match.c $(BPFILTERSRCDIR)/map-common.c \
> -	$(BPFILTERSRCDIR)/context.c $(BPFILTERSRCDIR)/bflog.c
> +	$(BPFILTERSRCDIR)/context.c $(BPFILTERSRCDIR)/bflog.c $(BPFILTERSRCDIR)/target.c
> +$(OUTPUT)/test_target: test_target.c $(BPFILTERSRCDIR)/target.c $(BPFILTERSRCDIR)/map-common.c \
> +	$(BPFILTERSRCDIR)/context.c $(BPFILTERSRCDIR)/bflog.c $(BPFILTERSRCDIR)/match.c
> diff --git a/tools/testing/selftests/bpf/bpfilter/bpfilter_util.h b/tools/testing/selftests/bpf/bpfilter/bpfilter_util.h
> new file mode 100644
> index 000000000000..d82ff86f280e
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/bpfilter/bpfilter_util.h
> @@ -0,0 +1,31 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#ifndef BPFILTER_UTIL_H
> +#define BPFILTER_UTIL_H
> +
> +#include <linux/bpfilter.h>
> +#include <linux/netfilter/x_tables.h>
> +
> +#include <stdio.h>
> +
> +static inline void init_standard_target(struct xt_standard_target *ipt_target, int revision,
> +					int verdict)
> +{
> +	snprintf(ipt_target->target.u.user.name, sizeof(ipt_target->target.u.user.name), "%s",
> +		 BPFILTER_STANDARD_TARGET);
> +	ipt_target->target.u.user.revision = revision;
> +	ipt_target->target.u.user.target_size = sizeof(*ipt_target);
> +	ipt_target->verdict = verdict;
> +}
> +
> +static inline void init_error_target(struct xt_error_target *ipt_target, int revision,
> +				     const char *error_name)
> +{
> +	snprintf(ipt_target->target.u.user.name, sizeof(ipt_target->target.u.user.name), "%s",
> +		 BPFILTER_ERROR_TARGET);
> +	ipt_target->target.u.user.revision = revision;
> +	ipt_target->target.u.user.target_size = sizeof(*ipt_target);
> +	snprintf(ipt_target->errorname, sizeof(ipt_target->errorname), "%s", error_name);
> +}
> +
> +#endif // BPFILTER_UTIL_H
> 
[...]
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next 00/11] bpfilter
  2021-05-17 22:52 [PATCH bpf-next 00/11] bpfilter Dmitrii Banshchikov
                   ` (10 preceding siblings ...)
  2021-05-17 22:53 ` [PATCH bpf-next 11/11] bpfilter: Handle setsockopts Dmitrii Banshchikov
@ 2021-05-20  4:54 ` Song Liu
  2021-05-20  7:53   ` Dmitrii Banshchikov
  11 siblings, 1 reply; 28+ messages in thread
From: Song Liu @ 2021-05-20  4:54 UTC (permalink / raw)
  To: Dmitrii Banshchikov
  Cc: open list:BPF (Safe dynamic programs and tools),
	Alexei Starovoitov, davem, daniel, andrii, Martin Lau,
	Yonghong Song, john.fastabend, kpsingh, netdev, Andrey Ignatov



> On May 17, 2021, at 3:52 PM, Dmitrii Banshchikov <me@ubique.spb.ru> wrote:
> 
> The patchset is based on the patches from David S. Miller [1] and Daniel
> Borkmann [2].
> 
> The main goal of the patchset is to prepare bpfilter for iptables'
> configuration blob parsing and code generation.
> 
> The patchset introduces data structures and code for matches, targets, rules
> and tables.
> 
> It seems inconvenient to continue to use the same blob internally in bpfilter
> in parts other than the blob parsing. That is why a superstructure with native
> types is introduced. It provides a more convenient way to iterate over the blob
> and limit the crazy structs widespread in the bpfilter code.
> 

[...]

> 
> 
> 1. https://lore.kernel.org/patchwork/patch/902785/

[1] used bpfilter_ prefix on struct definitions, like "struct bpfilter_target"
I think we should do the same in this version. (Or were there discussions on
removing the prefix?). 

Thanks,
Song

[...]


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next 02/11] bpfilter: Add logging facility
  2021-05-19 17:32   ` Song Liu
@ 2021-05-20  7:08     ` Dmitrii Banshchikov
  2021-05-20 16:35       ` Song Liu
  0 siblings, 1 reply; 28+ messages in thread
From: Dmitrii Banshchikov @ 2021-05-20  7:08 UTC (permalink / raw)
  To: Song Liu
  Cc: bpf, Alexei Starovoitov, David S . Miller, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Networking, Andrey Ignatov

On Wed, May 19, 2021 at 10:32:25AM -0700, Song Liu wrote:
> On Tue, May 18, 2021 at 11:05 PM Dmitrii Banshchikov <me@ubique.spb.ru> wrote:
> >
> > There are three logging levels for messages: FATAL, NOTICE and DEBUG.
> > When a message is logged with FATAL level it results in bpfilter
> > usermode helper termination.
> 
> Could you please explain why we choose to have 3 levels? Will we need
> more levels,
> like WARNING, ERROR, etc.?


I found that I need one level for development - to trace what
goes rignt and wrong. At the same time as those messages go to
dmesg this level is too verbose to be used under normal
circumstances. That is why another level is introduced. And the
last one exists to verify invariants or error condintions from
which there is no right way to recover and they result in
bpfilter termination.

Probably we may have just two levels - DEBUG and NOTICE and some
analogue of BUG_ON/WARN_ON/runtime assert that results in a
message on NOTICE level and program termination if the checked
condition is false.

I don't think that we will need more levels - until we decide to
utilize syslog facility. Even in that case I don't know how to
differntiate between e.g. NOTICE and INFO messages.

> 
> >
> > Introduce struct context to avoid use of global objects and store there
> > the logging parameters: log level and log sink.
> >
> > Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
> > ---
> >  net/bpfilter/Makefile  |  2 +-
> >  net/bpfilter/bflog.c   | 29 +++++++++++++++++++++++++++++
> >  net/bpfilter/bflog.h   | 24 ++++++++++++++++++++++++
> >  net/bpfilter/context.h | 16 ++++++++++++++++
> 
> Maybe combine bflog.h and context.h into one file? And bflog() can
> probably fit in
> that file too.


Sure.

> 
> Thanks,
> Song
> 
> >  4 files changed, 70 insertions(+), 1 deletion(-)
> >  create mode 100644 net/bpfilter/bflog.c
> >  create mode 100644 net/bpfilter/bflog.h
> >  create mode 100644 net/bpfilter/context.h
> >
> > diff --git a/net/bpfilter/Makefile b/net/bpfilter/Makefile
> > index cdac82b8c53a..874d5ef6237d 100644
> > --- a/net/bpfilter/Makefile
> > +++ b/net/bpfilter/Makefile
> > @@ -4,7 +4,7 @@
> >  #
> >
> >  userprogs := bpfilter_umh
> > -bpfilter_umh-objs := main.o
> > +bpfilter_umh-objs := main.o bflog.o
> >  userccflags += -I $(srctree)/tools/include/ -I $(srctree)/tools/include/uapi
> >
> >  ifeq ($(CONFIG_BPFILTER_UMH), y)
> > diff --git a/net/bpfilter/bflog.c b/net/bpfilter/bflog.c
> > new file mode 100644
> > index 000000000000..2752e39060e4
> > --- /dev/null
> > +++ b/net/bpfilter/bflog.c
> > @@ -0,0 +1,29 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (c) 2021 Telegram FZ-LLC
> > + */
> > +
> > +#define _GNU_SOURCE
> > +
> > +#include "bflog.h"
> > +
> > +#include <stdarg.h>
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +
> > +#include "context.h"
> > +
> > +void bflog(struct context *ctx, int level, const char *fmt, ...)
> > +{
> > +       if (ctx->log_file &&
> > +           (level == BFLOG_LEVEL_FATAL || (level & ctx->log_level))) {
> > +               va_list va;
> > +
> > +               va_start(va, fmt);
> > +               vfprintf(ctx->log_file, fmt, va);
> > +               va_end(va);
> > +       }
> > +
> > +       if (level == BFLOG_LEVEL_FATAL)
> > +               exit(EXIT_FAILURE);
> > +}
> > diff --git a/net/bpfilter/bflog.h b/net/bpfilter/bflog.h
> > new file mode 100644
> > index 000000000000..4ed12791cfa1
> > --- /dev/null
> > +++ b/net/bpfilter/bflog.h
> > @@ -0,0 +1,24 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (c) 2021 Telegram FZ-LLC
> > + */
> > +
> > +#ifndef NET_BPFILTER_BFLOG_H
> > +#define NET_BPFILTER_BFLOG_H
> > +
> > +struct context;
> > +
> > +#define BFLOG_IMPL(ctx, level, fmt, ...) bflog(ctx, level, "bpfilter: " fmt, ##__VA_ARGS__)
> > +
> > +#define BFLOG_LEVEL_FATAL (0)
> > +#define BFLOG_LEVEL_NOTICE (1)
> > +#define BFLOG_LEVEL_DEBUG (2)
> > +
> > +#define BFLOG_FATAL(ctx, fmt, ...)                                                                 \
> > +       BFLOG_IMPL(ctx, BFLOG_LEVEL_FATAL, "fatal error: " fmt, ##__VA_ARGS__)
> > +#define BFLOG_NOTICE(ctx, fmt, ...) BFLOG_IMPL(ctx, BFLOG_LEVEL_NOTICE, fmt, ##__VA_ARGS__)
> > +#define BFLOG_DEBUG(ctx, fmt, ...) BFLOG_IMPL(ctx, BFLOG_LEVEL_DEBUG, fmt, ##__VA_ARGS__)
> > +
> > +void bflog(struct context *ctx, int level, const char *fmt, ...);
> > +
> > +#endif // NET_BPFILTER_BFLOG_H
> > diff --git a/net/bpfilter/context.h b/net/bpfilter/context.h
> > new file mode 100644
> > index 000000000000..e85c97c3d010
> > --- /dev/null
> > +++ b/net/bpfilter/context.h
> > @@ -0,0 +1,16 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (c) 2021 Telegram FZ-LLC
> > + */
> > +
> > +#ifndef NET_BPFILTER_CONTEXT_H
> > +#define NET_BPFILTER_CONTEXT_H
> > +
> > +#include <stdio.h>
> > +
> > +struct context {
> > +       FILE *log_file;
> > +       int log_level;
> > +};
> > +
> > +#endif // NET_BPFILTER_CONTEXT_H
> > --
> > 2.25.1
> >

-- 

Dmitrii Banshchikov

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next 06/11] bpfilter: Add struct match
  2021-05-20  4:26   ` Song Liu
@ 2021-05-20  7:31     ` Dmitrii Banshchikov
  2021-05-20 17:44       ` Song Liu
  0 siblings, 1 reply; 28+ messages in thread
From: Dmitrii Banshchikov @ 2021-05-20  7:31 UTC (permalink / raw)
  To: Song Liu
  Cc: open list:BPF (Safe dynamic programs and tools),
	Alexei Starovoitov, David S . Miller, Daniel Borkmann,
	Andrii Nakryiko, Martin Lau, Yonghong Song, John Fastabend,
	KP Singh, open list:BPF (Safe dynamic programs and tools),
	Andrey Ignatov

On Thu, May 20, 2021 at 04:26:28AM +0000, Song Liu wrote:
> 
> 
> > On May 17, 2021, at 3:53 PM, Dmitrii Banshchikov <me@ubique.spb.ru> wrote:
> > 
> > struct match_ops defines polymorphic interface for matches. A match
> > consists of pointers to struct match_ops and struct xt_entry_match which
> > contains a payload for the match's type.
> > 
> > All match_ops are kept in map match_ops_map by their name.
> > 
> > Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
> > 
> [...]
> 
> > diff --git a/net/bpfilter/match-ops-map.h b/net/bpfilter/match-ops-map.h
> > new file mode 100644
> > index 000000000000..0ff57f2d8da8
> > --- /dev/null
> > +++ b/net/bpfilter/match-ops-map.h
> > @@ -0,0 +1,48 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (c) 2021 Telegram FZ-LLC
> > + */
> > +
> > +#ifndef NET_BPFILTER_MATCH_OPS_MAP_H
> > +#define NET_BPFILTER_MATCH_OPS_MAP_H
> > +
> > +#include "map-common.h"
> > +
> > +#include <linux/err.h>
> > +
> > +#include <errno.h>
> > +#include <string.h>
> > +
> > +#include "match.h"
> > +
> > +struct match_ops_map {
> > +	struct hsearch_data index;
> > +};
> 
> Do we plan to extend match_ops_map? Otherwise, we can just use 
> hsearch_data in struct context. 

Agreed.

> 
> > +
> > +static inline int create_match_ops_map(struct match_ops_map *map, size_t nelem)
> > +{
> > +	return create_map(&map->index, nelem);
> > +}
> > +
> > +static inline const struct match_ops *match_ops_map_find(struct match_ops_map *map,
> > +							 const char *name)
> > +{
> > +	const size_t namelen = strnlen(name, BPFILTER_EXTENSION_MAXNAMELEN);
> > +
> > +	if (namelen < BPFILTER_EXTENSION_MAXNAMELEN)
> > +		return map_find(&map->index, name);
> > +
> > +	return ERR_PTR(-EINVAL);
> > +}
> > +
> > +static inline int match_ops_map_insert(struct match_ops_map *map, const struct match_ops *match_ops)
> > +{
> > +	return map_insert(&map->index, match_ops->name, (void *)match_ops);
> > +}
> > +
> > +static inline void free_match_ops_map(struct match_ops_map *map)
> > +{
> > +	free_map(&map->index);
> > +}
> > +
> > +#endif // NET_BPFILTER_MATCT_OPS_MAP_H
> > diff --git a/net/bpfilter/match.c b/net/bpfilter/match.c
> > new file mode 100644
> > index 000000000000..aeca1b93cd2d
> > --- /dev/null
> > +++ b/net/bpfilter/match.c
> > @@ -0,0 +1,73 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (c) 2021 Telegram FZ-LLC
> > + */
> > +
> > +#define _GNU_SOURCE
> > +
> > +#include "match.h"
> > +
> > +#include <linux/err.h>
> > +#include <linux/netfilter/xt_tcpudp.h>
> 
> Besides xt_ filters, do we plan to support others? If so, we probably 
> want separate files for each of them. 

Do you mean nft filters?
They use nfilter API and currently we cannot hook into it - so
probably eventually.


> 
> > +
> > +#include <errno.h>
> > +#include <string.h>
> > +
> > +#include "bflog.h"
> > +#include "context.h"
> > +#include "match-ops-map.h"
> > +
> > +#define BPFILTER_ALIGN(__X) __ALIGN_KERNEL(__X, __alignof__(__u64))
> > +#define MATCH_SIZE(type) (sizeof(struct bpfilter_ipt_match) + BPFILTER_ALIGN(sizeof(type)))
> > +
> > +static int udp_match_check(struct context *ctx, const struct bpfilter_ipt_match *ipt_match)
> > +{
> > +	const struct xt_udp *udp;
> > +
> > +	udp = (const struct xt_udp *)&ipt_match->data;
> > +
> > +	if (udp->invflags & XT_UDP_INV_MASK) {
> > +		BFLOG_DEBUG(ctx, "cannot check match 'udp': invalid flags\n");
> > +		return -EINVAL;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +const struct match_ops udp_match_ops = { .name = "udp",
> 
> And maybe we should name this one "xt_udp"? 

Agreed.


> 
> > +					 .size = MATCH_SIZE(struct xt_udp),
> > +					 .revision = 0,
> > +					 .check = udp_match_check };
> > +
> > +int init_match(struct context *ctx, const struct bpfilter_ipt_match *ipt_match, struct match *match)
> > +{
> > +	const size_t maxlen = sizeof(ipt_match->u.user.name);
> > +	const struct match_ops *found;
> > +	int err;
> > +
> > +	if (strnlen(ipt_match->u.user.name, maxlen) == maxlen) {
> > +		BFLOG_DEBUG(ctx, "cannot init match: too long match name\n");
> > +		return -EINVAL;
> > +	}
> > +
> > +	found = match_ops_map_find(&ctx->match_ops_map, ipt_match->u.user.name);
> > +	if (IS_ERR(found)) {
> > +		BFLOG_DEBUG(ctx, "cannot find match by name: '%s'\n", ipt_match->u.user.name);
> > +		return PTR_ERR(found);
> > +	}
> > +
> > +	if (found->size != ipt_match->u.match_size ||
> > +	    found->revision != ipt_match->u.user.revision) {
> > +		BFLOG_DEBUG(ctx, "invalid match: '%s'\n", ipt_match->u.user.name);
> > +		return -EINVAL;
> > +	}
> > +
> > +	err = found->check(ctx, ipt_match);
> > +	if (err)
> > +		return err;
> > +
> > +	match->match_ops = found;
> > +	match->ipt_match = ipt_match;
> > +
> > +	return 0;
> > +}
> > diff --git a/net/bpfilter/match.h b/net/bpfilter/match.h
> > new file mode 100644
> > index 000000000000..79b7c87016d4
> > --- /dev/null
> > +++ b/net/bpfilter/match.h
> > @@ -0,0 +1,34 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (c) 2021 Telegram FZ-LLC
> > + */
> > +
> > +#ifndef NET_BPFILTER_MATCH_H
> > +#define NET_BPFILTER_MATCH_H
> > +
> > +#include "../../include/uapi/linux/bpfilter.h"
> > +
> > +#include <stdint.h>
> > +
> > +struct bpfilter_ipt_match;
> > +struct context;
> > +struct match_ops_map;
> > +
> > +struct match_ops {
> > +	char name[BPFILTER_EXTENSION_MAXNAMELEN];
> 
> BPFILTER_EXTENSION_MAXNAMELEN is 29, so "size" below is mis-aligned. I guess
> we can swap size and revision. 

Agreed.

> 
> > +	uint16_t size;
> > +	uint8_t revision;
> > +	int (*check)(struct context *ctx, const struct bpfilter_ipt_match *ipt_match);
> > +};
> > +
> > +extern const struct match_ops udp_match_ops;
> > +
> > +struct match {
> > +	const struct match_ops *match_ops;
> > +	const struct bpfilter_ipt_match *ipt_match;
> > +};
> 
> [...]
> 

-- 

Dmitrii Banshchikov

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next 07/11] bpfilter: Add struct target
  2021-05-20  4:36   ` Song Liu
@ 2021-05-20  7:44     ` Dmitrii Banshchikov
  0 siblings, 0 replies; 28+ messages in thread
From: Dmitrii Banshchikov @ 2021-05-20  7:44 UTC (permalink / raw)
  To: Song Liu
  Cc: open list:BPF (Safe dynamic programs and tools),
	ast, davem, daniel, andrii, Martin Lau, Yonghong Song,
	john.fastabend, kpsingh, netdev, Andrey Ignatov

On Thu, May 20, 2021 at 04:36:46AM +0000, Song Liu wrote:
> 
> 
> > On May 17, 2021, at 3:53 PM, Dmitrii Banshchikov <me@ubique.spb.ru> wrote:
> > 
> > struct target_ops defines polymorphic interface for targets. A target
> > consists of pointers to struct target_ops and struct xt_entry_target
> > which contains a payload for the target's type.
> > 
> > All target_ops are kept in map target_ops_map by their name.
> > 
> > Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
> > 
> 
> [...]
> 
> > index 000000000000..6b65241328da
> > --- /dev/null
> > +++ b/net/bpfilter/target-ops-map.h
> > @@ -0,0 +1,49 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (c) 2021 Telegram FZ-LLC
> > + */
> > +
> > +#ifndef NET_BPFILTER_TARGET_OPS_MAP_H
> > +#define NET_BPFILTER_TARGET_OPS_MAP_H
> > +
> > +#include "map-common.h"
> > +
> > +#include <linux/err.h>
> > +
> > +#include <errno.h>
> > +#include <string.h>
> > +
> > +#include "target.h"
> > +
> > +struct target_ops_map {
> > +	struct hsearch_data index;
> > +};
> 
> Similar to 06/11, target_ops_map seems unnecessary. Also, do we need to 
> support non-xt targets? 

As with nft matches - probably eventually but not now.

> 
> > +
> > +static inline int create_target_ops_map(struct target_ops_map *map, size_t nelem)
> > +{
> > +	return create_map(&map->index, nelem);
> > +}
> > +
> > +static inline const struct target_ops *target_ops_map_find(struct target_ops_map *map,
> > +							   const char *name)
> > +{
> > +	const size_t namelen = strnlen(name, BPFILTER_EXTENSION_MAXNAMELEN);
> > +
> > +	if (namelen < BPFILTER_EXTENSION_MAXNAMELEN)
> > +		return map_find(&map->index, name);
> > +
> > +	return ERR_PTR(-EINVAL);
> > +}
> > +
> > +static inline int target_ops_map_insert(struct target_ops_map *map,
> > +					const struct target_ops *target_ops)
> > +{
> > +	return map_insert(&map->index, target_ops->name, (void *)target_ops);
> > +}
> > +
> > +static inline void free_target_ops_map(struct target_ops_map *map)
> > +{
> > +	free_map(&map->index);
> > +}
> > +
> > +#endif // NET_BPFILTER_TARGET_OPS_MAP_H
> > diff --git a/net/bpfilter/target.c b/net/bpfilter/target.c
> > new file mode 100644
> > index 000000000000..a18fe477f93c
> > --- /dev/null
> > +++ b/net/bpfilter/target.c
> > @@ -0,0 +1,112 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (c) 2021 Telegram FZ-LLC
> > + */
> > +
> > +#define _GNU_SOURCE
> > +
> > +#include "target.h"
> > +
> > +#include <linux/err.h>
> > +#include <linux/netfilter/x_tables.h>
> > +
> > +#include <errno.h>
> > +#include <string.h>
> > +
> > +#include "bflog.h"
> > +#include "context.h"
> > +#include "target-ops-map.h"
> > +
> 
> Please add some comments about convert_verdict. 
> 
> 
> > +static int convert_verdict(int verdict)
> > +{
> > +	return -verdict - 1;
> > +}
> > +
> > +static int standard_target_check(struct context *ctx, const struct bpfilter_ipt_target *ipt_target)
> > +{
> > +	const struct bpfilter_ipt_standard_target *standard_target;
> > +
> > +	standard_target = (const struct bpfilter_ipt_standard_target *)ipt_target;
> > +
> > +	if (standard_target->verdict > 0)
> > +		return 0;
> > +
> > +	if (standard_target->verdict < 0) {
> > +		if (standard_target->verdict == BPFILTER_RETURN)
> > +			return 0;
> > +
> > +		switch (convert_verdict(standard_target->verdict)) {
> > +		case BPFILTER_NF_ACCEPT:
> > +		case BPFILTER_NF_DROP:
> > +		case BPFILTER_NF_QUEUE:
> > +			return 0;
> > +		}
> > +	}
> > +
> > +	BFLOG_DEBUG(ctx, "invalid verdict: %d\n", standard_target->verdict);
> > +
> > +	return -EINVAL;
> > +}
> > +
> > +const struct target_ops standard_target_ops = {
> > +	.name = "",
> > +	.revision = 0,
> > +	.size = sizeof(struct xt_standard_target),
> > +	.check = standard_target_check,
> > +};
> > +
> > +static int error_target_check(struct context *ctx, const struct bpfilter_ipt_target *ipt_target)
> > +{
> > +	const struct bpfilter_ipt_error_target *error_target;
> > +	size_t maxlen;
> > +
> > +	error_target = (const struct bpfilter_ipt_error_target *)&ipt_target;
> > +	maxlen = sizeof(error_target->error_name);
> > +	if (strnlen(error_target->error_name, maxlen) == maxlen) {
> > +		BFLOG_DEBUG(ctx, "cannot check error target: too long errorname\n");
> > +		return -EINVAL;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +const struct target_ops error_target_ops = {
> > +	.name = "ERROR",
> > +	.revision = 0,
> > +	.size = sizeof(struct xt_error_target),
> > +	.check = error_target_check,
> > +};
> > +
> > +int init_target(struct context *ctx, const struct bpfilter_ipt_target *ipt_target,
> > +		struct target *target)
> > +{
> > +	const size_t maxlen = sizeof(ipt_target->u.user.name);
> > +	const struct target_ops *found;
> > +	int err;
> > +
> > +	if (strnlen(ipt_target->u.user.name, maxlen) == maxlen) {
> > +		BFLOG_DEBUG(ctx, "cannot init target: too long target name\n");
> > +		return -EINVAL;
> > +	}
> > +
> > +	found = target_ops_map_find(&ctx->target_ops_map, ipt_target->u.user.name);
> > +	if (IS_ERR(found)) {
> > +		BFLOG_DEBUG(ctx, "cannot find target by name: '%s'\n", ipt_target->u.user.name);
> > +		return PTR_ERR(found);
> > +	}
> > +
> > +	if (found->size != ipt_target->u.target_size ||
> > +	    found->revision != ipt_target->u.user.revision) {
> > +		BFLOG_DEBUG(ctx, "invalid target: '%s'\n", ipt_target->u.user.name);
> > +		return -EINVAL;
> > +	}
> > +
> > +	err = found->check(ctx, ipt_target);
> > +	if (err)
> > +		return err;
> > +
> > +	target->target_ops = found;
> > +	target->ipt_target = ipt_target;
> > +
> > +	return 0;
> > +}
> > diff --git a/net/bpfilter/target.h b/net/bpfilter/target.h
> > new file mode 100644
> > index 000000000000..5d9c4c459c05
> > --- /dev/null
> > +++ b/net/bpfilter/target.h
> > @@ -0,0 +1,34 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (c) 2021 Telegram FZ-LLC
> > + */
> > +
> > +#ifndef NET_BPFILTER_TARGET_H
> > +#define NET_BPFILTER_TARGET_H
> > +
> > +#include "../../include/uapi/linux/bpfilter.h"
> > +
> > +#include <stdint.h>
> > +
> > +struct context;
> > +struct target_ops_map;
> > +
> > +struct target_ops {
> > +	char name[BPFILTER_EXTENSION_MAXNAMELEN];
> > +	uint16_t size;
> 
> Mis-aligned "size". 
> 
> > +	uint8_t revision;
> > +	int (*check)(struct context *ctx, const struct bpfilter_ipt_target *ipt_target);
> > +};
> > +
> > +struct target {
> > +	const struct target_ops *target_ops;
> > +	const struct bpfilter_ipt_target *ipt_target;
> > +};
> > +
> > +extern const struct target_ops standard_target_ops;
> > +extern const struct target_ops error_target_ops;
> > +
> > +int init_target(struct context *ctx, const struct bpfilter_ipt_target *ipt_target,
> > +		struct target *target);
> > +
> > +#endif // NET_BPFILTER_TARGET_H
> > diff --git a/tools/testing/selftests/bpf/bpfilter/.gitignore b/tools/testing/selftests/bpf/bpfilter/.gitignore
> > index e5073231f811..1856d0515f49 100644
> > --- a/tools/testing/selftests/bpf/bpfilter/.gitignore
> > +++ b/tools/testing/selftests/bpf/bpfilter/.gitignore
> > @@ -2,3 +2,4 @@
> > test_io
> > test_map
> > test_match
> > +test_target
> > diff --git a/tools/testing/selftests/bpf/bpfilter/Makefile b/tools/testing/selftests/bpf/bpfilter/Makefile
> > index 362c9a28b88d..78da74b9ee68 100644
> > --- a/tools/testing/selftests/bpf/bpfilter/Makefile
> > +++ b/tools/testing/selftests/bpf/bpfilter/Makefile
> > @@ -11,6 +11,7 @@ CFLAGS += -Wall -g -pthread -I$(TOOLSINCDIR) -I$(APIDIR) -I$(BPFILTERSRCDIR)
> > TEST_GEN_PROGS += test_io
> > TEST_GEN_PROGS += test_map
> > TEST_GEN_PROGS += test_match
> > +TEST_GEN_PROGS += test_target
> > 
> > KSFT_KHDR_INSTALL := 1
> > 
> > @@ -19,4 +20,6 @@ include ../../lib.mk
> > $(OUTPUT)/test_io: test_io.c $(BPFILTERSRCDIR)/io.c
> > $(OUTPUT)/test_map: test_map.c $(BPFILTERSRCDIR)/map-common.c
> > $(OUTPUT)/test_match: test_match.c $(BPFILTERSRCDIR)/match.c $(BPFILTERSRCDIR)/map-common.c \
> > -	$(BPFILTERSRCDIR)/context.c $(BPFILTERSRCDIR)/bflog.c
> > +	$(BPFILTERSRCDIR)/context.c $(BPFILTERSRCDIR)/bflog.c $(BPFILTERSRCDIR)/target.c
> > +$(OUTPUT)/test_target: test_target.c $(BPFILTERSRCDIR)/target.c $(BPFILTERSRCDIR)/map-common.c \
> > +	$(BPFILTERSRCDIR)/context.c $(BPFILTERSRCDIR)/bflog.c $(BPFILTERSRCDIR)/match.c
> > diff --git a/tools/testing/selftests/bpf/bpfilter/bpfilter_util.h b/tools/testing/selftests/bpf/bpfilter/bpfilter_util.h
> > new file mode 100644
> > index 000000000000..d82ff86f280e
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/bpfilter/bpfilter_util.h
> > @@ -0,0 +1,31 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +
> > +#ifndef BPFILTER_UTIL_H
> > +#define BPFILTER_UTIL_H
> > +
> > +#include <linux/bpfilter.h>
> > +#include <linux/netfilter/x_tables.h>
> > +
> > +#include <stdio.h>
> > +
> > +static inline void init_standard_target(struct xt_standard_target *ipt_target, int revision,
> > +					int verdict)
> > +{
> > +	snprintf(ipt_target->target.u.user.name, sizeof(ipt_target->target.u.user.name), "%s",
> > +		 BPFILTER_STANDARD_TARGET);
> > +	ipt_target->target.u.user.revision = revision;
> > +	ipt_target->target.u.user.target_size = sizeof(*ipt_target);
> > +	ipt_target->verdict = verdict;
> > +}
> > +
> > +static inline void init_error_target(struct xt_error_target *ipt_target, int revision,
> > +				     const char *error_name)
> > +{
> > +	snprintf(ipt_target->target.u.user.name, sizeof(ipt_target->target.u.user.name), "%s",
> > +		 BPFILTER_ERROR_TARGET);
> > +	ipt_target->target.u.user.revision = revision;
> > +	ipt_target->target.u.user.target_size = sizeof(*ipt_target);
> > +	snprintf(ipt_target->errorname, sizeof(ipt_target->errorname), "%s", error_name);
> > +}
> > +
> > +#endif // BPFILTER_UTIL_H
> > 
> [...]
> > 
> 

-- 

Dmitrii Banshchikov

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next 00/11] bpfilter
  2021-05-20  4:54 ` [PATCH bpf-next 00/11] bpfilter Song Liu
@ 2021-05-20  7:53   ` Dmitrii Banshchikov
  2021-05-20 16:55     ` Alexei Starovoitov
  0 siblings, 1 reply; 28+ messages in thread
From: Dmitrii Banshchikov @ 2021-05-20  7:53 UTC (permalink / raw)
  To: Song Liu
  Cc: open list:BPF (Safe dynamic programs and tools),
	Alexei Starovoitov, davem, daniel, andrii, Martin Lau,
	Yonghong Song, john.fastabend, kpsingh, netdev, Andrey Ignatov

On Thu, May 20, 2021 at 04:54:45AM +0000, Song Liu wrote:
> 
> 
> > On May 17, 2021, at 3:52 PM, Dmitrii Banshchikov <me@ubique.spb.ru> wrote:
> > 
> > The patchset is based on the patches from David S. Miller [1] and Daniel
> > Borkmann [2].
> > 
> > The main goal of the patchset is to prepare bpfilter for iptables'
> > configuration blob parsing and code generation.
> > 
> > The patchset introduces data structures and code for matches, targets, rules
> > and tables.
> > 
> > It seems inconvenient to continue to use the same blob internally in bpfilter
> > in parts other than the blob parsing. That is why a superstructure with native
> > types is introduced. It provides a more convenient way to iterate over the blob
> > and limit the crazy structs widespread in the bpfilter code.
> > 
> 
> [...]
> 
> > 
> > 
> > 1. https://lore.kernel.org/patchwork/patch/902785/
> 
> [1] used bpfilter_ prefix on struct definitions, like "struct bpfilter_target"
> I think we should do the same in this version. (Or were there discussions on
> removing the prefix?). 

There were no discussions about it.
As those structs are private to bpfilter I assumed that it is
safe to save some characters.
I will add the prefix to all internal structs in the next
iteration.


> 
> Thanks,
> Song
> 
> [...]
> 

-- 

Dmitrii Banshchikov

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next 02/11] bpfilter: Add logging facility
  2021-05-20  7:08     ` Dmitrii Banshchikov
@ 2021-05-20 16:35       ` Song Liu
  2021-05-21  6:46         ` Dmitrii Banshchikov
  0 siblings, 1 reply; 28+ messages in thread
From: Song Liu @ 2021-05-20 16:35 UTC (permalink / raw)
  To: Dmitrii Banshchikov
  Cc: Song Liu, bpf, Alexei Starovoitov, David S . Miller,
	Daniel Borkmann, Andrii Nakryiko, Martin Lau, Yonghong Song,
	John Fastabend, KP Singh, Networking, Andrey Ignatov



> On May 20, 2021, at 12:08 AM, Dmitrii Banshchikov <me@ubique.spb.ru> wrote:
> 
> On Wed, May 19, 2021 at 10:32:25AM -0700, Song Liu wrote:
>> On Tue, May 18, 2021 at 11:05 PM Dmitrii Banshchikov <me@ubique.spb.ru> wrote:
>>> 
>>> There are three logging levels for messages: FATAL, NOTICE and DEBUG.
>>> When a message is logged with FATAL level it results in bpfilter
>>> usermode helper termination.
>> 
>> Could you please explain why we choose to have 3 levels? Will we need
>> more levels,
>> like WARNING, ERROR, etc.?
> 
> 
> I found that I need one level for development - to trace what
> goes rignt and wrong. At the same time as those messages go to
> dmesg this level is too verbose to be used under normal
> circumstances. That is why another level is introduced. And the
> last one exists to verify invariants or error condintions from
> which there is no right way to recover and they result in
> bpfilter termination.

/dev/kmsg supports specifying priority of the message. Like:

   echo '<4> This message have priority of 4' > /dev/kmsg

Therefore, with proper priority settings, we can have more levels safely.
Does this make sense?

Thanks,
Song

[...]


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next 00/11] bpfilter
  2021-05-20  7:53   ` Dmitrii Banshchikov
@ 2021-05-20 16:55     ` Alexei Starovoitov
  2021-05-20 17:56       ` Song Liu
  0 siblings, 1 reply; 28+ messages in thread
From: Alexei Starovoitov @ 2021-05-20 16:55 UTC (permalink / raw)
  To: Dmitrii Banshchikov
  Cc: Song Liu, open list:BPF (Safe dynamic programs and tools),
	Alexei Starovoitov, davem, daniel, andrii, Martin Lau,
	Yonghong Song, john.fastabend, kpsingh, netdev, Andrey Ignatov

On Thu, May 20, 2021 at 12:53 AM Dmitrii Banshchikov <me@ubique.spb.ru> wrote:
>
> On Thu, May 20, 2021 at 04:54:45AM +0000, Song Liu wrote:
> >
> >
> > > On May 17, 2021, at 3:52 PM, Dmitrii Banshchikov <me@ubique.spb.ru> wrote:
> > >
> > > The patchset is based on the patches from David S. Miller [1] and Daniel
> > > Borkmann [2].
> > >
> > > The main goal of the patchset is to prepare bpfilter for iptables'
> > > configuration blob parsing and code generation.
> > >
> > > The patchset introduces data structures and code for matches, targets, rules
> > > and tables.
> > >
> > > It seems inconvenient to continue to use the same blob internally in bpfilter
> > > in parts other than the blob parsing. That is why a superstructure with native
> > > types is introduced. It provides a more convenient way to iterate over the blob
> > > and limit the crazy structs widespread in the bpfilter code.
> > >
> >
> > [...]
> >
> > >
> > >
> > > 1. https://lore.kernel.org/patchwork/patch/902785/
> >
> > [1] used bpfilter_ prefix on struct definitions, like "struct bpfilter_target"
> > I think we should do the same in this version. (Or were there discussions on
> > removing the prefix?).
>
> There were no discussions about it.
> As those structs are private to bpfilter I assumed that it is
> safe to save some characters.
> I will add the prefix to all internal structs in the next
> iteration.

For internal types it's ok to skip the prefix otherwise it's too verbose.
In libbpf we skip 'bpf_' prefix in such cases.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next 06/11] bpfilter: Add struct match
  2021-05-20  7:31     ` Dmitrii Banshchikov
@ 2021-05-20 17:44       ` Song Liu
  0 siblings, 0 replies; 28+ messages in thread
From: Song Liu @ 2021-05-20 17:44 UTC (permalink / raw)
  To: Dmitrii Banshchikov
  Cc: open list:BPF (Safe dynamic programs and tools),
	Alexei Starovoitov, David S . Miller, Daniel Borkmann,
	Andrii Nakryiko, Martin Lau, Yonghong Song, John Fastabend,
	KP Singh, open list:BPF (Safe dynamic programs and tools),
	Andrey Ignatov



> On May 20, 2021, at 12:31 AM, Dmitrii Banshchikov <me@ubique.spb.ru> wrote:
> 
> On Thu, May 20, 2021 at 04:26:28AM +0000, Song Liu wrote:
>> 
>> 
>>> On May 17, 2021, at 3:53 PM, Dmitrii Banshchikov <me@ubique.spb.ru> wrote:
>>> 
>>> struct match_ops defines polymorphic interface for matches. A match
>>> consists of pointers to struct match_ops and struct xt_entry_match which
>>> contains a payload for the match's type.
>>> 
>>> All match_ops are kept in map match_ops_map by their name.
>>> 
>>> Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
>>> 
>> [...]
>> 
>>> diff --git a/net/bpfilter/match-ops-map.h b/net/bpfilter/match-ops-map.h
>>> new file mode 100644
>>> index 000000000000..0ff57f2d8da8
>>> --- /dev/null
>>> +++ b/net/bpfilter/match-ops-map.h
>>> @@ -0,0 +1,48 @@
>>> +/* SPDX-License-Identifier: GPL-2.0 */
>>> +/*
>>> + * Copyright (c) 2021 Telegram FZ-LLC
>>> + */
>>> +
>>> +#ifndef NET_BPFILTER_MATCH_OPS_MAP_H
>>> +#define NET_BPFILTER_MATCH_OPS_MAP_H
>>> +
>>> +#include "map-common.h"
>>> +
>>> +#include <linux/err.h>
>>> +
>>> +#include <errno.h>
>>> +#include <string.h>
>>> +
>>> +#include "match.h"
>>> +
>>> +struct match_ops_map {
>>> +	struct hsearch_data index;
>>> +};
>> 
>> Do we plan to extend match_ops_map? Otherwise, we can just use 
>> hsearch_data in struct context. 
> 
> Agreed.
> 
>> 
>>> +
>>> +static inline int create_match_ops_map(struct match_ops_map *map, size_t nelem)
>>> +{
>>> +	return create_map(&map->index, nelem);
>>> +}
>>> +
>>> +static inline const struct match_ops *match_ops_map_find(struct match_ops_map *map,
>>> +							 const char *name)
>>> +{
>>> +	const size_t namelen = strnlen(name, BPFILTER_EXTENSION_MAXNAMELEN);
>>> +
>>> +	if (namelen < BPFILTER_EXTENSION_MAXNAMELEN)
>>> +		return map_find(&map->index, name);
>>> +
>>> +	return ERR_PTR(-EINVAL);
>>> +}
>>> +
>>> +static inline int match_ops_map_insert(struct match_ops_map *map, const struct match_ops *match_ops)
>>> +{
>>> +	return map_insert(&map->index, match_ops->name, (void *)match_ops);
>>> +}
>>> +
>>> +static inline void free_match_ops_map(struct match_ops_map *map)
>>> +{
>>> +	free_map(&map->index);
>>> +}
>>> +
>>> +#endif // NET_BPFILTER_MATCT_OPS_MAP_H
>>> diff --git a/net/bpfilter/match.c b/net/bpfilter/match.c
>>> new file mode 100644
>>> index 000000000000..aeca1b93cd2d
>>> --- /dev/null
>>> +++ b/net/bpfilter/match.c
>>> @@ -0,0 +1,73 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>>> +/*
>>> + * Copyright (c) 2021 Telegram FZ-LLC
>>> + */
>>> +
>>> +#define _GNU_SOURCE
>>> +
>>> +#include "match.h"
>>> +
>>> +#include <linux/err.h>
>>> +#include <linux/netfilter/xt_tcpudp.h>
>> 
>> Besides xt_ filters, do we plan to support others? If so, we probably 
>> want separate files for each of them. 
> 
> Do you mean nft filters?
> They use nfilter API and currently we cannot hook into it - so
> probably eventually.
> 

The comment was mostly about how we name variables/marcos. If we plan to 
support more than xt_ matches, we should prefix variables properly. 

Song



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next 00/11] bpfilter
  2021-05-20 16:55     ` Alexei Starovoitov
@ 2021-05-20 17:56       ` Song Liu
  2021-05-21  6:00         ` Dmitrii Banshchikov
  0 siblings, 1 reply; 28+ messages in thread
From: Song Liu @ 2021-05-20 17:56 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Dmitrii Banshchikov,
	open list:BPF (Safe dynamic programs and tools),
	Alexei Starovoitov, davem, daniel, andrii, Martin Lau,
	Yonghong Song, john.fastabend, kpsingh, netdev, Andrey Ignatov



> On May 20, 2021, at 9:55 AM, Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> 
> On Thu, May 20, 2021 at 12:53 AM Dmitrii Banshchikov <me@ubique.spb.ru> wrote:
>> 
>> On Thu, May 20, 2021 at 04:54:45AM +0000, Song Liu wrote:
>>> 
>>> 
>>>> On May 17, 2021, at 3:52 PM, Dmitrii Banshchikov <me@ubique.spb.ru> wrote:
>>>> 
>>>> The patchset is based on the patches from David S. Miller [1] and Daniel
>>>> Borkmann [2].
>>>> 
>>>> The main goal of the patchset is to prepare bpfilter for iptables'
>>>> configuration blob parsing and code generation.
>>>> 
>>>> The patchset introduces data structures and code for matches, targets, rules
>>>> and tables.
>>>> 
>>>> It seems inconvenient to continue to use the same blob internally in bpfilter
>>>> in parts other than the blob parsing. That is why a superstructure with native
>>>> types is introduced. It provides a more convenient way to iterate over the blob
>>>> and limit the crazy structs widespread in the bpfilter code.
>>>> 
>>> 
>>> [...]
>>> 
>>>> 
>>>> 
>>>> 1. https://lore.kernel.org/patchwork/patch/902785/
>>> 
>>> [1] used bpfilter_ prefix on struct definitions, like "struct bpfilter_target"
>>> I think we should do the same in this version. (Or were there discussions on
>>> removing the prefix?).
>> 
>> There were no discussions about it.
>> As those structs are private to bpfilter I assumed that it is
>> safe to save some characters.
>> I will add the prefix to all internal structs in the next
>> iteration.
> 
> For internal types it's ok to skip the prefix otherwise it's too verbose.
> In libbpf we skip 'bpf_' prefix in such cases.

Do we have plan to put some of this logic in a library? If that is the case, the 
effort now may save some pain in the future. 

Thanks,
Song


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next 09/11] bpfilter: Add struct table
  2021-05-17 22:53 ` [PATCH bpf-next 09/11] bpfilter: Add struct table Dmitrii Banshchikov
@ 2021-05-20 18:07   ` Song Liu
  0 siblings, 0 replies; 28+ messages in thread
From: Song Liu @ 2021-05-20 18:07 UTC (permalink / raw)
  To: Dmitrii Banshchikov
  Cc: open list:BPF (Safe dynamic programs and tools),
	ast, davem, daniel, andrii, Martin Lau, Yonghong Song,
	john.fastabend, kpsingh, netdev, Andrey Ignatov



> On May 17, 2021, at 3:53 PM, Dmitrii Banshchikov <me@ubique.spb.ru> wrote:
> 
> A table keeps iptables' blob and an array of struct rule for this blob.
> The array of rules provides more convenient way to interact with blob's
> entries.
> 
> All tables are stored in table_ops_map map which is used for lookups.
> Also all tables are linked into a list that is used for freeing them.
> 
> Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>

[...]


> diff --git a/net/bpfilter/context.h b/net/bpfilter/context.h
> index c62c1ba4781c..2d9e3fafb0f8 100644
> --- a/net/bpfilter/context.h
> +++ b/net/bpfilter/context.h
> @@ -10,12 +10,15 @@
> 
> #include "match-ops-map.h"
> #include "target-ops-map.h"
> +#include "table-map.h"
> 
> struct context {
> 	FILE *log_file;
> 	int log_level;
> 	struct match_ops_map match_ops_map;
> 	struct target_ops_map target_ops_map;
> +	struct table_map table_map;
> +	struct list_head table_list;

How about we add table_list to struct table_map (and maybe rename it)?
I suspect that will make the code a little cleaner. 

Thanks,
Song

[...]



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next 00/11] bpfilter
  2021-05-20 17:56       ` Song Liu
@ 2021-05-21  6:00         ` Dmitrii Banshchikov
  0 siblings, 0 replies; 28+ messages in thread
From: Dmitrii Banshchikov @ 2021-05-21  6:00 UTC (permalink / raw)
  To: Song Liu
  Cc: Alexei Starovoitov,
	open list:BPF (Safe dynamic programs and tools),
	Alexei Starovoitov, davem, daniel, andrii, Martin Lau,
	Yonghong Song, john.fastabend, kpsingh, netdev, Andrey Ignatov

On Thu, May 20, 2021 at 05:56:30PM +0000, Song Liu wrote:
> 
> 
> > On May 20, 2021, at 9:55 AM, Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> > 
> > On Thu, May 20, 2021 at 12:53 AM Dmitrii Banshchikov <me@ubique.spb.ru> wrote:
> >> 
> >> On Thu, May 20, 2021 at 04:54:45AM +0000, Song Liu wrote:
> >>> 
> >>> 
> >>>> On May 17, 2021, at 3:52 PM, Dmitrii Banshchikov <me@ubique.spb.ru> wrote:
> >>>> 
> >>>> The patchset is based on the patches from David S. Miller [1] and Daniel
> >>>> Borkmann [2].
> >>>> 
> >>>> The main goal of the patchset is to prepare bpfilter for iptables'
> >>>> configuration blob parsing and code generation.
> >>>> 
> >>>> The patchset introduces data structures and code for matches, targets, rules
> >>>> and tables.
> >>>> 
> >>>> It seems inconvenient to continue to use the same blob internally in bpfilter
> >>>> in parts other than the blob parsing. That is why a superstructure with native
> >>>> types is introduced. It provides a more convenient way to iterate over the blob
> >>>> and limit the crazy structs widespread in the bpfilter code.
> >>>> 
> >>> 
> >>> [...]
> >>> 
> >>>> 
> >>>> 
> >>>> 1. https://lore.kernel.org/patchwork/patch/902785/
> >>> 
> >>> [1] used bpfilter_ prefix on struct definitions, like "struct bpfilter_target"
> >>> I think we should do the same in this version. (Or were there discussions on
> >>> removing the prefix?).
> >> 
> >> There were no discussions about it.
> >> As those structs are private to bpfilter I assumed that it is
> >> safe to save some characters.
> >> I will add the prefix to all internal structs in the next
> >> iteration.
> > 
> > For internal types it's ok to skip the prefix otherwise it's too verbose.
> > In libbpf we skip 'bpf_' prefix in such cases.
> 
> Do we have plan to put some of this logic in a library? If that is the case, the 
> effort now may save some pain in the future. 

I cannot imagine a case when we need this logic in a library.
Even if we eventually need it as these definitions are private to
bpfilter - amount of pain should be minimal.

> 
> Thanks,
> Song
> 

-- 

Dmitrii Banshchikov

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next 02/11] bpfilter: Add logging facility
  2021-05-20 16:35       ` Song Liu
@ 2021-05-21  6:46         ` Dmitrii Banshchikov
  0 siblings, 0 replies; 28+ messages in thread
From: Dmitrii Banshchikov @ 2021-05-21  6:46 UTC (permalink / raw)
  To: Song Liu
  Cc: Song Liu, bpf, Alexei Starovoitov, David S . Miller,
	Daniel Borkmann, Andrii Nakryiko, Martin Lau, Yonghong Song,
	John Fastabend, KP Singh, Networking, Andrey Ignatov

On Thu, May 20, 2021 at 04:35:45PM +0000, Song Liu wrote:
> 
> 
> > On May 20, 2021, at 12:08 AM, Dmitrii Banshchikov <me@ubique.spb.ru> wrote:
> > 
> > On Wed, May 19, 2021 at 10:32:25AM -0700, Song Liu wrote:
> >> On Tue, May 18, 2021 at 11:05 PM Dmitrii Banshchikov <me@ubique.spb.ru> wrote:
> >>> 
> >>> There are three logging levels for messages: FATAL, NOTICE and DEBUG.
> >>> When a message is logged with FATAL level it results in bpfilter
> >>> usermode helper termination.
> >> 
> >> Could you please explain why we choose to have 3 levels? Will we need
> >> more levels,
> >> like WARNING, ERROR, etc.?
> > 
> > 
> > I found that I need one level for development - to trace what
> > goes rignt and wrong. At the same time as those messages go to
> > dmesg this level is too verbose to be used under normal
> > circumstances. That is why another level is introduced. And the
> > last one exists to verify invariants or error condintions from
> > which there is no right way to recover and they result in
> > bpfilter termination.
> 
> /dev/kmsg supports specifying priority of the message. Like:
> 
>    echo '<4> This message have priority of 4' > /dev/kmsg
> 
> Therefore, with proper priority settings, we can have more levels safely.
> Does this make sense?

Yes, it makes.
BPFILTER_FATAL should be renamed to BPFILTER_EMERG to match
printk() counterpart. All bpfilter log levels should match
printk() levels. All bpfilter log messages should include log
level. And BPFILTER_DEBUG should be easily turned on/off during
compilation to enable tracing/debug.


> 
> Thanks,
> Song
> 
> [...]
> 

-- 

Dmitrii Banshchikov

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2021-05-21  6:47 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-17 22:52 [PATCH bpf-next 00/11] bpfilter Dmitrii Banshchikov
2021-05-17 22:52 ` [PATCH bpf-next 01/11] bpfilter: Add types for usermode helper Dmitrii Banshchikov
2021-05-17 22:52 ` [PATCH bpf-next 02/11] bpfilter: Add logging facility Dmitrii Banshchikov
2021-05-19 17:32   ` Song Liu
2021-05-20  7:08     ` Dmitrii Banshchikov
2021-05-20 16:35       ` Song Liu
2021-05-21  6:46         ` Dmitrii Banshchikov
2021-05-17 22:53 ` [PATCH bpf-next 03/11] bpfilter: Add IO functions Dmitrii Banshchikov
2021-05-19 18:47   ` Song Liu
2021-05-17 22:53 ` [PATCH bpf-next 04/11] tools: Add bpfilter usermode helper header Dmitrii Banshchikov
2021-05-17 22:53 ` [PATCH bpf-next 05/11] bpfilter: Add map container Dmitrii Banshchikov
2021-05-17 22:53 ` [PATCH bpf-next 06/11] bpfilter: Add struct match Dmitrii Banshchikov
2021-05-20  4:26   ` Song Liu
2021-05-20  7:31     ` Dmitrii Banshchikov
2021-05-20 17:44       ` Song Liu
2021-05-17 22:53 ` [PATCH bpf-next 07/11] bpfilter: Add struct target Dmitrii Banshchikov
2021-05-20  4:36   ` Song Liu
2021-05-20  7:44     ` Dmitrii Banshchikov
2021-05-17 22:53 ` [PATCH bpf-next 08/11] bpfilter: Add struct rule Dmitrii Banshchikov
2021-05-17 22:53 ` [PATCH bpf-next 09/11] bpfilter: Add struct table Dmitrii Banshchikov
2021-05-20 18:07   ` Song Liu
2021-05-17 22:53 ` [PATCH bpf-next 10/11] bpfilter: Add handling of setsockopt() calls Dmitrii Banshchikov
2021-05-17 22:53 ` [PATCH bpf-next 11/11] bpfilter: Handle setsockopts Dmitrii Banshchikov
2021-05-20  4:54 ` [PATCH bpf-next 00/11] bpfilter Song Liu
2021-05-20  7:53   ` Dmitrii Banshchikov
2021-05-20 16:55     ` Alexei Starovoitov
2021-05-20 17:56       ` Song Liu
2021-05-21  6:00         ` Dmitrii Banshchikov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.