All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next v1 00/10] bpfilter
@ 2021-06-03 10:14 Dmitrii Banshchikov
  2021-06-03 10:14 ` [PATCH bpf-next v1 01/10] bpfilter: Add types for usermode helper Dmitrii Banshchikov
                   ` (10 more replies)
  0 siblings, 11 replies; 20+ messages in thread
From: Dmitrii Banshchikov @ 2021-06-03 10:14 UTC (permalink / raw)
  To: bpf
  Cc: Dmitrii Banshchikov, ast, davem, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, kpsingh, netdev, rdna

The patchset is based on the patches from David S. Miller [1] and
Daniel Borkmann [2].

The main goal of the patchset is to prepare bpfilter for
iptables' configuration blob parsing and code generation.

The patchset introduces data structures and code for matches,
targets, rules and tables.

The current version misses handling of counters. Postpone its
implementation until the code generation phase as it's not clear
yet how to better handle them.

Beside that there is no support of net namespaces at all.

In the next iteration basic code generation shall be introduced.

The rough plan for the code generation.

It seems reasonable to assume that the first rules should cover
most of the packet flow.  This is why they are critical from the
performance point of view.  At the same time number of user
defined rules might be pretty large. Also there is a limit on
size and complexity of a BPF program introduced by the verifier.

There are two approaches how to handle iptables' rules in
generated BPF programs.

The first approach is to generate a BPF program that is an
equivalent to a set of rules on a rule by rule basis. This
approach should give the best performance. The drawback is the
limitation from the verifier on size and complexity of BPF
program.

The second approach is to use an internal representation of rules
stored in a BPF map and use bpf_for_each_map_elem() helper to
iterate over them. In this case the helper's callback is a BPF
function that is able to process any valid rule.

Combination of the two approaches should give most of the
benefits - a heuristic should help to select a small subset of
the rules for code generation on a rule by rule basis. All other
rules are cold and it should be possible to store them in an
internal form in a BPF map. The rules will be handled by
bpf_for_each_map_elem().  This should remove the limit on the
number of supported rules.

During development it was useful to use statically linked
sanitizers in bpfilter usermode helper. Also it is possible to
use fuzzers but it's not clear if it is worth adding them to the
test infrastructure - because there are no other fuzzers under
tools/testing/selftests currently.

Patch 1 adds definitions of the used types.
Patch 2 adds logging to bpfilter.
Patch 3 adds bpfilter header to tools
Patch 4 adds an associative map.
Patches 5/6/7/8 add code for matches, targets, rules and table.
Patch 9 handles hooked setsockopt(2) calls.
Patch 10 uses prepared code in main().

Here is an example:
% dmesg  | tail -n 2
[   23.636102] bpfilter: Loaded bpfilter_umh pid 181
[   23.658529] bpfilter: started
% /usr/sbin/iptables-legacy -L -n
Chain INPUT (policy ACCEPT)
target     prot opt source               destination

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
% /usr/sbin/iptables-legacy -A INPUT -p udp --dport 23 -j DROP
% /usr/sbin/iptables-legacy -L -n
Chain INPUT (policy ACCEPT)
target     prot opt source               destination
DROP       udp  --  0.0.0.0/0            0.0.0.0/0           udp dpt:23

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
% /usr/sbin/iptables-legacy -F
% /usr/sbin/iptables-legacy -L -n
Chain INPUT (policy ACCEPT)
target     prot opt source               destination

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
%

v0 -> v1
IO:
  * Use ssize_t in pvm_read, pvm_write for total_bytes
  * Move IO functions into sockopt.c and main.c
Logging:
  * Use LOGLEVEL_EMERG, LOGLEVEL_NOTICE, LOGLEVE_DEBUG
    while logging to /dev/kmsg
  * Prepend log message with <n> where n is log level
  * Conditionally enable BFLOG_DEBUG messages
  * Merge bflog.{h,c} into context.h
Matches:
  * Reorder fields in struct match_ops for tight packing
  * Get rid of struct match_ops_map
  * Rename udp_match_ops to xt_udp
  * Use XT_ALIGN macro
  * Store payload size in match size
  * Move udp match routines into a separate file
Targets:
  * Reorder fields in struct target_ops for tight packing
  * Get rid of struct target_ops_map
  * Add comments for convert_verdict function
Rules:
  * Add validation
Tables:
  * Combine table_map and table_list into table_index
  * Add validation
Sockopts:
  * Handle IPT_SO_GET_REVISION_TARGET

1. https://lore.kernel.org/patchwork/patch/902785/
2. https://lore.kernel.org/patchwork/patch/902783/

Dmitrii Banshchikov (10):
  bpfilter: Add types for usermode helper
  bpfilter: Add logging facility
  tools: Add bpfilter usermode helper header
  bpfilter: Add map container
  bpfilter: Add struct match
  bpfilter: Add struct target
  bpfilter: Add struct rule
  bpfilter: Add struct table
  bpfilter: Add handling of setsockopt() calls
  bpfilter: Handle setsockopts

 .clang-format                                 |   2 +-
 include/uapi/linux/bpfilter.h                 | 155 +++++++
 net/bpfilter/Makefile                         |   3 +-
 net/bpfilter/context.c                        | 181 ++++++++
 net/bpfilter/context.h                        |  46 ++
 net/bpfilter/main.c                           | 123 ++++--
 net/bpfilter/map-common.c                     |  64 +++
 net/bpfilter/map-common.h                     |  19 +
 net/bpfilter/match.c                          |  49 +++
 net/bpfilter/match.h                          |  33 ++
 net/bpfilter/rule.c                           | 163 +++++++
 net/bpfilter/rule.h                           |  32 ++
 net/bpfilter/sockopt.c                        | 409 ++++++++++++++++++
 net/bpfilter/sockopt.h                        |  14 +
 net/bpfilter/table.c                          | 339 +++++++++++++++
 net/bpfilter/table.h                          |  39 ++
 net/bpfilter/target.c                         | 118 +++++
 net/bpfilter/target.h                         |  49 +++
 net/bpfilter/xt_udp.c                         |  33 ++
 tools/include/uapi/linux/bpfilter.h           | 179 ++++++++
 .../testing/selftests/bpf/bpfilter/.gitignore |   5 +
 tools/testing/selftests/bpf/bpfilter/Makefile |  30 ++
 .../selftests/bpf/bpfilter/bpfilter_util.h    |  39 ++
 .../testing/selftests/bpf/bpfilter/test_map.c |  63 +++
 .../selftests/bpf/bpfilter/test_match.c       |  63 +++
 .../selftests/bpf/bpfilter/test_rule.c        |  55 +++
 .../selftests/bpf/bpfilter/test_target.c      |  85 ++++
 27 files changed, 2346 insertions(+), 44 deletions(-)
 create mode 100644 net/bpfilter/context.c
 create mode 100644 net/bpfilter/context.h
 create mode 100644 net/bpfilter/map-common.c
 create mode 100644 net/bpfilter/map-common.h
 create mode 100644 net/bpfilter/match.c
 create mode 100644 net/bpfilter/match.h
 create mode 100644 net/bpfilter/rule.c
 create mode 100644 net/bpfilter/rule.h
 create mode 100644 net/bpfilter/sockopt.c
 create mode 100644 net/bpfilter/sockopt.h
 create mode 100644 net/bpfilter/table.c
 create mode 100644 net/bpfilter/table.h
 create mode 100644 net/bpfilter/target.c
 create mode 100644 net/bpfilter/target.h
 create mode 100644 net/bpfilter/xt_udp.c
 create mode 100644 tools/include/uapi/linux/bpfilter.h
 create mode 100644 tools/testing/selftests/bpf/bpfilter/.gitignore
 create mode 100644 tools/testing/selftests/bpf/bpfilter/Makefile
 create mode 100644 tools/testing/selftests/bpf/bpfilter/bpfilter_util.h
 create mode 100644 tools/testing/selftests/bpf/bpfilter/test_map.c
 create mode 100644 tools/testing/selftests/bpf/bpfilter/test_match.c
 create mode 100644 tools/testing/selftests/bpf/bpfilter/test_rule.c
 create mode 100644 tools/testing/selftests/bpf/bpfilter/test_target.c

-- 
2.25.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH bpf-next v1 01/10] bpfilter: Add types for usermode helper
  2021-06-03 10:14 [PATCH bpf-next v1 00/10] bpfilter Dmitrii Banshchikov
@ 2021-06-03 10:14 ` Dmitrii Banshchikov
  2021-06-03 10:14 ` [PATCH bpf-next v1 02/10] bpfilter: Add logging facility Dmitrii Banshchikov
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Dmitrii Banshchikov @ 2021-06-03 10:14 UTC (permalink / raw)
  To: bpf
  Cc: Dmitrii Banshchikov, ast, davem, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, kpsingh, netdev, rdna

Add more definitions that mirror existing iptables' ABI.
These definitions will be used in bpfilter usermode helper.

Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
---
 include/uapi/linux/bpfilter.h | 155 ++++++++++++++++++++++++++++++++++
 1 file changed, 155 insertions(+)

diff --git a/include/uapi/linux/bpfilter.h b/include/uapi/linux/bpfilter.h
index cbc1f5813f50..e97d95d0ba54 100644
--- a/include/uapi/linux/bpfilter.h
+++ b/include/uapi/linux/bpfilter.h
@@ -3,6 +3,13 @@
 #define _UAPI_LINUX_BPFILTER_H
 
 #include <linux/if.h>
+#include <linux/const.h>
+
+#define BPFILTER_FUNCTION_MAXNAMELEN    30
+#define BPFILTER_EXTENSION_MAXNAMELEN   29
+
+#define BPFILTER_STANDARD_TARGET        ""
+#define BPFILTER_ERROR_TARGET           "ERROR"
 
 enum {
 	BPFILTER_IPT_SO_SET_REPLACE = 64,
@@ -18,4 +25,152 @@ enum {
 	BPFILTER_IPT_GET_MAX,
 };
 
+enum {
+	BPFILTER_XT_TABLE_MAXNAMELEN = 32,
+};
+
+enum {
+	BPFILTER_NF_DROP = 0,
+	BPFILTER_NF_ACCEPT = 1,
+	BPFILTER_NF_STOLEN = 2,
+	BPFILTER_NF_QUEUE = 3,
+	BPFILTER_NF_REPEAT = 4,
+	BPFILTER_NF_STOP = 5,
+	BPFILTER_NF_MAX_VERDICT = BPFILTER_NF_STOP,
+	BPFILTER_RETURN = (-BPFILTER_NF_REPEAT - 1),
+};
+
+enum {
+	BPFILTER_INET_HOOK_PRE_ROUTING = 0,
+	BPFILTER_INET_HOOK_LOCAL_IN = 1,
+	BPFILTER_INET_HOOK_FORWARD = 2,
+	BPFILTER_INET_HOOK_LOCAL_OUT = 3,
+	BPFILTER_INET_HOOK_POST_ROUTING = 4,
+	BPFILTER_INET_HOOK_MAX,
+};
+
+enum {
+	BPFILTER_IPT_F_MASK = 0x03,
+	BPFILTER_IPT_INV_MASK = 0x7f
+};
+
+struct bpfilter_ipt_match {
+	union {
+		struct {
+			__u16 match_size;
+			char name[BPFILTER_EXTENSION_MAXNAMELEN];
+			__u8 revision;
+		} user;
+		struct {
+			__u16 match_size;
+			void *match;
+		} kernel;
+		__u16 match_size;
+	} u;
+	unsigned char data[0];
+};
+
+struct bpfilter_ipt_target {
+	union {
+		struct {
+			__u16 target_size;
+			char name[BPFILTER_EXTENSION_MAXNAMELEN];
+			__u8 revision;
+		} user;
+		struct {
+			__u16 target_size;
+			void *target;
+		} kernel;
+		__u16 target_size;
+	} u;
+	unsigned char data[0];
+};
+
+struct bpfilter_ipt_standard_target {
+	struct bpfilter_ipt_target target;
+	int verdict;
+};
+
+struct bpfilter_ipt_error_target {
+	struct bpfilter_ipt_target target;
+	char error_name[BPFILTER_FUNCTION_MAXNAMELEN];
+};
+
+struct bpfilter_ipt_get_info {
+	char name[BPFILTER_XT_TABLE_MAXNAMELEN];
+	__u32 valid_hooks;
+	__u32 hook_entry[BPFILTER_INET_HOOK_MAX];
+	__u32 underflow[BPFILTER_INET_HOOK_MAX];
+	__u32 num_entries;
+	__u32 size;
+};
+
+struct bpfilter_ipt_counters {
+	__u64 packet_cnt;
+	__u64 byte_cnt;
+};
+
+struct bpfilter_ipt_counters_info {
+	char name[BPFILTER_XT_TABLE_MAXNAMELEN];
+	__u32 num_counters;
+	struct bpfilter_ipt_counters counters[0];
+};
+
+struct bpfilter_ipt_get_revision {
+	char name[BPFILTER_EXTENSION_MAXNAMELEN];
+	__u8 revision;
+};
+
+struct bpfilter_ipt_ip {
+	__u32 src;
+	__u32 dst;
+	__u32 src_mask;
+	__u32 dst_mask;
+	char in_iface[IFNAMSIZ];
+	char out_iface[IFNAMSIZ];
+	__u8 in_iface_mask[IFNAMSIZ];
+	__u8 out_iface_mask[IFNAMSIZ];
+	__u16 protocol;
+	__u8 flags;
+	__u8 invflags;
+};
+
+struct bpfilter_ipt_entry {
+	struct bpfilter_ipt_ip ip;
+	__u32 bfcache;
+	__u16 target_offset;
+	__u16 next_offset;
+	__u32 comefrom;
+	struct bpfilter_ipt_counters counters;
+	__u8 elems[0];
+};
+
+struct bpfilter_ipt_standard_entry {
+	struct bpfilter_ipt_entry entry;
+	struct bpfilter_ipt_standard_target target;
+};
+
+struct bpfilter_ipt_error_entry {
+	struct bpfilter_ipt_entry entry;
+	struct bpfilter_ipt_error_target target;
+};
+
+struct bpfilter_ipt_get_entries {
+	char name[BPFILTER_XT_TABLE_MAXNAMELEN];
+	__u32 size;
+	struct bpfilter_ipt_entry entries[0];
+};
+
+struct bpfilter_ipt_replace {
+	char name[BPFILTER_XT_TABLE_MAXNAMELEN];
+	__u32 valid_hooks;
+	__u32 num_entries;
+	__u32 size;
+	__u32 hook_entry[BPFILTER_INET_HOOK_MAX];
+	__u32 underflow[BPFILTER_INET_HOOK_MAX];
+	__u32 num_counters;
+	struct bpfilter_ipt_counters *cntrs;
+	struct bpfilter_ipt_entry entries[0];
+};
+
 #endif /* _UAPI_LINUX_BPFILTER_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH bpf-next v1 02/10] bpfilter: Add logging facility
  2021-06-03 10:14 [PATCH bpf-next v1 00/10] bpfilter Dmitrii Banshchikov
  2021-06-03 10:14 ` [PATCH bpf-next v1 01/10] bpfilter: Add types for usermode helper Dmitrii Banshchikov
@ 2021-06-03 10:14 ` Dmitrii Banshchikov
  2021-06-03 10:14 ` [PATCH bpf-next v1 03/10] tools: Add bpfilter usermode helper header Dmitrii Banshchikov
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Dmitrii Banshchikov @ 2021-06-03 10:14 UTC (permalink / raw)
  To: bpf
  Cc: Dmitrii Banshchikov, ast, davem, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, kpsingh, netdev, rdna

There are three logging levels for messages: FATAL, NOTICE and DEBUG.
When a message is logged with FATAL level it results in bpfilter
usermode helper termination.

Introduce struct context to avoid use of global objects and store there
the logging parameters: log level and log sink.

Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
---
 net/bpfilter/context.h | 37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)
 create mode 100644 net/bpfilter/context.h

diff --git a/net/bpfilter/context.h b/net/bpfilter/context.h
new file mode 100644
index 000000000000..e7bc27ee1ace
--- /dev/null
+++ b/net/bpfilter/context.h
@@ -0,0 +1,37 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#ifndef NET_BPFILTER_CONTEXT_H
+#define NET_BPFILTER_CONTEXT_H
+
+#include <sys/syslog.h>
+
+#include <stdio.h>
+#include <stdlib.h>
+
+struct context {
+	FILE *log_file;
+};
+
+#define BFLOG_IMPL(ctx, level, fmt, ...)                                                           \
+	do {                                                                                       \
+		if ((ctx)->log_file)								   \
+			fprintf((ctx)->log_file, "<%d>bpfilter: " fmt, (level), ##__VA_ARGS__);    \
+		if ((level) == LOG_EMERG)                                                          \
+			exit(EXIT_FAILURE);                                                        \
+	} while (0)
+
+#define BFLOG_EMERG(ctx, fmt, ...)                                                                 \
+	BFLOG_IMPL(ctx, LOG_KERN | LOG_EMERG, "fatal error: " fmt, ##__VA_ARGS__)
+
+#define BFLOG_NOTICE(ctx, fmt, ...) BFLOG_IMPL(ctx, LOG_KERN | LOG_NOTICE, fmt, ##__VA_ARGS__)
+
+#if 0
+#define BFLOG_DEBUG(ctx, fmt, ...) BFLOG_IMPL(ctx, LOG_KERN | LOG_DEBUG, fmt, ##__VA_ARGS__)
+#else
+#define BFLOG_DEBUG(ctx, fmt, ...)
+#endif
+
+#endif // NET_BPFILTER_CONTEXT_H
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH bpf-next v1 03/10] tools: Add bpfilter usermode helper header
  2021-06-03 10:14 [PATCH bpf-next v1 00/10] bpfilter Dmitrii Banshchikov
  2021-06-03 10:14 ` [PATCH bpf-next v1 01/10] bpfilter: Add types for usermode helper Dmitrii Banshchikov
  2021-06-03 10:14 ` [PATCH bpf-next v1 02/10] bpfilter: Add logging facility Dmitrii Banshchikov
@ 2021-06-03 10:14 ` Dmitrii Banshchikov
  2021-06-08 16:20   ` Yonghong Song
  2021-06-03 10:14 ` [PATCH bpf-next v1 04/10] bpfilter: Add map container Dmitrii Banshchikov
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 20+ messages in thread
From: Dmitrii Banshchikov @ 2021-06-03 10:14 UTC (permalink / raw)
  To: bpf
  Cc: Dmitrii Banshchikov, ast, davem, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, kpsingh, netdev, rdna

The header will be used in bpfilter usermode helper test infrastructure.

Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
---
 tools/include/uapi/linux/bpfilter.h | 179 ++++++++++++++++++++++++++++
 1 file changed, 179 insertions(+)
 create mode 100644 tools/include/uapi/linux/bpfilter.h

diff --git a/tools/include/uapi/linux/bpfilter.h b/tools/include/uapi/linux/bpfilter.h
new file mode 100644
index 000000000000..8b49d81f81c8
--- /dev/null
+++ b/tools/include/uapi/linux/bpfilter.h
@@ -0,0 +1,179 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _UAPI_LINUX_BPFILTER_H
+#define _UAPI_LINUX_BPFILTER_H
+
+#include <linux/if.h>
+#include <linux/const.h>
+
+#define BPFILTER_FUNCTION_MAXNAMELEN    30
+#define BPFILTER_EXTENSION_MAXNAMELEN   29
+
+#define BPFILTER_STANDARD_TARGET        ""
+#define BPFILTER_ERROR_TARGET           "ERROR"
+
+
+#define BPFILTER_ALIGN(__X) __ALIGN_KERNEL(__X, __alignof__(__u64))
+
+enum {
+	BPFILTER_IPT_SO_SET_REPLACE = 64,
+	BPFILTER_IPT_SO_SET_ADD_COUNTERS = 65,
+	BPFILTER_IPT_SET_MAX,
+};
+
+enum {
+	BPFILTER_IPT_SO_GET_INFO = 64,
+	BPFILTER_IPT_SO_GET_ENTRIES = 65,
+	BPFILTER_IPT_SO_GET_REVISION_MATCH = 66,
+	BPFILTER_IPT_SO_GET_REVISION_TARGET = 67,
+	BPFILTER_IPT_GET_MAX,
+};
+
+enum {
+	BPFILTER_XT_TABLE_MAXNAMELEN = 32,
+};
+
+enum {
+	BPFILTER_NF_DROP = 0,
+	BPFILTER_NF_ACCEPT = 1,
+	BPFILTER_NF_STOLEN = 2,
+	BPFILTER_NF_QUEUE = 3,
+	BPFILTER_NF_REPEAT = 4,
+	BPFILTER_NF_STOP = 5,
+	BPFILTER_NF_MAX_VERDICT = BPFILTER_NF_STOP,
+	BPFILTER_RETURN = (-BPFILTER_NF_REPEAT - 1),
+};
+
+enum {
+	BPFILTER_INET_HOOK_PRE_ROUTING = 0,
+	BPFILTER_INET_HOOK_LOCAL_IN = 1,
+	BPFILTER_INET_HOOK_FORWARD = 2,
+	BPFILTER_INET_HOOK_LOCAL_OUT = 3,
+	BPFILTER_INET_HOOK_POST_ROUTING = 4,
+	BPFILTER_INET_HOOK_MAX,
+};
+
+enum {
+	BPFILTER_IPT_F_MASK = 0x03,
+	BPFILTER_IPT_INV_MASK = 0x7f
+};
+
+struct bpfilter_ipt_match {
+	union {
+		struct {
+			__u16 match_size;
+			char name[BPFILTER_EXTENSION_MAXNAMELEN];
+			__u8 revision;
+		} user;
+		struct {
+			__u16 match_size;
+			void *match;
+		} kernel;
+		__u16 match_size;
+	} u;
+	unsigned char data[0];
+};
+
+struct bpfilter_ipt_target {
+	union {
+		struct {
+			__u16 target_size;
+			char name[BPFILTER_EXTENSION_MAXNAMELEN];
+			__u8 revision;
+		} user;
+		struct {
+			__u16 target_size;
+			void *target;
+		} kernel;
+		__u16 target_size;
+	} u;
+	unsigned char data[0];
+};
+
+struct bpfilter_ipt_standard_target {
+	struct bpfilter_ipt_target target;
+	int verdict;
+};
+
+struct bpfilter_ipt_error_target {
+	struct bpfilter_ipt_target target;
+	char error_name[BPFILTER_FUNCTION_MAXNAMELEN];
+};
+
+struct bpfilter_ipt_get_info {
+	char name[BPFILTER_XT_TABLE_MAXNAMELEN];
+	__u32 valid_hooks;
+	__u32 hook_entry[BPFILTER_INET_HOOK_MAX];
+	__u32 underflow[BPFILTER_INET_HOOK_MAX];
+	__u32 num_entries;
+	__u32 size;
+};
+
+struct bpfilter_ipt_counters {
+	__u64 packet_cnt;
+	__u64 byte_cnt;
+};
+
+struct bpfilter_ipt_counters_info {
+	char name[BPFILTER_XT_TABLE_MAXNAMELEN];
+	__u32 num_counters;
+	struct bpfilter_ipt_counters counters[0];
+};
+
+struct bpfilter_ipt_get_revision {
+	char name[BPFILTER_EXTENSION_MAXNAMELEN];
+	__u8 revision;
+};
+
+struct bpfilter_ipt_ip {
+	__u32 src;
+	__u32 dst;
+	__u32 src_mask;
+	__u32 dst_mask;
+	char in_iface[IFNAMSIZ];
+	char out_iface[IFNAMSIZ];
+	__u8 in_iface_mask[IFNAMSIZ];
+	__u8 out_iface_mask[IFNAMSIZ];
+	__u16 protocol;
+	__u8 flags;
+	__u8 invflags;
+};
+
+struct bpfilter_ipt_entry {
+	struct bpfilter_ipt_ip ip;
+	__u32 bfcache;
+	__u16 target_offset;
+	__u16 next_offset;
+	__u32 comefrom;
+	struct bpfilter_ipt_counters counters;
+	__u8 elems[0];
+};
+
+struct bpfilter_ipt_standard_entry {
+	struct bpfilter_ipt_entry entry;
+	struct bpfilter_ipt_standard_target target;
+};
+
+struct bpfilter_ipt_error_entry {
+	struct bpfilter_ipt_entry entry;
+	struct bpfilter_ipt_error_target target;
+};
+
+struct bpfilter_ipt_get_entries {
+	char name[BPFILTER_XT_TABLE_MAXNAMELEN];
+	__u32 size;
+	struct bpfilter_ipt_entry entries[0];
+};
+
+struct bpfilter_ipt_replace {
+	char name[BPFILTER_XT_TABLE_MAXNAMELEN];
+	__u32 valid_hooks;
+	__u32 num_entries;
+	__u32 size;
+	__u32 hook_entry[BPFILTER_INET_HOOK_MAX];
+	__u32 underflow[BPFILTER_INET_HOOK_MAX];
+	__u32 num_counters;
+	struct bpfilter_ipt_counters *cntrs;
+	struct bpfilter_ipt_entry entries[0];
+};
+
+#endif /* _UAPI_LINUX_BPFILTER_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH bpf-next v1 04/10] bpfilter: Add map container
  2021-06-03 10:14 [PATCH bpf-next v1 00/10] bpfilter Dmitrii Banshchikov
                   ` (2 preceding siblings ...)
  2021-06-03 10:14 ` [PATCH bpf-next v1 03/10] tools: Add bpfilter usermode helper header Dmitrii Banshchikov
@ 2021-06-03 10:14 ` Dmitrii Banshchikov
  2021-06-03 10:14 ` [PATCH bpf-next v1 05/10] bpfilter: Add struct match Dmitrii Banshchikov
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Dmitrii Banshchikov @ 2021-06-03 10:14 UTC (permalink / raw)
  To: bpf
  Cc: Dmitrii Banshchikov, ast, davem, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, kpsingh, netdev, rdna

Introduce common code for an associative container. This common code
will be used for maps of matches, targets and tables. Hash search tables
from libc are used as an index. The supported set of operations is:
insert, update and find.

Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
---
 net/bpfilter/Makefile                         |  2 +-
 net/bpfilter/map-common.c                     | 64 +++++++++++++++++++
 net/bpfilter/map-common.h                     | 19 ++++++
 .../testing/selftests/bpf/bpfilter/.gitignore |  2 +
 tools/testing/selftests/bpf/bpfilter/Makefile | 17 +++++
 .../testing/selftests/bpf/bpfilter/test_map.c | 63 ++++++++++++++++++
 6 files changed, 166 insertions(+), 1 deletion(-)
 create mode 100644 net/bpfilter/map-common.c
 create mode 100644 net/bpfilter/map-common.h
 create mode 100644 tools/testing/selftests/bpf/bpfilter/.gitignore
 create mode 100644 tools/testing/selftests/bpf/bpfilter/Makefile
 create mode 100644 tools/testing/selftests/bpf/bpfilter/test_map.c

diff --git a/net/bpfilter/Makefile b/net/bpfilter/Makefile
index cdac82b8c53a..1809759d08c4 100644
--- a/net/bpfilter/Makefile
+++ b/net/bpfilter/Makefile
@@ -4,7 +4,7 @@
 #
 
 userprogs := bpfilter_umh
-bpfilter_umh-objs := main.o
+bpfilter_umh-objs := main.o map-common.o
 userccflags += -I $(srctree)/tools/include/ -I $(srctree)/tools/include/uapi
 
 ifeq ($(CONFIG_BPFILTER_UMH), y)
diff --git a/net/bpfilter/map-common.c b/net/bpfilter/map-common.c
new file mode 100644
index 000000000000..6a4ab0c5d3ec
--- /dev/null
+++ b/net/bpfilter/map-common.c
@@ -0,0 +1,64 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#include "map-common.h"
+
+#include <linux/err.h>
+
+#include <errno.h>
+#include <string.h>
+
+int create_map(struct hsearch_data *htab, size_t nelem)
+{
+	memset(htab, 0, sizeof(*htab));
+	if (!hcreate_r(nelem, htab))
+		return -errno;
+
+	return 0;
+}
+
+void *map_find(struct hsearch_data *htab, const char *name)
+{
+	const ENTRY needle = { .key = (char *)name };
+	ENTRY *found;
+
+	if (!hsearch_r(needle, FIND, &found, htab))
+		return ERR_PTR(-ENOENT);
+
+	return found->data;
+}
+
+int map_update(struct hsearch_data *htab, const char *name, void *data)
+{
+	const ENTRY needle = { .key = (char *)name, .data = data };
+	ENTRY *found;
+
+	if (!hsearch_r(needle, ENTER, &found, htab))
+		return -errno;
+
+	found->key = (char *)name;
+	found->data = data;
+
+	return 0;
+}
+
+int map_insert(struct hsearch_data *htab, const char *name, void *data)
+{
+	const ENTRY needle = { .key = (char *)name, .data = data };
+	ENTRY *found;
+
+	if (!hsearch_r(needle, ENTER, &found, htab))
+		return -errno;
+
+	if (found->data != data)
+		return -EEXIST;
+
+	return 0;
+}
+
+void free_map(struct hsearch_data *htab)
+{
+	hdestroy_r(htab);
+}
diff --git a/net/bpfilter/map-common.h b/net/bpfilter/map-common.h
new file mode 100644
index 000000000000..b29829230eff
--- /dev/null
+++ b/net/bpfilter/map-common.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#ifndef NET_BPFILTER_MAP_COMMON_H
+#define NET_BPFILTER_MAP_COMMON_H
+
+#define _GNU_SOURCE
+
+#include <search.h>
+
+int create_map(struct hsearch_data *htab, size_t nelem);
+void *map_find(struct hsearch_data *htab, const char *name);
+int map_insert(struct hsearch_data *htab, const char *name, void *data);
+int map_update(struct hsearch_data *htab, const char *name, void *data);
+void free_map(struct hsearch_data *htab);
+
+#endif // NET_BPFILTER_MAP_COMMON_H
diff --git a/tools/testing/selftests/bpf/bpfilter/.gitignore b/tools/testing/selftests/bpf/bpfilter/.gitignore
new file mode 100644
index 000000000000..983fd06cbefa
--- /dev/null
+++ b/tools/testing/selftests/bpf/bpfilter/.gitignore
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+test_map
diff --git a/tools/testing/selftests/bpf/bpfilter/Makefile b/tools/testing/selftests/bpf/bpfilter/Makefile
new file mode 100644
index 000000000000..647229a0596c
--- /dev/null
+++ b/tools/testing/selftests/bpf/bpfilter/Makefile
@@ -0,0 +1,17 @@
+# SPDX-License-Identifier: GPL-2.0
+
+top_srcdir = ../../../../..
+TOOLSDIR := $(abspath ../../../../)
+TOOLSINCDIR := $(TOOLSDIR)/include
+APIDIR := $(TOOLSINCDIR)/uapi
+BPFILTERSRCDIR := $(top_srcdir)/net/bpfilter
+
+CFLAGS += -Wall -g -pthread -I$(TOOLSINCDIR) -I$(APIDIR) -I$(BPFILTERSRCDIR)
+
+TEST_GEN_PROGS += test_map
+
+KSFT_KHDR_INSTALL := 1
+
+include ../../lib.mk
+
+$(OUTPUT)/test_map: test_map.c $(BPFILTERSRCDIR)/map-common.c
diff --git a/tools/testing/selftests/bpf/bpfilter/test_map.c b/tools/testing/selftests/bpf/bpfilter/test_map.c
new file mode 100644
index 000000000000..6ac61a634e41
--- /dev/null
+++ b/tools/testing/selftests/bpf/bpfilter/test_map.c
@@ -0,0 +1,63 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "map-common.h"
+
+#include <linux/err.h>
+
+#include "../../kselftest_harness.h"
+
+FIXTURE(test_map)
+{
+	struct hsearch_data map;
+	const char *key;
+	void *expected;
+	void *actual;
+};
+
+FIXTURE_SETUP(test_map)
+{
+	const int max_nelements = 100;
+
+	create_map(&self->map, max_nelements);
+	self->key = "key";
+	self->expected = "expected";
+	self->actual = "actual";
+}
+
+FIXTURE_TEARDOWN(test_map)
+{
+	free_map(&self->map);
+}
+
+TEST_F(test_map, insert_and_find)
+{
+	void *found;
+
+	found = map_find(&self->map, self->key);
+	ASSERT_TRUE(IS_ERR(found))
+	ASSERT_EQ(-ENOENT, PTR_ERR(found))
+
+	ASSERT_EQ(0, map_insert(&self->map, self->key, self->expected));
+	ASSERT_EQ(0, map_insert(&self->map, self->key, self->expected));
+	ASSERT_EQ(-EEXIST, map_insert(&self->map, self->key, self->actual));
+
+	found = map_find(&self->map, self->key);
+
+	ASSERT_FALSE(IS_ERR(found));
+	ASSERT_STREQ(self->expected, found);
+}
+
+TEST_F(test_map, update)
+{
+	void *found;
+
+	ASSERT_EQ(0, map_insert(&self->map, self->key, self->actual));
+	ASSERT_EQ(0, map_update(&self->map, self->key, self->expected));
+
+	found = map_find(&self->map, self->key);
+
+	ASSERT_FALSE(IS_ERR(found));
+	ASSERT_STREQ(self->expected, found);
+}
+
+TEST_HARNESS_MAIN
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH bpf-next v1 05/10] bpfilter: Add struct match
  2021-06-03 10:14 [PATCH bpf-next v1 00/10] bpfilter Dmitrii Banshchikov
                   ` (3 preceding siblings ...)
  2021-06-03 10:14 ` [PATCH bpf-next v1 04/10] bpfilter: Add map container Dmitrii Banshchikov
@ 2021-06-03 10:14 ` Dmitrii Banshchikov
  2021-06-03 10:14 ` [PATCH bpf-next v1 06/10] bpfilter: Add struct target Dmitrii Banshchikov
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Dmitrii Banshchikov @ 2021-06-03 10:14 UTC (permalink / raw)
  To: bpf
  Cc: Dmitrii Banshchikov, ast, davem, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, kpsingh, netdev, rdna

struct match_ops defines polymorphic interface for matches. A match
consists of pointers to struct match_ops and struct xt_entry_match which
contains a payload for the match's type.

All match_ops are kept in a map by their name.

Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
---
 net/bpfilter/Makefile                         |  3 +-
 net/bpfilter/context.c                        | 44 +++++++++++++
 net/bpfilter/context.h                        |  5 ++
 net/bpfilter/match.c                          | 49 +++++++++++++++
 net/bpfilter/match.h                          | 33 ++++++++++
 net/bpfilter/xt_udp.c                         | 33 ++++++++++
 .../testing/selftests/bpf/bpfilter/.gitignore |  1 +
 tools/testing/selftests/bpf/bpfilter/Makefile |  7 +++
 .../selftests/bpf/bpfilter/test_match.c       | 63 +++++++++++++++++++
 9 files changed, 237 insertions(+), 1 deletion(-)
 create mode 100644 net/bpfilter/context.c
 create mode 100644 net/bpfilter/match.c
 create mode 100644 net/bpfilter/match.h
 create mode 100644 net/bpfilter/xt_udp.c
 create mode 100644 tools/testing/selftests/bpf/bpfilter/test_match.c

diff --git a/net/bpfilter/Makefile b/net/bpfilter/Makefile
index 1809759d08c4..59f2d35c1627 100644
--- a/net/bpfilter/Makefile
+++ b/net/bpfilter/Makefile
@@ -4,7 +4,8 @@
 #
 
 userprogs := bpfilter_umh
-bpfilter_umh-objs := main.o map-common.o
+bpfilter_umh-objs := main.o map-common.o match.o context.o
+bpfilter_umh-objs += xt_udp.o
 userccflags += -I $(srctree)/tools/include/ -I $(srctree)/tools/include/uapi
 
 ifeq ($(CONFIG_BPFILTER_UMH), y)
diff --git a/net/bpfilter/context.c b/net/bpfilter/context.c
new file mode 100644
index 000000000000..6b6203dd22a7
--- /dev/null
+++ b/net/bpfilter/context.c
@@ -0,0 +1,44 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#define _GNU_SOURCE
+
+#include "context.h"
+
+#include <linux/err.h>
+#include <linux/list.h>
+
+#include "map-common.h"
+#include "match.h"
+
+static int init_match_ops_map(struct context *ctx)
+{
+	const struct match_ops *match_ops[] = { &xt_udp };
+	int i, err;
+
+	err = create_map(&ctx->match_ops_map, ARRAY_SIZE(match_ops));
+	if (err)
+		return err;
+
+	for (i = 0; i < ARRAY_SIZE(match_ops); ++i) {
+		const struct match_ops *m = match_ops[i];
+
+		err = map_insert(&ctx->match_ops_map, m->name, (void *)m);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+int create_context(struct context *ctx)
+{
+	return init_match_ops_map(ctx);
+}
+
+void free_context(struct context *ctx)
+{
+	free_map(&ctx->match_ops_map);
+}
diff --git a/net/bpfilter/context.h b/net/bpfilter/context.h
index e7bc27ee1ace..60bb525843b0 100644
--- a/net/bpfilter/context.h
+++ b/net/bpfilter/context.h
@@ -10,9 +10,11 @@
 
 #include <stdio.h>
 #include <stdlib.h>
+#include <search.h>
 
 struct context {
 	FILE *log_file;
+	struct hsearch_data match_ops_map;
 };
 
 #define BFLOG_IMPL(ctx, level, fmt, ...)                                                           \
@@ -34,4 +36,7 @@ struct context {
 #define BFLOG_DEBUG(ctx, fmt, ...)
 #endif
 
+int create_context(struct context *ctx);
+void free_context(struct context *ctx);
+
 #endif // NET_BPFILTER_CONTEXT_H
diff --git a/net/bpfilter/match.c b/net/bpfilter/match.c
new file mode 100644
index 000000000000..3b49196efabf
--- /dev/null
+++ b/net/bpfilter/match.c
@@ -0,0 +1,49 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#define _GNU_SOURCE
+
+#include "match.h"
+
+#include <linux/err.h>
+
+#include <errno.h>
+#include <string.h>
+
+#include "context.h"
+#include "map-common.h"
+
+int init_match(struct context *ctx, const struct bpfilter_ipt_match *ipt_match, struct match *match)
+{
+	const size_t maxlen = sizeof(ipt_match->u.user.name);
+	const struct match_ops *found;
+	int err;
+
+	if (strnlen(ipt_match->u.user.name, maxlen) == maxlen) {
+		BFLOG_DEBUG(ctx, "cannot init match: too long match name\n");
+		return -EINVAL;
+	}
+
+	found = map_find(&ctx->match_ops_map, ipt_match->u.user.name);
+	if (IS_ERR(found)) {
+		BFLOG_DEBUG(ctx, "cannot find match by name: '%s'\n", ipt_match->u.user.name);
+		return PTR_ERR(found);
+	}
+
+	if (found->size + sizeof(*ipt_match) != ipt_match->u.match_size ||
+	    found->revision != ipt_match->u.user.revision) {
+		BFLOG_DEBUG(ctx, "invalid match: '%s'\n", ipt_match->u.user.name);
+		return -EINVAL;
+	}
+
+	err = found->check(ctx, ipt_match);
+	if (err)
+		return err;
+
+	match->match_ops = found;
+	match->ipt_match = ipt_match;
+
+	return 0;
+}
diff --git a/net/bpfilter/match.h b/net/bpfilter/match.h
new file mode 100644
index 000000000000..9879a3670711
--- /dev/null
+++ b/net/bpfilter/match.h
@@ -0,0 +1,33 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#ifndef NET_BPFILTER_MATCH_H
+#define NET_BPFILTER_MATCH_H
+
+#include "../../include/uapi/linux/bpfilter.h"
+
+#include <stdint.h>
+
+struct bpfilter_ipt_match;
+struct context;
+
+struct match_ops {
+	char name[BPFILTER_EXTENSION_MAXNAMELEN];
+	uint8_t revision;
+	uint16_t size;
+	int (*check)(struct context *ctx, const struct bpfilter_ipt_match *ipt_match);
+};
+
+struct match {
+	const struct match_ops *match_ops;
+	const struct bpfilter_ipt_match *ipt_match;
+};
+
+extern const struct match_ops xt_udp;
+
+int init_match(struct context *ctx, const struct bpfilter_ipt_match *ipt_match,
+	       struct match *match);
+
+#endif // NET_BPFILTER_MATCH_H
diff --git a/net/bpfilter/xt_udp.c b/net/bpfilter/xt_udp.c
new file mode 100644
index 000000000000..a7fbe77a53cc
--- /dev/null
+++ b/net/bpfilter/xt_udp.c
@@ -0,0 +1,33 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#define _GNU_SOURCE
+
+#include <linux/netfilter/x_tables.h>
+#include <linux/netfilter/xt_tcpudp.h>
+
+#include <errno.h>
+
+#include "context.h"
+#include "match.h"
+
+static int xt_udp_check(struct context *ctx, const struct bpfilter_ipt_match *ipt_match)
+{
+	const struct xt_udp *udp;
+
+	udp = (const struct xt_udp *)&ipt_match->data;
+
+	if (udp->invflags & XT_UDP_INV_MASK) {
+		BFLOG_DEBUG(ctx, "cannot check match 'udp': invalid flags\n");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+const struct match_ops xt_udp = { .name = "udp",
+				  .size = XT_ALIGN(sizeof(struct xt_udp)),
+				  .revision = 0,
+				  .check = xt_udp_check };
diff --git a/tools/testing/selftests/bpf/bpfilter/.gitignore b/tools/testing/selftests/bpf/bpfilter/.gitignore
index 983fd06cbefa..9250411fa7aa 100644
--- a/tools/testing/selftests/bpf/bpfilter/.gitignore
+++ b/tools/testing/selftests/bpf/bpfilter/.gitignore
@@ -1,2 +1,3 @@
 # SPDX-License-Identifier: GPL-2.0-only
 test_map
+test_match
diff --git a/tools/testing/selftests/bpf/bpfilter/Makefile b/tools/testing/selftests/bpf/bpfilter/Makefile
index 647229a0596c..0ef156cdb198 100644
--- a/tools/testing/selftests/bpf/bpfilter/Makefile
+++ b/tools/testing/selftests/bpf/bpfilter/Makefile
@@ -9,9 +9,16 @@ BPFILTERSRCDIR := $(top_srcdir)/net/bpfilter
 CFLAGS += -Wall -g -pthread -I$(TOOLSINCDIR) -I$(APIDIR) -I$(BPFILTERSRCDIR)
 
 TEST_GEN_PROGS += test_map
+TEST_GEN_PROGS += test_match
 
 KSFT_KHDR_INSTALL := 1
 
 include ../../lib.mk
 
+BPFILTER_MATCH_SRCS := $(BPFILTERSRCDIR)/match.c $(BPFILTERSRCDIR)/xt_udp.c
+
+BPFILTER_COMMON_SRCS := $(BPFILTERSRCDIR)/map-common.c $(BPFILTERSRCDIR)/context.c
+BPFILTER_COMMON_SRCS += $(BPFILTER_MATCH_SRCS)
+
 $(OUTPUT)/test_map: test_map.c $(BPFILTERSRCDIR)/map-common.c
+$(OUTPUT)/test_match: test_match.c $(BPFILTER_COMMON_SRCS) $(BPFILTER_MATCH_SRCS)
diff --git a/tools/testing/selftests/bpf/bpfilter/test_match.c b/tools/testing/selftests/bpf/bpfilter/test_match.c
new file mode 100644
index 000000000000..3a56d79ed24c
--- /dev/null
+++ b/tools/testing/selftests/bpf/bpfilter/test_match.c
@@ -0,0 +1,63 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#define _GNU_SOURCE
+
+#include "context.h"
+#include "match.h"
+
+#include <linux/bpfilter.h>
+#include <linux/err.h>
+
+#include <linux/netfilter_ipv4/ip_tables.h>
+#include <linux/netfilter/x_tables.h>
+#include <linux/netfilter/xt_tcpudp.h>
+
+#include <stdio.h>
+
+#include "../../kselftest_harness.h"
+
+struct udp_match {
+	struct xt_entry_match ipt_match;
+	struct xt_udp udp;
+};
+
+FIXTURE(test_udp_match)
+{
+	struct context ctx;
+	struct udp_match udp_match;
+	struct match match;
+};
+
+FIXTURE_SETUP(test_udp_match)
+{
+	ASSERT_EQ(0, create_context(&self->ctx));
+	self->ctx.log_file = stderr;
+
+	memset(&self->udp_match, 0, sizeof(self->udp_match));
+	snprintf(self->udp_match.ipt_match.u.user.name,
+		 sizeof(self->udp_match.ipt_match.u.user.name), "udp");
+	self->udp_match.ipt_match.u.user.match_size = sizeof(struct udp_match);
+	self->udp_match.ipt_match.u.user.revision = 0;
+};
+
+FIXTURE_TEARDOWN(test_udp_match)
+{
+	free_context(&self->ctx);
+}
+
+TEST_F(test_udp_match, init)
+{
+	self->udp_match.udp.spts[0] = 1;
+	self->udp_match.udp.spts[1] = 2;
+	self->udp_match.udp.dpts[0] = 3;
+	self->udp_match.udp.dpts[1] = 4;
+	self->udp_match.udp.invflags = 0;
+
+	ASSERT_EQ(init_match(&self->ctx,
+			     (const struct bpfilter_ipt_match *)&self->udp_match
+				     .ipt_match,
+			     &self->match),
+		  0);
+}
+
+TEST_HARNESS_MAIN
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH bpf-next v1 06/10] bpfilter: Add struct target
  2021-06-03 10:14 [PATCH bpf-next v1 00/10] bpfilter Dmitrii Banshchikov
                   ` (4 preceding siblings ...)
  2021-06-03 10:14 ` [PATCH bpf-next v1 05/10] bpfilter: Add struct match Dmitrii Banshchikov
@ 2021-06-03 10:14 ` Dmitrii Banshchikov
  2021-06-03 16:47   ` Dmitrii Banshchikov
  2021-06-03 10:14 ` [PATCH bpf-next v1 07/10] bpfilter: Add struct rule Dmitrii Banshchikov
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 20+ messages in thread
From: Dmitrii Banshchikov @ 2021-06-03 10:14 UTC (permalink / raw)
  To: bpf
  Cc: Dmitrii Banshchikov, ast, davem, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, kpsingh, netdev, rdna

struct target_ops defines polymorphic interface for targets. A target
consists of pointers to struct target_ops and struct xt_entry_target
which contains a payload for the target's type.

All target_ops are kept in a map by their name.

Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
---
 .clang-format                                 |   2 +-
 net/bpfilter/Makefile                         |   2 +-
 net/bpfilter/context.c                        |  36 +++++-
 net/bpfilter/context.h                        |   1 +
 net/bpfilter/target.c                         | 118 ++++++++++++++++++
 net/bpfilter/target.h                         |  49 ++++++++
 .../testing/selftests/bpf/bpfilter/.gitignore |   1 +
 tools/testing/selftests/bpf/bpfilter/Makefile |   7 +-
 .../selftests/bpf/bpfilter/bpfilter_util.h    |  31 +++++
 .../selftests/bpf/bpfilter/test_target.c      |  85 +++++++++++++
 10 files changed, 327 insertions(+), 5 deletions(-)
 create mode 100644 net/bpfilter/target.c
 create mode 100644 net/bpfilter/target.h
 create mode 100644 tools/testing/selftests/bpf/bpfilter/bpfilter_util.h
 create mode 100644 tools/testing/selftests/bpf/bpfilter/test_target.c

diff --git a/.clang-format b/.clang-format
index c24b147cac01..3212542df113 100644
--- a/.clang-format
+++ b/.clang-format
@@ -52,7 +52,7 @@ BreakConstructorInitializersBeforeComma: false
 #BreakConstructorInitializers: BeforeComma # Unknown to clang-format-4.0
 BreakAfterJavaFieldAnnotations: false
 BreakStringLiterals: false
-ColumnLimit: 80
+ColumnLimit: 100
 CommentPragmas: '^ IWYU pragma:'
 #CompactNamespaces: false # Unknown to clang-format-4.0
 ConstructorInitializerAllOnOneLineOrOnePerLine: false
diff --git a/net/bpfilter/Makefile b/net/bpfilter/Makefile
index 59f2d35c1627..031c9dd40d2d 100644
--- a/net/bpfilter/Makefile
+++ b/net/bpfilter/Makefile
@@ -4,7 +4,7 @@
 #
 
 userprogs := bpfilter_umh
-bpfilter_umh-objs := main.o map-common.o match.o context.o
+bpfilter_umh-objs := main.o map-common.o context.o match.o target.o
 bpfilter_umh-objs += xt_udp.o
 userccflags += -I $(srctree)/tools/include/ -I $(srctree)/tools/include/uapi
 
diff --git a/net/bpfilter/context.c b/net/bpfilter/context.c
index 6b6203dd22a7..6e186399609e 100644
--- a/net/bpfilter/context.c
+++ b/net/bpfilter/context.c
@@ -12,6 +12,7 @@
 
 #include "map-common.h"
 #include "match.h"
+#include "target.h"
 
 static int init_match_ops_map(struct context *ctx)
 {
@@ -33,12 +34,45 @@ static int init_match_ops_map(struct context *ctx)
 	return 0;
 }
 
+static int init_target_ops_map(struct context *ctx)
+{
+	const struct target_ops *target_ops[] = { &standard_target_ops, &error_target_ops };
+	int i, err;
+
+	err = create_map(&ctx->target_ops_map, ARRAY_SIZE(target_ops));
+	if (err)
+		return err;
+
+	for (i = 0; i < ARRAY_SIZE(target_ops); ++i) {
+		const struct target_ops *t = target_ops[i];
+
+		err = map_insert(&ctx->target_ops_map, t->name, (void *)t);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
 int create_context(struct context *ctx)
 {
-	return init_match_ops_map(ctx);
+	int err;
+
+	err = init_match_ops_map(ctx);
+	if (err)
+		return err;
+
+	err = init_target_ops_map(ctx);
+	if (err) {
+		free_map(&ctx->match_ops_map);
+		return err;
+	}
+
+	return 0;
 }
 
 void free_context(struct context *ctx)
 {
 	free_map(&ctx->match_ops_map);
+	free_map(&ctx->target_ops_map);
 }
diff --git a/net/bpfilter/context.h b/net/bpfilter/context.h
index 60bb525843b0..ed268259adcc 100644
--- a/net/bpfilter/context.h
+++ b/net/bpfilter/context.h
@@ -15,6 +15,7 @@
 struct context {
 	FILE *log_file;
 	struct hsearch_data match_ops_map;
+	struct hsearch_data target_ops_map;
 };
 
 #define BFLOG_IMPL(ctx, level, fmt, ...)                                                           \
diff --git a/net/bpfilter/target.c b/net/bpfilter/target.c
new file mode 100644
index 000000000000..f87ef719ea4d
--- /dev/null
+++ b/net/bpfilter/target.c
@@ -0,0 +1,118 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#define _GNU_SOURCE
+
+#include "target.h"
+
+#include <linux/err.h>
+#include <linux/netfilter/x_tables.h>
+
+#include <errno.h>
+#include <string.h>
+
+#include "context.h"
+#include "map-common.h"
+
+static const struct target_ops *target_ops_map_find(struct hsearch_data *map, const char *name)
+{
+	const size_t namelen = strnlen(name, BPFILTER_EXTENSION_MAXNAMELEN);
+
+	if (namelen < BPFILTER_EXTENSION_MAXNAMELEN)
+		return map_find(map, name);
+
+	return ERR_PTR(-EINVAL);
+}
+
+static int standard_target_check(struct context *ctx, const struct bpfilter_ipt_target *ipt_target)
+{
+	const struct bpfilter_ipt_standard_target *standard_target;
+
+	standard_target = (const struct bpfilter_ipt_standard_target *)ipt_target;
+
+	// Positive values of verdict denote a jump offset into a blob.
+	if (standard_target->verdict > 0)
+		return 0;
+
+	// Special values like ACCEPT, DROP, RETURN are encoded as negative values.
+	if (standard_target->verdict < 0) {
+		if (standard_target->verdict == BPFILTER_RETURN)
+			return 0;
+
+		switch (convert_verdict(standard_target->verdict)) {
+		case BPFILTER_NF_ACCEPT:
+		case BPFILTER_NF_DROP:
+		case BPFILTER_NF_QUEUE:
+			return 0;
+		}
+	}
+
+	BFLOG_DEBUG(ctx, "invalid verdict: %d\n", standard_target->verdict);
+
+	return -EINVAL;
+}
+
+const struct target_ops standard_target_ops = {
+	.name = "",
+	.revision = 0,
+	.size = sizeof(struct xt_standard_target),
+	.check = standard_target_check,
+};
+
+static int error_target_check(struct context *ctx, const struct bpfilter_ipt_target *ipt_target)
+{
+	const struct bpfilter_ipt_error_target *error_target;
+	size_t maxlen;
+
+	error_target = (const struct bpfilter_ipt_error_target *)&ipt_target;
+	maxlen = sizeof(error_target->error_name);
+	if (strnlen(error_target->error_name, maxlen) == maxlen) {
+		BFLOG_DEBUG(ctx, "cannot check error target: too long errorname\n");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+const struct target_ops error_target_ops = {
+	.name = "ERROR",
+	.revision = 0,
+	.size = sizeof(struct xt_error_target),
+	.check = error_target_check,
+};
+
+int init_target(struct context *ctx, const struct bpfilter_ipt_target *ipt_target,
+		struct target *target)
+{
+	const size_t maxlen = sizeof(ipt_target->u.user.name);
+	const struct target_ops *found;
+	int err;
+
+	if (strnlen(ipt_target->u.user.name, maxlen) == maxlen) {
+		BFLOG_DEBUG(ctx, "cannot init target: too long target name\n");
+		return -EINVAL;
+	}
+
+	found = target_ops_map_find(&ctx->target_ops_map, ipt_target->u.user.name);
+	if (IS_ERR(found)) {
+		BFLOG_DEBUG(ctx, "cannot find target by name: '%s'\n", ipt_target->u.user.name);
+		return PTR_ERR(found);
+	}
+
+	if (found->size != ipt_target->u.target_size ||
+	    found->revision != ipt_target->u.user.revision) {
+		BFLOG_DEBUG(ctx, "invalid target: '%s'\n", ipt_target->u.user.name);
+		return -EINVAL;
+	}
+
+	err = found->check(ctx, ipt_target);
+	if (err)
+		return err;
+
+	target->target_ops = found;
+	target->ipt_target = ipt_target;
+
+	return 0;
+}
diff --git a/net/bpfilter/target.h b/net/bpfilter/target.h
new file mode 100644
index 000000000000..e27816d73cc2
--- /dev/null
+++ b/net/bpfilter/target.h
@@ -0,0 +1,49 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#ifndef NET_BPFILTER_TARGET_H
+#define NET_BPFILTER_TARGET_H
+
+#include "../../include/uapi/linux/bpfilter.h"
+
+#include <stdint.h>
+
+struct context;
+struct target_ops_map;
+
+struct target_ops {
+	char name[BPFILTER_EXTENSION_MAXNAMELEN];
+	uint8_t revision;
+	uint16_t size;
+	int (*check)(struct context *ctx, const struct bpfilter_ipt_target *ipt_target);
+};
+
+struct target {
+	const struct target_ops *target_ops;
+	const struct bpfilter_ipt_target *ipt_target;
+};
+
+extern const struct target_ops standard_target_ops;
+extern const struct target_ops error_target_ops;
+
+/* Restore verdict's special value(ACCEPT, DROP, etc.) from its negative representation. */
+static inline int convert_verdict(int verdict)
+{
+	return -verdict - 1;
+}
+
+static inline int standard_target_verdict(const struct bpfilter_ipt_target *ipt_target)
+{
+	const struct bpfilter_ipt_standard_target *standard_target;
+
+	standard_target = (const struct bpfilter_ipt_standard_target *)ipt_target;
+
+	return standard_target->verdict;
+}
+
+int init_target(struct context *ctx, const struct bpfilter_ipt_target *ipt_target,
+		struct target *target);
+
+#endif // NET_BPFILTER_TARGET_H
diff --git a/tools/testing/selftests/bpf/bpfilter/.gitignore b/tools/testing/selftests/bpf/bpfilter/.gitignore
index 9250411fa7aa..7e077f506af1 100644
--- a/tools/testing/selftests/bpf/bpfilter/.gitignore
+++ b/tools/testing/selftests/bpf/bpfilter/.gitignore
@@ -1,3 +1,4 @@
 # SPDX-License-Identifier: GPL-2.0-only
 test_map
 test_match
+test_target
diff --git a/tools/testing/selftests/bpf/bpfilter/Makefile b/tools/testing/selftests/bpf/bpfilter/Makefile
index 0ef156cdb198..a11775e8b5af 100644
--- a/tools/testing/selftests/bpf/bpfilter/Makefile
+++ b/tools/testing/selftests/bpf/bpfilter/Makefile
@@ -10,15 +10,18 @@ CFLAGS += -Wall -g -pthread -I$(TOOLSINCDIR) -I$(APIDIR) -I$(BPFILTERSRCDIR)
 
 TEST_GEN_PROGS += test_map
 TEST_GEN_PROGS += test_match
+TEST_GEN_PROGS += test_target
 
 KSFT_KHDR_INSTALL := 1
 
 include ../../lib.mk
 
 BPFILTER_MATCH_SRCS := $(BPFILTERSRCDIR)/match.c $(BPFILTERSRCDIR)/xt_udp.c
+BPFILTER_TARGET_SRCS := $(BPFILTERSRCDIR)/target.c
 
 BPFILTER_COMMON_SRCS := $(BPFILTERSRCDIR)/map-common.c $(BPFILTERSRCDIR)/context.c
-BPFILTER_COMMON_SRCS += $(BPFILTER_MATCH_SRCS)
+BPFILTER_COMMON_SRCS += $(BPFILTER_MATCH_SRCS) $(BPFILTER_TARGET_SRCS)
 
 $(OUTPUT)/test_map: test_map.c $(BPFILTERSRCDIR)/map-common.c
-$(OUTPUT)/test_match: test_match.c $(BPFILTER_COMMON_SRCS) $(BPFILTER_MATCH_SRCS)
+$(OUTPUT)/test_match: test_match.c $(BPFILTER_COMMON_SRCS)
+$(OUTPUT)/test_target: test_target.c $(BPFILTER_COMMON_SRCS)
diff --git a/tools/testing/selftests/bpf/bpfilter/bpfilter_util.h b/tools/testing/selftests/bpf/bpfilter/bpfilter_util.h
new file mode 100644
index 000000000000..d82ff86f280e
--- /dev/null
+++ b/tools/testing/selftests/bpf/bpfilter/bpfilter_util.h
@@ -0,0 +1,31 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef BPFILTER_UTIL_H
+#define BPFILTER_UTIL_H
+
+#include <linux/bpfilter.h>
+#include <linux/netfilter/x_tables.h>
+
+#include <stdio.h>
+
+static inline void init_standard_target(struct xt_standard_target *ipt_target, int revision,
+					int verdict)
+{
+	snprintf(ipt_target->target.u.user.name, sizeof(ipt_target->target.u.user.name), "%s",
+		 BPFILTER_STANDARD_TARGET);
+	ipt_target->target.u.user.revision = revision;
+	ipt_target->target.u.user.target_size = sizeof(*ipt_target);
+	ipt_target->verdict = verdict;
+}
+
+static inline void init_error_target(struct xt_error_target *ipt_target, int revision,
+				     const char *error_name)
+{
+	snprintf(ipt_target->target.u.user.name, sizeof(ipt_target->target.u.user.name), "%s",
+		 BPFILTER_ERROR_TARGET);
+	ipt_target->target.u.user.revision = revision;
+	ipt_target->target.u.user.target_size = sizeof(*ipt_target);
+	snprintf(ipt_target->errorname, sizeof(ipt_target->errorname), "%s", error_name);
+}
+
+#endif // BPFILTER_UTIL_H
diff --git a/tools/testing/selftests/bpf/bpfilter/test_target.c b/tools/testing/selftests/bpf/bpfilter/test_target.c
new file mode 100644
index 000000000000..6765497b53c4
--- /dev/null
+++ b/tools/testing/selftests/bpf/bpfilter/test_target.c
@@ -0,0 +1,85 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#define _GNU_SOURCE
+
+#include "context.h"
+#include "target.h"
+
+#include <linux/bpfilter.h>
+#include <linux/err.h>
+
+#include <linux/netfilter/x_tables.h>
+#include <linux/netfilter_ipv4/ip_tables.h>
+
+#include "../../kselftest_harness.h"
+
+#include "bpfilter_util.h"
+
+FIXTURE(test_standard_target)
+{
+	struct context ctx;
+	struct xt_standard_target ipt_target;
+	struct target target;
+};
+
+FIXTURE_VARIANT(test_standard_target)
+{
+	int verdict;
+};
+
+FIXTURE_VARIANT_ADD(test_standard_target, accept) {
+	.verdict = -BPFILTER_NF_ACCEPT - 1,
+};
+
+FIXTURE_VARIANT_ADD(test_standard_target, drop) {
+	.verdict = -BPFILTER_NF_DROP - 1,
+};
+
+FIXTURE_SETUP(test_standard_target)
+{
+	ASSERT_EQ(0, create_context(&self->ctx));
+	self->ctx.log_file = stderr;
+
+	memset(&self->ipt_target, 0, sizeof(self->ipt_target));
+	init_standard_target(&self->ipt_target, 0, variant->verdict);
+}
+
+FIXTURE_TEARDOWN(test_standard_target)
+{
+	free_context(&self->ctx);
+}
+
+TEST_F(test_standard_target, init)
+{
+	ASSERT_EQ(0, init_target(&self->ctx, (const struct bpfilter_ipt_target *)&self->ipt_target,
+				 &self->target));
+}
+
+FIXTURE(test_error_target)
+{
+	struct context ctx;
+	struct xt_error_target ipt_target;
+	struct target target;
+};
+
+FIXTURE_SETUP(test_error_target)
+{
+	ASSERT_EQ(0, create_context(&self->ctx));
+	self->ctx.log_file = stderr;
+
+	memset(&self->ipt_target, 0, sizeof(self->ipt_target));
+	init_error_target(&self->ipt_target, 0, "x");
+}
+
+FIXTURE_TEARDOWN(test_error_target)
+{
+	free_context(&self->ctx);
+}
+
+TEST_F(test_error_target, init)
+{
+	ASSERT_EQ(0, init_target(&self->ctx, (const struct bpfilter_ipt_target *)&self->ipt_target,
+				 &self->target));
+}
+
+TEST_HARNESS_MAIN
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH bpf-next v1 07/10] bpfilter: Add struct rule
  2021-06-03 10:14 [PATCH bpf-next v1 00/10] bpfilter Dmitrii Banshchikov
                   ` (5 preceding siblings ...)
  2021-06-03 10:14 ` [PATCH bpf-next v1 06/10] bpfilter: Add struct target Dmitrii Banshchikov
@ 2021-06-03 10:14 ` Dmitrii Banshchikov
  2021-06-10  0:30   ` Yonghong Song
  2021-06-03 10:14 ` [PATCH bpf-next v1 08/10] bpfilter: Add struct table Dmitrii Banshchikov
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 20+ messages in thread
From: Dmitrii Banshchikov @ 2021-06-03 10:14 UTC (permalink / raw)
  To: bpf
  Cc: Dmitrii Banshchikov, ast, davem, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, kpsingh, netdev, rdna

struct rule is an equivalent of struct ipt_entry. A rule consists of
zero or more matches and a target. A rule has a pointer to its ipt_entry
in entries blob.  struct rule should simplify iteration over a blob and
avoid blob's guts in code generation.

Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
---
 net/bpfilter/Makefile                         |   2 +-
 net/bpfilter/rule.c                           | 163 ++++++++++++++++++
 net/bpfilter/rule.h                           |  32 ++++
 .../testing/selftests/bpf/bpfilter/.gitignore |   1 +
 tools/testing/selftests/bpf/bpfilter/Makefile |   5 +-
 .../selftests/bpf/bpfilter/bpfilter_util.h    |   8 +
 .../selftests/bpf/bpfilter/test_rule.c        |  55 ++++++
 7 files changed, 264 insertions(+), 2 deletions(-)
 create mode 100644 net/bpfilter/rule.c
 create mode 100644 net/bpfilter/rule.h
 create mode 100644 tools/testing/selftests/bpf/bpfilter/test_rule.c

diff --git a/net/bpfilter/Makefile b/net/bpfilter/Makefile
index 031c9dd40d2d..7ce961162283 100644
--- a/net/bpfilter/Makefile
+++ b/net/bpfilter/Makefile
@@ -4,7 +4,7 @@
 #
 
 userprogs := bpfilter_umh
-bpfilter_umh-objs := main.o map-common.o context.o match.o target.o
+bpfilter_umh-objs := main.o map-common.o context.o match.o target.o rule.o
 bpfilter_umh-objs += xt_udp.o
 userccflags += -I $(srctree)/tools/include/ -I $(srctree)/tools/include/uapi
 
diff --git a/net/bpfilter/rule.c b/net/bpfilter/rule.c
new file mode 100644
index 000000000000..6018b4b7c0cc
--- /dev/null
+++ b/net/bpfilter/rule.c
@@ -0,0 +1,163 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#define _GNU_SOURCE
+
+#include "rule.h"
+
+#include "../../include/uapi/linux/bpfilter.h"
+
+#include <linux/err.h>
+#include <linux/netfilter/x_tables.h>
+
+#include <errno.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "context.h"
+#include "match.h"
+
+static const struct bpfilter_ipt_target *
+ipt_entry_target(const struct bpfilter_ipt_entry *ipt_entry)
+{
+	return (const void *)ipt_entry + ipt_entry->target_offset;
+}
+
+static const struct bpfilter_ipt_match *ipt_entry_match(const struct bpfilter_ipt_entry *entry,
+							size_t offset)
+{
+	return (const void *)entry + offset;
+}
+
+static int ipt_entry_num_matches(const struct bpfilter_ipt_entry *ipt_entry)
+{
+	const struct bpfilter_ipt_match *ipt_match;
+	uint32_t offset = sizeof(*ipt_entry);
+	int num_matches = 0;
+
+	while (offset < ipt_entry->target_offset) {
+		ipt_match = ipt_entry_match(ipt_entry, offset);
+
+		if ((uintptr_t)ipt_match % __alignof__(struct bpfilter_ipt_match))
+			return -EINVAL;
+
+		if (ipt_entry->target_offset < offset + sizeof(*ipt_match))
+			return -EINVAL;
+
+		if (ipt_match->u.match_size < sizeof(*ipt_match))
+			return -EINVAL;
+
+		if (ipt_entry->target_offset < offset + ipt_match->u.match_size)
+			return -EINVAL;
+
+		++num_matches;
+		offset += ipt_match->u.match_size;
+	}
+
+	if (offset != ipt_entry->target_offset)
+		return -EINVAL;
+
+	return num_matches;
+}
+
+static int init_rule_matches(struct context *ctx, const struct bpfilter_ipt_entry *ipt_entry,
+			     struct rule *rule)
+{
+	const struct bpfilter_ipt_match *ipt_match;
+	uint32_t offset = sizeof(*ipt_entry);
+	struct match *match;
+	int err;
+
+	rule->matches = calloc(rule->num_matches, sizeof(rule->matches[0]));
+	if (!rule->matches)
+		return -ENOMEM;
+
+	match = rule->matches;
+	while (offset < ipt_entry->target_offset) {
+		ipt_match = ipt_entry_match(ipt_entry, offset);
+		err = init_match(ctx, ipt_match, match);
+		if (err) {
+			free(rule->matches);
+			rule->matches = NULL;
+			return err;
+		}
+
+		++match;
+		offset += ipt_match->u.match_size;
+	}
+
+	return 0;
+}
+
+static int check_ipt_entry_ip(const struct bpfilter_ipt_ip *ip)
+{
+	if (ip->flags & ~BPFILTER_IPT_F_MASK)
+		return -EINVAL;
+
+	if (ip->invflags & ~BPFILTER_IPT_INV_MASK)
+		return -EINVAL;
+
+	return 0;
+}
+
+bool rule_has_standard_target(const struct rule *rule)
+{
+	return rule->target.target_ops == &standard_target_ops;
+}
+
+bool is_rule_unconditional(const struct rule *rule)
+{
+	static const struct bpfilter_ipt_ip unconditional;
+
+	if (rule->num_matches)
+		return false;
+
+	return !memcmp(&rule->ipt_entry->ip, &unconditional, sizeof(unconditional));
+}
+
+int init_rule(struct context *ctx, const struct bpfilter_ipt_entry *ipt_entry, struct rule *rule)
+{
+	const struct bpfilter_ipt_target *ipt_target;
+	int err;
+
+	err = check_ipt_entry_ip(&ipt_entry->ip);
+	if (err)
+		return err;
+
+	if (ipt_entry->target_offset < sizeof(*ipt_entry))
+		return -EINVAL;
+
+	if (ipt_entry->next_offset < ipt_entry->target_offset + sizeof(*ipt_target))
+		return -EINVAL;
+
+	ipt_target = ipt_entry_target(ipt_entry);
+
+	if (ipt_target->u.target_size < sizeof(*ipt_target))
+		return -EINVAL;
+
+	if (ipt_entry->next_offset < ipt_entry->target_offset + ipt_target->u.target_size)
+		return -EINVAL;
+
+	err = init_target(ctx, ipt_target, &rule->target);
+	if (err)
+		return err;
+
+	if (rule_has_standard_target(rule)) {
+		if (XT_ALIGN(ipt_entry->target_offset +
+			     sizeof(struct bpfilter_ipt_standard_target)) != ipt_entry->next_offset)
+			return -EINVAL;
+	}
+
+	rule->num_matches = ipt_entry_num_matches(ipt_entry);
+	if (rule->num_matches < 0)
+		return rule->num_matches;
+
+	return init_rule_matches(ctx, ipt_entry, rule);
+}
+
+void free_rule(struct rule *rule)
+{
+	free(rule->matches);
+}
diff --git a/net/bpfilter/rule.h b/net/bpfilter/rule.h
new file mode 100644
index 000000000000..cf879a19c670
--- /dev/null
+++ b/net/bpfilter/rule.h
@@ -0,0 +1,32 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#ifndef NET_BPFILTER_RULE_H
+#define NET_BPFILTER_RULE_H
+
+#include <stdint.h>
+#include <stdbool.h>
+
+#include "target.h"
+
+struct bpfilter_ipt_entry;
+struct context;
+struct match;
+
+struct rule {
+	const struct bpfilter_ipt_entry *ipt_entry;
+	uint32_t came_from;
+	uint32_t hook_mask;
+	uint16_t num_matches;
+	struct match *matches;
+	struct target target;
+};
+
+bool rule_has_standard_target(const struct rule *rule);
+bool is_rule_unconditional(const struct rule *rule);
+int init_rule(struct context *ctx, const struct bpfilter_ipt_entry *ipt_entry, struct rule *rule);
+void free_rule(struct rule *rule);
+
+#endif // NET_BPFILTER_RULE_H
diff --git a/tools/testing/selftests/bpf/bpfilter/.gitignore b/tools/testing/selftests/bpf/bpfilter/.gitignore
index 7e077f506af1..4d7c5083d980 100644
--- a/tools/testing/selftests/bpf/bpfilter/.gitignore
+++ b/tools/testing/selftests/bpf/bpfilter/.gitignore
@@ -2,3 +2,4 @@
 test_map
 test_match
 test_target
+test_rule
diff --git a/tools/testing/selftests/bpf/bpfilter/Makefile b/tools/testing/selftests/bpf/bpfilter/Makefile
index a11775e8b5af..27a1ddcb6dc9 100644
--- a/tools/testing/selftests/bpf/bpfilter/Makefile
+++ b/tools/testing/selftests/bpf/bpfilter/Makefile
@@ -11,6 +11,7 @@ CFLAGS += -Wall -g -pthread -I$(TOOLSINCDIR) -I$(APIDIR) -I$(BPFILTERSRCDIR)
 TEST_GEN_PROGS += test_map
 TEST_GEN_PROGS += test_match
 TEST_GEN_PROGS += test_target
+TEST_GEN_PROGS += test_rule
 
 KSFT_KHDR_INSTALL := 1
 
@@ -19,9 +20,11 @@ include ../../lib.mk
 BPFILTER_MATCH_SRCS := $(BPFILTERSRCDIR)/match.c $(BPFILTERSRCDIR)/xt_udp.c
 BPFILTER_TARGET_SRCS := $(BPFILTERSRCDIR)/target.c
 
-BPFILTER_COMMON_SRCS := $(BPFILTERSRCDIR)/map-common.c $(BPFILTERSRCDIR)/context.c
+BPFILTER_COMMON_SRCS := $(BPFILTERSRCDIR)/map-common.c $(BPFILTERSRCDIR)/context.c \
+	$(BPFILTERSRCDIR)/rule.c
 BPFILTER_COMMON_SRCS += $(BPFILTER_MATCH_SRCS) $(BPFILTER_TARGET_SRCS)
 
 $(OUTPUT)/test_map: test_map.c $(BPFILTERSRCDIR)/map-common.c
 $(OUTPUT)/test_match: test_match.c $(BPFILTER_COMMON_SRCS)
 $(OUTPUT)/test_target: test_target.c $(BPFILTER_COMMON_SRCS)
+$(OUTPUT)/test_rule: test_rule.c $(BPFILTER_COMMON_SRCS)
diff --git a/tools/testing/selftests/bpf/bpfilter/bpfilter_util.h b/tools/testing/selftests/bpf/bpfilter/bpfilter_util.h
index d82ff86f280e..55fb0e959fca 100644
--- a/tools/testing/selftests/bpf/bpfilter/bpfilter_util.h
+++ b/tools/testing/selftests/bpf/bpfilter/bpfilter_util.h
@@ -7,6 +7,7 @@
 #include <linux/netfilter/x_tables.h>
 
 #include <stdio.h>
+#include <string.h>
 
 static inline void init_standard_target(struct xt_standard_target *ipt_target, int revision,
 					int verdict)
@@ -28,4 +29,11 @@ static inline void init_error_target(struct xt_error_target *ipt_target, int rev
 	snprintf(ipt_target->errorname, sizeof(ipt_target->errorname), "%s", error_name);
 }
 
+static inline void init_standard_entry(struct ipt_entry *entry, __u16 matches_size)
+{
+	memset(entry, 0, sizeof(*entry));
+	entry->target_offset = sizeof(*entry) + matches_size;
+	entry->next_offset = sizeof(*entry) + matches_size + sizeof(struct xt_standard_target);
+}
+
 #endif // BPFILTER_UTIL_H
diff --git a/tools/testing/selftests/bpf/bpfilter/test_rule.c b/tools/testing/selftests/bpf/bpfilter/test_rule.c
new file mode 100644
index 000000000000..fe12adf32fe5
--- /dev/null
+++ b/tools/testing/selftests/bpf/bpfilter/test_rule.c
@@ -0,0 +1,55 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#define _GNU_SOURCE
+
+#include "rule.h"
+
+#include <linux/bpfilter.h>
+#include <linux/err.h>
+
+#include <linux/netfilter_ipv4/ip_tables.h>
+
+#include <stdio.h>
+#include <stdlib.h>
+
+#include "../../kselftest_harness.h"
+
+#include "context.h"
+#include "rule.h"
+
+#include "bpfilter_util.h"
+
+FIXTURE(test_standard_rule)
+{
+	struct context ctx;
+	struct {
+		struct ipt_entry entry;
+		struct xt_standard_target target;
+	} entry;
+	struct rule rule;
+};
+
+FIXTURE_SETUP(test_standard_rule)
+{
+	const int verdict = BPFILTER_NF_ACCEPT;
+
+	ASSERT_EQ(create_context(&self->ctx), 0);
+	self->ctx.log_file = stderr;
+
+	init_standard_entry(&self->entry.entry, 0);
+	init_standard_target(&self->entry.target, 0, -verdict - 1);
+}
+
+FIXTURE_TEARDOWN(test_standard_rule)
+{
+	free_rule(&self->rule);
+	free_context(&self->ctx);
+}
+
+TEST_F(test_standard_rule, init)
+{
+	ASSERT_EQ(0, init_rule(&self->ctx, (const struct bpfilter_ipt_entry *)&self->entry.entry,
+			       &self->rule));
+}
+
+TEST_HARNESS_MAIN
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH bpf-next v1 08/10] bpfilter: Add struct table
  2021-06-03 10:14 [PATCH bpf-next v1 00/10] bpfilter Dmitrii Banshchikov
                   ` (6 preceding siblings ...)
  2021-06-03 10:14 ` [PATCH bpf-next v1 07/10] bpfilter: Add struct rule Dmitrii Banshchikov
@ 2021-06-03 10:14 ` Dmitrii Banshchikov
  2021-06-03 10:14 ` [PATCH bpf-next v1 09/10] bpfilter: Add handling of setsockopt() calls Dmitrii Banshchikov
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Dmitrii Banshchikov @ 2021-06-03 10:14 UTC (permalink / raw)
  To: bpf
  Cc: Dmitrii Banshchikov, ast, davem, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, kpsingh, netdev, rdna

A table keeps iptables' blob and an array of struct rule for this blob.
The array of rules provides more convenient way to interact with blob's
entries.

All tables are stored in a map which is used for lookups.
Also all tables are linked into a list that is used for freeing them.

Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
---
 net/bpfilter/Makefile                         |   2 +-
 net/bpfilter/context.c                        | 103 ++++++
 net/bpfilter/context.h                        |   5 +-
 net/bpfilter/table.c                          | 339 ++++++++++++++++++
 net/bpfilter/table.h                          |  39 ++
 tools/testing/selftests/bpf/bpfilter/Makefile |   2 +-
 6 files changed, 487 insertions(+), 3 deletions(-)
 create mode 100644 net/bpfilter/table.c
 create mode 100644 net/bpfilter/table.h

diff --git a/net/bpfilter/Makefile b/net/bpfilter/Makefile
index 7ce961162283..e16fee837ca0 100644
--- a/net/bpfilter/Makefile
+++ b/net/bpfilter/Makefile
@@ -4,7 +4,7 @@
 #
 
 userprogs := bpfilter_umh
-bpfilter_umh-objs := main.o map-common.o context.o match.o target.o rule.o
+bpfilter_umh-objs := main.o map-common.o context.o match.o target.o rule.o table.o
 bpfilter_umh-objs += xt_udp.o
 userccflags += -I $(srctree)/tools/include/ -I $(srctree)/tools/include/uapi
 
diff --git a/net/bpfilter/context.c b/net/bpfilter/context.c
index 6e186399609e..beb4454d1218 100644
--- a/net/bpfilter/context.c
+++ b/net/bpfilter/context.c
@@ -10,8 +10,12 @@
 #include <linux/err.h>
 #include <linux/list.h>
 
+#include <string.h>
+
 #include "map-common.h"
 #include "match.h"
+#include "rule.h"
+#include "table.h"
 #include "target.h"
 
 static int init_match_ops_map(struct context *ctx)
@@ -54,6 +58,88 @@ static int init_target_ops_map(struct context *ctx)
 	return 0;
 }
 
+static void init_standard_entry(struct bpfilter_ipt_standard_entry *ipt_entry)
+{
+	ipt_entry->entry.next_offset = sizeof(*ipt_entry);
+	ipt_entry->entry.target_offset = sizeof(ipt_entry->entry);
+	ipt_entry->target.target.u.user.revision = 0;
+	ipt_entry->target.target.u.user.target_size = sizeof(struct bpfilter_ipt_standard_target);
+	ipt_entry->target.verdict = -BPFILTER_NF_ACCEPT - 1;
+}
+
+static void init_error_entry(struct bpfilter_ipt_error_entry *ipt_entry)
+{
+	ipt_entry->entry.next_offset = sizeof(*ipt_entry);
+	ipt_entry->entry.target_offset = sizeof(ipt_entry->entry);
+	ipt_entry->target.target.u.target_size = sizeof(struct bpfilter_ipt_error_target);
+	ipt_entry->target.target.u.user.revision = 0;
+	snprintf(ipt_entry->target.target.u.user.name, sizeof(ipt_entry->target.target.u.user.name),
+		 "ERROR");
+}
+
+static struct table *create_filter_table(struct context *ctx)
+{
+	struct filter_table_entries {
+		struct bpfilter_ipt_standard_entry local_in;
+		struct bpfilter_ipt_standard_entry forward;
+		struct bpfilter_ipt_standard_entry local_out;
+		struct bpfilter_ipt_error_entry error;
+	};
+
+	struct filter_table {
+		struct bpfilter_ipt_replace replace;
+		struct filter_table_entries entries;
+	} filter_table;
+
+	memset(&filter_table, 0, sizeof(filter_table));
+
+	snprintf(filter_table.replace.name, sizeof(filter_table.replace.name), "filter");
+	filter_table.replace.valid_hooks = 1 << BPFILTER_INET_HOOK_LOCAL_IN |
+					   1 << BPFILTER_INET_HOOK_FORWARD |
+					   1 << BPFILTER_INET_HOOK_LOCAL_OUT;
+	filter_table.replace.num_entries = 4;
+	filter_table.replace.size = sizeof(struct filter_table_entries);
+
+	filter_table.replace.hook_entry[BPFILTER_INET_HOOK_FORWARD] =
+		offsetof(struct filter_table_entries, forward);
+	filter_table.replace.underflow[BPFILTER_INET_HOOK_FORWARD] =
+		offsetof(struct filter_table_entries, forward);
+
+	filter_table.replace.hook_entry[BPFILTER_INET_HOOK_LOCAL_OUT] =
+		offsetof(struct filter_table_entries, local_out);
+	filter_table.replace.underflow[BPFILTER_INET_HOOK_LOCAL_OUT] =
+		offsetof(struct filter_table_entries, local_out);
+
+	init_standard_entry(&filter_table.entries.local_in);
+	init_standard_entry(&filter_table.entries.forward);
+	init_standard_entry(&filter_table.entries.local_out);
+	init_error_entry(&filter_table.entries.error);
+
+	return create_table(ctx, &filter_table.replace);
+}
+
+static int init_table_index(struct context *ctx)
+{
+	struct table *table;
+	int err;
+
+	INIT_LIST_HEAD(&ctx->table_index.list);
+
+	err = create_map(&ctx->table_index.map, 1);
+	if (err)
+		return err;
+
+	table = create_filter_table(ctx);
+	if (IS_ERR(table)) {
+		free_map(&ctx->table_index.map);
+		return PTR_ERR(table);
+	}
+
+	list_add_tail(&table->list, &ctx->table_index.list);
+
+	return map_insert(&ctx->table_index.map, table->name, table);
+}
+
 int create_context(struct context *ctx)
 {
 	int err;
@@ -68,11 +154,28 @@ int create_context(struct context *ctx)
 		return err;
 	}
 
+	err = init_table_index(ctx);
+	if (err) {
+		free_map(&ctx->match_ops_map);
+		free_map(&ctx->target_ops_map);
+		return err;
+	}
+
 	return 0;
 }
 
 void free_context(struct context *ctx)
 {
+	struct list_head *t, *n;
+
+	list_for_each_safe(t, n, &ctx->table_index.list) {
+		struct table *table;
+
+		table = list_entry(t, struct table, list);
+		free_table(table);
+	}
+
+	free_map(&ctx->table_index.map);
 	free_map(&ctx->match_ops_map);
 	free_map(&ctx->target_ops_map);
 }
diff --git a/net/bpfilter/context.h b/net/bpfilter/context.h
index ed268259adcc..740d30af9087 100644
--- a/net/bpfilter/context.h
+++ b/net/bpfilter/context.h
@@ -8,14 +8,17 @@
 
 #include <sys/syslog.h>
 
+#include <search.h>
 #include <stdio.h>
 #include <stdlib.h>
-#include <search.h>
+
+#include "table.h"
 
 struct context {
 	FILE *log_file;
 	struct hsearch_data match_ops_map;
 	struct hsearch_data target_ops_map;
+	struct table_index table_index;
 };
 
 #define BFLOG_IMPL(ctx, level, fmt, ...)                                                           \
diff --git a/net/bpfilter/table.c b/net/bpfilter/table.c
new file mode 100644
index 000000000000..506ad3c9402f
--- /dev/null
+++ b/net/bpfilter/table.c
@@ -0,0 +1,339 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#define _GNU_SOURCE
+
+#include "table.h"
+
+#include <linux/err.h>
+#include <linux/list.h>
+
+#include <errno.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "context.h"
+#include "rule.h"
+
+static int rule_offset_comparator(const void *x, const void *y)
+{
+	const struct rule *rule = y;
+
+	return x - (const void *)rule->ipt_entry;
+}
+
+static struct rule *table_find_rule_by_offset(struct table *table, uint32_t offset)
+{
+	const struct bpfilter_ipt_entry *key;
+
+	key = table->entries + offset;
+
+	return bsearch(key, table->rules, table->num_rules, sizeof(table->rules[0]),
+		       rule_offset_comparator);
+}
+
+static int table_init_rules(struct context *ctx, struct table *table,
+			    const struct bpfilter_ipt_replace *ipt_replace)
+{
+	uint32_t offset;
+	int i;
+
+	table->entries = malloc(table->size);
+	if (!table->entries)
+		return -ENOMEM;
+
+	memcpy(table->entries, ipt_replace->entries, table->size);
+
+	table->rules = calloc(table->num_rules, sizeof(table->rules[0]));
+	if (!table->rules)
+		return -ENOMEM;
+
+	offset = 0;
+	for (i = 0; i < table->num_rules; ++i) {
+		const struct bpfilter_ipt_entry *ipt_entry;
+		int err;
+
+		if (table->size < offset)
+			return -EINVAL;
+
+		if (table->size < offset + sizeof(*ipt_entry))
+			return -EINVAL;
+
+		ipt_entry = table->entries + offset;
+
+		if ((uintptr_t)ipt_entry % __alignof__(struct bpfilter_ipt_entry))
+			return -EINVAL;
+
+		if (table->size < offset + ipt_entry->next_offset)
+			return -EINVAL;
+
+		err = init_rule(ctx, ipt_entry, &table->rules[i]);
+		if (err)
+			return err;
+
+		table->rules[i].ipt_entry = ipt_entry;
+		offset += ipt_entry->next_offset;
+	}
+
+	if (offset != ipt_replace->size)
+		return -EINVAL;
+
+	if (table->num_rules != ipt_replace->num_entries)
+		return -EINVAL;
+
+	return 0;
+}
+
+static int table_check_hooks(const struct table *table)
+{
+	uint32_t max_rule_front, max_rule_last;
+	bool check = false;
+	int i;
+
+	for (i = 0; i < BPFILTER_INET_HOOK_MAX; ++i) {
+		if (!(table->valid_hooks & (1 << i)))
+			continue;
+
+		if (check) {
+			if (table->hook_entry[i] <= max_rule_front)
+				return -EINVAL;
+
+			if (table->underflow[i] <= max_rule_last)
+				return -EINVAL;
+		}
+
+		max_rule_front = table->hook_entry[i];
+		max_rule_last = table->underflow[i];
+		check = true;
+	}
+
+	return 0;
+}
+
+static int table_init_hooks(struct table *table, const struct bpfilter_ipt_replace *ipt_replace)
+{
+	int i;
+
+	for (i = 0; i < BPFILTER_INET_HOOK_MAX; ++i) {
+		struct rule *rule_front, *rule_last;
+		int verdict;
+
+		if (!(table->valid_hooks & (1 << i)))
+			continue;
+
+		rule_front = table_find_rule_by_offset(table, ipt_replace->hook_entry[i]);
+		rule_last = table_find_rule_by_offset(table, ipt_replace->underflow[i]);
+
+		if (!rule_front || !rule_last)
+			return -EINVAL;
+
+		if (!is_rule_unconditional(rule_last))
+			return -EINVAL;
+
+		if (!rule_has_standard_target(rule_last))
+			return -EINVAL;
+
+		verdict = standard_target_verdict(rule_last->target.ipt_target);
+		if (verdict >= 0)
+			return -EINVAL;
+
+		verdict = convert_verdict(verdict);
+
+		if (verdict != BPFILTER_NF_DROP && verdict != BPFILTER_NF_ACCEPT)
+			return -EINVAL;
+
+		table->hook_entry[i] = rule_front - table->rules;
+		table->underflow[i] = rule_last - table->rules;
+	}
+
+	return table_check_hooks(table);
+}
+
+static struct rule *next_rule(const struct table *table, struct rule *rule)
+{
+	const uint32_t i = rule - table->rules;
+
+	if (table->num_rules <= i + 1)
+		return ERR_PTR(-EINVAL);
+
+	++rule;
+	rule->came_from = i;
+
+	return rule;
+}
+
+static struct rule *backtrack_rule(const struct table *table, struct rule *rule)
+{
+	uint32_t i = rule - table->rules;
+	int prev_i;
+
+	do {
+		rule->hook_mask ^= (1 << BPFILTER_INET_HOOK_MAX);
+		prev_i = i;
+		i = rule->came_from;
+		rule->came_from = 0;
+
+		if (i == prev_i)
+			return NULL;
+
+		rule = &table->rules[i];
+	} while (prev_i == i + 1);
+
+	return next_rule(table, rule);
+}
+
+static int table_check_chain(struct table *table, uint32_t hook, struct rule *rule)
+{
+	uint32_t i = rule - table->rules;
+
+	rule->came_from = i;
+
+	for (;;) {
+		bool visited;
+		int verdict;
+
+		if (!rule)
+			return 0;
+
+		if (IS_ERR(rule))
+			return PTR_ERR(rule);
+
+		i = rule - table->rules;
+
+		if (table->num_rules <= i)
+			return -EINVAL;
+
+		if (rule->hook_mask & (1 << BPFILTER_INET_HOOK_MAX))
+			return -EINVAL;
+
+		// already visited
+		visited = rule->hook_mask & (1 << hook);
+		rule->hook_mask |= (1 << hook) | (1 << BPFILTER_INET_HOOK_MAX);
+
+		if (visited) {
+			rule = backtrack_rule(table, rule);
+			continue;
+		}
+
+		if (!rule_has_standard_target(rule)) {
+			rule = next_rule(table, rule);
+			continue;
+		}
+
+		verdict = standard_target_verdict(rule->target.ipt_target);
+		if (verdict > 0) {
+			rule = table_find_rule_by_offset(table, verdict);
+			if (!rule)
+				return -EINVAL;
+
+			rule->came_from = i;
+			continue;
+		}
+
+		if (!is_rule_unconditional(rule)) {
+			rule = next_rule(table, rule);
+			continue;
+		}
+
+		rule = backtrack_rule(table, rule);
+	}
+
+	return 0;
+}
+
+static int table_check_chains(struct table *table)
+{
+	int i, err;
+
+	for (i = 0, err = 0; !err && i < BPFILTER_INET_HOOK_MAX; ++i) {
+		if (table->valid_hooks & (1 << i))
+			err = table_check_chain(table, i, &table->rules[table->hook_entry[i]]);
+	}
+
+	return err;
+}
+
+struct table *create_table(struct context *ctx, const struct bpfilter_ipt_replace *ipt_replace)
+{
+	struct table *table;
+	int err;
+
+	table = calloc(1, sizeof(*table));
+	if (!table)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&table->list);
+	snprintf(table->name, sizeof(table->name), "%s", ipt_replace->name);
+	table->valid_hooks = ipt_replace->valid_hooks;
+	table->num_rules = ipt_replace->num_entries;
+	table->num_counters = ipt_replace->num_counters;
+	table->size = ipt_replace->size;
+
+	err = table_init_rules(ctx, table, ipt_replace);
+	if (err)
+		goto err_free;
+
+	err = table_init_hooks(table, ipt_replace);
+	if (err)
+		goto err_free;
+
+	err = table_check_chains(table);
+	if (err)
+		goto err_free;
+
+	return table;
+
+err_free:
+	free_table(table);
+
+	return ERR_PTR(err);
+}
+
+void table_get_info(const struct table *table, struct bpfilter_ipt_get_info *info)
+{
+	int i;
+
+	snprintf(info->name, sizeof(info->name), "%s", table->name);
+	info->valid_hooks = table->valid_hooks;
+
+	for (i = 0; i < BPFILTER_INET_HOOK_MAX; ++i) {
+		const struct rule *rule_front, *rule_last;
+
+		if (!(table->valid_hooks & (1 << i))) {
+			info->hook_entry[i] = 0;
+			info->underflow[i] = 0;
+			continue;
+		}
+
+		rule_front = &table->rules[table->hook_entry[i]];
+		rule_last = &table->rules[table->underflow[i]];
+		info->hook_entry[i] = (const void *)rule_front->ipt_entry - table->entries;
+		info->underflow[i] = (const void *)rule_last->ipt_entry - table->entries;
+	}
+
+	info->num_entries = table->num_rules;
+	info->size = table->size;
+}
+
+void free_table(struct table *table)
+{
+	int i;
+
+	if (!table)
+		return;
+
+	list_del(&table->list);
+
+	if (table->rules) {
+		for (i = 0; i < table->num_rules; ++i)
+			free_rule(&table->rules[i]);
+		free(table->rules);
+	}
+
+	free(table->entries);
+	free(table);
+}
diff --git a/net/bpfilter/table.h b/net/bpfilter/table.h
new file mode 100644
index 000000000000..0f5d653c4460
--- /dev/null
+++ b/net/bpfilter/table.h
@@ -0,0 +1,39 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#ifndef NET_BPFILTER_TABLE_H
+#define NET_BPFILTER_TABLE_H
+
+#include "../../include/uapi/linux/bpfilter.h"
+
+#include <search.h>
+#include <stdint.h>
+
+struct context;
+struct rule;
+
+struct table {
+	struct list_head list;
+	char name[BPFILTER_XT_TABLE_MAXNAMELEN];
+	uint32_t valid_hooks;
+	uint32_t num_rules;
+	uint32_t num_counters;
+	uint32_t size;
+	uint32_t hook_entry[BPFILTER_INET_HOOK_MAX];
+	uint32_t underflow[BPFILTER_INET_HOOK_MAX];
+	struct rule *rules;
+	void *entries;
+};
+
+struct table_index {
+	struct hsearch_data map;
+	struct list_head list;
+};
+
+struct table *create_table(struct context *ctx, const struct bpfilter_ipt_replace *ipt_replace);
+void table_get_info(const struct table *table, struct bpfilter_ipt_get_info *info);
+void free_table(struct table *table);
+
+#endif // NET_BPFILTER_TABLE_H
diff --git a/tools/testing/selftests/bpf/bpfilter/Makefile b/tools/testing/selftests/bpf/bpfilter/Makefile
index 27a1ddcb6dc9..267a96a4c8a6 100644
--- a/tools/testing/selftests/bpf/bpfilter/Makefile
+++ b/tools/testing/selftests/bpf/bpfilter/Makefile
@@ -21,7 +21,7 @@ BPFILTER_MATCH_SRCS := $(BPFILTERSRCDIR)/match.c $(BPFILTERSRCDIR)/xt_udp.c
 BPFILTER_TARGET_SRCS := $(BPFILTERSRCDIR)/target.c
 
 BPFILTER_COMMON_SRCS := $(BPFILTERSRCDIR)/map-common.c $(BPFILTERSRCDIR)/context.c \
-	$(BPFILTERSRCDIR)/rule.c
+	$(BPFILTERSRCDIR)/rule.c $(BPFILTERSRCDIR)/table.c
 BPFILTER_COMMON_SRCS += $(BPFILTER_MATCH_SRCS) $(BPFILTER_TARGET_SRCS)
 
 $(OUTPUT)/test_map: test_map.c $(BPFILTERSRCDIR)/map-common.c
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH bpf-next v1 09/10] bpfilter: Add handling of setsockopt() calls
  2021-06-03 10:14 [PATCH bpf-next v1 00/10] bpfilter Dmitrii Banshchikov
                   ` (7 preceding siblings ...)
  2021-06-03 10:14 ` [PATCH bpf-next v1 08/10] bpfilter: Add struct table Dmitrii Banshchikov
@ 2021-06-03 10:14 ` Dmitrii Banshchikov
  2021-06-03 10:14 ` [PATCH bpf-next v1 10/10] bpfilter: Handle setsockopts Dmitrii Banshchikov
  2021-06-10  0:50 ` [PATCH bpf-next v1 00/10] bpfilter Yonghong Song
  10 siblings, 0 replies; 20+ messages in thread
From: Dmitrii Banshchikov @ 2021-06-03 10:14 UTC (permalink / raw)
  To: bpf
  Cc: Dmitrii Banshchikov, ast, davem, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, kpsingh, netdev, rdna

Add support of iptables' setsockopt(2).
The parameters of a setsockopt(2) call are passed by struct mbox_request
which contains a type of the setsockopt(2) call and its memory buffer
description. The supplied memory buffer is read-written by
process_vm_readv(2)/process_vm_writev(2).

Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
---
 net/bpfilter/Makefile  |   2 +-
 net/bpfilter/sockopt.c | 409 +++++++++++++++++++++++++++++++++++++++++
 net/bpfilter/sockopt.h |  14 ++
 3 files changed, 424 insertions(+), 1 deletion(-)
 create mode 100644 net/bpfilter/sockopt.c
 create mode 100644 net/bpfilter/sockopt.h

diff --git a/net/bpfilter/Makefile b/net/bpfilter/Makefile
index e16fee837ca0..e5979159ce72 100644
--- a/net/bpfilter/Makefile
+++ b/net/bpfilter/Makefile
@@ -4,7 +4,7 @@
 #
 
 userprogs := bpfilter_umh
-bpfilter_umh-objs := main.o map-common.o context.o match.o target.o rule.o table.o
+bpfilter_umh-objs := main.o map-common.o context.o match.o target.o rule.o table.o sockopt.o
 bpfilter_umh-objs += xt_udp.o
 userccflags += -I $(srctree)/tools/include/ -I $(srctree)/tools/include/uapi
 
diff --git a/net/bpfilter/sockopt.c b/net/bpfilter/sockopt.c
new file mode 100644
index 000000000000..75f36cea4d11
--- /dev/null
+++ b/net/bpfilter/sockopt.c
@@ -0,0 +1,409 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#define _GNU_SOURCE
+
+#include "sockopt.h"
+
+#include <linux/err.h>
+#include <linux/list.h>
+
+#include <errno.h>
+#include <limits.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/uio.h>
+
+#include "context.h"
+#include "map-common.h"
+#include "match.h"
+#include "msgfmt.h"
+
+static int pvm_read(pid_t pid, void *to, const void *from, size_t count)
+{
+	const struct iovec l_iov = { .iov_base = to, .iov_len = count };
+	const struct iovec r_iov = { .iov_base = (void *)from, .iov_len = count };
+	ssize_t total_bytes;
+
+	total_bytes = process_vm_readv(pid, &l_iov, 1, &r_iov, 1, 0);
+	if (total_bytes == -1)
+		return -errno;
+
+	if (total_bytes != count)
+		return -EFAULT;
+
+	return 0;
+}
+
+static int pvm_read_from_offset(pid_t pid, void *to, const void *from, size_t offset, size_t count)
+{
+	return pvm_read(pid, to + offset, from + offset, count);
+}
+
+static int pvm_write(pid_t pid, void *to, const void *from, size_t count)
+{
+	const struct iovec l_iov = { .iov_base = (void *)from, .iov_len = count };
+	const struct iovec r_iov = { .iov_base = to, .iov_len = count };
+	ssize_t total_bytes;
+
+	total_bytes = process_vm_writev(pid, &l_iov, 1, &r_iov, 1, 0);
+	if (total_bytes == -1)
+		return -errno;
+
+	if (total_bytes != count)
+		return -EFAULT;
+
+	return 0;
+}
+
+static int read_ipt_get_info(const struct mbox_request *req, struct bpfilter_ipt_get_info *info)
+{
+	int err;
+
+	if (req->len != sizeof(*info))
+		return -EINVAL;
+
+	err = pvm_read(req->pid, info, (const void *)req->addr, sizeof(*info));
+	if (err)
+		return err;
+
+	info->name[sizeof(info->name) - 1] = '\0';
+
+	return 0;
+}
+
+static int sockopt_get_info(struct context *ctx, const struct mbox_request *req)
+{
+	struct bpfilter_ipt_get_info info;
+	struct table *table;
+	int err;
+
+	BFLOG_DEBUG(ctx, "handling IPT_SO_GET_INFO\n");
+
+	if (req->len != sizeof(info))
+		return -EINVAL;
+
+	err = read_ipt_get_info(req, &info);
+	if (err) {
+		BFLOG_DEBUG(ctx, "cannot read ipt_get_info: %s\n", strerror(-err));
+		return err;
+	}
+
+	table = map_find(&ctx->table_index.map, info.name);
+	if (IS_ERR(table)) {
+		BFLOG_DEBUG(ctx, "cannot find table: '%s'\n", info.name);
+		return -ENOENT;
+	}
+
+	table_get_info(table, &info);
+
+	return pvm_write(req->pid, (void *)req->addr, &info, sizeof(info));
+}
+
+static int read_ipt_get_entries(const struct mbox_request *req,
+				struct bpfilter_ipt_get_entries *entries)
+{
+	int err;
+
+	if (req->len < sizeof(*entries))
+		return -EINVAL;
+
+	err = pvm_read(req->pid, entries, (const void *)req->addr, sizeof(*entries));
+	if (err)
+		return err;
+
+	entries->name[sizeof(entries->name) - 1] = '\0';
+
+	return 0;
+}
+
+static int sockopt_get_entries(struct context *ctx, const struct mbox_request *req)
+{
+	struct bpfilter_ipt_get_entries get_entries, *entries;
+	struct table *table;
+	int err;
+
+	BFLOG_DEBUG(ctx, "handling IPT_SO_GET_ENTRIES\n");
+
+	err = read_ipt_get_entries(req, &get_entries);
+	if (err) {
+		BFLOG_DEBUG(ctx, "cannot read ipt_get_entries: %s\n", strerror(-err));
+		return err;
+	}
+
+	table = map_find(&ctx->table_index.map, get_entries.name);
+	if (IS_ERR(table)) {
+		BFLOG_DEBUG(ctx, "cannot find table: '%s'\n", get_entries.name);
+		return -ENOENT;
+	}
+
+	if (get_entries.size != table->size) {
+		BFLOG_DEBUG(ctx, "table '%s' get entries size mismatch\n", get_entries.name);
+		return -EINVAL;
+	}
+
+	entries = (struct bpfilter_ipt_get_entries *)req->addr;
+
+	err = pvm_write(req->pid, entries->name, table->name, sizeof(entries->name));
+	if (err)
+		return err;
+
+	err = pvm_write(req->pid, &entries->size, &table->size, sizeof(table->size));
+	if (err)
+		return err;
+
+	return pvm_write(req->pid, entries->entries, table->entries, table->size);
+}
+
+static int read_ipt_get_revision(const struct mbox_request *req,
+				 struct bpfilter_ipt_get_revision *revision)
+{
+	int err;
+
+	if (req->len != sizeof(*revision))
+		return -EINVAL;
+
+	err = pvm_read(req->pid, revision, (const void *)req->addr, sizeof(*revision));
+	if (err)
+		return err;
+
+	revision->name[sizeof(revision->name) - 1] = '\0';
+
+	return 0;
+}
+
+static int sockopt_get_revision_match(struct context *ctx, const struct mbox_request *req)
+{
+	struct bpfilter_ipt_get_revision get_revision;
+	const struct match_ops *found;
+	int err;
+
+	BFLOG_DEBUG(ctx, "handling IPT_SO_GET_REVISION_MATCH\n");
+
+	err = read_ipt_get_revision(req, &get_revision);
+	if (err)
+		return err;
+
+	found = map_find(&ctx->match_ops_map, get_revision.name);
+	if (IS_ERR(found)) {
+		BFLOG_DEBUG(ctx, "cannot find match: '%s'\n", get_revision.name);
+		return PTR_ERR(found);
+	}
+
+	return found->revision;
+}
+
+static int sockopt_get_revision_target(struct context *ctx, const struct mbox_request *req)
+{
+	struct bpfilter_ipt_get_revision get_revision;
+	const struct match_ops *found;
+	int err;
+
+	BFLOG_DEBUG(ctx, "handling IPT_SO_GET_REVISION_TARGET\n");
+
+	err = read_ipt_get_revision(req, &get_revision);
+	if (err)
+		return err;
+
+	found = map_find(&ctx->target_ops_map, get_revision.name);
+	if (IS_ERR(found)) {
+		BFLOG_DEBUG(ctx, "cannot find target: '%s'\n", get_revision.name);
+		return PTR_ERR(found);
+	}
+
+	return found->revision;
+}
+
+static struct bpfilter_ipt_replace *read_ipt_replace(const struct mbox_request *req)
+{
+	struct bpfilter_ipt_replace ipt_header, *ipt_replace;
+	int err;
+
+	if (req->len < sizeof(ipt_header))
+		return ERR_PTR(-EINVAL);
+
+	err = pvm_read(req->pid, &ipt_header, (const void *)req->addr, sizeof(ipt_header));
+	if (err)
+		return ERR_PTR(err);
+
+	if (ipt_header.num_counters == 0)
+		return ERR_PTR(-EINVAL);
+
+	if (ipt_header.num_counters >= INT_MAX / sizeof(struct bpfilter_ipt_counters))
+		return ERR_PTR(-ENOMEM);
+
+	ipt_header.name[sizeof(ipt_header.name) - 1] = '\0';
+
+	ipt_replace = malloc(sizeof(ipt_header) + ipt_header.size);
+	if (!ipt_replace)
+		return ERR_PTR(-ENOMEM);
+
+	memcpy(ipt_replace, &ipt_header, sizeof(ipt_header));
+
+	err = pvm_read_from_offset(req->pid, ipt_replace, (const void *)req->addr,
+				   sizeof(ipt_header), ipt_header.size);
+	if (err) {
+		free(ipt_replace);
+		return ERR_PTR(err);
+	}
+
+	return ipt_replace;
+}
+
+static int sockopt_set_replace(struct context *ctx, const struct mbox_request *req)
+{
+	struct bpfilter_ipt_replace *ipt_replace;
+	struct table *table, *new_table = NULL;
+	int err;
+
+	BFLOG_DEBUG(ctx, "handling IPT_SO_SET_REPLACE\n");
+
+	ipt_replace = read_ipt_replace(req);
+	if (IS_ERR(ipt_replace)) {
+		BFLOG_DEBUG(ctx, "cannot read ipt_replace: %s\n", strerror(-PTR_ERR(ipt_replace)));
+		return PTR_ERR(ipt_replace);
+	}
+
+	table = map_find(&ctx->table_index.map, ipt_replace->name);
+	if (IS_ERR(table)) {
+		err = PTR_ERR(table);
+		BFLOG_DEBUG(ctx, "cannot find table: '%s'\n", ipt_replace->name);
+		goto cleanup;
+	}
+
+	new_table = create_table(ctx, ipt_replace);
+	if (IS_ERR(new_table)) {
+		err = PTR_ERR(new_table);
+		BFLOG_DEBUG(ctx, "cannot read table: %s\n", strerror(-PTR_ERR(new_table)));
+		goto cleanup;
+	}
+
+	// Here be codegen
+	// ...
+	//
+
+	err = map_update(&ctx->table_index.map, new_table->name, new_table);
+	if (err) {
+		BFLOG_DEBUG(ctx, "cannot update table map: %s\n", strerror(-err));
+		goto cleanup;
+	}
+
+	list_add_tail(&new_table->list, &ctx->table_index.list);
+	new_table = table;
+
+cleanup:
+	if (!IS_ERR(new_table))
+		free_table(new_table);
+
+	free(ipt_replace);
+
+	return err;
+}
+
+static struct bpfilter_ipt_counters_info *read_ipt_counters_info(const struct mbox_request *req)
+{
+	struct bpfilter_ipt_counters_info *info;
+	size_t size;
+	int err;
+
+	if (req->len < sizeof(*info))
+		return ERR_PTR(-EINVAL);
+
+	info = malloc(req->len);
+	if (!info)
+		return ERR_PTR(-ENOMEM);
+
+	err = pvm_read(req->pid, info, (const void *)req->addr, sizeof(*info));
+	if (err)
+		goto err_free;
+
+	size = info->num_counters * sizeof(info->counters[0]);
+	if (req->len != sizeof(*info) + size) {
+		err = -EINVAL;
+		goto err_free;
+	}
+
+	info->name[sizeof(info->name) - 1] = '\0';
+
+	err = pvm_read_from_offset(req->pid, info, (const void *)req->addr, sizeof(*info), size);
+	if (err)
+		goto err_free;
+
+	return info;
+
+err_free:
+	free(info);
+
+	return ERR_PTR(err);
+}
+
+static int sockopt_set_add_counters(struct context *ctx, const struct mbox_request *req)
+{
+	struct bpfilter_ipt_counters_info *info;
+	struct table *table;
+	int err = 0;
+
+	BFLOG_DEBUG(ctx, "handling IPT_SO_SET_ADD_COUNTERS\n");
+
+	info = read_ipt_counters_info(req);
+	if (IS_ERR(info)) {
+		err = PTR_ERR(info);
+		BFLOG_DEBUG(ctx, "cannot read ipt_counters_info: %s\n", strerror(-err));
+		goto err_free;
+	}
+
+	table = map_find(&ctx->table_index.map, info->name);
+	if (IS_ERR(table)) {
+		err = PTR_ERR(table);
+		BFLOG_DEBUG(ctx, "cannot find table: '%s'\n", info->name);
+		goto err_free;
+	}
+
+	// TODO handle counters
+
+err_free:
+	free(info);
+
+	return err;
+}
+
+static int handle_get_request(struct context *ctx, const struct mbox_request *req)
+{
+	switch (req->cmd) {
+	case 0:
+		return 0;
+	case BPFILTER_IPT_SO_GET_INFO:
+		return sockopt_get_info(ctx, req);
+	case BPFILTER_IPT_SO_GET_ENTRIES:
+		return sockopt_get_entries(ctx, req);
+	case BPFILTER_IPT_SO_GET_REVISION_MATCH:
+		return sockopt_get_revision_match(ctx, req);
+	case BPFILTER_IPT_SO_GET_REVISION_TARGET:
+		return sockopt_get_revision_target(ctx, req);
+	}
+
+	BFLOG_NOTICE(ctx, "Unexpected SO_GET request: %d\n", req->cmd);
+
+	return -ENOPROTOOPT;
+}
+
+static int handle_set_request(struct context *ctx, const struct mbox_request *req)
+{
+	switch (req->cmd) {
+	case BPFILTER_IPT_SO_SET_REPLACE:
+		return sockopt_set_replace(ctx, req);
+	case BPFILTER_IPT_SO_SET_ADD_COUNTERS:
+		return sockopt_set_add_counters(ctx, req);
+	}
+
+	BFLOG_NOTICE(ctx, "Unexpected SO_SET request: %d\n", req->cmd);
+
+	return -ENOPROTOOPT;
+}
+
+int handle_sockopt_request(struct context *ctx, const struct mbox_request *req)
+{
+	return req->is_set ? handle_set_request(ctx, req) : handle_get_request(ctx, req);
+}
diff --git a/net/bpfilter/sockopt.h b/net/bpfilter/sockopt.h
new file mode 100644
index 000000000000..711c2f295d89
--- /dev/null
+++ b/net/bpfilter/sockopt.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
+#ifndef NET_BPFILTER_SOCKOPT_H
+#define NET_BPFILTER_SOCKOPT_H
+
+struct context;
+struct mbox_request;
+
+int handle_sockopt_request(struct context *ctx, const struct mbox_request *req);
+
+#endif // NET_BPFILTER_SOCKOPT_H
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH bpf-next v1 10/10] bpfilter: Handle setsockopts
  2021-06-03 10:14 [PATCH bpf-next v1 00/10] bpfilter Dmitrii Banshchikov
                   ` (8 preceding siblings ...)
  2021-06-03 10:14 ` [PATCH bpf-next v1 09/10] bpfilter: Add handling of setsockopt() calls Dmitrii Banshchikov
@ 2021-06-03 10:14 ` Dmitrii Banshchikov
  2021-06-10  0:50 ` [PATCH bpf-next v1 00/10] bpfilter Yonghong Song
  10 siblings, 0 replies; 20+ messages in thread
From: Dmitrii Banshchikov @ 2021-06-03 10:14 UTC (permalink / raw)
  To: bpf
  Cc: Dmitrii Banshchikov, ast, davem, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, kpsingh, netdev, rdna

Use earlier introduced infrastructure for and handle setsockopt(2)
calls.

Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
---
 net/bpfilter/main.c | 123 +++++++++++++++++++++++++++++---------------
 1 file changed, 81 insertions(+), 42 deletions(-)

diff --git a/net/bpfilter/main.c b/net/bpfilter/main.c
index 05e1cfc1e5cd..4c04898530cf 100644
--- a/net/bpfilter/main.c
+++ b/net/bpfilter/main.c
@@ -1,64 +1,103 @@
 // SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2021 Telegram FZ-LLC
+ */
+
 #define _GNU_SOURCE
-#include <sys/uio.h>
+
+#include <unistd.h>
+#include <errno.h>
+
 #include <errno.h>
 #include <stdio.h>
-#include <sys/socket.h>
-#include <fcntl.h>
+#include <string.h>
 #include <unistd.h>
-#include "../../include/uapi/linux/bpf.h"
-#include <asm/unistd.h>
+
+#include "context.h"
 #include "msgfmt.h"
+#include "sockopt.h"
 
-FILE *debug_f;
+#define do_exact(fd, op, buffer, count)                                                            \
+	({                                                                                         \
+		size_t total = 0;                                                                  \
+		int err = 0;                                                                       \
+												   \
+		do {                                                                               \
+			const ssize_t part = op(fd, (buffer) + total, (count) - total);            \
+			if (part > 0) {                                                            \
+				total += part;                                                     \
+			} else if (part == 0 && (count) > 0) {                                     \
+				err = -EIO;                                                        \
+				break;                                                             \
+			} else if (part == -1) {                                                   \
+				if (errno == EINTR)                                                \
+					continue;                                                  \
+				err = -errno;                                                      \
+				break;                                                             \
+			}                                                                          \
+		} while (total < (count));                                                         \
+												   \
+		err;                                                                               \
+	})
 
-static int handle_get_cmd(struct mbox_request *cmd)
+static int read_exact(int fd, void *buffer, size_t count)
 {
-	switch (cmd->cmd) {
-	case 0:
-		return 0;
-	default:
-		break;
-	}
-	return -ENOPROTOOPT;
+	return do_exact(fd, read, buffer, count);
 }
 
-static int handle_set_cmd(struct mbox_request *cmd)
+static int write_exact(int fd, const void *buffer, size_t count)
 {
-	return -ENOPROTOOPT;
+	return do_exact(fd, write, buffer, count);
+}
+
+static int setup_context(struct context *ctx)
+{
+	ctx->log_file = fopen("/dev/kmsg", "w");
+	if (!ctx->log_file)
+		return -errno;
+
+	setvbuf(ctx->log_file, 0, _IOLBF, 0);
+
+	return 0;
 }
 
-static void loop(void)
+static void loop(struct context *ctx)
 {
-	while (1) {
-		struct mbox_request req;
-		struct mbox_reply reply;
-		int n;
-
-		n = read(0, &req, sizeof(req));
-		if (n != sizeof(req)) {
-			fprintf(debug_f, "invalid request %d\n", n);
-			return;
-		}
-
-		reply.status = req.is_set ?
-			handle_set_cmd(&req) :
-			handle_get_cmd(&req);
-
-		n = write(1, &reply, sizeof(reply));
-		if (n != sizeof(reply)) {
-			fprintf(debug_f, "reply failed %d\n", n);
-			return;
-		}
+	struct mbox_request req;
+	struct mbox_reply reply;
+	int err;
+
+	for (;;) {
+		err = read_exact(STDIN_FILENO, &req, sizeof(req));
+		if (err)
+			BFLOG_EMERG(ctx, "cannot read request: %s\n", strerror(-err));
+
+		reply.status = handle_sockopt_request(ctx, &req);
+
+		err = write_exact(STDOUT_FILENO, &reply, sizeof(reply));
+		if (err)
+			BFLOG_EMERG(ctx, "cannot write reply: %s\n", strerror(-err));
 	}
 }
 
 int main(void)
 {
-	debug_f = fopen("/dev/kmsg", "w");
-	setvbuf(debug_f, 0, _IOLBF, 0);
-	fprintf(debug_f, "Started bpfilter\n");
-	loop();
-	fclose(debug_f);
+	struct context ctx;
+	int err;
+
+	err = create_context(&ctx);
+	if (err)
+		return err;
+
+	err = setup_context(&ctx);
+	if (err) {
+		free_context(&ctx);
+		return err;
+	}
+
+	BFLOG_NOTICE(&ctx, "started\n");
+
+	loop(&ctx);
+
 	return 0;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v1 06/10] bpfilter: Add struct target
  2021-06-03 10:14 ` [PATCH bpf-next v1 06/10] bpfilter: Add struct target Dmitrii Banshchikov
@ 2021-06-03 16:47   ` Dmitrii Banshchikov
  0 siblings, 0 replies; 20+ messages in thread
From: Dmitrii Banshchikov @ 2021-06-03 16:47 UTC (permalink / raw)
  To: bpf
  Cc: ast, davem, daniel, andrii, kafai, songliubraving, yhs,
	john.fastabend, kpsingh, netdev, rdna

On Thu, Jun 03, 2021 at 02:14:21PM +0400, Dmitrii Banshchikov wrote:
> struct target_ops defines polymorphic interface for targets. A target
> consists of pointers to struct target_ops and struct xt_entry_target
> which contains a payload for the target's type.
> 
> All target_ops are kept in a map by their name.
> 
> Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
> ---
>  .clang-format                                 |   2 +-
>  net/bpfilter/Makefile                         |   2 +-
>  net/bpfilter/context.c                        |  36 +++++-
>  net/bpfilter/context.h                        |   1 +
>  net/bpfilter/target.c                         | 118 ++++++++++++++++++
>  net/bpfilter/target.h                         |  49 ++++++++
>  .../testing/selftests/bpf/bpfilter/.gitignore |   1 +
>  tools/testing/selftests/bpf/bpfilter/Makefile |   7 +-
>  .../selftests/bpf/bpfilter/bpfilter_util.h    |  31 +++++
>  .../selftests/bpf/bpfilter/test_target.c      |  85 +++++++++++++
>  10 files changed, 327 insertions(+), 5 deletions(-)
>  create mode 100644 net/bpfilter/target.c
>  create mode 100644 net/bpfilter/target.h
>  create mode 100644 tools/testing/selftests/bpf/bpfilter/bpfilter_util.h
>  create mode 100644 tools/testing/selftests/bpf/bpfilter/test_target.c
> 
> diff --git a/.clang-format b/.clang-format
> index c24b147cac01..3212542df113 100644
> --- a/.clang-format
> +++ b/.clang-format
> @@ -52,7 +52,7 @@ BreakConstructorInitializersBeforeComma: false
>  #BreakConstructorInitializers: BeforeComma # Unknown to clang-format-4.0
>  BreakAfterJavaFieldAnnotations: false
>  BreakStringLiterals: false
> -ColumnLimit: 80
> +ColumnLimit: 100

Obviously this is not intended to be here.


>  CommentPragmas: '^ IWYU pragma:'
>  #CompactNamespaces: false # Unknown to clang-format-4.0
>  ConstructorInitializerAllOnOneLineOrOnePerLine: false
> diff --git a/net/bpfilter/Makefile b/net/bpfilter/Makefile
> index 59f2d35c1627..031c9dd40d2d 100644
> --- a/net/bpfilter/Makefile
> +++ b/net/bpfilter/Makefile
> @@ -4,7 +4,7 @@
>  #
>  
>  userprogs := bpfilter_umh
> -bpfilter_umh-objs := main.o map-common.o match.o context.o
> +bpfilter_umh-objs := main.o map-common.o context.o match.o target.o
>  bpfilter_umh-objs += xt_udp.o
>  userccflags += -I $(srctree)/tools/include/ -I $(srctree)/tools/include/uapi
>  
> diff --git a/net/bpfilter/context.c b/net/bpfilter/context.c
> index 6b6203dd22a7..6e186399609e 100644
> --- a/net/bpfilter/context.c
> +++ b/net/bpfilter/context.c
> @@ -12,6 +12,7 @@
>  
>  #include "map-common.h"
>  #include "match.h"
> +#include "target.h"
>  
>  static int init_match_ops_map(struct context *ctx)
>  {
> @@ -33,12 +34,45 @@ static int init_match_ops_map(struct context *ctx)
>  	return 0;
>  }
>  
> +static int init_target_ops_map(struct context *ctx)
> +{
> +	const struct target_ops *target_ops[] = { &standard_target_ops, &error_target_ops };
> +	int i, err;
> +
> +	err = create_map(&ctx->target_ops_map, ARRAY_SIZE(target_ops));
> +	if (err)
> +		return err;
> +
> +	for (i = 0; i < ARRAY_SIZE(target_ops); ++i) {
> +		const struct target_ops *t = target_ops[i];
> +
> +		err = map_insert(&ctx->target_ops_map, t->name, (void *)t);
> +		if (err)
> +			return err;
> +	}
> +
> +	return 0;
> +}
> +
>  int create_context(struct context *ctx)
>  {
> -	return init_match_ops_map(ctx);
> +	int err;
> +
> +	err = init_match_ops_map(ctx);
> +	if (err)
> +		return err;
> +
> +	err = init_target_ops_map(ctx);
> +	if (err) {
> +		free_map(&ctx->match_ops_map);
> +		return err;
> +	}
> +
> +	return 0;
>  }
>  
>  void free_context(struct context *ctx)
>  {
>  	free_map(&ctx->match_ops_map);
> +	free_map(&ctx->target_ops_map);
>  }
> diff --git a/net/bpfilter/context.h b/net/bpfilter/context.h
> index 60bb525843b0..ed268259adcc 100644
> --- a/net/bpfilter/context.h
> +++ b/net/bpfilter/context.h
> @@ -15,6 +15,7 @@
>  struct context {
>  	FILE *log_file;
>  	struct hsearch_data match_ops_map;
> +	struct hsearch_data target_ops_map;
>  };
>  
>  #define BFLOG_IMPL(ctx, level, fmt, ...)                                                           \
> diff --git a/net/bpfilter/target.c b/net/bpfilter/target.c
> new file mode 100644
> index 000000000000..f87ef719ea4d
> --- /dev/null
> +++ b/net/bpfilter/target.c
> @@ -0,0 +1,118 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2021 Telegram FZ-LLC
> + */
> +
> +#define _GNU_SOURCE
> +
> +#include "target.h"
> +
> +#include <linux/err.h>
> +#include <linux/netfilter/x_tables.h>
> +
> +#include <errno.h>
> +#include <string.h>
> +
> +#include "context.h"
> +#include "map-common.h"
> +
> +static const struct target_ops *target_ops_map_find(struct hsearch_data *map, const char *name)
> +{
> +	const size_t namelen = strnlen(name, BPFILTER_EXTENSION_MAXNAMELEN);
> +
> +	if (namelen < BPFILTER_EXTENSION_MAXNAMELEN)
> +		return map_find(map, name);
> +
> +	return ERR_PTR(-EINVAL);
> +}
> +
> +static int standard_target_check(struct context *ctx, const struct bpfilter_ipt_target *ipt_target)
> +{
> +	const struct bpfilter_ipt_standard_target *standard_target;
> +
> +	standard_target = (const struct bpfilter_ipt_standard_target *)ipt_target;
> +
> +	// Positive values of verdict denote a jump offset into a blob.
> +	if (standard_target->verdict > 0)
> +		return 0;
> +
> +	// Special values like ACCEPT, DROP, RETURN are encoded as negative values.
> +	if (standard_target->verdict < 0) {
> +		if (standard_target->verdict == BPFILTER_RETURN)
> +			return 0;
> +
> +		switch (convert_verdict(standard_target->verdict)) {
> +		case BPFILTER_NF_ACCEPT:
> +		case BPFILTER_NF_DROP:
> +		case BPFILTER_NF_QUEUE:
> +			return 0;
> +		}
> +	}
> +
> +	BFLOG_DEBUG(ctx, "invalid verdict: %d\n", standard_target->verdict);
> +
> +	return -EINVAL;
> +}
> +
> +const struct target_ops standard_target_ops = {
> +	.name = "",
> +	.revision = 0,
> +	.size = sizeof(struct xt_standard_target),
> +	.check = standard_target_check,
> +};
> +
> +static int error_target_check(struct context *ctx, const struct bpfilter_ipt_target *ipt_target)
> +{
> +	const struct bpfilter_ipt_error_target *error_target;
> +	size_t maxlen;
> +
> +	error_target = (const struct bpfilter_ipt_error_target *)&ipt_target;
> +	maxlen = sizeof(error_target->error_name);
> +	if (strnlen(error_target->error_name, maxlen) == maxlen) {
> +		BFLOG_DEBUG(ctx, "cannot check error target: too long errorname\n");
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
> +const struct target_ops error_target_ops = {
> +	.name = "ERROR",
> +	.revision = 0,
> +	.size = sizeof(struct xt_error_target),
> +	.check = error_target_check,
> +};
> +
> +int init_target(struct context *ctx, const struct bpfilter_ipt_target *ipt_target,
> +		struct target *target)
> +{
> +	const size_t maxlen = sizeof(ipt_target->u.user.name);
> +	const struct target_ops *found;
> +	int err;
> +
> +	if (strnlen(ipt_target->u.user.name, maxlen) == maxlen) {
> +		BFLOG_DEBUG(ctx, "cannot init target: too long target name\n");
> +		return -EINVAL;
> +	}
> +
> +	found = target_ops_map_find(&ctx->target_ops_map, ipt_target->u.user.name);
> +	if (IS_ERR(found)) {
> +		BFLOG_DEBUG(ctx, "cannot find target by name: '%s'\n", ipt_target->u.user.name);
> +		return PTR_ERR(found);
> +	}
> +
> +	if (found->size != ipt_target->u.target_size ||
> +	    found->revision != ipt_target->u.user.revision) {
> +		BFLOG_DEBUG(ctx, "invalid target: '%s'\n", ipt_target->u.user.name);
> +		return -EINVAL;
> +	}
> +
> +	err = found->check(ctx, ipt_target);
> +	if (err)
> +		return err;
> +
> +	target->target_ops = found;
> +	target->ipt_target = ipt_target;
> +
> +	return 0;
> +}
> diff --git a/net/bpfilter/target.h b/net/bpfilter/target.h
> new file mode 100644
> index 000000000000..e27816d73cc2
> --- /dev/null
> +++ b/net/bpfilter/target.h
> @@ -0,0 +1,49 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (c) 2021 Telegram FZ-LLC
> + */
> +
> +#ifndef NET_BPFILTER_TARGET_H
> +#define NET_BPFILTER_TARGET_H
> +
> +#include "../../include/uapi/linux/bpfilter.h"
> +
> +#include <stdint.h>
> +
> +struct context;
> +struct target_ops_map;
> +
> +struct target_ops {
> +	char name[BPFILTER_EXTENSION_MAXNAMELEN];
> +	uint8_t revision;
> +	uint16_t size;
> +	int (*check)(struct context *ctx, const struct bpfilter_ipt_target *ipt_target);
> +};
> +
> +struct target {
> +	const struct target_ops *target_ops;
> +	const struct bpfilter_ipt_target *ipt_target;
> +};
> +
> +extern const struct target_ops standard_target_ops;
> +extern const struct target_ops error_target_ops;
> +
> +/* Restore verdict's special value(ACCEPT, DROP, etc.) from its negative representation. */
> +static inline int convert_verdict(int verdict)
> +{
> +	return -verdict - 1;
> +}
> +
> +static inline int standard_target_verdict(const struct bpfilter_ipt_target *ipt_target)
> +{
> +	const struct bpfilter_ipt_standard_target *standard_target;
> +
> +	standard_target = (const struct bpfilter_ipt_standard_target *)ipt_target;
> +
> +	return standard_target->verdict;
> +}
> +
> +int init_target(struct context *ctx, const struct bpfilter_ipt_target *ipt_target,
> +		struct target *target);
> +
> +#endif // NET_BPFILTER_TARGET_H
> diff --git a/tools/testing/selftests/bpf/bpfilter/.gitignore b/tools/testing/selftests/bpf/bpfilter/.gitignore
> index 9250411fa7aa..7e077f506af1 100644
> --- a/tools/testing/selftests/bpf/bpfilter/.gitignore
> +++ b/tools/testing/selftests/bpf/bpfilter/.gitignore
> @@ -1,3 +1,4 @@
>  # SPDX-License-Identifier: GPL-2.0-only
>  test_map
>  test_match
> +test_target
> diff --git a/tools/testing/selftests/bpf/bpfilter/Makefile b/tools/testing/selftests/bpf/bpfilter/Makefile
> index 0ef156cdb198..a11775e8b5af 100644
> --- a/tools/testing/selftests/bpf/bpfilter/Makefile
> +++ b/tools/testing/selftests/bpf/bpfilter/Makefile
> @@ -10,15 +10,18 @@ CFLAGS += -Wall -g -pthread -I$(TOOLSINCDIR) -I$(APIDIR) -I$(BPFILTERSRCDIR)
>  
>  TEST_GEN_PROGS += test_map
>  TEST_GEN_PROGS += test_match
> +TEST_GEN_PROGS += test_target
>  
>  KSFT_KHDR_INSTALL := 1
>  
>  include ../../lib.mk
>  
>  BPFILTER_MATCH_SRCS := $(BPFILTERSRCDIR)/match.c $(BPFILTERSRCDIR)/xt_udp.c
> +BPFILTER_TARGET_SRCS := $(BPFILTERSRCDIR)/target.c
>  
>  BPFILTER_COMMON_SRCS := $(BPFILTERSRCDIR)/map-common.c $(BPFILTERSRCDIR)/context.c
> -BPFILTER_COMMON_SRCS += $(BPFILTER_MATCH_SRCS)
> +BPFILTER_COMMON_SRCS += $(BPFILTER_MATCH_SRCS) $(BPFILTER_TARGET_SRCS)
>  
>  $(OUTPUT)/test_map: test_map.c $(BPFILTERSRCDIR)/map-common.c
> -$(OUTPUT)/test_match: test_match.c $(BPFILTER_COMMON_SRCS) $(BPFILTER_MATCH_SRCS)
> +$(OUTPUT)/test_match: test_match.c $(BPFILTER_COMMON_SRCS)
> +$(OUTPUT)/test_target: test_target.c $(BPFILTER_COMMON_SRCS)
> diff --git a/tools/testing/selftests/bpf/bpfilter/bpfilter_util.h b/tools/testing/selftests/bpf/bpfilter/bpfilter_util.h
> new file mode 100644
> index 000000000000..d82ff86f280e
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/bpfilter/bpfilter_util.h
> @@ -0,0 +1,31 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#ifndef BPFILTER_UTIL_H
> +#define BPFILTER_UTIL_H
> +
> +#include <linux/bpfilter.h>
> +#include <linux/netfilter/x_tables.h>
> +
> +#include <stdio.h>
> +
> +static inline void init_standard_target(struct xt_standard_target *ipt_target, int revision,
> +					int verdict)
> +{
> +	snprintf(ipt_target->target.u.user.name, sizeof(ipt_target->target.u.user.name), "%s",
> +		 BPFILTER_STANDARD_TARGET);
> +	ipt_target->target.u.user.revision = revision;
> +	ipt_target->target.u.user.target_size = sizeof(*ipt_target);
> +	ipt_target->verdict = verdict;
> +}
> +
> +static inline void init_error_target(struct xt_error_target *ipt_target, int revision,
> +				     const char *error_name)
> +{
> +	snprintf(ipt_target->target.u.user.name, sizeof(ipt_target->target.u.user.name), "%s",
> +		 BPFILTER_ERROR_TARGET);
> +	ipt_target->target.u.user.revision = revision;
> +	ipt_target->target.u.user.target_size = sizeof(*ipt_target);
> +	snprintf(ipt_target->errorname, sizeof(ipt_target->errorname), "%s", error_name);
> +}
> +
> +#endif // BPFILTER_UTIL_H
> diff --git a/tools/testing/selftests/bpf/bpfilter/test_target.c b/tools/testing/selftests/bpf/bpfilter/test_target.c
> new file mode 100644
> index 000000000000..6765497b53c4
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/bpfilter/test_target.c
> @@ -0,0 +1,85 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#define _GNU_SOURCE
> +
> +#include "context.h"
> +#include "target.h"
> +
> +#include <linux/bpfilter.h>
> +#include <linux/err.h>
> +
> +#include <linux/netfilter/x_tables.h>
> +#include <linux/netfilter_ipv4/ip_tables.h>
> +
> +#include "../../kselftest_harness.h"
> +
> +#include "bpfilter_util.h"
> +
> +FIXTURE(test_standard_target)
> +{
> +	struct context ctx;
> +	struct xt_standard_target ipt_target;
> +	struct target target;
> +};
> +
> +FIXTURE_VARIANT(test_standard_target)
> +{
> +	int verdict;
> +};
> +
> +FIXTURE_VARIANT_ADD(test_standard_target, accept) {
> +	.verdict = -BPFILTER_NF_ACCEPT - 1,
> +};
> +
> +FIXTURE_VARIANT_ADD(test_standard_target, drop) {
> +	.verdict = -BPFILTER_NF_DROP - 1,
> +};
> +
> +FIXTURE_SETUP(test_standard_target)
> +{
> +	ASSERT_EQ(0, create_context(&self->ctx));
> +	self->ctx.log_file = stderr;
> +
> +	memset(&self->ipt_target, 0, sizeof(self->ipt_target));
> +	init_standard_target(&self->ipt_target, 0, variant->verdict);
> +}
> +
> +FIXTURE_TEARDOWN(test_standard_target)
> +{
> +	free_context(&self->ctx);
> +}
> +
> +TEST_F(test_standard_target, init)
> +{
> +	ASSERT_EQ(0, init_target(&self->ctx, (const struct bpfilter_ipt_target *)&self->ipt_target,
> +				 &self->target));
> +}
> +
> +FIXTURE(test_error_target)
> +{
> +	struct context ctx;
> +	struct xt_error_target ipt_target;
> +	struct target target;
> +};
> +
> +FIXTURE_SETUP(test_error_target)
> +{
> +	ASSERT_EQ(0, create_context(&self->ctx));
> +	self->ctx.log_file = stderr;
> +
> +	memset(&self->ipt_target, 0, sizeof(self->ipt_target));
> +	init_error_target(&self->ipt_target, 0, "x");
> +}
> +
> +FIXTURE_TEARDOWN(test_error_target)
> +{
> +	free_context(&self->ctx);
> +}
> +
> +TEST_F(test_error_target, init)
> +{
> +	ASSERT_EQ(0, init_target(&self->ctx, (const struct bpfilter_ipt_target *)&self->ipt_target,
> +				 &self->target));
> +}
> +
> +TEST_HARNESS_MAIN
> -- 
> 2.25.1
> 

-- 

Dmitrii Banshchikov

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v1 03/10] tools: Add bpfilter usermode helper header
  2021-06-03 10:14 ` [PATCH bpf-next v1 03/10] tools: Add bpfilter usermode helper header Dmitrii Banshchikov
@ 2021-06-08 16:20   ` Yonghong Song
  2021-06-09 10:07     ` Dmitrii Banshchikov
  0 siblings, 1 reply; 20+ messages in thread
From: Yonghong Song @ 2021-06-08 16:20 UTC (permalink / raw)
  To: Dmitrii Banshchikov, bpf
  Cc: ast, davem, daniel, andrii, kafai, songliubraving,
	john.fastabend, kpsingh, netdev, rdna



On 6/3/21 3:14 AM, Dmitrii Banshchikov wrote:
> The header will be used in bpfilter usermode helper test infrastructure.
> 
> Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
> ---
>   tools/include/uapi/linux/bpfilter.h | 179 ++++++++++++++++++++++++++++
>   1 file changed, 179 insertions(+)
>   create mode 100644 tools/include/uapi/linux/bpfilter.h
> 
> diff --git a/tools/include/uapi/linux/bpfilter.h b/tools/include/uapi/linux/bpfilter.h
> new file mode 100644
> index 000000000000..8b49d81f81c8
> --- /dev/null
> +++ b/tools/include/uapi/linux/bpfilter.h
> @@ -0,0 +1,179 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> +#ifndef _UAPI_LINUX_BPFILTER_H
> +#define _UAPI_LINUX_BPFILTER_H
> +
> +#include <linux/if.h>
> +#include <linux/const.h>
> +
> +#define BPFILTER_FUNCTION_MAXNAMELEN    30
> +#define BPFILTER_EXTENSION_MAXNAMELEN   29
> +
> +#define BPFILTER_STANDARD_TARGET        ""
> +#define BPFILTER_ERROR_TARGET           "ERROR"
> +
> +
> +#define BPFILTER_ALIGN(__X) __ALIGN_KERNEL(__X, __alignof__(__u64))

The difference between include/uapi/linux/bpfilter.h and
tools/include/uapi/linux/bpfilter.h is the above "define".
Can we put the above define in include/uapi/linux/bpfilter.h as well
so in the commit message we can say tools/include/uapi/linux/bpfilter.h
is a copy of include/uapi/linux/bpfilter.h?

> +
> +enum {
> +	BPFILTER_IPT_SO_SET_REPLACE = 64,
> +	BPFILTER_IPT_SO_SET_ADD_COUNTERS = 65,
> +	BPFILTER_IPT_SET_MAX,
> +};
> +
[...]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v1 03/10] tools: Add bpfilter usermode helper header
  2021-06-08 16:20   ` Yonghong Song
@ 2021-06-09 10:07     ` Dmitrii Banshchikov
  0 siblings, 0 replies; 20+ messages in thread
From: Dmitrii Banshchikov @ 2021-06-09 10:07 UTC (permalink / raw)
  To: Yonghong Song
  Cc: bpf, ast, davem, daniel, andrii, kafai, songliubraving,
	john.fastabend, kpsingh, netdev, rdna

On Tue, Jun 08, 2021 at 09:20:12AM -0700, Yonghong Song wrote:
> 
> 
> On 6/3/21 3:14 AM, Dmitrii Banshchikov wrote:
> > The header will be used in bpfilter usermode helper test infrastructure.
> > 
> > Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
> > ---
> >   tools/include/uapi/linux/bpfilter.h | 179 ++++++++++++++++++++++++++++
> >   1 file changed, 179 insertions(+)
> >   create mode 100644 tools/include/uapi/linux/bpfilter.h
> > 
> > diff --git a/tools/include/uapi/linux/bpfilter.h b/tools/include/uapi/linux/bpfilter.h
> > new file mode 100644
> > index 000000000000..8b49d81f81c8
> > --- /dev/null
> > +++ b/tools/include/uapi/linux/bpfilter.h
> > @@ -0,0 +1,179 @@
> > +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> > +#ifndef _UAPI_LINUX_BPFILTER_H
> > +#define _UAPI_LINUX_BPFILTER_H
> > +
> > +#include <linux/if.h>
> > +#include <linux/const.h>
> > +
> > +#define BPFILTER_FUNCTION_MAXNAMELEN    30
> > +#define BPFILTER_EXTENSION_MAXNAMELEN   29
> > +
> > +#define BPFILTER_STANDARD_TARGET        ""
> > +#define BPFILTER_ERROR_TARGET           "ERROR"
> > +
> > +
> > +#define BPFILTER_ALIGN(__X) __ALIGN_KERNEL(__X, __alignof__(__u64))
> 
> The difference between include/uapi/linux/bpfilter.h and
> tools/include/uapi/linux/bpfilter.h is the above "define".
> Can we put the above define in include/uapi/linux/bpfilter.h as well
> so in the commit message we can say tools/include/uapi/linux/bpfilter.h
> is a copy of include/uapi/linux/bpfilter.h?

Actually it seems that it is possible to drop this define as
XT_ALIGN is used now instead of BPFILTER_ALIGN.
I will remove the define and reword the message.
Thank you.

> 
> > +
> > +enum {
> > +	BPFILTER_IPT_SO_SET_REPLACE = 64,
> > +	BPFILTER_IPT_SO_SET_ADD_COUNTERS = 65,
> > +	BPFILTER_IPT_SET_MAX,
> > +};
> > +
> [...]

-- 

Dmitrii Banshchikov

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v1 07/10] bpfilter: Add struct rule
  2021-06-03 10:14 ` [PATCH bpf-next v1 07/10] bpfilter: Add struct rule Dmitrii Banshchikov
@ 2021-06-10  0:30   ` Yonghong Song
  2021-06-10 13:16     ` Dmitrii Banshchikov
  0 siblings, 1 reply; 20+ messages in thread
From: Yonghong Song @ 2021-06-10  0:30 UTC (permalink / raw)
  To: Dmitrii Banshchikov, bpf
  Cc: ast, davem, daniel, andrii, kafai, songliubraving,
	john.fastabend, kpsingh, netdev, rdna



On 6/3/21 3:14 AM, Dmitrii Banshchikov wrote:
> struct rule is an equivalent of struct ipt_entry. A rule consists of
> zero or more matches and a target. A rule has a pointer to its ipt_entry
> in entries blob.  struct rule should simplify iteration over a blob and
> avoid blob's guts in code generation.
> 
> Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
> ---
>   net/bpfilter/Makefile                         |   2 +-
>   net/bpfilter/rule.c                           | 163 ++++++++++++++++++
>   net/bpfilter/rule.h                           |  32 ++++
>   .../testing/selftests/bpf/bpfilter/.gitignore |   1 +
>   tools/testing/selftests/bpf/bpfilter/Makefile |   5 +-
>   .../selftests/bpf/bpfilter/bpfilter_util.h    |   8 +
>   .../selftests/bpf/bpfilter/test_rule.c        |  55 ++++++
>   7 files changed, 264 insertions(+), 2 deletions(-)
>   create mode 100644 net/bpfilter/rule.c
>   create mode 100644 net/bpfilter/rule.h
>   create mode 100644 tools/testing/selftests/bpf/bpfilter/test_rule.c
> 
[...]
> +
> +bool rule_has_standard_target(const struct rule *rule);
> +bool is_rule_unconditional(const struct rule *rule);
> +int init_rule(struct context *ctx, const struct bpfilter_ipt_entry *ipt_entry, struct rule *rule);
> +void free_rule(struct rule *rule);
> +
> +#endif // NET_BPFILTER_RULE_H
> diff --git a/tools/testing/selftests/bpf/bpfilter/.gitignore b/tools/testing/selftests/bpf/bpfilter/.gitignore
> index 7e077f506af1..4d7c5083d980 100644
> --- a/tools/testing/selftests/bpf/bpfilter/.gitignore
> +++ b/tools/testing/selftests/bpf/bpfilter/.gitignore
> @@ -2,3 +2,4 @@
>   test_map
>   test_match
>   test_target
> +test_rule
> diff --git a/tools/testing/selftests/bpf/bpfilter/Makefile b/tools/testing/selftests/bpf/bpfilter/Makefile
> index a11775e8b5af..27a1ddcb6dc9 100644
> --- a/tools/testing/selftests/bpf/bpfilter/Makefile
> +++ b/tools/testing/selftests/bpf/bpfilter/Makefile
> @@ -11,6 +11,7 @@ CFLAGS += -Wall -g -pthread -I$(TOOLSINCDIR) -I$(APIDIR) -I$(BPFILTERSRCDIR)
>   TEST_GEN_PROGS += test_map
>   TEST_GEN_PROGS += test_match
>   TEST_GEN_PROGS += test_target
> +TEST_GEN_PROGS += test_rule
>   
>   KSFT_KHDR_INSTALL := 1
>   
> @@ -19,9 +20,11 @@ include ../../lib.mk
>   BPFILTER_MATCH_SRCS := $(BPFILTERSRCDIR)/match.c $(BPFILTERSRCDIR)/xt_udp.c
>   BPFILTER_TARGET_SRCS := $(BPFILTERSRCDIR)/target.c
>   
> -BPFILTER_COMMON_SRCS := $(BPFILTERSRCDIR)/map-common.c $(BPFILTERSRCDIR)/context.c
> +BPFILTER_COMMON_SRCS := $(BPFILTERSRCDIR)/map-common.c $(BPFILTERSRCDIR)/context.c \
> +	$(BPFILTERSRCDIR)/rule.c
>   BPFILTER_COMMON_SRCS += $(BPFILTER_MATCH_SRCS) $(BPFILTER_TARGET_SRCS)
>   
>   $(OUTPUT)/test_map: test_map.c $(BPFILTERSRCDIR)/map-common.c
>   $(OUTPUT)/test_match: test_match.c $(BPFILTER_COMMON_SRCS)
>   $(OUTPUT)/test_target: test_target.c $(BPFILTER_COMMON_SRCS)
> +$(OUTPUT)/test_rule: test_rule.c $(BPFILTER_COMMON_SRCS)
> diff --git a/tools/testing/selftests/bpf/bpfilter/bpfilter_util.h b/tools/testing/selftests/bpf/bpfilter/bpfilter_util.h
> index d82ff86f280e..55fb0e959fca 100644
> --- a/tools/testing/selftests/bpf/bpfilter/bpfilter_util.h
> +++ b/tools/testing/selftests/bpf/bpfilter/bpfilter_util.h
> @@ -7,6 +7,7 @@
>   #include <linux/netfilter/x_tables.h>
>   
>   #include <stdio.h>
> +#include <string.h>
>   
>   static inline void init_standard_target(struct xt_standard_target *ipt_target, int revision,
>   					int verdict)
> @@ -28,4 +29,11 @@ static inline void init_error_target(struct xt_error_target *ipt_target, int rev
>   	snprintf(ipt_target->errorname, sizeof(ipt_target->errorname), "%s", error_name);
>   }
>   
> +static inline void init_standard_entry(struct ipt_entry *entry, __u16 matches_size)
> +{
> +	memset(entry, 0, sizeof(*entry));
> +	entry->target_offset = sizeof(*entry) + matches_size;
> +	entry->next_offset = sizeof(*entry) + matches_size + sizeof(struct xt_standard_target);
> +}
> +
>   #endif // BPFILTER_UTIL_H
> diff --git a/tools/testing/selftests/bpf/bpfilter/test_rule.c b/tools/testing/selftests/bpf/bpfilter/test_rule.c
> new file mode 100644
> index 000000000000..fe12adf32fe5
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/bpfilter/test_rule.c
> @@ -0,0 +1,55 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#define _GNU_SOURCE
> +
> +#include "rule.h"
> +
> +#include <linux/bpfilter.h>
> +#include <linux/err.h>
> +
> +#include <linux/netfilter_ipv4/ip_tables.h>
> +
> +#include <stdio.h>
> +#include <stdlib.h>
> +
> +#include "../../kselftest_harness.h"
> +
> +#include "context.h"
> +#include "rule.h"
> +
> +#include "bpfilter_util.h"
> +
> +FIXTURE(test_standard_rule)
> +{
> +	struct context ctx;
> +	struct {
> +		struct ipt_entry entry;
> +		struct xt_standard_target target;
> +	} entry;
> +	struct rule rule;
> +};
> +
> +FIXTURE_SETUP(test_standard_rule)
> +{
> +	const int verdict = BPFILTER_NF_ACCEPT;
> +
> +	ASSERT_EQ(create_context(&self->ctx), 0);
> +	self->ctx.log_file = stderr;
> +
> +	init_standard_entry(&self->entry.entry, 0);
> +	init_standard_target(&self->entry.target, 0, -verdict - 1);
> +}
> +
> +FIXTURE_TEARDOWN(test_standard_rule)
> +{
> +	free_rule(&self->rule);
> +	free_context(&self->ctx);
> +}
> +
> +TEST_F(test_standard_rule, init)
> +{
> +	ASSERT_EQ(0, init_rule(&self->ctx, (const struct bpfilter_ipt_entry *)&self->entry.entry,
> +			       &self->rule));
> +}
> +
> +TEST_HARNESS_MAIN

When compiling selftests/bpf/bpfilter, I got the following compilation 
warning:

gcc -Wall -g -pthread -I/home/yhs/work/bpf-next/tools/include 
-I/home/yhs/work/bpf-next/tools/include/uapi 
-I../../../../../net/bpfilter    test_rule.c 
../../../../../net/bpfilter/map-common.c 
../../../../../net/bpfilter/context.c ../../../../../net/bpfilter/rule.c 
../../../../../net/bpfilter/table.c ../../../../../net/bpfilter/match.c 
../../../../../net/bpfilter/xt_udp.c 
../../../../../net/bpfilter/target.c  -o 
/home/yhs/work/bpf-next/tools/testing/selftests/bpf/bpfilter/test_rule
In file included from test_rule.c:15:
../../kselftest_harness.h:674: warning: "ARRAY_SIZE" redefined
  #define ARRAY_SIZE(a) (sizeof(a) / sizeof(a[0]))

In file included from /usr/include/linux/netfilter/x_tables.h:4,
                  from /usr/include/linux/netfilter_ipv4/ip_tables.h:24,
                  from test_rule.c:10:
/home/yhs/work/bpf-next/tools/include/linux/kernel.h:105: note: this is 
the location of the previous definition
  #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + 
__must_be_array(arr))


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v1 00/10] bpfilter
  2021-06-03 10:14 [PATCH bpf-next v1 00/10] bpfilter Dmitrii Banshchikov
                   ` (9 preceding siblings ...)
  2021-06-03 10:14 ` [PATCH bpf-next v1 10/10] bpfilter: Handle setsockopts Dmitrii Banshchikov
@ 2021-06-10  0:50 ` Yonghong Song
  2021-06-10 13:36   ` Dmitrii Banshchikov
  10 siblings, 1 reply; 20+ messages in thread
From: Yonghong Song @ 2021-06-10  0:50 UTC (permalink / raw)
  To: Dmitrii Banshchikov, bpf
  Cc: ast, davem, daniel, andrii, kafai, songliubraving,
	john.fastabend, kpsingh, netdev, rdna



On 6/3/21 3:14 AM, Dmitrii Banshchikov wrote:
> The patchset is based on the patches from David S. Miller [1] and
> Daniel Borkmann [2].
> 
> The main goal of the patchset is to prepare bpfilter for
> iptables' configuration blob parsing and code generation.
> 
> The patchset introduces data structures and code for matches,
> targets, rules and tables.
> 
> The current version misses handling of counters. Postpone its
> implementation until the code generation phase as it's not clear
> yet how to better handle them.
> 
> Beside that there is no support of net namespaces at all.
> 
> In the next iteration basic code generation shall be introduced.
> 
> The rough plan for the code generation.
> 
> It seems reasonable to assume that the first rules should cover
> most of the packet flow.  This is why they are critical from the
> performance point of view.  At the same time number of user
> defined rules might be pretty large. Also there is a limit on
> size and complexity of a BPF program introduced by the verifier.
> 
> There are two approaches how to handle iptables' rules in
> generated BPF programs.
> 
> The first approach is to generate a BPF program that is an
> equivalent to a set of rules on a rule by rule basis. This
> approach should give the best performance. The drawback is the
> limitation from the verifier on size and complexity of BPF
> program.
> 
> The second approach is to use an internal representation of rules
> stored in a BPF map and use bpf_for_each_map_elem() helper to
> iterate over them. In this case the helper's callback is a BPF
> function that is able to process any valid rule.
> 
> Combination of the two approaches should give most of the
> benefits - a heuristic should help to select a small subset of
> the rules for code generation on a rule by rule basis. All other
> rules are cold and it should be possible to store them in an
> internal form in a BPF map. The rules will be handled by
> bpf_for_each_map_elem().  This should remove the limit on the
> number of supported rules.

Agree. A bpf program inlines some hot rule handling and put
the rest in for_each_map_elem() sounds reasonable to me.

> 
> During development it was useful to use statically linked
> sanitizers in bpfilter usermode helper. Also it is possible to
> use fuzzers but it's not clear if it is worth adding them to the
> test infrastructure - because there are no other fuzzers under
> tools/testing/selftests currently.
> 
> Patch 1 adds definitions of the used types.
> Patch 2 adds logging to bpfilter.
> Patch 3 adds bpfilter header to tools
> Patch 4 adds an associative map.
> Patches 5/6/7/8 add code for matches, targets, rules and table.
> Patch 9 handles hooked setsockopt(2) calls.
> Patch 10 uses prepared code in main().
> 
> Here is an example:
> % dmesg  | tail -n 2
> [   23.636102] bpfilter: Loaded bpfilter_umh pid 181
> [   23.658529] bpfilter: started
> % /usr/sbin/iptables-legacy -L -n

So this /usr/sbin/iptables-legacy is your iptables variant to
translate iptable command lines to BPFILTER_IPT_SO_*,
right? It could be good to provide a pointer to the source
or binary so people can give a try.

I am not an expert in iptables. Reading codes, I kind of
can grasp the high-level ideas of the patch, but probably
Alexei or Daniel can review some details whether the
design is sufficient to be an iptable replacement.


> Chain INPUT (policy ACCEPT)
> target     prot opt source               destination
> 
> Chain FORWARD (policy ACCEPT)
> target     prot opt source               destination
> 
[...]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v1 07/10] bpfilter: Add struct rule
  2021-06-10  0:30   ` Yonghong Song
@ 2021-06-10 13:16     ` Dmitrii Banshchikov
  0 siblings, 0 replies; 20+ messages in thread
From: Dmitrii Banshchikov @ 2021-06-10 13:16 UTC (permalink / raw)
  To: Yonghong Song
  Cc: bpf, ast, davem, daniel, andrii, kafai, songliubraving,
	john.fastabend, kpsingh, netdev, rdna

On Wed, Jun 09, 2021 at 05:30:56PM -0700, Yonghong Song wrote:
> 
> 
> On 6/3/21 3:14 AM, Dmitrii Banshchikov wrote:
> > struct rule is an equivalent of struct ipt_entry. A rule consists of
> > zero or more matches and a target. A rule has a pointer to its ipt_entry
> > in entries blob.  struct rule should simplify iteration over a blob and
> > avoid blob's guts in code generation.
> > 
> > Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
> > ---
> >   net/bpfilter/Makefile                         |   2 +-
> >   net/bpfilter/rule.c                           | 163 ++++++++++++++++++
> >   net/bpfilter/rule.h                           |  32 ++++
> >   .../testing/selftests/bpf/bpfilter/.gitignore |   1 +
> >   tools/testing/selftests/bpf/bpfilter/Makefile |   5 +-
> >   .../selftests/bpf/bpfilter/bpfilter_util.h    |   8 +
> >   .../selftests/bpf/bpfilter/test_rule.c        |  55 ++++++
> >   7 files changed, 264 insertions(+), 2 deletions(-)
> >   create mode 100644 net/bpfilter/rule.c
> >   create mode 100644 net/bpfilter/rule.h
> >   create mode 100644 tools/testing/selftests/bpf/bpfilter/test_rule.c
> > 
> [...]
> > +
> > +bool rule_has_standard_target(const struct rule *rule);
> > +bool is_rule_unconditional(const struct rule *rule);
> > +int init_rule(struct context *ctx, const struct bpfilter_ipt_entry *ipt_entry, struct rule *rule);
> > +void free_rule(struct rule *rule);
> > +
> > +#endif // NET_BPFILTER_RULE_H
> > diff --git a/tools/testing/selftests/bpf/bpfilter/.gitignore b/tools/testing/selftests/bpf/bpfilter/.gitignore
> > index 7e077f506af1..4d7c5083d980 100644
> > --- a/tools/testing/selftests/bpf/bpfilter/.gitignore
> > +++ b/tools/testing/selftests/bpf/bpfilter/.gitignore
> > @@ -2,3 +2,4 @@
> >   test_map
> >   test_match
> >   test_target
> > +test_rule
> > diff --git a/tools/testing/selftests/bpf/bpfilter/Makefile b/tools/testing/selftests/bpf/bpfilter/Makefile
> > index a11775e8b5af..27a1ddcb6dc9 100644
> > --- a/tools/testing/selftests/bpf/bpfilter/Makefile
> > +++ b/tools/testing/selftests/bpf/bpfilter/Makefile
> > @@ -11,6 +11,7 @@ CFLAGS += -Wall -g -pthread -I$(TOOLSINCDIR) -I$(APIDIR) -I$(BPFILTERSRCDIR)
> >   TEST_GEN_PROGS += test_map
> >   TEST_GEN_PROGS += test_match
> >   TEST_GEN_PROGS += test_target
> > +TEST_GEN_PROGS += test_rule
> >   KSFT_KHDR_INSTALL := 1
> > @@ -19,9 +20,11 @@ include ../../lib.mk
> >   BPFILTER_MATCH_SRCS := $(BPFILTERSRCDIR)/match.c $(BPFILTERSRCDIR)/xt_udp.c
> >   BPFILTER_TARGET_SRCS := $(BPFILTERSRCDIR)/target.c
> > -BPFILTER_COMMON_SRCS := $(BPFILTERSRCDIR)/map-common.c $(BPFILTERSRCDIR)/context.c
> > +BPFILTER_COMMON_SRCS := $(BPFILTERSRCDIR)/map-common.c $(BPFILTERSRCDIR)/context.c \
> > +	$(BPFILTERSRCDIR)/rule.c
> >   BPFILTER_COMMON_SRCS += $(BPFILTER_MATCH_SRCS) $(BPFILTER_TARGET_SRCS)
> >   $(OUTPUT)/test_map: test_map.c $(BPFILTERSRCDIR)/map-common.c
> >   $(OUTPUT)/test_match: test_match.c $(BPFILTER_COMMON_SRCS)
> >   $(OUTPUT)/test_target: test_target.c $(BPFILTER_COMMON_SRCS)
> > +$(OUTPUT)/test_rule: test_rule.c $(BPFILTER_COMMON_SRCS)
> > diff --git a/tools/testing/selftests/bpf/bpfilter/bpfilter_util.h b/tools/testing/selftests/bpf/bpfilter/bpfilter_util.h
> > index d82ff86f280e..55fb0e959fca 100644
> > --- a/tools/testing/selftests/bpf/bpfilter/bpfilter_util.h
> > +++ b/tools/testing/selftests/bpf/bpfilter/bpfilter_util.h
> > @@ -7,6 +7,7 @@
> >   #include <linux/netfilter/x_tables.h>
> >   #include <stdio.h>
> > +#include <string.h>
> >   static inline void init_standard_target(struct xt_standard_target *ipt_target, int revision,
> >   					int verdict)
> > @@ -28,4 +29,11 @@ static inline void init_error_target(struct xt_error_target *ipt_target, int rev
> >   	snprintf(ipt_target->errorname, sizeof(ipt_target->errorname), "%s", error_name);
> >   }
> > +static inline void init_standard_entry(struct ipt_entry *entry, __u16 matches_size)
> > +{
> > +	memset(entry, 0, sizeof(*entry));
> > +	entry->target_offset = sizeof(*entry) + matches_size;
> > +	entry->next_offset = sizeof(*entry) + matches_size + sizeof(struct xt_standard_target);
> > +}
> > +
> >   #endif // BPFILTER_UTIL_H
> > diff --git a/tools/testing/selftests/bpf/bpfilter/test_rule.c b/tools/testing/selftests/bpf/bpfilter/test_rule.c
> > new file mode 100644
> > index 000000000000..fe12adf32fe5
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/bpfilter/test_rule.c
> > @@ -0,0 +1,55 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +#define _GNU_SOURCE
> > +
> > +#include "rule.h"
> > +
> > +#include <linux/bpfilter.h>
> > +#include <linux/err.h>
> > +
> > +#include <linux/netfilter_ipv4/ip_tables.h>
> > +
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +
> > +#include "../../kselftest_harness.h"
> > +
> > +#include "context.h"
> > +#include "rule.h"
> > +
> > +#include "bpfilter_util.h"
> > +
> > +FIXTURE(test_standard_rule)
> > +{
> > +	struct context ctx;
> > +	struct {
> > +		struct ipt_entry entry;
> > +		struct xt_standard_target target;
> > +	} entry;
> > +	struct rule rule;
> > +};
> > +
> > +FIXTURE_SETUP(test_standard_rule)
> > +{
> > +	const int verdict = BPFILTER_NF_ACCEPT;
> > +
> > +	ASSERT_EQ(create_context(&self->ctx), 0);
> > +	self->ctx.log_file = stderr;
> > +
> > +	init_standard_entry(&self->entry.entry, 0);
> > +	init_standard_target(&self->entry.target, 0, -verdict - 1);
> > +}
> > +
> > +FIXTURE_TEARDOWN(test_standard_rule)
> > +{
> > +	free_rule(&self->rule);
> > +	free_context(&self->ctx);
> > +}
> > +
> > +TEST_F(test_standard_rule, init)
> > +{
> > +	ASSERT_EQ(0, init_rule(&self->ctx, (const struct bpfilter_ipt_entry *)&self->entry.entry,
> > +			       &self->rule));
> > +}
> > +
> > +TEST_HARNESS_MAIN
> 
> When compiling selftests/bpf/bpfilter, I got the following compilation
> warning:
> 
> gcc -Wall -g -pthread -I/home/yhs/work/bpf-next/tools/include
> -I/home/yhs/work/bpf-next/tools/include/uapi -I../../../../../net/bpfilter
> test_rule.c ../../../../../net/bpfilter/map-common.c
> ../../../../../net/bpfilter/context.c ../../../../../net/bpfilter/rule.c
> ../../../../../net/bpfilter/table.c ../../../../../net/bpfilter/match.c
> ../../../../../net/bpfilter/xt_udp.c ../../../../../net/bpfilter/target.c
> -o /home/yhs/work/bpf-next/tools/testing/selftests/bpf/bpfilter/test_rule
> In file included from test_rule.c:15:
> ../../kselftest_harness.h:674: warning: "ARRAY_SIZE" redefined
>  #define ARRAY_SIZE(a) (sizeof(a) / sizeof(a[0]))
> 
> In file included from /usr/include/linux/netfilter/x_tables.h:4,
>                  from /usr/include/linux/netfilter_ipv4/ip_tables.h:24,
>                  from test_rule.c:10:
> /home/yhs/work/bpf-next/tools/include/linux/kernel.h:105: note: this is the
> location of the previous definition
>  #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) +
> __must_be_array(arr))
> 

Hmm. I cannot reproduce it locally now though I saw this error and
fixed it by removing some of the includes from header files.
I will double check it.


-- 

Dmitrii Banshchikov

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v1 00/10] bpfilter
  2021-06-10  0:50 ` [PATCH bpf-next v1 00/10] bpfilter Yonghong Song
@ 2021-06-10 13:36   ` Dmitrii Banshchikov
  2021-06-10 13:58     ` Daniel Borkmann
  2021-06-10 14:56     ` Yonghong Song
  0 siblings, 2 replies; 20+ messages in thread
From: Dmitrii Banshchikov @ 2021-06-10 13:36 UTC (permalink / raw)
  To: Yonghong Song
  Cc: bpf, ast, davem, daniel, andrii, kafai, songliubraving,
	john.fastabend, kpsingh, netdev, rdna

On Wed, Jun 09, 2021 at 05:50:13PM -0700, Yonghong Song wrote:
> 
> 
> On 6/3/21 3:14 AM, Dmitrii Banshchikov wrote:
> > The patchset is based on the patches from David S. Miller [1] and
> > Daniel Borkmann [2].
> > 
> > The main goal of the patchset is to prepare bpfilter for
> > iptables' configuration blob parsing and code generation.
> > 
> > The patchset introduces data structures and code for matches,
> > targets, rules and tables.
> > 
> > The current version misses handling of counters. Postpone its
> > implementation until the code generation phase as it's not clear
> > yet how to better handle them.
> > 
> > Beside that there is no support of net namespaces at all.
> > 
> > In the next iteration basic code generation shall be introduced.
> > 
> > The rough plan for the code generation.
> > 
> > It seems reasonable to assume that the first rules should cover
> > most of the packet flow.  This is why they are critical from the
> > performance point of view.  At the same time number of user
> > defined rules might be pretty large. Also there is a limit on
> > size and complexity of a BPF program introduced by the verifier.
> > 
> > There are two approaches how to handle iptables' rules in
> > generated BPF programs.
> > 
> > The first approach is to generate a BPF program that is an
> > equivalent to a set of rules on a rule by rule basis. This
> > approach should give the best performance. The drawback is the
> > limitation from the verifier on size and complexity of BPF
> > program.
> > 
> > The second approach is to use an internal representation of rules
> > stored in a BPF map and use bpf_for_each_map_elem() helper to
> > iterate over them. In this case the helper's callback is a BPF
> > function that is able to process any valid rule.
> > 
> > Combination of the two approaches should give most of the
> > benefits - a heuristic should help to select a small subset of
> > the rules for code generation on a rule by rule basis. All other
> > rules are cold and it should be possible to store them in an
> > internal form in a BPF map. The rules will be handled by
> > bpf_for_each_map_elem().  This should remove the limit on the
> > number of supported rules.
> 
> Agree. A bpf program inlines some hot rule handling and put
> the rest in for_each_map_elem() sounds reasonable to me.
> 
> > 
> > During development it was useful to use statically linked
> > sanitizers in bpfilter usermode helper. Also it is possible to
> > use fuzzers but it's not clear if it is worth adding them to the
> > test infrastructure - because there are no other fuzzers under
> > tools/testing/selftests currently.
> > 
> > Patch 1 adds definitions of the used types.
> > Patch 2 adds logging to bpfilter.
> > Patch 3 adds bpfilter header to tools
> > Patch 4 adds an associative map.
> > Patches 5/6/7/8 add code for matches, targets, rules and table.
> > Patch 9 handles hooked setsockopt(2) calls.
> > Patch 10 uses prepared code in main().
> > 
> > Here is an example:
> > % dmesg  | tail -n 2
> > [   23.636102] bpfilter: Loaded bpfilter_umh pid 181
> > [   23.658529] bpfilter: started
> > % /usr/sbin/iptables-legacy -L -n
> 
> So this /usr/sbin/iptables-legacy is your iptables variant to
> translate iptable command lines to BPFILTER_IPT_SO_*,
> right? It could be good to provide a pointer to the source
> or binary so people can give a try.
> 
> I am not an expert in iptables. Reading codes, I kind of
> can grasp the high-level ideas of the patch, but probably
> Alexei or Daniel can review some details whether the
> design is sufficient to be an iptable replacement.
> 

The goal of a complete iptables replacement is too ambigious for
the moment - because existings hooks and helpers don't cover all
required functionality.

A more achievable goal is to have something simple that could
replace a significant part of use cases for filter table.

Having something simple that would work as a stateless firewall
and provide some performance benefits is a good start. For more
complex scenarios there is a safe fallback to the existing
implementation.


> 
> > Chain INPUT (policy ACCEPT)
> > target     prot opt source               destination
> > 
> > Chain FORWARD (policy ACCEPT)
> > target     prot opt source               destination
> > 
> [...]

-- 

Dmitrii Banshchikov

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v1 00/10] bpfilter
  2021-06-10 13:36   ` Dmitrii Banshchikov
@ 2021-06-10 13:58     ` Daniel Borkmann
  2021-06-10 14:56     ` Yonghong Song
  1 sibling, 0 replies; 20+ messages in thread
From: Daniel Borkmann @ 2021-06-10 13:58 UTC (permalink / raw)
  To: Dmitrii Banshchikov, Yonghong Song
  Cc: bpf, ast, davem, andrii, kafai, songliubraving, john.fastabend,
	kpsingh, netdev, rdna

On 6/10/21 3:36 PM, Dmitrii Banshchikov wrote:
> On Wed, Jun 09, 2021 at 05:50:13PM -0700, Yonghong Song wrote:
>> On 6/3/21 3:14 AM, Dmitrii Banshchikov wrote:
>>> The patchset is based on the patches from David S. Miller [1] and
>>> Daniel Borkmann [2].
>>>
>>> The main goal of the patchset is to prepare bpfilter for
>>> iptables' configuration blob parsing and code generation.
>>>
>>> The patchset introduces data structures and code for matches,
>>> targets, rules and tables.
>>>
>>> The current version misses handling of counters. Postpone its
>>> implementation until the code generation phase as it's not clear
>>> yet how to better handle them.
>>>
>>> Beside that there is no support of net namespaces at all.
>>>
>>> In the next iteration basic code generation shall be introduced.
>>>
>>> The rough plan for the code generation.
>>>
>>> It seems reasonable to assume that the first rules should cover
>>> most of the packet flow.  This is why they are critical from the
>>> performance point of view.  At the same time number of user
>>> defined rules might be pretty large. Also there is a limit on
>>> size and complexity of a BPF program introduced by the verifier.
>>>
>>> There are two approaches how to handle iptables' rules in
>>> generated BPF programs.
>>>
>>> The first approach is to generate a BPF program that is an
>>> equivalent to a set of rules on a rule by rule basis. This
>>> approach should give the best performance. The drawback is the
>>> limitation from the verifier on size and complexity of BPF
>>> program.
>>>
>>> The second approach is to use an internal representation of rules
>>> stored in a BPF map and use bpf_for_each_map_elem() helper to
>>> iterate over them. In this case the helper's callback is a BPF
>>> function that is able to process any valid rule.
>>>
>>> Combination of the two approaches should give most of the
>>> benefits - a heuristic should help to select a small subset of
>>> the rules for code generation on a rule by rule basis. All other
>>> rules are cold and it should be possible to store them in an
>>> internal form in a BPF map. The rules will be handled by
>>> bpf_for_each_map_elem().  This should remove the limit on the
>>> number of supported rules.
>>
>> Agree. A bpf program inlines some hot rule handling and put
>> the rest in for_each_map_elem() sounds reasonable to me.

Sounds reasonable. You mentioned in the next iteration that you are
planning to include basic code generation. Would be good to have that
as part of an initial merge included, maybe along with some form of
documentation for users on what is expected to already work with the
current state of the code (& potentially stating goals/non-goals) of
this work. Thanks Dmitrii!

>>> During development it was useful to use statically linked
>>> sanitizers in bpfilter usermode helper. Also it is possible to
>>> use fuzzers but it's not clear if it is worth adding them to the
>>> test infrastructure - because there are no other fuzzers under
>>> tools/testing/selftests currently.
>>>
>>> Patch 1 adds definitions of the used types.
>>> Patch 2 adds logging to bpfilter.
>>> Patch 3 adds bpfilter header to tools
>>> Patch 4 adds an associative map.
>>> Patches 5/6/7/8 add code for matches, targets, rules and table.
>>> Patch 9 handles hooked setsockopt(2) calls.
>>> Patch 10 uses prepared code in main().
>>>
>>> Here is an example:
>>> % dmesg  | tail -n 2
>>> [   23.636102] bpfilter: Loaded bpfilter_umh pid 181
>>> [   23.658529] bpfilter: started
>>> % /usr/sbin/iptables-legacy -L -n
>>
>> So this /usr/sbin/iptables-legacy is your iptables variant to
>> translate iptable command lines to BPFILTER_IPT_SO_*,
>> right? It could be good to provide a pointer to the source
>> or binary so people can give a try.
>>
>> I am not an expert in iptables. Reading codes, I kind of
>> can grasp the high-level ideas of the patch, but probably
>> Alexei or Daniel can review some details whether the
>> design is sufficient to be an iptable replacement.
> 
> The goal of a complete iptables replacement is too ambigious for
> the moment - because existings hooks and helpers don't cover all
> required functionality.
> 
> A more achievable goal is to have something simple that could
> replace a significant part of use cases for filter table.
> 
> Having something simple that would work as a stateless firewall
> and provide some performance benefits is a good start. For more
> complex scenarios there is a safe fallback to the existing
> implementation.
> 
>>> Chain INPUT (policy ACCEPT)
>>> target     prot opt source               destination
>>>
>>> Chain FORWARD (policy ACCEPT)
>>> target     prot opt source               destination
>>>
>> [...]
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v1 00/10] bpfilter
  2021-06-10 13:36   ` Dmitrii Banshchikov
  2021-06-10 13:58     ` Daniel Borkmann
@ 2021-06-10 14:56     ` Yonghong Song
  1 sibling, 0 replies; 20+ messages in thread
From: Yonghong Song @ 2021-06-10 14:56 UTC (permalink / raw)
  To: Dmitrii Banshchikov
  Cc: bpf, ast, davem, daniel, andrii, kafai, songliubraving,
	john.fastabend, kpsingh, netdev, rdna



On 6/10/21 6:36 AM, Dmitrii Banshchikov wrote:
> On Wed, Jun 09, 2021 at 05:50:13PM -0700, Yonghong Song wrote:
>>
>>
>> On 6/3/21 3:14 AM, Dmitrii Banshchikov wrote:
>>> The patchset is based on the patches from David S. Miller [1] and
>>> Daniel Borkmann [2].
>>>
>>> The main goal of the patchset is to prepare bpfilter for
>>> iptables' configuration blob parsing and code generation.
>>>
>>> The patchset introduces data structures and code for matches,
>>> targets, rules and tables.
>>>
>>> The current version misses handling of counters. Postpone its
>>> implementation until the code generation phase as it's not clear
>>> yet how to better handle them.
>>>
>>> Beside that there is no support of net namespaces at all.
>>>
>>> In the next iteration basic code generation shall be introduced.
>>>
>>> The rough plan for the code generation.
>>>
>>> It seems reasonable to assume that the first rules should cover
>>> most of the packet flow.  This is why they are critical from the
>>> performance point of view.  At the same time number of user
>>> defined rules might be pretty large. Also there is a limit on
>>> size and complexity of a BPF program introduced by the verifier.
>>>
>>> There are two approaches how to handle iptables' rules in
>>> generated BPF programs.
>>>
>>> The first approach is to generate a BPF program that is an
>>> equivalent to a set of rules on a rule by rule basis. This
>>> approach should give the best performance. The drawback is the
>>> limitation from the verifier on size and complexity of BPF
>>> program.
>>>
>>> The second approach is to use an internal representation of rules
>>> stored in a BPF map and use bpf_for_each_map_elem() helper to
>>> iterate over them. In this case the helper's callback is a BPF
>>> function that is able to process any valid rule.
>>>
>>> Combination of the two approaches should give most of the
>>> benefits - a heuristic should help to select a small subset of
>>> the rules for code generation on a rule by rule basis. All other
>>> rules are cold and it should be possible to store them in an
>>> internal form in a BPF map. The rules will be handled by
>>> bpf_for_each_map_elem().  This should remove the limit on the
>>> number of supported rules.
>>
>> Agree. A bpf program inlines some hot rule handling and put
>> the rest in for_each_map_elem() sounds reasonable to me.
>>
>>>
>>> During development it was useful to use statically linked
>>> sanitizers in bpfilter usermode helper. Also it is possible to
>>> use fuzzers but it's not clear if it is worth adding them to the
>>> test infrastructure - because there are no other fuzzers under
>>> tools/testing/selftests currently.
>>>
>>> Patch 1 adds definitions of the used types.
>>> Patch 2 adds logging to bpfilter.
>>> Patch 3 adds bpfilter header to tools
>>> Patch 4 adds an associative map.
>>> Patches 5/6/7/8 add code for matches, targets, rules and table.
>>> Patch 9 handles hooked setsockopt(2) calls.
>>> Patch 10 uses prepared code in main().
>>>
>>> Here is an example:
>>> % dmesg  | tail -n 2
>>> [   23.636102] bpfilter: Loaded bpfilter_umh pid 181
>>> [   23.658529] bpfilter: started
>>> % /usr/sbin/iptables-legacy -L -n
>>
>> So this /usr/sbin/iptables-legacy is your iptables variant to
>> translate iptable command lines to BPFILTER_IPT_SO_*,
>> right? It could be good to provide a pointer to the source
>> or binary so people can give a try.
>>
>> I am not an expert in iptables. Reading codes, I kind of
>> can grasp the high-level ideas of the patch, but probably
>> Alexei or Daniel can review some details whether the
>> design is sufficient to be an iptable replacement.
>>
> 
> The goal of a complete iptables replacement is too ambigious for
> the moment - because existings hooks and helpers don't cover all
> required functionality.
> 
> A more achievable goal is to have something simple that could
> replace a significant part of use cases for filter table.
> 
> Having something simple that would work as a stateless firewall
> and provide some performance benefits is a good start. For more
> complex scenarios there is a safe fallback to the existing
> implementation.

Thanks for explanation. It would be good to put the above
into cover letter so reviewers/users can get a realistic
expectation.

> 
> 
>>
>>> Chain INPUT (policy ACCEPT)
>>> target     prot opt source               destination
>>>
>>> Chain FORWARD (policy ACCEPT)
>>> target     prot opt source               destination
>>>
>> [...]
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-06-10 14:57 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-03 10:14 [PATCH bpf-next v1 00/10] bpfilter Dmitrii Banshchikov
2021-06-03 10:14 ` [PATCH bpf-next v1 01/10] bpfilter: Add types for usermode helper Dmitrii Banshchikov
2021-06-03 10:14 ` [PATCH bpf-next v1 02/10] bpfilter: Add logging facility Dmitrii Banshchikov
2021-06-03 10:14 ` [PATCH bpf-next v1 03/10] tools: Add bpfilter usermode helper header Dmitrii Banshchikov
2021-06-08 16:20   ` Yonghong Song
2021-06-09 10:07     ` Dmitrii Banshchikov
2021-06-03 10:14 ` [PATCH bpf-next v1 04/10] bpfilter: Add map container Dmitrii Banshchikov
2021-06-03 10:14 ` [PATCH bpf-next v1 05/10] bpfilter: Add struct match Dmitrii Banshchikov
2021-06-03 10:14 ` [PATCH bpf-next v1 06/10] bpfilter: Add struct target Dmitrii Banshchikov
2021-06-03 16:47   ` Dmitrii Banshchikov
2021-06-03 10:14 ` [PATCH bpf-next v1 07/10] bpfilter: Add struct rule Dmitrii Banshchikov
2021-06-10  0:30   ` Yonghong Song
2021-06-10 13:16     ` Dmitrii Banshchikov
2021-06-03 10:14 ` [PATCH bpf-next v1 08/10] bpfilter: Add struct table Dmitrii Banshchikov
2021-06-03 10:14 ` [PATCH bpf-next v1 09/10] bpfilter: Add handling of setsockopt() calls Dmitrii Banshchikov
2021-06-03 10:14 ` [PATCH bpf-next v1 10/10] bpfilter: Handle setsockopts Dmitrii Banshchikov
2021-06-10  0:50 ` [PATCH bpf-next v1 00/10] bpfilter Yonghong Song
2021-06-10 13:36   ` Dmitrii Banshchikov
2021-06-10 13:58     ` Daniel Borkmann
2021-06-10 14:56     ` Yonghong Song

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.