bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Patch bpf-next v3 0/5] sock_map: clean up and refactor code for BPF_SK_SKB_VERDICT
@ 2021-02-13 21:44 Cong Wang
  2021-02-13 21:44 ` [Patch bpf-next v3 1/5] bpf: clean up sockmap related Kconfigs Cong Wang
                   ` (4 more replies)
  0 siblings, 5 replies; 23+ messages in thread
From: Cong Wang @ 2021-02-13 21:44 UTC (permalink / raw)
  To: netdev; +Cc: bpf, duanxiongchun, wangdongdong.6, jiang.wang, Cong Wang

From: Cong Wang <cong.wang@bytedance.com>

This patchset is the first series of patches separated out from
the original large patchset, to make reviews easier. This patchset
does not add any new feature or change any functionality but merely
cleans up the existing sockmap and skmsg code and refactors it, to
prepare for the patches followed up. This passed all BPF selftests.

The original whole patchset is available on github:
https://github.com/congwang/linux/tree/sockmap

and this patchset is also available on github:
https://github.com/congwang/linux/tree/sockmap1

---
v3: fix a few Kconfig compile errors
    remove an unused variable
    add a comment for bpf_convert_data_end_access()

v2: split the original patchset
    compute data_end with bpf_convert_data_end_access()
    get rid of psock->bpf_running
    reduce the scope of CONFIG_BPF_STREAM_PARSER
    do not add CONFIG_BPF_SOCK_MAP

Cong Wang (5):
  bpf: clean up sockmap related Kconfigs
  skmsg: get rid of struct sk_psock_parser
  bpf: compute data_end dynamically with JIT code
  skmsg: use skb ext instead of TCP_SKB_CB
  sock_map: rename skb_parser and skb_verdict

 include/linux/bpf.h                           |  20 +-
 include/linux/bpf_types.h                     |   2 -
 include/linux/skbuff.h                        |   3 +
 include/linux/skmsg.h                         |  85 ++++++--
 include/net/tcp.h                             |  38 +---
 include/net/udp.h                             |   4 +-
 init/Kconfig                                  |   1 +
 net/Kconfig                                   |   7 +-
 net/core/Makefile                             |   2 +-
 net/core/filter.c                             |  48 +++--
 net/core/skbuff.c                             |   7 +
 net/core/skmsg.c                              | 194 +++++++++---------
 net/core/sock_map.c                           |  74 +++----
 net/ipv4/Makefile                             |   2 +-
 net/ipv4/tcp_bpf.c                            |   2 -
 .../selftests/bpf/prog_tests/sockmap_listen.c |   8 +-
 .../selftests/bpf/progs/test_sockmap_listen.c |   4 +-
 17 files changed, 269 insertions(+), 232 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Patch bpf-next v3 1/5] bpf: clean up sockmap related Kconfigs
  2021-02-13 21:44 [Patch bpf-next v3 0/5] sock_map: clean up and refactor code for BPF_SK_SKB_VERDICT Cong Wang
@ 2021-02-13 21:44 ` Cong Wang
  2021-02-15 10:34   ` Lorenz Bauer
  2021-02-15 18:34   ` John Fastabend
  2021-02-13 21:44 ` [Patch bpf-next v3 2/5] skmsg: get rid of struct sk_psock_parser Cong Wang
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 23+ messages in thread
From: Cong Wang @ 2021-02-13 21:44 UTC (permalink / raw)
  To: netdev
  Cc: bpf, duanxiongchun, wangdongdong.6, jiang.wang, Cong Wang,
	John Fastabend, Daniel Borkmann, Jakub Sitnicki, Lorenz Bauer

From: Cong Wang <cong.wang@bytedance.com>

As suggested by John, clean up sockmap related Kconfigs:

Reduce the scope of CONFIG_BPF_STREAM_PARSER down to TCP stream
parser, to reflect its name.

Make the rest sockmap code simply depend on CONFIG_BPF_SYSCALL.
And leave CONFIG_NET_SOCK_MSG untouched, as it is used by
non-sockmap cases.

Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jakub Sitnicki <jakub@cloudflare.com>
Cc: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
---
 include/linux/bpf.h       |  20 +++---
 include/linux/bpf_types.h |   2 -
 include/linux/skmsg.h     |  22 ++++++
 include/net/tcp.h         |  13 ++--
 include/net/udp.h         |   4 +-
 init/Kconfig              |   1 +
 net/Kconfig               |   6 +-
 net/core/Makefile         |   2 +-
 net/core/skmsg.c          | 139 ++++++++++++++++++++------------------
 net/core/sock_map.c       |   2 +
 net/ipv4/Makefile         |   2 +-
 net/ipv4/tcp_bpf.c        |   2 -
 12 files changed, 117 insertions(+), 98 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index cccaef1088ea..e29600f01585 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1741,6 +1741,14 @@ static inline bool bpf_map_is_dev_bound(struct bpf_map *map)
 
 struct bpf_map *bpf_map_offload_map_alloc(union bpf_attr *attr);
 void bpf_map_offload_map_free(struct bpf_map *map);
+
+int sock_map_prog_update(struct bpf_map *map, struct bpf_prog *prog,
+			 struct bpf_prog *old, u32 which);
+int sock_map_get_from_fd(const union bpf_attr *attr, struct bpf_prog *prog);
+int sock_map_prog_detach(const union bpf_attr *attr, enum bpf_prog_type ptype);
+int sock_map_update_elem_sys(struct bpf_map *map, void *key, void *value, u64 flags);
+void sock_map_unhash(struct sock *sk);
+void sock_map_close(struct sock *sk, long timeout);
 #else
 static inline int bpf_prog_offload_init(struct bpf_prog *prog,
 					union bpf_attr *attr)
@@ -1766,17 +1774,7 @@ static inline struct bpf_map *bpf_map_offload_map_alloc(union bpf_attr *attr)
 static inline void bpf_map_offload_map_free(struct bpf_map *map)
 {
 }
-#endif /* CONFIG_NET && CONFIG_BPF_SYSCALL */
 
-#if defined(CONFIG_BPF_STREAM_PARSER)
-int sock_map_prog_update(struct bpf_map *map, struct bpf_prog *prog,
-			 struct bpf_prog *old, u32 which);
-int sock_map_get_from_fd(const union bpf_attr *attr, struct bpf_prog *prog);
-int sock_map_prog_detach(const union bpf_attr *attr, enum bpf_prog_type ptype);
-int sock_map_update_elem_sys(struct bpf_map *map, void *key, void *value, u64 flags);
-void sock_map_unhash(struct sock *sk);
-void sock_map_close(struct sock *sk, long timeout);
-#else
 static inline int sock_map_prog_update(struct bpf_map *map,
 				       struct bpf_prog *prog,
 				       struct bpf_prog *old, u32 which)
@@ -1801,7 +1799,7 @@ static inline int sock_map_update_elem_sys(struct bpf_map *map, void *key, void
 {
 	return -EOPNOTSUPP;
 }
-#endif /* CONFIG_BPF_STREAM_PARSER */
+#endif /* CONFIG_NET && CONFIG_BPF_SYSCALL */
 
 #if defined(CONFIG_INET) && defined(CONFIG_BPF_SYSCALL)
 void bpf_sk_reuseport_detach(struct sock *sk);
diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h
index 99f7fd657d87..d6b4a657c885 100644
--- a/include/linux/bpf_types.h
+++ b/include/linux/bpf_types.h
@@ -103,10 +103,8 @@ BPF_MAP_TYPE(BPF_MAP_TYPE_HASH_OF_MAPS, htab_of_maps_map_ops)
 BPF_MAP_TYPE(BPF_MAP_TYPE_DEVMAP, dev_map_ops)
 BPF_MAP_TYPE(BPF_MAP_TYPE_DEVMAP_HASH, dev_map_hash_ops)
 BPF_MAP_TYPE(BPF_MAP_TYPE_SK_STORAGE, sk_storage_map_ops)
-#if defined(CONFIG_BPF_STREAM_PARSER)
 BPF_MAP_TYPE(BPF_MAP_TYPE_SOCKMAP, sock_map_ops)
 BPF_MAP_TYPE(BPF_MAP_TYPE_SOCKHASH, sock_hash_ops)
-#endif
 #ifdef CONFIG_BPF_LSM
 BPF_MAP_TYPE(BPF_MAP_TYPE_INODE_STORAGE, inode_storage_map_ops)
 BPF_MAP_TYPE(BPF_MAP_TYPE_TASK_STORAGE, task_storage_map_ops)
diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
index 8edbbf5f2f93..041faef00937 100644
--- a/include/linux/skmsg.h
+++ b/include/linux/skmsg.h
@@ -71,7 +71,9 @@ struct sk_psock_link {
 };
 
 struct sk_psock_parser {
+#if IS_ENABLED(CONFIG_BPF_STREAM_PARSER)
 	struct strparser		strp;
+#endif
 	bool				enabled;
 	void (*saved_data_ready)(struct sock *sk);
 };
@@ -305,9 +307,29 @@ static inline void sk_psock_report_error(struct sk_psock *psock, int err)
 
 struct sk_psock *sk_psock_init(struct sock *sk, int node);
 
+#if IS_ENABLED(CONFIG_BPF_STREAM_PARSER)
 int sk_psock_init_strp(struct sock *sk, struct sk_psock *psock);
 void sk_psock_start_strp(struct sock *sk, struct sk_psock *psock);
 void sk_psock_stop_strp(struct sock *sk, struct sk_psock *psock);
+void sk_psock_done_strp(struct sk_psock *psock);
+#else
+static inline int sk_psock_init_strp(struct sock *sk, struct sk_psock *psock)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline void sk_psock_start_strp(struct sock *sk, struct sk_psock *psock)
+{
+}
+
+static inline void sk_psock_stop_strp(struct sock *sk, struct sk_psock *psock)
+{
+}
+
+static inline void sk_psock_done_strp(struct sk_psock *psock)
+{
+}
+#endif
 void sk_psock_start_verdict(struct sock *sk, struct sk_psock *psock);
 void sk_psock_stop_verdict(struct sock *sk, struct sk_psock *psock);
 
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 4bb42fb19711..99fdbf03aeee 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -2204,23 +2204,20 @@ void tcp_update_ulp(struct sock *sk, struct proto *p,
 	__MODULE_INFO(alias, alias_userspace, name);		\
 	__MODULE_INFO(alias, alias_tcp_ulp, "tcp-ulp-" name)
 
+#ifdef CONFIG_NET_SOCK_MSG
 struct sk_msg;
 struct sk_psock;
-
-#ifdef CONFIG_BPF_STREAM_PARSER
 struct proto *tcp_bpf_get_proto(struct sock *sk, struct sk_psock *psock);
 void tcp_bpf_clone(const struct sock *sk, struct sock *newsk);
-#else
-static inline void tcp_bpf_clone(const struct sock *sk, struct sock *newsk)
-{
-}
-#endif /* CONFIG_BPF_STREAM_PARSER */
 
-#ifdef CONFIG_NET_SOCK_MSG
 int tcp_bpf_sendmsg_redir(struct sock *sk, struct sk_msg *msg, u32 bytes,
 			  int flags);
 int __tcp_bpf_recvmsg(struct sock *sk, struct sk_psock *psock,
 		      struct msghdr *msg, int len, int flags);
+#else
+static inline void tcp_bpf_clone(const struct sock *sk, struct sock *newsk)
+{
+}
 #endif /* CONFIG_NET_SOCK_MSG */
 
 #ifdef CONFIG_CGROUP_BPF
diff --git a/include/net/udp.h b/include/net/udp.h
index 877832bed471..fcc73cc50007 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -511,9 +511,9 @@ static inline struct sk_buff *udp_rcv_segment(struct sock *sk,
 	return segs;
 }
 
-#ifdef CONFIG_BPF_STREAM_PARSER
+#ifdef CONFIG_NET_SOCK_MSG
 struct sk_psock;
 struct proto *udp_bpf_get_proto(struct sock *sk, struct sk_psock *psock);
-#endif /* BPF_STREAM_PARSER */
+#endif /* CONFIG_NET_SOCK_MSG */
 
 #endif	/* _UDP_H */
diff --git a/init/Kconfig b/init/Kconfig
index b77c60f8b963..163d76019f59 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1703,6 +1703,7 @@ config BPF_SYSCALL
 	select BPF
 	select IRQ_WORK
 	select TASKS_TRACE_RCU
+	select NET_SOCK_MSG if NET
 	default n
 	help
 	  Enable the bpf() system call that allows to manipulate eBPF
diff --git a/net/Kconfig b/net/Kconfig
index f4c32d982af6..a4f60d0c630f 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -313,13 +313,9 @@ config BPF_STREAM_PARSER
 	select STREAM_PARSER
 	select NET_SOCK_MSG
 	help
-	  Enabling this allows a stream parser to be used with
+	  Enabling this allows a TCP stream parser to be used with
 	  BPF_MAP_TYPE_SOCKMAP.
 
-	  BPF_MAP_TYPE_SOCKMAP provides a map type to use with network sockets.
-	  It can be used to enforce socket policy, implement socket redirects,
-	  etc.
-
 config NET_FLOW_LIMIT
 	bool
 	depends on RPS
diff --git a/net/core/Makefile b/net/core/Makefile
index 3e2c378e5f31..05733689d050 100644
--- a/net/core/Makefile
+++ b/net/core/Makefile
@@ -28,10 +28,10 @@ obj-$(CONFIG_CGROUP_NET_PRIO) += netprio_cgroup.o
 obj-$(CONFIG_CGROUP_NET_CLASSID) += netclassid_cgroup.o
 obj-$(CONFIG_LWTUNNEL) += lwtunnel.o
 obj-$(CONFIG_LWTUNNEL_BPF) += lwt_bpf.o
-obj-$(CONFIG_BPF_STREAM_PARSER) += sock_map.o
 obj-$(CONFIG_DST_CACHE) += dst_cache.o
 obj-$(CONFIG_HWBM) += hwbm.o
 obj-$(CONFIG_NET_DEVLINK) += devlink.o
 obj-$(CONFIG_GRO_CELLS) += gro_cells.o
 obj-$(CONFIG_FAILOVER) += failover.o
+obj-$(CONFIG_BPF_SYSCALL) += sock_map.o
 obj-$(CONFIG_BPF_SYSCALL) += bpf_sk_storage.o
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index 1261512d6807..6cb5ff6f8f9c 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -651,9 +651,7 @@ static void sk_psock_destroy_deferred(struct work_struct *gc)
 
 	/* No sk_callback_lock since already detached. */
 
-	/* Parser has been stopped */
-	if (psock->progs.skb_parser)
-		strp_done(&psock->parser.strp);
+	sk_psock_done_strp(psock);
 
 	cancel_work_sync(&psock->work);
 
@@ -750,14 +748,6 @@ static int sk_psock_bpf_run(struct sk_psock *psock, struct bpf_prog *prog,
 	return bpf_prog_run_pin_on_cpu(prog, skb);
 }
 
-static struct sk_psock *sk_psock_from_strp(struct strparser *strp)
-{
-	struct sk_psock_parser *parser;
-
-	parser = container_of(strp, struct sk_psock_parser, strp);
-	return container_of(parser, struct sk_psock, parser);
-}
-
 static void sk_psock_skb_redirect(struct sk_buff *skb)
 {
 	struct sk_psock *psock_other;
@@ -866,6 +856,24 @@ static void sk_psock_verdict_apply(struct sk_psock *psock,
 	}
 }
 
+static void sk_psock_write_space(struct sock *sk)
+{
+	struct sk_psock *psock;
+	void (*write_space)(struct sock *sk) = NULL;
+
+	rcu_read_lock();
+	psock = sk_psock(sk);
+	if (likely(psock)) {
+		if (sk_psock_test_state(psock, SK_PSOCK_TX_ENABLED))
+			schedule_work(&psock->work);
+		write_space = psock->saved_write_space;
+	}
+	rcu_read_unlock();
+	if (write_space)
+		write_space(sk);
+}
+
+#if IS_ENABLED(CONFIG_BPF_STREAM_PARSER)
 static void sk_psock_strp_read(struct strparser *strp, struct sk_buff *skb)
 {
 	struct sk_psock *psock;
@@ -897,6 +905,14 @@ static int sk_psock_strp_read_done(struct strparser *strp, int err)
 	return err;
 }
 
+static struct sk_psock *sk_psock_from_strp(struct strparser *strp)
+{
+	struct sk_psock_parser *parser;
+
+	parser = container_of(strp, struct sk_psock_parser, strp);
+	return container_of(parser, struct sk_psock, parser);
+}
+
 static int sk_psock_strp_parse(struct strparser *strp, struct sk_buff *skb)
 {
 	struct sk_psock *psock = sk_psock_from_strp(strp);
@@ -933,6 +949,52 @@ static void sk_psock_strp_data_ready(struct sock *sk)
 	rcu_read_unlock();
 }
 
+int sk_psock_init_strp(struct sock *sk, struct sk_psock *psock)
+{
+	static const struct strp_callbacks cb = {
+		.rcv_msg	= sk_psock_strp_read,
+		.read_sock_done	= sk_psock_strp_read_done,
+		.parse_msg	= sk_psock_strp_parse,
+	};
+
+	psock->parser.enabled = false;
+	return strp_init(&psock->parser.strp, sk, &cb);
+}
+
+void sk_psock_start_strp(struct sock *sk, struct sk_psock *psock)
+{
+	struct sk_psock_parser *parser = &psock->parser;
+
+	if (parser->enabled)
+		return;
+
+	parser->saved_data_ready = sk->sk_data_ready;
+	sk->sk_data_ready = sk_psock_strp_data_ready;
+	sk->sk_write_space = sk_psock_write_space;
+	parser->enabled = true;
+}
+
+void sk_psock_stop_strp(struct sock *sk, struct sk_psock *psock)
+{
+	struct sk_psock_parser *parser = &psock->parser;
+
+	if (!parser->enabled)
+		return;
+
+	sk->sk_data_ready = parser->saved_data_ready;
+	parser->saved_data_ready = NULL;
+	strp_stop(&parser->strp);
+	parser->enabled = false;
+}
+
+void sk_psock_done_strp(struct sk_psock *psock)
+{
+	/* Parser has been stopped */
+	if (psock->progs.skb_parser)
+		strp_done(&psock->parser.strp);
+}
+#endif
+
 static int sk_psock_verdict_recv(read_descriptor_t *desc, struct sk_buff *skb,
 				 unsigned int offset, size_t orig_len)
 {
@@ -984,35 +1046,6 @@ static void sk_psock_verdict_data_ready(struct sock *sk)
 	sock->ops->read_sock(sk, &desc, sk_psock_verdict_recv);
 }
 
-static void sk_psock_write_space(struct sock *sk)
-{
-	struct sk_psock *psock;
-	void (*write_space)(struct sock *sk) = NULL;
-
-	rcu_read_lock();
-	psock = sk_psock(sk);
-	if (likely(psock)) {
-		if (sk_psock_test_state(psock, SK_PSOCK_TX_ENABLED))
-			schedule_work(&psock->work);
-		write_space = psock->saved_write_space;
-	}
-	rcu_read_unlock();
-	if (write_space)
-		write_space(sk);
-}
-
-int sk_psock_init_strp(struct sock *sk, struct sk_psock *psock)
-{
-	static const struct strp_callbacks cb = {
-		.rcv_msg	= sk_psock_strp_read,
-		.read_sock_done	= sk_psock_strp_read_done,
-		.parse_msg	= sk_psock_strp_parse,
-	};
-
-	psock->parser.enabled = false;
-	return strp_init(&psock->parser.strp, sk, &cb);
-}
-
 void sk_psock_start_verdict(struct sock *sk, struct sk_psock *psock)
 {
 	struct sk_psock_parser *parser = &psock->parser;
@@ -1026,32 +1059,6 @@ void sk_psock_start_verdict(struct sock *sk, struct sk_psock *psock)
 	parser->enabled = true;
 }
 
-void sk_psock_start_strp(struct sock *sk, struct sk_psock *psock)
-{
-	struct sk_psock_parser *parser = &psock->parser;
-
-	if (parser->enabled)
-		return;
-
-	parser->saved_data_ready = sk->sk_data_ready;
-	sk->sk_data_ready = sk_psock_strp_data_ready;
-	sk->sk_write_space = sk_psock_write_space;
-	parser->enabled = true;
-}
-
-void sk_psock_stop_strp(struct sock *sk, struct sk_psock *psock)
-{
-	struct sk_psock_parser *parser = &psock->parser;
-
-	if (!parser->enabled)
-		return;
-
-	sk->sk_data_ready = parser->saved_data_ready;
-	parser->saved_data_ready = NULL;
-	strp_stop(&parser->strp);
-	parser->enabled = false;
-}
-
 void sk_psock_stop_verdict(struct sock *sk, struct sk_psock *psock)
 {
 	struct sk_psock_parser *parser = &psock->parser;
diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index d758fb83c884..ee3334dd3a38 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -1461,9 +1461,11 @@ int sock_map_prog_update(struct bpf_map *map, struct bpf_prog *prog,
 	case BPF_SK_MSG_VERDICT:
 		pprog = &progs->msg_parser;
 		break;
+#if IS_ENABLED(CONFIG_BPF_STREAM_PARSER)
 	case BPF_SK_SKB_STREAM_PARSER:
 		pprog = &progs->skb_parser;
 		break;
+#endif
 	case BPF_SK_SKB_STREAM_VERDICT:
 		pprog = &progs->skb_verdict;
 		break;
diff --git a/net/ipv4/Makefile b/net/ipv4/Makefile
index 5b77a46885b9..49debfefcc62 100644
--- a/net/ipv4/Makefile
+++ b/net/ipv4/Makefile
@@ -62,7 +62,7 @@ obj-$(CONFIG_TCP_CONG_LP) += tcp_lp.o
 obj-$(CONFIG_TCP_CONG_YEAH) += tcp_yeah.o
 obj-$(CONFIG_TCP_CONG_ILLINOIS) += tcp_illinois.o
 obj-$(CONFIG_NET_SOCK_MSG) += tcp_bpf.o
-obj-$(CONFIG_BPF_STREAM_PARSER) += udp_bpf.o
+obj-$(CONFIG_NET_SOCK_MSG) += udp_bpf.o
 obj-$(CONFIG_NETLABEL) += cipso_ipv4.o
 
 obj-$(CONFIG_XFRM) += xfrm4_policy.o xfrm4_state.o xfrm4_input.o \
diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index bc7d2a586e18..1de5b62d8d09 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -229,7 +229,6 @@ int tcp_bpf_sendmsg_redir(struct sock *sk, struct sk_msg *msg,
 }
 EXPORT_SYMBOL_GPL(tcp_bpf_sendmsg_redir);
 
-#ifdef CONFIG_BPF_STREAM_PARSER
 static bool tcp_bpf_stream_read(const struct sock *sk)
 {
 	struct sk_psock *psock;
@@ -629,4 +628,3 @@ void tcp_bpf_clone(const struct sock *sk, struct sock *newsk)
 	if (prot == &tcp_bpf_prots[family][TCP_BPF_BASE])
 		newsk->sk_prot = sk->sk_prot_creator;
 }
-#endif /* CONFIG_BPF_STREAM_PARSER */
-- 
2.25.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Patch bpf-next v3 2/5] skmsg: get rid of struct sk_psock_parser
  2021-02-13 21:44 [Patch bpf-next v3 0/5] sock_map: clean up and refactor code for BPF_SK_SKB_VERDICT Cong Wang
  2021-02-13 21:44 ` [Patch bpf-next v3 1/5] bpf: clean up sockmap related Kconfigs Cong Wang
@ 2021-02-13 21:44 ` Cong Wang
  2021-02-15 18:56   ` John Fastabend
  2021-02-15 19:03   ` Jakub Sitnicki
  2021-02-13 21:44 ` [Patch bpf-next v3 3/5] bpf: compute data_end dynamically with JIT code Cong Wang
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 23+ messages in thread
From: Cong Wang @ 2021-02-13 21:44 UTC (permalink / raw)
  To: netdev
  Cc: bpf, duanxiongchun, wangdongdong.6, jiang.wang, Cong Wang,
	John Fastabend, Daniel Borkmann, Jakub Sitnicki, Lorenz Bauer

From: Cong Wang <cong.wang@bytedance.com>

struct sk_psock_parser is embedded in sk_psock, it is
unnecessary as skb verdict also uses ->saved_data_ready.
We can simply fold these fields into sk_psock, and get rid
of ->enabled.

Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jakub Sitnicki <jakub@cloudflare.com>
Cc: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
---
 include/linux/skmsg.h | 19 ++++++----------
 net/core/skmsg.c      | 53 +++++++++++++------------------------------
 net/core/sock_map.c   |  8 +++----
 3 files changed, 27 insertions(+), 53 deletions(-)

diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
index 041faef00937..e3bb712af257 100644
--- a/include/linux/skmsg.h
+++ b/include/linux/skmsg.h
@@ -70,14 +70,6 @@ struct sk_psock_link {
 	void				*link_raw;
 };
 
-struct sk_psock_parser {
-#if IS_ENABLED(CONFIG_BPF_STREAM_PARSER)
-	struct strparser		strp;
-#endif
-	bool				enabled;
-	void (*saved_data_ready)(struct sock *sk);
-};
-
 struct sk_psock_work_state {
 	struct sk_buff			*skb;
 	u32				len;
@@ -92,7 +84,9 @@ struct sk_psock {
 	u32				eval;
 	struct sk_msg			*cork;
 	struct sk_psock_progs		progs;
-	struct sk_psock_parser		parser;
+#if IS_ENABLED(CONFIG_BPF_STREAM_PARSER)
+	struct strparser		strp;
+#endif
 	struct sk_buff_head		ingress_skb;
 	struct list_head		ingress_msg;
 	unsigned long			state;
@@ -102,6 +96,7 @@ struct sk_psock {
 	void (*saved_unhash)(struct sock *sk);
 	void (*saved_close)(struct sock *sk, long timeout);
 	void (*saved_write_space)(struct sock *sk);
+	void (*saved_data_ready)(struct sock *sk);
 	struct proto			*sk_proto;
 	struct sk_psock_work_state	work_state;
 	struct work_struct		work;
@@ -422,8 +417,8 @@ static inline void sk_psock_put(struct sock *sk, struct sk_psock *psock)
 
 static inline void sk_psock_data_ready(struct sock *sk, struct sk_psock *psock)
 {
-	if (psock->parser.enabled)
-		psock->parser.saved_data_ready(sk);
+	if (psock->saved_data_ready)
+		psock->saved_data_ready(sk);
 	else
 		sk->sk_data_ready(sk);
 }
@@ -462,6 +457,6 @@ static inline bool sk_psock_strp_enabled(struct sk_psock *psock)
 {
 	if (!psock)
 		return false;
-	return psock->parser.enabled;
+	return !!psock->saved_data_ready;
 }
 #endif /* _LINUX_SKMSG_H */
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index 6cb5ff6f8f9c..7f400d044cda 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -905,17 +905,9 @@ static int sk_psock_strp_read_done(struct strparser *strp, int err)
 	return err;
 }
 
-static struct sk_psock *sk_psock_from_strp(struct strparser *strp)
-{
-	struct sk_psock_parser *parser;
-
-	parser = container_of(strp, struct sk_psock_parser, strp);
-	return container_of(parser, struct sk_psock, parser);
-}
-
 static int sk_psock_strp_parse(struct strparser *strp, struct sk_buff *skb)
 {
-	struct sk_psock *psock = sk_psock_from_strp(strp);
+	struct sk_psock *psock = container_of(strp, struct sk_psock, strp);
 	struct bpf_prog *prog;
 	int ret = skb->len;
 
@@ -939,10 +931,10 @@ static void sk_psock_strp_data_ready(struct sock *sk)
 	psock = sk_psock(sk);
 	if (likely(psock)) {
 		if (tls_sw_has_ctx_rx(sk)) {
-			psock->parser.saved_data_ready(sk);
+			psock->saved_data_ready(sk);
 		} else {
 			write_lock_bh(&sk->sk_callback_lock);
-			strp_data_ready(&psock->parser.strp);
+			strp_data_ready(&psock->strp);
 			write_unlock_bh(&sk->sk_callback_lock);
 		}
 	}
@@ -957,41 +949,34 @@ int sk_psock_init_strp(struct sock *sk, struct sk_psock *psock)
 		.parse_msg	= sk_psock_strp_parse,
 	};
 
-	psock->parser.enabled = false;
-	return strp_init(&psock->parser.strp, sk, &cb);
+	return strp_init(&psock->strp, sk, &cb);
 }
 
 void sk_psock_start_strp(struct sock *sk, struct sk_psock *psock)
 {
-	struct sk_psock_parser *parser = &psock->parser;
-
-	if (parser->enabled)
+	if (psock->saved_data_ready)
 		return;
 
-	parser->saved_data_ready = sk->sk_data_ready;
+	psock->saved_data_ready = sk->sk_data_ready;
 	sk->sk_data_ready = sk_psock_strp_data_ready;
 	sk->sk_write_space = sk_psock_write_space;
-	parser->enabled = true;
 }
 
 void sk_psock_stop_strp(struct sock *sk, struct sk_psock *psock)
 {
-	struct sk_psock_parser *parser = &psock->parser;
-
-	if (!parser->enabled)
+	if (!psock->saved_data_ready)
 		return;
 
-	sk->sk_data_ready = parser->saved_data_ready;
-	parser->saved_data_ready = NULL;
-	strp_stop(&parser->strp);
-	parser->enabled = false;
+	sk->sk_data_ready = psock->saved_data_ready;
+	psock->saved_data_ready = NULL;
+	strp_stop(&psock->strp);
 }
 
 void sk_psock_done_strp(struct sk_psock *psock)
 {
 	/* Parser has been stopped */
 	if (psock->progs.skb_parser)
-		strp_done(&psock->parser.strp);
+		strp_done(&psock->strp);
 }
 #endif
 
@@ -1048,25 +1033,19 @@ static void sk_psock_verdict_data_ready(struct sock *sk)
 
 void sk_psock_start_verdict(struct sock *sk, struct sk_psock *psock)
 {
-	struct sk_psock_parser *parser = &psock->parser;
-
-	if (parser->enabled)
+	if (psock->saved_data_ready)
 		return;
 
-	parser->saved_data_ready = sk->sk_data_ready;
+	psock->saved_data_ready = sk->sk_data_ready;
 	sk->sk_data_ready = sk_psock_verdict_data_ready;
 	sk->sk_write_space = sk_psock_write_space;
-	parser->enabled = true;
 }
 
 void sk_psock_stop_verdict(struct sock *sk, struct sk_psock *psock)
 {
-	struct sk_psock_parser *parser = &psock->parser;
-
-	if (!parser->enabled)
+	if (!psock->saved_data_ready)
 		return;
 
-	sk->sk_data_ready = parser->saved_data_ready;
-	parser->saved_data_ready = NULL;
-	parser->enabled = false;
+	sk->sk_data_ready = psock->saved_data_ready;
+	psock->saved_data_ready = NULL;
 }
diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index ee3334dd3a38..1a28a5c2c61e 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -148,9 +148,9 @@ static void sock_map_del_link(struct sock *sk,
 			struct bpf_map *map = link->map;
 			struct bpf_stab *stab = container_of(map, struct bpf_stab,
 							     map);
-			if (psock->parser.enabled && stab->progs.skb_parser)
+			if (psock->saved_data_ready && stab->progs.skb_parser)
 				strp_stop = true;
-			if (psock->parser.enabled && stab->progs.skb_verdict)
+			if (psock->saved_data_ready && stab->progs.skb_verdict)
 				verdict_stop = true;
 			list_del(&link->list);
 			sk_psock_free_link(link);
@@ -283,14 +283,14 @@ static int sock_map_link(struct bpf_map *map, struct sk_psock_progs *progs,
 		goto out_drop;
 
 	write_lock_bh(&sk->sk_callback_lock);
-	if (skb_parser && skb_verdict && !psock->parser.enabled) {
+	if (skb_parser && skb_verdict && !psock->saved_data_ready) {
 		ret = sk_psock_init_strp(sk, psock);
 		if (ret)
 			goto out_unlock_drop;
 		psock_set_prog(&psock->progs.skb_verdict, skb_verdict);
 		psock_set_prog(&psock->progs.skb_parser, skb_parser);
 		sk_psock_start_strp(sk, psock);
-	} else if (!skb_parser && skb_verdict && !psock->parser.enabled) {
+	} else if (!skb_parser && skb_verdict && !psock->saved_data_ready) {
 		psock_set_prog(&psock->progs.skb_verdict, skb_verdict);
 		sk_psock_start_verdict(sk,psock);
 	}
-- 
2.25.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Patch bpf-next v3 3/5] bpf: compute data_end dynamically with JIT code
  2021-02-13 21:44 [Patch bpf-next v3 0/5] sock_map: clean up and refactor code for BPF_SK_SKB_VERDICT Cong Wang
  2021-02-13 21:44 ` [Patch bpf-next v3 1/5] bpf: clean up sockmap related Kconfigs Cong Wang
  2021-02-13 21:44 ` [Patch bpf-next v3 2/5] skmsg: get rid of struct sk_psock_parser Cong Wang
@ 2021-02-13 21:44 ` Cong Wang
  2021-02-15 19:03   ` John Fastabend
  2021-02-15 19:06   ` Jakub Sitnicki
  2021-02-13 21:44 ` [Patch bpf-next v3 4/5] skmsg: use skb ext instead of TCP_SKB_CB Cong Wang
  2021-02-13 21:44 ` [Patch bpf-next v3 5/5] sock_map: rename skb_parser and skb_verdict Cong Wang
  4 siblings, 2 replies; 23+ messages in thread
From: Cong Wang @ 2021-02-13 21:44 UTC (permalink / raw)
  To: netdev
  Cc: bpf, duanxiongchun, wangdongdong.6, jiang.wang, Cong Wang,
	Jakub Sitnicki, John Fastabend, Daniel Borkmann, Lorenz Bauer

From: Cong Wang <cong.wang@bytedance.com>

Currently, we compute ->data_end with a compile-time constant
offset of skb. But as Jakub pointed out, we can actually compute
it in eBPF JIT code at run-time, so that we can competely get
rid of ->data_end. This is similar to skb_shinfo(skb) computation
in bpf_convert_shinfo_access().

Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
---
 include/net/tcp.h |  6 ------
 net/core/filter.c | 48 +++++++++++++++++++++++++++--------------------
 net/core/skmsg.c  |  1 -
 3 files changed, 28 insertions(+), 27 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 99fdbf03aeee..697712178eff 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -885,18 +885,12 @@ struct tcp_skb_cb {
 		struct {
 			__u32 flags;
 			struct sock *sk_redir;
-			void *data_end;
 		} bpf;
 	};
 };
 
 #define TCP_SKB_CB(__skb)	((struct tcp_skb_cb *)&((__skb)->cb[0]))
 
-static inline void bpf_compute_data_end_sk_skb(struct sk_buff *skb)
-{
-	TCP_SKB_CB(skb)->bpf.data_end = skb->data + skb_headlen(skb);
-}
-
 static inline bool tcp_skb_bpf_ingress(const struct sk_buff *skb)
 {
 	return TCP_SKB_CB(skb)->bpf.flags & BPF_F_INGRESS;
diff --git a/net/core/filter.c b/net/core/filter.c
index 7059cf604d94..38c4996e48bf 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -1863,10 +1863,7 @@ static const struct bpf_func_proto bpf_sk_fullsock_proto = {
 static inline int sk_skb_try_make_writable(struct sk_buff *skb,
 					   unsigned int write_len)
 {
-	int err = __bpf_try_make_writable(skb, write_len);
-
-	bpf_compute_data_end_sk_skb(skb);
-	return err;
+	return __bpf_try_make_writable(skb, write_len);
 }
 
 BPF_CALL_2(sk_skb_pull_data, struct sk_buff *, skb, u32, len)
@@ -3577,7 +3574,6 @@ BPF_CALL_4(sk_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
 			return -ENOMEM;
 		__skb_pull(skb, len_diff_abs);
 	}
-	bpf_compute_data_end_sk_skb(skb);
 	if (tls_sw_has_ctx_rx(skb->sk)) {
 		struct strp_msg *rxm = strp_msg(skb);
 
@@ -3742,10 +3738,7 @@ static const struct bpf_func_proto bpf_skb_change_tail_proto = {
 BPF_CALL_3(sk_skb_change_tail, struct sk_buff *, skb, u32, new_len,
 	   u64, flags)
 {
-	int ret = __bpf_skb_change_tail(skb, new_len, flags);
-
-	bpf_compute_data_end_sk_skb(skb);
-	return ret;
+	return __bpf_skb_change_tail(skb, new_len, flags);
 }
 
 static const struct bpf_func_proto sk_skb_change_tail_proto = {
@@ -3808,10 +3801,7 @@ static const struct bpf_func_proto bpf_skb_change_head_proto = {
 BPF_CALL_3(sk_skb_change_head, struct sk_buff *, skb, u32, head_room,
 	   u64, flags)
 {
-	int ret = __bpf_skb_change_head(skb, head_room, flags);
-
-	bpf_compute_data_end_sk_skb(skb);
-	return ret;
+	return __bpf_skb_change_head(skb, head_room, flags);
 }
 
 static const struct bpf_func_proto sk_skb_change_head_proto = {
@@ -9657,22 +9647,40 @@ static u32 sock_ops_convert_ctx_access(enum bpf_access_type type,
 	return insn - insn_buf;
 }
 
+/* data_end = skb->data + skb_headlen() */
+static struct bpf_insn *bpf_convert_data_end_access(const struct bpf_insn *si,
+						    struct bpf_insn *insn)
+{
+	/* si->dst_reg = skb->data */
+	*insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct sk_buff, data),
+			      si->dst_reg, si->src_reg,
+			      offsetof(struct sk_buff, data));
+	/* AX = skb->len */
+	*insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct sk_buff, len),
+			      BPF_REG_AX, si->src_reg,
+			      offsetof(struct sk_buff, len));
+	/* si->dst_reg = skb->data + skb->len */
+	*insn++ = BPF_ALU64_REG(BPF_ADD, si->dst_reg, BPF_REG_AX);
+	/* AX = skb->data_len */
+	*insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct sk_buff, data_len),
+			      BPF_REG_AX, si->src_reg,
+			      offsetof(struct sk_buff, data_len));
+	/* si->dst_reg = skb->data + skb->len - skb->data_len */
+	*insn++ = BPF_ALU64_REG(BPF_SUB, si->dst_reg, BPF_REG_AX);
+
+	return insn;
+}
+
 static u32 sk_skb_convert_ctx_access(enum bpf_access_type type,
 				     const struct bpf_insn *si,
 				     struct bpf_insn *insn_buf,
 				     struct bpf_prog *prog, u32 *target_size)
 {
 	struct bpf_insn *insn = insn_buf;
-	int off;
 
 	switch (si->off) {
 	case offsetof(struct __sk_buff, data_end):
-		off  = si->off;
-		off -= offsetof(struct __sk_buff, data_end);
-		off += offsetof(struct sk_buff, cb);
-		off += offsetof(struct tcp_skb_cb, bpf.data_end);
-		*insn++ = BPF_LDX_MEM(BPF_SIZEOF(void *), si->dst_reg,
-				      si->src_reg, off);
+		insn = bpf_convert_data_end_access(si, insn);
 		break;
 	default:
 		return bpf_convert_ctx_access(type, si, insn_buf, prog,
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index 7f400d044cda..2d8bbb3fd87c 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -744,7 +744,6 @@ EXPORT_SYMBOL_GPL(sk_psock_msg_verdict);
 static int sk_psock_bpf_run(struct sk_psock *psock, struct bpf_prog *prog,
 			    struct sk_buff *skb)
 {
-	bpf_compute_data_end_sk_skb(skb);
 	return bpf_prog_run_pin_on_cpu(prog, skb);
 }
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Patch bpf-next v3 4/5] skmsg: use skb ext instead of TCP_SKB_CB
  2021-02-13 21:44 [Patch bpf-next v3 0/5] sock_map: clean up and refactor code for BPF_SK_SKB_VERDICT Cong Wang
                   ` (2 preceding siblings ...)
  2021-02-13 21:44 ` [Patch bpf-next v3 3/5] bpf: compute data_end dynamically with JIT code Cong Wang
@ 2021-02-13 21:44 ` Cong Wang
  2021-02-15 19:20   ` John Fastabend
  2021-02-13 21:44 ` [Patch bpf-next v3 5/5] sock_map: rename skb_parser and skb_verdict Cong Wang
  4 siblings, 1 reply; 23+ messages in thread
From: Cong Wang @ 2021-02-13 21:44 UTC (permalink / raw)
  To: netdev
  Cc: bpf, duanxiongchun, wangdongdong.6, jiang.wang, Cong Wang,
	John Fastabend, Daniel Borkmann, Jakub Sitnicki, Lorenz Bauer

From: Cong Wang <cong.wang@bytedance.com>

Currently TCP_SKB_CB() is hard-coded in skmsg code, it certainly
does not work for any other non-TCP protocols. We can move them to
skb ext instead of playing with skb cb, which is harder to make
correct.

Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jakub Sitnicki <jakub@cloudflare.com>
Reviewed-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
---
 include/linux/skbuff.h |  3 +++
 include/linux/skmsg.h  | 40 ++++++++++++++++++++++++++++++++++++++++
 include/net/tcp.h      | 19 -------------------
 net/Kconfig            |  1 +
 net/core/skbuff.c      |  7 +++++++
 net/core/skmsg.c       | 35 +++++++++++++++++++++++------------
 net/core/sock_map.c    | 12 ++++++------
 7 files changed, 80 insertions(+), 37 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 46f901adf1a8..2d4ffe77ef47 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -4166,6 +4166,9 @@ enum skb_ext_id {
 #endif
 #if IS_ENABLED(CONFIG_MPTCP)
 	SKB_EXT_MPTCP,
+#endif
+#if IS_ENABLED(CONFIG_NET_SOCK_MSG)
+	SKB_EXT_BPF,
 #endif
 	SKB_EXT_NUM, /* must be last */
 };
diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
index e3bb712af257..d5c711ef6d4b 100644
--- a/include/linux/skmsg.h
+++ b/include/linux/skmsg.h
@@ -459,4 +459,44 @@ static inline bool sk_psock_strp_enabled(struct sk_psock *psock)
 		return false;
 	return !!psock->saved_data_ready;
 }
+
+struct skb_bpf_ext {
+	__u32 flags;
+	struct sock *sk_redir;
+};
+
+#if IS_ENABLED(CONFIG_NET_SOCK_MSG)
+static inline
+bool skb_bpf_ext_ingress(const struct sk_buff *skb)
+{
+	struct skb_bpf_ext *ext = skb_ext_find(skb, SKB_EXT_BPF);
+
+	return ext->flags & BPF_F_INGRESS;
+}
+
+static inline
+void skb_bpf_ext_set_ingress(const struct sk_buff *skb)
+{
+	struct skb_bpf_ext *ext = skb_ext_find(skb, SKB_EXT_BPF);
+
+	ext->flags |= BPF_F_INGRESS;
+}
+
+static inline
+struct sock *skb_bpf_ext_redirect_fetch(struct sk_buff *skb)
+{
+	struct skb_bpf_ext *ext = skb_ext_find(skb, SKB_EXT_BPF);
+
+	return ext->sk_redir;
+}
+
+static inline
+void skb_bpf_ext_redirect_clear(struct sk_buff *skb)
+{
+	struct skb_bpf_ext *ext = skb_ext_find(skb, SKB_EXT_BPF);
+
+	ext->flags = 0;
+	ext->sk_redir = NULL;
+}
+#endif /* CONFIG_NET_SOCK_MSG */
 #endif /* _LINUX_SKMSG_H */
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 697712178eff..e35881f837b2 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -882,30 +882,11 @@ struct tcp_skb_cb {
 			struct inet6_skb_parm	h6;
 #endif
 		} header;	/* For incoming skbs */
-		struct {
-			__u32 flags;
-			struct sock *sk_redir;
-		} bpf;
 	};
 };
 
 #define TCP_SKB_CB(__skb)	((struct tcp_skb_cb *)&((__skb)->cb[0]))
 
-static inline bool tcp_skb_bpf_ingress(const struct sk_buff *skb)
-{
-	return TCP_SKB_CB(skb)->bpf.flags & BPF_F_INGRESS;
-}
-
-static inline struct sock *tcp_skb_bpf_redirect_fetch(struct sk_buff *skb)
-{
-	return TCP_SKB_CB(skb)->bpf.sk_redir;
-}
-
-static inline void tcp_skb_bpf_redirect_clear(struct sk_buff *skb)
-{
-	TCP_SKB_CB(skb)->bpf.sk_redir = NULL;
-}
-
 extern const struct inet_connection_sock_af_ops ipv4_specific;
 
 #if IS_ENABLED(CONFIG_IPV6)
diff --git a/net/Kconfig b/net/Kconfig
index a4f60d0c630f..9b4dd1ad2188 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -419,6 +419,7 @@ config SOCK_VALIDATE_XMIT
 
 config NET_SOCK_MSG
 	bool
+	select SKB_EXTENSIONS
 	default n
 	help
 	  The NET_SOCK_MSG provides a framework for plain sockets (e.g. TCP) or
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 145503d3f06b..7695a2b65832 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -60,6 +60,7 @@
 #include <linux/prefetch.h>
 #include <linux/if_vlan.h>
 #include <linux/mpls.h>
+#include <linux/skmsg.h>
 
 #include <net/protocol.h>
 #include <net/dst.h>
@@ -4259,6 +4260,9 @@ static const u8 skb_ext_type_len[] = {
 #if IS_ENABLED(CONFIG_MPTCP)
 	[SKB_EXT_MPTCP] = SKB_EXT_CHUNKSIZEOF(struct mptcp_ext),
 #endif
+#if IS_ENABLED(CONFIG_NET_SOCK_MSG)
+	[SKB_EXT_BPF] = SKB_EXT_CHUNKSIZEOF(struct skb_bpf_ext),
+#endif
 };
 
 static __always_inline unsigned int skb_ext_total_length(void)
@@ -4275,6 +4279,9 @@ static __always_inline unsigned int skb_ext_total_length(void)
 #endif
 #if IS_ENABLED(CONFIG_MPTCP)
 		skb_ext_type_len[SKB_EXT_MPTCP] +
+#endif
+#if IS_ENABLED(CONFIG_NET_SOCK_MSG)
+		skb_ext_type_len[SKB_EXT_BPF] +
 #endif
 		0;
 }
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index 2d8bbb3fd87c..9404dbf5d57b 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -525,7 +525,8 @@ static void sk_psock_backlog(struct work_struct *work)
 		len = skb->len;
 		off = 0;
 start:
-		ingress = tcp_skb_bpf_ingress(skb);
+		ingress = skb_bpf_ext_ingress(skb);
+		skb_ext_del(skb, SKB_EXT_BPF);
 		do {
 			ret = -EIO;
 			if (likely(psock->sk->sk_socket))
@@ -752,7 +753,7 @@ static void sk_psock_skb_redirect(struct sk_buff *skb)
 	struct sk_psock *psock_other;
 	struct sock *sk_other;
 
-	sk_other = tcp_skb_bpf_redirect_fetch(skb);
+	sk_other = skb_bpf_ext_redirect_fetch(skb);
 	/* This error is a buggy BPF program, it returned a redirect
 	 * return code, but then didn't set a redirect interface.
 	 */
@@ -794,6 +795,9 @@ int sk_psock_tls_strp_read(struct sk_psock *psock, struct sk_buff *skb)
 	struct bpf_prog *prog;
 	int ret = __SK_PASS;
 
+	if (!skb_ext_add(skb, SKB_EXT_BPF))
+		return __SK_DROP;
+
 	rcu_read_lock();
 	prog = READ_ONCE(psock->progs.skb_verdict);
 	if (likely(prog)) {
@@ -802,9 +806,9 @@ int sk_psock_tls_strp_read(struct sk_psock *psock, struct sk_buff *skb)
 		 * TLS context.
 		 */
 		skb->sk = psock->sk;
-		tcp_skb_bpf_redirect_clear(skb);
+		skb_bpf_ext_redirect_clear(skb);
 		ret = sk_psock_bpf_run(psock, prog, skb);
-		ret = sk_psock_map_verd(ret, tcp_skb_bpf_redirect_fetch(skb));
+		ret = sk_psock_map_verd(ret, skb_bpf_ext_redirect_fetch(skb));
 		skb->sk = NULL;
 	}
 	sk_psock_tls_verdict_apply(skb, psock->sk, ret);
@@ -816,7 +820,6 @@ EXPORT_SYMBOL_GPL(sk_psock_tls_strp_read);
 static void sk_psock_verdict_apply(struct sk_psock *psock,
 				   struct sk_buff *skb, int verdict)
 {
-	struct tcp_skb_cb *tcp;
 	struct sock *sk_other;
 	int err = -EIO;
 
@@ -828,9 +831,7 @@ static void sk_psock_verdict_apply(struct sk_psock *psock,
 			goto out_free;
 		}
 
-		tcp = TCP_SKB_CB(skb);
-		tcp->bpf.flags |= BPF_F_INGRESS;
-
+		skb_bpf_ext_set_ingress(skb);
 		/* If the queue is empty then we can submit directly
 		 * into the msg queue. If its not empty we have to
 		 * queue work otherwise we may get OOO data. Otherwise,
@@ -888,11 +889,15 @@ static void sk_psock_strp_read(struct strparser *strp, struct sk_buff *skb)
 		goto out;
 	}
 	skb_set_owner_r(skb, sk);
+	if (!skb_ext_add(skb, SKB_EXT_BPF)) {
+		kfree_skb(skb);
+		goto out;
+	}
 	prog = READ_ONCE(psock->progs.skb_verdict);
 	if (likely(prog)) {
-		tcp_skb_bpf_redirect_clear(skb);
+		skb_bpf_ext_redirect_clear(skb);
 		ret = sk_psock_bpf_run(psock, prog, skb);
-		ret = sk_psock_map_verd(ret, tcp_skb_bpf_redirect_fetch(skb));
+		ret = sk_psock_map_verd(ret, skb_bpf_ext_redirect_fetch(skb));
 	}
 	sk_psock_verdict_apply(psock, skb, ret);
 out:
@@ -1003,11 +1008,17 @@ static int sk_psock_verdict_recv(read_descriptor_t *desc, struct sk_buff *skb,
 		goto out;
 	}
 	skb_set_owner_r(skb, sk);
+	if (!skb_ext_add(skb, SKB_EXT_BPF)) {
+		len = 0;
+		kfree_skb(skb);
+		goto out;
+	}
+
 	prog = READ_ONCE(psock->progs.skb_verdict);
 	if (likely(prog)) {
-		tcp_skb_bpf_redirect_clear(skb);
+		skb_bpf_ext_redirect_clear(skb);
 		ret = sk_psock_bpf_run(psock, prog, skb);
-		ret = sk_psock_map_verd(ret, tcp_skb_bpf_redirect_fetch(skb));
+		ret = sk_psock_map_verd(ret, skb_bpf_ext_redirect_fetch(skb));
 	}
 	sk_psock_verdict_apply(psock, skb, ret);
 out:
diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index 1a28a5c2c61e..e9f2a17fb665 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -657,7 +657,7 @@ const struct bpf_func_proto bpf_sock_map_update_proto = {
 BPF_CALL_4(bpf_sk_redirect_map, struct sk_buff *, skb,
 	   struct bpf_map *, map, u32, key, u64, flags)
 {
-	struct tcp_skb_cb *tcb = TCP_SKB_CB(skb);
+	struct skb_bpf_ext *ext = skb_ext_find(skb, SKB_EXT_BPF);
 	struct sock *sk;
 
 	if (unlikely(flags & ~(BPF_F_INGRESS)))
@@ -667,8 +667,8 @@ BPF_CALL_4(bpf_sk_redirect_map, struct sk_buff *, skb,
 	if (unlikely(!sk || !sock_map_redirect_allowed(sk)))
 		return SK_DROP;
 
-	tcb->bpf.flags = flags;
-	tcb->bpf.sk_redir = sk;
+	ext->flags = flags;
+	ext->sk_redir = sk;
 	return SK_PASS;
 }
 
@@ -1250,7 +1250,7 @@ const struct bpf_func_proto bpf_sock_hash_update_proto = {
 BPF_CALL_4(bpf_sk_redirect_hash, struct sk_buff *, skb,
 	   struct bpf_map *, map, void *, key, u64, flags)
 {
-	struct tcp_skb_cb *tcb = TCP_SKB_CB(skb);
+	struct skb_bpf_ext *ext = skb_ext_find(skb, SKB_EXT_BPF);
 	struct sock *sk;
 
 	if (unlikely(flags & ~(BPF_F_INGRESS)))
@@ -1260,8 +1260,8 @@ BPF_CALL_4(bpf_sk_redirect_hash, struct sk_buff *, skb,
 	if (unlikely(!sk || !sock_map_redirect_allowed(sk)))
 		return SK_DROP;
 
-	tcb->bpf.flags = flags;
-	tcb->bpf.sk_redir = sk;
+	ext->flags = flags;
+	ext->sk_redir = sk;
 	return SK_PASS;
 }
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Patch bpf-next v3 5/5] sock_map: rename skb_parser and skb_verdict
  2021-02-13 21:44 [Patch bpf-next v3 0/5] sock_map: clean up and refactor code for BPF_SK_SKB_VERDICT Cong Wang
                   ` (3 preceding siblings ...)
  2021-02-13 21:44 ` [Patch bpf-next v3 4/5] skmsg: use skb ext instead of TCP_SKB_CB Cong Wang
@ 2021-02-13 21:44 ` Cong Wang
  2021-02-15 19:09   ` John Fastabend
  4 siblings, 1 reply; 23+ messages in thread
From: Cong Wang @ 2021-02-13 21:44 UTC (permalink / raw)
  To: netdev
  Cc: bpf, duanxiongchun, wangdongdong.6, jiang.wang, Cong Wang,
	John Fastabend, Daniel Borkmann, Jakub Sitnicki, Lorenz Bauer

From: Cong Wang <cong.wang@bytedance.com>

These two eBPF programs are tied to BPF_SK_SKB_STREAM_PARSER
and BPF_SK_SKB_STREAM_VERDICT, rename them to reflect the fact
they are only used for TCP. And save the name 'skb_verdict' for
general use later.

Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jakub Sitnicki <jakub@cloudflare.com>
Reviewed-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
---
 include/linux/skmsg.h                         |  8 +--
 net/core/skmsg.c                              | 14 ++---
 net/core/sock_map.c                           | 60 +++++++++----------
 .../selftests/bpf/prog_tests/sockmap_listen.c |  8 +--
 .../selftests/bpf/progs/test_sockmap_listen.c |  4 +-
 5 files changed, 47 insertions(+), 47 deletions(-)

diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
index d5c711ef6d4b..a0c764b3e624 100644
--- a/include/linux/skmsg.h
+++ b/include/linux/skmsg.h
@@ -56,8 +56,8 @@ struct sk_msg {
 
 struct sk_psock_progs {
 	struct bpf_prog			*msg_parser;
-	struct bpf_prog			*skb_parser;
-	struct bpf_prog			*skb_verdict;
+	struct bpf_prog			*stream_parser;
+	struct bpf_prog			*stream_verdict;
 };
 
 enum sk_psock_state_bits {
@@ -447,8 +447,8 @@ static inline int psock_replace_prog(struct bpf_prog **pprog,
 static inline void psock_progs_drop(struct sk_psock_progs *progs)
 {
 	psock_set_prog(&progs->msg_parser, NULL);
-	psock_set_prog(&progs->skb_parser, NULL);
-	psock_set_prog(&progs->skb_verdict, NULL);
+	psock_set_prog(&progs->stream_parser, NULL);
+	psock_set_prog(&progs->stream_verdict, NULL);
 }
 
 int sk_psock_tls_strp_read(struct sk_psock *psock, struct sk_buff *skb);
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index 9404dbf5d57b..5e5e6d95cb33 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -684,9 +684,9 @@ void sk_psock_drop(struct sock *sk, struct sk_psock *psock)
 	write_lock_bh(&sk->sk_callback_lock);
 	sk_psock_restore_proto(sk, psock);
 	rcu_assign_sk_user_data(sk, NULL);
-	if (psock->progs.skb_parser)
+	if (psock->progs.stream_parser)
 		sk_psock_stop_strp(sk, psock);
-	else if (psock->progs.skb_verdict)
+	else if (psock->progs.stream_verdict)
 		sk_psock_stop_verdict(sk, psock);
 	write_unlock_bh(&sk->sk_callback_lock);
 	sk_psock_clear_state(psock, SK_PSOCK_TX_ENABLED);
@@ -799,7 +799,7 @@ int sk_psock_tls_strp_read(struct sk_psock *psock, struct sk_buff *skb)
 		return __SK_DROP;
 
 	rcu_read_lock();
-	prog = READ_ONCE(psock->progs.skb_verdict);
+	prog = READ_ONCE(psock->progs.stream_verdict);
 	if (likely(prog)) {
 		/* We skip full set_owner_r here because if we do a SK_PASS
 		 * or SK_DROP we can skip skb memory accounting and use the
@@ -893,7 +893,7 @@ static void sk_psock_strp_read(struct strparser *strp, struct sk_buff *skb)
 		kfree_skb(skb);
 		goto out;
 	}
-	prog = READ_ONCE(psock->progs.skb_verdict);
+	prog = READ_ONCE(psock->progs.stream_verdict);
 	if (likely(prog)) {
 		skb_bpf_ext_redirect_clear(skb);
 		ret = sk_psock_bpf_run(psock, prog, skb);
@@ -916,7 +916,7 @@ static int sk_psock_strp_parse(struct strparser *strp, struct sk_buff *skb)
 	int ret = skb->len;
 
 	rcu_read_lock();
-	prog = READ_ONCE(psock->progs.skb_parser);
+	prog = READ_ONCE(psock->progs.stream_parser);
 	if (likely(prog)) {
 		skb->sk = psock->sk;
 		ret = sk_psock_bpf_run(psock, prog, skb);
@@ -979,7 +979,7 @@ void sk_psock_stop_strp(struct sock *sk, struct sk_psock *psock)
 void sk_psock_done_strp(struct sk_psock *psock)
 {
 	/* Parser has been stopped */
-	if (psock->progs.skb_parser)
+	if (psock->progs.stream_parser)
 		strp_done(&psock->strp);
 }
 #endif
@@ -1014,7 +1014,7 @@ static int sk_psock_verdict_recv(read_descriptor_t *desc, struct sk_buff *skb,
 		goto out;
 	}
 
-	prog = READ_ONCE(psock->progs.skb_verdict);
+	prog = READ_ONCE(psock->progs.stream_verdict);
 	if (likely(prog)) {
 		skb_bpf_ext_redirect_clear(skb);
 		ret = sk_psock_bpf_run(psock, prog, skb);
diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index e9f2a17fb665..0a4437f60041 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -148,9 +148,9 @@ static void sock_map_del_link(struct sock *sk,
 			struct bpf_map *map = link->map;
 			struct bpf_stab *stab = container_of(map, struct bpf_stab,
 							     map);
-			if (psock->saved_data_ready && stab->progs.skb_parser)
+			if (psock->saved_data_ready && stab->progs.stream_parser)
 				strp_stop = true;
-			if (psock->saved_data_ready && stab->progs.skb_verdict)
+			if (psock->saved_data_ready && stab->progs.stream_verdict)
 				verdict_stop = true;
 			list_del(&link->list);
 			sk_psock_free_link(link);
@@ -224,23 +224,23 @@ static struct sk_psock *sock_map_psock_get_checked(struct sock *sk)
 static int sock_map_link(struct bpf_map *map, struct sk_psock_progs *progs,
 			 struct sock *sk)
 {
-	struct bpf_prog *msg_parser, *skb_parser, *skb_verdict;
+	struct bpf_prog *msg_parser, *stream_parser, *stream_verdict;
 	struct sk_psock *psock;
 	int ret;
 
-	skb_verdict = READ_ONCE(progs->skb_verdict);
-	if (skb_verdict) {
-		skb_verdict = bpf_prog_inc_not_zero(skb_verdict);
-		if (IS_ERR(skb_verdict))
-			return PTR_ERR(skb_verdict);
+	stream_verdict = READ_ONCE(progs->stream_verdict);
+	if (stream_verdict) {
+		stream_verdict = bpf_prog_inc_not_zero(stream_verdict);
+		if (IS_ERR(stream_verdict))
+			return PTR_ERR(stream_verdict);
 	}
 
-	skb_parser = READ_ONCE(progs->skb_parser);
-	if (skb_parser) {
-		skb_parser = bpf_prog_inc_not_zero(skb_parser);
-		if (IS_ERR(skb_parser)) {
-			ret = PTR_ERR(skb_parser);
-			goto out_put_skb_verdict;
+	stream_parser = READ_ONCE(progs->stream_parser);
+	if (stream_parser) {
+		stream_parser = bpf_prog_inc_not_zero(stream_parser);
+		if (IS_ERR(stream_parser)) {
+			ret = PTR_ERR(stream_parser);
+			goto out_put_stream_verdict;
 		}
 	}
 
@@ -249,7 +249,7 @@ static int sock_map_link(struct bpf_map *map, struct sk_psock_progs *progs,
 		msg_parser = bpf_prog_inc_not_zero(msg_parser);
 		if (IS_ERR(msg_parser)) {
 			ret = PTR_ERR(msg_parser);
-			goto out_put_skb_parser;
+			goto out_put_stream_parser;
 		}
 	}
 
@@ -261,8 +261,8 @@ static int sock_map_link(struct bpf_map *map, struct sk_psock_progs *progs,
 
 	if (psock) {
 		if ((msg_parser && READ_ONCE(psock->progs.msg_parser)) ||
-		    (skb_parser  && READ_ONCE(psock->progs.skb_parser)) ||
-		    (skb_verdict && READ_ONCE(psock->progs.skb_verdict))) {
+		    (stream_parser  && READ_ONCE(psock->progs.stream_parser)) ||
+		    (stream_verdict && READ_ONCE(psock->progs.stream_verdict))) {
 			sk_psock_put(sk, psock);
 			ret = -EBUSY;
 			goto out_progs;
@@ -283,15 +283,15 @@ static int sock_map_link(struct bpf_map *map, struct sk_psock_progs *progs,
 		goto out_drop;
 
 	write_lock_bh(&sk->sk_callback_lock);
-	if (skb_parser && skb_verdict && !psock->saved_data_ready) {
+	if (stream_parser && stream_verdict && !psock->saved_data_ready) {
 		ret = sk_psock_init_strp(sk, psock);
 		if (ret)
 			goto out_unlock_drop;
-		psock_set_prog(&psock->progs.skb_verdict, skb_verdict);
-		psock_set_prog(&psock->progs.skb_parser, skb_parser);
+		psock_set_prog(&psock->progs.stream_verdict, stream_verdict);
+		psock_set_prog(&psock->progs.stream_parser, stream_parser);
 		sk_psock_start_strp(sk, psock);
-	} else if (!skb_parser && skb_verdict && !psock->saved_data_ready) {
-		psock_set_prog(&psock->progs.skb_verdict, skb_verdict);
+	} else if (!stream_parser && stream_verdict && !psock->saved_data_ready) {
+		psock_set_prog(&psock->progs.stream_verdict, stream_verdict);
 		sk_psock_start_verdict(sk,psock);
 	}
 	write_unlock_bh(&sk->sk_callback_lock);
@@ -303,12 +303,12 @@ static int sock_map_link(struct bpf_map *map, struct sk_psock_progs *progs,
 out_progs:
 	if (msg_parser)
 		bpf_prog_put(msg_parser);
-out_put_skb_parser:
-	if (skb_parser)
-		bpf_prog_put(skb_parser);
-out_put_skb_verdict:
-	if (skb_verdict)
-		bpf_prog_put(skb_verdict);
+out_put_stream_parser:
+	if (stream_parser)
+		bpf_prog_put(stream_parser);
+out_put_stream_verdict:
+	if (stream_verdict)
+		bpf_prog_put(stream_verdict);
 	return ret;
 }
 
@@ -1463,11 +1463,11 @@ int sock_map_prog_update(struct bpf_map *map, struct bpf_prog *prog,
 		break;
 #if IS_ENABLED(CONFIG_BPF_STREAM_PARSER)
 	case BPF_SK_SKB_STREAM_PARSER:
-		pprog = &progs->skb_parser;
+		pprog = &progs->stream_parser;
 		break;
 #endif
 	case BPF_SK_SKB_STREAM_VERDICT:
-		pprog = &progs->skb_verdict;
+		pprog = &progs->stream_verdict;
 		break;
 	default:
 		return -EOPNOTSUPP;
diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
index d7d65a700799..c26e6bf05e49 100644
--- a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
+++ b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
@@ -1014,8 +1014,8 @@ static void test_skb_redir_to_connected(struct test_sockmap_listen *skel,
 					struct bpf_map *inner_map, int family,
 					int sotype)
 {
-	int verdict = bpf_program__fd(skel->progs.prog_skb_verdict);
-	int parser = bpf_program__fd(skel->progs.prog_skb_parser);
+	int verdict = bpf_program__fd(skel->progs.prog_stream_verdict);
+	int parser = bpf_program__fd(skel->progs.prog_stream_parser);
 	int verdict_map = bpf_map__fd(skel->maps.verdict_map);
 	int sock_map = bpf_map__fd(inner_map);
 	int err;
@@ -1125,8 +1125,8 @@ static void test_skb_redir_to_listening(struct test_sockmap_listen *skel,
 					struct bpf_map *inner_map, int family,
 					int sotype)
 {
-	int verdict = bpf_program__fd(skel->progs.prog_skb_verdict);
-	int parser = bpf_program__fd(skel->progs.prog_skb_parser);
+	int verdict = bpf_program__fd(skel->progs.prog_stream_verdict);
+	int parser = bpf_program__fd(skel->progs.prog_stream_parser);
 	int verdict_map = bpf_map__fd(skel->maps.verdict_map);
 	int sock_map = bpf_map__fd(inner_map);
 	int err;
diff --git a/tools/testing/selftests/bpf/progs/test_sockmap_listen.c b/tools/testing/selftests/bpf/progs/test_sockmap_listen.c
index a3a366c57ce1..fa221141e9c1 100644
--- a/tools/testing/selftests/bpf/progs/test_sockmap_listen.c
+++ b/tools/testing/selftests/bpf/progs/test_sockmap_listen.c
@@ -31,13 +31,13 @@ struct {
 static volatile bool test_sockmap; /* toggled by user-space */
 
 SEC("sk_skb/stream_parser")
-int prog_skb_parser(struct __sk_buff *skb)
+int prog_stream_parser(struct __sk_buff *skb)
 {
 	return skb->len;
 }
 
 SEC("sk_skb/stream_verdict")
-int prog_skb_verdict(struct __sk_buff *skb)
+int prog_stream_verdict(struct __sk_buff *skb)
 {
 	unsigned int *count;
 	__u32 zero = 0;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch bpf-next v3 1/5] bpf: clean up sockmap related Kconfigs
  2021-02-13 21:44 ` [Patch bpf-next v3 1/5] bpf: clean up sockmap related Kconfigs Cong Wang
@ 2021-02-15 10:34   ` Lorenz Bauer
  2021-02-15 18:34   ` John Fastabend
  1 sibling, 0 replies; 23+ messages in thread
From: Lorenz Bauer @ 2021-02-15 10:34 UTC (permalink / raw)
  To: Cong Wang
  Cc: Networking, bpf, duanxiongchun, wangdongdong.6, jiang.wang,
	Cong Wang, John Fastabend, Daniel Borkmann, Jakub Sitnicki

On Sat, 13 Feb 2021 at 21:44, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>
> From: Cong Wang <cong.wang@bytedance.com>
>
> As suggested by John, clean up sockmap related Kconfigs:
>
> Reduce the scope of CONFIG_BPF_STREAM_PARSER down to TCP stream
> parser, to reflect its name.
>
> Make the rest sockmap code simply depend on CONFIG_BPF_SYSCALL.
> And leave CONFIG_NET_SOCK_MSG untouched, as it is used by
> non-sockmap cases.

For the series:

Reviewed-by: Lorenz Bauer <lmb@cloudflare.com>

Jakub, John: can you please take another look at the assembly in patch 3?

-- 
Lorenz Bauer  |  Systems Engineer
6th Floor, County Hall/The Riverside Building, SE1 7PB, UK

www.cloudflare.com

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [Patch bpf-next v3 1/5] bpf: clean up sockmap related Kconfigs
  2021-02-13 21:44 ` [Patch bpf-next v3 1/5] bpf: clean up sockmap related Kconfigs Cong Wang
  2021-02-15 10:34   ` Lorenz Bauer
@ 2021-02-15 18:34   ` John Fastabend
  1 sibling, 0 replies; 23+ messages in thread
From: John Fastabend @ 2021-02-15 18:34 UTC (permalink / raw)
  To: Cong Wang, netdev
  Cc: bpf, duanxiongchun, wangdongdong.6, jiang.wang, Cong Wang,
	John Fastabend, Daniel Borkmann, Jakub Sitnicki, Lorenz Bauer

Cong Wang wrote:
> From: Cong Wang <cong.wang@bytedance.com>
> 
> As suggested by John, clean up sockmap related Kconfigs:
> 
> Reduce the scope of CONFIG_BPF_STREAM_PARSER down to TCP stream
> parser, to reflect its name.
> 
> Make the rest sockmap code simply depend on CONFIG_BPF_SYSCALL.
> And leave CONFIG_NET_SOCK_MSG untouched, as it is used by
> non-sockmap cases.
> 
> Cc: John Fastabend <john.fastabend@gmail.com>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Jakub Sitnicki <jakub@cloudflare.com>
> Cc: Lorenz Bauer <lmb@cloudflare.com>
> Signed-off-by: Cong Wang <cong.wang@bytedance.com>
> ---

Thanks for doing this.

Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [Patch bpf-next v3 2/5] skmsg: get rid of struct sk_psock_parser
  2021-02-13 21:44 ` [Patch bpf-next v3 2/5] skmsg: get rid of struct sk_psock_parser Cong Wang
@ 2021-02-15 18:56   ` John Fastabend
  2021-02-15 19:03   ` Jakub Sitnicki
  1 sibling, 0 replies; 23+ messages in thread
From: John Fastabend @ 2021-02-15 18:56 UTC (permalink / raw)
  To: Cong Wang, netdev
  Cc: bpf, duanxiongchun, wangdongdong.6, jiang.wang, Cong Wang,
	John Fastabend, Daniel Borkmann, Jakub Sitnicki, Lorenz Bauer

Cong Wang wrote:
> From: Cong Wang <cong.wang@bytedance.com>
> 
> struct sk_psock_parser is embedded in sk_psock, it is
> unnecessary as skb verdict also uses ->saved_data_ready.
> We can simply fold these fields into sk_psock, and get rid
> of ->enabled.
> 
> Cc: John Fastabend <john.fastabend@gmail.com>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Jakub Sitnicki <jakub@cloudflare.com>
> Cc: Lorenz Bauer <lmb@cloudflare.com>
> Signed-off-by: Cong Wang <cong.wang@bytedance.com>
> ---
>  include/linux/skmsg.h | 19 ++++++----------
>  net/core/skmsg.c      | 53 +++++++++++++------------------------------
>  net/core/sock_map.c   |  8 +++----
>  3 files changed, 27 insertions(+), 53 deletions(-)
> 

Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [Patch bpf-next v3 3/5] bpf: compute data_end dynamically with JIT code
  2021-02-13 21:44 ` [Patch bpf-next v3 3/5] bpf: compute data_end dynamically with JIT code Cong Wang
@ 2021-02-15 19:03   ` John Fastabend
  2021-02-15 19:06   ` Jakub Sitnicki
  1 sibling, 0 replies; 23+ messages in thread
From: John Fastabend @ 2021-02-15 19:03 UTC (permalink / raw)
  To: Cong Wang, netdev
  Cc: bpf, duanxiongchun, wangdongdong.6, jiang.wang, Cong Wang,
	Jakub Sitnicki, John Fastabend, Daniel Borkmann, Lorenz Bauer

Cong Wang wrote:
> From: Cong Wang <cong.wang@bytedance.com>
> 
> Currently, we compute ->data_end with a compile-time constant
> offset of skb. But as Jakub pointed out, we can actually compute
> it in eBPF JIT code at run-time, so that we can competely get
> rid of ->data_end. This is similar to skb_shinfo(skb) computation
> in bpf_convert_shinfo_access().
> 
> Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
> Cc: John Fastabend <john.fastabend@gmail.com>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Lorenz Bauer <lmb@cloudflare.com>
> Signed-off-by: Cong Wang <cong.wang@bytedance.com>
> ---

Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch bpf-next v3 2/5] skmsg: get rid of struct sk_psock_parser
  2021-02-13 21:44 ` [Patch bpf-next v3 2/5] skmsg: get rid of struct sk_psock_parser Cong Wang
  2021-02-15 18:56   ` John Fastabend
@ 2021-02-15 19:03   ` Jakub Sitnicki
  1 sibling, 0 replies; 23+ messages in thread
From: Jakub Sitnicki @ 2021-02-15 19:03 UTC (permalink / raw)
  To: Cong Wang
  Cc: netdev, bpf, duanxiongchun, wangdongdong.6, jiang.wang,
	Cong Wang, John Fastabend, Daniel Borkmann, Lorenz Bauer

On Sat, Feb 13, 2021 at 10:44 PM CET, Cong Wang wrote:
> From: Cong Wang <cong.wang@bytedance.com>
>
> struct sk_psock_parser is embedded in sk_psock, it is
> unnecessary as skb verdict also uses ->saved_data_ready.
> We can simply fold these fields into sk_psock, and get rid
> of ->enabled.
>
> Cc: John Fastabend <john.fastabend@gmail.com>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Jakub Sitnicki <jakub@cloudflare.com>
> Cc: Lorenz Bauer <lmb@cloudflare.com>
> Signed-off-by: Cong Wang <cong.wang@bytedance.com>
> ---

Acked-by: Jakub Sitnicki <jakub@cloudflare.com>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch bpf-next v3 3/5] bpf: compute data_end dynamically with JIT code
  2021-02-13 21:44 ` [Patch bpf-next v3 3/5] bpf: compute data_end dynamically with JIT code Cong Wang
  2021-02-15 19:03   ` John Fastabend
@ 2021-02-15 19:06   ` Jakub Sitnicki
  1 sibling, 0 replies; 23+ messages in thread
From: Jakub Sitnicki @ 2021-02-15 19:06 UTC (permalink / raw)
  To: Cong Wang
  Cc: netdev, bpf, duanxiongchun, wangdongdong.6, jiang.wang,
	Cong Wang, John Fastabend, Daniel Borkmann, Lorenz Bauer

On Sat, Feb 13, 2021 at 10:44 PM CET, Cong Wang wrote:
> From: Cong Wang <cong.wang@bytedance.com>
>
> Currently, we compute ->data_end with a compile-time constant
> offset of skb. But as Jakub pointed out, we can actually compute
> it in eBPF JIT code at run-time, so that we can competely get
> rid of ->data_end. This is similar to skb_shinfo(skb) computation
> in bpf_convert_shinfo_access().
>
> Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
> Cc: John Fastabend <john.fastabend@gmail.com>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Lorenz Bauer <lmb@cloudflare.com>
> Signed-off-by: Cong Wang <cong.wang@bytedance.com>
> ---

Acked-by: Jakub Sitnicki <jakub@cloudflare.com>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [Patch bpf-next v3 5/5] sock_map: rename skb_parser and skb_verdict
  2021-02-13 21:44 ` [Patch bpf-next v3 5/5] sock_map: rename skb_parser and skb_verdict Cong Wang
@ 2021-02-15 19:09   ` John Fastabend
  0 siblings, 0 replies; 23+ messages in thread
From: John Fastabend @ 2021-02-15 19:09 UTC (permalink / raw)
  To: Cong Wang, netdev
  Cc: bpf, duanxiongchun, wangdongdong.6, jiang.wang, Cong Wang,
	John Fastabend, Daniel Borkmann, Jakub Sitnicki, Lorenz Bauer

Cong Wang wrote:
> From: Cong Wang <cong.wang@bytedance.com>
> 
> These two eBPF programs are tied to BPF_SK_SKB_STREAM_PARSER
> and BPF_SK_SKB_STREAM_VERDICT, rename them to reflect the fact
> they are only used for TCP. And save the name 'skb_verdict' for
> general use later.
> 
> Cc: John Fastabend <john.fastabend@gmail.com>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Jakub Sitnicki <jakub@cloudflare.com>
> Reviewed-by: Lorenz Bauer <lmb@cloudflare.com>
> Signed-off-by: Cong Wang <cong.wang@bytedance.com>
> ---

Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [Patch bpf-next v3 4/5] skmsg: use skb ext instead of TCP_SKB_CB
  2021-02-13 21:44 ` [Patch bpf-next v3 4/5] skmsg: use skb ext instead of TCP_SKB_CB Cong Wang
@ 2021-02-15 19:20   ` John Fastabend
  2021-02-15 22:24     ` Cong Wang
  2021-02-16  8:56     ` Lorenz Bauer
  0 siblings, 2 replies; 23+ messages in thread
From: John Fastabend @ 2021-02-15 19:20 UTC (permalink / raw)
  To: Cong Wang, netdev
  Cc: bpf, duanxiongchun, wangdongdong.6, jiang.wang, Cong Wang,
	John Fastabend, Daniel Borkmann, Jakub Sitnicki, Lorenz Bauer

Cong Wang wrote:
> From: Cong Wang <cong.wang@bytedance.com>
> 
> Currently TCP_SKB_CB() is hard-coded in skmsg code, it certainly
> does not work for any other non-TCP protocols. We can move them to
> skb ext instead of playing with skb cb, which is harder to make
> correct.
> 
> Cc: John Fastabend <john.fastabend@gmail.com>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Jakub Sitnicki <jakub@cloudflare.com>
> Reviewed-by: Lorenz Bauer <lmb@cloudflare.com>
> Signed-off-by: Cong Wang <cong.wang@bytedance.com>
> ---

I'm not seeing the advantage of doing this at the moment. We can
continue to use cb[] here, which is simpler IMO and use the ext
if needed for the other use cases. This is adding a per packet
alloc cost that we don't have at the moment as I understand it.

[...]

> diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
> index e3bb712af257..d5c711ef6d4b 100644
> --- a/include/linux/skmsg.h
> +++ b/include/linux/skmsg.h
> @@ -459,4 +459,44 @@ static inline bool sk_psock_strp_enabled(struct sk_psock *psock)
>  		return false;
>  	return !!psock->saved_data_ready;
>  }
> +
> +struct skb_bpf_ext {
> +	__u32 flags;
> +	struct sock *sk_redir;
> +};
> +
> +#if IS_ENABLED(CONFIG_NET_SOCK_MSG)
> +static inline
> +bool skb_bpf_ext_ingress(const struct sk_buff *skb)
> +{
> +	struct skb_bpf_ext *ext = skb_ext_find(skb, SKB_EXT_BPF);
> +
> +	return ext->flags & BPF_F_INGRESS;
> +}
> +
> +static inline
> +void skb_bpf_ext_set_ingress(const struct sk_buff *skb)
> +{
> +	struct skb_bpf_ext *ext = skb_ext_find(skb, SKB_EXT_BPF);
> +
> +	ext->flags |= BPF_F_INGRESS;
> +}
> +
> +static inline
> +struct sock *skb_bpf_ext_redirect_fetch(struct sk_buff *skb)
> +{
> +	struct skb_bpf_ext *ext = skb_ext_find(skb, SKB_EXT_BPF);
> +
> +	return ext->sk_redir;
> +}
> +
> +static inline
> +void skb_bpf_ext_redirect_clear(struct sk_buff *skb)
> +{
> +	struct skb_bpf_ext *ext = skb_ext_find(skb, SKB_EXT_BPF);
> +
> +	ext->flags = 0;
> +	ext->sk_redir = NULL;
> +}
> +#endif /* CONFIG_NET_SOCK_MSG */

So we will have some slight duplication for cb[] variant and ext
variant above. I'm OK with that to avoid an allocation.

[...]

> @@ -1003,11 +1008,17 @@ static int sk_psock_verdict_recv(read_descriptor_t *desc, struct sk_buff *skb,
>  		goto out;
>  	}
>  	skb_set_owner_r(skb, sk);
> +	if (!skb_ext_add(skb, SKB_EXT_BPF)) {
> +		len = 0;
> +		kfree_skb(skb);
> +		goto out;
> +	}
> +

per packet cost here. Perhaps you can argue small alloc will usually not be 
noticable in such a large stack, but once we convert over it will be very
hard to go back. And I'm looking at optimizing this path now.

>  	prog = READ_ONCE(psock->progs.skb_verdict);
>  	if (likely(prog)) {
> -		tcp_skb_bpf_redirect_clear(skb);
> +		skb_bpf_ext_redirect_clear(skb);
>  		ret = sk_psock_bpf_run(psock, prog, skb);
> -		ret = sk_psock_map_verd(ret, tcp_skb_bpf_redirect_fetch(skb));
> +		ret = sk_psock_map_verd(ret, skb_bpf_ext_redirect_fetch(skb));
>  	}
>  	sk_psock_verdict_apply(psock, skb, ret);

Thanks for the series Cong. Drop this patch and resubmit carry ACKs forward
and then lets revisit this later.

Thanks,
John

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch bpf-next v3 4/5] skmsg: use skb ext instead of TCP_SKB_CB
  2021-02-15 19:20   ` John Fastabend
@ 2021-02-15 22:24     ` Cong Wang
  2021-02-15 23:57       ` John Fastabend
  2021-02-16  8:56     ` Lorenz Bauer
  1 sibling, 1 reply; 23+ messages in thread
From: Cong Wang @ 2021-02-15 22:24 UTC (permalink / raw)
  To: John Fastabend
  Cc: Linux Kernel Network Developers, bpf, duanxiongchun,
	Dongdong Wang, jiang.wang, Cong Wang, Daniel Borkmann,
	Jakub Sitnicki, Lorenz Bauer

On Mon, Feb 15, 2021 at 11:20 AM John Fastabend
<john.fastabend@gmail.com> wrote:
>
> Cong Wang wrote:
> > From: Cong Wang <cong.wang@bytedance.com>
> >
> > Currently TCP_SKB_CB() is hard-coded in skmsg code, it certainly
> > does not work for any other non-TCP protocols. We can move them to
> > skb ext instead of playing with skb cb, which is harder to make
> > correct.
> >
> > Cc: John Fastabend <john.fastabend@gmail.com>
> > Cc: Daniel Borkmann <daniel@iogearbox.net>
> > Cc: Jakub Sitnicki <jakub@cloudflare.com>
> > Reviewed-by: Lorenz Bauer <lmb@cloudflare.com>
> > Signed-off-by: Cong Wang <cong.wang@bytedance.com>
> > ---
>
> I'm not seeing the advantage of doing this at the moment. We can
> continue to use cb[] here, which is simpler IMO and use the ext
> if needed for the other use cases. This is adding a per packet
> alloc cost that we don't have at the moment as I understand it.

Hmm? How can we continue using TCP_SKB_CB() for UDP or
AF_UNIX?

I am not sure I get your "at the moment" correctly, do you mean
I should move this patch to a later patchset, maybe the UDP
patchset? At least this patch is needed, no matter by which patchset,
so it should not be dropped.


>
> [...]
>
> > diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
> > index e3bb712af257..d5c711ef6d4b 100644
> > --- a/include/linux/skmsg.h
> > +++ b/include/linux/skmsg.h
> > @@ -459,4 +459,44 @@ static inline bool sk_psock_strp_enabled(struct sk_psock *psock)
> >               return false;
> >       return !!psock->saved_data_ready;
> >  }
> > +
> > +struct skb_bpf_ext {
> > +     __u32 flags;
> > +     struct sock *sk_redir;
> > +};
> > +
> > +#if IS_ENABLED(CONFIG_NET_SOCK_MSG)
> > +static inline
> > +bool skb_bpf_ext_ingress(const struct sk_buff *skb)
> > +{
> > +     struct skb_bpf_ext *ext = skb_ext_find(skb, SKB_EXT_BPF);
> > +
> > +     return ext->flags & BPF_F_INGRESS;
> > +}
> > +
> > +static inline
> > +void skb_bpf_ext_set_ingress(const struct sk_buff *skb)
> > +{
> > +     struct skb_bpf_ext *ext = skb_ext_find(skb, SKB_EXT_BPF);
> > +
> > +     ext->flags |= BPF_F_INGRESS;
> > +}
> > +
> > +static inline
> > +struct sock *skb_bpf_ext_redirect_fetch(struct sk_buff *skb)
> > +{
> > +     struct skb_bpf_ext *ext = skb_ext_find(skb, SKB_EXT_BPF);
> > +
> > +     return ext->sk_redir;
> > +}
> > +
> > +static inline
> > +void skb_bpf_ext_redirect_clear(struct sk_buff *skb)
> > +{
> > +     struct skb_bpf_ext *ext = skb_ext_find(skb, SKB_EXT_BPF);
> > +
> > +     ext->flags = 0;
> > +     ext->sk_redir = NULL;
> > +}
> > +#endif /* CONFIG_NET_SOCK_MSG */
>
> So we will have some slight duplication for cb[] variant and ext
> variant above. I'm OK with that to avoid an allocation.

Not sure what you mean by "duplication", these are removed from
TCP_SKB_CB(), so there is clearly no duplication.

>
> [...]
>
> > @@ -1003,11 +1008,17 @@ static int sk_psock_verdict_recv(read_descriptor_t *desc, struct sk_buff *skb,
> >               goto out;
> >       }
> >       skb_set_owner_r(skb, sk);
> > +     if (!skb_ext_add(skb, SKB_EXT_BPF)) {
> > +             len = 0;
> > +             kfree_skb(skb);
> > +             goto out;
> > +     }
> > +
>
> per packet cost here. Perhaps you can argue small alloc will usually not be
> noticable in such a large stack, but once we convert over it will be very
> hard to go back. And I'm looking at optimizing this path now.

This is a price we need to pay to avoid CB, and skb_ext_add() has been
used on other fast paths too, for example, tcf_classify_ingress() and
mptcp_incoming_options(). So, it is definitely acceptable.

>
> >       prog = READ_ONCE(psock->progs.skb_verdict);
> >       if (likely(prog)) {
> > -             tcp_skb_bpf_redirect_clear(skb);
> > +             skb_bpf_ext_redirect_clear(skb);
> >               ret = sk_psock_bpf_run(psock, prog, skb);
> > -             ret = sk_psock_map_verd(ret, tcp_skb_bpf_redirect_fetch(skb));
> > +             ret = sk_psock_map_verd(ret, skb_bpf_ext_redirect_fetch(skb));
> >       }
> >       sk_psock_verdict_apply(psock, skb, ret);
>
> Thanks for the series Cong. Drop this patch and resubmit carry ACKs forward
> and then lets revisit this later.

I still believe it is best to stay in this patchset, as it does not change
any functionality and is a preparation too. And the next patchset will be
UDP/AF_UNIX changes as you suggested, it is very awkward to put this
patch into either UDP or AF_UNIX changes.

So, let's keep it in this patchset, and I am happy to address any concerns
and open to other ideas than using skb ext.

Thanks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch bpf-next v3 4/5] skmsg: use skb ext instead of TCP_SKB_CB
  2021-02-15 22:24     ` Cong Wang
@ 2021-02-15 23:57       ` John Fastabend
  2021-02-16  0:28         ` Cong Wang
  0 siblings, 1 reply; 23+ messages in thread
From: John Fastabend @ 2021-02-15 23:57 UTC (permalink / raw)
  To: Cong Wang, John Fastabend
  Cc: Linux Kernel Network Developers, bpf, duanxiongchun,
	Dongdong Wang, jiang.wang, Cong Wang, Daniel Borkmann,
	Jakub Sitnicki, Lorenz Bauer

Cong Wang wrote:
> On Mon, Feb 15, 2021 at 11:20 AM John Fastabend
> <john.fastabend@gmail.com> wrote:
> >
> > Cong Wang wrote:
> > > From: Cong Wang <cong.wang@bytedance.com>
> > >
> > > Currently TCP_SKB_CB() is hard-coded in skmsg code, it certainly
> > > does not work for any other non-TCP protocols. We can move them to
> > > skb ext instead of playing with skb cb, which is harder to make
> > > correct.
> > >
> > > Cc: John Fastabend <john.fastabend@gmail.com>
> > > Cc: Daniel Borkmann <daniel@iogearbox.net>
> > > Cc: Jakub Sitnicki <jakub@cloudflare.com>
> > > Reviewed-by: Lorenz Bauer <lmb@cloudflare.com>
> > > Signed-off-by: Cong Wang <cong.wang@bytedance.com>
> > > ---
> >
> > I'm not seeing the advantage of doing this at the moment. We can
> > continue to use cb[] here, which is simpler IMO and use the ext
> > if needed for the other use cases. This is adding a per packet
> > alloc cost that we don't have at the moment as I understand it.
> 
> Hmm? How can we continue using TCP_SKB_CB() for UDP or
> AF_UNIX?
> 
> I am not sure I get your "at the moment" correctly, do you mean
> I should move this patch to a later patchset, maybe the UDP
> patchset? At least this patch is needed, no matter by which patchset,
> so it should not be dropped.

Agree, the skb_bpf_ext{} pieces are needed for UDP and AF_UNIX. Its
not required for TCP side though. What I'm suggesting is leave the
TCP side as-is, using the cb[] fields. Then use the skb_bpf_ext fields
from UDP and AF_UNIX.

> 
> 
> >
> > [...]
> >
> > > diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
> > > index e3bb712af257..d5c711ef6d4b 100644
> > > --- a/include/linux/skmsg.h
> > > +++ b/include/linux/skmsg.h
> > > @@ -459,4 +459,44 @@ static inline bool sk_psock_strp_enabled(struct sk_psock *psock)
> > >               return false;
> > >       return !!psock->saved_data_ready;
> > >  }
> > > +
> > > +struct skb_bpf_ext {
> > > +     __u32 flags;
> > > +     struct sock *sk_redir;
> > > +};
> > > +
> > > +#if IS_ENABLED(CONFIG_NET_SOCK_MSG)
> > > +static inline
> > > +bool skb_bpf_ext_ingress(const struct sk_buff *skb)
> > > +{
> > > +     struct skb_bpf_ext *ext = skb_ext_find(skb, SKB_EXT_BPF);
> > > +
> > > +     return ext->flags & BPF_F_INGRESS;
> > > +}
> > > +
> > > +static inline
> > > +void skb_bpf_ext_set_ingress(const struct sk_buff *skb)
> > > +{
> > > +     struct skb_bpf_ext *ext = skb_ext_find(skb, SKB_EXT_BPF);
> > > +
> > > +     ext->flags |= BPF_F_INGRESS;
> > > +}
> > > +
> > > +static inline
> > > +struct sock *skb_bpf_ext_redirect_fetch(struct sk_buff *skb)
> > > +{
> > > +     struct skb_bpf_ext *ext = skb_ext_find(skb, SKB_EXT_BPF);
> > > +
> > > +     return ext->sk_redir;
> > > +}
> > > +
> > > +static inline
> > > +void skb_bpf_ext_redirect_clear(struct sk_buff *skb)
> > > +{
> > > +     struct skb_bpf_ext *ext = skb_ext_find(skb, SKB_EXT_BPF);
> > > +
 > > +     ext->flags = 0;
> > > +     ext->sk_redir = NULL;
> > > +}
> > > +#endif /* CONFIG_NET_SOCK_MSG */
> >
> > So we will have some slight duplication for cb[] variant and ext
> > variant above. I'm OK with that to avoid an allocation.
> 
> Not sure what you mean by "duplication", these are removed from
> TCP_SKB_CB(), so there is clearly no duplication.

In this patch yes, no duplication.  But, I want to leave TCP alone
and have it continue to use cb[] to avoid alloc per packet.

> 
> >
> > [...]
> >
> > > @@ -1003,11 +1008,17 @@ static int sk_psock_verdict_recv(read_descriptor_t *desc, struct sk_buff *skb,
> > >               goto out;
> > >       }
> > >       skb_set_owner_r(skb, sk);
> > > +     if (!skb_ext_add(skb, SKB_EXT_BPF)) {
> > > +             len = 0;
> > > +             kfree_skb(skb);
> > > +             goto out;
> > > +     }
> > > +
> >
> > per packet cost here. Perhaps you can argue small alloc will usually not be
> > noticable in such a large stack, but once we convert over it will be very
> > hard to go back. And I'm looking at optimizing this path now.
> 
> This is a price we need to pay to avoid CB, and skb_ext_add() has been
> used on other fast paths too, for example, tcf_classify_ingress() and
> mptcp_incoming_options(). So, it is definitely acceptable.

For TCP case we can continue to use CB and not pay the price. For UDP
and AF_UNIX we can do the extra alloc.

The use in tcf_classify_ingress is a miss case so not the common path. If
it is/was in the common path I would suggest we rip it out.

> 
> >
> > >       prog = READ_ONCE(psock->progs.skb_verdict);
> > >       if (likely(prog)) {
> > > -             tcp_skb_bpf_redirect_clear(skb);
> > > +             skb_bpf_ext_redirect_clear(skb);
> > >               ret = sk_psock_bpf_run(psock, prog, skb);
> > > -             ret = sk_psock_map_verd(ret, tcp_skb_bpf_redirect_fetch(skb));
> > > +             ret = sk_psock_map_verd(ret, skb_bpf_ext_redirect_fetch(skb));
> > >       }
> > >       sk_psock_verdict_apply(psock, skb, ret);
> >
> > Thanks for the series Cong. Drop this patch and resubmit carry ACKs forward
> > and then lets revisit this later.
> 
> I still believe it is best to stay in this patchset, as it does not change
> any functionality and is a preparation too. And the next patchset will be
> UDP/AF_UNIX changes as you suggested, it is very awkward to put this
> patch into either UDP or AF_UNIX changes.

Disagree. It adds extra code to the TCP side that I think is not needed. Any
reason the TCP implementation can't continue to use cb[]?

> 
> So, let's keep it in this patchset, and I am happy to address any concerns
> and open to other ideas than using skb ext.

The idea here is to just use cb[] in TCP case per above. I only scanned your
other patches, but presumably this can be patch 1 with just the functions

 skb_bpf_ext_ingress()
 skb_bpf_ext_set_ingress()
 skb_bpf_ext_redirect_fetch()
 skb_bpf_ext_redirect_clear()

And none of the removals from TCP side.

.John

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch bpf-next v3 4/5] skmsg: use skb ext instead of TCP_SKB_CB
  2021-02-15 23:57       ` John Fastabend
@ 2021-02-16  0:28         ` Cong Wang
  2021-02-16  0:54           ` John Fastabend
  0 siblings, 1 reply; 23+ messages in thread
From: Cong Wang @ 2021-02-16  0:28 UTC (permalink / raw)
  To: John Fastabend
  Cc: Linux Kernel Network Developers, bpf, duanxiongchun,
	Dongdong Wang, jiang.wang, Cong Wang, Daniel Borkmann,
	Jakub Sitnicki, Lorenz Bauer

On Mon, Feb 15, 2021 at 3:57 PM John Fastabend <john.fastabend@gmail.com> wrote:
>
> For TCP case we can continue to use CB and not pay the price. For UDP
> and AF_UNIX we can do the extra alloc.

I see your point, but specializing TCP case does not give much benefit
here, the skmsg code would have to check skb->protocol etc. to decide
whether to use TCP_SKB_CB() or skb_ext:

if (skb->protocol == ...)
  TCP_SKB_CB(skb) = ...;
else
  ext = skb_ext_find(skb);

which looks ugly to me. And I doubt skb->protocol alone is sufficient to
distinguish TCP, so we may end up having more checks above.

So do you really want to trade code readability with an extra alloc?

>
> The use in tcf_classify_ingress is a miss case so not the common path. If
> it is/was in the common path I would suggest we rip it out.
>

Excellent point, what about nf_bridge_unshare()? It is a common path
for bridge netfilter, which is also probably why skb ext was introduced
(IIRC). secpath_set() seems on a common path for XFRM too.

Are you suggesting to remove them all? ;)

Thanks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch bpf-next v3 4/5] skmsg: use skb ext instead of TCP_SKB_CB
  2021-02-16  0:28         ` Cong Wang
@ 2021-02-16  0:54           ` John Fastabend
  2021-02-16  1:04             ` Cong Wang
  0 siblings, 1 reply; 23+ messages in thread
From: John Fastabend @ 2021-02-16  0:54 UTC (permalink / raw)
  To: Cong Wang, John Fastabend
  Cc: Linux Kernel Network Developers, bpf, duanxiongchun,
	Dongdong Wang, jiang.wang, Cong Wang, Daniel Borkmann,
	Jakub Sitnicki, Lorenz Bauer

Cong Wang wrote:
> On Mon, Feb 15, 2021 at 3:57 PM John Fastabend <john.fastabend@gmail.com> wrote:
> >
> > For TCP case we can continue to use CB and not pay the price. For UDP
> > and AF_UNIX we can do the extra alloc.
> 
> I see your point, but specializing TCP case does not give much benefit
> here, the skmsg code would have to check skb->protocol etc. to decide
> whether to use TCP_SKB_CB() or skb_ext:
> 
> if (skb->protocol == ...)
>   TCP_SKB_CB(skb) = ...;
> else
>   ext = skb_ext_find(skb);
> 
> which looks ugly to me. And I doubt skb->protocol alone is sufficient to
> distinguish TCP, so we may end up having more checks above.
> 
> So do you really want to trade code readability with an extra alloc?

Above is ugly. So I look at where the patch replaces things,

sk_psock_tls_strp_read(), this is TLS specific read hook so can't really
work in generic case anyways.

sk_psock_strp_read(), will you have UDP, AF_UNIX stream parsers? Do these
even work outside TCP cases.

For these ones: sk_psock_verdict_apply(), sk_psock_verdict_recv(),
sk_psock_backlog(), can't we just do some refactoring around their
hook points so we know the context. For example sk_psock_tls_verdict_apply
is calling sk_psock_skb_redirect(). Why not have a sk_psock_unix_redirect()
and a sk_psock_udp_redirect(). There are likely some optimizations we can
deploy this way. We've already don this for tls and sk_msg types for example.

Then the helpers will know their types by program type, just use the right
variants.

So not suggestiong if/else the checks so much as having per type hooks.

> 
> >
> > The use in tcf_classify_ingress is a miss case so not the common path. If
> > it is/was in the common path I would suggest we rip it out.
> >
> 
> Excellent point, what about nf_bridge_unshare()? It is a common path
> for bridge netfilter, which is also probably why skb ext was introduced
> (IIRC). secpath_set() seems on a common path for XFRM too.

Yeah not nice, but we don't use nf_bridge so doesn't bother me.

> 
> Are you suggesting to remove them all? ;)

From the hotpath where I care about perfromance yes. 

> 
> Thanks.



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch bpf-next v3 4/5] skmsg: use skb ext instead of TCP_SKB_CB
  2021-02-16  0:54           ` John Fastabend
@ 2021-02-16  1:04             ` Cong Wang
  2021-02-16  1:50               ` John Fastabend
  0 siblings, 1 reply; 23+ messages in thread
From: Cong Wang @ 2021-02-16  1:04 UTC (permalink / raw)
  To: John Fastabend
  Cc: Linux Kernel Network Developers, bpf, duanxiongchun,
	Dongdong Wang, jiang.wang, Cong Wang, Daniel Borkmann,
	Jakub Sitnicki, Lorenz Bauer

On Mon, Feb 15, 2021 at 4:54 PM John Fastabend <john.fastabend@gmail.com> wrote:
>
> Cong Wang wrote:
> > On Mon, Feb 15, 2021 at 3:57 PM John Fastabend <john.fastabend@gmail.com> wrote:
> > >
> > > For TCP case we can continue to use CB and not pay the price. For UDP
> > > and AF_UNIX we can do the extra alloc.
> >
> > I see your point, but specializing TCP case does not give much benefit
> > here, the skmsg code would have to check skb->protocol etc. to decide
> > whether to use TCP_SKB_CB() or skb_ext:
> >
> > if (skb->protocol == ...)
> >   TCP_SKB_CB(skb) = ...;
> > else
> >   ext = skb_ext_find(skb);
> >
> > which looks ugly to me. And I doubt skb->protocol alone is sufficient to
> > distinguish TCP, so we may end up having more checks above.
> >
> > So do you really want to trade code readability with an extra alloc?
>
> Above is ugly. So I look at where the patch replaces things,
>
> sk_psock_tls_strp_read(), this is TLS specific read hook so can't really
> work in generic case anyways.
>
> sk_psock_strp_read(), will you have UDP, AF_UNIX stream parsers? Do these
> even work outside TCP cases.
>
> For these ones: sk_psock_verdict_apply(), sk_psock_verdict_recv(),
> sk_psock_backlog(), can't we just do some refactoring around their
> hook points so we know the context. For example sk_psock_tls_verdict_apply
> is calling sk_psock_skb_redirect(). Why not have a sk_psock_unix_redirect()
> and a sk_psock_udp_redirect(). There are likely some optimizations we can
> deploy this way. We've already don this for tls and sk_msg types for example.
>
> Then the helpers will know their types by program type, just use the right
> variants.
>
> So not suggestiong if/else the checks so much as having per type hooks.
>

Hmm, but sk_psock_backlog() is still the only one that handles all three
above cases, right? It uses TCP_SKB_CB() too and more importantly it
is also why we can't use a per-cpu struct here (see bpf_redirect_info).

Thanks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch bpf-next v3 4/5] skmsg: use skb ext instead of TCP_SKB_CB
  2021-02-16  1:04             ` Cong Wang
@ 2021-02-16  1:50               ` John Fastabend
  2021-02-16  4:06                 ` Cong Wang
  0 siblings, 1 reply; 23+ messages in thread
From: John Fastabend @ 2021-02-16  1:50 UTC (permalink / raw)
  To: Cong Wang, John Fastabend
  Cc: Linux Kernel Network Developers, bpf, duanxiongchun,
	Dongdong Wang, jiang.wang, Cong Wang, Daniel Borkmann,
	Jakub Sitnicki, Lorenz Bauer

Cong Wang wrote:
> On Mon, Feb 15, 2021 at 4:54 PM John Fastabend <john.fastabend@gmail.com> wrote:
> >
> > Cong Wang wrote:
> > > On Mon, Feb 15, 2021 at 3:57 PM John Fastabend <john.fastabend@gmail.com> wrote:
> > > >
> > > > For TCP case we can continue to use CB and not pay the price. For UDP
> > > > and AF_UNIX we can do the extra alloc.
> > >
> > > I see your point, but specializing TCP case does not give much benefit
> > > here, the skmsg code would have to check skb->protocol etc. to decide
> > > whether to use TCP_SKB_CB() or skb_ext:
> > >
> > > if (skb->protocol == ...)
> > >   TCP_SKB_CB(skb) = ...;
> > > else
> > >   ext = skb_ext_find(skb);
> > >
> > > which looks ugly to me. And I doubt skb->protocol alone is sufficient to
> > > distinguish TCP, so we may end up having more checks above.
> > >
> > > So do you really want to trade code readability with an extra alloc?
> >
> > Above is ugly. So I look at where the patch replaces things,
> >
> > sk_psock_tls_strp_read(), this is TLS specific read hook so can't really
> > work in generic case anyways.
> >
> > sk_psock_strp_read(), will you have UDP, AF_UNIX stream parsers? Do these
> > even work outside TCP cases.
> >
> > For these ones: sk_psock_verdict_apply(), sk_psock_verdict_recv(),
> > sk_psock_backlog(), can't we just do some refactoring around their
> > hook points so we know the context. For example sk_psock_tls_verdict_apply
> > is calling sk_psock_skb_redirect(). Why not have a sk_psock_unix_redirect()
> > and a sk_psock_udp_redirect(). There are likely some optimizations we can
> > deploy this way. We've already don this for tls and sk_msg types for example.
> >
> > Then the helpers will know their types by program type, just use the right
> > variants.
> >
> > So not suggestiong if/else the checks so much as having per type hooks.
> >
> 
> Hmm, but sk_psock_backlog() is still the only one that handles all three
> above cases, right? It uses TCP_SKB_CB() too and more importantly it
> is also why we can't use a per-cpu struct here (see bpf_redirect_info).

Right, but the workqueue is created at init time where we will know the 
socket type based on the program/map types so can build the redirect
backlog queue there based on the type needed. I also have a patch in
mind that would do more specific TCP things in that code anyways. I
can flush it out this week if anyone cares. The idea is we are wasting
lots of cycles using skb_send_sock_locked when we can just inject
the packet directlyy into the tcp stack.

Also on the original patch here, we can't just kfree_skb() on alloc
errors because that will look like a data drop. Errors need to be
handled gracefully without dropping data. At least in the TCP case,
but probably also in UDP and AF_UNIX cases as well. Original code
was pretty loose in this regard, but it caused users to write bug
reports and then I've been fixing most of them. If you see more
cases let me know.

> 
> Thanks.



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch bpf-next v3 4/5] skmsg: use skb ext instead of TCP_SKB_CB
  2021-02-16  1:50               ` John Fastabend
@ 2021-02-16  4:06                 ` Cong Wang
  0 siblings, 0 replies; 23+ messages in thread
From: Cong Wang @ 2021-02-16  4:06 UTC (permalink / raw)
  To: John Fastabend
  Cc: Linux Kernel Network Developers, bpf, duanxiongchun,
	Dongdong Wang, jiang.wang, Cong Wang, Daniel Borkmann,
	Jakub Sitnicki, Lorenz Bauer

On Mon, Feb 15, 2021 at 5:50 PM John Fastabend <john.fastabend@gmail.com> wrote:
>
> Cong Wang wrote:
> > On Mon, Feb 15, 2021 at 4:54 PM John Fastabend <john.fastabend@gmail.com> wrote:
> > >
> > > Cong Wang wrote:
> > > > On Mon, Feb 15, 2021 at 3:57 PM John Fastabend <john.fastabend@gmail.com> wrote:
> > > > >
> > > > > For TCP case we can continue to use CB and not pay the price. For UDP
> > > > > and AF_UNIX we can do the extra alloc.
> > > >
> > > > I see your point, but specializing TCP case does not give much benefit
> > > > here, the skmsg code would have to check skb->protocol etc. to decide
> > > > whether to use TCP_SKB_CB() or skb_ext:
> > > >
> > > > if (skb->protocol == ...)
> > > >   TCP_SKB_CB(skb) = ...;
> > > > else
> > > >   ext = skb_ext_find(skb);
> > > >
> > > > which looks ugly to me. And I doubt skb->protocol alone is sufficient to
> > > > distinguish TCP, so we may end up having more checks above.
> > > >
> > > > So do you really want to trade code readability with an extra alloc?
> > >
> > > Above is ugly. So I look at where the patch replaces things,
> > >
> > > sk_psock_tls_strp_read(), this is TLS specific read hook so can't really
> > > work in generic case anyways.
> > >
> > > sk_psock_strp_read(), will you have UDP, AF_UNIX stream parsers? Do these
> > > even work outside TCP cases.
> > >
> > > For these ones: sk_psock_verdict_apply(), sk_psock_verdict_recv(),
> > > sk_psock_backlog(), can't we just do some refactoring around their
> > > hook points so we know the context. For example sk_psock_tls_verdict_apply
> > > is calling sk_psock_skb_redirect(). Why not have a sk_psock_unix_redirect()
> > > and a sk_psock_udp_redirect(). There are likely some optimizations we can
> > > deploy this way. We've already don this for tls and sk_msg types for example.
> > >
> > > Then the helpers will know their types by program type, just use the right
> > > variants.
> > >
> > > So not suggestiong if/else the checks so much as having per type hooks.
> > >
> >
> > Hmm, but sk_psock_backlog() is still the only one that handles all three
> > above cases, right? It uses TCP_SKB_CB() too and more importantly it
> > is also why we can't use a per-cpu struct here (see bpf_redirect_info).
>
> Right, but the workqueue is created at init time where we will know the
> socket type based on the program/map types so can build the redirect
> backlog queue there based on the type needed. I also have a patch in

Hmm? How could a socket type match the skb type when we redirect
across-protocol?

In my use case, I want to redirect an AF_UNIX skb to a UDP socket,
clearly checking the UDP socket workqueue can't find out it is an
AF_UNIX skb. It has to be a per-skb check.

> mind that would do more specific TCP things in that code anyways. I
> can flush it out this week if anyone cares. The idea is we are wasting
> lots of cycles using skb_send_sock_locked when we can just inject
> the packet directlyy into the tcp stack.

Actually I did try the same, it clearly doesn't work for cross-protocol.

Anyway, please let me know what your suggestion for skb ext here?

It looks like we either have some ugly packet type checks, or just
use the skb ext. I don't see any other way yet, I also explored the
struct sk_buff again and still can not find anything we can reuse.
(_skb_refdst can only be reused very briefly with
tcp_skb_tsorted_save().)

Therefore, I believe using skb ext is still the best solution here.

>
> Also on the original patch here, we can't just kfree_skb() on alloc
> errors because that will look like a data drop. Errors need to be
> handled gracefully without dropping data. At least in the TCP case,
> but probably also in UDP and AF_UNIX cases as well. Original code
> was pretty loose in this regard, but it caused users to write bug
> reports and then I've been fixing most of them. If you see more
> cases let me know.

What's your suggestion here? Return -EAGAIN? But it requires
the caller put it in a loop to be graceful, but we can't do it in, for
example, sk_psock_tls_strp_read().

Thanks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch bpf-next v3 4/5] skmsg: use skb ext instead of TCP_SKB_CB
  2021-02-15 19:20   ` John Fastabend
  2021-02-15 22:24     ` Cong Wang
@ 2021-02-16  8:56     ` Lorenz Bauer
  2021-02-17 19:19       ` John Fastabend
  1 sibling, 1 reply; 23+ messages in thread
From: Lorenz Bauer @ 2021-02-16  8:56 UTC (permalink / raw)
  To: John Fastabend
  Cc: Cong Wang, Networking, bpf, duanxiongchun, wangdongdong.6,
	jiang.wang, Cong Wang, Daniel Borkmann, Jakub Sitnicki

On Mon, 15 Feb 2021 at 19:20, John Fastabend <john.fastabend@gmail.com> wrote:
>
> Cong Wang wrote:
> > From: Cong Wang <cong.wang@bytedance.com>
> >
> > Currently TCP_SKB_CB() is hard-coded in skmsg code, it certainly
> > does not work for any other non-TCP protocols. We can move them to
> > skb ext instead of playing with skb cb, which is harder to make
> > correct.
> >
> > Cc: John Fastabend <john.fastabend@gmail.com>
> > Cc: Daniel Borkmann <daniel@iogearbox.net>
> > Cc: Jakub Sitnicki <jakub@cloudflare.com>
> > Reviewed-by: Lorenz Bauer <lmb@cloudflare.com>
> > Signed-off-by: Cong Wang <cong.wang@bytedance.com>
> > ---
>
> I'm not seeing the advantage of doing this at the moment. We can
> continue to use cb[] here, which is simpler IMO and use the ext
> if needed for the other use cases. This is adding a per packet
> alloc cost that we don't have at the moment as I understand it.

John, do you have a benchmark we can look at? Right now we're arguing
in the abstract.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch bpf-next v3 4/5] skmsg: use skb ext instead of TCP_SKB_CB
  2021-02-16  8:56     ` Lorenz Bauer
@ 2021-02-17 19:19       ` John Fastabend
  0 siblings, 0 replies; 23+ messages in thread
From: John Fastabend @ 2021-02-17 19:19 UTC (permalink / raw)
  To: Lorenz Bauer, John Fastabend
  Cc: Cong Wang, Networking, bpf, duanxiongchun, wangdongdong.6,
	jiang.wang, Cong Wang, Daniel Borkmann, Jakub Sitnicki

Lorenz Bauer wrote:
> On Mon, 15 Feb 2021 at 19:20, John Fastabend <john.fastabend@gmail.com> wrote:
> >
> > Cong Wang wrote:
> > > From: Cong Wang <cong.wang@bytedance.com>
> > >
> > > Currently TCP_SKB_CB() is hard-coded in skmsg code, it certainly
> > > does not work for any other non-TCP protocols. We can move them to
> > > skb ext instead of playing with skb cb, which is harder to make
> > > correct.
> > >
> > > Cc: John Fastabend <john.fastabend@gmail.com>
> > > Cc: Daniel Borkmann <daniel@iogearbox.net>
> > > Cc: Jakub Sitnicki <jakub@cloudflare.com>
> > > Reviewed-by: Lorenz Bauer <lmb@cloudflare.com>
> > > Signed-off-by: Cong Wang <cong.wang@bytedance.com>
> > > ---
> >
> > I'm not seeing the advantage of doing this at the moment. We can
> > continue to use cb[] here, which is simpler IMO and use the ext
> > if needed for the other use cases. This is adding a per packet
> > alloc cost that we don't have at the moment as I understand it.
> 
> John, do you have a benchmark we can look at? Right now we're arguing
> in the abstract.

Sure, but looks like Cong found some spare fields in sk_buff so
that looks much nicer.

I'll mess aound a bit with our benchmarks and see where we can
publish them. It would be good to have some repeatable tests
here folks can use.

Thanks,
John

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2021-02-17 19:20 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-13 21:44 [Patch bpf-next v3 0/5] sock_map: clean up and refactor code for BPF_SK_SKB_VERDICT Cong Wang
2021-02-13 21:44 ` [Patch bpf-next v3 1/5] bpf: clean up sockmap related Kconfigs Cong Wang
2021-02-15 10:34   ` Lorenz Bauer
2021-02-15 18:34   ` John Fastabend
2021-02-13 21:44 ` [Patch bpf-next v3 2/5] skmsg: get rid of struct sk_psock_parser Cong Wang
2021-02-15 18:56   ` John Fastabend
2021-02-15 19:03   ` Jakub Sitnicki
2021-02-13 21:44 ` [Patch bpf-next v3 3/5] bpf: compute data_end dynamically with JIT code Cong Wang
2021-02-15 19:03   ` John Fastabend
2021-02-15 19:06   ` Jakub Sitnicki
2021-02-13 21:44 ` [Patch bpf-next v3 4/5] skmsg: use skb ext instead of TCP_SKB_CB Cong Wang
2021-02-15 19:20   ` John Fastabend
2021-02-15 22:24     ` Cong Wang
2021-02-15 23:57       ` John Fastabend
2021-02-16  0:28         ` Cong Wang
2021-02-16  0:54           ` John Fastabend
2021-02-16  1:04             ` Cong Wang
2021-02-16  1:50               ` John Fastabend
2021-02-16  4:06                 ` Cong Wang
2021-02-16  8:56     ` Lorenz Bauer
2021-02-17 19:19       ` John Fastabend
2021-02-13 21:44 ` [Patch bpf-next v3 5/5] sock_map: rename skb_parser and skb_verdict Cong Wang
2021-02-15 19:09   ` John Fastabend

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).