netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH
@ 2018-12-17 10:24 Björn Töpel
  2018-12-17 10:24 ` [PATCH bpf-next v2 1/7] xsk: simplify AF_XDP socket teardown Björn Töpel
                   ` (7 more replies)
  0 siblings, 8 replies; 20+ messages in thread
From: Björn Töpel @ 2018-12-17 10:24 UTC (permalink / raw)
  To: bjorn.topel, magnus.karlsson, magnus.karlsson, ast, daniel, netdev
  Cc: Björn Töpel, brouer, u9012063, qi.z.zhang,
	jakub.kicinski, andrew

From: Björn Töpel <bjorn.topel@intel.com>

Hi!

This patch set adds support for a new XDP socket bind option,
XDP_ATTACH.

XDP_ATTACH associates an XDP socket to a specific netdev Rx queue. To
redirect a packet to an attached socket from XDP, the bpf_xsk_redirect
helper is used. The bpf_xsk_redirect helper is also introduced in this
series.

Many XDP socket users just need a simple way of creating/binding a
socket and receiving frames right away without a complicated XDP
program. "Attached" XDP sockets removes the need for the XSKMAP, and
allows for a trivial XDP program, e.g.:

  SEC("xdp")
  int xdp_prog(struct xdp_md *ctx)
  {
        return bpf_xsk_redirect(ctx);
  }

An attached XDP socket also has better performance than the XSKMAP
based sockets (performance numbers below).

The first three patches of the series add support for the XDP_ATTACH
flag and the BPF helper.

Since the trivial XDP program above will be a very common way of using
AF_XDP sockets, it makes sense to bundle that XDP program to libbpf. 

We call BPF programs that are bundled with libbpf a "builtin
program". To access the bpf_object/bpf_program of a builtin program
the following libbpf API are introduced:

      LIBBPF_API struct bpf_object *bpf_object__open_builtin(
                      enum bpf_prog_type prog_type);
    
      LIBBPF_API struct bpf_program *
      bpf_object__find_xdp_builtin_program(
                    struct bpf_object* obj,
                      enum libbpf_builtin_xdp_prog prog);
    
The first function is used to get a handle to the bpf_object
containing all builtin programs for a certain program type. The latter
is used to access a certain builtin program from the bpf_object. Note
that currenty only XDP is supported. When other program types are
supported, additional bpf_object__find_PROG_TYPE_builtin_program
function are required.
    
Patch 4 and 5 introduce the "builtin" program to libbpf.

The last two patches adds XDP_ATTACH support and the "builtin" libbpf
support to the sample application.

Some questions:

* If the only builtin programs in libbpf will be XDP programs, the
  libbpf API in this series might be a bit over-engineered. Thoughts?

* Is the idea that users of libbpf should use the IS_ERR*
  functionality for checking pointers (e.g. __open), or is that just
  library internal convenience?

Finally, the performance numbers for rxdrop (Intel(R) Xeon(R) Gold
6154 CPU @ 3.00GHz):

XDP_SKB:
XSKMAP:     2.8 Mpps
XDP_ATTACH: 2.9 Mpps

XDP_DRV - copy:
XSKMAP:     8.5 Mpps
XDP_ATTACH: 9.3 Mpps

XDP_DRV - zero-copy:
XSKMAP:     15.1 Mpps
XDP_ATTACH: 17.3 Mpps

Thanks!
Björn

v1->v2: Reworked the "builtin program" concept. The v1 had the builtin
        program as part of the kernel, which simply added too much
        complexity. In v2, the "builtin" was moved to libbpf
        instead. Alexei suggested that XDP_ATTACH was renamed to to
        XDP_BUILTIN_ATTACH, but given that the "builtin" functionality
        was removed, the old name stuck.


Björn Töpel (7):
  xsk: simplify AF_XDP socket teardown
  xsk: add XDP_ATTACH bind() flag
  bpf: add bpf_xsk_redirect function
  tools/bpf: sync kernel uapi bpf.h to tools directory
  libbpf: initial support for builtin BPF programs
  samples: bpf: simplify/cleanup xdpsock
  samples: bpf: add support for XDP_ATTACH to xdpsock

 include/linux/filter.h         |   4 +
 include/linux/netdevice.h      |   1 +
 include/net/xdp_sock.h         |   3 +
 include/trace/events/xdp.h     |  61 ++++++++++++++
 include/uapi/linux/bpf.h       |  14 +++-
 include/uapi/linux/if_xdp.h    |   1 +
 net/core/filter.c              | 100 +++++++++++++++++++++++
 net/xdp/xsk.c                  |  67 +++++++++------
 samples/bpf/xdpsock_user.c     | 145 +++++++++++++++++++++++----------
 tools/include/uapi/linux/bpf.h |  14 +++-
 tools/lib/bpf/libbpf.c         |  85 +++++++++++++++++++
 tools/lib/bpf/libbpf.h         |  14 ++++
 tools/lib/bpf/libbpf.map       |   2 +
 13 files changed, 441 insertions(+), 70 deletions(-)

-- 
2.19.1

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH bpf-next v2 1/7] xsk: simplify AF_XDP socket teardown
  2018-12-17 10:24 [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH Björn Töpel
@ 2018-12-17 10:24 ` Björn Töpel
  2018-12-17 10:24 ` [PATCH bpf-next v2 2/7] xsk: add XDP_ATTACH bind() flag Björn Töpel
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 20+ messages in thread
From: Björn Töpel @ 2018-12-17 10:24 UTC (permalink / raw)
  To: bjorn.topel, magnus.karlsson, magnus.karlsson, ast, daniel, netdev
  Cc: Björn Töpel, brouer, u9012063, qi.z.zhang,
	jakub.kicinski, andrew

From: Björn Töpel <bjorn.topel@intel.com>

Prior this commit, when the struct socket object was being released,
the UMEM did not have its reference count decreased. Instead, this was
done in the struct sock sk_destruct function.

There is no reason to keep the UMEM reference around when the socket
is being orphaned, so in this patch the xdp_put_mem is called in the
xsk_release function. This results in that the xsk_destruct function
can be removed!

Note that, it still holds that a struct xsk_sock reference might still
linger in the XSKMAP after the UMEM is released, e.g. if a user does
not clear the XSKMAP prior to closing the process. This sock will be
in a "released" zombie like state, until the XSKMAP is removed.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
---
 net/xdp/xsk.c | 16 +---------------
 1 file changed, 1 insertion(+), 15 deletions(-)

diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 07156f43d295..a03268454a27 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -366,6 +366,7 @@ static int xsk_release(struct socket *sock)
 
 	xskq_destroy(xs->rx);
 	xskq_destroy(xs->tx);
+	xdp_put_umem(xs->umem);
 
 	sock_orphan(sk);
 	sock->sk = NULL;
@@ -713,18 +714,6 @@ static const struct proto_ops xsk_proto_ops = {
 	.sendpage	= sock_no_sendpage,
 };
 
-static void xsk_destruct(struct sock *sk)
-{
-	struct xdp_sock *xs = xdp_sk(sk);
-
-	if (!sock_flag(sk, SOCK_DEAD))
-		return;
-
-	xdp_put_umem(xs->umem);
-
-	sk_refcnt_debug_dec(sk);
-}
-
 static int xsk_create(struct net *net, struct socket *sock, int protocol,
 		      int kern)
 {
@@ -751,9 +740,6 @@ static int xsk_create(struct net *net, struct socket *sock, int protocol,
 
 	sk->sk_family = PF_XDP;
 
-	sk->sk_destruct = xsk_destruct;
-	sk_refcnt_debug_inc(sk);
-
 	sock_set_flag(sk, SOCK_RCU_FREE);
 
 	xs = xdp_sk(sk);
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH bpf-next v2 2/7] xsk: add XDP_ATTACH bind() flag
  2018-12-17 10:24 [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH Björn Töpel
  2018-12-17 10:24 ` [PATCH bpf-next v2 1/7] xsk: simplify AF_XDP socket teardown Björn Töpel
@ 2018-12-17 10:24 ` Björn Töpel
  2018-12-17 10:24 ` [PATCH bpf-next v2 3/7] bpf: add bpf_xsk_redirect function Björn Töpel
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 20+ messages in thread
From: Björn Töpel @ 2018-12-17 10:24 UTC (permalink / raw)
  To: bjorn.topel, magnus.karlsson, magnus.karlsson, ast, daniel, netdev
  Cc: Björn Töpel, brouer, u9012063, qi.z.zhang,
	jakub.kicinski, andrew

From: Björn Töpel <bjorn.topel@intel.com>

In this commit the XDP_ATTACH bind() flag is introduced. When an XDP
socket is bound with this flag set, the socket will be associated with
a certain netdev Rx queue. The idea is that the XDP socket users do
not have to deal with the XSKMAP. Instead, XDP_ATTACH will "attach" an
XDP socket to a queue, and XDP programs simply use bpf_xsk_redirect to
redirect XDP frames to an attached socket.

An XDP socket bound with this option performs better, since the BPF
program is smaller, and the kernel code-path also has fewer
instructions.

This commit only introduces the first part of XDP_ATTACH, namely
associating the XDP socket to a netdev Rx queue. The bpf_xsk_redirect
function will be introduced in the next commit.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
---
 include/linux/netdevice.h   |  1 +
 include/net/xdp_sock.h      |  3 +++
 include/uapi/linux/if_xdp.h |  1 +
 net/xdp/xsk.c               | 51 +++++++++++++++++++++++++++++--------
 4 files changed, 45 insertions(+), 11 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index fc6ba71513be..a2d19af6b8dd 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -743,6 +743,7 @@ struct netdev_rx_queue {
 	struct xdp_rxq_info		xdp_rxq;
 #ifdef CONFIG_XDP_SOCKETS
 	struct xdp_umem                 *umem;
+	struct xdp_sock			*xsk;
 #endif
 } ____cacheline_aligned_in_smp;
 
diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h
index 13acb9803a6d..13975723430c 100644
--- a/include/net/xdp_sock.h
+++ b/include/net/xdp_sock.h
@@ -61,6 +61,7 @@ struct xdp_sock {
 	struct xsk_queue *tx ____cacheline_aligned_in_smp;
 	struct list_head list;
 	bool zc;
+	bool attached;
 	/* Protects multiple processes in the control path */
 	struct mutex mutex;
 	/* Mutual exclusion of NAPI TX thread and sendmsg error paths
@@ -72,7 +73,9 @@ struct xdp_sock {
 
 struct xdp_buff;
 #ifdef CONFIG_XDP_SOCKETS
+int xsk_generic_attached_rcv(struct xdp_sock *xs, struct xdp_buff *xdp);
 int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp);
+int xsk_attached_rcv(struct xdp_sock *xs, struct xdp_buff *xdp);
 int xsk_rcv(struct xdp_sock *xs, struct xdp_buff *xdp);
 void xsk_flush(struct xdp_sock *xs);
 bool xsk_is_setup_for_bpf_map(struct xdp_sock *xs);
diff --git a/include/uapi/linux/if_xdp.h b/include/uapi/linux/if_xdp.h
index caed8b1614ff..bd76235c2749 100644
--- a/include/uapi/linux/if_xdp.h
+++ b/include/uapi/linux/if_xdp.h
@@ -16,6 +16,7 @@
 #define XDP_SHARED_UMEM	(1 << 0)
 #define XDP_COPY	(1 << 1) /* Force copy-mode */
 #define XDP_ZEROCOPY	(1 << 2) /* Force zero-copy mode */
+#define XDP_ATTACH	(1 << 3)
 
 struct sockaddr_xdp {
 	__u16 sxdp_family;
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index a03268454a27..08d66a22185d 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -100,17 +100,20 @@ static int __xsk_rcv_zc(struct xdp_sock *xs, struct xdp_buff *xdp, u32 len)
 	return err;
 }
 
-int xsk_rcv(struct xdp_sock *xs, struct xdp_buff *xdp)
+int xsk_attached_rcv(struct xdp_sock *xs, struct xdp_buff *xdp)
 {
-	u32 len;
+	u32 len = xdp->data_end - xdp->data;
+
+	return (xdp->rxq->mem.type == MEM_TYPE_ZERO_COPY) ?
+		__xsk_rcv_zc(xs, xdp, len) : __xsk_rcv(xs, xdp, len);
+}
 
+int xsk_rcv(struct xdp_sock *xs, struct xdp_buff *xdp)
+{
 	if (xs->dev != xdp->rxq->dev || xs->queue_id != xdp->rxq->queue_index)
 		return -EINVAL;
 
-	len = xdp->data_end - xdp->data;
-
-	return (xdp->rxq->mem.type == MEM_TYPE_ZERO_COPY) ?
-		__xsk_rcv_zc(xs, xdp, len) : __xsk_rcv(xs, xdp, len);
+	return xsk_attached_rcv(xs, xdp);
 }
 
 void xsk_flush(struct xdp_sock *xs)
@@ -119,7 +122,7 @@ void xsk_flush(struct xdp_sock *xs)
 	xs->sk.sk_data_ready(&xs->sk);
 }
 
-int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp)
+int xsk_generic_attached_rcv(struct xdp_sock *xs, struct xdp_buff *xdp)
 {
 	u32 metalen = xdp->data - xdp->data_meta;
 	u32 len = xdp->data_end - xdp->data;
@@ -127,9 +130,6 @@ int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp)
 	u64 addr;
 	int err;
 
-	if (xs->dev != xdp->rxq->dev || xs->queue_id != xdp->rxq->queue_index)
-		return -EINVAL;
-
 	if (!xskq_peek_addr(xs->umem->fq, &addr) ||
 	    len > xs->umem->chunk_size_nohr - XDP_PACKET_HEADROOM) {
 		xs->rx_dropped++;
@@ -152,6 +152,14 @@ int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp)
 	return err;
 }
 
+int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp)
+{
+	if (xs->dev != xdp->rxq->dev || xs->queue_id != xdp->rxq->queue_index)
+		return -EINVAL;
+
+	return xsk_generic_attached_rcv(xs, xdp);
+}
+
 void xsk_umem_complete_tx(struct xdp_umem *umem, u32 nb_entries)
 {
 	xskq_produce_flush_addr_n(umem->cq, nb_entries);
@@ -339,6 +347,19 @@ static int xsk_init_queue(u32 entries, struct xsk_queue **queue,
 	return 0;
 }
 
+static void xsk_detach(struct xdp_sock *xs)
+{
+	if (xs->attached)
+		WRITE_ONCE(xs->dev->_rx[xs->queue_id].xsk, NULL);
+}
+
+static int xsk_attach(struct xdp_sock *xs, struct net_device *dev, u16 qid)
+{
+	xs->attached = true;
+	WRITE_ONCE(dev->_rx[qid].xsk, xs);
+	return 0;
+}
+
 static int xsk_release(struct socket *sock)
 {
 	struct sock *sk = sock->sk;
@@ -359,6 +380,7 @@ static int xsk_release(struct socket *sock)
 
 		/* Wait for driver to stop using the xdp socket. */
 		xdp_del_sk_umem(xs->umem, xs);
+		xsk_detach(xs);
 		xs->dev = NULL;
 		synchronize_net();
 		dev_put(dev);
@@ -432,7 +454,8 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len)
 		struct xdp_sock *umem_xs;
 		struct socket *sock;
 
-		if ((flags & XDP_COPY) || (flags & XDP_ZEROCOPY)) {
+		if ((flags & XDP_COPY) || (flags & XDP_ZEROCOPY) ||
+		    (flags & XDP_ATTACH)) {
 			/* Cannot specify flags for shared sockets. */
 			err = -EINVAL;
 			goto out_unlock;
@@ -478,6 +501,12 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len)
 		err = xdp_umem_assign_dev(xs->umem, dev, qid, flags);
 		if (err)
 			goto out_unlock;
+
+		if (flags & XDP_ATTACH) {
+			err = xsk_attach(xs, dev, qid);
+			if (err)
+				goto out_unlock;
+		}
 	}
 
 	xs->dev = dev;
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH bpf-next v2 3/7] bpf: add bpf_xsk_redirect function
  2018-12-17 10:24 [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH Björn Töpel
  2018-12-17 10:24 ` [PATCH bpf-next v2 1/7] xsk: simplify AF_XDP socket teardown Björn Töpel
  2018-12-17 10:24 ` [PATCH bpf-next v2 2/7] xsk: add XDP_ATTACH bind() flag Björn Töpel
@ 2018-12-17 10:24 ` Björn Töpel
  2018-12-17 10:24 ` [PATCH bpf-next v2 4/7] tools/bpf: sync kernel uapi bpf.h to tools directory Björn Töpel
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 20+ messages in thread
From: Björn Töpel @ 2018-12-17 10:24 UTC (permalink / raw)
  To: bjorn.topel, magnus.karlsson, magnus.karlsson, ast, daniel, netdev
  Cc: Björn Töpel, brouer, u9012063, qi.z.zhang,
	jakub.kicinski, andrew

From: Björn Töpel <bjorn.topel@intel.com>

The bpf_xsk_redirect function is a new redirect bpf function, in
addition to bpf_redirect/bpf_redirect_map. If an XDP socket has been
attached to a netdev Rx queue via the XDP_ATTACH bind() option and
bpf_xsk_redirect is called, the packet will be redirected to the
attached socket.

The bpf_xsk_redirect function returns XDP_REDIRECT if there is a
socket attached to the originated queue, otherwise XDP_PASS.

This commit also adds the corresponding trace points for the redirect
call.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
---
 include/linux/filter.h     |   4 ++
 include/trace/events/xdp.h |  61 ++++++++++++++++++++++
 include/uapi/linux/bpf.h   |  14 +++++-
 net/core/filter.c          | 100 +++++++++++++++++++++++++++++++++++++
 4 files changed, 178 insertions(+), 1 deletion(-)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index 537e9e7c6e6f..ee3efca4477e 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -532,6 +532,10 @@ struct bpf_redirect_info {
 	u32 flags;
 	struct bpf_map *map;
 	struct bpf_map *map_to_flush;
+#ifdef CONFIG_XDP_SOCKETS
+	struct xdp_sock *xsk;
+	struct xdp_sock *xsk_to_flush;
+#endif
 	u32 kern_flags;
 };
 
diff --git a/include/trace/events/xdp.h b/include/trace/events/xdp.h
index e95cb86b65cf..30f399bd462b 100644
--- a/include/trace/events/xdp.h
+++ b/include/trace/events/xdp.h
@@ -158,6 +158,67 @@ struct _bpf_dtab_netdev {
 	 trace_xdp_redirect_map_err(dev, xdp, devmap_ifindex(fwd, map),	\
 				    err, map, idx)
 
+DECLARE_EVENT_CLASS(xsk_redirect_template,
+
+	TP_PROTO(const struct net_device *dev,
+		 const struct bpf_prog *xdp,
+		 int err,
+		 struct xdp_buff *xbuff),
+
+	TP_ARGS(dev, xdp, err, xbuff),
+
+	TP_STRUCT__entry(
+		__field(int, prog_id)
+		__field(u32, act)
+		__field(int, ifindex)
+		__field(int, err)
+		__field(u32, queue_index)
+		__field(enum xdp_mem_type, mem_type)
+	),
+
+	TP_fast_assign(
+		__entry->prog_id	= xdp->aux->id;
+		__entry->act		= XDP_REDIRECT;
+		__entry->ifindex	= dev->ifindex;
+		__entry->err		= err;
+		__entry->queue_index	= xbuff->rxq->queue_index;
+		__entry->mem_type	= xbuff->rxq->mem.type;
+	),
+
+	TP_printk("prog_id=%d action=%s ifindex=%d err=%d queue_index=%d"
+		  " mem_type=%d",
+		  __entry->prog_id,
+		  __print_symbolic(__entry->act, __XDP_ACT_SYM_TAB),
+		  __entry->ifindex,
+		  __entry->err,
+		  __entry->queue_index,
+		  __entry->mem_type)
+);
+
+DEFINE_EVENT(xsk_redirect_template, xsk_redirect,
+	TP_PROTO(const struct net_device *dev,
+		 const struct bpf_prog *xdp,
+		 int err,
+		 struct xdp_buff *xbuff),
+
+	TP_ARGS(dev, xdp, err, xbuff)
+);
+
+DEFINE_EVENT(xsk_redirect_template, xsk_redirect_err,
+	TP_PROTO(const struct net_device *dev,
+		 const struct bpf_prog *xdp,
+		 int err,
+		 struct xdp_buff *xbuff),
+
+	TP_ARGS(dev, xdp, err, xbuff)
+);
+
+#define _trace_xsk_redirect(dev, xdp, xbuff)		\
+	 trace_xsk_redirect(dev, xdp, 0, xbuff)
+
+#define _trace_xsk_redirect_err(dev, xdp, xbuff, err)	\
+	 trace_xsk_redirect_err(dev, xdp, err, xbuff)
+
 TRACE_EVENT(xdp_cpumap_kthread,
 
 	TP_PROTO(int map_id, unsigned int processed,  unsigned int drops,
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index e7d57e89f25f..2568039b6265 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -2319,6 +2319,17 @@ union bpf_attr {
  *		"**y**".
  *	Return
  *		0
+ *
+ * int bpf_xsk_redirect(struct xdp_buff *xdp_md)
+ *	 Description
+ *		Redirect the packet to the attached XDP socket, if any.
+ *		An XDP socket can be attached to a network interface Rx
+ *		queue by passing the XDP_ATTACH option at bind point of
+ *		the socket.
+ *
+ *	Return
+ *		**XDP_REDIRECT** if there is an XDP socket attached to the Rx
+ *		queue receiving the frame, otherwise **XDP_PASS**.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -2413,7 +2424,8 @@ union bpf_attr {
 	FN(map_peek_elem),		\
 	FN(msg_push_data),		\
 	FN(msg_pop_data),		\
-	FN(rc_pointer_rel),
+	FN(rc_pointer_rel),		\
+	FN(xsk_redirect),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
diff --git a/net/core/filter.c b/net/core/filter.c
index f9348806e843..93ed82cd741f 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3415,6 +3415,17 @@ static int __bpf_tx_xdp_map(struct net_device *dev_rx, void *fwd,
 	return 0;
 }
 
+static void xdp_do_flush_xsk(struct bpf_redirect_info *ri)
+{
+#ifdef CONFIG_XDP_SOCKETS
+	struct xdp_sock *xsk = ri->xsk_to_flush;
+
+	ri->xsk_to_flush = NULL;
+	if (xsk)
+		xsk_flush(xsk);
+#endif
+}
+
 void xdp_do_flush_map(void)
 {
 	struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info);
@@ -3436,6 +3447,8 @@ void xdp_do_flush_map(void)
 			break;
 		}
 	}
+
+	xdp_do_flush_xsk(ri);
 }
 EXPORT_SYMBOL_GPL(xdp_do_flush_map);
 
@@ -3501,6 +3514,30 @@ static int xdp_do_redirect_map(struct net_device *dev, struct xdp_buff *xdp,
 	return err;
 }
 
+#ifdef CONFIG_XDP_SOCKETS
+static int xdp_do_xsk_redirect(struct net_device *dev, struct xdp_buff *xdp,
+			       struct bpf_prog *xdp_prog,
+			       struct bpf_redirect_info *ri)
+{
+	struct xdp_sock *xsk = ri->xsk;
+	int err;
+
+	ri->xsk = NULL;
+	ri->xsk_to_flush = xsk;
+
+	err = xsk_attached_rcv(xsk, xdp);
+	if (unlikely(err))
+		goto err;
+
+	_trace_xsk_redirect(dev, xdp_prog, xdp);
+	return 0;
+
+err:
+	_trace_xsk_redirect_err(dev, xdp_prog, xdp, err);
+	return err;
+}
+#endif /* CONFIG_XDP_SOCKETS */
+
 int xdp_do_redirect(struct net_device *dev, struct xdp_buff *xdp,
 		    struct bpf_prog *xdp_prog)
 {
@@ -3510,6 +3547,10 @@ int xdp_do_redirect(struct net_device *dev, struct xdp_buff *xdp,
 	if (likely(map))
 		return xdp_do_redirect_map(dev, xdp, xdp_prog, map, ri);
 
+#ifdef CONFIG_XDP_SOCKETS
+	if (ri->xsk)
+		return xdp_do_xsk_redirect(dev, xdp, xdp_prog, ri);
+#endif
 	return xdp_do_redirect_slow(dev, xdp, xdp_prog, ri);
 }
 EXPORT_SYMBOL_GPL(xdp_do_redirect);
@@ -3560,6 +3601,33 @@ static int xdp_do_generic_redirect_map(struct net_device *dev,
 	return err;
 }
 
+#ifdef CONFIG_XDP_SOCKETS
+static int xdp_do_generic_xsk_redirect(struct net_device *dev,
+				       struct xdp_buff *xdp,
+				       struct bpf_prog *xdp_prog,
+				       struct sk_buff *skb)
+{
+	struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info);
+	struct xdp_sock *xsk = ri->xsk;
+	int err;
+
+	ri->xsk = NULL;
+	ri->xsk_to_flush = NULL;
+
+	err = xsk_generic_attached_rcv(xsk, xdp);
+	if (err)
+		goto err;
+
+	consume_skb(skb);
+	_trace_xsk_redirect(dev, xdp_prog, xdp);
+	return 0;
+
+err:
+	_trace_xsk_redirect_err(dev, xdp_prog, xdp, err);
+	return err;
+}
+#endif /* CONFIG_XDP_SOCKETS */
+
 int xdp_do_generic_redirect(struct net_device *dev, struct sk_buff *skb,
 			    struct xdp_buff *xdp, struct bpf_prog *xdp_prog)
 {
@@ -3572,6 +3640,11 @@ int xdp_do_generic_redirect(struct net_device *dev, struct sk_buff *skb,
 	if (map)
 		return xdp_do_generic_redirect_map(dev, skb, xdp, xdp_prog,
 						   map);
+#ifdef CONFIG_XDP_SOCKETS
+	if (ri->xsk)
+		return xdp_do_generic_xsk_redirect(dev, xdp, xdp_prog, skb);
+#endif
+
 	ri->ifindex = 0;
 	fwd = dev_get_by_index_rcu(dev_net(dev), index);
 	if (unlikely(!fwd)) {
@@ -3639,6 +3712,29 @@ static const struct bpf_func_proto bpf_xdp_redirect_map_proto = {
 	.arg3_type      = ARG_ANYTHING,
 };
 
+#ifdef CONFIG_XDP_SOCKETS
+BPF_CALL_1(bpf_xdp_xsk_redirect, struct xdp_buff *, xdp)
+{
+	struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info);
+	struct xdp_sock *xsk;
+
+	xsk = READ_ONCE(xdp->rxq->dev->_rx[xdp->rxq->queue_index].xsk);
+	if (xsk) {
+		ri->xsk = xsk;
+		return XDP_REDIRECT;
+	}
+
+	return XDP_PASS;
+}
+
+static const struct bpf_func_proto bpf_xdp_xsk_redirect_proto = {
+	.func           = bpf_xdp_xsk_redirect,
+	.gpl_only       = false,
+	.ret_type       = RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_CTX,
+};
+#endif /* CONFIG_XDP_SOCKETS */
+
 static unsigned long bpf_skb_copy(void *dst_buff, const void *skb,
 				  unsigned long off, unsigned long len)
 {
@@ -5511,6 +5607,10 @@ xdp_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_xdp_sk_lookup_tcp_proto;
 	case BPF_FUNC_sk_release:
 		return &bpf_sk_release_proto;
+#endif
+#ifdef CONFIG_XDP_SOCKETS
+	case BPF_FUNC_xsk_redirect:
+		return &bpf_xdp_xsk_redirect_proto;
 #endif
 	default:
 		return bpf_base_func_proto(func_id);
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH bpf-next v2 4/7] tools/bpf: sync kernel uapi bpf.h to tools directory
  2018-12-17 10:24 [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH Björn Töpel
                   ` (2 preceding siblings ...)
  2018-12-17 10:24 ` [PATCH bpf-next v2 3/7] bpf: add bpf_xsk_redirect function Björn Töpel
@ 2018-12-17 10:24 ` Björn Töpel
  2018-12-17 10:24 ` [PATCH bpf-next v2 5/7] libbpf: initial support for builtin BPF programs Björn Töpel
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 20+ messages in thread
From: Björn Töpel @ 2018-12-17 10:24 UTC (permalink / raw)
  To: bjorn.topel, magnus.karlsson, magnus.karlsson, ast, daniel, netdev
  Cc: Björn Töpel, brouer, u9012063, qi.z.zhang,
	jakub.kicinski, andrew

From: Björn Töpel <bjorn.topel@intel.com>

Sync kernel uapi bpf.h "BPF_FUNC_xsk_redirect" change to the tools
directory.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
---
 tools/include/uapi/linux/bpf.h | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index e7d57e89f25f..2568039b6265 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -2319,6 +2319,17 @@ union bpf_attr {
  *		"**y**".
  *	Return
  *		0
+ *
+ * int bpf_xsk_redirect(struct xdp_buff *xdp_md)
+ *	 Description
+ *		Redirect the packet to the attached XDP socket, if any.
+ *		An XDP socket can be attached to a network interface Rx
+ *		queue by passing the XDP_ATTACH option at bind point of
+ *		the socket.
+ *
+ *	Return
+ *		**XDP_REDIRECT** if there is an XDP socket attached to the Rx
+ *		queue receiving the frame, otherwise **XDP_PASS**.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -2413,7 +2424,8 @@ union bpf_attr {
 	FN(map_peek_elem),		\
 	FN(msg_push_data),		\
 	FN(msg_pop_data),		\
-	FN(rc_pointer_rel),
+	FN(rc_pointer_rel),		\
+	FN(xsk_redirect),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH bpf-next v2 5/7] libbpf: initial support for builtin BPF programs
  2018-12-17 10:24 [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH Björn Töpel
                   ` (3 preceding siblings ...)
  2018-12-17 10:24 ` [PATCH bpf-next v2 4/7] tools/bpf: sync kernel uapi bpf.h to tools directory Björn Töpel
@ 2018-12-17 10:24 ` Björn Töpel
  2018-12-17 10:25 ` [PATCH bpf-next v2 6/7] samples: bpf: simplify/cleanup xdpsock Björn Töpel
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 20+ messages in thread
From: Björn Töpel @ 2018-12-17 10:24 UTC (permalink / raw)
  To: bjorn.topel, magnus.karlsson, magnus.karlsson, ast, daniel, netdev
  Cc: Björn Töpel, brouer, u9012063, qi.z.zhang,
	jakub.kicinski, andrew

From: Björn Töpel <bjorn.topel@intel.com>

This commit introduces builtin BPF programs to libbpf. A builtin
program is simply a BPF program that is bundled with libbpf.

The first builtin program is an XDP program, "xdp_xsk_redirect", which
is a trivial program that calls bpf_xsk_redirect for each received
packet, and redirects it to the corresponding XDP socket.

Two new functions are added to the libbpf API:
  LIBBPF_API struct bpf_object *bpf_object__open_builtin(
                  enum bpf_prog_type prog_type);

  LIBBPF_API struct bpf_program *
  bpf_object__find_xdp_builtin_program(
                struct bpf_object* obj,
                  enum libbpf_builtin_xdp_prog prog);

The first function is used to get a handle to the bpf_object
containing all builtin programs for a certain program type. The latter
is used to access a certain builtin program from the bpf_object. Note
that currenty only XDP is supported. When other program types are
supported, additional bpf_object__find_PROG_TYPE_builtin_program
function are required.

When/if packet cloning is introduced to XDP, another builtin program
candidate would be a program that clones all packets to an XDP socket.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
---
 tools/lib/bpf/libbpf.c   | 85 ++++++++++++++++++++++++++++++++++++++++
 tools/lib/bpf/libbpf.h   | 14 +++++++
 tools/lib/bpf/libbpf.map |  2 +
 3 files changed, 101 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index e2bc75ee1614..d8551193862b 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -242,6 +242,31 @@ struct bpf_object {
 };
 #define obj_elf_valid(o)	((o)->efile.elf)
 
+struct libbpf_builtin_prog {
+	const char *name;
+	const struct bpf_insn *insns;
+	size_t size;
+};
+
+/*
+ * Builtin XDP program: LIBBPF_BUILTIN_XDP__XSK_REDIRECT
+ *
+ * Trivial XDP program that calls bpf_xsk_redirect() on every received
+ * frame.
+ */
+static const struct bpf_insn builtin_xdp_xsk_redirect_insn[] = {
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_xsk_redirect),
+	BPF_EXIT_INSN(),
+};
+
+static const struct libbpf_builtin_prog libbpf_builtin_xdp_prog[] = {
+	[LIBBPF_BUILTIN_XDP__XSK_REDIRECT] = {
+		"xdp_xsk_redirect",
+		&builtin_xdp_xsk_redirect_insn[0],
+		sizeof(builtin_xdp_xsk_redirect_insn)
+	},
+};
+
 void bpf_program__unload(struct bpf_program *prog)
 {
 	int i;
@@ -2990,3 +3015,63 @@ bpf_perf_event_read_simple(void *mmap_mem, size_t mmap_size, size_t page_size,
 	ring_buffer_write_tail(header, data_tail);
 	return ret;
 }
+
+struct bpf_object *bpf_object__open_builtin(enum bpf_prog_type prog_type)
+{
+	struct bpf_program *prog;
+	struct bpf_object *obj;
+	int err, i;
+
+	/* Right now, only XDP is supported. */
+	if (prog_type != BPF_PROG_TYPE_XDP)
+		return ERR_PTR(-EINVAL);
+
+	obj = bpf_object__new("", NULL, 0);
+	if (IS_ERR(obj))
+		return NULL;
+
+	CHECK_ERR(bpf_object__init_license(obj, (void *)"GPL", sizeof("GPL")),
+		  err, out);
+
+	for (i = 0; i < __LIBBPF_BUILTIN_XDP__END; i++) {
+		err = bpf_object__add_program(
+			obj,
+			(void *)libbpf_builtin_xdp_prog[i].insns,
+			libbpf_builtin_xdp_prog[i].size,
+			(char *)libbpf_builtin_xdp_prog[i].name, i);
+		if (err) {
+			pr_warning("failed to add builtin program %s\n",
+				   libbpf_builtin_xdp_prog[i].name);
+			goto out;
+		}
+	}
+
+	bpf_object__for_each_program(prog, obj) {
+		bpf_program__set_type(prog, BPF_PROG_TYPE_XDP);
+
+		prog->name = strdup(libbpf_builtin_xdp_prog[prog->idx].name);
+		if (!prog->name) {
+			pr_warning("failed to allocate memory for name %s\n",
+				   libbpf_builtin_xdp_prog[prog->idx].name);
+			goto out;
+		}
+	}
+
+	return obj;
+out:
+	bpf_object__close(obj);
+	return NULL;
+
+}
+
+struct bpf_program *bpf_object__find_xdp_builtin_program(
+	struct bpf_object *obj, enum libbpf_builtin_xdp_prog prog)
+{
+	if (!obj)
+		return NULL;
+
+	if (prog < 0 || prog >= __LIBBPF_BUILTIN_XDP__END)
+		return NULL;
+
+	return bpf_object__find_prog_by_idx(obj, prog);
+}
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index 5f68d7b75215..0de0cc2da240 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -45,6 +45,15 @@ enum libbpf_errno {
 	__LIBBPF_ERRNO__END,
 };
 
+enum libbpf_builtin_xdp_prog {
+	/*
+	 * Trivial XDP program that calls bpf_xsk_redirect
+	 * unconditionally for every received packet.
+	 */
+	LIBBPF_BUILTIN_XDP__XSK_REDIRECT,
+	__LIBBPF_BUILTIN_XDP__END,
+};
+
 LIBBPF_API int libbpf_strerror(int err, char *buf, size_t size);
 
 /*
@@ -75,6 +84,8 @@ struct bpf_object *__bpf_object__open_xattr(struct bpf_object_open_attr *attr,
 LIBBPF_API struct bpf_object *bpf_object__open_buffer(void *obj_buf,
 						      size_t obj_buf_sz,
 						      const char *name);
+LIBBPF_API struct bpf_object *bpf_object__open_builtin(
+	enum bpf_prog_type prog_type);
 LIBBPF_API int bpf_object__pin_maps(struct bpf_object *obj, const char *path);
 LIBBPF_API int bpf_object__unpin_maps(struct bpf_object *obj,
 				      const char *path);
@@ -94,6 +105,9 @@ LIBBPF_API int bpf_object__btf_fd(const struct bpf_object *obj);
 
 LIBBPF_API struct bpf_program *
 bpf_object__find_program_by_title(struct bpf_object *obj, const char *title);
+LIBBPF_API struct bpf_program *
+bpf_object__find_xdp_builtin_program(struct bpf_object *obj,
+				     enum libbpf_builtin_xdp_prog prog);
 
 LIBBPF_API struct bpf_object *bpf_object__next(struct bpf_object *prev);
 #define bpf_object__for_each_safe(pos, tmp)			\
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index cd02cd4e2cc3..58acd84d5ff3 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -121,6 +121,8 @@ LIBBPF_0.0.1 {
 		libbpf_prog_type_by_name;
 		libbpf_set_print;
 		libbpf_strerror;
+		bpf_object__open_builtin;
+		bpf_object__find_xdp_builtin_program;
 	local:
 		*;
 };
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH bpf-next v2 6/7] samples: bpf: simplify/cleanup xdpsock
  2018-12-17 10:24 [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH Björn Töpel
                   ` (4 preceding siblings ...)
  2018-12-17 10:24 ` [PATCH bpf-next v2 5/7] libbpf: initial support for builtin BPF programs Björn Töpel
@ 2018-12-17 10:25 ` Björn Töpel
  2018-12-17 10:25 ` [PATCH bpf-next v2 7/7] samples: bpf: add support for XDP_ATTACH to xdpsock Björn Töpel
  2018-12-17 12:50 ` [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH Jesper Dangaard Brouer
  7 siblings, 0 replies; 20+ messages in thread
From: Björn Töpel @ 2018-12-17 10:25 UTC (permalink / raw)
  To: bjorn.topel, magnus.karlsson, magnus.karlsson, ast, daniel, netdev
  Cc: Björn Töpel, brouer, u9012063, qi.z.zhang,
	jakub.kicinski, andrew

From: Björn Töpel <bjorn.topel@intel.com>

Break out all map and XDP program setup into a function.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
---
 samples/bpf/xdpsock_user.c | 91 +++++++++++++++++++++-----------------
 1 file changed, 50 insertions(+), 41 deletions(-)

diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c
index 57ecadc58403..c72face0da05 100644
--- a/samples/bpf/xdpsock_user.c
+++ b/samples/bpf/xdpsock_user.c
@@ -119,7 +119,7 @@ struct xdpsock {
 };
 
 static int num_socks;
-struct xdpsock *xsks[MAX_SOCKS];
+static struct xdpsock *xsks[MAX_SOCKS];
 
 static unsigned long get_nsecs(void)
 {
@@ -897,37 +897,21 @@ static void l2fwd(struct xdpsock *xsk)
 	}
 }
 
-int main(int argc, char **argv)
+static void setup_xdp_xskmap(const char *pathprefix)
 {
-	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
-	struct bpf_prog_load_attr prog_load_attr = {
-		.prog_type	= BPF_PROG_TYPE_XDP,
-	};
-	int prog_fd, qidconf_map, xsks_map;
+	int prog_fd, qidconf_map, xsks_map, i, key = 0, err;
+	struct bpf_prog_load_attr prog_load_attr = {};
+	char bpf_objs_filename[256];
+	struct bpf_program *prog;
 	struct bpf_object *obj;
-	char xdp_filename[256];
 	struct bpf_map *map;
-	int i, ret, key = 0;
-	pthread_t pt;
 
-	parse_command_line(argc, argv);
-
-	if (setrlimit(RLIMIT_MEMLOCK, &r)) {
-		fprintf(stderr, "ERROR: setrlimit(RLIMIT_MEMLOCK) \"%s\"\n",
-			strerror(errno));
-		exit(EXIT_FAILURE);
-	}
-
-	snprintf(xdp_filename, sizeof(xdp_filename), "%s_kern.o", argv[0]);
-	prog_load_attr.file = xdp_filename;
+	snprintf(bpf_objs_filename, sizeof(bpf_objs_filename), "%s_kern.o",
+		 pathprefix);
+	prog_load_attr.file = bpf_objs_filename;
 
 	if (bpf_prog_load_xattr(&prog_load_attr, &obj, &prog_fd))
 		exit(EXIT_FAILURE);
-	if (prog_fd < 0) {
-		fprintf(stderr, "ERROR: no program found: %s\n",
-			strerror(prog_fd));
-		exit(EXIT_FAILURE);
-	}
 
 	map = bpf_object__find_map_by_name(obj, "qidconf_map");
 	qidconf_map = bpf_map__fd(map);
@@ -945,14 +929,51 @@ int main(int argc, char **argv)
 		exit(EXIT_FAILURE);
 	}
 
+	err = bpf_map_update_elem(qidconf_map, &key, &opt_queue, 0);
+	if (err) {
+		fprintf(stderr, "ERROR: bpf_map_update_elem qidconf\n");
+		exit(EXIT_FAILURE);
+	}
+
+	/* Insert sockets into into the XSKMAP. */
+	for (i = 0; i < num_socks; i++) {
+		key = i;
+		err = bpf_map_update_elem(xsks_map, &key, &xsks[i]->sfd, 0);
+		if (err) {
+			fprintf(stderr, "ERROR: bpf_map_update_elem %d\n", i);
+			exit(EXIT_FAILURE);
+		}
+	}
+
+	prog = bpf_object__find_program_by_title(obj, "xdp_sock");
+	prog_fd = bpf_program__fd(prog);
+	if (prog_fd < 0) {
+		fprintf(stderr, "ERROR: no xdp_sock program found: %s\n",
+			strerror(prog_fd));
+		exit(EXIT_FAILURE);
+	}
+
 	if (bpf_set_link_xdp_fd(opt_ifindex, prog_fd, opt_xdp_flags) < 0) {
 		fprintf(stderr, "ERROR: link set xdp fd failed\n");
 		exit(EXIT_FAILURE);
 	}
+}
 
-	ret = bpf_map_update_elem(qidconf_map, &key, &opt_queue, 0);
-	if (ret) {
-		fprintf(stderr, "ERROR: bpf_map_update_elem qidconf\n");
+int main(int argc, char **argv)
+{
+	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
+	pthread_t pt;
+	int ret;
+
+	signal(SIGINT, int_exit);
+	signal(SIGTERM, int_exit);
+	signal(SIGABRT, int_exit);
+
+	parse_command_line(argc, argv);
+
+	if (setrlimit(RLIMIT_MEMLOCK, &r)) {
+		fprintf(stderr, "ERROR: setrlimit(RLIMIT_MEMLOCK) \"%s\"\n",
+			strerror(errno));
 		exit(EXIT_FAILURE);
 	}
 
@@ -964,19 +985,7 @@ int main(int argc, char **argv)
 		xsks[num_socks++] = xsk_configure(xsks[0]->umem);
 #endif
 
-	/* ...and insert them into the map. */
-	for (i = 0; i < num_socks; i++) {
-		key = i;
-		ret = bpf_map_update_elem(xsks_map, &key, &xsks[i]->sfd, 0);
-		if (ret) {
-			fprintf(stderr, "ERROR: bpf_map_update_elem %d\n", i);
-			exit(EXIT_FAILURE);
-		}
-	}
-
-	signal(SIGINT, int_exit);
-	signal(SIGTERM, int_exit);
-	signal(SIGABRT, int_exit);
+	setup_xdp_xskmap(argv[0]);
 
 	setlocale(LC_ALL, "");
 
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH bpf-next v2 7/7] samples: bpf: add support for XDP_ATTACH to xdpsock
  2018-12-17 10:24 [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH Björn Töpel
                   ` (5 preceding siblings ...)
  2018-12-17 10:25 ` [PATCH bpf-next v2 6/7] samples: bpf: simplify/cleanup xdpsock Björn Töpel
@ 2018-12-17 10:25 ` Björn Töpel
  2018-12-17 12:50 ` [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH Jesper Dangaard Brouer
  7 siblings, 0 replies; 20+ messages in thread
From: Björn Töpel @ 2018-12-17 10:25 UTC (permalink / raw)
  To: bjorn.topel, magnus.karlsson, magnus.karlsson, ast, daniel, netdev
  Cc: Björn Töpel, brouer, u9012063, qi.z.zhang,
	jakub.kicinski, andrew

From: Björn Töpel <bjorn.topel@intel.com>

This commit adds support for the XDP_ATTACH bind() flag. If "-a" is
passed to the program XDP_ATTACH is passed to bind and the builtin
libbpf XDP program is loaded.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
---
 samples/bpf/xdpsock_user.c | 56 ++++++++++++++++++++++++++++++++++++--
 1 file changed, 54 insertions(+), 2 deletions(-)

diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c
index c72face0da05..53f9063bbdd9 100644
--- a/samples/bpf/xdpsock_user.c
+++ b/samples/bpf/xdpsock_user.c
@@ -650,6 +650,7 @@ static struct option long_options[] = {
 	{"interval", required_argument, 0, 'n'},
 	{"zero-copy", no_argument, 0, 'z'},
 	{"copy", no_argument, 0, 'c'},
+	{"attach", no_argument, 0, 'a'},
 	{0, 0, 0, 0}
 };
 
@@ -670,6 +671,7 @@ static void usage(const char *prog)
 		"  -n, --interval=n	Specify statistics update interval (default 1 sec).\n"
 		"  -z, --zero-copy      Force zero-copy mode.\n"
 		"  -c, --copy           Force copy mode.\n"
+		"  -a, --attach		XDP_ATTACH mode.\n"
 		"\n";
 	fprintf(stderr, str, prog);
 	exit(EXIT_FAILURE);
@@ -682,7 +684,7 @@ static void parse_command_line(int argc, char **argv)
 	opterr = 0;
 
 	for (;;) {
-		c = getopt_long(argc, argv, "rtli:q:psSNn:cz", long_options,
+		c = getopt_long(argc, argv, "rtli:q:psSNn:cza", long_options,
 				&option_index);
 		if (c == -1)
 			break;
@@ -725,6 +727,9 @@ static void parse_command_line(int argc, char **argv)
 		case 'c':
 			opt_xdp_bind_flags |= XDP_COPY;
 			break;
+		case 'a':
+			opt_xdp_bind_flags |= XDP_ATTACH;
+			break;
 		default:
 			usage(basename(argv[0]));
 		}
@@ -959,6 +964,45 @@ static void setup_xdp_xskmap(const char *pathprefix)
 	}
 }
 
+static void setup_xdp_attach(void)
+{
+	struct bpf_program *prog;
+	struct bpf_object *obj;
+	int prog_fd, err;
+
+	obj = bpf_object__open_builtin(BPF_PROG_TYPE_XDP);
+	if (!obj) {
+		fprintf(stderr, "ERROR: opening builtin XDP\n");
+		exit(EXIT_FAILURE);
+	}
+
+	prog = bpf_object__find_xdp_builtin_program(
+		obj,
+		LIBBPF_BUILTIN_XDP__XSK_REDIRECT);
+	if (!prog) {
+		fprintf(stderr, "ERROR: finding builtin XDP program\n");
+		exit(EXIT_FAILURE);
+	}
+
+	err = bpf_program__load(prog, "GPL", 0);
+	if (err) {
+		fprintf(stderr, "ERROR: error loading XDP program\n");
+		exit(EXIT_FAILURE);
+	}
+
+	prog_fd = bpf_program__fd(prog);
+	if (prog_fd < 0) {
+		fprintf(stderr, "ERROR: no xdp_sock program found: %s\n",
+			strerror(prog_fd));
+		exit(EXIT_FAILURE);
+	}
+
+	if (bpf_set_link_xdp_fd(opt_ifindex, prog_fd, opt_xdp_flags) < 0) {
+		fprintf(stderr, "ERROR: link set xdp fd failed\n");
+		exit(EXIT_FAILURE);
+	}
+}
+
 int main(int argc, char **argv)
 {
 	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
@@ -981,11 +1025,19 @@ int main(int argc, char **argv)
 	xsks[num_socks++] = xsk_configure(NULL);
 
 #if RR_LB
+	if (opt_xdp_bind_flags & XDP_ATTACH) {
+		fprintf(stderr, "ERROR: Not supported by application!\n");
+		exit(EXIT_FAILURE);
+	}
+
 	for (i = 0; i < MAX_SOCKS - 1; i++)
 		xsks[num_socks++] = xsk_configure(xsks[0]->umem);
 #endif
 
-	setup_xdp_xskmap(argv[0]);
+	if (opt_xdp_bind_flags & XDP_ATTACH)
+		setup_xdp_attach();
+	else
+		setup_xdp_xskmap(argv[0]);
 
 	setlocale(LC_ALL, "");
 
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH
  2018-12-17 10:24 [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH Björn Töpel
                   ` (6 preceding siblings ...)
  2018-12-17 10:25 ` [PATCH bpf-next v2 7/7] samples: bpf: add support for XDP_ATTACH to xdpsock Björn Töpel
@ 2018-12-17 12:50 ` Jesper Dangaard Brouer
  2018-12-17 13:39   ` Björn Töpel
  7 siblings, 1 reply; 20+ messages in thread
From: Jesper Dangaard Brouer @ 2018-12-17 12:50 UTC (permalink / raw)
  To: Björn Töpel
  Cc: magnus.karlsson, magnus.karlsson, ast, daniel, netdev,
	Björn Töpel, u9012063, qi.z.zhang, jakub.kicinski,
	andrew, brouer

On Mon, 17 Dec 2018 11:24:54 +0100
Björn Töpel <bjorn.topel@gmail.com> wrote:

> XDP_ATTACH associates an XDP socket to a specific netdev Rx queue. To
> redirect a packet to an attached socket from XDP, the bpf_xsk_redirect
> helper is used. The bpf_xsk_redirect helper is also introduced in this
> series.
> 
> Many XDP socket users just need a simple way of creating/binding a
> socket and receiving frames right away without a complicated XDP
> program. "Attached" XDP sockets removes the need for the XSKMAP, and
> allows for a trivial XDP program, e.g.:
> 
>   SEC("xdp")
>   int xdp_prog(struct xdp_md *ctx)
>   {
>         return bpf_xsk_redirect(ctx);
>   }
> 
> An attached XDP socket also has better performance than the XSKMAP
> based sockets (performance numbers below).

I still have a general problem with this approach.

The AF_XDP socket is build around (and gets its performance) from
being tied to a specific RX-queue.  That design begs to have an XDP
program per RX-queue.

Patchset-v1 moved towards this goal.  But in this patchset-v2 you
steer away from this again, and work-around the issue with the current
limitations of 1-XDP program per netdev.  (Which result in; if a
single AF_XDP socket is used in the system, which can ONLY be for a
single RX-queue by design, then ALL other XDP_PASS traffic also have
to take the overhead of indirect BPF call).

IMHO with this use-case, now is the time to introduce XDP programs per
RX-queue.  Yes, it will be more work, but I will offer to helpout.
This should be generalized as XDP programs per RX-queue can be used by
other use-cases too:
  In general terms: We can setup a NIC hardware filter to deliver
frame matching some criteria, then we can avoid rechecking these
criterias in on the (RX) CPU when/if we can attach an XDP prog to this
specific RX-queue directly.  This *IS* exactly what AF_XDP does, but it
is in general useful for others, like CPUMAP redirect.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH
  2018-12-17 12:50 ` [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH Jesper Dangaard Brouer
@ 2018-12-17 13:39   ` Björn Töpel
  2018-12-17 14:10     ` Jesper Dangaard Brouer
  2018-12-18  2:36     ` Jakub Kicinski
  0 siblings, 2 replies; 20+ messages in thread
From: Björn Töpel @ 2018-12-17 13:39 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Björn Töpel
  Cc: magnus.karlsson, magnus.karlsson, ast, daniel, netdev, u9012063,
	qi.z.zhang, jakub.kicinski, andrew



On 2018-12-17 13:50, Jesper Dangaard Brouer wrote:
> On Mon, 17 Dec 2018 11:24:54 +0100
> Björn Töpel <bjorn.topel@gmail.com> wrote:
> 
>> XDP_ATTACH associates an XDP socket to a specific netdev Rx queue. To
>> redirect a packet to an attached socket from XDP, the bpf_xsk_redirect
>> helper is used. The bpf_xsk_redirect helper is also introduced in this
>> series.
>>
>> Many XDP socket users just need a simple way of creating/binding a
>> socket and receiving frames right away without a complicated XDP
>> program. "Attached" XDP sockets removes the need for the XSKMAP, and
>> allows for a trivial XDP program, e.g.:
>>
>>    SEC("xdp")
>>    int xdp_prog(struct xdp_md *ctx)
>>    {
>>          return bpf_xsk_redirect(ctx);
>>    }
>>
>> An attached XDP socket also has better performance than the XSKMAP
>> based sockets (performance numbers below).
> 
> I still have a general problem with this approach.
>

And I appreciate that you have comments on the design/code! :-) Thank you!


> The AF_XDP socket is build around (and gets its performance) from
> being tied to a specific RX-queue.  That design begs to have an XDP
> program per RX-queue.
>
> Patchset-v1 moved towards this goal.  But in this patchset-v2 you
> steer away from this again, and work-around the issue with the current
> limitations of 1-XDP program per netdev.  (Which result in; if a
> single AF_XDP socket is used in the system, which can ONLY be for a
> single RX-queue by design, then ALL other XDP_PASS traffic also have
> to take the overhead of indirect BPF call).
>

I agree that a per-queue-program would be a good fit, but I think it
still makes sense to add XDP_ATTACH and the helper, to make it easier
for the XDP program authors to use AF_XDP. Would you prefer that this
functionality was help back, until per-queue programs are introduced?

OTOH the implementation would probably look different if there was a
per-q program, because this would open up for more optimizations. One
could even imagine that the socket fd was part of the program (part of
relocation process) when loading the program. That could get rid of yet
another if-statement that check for socket existence. :-)

> IMHO with this use-case, now is the time to introduce XDP programs per
> RX-queue.  Yes, it will be more work, but I will offer to helpout.
> This should be generalized as XDP programs per RX-queue can be used by
> other use-cases too:
>    In general terms: We can setup a NIC hardware filter to deliver
> frame matching some criteria, then we can avoid rechecking these
> criterias in on the (RX) CPU when/if we can attach an XDP prog to this
> specific RX-queue directly.  This *IS* exactly what AF_XDP does, but it
> is in general useful for others, like CPUMAP redirect.
>

Fair enough, and thank you for offering to help out. And the fact that
*other than AF_XDP* can benefit from that is good. Before we dive down
this hole, what are the opinions of the BPF maintainers? Maybe it's
better to hack an RFC, and then take the discussion?


Björn

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH
  2018-12-17 13:39   ` Björn Töpel
@ 2018-12-17 14:10     ` Jesper Dangaard Brouer
  2018-12-17 14:55       ` Magnus Karlsson
  2018-12-18  2:36     ` Jakub Kicinski
  1 sibling, 1 reply; 20+ messages in thread
From: Jesper Dangaard Brouer @ 2018-12-17 14:10 UTC (permalink / raw)
  To: Björn Töpel
  Cc: Björn Töpel, magnus.karlsson, magnus.karlsson, ast,
	daniel, netdev, u9012063, qi.z.zhang, jakub.kicinski, andrew,
	brouer

On Mon, 17 Dec 2018 14:39:57 +0100
Björn Töpel <bjorn.topel@intel.com> wrote:

> On 2018-12-17 13:50, Jesper Dangaard Brouer wrote:
> > On Mon, 17 Dec 2018 11:24:54 +0100
> > Björn Töpel <bjorn.topel@gmail.com> wrote:
> >   
> >> XDP_ATTACH associates an XDP socket to a specific netdev Rx queue. To
> >> redirect a packet to an attached socket from XDP, the bpf_xsk_redirect
> >> helper is used. The bpf_xsk_redirect helper is also introduced in this
> >> series.
> >>
> >> Many XDP socket users just need a simple way of creating/binding a
> >> socket and receiving frames right away without a complicated XDP
> >> program. "Attached" XDP sockets removes the need for the XSKMAP, and
> >> allows for a trivial XDP program, e.g.:
> >>
> >>    SEC("xdp")
> >>    int xdp_prog(struct xdp_md *ctx)
> >>    {
> >>          return bpf_xsk_redirect(ctx);
> >>    }
> >>
> >> An attached XDP socket also has better performance than the XSKMAP
> >> based sockets (performance numbers below).  
> > 
> > I still have a general problem with this approach.
> >  
> 
> And I appreciate that you have comments on the design/code! :-) Thank you!
> 
> 
> > The AF_XDP socket is build around (and gets its performance) from
> > being tied to a specific RX-queue.  That design begs to have an XDP
> > program per RX-queue.
> >
> > Patchset-v1 moved towards this goal.  But in this patchset-v2 you
> > steer away from this again, and work-around the issue with the current
> > limitations of 1-XDP program per netdev.  (Which result in; if a
> > single AF_XDP socket is used in the system, which can ONLY be for a
> > single RX-queue by design, then ALL other XDP_PASS traffic also have
> > to take the overhead of indirect BPF call).
> >  
> 
> I agree that a per-queue-program would be a good fit, but I think it
> still makes sense to add XDP_ATTACH and the helper, to make it easier
> for the XDP program authors to use AF_XDP. Would you prefer that this
> functionality was help back, until per-queue programs are introduced?

Yes, for the reasons you yourself listed in next section:

> OTOH the implementation would probably look different if there was a
> per-q program, because this would open up for more optimizations. One
> could even imagine that the socket fd was part of the program (part of
> relocation process) when loading the program. That could get rid of yet
> another if-statement that check for socket existence. :-)

Yes, exactly.  The implementation would probably look different, when
we look at it from a more generic angle, with per-q programs.

> > IMHO with this use-case, now is the time to introduce XDP programs per
> > RX-queue.  Yes, it will be more work, but I will offer to helpout.
> > This should be generalized as XDP programs per RX-queue can be used by
> > other use-cases too:
> >    In general terms: We can setup a NIC hardware filter to deliver
> > frame matching some criteria, then we can avoid rechecking these
> > criterias in on the (RX) CPU when/if we can attach an XDP prog to this
> > specific RX-queue directly.  This *IS* exactly what AF_XDP does, but it
> > is in general useful for others, like CPUMAP redirect.
> >  
> 
> Fair enough, and thank you for offering to help out. And the fact that
> *other than AF_XDP* can benefit from that is good. Before we dive down
> this hole, what are the opinions of the BPF maintainers? Maybe it's
> better to hack an RFC, and then take the discussion?

As XDP grows, and more use-cases are added, then I fear that the single
XDP program per netdev is going to be a performance bottleneck.  As the
single XDP program, will have to perform a lot of common checks before
it knows what use-case this packet match.  With an XDP program per
RX-queue, we can instead leverage the hardware to pre-filter/sort
packets, and thus simplify the XDP programs. 
  And this patchset already do shows a performance advantage of
simplifying the XDP prog and allowing to store info per RX-queue (the
xsk-sock) that allows you do take a more direct action (avoiding exec
some of the redirect-core code).

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH
  2018-12-17 14:10     ` Jesper Dangaard Brouer
@ 2018-12-17 14:55       ` Magnus Karlsson
  2018-12-17 15:30         ` Björn Töpel
  0 siblings, 1 reply; 20+ messages in thread
From: Magnus Karlsson @ 2018-12-17 14:55 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Björn Töpel, Björn Töpel, Karlsson, Magnus,
	ast, Daniel Borkmann, Network Development, William Tu, Zhang,
	Qi Z, Jakub Kicinski, andrew

On Mon, Dec 17, 2018 at 3:10 PM Jesper Dangaard Brouer
<brouer@redhat.com> wrote:
>
> On Mon, 17 Dec 2018 14:39:57 +0100
> Björn Töpel <bjorn.topel@intel.com> wrote:
>
> > On 2018-12-17 13:50, Jesper Dangaard Brouer wrote:
> > > On Mon, 17 Dec 2018 11:24:54 +0100
> > > Björn Töpel <bjorn.topel@gmail.com> wrote:
> > >
> > >> XDP_ATTACH associates an XDP socket to a specific netdev Rx queue. To
> > >> redirect a packet to an attached socket from XDP, the bpf_xsk_redirect
> > >> helper is used. The bpf_xsk_redirect helper is also introduced in this
> > >> series.
> > >>
> > >> Many XDP socket users just need a simple way of creating/binding a
> > >> socket and receiving frames right away without a complicated XDP
> > >> program. "Attached" XDP sockets removes the need for the XSKMAP, and
> > >> allows for a trivial XDP program, e.g.:
> > >>
> > >>    SEC("xdp")
> > >>    int xdp_prog(struct xdp_md *ctx)
> > >>    {
> > >>          return bpf_xsk_redirect(ctx);
> > >>    }
> > >>
> > >> An attached XDP socket also has better performance than the XSKMAP
> > >> based sockets (performance numbers below).
> > >
> > > I still have a general problem with this approach.
> > >
> >
> > And I appreciate that you have comments on the design/code! :-) Thank you!
> >
> >
> > > The AF_XDP socket is build around (and gets its performance) from
> > > being tied to a specific RX-queue.  That design begs to have an XDP
> > > program per RX-queue.
> > >
> > > Patchset-v1 moved towards this goal.  But in this patchset-v2 you
> > > steer away from this again, and work-around the issue with the current
> > > limitations of 1-XDP program per netdev.  (Which result in; if a
> > > single AF_XDP socket is used in the system, which can ONLY be for a
> > > single RX-queue by design, then ALL other XDP_PASS traffic also have
> > > to take the overhead of indirect BPF call).
> > >
> >
> > I agree that a per-queue-program would be a good fit, but I think it
> > still makes sense to add XDP_ATTACH and the helper, to make it easier
> > for the XDP program authors to use AF_XDP. Would you prefer that this
> > functionality was help back, until per-queue programs are introduced?
>
> Yes, for the reasons you yourself listed in next section:
>
> > OTOH the implementation would probably look different if there was a
> > per-q program, because this would open up for more optimizations. One
> > could even imagine that the socket fd was part of the program (part of
> > relocation process) when loading the program. That could get rid of yet
> > another if-statement that check for socket existence. :-)
>
> Yes, exactly.  The implementation would probably look different, when
> we look at it from a more generic angle, with per-q programs.
>
> > > IMHO with this use-case, now is the time to introduce XDP programs per
> > > RX-queue.  Yes, it will be more work, but I will offer to helpout.
> > > This should be generalized as XDP programs per RX-queue can be used by
> > > other use-cases too:
> > >    In general terms: We can setup a NIC hardware filter to deliver
> > > frame matching some criteria, then we can avoid rechecking these
> > > criterias in on the (RX) CPU when/if we can attach an XDP prog to this
> > > specific RX-queue directly.  This *IS* exactly what AF_XDP does, but it
> > > is in general useful for others, like CPUMAP redirect.
> > >
> >
> > Fair enough, and thank you for offering to help out. And the fact that
> > *other than AF_XDP* can benefit from that is good. Before we dive down
> > this hole, what are the opinions of the BPF maintainers? Maybe it's
> > better to hack an RFC, and then take the discussion?
>
> As XDP grows, and more use-cases are added, then I fear that the single
> XDP program per netdev is going to be a performance bottleneck.  As the
> single XDP program, will have to perform a lot of common checks before
> it knows what use-case this packet match.  With an XDP program per
> RX-queue, we can instead leverage the hardware to pre-filter/sort
> packets, and thus simplify the XDP programs.
>   And this patchset already do shows a performance advantage of
> simplifying the XDP prog and allowing to store info per RX-queue (the
> xsk-sock) that allows you do take a more direct action (avoiding exec
> some of the redirect-core code).

Instead of introducing the XDP_ATTACH option to the bind call, can we
just make this association between rx queue id and xsk every single
time we bind? Then it is up to the user via the XDP program if he/she
wants to use this by calling xsk_redirect. No new option needed.

We could also make the setup of AF_XDP easier by just hiding the
loading and creation of the XSKMAP and the XDP program behind
xsk_socket__create in the patch set I am working on. It could
take a parameter in the configuration struct stating if libbpf
should load a predefined program (with xsk_redirect alternatively
XSKMAP that directs all traffic on a queue to a specific AF_XDP
socket), or if the user desires to set this up manually. The
built-in program would be supplied inside libbpf and would be
very small. We could start with the current XSKMAP and just setup
everything in the same way the sample program does. If people
think xsk_redirect is a good idea, then we could move over to
that, as it has higher performance.

Please let me know what you think.

BTW, I like the per RX queue XDP program idea. Could see a number of
performance optimizations that could be done given that that existed.

Thanks: Magnus


> --
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH
  2018-12-17 14:55       ` Magnus Karlsson
@ 2018-12-17 15:30         ` Björn Töpel
  2018-12-18 23:04           ` Alexei Starovoitov
  0 siblings, 1 reply; 20+ messages in thread
From: Björn Töpel @ 2018-12-17 15:30 UTC (permalink / raw)
  To: Magnus Karlsson, Jesper Dangaard Brouer
  Cc: Björn Töpel, Karlsson, Magnus, ast, Daniel Borkmann,
	Network Development, William Tu, Zhang, Qi Z, Jakub Kicinski,
	andrew



On 2018-12-17 15:55, Magnus Karlsson wrote:
> On Mon, Dec 17, 2018 at 3:10 PM Jesper Dangaard Brouer
> <brouer@redhat.com> wrote:
>>
>> On Mon, 17 Dec 2018 14:39:57 +0100
>> Björn Töpel <bjorn.topel@intel.com> wrote:
>>
>>> On 2018-12-17 13:50, Jesper Dangaard Brouer wrote:
>>>> On Mon, 17 Dec 2018 11:24:54 +0100
>>>> Björn Töpel <bjorn.topel@gmail.com> wrote:
>>>>
>>>>> XDP_ATTACH associates an XDP socket to a specific netdev Rx queue. To
>>>>> redirect a packet to an attached socket from XDP, the bpf_xsk_redirect
>>>>> helper is used. The bpf_xsk_redirect helper is also introduced in this
>>>>> series.
>>>>>
>>>>> Many XDP socket users just need a simple way of creating/binding a
>>>>> socket and receiving frames right away without a complicated XDP
>>>>> program. "Attached" XDP sockets removes the need for the XSKMAP, and
>>>>> allows for a trivial XDP program, e.g.:
>>>>>
>>>>>     SEC("xdp")
>>>>>     int xdp_prog(struct xdp_md *ctx)
>>>>>     {
>>>>>           return bpf_xsk_redirect(ctx);
>>>>>     }
>>>>>
>>>>> An attached XDP socket also has better performance than the XSKMAP
>>>>> based sockets (performance numbers below).
>>>>
>>>> I still have a general problem with this approach.
>>>>
>>>
>>> And I appreciate that you have comments on the design/code! :-) Thank you!
>>>
>>>
>>>> The AF_XDP socket is build around (and gets its performance) from
>>>> being tied to a specific RX-queue.  That design begs to have an XDP
>>>> program per RX-queue.
>>>>
>>>> Patchset-v1 moved towards this goal.  But in this patchset-v2 you
>>>> steer away from this again, and work-around the issue with the current
>>>> limitations of 1-XDP program per netdev.  (Which result in; if a
>>>> single AF_XDP socket is used in the system, which can ONLY be for a
>>>> single RX-queue by design, then ALL other XDP_PASS traffic also have
>>>> to take the overhead of indirect BPF call).
>>>>
>>>
>>> I agree that a per-queue-program would be a good fit, but I think it
>>> still makes sense to add XDP_ATTACH and the helper, to make it easier
>>> for the XDP program authors to use AF_XDP. Would you prefer that this
>>> functionality was help back, until per-queue programs are introduced?
>>
>> Yes, for the reasons you yourself listed in next section:
>>
>>> OTOH the implementation would probably look different if there was a
>>> per-q program, because this would open up for more optimizations. One
>>> could even imagine that the socket fd was part of the program (part of
>>> relocation process) when loading the program. That could get rid of yet
>>> another if-statement that check for socket existence. :-)
>>
>> Yes, exactly.  The implementation would probably look different, when
>> we look at it from a more generic angle, with per-q programs.
>>
>>>> IMHO with this use-case, now is the time to introduce XDP programs per
>>>> RX-queue.  Yes, it will be more work, but I will offer to helpout.
>>>> This should be generalized as XDP programs per RX-queue can be used by
>>>> other use-cases too:
>>>>     In general terms: We can setup a NIC hardware filter to deliver
>>>> frame matching some criteria, then we can avoid rechecking these
>>>> criterias in on the (RX) CPU when/if we can attach an XDP prog to this
>>>> specific RX-queue directly.  This *IS* exactly what AF_XDP does, but it
>>>> is in general useful for others, like CPUMAP redirect.
>>>>
>>>
>>> Fair enough, and thank you for offering to help out. And the fact that
>>> *other than AF_XDP* can benefit from that is good. Before we dive down
>>> this hole, what are the opinions of the BPF maintainers? Maybe it's
>>> better to hack an RFC, and then take the discussion?
>>
>> As XDP grows, and more use-cases are added, then I fear that the single
>> XDP program per netdev is going to be a performance bottleneck.  As the
>> single XDP program, will have to perform a lot of common checks before
>> it knows what use-case this packet match.  With an XDP program per
>> RX-queue, we can instead leverage the hardware to pre-filter/sort
>> packets, and thus simplify the XDP programs.
>>    And this patchset already do shows a performance advantage of
>> simplifying the XDP prog and allowing to store info per RX-queue (the
>> xsk-sock) that allows you do take a more direct action (avoiding exec
>> some of the redirect-core code).
> 
> Instead of introducing the XDP_ATTACH option to the bind call, can we
> just make this association between rx queue id and xsk every single
> time we bind? Then it is up to the user via the XDP program if he/she
> wants to use this by calling xsk_redirect. No new option needed.
>

Nice! Then it would simply be a matter of adding the helper. Much better
than extending the uapi. Thank you for pointing this out!


> We could also make the setup of AF_XDP easier by just hiding the
> loading and creation of the XSKMAP and the XDP program behind
> xsk_socket__create in the patch set I am working on. It could
> take a parameter in the configuration struct stating if libbpf
> should load a predefined program (with xsk_redirect alternatively
> XSKMAP that directs all traffic on a queue to a specific AF_XDP
> socket), or if the user desires to set this up manually. The
> built-in program would be supplied inside libbpf and would be
> very small. We could start with the current XSKMAP and just setup
> everything in the same way the sample program does. If people
> think xsk_redirect is a good idea, then we could move over to
> that, as it has higher performance.
> 
> Please let me know what you think.
>

Hmm, let's start of with this, making sure AF_XDP is simple to use from
a libbpf perspective, and then (maybe) move towards the
bpf_xsk_redirect, depending on where the per-queue work moves.

Let's drop this series for now, and focus on libbpf and the per-queue 
XDP programs.


> BTW, I like the per RX queue XDP program idea. Could see a number of
> performance optimizations that could be done given that that existed.
>

Yes! It would be really cool do use the socket file descriptor in the
BPF program, so that the bpf_xsk_redirect could use the socket directly
(analogous to bpf_map in redirect map).

Another idea with per-queue programs is that the socket could be 
implicity bound to queue/dev when installing the program to an Rx queue.


Björn

> Thanks: Magnus
> 
> 
>> --
>> Best regards,
>>    Jesper Dangaard Brouer
>>    MSc.CS, Principal Kernel Engineer at Red Hat
>>    LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH
  2018-12-17 13:39   ` Björn Töpel
  2018-12-17 14:10     ` Jesper Dangaard Brouer
@ 2018-12-18  2:36     ` Jakub Kicinski
  2018-12-18  8:59       ` Björn Töpel
  1 sibling, 1 reply; 20+ messages in thread
From: Jakub Kicinski @ 2018-12-18  2:36 UTC (permalink / raw)
  To: Björn Töpel
  Cc: Jesper Dangaard Brouer, Björn Töpel, magnus.karlsson,
	magnus.karlsson, ast, daniel, netdev, u9012063, qi.z.zhang,
	andrew

On Mon, 17 Dec 2018 14:39:57 +0100, Björn Töpel wrote:
> OTOH the implementation would probably look different if there was a
> per-q program, because this would open up for more optimizations. One
> could even imagine that the socket fd was part of the program (part of
> relocation process) when loading the program. That could get rid of yet
> another if-statement that check for socket existence. :-)

Interesting thought, then we would have to teach the BPF subsystem to
express program dependencies (this program can only be used on queue
where given XSK is attached), and have the networking subsystem
(drivers?) check those constraints when attaching. Did I get this right?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH
  2018-12-18  2:36     ` Jakub Kicinski
@ 2018-12-18  8:59       ` Björn Töpel
  2018-12-18 18:18         ` Jakub Kicinski
  0 siblings, 1 reply; 20+ messages in thread
From: Björn Töpel @ 2018-12-18  8:59 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Jesper Dangaard Brouer, Björn Töpel, magnus.karlsson,
	magnus.karlsson, ast, daniel, netdev, u9012063, qi.z.zhang,
	andrew

On 2018-12-18 03:36, Jakub Kicinski wrote:
> On Mon, 17 Dec 2018 14:39:57 +0100, Björn Töpel wrote:
>> OTOH the implementation would probably look different if there was a
>> per-q program, because this would open up for more optimizations. One
>> could even imagine that the socket fd was part of the program (part of
>> relocation process) when loading the program. That could get rid of yet
>> another if-statement that check for socket existence. :-)
> 
> Interesting thought, then we would have to teach the BPF subsystem to
> express program dependencies (this program can only be used on queue
> where given XSK is attached), and have the networking subsystem
> (drivers?) check those constraints when attaching. Did I get this right?
> 

Yes. Well, yet another dependency, similar to how a program refers to maps.

The per-queue loading could be similar to passing the prog_ifindex
in the prog-loading attribute, used by the HW offloading. I.e. in
addition to prog_ifindex, there would be, say, prog_ifqueue.

For the XDP socket constraint checking I see two models. The BPF program
refers to an unbound socket(s), and by installing the program to a
certain Rx queue, the socket(s) will be implicitly bound. If the
socket(s) are bound to certain device/queue at install time, the
networking subsystem will scream if the program is installed to a
ifindex/queue != prog ifindex/queue.

Is the XDP HW offloading program verified at load or install time, or both?

Again, this is just me thinking out loud... Might be a lousy idea. :-)

Side-note: When we started AF_XDP, we played around with the idea to
associate the XSKMAP to a certain ifindex/queue, to be able to reject
sockets bound to another device/queue and elide a couple of checks in
the fast-path. My ramblings above is somewhat in the same vein. We'd
like to be able to set program constraints for netdev/queue target, so
that we can remove checks in the fast-path.


Björn

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH
  2018-12-18  8:59       ` Björn Töpel
@ 2018-12-18 18:18         ` Jakub Kicinski
  0 siblings, 0 replies; 20+ messages in thread
From: Jakub Kicinski @ 2018-12-18 18:18 UTC (permalink / raw)
  To: Björn Töpel
  Cc: Jesper Dangaard Brouer, Björn Töpel, magnus.karlsson,
	magnus.karlsson, ast, daniel, netdev, u9012063, qi.z.zhang,
	andrew

On Tue, 18 Dec 2018 09:59:26 +0100, Björn Töpel wrote:
> Is the XDP HW offloading program verified at load or install time, or both?

Load interacts with the ifindex of choice, install checks if the load
time device matches.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH
  2018-12-17 15:30         ` Björn Töpel
@ 2018-12-18 23:04           ` Alexei Starovoitov
  2018-12-19  9:34             ` Björn Töpel
  0 siblings, 1 reply; 20+ messages in thread
From: Alexei Starovoitov @ 2018-12-18 23:04 UTC (permalink / raw)
  To: Björn Töpel
  Cc: Magnus Karlsson, Jesper Dangaard Brouer, Björn Töpel,
	Karlsson, Magnus, ast, Daniel Borkmann, Network Development,
	William Tu, Zhang, Qi Z, Jakub Kicinski, andrew

On Mon, Dec 17, 2018 at 04:30:05PM +0100, Björn Töpel wrote:
> > 
> > Instead of introducing the XDP_ATTACH option to the bind call, can we
> > just make this association between rx queue id and xsk every single
> > time we bind? Then it is up to the user via the XDP program if he/she
> > wants to use this by calling xsk_redirect. No new option needed.
> > 
> 
> Nice! Then it would simply be a matter of adding the helper. Much better
> than extending the uapi. Thank you for pointing this out!

Magnus's by-default association approach sounds good to me.
I'm missing though why extra bpf helper is needed.
This soft-bind rx queue with xsk at the time
of normal bind() should work as-is and no program necessary.
No prog -> default assoc. Yet the prog has an ability to redirect
to specific xsk via existing bpf_redirect_map
or via faster (future) xsk_redirect.

Also I agree with Jesper that per-queue programs would be good to have
sooner than later, but I disagree that it's mandatory for this set.
It seems the patches will give us performance boost and simplification
of uapi while per-queue prog api is being discussed and implemented.

I like the other idea too (of having libbpf provide canned bpf progs
for different use cases). That's better than loading them
from the kernel.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH
  2018-12-18 23:04           ` Alexei Starovoitov
@ 2018-12-19  9:34             ` Björn Töpel
  2018-12-19 16:23               ` Jesper Dangaard Brouer
  2018-12-20  0:48               ` Alexei Starovoitov
  0 siblings, 2 replies; 20+ messages in thread
From: Björn Töpel @ 2018-12-19  9:34 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Magnus Karlsson, Jesper Dangaard Brouer, Björn Töpel,
	Karlsson, Magnus, ast, Daniel Borkmann, Network Development,
	William Tu, Zhang, Qi Z, Jakub Kicinski, andrew


On 2018-12-19 00:04, Alexei Starovoitov wrote:
> On Mon, Dec 17, 2018 at 04:30:05PM +0100, Björn Töpel wrote:
>>>
>>> Instead of introducing the XDP_ATTACH option to the bind call, can we
>>> just make this association between rx queue id and xsk every single
>>> time we bind? Then it is up to the user via the XDP program if he/she
>>> wants to use this by calling xsk_redirect. No new option needed.
>>>
>>
>> Nice! Then it would simply be a matter of adding the helper. Much better
>> than extending the uapi. Thank you for pointing this out!
> 
> Magnus's by-default association approach sounds good to me.
> I'm missing though why extra bpf helper is needed.
> This soft-bind rx queue with xsk at the time
> of normal bind() should work as-is and no program necessary.

Hmm, an XDP program has always been required. It's just whether the
program needs an XSKMAP or not. The XDP_ATTACH can be seens as "use an
internal map" -- and the mechanism to redirect using that map is via the
bpf_xsk_redirect helper. What Magnus proposed was populating the builtin
map unconditionally, and therefore removing the need for the XDP_ATTACH
option.

If I am interpreting what you're saying correctly, you thought
XDP_ATTACH (in v1) was bypassing the XDP path. This is not something
that we want. AF_XDP is a proper XDP construct, and require an XDP
program to receive frames. What I wanted to improve with this series was
to lower the complexity of the minimal XDP program to receive a frame.

> No prog -> default assoc. Yet the prog has an ability to redirect
> to specific xsk via existing bpf_redirect_map
> or via faster (future) xsk_redirect.
> 
> Also I agree with Jesper that per-queue programs would be good to have
> sooner than later, but I disagree that it's mandatory for this set.
> It seems the patches will give us performance boost and simplification
> of uapi while per-queue prog api is being discussed and implemented.
> 
> I like the other idea too (of having libbpf provide canned bpf progs
> for different use cases). That's better than loading them
> from the kernel.
>

To summarize; I'll hold off this series for now (except the first
cleanup patch which I'll send out today), and spend some time on
per-queue programs, iproute2 support and libbpf.


Björn

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH
  2018-12-19  9:34             ` Björn Töpel
@ 2018-12-19 16:23               ` Jesper Dangaard Brouer
  2018-12-20  0:48               ` Alexei Starovoitov
  1 sibling, 0 replies; 20+ messages in thread
From: Jesper Dangaard Brouer @ 2018-12-19 16:23 UTC (permalink / raw)
  To: Björn Töpel
  Cc: Alexei Starovoitov, Magnus Karlsson, Björn Töpel,
	Karlsson, Magnus, ast, Daniel Borkmann, Network Development,
	William Tu, Zhang, Qi Z, Jakub Kicinski, andrew, brouer,
	Ilias Apalodimas, Quentin Monnet

On Wed, 19 Dec 2018 10:34:28 +0100
Björn Töpel <bjorn.topel@intel.com> wrote:

> To summarize; I'll hold off this series for now (except the first
> cleanup patch which I'll send out today), and spend some time on
> per-queue programs, iproute2 support and libbpf.

As promised I'll be helping out with per-queue programs, and is
currentl writing up a design document here [1], and tomorrow we will
do a meeting (with Bjørn and Magnus) about this design and next steps.

[1] https://github.com/xdp-project/xdp-project/blob/master/areas/core/xdp_per_rxq01.org
-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH
  2018-12-19  9:34             ` Björn Töpel
  2018-12-19 16:23               ` Jesper Dangaard Brouer
@ 2018-12-20  0:48               ` Alexei Starovoitov
  1 sibling, 0 replies; 20+ messages in thread
From: Alexei Starovoitov @ 2018-12-20  0:48 UTC (permalink / raw)
  To: Björn Töpel
  Cc: Magnus Karlsson, Jesper Dangaard Brouer, Björn Töpel,
	Karlsson, Magnus, ast, Daniel Borkmann, Network Development,
	William Tu, Zhang, Qi Z, Jakub Kicinski, andrew

On Wed, Dec 19, 2018 at 10:34:28AM +0100, Björn Töpel wrote:
> 
> On 2018-12-19 00:04, Alexei Starovoitov wrote:
> > On Mon, Dec 17, 2018 at 04:30:05PM +0100, Björn Töpel wrote:
> > > > 
> > > > Instead of introducing the XDP_ATTACH option to the bind call, can we
> > > > just make this association between rx queue id and xsk every single
> > > > time we bind? Then it is up to the user via the XDP program if he/she
> > > > wants to use this by calling xsk_redirect. No new option needed.
> > > > 
> > > 
> > > Nice! Then it would simply be a matter of adding the helper. Much better
> > > than extending the uapi. Thank you for pointing this out!
> > 
> > Magnus's by-default association approach sounds good to me.
> > I'm missing though why extra bpf helper is needed.
> > This soft-bind rx queue with xsk at the time
> > of normal bind() should work as-is and no program necessary.
> 
> Hmm, an XDP program has always been required. It's just whether the
> program needs an XSKMAP or not. The XDP_ATTACH can be seens as "use an
> internal map" -- and the mechanism to redirect using that map is via the
> bpf_xsk_redirect helper. What Magnus proposed was populating the builtin
> map unconditionally, and therefore removing the need for the XDP_ATTACH
> option.

well, I'm actually proposing to make af_xdp program optional
if hard coded queue->xsk mapping gives extra performance.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2018-12-20  0:48 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-17 10:24 [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH Björn Töpel
2018-12-17 10:24 ` [PATCH bpf-next v2 1/7] xsk: simplify AF_XDP socket teardown Björn Töpel
2018-12-17 10:24 ` [PATCH bpf-next v2 2/7] xsk: add XDP_ATTACH bind() flag Björn Töpel
2018-12-17 10:24 ` [PATCH bpf-next v2 3/7] bpf: add bpf_xsk_redirect function Björn Töpel
2018-12-17 10:24 ` [PATCH bpf-next v2 4/7] tools/bpf: sync kernel uapi bpf.h to tools directory Björn Töpel
2018-12-17 10:24 ` [PATCH bpf-next v2 5/7] libbpf: initial support for builtin BPF programs Björn Töpel
2018-12-17 10:25 ` [PATCH bpf-next v2 6/7] samples: bpf: simplify/cleanup xdpsock Björn Töpel
2018-12-17 10:25 ` [PATCH bpf-next v2 7/7] samples: bpf: add support for XDP_ATTACH to xdpsock Björn Töpel
2018-12-17 12:50 ` [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH Jesper Dangaard Brouer
2018-12-17 13:39   ` Björn Töpel
2018-12-17 14:10     ` Jesper Dangaard Brouer
2018-12-17 14:55       ` Magnus Karlsson
2018-12-17 15:30         ` Björn Töpel
2018-12-18 23:04           ` Alexei Starovoitov
2018-12-19  9:34             ` Björn Töpel
2018-12-19 16:23               ` Jesper Dangaard Brouer
2018-12-20  0:48               ` Alexei Starovoitov
2018-12-18  2:36     ` Jakub Kicinski
2018-12-18  8:59       ` Björn Töpel
2018-12-18 18:18         ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).