All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 bpf-next 00/15] net: Add support for XDP in egress path
@ 2020-04-27 22:46 David Ahern
  2020-04-27 22:46 ` [PATCH v4 bpf-next 01/15] net: Refactor convert_to_xdp_frame David Ahern
                   ` (14 more replies)
  0 siblings, 15 replies; 20+ messages in thread
From: David Ahern @ 2020-04-27 22:46 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, prashantbhole.linux, jasowang, brouer, toke,
	toshiaki.makita1, daniel, john.fastabend, ast, kafai,
	songliubraving, yhs, andriin, dsahern, David Ahern

From: David Ahern <dahern@digitalocean.com>

This series adds support for XDP in the egress path by introducing
a new XDP attachment type, BPF_XDP_EGRESS, and adding a UAPI to
if_link.h for attaching the program to a netdevice and reporting
the program. bpf programs can be run on all packets in the Tx path -
skbs or redirected xdp frames. The intent is to emulate the current
RX path for XDP as much as possible to maintain consistency and
symmetry in the 2 paths with their APIs.

This is a missing primitive for XDP allowing solutions to build small,
targeted programs properly distributed in the networking path allowing,
for example, an egress firewall/ACL/traffic verification or packet
manipulation and encapping an entire ethernet frame whether it is
locally generated traffic, forwarded via the slow path (ie., full
stack processing) or xdp redirected frames.

Nothing about running a program in the Tx path requires driver specific
resources like the Rx path has. Thus, programs can be run in core
code and attached to the net_device struct similar to skb mode. The
egress attach is done using the new XDP_FLAGS_EGRESS_MODE flag, and
is reported by the kernel using the XDP_ATTACHED_EGRESS_CORE attach
flag with IFLA_XDP_EGRESS_PROG_ID making the api similar to existing
APIs for XDP.

The locations chosen to run the egress program - __netdev_start_xmit
before the call to ndo_start_xmit and bq_xmit_all before invoking
ndo_xdp_xmit - allow follow on patch sets to handle tx queueing and
setting the queue index if multi-queue with consistency in handling
both packet formats.

A few of the patches trace back to work done on offloading programs
from a VM by Jason Wang and Prashant Bole.

v4:
- added space in bpftool help in partch 12 - Toke
- updated to top of bpf-next

v3:
- removed IFLA_XDP_EGRESS and dropped back to XDP_FLAGS_EGRESS_MODE
  as the uapi to specify the attach. This caused the ordering of the
  patches to change with the uapi now introduced in the second patch
  and 2 refactoring patches are dropped. Samples and test programs
  updated to use the new API.

v2:
- changed rx checks in xdp_is_valid_access to any expected_attach_type
- add xdp_egress argument to bpftool prog rst document
- do not allow IFLA_XDP and IFLA_XDP_EGRESS in the same config. There
  is no way to rollback IFLA_XDP if IFLA_XDP_EGRESS fails.
- comments from Andrii on libbpf

v1:
- add selftests
- flip the order of xdp generic patches as requested by Toke
- fixed the count arg to do_xdp_egress_frame - Toke
- remove meta data invalidate in __xdp_egress_frame - Toke
- fixed data_hard_start in __xdp_egress_frame - Jesper
- refactored convert_to_xdp_frame to reuse buf to frame code - Jesper
- added missed refactoring patch when generating patch set

RFC v5:
- updated cover letter
- moved running of ebpf program to from ndo_{start,xdp}_xmit to core
  code. Dropped all tun and vhost related changes.
- added egress support to bpftool

RFC v4:
- updated cover letter
- patches related to code movement between tuntap, headers and vhost
  are dropped; previous RFC ran the XDP program in vhost context vs
  this set which runs them before queueing to vhost. As a part of this
  moved invocation of egress program to tun_net_xmit and tun_xdp_xmit.
- renamed do_xdp_generic to do_xdp_generic_rx to emphasize is called
  in the Rx path; added rx argument to do_xdp_generic_core since it
  is used for both directions and needs to know which queue values to
  set in xdp_buff

RFC v3:
- reworked the patches - splitting patch 1 from RFC v2 into 3, combining
  patch 2 from RFC v2 into the first 3, combining patches 6 and 7 from
  RFC v2 into 1 since both did a trivial rename and export. Reordered
  the patches such that kernel changes are first followed by libbpf and
  an enhancement to a sample.

- moved small xdp related helper functions from tun.c to tun.h to make
  tun_ptr_free usable from the tap code. This is needed to handle the
  case of tap builtin and tun built as a module.

- pkt_ptrs added to `struct tun_file` and passed to tun_consume_packets
  rather than declaring pkts as an array on the stack.

RFC v2:
- New XDP attachment type: Jesper, Toke and Alexei discussed whether
  to introduce a new program type. Since this set adds a way to attach
  regular XDP program to the tx path, as per Alexei's suggestion, a
  new attachment type BPF_XDP_EGRESS is introduced.

- libbpf API changes:
  Alexei had suggested _opts() style of API extension. Considering it
  two new libbpf APIs are introduced which are equivalent to existing
  APIs. New ones can be extended easily. Please see individual patches
  for details. xdp1 sample program is modified to use new APIs.

- tun: Some patches from previous set are removed as they are
  irrelevant in this series. They will in introduced later.

David Ahern (15):
  net: Refactor convert_to_xdp_frame
  net: uapi for XDP programs in the egress path
  net: Add XDP setup and query commands for Tx programs
  net: Add BPF_XDP_EGRESS as a bpf_attach_type
  xdp: Add xdp_txq_info to xdp_buff
  net: Rename do_xdp_generic to do_xdp_generic_rx
  net: rename netif_receive_generic_xdp to do_generic_xdp_core
  net: set XDP egress program on netdevice
  net: Support xdp in the Tx path for packets as an skb
  net: Support xdp in the Tx path for xdp_frames
  libbpf: Add egress XDP support
  bpftool: Add support for XDP egress
  selftest: Add test for xdp_egress
  selftest: Add xdp_egress attach tests
  samples/bpf: add XDP egress support to xdp1

 drivers/net/tun.c                             |   4 +-
 include/linux/netdevice.h                     |  21 +-
 include/net/xdp.h                             |  35 ++-
 include/uapi/linux/bpf.h                      |   3 +
 include/uapi/linux/if_link.h                  |   6 +-
 kernel/bpf/devmap.c                           |  19 +-
 net/core/dev.c                                | 241 ++++++++++++++----
 net/core/filter.c                             |  26 ++
 net/core/rtnetlink.c                          |  23 +-
 samples/bpf/xdp1_user.c                       |  11 +-
 .../bpf/bpftool/Documentation/bpftool-net.rst |   4 +-
 .../bpftool/Documentation/bpftool-prog.rst    |   2 +-
 tools/bpf/bpftool/bash-completion/bpftool     |   4 +-
 tools/bpf/bpftool/net.c                       |   6 +-
 tools/bpf/bpftool/netlink_dumper.c            |   5 +
 tools/bpf/bpftool/prog.c                      |   2 +-
 tools/include/uapi/linux/bpf.h                |   3 +
 tools/include/uapi/linux/if_link.h            |   6 +-
 tools/lib/bpf/libbpf.c                        |   2 +
 tools/lib/bpf/libbpf.h                        |   1 +
 tools/lib/bpf/netlink.c                       |   6 +
 tools/testing/selftests/bpf/Makefile          |   1 +
 .../bpf/prog_tests/xdp_egress_attach.c        |  56 ++++
 .../selftests/bpf/progs/test_xdp_egress.c     |  12 +
 .../bpf/progs/test_xdp_egress_fail.c          |  16 ++
 tools/testing/selftests/bpf/progs/xdp_drop.c  |  25 ++
 .../testing/selftests/bpf/test_xdp_egress.sh  | 160 ++++++++++++
 27 files changed, 623 insertions(+), 77 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/xdp_egress_attach.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_xdp_egress.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_xdp_egress_fail.c
 create mode 100644 tools/testing/selftests/bpf/progs/xdp_drop.c
 create mode 100755 tools/testing/selftests/bpf/test_xdp_egress.sh

-- 
2.21.1 (Apple Git-122.3)


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v4 bpf-next 01/15] net: Refactor convert_to_xdp_frame
  2020-04-27 22:46 [PATCH v4 bpf-next 00/15] net: Add support for XDP in egress path David Ahern
@ 2020-04-27 22:46 ` David Ahern
  2020-04-27 22:46 ` [PATCH v4 bpf-next 02/15] net: uapi for XDP programs in the egress path David Ahern
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: David Ahern @ 2020-04-27 22:46 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, prashantbhole.linux, jasowang, brouer, toke,
	toshiaki.makita1, daniel, john.fastabend, ast, kafai,
	songliubraving, yhs, andriin, dsahern, David Ahern

From: David Ahern <dahern@digitalocean.com>

Move the guts of convert_to_xdp_frame to a new helper, update_xdp_frame
so it can be reused in a later patch.

Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David Ahern <dahern@digitalocean.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
---
 include/net/xdp.h | 30 ++++++++++++++++++++----------
 1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/include/net/xdp.h b/include/net/xdp.h
index 3cc6d5d84aa4..3264fa882de3 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -93,32 +93,42 @@ static inline void xdp_scrub_frame(struct xdp_frame *frame)
 
 struct xdp_frame *xdp_convert_zc_to_xdp_frame(struct xdp_buff *xdp);
 
-/* Convert xdp_buff to xdp_frame */
 static inline
-struct xdp_frame *convert_to_xdp_frame(struct xdp_buff *xdp)
+bool update_xdp_frame(struct xdp_buff *xdp, struct xdp_frame *xdp_frame)
 {
-	struct xdp_frame *xdp_frame;
 	int metasize;
 	int headroom;
 
-	if (xdp->rxq->mem.type == MEM_TYPE_ZERO_COPY)
-		return xdp_convert_zc_to_xdp_frame(xdp);
-
 	/* Assure headroom is available for storing info */
 	headroom = xdp->data - xdp->data_hard_start;
 	metasize = xdp->data - xdp->data_meta;
 	metasize = metasize > 0 ? metasize : 0;
 	if (unlikely((headroom - metasize) < sizeof(*xdp_frame)))
-		return NULL;
-
-	/* Store info in top of packet */
-	xdp_frame = xdp->data_hard_start;
+		return false;
 
 	xdp_frame->data = xdp->data;
 	xdp_frame->len  = xdp->data_end - xdp->data;
 	xdp_frame->headroom = headroom - sizeof(*xdp_frame);
 	xdp_frame->metasize = metasize;
 
+	return true;
+}
+
+/* Convert xdp_buff to xdp_frame */
+static inline
+struct xdp_frame *convert_to_xdp_frame(struct xdp_buff *xdp)
+{
+	struct xdp_frame *xdp_frame;
+
+	if (xdp->rxq->mem.type == MEM_TYPE_ZERO_COPY)
+		return xdp_convert_zc_to_xdp_frame(xdp);
+
+	/* Store info in top of packet */
+	xdp_frame = xdp->data_hard_start;
+
+	if (unlikely(!update_xdp_frame(xdp, xdp_frame)))
+		return NULL;
+
 	/* rxq only valid until napi_schedule ends, convert to xdp_mem_info */
 	xdp_frame->mem = xdp->rxq->mem;
 
-- 
2.21.1 (Apple Git-122.3)


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 bpf-next 02/15] net: uapi for XDP programs in the egress path
  2020-04-27 22:46 [PATCH v4 bpf-next 00/15] net: Add support for XDP in egress path David Ahern
  2020-04-27 22:46 ` [PATCH v4 bpf-next 01/15] net: Refactor convert_to_xdp_frame David Ahern
@ 2020-04-27 22:46 ` David Ahern
  2020-04-27 22:46 ` [PATCH v4 bpf-next 03/15] net: Add XDP setup and query commands for Tx programs David Ahern
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: David Ahern @ 2020-04-27 22:46 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, prashantbhole.linux, jasowang, brouer, toke,
	toshiaki.makita1, daniel, john.fastabend, ast, kafai,
	songliubraving, yhs, andriin, dsahern, David Ahern

From: David Ahern <dahern@digitalocean.com>

Running programs in the egress path, on skbs or xdp_frames, does not
require driver specific resources like Rx path. Accordingly, the
programs can be run in core code, so add xdp_egress_prog to net_device
to hold a reference to an attached program.

For UAPI, add XDP_FLAGS_EGRESS_MODE to specify attach is at egress, add
a new attach flag, XDP_ATTACHED_EGRESS_CORE, for reporting the attach
point at the core, egress level and add IFLA_XDP_EGRESS_PROG_ID
for reporting the program id. Add rtnl_xdp_prog_egress to fill in
link message with egress data.

Signed-off-by: David Ahern <dahern@digitalocean.com>
---
 include/linux/netdevice.h          |  1 +
 include/uapi/linux/if_link.h       |  6 +++++-
 net/core/rtnetlink.c               | 23 +++++++++++++++++++++--
 tools/include/uapi/linux/if_link.h |  6 +++++-
 4 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 5a8d40f1ffe2..594c13d4cd00 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1995,6 +1995,7 @@ struct net_device {
 	unsigned int		real_num_rx_queues;
 
 	struct bpf_prog __rcu	*xdp_prog;
+	struct bpf_prog __rcu	*xdp_egress_prog;
 	unsigned long		gro_flush_timeout;
 	int			napi_defer_hard_irqs;
 	rx_handler_func_t __rcu	*rx_handler;
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 127c704eeba9..4a33ff26ef62 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -975,9 +975,11 @@ enum {
 #define XDP_FLAGS_DRV_MODE		(1U << 2)
 #define XDP_FLAGS_HW_MODE		(1U << 3)
 #define XDP_FLAGS_REPLACE		(1U << 4)
+#define XDP_FLAGS_EGRESS_MODE		(1U << 5)
 #define XDP_FLAGS_MODES			(XDP_FLAGS_SKB_MODE | \
 					 XDP_FLAGS_DRV_MODE | \
-					 XDP_FLAGS_HW_MODE)
+					 XDP_FLAGS_HW_MODE | \
+					 XDP_FLAGS_EGRESS_MODE)
 #define XDP_FLAGS_MASK			(XDP_FLAGS_UPDATE_IF_NOEXIST | \
 					 XDP_FLAGS_MODES | XDP_FLAGS_REPLACE)
 
@@ -988,6 +990,7 @@ enum {
 	XDP_ATTACHED_SKB,
 	XDP_ATTACHED_HW,
 	XDP_ATTACHED_MULTI,
+	XDP_ATTACHED_EGRESS_CORE,
 };
 
 enum {
@@ -1000,6 +1003,7 @@ enum {
 	IFLA_XDP_SKB_PROG_ID,
 	IFLA_XDP_HW_PROG_ID,
 	IFLA_XDP_EXPECTED_FD,
+	IFLA_XDP_EGRESS_PROG_ID,
 	__IFLA_XDP_MAX,
 };
 
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index d6f4f4a9e8ba..c8f6cc67a070 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -982,7 +982,7 @@ static size_t rtnl_xdp_size(void)
 	size_t xdp_size = nla_total_size(0) +	/* nest IFLA_XDP */
 			  nla_total_size(1) +	/* XDP_ATTACHED */
 			  nla_total_size(4) +	/* XDP_PROG_ID (or 1st mode) */
-			  nla_total_size(4);	/* XDP_<mode>_PROG_ID */
+			  nla_total_size(4) * 2; /* XDP_<mode>_PROG_ID */
 
 	return xdp_size;
 }
@@ -1402,6 +1402,18 @@ static int rtnl_fill_link_ifmap(struct sk_buff *skb, struct net_device *dev)
 	return 0;
 }
 
+static u32 rtnl_xdp_prog_egress(struct net_device *dev)
+{
+	const struct bpf_prog *prog;
+
+	ASSERT_RTNL();
+
+	prog = rtnl_dereference(dev->xdp_egress_prog);
+	if (!prog)
+		return 0;
+	return prog->aux->id;
+}
+
 static u32 rtnl_xdp_prog_skb(struct net_device *dev)
 {
 	const struct bpf_prog *generic_xdp_prog;
@@ -1474,6 +1486,12 @@ static int rtnl_xdp_fill(struct sk_buff *skb, struct net_device *dev)
 				  IFLA_XDP_HW_PROG_ID, rtnl_xdp_prog_hw);
 	if (err)
 		goto err_cancel;
+	err = rtnl_xdp_report_one(skb, dev, &prog_id, &mode,
+				  XDP_ATTACHED_EGRESS_CORE,
+				  IFLA_XDP_EGRESS_PROG_ID,
+				  rtnl_xdp_prog_egress);
+	if (err)
+		goto err_cancel;
 
 	err = nla_put_u8(skb, IFLA_XDP_ATTACHED, mode);
 	if (err)
@@ -2790,7 +2808,8 @@ static int do_setlink(const struct sk_buff *skb,
 		if (err < 0)
 			goto errout;
 
-		if (xdp[IFLA_XDP_ATTACHED] || xdp[IFLA_XDP_PROG_ID]) {
+		if (xdp[IFLA_XDP_ATTACHED] || xdp[IFLA_XDP_PROG_ID] ||
+		    xdp[IFLA_XDP_EGRESS_PROG_ID]) {
 			err = -EINVAL;
 			goto errout;
 		}
diff --git a/tools/include/uapi/linux/if_link.h b/tools/include/uapi/linux/if_link.h
index ca6665ea758a..41b1c5ff4875 100644
--- a/tools/include/uapi/linux/if_link.h
+++ b/tools/include/uapi/linux/if_link.h
@@ -963,9 +963,11 @@ enum {
 #define XDP_FLAGS_DRV_MODE		(1U << 2)
 #define XDP_FLAGS_HW_MODE		(1U << 3)
 #define XDP_FLAGS_REPLACE		(1U << 4)
+#define XDP_FLAGS_EGRESS_MODE		(1U << 5)
 #define XDP_FLAGS_MODES			(XDP_FLAGS_SKB_MODE | \
 					 XDP_FLAGS_DRV_MODE | \
-					 XDP_FLAGS_HW_MODE)
+					 XDP_FLAGS_HW_MODE | \
+					 XDP_FLAGS_EGRESS_MODE)
 #define XDP_FLAGS_MASK			(XDP_FLAGS_UPDATE_IF_NOEXIST | \
 					 XDP_FLAGS_MODES | XDP_FLAGS_REPLACE)
 
@@ -976,6 +978,7 @@ enum {
 	XDP_ATTACHED_SKB,
 	XDP_ATTACHED_HW,
 	XDP_ATTACHED_MULTI,
+	XDP_ATTACHED_EGRESS_CORE,
 };
 
 enum {
@@ -988,6 +991,7 @@ enum {
 	IFLA_XDP_SKB_PROG_ID,
 	IFLA_XDP_HW_PROG_ID,
 	IFLA_XDP_EXPECTED_FD,
+	IFLA_XDP_EGRESS_PROG_ID,
 	__IFLA_XDP_MAX,
 };
 
-- 
2.21.1 (Apple Git-122.3)


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 bpf-next 03/15] net: Add XDP setup and query commands for Tx programs
  2020-04-27 22:46 [PATCH v4 bpf-next 00/15] net: Add support for XDP in egress path David Ahern
  2020-04-27 22:46 ` [PATCH v4 bpf-next 01/15] net: Refactor convert_to_xdp_frame David Ahern
  2020-04-27 22:46 ` [PATCH v4 bpf-next 02/15] net: uapi for XDP programs in the egress path David Ahern
@ 2020-04-27 22:46 ` David Ahern
  2020-04-27 22:46 ` [PATCH v4 bpf-next 04/15] net: Add BPF_XDP_EGRESS as a bpf_attach_type David Ahern
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: David Ahern @ 2020-04-27 22:46 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, prashantbhole.linux, jasowang, brouer, toke,
	toshiaki.makita1, daniel, john.fastabend, ast, kafai,
	songliubraving, yhs, andriin, dsahern, David Ahern

From: David Ahern <dahern@digitalocean.com>

Add new netdev commands, XDP_SETUP_PROG_EGRESS and
XDP_QUERY_PROG_EGRESS, to query and setup egress programs.

Update dev_change_xdp_fd and dev_xdp_install to check for egress mode
via XDP_FLAGS_EGRESS_MODE in the flags. If egress bool is set, then use
XDP_SETUP_PROG_EGRESS in dev_xdp_install and XDP_QUERY_PROG_EGRESS in
dev_change_xdp_fd.

Signed-off-by: David Ahern <dahern@digitalocean.com>
Co-developed-by: Prashant Bhole <prashantbhole.linux@gmail.com>
Signed-off-by: Prashant Bhole <prashantbhole.linux@gmail.com>
---
 include/linux/netdevice.h |  2 ++
 net/core/dev.c            | 20 +++++++++++++++-----
 2 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 594c13d4cd00..ee0cb73ca18a 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -873,8 +873,10 @@ enum bpf_netdev_command {
 	 */
 	XDP_SETUP_PROG,
 	XDP_SETUP_PROG_HW,
+	XDP_SETUP_PROG_EGRESS,
 	XDP_QUERY_PROG,
 	XDP_QUERY_PROG_HW,
+	XDP_QUERY_PROG_EGRESS,
 	/* BPF program for offload callbacks, invoked at program load time. */
 	BPF_OFFLOAD_MAP_ALLOC,
 	BPF_OFFLOAD_MAP_FREE,
diff --git a/net/core/dev.c b/net/core/dev.c
index afff16849c26..c0455e764f97 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -8600,13 +8600,16 @@ static int dev_xdp_install(struct net_device *dev, bpf_op_t bpf_op,
 			   struct bpf_prog *prog)
 {
 	bool non_hw = !(flags & XDP_FLAGS_HW_MODE);
+	bool egress = flags & XDP_FLAGS_EGRESS_MODE;
 	struct bpf_prog *prev_prog = NULL;
 	struct netdev_bpf xdp;
 	int err;
 
 	if (non_hw) {
-		prev_prog = bpf_prog_by_id(__dev_xdp_query(dev, bpf_op,
-							   XDP_QUERY_PROG));
+		enum bpf_netdev_command cmd;
+
+		cmd = egress ? XDP_QUERY_PROG_EGRESS : XDP_QUERY_PROG;
+		prev_prog = bpf_prog_by_id(__dev_xdp_query(dev, bpf_op, cmd));
 		if (IS_ERR(prev_prog))
 			prev_prog = NULL;
 	}
@@ -8615,7 +8618,7 @@ static int dev_xdp_install(struct net_device *dev, bpf_op_t bpf_op,
 	if (flags & XDP_FLAGS_HW_MODE)
 		xdp.command = XDP_SETUP_PROG_HW;
 	else
-		xdp.command = XDP_SETUP_PROG;
+		xdp.command = egress ? XDP_SETUP_PROG_EGRESS : XDP_SETUP_PROG;
 	xdp.extack = extack;
 	xdp.flags = flags;
 	xdp.prog = prog;
@@ -8677,12 +8680,18 @@ int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack,
 	bpf_op_t bpf_op, bpf_chk;
 	struct bpf_prog *prog;
 	bool offload;
+	bool egress;
 	int err;
 
 	ASSERT_RTNL();
 
 	offload = flags & XDP_FLAGS_HW_MODE;
-	query = offload ? XDP_QUERY_PROG_HW : XDP_QUERY_PROG;
+	egress = flags & XDP_FLAGS_EGRESS_MODE;
+	if (egress)
+		query = XDP_QUERY_PROG_EGRESS;
+	else
+		query = offload ? XDP_QUERY_PROG_HW : XDP_QUERY_PROG;
+
 
 	bpf_op = bpf_chk = ops->ndo_bpf;
 	if (!bpf_op && (flags & (XDP_FLAGS_DRV_MODE | XDP_FLAGS_HW_MODE))) {
@@ -8712,7 +8721,8 @@ int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack,
 		}
 	}
 	if (fd >= 0) {
-		if (!offload && __dev_xdp_query(dev, bpf_chk, XDP_QUERY_PROG)) {
+		if (!offload && !egress &&
+		    __dev_xdp_query(dev, bpf_chk, XDP_QUERY_PROG)) {
 			NL_SET_ERR_MSG(extack, "native and generic XDP can't be active at the same time");
 			return -EEXIST;
 		}
-- 
2.21.1 (Apple Git-122.3)


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 bpf-next 04/15] net: Add BPF_XDP_EGRESS as a bpf_attach_type
  2020-04-27 22:46 [PATCH v4 bpf-next 00/15] net: Add support for XDP in egress path David Ahern
                   ` (2 preceding siblings ...)
  2020-04-27 22:46 ` [PATCH v4 bpf-next 03/15] net: Add XDP setup and query commands for Tx programs David Ahern
@ 2020-04-27 22:46 ` David Ahern
  2020-04-27 22:46 ` [PATCH v4 bpf-next 05/15] xdp: Add xdp_txq_info to xdp_buff David Ahern
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: David Ahern @ 2020-04-27 22:46 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, prashantbhole.linux, jasowang, brouer, toke,
	toshiaki.makita1, daniel, john.fastabend, ast, kafai,
	songliubraving, yhs, andriin, dsahern, David Ahern

From: David Ahern <dahern@digitalocean.com>

Add new bpf_attach_type, BPF_XDP_EGRESS, for BPF programs attached
at the XDP layer, but the egress path.

Since egress path will not have ingress_ifindex and rx_queue_index
set, update xdp_is_valid_access to block access to these entries in
the xdp context when a program is attached with expected_attach_type
set.

Update dev_change_xdp_fd to verify expected_attach_type for a program
is BPF_XDP_EGRESS if egress argument is set.

The next patch adds support for the egress ifindex.

Signed-off-by: Prashant Bhole <prashantbhole.linux@gmail.com>
Co-developed-by: David Ahern <dahern@digitalocean.com>
Signed-off-by: David Ahern <dahern@digitalocean.com>
---
 include/uapi/linux/bpf.h       |  1 +
 net/core/dev.c                 | 11 +++++++++++
 net/core/filter.c              | 11 +++++++++++
 tools/include/uapi/linux/bpf.h |  1 +
 4 files changed, 24 insertions(+)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 4a6c47f3febe..c89da2c2b1f2 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -215,6 +215,7 @@ enum bpf_attach_type {
 	BPF_TRACE_FEXIT,
 	BPF_MODIFY_RETURN,
 	BPF_LSM_MAC,
+	BPF_XDP_EGRESS,
 	__MAX_BPF_ATTACH_TYPE
 };
 
diff --git a/net/core/dev.c b/net/core/dev.c
index c0455e764f97..88672ea4fc80 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -8737,6 +8737,17 @@ int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack,
 		if (IS_ERR(prog))
 			return PTR_ERR(prog);
 
+		if (egress && prog->expected_attach_type != BPF_XDP_EGRESS) {
+			NL_SET_ERR_MSG(extack, "XDP program in Tx path must use BPF_XDP_EGRESS attach type");
+			bpf_prog_put(prog);
+			return -EINVAL;
+		}
+		if (!egress && prog->expected_attach_type == BPF_XDP_EGRESS) {
+			NL_SET_ERR_MSG(extack, "XDP program in Rx path can not use BPF_XDP_EGRESS attach type");
+			bpf_prog_put(prog);
+			return -EINVAL;
+		}
+
 		if (!offload && bpf_prog_is_dev_bound(prog->aux)) {
 			NL_SET_ERR_MSG(extack, "using device-bound program without HW_MODE flag is not supported");
 			bpf_prog_put(prog);
diff --git a/net/core/filter.c b/net/core/filter.c
index da3b7a72c37c..00e1941137ab 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -6861,6 +6861,17 @@ static bool xdp_is_valid_access(int off, int size,
 				const struct bpf_prog *prog,
 				struct bpf_insn_access_aux *info)
 {
+	/* Rx data is only accessible from original XDP where
+	 * expected_attach_type is not set
+	 */
+	if (prog->expected_attach_type) {
+		switch (off) {
+		case offsetof(struct xdp_md, ingress_ifindex):
+		case offsetof(struct xdp_md, rx_queue_index):
+			return false;
+		}
+	}
+
 	if (type == BPF_WRITE) {
 		if (bpf_prog_is_dev_bound(prog->aux)) {
 			switch (off) {
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 4a6c47f3febe..c89da2c2b1f2 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -215,6 +215,7 @@ enum bpf_attach_type {
 	BPF_TRACE_FEXIT,
 	BPF_MODIFY_RETURN,
 	BPF_LSM_MAC,
+	BPF_XDP_EGRESS,
 	__MAX_BPF_ATTACH_TYPE
 };
 
-- 
2.21.1 (Apple Git-122.3)


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 bpf-next 05/15] xdp: Add xdp_txq_info to xdp_buff
  2020-04-27 22:46 [PATCH v4 bpf-next 00/15] net: Add support for XDP in egress path David Ahern
                   ` (3 preceding siblings ...)
  2020-04-27 22:46 ` [PATCH v4 bpf-next 04/15] net: Add BPF_XDP_EGRESS as a bpf_attach_type David Ahern
@ 2020-04-27 22:46 ` David Ahern
  2020-04-27 22:46 ` [PATCH v4 bpf-next 06/15] net: Rename do_xdp_generic to do_xdp_generic_rx David Ahern
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: David Ahern @ 2020-04-27 22:46 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, prashantbhole.linux, jasowang, brouer, toke,
	toshiaki.makita1, daniel, john.fastabend, ast, kafai,
	songliubraving, yhs, andriin, dsahern, David Ahern

From: David Ahern <dahern@digitalocean.com>

Add xdp_txq_info as the Tx counterpart to xdp_rxq_info. At the
moment only the device is added. Other fields (queue_index)
can be added as use cases arise.

From a UAPI perspective, add egress_ifindex to xdp context.

Update the verifier to reject accesses to egress_ifindex by
rx programs.

Signed-off-by: David Ahern <dahern@digitalocean.com>
---
 include/net/xdp.h              |  5 +++++
 include/uapi/linux/bpf.h       |  2 ++
 net/core/filter.c              | 15 +++++++++++++++
 tools/include/uapi/linux/bpf.h |  2 ++
 4 files changed, 24 insertions(+)

diff --git a/include/net/xdp.h b/include/net/xdp.h
index 3264fa882de3..2b85b8649201 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -63,6 +63,10 @@ struct xdp_rxq_info {
 	struct xdp_mem_info mem;
 } ____cacheline_aligned; /* perf critical, avoid false-sharing */
 
+struct xdp_txq_info {
+	struct net_device *dev;
+};
+
 struct xdp_buff {
 	void *data;
 	void *data_end;
@@ -70,6 +74,7 @@ struct xdp_buff {
 	void *data_hard_start;
 	unsigned long handle;
 	struct xdp_rxq_info *rxq;
+	struct xdp_txq_info *txq;
 };
 
 struct xdp_frame {
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index c89da2c2b1f2..3166f7f2696e 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -3498,6 +3498,8 @@ struct xdp_md {
 	/* Below access go through struct xdp_rxq_info */
 	__u32 ingress_ifindex; /* rxq->dev->ifindex */
 	__u32 rx_queue_index;  /* rxq->queue_index  */
+
+	__u32 egress_ifindex;  /* txq->dev->ifindex */
 };
 
 enum sk_action {
diff --git a/net/core/filter.c b/net/core/filter.c
index 00e1941137ab..b6ab7b4dcc1f 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -6870,6 +6870,11 @@ static bool xdp_is_valid_access(int off, int size,
 		case offsetof(struct xdp_md, rx_queue_index):
 			return false;
 		}
+	} else if (prog->expected_attach_type != BPF_XDP_EGRESS) {
+		switch (off) {
+		case offsetof(struct xdp_md, egress_ifindex):
+			return false;
+		}
 	}
 
 	if (type == BPF_WRITE) {
@@ -7819,6 +7824,16 @@ static u32 xdp_convert_ctx_access(enum bpf_access_type type,
 				      offsetof(struct xdp_rxq_info,
 					       queue_index));
 		break;
+	case offsetof(struct xdp_md, egress_ifindex):
+		*insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct xdp_buff, txq),
+				      si->dst_reg, si->src_reg,
+				      offsetof(struct xdp_buff, txq));
+		*insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct xdp_txq_info, dev),
+				      si->dst_reg, si->dst_reg,
+				      offsetof(struct xdp_txq_info, dev));
+		*insn++ = BPF_LDX_MEM(BPF_W, si->dst_reg, si->dst_reg,
+				      offsetof(struct net_device, ifindex));
+		break;
 	}
 
 	return insn - insn_buf;
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index c89da2c2b1f2..3166f7f2696e 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -3498,6 +3498,8 @@ struct xdp_md {
 	/* Below access go through struct xdp_rxq_info */
 	__u32 ingress_ifindex; /* rxq->dev->ifindex */
 	__u32 rx_queue_index;  /* rxq->queue_index  */
+
+	__u32 egress_ifindex;  /* txq->dev->ifindex */
 };
 
 enum sk_action {
-- 
2.21.1 (Apple Git-122.3)


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 bpf-next 06/15] net: Rename do_xdp_generic to do_xdp_generic_rx
  2020-04-27 22:46 [PATCH v4 bpf-next 00/15] net: Add support for XDP in egress path David Ahern
                   ` (4 preceding siblings ...)
  2020-04-27 22:46 ` [PATCH v4 bpf-next 05/15] xdp: Add xdp_txq_info to xdp_buff David Ahern
@ 2020-04-27 22:46 ` David Ahern
  2020-04-27 22:46 ` [PATCH v4 bpf-next 07/15] net: rename netif_receive_generic_xdp to do_generic_xdp_core David Ahern
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: David Ahern @ 2020-04-27 22:46 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, prashantbhole.linux, jasowang, brouer, toke,
	toshiaki.makita1, daniel, john.fastabend, ast, kafai,
	songliubraving, yhs, andriin, dsahern, David Ahern

From: David Ahern <dahern@digitalocean.com>

Rename do_xdp_generic to do_xdp_generic_rx to emphasize its use in the
Rx path.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Prashant Bhole <prashantbhole.linux@gmail.com>
Signed-off-by: David Ahern <dahern@digitalocean.com>
---
 drivers/net/tun.c         | 4 ++--
 include/linux/netdevice.h | 2 +-
 net/core/dev.c            | 7 ++++---
 3 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 44889eba1dbc..efe655d27661 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1898,7 +1898,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 		rcu_read_lock();
 		xdp_prog = rcu_dereference(tun->xdp_prog);
 		if (xdp_prog) {
-			ret = do_xdp_generic(xdp_prog, skb);
+			ret = do_xdp_generic_rx(xdp_prog, skb);
 			if (ret != XDP_PASS) {
 				rcu_read_unlock();
 				local_bh_enable();
@@ -2463,7 +2463,7 @@ static int tun_xdp_one(struct tun_struct *tun,
 	skb_record_rx_queue(skb, tfile->queue_index);
 
 	if (skb_xdp) {
-		err = do_xdp_generic(xdp_prog, skb);
+		err = do_xdp_generic_rx(xdp_prog, skb);
 		if (err != XDP_PASS)
 			goto out;
 	}
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index ee0cb73ca18a..f4d24d9ea4f9 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3714,7 +3714,7 @@ static inline void dev_consume_skb_any(struct sk_buff *skb)
 }
 
 void generic_xdp_tx(struct sk_buff *skb, struct bpf_prog *xdp_prog);
-int do_xdp_generic(struct bpf_prog *xdp_prog, struct sk_buff *skb);
+int do_xdp_generic_rx(struct bpf_prog *xdp_prog, struct sk_buff *skb);
 int netif_rx(struct sk_buff *skb);
 int netif_rx_ni(struct sk_buff *skb);
 int netif_receive_skb(struct sk_buff *skb);
diff --git a/net/core/dev.c b/net/core/dev.c
index 88672ea4fc80..9a352ff32b72 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4642,7 +4642,7 @@ void generic_xdp_tx(struct sk_buff *skb, struct bpf_prog *xdp_prog)
 
 static DEFINE_STATIC_KEY_FALSE(generic_xdp_needed_key);
 
-int do_xdp_generic(struct bpf_prog *xdp_prog, struct sk_buff *skb)
+int do_xdp_generic_rx(struct bpf_prog *xdp_prog, struct sk_buff *skb)
 {
 	if (xdp_prog) {
 		struct xdp_buff xdp;
@@ -4670,7 +4670,7 @@ int do_xdp_generic(struct bpf_prog *xdp_prog, struct sk_buff *skb)
 	kfree_skb(skb);
 	return XDP_DROP;
 }
-EXPORT_SYMBOL_GPL(do_xdp_generic);
+EXPORT_SYMBOL_GPL(do_xdp_generic_rx);
 
 static int netif_rx_internal(struct sk_buff *skb)
 {
@@ -5020,7 +5020,8 @@ static int __netif_receive_skb_core(struct sk_buff *skb, bool pfmemalloc,
 		int ret2;
 
 		preempt_disable();
-		ret2 = do_xdp_generic(rcu_dereference(skb->dev->xdp_prog), skb);
+		ret2 = do_xdp_generic_rx(rcu_dereference(skb->dev->xdp_prog),
+					 skb);
 		preempt_enable();
 
 		if (ret2 != XDP_PASS)
-- 
2.21.1 (Apple Git-122.3)


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 bpf-next 07/15] net: rename netif_receive_generic_xdp to do_generic_xdp_core
  2020-04-27 22:46 [PATCH v4 bpf-next 00/15] net: Add support for XDP in egress path David Ahern
                   ` (5 preceding siblings ...)
  2020-04-27 22:46 ` [PATCH v4 bpf-next 06/15] net: Rename do_xdp_generic to do_xdp_generic_rx David Ahern
@ 2020-04-27 22:46 ` David Ahern
  2020-04-27 22:46 ` [PATCH v4 bpf-next 08/15] net: set XDP egress program on netdevice David Ahern
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: David Ahern @ 2020-04-27 22:46 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, prashantbhole.linux, jasowang, brouer, toke,
	toshiaki.makita1, daniel, john.fastabend, ast, kafai,
	songliubraving, yhs, andriin, dsahern, David Ahern

From: David Ahern <dahern@digitalocean.com>

In skb generic path, we need a way to run XDP program on skb but
to have customized handling of XDP actions. netif_receive_generic_xdp
will be more helpful in such cases than do_xdp_generic.

This patch prepares netif_receive_generic_xdp() to be used as general
purpose function for running xdp programs on skbs by renaming it to
do_xdp_generic_core, moving skb_is_redirected and rxq settings as well
as XDP return code checks to the callers.

This allows this core function to be used from both Rx and Tx paths
with rxq and txq set based on context.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Prashant Bhole <prashantbhole.linux@gmail.com>
Signed-off-by: David Ahern <dahern@digitalocean.com>
---
 net/core/dev.c | 52 ++++++++++++++++++++++++--------------------------
 1 file changed, 25 insertions(+), 27 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 9a352ff32b72..5bbdbc0c0a92 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4501,25 +4501,17 @@ static struct netdev_rx_queue *netif_get_rxqueue(struct sk_buff *skb)
 	return rxqueue;
 }
 
-static u32 netif_receive_generic_xdp(struct sk_buff *skb,
-				     struct xdp_buff *xdp,
-				     struct bpf_prog *xdp_prog)
+static u32 do_xdp_generic_core(struct sk_buff *skb, struct xdp_buff *xdp,
+			       struct bpf_prog *xdp_prog)
 {
-	struct netdev_rx_queue *rxqueue;
 	void *orig_data, *orig_data_end;
-	u32 metalen, act = XDP_DROP;
 	__be16 orig_eth_type;
 	struct ethhdr *eth;
+	u32 metalen, act;
 	bool orig_bcast;
 	int hlen, off;
 	u32 mac_len;
 
-	/* Reinjected packets coming from act_mirred or similar should
-	 * not get XDP generic processing.
-	 */
-	if (skb_is_redirected(skb))
-		return XDP_PASS;
-
 	/* XDP packets must be linear and must have sufficient headroom
 	 * of XDP_PACKET_HEADROOM bytes. This is the guarantee that also
 	 * native XDP provides, thus we need to do it here as well.
@@ -4535,9 +4527,9 @@ static u32 netif_receive_generic_xdp(struct sk_buff *skb,
 		if (pskb_expand_head(skb,
 				     hroom > 0 ? ALIGN(hroom, NET_SKB_PAD) : 0,
 				     troom > 0 ? troom + 128 : 0, GFP_ATOMIC))
-			goto do_drop;
+			return XDP_DROP;
 		if (skb_linearize(skb))
-			goto do_drop;
+			return XDP_DROP;
 	}
 
 	/* The XDP program wants to see the packet starting at the MAC
@@ -4555,9 +4547,6 @@ static u32 netif_receive_generic_xdp(struct sk_buff *skb,
 	orig_bcast = is_multicast_ether_addr_64bits(eth->h_dest);
 	orig_eth_type = eth->h_proto;
 
-	rxqueue = netif_get_rxqueue(skb);
-	xdp->rxq = &rxqueue->xdp_rxq;
-
 	act = bpf_prog_run_xdp(xdp_prog, xdp);
 
 	/* check if bpf_xdp_adjust_head was used */
@@ -4600,16 +4589,6 @@ static u32 netif_receive_generic_xdp(struct sk_buff *skb,
 		if (metalen)
 			skb_metadata_set(skb, metalen);
 		break;
-	default:
-		bpf_warn_invalid_xdp_action(act);
-		/* fall through */
-	case XDP_ABORTED:
-		trace_xdp_exception(skb->dev, xdp_prog, act);
-		/* fall through */
-	case XDP_DROP:
-	do_drop:
-		kfree_skb(skb);
-		break;
 	}
 
 	return act;
@@ -4644,12 +4623,22 @@ static DEFINE_STATIC_KEY_FALSE(generic_xdp_needed_key);
 
 int do_xdp_generic_rx(struct bpf_prog *xdp_prog, struct sk_buff *skb)
 {
+	/* Reinjected packets coming from act_mirred or similar should
+	 * not get XDP generic processing.
+	 */
+	if (skb_is_redirected(skb))
+		return XDP_PASS;
+
 	if (xdp_prog) {
+		struct netdev_rx_queue *rxqueue;
 		struct xdp_buff xdp;
 		u32 act;
 		int err;
 
-		act = netif_receive_generic_xdp(skb, &xdp, xdp_prog);
+		rxqueue = netif_get_rxqueue(skb);
+		xdp.rxq = &rxqueue->xdp_rxq;
+
+		act = do_xdp_generic_core(skb, &xdp, xdp_prog);
 		if (act != XDP_PASS) {
 			switch (act) {
 			case XDP_REDIRECT:
@@ -4661,6 +4650,15 @@ int do_xdp_generic_rx(struct bpf_prog *xdp_prog, struct sk_buff *skb)
 			case XDP_TX:
 				generic_xdp_tx(skb, xdp_prog);
 				break;
+			default:
+				bpf_warn_invalid_xdp_action(act);
+				/* fall through */
+			case XDP_ABORTED:
+				trace_xdp_exception(skb->dev, xdp_prog, act);
+				/* fall through */
+			case XDP_DROP:
+				kfree_skb(skb);
+				break;
 			}
 			return XDP_DROP;
 		}
-- 
2.21.1 (Apple Git-122.3)


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 bpf-next 08/15] net: set XDP egress program on netdevice
  2020-04-27 22:46 [PATCH v4 bpf-next 00/15] net: Add support for XDP in egress path David Ahern
                   ` (6 preceding siblings ...)
  2020-04-27 22:46 ` [PATCH v4 bpf-next 07/15] net: rename netif_receive_generic_xdp to do_generic_xdp_core David Ahern
@ 2020-04-27 22:46 ` David Ahern
  2020-04-27 22:46 ` [PATCH v4 bpf-next 09/15] net: Support xdp in the Tx path for packets as an skb David Ahern
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: David Ahern @ 2020-04-27 22:46 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, prashantbhole.linux, jasowang, brouer, toke,
	toshiaki.makita1, daniel, john.fastabend, ast, kafai,
	songliubraving, yhs, andriin, dsahern, David Ahern

From: David Ahern <dahern@digitalocean.com>

This patch handles the plumbing for installing an XDP egress
program on a net_device by handling XDP_SETUP_PROG_EGRESS and
XDP_QUERY_PROG_EGRESS in generic_xdp_install handler. New static
key is added to signal when an egress program has been installed.

Update dev_xdp_uninstall to remove egress programs.

Signed-off-by: David Ahern <dahern@digitalocean.com>
---
 include/linux/netdevice.h |  2 ++
 net/core/dev.c            | 48 +++++++++++++++++++++++++++------------
 2 files changed, 36 insertions(+), 14 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index f4d24d9ea4f9..2b552c29e188 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -752,6 +752,8 @@ struct netdev_rx_queue {
 #endif
 } ____cacheline_aligned_in_smp;
 
+extern struct static_key_false xdp_egress_needed_key;
+
 /*
  * RX queue sysfs structures and functions.
  */
diff --git a/net/core/dev.c b/net/core/dev.c
index 5bbdbc0c0a92..14ce8e25e3d3 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4620,6 +4620,7 @@ void generic_xdp_tx(struct sk_buff *skb, struct bpf_prog *xdp_prog)
 }
 
 static DEFINE_STATIC_KEY_FALSE(generic_xdp_needed_key);
+DEFINE_STATIC_KEY_FALSE(xdp_egress_needed_key);
 
 int do_xdp_generic_rx(struct bpf_prog *xdp_prog, struct sk_buff *skb)
 {
@@ -5335,12 +5336,12 @@ static void __netif_receive_skb_list(struct list_head *head)
 
 static int generic_xdp_install(struct net_device *dev, struct netdev_bpf *xdp)
 {
-	struct bpf_prog *old = rtnl_dereference(dev->xdp_prog);
-	struct bpf_prog *new = xdp->prog;
+	struct bpf_prog *old, *new = xdp->prog;
 	int ret = 0;
 
 	switch (xdp->command) {
 	case XDP_SETUP_PROG:
+		old = rtnl_dereference(dev->xdp_prog);
 		rcu_assign_pointer(dev->xdp_prog, new);
 		if (old)
 			bpf_prog_put(old);
@@ -5353,11 +5354,25 @@ static int generic_xdp_install(struct net_device *dev, struct netdev_bpf *xdp)
 			dev_disable_gro_hw(dev);
 		}
 		break;
+	case XDP_SETUP_PROG_EGRESS:
+		old = rtnl_dereference(dev->xdp_egress_prog);
+		rcu_assign_pointer(dev->xdp_egress_prog, new);
+		if (old)
+			bpf_prog_put(old);
 
+		if (old && !new)
+			static_branch_dec(&xdp_egress_needed_key);
+		else if (new && !old)
+			static_branch_inc(&xdp_egress_needed_key);
+		break;
 	case XDP_QUERY_PROG:
+		old = rtnl_dereference(dev->xdp_prog);
+		xdp->prog_id = old ? old->aux->id : 0;
+		break;
+	case XDP_QUERY_PROG_EGRESS:
+		old = rtnl_dereference(dev->xdp_egress_prog);
 		xdp->prog_id = old ? old->aux->id : 0;
 		break;
-
 	default:
 		ret = -EINVAL;
 		break;
@@ -8640,6 +8655,10 @@ static void dev_xdp_uninstall(struct net_device *dev)
 	/* Remove generic XDP */
 	WARN_ON(dev_xdp_install(dev, generic_xdp_install, NULL, 0, NULL));
 
+	/* Remove XDP egress */
+	WARN_ON(dev_xdp_install(dev, generic_xdp_install, NULL,
+				XDP_FLAGS_EGRESS_MODE, NULL));
+
 	/* Remove from the driver */
 	ndo_bpf = dev->netdev_ops->ndo_bpf;
 	if (!ndo_bpf)
@@ -8686,21 +8705,22 @@ int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack,
 
 	offload = flags & XDP_FLAGS_HW_MODE;
 	egress = flags & XDP_FLAGS_EGRESS_MODE;
-	if (egress)
+	if (egress) {
 		query = XDP_QUERY_PROG_EGRESS;
-	else
+		bpf_op = bpf_chk = generic_xdp_install;
+	} else {
 		query = offload ? XDP_QUERY_PROG_HW : XDP_QUERY_PROG;
 
-
-	bpf_op = bpf_chk = ops->ndo_bpf;
-	if (!bpf_op && (flags & (XDP_FLAGS_DRV_MODE | XDP_FLAGS_HW_MODE))) {
-		NL_SET_ERR_MSG(extack, "underlying driver does not support XDP in native mode");
-		return -EOPNOTSUPP;
+		bpf_op = bpf_chk = ops->ndo_bpf;
+		if (!bpf_op && (flags & (XDP_FLAGS_DRV_MODE | XDP_FLAGS_HW_MODE))) {
+			NL_SET_ERR_MSG(extack, "underlying driver does not support XDP in native mode");
+			return -EOPNOTSUPP;
+		}
+		if (!bpf_op || (flags & XDP_FLAGS_SKB_MODE))
+			bpf_op = generic_xdp_install;
+		if (bpf_op == bpf_chk)
+			bpf_chk = generic_xdp_install;
 	}
-	if (!bpf_op || (flags & XDP_FLAGS_SKB_MODE))
-		bpf_op = generic_xdp_install;
-	if (bpf_op == bpf_chk)
-		bpf_chk = generic_xdp_install;
 
 	prog_id = __dev_xdp_query(dev, bpf_op, query);
 	if (flags & XDP_FLAGS_REPLACE) {
-- 
2.21.1 (Apple Git-122.3)


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 bpf-next 09/15] net: Support xdp in the Tx path for packets as an skb
  2020-04-27 22:46 [PATCH v4 bpf-next 00/15] net: Add support for XDP in egress path David Ahern
                   ` (7 preceding siblings ...)
  2020-04-27 22:46 ` [PATCH v4 bpf-next 08/15] net: set XDP egress program on netdevice David Ahern
@ 2020-04-27 22:46 ` David Ahern
  2020-04-28 15:05   ` Daniel Borkmann
  2020-04-27 22:46 ` [PATCH v4 bpf-next 10/15] net: Support xdp in the Tx path for xdp_frames David Ahern
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 20+ messages in thread
From: David Ahern @ 2020-04-27 22:46 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, prashantbhole.linux, jasowang, brouer, toke,
	toshiaki.makita1, daniel, john.fastabend, ast, kafai,
	songliubraving, yhs, andriin, dsahern, David Ahern

From: David Ahern <dahern@digitalocean.com>

Add support to run Tx path program on packets about to hit the
ndo_start_xmit function for a device. Only XDP_DROP and XDP_PASS
are supported now. Conceptually, XDP_REDIRECT for this path can
work the same as it does for the Rx path, but that support is left
for a follow on series.

Signed-off-by: David Ahern <dahern@digitalocean.com>
---
 include/linux/netdevice.h | 11 +++++++++
 net/core/dev.c            | 52 ++++++++++++++++++++++++++++++++++++++-
 2 files changed, 62 insertions(+), 1 deletion(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 2b552c29e188..33a09396444f 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3717,6 +3717,7 @@ static inline void dev_consume_skb_any(struct sk_buff *skb)
 
 void generic_xdp_tx(struct sk_buff *skb, struct bpf_prog *xdp_prog);
 int do_xdp_generic_rx(struct bpf_prog *xdp_prog, struct sk_buff *skb);
+u32 do_xdp_egress_skb(struct net_device *dev, struct sk_buff *skb);
 int netif_rx(struct sk_buff *skb);
 int netif_rx_ni(struct sk_buff *skb);
 int netif_receive_skb(struct sk_buff *skb);
@@ -4577,6 +4578,16 @@ static inline netdev_tx_t __netdev_start_xmit(const struct net_device_ops *ops,
 					      struct sk_buff *skb, struct net_device *dev,
 					      bool more)
 {
+	if (static_branch_unlikely(&xdp_egress_needed_key)) {
+		u32 act;
+
+		rcu_read_lock();
+		act = do_xdp_egress_skb(dev, skb);
+		rcu_read_unlock();
+		if (act == XDP_DROP)
+			return NET_XMIT_DROP;
+	}
+
 	__this_cpu_write(softnet_data.xmit.more, more);
 	return ops->ndo_start_xmit(skb, dev);
 }
diff --git a/net/core/dev.c b/net/core/dev.c
index 14ce8e25e3d3..4d98189548c7 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4620,7 +4620,6 @@ void generic_xdp_tx(struct sk_buff *skb, struct bpf_prog *xdp_prog)
 }
 
 static DEFINE_STATIC_KEY_FALSE(generic_xdp_needed_key);
-DEFINE_STATIC_KEY_FALSE(xdp_egress_needed_key);
 
 int do_xdp_generic_rx(struct bpf_prog *xdp_prog, struct sk_buff *skb)
 {
@@ -4671,6 +4670,57 @@ int do_xdp_generic_rx(struct bpf_prog *xdp_prog, struct sk_buff *skb)
 }
 EXPORT_SYMBOL_GPL(do_xdp_generic_rx);
 
+DEFINE_STATIC_KEY_FALSE(xdp_egress_needed_key);
+EXPORT_SYMBOL_GPL(xdp_egress_needed_key);
+
+static u32 handle_xdp_egress_act(u32 act, struct net_device *dev,
+				 struct bpf_prog *xdp_prog)
+{
+	switch (act) {
+	case XDP_DROP:
+		/* fall through */
+	case XDP_PASS:
+		break;
+	case XDP_TX:
+		/* fall through */
+	case XDP_REDIRECT:
+		/* fall through */
+	default:
+		bpf_warn_invalid_xdp_action(act);
+		/* fall through */
+	case XDP_ABORTED:
+		trace_xdp_exception(dev, xdp_prog, act);
+		act = XDP_DROP;
+		break;
+	}
+
+	return act;
+}
+
+u32 do_xdp_egress_skb(struct net_device *dev, struct sk_buff *skb)
+{
+	struct bpf_prog *xdp_prog;
+	u32 act = XDP_PASS;
+
+	xdp_prog = rcu_dereference(dev->xdp_egress_prog);
+	if (xdp_prog) {
+		struct xdp_txq_info txq = { .dev = dev };
+		struct xdp_buff xdp;
+
+		xdp.txq = &txq;
+		act = do_xdp_generic_core(skb, &xdp, xdp_prog);
+		act = handle_xdp_egress_act(act, dev, xdp_prog);
+		if (act == XDP_DROP) {
+			atomic_long_inc(&dev->tx_dropped);
+			skb_tx_error(skb);
+			kfree_skb(skb);
+		}
+	}
+
+	return act;
+}
+EXPORT_SYMBOL_GPL(do_xdp_egress_skb);
+
 static int netif_rx_internal(struct sk_buff *skb)
 {
 	int ret;
-- 
2.21.1 (Apple Git-122.3)


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 bpf-next 10/15] net: Support xdp in the Tx path for xdp_frames
  2020-04-27 22:46 [PATCH v4 bpf-next 00/15] net: Add support for XDP in egress path David Ahern
                   ` (8 preceding siblings ...)
  2020-04-27 22:46 ` [PATCH v4 bpf-next 09/15] net: Support xdp in the Tx path for packets as an skb David Ahern
@ 2020-04-27 22:46 ` David Ahern
  2020-04-27 22:46 ` [PATCH v4 bpf-next 11/15] libbpf: Add egress XDP support David Ahern
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: David Ahern @ 2020-04-27 22:46 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, prashantbhole.linux, jasowang, brouer, toke,
	toshiaki.makita1, daniel, john.fastabend, ast, kafai,
	songliubraving, yhs, andriin, dsahern, David Ahern

From: David Ahern <dahern@digitalocean.com>

Add support to run Tx path program on xdp_frames by adding a hook to
bq_xmit_all before xdp_frames are passed to ndo_xdp_xmit for the device.

If an xdp_frame is dropped by the program, it is removed from the
xdp_frames array with subsequent entries moved up.

Signed-off-by: David Ahern <dahern@digitalocean.com>
---
 include/linux/netdevice.h |  3 ++
 kernel/bpf/devmap.c       | 19 +++++++++----
 net/core/dev.c            | 59 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 76 insertions(+), 5 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 33a09396444f..8c707ce9ab65 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3718,6 +3718,9 @@ static inline void dev_consume_skb_any(struct sk_buff *skb)
 void generic_xdp_tx(struct sk_buff *skb, struct bpf_prog *xdp_prog);
 int do_xdp_generic_rx(struct bpf_prog *xdp_prog, struct sk_buff *skb);
 u32 do_xdp_egress_skb(struct net_device *dev, struct sk_buff *skb);
+unsigned int do_xdp_egress_frame(struct net_device *dev,
+				 struct xdp_frame **frames,
+				 unsigned int count);
 int netif_rx(struct sk_buff *skb);
 int netif_rx_ni(struct sk_buff *skb);
 int netif_receive_skb(struct sk_buff *skb);
diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c
index a51d9fb7a359..3add36d697a8 100644
--- a/kernel/bpf/devmap.c
+++ b/kernel/bpf/devmap.c
@@ -321,24 +321,33 @@ static int bq_xmit_all(struct xdp_dev_bulk_queue *bq, u32 flags)
 {
 	struct net_device *dev = bq->dev;
 	int sent = 0, drops = 0, err = 0;
+	unsigned int count = bq->count;
 	int i;
 
-	if (unlikely(!bq->count))
+	if (unlikely(!count))
 		return 0;
 
-	for (i = 0; i < bq->count; i++) {
+	for (i = 0; i < count; i++) {
 		struct xdp_frame *xdpf = bq->q[i];
 
 		prefetch(xdpf);
 	}
 
-	sent = dev->netdev_ops->ndo_xdp_xmit(dev, bq->count, bq->q, flags);
+	if (static_branch_unlikely(&xdp_egress_needed_key)) {
+		count = do_xdp_egress_frame(dev, bq->q, count);
+		drops += bq->count - count;
+		/* all frames consumed by the xdp program? */
+		if (!count)
+			goto out;
+	}
+
+	sent = dev->netdev_ops->ndo_xdp_xmit(dev, count, bq->q, flags);
 	if (sent < 0) {
 		err = sent;
 		sent = 0;
 		goto error;
 	}
-	drops = bq->count - sent;
+	drops += count - sent;
 out:
 	bq->count = 0;
 
@@ -350,7 +359,7 @@ static int bq_xmit_all(struct xdp_dev_bulk_queue *bq, u32 flags)
 	/* If ndo_xdp_xmit fails with an errno, no frames have been
 	 * xmit'ed and it's our responsibility to them free all.
 	 */
-	for (i = 0; i < bq->count; i++) {
+	for (i = 0; i < count; i++) {
 		struct xdp_frame *xdpf = bq->q[i];
 
 		xdp_return_frame_rx_napi(xdpf);
diff --git a/net/core/dev.c b/net/core/dev.c
index 4d98189548c7..62ef6bf80998 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4721,6 +4721,65 @@ u32 do_xdp_egress_skb(struct net_device *dev, struct sk_buff *skb)
 }
 EXPORT_SYMBOL_GPL(do_xdp_egress_skb);
 
+static u32 __xdp_egress_frame(struct net_device *dev,
+			      struct bpf_prog *xdp_prog,
+			      struct xdp_frame *xdp_frame,
+			      struct xdp_txq_info *txq)
+{
+	struct xdp_buff xdp;
+	u32 act;
+
+	xdp.data_hard_start = xdp_frame->data - xdp_frame->headroom
+			      - sizeof(*xdp_frame);
+	xdp.data = xdp_frame->data;
+	xdp.data_end = xdp.data + xdp_frame->len;
+	xdp.data_meta = xdp.data - xdp_frame->metasize;
+	xdp.txq = txq;
+
+	act = bpf_prog_run_xdp(xdp_prog, &xdp);
+	act = handle_xdp_egress_act(act, dev, xdp_prog);
+
+	/* if not dropping frame, readjust pointers in case
+	 * program made changes to the buffer
+	 */
+	if (act != XDP_DROP) {
+		if (unlikely(!update_xdp_frame(&xdp, xdp_frame)))
+			return XDP_DROP;
+	}
+
+	return act;
+}
+
+unsigned int do_xdp_egress_frame(struct net_device *dev,
+				 struct xdp_frame **frames,
+				 unsigned int count)
+{
+	struct bpf_prog *xdp_prog;
+
+	xdp_prog = rcu_dereference(dev->xdp_egress_prog);
+	if (xdp_prog) {
+		struct xdp_txq_info txq = { .dev = dev };
+		unsigned int i, j;
+		u32 act;
+
+		for (i = 0, j = 0; i < count; i++) {
+			struct xdp_frame *frame = frames[i];
+
+			act = __xdp_egress_frame(dev, xdp_prog, frame, &txq);
+			if (act == XDP_DROP) {
+				xdp_return_frame_rx_napi(frame);
+				continue;
+			}
+
+			frames[j] = frame;
+			j++;
+		}
+		count = j;
+	}
+
+	return count;
+}
+
 static int netif_rx_internal(struct sk_buff *skb)
 {
 	int ret;
-- 
2.21.1 (Apple Git-122.3)


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 bpf-next 11/15] libbpf: Add egress XDP support
  2020-04-27 22:46 [PATCH v4 bpf-next 00/15] net: Add support for XDP in egress path David Ahern
                   ` (9 preceding siblings ...)
  2020-04-27 22:46 ` [PATCH v4 bpf-next 10/15] net: Support xdp in the Tx path for xdp_frames David Ahern
@ 2020-04-27 22:46 ` David Ahern
  2020-04-27 22:46 ` [PATCH v4 bpf-next 12/15] bpftool: Add support for XDP egress David Ahern
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: David Ahern @ 2020-04-27 22:46 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, prashantbhole.linux, jasowang, brouer, toke,
	toshiaki.makita1, daniel, john.fastabend, ast, kafai,
	songliubraving, yhs, andriin, dsahern, David Ahern

From: David Ahern <dahern@digitalocean.com>

New section name hint, xdp_egress, is added to set expected attach
type at program load. Programs can use xdp_egress as the prefix in
the SEC statement to load the program with the BPF_XDP_EGRESS
attach type set.

egress_prog_id is added to xdp_link_info to report the program
id.

Signed-off-by: David Ahern <dahern@digitalocean.com>
---
 tools/lib/bpf/libbpf.c  | 2 ++
 tools/lib/bpf/libbpf.h  | 1 +
 tools/lib/bpf/netlink.c | 6 ++++++
 3 files changed, 9 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 8e1dc6980fac..60c9c503e1c6 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -6366,6 +6366,8 @@ static const struct bpf_sec_def section_defs[] = {
 		.is_attach_btf = true,
 		.expected_attach_type = BPF_LSM_MAC,
 		.attach_fn = attach_lsm),
+	BPF_EAPROG_SEC("xdp_egress",		BPF_PROG_TYPE_XDP,
+						BPF_XDP_EGRESS),
 	BPF_PROG_SEC("xdp",			BPF_PROG_TYPE_XDP),
 	BPF_PROG_SEC("perf_event",		BPF_PROG_TYPE_PERF_EVENT),
 	BPF_PROG_SEC("lwt_in",			BPF_PROG_TYPE_LWT_IN),
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index f1dacecb1619..445c3789faa4 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -454,6 +454,7 @@ struct xdp_link_info {
 	__u32 hw_prog_id;
 	__u32 skb_prog_id;
 	__u8 attach_mode;
+	__u32 egress_prog_id;
 };
 
 struct bpf_xdp_set_link_opts {
diff --git a/tools/lib/bpf/netlink.c b/tools/lib/bpf/netlink.c
index 312f887570b2..da0b383dbd5d 100644
--- a/tools/lib/bpf/netlink.c
+++ b/tools/lib/bpf/netlink.c
@@ -280,6 +280,10 @@ static int get_xdp_info(void *cookie, void *msg, struct nlattr **tb)
 		xdp_id->info.hw_prog_id = libbpf_nla_getattr_u32(
 			xdp_tb[IFLA_XDP_HW_PROG_ID]);
 
+	if (xdp_tb[IFLA_XDP_EGRESS_PROG_ID])
+		xdp_id->info.egress_prog_id = libbpf_nla_getattr_u32(
+			xdp_tb[IFLA_XDP_EGRESS_PROG_ID]);
+
 	return 0;
 }
 
@@ -331,6 +335,8 @@ static __u32 get_xdp_id(struct xdp_link_info *info, __u32 flags)
 		return info->hw_prog_id;
 	if (flags & XDP_FLAGS_SKB_MODE)
 		return info->skb_prog_id;
+	if (flags & XDP_FLAGS_EGRESS_MODE)
+		return info->egress_prog_id;
 
 	return 0;
 }
-- 
2.21.1 (Apple Git-122.3)


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 bpf-next 12/15] bpftool: Add support for XDP egress
  2020-04-27 22:46 [PATCH v4 bpf-next 00/15] net: Add support for XDP in egress path David Ahern
                   ` (10 preceding siblings ...)
  2020-04-27 22:46 ` [PATCH v4 bpf-next 11/15] libbpf: Add egress XDP support David Ahern
@ 2020-04-27 22:46 ` David Ahern
  2020-04-28  8:08   ` Quentin Monnet
  2020-04-27 22:46 ` [PATCH v4 bpf-next 13/15] selftest: Add test for xdp_egress David Ahern
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 20+ messages in thread
From: David Ahern @ 2020-04-27 22:46 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, prashantbhole.linux, jasowang, brouer, toke,
	toshiaki.makita1, daniel, john.fastabend, ast, kafai,
	songliubraving, yhs, andriin, dsahern, David Ahern

From: David Ahern <dahern@digitalocean.com>

Add xdp_egress as a program type since it requires a new attach
type. This follows suit with other program type + attach type
combintations and leverages the SEC name in libbpf.

Add NET_ATTACH_TYPE_XDP_EGRESS and update attach_type_strings to
allow a user to specify 'xdp_egress' as the attach or detach point.

Update do_attach_detach_xdp to set XDP_FLAGS_EGRESS_MODE if egress
is selected.

Update do_xdp_dump_one to show egress program ids.

Update the documentation and help output.

Signed-off-by: David Ahern <dahern@digitalocean.com>
---
 tools/bpf/bpftool/Documentation/bpftool-net.rst  | 4 +++-
 tools/bpf/bpftool/Documentation/bpftool-prog.rst | 2 +-
 tools/bpf/bpftool/bash-completion/bpftool        | 4 ++--
 tools/bpf/bpftool/net.c                          | 6 +++++-
 tools/bpf/bpftool/netlink_dumper.c               | 5 +++++
 tools/bpf/bpftool/prog.c                         | 2 +-
 6 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/tools/bpf/bpftool/Documentation/bpftool-net.rst b/tools/bpf/bpftool/Documentation/bpftool-net.rst
index 8651b00b81ea..d7398fb00ec4 100644
--- a/tools/bpf/bpftool/Documentation/bpftool-net.rst
+++ b/tools/bpf/bpftool/Documentation/bpftool-net.rst
@@ -26,7 +26,8 @@ NET COMMANDS
 |	**bpftool** **net help**
 |
 |	*PROG* := { **id** *PROG_ID* | **pinned** *FILE* | **tag** *PROG_TAG* }
-|	*ATTACH_TYPE* := { **xdp** | **xdpgeneric** | **xdpdrv** | **xdpoffload** }
+|	*ATTACH_TYPE* :=
+|       { **xdp** | **xdpgeneric** | **xdpdrv** | **xdpoffload** | **xdp_egress** }
 
 DESCRIPTION
 ===========
@@ -63,6 +64,7 @@ DESCRIPTION
                   **xdpgeneric** - Generic XDP. runs at generic XDP hook when packet already enters receive path as skb;
                   **xdpdrv** - Native XDP. runs earliest point in driver's receive path;
                   **xdpoffload** - Offload XDP. runs directly on NIC on each packet reception;
+                  **xdp_egress** - XDP in egress path. runs at core networking level;
 
 	**bpftool** **net detach** *ATTACH_TYPE* **dev** *NAME*
                   Detach bpf program attached to network interface *NAME* with
diff --git a/tools/bpf/bpftool/Documentation/bpftool-prog.rst b/tools/bpf/bpftool/Documentation/bpftool-prog.rst
index 9f19404f470e..ab0a8846a8e3 100644
--- a/tools/bpf/bpftool/Documentation/bpftool-prog.rst
+++ b/tools/bpf/bpftool/Documentation/bpftool-prog.rst
@@ -44,7 +44,7 @@ PROG COMMANDS
 |		**cgroup/connect4** | **cgroup/connect6** | **cgroup/sendmsg4** | **cgroup/sendmsg6** |
 |		**cgroup/recvmsg4** | **cgroup/recvmsg6** | **cgroup/sysctl** |
 |		**cgroup/getsockopt** | **cgroup/setsockopt** |
-|		**struct_ops** | **fentry** | **fexit** | **freplace**
+|		**struct_ops** | **fentry** | **fexit** | **freplace** | **xdp_egress**
 |	}
 |       *ATTACH_TYPE* := {
 |		**msg_verdict** | **stream_verdict** | **stream_parser** | **flow_dissector**
diff --git a/tools/bpf/bpftool/bash-completion/bpftool b/tools/bpf/bpftool/bash-completion/bpftool
index 45ee99b159e2..ab20696c20c6 100644
--- a/tools/bpf/bpftool/bash-completion/bpftool
+++ b/tools/bpf/bpftool/bash-completion/bpftool
@@ -471,7 +471,7 @@ _bpftool()
                                 cgroup/post_bind4 cgroup/post_bind6 \
                                 cgroup/sysctl cgroup/getsockopt \
                                 cgroup/setsockopt struct_ops \
-                                fentry fexit freplace" -- \
+                                fentry fexit freplace xdp_egress" -- \
                                                    "$cur" ) )
                             return 0
                             ;;
@@ -1003,7 +1003,7 @@ _bpftool()
             ;;
         net)
             local PROG_TYPE='id pinned tag name'
-            local ATTACH_TYPES='xdp xdpgeneric xdpdrv xdpoffload'
+            local ATTACH_TYPES='xdp xdpgeneric xdpdrv xdpoffload xdp_egress'
             case $command in
                 show|list)
                     [[ $prev != "$command" ]] && return 0
diff --git a/tools/bpf/bpftool/net.c b/tools/bpf/bpftool/net.c
index c5e3895b7c8b..714fa075521b 100644
--- a/tools/bpf/bpftool/net.c
+++ b/tools/bpf/bpftool/net.c
@@ -61,6 +61,7 @@ enum net_attach_type {
 	NET_ATTACH_TYPE_XDP_GENERIC,
 	NET_ATTACH_TYPE_XDP_DRIVER,
 	NET_ATTACH_TYPE_XDP_OFFLOAD,
+	NET_ATTACH_TYPE_XDP_EGRESS,
 };
 
 static const char * const attach_type_strings[] = {
@@ -68,6 +69,7 @@ static const char * const attach_type_strings[] = {
 	[NET_ATTACH_TYPE_XDP_GENERIC]	= "xdpgeneric",
 	[NET_ATTACH_TYPE_XDP_DRIVER]	= "xdpdrv",
 	[NET_ATTACH_TYPE_XDP_OFFLOAD]	= "xdpoffload",
+	[NET_ATTACH_TYPE_XDP_EGRESS]	= "xdp_egress",
 };
 
 const size_t net_attach_type_size = ARRAY_SIZE(attach_type_strings);
@@ -286,6 +288,8 @@ static int do_attach_detach_xdp(int progfd, enum net_attach_type attach_type,
 		flags |= XDP_FLAGS_DRV_MODE;
 	if (attach_type == NET_ATTACH_TYPE_XDP_OFFLOAD)
 		flags |= XDP_FLAGS_HW_MODE;
+	if (attach_type == NET_ATTACH_TYPE_XDP_EGRESS)
+		flags |= XDP_FLAGS_EGRESS_MODE;
 
 	return bpf_set_link_xdp_fd(ifindex, progfd, flags);
 }
@@ -464,7 +468,7 @@ static int do_help(int argc, char **argv)
 		"       %s %s help\n"
 		"\n"
 		"       " HELP_SPEC_PROGRAM "\n"
-		"       ATTACH_TYPE := { xdp | xdpgeneric | xdpdrv | xdpoffload }\n"
+		"       ATTACH_TYPE := { xdp | xdpgeneric | xdpdrv | xdpoffload | xdp_egress }\n"
 		"\n"
 		"Note: Only xdp and tc attachments are supported now.\n"
 		"      For progs attached to cgroups, use \"bpftool cgroup\"\n"
diff --git a/tools/bpf/bpftool/netlink_dumper.c b/tools/bpf/bpftool/netlink_dumper.c
index 5f65140b003b..68e4909b6073 100644
--- a/tools/bpf/bpftool/netlink_dumper.c
+++ b/tools/bpf/bpftool/netlink_dumper.c
@@ -55,6 +55,8 @@ static int do_xdp_dump_one(struct nlattr *attr, unsigned int ifindex,
 		xdp_dump_prog_id(tb, IFLA_XDP_SKB_PROG_ID, "generic", true);
 		xdp_dump_prog_id(tb, IFLA_XDP_DRV_PROG_ID, "driver", true);
 		xdp_dump_prog_id(tb, IFLA_XDP_HW_PROG_ID, "offload", true);
+		xdp_dump_prog_id(tb, IFLA_XDP_EGRESS_PROG_ID,
+				 "egress", true);
 		if (json_output)
 			jsonw_end_array(json_wtr);
 	} else if (mode == XDP_ATTACHED_DRV) {
@@ -63,6 +65,9 @@ static int do_xdp_dump_one(struct nlattr *attr, unsigned int ifindex,
 		xdp_dump_prog_id(tb, IFLA_XDP_PROG_ID, "generic", false);
 	} else if (mode == XDP_ATTACHED_HW) {
 		xdp_dump_prog_id(tb, IFLA_XDP_PROG_ID, "offload", false);
+	} else if (mode == XDP_ATTACHED_EGRESS_CORE) {
+		xdp_dump_prog_id(tb, IFLA_XDP_EGRESS_PROG_ID,
+				 "egress", false);
 	}
 
 	NET_END_OBJECT_FINAL;
diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
index f6a5974a7b0a..64695ccdcf9d 100644
--- a/tools/bpf/bpftool/prog.c
+++ b/tools/bpf/bpftool/prog.c
@@ -2014,7 +2014,7 @@ static int do_help(int argc, char **argv)
 		"                 cgroup/post_bind6 | cgroup/connect4 | cgroup/connect6 |\n"
 		"                 cgroup/sendmsg4 | cgroup/sendmsg6 | cgroup/recvmsg4 |\n"
 		"                 cgroup/recvmsg6 | cgroup/getsockopt | cgroup/setsockopt |\n"
-		"                 struct_ops | fentry | fexit | freplace }\n"
+		"                 struct_ops | fentry | fexit | freplace | xdp_egress }\n"
 		"       ATTACH_TYPE := { msg_verdict | stream_verdict | stream_parser |\n"
 		"                        flow_dissector }\n"
 		"       METRIC := { cycles | instructions | l1d_loads | llc_misses }\n"
-- 
2.21.1 (Apple Git-122.3)


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 bpf-next 13/15] selftest: Add test for xdp_egress
  2020-04-27 22:46 [PATCH v4 bpf-next 00/15] net: Add support for XDP in egress path David Ahern
                   ` (11 preceding siblings ...)
  2020-04-27 22:46 ` [PATCH v4 bpf-next 12/15] bpftool: Add support for XDP egress David Ahern
@ 2020-04-27 22:46 ` David Ahern
  2020-04-27 22:46 ` [PATCH v4 bpf-next 14/15] selftest: Add xdp_egress attach tests David Ahern
  2020-04-27 22:46 ` [PATCH v4 bpf-next 15/15] samples/bpf: add XDP egress support to xdp1 David Ahern
  14 siblings, 0 replies; 20+ messages in thread
From: David Ahern @ 2020-04-27 22:46 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, prashantbhole.linux, jasowang, brouer, toke,
	toshiaki.makita1, daniel, john.fastabend, ast, kafai,
	songliubraving, yhs, andriin, dsahern, David Ahern

From: David Ahern <dahern@digitalocean.com>

Add selftest for xdp_egress. Add xdp_drop program to veth connecting
a namespace to drop packets and break connectivity.

Signed-off-by: David Ahern <dahern@digitalocean.com>
---
 tools/testing/selftests/bpf/Makefile          |   1 +
 tools/testing/selftests/bpf/progs/xdp_drop.c  |  25 +++
 .../testing/selftests/bpf/test_xdp_egress.sh  | 160 ++++++++++++++++++
 3 files changed, 186 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/progs/xdp_drop.c
 create mode 100755 tools/testing/selftests/bpf/test_xdp_egress.sh

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 7729892e0b04..5dae18ebac13 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -50,6 +50,7 @@ TEST_PROGS := test_kmod.sh \
 	test_xdp_redirect.sh \
 	test_xdp_meta.sh \
 	test_xdp_veth.sh \
+	test_xdp_egress.sh \
 	test_offload.py \
 	test_sock_addr.sh \
 	test_tunnel.sh \
diff --git a/tools/testing/selftests/bpf/progs/xdp_drop.c b/tools/testing/selftests/bpf/progs/xdp_drop.c
new file mode 100644
index 000000000000..cffabc53a5e1
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/xdp_drop.c
@@ -0,0 +1,25 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/bpf.h>
+#include <linux/if_ether.h>
+#include <bpf/bpf_helpers.h>
+
+SEC("drop")
+int xdp_drop(struct xdp_md *ctx)
+{
+	void *data_end = (void *)(long)ctx->data_end;
+	void *data = (void *)(long)ctx->data;
+	struct ethhdr *eth = data;
+	void *nh;
+
+	nh = data + sizeof(*eth);
+	if (nh > data_end)
+		return XDP_DROP;
+
+	if (eth->h_proto == 0x0008)
+		return XDP_DROP;
+
+	return XDP_PASS;
+}
+
+char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/test_xdp_egress.sh b/tools/testing/selftests/bpf/test_xdp_egress.sh
new file mode 100755
index 000000000000..7efa59fdf823
--- /dev/null
+++ b/tools/testing/selftests/bpf/test_xdp_egress.sh
@@ -0,0 +1,160 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# XDP egress tests.
+
+# Kselftest framework requirement - SKIP code is 4.
+ksft_skip=4
+
+TESTNAME=xdp_egress
+BPF_FS=$(awk '$3 == "bpf" {print $2; exit}' /proc/mounts)
+
+ret=0
+
+################################################################################
+#
+log_test()
+{
+	local rc=$1
+	local expected=$2
+	local msg="$3"
+
+	if [ ${rc} -eq ${expected} ]; then
+		printf "TEST: %-60s  [ OK ]\n" "${msg}"
+	else
+		ret=1
+		printf "TEST: %-60s  [FAIL]\n" "${msg}"
+	fi
+}
+
+################################################################################
+# create namespaces and connect them
+
+create_ns()
+{
+	local ns=$1
+	local addr=$2
+	local addr6=$3
+
+	ip netns add ${ns}
+
+	ip -netns ${ns} link set lo up
+	ip -netns ${ns} addr add dev lo ${addr}
+	ip -netns ${ns} -6 addr add dev lo ${addr6}
+
+	ip -netns ${ns} ro add unreachable default metric 8192
+	ip -netns ${ns} -6 ro add unreachable default metric 8192
+
+	ip netns exec ${ns} sysctl -qw net.ipv4.ip_forward=1
+	ip netns exec ${ns} sysctl -qw net.ipv6.conf.all.keep_addr_on_down=1
+	ip netns exec ${ns} sysctl -qw net.ipv6.conf.all.forwarding=1
+	ip netns exec ${ns} sysctl -qw net.ipv6.conf.all.forwarding=1
+	ip netns exec ${ns} sysctl -qw net.ipv6.conf.all.accept_dad=0
+}
+
+connect_ns()
+{
+	local ns1=$1
+	local ns1_dev=$2
+	local ns1_addr=$3
+	local ns1_addr6=$4
+	local ns2=$5
+	local ns2_dev=$6
+	local ns2_addr=$7
+	local ns2_addr6=$8
+	local ns1arg
+	local ns2arg
+
+	if [ -n "${ns1}" ]; then
+		ns1arg="-netns ${ns1}"
+	fi
+	if [ -n "${ns2}" ]; then
+		ns2arg="-netns ${ns2}"
+	fi
+
+	ip ${ns1arg} li add ${ns1_dev} type veth peer name tmp
+	ip ${ns1arg} li set ${ns1_dev} up
+	ip ${ns1arg} li set tmp netns ${ns2} name ${ns2_dev}
+	ip ${ns2arg} li set ${ns2_dev} up
+
+	ip ${ns1arg} addr add dev ${ns1_dev} ${ns1_addr}
+	ip ${ns2arg} addr add dev ${ns2_dev} ${ns2_addr}
+
+	ip ${ns1arg} addr add dev ${ns1_dev} ${ns1_addr6} nodad
+	ip ${ns2arg} addr add dev ${ns2_dev} ${ns2_addr6} nodad
+}
+
+################################################################################
+#
+
+setup()
+{
+	create_ns host 172.16.101.1/32 2001:db8:101::1/128
+	connect_ns "" veth-host 172.16.1.1/24 2001:db8:1::1/64 host eth0 172.16.1.2/24 2001:db8:1::2/64
+	ip ro add 172.16.101.1 via 172.16.1.2
+	ip -6 ro add 2001:db8:101::1 via 2001:db8:1::2
+	ping -c1 -w1 172.16.101.1 >/dev/null 2>&1
+	ping -c1 -w1 2001:db8:101::1 >/dev/null 2>&1
+}
+
+cleanup()
+{
+	ip li del veth-host 2>/dev/null
+	ip netns del host 2>/dev/null
+	rm -f $BPF_FS/test_$TESTNAME
+}
+
+################################################################################
+# main
+
+if [ $(id -u) -ne 0 ]; then
+	echo "selftests: $TESTNAME [SKIP] Need root privileges"
+	exit $ksft_skip
+fi
+
+if ! ip link set dev lo xdp off > /dev/null 2>&1; then
+	echo "selftests: $TESTNAME [SKIP] Could not run test without the ip xdp support"
+	exit $ksft_skip
+fi
+
+if [ -z "$BPF_FS" ]; then
+	echo "selftests: $TESTNAME [SKIP] Could not run test without bpffs mounted"
+	exit $ksft_skip
+fi
+
+if ! bpftool version > /dev/null 2>&1; then
+	echo "selftests: $TESTNAME [SKIP] Could not run test without bpftool"
+	exit $ksft_skip
+fi
+
+cleanup
+trap cleanup EXIT
+
+set -e
+setup
+set +e
+
+bpftool prog load xdp_drop.o $BPF_FS/test_$TESTNAME type xdp_egress || exit 1
+ID=$(bpftool prog show name xdp_drop | awk '$4 == "xdp_drop" {print $1}')
+
+# attach egress program
+bpftool net attach xdp_egress id ${ID/:/} dev veth-host
+ping -c1 -w1 172.16.101.1 >/dev/null 2>&1
+log_test $? 1 "IPv4 connectivity disabled by xdp_egress"
+ping -c1 -w1 2001:db8:101::1 >/dev/null 2>&1
+log_test $? 0 "IPv6 connectivity not disabled by egress drop program"
+
+# detach program should restore connectivity
+bpftool net detach xdp_egress dev veth-host
+ping -c1 -w1 172.16.101.1 >/dev/null 2>&1
+log_test $? 0 "IPv4 connectivity restored"
+
+# cleanup on delete
+ip netns exec host bpftool net attach xdp_egress id ${ID/:/} dev eth0
+bpftool net attach xdp_egress id ${ID/:/} dev veth-host
+ip li del veth-host
+rm -f $BPF_FS/test_$TESTNAME
+sleep 5  # rcu grace pass; verify program is dropped
+bpftool prog show name xdp_drop
+
+exit $ret
-- 
2.21.1 (Apple Git-122.3)


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 bpf-next 14/15] selftest: Add xdp_egress attach tests
  2020-04-27 22:46 [PATCH v4 bpf-next 00/15] net: Add support for XDP in egress path David Ahern
                   ` (12 preceding siblings ...)
  2020-04-27 22:46 ` [PATCH v4 bpf-next 13/15] selftest: Add test for xdp_egress David Ahern
@ 2020-04-27 22:46 ` David Ahern
  2020-04-27 22:46 ` [PATCH v4 bpf-next 15/15] samples/bpf: add XDP egress support to xdp1 David Ahern
  14 siblings, 0 replies; 20+ messages in thread
From: David Ahern @ 2020-04-27 22:46 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, prashantbhole.linux, jasowang, brouer, toke,
	toshiaki.makita1, daniel, john.fastabend, ast, kafai,
	songliubraving, yhs, andriin, dsahern, David Ahern

From: David Ahern <dahern@digitalocean.com>

Add xdp_egress attach tests:
1. verify egress programs cannot access ingress entries in xdp context
2. verify ability to load, attach, and detach xdp egress to a device.

Signed-off-by: David Ahern <dahern@digitalocean.com>
---
 .../bpf/prog_tests/xdp_egress_attach.c        | 56 +++++++++++++++++++
 .../selftests/bpf/progs/test_xdp_egress.c     | 12 ++++
 .../bpf/progs/test_xdp_egress_fail.c          | 16 ++++++
 3 files changed, 84 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/xdp_egress_attach.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_xdp_egress.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_xdp_egress_fail.c

diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_egress_attach.c b/tools/testing/selftests/bpf/prog_tests/xdp_egress_attach.c
new file mode 100644
index 000000000000..5253754b27de
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_egress_attach.c
@@ -0,0 +1,56 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/if_link.h>
+#include <test_progs.h>
+
+#define IFINDEX_LO 1
+
+void test_xdp_egress_attach(void)
+{
+	struct bpf_prog_load_attr attr = {
+		.prog_type = BPF_PROG_TYPE_XDP,
+		.expected_attach_type = BPF_XDP_EGRESS,
+	};
+	struct bpf_prog_info info = {};
+	__u32 id, len = sizeof(info);
+	struct bpf_object *obj;
+	__u32 duration = 0;
+	int err, fd = -1;
+
+	/* should fail - accesses rx queue info */
+	attr.file = "./test_xdp_egress_fail.o",
+	err = bpf_prog_load_xattr(&attr, &obj, &fd);
+	if (CHECK(err == 0 && fd >= 0, "xdp_egress with rx failed to load",
+		 "load of xdp_egress with rx succeeded instead of failed"))
+		return;
+
+	attr.file = "./test_xdp_egress.o",
+	err = bpf_prog_load_xattr(&attr, &obj, &fd);
+	if (CHECK_FAIL(err))
+		return;
+
+	err = bpf_obj_get_info_by_fd(fd, &info, &len);
+	if (CHECK_FAIL(err))
+		goto out_close;
+
+	err = bpf_set_link_xdp_fd(IFINDEX_LO, fd, XDP_FLAGS_EGRESS_MODE);
+	if (CHECK(err, "xdp attach", "xdp attach failed"))
+		goto out_close;
+
+	err = bpf_get_link_xdp_id(IFINDEX_LO, &id, XDP_FLAGS_EGRESS_MODE);
+	if (CHECK(err || id != info.id, "id_check",
+		  "loaded prog id %u != id %u, err %d", info.id, id, err))
+		goto out;
+
+out:
+	err = bpf_set_link_xdp_fd(IFINDEX_LO, -1, XDP_FLAGS_EGRESS_MODE);
+	if (CHECK(err, "xdp detach", "xdp detach failed"))
+		goto out_close;
+
+	err = bpf_get_link_xdp_id(IFINDEX_LO, &id, XDP_FLAGS_EGRESS_MODE);
+	if (CHECK(err || id, "id_check",
+		  "failed to detach program %u", id))
+		goto out;
+
+out_close:
+	bpf_object__close(obj);
+}
diff --git a/tools/testing/selftests/bpf/progs/test_xdp_egress.c b/tools/testing/selftests/bpf/progs/test_xdp_egress.c
new file mode 100644
index 000000000000..0477e8537b7f
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_xdp_egress.c
@@ -0,0 +1,12 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+
+SEC("xdp_egress")
+int xdp_egress_good(struct xdp_md *ctx)
+{
+	__u32 idx = ctx->egress_ifindex;
+
+	return idx == 1 ? XDP_DROP : XDP_PASS;
+}
diff --git a/tools/testing/selftests/bpf/progs/test_xdp_egress_fail.c b/tools/testing/selftests/bpf/progs/test_xdp_egress_fail.c
new file mode 100644
index 000000000000..76b47b1d3bc3
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_xdp_egress_fail.c
@@ -0,0 +1,16 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+
+SEC("xdp_egress")
+int xdp_egress_fail(struct xdp_md *ctx)
+{
+	__u32 rxq = ctx->rx_queue_index;
+	__u32 idx = ctx->ingress_ifindex;
+
+	if (idx == 1)
+		return XDP_DROP;
+
+	return rxq ? XDP_DROP : XDP_PASS;
+}
-- 
2.21.1 (Apple Git-122.3)


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 bpf-next 15/15] samples/bpf: add XDP egress support to xdp1
  2020-04-27 22:46 [PATCH v4 bpf-next 00/15] net: Add support for XDP in egress path David Ahern
                   ` (13 preceding siblings ...)
  2020-04-27 22:46 ` [PATCH v4 bpf-next 14/15] selftest: Add xdp_egress attach tests David Ahern
@ 2020-04-27 22:46 ` David Ahern
  14 siblings, 0 replies; 20+ messages in thread
From: David Ahern @ 2020-04-27 22:46 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, prashantbhole.linux, jasowang, brouer, toke,
	toshiaki.makita1, daniel, john.fastabend, ast, kafai,
	songliubraving, yhs, andriin, dsahern, David Ahern

From: David Ahern <dahern@digitalocean.com>

xdp1 and xdp2 now accept -E flag to set XDP program in the egress
path.

Signed-off-by: Prashant Bhole <prashantbhole.linux@gmail.com>
Signed-off-by: David Ahern <dahern@digitalocean.com>
---
 samples/bpf/xdp1_user.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/samples/bpf/xdp1_user.c b/samples/bpf/xdp1_user.c
index c447ad9e3a1d..bb104f4d8c5e 100644
--- a/samples/bpf/xdp1_user.c
+++ b/samples/bpf/xdp1_user.c
@@ -73,7 +73,8 @@ static void usage(const char *prog)
 		"OPTS:\n"
 		"    -S    use skb-mode\n"
 		"    -N    enforce native mode\n"
-		"    -F    force loading prog\n",
+		"    -F    force loading prog\n"
+		"    -E	   egress path program\n",
 		prog);
 }
 
@@ -85,7 +86,7 @@ int main(int argc, char **argv)
 	};
 	struct bpf_prog_info info = {};
 	__u32 info_len = sizeof(info);
-	const char *optstr = "FSN";
+	const char *optstr = "FSNE";
 	int prog_fd, map_fd, opt;
 	struct bpf_object *obj;
 	struct bpf_map *map;
@@ -103,13 +104,17 @@ int main(int argc, char **argv)
 		case 'F':
 			xdp_flags &= ~XDP_FLAGS_UPDATE_IF_NOEXIST;
 			break;
+		case 'E':
+			xdp_flags |= XDP_FLAGS_EGRESS_MODE;
+			prog_load_attr.expected_attach_type = BPF_XDP_EGRESS;
+			break;
 		default:
 			usage(basename(argv[0]));
 			return 1;
 		}
 	}
 
-	if (!(xdp_flags & XDP_FLAGS_SKB_MODE))
+	if (!(xdp_flags & (XDP_FLAGS_SKB_MODE | XDP_FLAGS_EGRESS_MODE)))
 		xdp_flags |= XDP_FLAGS_DRV_MODE;
 
 	if (optind == argc) {
-- 
2.21.1 (Apple Git-122.3)


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 bpf-next 12/15] bpftool: Add support for XDP egress
  2020-04-27 22:46 ` [PATCH v4 bpf-next 12/15] bpftool: Add support for XDP egress David Ahern
@ 2020-04-28  8:08   ` Quentin Monnet
  0 siblings, 0 replies; 20+ messages in thread
From: Quentin Monnet @ 2020-04-28  8:08 UTC (permalink / raw)
  To: David Ahern, netdev
  Cc: davem, kuba, prashantbhole.linux, jasowang, brouer, toke,
	toshiaki.makita1, daniel, john.fastabend, ast, kafai,
	songliubraving, yhs, andriin, dsahern, David Ahern

2020-04-27 16:46 UTC-0600 ~ David Ahern <dsahern@kernel.org>
> From: David Ahern <dahern@digitalocean.com>
> 
> Add xdp_egress as a program type since it requires a new attach
> type. This follows suit with other program type + attach type
> combintations and leverages the SEC name in libbpf.
> 
> Add NET_ATTACH_TYPE_XDP_EGRESS and update attach_type_strings to
> allow a user to specify 'xdp_egress' as the attach or detach point.
> 
> Update do_attach_detach_xdp to set XDP_FLAGS_EGRESS_MODE if egress
> is selected.
> 
> Update do_xdp_dump_one to show egress program ids.
> 
> Update the documentation and help output.
> 
> Signed-off-by: David Ahern <dahern@digitalocean.com>

Reviewed-by: Quentin Monnet <quentin@isovalent.com>

Thanks!

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 bpf-next 09/15] net: Support xdp in the Tx path for packets as an skb
  2020-04-27 22:46 ` [PATCH v4 bpf-next 09/15] net: Support xdp in the Tx path for packets as an skb David Ahern
@ 2020-04-28 15:05   ` Daniel Borkmann
  2020-04-30  0:17     ` David Ahern
  0 siblings, 1 reply; 20+ messages in thread
From: Daniel Borkmann @ 2020-04-28 15:05 UTC (permalink / raw)
  To: David Ahern, netdev
  Cc: davem, kuba, prashantbhole.linux, jasowang, brouer, toke,
	toshiaki.makita1, john.fastabend, ast, kafai, songliubraving,
	yhs, andriin, dsahern, David Ahern

On 4/28/20 12:46 AM, David Ahern wrote:
> From: David Ahern <dahern@digitalocean.com>
> 
> Add support to run Tx path program on packets about to hit the
> ndo_start_xmit function for a device. Only XDP_DROP and XDP_PASS
> are supported now. Conceptually, XDP_REDIRECT for this path can
> work the same as it does for the Rx path, but that support is left
> for a follow on series.
> 
> Signed-off-by: David Ahern <dahern@digitalocean.com>
> ---
>   include/linux/netdevice.h | 11 +++++++++
>   net/core/dev.c            | 52 ++++++++++++++++++++++++++++++++++++++-
>   2 files changed, 62 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 2b552c29e188..33a09396444f 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -3717,6 +3717,7 @@ static inline void dev_consume_skb_any(struct sk_buff *skb)
>   
>   void generic_xdp_tx(struct sk_buff *skb, struct bpf_prog *xdp_prog);
>   int do_xdp_generic_rx(struct bpf_prog *xdp_prog, struct sk_buff *skb);
> +u32 do_xdp_egress_skb(struct net_device *dev, struct sk_buff *skb);
>   int netif_rx(struct sk_buff *skb);
>   int netif_rx_ni(struct sk_buff *skb);
>   int netif_receive_skb(struct sk_buff *skb);
> @@ -4577,6 +4578,16 @@ static inline netdev_tx_t __netdev_start_xmit(const struct net_device_ops *ops,
>   					      struct sk_buff *skb, struct net_device *dev,
>   					      bool more)
>   {
> +	if (static_branch_unlikely(&xdp_egress_needed_key)) {
> +		u32 act;
> +
> +		rcu_read_lock();
> +		act = do_xdp_egress_skb(dev, skb);
> +		rcu_read_unlock();
> +		if (act == XDP_DROP)
> +			return NET_XMIT_DROP;
> +	}
> +
>   	__this_cpu_write(softnet_data.xmit.more, more);
>   	return ops->ndo_start_xmit(skb, dev);

I didn't see anything in the patch series on this (unless I missed it), but
don't we need to force turning off GSO/TSO for the device where XDP egress is
attached? Otherwise how is this safe? E.g. generic XDP uses netif_elide_gro()
to bypass GRO once enabled. In this case on egress, if helpers like bpf_xdp_adjust_head()
or bpf_xdp_adjust_tail() adapt GSO skbs then drivers will operate on wrong GSO
info. Not sure if this goes into undefined behavior here?

Overall, for the regular stack, I expect the performance of XDP egress to be
worse than e.g. tc egress, for example, when TSO is disabled but not GSO then
you parse the same packet multiple times given post-GSO whereas with tc egress
it would operate just fine on a GSO skb. Plus all the limitations generic XDP
has with skb_cloned(skb), skb_is_nonlinear(skb), etc, where we need to linearize
so calling it 'XDP egress' might lead to false assumptions. Did you do a comparison
on that as well?

Also, I presume the XDP egress is intentionally not called when programs return
XDP_TX but only XDP_REDIRECT? Why such design decision?

>   }
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 14ce8e25e3d3..4d98189548c7 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -4620,7 +4620,6 @@ void generic_xdp_tx(struct sk_buff *skb, struct bpf_prog *xdp_prog)
>   }
>   
>   static DEFINE_STATIC_KEY_FALSE(generic_xdp_needed_key);
> -DEFINE_STATIC_KEY_FALSE(xdp_egress_needed_key);
>   
>   int do_xdp_generic_rx(struct bpf_prog *xdp_prog, struct sk_buff *skb)
>   {
> @@ -4671,6 +4670,57 @@ int do_xdp_generic_rx(struct bpf_prog *xdp_prog, struct sk_buff *skb)
>   }
>   EXPORT_SYMBOL_GPL(do_xdp_generic_rx);
>   
> +DEFINE_STATIC_KEY_FALSE(xdp_egress_needed_key);
> +EXPORT_SYMBOL_GPL(xdp_egress_needed_key);
> +
> +static u32 handle_xdp_egress_act(u32 act, struct net_device *dev,
> +				 struct bpf_prog *xdp_prog)
> +{
> +	switch (act) {
> +	case XDP_DROP:
> +		/* fall through */
> +	case XDP_PASS:
> +		break;
> +	case XDP_TX:
> +		/* fall through */
> +	case XDP_REDIRECT:
> +		/* fall through */
> +	default:
> +		bpf_warn_invalid_xdp_action(act);
> +		/* fall through */
> +	case XDP_ABORTED:
> +		trace_xdp_exception(dev, xdp_prog, act);
> +		act = XDP_DROP;
> +		break;
> +	}
> +
> +	return act;
> +}
> +
> +u32 do_xdp_egress_skb(struct net_device *dev, struct sk_buff *skb)
> +{
> +	struct bpf_prog *xdp_prog;
> +	u32 act = XDP_PASS;
> +
> +	xdp_prog = rcu_dereference(dev->xdp_egress_prog);
> +	if (xdp_prog) {
> +		struct xdp_txq_info txq = { .dev = dev };
> +		struct xdp_buff xdp;
> +
> +		xdp.txq = &txq;
> +		act = do_xdp_generic_core(skb, &xdp, xdp_prog);
> +		act = handle_xdp_egress_act(act, dev, xdp_prog);
> +		if (act == XDP_DROP) {
> +			atomic_long_inc(&dev->tx_dropped);
> +			skb_tx_error(skb);
> +			kfree_skb(skb);
> +		}
> +	}
> +
> +	return act;
> +}
> +EXPORT_SYMBOL_GPL(do_xdp_egress_skb);
> +
>   static int netif_rx_internal(struct sk_buff *skb)
>   {
>   	int ret;
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 bpf-next 09/15] net: Support xdp in the Tx path for packets as an skb
  2020-04-28 15:05   ` Daniel Borkmann
@ 2020-04-30  0:17     ` David Ahern
  2020-05-12 20:11       ` David Ahern
  0 siblings, 1 reply; 20+ messages in thread
From: David Ahern @ 2020-04-30  0:17 UTC (permalink / raw)
  To: Daniel Borkmann, David Ahern, netdev
  Cc: davem, kuba, prashantbhole.linux, jasowang, brouer, toke,
	toshiaki.makita1, john.fastabend, ast, kafai, songliubraving,
	yhs, andriin, David Ahern

On 4/28/20 9:05 AM, Daniel Borkmann wrote:
> I didn't see anything in the patch series on this (unless I missed it), but
> don't we need to force turning off GSO/TSO for the device where XDP
> egress is
> attached? Otherwise how is this safe? E.g. generic XDP uses
> netif_elide_gro()
> to bypass GRO once enabled. In this case on egress, if helpers like
> bpf_xdp_adjust_head()
> or bpf_xdp_adjust_tail() adapt GSO skbs then drivers will operate on
> wrong GSO
> info. Not sure if this goes into undefined behavior here?

yep, I need to disable gso / tso.

> 
> Overall, for the regular stack, I expect the performance of XDP egress
> to be
> worse than e.g. tc egress, for example, when TSO is disabled but not GSO
> then
> you parse the same packet multiple times given post-GSO whereas with tc
> egress
> it would operate just fine on a GSO skb. Plus all the limitations
> generic XDP
> has with skb_cloned(skb), skb_is_nonlinear(skb), etc, where we need to
> linearize
> so calling it 'XDP egress' might lead to false assumptions. Did you do a
> comparison
> on that as well?

Are suggesting that skb path and xdp_frame paths have different attach
options to make it clearer that attaching to the skb path requires
offloads to be disabled? Or do you think the name 'XDP egress' is wrong?

This option is a building block to let people choose how they want to
deploy a solution. If the overwhelming majority of traffic takes the XDP
redirect path and only a small set (e.g., broadcast, multicast, first
packet of a flow) takes the slow path, the overall solution is solid and
better performing than sending all of the traffic up the stack.

> 
> Also, I presume the XDP egress is intentionally not called when programs
> return
> XDP_TX but only XDP_REDIRECT? Why such design decision?

XDP_REDIRECT sends frame through the networking stack to go from device
1 to device 2 which can be a completely different context (e.g.,
redirect from host NIC to VM tap device).

XDP_TX is within a driver. The driver just ran a program on the packet
and nothing about the context has changed. That said, I did label the
egress attach as "core" to mean it is run in core networking code.
Moving forward if a feature comes up that warrants a change (e.g., Tx
queue selection) there are options - from drivers adding support to run
the attached egress program or using the driver attach mode to
explicitly state that it is run in the driver and covers XDP_TX.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 bpf-next 09/15] net: Support xdp in the Tx path for packets as an skb
  2020-04-30  0:17     ` David Ahern
@ 2020-05-12 20:11       ` David Ahern
  0 siblings, 0 replies; 20+ messages in thread
From: David Ahern @ 2020-05-12 20:11 UTC (permalink / raw)
  To: Daniel Borkmann, David Ahern, netdev
  Cc: davem, kuba, prashantbhole.linux, jasowang, brouer, toke,
	toshiaki.makita1, john.fastabend, ast, kafai, songliubraving,
	yhs, andriin, David Ahern

On 4/29/20 6:17 PM, David Ahern wrote:
>> Overall, for the regular stack, I expect the performance of XDP egress
>> to be
>> worse than e.g. tc egress, for example, when TSO is disabled but not GSO
>> then
>> you parse the same packet multiple times given post-GSO whereas with tc
>> egress
>> it would operate just fine on a GSO skb. Plus all the limitations
>> generic XDP
>> has with skb_cloned(skb), skb_is_nonlinear(skb), etc, where we need to
>> linearize
>> so calling it 'XDP egress' might lead to false assumptions. Did you do a
>> comparison
>> on that as well?
> 

After another round of staring at the code and running various tests, I
will concede the skb path for a few reasons:

1. all appropriate hooks for running an XDP egress program on skbs are
very close to the same point where the tc hook is,

2. the changes needed to handle xdp programs on skbs combined with the
performance impacts those changes bring (e.g., cloned skb, nonlinear skb
disabling GSO, etc), and

3. xdp programs and cls-bpf programs can share data (ie., maps) and
program can be similar enough that the overhead of 2 programs with
separate attach points is reasonable (e.g., I was able to adapt a
firewall so that it works for both paths and the only difference is
setting data and data_end based on context).

What that means is that an xdp_egress program would only apply to
xdp_frames redirected from another interface.


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2020-05-12 20:11 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-27 22:46 [PATCH v4 bpf-next 00/15] net: Add support for XDP in egress path David Ahern
2020-04-27 22:46 ` [PATCH v4 bpf-next 01/15] net: Refactor convert_to_xdp_frame David Ahern
2020-04-27 22:46 ` [PATCH v4 bpf-next 02/15] net: uapi for XDP programs in the egress path David Ahern
2020-04-27 22:46 ` [PATCH v4 bpf-next 03/15] net: Add XDP setup and query commands for Tx programs David Ahern
2020-04-27 22:46 ` [PATCH v4 bpf-next 04/15] net: Add BPF_XDP_EGRESS as a bpf_attach_type David Ahern
2020-04-27 22:46 ` [PATCH v4 bpf-next 05/15] xdp: Add xdp_txq_info to xdp_buff David Ahern
2020-04-27 22:46 ` [PATCH v4 bpf-next 06/15] net: Rename do_xdp_generic to do_xdp_generic_rx David Ahern
2020-04-27 22:46 ` [PATCH v4 bpf-next 07/15] net: rename netif_receive_generic_xdp to do_generic_xdp_core David Ahern
2020-04-27 22:46 ` [PATCH v4 bpf-next 08/15] net: set XDP egress program on netdevice David Ahern
2020-04-27 22:46 ` [PATCH v4 bpf-next 09/15] net: Support xdp in the Tx path for packets as an skb David Ahern
2020-04-28 15:05   ` Daniel Borkmann
2020-04-30  0:17     ` David Ahern
2020-05-12 20:11       ` David Ahern
2020-04-27 22:46 ` [PATCH v4 bpf-next 10/15] net: Support xdp in the Tx path for xdp_frames David Ahern
2020-04-27 22:46 ` [PATCH v4 bpf-next 11/15] libbpf: Add egress XDP support David Ahern
2020-04-27 22:46 ` [PATCH v4 bpf-next 12/15] bpftool: Add support for XDP egress David Ahern
2020-04-28  8:08   ` Quentin Monnet
2020-04-27 22:46 ` [PATCH v4 bpf-next 13/15] selftest: Add test for xdp_egress David Ahern
2020-04-27 22:46 ` [PATCH v4 bpf-next 14/15] selftest: Add xdp_egress attach tests David Ahern
2020-04-27 22:46 ` [PATCH v4 bpf-next 15/15] samples/bpf: add XDP egress support to xdp1 David Ahern

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.