All of lore.kernel.org
 help / color / mirror / Atom feed
* [bpf-next V1-RFC PATCH 00/14] xdp: new XDP rx-queue info concept
@ 2017-12-13 11:19 Jesper Dangaard Brouer
  2017-12-13 11:19 ` [bpf-next V1-RFC PATCH 01/14] xdp: base API for " Jesper Dangaard Brouer
                   ` (13 more replies)
  0 siblings, 14 replies; 39+ messages in thread
From: Jesper Dangaard Brouer @ 2017-12-13 11:19 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov
  Cc: netdev, Jesper Dangaard Brouer, dsahern, gospo, bjorn.topel,
	michael.chan

NOTICE: I send this as RFC to give driver developers a heads-up,
        while I complete the testing of the drivers (which I have HW for).
        I don't expect many changes before I post it officially.

There is a need for XDP to know more about the RX-queue a given XDP
frames have arrived on.  For both the XDP bpf-prog and kernel side.

Instead of extending struct xdp_buff each time new info is needed,
this patchset takes a different approach.  Struct xdp_buff is only
extended with a pointer to a struct xdp_rxq_info (allowing for easier
extending this later).  This xdp_rxq_info contains information related
to how the driver have setup the individual RX-queue's.  This is
read-mostly information, and all xdp_buff frames (in drivers
napi_poll) point to the same xdp_rxq_info (per RX-queue).

We stress this data/cache-line is for read-mostly info.  This is NOT
for dynamic per packet info, use the data_meta for such use-cases.

This patchset start out small, and only expose ingress_ifindex and the
RX queue index to the XDP/BPF program. Access to tangible info like
the ingress ifindex and RX queue index, is fairly easy to comprehent.
The other future use-cases could allow XDP frames to be recycled back
to the originating device driver, by providing info on RX device and
queue number.

As XDP doesn't have driver feature flags, and eBPF code due to
bpf-tail-calls cannot determine that XDP driver invoke it, this
patchset have to update every driver that support XDP.

For driver developers (review individual driver patches!):

The xdp_rxq_info is tied to the drivers RX-ring(s). Whenever a RX-ring
modification require (temporary) stopping RX frames, then the
xdp_rxq_info should (likely) also be unregistred and re-registered,
especially if reallocating the pages in the ring. Make sure ethtool
set_channels does the right thing. When replacing XDP prog, if and
only if RX-ring need to be changed, then also re-register the
xdp_rxq_info.

I'm Cc'ing the individual driver patches to the registered maintainers.

Testing:

I've only tested the NIC drivers I have hardware for.  The general
test procedure is to (DUT = Device Under Test):
 (1) run pktgen script pktgen_sample04_many_flows.sh (against DUT),
 (2) run samples/bpf program xdp_rxq_info --dev $DEV (on DUT)
 (3) runtime modify NIC queue via ethtool -L (on DUT)

RFC notes/questions:

(A) Didn't update driver netdevsim, as it doesn't reference xdp_buff.

(B) Must I write a test for tools/testing/selftests/bpf/ ?

(C) The bnxt_en driver should re-register when changing ring layout

(D) Should xdp_rxq_info be SLAB allocated instead of embedding into drv structs?

(E) Should struct be named xdp_info instead for future (e.g TX) use-cases?

Patch based on git tree bpf-next (at commit 553a8f2f42):
 https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/

---

Jesper Dangaard Brouer (14):
      xdp: base API for new XDP rx-queue info concept
      xdp/mlx5: setup xdp_rxq_info and extend with qtype
      i40e: setup xdp_rxq_info
      ixgbe: setup xdp_rxq_info
      xdp/qede: setup xdp_rxq_info and intro xdp_rxq_info_is_reg
      mlx4: setup xdp_rxq_info
      bnxt_en: setup xdp_rxq_info
      nfp: setup xdp_rxq_info
      thunderx: setup xdp_rxq_info
      tun: setup xdp_rxq_info
      virtio_net: setup xdp_rxq_info
      xdp: generic XDP handling of xdp_rxq_info
      bpf: finally expose xdp_rxq_info to XDP bpf-programs
      samples/bpf: program demonstrating access to xdp_rxq_info


 drivers/net/ethernet/broadcom/bnxt/bnxt.c          |   10 
 drivers/net/ethernet/broadcom/bnxt/bnxt.h          |    2 
 drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c      |    1 
 drivers/net/ethernet/cavium/thunder/nicvf_main.c   |   11 
 drivers/net/ethernet/cavium/thunder/nicvf_queues.c |    7 
 drivers/net/ethernet/cavium/thunder/nicvf_queues.h |    2 
 drivers/net/ethernet/intel/i40e/i40e_txrx.c        |   18 +
 drivers/net/ethernet/intel/i40e/i40e_txrx.h        |    3 
 drivers/net/ethernet/intel/ixgbe/ixgbe.h           |    2 
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c      |   11 
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c     |    3 
 drivers/net/ethernet/mellanox/mlx4/en_rx.c         |   13 
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h       |    4 
 drivers/net/ethernet/mellanox/mlx5/core/en.h       |    4 
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  |   14 +
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c    |    1 
 drivers/net/ethernet/netronome/nfp/nfp_net.h       |    5 
 .../net/ethernet/netronome/nfp/nfp_net_common.c    |   10 
 drivers/net/ethernet/qlogic/qede/qede.h            |    2 
 drivers/net/ethernet/qlogic/qede/qede_fp.c         |    1 
 drivers/net/ethernet/qlogic/qede/qede_main.c       |   12 
 drivers/net/tun.c                                  |   25 +
 drivers/net/virtio_net.c                           |   12 
 include/linux/filter.h                             |    2 
 include/linux/netdevice.h                          |    2 
 include/net/xdp.h                                  |   69 +++
 include/uapi/linux/bpf.h                           |    3 
 net/core/Makefile                                  |    2 
 net/core/dev.c                                     |   60 ++
 net/core/filter.c                                  |   19 +
 net/core/xdp.c                                     |   57 ++
 samples/bpf/Makefile                               |    4 
 samples/bpf/xdp_rxq_info_kern.c                    |   96 ++++
 samples/bpf/xdp_rxq_info_user.c                    |  533 ++++++++++++++++++++
 34 files changed, 996 insertions(+), 24 deletions(-)
 create mode 100644 include/net/xdp.h
 create mode 100644 net/core/xdp.c
 create mode 100644 samples/bpf/xdp_rxq_info_kern.c
 create mode 100644 samples/bpf/xdp_rxq_info_user.c

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [bpf-next V1-RFC PATCH 01/14] xdp: base API for new XDP rx-queue info concept
  2017-12-13 11:19 [bpf-next V1-RFC PATCH 00/14] xdp: new XDP rx-queue info concept Jesper Dangaard Brouer
@ 2017-12-13 11:19 ` Jesper Dangaard Brouer
  2017-12-14  2:34   ` David Ahern
  2017-12-13 11:19 ` [bpf-next V1-RFC PATCH 02/14] xdp/mlx5: setup xdp_rxq_info and extend with qtype Jesper Dangaard Brouer
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 39+ messages in thread
From: Jesper Dangaard Brouer @ 2017-12-13 11:19 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov
  Cc: netdev, Jesper Dangaard Brouer, dsahern, gospo, bjorn.topel,
	michael.chan

This patch only introduce the core data structures and API functions.
All XDP enabled drivers must use the API before this info can used.

There is a need for XDP to know more about the RX-queue a given XDP
frames have arrived on.  For both the XDP bpf-prog and kernel side.

Instead of extending xdp_buff each time new info is needed, the patch
creates a separate read-mostly struct xdp_rxq_info, that contains this
info.  We stress this data/cache-line is for read-only info.  This is
NOT for dynamic per packet info, use the data_meta for such use-cases.

The performance advantage is this info can be setup at RX-ring init
time, instead of updating N-members in xdp_buff.  A possible (driver
level) micro optimization is that xdp_buff->rxq assignment could be
done once per XDP/NAPI loop.  The extra pointer deref only happens for
program needing access to this info (thus, no slowdown to existing
use-cases).

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 include/linux/filter.h |    2 ++
 include/net/xdp.h      |   45 +++++++++++++++++++++++++++++++++++++++++++++
 net/core/Makefile      |    2 +-
 net/core/xdp.c         |   40 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 88 insertions(+), 1 deletion(-)
 create mode 100644 include/net/xdp.h
 create mode 100644 net/core/xdp.c

diff --git a/include/linux/filter.h b/include/linux/filter.h
index 5feb441d3dd9..111107fcace6 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -19,6 +19,7 @@
 #include <linux/cryptohash.h>
 #include <linux/set_memory.h>
 
+#include <net/xdp.h>
 #include <net/sch_generic.h>
 
 #include <uapi/linux/filter.h>
@@ -496,6 +497,7 @@ struct xdp_buff {
 	void *data_end;
 	void *data_meta;
 	void *data_hard_start;
+	struct xdp_rxq_info *rxq;
 };
 
 /* Compute the linear packet data range [data, data_end) which
diff --git a/include/net/xdp.h b/include/net/xdp.h
new file mode 100644
index 000000000000..e4acd198fd60
--- /dev/null
+++ b/include/net/xdp.h
@@ -0,0 +1,45 @@
+/* include/net/xdp.h
+ *
+ * Copyright (c) 2017 Jesper Dangaard Brouer, Red Hat Inc.
+ * Released under terms in GPL version 2.  See COPYING.
+ */
+#ifndef __LINUX_NET_XDP_H__
+#define __LINUX_NET_XDP_H__
+
+/**
+ * DOC: XDP RX-queue information
+ *
+ * The XDP RX-queue info (xdp_rxq_info) is associated with the driver
+ * level RX-ring queues.  It is information that is specific to how
+ * the driver have configured a given RX-ring queue.
+ *
+ * Each xdp_buff frame received in the driver carry a (pointer)
+ * reference to this xdp_rxq_info structure.  This provides the XDP
+ * data-path read-access to RX-info for both kernel and bpf-side
+ * (limited subset).
+ *
+ * For now, direct access is only safe while running in NAPI/softirq
+ * context.
+ *
+ * The driver usage API is an init, register and unregister API.
+ *
+ * The struct is not directly tied to the XDP prog.  A new XDP prog
+ * can be attached as long as it doesn't change the underlying
+ * RX-ring.  If the RX-ring does change significantly, the NIC driver
+ * naturally need to stop the RX-ring before purging and reallocating
+ * memory.  In that process the driver MUST call unregistor (which
+ * also apply for driver shutdown and unload).  The init and register
+ * API is also mandatory during RX-ring setup.
+ */
+
+struct xdp_rxq_info {
+	struct net_device *dev;
+	u32 queue_index;
+	u32 reg_state;
+} ____cacheline_aligned; /* perf critical, avoid false-sharing */
+
+void xdp_rxq_info_init(struct xdp_rxq_info *xdp_rxq);
+void xdp_rxq_info_reg(struct xdp_rxq_info *xdp_rxq);
+void xdp_rxq_info_unreg(struct xdp_rxq_info *xdp_rxq);
+
+#endif /* __LINUX_NET_XDP_H__ */
diff --git a/net/core/Makefile b/net/core/Makefile
index 1fd0a9c88b1b..6dbbba8c57ae 100644
--- a/net/core/Makefile
+++ b/net/core/Makefile
@@ -11,7 +11,7 @@ obj-$(CONFIG_SYSCTL) += sysctl_net_core.o
 obj-y		     += dev.o ethtool.o dev_addr_lists.o dst.o netevent.o \
 			neighbour.o rtnetlink.o utils.o link_watch.o filter.o \
 			sock_diag.o dev_ioctl.o tso.o sock_reuseport.o \
-			fib_notifier.o
+			fib_notifier.o xdp.o
 
 obj-y += net-sysfs.o
 obj-$(CONFIG_PROC_FS) += net-procfs.o
diff --git a/net/core/xdp.c b/net/core/xdp.c
new file mode 100644
index 000000000000..a9d2dd7b1ede
--- /dev/null
+++ b/net/core/xdp.c
@@ -0,0 +1,40 @@
+/* net/core/xdp.c
+ *
+ * Copyright (c) 2017 Jesper Dangaard Brouer, Red Hat Inc.
+ * Released under terms in GPL version 2.  See COPYING.
+ */
+#include <linux/types.h>
+#include <linux/mm.h>
+
+#include <net/xdp.h>
+
+#define REG_STATE_NEW		0x0
+#define REG_STATE_REGISTRED	0x1
+#define REG_STATE_UNREGISTRED	0x2
+
+void xdp_rxq_info_unreg(struct xdp_rxq_info *xdp_rxq)
+{
+	xdp_rxq->reg_state = REG_STATE_UNREGISTRED;
+}
+EXPORT_SYMBOL_GPL(xdp_rxq_info_unreg);
+
+void xdp_rxq_info_init(struct xdp_rxq_info *xdp_rxq)
+{
+	if (xdp_rxq->reg_state == REG_STATE_REGISTRED) {
+		WARN(1, "Missing unregister, handled but fix driver\n");
+		xdp_rxq_info_unreg(xdp_rxq);
+	}
+	memset(xdp_rxq, 0, sizeof(*xdp_rxq));
+	xdp_rxq->queue_index = U32_MAX;
+	xdp_rxq->reg_state = REG_STATE_NEW;
+}
+EXPORT_SYMBOL_GPL(xdp_rxq_info_init);
+
+void xdp_rxq_info_reg(struct xdp_rxq_info *xdp_rxq)
+{
+	WARN(!xdp_rxq->dev, "Missing net_device from driver");
+	WARN(xdp_rxq->queue_index == U32_MAX, "Miss queue_index from driver");
+	WARN(!(xdp_rxq->reg_state == REG_STATE_NEW),"API violation, miss init");
+	xdp_rxq->reg_state = REG_STATE_REGISTRED;
+}
+EXPORT_SYMBOL_GPL(xdp_rxq_info_reg);

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [bpf-next V1-RFC PATCH 02/14] xdp/mlx5: setup xdp_rxq_info and extend with qtype
  2017-12-13 11:19 [bpf-next V1-RFC PATCH 00/14] xdp: new XDP rx-queue info concept Jesper Dangaard Brouer
  2017-12-13 11:19 ` [bpf-next V1-RFC PATCH 01/14] xdp: base API for " Jesper Dangaard Brouer
@ 2017-12-13 11:19 ` Jesper Dangaard Brouer
  2017-12-13 12:27   ` Tariq Toukan
  2017-12-13 11:19   ` [Intel-wired-lan] " Jesper Dangaard Brouer
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 39+ messages in thread
From: Jesper Dangaard Brouer @ 2017-12-13 11:19 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov
  Cc: netdev, dsahern, Matan Barak, Saeed Mahameed,
	Jesper Dangaard Brouer, gospo, bjorn.topel, michael.chan,
	Tariq Toukan

The mlx5 driver have a special drop-RQ queue (one per interface) that
simply drops all incoming traffic. It helps driver keep other HW
objects (flow steering) alive upon down/up operations.  It is
temporarily pointed by flow steering objects during the interface
setup, and when interface is down. It lacks many fields that are set
in a regular RQ (for example its state is never switched to
MLX5_RQC_STATE_RDY). (Thanks to Tariq Toukan for explaination).

The XDP RX-queue info API is extended with a queue-type, and mlx5 uses
this kind of drop/sink-type (RXQ_TYPE_SINK) for this kind of sink queue.

Driver hook points for xdp_rxq_info:
 * init+reg: mlx5e_alloc_rq()
 * init+reg: mlx5e_alloc_drop_rq()
 * unreg   : mlx5e_free_rq()

Tested on actual hardware with samples/bpf program

Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Matan Barak <matanb@mellanox.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h      |    4 ++++
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c |   14 +++++++++++++
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c   |    1 +
 include/net/xdp.h                                 |   23 +++++++++++++++++++++
 net/core/xdp.c                                    |    6 +++++
 5 files changed, 48 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index c0872b3284cb..fe10a042783b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -46,6 +46,7 @@
 #include <linux/mlx5/transobj.h>
 #include <linux/rhashtable.h>
 #include <net/switchdev.h>
+#include <net/xdp.h>
 #include "wq.h"
 #include "mlx5_core.h"
 #include "en_stats.h"
@@ -568,6 +569,9 @@ struct mlx5e_rq {
 	u32                    rqn;
 	struct mlx5_core_dev  *mdev;
 	struct mlx5_core_mkey  umr_mkey;
+
+	/* XDP read-mostly */
+	struct xdp_rxq_info xdp_rxq;
 } ____cacheline_aligned_in_smp;
 
 struct mlx5e_channel {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 0f5c012de52e..ea44b5f25e11 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -582,6 +582,12 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c,
 	rq->ix      = c->ix;
 	rq->mdev    = mdev;
 
+	/* XDP RX-queue info */
+	xdp_rxq_info_init(&rq->xdp_rxq);
+	rq->xdp_rxq.dev		= rq->netdev;
+	rq->xdp_rxq.queue_index = rq->ix;
+	xdp_rxq_info_reg(&rq->xdp_rxq);
+
 	rq->xdp_prog = params->xdp_prog ? bpf_prog_inc(params->xdp_prog) : NULL;
 	if (IS_ERR(rq->xdp_prog)) {
 		err = PTR_ERR(rq->xdp_prog);
@@ -695,6 +701,7 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c,
 err_rq_wq_destroy:
 	if (rq->xdp_prog)
 		bpf_prog_put(rq->xdp_prog);
+	xdp_rxq_info_unreg(&rq->xdp_rxq);
 	mlx5_wq_destroy(&rq->wq_ctrl);
 
 	return err;
@@ -707,6 +714,8 @@ static void mlx5e_free_rq(struct mlx5e_rq *rq)
 	if (rq->xdp_prog)
 		bpf_prog_put(rq->xdp_prog);
 
+	xdp_rxq_info_unreg(&rq->xdp_rxq);
+
 	switch (rq->wq_type) {
 	case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ:
 		mlx5e_rq_free_mpwqe_info(rq);
@@ -2768,6 +2777,11 @@ static int mlx5e_alloc_drop_rq(struct mlx5_core_dev *mdev,
 	if (err)
 		return err;
 
+	/* XDP RX-queue info for "Drop-RQ", packets never reach XDP */
+	xdp_rxq_info_init(&rq->xdp_rxq);
+	xdp_rxq_info_type(&rq->xdp_rxq, RXQ_TYPE_SINK);
+	xdp_rxq_info_reg(&rq->xdp_rxq);
+
 	rq->mdev = mdev;
 
 	return 0;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 5b499c7a698f..7b38480811d4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -812,6 +812,7 @@ static inline int mlx5e_xdp_handle(struct mlx5e_rq *rq,
 	xdp_set_data_meta_invalid(&xdp);
 	xdp.data_end = xdp.data + *len;
 	xdp.data_hard_start = va;
+	xdp.rxq = &rq->xdp_rxq;
 
 	act = bpf_prog_run_xdp(prog, &xdp);
 	switch (act) {
diff --git a/include/net/xdp.h b/include/net/xdp.h
index e4acd198fd60..5be560d943e1 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -36,10 +36,33 @@ struct xdp_rxq_info {
 	struct net_device *dev;
 	u32 queue_index;
 	u32 reg_state;
+	u32 qtype;
 } ____cacheline_aligned; /* perf critical, avoid false-sharing */
 
 void xdp_rxq_info_init(struct xdp_rxq_info *xdp_rxq);
 void xdp_rxq_info_reg(struct xdp_rxq_info *xdp_rxq);
 void xdp_rxq_info_unreg(struct xdp_rxq_info *xdp_rxq);
 
+/**
+ * DOC: XDP RX-queue type
+ *
+ * The XDP RX-queue info can have associated a type.
+ *
+ * @RXQ_TYPE_DEFAULT: default no specifik queue type need to be specified
+ *
+ * @RXQ_TYPE_SINK: indicate a fake queue that never reach XDP RX
+ *	code.  Some drivers have a need to maintain a lower layer
+ *	RX-queue as a sink queue, while reconfiguring other RX-queues.
+ */
+#define RXQ_TYPE_DEFAULT	0
+#define RXQ_TYPE_SINK		1
+#define RXQ_TYPE_MAX		RXQ_TYPE_SINK
+
+static inline
+void xdp_rxq_info_type(struct xdp_rxq_info *xdp_rxq, u32 qtype)
+{
+	BUILD_BUG_ON(qtype > RXQ_TYPE_MAX);
+	xdp_rxq->qtype = qtype;
+}
+
 #endif /* __LINUX_NET_XDP_H__ */
diff --git a/net/core/xdp.c b/net/core/xdp.c
index a9d2dd7b1ede..2a111f5987f6 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -32,8 +32,14 @@ EXPORT_SYMBOL_GPL(xdp_rxq_info_init);
 
 void xdp_rxq_info_reg(struct xdp_rxq_info *xdp_rxq)
 {
+	if (xdp_rxq->qtype == RXQ_TYPE_SINK)
+		goto skip_content_check;
+
+	/* Check information setup by driver code */
 	WARN(!xdp_rxq->dev, "Missing net_device from driver");
 	WARN(xdp_rxq->queue_index == U32_MAX, "Miss queue_index from driver");
+
+skip_content_check:
 	WARN(!(xdp_rxq->reg_state == REG_STATE_NEW),"API violation, miss init");
 	xdp_rxq->reg_state = REG_STATE_REGISTRED;
 }

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [bpf-next V1-RFC PATCH 03/14] i40e: setup xdp_rxq_info
  2017-12-13 11:19 [bpf-next V1-RFC PATCH 00/14] xdp: new XDP rx-queue info concept Jesper Dangaard Brouer
@ 2017-12-13 11:19   ` Jesper Dangaard Brouer
  2017-12-13 11:19 ` [bpf-next V1-RFC PATCH 02/14] xdp/mlx5: setup xdp_rxq_info and extend with qtype Jesper Dangaard Brouer
                     ` (12 subsequent siblings)
  13 siblings, 0 replies; 39+ messages in thread
From: Jesper Dangaard Brouer @ 2017-12-13 11:19 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov
  Cc: netdev, Jesper Dangaard Brouer, intel-wired-lan, Jeff Kirsher,
	dsahern, gospo, bjorn.topel, michael.chan

The i40e driver have a special "FDIR" RX-ring (I40E_VSI_FDIR) which is
a sideband channel for configuring/updating the flow director tables.
This (i40e_vsi_)type does not invoke XDP-ebpf code. Thus, mark this
type as a XDP RX-queue type RXQ_TYPE_SINK.

Driver hook points for xdp_rxq_info:
 * init+reg: i40e_setup_rx_descriptors (via i40e_vsi_setup_rx_resources)
 * unreg   : i40e_free_rx_resources    (via i40e_vsi_free_rx_resources)

Tested on actual hardware with samples/bpf program.

Cc: intel-wired-lan@lists.osuosl.org
Cc: Björn Töpel <bjorn.topel@intel.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/intel/i40e/i40e_txrx.c |   18 +++++++++++++++++-
 drivers/net/ethernet/intel/i40e/i40e_txrx.h |    3 +++
 2 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index 4566d66ffc7c..d1bfc4232ca9 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -27,6 +27,7 @@
 #include <linux/prefetch.h>
 #include <net/busy_poll.h>
 #include <linux/bpf_trace.h>
+#include <net/xdp.h>
 #include "i40e.h"
 #include "i40e_trace.h"
 #include "i40e_prototype.h"
@@ -1236,6 +1237,7 @@ void i40e_clean_rx_ring(struct i40e_ring *rx_ring)
 void i40e_free_rx_resources(struct i40e_ring *rx_ring)
 {
 	i40e_clean_rx_ring(rx_ring);
+	xdp_rxq_info_unreg(&rx_ring->xdp_rxq);
 	rx_ring->xdp_prog = NULL;
 	kfree(rx_ring->rx_bi);
 	rx_ring->rx_bi = NULL;
@@ -1283,6 +1285,18 @@ int i40e_setup_rx_descriptors(struct i40e_ring *rx_ring)
 	rx_ring->next_to_clean = 0;
 	rx_ring->next_to_use = 0;
 
+	/* XDP RX-queue info */
+	xdp_rxq_info_init(&rx_ring->xdp_rxq);
+
+	/* Flow director side channel does not invoke XDP/bpf */
+	if (rx_ring->vsi->type == I40E_VSI_FDIR)
+		xdp_rxq_info_type(&rx_ring->xdp_rxq, RXQ_TYPE_SINK);
+	else
+		rx_ring->xdp_rxq.dev = rx_ring->netdev;
+
+	rx_ring->xdp_rxq.queue_index = rx_ring->queue_index;
+	xdp_rxq_info_reg(&rx_ring->xdp_rxq);
+
 	rx_ring->xdp_prog = rx_ring->vsi->xdp_prog;
 
 	return 0;
@@ -2068,11 +2082,13 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget)
 	struct sk_buff *skb = rx_ring->skb;
 	u16 cleaned_count = I40E_DESC_UNUSED(rx_ring);
 	bool failure = false, xdp_xmit = false;
+	struct xdp_buff xdp;
+
+	xdp.rxq = &rx_ring->xdp_rxq;
 
 	while (likely(total_rx_packets < (unsigned int)budget)) {
 		struct i40e_rx_buffer *rx_buffer;
 		union i40e_rx_desc *rx_desc;
-		struct xdp_buff xdp;
 		unsigned int size;
 		u16 vlan_tag;
 		u8 rx_ptype;
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.h b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
index fbae1182e2ea..2d08760fc4ce 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
@@ -27,6 +27,8 @@
 #ifndef _I40E_TXRX_H_
 #define _I40E_TXRX_H_
 
+#include <net/xdp.h>
+
 /* Interrupt Throttling and Rate Limiting Goodies */
 
 #define I40E_MAX_ITR               0x0FF0  /* reg uses 2 usec resolution */
@@ -428,6 +430,7 @@ struct i40e_ring {
 					 */
 
 	struct i40e_channel *ch;
+	struct xdp_rxq_info xdp_rxq;
 } ____cacheline_internodealigned_in_smp;
 
 static inline bool ring_uses_build_skb(struct i40e_ring *ring)

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Intel-wired-lan] [bpf-next V1-RFC PATCH 03/14] i40e: setup xdp_rxq_info
@ 2017-12-13 11:19   ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 39+ messages in thread
From: Jesper Dangaard Brouer @ 2017-12-13 11:19 UTC (permalink / raw)
  To: intel-wired-lan

The i40e driver have a special "FDIR" RX-ring (I40E_VSI_FDIR) which is
a sideband channel for configuring/updating the flow director tables.
This (i40e_vsi_)type does not invoke XDP-ebpf code. Thus, mark this
type as a XDP RX-queue type RXQ_TYPE_SINK.

Driver hook points for xdp_rxq_info:
 * init+reg: i40e_setup_rx_descriptors (via i40e_vsi_setup_rx_resources)
 * unreg   : i40e_free_rx_resources    (via i40e_vsi_free_rx_resources)

Tested on actual hardware with samples/bpf program.

Cc: intel-wired-lan at lists.osuosl.org
Cc: Bj?rn T?pel <bjorn.topel@intel.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/intel/i40e/i40e_txrx.c |   18 +++++++++++++++++-
 drivers/net/ethernet/intel/i40e/i40e_txrx.h |    3 +++
 2 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index 4566d66ffc7c..d1bfc4232ca9 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -27,6 +27,7 @@
 #include <linux/prefetch.h>
 #include <net/busy_poll.h>
 #include <linux/bpf_trace.h>
+#include <net/xdp.h>
 #include "i40e.h"
 #include "i40e_trace.h"
 #include "i40e_prototype.h"
@@ -1236,6 +1237,7 @@ void i40e_clean_rx_ring(struct i40e_ring *rx_ring)
 void i40e_free_rx_resources(struct i40e_ring *rx_ring)
 {
 	i40e_clean_rx_ring(rx_ring);
+	xdp_rxq_info_unreg(&rx_ring->xdp_rxq);
 	rx_ring->xdp_prog = NULL;
 	kfree(rx_ring->rx_bi);
 	rx_ring->rx_bi = NULL;
@@ -1283,6 +1285,18 @@ int i40e_setup_rx_descriptors(struct i40e_ring *rx_ring)
 	rx_ring->next_to_clean = 0;
 	rx_ring->next_to_use = 0;
 
+	/* XDP RX-queue info */
+	xdp_rxq_info_init(&rx_ring->xdp_rxq);
+
+	/* Flow director side channel does not invoke XDP/bpf */
+	if (rx_ring->vsi->type == I40E_VSI_FDIR)
+		xdp_rxq_info_type(&rx_ring->xdp_rxq, RXQ_TYPE_SINK);
+	else
+		rx_ring->xdp_rxq.dev = rx_ring->netdev;
+
+	rx_ring->xdp_rxq.queue_index = rx_ring->queue_index;
+	xdp_rxq_info_reg(&rx_ring->xdp_rxq);
+
 	rx_ring->xdp_prog = rx_ring->vsi->xdp_prog;
 
 	return 0;
@@ -2068,11 +2082,13 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget)
 	struct sk_buff *skb = rx_ring->skb;
 	u16 cleaned_count = I40E_DESC_UNUSED(rx_ring);
 	bool failure = false, xdp_xmit = false;
+	struct xdp_buff xdp;
+
+	xdp.rxq = &rx_ring->xdp_rxq;
 
 	while (likely(total_rx_packets < (unsigned int)budget)) {
 		struct i40e_rx_buffer *rx_buffer;
 		union i40e_rx_desc *rx_desc;
-		struct xdp_buff xdp;
 		unsigned int size;
 		u16 vlan_tag;
 		u8 rx_ptype;
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.h b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
index fbae1182e2ea..2d08760fc4ce 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
@@ -27,6 +27,8 @@
 #ifndef _I40E_TXRX_H_
 #define _I40E_TXRX_H_
 
+#include <net/xdp.h>
+
 /* Interrupt Throttling and Rate Limiting Goodies */
 
 #define I40E_MAX_ITR               0x0FF0  /* reg uses 2 usec resolution */
@@ -428,6 +430,7 @@ struct i40e_ring {
 					 */
 
 	struct i40e_channel *ch;
+	struct xdp_rxq_info xdp_rxq;
 } ____cacheline_internodealigned_in_smp;
 
 static inline bool ring_uses_build_skb(struct i40e_ring *ring)


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [bpf-next V1-RFC PATCH 04/14] ixgbe: setup xdp_rxq_info
  2017-12-13 11:19 [bpf-next V1-RFC PATCH 00/14] xdp: new XDP rx-queue info concept Jesper Dangaard Brouer
@ 2017-12-13 11:19   ` Jesper Dangaard Brouer
  2017-12-13 11:19 ` [bpf-next V1-RFC PATCH 02/14] xdp/mlx5: setup xdp_rxq_info and extend with qtype Jesper Dangaard Brouer
                     ` (12 subsequent siblings)
  13 siblings, 0 replies; 39+ messages in thread
From: Jesper Dangaard Brouer @ 2017-12-13 11:19 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov
  Cc: netdev, dsahern, Alexander Duyck, intel-wired-lan, Jeff Kirsher,
	Jesper Dangaard Brouer, gospo, bjorn.topel, michael.chan

Driver hook points for xdp_rxq_info:
 * init+reg: ixgbe_setup_rx_resources()
 * unreg   : ixgbe_free_rx_resources()

Cc: intel-wired-lan@lists.osuosl.org
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe.h      |    2 ++
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |   11 ++++++++++-
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index 468c3555a629..8611763d6129 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -53,6 +53,7 @@
 #include <linux/dca.h>
 #endif
 
+#include <net/xdp.h>
 #include <net/busy_poll.h>
 
 /* common prefix used by pr_<> macros */
@@ -371,6 +372,7 @@ struct ixgbe_ring {
 		struct ixgbe_tx_queue_stats tx_stats;
 		struct ixgbe_rx_queue_stats rx_stats;
 	};
+	struct xdp_rxq_info xdp_rxq;
 } ____cacheline_internodealigned_in_smp;
 
 enum ixgbe_ring_f_enum {
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 7737a05c717c..2a8196427776 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -2318,12 +2318,14 @@ static int ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
 #endif /* IXGBE_FCOE */
 	u16 cleaned_count = ixgbe_desc_unused(rx_ring);
 	bool xdp_xmit = false;
+	struct xdp_buff xdp;
+
+	xdp.rxq = &rx_ring->xdp_rxq;
 
 	while (likely(total_rx_packets < budget)) {
 		union ixgbe_adv_rx_desc *rx_desc;
 		struct ixgbe_rx_buffer *rx_buffer;
 		struct sk_buff *skb;
-		struct xdp_buff xdp;
 		unsigned int size;
 
 		/* return some buffers to hardware, one at a time is too slow */
@@ -6444,6 +6446,12 @@ int ixgbe_setup_rx_resources(struct ixgbe_adapter *adapter,
 	rx_ring->next_to_clean = 0;
 	rx_ring->next_to_use = 0;
 
+	/* XDP RX-queue info */
+	xdp_rxq_info_init(&rx_ring->xdp_rxq);
+	rx_ring->xdp_rxq.dev         = adapter->netdev;
+	rx_ring->xdp_rxq.queue_index = rx_ring->queue_index;
+	xdp_rxq_info_reg(&rx_ring->xdp_rxq);
+
 	rx_ring->xdp_prog = adapter->xdp_prog;
 
 	return 0;
@@ -6541,6 +6549,7 @@ void ixgbe_free_rx_resources(struct ixgbe_ring *rx_ring)
 	ixgbe_clean_rx_ring(rx_ring);
 
 	rx_ring->xdp_prog = NULL;
+	xdp_rxq_info_unreg(&rx_ring->xdp_rxq);
 	vfree(rx_ring->rx_buffer_info);
 	rx_ring->rx_buffer_info = NULL;
 

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Intel-wired-lan] [bpf-next V1-RFC PATCH 04/14] ixgbe: setup xdp_rxq_info
@ 2017-12-13 11:19   ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 39+ messages in thread
From: Jesper Dangaard Brouer @ 2017-12-13 11:19 UTC (permalink / raw)
  To: intel-wired-lan

Driver hook points for xdp_rxq_info:
 * init+reg: ixgbe_setup_rx_resources()
 * unreg   : ixgbe_free_rx_resources()

Cc: intel-wired-lan at lists.osuosl.org
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe.h      |    2 ++
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |   11 ++++++++++-
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index 468c3555a629..8611763d6129 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -53,6 +53,7 @@
 #include <linux/dca.h>
 #endif
 
+#include <net/xdp.h>
 #include <net/busy_poll.h>
 
 /* common prefix used by pr_<> macros */
@@ -371,6 +372,7 @@ struct ixgbe_ring {
 		struct ixgbe_tx_queue_stats tx_stats;
 		struct ixgbe_rx_queue_stats rx_stats;
 	};
+	struct xdp_rxq_info xdp_rxq;
 } ____cacheline_internodealigned_in_smp;
 
 enum ixgbe_ring_f_enum {
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 7737a05c717c..2a8196427776 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -2318,12 +2318,14 @@ static int ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
 #endif /* IXGBE_FCOE */
 	u16 cleaned_count = ixgbe_desc_unused(rx_ring);
 	bool xdp_xmit = false;
+	struct xdp_buff xdp;
+
+	xdp.rxq = &rx_ring->xdp_rxq;
 
 	while (likely(total_rx_packets < budget)) {
 		union ixgbe_adv_rx_desc *rx_desc;
 		struct ixgbe_rx_buffer *rx_buffer;
 		struct sk_buff *skb;
-		struct xdp_buff xdp;
 		unsigned int size;
 
 		/* return some buffers to hardware, one@a time is too slow */
@@ -6444,6 +6446,12 @@ int ixgbe_setup_rx_resources(struct ixgbe_adapter *adapter,
 	rx_ring->next_to_clean = 0;
 	rx_ring->next_to_use = 0;
 
+	/* XDP RX-queue info */
+	xdp_rxq_info_init(&rx_ring->xdp_rxq);
+	rx_ring->xdp_rxq.dev         = adapter->netdev;
+	rx_ring->xdp_rxq.queue_index = rx_ring->queue_index;
+	xdp_rxq_info_reg(&rx_ring->xdp_rxq);
+
 	rx_ring->xdp_prog = adapter->xdp_prog;
 
 	return 0;
@@ -6541,6 +6549,7 @@ void ixgbe_free_rx_resources(struct ixgbe_ring *rx_ring)
 	ixgbe_clean_rx_ring(rx_ring);
 
 	rx_ring->xdp_prog = NULL;
+	xdp_rxq_info_unreg(&rx_ring->xdp_rxq);
 	vfree(rx_ring->rx_buffer_info);
 	rx_ring->rx_buffer_info = NULL;
 


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [bpf-next V1-RFC PATCH 05/14] xdp/qede: setup xdp_rxq_info and intro xdp_rxq_info_is_reg
  2017-12-13 11:19 [bpf-next V1-RFC PATCH 00/14] xdp: new XDP rx-queue info concept Jesper Dangaard Brouer
                   ` (3 preceding siblings ...)
  2017-12-13 11:19   ` [Intel-wired-lan] " Jesper Dangaard Brouer
@ 2017-12-13 11:19 ` Jesper Dangaard Brouer
  2017-12-13 11:19 ` [bpf-next V1-RFC PATCH 06/14] mlx4: setup xdp_rxq_info Jesper Dangaard Brouer
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 39+ messages in thread
From: Jesper Dangaard Brouer @ 2017-12-13 11:19 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov
  Cc: bjorn.topel, netdev, Jesper Dangaard Brouer, Ariel Elior,
	dsahern, gospo, everest-linux-l2, michael.chan

The driver code qede_free_fp_array() depend on kfree() can be called
with a NULL pointer. This stems from the qede_alloc_fp_array()
function which either (kz)alloc memory for fp->txq or fp->rxq.
This also simplifies error handling code in case of memory allocation
failures, but xdp_rxq_info_unreg need to know the difference.

Introduce xdp_rxq_info_is_reg() to handle if a memory allocation fails
and detect this is the failure path by seeing that xdp_rxq_info was
not registred yet, which first happens after successful alloaction in
qede_init_fp().

Driver hook points for xdp_rxq_info:
 * init+reg: qede_init_fp
 * unreg   : qede_free_fp_array

Tested on actual hardware with samples/bpf program.

Cc: everest-linux-l2@cavium.com
Cc: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/qlogic/qede/qede.h      |    2 ++
 drivers/net/ethernet/qlogic/qede/qede_fp.c   |    1 +
 drivers/net/ethernet/qlogic/qede/qede_main.c |   12 ++++++++++++
 include/net/xdp.h                            |    1 +
 net/core/xdp.c                               |   11 +++++++++++
 5 files changed, 27 insertions(+)

diff --git a/drivers/net/ethernet/qlogic/qede/qede.h b/drivers/net/ethernet/qlogic/qede/qede.h
index a3a70ade411f..62f47736511b 100644
--- a/drivers/net/ethernet/qlogic/qede/qede.h
+++ b/drivers/net/ethernet/qlogic/qede/qede.h
@@ -40,6 +40,7 @@
 #include <linux/kernel.h>
 #include <linux/mutex.h>
 #include <linux/bpf.h>
+#include <net/xdp.h>
 #include <linux/qed/qede_rdma.h>
 #include <linux/io.h>
 #ifdef CONFIG_RFS_ACCEL
@@ -345,6 +346,7 @@ struct qede_rx_queue {
 	u64 xdp_no_pass;
 
 	void *handle;
+	struct xdp_rxq_info xdp_rxq;
 };
 
 union db_prod {
diff --git a/drivers/net/ethernet/qlogic/qede/qede_fp.c b/drivers/net/ethernet/qlogic/qede/qede_fp.c
index 48ec4c56cddf..dafc079ab6b9 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_fp.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_fp.c
@@ -1006,6 +1006,7 @@ static bool qede_rx_xdp(struct qede_dev *edev,
 	xdp.data = xdp.data_hard_start + *data_offset;
 	xdp_set_data_meta_invalid(&xdp);
 	xdp.data_end = xdp.data + *len;
+	xdp.rxq = &rxq->xdp_rxq;
 
 	/* Queues always have a full reset currently, so for the time
 	 * being until there's atomic program replace just mark read
diff --git a/drivers/net/ethernet/qlogic/qede/qede_main.c b/drivers/net/ethernet/qlogic/qede/qede_main.c
index 57332b3e5e64..a7264fdf4112 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_main.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_main.c
@@ -762,6 +762,12 @@ static void qede_free_fp_array(struct qede_dev *edev)
 			fp = &edev->fp_array[i];
 
 			kfree(fp->sb_info);
+			/* Handle mem alloc failure case where qede_init_fp
+			 * didn't register xdp_rxq_info yet.
+			 * Implicit only (fp->type & QEDE_FASTPATH_RX)
+			 */
+			if (fp->rxq && xdp_rxq_info_is_reg(&fp->rxq->xdp_rxq))
+				xdp_rxq_info_unreg(&fp->rxq->xdp_rxq);
 			kfree(fp->rxq);
 			kfree(fp->xdp_tx);
 			kfree(fp->txq);
@@ -1498,6 +1504,12 @@ static void qede_init_fp(struct qede_dev *edev)
 			else
 				fp->rxq->data_direction = DMA_FROM_DEVICE;
 			fp->rxq->dev = &edev->pdev->dev;
+
+			/* XDP RX-queue info */
+			xdp_rxq_info_init(&fp->rxq->xdp_rxq);
+			fp->rxq->xdp_rxq.dev = edev->ndev;
+			fp->rxq->xdp_rxq.queue_index = fp->rxq->rxq_id;
+			xdp_rxq_info_reg(&fp->rxq->xdp_rxq);
 		}
 
 		if (fp->type & QEDE_FASTPATH_TX) {
diff --git a/include/net/xdp.h b/include/net/xdp.h
index 5be560d943e1..106ec8ed4db5 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -42,6 +42,7 @@ struct xdp_rxq_info {
 void xdp_rxq_info_init(struct xdp_rxq_info *xdp_rxq);
 void xdp_rxq_info_reg(struct xdp_rxq_info *xdp_rxq);
 void xdp_rxq_info_unreg(struct xdp_rxq_info *xdp_rxq);
+bool xdp_rxq_info_is_reg(struct xdp_rxq_info *xdp_rxq);
 
 /**
  * DOC: XDP RX-queue type
diff --git a/net/core/xdp.c b/net/core/xdp.c
index 2a111f5987f6..f0203b8fc04c 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -44,3 +44,14 @@ void xdp_rxq_info_reg(struct xdp_rxq_info *xdp_rxq)
 	xdp_rxq->reg_state = REG_STATE_REGISTRED;
 }
 EXPORT_SYMBOL_GPL(xdp_rxq_info_reg);
+
+bool xdp_rxq_info_is_reg(struct xdp_rxq_info *xdp_rxq)
+{
+	if (!xdp_rxq) {
+		WARN(1, "API violation, xdp_rxq is NULL");
+		return false;
+	}
+
+	return (xdp_rxq->reg_state == REG_STATE_REGISTRED);
+}
+EXPORT_SYMBOL_GPL(xdp_rxq_info_is_reg);

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [bpf-next V1-RFC PATCH 06/14] mlx4: setup xdp_rxq_info
  2017-12-13 11:19 [bpf-next V1-RFC PATCH 00/14] xdp: new XDP rx-queue info concept Jesper Dangaard Brouer
                   ` (4 preceding siblings ...)
  2017-12-13 11:19 ` [bpf-next V1-RFC PATCH 05/14] xdp/qede: setup xdp_rxq_info and intro xdp_rxq_info_is_reg Jesper Dangaard Brouer
@ 2017-12-13 11:19 ` Jesper Dangaard Brouer
  2017-12-13 12:42   ` Tariq Toukan
  2017-12-13 11:19 ` [bpf-next V1-RFC PATCH 07/14] bnxt_en: " Jesper Dangaard Brouer
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 39+ messages in thread
From: Jesper Dangaard Brouer @ 2017-12-13 11:19 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov
  Cc: netdev, dsahern, Jesper Dangaard Brouer, gospo, bjorn.topel,
	michael.chan, Tariq Toukan

Driver hook points for xdp_rxq_info:
 * init+reg: mlx4_en_create_rx_ring
 * unreg   : mlx4_en_destroy_rx_ring

Cc: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c |    3 ++-
 drivers/net/ethernet/mellanox/mlx4/en_rx.c     |   13 +++++++++++--
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h   |    4 +++-
 3 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index 99051a294fa6..0cfcf3089ae4 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -2172,8 +2172,9 @@ static int mlx4_en_alloc_resources(struct mlx4_en_priv *priv)
 
 		if (mlx4_en_create_rx_ring(priv, &priv->rx_ring[i],
 					   prof->rx_ring_size, priv->stride,
-					   node))
+					   node, i))
 			goto err;
+
 	}
 
 #ifdef CONFIG_RFS_ACCEL
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index 85e28efcda33..2091c9734e6a 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -262,7 +262,7 @@ void mlx4_en_set_num_rx_rings(struct mlx4_en_dev *mdev)
 
 int mlx4_en_create_rx_ring(struct mlx4_en_priv *priv,
 			   struct mlx4_en_rx_ring **pring,
-			   u32 size, u16 stride, int node)
+			   u32 size, u16 stride, int node, int queue_index)
 {
 	struct mlx4_en_dev *mdev = priv->mdev;
 	struct mlx4_en_rx_ring *ring;
@@ -286,6 +286,12 @@ int mlx4_en_create_rx_ring(struct mlx4_en_priv *priv,
 	ring->log_stride = ffs(ring->stride) - 1;
 	ring->buf_size = ring->size * ring->stride + TXBB_SIZE;
 
+	/* XDP RX-queue info */
+	xdp_rxq_info_init(&ring->xdp_rxq);
+	ring->xdp_rxq.dev = priv->dev;
+	ring->xdp_rxq.queue_index = queue_index;
+	xdp_rxq_info_reg(&ring->xdp_rxq);
+
 	tmp = size * roundup_pow_of_two(MLX4_EN_MAX_RX_FRAGS *
 					sizeof(struct mlx4_en_rx_alloc));
 	ring->rx_info = vzalloc_node(tmp, node);
@@ -318,6 +324,7 @@ int mlx4_en_create_rx_ring(struct mlx4_en_priv *priv,
 	vfree(ring->rx_info);
 	ring->rx_info = NULL;
 err_ring:
+	xdp_rxq_info_unreg(&ring->xdp_rxq);
 	kfree(ring);
 	*pring = NULL;
 
@@ -440,6 +447,7 @@ void mlx4_en_destroy_rx_ring(struct mlx4_en_priv *priv,
 					lockdep_is_held(&mdev->state_lock));
 	if (old_prog)
 		bpf_prog_put(old_prog);
+	xdp_rxq_info_unreg(&ring->xdp_rxq);
 	mlx4_free_hwq_res(mdev->dev, &ring->wqres, size * stride + TXBB_SIZE);
 	vfree(ring->rx_info);
 	ring->rx_info = NULL;
@@ -650,6 +658,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
 	int cq_ring = cq->ring;
 	bool doorbell_pending;
 	struct mlx4_cqe *cqe;
+	struct xdp_buff xdp;
 	int polled = 0;
 	int index;
 
@@ -664,6 +673,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
 	/* Protect accesses to: ring->xdp_prog, priv->mac_hash list */
 	rcu_read_lock();
 	xdp_prog = rcu_dereference(ring->xdp_prog);
+	xdp.rxq = &ring->xdp_rxq;
 	doorbell_pending = 0;
 
 	/* We assume a 1:1 mapping between CQEs and Rx descriptors, so Rx
@@ -748,7 +758,6 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
 		 * read bytes but not past the end of the frag.
 		 */
 		if (xdp_prog) {
-			struct xdp_buff xdp;
 			dma_addr_t dma;
 			void *orig_data;
 			u32 act;
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index 1856e279a7e0..bdfb4362b35a 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -46,6 +46,7 @@
 #endif
 #include <linux/cpu_rmap.h>
 #include <linux/ptp_clock_kernel.h>
+#include <net/xdp.h>
 
 #include <linux/mlx4/device.h>
 #include <linux/mlx4/qp.h>
@@ -353,6 +354,7 @@ struct mlx4_en_rx_ring {
 	unsigned long dropped;
 	int hwtstamp_rx_filter;
 	cpumask_var_t affinity_mask;
+	struct xdp_rxq_info xdp_rxq;
 };
 
 struct mlx4_en_cq {
@@ -716,7 +718,7 @@ void mlx4_en_set_num_rx_rings(struct mlx4_en_dev *mdev);
 void mlx4_en_recover_from_oom(struct mlx4_en_priv *priv);
 int mlx4_en_create_rx_ring(struct mlx4_en_priv *priv,
 			   struct mlx4_en_rx_ring **pring,
-			   u32 size, u16 stride, int node);
+			   u32 size, u16 stride, int node, int queue_index);
 void mlx4_en_destroy_rx_ring(struct mlx4_en_priv *priv,
 			     struct mlx4_en_rx_ring **pring,
 			     u32 size, u16 stride);

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [bpf-next V1-RFC PATCH 07/14] bnxt_en: setup xdp_rxq_info
  2017-12-13 11:19 [bpf-next V1-RFC PATCH 00/14] xdp: new XDP rx-queue info concept Jesper Dangaard Brouer
                   ` (5 preceding siblings ...)
  2017-12-13 11:19 ` [bpf-next V1-RFC PATCH 06/14] mlx4: setup xdp_rxq_info Jesper Dangaard Brouer
@ 2017-12-13 11:19 ` Jesper Dangaard Brouer
  2017-12-13 11:20 ` [bpf-next V1-RFC PATCH 08/14] nfp: " Jesper Dangaard Brouer
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 39+ messages in thread
From: Jesper Dangaard Brouer @ 2017-12-13 11:19 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov
  Cc: netdev, Jesper Dangaard Brouer, dsahern, michael.chan,
	bjorn.topel, gospo, Andy Gospodarek

Driver hook points for xdp_rxq_info:
 * init+reg: bnxt_alloc_rx_rings
 * unreg   : bnxt_free_rx_rings

This driver should be updated to re-register when changing
allocation mode of RX rings.

Cc: Andy Gospodarek <andy@greyhouse.net>
Cc: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c     |   10 ++++++++++
 drivers/net/ethernet/broadcom/bnxt/bnxt.h     |    2 ++
 drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c |    1 +
 3 files changed, 13 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index af6c83f355ae..1eb65a6d6d9a 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -2246,6 +2246,7 @@ static void bnxt_free_rx_rings(struct bnxt *bp)
 
 		if (rxr->xdp_prog)
 			bpf_prog_put(rxr->xdp_prog);
+		xdp_rxq_info_unreg(&rxr->xdp_rxq);
 
 		kfree(rxr->rx_tpa);
 		rxr->rx_tpa = NULL;
@@ -2280,6 +2281,12 @@ static int bnxt_alloc_rx_rings(struct bnxt *bp)
 
 		ring = &rxr->rx_ring_struct;
 
+		/* XDP RX-queue info */
+		xdp_rxq_info_init(&rxr->xdp_rxq);
+		rxr->xdp_rxq.dev = bp->dev;
+		rxr->xdp_rxq.queue_index = i;
+		xdp_rxq_info_reg(&rxr->xdp_rxq);
+
 		rc = bnxt_alloc_ring(bp, ring);
 		if (rc)
 			return rc;
@@ -2834,6 +2841,9 @@ void bnxt_set_ring_params(struct bnxt *bp)
 	bp->cp_ring_mask = bp->cp_bit - 1;
 }
 
+/* Changing allocation mode of RX rings.
+ * TODO: Update when extending xdp_rxq_info to support allocation modes.
+ */
 int bnxt_set_rx_skb_mode(struct bnxt *bp, bool page_mode)
 {
 	if (page_mode) {
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 5359a1f0045f..2d268fc26f5e 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -23,6 +23,7 @@
 #include <net/devlink.h>
 #include <net/dst_metadata.h>
 #include <net/switchdev.h>
+#include <net/xdp.h>
 
 struct tx_bd {
 	__le32 tx_bd_len_flags_type;
@@ -664,6 +665,7 @@ struct bnxt_rx_ring_info {
 
 	struct bnxt_ring_struct	rx_ring_struct;
 	struct bnxt_ring_struct	rx_agg_ring_struct;
+	struct xdp_rxq_info	xdp_rxq;
 };
 
 struct bnxt_cp_ring_info {
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
index 261e5847557a..1389ab5e05df 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
@@ -96,6 +96,7 @@ bool bnxt_rx_xdp(struct bnxt *bp, struct bnxt_rx_ring_info *rxr, u16 cons,
 	xdp.data = *data_ptr;
 	xdp_set_data_meta_invalid(&xdp);
 	xdp.data_end = *data_ptr + *len;
+	xdp.rxq = &rxr->xdp_rxq;
 	orig_data = xdp.data;
 	mapping = rx_buf->mapping - bp->rx_dma_offset;
 

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [bpf-next V1-RFC PATCH 08/14] nfp: setup xdp_rxq_info
  2017-12-13 11:19 [bpf-next V1-RFC PATCH 00/14] xdp: new XDP rx-queue info concept Jesper Dangaard Brouer
                   ` (6 preceding siblings ...)
  2017-12-13 11:19 ` [bpf-next V1-RFC PATCH 07/14] bnxt_en: " Jesper Dangaard Brouer
@ 2017-12-13 11:20 ` Jesper Dangaard Brouer
  2017-12-14  2:34   ` Jakub Kicinski
  2017-12-13 11:20   ` Jesper Dangaard Brouer
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 39+ messages in thread
From: Jesper Dangaard Brouer @ 2017-12-13 11:20 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov
  Cc: dsahern, Jakub Kicinski, oss-drivers, Simon Horman,
	Jesper Dangaard Brouer, netdev, gospo, bjorn.topel, michael.chan

Driver hook points for xdp_rxq_info:
 * init+reg: nfp_net_rx_ring_alloc
 * unreg   : nfp_net_rx_ring_free

In struct nfp_net_rx_ring moved member @size into a hole on 64-bit.
Thus, the size remaines the same after adding member @xdp_rxq.

Cc: oss-drivers@netronome.com
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/netronome/nfp/nfp_net.h       |    5 ++++-
 .../net/ethernet/netronome/nfp/nfp_net_common.c    |   10 +++++++++-
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net.h b/drivers/net/ethernet/netronome/nfp/nfp_net.h
index 3801c52098d5..0e564cfabe7e 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net.h
@@ -47,6 +47,7 @@
 #include <linux/netdevice.h>
 #include <linux/pci.h>
 #include <linux/io-64-nonatomic-hi-lo.h>
+#include <net/xdp.h>
 
 #include "nfp_net_ctrl.h"
 
@@ -350,6 +351,7 @@ struct nfp_net_rx_buf {
  * @rxds:       Virtual address of FL/RX ring in host memory
  * @dma:        DMA address of the FL/RX ring
  * @size:       Size, in bytes, of the FL/RX ring (needed to free)
+ * @xdp_rxq:    RX-ring info avail for XDP
  */
 struct nfp_net_rx_ring {
 	struct nfp_net_r_vector *r_vec;
@@ -361,13 +363,14 @@ struct nfp_net_rx_ring {
 	u32 idx;
 
 	int fl_qcidx;
+	unsigned int size;
 	u8 __iomem *qcp_fl;
 
 	struct nfp_net_rx_buf *rxbufs;
 	struct nfp_net_rx_desc *rxds;
 
 	dma_addr_t dma;
-	unsigned int size;
+	struct xdp_rxq_info xdp_rxq;
 } ____cacheline_aligned;
 
 /**
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index ad3e9f6a61e5..6474aecd0451 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -1608,11 +1608,13 @@ static int nfp_net_rx(struct nfp_net_rx_ring *rx_ring, int budget)
 	unsigned int true_bufsz;
 	struct sk_buff *skb;
 	int pkts_polled = 0;
+	struct xdp_buff xdp;
 	int idx;
 
 	rcu_read_lock();
 	xdp_prog = READ_ONCE(dp->xdp_prog);
 	true_bufsz = xdp_prog ? PAGE_SIZE : dp->fl_bufsz;
+	xdp.rxq = &rx_ring->xdp_rxq;
 	tx_ring = r_vec->xdp_ring;
 
 	while (pkts_polled < budget) {
@@ -1703,7 +1705,6 @@ static int nfp_net_rx(struct nfp_net_rx_ring *rx_ring, int budget)
 				  dp->bpf_offload_xdp) && !meta.portid) {
 			void *orig_data = rxbuf->frag + pkt_off;
 			unsigned int dma_off;
-			struct xdp_buff xdp;
 			int act;
 
 			xdp.data_hard_start = rxbuf->frag + NFP_NET_RX_BUF_HEADROOM;
@@ -2252,6 +2253,7 @@ static void nfp_net_rx_ring_free(struct nfp_net_rx_ring *rx_ring)
 	struct nfp_net_r_vector *r_vec = rx_ring->r_vec;
 	struct nfp_net_dp *dp = &r_vec->nfp_net->dp;
 
+	xdp_rxq_info_unreg(&rx_ring->xdp_rxq);
 	kfree(rx_ring->rxbufs);
 
 	if (rx_ring->rxds)
@@ -2277,6 +2279,12 @@ nfp_net_rx_ring_alloc(struct nfp_net_dp *dp, struct nfp_net_rx_ring *rx_ring)
 {
 	int sz;
 
+	/* XDP RX-queue info */
+	xdp_rxq_info_init(&rx_ring->xdp_rxq);
+	rx_ring->xdp_rxq.dev = dp->netdev;
+	rx_ring->xdp_rxq.queue_index = rx_ring->idx;
+	xdp_rxq_info_reg(&rx_ring->xdp_rxq);
+
 	rx_ring->cnt = dp->rxd_cnt;
 	rx_ring->size = sizeof(*rx_ring->rxds) * rx_ring->cnt;
 	rx_ring->rxds = dma_zalloc_coherent(dp->dev, rx_ring->size,

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [bpf-next V1-RFC PATCH 09/14] thunderx: setup xdp_rxq_info
  2017-12-13 11:19 [bpf-next V1-RFC PATCH 00/14] xdp: new XDP rx-queue info concept Jesper Dangaard Brouer
@ 2017-12-13 11:20   ` Jesper Dangaard Brouer
  2017-12-13 11:19 ` [bpf-next V1-RFC PATCH 02/14] xdp/mlx5: setup xdp_rxq_info and extend with qtype Jesper Dangaard Brouer
                     ` (12 subsequent siblings)
  13 siblings, 0 replies; 39+ messages in thread
From: Jesper Dangaard Brouer @ 2017-12-13 11:20 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov
  Cc: bjorn.topel, Robert Richter, netdev, dsahern,
	Jesper Dangaard Brouer, gospo, Sunil Goutham, michael.chan,
	linux-arm-kernel

This driver uses a bool scheme for "enable"/"disable" when setting up
different resources.  Thus, the hook points for xdp_rxq_info is done
in the same function call nicvf_rcv_queue_config().  This is activated
through enable/disble via nicvf_config_data_transfer(), which is tied
into nicvf_stop()/nicvf_open().

Extending driver packet handler call-path nicvf_rcv_pkt_handler() with
a pointer to the given struct rcv_queue, in-order to access the
xdp_rxq_info data area (in nicvf_xdp_rx()).

Cc: linux-arm-kernel@lists.infradead.org
Cc: Sunil Goutham <sgoutham@cavium.com>
Cc: Robert Richter <rric@kernel.org>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/cavium/thunder/nicvf_main.c   |   11 +++++++----
 drivers/net/ethernet/cavium/thunder/nicvf_queues.c |    7 +++++++
 drivers/net/ethernet/cavium/thunder/nicvf_queues.h |    2 ++
 3 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index 52b3a6044f85..21618d0d694f 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -521,7 +521,7 @@ static void nicvf_unmap_page(struct nicvf *nic, struct page *page, u64 dma_addr)
 
 static inline bool nicvf_xdp_rx(struct nicvf *nic, struct bpf_prog *prog,
 				struct cqe_rx_t *cqe_rx, struct snd_queue *sq,
-				struct sk_buff **skb)
+				struct rcv_queue *rq, struct sk_buff **skb)
 {
 	struct xdp_buff xdp;
 	struct page *page;
@@ -545,6 +545,7 @@ static inline bool nicvf_xdp_rx(struct nicvf *nic, struct bpf_prog *prog,
 	xdp.data = (void *)cpu_addr;
 	xdp_set_data_meta_invalid(&xdp);
 	xdp.data_end = xdp.data + len;
+	xdp.rxq = &rq->xdp_rxq;
 	orig_data = xdp.data;
 
 	rcu_read_lock();
@@ -698,7 +699,8 @@ static inline void nicvf_set_rxhash(struct net_device *netdev,
 
 static void nicvf_rcv_pkt_handler(struct net_device *netdev,
 				  struct napi_struct *napi,
-				  struct cqe_rx_t *cqe_rx, struct snd_queue *sq)
+				  struct cqe_rx_t *cqe_rx,
+				  struct snd_queue *sq, struct rcv_queue *rq)
 {
 	struct sk_buff *skb = NULL;
 	struct nicvf *nic = netdev_priv(netdev);
@@ -724,7 +726,7 @@ static void nicvf_rcv_pkt_handler(struct net_device *netdev,
 	/* For XDP, ignore pkts spanning multiple pages */
 	if (nic->xdp_prog && (cqe_rx->rb_cnt == 1)) {
 		/* Packet consumed by XDP */
-		if (nicvf_xdp_rx(snic, nic->xdp_prog, cqe_rx, sq, &skb))
+		if (nicvf_xdp_rx(snic, nic->xdp_prog, cqe_rx, sq, rq, &skb))
 			return;
 	} else {
 		skb = nicvf_get_rcv_skb(snic, cqe_rx,
@@ -781,6 +783,7 @@ static int nicvf_cq_intr_handler(struct net_device *netdev, u8 cq_idx,
 	struct cqe_rx_t *cq_desc;
 	struct netdev_queue *txq;
 	struct snd_queue *sq = &qs->sq[cq_idx];
+	struct rcv_queue *rq = &qs->rq[cq_idx];
 	unsigned int tx_pkts = 0, tx_bytes = 0, txq_idx;
 
 	spin_lock_bh(&cq->lock);
@@ -811,7 +814,7 @@ static int nicvf_cq_intr_handler(struct net_device *netdev, u8 cq_idx,
 
 		switch (cq_desc->cqe_type) {
 		case CQE_TYPE_RX:
-			nicvf_rcv_pkt_handler(netdev, napi, cq_desc, sq);
+			nicvf_rcv_pkt_handler(netdev, napi, cq_desc, sq, rq);
 			work_done++;
 		break;
 		case CQE_TYPE_SEND:
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index 095c18aeb8d5..2f2a736ea102 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -760,6 +760,7 @@ static void nicvf_rcv_queue_config(struct nicvf *nic, struct queue_set *qs,
 
 	if (!rq->enable) {
 		nicvf_reclaim_rcv_queue(nic, qs, qidx);
+		xdp_rxq_info_unreg(&rq->xdp_rxq);
 		return;
 	}
 
@@ -772,6 +773,12 @@ static void nicvf_rcv_queue_config(struct nicvf *nic, struct queue_set *qs,
 	/* all writes of RBDR data to be loaded into L2 Cache as well*/
 	rq->caching = 1;
 
+	/* XDP RX-queue info */
+	xdp_rxq_info_init(&rq->xdp_rxq);
+	rq->xdp_rxq.dev = nic->netdev;
+	rq->xdp_rxq.queue_index = qidx;
+	xdp_rxq_info_reg(&rq->xdp_rxq);
+
 	/* Send a mailbox msg to PF to config RQ */
 	mbx.rq.msg = NIC_MBOX_MSG_RQ_CFG;
 	mbx.rq.qs_num = qs->vnic_id;
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
index 178ab6e8e3c5..7d1e4e2aaad0 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
@@ -12,6 +12,7 @@
 #include <linux/netdevice.h>
 #include <linux/iommu.h>
 #include <linux/bpf.h>
+#include <net/xdp.h>
 #include "q_struct.h"
 
 #define MAX_QUEUE_SET			128
@@ -255,6 +256,7 @@ struct rcv_queue {
 	u8		start_qs_rbdr_idx; /* RBDR idx in the above QS */
 	u8		caching;
 	struct		rx_tx_queue_stats stats;
+	struct xdp_rxq_info xdp_rxq;
 } ____cacheline_aligned_in_smp;
 
 struct cmp_queue {

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [bpf-next V1-RFC PATCH 09/14] thunderx: setup xdp_rxq_info
@ 2017-12-13 11:20   ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 39+ messages in thread
From: Jesper Dangaard Brouer @ 2017-12-13 11:20 UTC (permalink / raw)
  To: linux-arm-kernel

This driver uses a bool scheme for "enable"/"disable" when setting up
different resources.  Thus, the hook points for xdp_rxq_info is done
in the same function call nicvf_rcv_queue_config().  This is activated
through enable/disble via nicvf_config_data_transfer(), which is tied
into nicvf_stop()/nicvf_open().

Extending driver packet handler call-path nicvf_rcv_pkt_handler() with
a pointer to the given struct rcv_queue, in-order to access the
xdp_rxq_info data area (in nicvf_xdp_rx()).

Cc: linux-arm-kernel at lists.infradead.org
Cc: Sunil Goutham <sgoutham@cavium.com>
Cc: Robert Richter <rric@kernel.org>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/cavium/thunder/nicvf_main.c   |   11 +++++++----
 drivers/net/ethernet/cavium/thunder/nicvf_queues.c |    7 +++++++
 drivers/net/ethernet/cavium/thunder/nicvf_queues.h |    2 ++
 3 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index 52b3a6044f85..21618d0d694f 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -521,7 +521,7 @@ static void nicvf_unmap_page(struct nicvf *nic, struct page *page, u64 dma_addr)
 
 static inline bool nicvf_xdp_rx(struct nicvf *nic, struct bpf_prog *prog,
 				struct cqe_rx_t *cqe_rx, struct snd_queue *sq,
-				struct sk_buff **skb)
+				struct rcv_queue *rq, struct sk_buff **skb)
 {
 	struct xdp_buff xdp;
 	struct page *page;
@@ -545,6 +545,7 @@ static inline bool nicvf_xdp_rx(struct nicvf *nic, struct bpf_prog *prog,
 	xdp.data = (void *)cpu_addr;
 	xdp_set_data_meta_invalid(&xdp);
 	xdp.data_end = xdp.data + len;
+	xdp.rxq = &rq->xdp_rxq;
 	orig_data = xdp.data;
 
 	rcu_read_lock();
@@ -698,7 +699,8 @@ static inline void nicvf_set_rxhash(struct net_device *netdev,
 
 static void nicvf_rcv_pkt_handler(struct net_device *netdev,
 				  struct napi_struct *napi,
-				  struct cqe_rx_t *cqe_rx, struct snd_queue *sq)
+				  struct cqe_rx_t *cqe_rx,
+				  struct snd_queue *sq, struct rcv_queue *rq)
 {
 	struct sk_buff *skb = NULL;
 	struct nicvf *nic = netdev_priv(netdev);
@@ -724,7 +726,7 @@ static void nicvf_rcv_pkt_handler(struct net_device *netdev,
 	/* For XDP, ignore pkts spanning multiple pages */
 	if (nic->xdp_prog && (cqe_rx->rb_cnt == 1)) {
 		/* Packet consumed by XDP */
-		if (nicvf_xdp_rx(snic, nic->xdp_prog, cqe_rx, sq, &skb))
+		if (nicvf_xdp_rx(snic, nic->xdp_prog, cqe_rx, sq, rq, &skb))
 			return;
 	} else {
 		skb = nicvf_get_rcv_skb(snic, cqe_rx,
@@ -781,6 +783,7 @@ static int nicvf_cq_intr_handler(struct net_device *netdev, u8 cq_idx,
 	struct cqe_rx_t *cq_desc;
 	struct netdev_queue *txq;
 	struct snd_queue *sq = &qs->sq[cq_idx];
+	struct rcv_queue *rq = &qs->rq[cq_idx];
 	unsigned int tx_pkts = 0, tx_bytes = 0, txq_idx;
 
 	spin_lock_bh(&cq->lock);
@@ -811,7 +814,7 @@ static int nicvf_cq_intr_handler(struct net_device *netdev, u8 cq_idx,
 
 		switch (cq_desc->cqe_type) {
 		case CQE_TYPE_RX:
-			nicvf_rcv_pkt_handler(netdev, napi, cq_desc, sq);
+			nicvf_rcv_pkt_handler(netdev, napi, cq_desc, sq, rq);
 			work_done++;
 		break;
 		case CQE_TYPE_SEND:
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index 095c18aeb8d5..2f2a736ea102 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -760,6 +760,7 @@ static void nicvf_rcv_queue_config(struct nicvf *nic, struct queue_set *qs,
 
 	if (!rq->enable) {
 		nicvf_reclaim_rcv_queue(nic, qs, qidx);
+		xdp_rxq_info_unreg(&rq->xdp_rxq);
 		return;
 	}
 
@@ -772,6 +773,12 @@ static void nicvf_rcv_queue_config(struct nicvf *nic, struct queue_set *qs,
 	/* all writes of RBDR data to be loaded into L2 Cache as well*/
 	rq->caching = 1;
 
+	/* XDP RX-queue info */
+	xdp_rxq_info_init(&rq->xdp_rxq);
+	rq->xdp_rxq.dev = nic->netdev;
+	rq->xdp_rxq.queue_index = qidx;
+	xdp_rxq_info_reg(&rq->xdp_rxq);
+
 	/* Send a mailbox msg to PF to config RQ */
 	mbx.rq.msg = NIC_MBOX_MSG_RQ_CFG;
 	mbx.rq.qs_num = qs->vnic_id;
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
index 178ab6e8e3c5..7d1e4e2aaad0 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
@@ -12,6 +12,7 @@
 #include <linux/netdevice.h>
 #include <linux/iommu.h>
 #include <linux/bpf.h>
+#include <net/xdp.h>
 #include "q_struct.h"
 
 #define MAX_QUEUE_SET			128
@@ -255,6 +256,7 @@ struct rcv_queue {
 	u8		start_qs_rbdr_idx; /* RBDR idx in the above QS */
 	u8		caching;
 	struct		rx_tx_queue_stats stats;
+	struct xdp_rxq_info xdp_rxq;
 } ____cacheline_aligned_in_smp;
 
 struct cmp_queue {

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [bpf-next V1-RFC PATCH 10/14] tun: setup xdp_rxq_info
  2017-12-13 11:19 [bpf-next V1-RFC PATCH 00/14] xdp: new XDP rx-queue info concept Jesper Dangaard Brouer
                   ` (8 preceding siblings ...)
  2017-12-13 11:20   ` Jesper Dangaard Brouer
@ 2017-12-13 11:20 ` Jesper Dangaard Brouer
  2017-12-20  7:48   ` Jason Wang
  2017-12-13 11:20 ` [bpf-next V1-RFC PATCH 11/14] virtio_net: " Jesper Dangaard Brouer
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 39+ messages in thread
From: Jesper Dangaard Brouer @ 2017-12-13 11:20 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov
  Cc: Willem de Bruijn, Michael S. Tsirkin, netdev, Jason Wang,
	dsahern, Jesper Dangaard Brouer, gospo, bjorn.topel,
	michael.chan

Driver hook points for xdp_rxq_info:
 * init+reg: tun_attach
 * unreg   : __tun_detach

I've done some manual testing of this tun driver, but I would
appriciate good review and someone else running their use-case tests,
as I'm not 100% sure I understand the tfile->detached semantics.

Cc: Jason Wang <jasowang@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/tun.c |   25 ++++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index e367d6310353..f1df08c2c541 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -180,6 +180,7 @@ struct tun_file {
 	struct list_head next;
 	struct tun_struct *detached;
 	struct skb_array tx_array;
+	struct xdp_rxq_info xdp_rxq;
 };
 
 struct tun_flow_entry {
@@ -687,8 +688,10 @@ static void __tun_detach(struct tun_file *tfile, bool clean)
 			    tun->dev->reg_state == NETREG_REGISTERED)
 				unregister_netdevice(tun->dev);
 		}
-		if (tun)
+		if (tun) {
 			skb_array_cleanup(&tfile->tx_array);
+			xdp_rxq_info_unreg(&tfile->xdp_rxq);
+		}
 		sock_put(&tfile->sk);
 	}
 }
@@ -728,11 +731,15 @@ static void tun_detach_all(struct net_device *dev)
 		tun_napi_del(tun, tfile);
 		/* Drop read queue */
 		tun_queue_purge(tfile);
+		skb_array_cleanup(&tfile->tx_array);
+		xdp_rxq_info_unreg(&tfile->xdp_rxq);
 		sock_put(&tfile->sk);
 	}
 	list_for_each_entry_safe(tfile, tmp, &tun->disabled, next) {
 		tun_enable_queue(tfile);
 		tun_queue_purge(tfile);
+		skb_array_cleanup(&tfile->tx_array);
+		xdp_rxq_info_unreg(&tfile->xdp_rxq);
 		sock_put(&tfile->sk);
 	}
 	BUG_ON(tun->numdisabled != 0);
@@ -784,6 +791,21 @@ static int tun_attach(struct tun_struct *tun, struct file *file,
 
 	tfile->queue_index = tun->numqueues;
 	tfile->socket.sk->sk_shutdown &= ~RCV_SHUTDOWN;
+
+	if (tfile->detached) {
+		/* Re-attach detached tfile, updating XDP queue_index */
+		WARN_ON(!xdp_rxq_info_is_reg(&tfile->xdp_rxq));
+
+		if (tfile->xdp_rxq.queue_index    != tfile->queue_index)
+			tfile->xdp_rxq.queue_index = tfile->queue_index;
+	} else {
+		/* Setup XDP RX-queue info, for new tfile getting attached */
+		xdp_rxq_info_init(&tfile->xdp_rxq);
+		tfile->xdp_rxq.dev = tun->dev;
+		tfile->xdp_rxq.queue_index = tfile->queue_index;
+		xdp_rxq_info_reg(&tfile->xdp_rxq);
+	}
+
 	rcu_assign_pointer(tfile->tun, tun);
 	rcu_assign_pointer(tun->tfiles[tun->numqueues], tfile);
 	tun->numqueues++;
@@ -1508,6 +1530,7 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun,
 		xdp.data = buf + pad;
 		xdp_set_data_meta_invalid(&xdp);
 		xdp.data_end = xdp.data + len;
+		xdp.rxq = &tfile->xdp_rxq;
 		orig_data = xdp.data;
 		act = bpf_prog_run_xdp(xdp_prog, &xdp);
 

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [bpf-next V1-RFC PATCH 11/14] virtio_net: setup xdp_rxq_info
  2017-12-13 11:19 [bpf-next V1-RFC PATCH 00/14] xdp: new XDP rx-queue info concept Jesper Dangaard Brouer
                   ` (9 preceding siblings ...)
  2017-12-13 11:20 ` [bpf-next V1-RFC PATCH 10/14] tun: " Jesper Dangaard Brouer
@ 2017-12-13 11:20 ` Jesper Dangaard Brouer
  2017-12-13 11:20 ` [bpf-next V1-RFC PATCH 12/14] xdp: generic XDP handling of xdp_rxq_info Jesper Dangaard Brouer
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 39+ messages in thread
From: Jesper Dangaard Brouer @ 2017-12-13 11:20 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov
  Cc: Michael S. Tsirkin, netdev, Jesper Dangaard Brouer,
	virtualization, dsahern, gospo, bjorn.topel, michael.chan

The virtio_net driver doesn't dynamically change the RX-ring queue
layout and backing pages, but instead reject XDP setup if all the
conditions for XDP is not meet.  Thus, the xdp_rxq_info also remains
fairly static.  This allow us to simply add the init+reg/unreg to
net_device open/close functions.

Driver hook points for xdp_rxq_info:
 * init+reg: virtnet_open
 * unreg   : virtnet_close

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/virtio_net.c |   12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 19a985ef9104..9792183befbf 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -31,6 +31,7 @@
 #include <linux/average.h>
 #include <linux/filter.h>
 #include <net/route.h>
+#include <net/xdp.h>
 
 static int napi_weight = NAPI_POLL_WEIGHT;
 module_param(napi_weight, int, 0444);
@@ -115,6 +116,8 @@ struct receive_queue {
 
 	/* Name of this receive queue: input.$index */
 	char name[40];
+
+	struct xdp_rxq_info xdp_rxq;
 };
 
 struct virtnet_info {
@@ -556,6 +559,7 @@ static struct sk_buff *receive_small(struct net_device *dev,
 		xdp.data = xdp.data_hard_start + xdp_headroom;
 		xdp_set_data_meta_invalid(&xdp);
 		xdp.data_end = xdp.data + len;
+		xdp.rxq = &rq->xdp_rxq;
 		orig_data = xdp.data;
 		act = bpf_prog_run_xdp(xdp_prog, &xdp);
 
@@ -1229,6 +1233,13 @@ static int virtnet_open(struct net_device *dev)
 			/* Make sure we have some buffers: if oom use wq. */
 			if (!try_fill_recv(vi, &vi->rq[i], GFP_KERNEL))
 				schedule_delayed_work(&vi->refill, 0);
+
+		/* XDP RX queue info */
+		xdp_rxq_info_init(&vi->rq[i].xdp_rxq);
+		vi->rq[i].xdp_rxq.dev = dev;
+		vi->rq[i].xdp_rxq.queue_index = i;
+		xdp_rxq_info_reg(&vi->rq[i].xdp_rxq);
+
 		virtnet_napi_enable(vi->rq[i].vq, &vi->rq[i].napi);
 		virtnet_napi_tx_enable(vi, vi->sq[i].vq, &vi->sq[i].napi);
 	}
@@ -1557,6 +1568,7 @@ static int virtnet_close(struct net_device *dev)
 	cancel_delayed_work_sync(&vi->refill);
 
 	for (i = 0; i < vi->max_queue_pairs; i++) {
+		xdp_rxq_info_unreg(&vi->rq[i].xdp_rxq);
 		napi_disable(&vi->rq[i].napi);
 		virtnet_napi_tx_disable(&vi->sq[i].napi);
 	}

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [bpf-next V1-RFC PATCH 12/14] xdp: generic XDP handling of xdp_rxq_info
  2017-12-13 11:19 [bpf-next V1-RFC PATCH 00/14] xdp: new XDP rx-queue info concept Jesper Dangaard Brouer
                   ` (10 preceding siblings ...)
  2017-12-13 11:20 ` [bpf-next V1-RFC PATCH 11/14] virtio_net: " Jesper Dangaard Brouer
@ 2017-12-13 11:20 ` Jesper Dangaard Brouer
  2017-12-13 22:50   ` Saeed Mahameed
  2017-12-13 11:20 ` [bpf-next V1-RFC PATCH 13/14] bpf: finally expose xdp_rxq_info to XDP bpf-programs Jesper Dangaard Brouer
  2017-12-13 11:20 ` [bpf-next V1-RFC PATCH 14/14] samples/bpf: program demonstrating access to xdp_rxq_info Jesper Dangaard Brouer
  13 siblings, 1 reply; 39+ messages in thread
From: Jesper Dangaard Brouer @ 2017-12-13 11:20 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov
  Cc: netdev, Jesper Dangaard Brouer, dsahern, gospo, bjorn.topel,
	michael.chan

Hook points for xdp_rxq_info:
 * init+reg: netif_alloc_rx_queues
 * unreg   : netif_free_rx_queues

The net_device have some members (num_rx_queues + real_num_rx_queues)
and data-area (dev->_rx with struct netdev_rx_queue's) that were
primarily used for exporting information about RPS (CONFIG_RPS) queues
to sysfs (CONFIG_SYSFS).

For generic XDP extend struct netdev_rx_queue with the xdp_rxq_info,
and remove some of the CONFIG_SYSFS ifdefs.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 include/linux/netdevice.h |    2 ++
 net/core/dev.c            |   60 ++++++++++++++++++++++++++++++++++++++-------
 2 files changed, 52 insertions(+), 10 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index cc4ce7456e38..43595b037872 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -44,6 +44,7 @@
 #include <net/dcbnl.h>
 #endif
 #include <net/netprio_cgroup.h>
+#include <net/xdp.h>
 
 #include <linux/netdev_features.h>
 #include <linux/neighbour.h>
@@ -686,6 +687,7 @@ struct netdev_rx_queue {
 #endif
 	struct kobject			kobj;
 	struct net_device		*dev;
+	struct xdp_rxq_info		xdp_rxq;
 } ____cacheline_aligned_in_smp;
 
 /*
diff --git a/net/core/dev.c b/net/core/dev.c
index 6bea8931bb62..44932d6194a2 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3873,9 +3873,33 @@ static int enqueue_to_backlog(struct sk_buff *skb, int cpu,
 	return NET_RX_DROP;
 }
 
+static struct netdev_rx_queue *netif_get_rxqueue(struct sk_buff *skb)
+{
+	struct net_device *dev = skb->dev;
+	struct netdev_rx_queue *rxqueue;
+
+	rxqueue = dev->_rx;
+
+	if (skb_rx_queue_recorded(skb)) {
+		u16 index = skb_get_rx_queue(skb);
+
+		if (unlikely(index >= dev->real_num_rx_queues)) {
+			WARN_ONCE(dev->real_num_rx_queues > 1,
+				  "%s received packet on queue %u, but number "
+				  "of RX queues is %u\n",
+				  dev->name, index, dev->real_num_rx_queues);
+
+			return rxqueue; /* Return first rxqueue */
+		}
+		rxqueue += index;
+	}
+	return rxqueue;
+}
+
 static u32 netif_receive_generic_xdp(struct sk_buff *skb,
 				     struct bpf_prog *xdp_prog)
 {
+	struct netdev_rx_queue *rxqueue;
 	u32 metalen, act = XDP_DROP;
 	struct xdp_buff xdp;
 	void *orig_data;
@@ -3919,6 +3943,9 @@ static u32 netif_receive_generic_xdp(struct sk_buff *skb,
 	xdp.data_hard_start = skb->data - skb_headroom(skb);
 	orig_data = xdp.data;
 
+	rxqueue = netif_get_rxqueue(skb);
+	xdp.rxq = &rxqueue->xdp_rxq;
+
 	act = bpf_prog_run_xdp(xdp_prog, &xdp);
 
 	off = xdp.data - orig_data;
@@ -7538,7 +7565,6 @@ void netif_stacked_transfer_operstate(const struct net_device *rootdev,
 }
 EXPORT_SYMBOL(netif_stacked_transfer_operstate);
 
-#ifdef CONFIG_SYSFS
 static int netif_alloc_rx_queues(struct net_device *dev)
 {
 	unsigned int i, count = dev->num_rx_queues;
@@ -7553,11 +7579,31 @@ static int netif_alloc_rx_queues(struct net_device *dev)
 
 	dev->_rx = rx;
 
-	for (i = 0; i < count; i++)
+	for (i = 0; i < count; i++) {
 		rx[i].dev = dev;
+
+		/* XDP RX-queue setup */
+		xdp_rxq_info_init(&rx[i].xdp_rxq);
+		rx[i].xdp_rxq.dev = dev;
+		rx[i].xdp_rxq.queue_index = i;
+		xdp_rxq_info_reg(&rx[i].xdp_rxq);
+	}
 	return 0;
 }
-#endif
+
+static void netif_free_rx_queues(struct net_device *dev)
+{
+	unsigned int i, count = dev->num_rx_queues;
+	struct netdev_rx_queue *rx;
+
+	if (!dev->_rx) /* netif_alloc_rx_queues alloc failed */
+		return;
+
+	rx = dev->_rx;
+
+	for (i = 0; i < count; i++)
+		xdp_rxq_info_unreg(&rx[i].xdp_rxq);
+}
 
 static void netdev_init_one_queue(struct net_device *dev,
 				  struct netdev_queue *queue, void *_unused)
@@ -8118,12 +8164,10 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
 		return NULL;
 	}
 
-#ifdef CONFIG_SYSFS
 	if (rxqs < 1) {
 		pr_err("alloc_netdev: Unable to allocate device with zero RX queues\n");
 		return NULL;
 	}
-#endif
 
 	alloc_size = sizeof(struct net_device);
 	if (sizeof_priv) {
@@ -8180,12 +8224,10 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
 	if (netif_alloc_netdev_queues(dev))
 		goto free_all;
 
-#ifdef CONFIG_SYSFS
 	dev->num_rx_queues = rxqs;
 	dev->real_num_rx_queues = rxqs;
 	if (netif_alloc_rx_queues(dev))
 		goto free_all;
-#endif
 
 	strcpy(dev->name, name);
 	dev->name_assign_type = name_assign_type;
@@ -8224,9 +8266,7 @@ void free_netdev(struct net_device *dev)
 
 	might_sleep();
 	netif_free_tx_queues(dev);
-#ifdef CONFIG_SYSFS
-	kvfree(dev->_rx);
-#endif
+	netif_free_rx_queues(dev);
 
 	kfree(rcu_dereference_protected(dev->ingress_queue, 1));
 

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [bpf-next V1-RFC PATCH 13/14] bpf: finally expose xdp_rxq_info to XDP bpf-programs
  2017-12-13 11:19 [bpf-next V1-RFC PATCH 00/14] xdp: new XDP rx-queue info concept Jesper Dangaard Brouer
                   ` (11 preceding siblings ...)
  2017-12-13 11:20 ` [bpf-next V1-RFC PATCH 12/14] xdp: generic XDP handling of xdp_rxq_info Jesper Dangaard Brouer
@ 2017-12-13 11:20 ` Jesper Dangaard Brouer
  2017-12-13 11:20 ` [bpf-next V1-RFC PATCH 14/14] samples/bpf: program demonstrating access to xdp_rxq_info Jesper Dangaard Brouer
  13 siblings, 0 replies; 39+ messages in thread
From: Jesper Dangaard Brouer @ 2017-12-13 11:20 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov
  Cc: netdev, Jesper Dangaard Brouer, dsahern, gospo, bjorn.topel,
	michael.chan

Now all XDP driver have been updated to setup xdp_rxq_info and assign
this to xdp_buff->rxq.  Thus, it is now safe to enable access to some
of the xdp_rxq_info struct members.

This patch extend xdp_md and expose UAPI to userspace for
ingress_ifindex and rx_queue_index.  Access happens via bpf
instruction rewrite, that load data directly from struct xdp_rxq_info.

* ingress_ifindex map to xdp_rxq_info->dev->ifindex
* rx_queue_index  map to xdp_rxq_info->queue_index

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 include/uapi/linux/bpf.h |    3 +++
 net/core/filter.c        |   19 +++++++++++++++++++
 2 files changed, 22 insertions(+)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 595bda120cfb..512d456dc469 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -893,6 +893,9 @@ struct xdp_md {
 	__u32 data;
 	__u32 data_end;
 	__u32 data_meta;
+	/* Below access go though struct xdp_rxq_info */
+	__u32 ingress_ifindex; /* rxq->dev->ifindex */
+	__u32 rx_queue_index;  /* rxq->queue_index  */
 };
 
 enum sk_action {
diff --git a/net/core/filter.c b/net/core/filter.c
index 754abe1041b7..676ee9860019 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -4303,6 +4303,25 @@ static u32 xdp_convert_ctx_access(enum bpf_access_type type,
 				      si->dst_reg, si->src_reg,
 				      offsetof(struct xdp_buff, data_end));
 		break;
+	case offsetof(struct xdp_md, ingress_ifindex):
+		*insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct xdp_buff, rxq),
+				      si->dst_reg, si->src_reg,
+				      offsetof(struct xdp_buff, rxq));
+		*insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct xdp_rxq_info, dev),
+				      si->dst_reg, si->dst_reg,
+				      offsetof(struct xdp_rxq_info, dev));
+		*insn++ = BPF_LDX_MEM(BPF_W, si->dst_reg, si->dst_reg,
+				      bpf_target_off(struct net_device,
+						     ifindex, 4, target_size));
+		break;
+	case offsetof(struct xdp_md, rx_queue_index):
+		*insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct xdp_buff, rxq),
+				      si->dst_reg, si->src_reg,
+				      offsetof(struct xdp_buff, rxq));
+		*insn++ = BPF_LDX_MEM(BPF_W, si->dst_reg, si->dst_reg,
+				      bpf_target_off(struct xdp_rxq_info,
+						queue_index, 4, target_size));
+		break;
 	}
 
 	return insn - insn_buf;

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [bpf-next V1-RFC PATCH 14/14] samples/bpf: program demonstrating access to xdp_rxq_info
  2017-12-13 11:19 [bpf-next V1-RFC PATCH 00/14] xdp: new XDP rx-queue info concept Jesper Dangaard Brouer
                   ` (12 preceding siblings ...)
  2017-12-13 11:20 ` [bpf-next V1-RFC PATCH 13/14] bpf: finally expose xdp_rxq_info to XDP bpf-programs Jesper Dangaard Brouer
@ 2017-12-13 11:20 ` Jesper Dangaard Brouer
  13 siblings, 0 replies; 39+ messages in thread
From: Jesper Dangaard Brouer @ 2017-12-13 11:20 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov
  Cc: netdev, Jesper Dangaard Brouer, dsahern, gospo, bjorn.topel,
	michael.chan

This sample program can be used for monitoring and reporting how many
packets per sec (pps) are received per NIC RX queue index and which
CPU processed the packet. In itself it is a useful tool for quickly
identifying RSS imbalance issues, see below.

The default XDP action is XDP_PASS in-order to provide a monitor
mode. For benchmarking purposes it is possible to specify other XDP
actions on the cmdline --action.

Output below shows an imbalance RSS case where most RXQ's deliver to
CPU-0 while CPU-2 only get packets from a single RXQ.  Looking at
things from a CPU level the two CPUs are processing approx the same
amount, BUT looking at the rx_queue_index levels it is clear that
RXQ-2 receive much better service, than other RXQs which all share CPU-0.

Running XDP on dev:i40e1 (ifindex:3) action:XDP_PASS
XDP stats       CPU     pps         issue-pps
XDP-RX CPU      0       900,473     0
XDP-RX CPU      2       906,921     0
XDP-RX CPU      total   1,807,395

RXQ stats       RXQ:CPU pps         issue-pps
rx_queue_index    0:0   180,098     0
rx_queue_index    0:sum 180,098
rx_queue_index    1:0   180,098     0
rx_queue_index    1:sum 180,098
rx_queue_index    2:2   906,921     0
rx_queue_index    2:sum 906,921
rx_queue_index    3:0   180,098     0
rx_queue_index    3:sum 180,098
rx_queue_index    4:0   180,082     0
rx_queue_index    4:sum 180,082
rx_queue_index    5:0   180,093     0
rx_queue_index    5:sum 180,093

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 samples/bpf/Makefile            |    4 
 samples/bpf/xdp_rxq_info_kern.c |   96 +++++++
 samples/bpf/xdp_rxq_info_user.c |  533 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 633 insertions(+)
 create mode 100644 samples/bpf/xdp_rxq_info_kern.c
 create mode 100644 samples/bpf/xdp_rxq_info_user.c

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 4fb944a7ecf8..3ff7a05bea9a 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -41,6 +41,7 @@ hostprogs-y += xdp_redirect
 hostprogs-y += xdp_redirect_map
 hostprogs-y += xdp_redirect_cpu
 hostprogs-y += xdp_monitor
+hostprogs-y += xdp_rxq_info
 hostprogs-y += syscall_tp
 
 # Libbpf dependencies
@@ -90,6 +91,7 @@ xdp_redirect-objs := bpf_load.o $(LIBBPF) xdp_redirect_user.o
 xdp_redirect_map-objs := bpf_load.o $(LIBBPF) xdp_redirect_map_user.o
 xdp_redirect_cpu-objs := bpf_load.o $(LIBBPF) xdp_redirect_cpu_user.o
 xdp_monitor-objs := bpf_load.o $(LIBBPF) xdp_monitor_user.o
+xdp_rxq_info-objs := bpf_load.o $(LIBBPF) xdp_rxq_info_user.o
 syscall_tp-objs := bpf_load.o $(LIBBPF) syscall_tp_user.o
 
 # Tell kbuild to always build the programs
@@ -139,6 +141,7 @@ always += xdp_redirect_kern.o
 always += xdp_redirect_map_kern.o
 always += xdp_redirect_cpu_kern.o
 always += xdp_monitor_kern.o
+always += xdp_rxq_info_kern.o
 always += syscall_tp_kern.o
 
 HOSTCFLAGS += -I$(objtree)/usr/include
@@ -182,6 +185,7 @@ HOSTLOADLIBES_xdp_redirect += -lelf
 HOSTLOADLIBES_xdp_redirect_map += -lelf
 HOSTLOADLIBES_xdp_redirect_cpu += -lelf
 HOSTLOADLIBES_xdp_monitor += -lelf
+HOSTLOADLIBES_xdp_rxq_info += -lelf
 HOSTLOADLIBES_syscall_tp += -lelf
 
 # Allows pointing LLC/CLANG to a LLVM backend with bpf support, redefine on cmdline:
diff --git a/samples/bpf/xdp_rxq_info_kern.c b/samples/bpf/xdp_rxq_info_kern.c
new file mode 100644
index 000000000000..19cbd662202c
--- /dev/null
+++ b/samples/bpf/xdp_rxq_info_kern.c
@@ -0,0 +1,96 @@
+/* Example howto extract XDP RX-queue info
+ *
+ * Copyright (c) 2017 Jesper Dangaard Brouer, Red Hat Inc.
+ * Released under terms in GPL version 2.  See COPYING.
+ */
+#include <uapi/linux/bpf.h>
+#include "bpf_helpers.h"
+
+/* Config setup from with userspace
+ *
+ * User-side setup ifindex in config_map, to verify that
+ * ctx->ingress_ifindex is correct (against configured ifindex)
+ */
+struct config {
+	__u32 action;
+	int ifindex;
+};
+struct bpf_map_def SEC("maps") config_map = {
+	.type		= BPF_MAP_TYPE_ARRAY,
+	.key_size	= sizeof(int),
+	.value_size	= sizeof(struct config),
+	.max_entries	= 1,
+};
+
+/* Common stats data record (shared with userspace) */
+struct datarec {
+	__u64 processed;
+	__u64 issue;
+};
+
+struct bpf_map_def SEC("maps") stats_global_map = {
+	.type		= BPF_MAP_TYPE_PERCPU_ARRAY,
+	.key_size	= sizeof(u32),
+	.value_size	= sizeof(struct datarec),
+	.max_entries	= 1,
+};
+
+#define MAX_RXQs 64
+
+/* Stats per rx_queue_index (per CPU) */
+struct bpf_map_def SEC("maps") rx_queue_index_map = {
+	.type		= BPF_MAP_TYPE_PERCPU_ARRAY,
+	.key_size	= sizeof(u32),
+	.value_size	= sizeof(struct datarec),
+	.max_entries	= MAX_RXQs + 1,
+};
+
+SEC("xdp_prog0")
+int  xdp_prognum0(struct xdp_md *ctx)
+{
+	void *data_end = (void *)(long)ctx->data_end;
+	void *data     = (void *)(long)ctx->data;
+	struct datarec *rec, *rxq_rec;
+	int ingress_ifindex;
+	struct config *config;
+	u32 key = 0;
+
+	/* Global stats record */
+	rec = bpf_map_lookup_elem(&stats_global_map, &key);
+	if (!rec)
+		return XDP_ABORTED;
+	rec->processed++;
+
+	/* Accessing ctx->ingress_ifindex, cause BPF to rewrite BPF
+	 * instructions inside kernel to access xdp_rxq->dev->ifindex
+	 */
+	ingress_ifindex = ctx->ingress_ifindex;
+
+	config = bpf_map_lookup_elem(&config_map, &key);
+	if (!config)
+		return XDP_ABORTED;
+
+	/* Simple test: check ctx provided ifindex is as expected */
+	if (ingress_ifindex != config->ifindex) {
+		/* count this error case */
+		rec->issue++;
+		return XDP_ABORTED;
+	}
+
+	/* Update stats per rx_queue_index. Handle if rx_queue_index
+	 * is larger than stats map can contain info for.
+	 */
+	key = ctx->rx_queue_index;
+	if (key >= MAX_RXQs)
+		key = MAX_RXQs;
+	rxq_rec = bpf_map_lookup_elem(&rx_queue_index_map, &key);
+	if (!rxq_rec)
+		return XDP_ABORTED;
+	rxq_rec->processed++;
+	if (key == MAX_RXQs)
+		rxq_rec->issue++;
+
+	return config->action;
+}
+
+char _license[] SEC("license") = "GPL";
diff --git a/samples/bpf/xdp_rxq_info_user.c b/samples/bpf/xdp_rxq_info_user.c
new file mode 100644
index 000000000000..8e540c33631c
--- /dev/null
+++ b/samples/bpf/xdp_rxq_info_user.c
@@ -0,0 +1,533 @@
+/*
+ * Copyright (c) 2017 Jesper Dangaard Brouer, Red Hat Inc.
+ * Released under terms in GPL version 2.  See COPYING.
+ */
+static const char *__doc__ = " XDP RX-queue info extract example\n\n"
+	"Monitor how many packets per sec (pps) are received\n"
+	"per NIC RX queue index and which CPU processed the packet\n"
+	;
+
+#include <errno.h>
+#include <signal.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdbool.h>
+#include <string.h>
+#include <unistd.h>
+#include <locale.h>
+#include <sys/resource.h>
+#include <getopt.h>
+#include <net/if.h>
+#include <time.h>
+
+#include <arpa/inet.h>
+#include <linux/if_link.h>
+
+#include "libbpf.h"
+#include "bpf_load.h"
+#include "bpf_util.h"
+
+static int ifindex = -1;
+static char ifname_buf[IF_NAMESIZE];
+static char *ifname;
+
+static __u32 xdp_flags;
+
+/* Exit return codes */
+#define EXIT_OK		0
+#define EXIT_FAIL		1
+#define EXIT_FAIL_OPTION	2
+#define EXIT_FAIL_XDP		3
+#define EXIT_FAIL_BPF		4
+#define EXIT_FAIL_MEM		5
+
+static const struct option long_options[] = {
+	{"help",	no_argument,		NULL, 'h' },
+	{"dev",		required_argument,	NULL, 'd' },
+	{"skb-mode",	no_argument,		NULL, 'S' },
+	{"sec",		required_argument,	NULL, 's' },
+	{"no-separators", no_argument,		NULL, 'z' },
+	{"action", 	required_argument,	NULL, 'a' },
+	{0, 0, NULL,  0 }
+};
+
+static void int_exit(int sig)
+{
+	fprintf(stderr,
+		"Interrupted: Removing XDP program on ifindex:%d device:%s\n",
+		ifindex, ifname);
+	if (ifindex > -1)
+		set_link_xdp_fd(ifindex, -1, xdp_flags);
+	exit(EXIT_OK);
+}
+
+struct config {
+	__u32 action;
+	int ifindex;
+};
+#define XDP_ACTION_MAX (XDP_TX + 1)
+#define XDP_ACTION_MAX_STRLEN 11
+static const char *xdp_action_names[XDP_ACTION_MAX] = {
+	[XDP_ABORTED]	= "XDP_ABORTED",
+	[XDP_DROP]	= "XDP_DROP",
+	[XDP_PASS]	= "XDP_PASS",
+	[XDP_TX]	= "XDP_TX",
+};
+
+static const char *action2str(int action)
+{
+	if (action < XDP_ACTION_MAX)
+		return xdp_action_names[action];
+	return NULL;
+}
+
+static int parse_xdp_action(char *action_str)
+{
+	size_t maxlen;
+	__u64 action = -1;
+	int i;
+
+	for (i = 0; i < XDP_ACTION_MAX; i++) {
+		maxlen = XDP_ACTION_MAX_STRLEN;
+		if (strncmp(xdp_action_names[i], action_str, maxlen) == 0) {
+			action = i;
+			break;
+		}
+	}
+	return action;
+}
+
+static void list_xdp_actions(void)
+{
+	int i;
+
+	printf("Available XDP --action <options>\n");
+	for (i = 0; i < XDP_ACTION_MAX; i++) {
+		printf("\t%s\n", xdp_action_names[i]);
+	}
+	printf("\n");
+}
+
+static void usage(char *argv[])
+{
+	int i;
+
+	printf("\nDOCUMENTATION:\n%s\n", __doc__);
+	printf(" Usage: %s (options-see-below)\n", argv[0]);
+	printf(" Listing options:\n");
+	for (i = 0; long_options[i].name != 0; i++) {
+		printf(" --%-12s", long_options[i].name);
+		if (long_options[i].flag != NULL)
+			printf(" flag (internal value:%d)",
+				*long_options[i].flag);
+		else
+			printf(" short-option: -%c",
+				long_options[i].val);
+		printf("\n");
+	}
+	printf("\n");
+	list_xdp_actions();
+}
+
+#define NANOSEC_PER_SEC 1000000000 /* 10^9 */
+static __u64 gettime(void)
+{
+	struct timespec t;
+	int res;
+
+	res = clock_gettime(CLOCK_MONOTONIC, &t);
+	if (res < 0) {
+		fprintf(stderr, "Error with gettimeofday! (%i)\n", res);
+		exit(EXIT_FAIL);
+	}
+	return (__u64) t.tv_sec * NANOSEC_PER_SEC + t.tv_nsec;
+}
+
+/* Common stats data record shared with _kern.c */
+struct datarec {
+	__u64 processed;
+	__u64 issue;
+};
+struct record {
+	__u64 timestamp;
+	struct datarec total;
+	struct datarec *cpu;
+};
+struct stats_record {
+	struct record stats;
+	struct record *rxq;
+};
+
+static struct datarec *alloc_record_per_cpu(void)
+{
+	unsigned int nr_cpus = bpf_num_possible_cpus();
+	struct datarec *array;
+	size_t size;
+
+	size = sizeof(struct datarec) * nr_cpus;
+	array = malloc(size);
+	memset(array, 0, size);
+	if (!array) {
+		fprintf(stderr, "Mem alloc error (nr_cpus:%u)\n", nr_cpus);
+		exit(EXIT_FAIL_MEM);
+	}
+	return array;
+}
+
+static struct record *alloc_record_per_rxq(void)
+{
+	unsigned int nr_rxqs = map_data[2].def.max_entries;
+	struct record *array;
+	size_t size;
+
+	size = sizeof(struct record) * nr_rxqs;
+	array = malloc(size);
+	memset(array, 0, size);
+	if (!array) {
+		fprintf(stderr, "Mem alloc error (nr_rxqs:%u)\n", nr_rxqs);
+		exit(EXIT_FAIL_MEM);
+	}
+	return array;
+}
+
+static struct stats_record *alloc_stats_record(void)
+{
+	unsigned int nr_rxqs = map_data[2].def.max_entries;
+	struct stats_record *rec;
+	int i;
+
+	rec = malloc(sizeof(*rec));
+	memset(rec, 0, sizeof(*rec));
+	if (!rec) {
+		fprintf(stderr, "Mem alloc error\n");
+		exit(EXIT_FAIL_MEM);
+	}
+	rec->rxq = alloc_record_per_rxq();
+	for (i = 0; i < nr_rxqs; i++) {
+		rec->rxq[i].cpu = alloc_record_per_cpu();
+	}
+	rec->stats.cpu = alloc_record_per_cpu();
+	return rec;
+}
+
+static void free_stats_record(struct stats_record *r)
+{
+	unsigned int nr_rxqs = map_data[2].def.max_entries;
+	int i;
+
+	for (i = 0; i < nr_rxqs; i++) {
+		free(r->rxq[i].cpu);
+	}
+	free(r->rxq);
+	free(r->stats.cpu);
+	free(r);
+}
+
+static bool map_collect_percpu(int fd, __u32 key, struct record *rec)
+{
+	/* For percpu maps, userspace gets a value per possible CPU */
+	unsigned int nr_cpus = bpf_num_possible_cpus();
+	struct datarec values[nr_cpus];
+	__u64 sum_processed = 0;
+	__u64 sum_issue = 0;
+	int i;
+
+	if ((bpf_map_lookup_elem(fd, &key, values)) != 0) {
+		fprintf(stderr,
+			"ERR: bpf_map_lookup_elem failed key:0x%X\n", key);
+		return false;
+	}
+	/* Get time as close as possible to reading map contents */
+	rec->timestamp = gettime();
+
+	/* Record and sum values from each CPU */
+	for (i = 0; i < nr_cpus; i++) {
+		rec->cpu[i].processed = values[i].processed;
+		sum_processed        += values[i].processed;
+		rec->cpu[i].issue = values[i].issue;
+		sum_issue        += values[i].issue;
+	}
+	rec->total.processed = sum_processed;
+	rec->total.issue     = sum_issue;
+	return true;
+}
+
+static void stats_collect(struct stats_record *rec)
+{
+	int fd, i, max_rxqs;
+
+	fd = map_data[1].fd; /* map: stats_global_map */
+	map_collect_percpu(fd, 0, &rec->stats);
+
+	fd = map_data[2].fd; /* map: rx_queue_index_map */
+	max_rxqs = map_data[2].def.max_entries;
+	for (i = 0; i < max_rxqs; i++)
+		map_collect_percpu(fd, i, &rec->rxq[i]);
+}
+
+static double calc_period(struct record *r, struct record *p)
+{
+	double period_ = 0;
+	__u64 period = 0;
+
+	period = r->timestamp - p->timestamp;
+	if (period > 0)
+		period_ = ((double) period / NANOSEC_PER_SEC);
+
+	return period_;
+}
+
+static __u64 calc_pps(struct datarec *r, struct datarec *p, double period_)
+{
+	__u64 packets = 0;
+	__u64 pps = 0;
+
+	if (period_ > 0) {
+		packets = r->processed - p->processed;
+		pps = packets / period_;
+	}
+	return pps;
+}
+
+static __u64 calc_errs_pps(struct datarec *r,
+			    struct datarec *p, double period_)
+{
+	__u64 packets = 0;
+	__u64 pps = 0;
+
+	if (period_ > 0) {
+		packets = r->issue - p->issue;
+		pps = packets / period_;
+	}
+	return pps;
+}
+
+static void stats_print(struct stats_record *stats_rec,
+			struct stats_record *stats_prev,
+			int action)
+{
+	unsigned int nr_cpus = bpf_num_possible_cpus();
+	unsigned int nr_rxqs = map_data[2].def.max_entries;
+	double pps = 0, err = 0;
+	struct record *rec, *prev;
+	double t;
+	int rxq;
+	int i;
+
+	/* Header */
+	printf("\nRunning XDP on dev:%s (ifindex:%d) action:%s\n",
+	       ifname, ifindex, action2str(action));
+
+	/* stats_global_map */
+	{
+		char *fmt_rx = "%-15s %-7d %'-11.0f %'-10.0f %s\n";
+		char *fm2_rx = "%-15s %-7s %'-11.0f\n";
+		char *errstr = "";
+
+		printf("%-15s %-7s %-11s %-11s\n",
+		       "XDP stats", "CPU", "pps", "issue-pps");
+
+		rec  =  &stats_rec->stats;
+		prev = &stats_prev->stats;
+		t = calc_period(rec, prev);
+		for (i = 0; i < nr_cpus; i++) {
+			struct datarec *r = &rec->cpu[i];
+			struct datarec *p = &prev->cpu[i];
+
+			pps = calc_pps     (r, p, t);
+			err = calc_errs_pps(r, p, t);
+			if (err > 0)
+				errstr = "invalid-ifindex";
+			if (pps > 0)
+				printf(fmt_rx, "XDP-RX CPU",
+					i, pps, err, errstr);
+		}
+		pps  = calc_pps     (&rec->total, &prev->total, t);
+		err  = calc_errs_pps(&rec->total, &prev->total, t);
+		printf(fm2_rx, "XDP-RX CPU", "total", pps, err);
+	}
+
+	/* rx_queue_index_map */
+	printf("\n%-15s %-7s %-11s %-11s\n",
+	       "RXQ stats", "RXQ:CPU", "pps", "issue-pps");
+
+	for (rxq = 0; rxq < nr_rxqs; rxq++) {
+		char *fmt_rx = "%-15s %3d:%-3d %'-11.0f %'-10.0f %s\n";
+		char *fm2_rx = "%-15s %3d:%-3s %'-11.0f\n";
+		char *errstr = "";
+		int rxq_ = rxq;
+
+		/* Last RXQ in map catch overflows */
+		if (rxq_ == nr_rxqs - 1)
+			rxq_ = -1;
+
+		rec  =  &stats_rec->rxq[rxq];
+		prev = &stats_prev->rxq[rxq];
+		t = calc_period(rec, prev);
+		for (i = 0; i < nr_cpus; i++) {
+			struct datarec *r = &rec->cpu[i];
+			struct datarec *p = &prev->cpu[i];
+
+			pps = calc_pps     (r, p, t);
+			err = calc_errs_pps(r, p, t);
+			if (err > 0) {
+				if (rxq_ == -1)
+					errstr = "map-overflow-RXQ";
+				else
+					errstr = "err";
+			}
+			if (pps > 0)
+				printf(fmt_rx, "rx_queue_index",
+				       rxq_, i, pps, err, errstr);
+		}
+		pps  = calc_pps     (&rec->total, &prev->total, t);
+		err  = calc_errs_pps(&rec->total, &prev->total, t);
+		if (pps || err)
+			printf(fm2_rx, "rx_queue_index", rxq_, "sum", pps, err);
+	}
+}
+
+
+/* Pointer swap trick */
+static inline void swap(struct stats_record **a, struct stats_record **b)
+{
+	struct stats_record *tmp;
+
+	tmp = *a;
+	*a = *b;
+	*b = tmp;
+}
+
+static void stats_poll(int interval, int action)
+{
+	struct stats_record *record, *prev;
+
+	record = alloc_stats_record();
+	prev   = alloc_stats_record();
+	stats_collect(record);
+
+	while (1) {
+		swap(&prev, &record);
+		stats_collect(record);
+		stats_print(record, prev, action);
+		sleep(interval);
+	}
+
+	free_stats_record(record);
+	free_stats_record(prev);
+}
+
+
+int main(int argc, char **argv)
+{
+	struct rlimit r = {10 * 1024 * 1024, RLIM_INFINITY};
+	bool use_separators = true;
+	struct config cfg = { 0 };
+	char filename[256];
+	int longindex = 0;
+	int interval = 2;
+	__u32 key = 0;
+	int opt, err;
+
+	char action_str_buf[XDP_ACTION_MAX_STRLEN + 1 /* for \0 */] = { 0 };
+	int action = XDP_PASS; /* Default action */
+	char *action_str = NULL;
+
+	snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
+
+	if (setrlimit(RLIMIT_MEMLOCK, &r)) {
+		perror("setrlimit(RLIMIT_MEMLOCK)");
+		return 1;
+	}
+
+	if (load_bpf_file(filename)) {
+		fprintf(stderr, "ERR in load_bpf_file(): %s", bpf_log_buf);
+		return EXIT_FAIL;
+	}
+
+	if (!prog_fd[0]) {
+		fprintf(stderr, "ERR: load_bpf_file: %s\n", strerror(errno));
+		return EXIT_FAIL;
+	}
+
+	/* Parse commands line args */
+	while ((opt = getopt_long(argc, argv, "hSd:",
+				  long_options, &longindex)) != -1) {
+		switch (opt) {
+		case 'd':
+			if (strlen(optarg) >= IF_NAMESIZE) {
+				fprintf(stderr, "ERR: --dev name too long\n");
+				goto error;
+			}
+			ifname = (char *)&ifname_buf;
+			strncpy(ifname, optarg, IF_NAMESIZE);
+			ifindex = if_nametoindex(ifname);
+			if (ifindex == 0) {
+				fprintf(stderr,
+					"ERR: --dev name unknown err(%d):%s\n",
+					errno, strerror(errno));
+				goto error;
+			}
+			break;
+		case 's':
+			interval = atoi(optarg);
+			break;
+		case 'S':
+			xdp_flags |= XDP_FLAGS_SKB_MODE;
+			break;
+		case 'z':
+			use_separators = false;
+			break;
+		case 'a':
+			action_str = (char *)&action_str_buf;
+			strncpy(action_str, optarg, XDP_ACTION_MAX_STRLEN);
+			break;
+		case 'h':
+		error:
+		default:
+			usage(argv);
+			return EXIT_FAIL_OPTION;
+		}
+	}
+	/* Required option */
+	if (ifindex == -1) {
+		fprintf(stderr, "ERR: required option --dev missing\n");
+		usage(argv);
+		return EXIT_FAIL_OPTION;
+	}
+	cfg.ifindex = ifindex;
+
+	/* Parse action string */
+	if (action_str) {
+		action = parse_xdp_action(action_str);
+		if (action < 0) {
+			fprintf(stderr, "ERR: Invalid XDP --action: %s\n",
+				action_str);
+			list_xdp_actions();
+			return EXIT_FAIL_OPTION;
+		}
+	}
+	cfg.action = action;
+
+	/* Trick to pretty printf with thousands separators use %' */
+	if (use_separators)
+		setlocale(LC_NUMERIC, "en_US");
+
+	/* User-side setup ifindex in config_map */
+	err = bpf_map_update_elem(map_fd[0], &key, &cfg, 0);
+	if (err) {
+		fprintf(stderr, "Store config failed (err:%d)\n", err);
+		exit(EXIT_FAIL_BPF);
+	}
+
+	/* Remove XDP program when program is interrupted */
+	signal(SIGINT, int_exit);
+
+	if (set_link_xdp_fd(ifindex, prog_fd[0], xdp_flags) < 0) {
+		fprintf(stderr, "link set xdp fd failed\n");
+		return EXIT_FAIL_XDP;
+	}
+
+	stats_poll(interval, action);
+	return EXIT_OK;
+}

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [bpf-next V1-RFC PATCH 02/14] xdp/mlx5: setup xdp_rxq_info and extend with qtype
  2017-12-13 11:19 ` [bpf-next V1-RFC PATCH 02/14] xdp/mlx5: setup xdp_rxq_info and extend with qtype Jesper Dangaard Brouer
@ 2017-12-13 12:27   ` Tariq Toukan
  2017-12-13 13:44     ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 39+ messages in thread
From: Tariq Toukan @ 2017-12-13 12:27 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Daniel Borkmann, Alexei Starovoitov
  Cc: netdev, dsahern, Matan Barak, Saeed Mahameed, gospo, bjorn.topel,
	michael.chan, Tariq Toukan

Hi Jesper,
Thanks for taking care of the drop RQ.

In general, mlx5 part looks ok to me.
Find a few comments below. Mostly pointing out some typos.

On 13/12/2017 1:19 PM, Jesper Dangaard Brouer wrote:
> The mlx5 driver have a special drop-RQ queue (one per interface) that
> simply drops all incoming traffic. It helps driver keep other HW
> objects (flow steering) alive upon down/up operations.  It is
> temporarily pointed by flow steering objects during the interface
> setup, and when interface is down. It lacks many fields that are set
> in a regular RQ (for example its state is never switched to
> MLX5_RQC_STATE_RDY). (Thanks to Tariq Toukan for explaination).
typo: explanation
> 
> The XDP RX-queue info API is extended with a queue-type, and mlx5 uses
> this kind of drop/sink-type (RXQ_TYPE_SINK) for this kind of sink queue.
> 
> Driver hook points for xdp_rxq_info:
>   * init+reg: mlx5e_alloc_rq()
>   * init+reg: mlx5e_alloc_drop_rq()
>   * unreg   : mlx5e_free_rq()
> 
> Tested on actual hardware with samples/bpf program
> 
> Cc: Saeed Mahameed <saeedm@mellanox.com>
> Cc: Matan Barak <matanb@mellanox.com>
> Cc: Tariq Toukan <tariqt@mellanox.com>
> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> ---
>   drivers/net/ethernet/mellanox/mlx5/core/en.h      |    4 ++++
>   drivers/net/ethernet/mellanox/mlx5/core/en_main.c |   14 +++++++++++++
>   drivers/net/ethernet/mellanox/mlx5/core/en_rx.c   |    1 +
>   include/net/xdp.h                                 |   23 +++++++++++++++++++++
>   net/core/xdp.c                                    |    6 +++++
>   5 files changed, 48 insertions(+)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
> index c0872b3284cb..fe10a042783b 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
> @@ -46,6 +46,7 @@
>   #include <linux/mlx5/transobj.h>
>   #include <linux/rhashtable.h>
>   #include <net/switchdev.h>
> +#include <net/xdp.h>
>   #include "wq.h"
>   #include "mlx5_core.h"
>   #include "en_stats.h"
> @@ -568,6 +569,9 @@ struct mlx5e_rq {
>   	u32                    rqn;
>   	struct mlx5_core_dev  *mdev;
>   	struct mlx5_core_mkey  umr_mkey;
> +
> +	/* XDP read-mostly */
> +	struct xdp_rxq_info xdp_rxq;
>   } ____cacheline_aligned_in_smp;
>   
>   struct mlx5e_channel {
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> index 0f5c012de52e..ea44b5f25e11 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> @@ -582,6 +582,12 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c,
>   	rq->ix      = c->ix;
>   	rq->mdev    = mdev;
>   
> +	/* XDP RX-queue info */
> +	xdp_rxq_info_init(&rq->xdp_rxq);
> +	rq->xdp_rxq.dev		= rq->netdev;
> +	rq->xdp_rxq.queue_index = rq->ix;
> +	xdp_rxq_info_reg(&rq->xdp_rxq);
> +
You don't set type here. This is ok as long as the following hold:
1) RXQ_TYPE_DEFAULT is zero
2) xdp_rxq is zalloc'ed.

>   	rq->xdp_prog = params->xdp_prog ? bpf_prog_inc(params->xdp_prog) : NULL;
>   	if (IS_ERR(rq->xdp_prog)) {
>   		err = PTR_ERR(rq->xdp_prog);
> @@ -695,6 +701,7 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c,
>   err_rq_wq_destroy:
>   	if (rq->xdp_prog)
>   		bpf_prog_put(rq->xdp_prog);
> +	xdp_rxq_info_unreg(&rq->xdp_rxq);
>   	mlx5_wq_destroy(&rq->wq_ctrl);
>   
>   	return err;
> @@ -707,6 +714,8 @@ static void mlx5e_free_rq(struct mlx5e_rq *rq)
>   	if (rq->xdp_prog)
>   		bpf_prog_put(rq->xdp_prog);
>   
> +	xdp_rxq_info_unreg(&rq->xdp_rxq);
> +
>   	switch (rq->wq_type) {
>   	case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ:
>   		mlx5e_rq_free_mpwqe_info(rq);
> @@ -2768,6 +2777,11 @@ static int mlx5e_alloc_drop_rq(struct mlx5_core_dev *mdev,
>   	if (err)
>   		return err;
>   
> +	/* XDP RX-queue info for "Drop-RQ", packets never reach XDP */
> +	xdp_rxq_info_init(&rq->xdp_rxq);
> +	xdp_rxq_info_type(&rq->xdp_rxq, RXQ_TYPE_SINK);
> +	xdp_rxq_info_reg(&rq->xdp_rxq);
> +
>   	rq->mdev = mdev;
>   
>   	return 0;
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> index 5b499c7a698f..7b38480811d4 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> @@ -812,6 +812,7 @@ static inline int mlx5e_xdp_handle(struct mlx5e_rq *rq,
>   	xdp_set_data_meta_invalid(&xdp);
>   	xdp.data_end = xdp.data + *len;
>   	xdp.data_hard_start = va;
> +	xdp.rxq = &rq->xdp_rxq;
>   
>   	act = bpf_prog_run_xdp(prog, &xdp);
>   	switch (act) {
> diff --git a/include/net/xdp.h b/include/net/xdp.h
> index e4acd198fd60..5be560d943e1 100644
> --- a/include/net/xdp.h
> +++ b/include/net/xdp.h
> @@ -36,10 +36,33 @@ struct xdp_rxq_info {
>   	struct net_device *dev;
>   	u32 queue_index;
>   	u32 reg_state;
> +	u32 qtype;
>   } ____cacheline_aligned; /* perf critical, avoid false-sharing */
>   
>   void xdp_rxq_info_init(struct xdp_rxq_info *xdp_rxq);
>   void xdp_rxq_info_reg(struct xdp_rxq_info *xdp_rxq);
>   void xdp_rxq_info_unreg(struct xdp_rxq_info *xdp_rxq);
>   
> +/**
> + * DOC: XDP RX-queue type
> + *
> + * The XDP RX-queue info can have associated a type.
> + *
> + * @RXQ_TYPE_DEFAULT: default no specifik queue type need to be specified

typo: specific

> + *
> + * @RXQ_TYPE_SINK: indicate a fake queue that never reach XDP RX
> + *	code.  Some drivers have a need to maintain a lower layer
> + *	RX-queue as a sink queue, while reconfiguring other RX-queues.
> + */
> +#define RXQ_TYPE_DEFAULT	0
> +#define RXQ_TYPE_SINK		1
> +#define RXQ_TYPE_MAX		RXQ_TYPE_SINK

Definitions of incremental numbers, enum might be best here, you can 
give them some enum type and use it in xdp_rxq_info->qtype.

> +
> +static inline
> +void xdp_rxq_info_type(struct xdp_rxq_info *xdp_rxq, u32 qtype)
> +{
> +	BUILD_BUG_ON(qtype > RXQ_TYPE_MAX);
> +	xdp_rxq->qtype = qtype;
> +}
> +
>   #endif /* __LINUX_NET_XDP_H__ */
> diff --git a/net/core/xdp.c b/net/core/xdp.c
> index a9d2dd7b1ede..2a111f5987f6 100644
> --- a/net/core/xdp.c
> +++ b/net/core/xdp.c
> @@ -32,8 +32,14 @@ EXPORT_SYMBOL_GPL(xdp_rxq_info_init);
>   
>   void xdp_rxq_info_reg(struct xdp_rxq_info *xdp_rxq)
>   {
> +	if (xdp_rxq->qtype == RXQ_TYPE_SINK)
> +		goto skip_content_check;
> +
> +	/* Check information setup by driver code */
>   	WARN(!xdp_rxq->dev, "Missing net_device from driver");
>   	WARN(xdp_rxq->queue_index == U32_MAX, "Miss queue_index from driver");
> +
> +skip_content_check:
>   	WARN(!(xdp_rxq->reg_state == REG_STATE_NEW),"API violation, miss init");
>   	xdp_rxq->reg_state = REG_STATE_REGISTRED;
typo: REGISTERED (introduced in a previous patch)
>   }
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [bpf-next V1-RFC PATCH 06/14] mlx4: setup xdp_rxq_info
  2017-12-13 11:19 ` [bpf-next V1-RFC PATCH 06/14] mlx4: setup xdp_rxq_info Jesper Dangaard Brouer
@ 2017-12-13 12:42   ` Tariq Toukan
  2017-12-13 14:00     ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 39+ messages in thread
From: Tariq Toukan @ 2017-12-13 12:42 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Daniel Borkmann, Alexei Starovoitov
  Cc: netdev, dsahern, gospo, bjorn.topel, michael.chan, Tariq Toukan



On 13/12/2017 1:19 PM, Jesper Dangaard Brouer wrote:
> Driver hook points for xdp_rxq_info:
>   * init+reg: mlx4_en_create_rx_ring
>   * unreg   : mlx4_en_destroy_rx_ring
> 
> Cc: Tariq Toukan <tariqt@mellanox.com>
> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> ---
>   drivers/net/ethernet/mellanox/mlx4/en_netdev.c |    3 ++-
>   drivers/net/ethernet/mellanox/mlx4/en_rx.c     |   13 +++++++++++--
>   drivers/net/ethernet/mellanox/mlx4/mlx4_en.h   |    4 +++-
>   3 files changed, 16 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
> index 99051a294fa6..0cfcf3089ae4 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
> @@ -2172,8 +2172,9 @@ static int mlx4_en_alloc_resources(struct mlx4_en_priv *priv)
>   
>   		if (mlx4_en_create_rx_ring(priv, &priv->rx_ring[i],
>   					   prof->rx_ring_size, priv->stride,
> -					   node))
> +					   node, i))
>   			goto err;
> +
>   	}
>   
>   #ifdef CONFIG_RFS_ACCEL
> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> index 85e28efcda33..2091c9734e6a 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> @@ -262,7 +262,7 @@ void mlx4_en_set_num_rx_rings(struct mlx4_en_dev *mdev)
>   
>   int mlx4_en_create_rx_ring(struct mlx4_en_priv *priv,
>   			   struct mlx4_en_rx_ring **pring,
> -			   u32 size, u16 stride, int node)
> +			   u32 size, u16 stride, int node, int queue_index)
>   {
>   	struct mlx4_en_dev *mdev = priv->mdev;
>   	struct mlx4_en_rx_ring *ring;
> @@ -286,6 +286,12 @@ int mlx4_en_create_rx_ring(struct mlx4_en_priv *priv,
>   	ring->log_stride = ffs(ring->stride) - 1;
>   	ring->buf_size = ring->size * ring->stride + TXBB_SIZE;
>   
> +	/* XDP RX-queue info */
> +	xdp_rxq_info_init(&ring->xdp_rxq);
> +	ring->xdp_rxq.dev = priv->dev;
> +	ring->xdp_rxq.queue_index = queue_index;
> +	xdp_rxq_info_reg(&ring->xdp_rxq);
> +
>   	tmp = size * roundup_pow_of_two(MLX4_EN_MAX_RX_FRAGS *
>   					sizeof(struct mlx4_en_rx_alloc));
>   	ring->rx_info = vzalloc_node(tmp, node);
> @@ -318,6 +324,7 @@ int mlx4_en_create_rx_ring(struct mlx4_en_priv *priv,
>   	vfree(ring->rx_info);
>   	ring->rx_info = NULL;
>   err_ring:
> +	xdp_rxq_info_unreg(&ring->xdp_rxq);
>   	kfree(ring);
>   	*pring = NULL;
>   
> @@ -440,6 +447,7 @@ void mlx4_en_destroy_rx_ring(struct mlx4_en_priv *priv,
>   					lockdep_is_held(&mdev->state_lock));
>   	if (old_prog)
>   		bpf_prog_put(old_prog);
> +	xdp_rxq_info_unreg(&ring->xdp_rxq);
>   	mlx4_free_hwq_res(mdev->dev, &ring->wqres, size * stride + TXBB_SIZE);
>   	vfree(ring->rx_info);
>   	ring->rx_info = NULL;
> @@ -650,6 +658,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
>   	int cq_ring = cq->ring;
>   	bool doorbell_pending;
>   	struct mlx4_cqe *cqe;
> +	struct xdp_buff xdp;
>   	int polled = 0;
>   	int index;
>   
> @@ -664,6 +673,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
>   	/* Protect accesses to: ring->xdp_prog, priv->mac_hash list */
>   	rcu_read_lock();
>   	xdp_prog = rcu_dereference(ring->xdp_prog);
> +	xdp.rxq = &ring->xdp_rxq;

You moved this to update it only once, and not per packet. Right?
This is because all fields of struct xdp_buff used to be specific 
per-packet and filled in every iteration. Now you introduce a new field 
which holds a context.

Well, I still need to go over the infrastructure in your other patches, 
but from first glance it seems that we can use two separated structs: 
one for context, and one for per-packet info.

>   	doorbell_pending = 0;
>   
>   	/* We assume a 1:1 mapping between CQEs and Rx descriptors, so Rx
> @@ -748,7 +758,6 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
>   		 * read bytes but not past the end of the frag.
>   		 */
>   		if (xdp_prog) {
> -			struct xdp_buff xdp;
>   			dma_addr_t dma;
>   			void *orig_data;
>   			u32 act;
> diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
> index 1856e279a7e0..bdfb4362b35a 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
> +++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
> @@ -46,6 +46,7 @@
>   #endif
>   #include <linux/cpu_rmap.h>
>   #include <linux/ptp_clock_kernel.h>
> +#include <net/xdp.h>
>   
>   #include <linux/mlx4/device.h>
>   #include <linux/mlx4/qp.h>
> @@ -353,6 +354,7 @@ struct mlx4_en_rx_ring {
>   	unsigned long dropped;
>   	int hwtstamp_rx_filter;
>   	cpumask_var_t affinity_mask;
> +	struct xdp_rxq_info xdp_rxq;
>   };
>   
>   struct mlx4_en_cq {
> @@ -716,7 +718,7 @@ void mlx4_en_set_num_rx_rings(struct mlx4_en_dev *mdev);
>   void mlx4_en_recover_from_oom(struct mlx4_en_priv *priv);
>   int mlx4_en_create_rx_ring(struct mlx4_en_priv *priv,
>   			   struct mlx4_en_rx_ring **pring,
> -			   u32 size, u16 stride, int node);
> +			   u32 size, u16 stride, int node, int queue_index);
>   void mlx4_en_destroy_rx_ring(struct mlx4_en_priv *priv,
>   			     struct mlx4_en_rx_ring **pring,
>   			     u32 size, u16 stride);
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [bpf-next V1-RFC PATCH 02/14] xdp/mlx5: setup xdp_rxq_info and extend with qtype
  2017-12-13 12:27   ` Tariq Toukan
@ 2017-12-13 13:44     ` Jesper Dangaard Brouer
  2017-12-13 23:03       ` Saeed Mahameed
  0 siblings, 1 reply; 39+ messages in thread
From: Jesper Dangaard Brouer @ 2017-12-13 13:44 UTC (permalink / raw)
  To: Tariq Toukan
  Cc: Daniel Borkmann, Alexei Starovoitov, netdev, dsahern,
	Matan Barak, Saeed Mahameed, gospo, bjorn.topel, michael.chan,
	brouer

On Wed, 13 Dec 2017 14:27:08 +0200
Tariq Toukan <tariqt@mellanox.com> wrote:

> Hi Jesper,
> Thanks for taking care of the drop RQ.
> 
> In general, mlx5 part looks ok to me.
> Find a few comments below. Mostly pointing out some typos.
> 
> On 13/12/2017 1:19 PM, Jesper Dangaard Brouer wrote:
> > The mlx5 driver have a special drop-RQ queue (one per interface) that
> > simply drops all incoming traffic. It helps driver keep other HW
> > objects (flow steering) alive upon down/up operations.  It is
> > temporarily pointed by flow steering objects during the interface
> > setup, and when interface is down. It lacks many fields that are set
> > in a regular RQ (for example its state is never switched to
> > MLX5_RQC_STATE_RDY). (Thanks to Tariq Toukan for explaination).  
> typo: explanation

Fixed 

> > 
> > The XDP RX-queue info API is extended with a queue-type, and mlx5 uses
> > this kind of drop/sink-type (RXQ_TYPE_SINK) for this kind of sink queue.
> > 
> > Driver hook points for xdp_rxq_info:
> >   * init+reg: mlx5e_alloc_rq()
> >   * init+reg: mlx5e_alloc_drop_rq()
> >   * unreg   : mlx5e_free_rq()
> > 
> > Tested on actual hardware with samples/bpf program
> > 
> > Cc: Saeed Mahameed <saeedm@mellanox.com>
> > Cc: Matan Barak <matanb@mellanox.com>
> > Cc: Tariq Toukan <tariqt@mellanox.com>
> > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> > ---
> >   drivers/net/ethernet/mellanox/mlx5/core/en.h      |    4 ++++
> >   drivers/net/ethernet/mellanox/mlx5/core/en_main.c |   14 +++++++++++++
> >   drivers/net/ethernet/mellanox/mlx5/core/en_rx.c   |    1 +
> >   include/net/xdp.h                                 |   23 +++++++++++++++++++++
> >   net/core/xdp.c                                    |    6 +++++
> >   5 files changed, 48 insertions(+)
> > 
> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
> > index c0872b3284cb..fe10a042783b 100644
> > --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
> > @@ -46,6 +46,7 @@
> >   #include <linux/mlx5/transobj.h>
> >   #include <linux/rhashtable.h>
> >   #include <net/switchdev.h>
> > +#include <net/xdp.h>
> >   #include "wq.h"
> >   #include "mlx5_core.h"
> >   #include "en_stats.h"
> > @@ -568,6 +569,9 @@ struct mlx5e_rq {
> >   	u32                    rqn;
> >   	struct mlx5_core_dev  *mdev;
> >   	struct mlx5_core_mkey  umr_mkey;
> > +
> > +	/* XDP read-mostly */
> > +	struct xdp_rxq_info xdp_rxq;
> >   } ____cacheline_aligned_in_smp;
> >   
> >   struct mlx5e_channel {
> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> > index 0f5c012de52e..ea44b5f25e11 100644
> > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> > @@ -582,6 +582,12 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c,
> >   	rq->ix      = c->ix;
> >   	rq->mdev    = mdev;
> >   
> > +	/* XDP RX-queue info */
> > +	xdp_rxq_info_init(&rq->xdp_rxq);
> > +	rq->xdp_rxq.dev		= rq->netdev;
> > +	rq->xdp_rxq.queue_index = rq->ix;
> > +	xdp_rxq_info_reg(&rq->xdp_rxq);
> > +  
> You don't set type here. This is ok as long as the following hold:
> 1) RXQ_TYPE_DEFAULT is zero

True

> 2) xdp_rxq is zalloc'ed.

xdp_rxq memory area is part of rq allocation, but in
xdp_rxq_info_init() I memset/zero the area explicit.

 
> >   	rq->xdp_prog = params->xdp_prog ?
> > bpf_prog_inc(params->xdp_prog) : NULL; if (IS_ERR(rq->xdp_prog)) {
> >   		err = PTR_ERR(rq->xdp_prog);
> > @@ -695,6 +701,7 @@ static int mlx5e_alloc_rq(struct mlx5e_channel
> > *c, err_rq_wq_destroy:
> >   	if (rq->xdp_prog)
> >   		bpf_prog_put(rq->xdp_prog);
> > +	xdp_rxq_info_unreg(&rq->xdp_rxq);
> >   	mlx5_wq_destroy(&rq->wq_ctrl);
> >   
> >   	return err;
> > @@ -707,6 +714,8 @@ static void mlx5e_free_rq(struct mlx5e_rq *rq)
> >   	if (rq->xdp_prog)
> >   		bpf_prog_put(rq->xdp_prog);
> >   
> > +	xdp_rxq_info_unreg(&rq->xdp_rxq);
> > +
> >   	switch (rq->wq_type) {
> >   	case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ:
> >   		mlx5e_rq_free_mpwqe_info(rq);
> > @@ -2768,6 +2777,11 @@ static int mlx5e_alloc_drop_rq(struct
> > mlx5_core_dev *mdev, if (err)
> >   		return err;
> >   
> > +	/* XDP RX-queue info for "Drop-RQ", packets never reach
> > XDP */
> > +	xdp_rxq_info_init(&rq->xdp_rxq);
> > +	xdp_rxq_info_type(&rq->xdp_rxq, RXQ_TYPE_SINK);
> > +	xdp_rxq_info_reg(&rq->xdp_rxq);
> > +
> >   	rq->mdev = mdev;
> >   
> >   	return 0;
> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> > b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c index
> > 5b499c7a698f..7b38480811d4 100644 ---
> > a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +++
> > b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c @@ -812,6 +812,7
> > @@ static inline int mlx5e_xdp_handle(struct mlx5e_rq *rq,
> > xdp_set_data_meta_invalid(&xdp); xdp.data_end = xdp.data + *len;
> >   	xdp.data_hard_start = va;
> > +	xdp.rxq = &rq->xdp_rxq;
> >   
> >   	act = bpf_prog_run_xdp(prog, &xdp);
> >   	switch (act) {
> > diff --git a/include/net/xdp.h b/include/net/xdp.h
> > index e4acd198fd60..5be560d943e1 100644
> > --- a/include/net/xdp.h
> > +++ b/include/net/xdp.h
> > @@ -36,10 +36,33 @@ struct xdp_rxq_info {
> >   	struct net_device *dev;
> >   	u32 queue_index;
> >   	u32 reg_state;
> > +	u32 qtype;
> >   } ____cacheline_aligned; /* perf critical, avoid false-sharing */
> >   
> >   void xdp_rxq_info_init(struct xdp_rxq_info *xdp_rxq);
> >   void xdp_rxq_info_reg(struct xdp_rxq_info *xdp_rxq);
> >   void xdp_rxq_info_unreg(struct xdp_rxq_info *xdp_rxq);
> >   
> > +/**
> > + * DOC: XDP RX-queue type
> > + *
> > + * The XDP RX-queue info can have associated a type.
> > + *
> > + * @RXQ_TYPE_DEFAULT: default no specifik queue type need to be
> > specified  
> 
> typo: specific

Thanks, this is a Danish typo (it's spelled that way in Danish).
 
> > + *
> > + * @RXQ_TYPE_SINK: indicate a fake queue that never reach XDP RX
> > + *	code.  Some drivers have a need to maintain a lower layer
> > + *	RX-queue as a sink queue, while reconfiguring other
> > RX-queues.
> > + */
> > +#define RXQ_TYPE_DEFAULT	0
> > +#define RXQ_TYPE_SINK		1
> > +#define RXQ_TYPE_MAX		RXQ_TYPE_SINK  
> 
> Definitions of incremental numbers, enum might be best here, you can 
> give them some enum type and use it in xdp_rxq_info->qtype.

I use defines to make the below BUILD_BUG_ON work, as enums does not
get expanded to their values in the C-preprocessor stage.

> > +
> > +static inline
> > +void xdp_rxq_info_type(struct xdp_rxq_info *xdp_rxq, u32 qtype)
> > +{
> > +	BUILD_BUG_ON(qtype > RXQ_TYPE_MAX);
> > +	xdp_rxq->qtype = qtype;
> > +}
> > +
> >   #endif /* __LINUX_NET_XDP_H__ */
> > diff --git a/net/core/xdp.c b/net/core/xdp.c
> > index a9d2dd7b1ede..2a111f5987f6 100644
> > --- a/net/core/xdp.c
> > +++ b/net/core/xdp.c
> > @@ -32,8 +32,14 @@ EXPORT_SYMBOL_GPL(xdp_rxq_info_init);
> >   
> >   void xdp_rxq_info_reg(struct xdp_rxq_info *xdp_rxq)
> >   {
> > +	if (xdp_rxq->qtype == RXQ_TYPE_SINK)
> > +		goto skip_content_check;
> > +
> > +	/* Check information setup by driver code */
> >   	WARN(!xdp_rxq->dev, "Missing net_device from driver");
> >   	WARN(xdp_rxq->queue_index == U32_MAX, "Miss queue_index from driver"); +
> > +skip_content_check:
> >   	WARN(!(xdp_rxq->reg_state == REG_STATE_NEW),"API violation, miss init");
> >     xdp_rxq->reg_state = REG_STATE_REGISTRED;  
> typo: REGISTERED (introduced in a previous patch)

Thanks for catching that! :-)

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [bpf-next V1-RFC PATCH 06/14] mlx4: setup xdp_rxq_info
  2017-12-13 12:42   ` Tariq Toukan
@ 2017-12-13 14:00     ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 39+ messages in thread
From: Jesper Dangaard Brouer @ 2017-12-13 14:00 UTC (permalink / raw)
  To: Tariq Toukan
  Cc: Daniel Borkmann, Alexei Starovoitov, netdev, dsahern, gospo,
	bjorn.topel, michael.chan, brouer

On Wed, 13 Dec 2017 14:42:25 +0200
Tariq Toukan <tariqt@mellanox.com> wrote:

> > @@ -650,6 +658,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
> >   	int cq_ring = cq->ring;
> >   	bool doorbell_pending;
> >   	struct mlx4_cqe *cqe;
> > +	struct xdp_buff xdp;
> >   	int polled = 0;
> >   	int index;
> >   
> > @@ -664,6 +673,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
> >   	/* Protect accesses to: ring->xdp_prog, priv->mac_hash list */
> >   	rcu_read_lock();
> >   	xdp_prog = rcu_dereference(ring->xdp_prog);
> > +	xdp.rxq = &ring->xdp_rxq;  
> 
> You moved this to update it only once, and not per packet. Right?
> This is because all fields of struct xdp_buff used to be specific 
> per-packet and filled in every iteration. Now you introduce a new field 
> which holds a context.
> 
> Well, I still need to go over the infrastructure in your other patches, 
> but from first glance it seems that we can use two separated structs: 
> one for context, and one for per-packet info.

This are two separate structs.  I guess, what you are suggesting is
passing them as separate structs to bpf_prog_run_xdp() ?

The reason I like/want to have xdp_buff point to xdp_rxq_info, is that
I want this information transferred through to the XDP_REDIRECT calls.

The plan (after this patchset) is to implement a kfree_xdp_buff() that
can free/return xdp frames directly to the driver (needed in err cases
in redirect code) by checking if the rxq->dev have defined an
ndo_xdp_return() function.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [bpf-next V1-RFC PATCH 12/14] xdp: generic XDP handling of xdp_rxq_info
  2017-12-13 11:20 ` [bpf-next V1-RFC PATCH 12/14] xdp: generic XDP handling of xdp_rxq_info Jesper Dangaard Brouer
@ 2017-12-13 22:50   ` Saeed Mahameed
  2017-12-18  9:47     ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 39+ messages in thread
From: Saeed Mahameed @ 2017-12-13 22:50 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Daniel Borkmann, Alexei Starovoitov
  Cc: netdev, dsahern, gospo, bjorn.topel, michael.chan



On 12/13/2017 3:20 AM, Jesper Dangaard Brouer wrote:
> Hook points for xdp_rxq_info:
>   * init+reg: netif_alloc_rx_queues
>   * unreg   : netif_free_rx_queues
> 
> The net_device have some members (num_rx_queues + real_num_rx_queues)
> and data-area (dev->_rx with struct netdev_rx_queue's) that were
> primarily used for exporting information about RPS (CONFIG_RPS) queues
> to sysfs (CONFIG_SYSFS).
> 
> For generic XDP extend struct netdev_rx_queue with the xdp_rxq_info,
> and remove some of the CONFIG_SYSFS ifdefs.
> 
> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> ---
>   include/linux/netdevice.h |    2 ++
>   net/core/dev.c            |   60 ++++++++++++++++++++++++++++++++++++++-------
>   2 files changed, 52 insertions(+), 10 deletions(-)
> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index cc4ce7456e38..43595b037872 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -44,6 +44,7 @@
>   #include <net/dcbnl.h>
>   #endif
>   #include <net/netprio_cgroup.h>
> +#include <net/xdp.h>
>   
>   #include <linux/netdev_features.h>
>   #include <linux/neighbour.h>
> @@ -686,6 +687,7 @@ struct netdev_rx_queue {
>   #endif
>   	struct kobject			kobj;
>   	struct net_device		*dev;
> +	struct xdp_rxq_info		xdp_rxq;
>   } ____cacheline_aligned_in_smp;
>   

Instead of duplicating this xdp_rxq_info and have 2 instances of it for 
drivers that do support XDP (the generic one and the driver internal 
xdp_rxq_info), drivers can use the generic netdev_rx_queue.xdp_rxq to 
register their own xdp_rxq_info.
I suggest the following API for drivers to use:

xdp_rxq_info_reg(netdev, rxq_index)
{
	rxqueue = dev->_rx + rxq_index;
	xdp_rxq = rxqueue.xdp_rxq;
         xdp_rxq_info_init(xdp_rxq);
	xdp_rxq.dev = netdev;
	xdp_rxq.queue_index = rxq_index;
}

xdp_rxq_info_unreg(netdev, rxq_index)
{
...
}

This way you can avoid the xdp_rxq_info structure management by the 
drivers them selves and reduce duplicated code to init, fill the 
xdp_rxq_info per driver.

-Saeed.

>   /*
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 6bea8931bb62..44932d6194a2 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -3873,9 +3873,33 @@ static int enqueue_to_backlog(struct sk_buff *skb, int cpu,
>   	return NET_RX_DROP;
>   }
>   
> +static struct netdev_rx_queue *netif_get_rxqueue(struct sk_buff *skb)
> +{
> +	struct net_device *dev = skb->dev;
> +	struct netdev_rx_queue *rxqueue;
> +
> +	rxqueue = dev->_rx;
> +
> +	if (skb_rx_queue_recorded(skb)) {
> +		u16 index = skb_get_rx_queue(skb);
> +
> +		if (unlikely(index >= dev->real_num_rx_queues)) {
> +			WARN_ONCE(dev->real_num_rx_queues > 1,
> +				  "%s received packet on queue %u, but number "
> +				  "of RX queues is %u\n",
> +				  dev->name, index, dev->real_num_rx_queues);
> +
> +			return rxqueue; /* Return first rxqueue */
> +		}
> +		rxqueue += index;
> +	}
> +	return rxqueue;
> +}
> +
>   static u32 netif_receive_generic_xdp(struct sk_buff *skb,
>   				     struct bpf_prog *xdp_prog)
>   {
> +	struct netdev_rx_queue *rxqueue;
>   	u32 metalen, act = XDP_DROP;
>   	struct xdp_buff xdp;
>   	void *orig_data;
> @@ -3919,6 +3943,9 @@ static u32 netif_receive_generic_xdp(struct sk_buff *skb,
>   	xdp.data_hard_start = skb->data - skb_headroom(skb);
>   	orig_data = xdp.data;
>   
> +	rxqueue = netif_get_rxqueue(skb);
> +	xdp.rxq = &rxqueue->xdp_rxq;
> +
>   	act = bpf_prog_run_xdp(xdp_prog, &xdp);
>   
>   	off = xdp.data - orig_data;
> @@ -7538,7 +7565,6 @@ void netif_stacked_transfer_operstate(const struct net_device *rootdev,
>   }
>   EXPORT_SYMBOL(netif_stacked_transfer_operstate);
>   
> -#ifdef CONFIG_SYSFS
>   static int netif_alloc_rx_queues(struct net_device *dev)
>   {
>   	unsigned int i, count = dev->num_rx_queues;
> @@ -7553,11 +7579,31 @@ static int netif_alloc_rx_queues(struct net_device *dev)
>   
>   	dev->_rx = rx;
>   
> -	for (i = 0; i < count; i++)
> +	for (i = 0; i < count; i++) {
>   		rx[i].dev = dev;
> +
> +		/* XDP RX-queue setup */
> +		xdp_rxq_info_init(&rx[i].xdp_rxq);
> +		rx[i].xdp_rxq.dev = dev;
> +		rx[i].xdp_rxq.queue_index = i;
> +		xdp_rxq_info_reg(&rx[i].xdp_rxq);
> +	}
>   	return 0;
>   }
> -#endif
> +
> +static void netif_free_rx_queues(struct net_device *dev)
> +{
> +	unsigned int i, count = dev->num_rx_queues;
> +	struct netdev_rx_queue *rx;
> +
> +	if (!dev->_rx) /* netif_alloc_rx_queues alloc failed */
> +		return;
> +
> +	rx = dev->_rx;
> +
> +	for (i = 0; i < count; i++)
> +		xdp_rxq_info_unreg(&rx[i].xdp_rxq);
> +}
>   
>   static void netdev_init_one_queue(struct net_device *dev,
>   				  struct netdev_queue *queue, void *_unused)
> @@ -8118,12 +8164,10 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
>   		return NULL;
>   	}
>   
> -#ifdef CONFIG_SYSFS
>   	if (rxqs < 1) {
>   		pr_err("alloc_netdev: Unable to allocate device with zero RX queues\n");
>   		return NULL;
>   	}
> -#endif
>   
>   	alloc_size = sizeof(struct net_device);
>   	if (sizeof_priv) {
> @@ -8180,12 +8224,10 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
>   	if (netif_alloc_netdev_queues(dev))
>   		goto free_all;
>   
> -#ifdef CONFIG_SYSFS
>   	dev->num_rx_queues = rxqs;
>   	dev->real_num_rx_queues = rxqs;
>   	if (netif_alloc_rx_queues(dev))
>   		goto free_all;
> -#endif
>   
>   	strcpy(dev->name, name);
>   	dev->name_assign_type = name_assign_type;
> @@ -8224,9 +8266,7 @@ void free_netdev(struct net_device *dev)
>   
>   	might_sleep();
>   	netif_free_tx_queues(dev);
> -#ifdef CONFIG_SYSFS
> -	kvfree(dev->_rx);
> -#endif
> +	netif_free_rx_queues(dev);
>   
>   	kfree(rcu_dereference_protected(dev->ingress_queue, 1));
>   
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [bpf-next V1-RFC PATCH 02/14] xdp/mlx5: setup xdp_rxq_info and extend with qtype
  2017-12-13 13:44     ` Jesper Dangaard Brouer
@ 2017-12-13 23:03       ` Saeed Mahameed
  2017-12-14  6:46         ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 39+ messages in thread
From: Saeed Mahameed @ 2017-12-13 23:03 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Tariq Toukan
  Cc: Daniel Borkmann, Alexei Starovoitov, netdev, dsahern,
	Matan Barak, gospo, bjorn.topel, michael.chan



On 12/13/2017 5:44 AM, Jesper Dangaard Brouer wrote:
> On Wed, 13 Dec 2017 14:27:08 +0200
> Tariq Toukan <tariqt@mellanox.com> wrote:
> 
>> Hi Jesper,
>> Thanks for taking care of the drop RQ.
>>
>> In general, mlx5 part looks ok to me.
>> Find a few comments below. Mostly pointing out some typos.
>>
>> On 13/12/2017 1:19 PM, Jesper Dangaard Brouer wrote:
>>> The mlx5 driver have a special drop-RQ queue (one per interface) that
>>> simply drops all incoming traffic. It helps driver keep other HW
>>> objects (flow steering) alive upon down/up operations.  It is
>>> temporarily pointed by flow steering objects during the interface
>>> setup, and when interface is down. It lacks many fields that are set
>>> in a regular RQ (for example its state is never switched to
>>> MLX5_RQC_STATE_RDY). (Thanks to Tariq Toukan for explaination).
>> typo: explanation
> 
> Fixed
> 
>>>
>>> The XDP RX-queue info API is extended with a queue-type, and mlx5 uses
>>> this kind of drop/sink-type (RXQ_TYPE_SINK) for this kind of sink queue.
>>>
>>> Driver hook points for xdp_rxq_info:
>>>    * init+reg: mlx5e_alloc_rq()
>>>    * init+reg: mlx5e_alloc_drop_rq()
>>>    * unreg   : mlx5e_free_rq()
>>>
>>> Tested on actual hardware with samples/bpf program
>>>
>>> Cc: Saeed Mahameed <saeedm@mellanox.com>
>>> Cc: Matan Barak <matanb@mellanox.com>
>>> Cc: Tariq Toukan <tariqt@mellanox.com>
>>> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
>>> ---
>>>    drivers/net/ethernet/mellanox/mlx5/core/en.h      |    4 ++++
>>>    drivers/net/ethernet/mellanox/mlx5/core/en_main.c |   14 +++++++++++++
>>>    drivers/net/ethernet/mellanox/mlx5/core/en_rx.c   |    1 +
>>>    include/net/xdp.h                                 |   23 +++++++++++++++++++++
>>>    net/core/xdp.c                                    |    6 +++++
>>>    5 files changed, 48 insertions(+)
>>>
>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
>>> index c0872b3284cb..fe10a042783b 100644
>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
>>> @@ -46,6 +46,7 @@
>>>    #include <linux/mlx5/transobj.h>
>>>    #include <linux/rhashtable.h>
>>>    #include <net/switchdev.h>
>>> +#include <net/xdp.h>
>>>    #include "wq.h"
>>>    #include "mlx5_core.h"
>>>    #include "en_stats.h"
>>> @@ -568,6 +569,9 @@ struct mlx5e_rq {
>>>    	u32                    rqn;
>>>    	struct mlx5_core_dev  *mdev;
>>>    	struct mlx5_core_mkey  umr_mkey;
>>> +
>>> +	/* XDP read-mostly */
>>> +	struct xdp_rxq_info xdp_rxq;
>>>    } ____cacheline_aligned_in_smp;
>>>    
>>>    struct mlx5e_channel {
>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>>> index 0f5c012de52e..ea44b5f25e11 100644
>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>>> @@ -582,6 +582,12 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c,
>>>    	rq->ix      = c->ix;
>>>    	rq->mdev    = mdev;
>>>    
>>> +	/* XDP RX-queue info */
>>> +	xdp_rxq_info_init(&rq->xdp_rxq);
>>> +	rq->xdp_rxq.dev		= rq->netdev;
>>> +	rq->xdp_rxq.queue_index = rq->ix;
>>> +	xdp_rxq_info_reg(&rq->xdp_rxq);
>>> +

See my comment below and my comment on patch #12 I believe we can reduce 
the amount of code duplication, and have a more generic way to register 
XDP RXQs, without the need for drivers to take care of xdp_rxq_info 
declaration and handling.

>> You don't set type here. This is ok as long as the following hold:
>> 1) RXQ_TYPE_DEFAULT is zero
> 
> True
> 
>> 2) xdp_rxq is zalloc'ed.
> 
> xdp_rxq memory area is part of rq allocation, but in
> xdp_rxq_info_init() I memset/zero the area explicit.
> 
>   
>>>    	rq->xdp_prog = params->xdp_prog ?
>>> bpf_prog_inc(params->xdp_prog) : NULL; if (IS_ERR(rq->xdp_prog)) {
>>>    		err = PTR_ERR(rq->xdp_prog);
>>> @@ -695,6 +701,7 @@ static int mlx5e_alloc_rq(struct mlx5e_channel
>>> *c, err_rq_wq_destroy:
>>>    	if (rq->xdp_prog)
>>>    		bpf_prog_put(rq->xdp_prog);
>>> +	xdp_rxq_info_unreg(&rq->xdp_rxq);
>>>    	mlx5_wq_destroy(&rq->wq_ctrl);
>>>    
>>>    	return err;
>>> @@ -707,6 +714,8 @@ static void mlx5e_free_rq(struct mlx5e_rq *rq)
>>>    	if (rq->xdp_prog)
>>>    		bpf_prog_put(rq->xdp_prog);
>>>    
>>> +	xdp_rxq_info_unreg(&rq->xdp_rxq);
>>> +
>>>    	switch (rq->wq_type) {
>>>    	case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ:
>>>    		mlx5e_rq_free_mpwqe_info(rq);
>>> @@ -2768,6 +2777,11 @@ static int mlx5e_alloc_drop_rq(struct
>>> mlx5_core_dev *mdev, if (err)
>>>    		return err;
>>>    
>>> +	/* XDP RX-queue info for "Drop-RQ", packets never reach
>>> XDP */
>>> +	xdp_rxq_info_init(&rq->xdp_rxq);
>>> +	xdp_rxq_info_type(&rq->xdp_rxq, RXQ_TYPE_SINK);
>>> +	xdp_rxq_info_reg(&rq->xdp_rxq);
>>> +

I don't see why you need this, This RQ is not even assigned to any 
netdev_rxq! it is a pure HW object that drops traffic in HW when netdev 
is down, it even has no buffers or napi handling, just ignore it's 
existence for the sake of mlx5 xdp_rxq_info reg/unreg stuff and remove 
RXQ_TYPE_SINK, bottom line it is not a real RQ and for sure XDP has 
nothing to do with it.

>>>    	rq->mdev = mdev;
>>>    
>>>    	return 0;
>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
>>> b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c index
>>> 5b499c7a698f..7b38480811d4 100644 ---
>>> a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +++
>>> b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c @@ -812,6 +812,7
>>> @@ static inline int mlx5e_xdp_handle(struct mlx5e_rq *rq,
>>> xdp_set_data_meta_invalid(&xdp); xdp.data_end = xdp.data + *len;
>>>    	xdp.data_hard_start = va;
>>> +	xdp.rxq = &rq->xdp_rxq;
>>>    
>>>    	act = bpf_prog_run_xdp(prog, &xdp);
>>>    	switch (act) {
>>> diff --git a/include/net/xdp.h b/include/net/xdp.h
>>> index e4acd198fd60..5be560d943e1 100644
>>> --- a/include/net/xdp.h
>>> +++ b/include/net/xdp.h
>>> @@ -36,10 +36,33 @@ struct xdp_rxq_info {
>>>    	struct net_device *dev;
>>>    	u32 queue_index;
>>>    	u32 reg_state;
>>> +	u32 qtype;
>>>    } ____cacheline_aligned; /* perf critical, avoid false-sharing */
>>>    
>>>    void xdp_rxq_info_init(struct xdp_rxq_info *xdp_rxq);
>>>    void xdp_rxq_info_reg(struct xdp_rxq_info *xdp_rxq);
>>>    void xdp_rxq_info_unreg(struct xdp_rxq_info *xdp_rxq);
>>>    
>>> +/**
>>> + * DOC: XDP RX-queue type
>>> + *
>>> + * The XDP RX-queue info can have associated a type.
>>> + *
>>> + * @RXQ_TYPE_DEFAULT: default no specifik queue type need to be
>>> specified
>>
>> typo: specific
> 
> Thanks, this is a Danish typo (it's spelled that way in Danish).
>   
>>> + *
>>> + * @RXQ_TYPE_SINK: indicate a fake queue that never reach XDP RX
>>> + *	code.  Some drivers have a need to maintain a lower layer
>>> + *	RX-queue as a sink queue, while reconfiguring other
>>> RX-queues.
>>> + */
>>> +#define RXQ_TYPE_DEFAULT	0
>>> +#define RXQ_TYPE_SINK		1
>>> +#define RXQ_TYPE_MAX		RXQ_TYPE_SINK
>>
>> Definitions of incremental numbers, enum might be best here, you can
>> give them some enum type and use it in xdp_rxq_info->qtype.
> 
> I use defines to make the below BUILD_BUG_ON work, as enums does not
> get expanded to their values in the C-preprocessor stage.
> 
>>> +
>>> +static inline
>>> +void xdp_rxq_info_type(struct xdp_rxq_info *xdp_rxq, u32 qtype)
>>> +{
>>> +	BUILD_BUG_ON(qtype > RXQ_TYPE_MAX);
>>> +	xdp_rxq->qtype = qtype;
>>> +}
>>> +
>>>    #endif /* __LINUX_NET_XDP_H__ */
>>> diff --git a/net/core/xdp.c b/net/core/xdp.c
>>> index a9d2dd7b1ede..2a111f5987f6 100644
>>> --- a/net/core/xdp.c
>>> +++ b/net/core/xdp.c
>>> @@ -32,8 +32,14 @@ EXPORT_SYMBOL_GPL(xdp_rxq_info_init);
>>>    
>>>    void xdp_rxq_info_reg(struct xdp_rxq_info *xdp_rxq)
>>>    {
>>> +	if (xdp_rxq->qtype == RXQ_TYPE_SINK)
>>> +		goto skip_content_check;
>>> +
>>> +	/* Check information setup by driver code */
>>>    	WARN(!xdp_rxq->dev, "Missing net_device from driver");
>>>    	WARN(xdp_rxq->queue_index == U32_MAX, "Miss queue_index from driver"); +
>>> +skip_content_check:
>>>    	WARN(!(xdp_rxq->reg_state == REG_STATE_NEW),"API violation, miss init");
>>>      xdp_rxq->reg_state = REG_STATE_REGISTRED;
>> typo: REGISTERED (introduced in a previous patch)
> 
> Thanks for catching that! :-)
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [bpf-next V1-RFC PATCH 08/14] nfp: setup xdp_rxq_info
  2017-12-13 11:20 ` [bpf-next V1-RFC PATCH 08/14] nfp: " Jesper Dangaard Brouer
@ 2017-12-14  2:34   ` Jakub Kicinski
  2017-12-18 20:25     ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 39+ messages in thread
From: Jakub Kicinski @ 2017-12-14  2:34 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Daniel Borkmann, Alexei Starovoitov, dsahern, oss-drivers,
	Simon Horman, netdev, gospo, bjorn.topel, michael.chan

On Wed, 13 Dec 2017 12:20:01 +0100, Jesper Dangaard Brouer wrote:
> Driver hook points for xdp_rxq_info:
>  * init+reg: nfp_net_rx_ring_alloc
>  * unreg   : nfp_net_rx_ring_free
> 
> In struct nfp_net_rx_ring moved member @size into a hole on 64-bit.
> Thus, the size remaines the same after adding member @xdp_rxq.
> 
> Cc: oss-drivers@netronome.com
> Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
> Cc: Simon Horman <simon.horman@netronome.com>
> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>

> diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net.h b/drivers/net/ethernet/netronome/nfp/nfp_net.h
> index 3801c52098d5..0e564cfabe7e 100644
> --- a/drivers/net/ethernet/netronome/nfp/nfp_net.h
> +++ b/drivers/net/ethernet/netronome/nfp/nfp_net.h
> @@ -47,6 +47,7 @@
>  #include <linux/netdevice.h>
>  #include <linux/pci.h>
>  #include <linux/io-64-nonatomic-hi-lo.h>
> +#include <net/xdp.h>
>  
>  #include "nfp_net_ctrl.h"
>  
> @@ -350,6 +351,7 @@ struct nfp_net_rx_buf {
>   * @rxds:       Virtual address of FL/RX ring in host memory
>   * @dma:        DMA address of the FL/RX ring
>   * @size:       Size, in bytes, of the FL/RX ring (needed to free)
> + * @xdp_rxq:    RX-ring info avail for XDP
>   */
>  struct nfp_net_rx_ring {
>  	struct nfp_net_r_vector *r_vec;
> @@ -361,13 +363,14 @@ struct nfp_net_rx_ring {
>  	u32 idx;
>  
>  	int fl_qcidx;
> +	unsigned int size;
>  	u8 __iomem *qcp_fl;
>  
>  	struct nfp_net_rx_buf *rxbufs;
>  	struct nfp_net_rx_desc *rxds;
>  
>  	dma_addr_t dma;
> -	unsigned int size;
> +	struct xdp_rxq_info xdp_rxq;
>  } ____cacheline_aligned;

The @size member is not in the hole on purpose.  IIRC all the members
up to @dma are in the first cacheline.  All things which are not needed
on the fast path are after @dma.  IOW @size is not used on the fast
path and the hole is for fast path stuff :)

>  /**
> diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
> index ad3e9f6a61e5..6474aecd0451 100644
> --- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
> +++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
> @@ -2252,6 +2253,7 @@ static void nfp_net_rx_ring_free(struct nfp_net_rx_ring *rx_ring)
>  	struct nfp_net_r_vector *r_vec = rx_ring->r_vec;
>  	struct nfp_net_dp *dp = &r_vec->nfp_net->dp;
>  
> +	xdp_rxq_info_unreg(&rx_ring->xdp_rxq);
>  	kfree(rx_ring->rxbufs);
>  
>  	if (rx_ring->rxds)
> @@ -2277,6 +2279,12 @@ nfp_net_rx_ring_alloc(struct nfp_net_dp *dp, struct nfp_net_rx_ring *rx_ring)
>  {
>  	int sz;
>  
> +	/* XDP RX-queue info */
> +	xdp_rxq_info_init(&rx_ring->xdp_rxq);
> +	rx_ring->xdp_rxq.dev = dp->netdev;
> +	rx_ring->xdp_rxq.queue_index = rx_ring->idx;
> +	xdp_rxq_info_reg(&rx_ring->xdp_rxq);
> +
>  	rx_ring->cnt = dp->rxd_cnt;
>  	rx_ring->size = sizeof(*rx_ring->rxds) * rx_ring->cnt;
>  	rx_ring->rxds = dma_zalloc_coherent(dp->dev, rx_ring->size,

The nfp driver implements the prepare/commit for reallocating rings.  
I don't think it matters now, but there can be 2 sets of rings with the
same ID allocated during reconfiguration (see nfp_net_ring_reconfig()).
Maybe place the register/unregister in nfp_net_open_stack() and
nfp_net_close_stack() respectively?

Perhaps that won't be necessary, only cleaner :)  I'm not sure how is
the redirect between drivers intended to work WRT freeing rings and
unloading drivers while packets fly...

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [bpf-next V1-RFC PATCH 01/14] xdp: base API for new XDP rx-queue info concept
  2017-12-13 11:19 ` [bpf-next V1-RFC PATCH 01/14] xdp: base API for " Jesper Dangaard Brouer
@ 2017-12-14  2:34   ` David Ahern
  2017-12-18 10:55     ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 39+ messages in thread
From: David Ahern @ 2017-12-14  2:34 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Daniel Borkmann, Alexei Starovoitov
  Cc: netdev, gospo, bjorn.topel, michael.chan

On 12/13/17 4:19 AM, Jesper Dangaard Brouer wrote:
> +
> +void xdp_rxq_info_unreg(struct xdp_rxq_info *xdp_rxq)
> +{
> +	xdp_rxq->reg_state = REG_STATE_UNREGISTRED;
> +}
> +EXPORT_SYMBOL_GPL(xdp_rxq_info_unreg);
> +
> +void xdp_rxq_info_init(struct xdp_rxq_info *xdp_rxq)
> +{
> +	if (xdp_rxq->reg_state == REG_STATE_REGISTRED) {
> +		WARN(1, "Missing unregister, handled but fix driver\n");
> +		xdp_rxq_info_unreg(xdp_rxq);
> +	}
> +	memset(xdp_rxq, 0, sizeof(*xdp_rxq));
> +	xdp_rxq->queue_index = U32_MAX;
> +	xdp_rxq->reg_state = REG_STATE_NEW;
> +}
> +EXPORT_SYMBOL_GPL(xdp_rxq_info_init);
> +
> +void xdp_rxq_info_reg(struct xdp_rxq_info *xdp_rxq)
> +{
> +	WARN(!xdp_rxq->dev, "Missing net_device from driver");
> +	WARN(xdp_rxq->queue_index == U32_MAX, "Miss queue_index from driver");
> +	WARN(!(xdp_rxq->reg_state == REG_STATE_NEW),"API violation, miss init");
> +	xdp_rxq->reg_state = REG_STATE_REGISTRED;
> +}
> +EXPORT_SYMBOL_GPL(xdp_rxq_info_reg);
> 

Rather than WARN()'s why not make the _reg and _init functions return an
int that indicates an error? For example you don't want to continue if
the dev is expected but missing.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [bpf-next V1-RFC PATCH 02/14] xdp/mlx5: setup xdp_rxq_info and extend with qtype
  2017-12-13 23:03       ` Saeed Mahameed
@ 2017-12-14  6:46         ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 39+ messages in thread
From: Jesper Dangaard Brouer @ 2017-12-14  6:46 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: Tariq Toukan, Daniel Borkmann, Alexei Starovoitov, netdev,
	dsahern, Matan Barak, gospo, bjorn.topel, michael.chan, brouer


On Wed, 13 Dec 2017 15:03:33 -0800 Saeed Mahameed <saeedm@mellanox.com> wrote:

> >>> @@ -707,6 +714,8 @@ static void mlx5e_free_rq(struct mlx5e_rq *rq)
> >>>    	if (rq->xdp_prog)
> >>>    		bpf_prog_put(rq->xdp_prog);
> >>>    
> >>> +	xdp_rxq_info_unreg(&rq->xdp_rxq);
> >>> +
> >>>    	switch (rq->wq_type) {
> >>>    	case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ:
> >>>    		mlx5e_rq_free_mpwqe_info(rq);
> >>> @@ -2768,6 +2777,11 @@ static int mlx5e_alloc_drop_rq(struct
> >>> mlx5_core_dev *mdev, if (err)
> >>>    		return err;
> >>>    
> >>> +	/* XDP RX-queue info for "Drop-RQ", packets never reach
> >>> XDP */
> >>> +	xdp_rxq_info_init(&rq->xdp_rxq);
> >>> +	xdp_rxq_info_type(&rq->xdp_rxq, RXQ_TYPE_SINK);
> >>> +	xdp_rxq_info_reg(&rq->xdp_rxq);
> >>> +  
> 
> I don't see why you need this, This RQ is not even assigned to any 
> netdev_rxq! it is a pure HW object that drops traffic in HW when netdev 
> is down, it even has no buffers or napi handling, just ignore it's 
> existence for the sake of mlx5 xdp_rxq_info reg/unreg stuff and remove 
> RXQ_TYPE_SINK, bottom line it is not a real RQ and for sure XDP has 
> nothing to do with it.

I need it here, because the take-down/free code-path is the same for
these two types of RQ's.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [bpf-next V1-RFC PATCH 12/14] xdp: generic XDP handling of xdp_rxq_info
  2017-12-13 22:50   ` Saeed Mahameed
@ 2017-12-18  9:47     ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 39+ messages in thread
From: Jesper Dangaard Brouer @ 2017-12-18  9:47 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: Daniel Borkmann, Alexei Starovoitov, netdev, dsahern, gospo,
	bjorn.topel, michael.chan, brouer

On Wed, 13 Dec 2017 14:50:07 -0800
Saeed Mahameed <saeedm@mellanox.com> wrote:

> On 12/13/2017 3:20 AM, Jesper Dangaard Brouer wrote:
> > Hook points for xdp_rxq_info:
> >   * init+reg: netif_alloc_rx_queues
> >   * unreg   : netif_free_rx_queues
> > 
> > The net_device have some members (num_rx_queues + real_num_rx_queues)
> > and data-area (dev->_rx with struct netdev_rx_queue's) that were
> > primarily used for exporting information about RPS (CONFIG_RPS) queues
> > to sysfs (CONFIG_SYSFS).
> > 
> > For generic XDP extend struct netdev_rx_queue with the xdp_rxq_info,
> > and remove some of the CONFIG_SYSFS ifdefs.
> > 
> > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> > ---
> >   include/linux/netdevice.h |    2 ++
> >   net/core/dev.c            |   60 ++++++++++++++++++++++++++++++++++++++-------
> >   2 files changed, 52 insertions(+), 10 deletions(-)
> > 
> > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > index cc4ce7456e38..43595b037872 100644
> > --- a/include/linux/netdevice.h
> > +++ b/include/linux/netdevice.h
> > @@ -44,6 +44,7 @@
> >   #include <net/dcbnl.h>
> >   #endif
> >   #include <net/netprio_cgroup.h>
> > +#include <net/xdp.h>
> >   
> >   #include <linux/netdev_features.h>
> >   #include <linux/neighbour.h>
> > @@ -686,6 +687,7 @@ struct netdev_rx_queue {
> >   #endif
> >   	struct kobject			kobj;
> >   	struct net_device		*dev;
> > +	struct xdp_rxq_info		xdp_rxq;
> >   } ____cacheline_aligned_in_smp;
> >     
> 
> Instead of duplicating this xdp_rxq_info and have 2 instances of it for 
> drivers that do support XDP (the generic one and the driver internal 
> xdp_rxq_info), drivers can use the generic netdev_rx_queue.xdp_rxq to 
> register their own xdp_rxq_info.
> I suggest the following API for drivers to use:
> 
> xdp_rxq_info_reg(netdev, rxq_index)
> {
> 	rxqueue = dev->_rx + rxq_index;
> 	xdp_rxq = rxqueue.xdp_rxq;
>          xdp_rxq_info_init(xdp_rxq);
> 	xdp_rxq.dev = netdev;
> 	xdp_rxq.queue_index = rxq_index;
> }
> 
> xdp_rxq_info_unreg(netdev, rxq_index)
> {
> ...
> }
> 
> This way you can avoid the xdp_rxq_info structure management by the 
> drivers them selves and reduce duplicated code to init, fill the 
> xdp_rxq_info per driver.

Having the xdp_rxq_info struct memory associated with the net_device,
and not the drivers RX-rings, make it harder for the drivers to use
this API.  Because drivers (e.g. i40e) to minimize "downtime" when
e.g. changing numbers of queues (ethtool set_channels) allocate and
setup new (RX+TX+XDP_TX)-rings _before_ taking down current RX-rings.
Thus, you have the problem of driver wanting to register two
xdp_rxq_info's at the same time, before unregistering the old.

Another issue I've seen is that, drivers do funny tricks, and you
cannot always assuming that rxq_index would be valid and bind directly
to dev->_rx + rxq_index.  E.g. in the i40e driver they have
vsi->num_queue_pairs they iterate over, and one of the
vsi->rx_rings[i] is a special flow-director (FDIR) ring that comes in
the end of the rx_rings[] array but internally have queue_index zero.

I was also hoping to unifying/integrating the generic-XDP REDIRECT
code path, by assigning the net_device stored xdp_rxq_info another
qtype "SKB-MODE-TYPE".  Allowing us to share more of the native-XDP
redirect-code path with SKB-XDP frames from generic-XDP.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Intel-wired-lan] [bpf-next V1-RFC PATCH 03/14] i40e: setup xdp_rxq_info
  2017-12-13 11:19   ` [Intel-wired-lan] " Jesper Dangaard Brouer
@ 2017-12-18 10:52     ` =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?=
  -1 siblings, 0 replies; 39+ messages in thread
From: Björn Töpel @ 2017-12-18 10:52 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Daniel Borkmann, Alexei Starovoitov, Netdev, intel-wired-lan,
	dsahern, gospo, Björn Töpel, michael.chan, Karlsson,
	Magnus

2017-12-13 12:19 GMT+01:00 Jesper Dangaard Brouer <brouer@redhat.com>:
> The i40e driver have a special "FDIR" RX-ring (I40E_VSI_FDIR) which is
> a sideband channel for configuring/updating the flow director tables.
> This (i40e_vsi_)type does not invoke XDP-ebpf code. Thus, mark this
> type as a XDP RX-queue type RXQ_TYPE_SINK.
>
> Driver hook points for xdp_rxq_info:
>  * init+reg: i40e_setup_rx_descriptors (via i40e_vsi_setup_rx_resources)
>  * unreg   : i40e_free_rx_resources    (via i40e_vsi_free_rx_resources)
>

Firstly, thanks for working on this, Jesper. This is very valuable to
the AF_PACKET V4^W^WAF_XDP work.

> Tested on actual hardware with samples/bpf program.
>
> Cc: intel-wired-lan@lists.osuosl.org
> Cc: Björn Töpel <bjorn.topel@intel.com>
> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> ---
>  drivers/net/ethernet/intel/i40e/i40e_txrx.c |   18 +++++++++++++++++-
>  drivers/net/ethernet/intel/i40e/i40e_txrx.h |    3 +++
>  2 files changed, 20 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> index 4566d66ffc7c..d1bfc4232ca9 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> @@ -27,6 +27,7 @@
>  #include <linux/prefetch.h>
>  #include <net/busy_poll.h>
>  #include <linux/bpf_trace.h>
> +#include <net/xdp.h>
>  #include "i40e.h"
>  #include "i40e_trace.h"
>  #include "i40e_prototype.h"
> @@ -1236,6 +1237,7 @@ void i40e_clean_rx_ring(struct i40e_ring *rx_ring)
>  void i40e_free_rx_resources(struct i40e_ring *rx_ring)
>  {
>         i40e_clean_rx_ring(rx_ring);
> +       xdp_rxq_info_unreg(&rx_ring->xdp_rxq);
>         rx_ring->xdp_prog = NULL;
>         kfree(rx_ring->rx_bi);
>         rx_ring->rx_bi = NULL;
> @@ -1283,6 +1285,18 @@ int i40e_setup_rx_descriptors(struct i40e_ring *rx_ring)
>         rx_ring->next_to_clean = 0;
>         rx_ring->next_to_use = 0;
>
> +       /* XDP RX-queue info */
> +       xdp_rxq_info_init(&rx_ring->xdp_rxq);
> +
> +       /* Flow director side channel does not invoke XDP/bpf */
> +       if (rx_ring->vsi->type == I40E_VSI_FDIR)
> +               xdp_rxq_info_type(&rx_ring->xdp_rxq, RXQ_TYPE_SINK);

For me, it doesn't make sense to expose the FD Rx ring outside the
scope of i40e driver. Check against VSI_MAIN in setup/cleanup so the
FD ring is not exposed outside the driver. Maybe you can get rid of
the RXQ_TYPE_SINK as well then.

> +       else
> +               rx_ring->xdp_rxq.dev = rx_ring->netdev;
> +
> +       rx_ring->xdp_rxq.queue_index = rx_ring->queue_index;
> +       xdp_rxq_info_reg(&rx_ring->xdp_rxq);
> +
>         rx_ring->xdp_prog = rx_ring->vsi->xdp_prog;
>
>         return 0;
> @@ -2068,11 +2082,13 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget)
>         struct sk_buff *skb = rx_ring->skb;
>         u16 cleaned_count = I40E_DESC_UNUSED(rx_ring);
>         bool failure = false, xdp_xmit = false;
> +       struct xdp_buff xdp;
> +
> +       xdp.rxq = &rx_ring->xdp_rxq;
>
>         while (likely(total_rx_packets < (unsigned int)budget)) {
>                 struct i40e_rx_buffer *rx_buffer;
>                 union i40e_rx_desc *rx_desc;
> -               struct xdp_buff xdp;
>                 unsigned int size;
>                 u16 vlan_tag;
>                 u8 rx_ptype;
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.h b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
> index fbae1182e2ea..2d08760fc4ce 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.h
> +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
> @@ -27,6 +27,8 @@
>  #ifndef _I40E_TXRX_H_
>  #define _I40E_TXRX_H_
>
> +#include <net/xdp.h>
> +
>  /* Interrupt Throttling and Rate Limiting Goodies */
>
>  #define I40E_MAX_ITR               0x0FF0  /* reg uses 2 usec resolution */
> @@ -428,6 +430,7 @@ struct i40e_ring {
>                                          */
>
>         struct i40e_channel *ch;
> +       struct xdp_rxq_info xdp_rxq;
>  } ____cacheline_internodealigned_in_smp;
>
>  static inline bool ring_uses_build_skb(struct i40e_ring *ring)
>
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan@osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [Intel-wired-lan] [bpf-next V1-RFC PATCH 03/14] i40e: setup xdp_rxq_info
@ 2017-12-18 10:52     ` =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?=
  0 siblings, 0 replies; 39+ messages in thread
From: =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?= @ 2017-12-18 10:52 UTC (permalink / raw)
  To: intel-wired-lan

2017-12-13 12:19 GMT+01:00 Jesper Dangaard Brouer <brouer@redhat.com>:
> The i40e driver have a special "FDIR" RX-ring (I40E_VSI_FDIR) which is
> a sideband channel for configuring/updating the flow director tables.
> This (i40e_vsi_)type does not invoke XDP-ebpf code. Thus, mark this
> type as a XDP RX-queue type RXQ_TYPE_SINK.
>
> Driver hook points for xdp_rxq_info:
>  * init+reg: i40e_setup_rx_descriptors (via i40e_vsi_setup_rx_resources)
>  * unreg   : i40e_free_rx_resources    (via i40e_vsi_free_rx_resources)
>

Firstly, thanks for working on this, Jesper. This is very valuable to
the AF_PACKET V4^W^WAF_XDP work.

> Tested on actual hardware with samples/bpf program.
>
> Cc: intel-wired-lan at lists.osuosl.org
> Cc: Bj?rn T?pel <bjorn.topel@intel.com>
> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> ---
>  drivers/net/ethernet/intel/i40e/i40e_txrx.c |   18 +++++++++++++++++-
>  drivers/net/ethernet/intel/i40e/i40e_txrx.h |    3 +++
>  2 files changed, 20 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> index 4566d66ffc7c..d1bfc4232ca9 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> @@ -27,6 +27,7 @@
>  #include <linux/prefetch.h>
>  #include <net/busy_poll.h>
>  #include <linux/bpf_trace.h>
> +#include <net/xdp.h>
>  #include "i40e.h"
>  #include "i40e_trace.h"
>  #include "i40e_prototype.h"
> @@ -1236,6 +1237,7 @@ void i40e_clean_rx_ring(struct i40e_ring *rx_ring)
>  void i40e_free_rx_resources(struct i40e_ring *rx_ring)
>  {
>         i40e_clean_rx_ring(rx_ring);
> +       xdp_rxq_info_unreg(&rx_ring->xdp_rxq);
>         rx_ring->xdp_prog = NULL;
>         kfree(rx_ring->rx_bi);
>         rx_ring->rx_bi = NULL;
> @@ -1283,6 +1285,18 @@ int i40e_setup_rx_descriptors(struct i40e_ring *rx_ring)
>         rx_ring->next_to_clean = 0;
>         rx_ring->next_to_use = 0;
>
> +       /* XDP RX-queue info */
> +       xdp_rxq_info_init(&rx_ring->xdp_rxq);
> +
> +       /* Flow director side channel does not invoke XDP/bpf */
> +       if (rx_ring->vsi->type == I40E_VSI_FDIR)
> +               xdp_rxq_info_type(&rx_ring->xdp_rxq, RXQ_TYPE_SINK);

For me, it doesn't make sense to expose the FD Rx ring outside the
scope of i40e driver. Check against VSI_MAIN in setup/cleanup so the
FD ring is not exposed outside the driver. Maybe you can get rid of
the RXQ_TYPE_SINK as well then.

> +       else
> +               rx_ring->xdp_rxq.dev = rx_ring->netdev;
> +
> +       rx_ring->xdp_rxq.queue_index = rx_ring->queue_index;
> +       xdp_rxq_info_reg(&rx_ring->xdp_rxq);
> +
>         rx_ring->xdp_prog = rx_ring->vsi->xdp_prog;
>
>         return 0;
> @@ -2068,11 +2082,13 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget)
>         struct sk_buff *skb = rx_ring->skb;
>         u16 cleaned_count = I40E_DESC_UNUSED(rx_ring);
>         bool failure = false, xdp_xmit = false;
> +       struct xdp_buff xdp;
> +
> +       xdp.rxq = &rx_ring->xdp_rxq;
>
>         while (likely(total_rx_packets < (unsigned int)budget)) {
>                 struct i40e_rx_buffer *rx_buffer;
>                 union i40e_rx_desc *rx_desc;
> -               struct xdp_buff xdp;
>                 unsigned int size;
>                 u16 vlan_tag;
>                 u8 rx_ptype;
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.h b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
> index fbae1182e2ea..2d08760fc4ce 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.h
> +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
> @@ -27,6 +27,8 @@
>  #ifndef _I40E_TXRX_H_
>  #define _I40E_TXRX_H_
>
> +#include <net/xdp.h>
> +
>  /* Interrupt Throttling and Rate Limiting Goodies */
>
>  #define I40E_MAX_ITR               0x0FF0  /* reg uses 2 usec resolution */
> @@ -428,6 +430,7 @@ struct i40e_ring {
>                                          */
>
>         struct i40e_channel *ch;
> +       struct xdp_rxq_info xdp_rxq;
>  } ____cacheline_internodealigned_in_smp;
>
>  static inline bool ring_uses_build_skb(struct i40e_ring *ring)
>
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan at osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [bpf-next V1-RFC PATCH 01/14] xdp: base API for new XDP rx-queue info concept
  2017-12-14  2:34   ` David Ahern
@ 2017-12-18 10:55     ` Jesper Dangaard Brouer
  2017-12-18 13:23       ` David Ahern
  2017-12-21 16:59       ` Jesper Dangaard Brouer
  0 siblings, 2 replies; 39+ messages in thread
From: Jesper Dangaard Brouer @ 2017-12-18 10:55 UTC (permalink / raw)
  To: David Ahern
  Cc: Daniel Borkmann, Alexei Starovoitov, netdev, gospo, bjorn.topel,
	michael.chan, brouer

On Wed, 13 Dec 2017 19:34:40 -0700
David Ahern <dsahern@gmail.com> wrote:

> On 12/13/17 4:19 AM, Jesper Dangaard Brouer wrote:
> > +
> > +void xdp_rxq_info_unreg(struct xdp_rxq_info *xdp_rxq)
> > +{
> > +	xdp_rxq->reg_state = REG_STATE_UNREGISTRED;
> > +}
> > +EXPORT_SYMBOL_GPL(xdp_rxq_info_unreg);
> > +
> > +void xdp_rxq_info_init(struct xdp_rxq_info *xdp_rxq)
> > +{
> > +	if (xdp_rxq->reg_state == REG_STATE_REGISTRED) {
> > +		WARN(1, "Missing unregister, handled but fix driver\n");
> > +		xdp_rxq_info_unreg(xdp_rxq);
> > +	}
> > +	memset(xdp_rxq, 0, sizeof(*xdp_rxq));
> > +	xdp_rxq->queue_index = U32_MAX;
> > +	xdp_rxq->reg_state = REG_STATE_NEW;
> > +}
> > +EXPORT_SYMBOL_GPL(xdp_rxq_info_init);
> > +
> > +void xdp_rxq_info_reg(struct xdp_rxq_info *xdp_rxq)
> > +{
> > +	WARN(!xdp_rxq->dev, "Missing net_device from driver");
> > +	WARN(xdp_rxq->queue_index == U32_MAX, "Miss queue_index from driver");
> > +	WARN(!(xdp_rxq->reg_state == REG_STATE_NEW),"API violation, miss init");
> > +	xdp_rxq->reg_state = REG_STATE_REGISTRED;
> > +}
> > +EXPORT_SYMBOL_GPL(xdp_rxq_info_reg);
> >   
> 
> Rather than WARN()'s why not make the _reg and _init functions return an
> int that indicates an error? For example you don't want to continue if
> the dev is expected but missing.

Handling return-errors in the drivers complicated the driver code, as it
involves unraveling and deallocating other RX-rings etc (that were
already allocated) if the reg fails. (Also notice next patch will allow
dev == NULL, if right ptype is set).

I'm not completely rejecting you idea, as this is a good optimization
trick, which is to move validation checks to setup-time, thus allowing
less validation checks at runtime.  I sort-of actually already did
this, as I allow bpf to deref dev without NULL check.  I would argue
this is good enough, as we will crash in a predictable way, as above
WARN will point to which driver violated the API.

If people think it is valuable I can change this API to return an err?

I guess, it would be more future-proof to do this, as we (Bjørn,
Michael, Andy) want to extend this to implement a XDP frame/mem return
code-path.  And the register call will likely have to allocate some
resource that could fail, which need to be handled...


If we do this, we might as well (slab) alloc the xdp_rxq_info
structure to reduce the bloat in the drivers RX-rings to a single
pointer (and a pointer to xdp_rxq_info is what xdp_buff.rxq need).

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Intel-wired-lan] [bpf-next V1-RFC PATCH 03/14] i40e: setup xdp_rxq_info
  2017-12-18 10:52     ` =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?=
@ 2017-12-18 13:05       ` Jesper Dangaard Brouer
  -1 siblings, 0 replies; 39+ messages in thread
From: Jesper Dangaard Brouer @ 2017-12-18 13:05 UTC (permalink / raw)
  To: Björn Töpel
  Cc: Daniel Borkmann, Alexei Starovoitov, Netdev, intel-wired-lan,
	dsahern, gospo, Björn Töpel, michael.chan, Karlsson,
	Magnus, brouer

On Mon, 18 Dec 2017 11:52:08 +0100
Björn Töpel <bjorn.topel@gmail.com> wrote:

> > +       /* Flow director side channel does not invoke XDP/bpf */
> > +       if (rx_ring->vsi->type == I40E_VSI_FDIR)
> > +               xdp_rxq_info_type(&rx_ring->xdp_rxq, RXQ_TYPE_SINK);  
> 
> For me, it doesn't make sense to expose the FD Rx ring outside the
> scope of i40e driver. Check against VSI_MAIN in setup/cleanup so the
> FD ring is not exposed outside the driver. Maybe you can get rid of
> the RXQ_TYPE_SINK as well then.

Okay, took this advice and changed patch to instead check against
I40E_VSI_MAIN.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [Intel-wired-lan] [bpf-next V1-RFC PATCH 03/14] i40e: setup xdp_rxq_info
@ 2017-12-18 13:05       ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 39+ messages in thread
From: Jesper Dangaard Brouer @ 2017-12-18 13:05 UTC (permalink / raw)
  To: intel-wired-lan

On Mon, 18 Dec 2017 11:52:08 +0100
Bj?rn T?pel <bjorn.topel@gmail.com> wrote:

> > +       /* Flow director side channel does not invoke XDP/bpf */
> > +       if (rx_ring->vsi->type == I40E_VSI_FDIR)
> > +               xdp_rxq_info_type(&rx_ring->xdp_rxq, RXQ_TYPE_SINK);  
> 
> For me, it doesn't make sense to expose the FD Rx ring outside the
> scope of i40e driver. Check against VSI_MAIN in setup/cleanup so the
> FD ring is not exposed outside the driver. Maybe you can get rid of
> the RXQ_TYPE_SINK as well then.

Okay, took this advice and changed patch to instead check against
I40E_VSI_MAIN.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [bpf-next V1-RFC PATCH 01/14] xdp: base API for new XDP rx-queue info concept
  2017-12-18 10:55     ` Jesper Dangaard Brouer
@ 2017-12-18 13:23       ` David Ahern
  2017-12-18 15:52         ` Jesper Dangaard Brouer
  2017-12-21 16:59       ` Jesper Dangaard Brouer
  1 sibling, 1 reply; 39+ messages in thread
From: David Ahern @ 2017-12-18 13:23 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Daniel Borkmann, Alexei Starovoitov, netdev, gospo, bjorn.topel,
	michael.chan

On 12/18/17 3:55 AM, Jesper Dangaard Brouer wrote:
> 
> Handling return-errors in the drivers complicated the driver code, as it
> involves unraveling and deallocating other RX-rings etc (that were
> already allocated) if the reg fails. (Also notice next patch will allow
> dev == NULL, if right ptype is set).
> 
> I'm not completely rejecting you idea, as this is a good optimization
> trick, which is to move validation checks to setup-time, thus allowing
> less validation checks at runtime.  I sort-of actually already did
> this, as I allow bpf to deref dev without NULL check.  I would argue
> this is good enough, as we will crash in a predictable way, as above
> WARN will point to which driver violated the API.
> 
> If people think it is valuable I can change this API to return an err?

Saeed's suggested API in a comment on patch 12 also removes most of the
WARN_ONs as it sets the device and index:

xdp_rxq_info_reg(netdev, rxq_index)
{
    rxqueue = dev->_rx + rxq_index;
    xdp_rxq = rxqueue.xdp_rxq;
        xdp_rxq_info_init(xdp_rxq);
    xdp_rxq.dev = netdev;
    xdp_rxq.queue_index = rxq_index;
}

xdp_rxq_info_unreg(netdev, rxq_index)
{
...
}

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [bpf-next V1-RFC PATCH 01/14] xdp: base API for new XDP rx-queue info concept
  2017-12-18 13:23       ` David Ahern
@ 2017-12-18 15:52         ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 39+ messages in thread
From: Jesper Dangaard Brouer @ 2017-12-18 15:52 UTC (permalink / raw)
  To: David Ahern
  Cc: Daniel Borkmann, Alexei Starovoitov, netdev, gospo, bjorn.topel,
	michael.chan, brouer

On Mon, 18 Dec 2017 06:23:40 -0700
David Ahern <dsahern@gmail.com> wrote:

> On 12/18/17 3:55 AM, Jesper Dangaard Brouer wrote:
> > 
> > Handling return-errors in the drivers complicated the driver code, as it
> > involves unraveling and deallocating other RX-rings etc (that were
> > already allocated) if the reg fails. (Also notice next patch will allow
> > dev == NULL, if right ptype is set).
> > 
> > I'm not completely rejecting you idea, as this is a good optimization
> > trick, which is to move validation checks to setup-time, thus allowing
> > less validation checks at runtime.  I sort-of actually already did
> > this, as I allow bpf to deref dev without NULL check.  I would argue
> > this is good enough, as we will crash in a predictable way, as above
> > WARN will point to which driver violated the API.
> > 
> > If people think it is valuable I can change this API to return an err?  
> 
> Saeed's suggested API in a comment on patch 12 also removes most of the
> WARN_ONs as it sets the device and index:
> 
> xdp_rxq_info_reg(netdev, rxq_index)
> {
>     rxqueue = netdev->_rx + rxq_index;
>     xdp_rxq = rxqueue.xdp_rxq;
>         xdp_rxq_info_init(xdp_rxq);
>     xdp_rxq.dev = netdev;
>     xdp_rxq.queue_index = rxq_index;
> }
> 
> xdp_rxq_info_unreg(netdev, rxq_index)
> {
> ...
> }

No, we still need the other WARN_ON's.

I don't understand why you think above API is better.  In case
netdev==NULL the system will simply crash on deref of netdev.  That
case happened for both drivers i40e and mlx5, when I was adding this.
The WARN_ON help me quickly identify the issue, and in both drivers it
was a non-critical error, as these queues are not used by XDP. IHMO a
better experience for the driver developer.

IHMO WARN_ON's are a good thing.  For example the:

 if (xdp_rxq->reg_state == REG_STATE_REGISTERED)
   WARN(1, "Missing unregister, handled but fix driver\n");

Just helped me identify a bug in i40e driver.  It turns out that
changing the RX-ring queue size via ethtool <-G|--set-ring> (_not_ the
number of RX-rings, but frames per RX-ring). Then i40e_set_ringparam()
allocates some temp RX-rings and copy-around struct contents, causing
this strange issue.  It will not crash with our currently simple content,
but later this would cause a hard-to-debug issue.  I'm happy I could
catch this now, instead of later as a strange crash.

The WARN's are there to assist driver developers when using this API
in their drivers (better than crash/BUG_ON as they don't have to dig-up
their serial cable console).  For me it is also part of the
documentation, as it document the API assumptions/assertions together
with a small text field.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [bpf-next V1-RFC PATCH 08/14] nfp: setup xdp_rxq_info
  2017-12-14  2:34   ` Jakub Kicinski
@ 2017-12-18 20:25     ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 39+ messages in thread
From: Jesper Dangaard Brouer @ 2017-12-18 20:25 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Daniel Borkmann, Alexei Starovoitov, dsahern, oss-drivers,
	Simon Horman, netdev, gospo, bjorn.topel, michael.chan, brouer

On Wed, 13 Dec 2017 18:34:27 -0800
Jakub Kicinski <jakub.kicinski@netronome.com> wrote:

> On Wed, 13 Dec 2017 12:20:01 +0100, Jesper Dangaard Brouer wrote:
> > Driver hook points for xdp_rxq_info:
> >  * init+reg: nfp_net_rx_ring_alloc
> >  * unreg   : nfp_net_rx_ring_free
> > 
> > In struct nfp_net_rx_ring moved member @size into a hole on 64-bit.
> > Thus, the size remaines the same after adding member @xdp_rxq.
> > 
> > Cc: oss-drivers@netronome.com
> > Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
> > Cc: Simon Horman <simon.horman@netronome.com>
> > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>  
> 
> > diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net.h b/drivers/net/ethernet/netronome/nfp/nfp_net.h
> > index 3801c52098d5..0e564cfabe7e 100644
> > --- a/drivers/net/ethernet/netronome/nfp/nfp_net.h
> > +++ b/drivers/net/ethernet/netronome/nfp/nfp_net.h
> > @@ -47,6 +47,7 @@
> >  #include <linux/netdevice.h>
> >  #include <linux/pci.h>
> >  #include <linux/io-64-nonatomic-hi-lo.h>
> > +#include <net/xdp.h>
> >  
> >  #include "nfp_net_ctrl.h"
> >  
> > @@ -350,6 +351,7 @@ struct nfp_net_rx_buf {
> >   * @rxds:       Virtual address of FL/RX ring in host memory
> >   * @dma:        DMA address of the FL/RX ring
> >   * @size:       Size, in bytes, of the FL/RX ring (needed to free)
> > + * @xdp_rxq:    RX-ring info avail for XDP
> >   */
> >  struct nfp_net_rx_ring {
> >  	struct nfp_net_r_vector *r_vec;
> > @@ -361,13 +363,14 @@ struct nfp_net_rx_ring {
> >  	u32 idx;
> >  
> >  	int fl_qcidx;
> > +	unsigned int size;
> >  	u8 __iomem *qcp_fl;
> >  
> >  	struct nfp_net_rx_buf *rxbufs;
> >  	struct nfp_net_rx_desc *rxds;
> >  
> >  	dma_addr_t dma;
> > -	unsigned int size;
> > +	struct xdp_rxq_info xdp_rxq;
> >  } ____cacheline_aligned;  
> 
> The @size member is not in the hole on purpose.  IIRC all the members
> up to @dma are in the first cacheline.  All things which are not
> needed on the fast path are after @dma.  IOW @size is not used on the
> fast path and the hole is for fast path stuff :)

Yes, I did notice @size was not used on fast-path, but it didn't hurt
to move it up.  I was just excited to see I could add this without
increasing the rx_ring struct size.

I'm more and more considering Ahern's suggestion of returning an err,
and if I do so, I also want to do proper allocation of xdp_rxq_info,
which means this will be converted into a pointer instead (and thus
much smaller effect on rx_ring size).


> >  /**
> > diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
> > index ad3e9f6a61e5..6474aecd0451 100644
> > --- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
> > +++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
> > @@ -2252,6 +2253,7 @@ static void nfp_net_rx_ring_free(struct nfp_net_rx_ring *rx_ring)
> >  	struct nfp_net_r_vector *r_vec = rx_ring->r_vec;
> >  	struct nfp_net_dp *dp = &r_vec->nfp_net->dp;
> >  
> > +	xdp_rxq_info_unreg(&rx_ring->xdp_rxq);
> >  	kfree(rx_ring->rxbufs);
> >  
> >  	if (rx_ring->rxds)
> > @@ -2277,6 +2279,12 @@ nfp_net_rx_ring_alloc(struct nfp_net_dp *dp, struct nfp_net_rx_ring *rx_ring)
> >  {
> >  	int sz;
> >  
> > +	/* XDP RX-queue info */
> > +	xdp_rxq_info_init(&rx_ring->xdp_rxq);
> > +	rx_ring->xdp_rxq.dev = dp->netdev;
> > +	rx_ring->xdp_rxq.queue_index = rx_ring->idx;
> > +	xdp_rxq_info_reg(&rx_ring->xdp_rxq);
> > +
> >  	rx_ring->cnt = dp->rxd_cnt;
> >  	rx_ring->size = sizeof(*rx_ring->rxds) * rx_ring->cnt;
> >  	rx_ring->rxds = dma_zalloc_coherent(dp->dev, rx_ring->size,  
> 
> The nfp driver implements the prepare/commit for reallocating rings.  
> I don't think it matters now, but there can be 2 sets of rings with the
> same ID allocated during reconfiguration (see nfp_net_ring_reconfig()).
> Maybe place the register/unregister in nfp_net_open_stack() and
> nfp_net_close_stack() respectively?

Going over the your driver code again, I do think I handle this
correctly in nfp_net_rx_ring_free() / nfp_net_rx_ring_alloc().

Your calls nfp_net_open_stack() / nfp_net_close_stack(), doesn't
support failing, which conflicts with Ahern's suggestion.

As I explained, in another reply, I do want to support having 2 sets
of rings during reconfiguration, as many drivers do this.  This is also
the reason I cannot use net_device->_rx[] area.

> Perhaps that won't be necessary, only cleaner :)  I'm not sure how is
> the redirect between drivers intended to work WRT freeing rings and
> unloading drivers while packets fly...

I do have a plan for handling in-flight packets when driver is being
unloaded... that is the reason for having the unreg call. (Sorry, I
should have included you in that offlist discussion).

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [bpf-next V1-RFC PATCH 10/14] tun: setup xdp_rxq_info
  2017-12-13 11:20 ` [bpf-next V1-RFC PATCH 10/14] tun: " Jesper Dangaard Brouer
@ 2017-12-20  7:48   ` Jason Wang
  2017-12-21 15:42     ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 39+ messages in thread
From: Jason Wang @ 2017-12-20  7:48 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Daniel Borkmann, Alexei Starovoitov
  Cc: Willem de Bruijn, Michael S. Tsirkin, netdev, dsahern, gospo,
	bjorn.topel, michael.chan



On 2017年12月13日 19:20, Jesper Dangaard Brouer wrote:
> Driver hook points for xdp_rxq_info:
>   * init+reg: tun_attach
>   * unreg   : __tun_detach
>
> I've done some manual testing of this tun driver, but I would
> appriciate good review and someone else running their use-case tests,
> as I'm not 100% sure I understand the tfile->detached semantics.
>
> Cc: Jason Wang <jasowang@redhat.com>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Willem de Bruijn <willemb@google.com>
> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> ---
>   drivers/net/tun.c |   25 ++++++++++++++++++++++++-
>   1 file changed, 24 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index e367d6310353..f1df08c2c541 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -180,6 +180,7 @@ struct tun_file {
>   	struct list_head next;
>   	struct tun_struct *detached;
>   	struct skb_array tx_array;
> +	struct xdp_rxq_info xdp_rxq;
>   };
>   
>   struct tun_flow_entry {
> @@ -687,8 +688,10 @@ static void __tun_detach(struct tun_file *tfile, bool clean)
>   			    tun->dev->reg_state == NETREG_REGISTERED)
>   				unregister_netdevice(tun->dev);
>   		}
> -		if (tun)
> +		if (tun) {
>   			skb_array_cleanup(&tfile->tx_array);
> +			xdp_rxq_info_unreg(&tfile->xdp_rxq);
> +		}
>   		sock_put(&tfile->sk);
>   	}
>   }
> @@ -728,11 +731,15 @@ static void tun_detach_all(struct net_device *dev)
>   		tun_napi_del(tun, tfile);
>   		/* Drop read queue */
>   		tun_queue_purge(tfile);
> +		skb_array_cleanup(&tfile->tx_array);

Looks like this is unnecessary, skb array will be cleaned up only when 
fd is closed otherwise there will be a double free.

Other looks good.

Thanks

> +		xdp_rxq_info_unreg(&tfile->xdp_rxq);
>   		sock_put(&tfile->sk);
>   	}
>   	list_for_each_entry_safe(tfile, tmp, &tun->disabled, next) {
>   		tun_enable_queue(tfile);
>   		tun_queue_purge(tfile);
> +		skb_array_cleanup(&tfile->tx_array);
> +		xdp_rxq_info_unreg(&tfile->xdp_rxq);
>   		sock_put(&tfile->sk);
>   	}
>   	BUG_ON(tun->numdisabled != 0);
> @@ -784,6 +791,21 @@ static int tun_attach(struct tun_struct *tun, struct file *file,
>   
>   	tfile->queue_index = tun->numqueues;
>   	tfile->socket.sk->sk_shutdown &= ~RCV_SHUTDOWN;
> +
> +	if (tfile->detached) {
> +		/* Re-attach detached tfile, updating XDP queue_index */
> +		WARN_ON(!xdp_rxq_info_is_reg(&tfile->xdp_rxq));
> +
> +		if (tfile->xdp_rxq.queue_index    != tfile->queue_index)
> +			tfile->xdp_rxq.queue_index = tfile->queue_index;
> +	} else {
> +		/* Setup XDP RX-queue info, for new tfile getting attached */
> +		xdp_rxq_info_init(&tfile->xdp_rxq);
> +		tfile->xdp_rxq.dev = tun->dev;
> +		tfile->xdp_rxq.queue_index = tfile->queue_index;
> +		xdp_rxq_info_reg(&tfile->xdp_rxq);
> +	}
> +
>   	rcu_assign_pointer(tfile->tun, tun);
>   	rcu_assign_pointer(tun->tfiles[tun->numqueues], tfile);
>   	tun->numqueues++;
> @@ -1508,6 +1530,7 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun,
>   		xdp.data = buf + pad;
>   		xdp_set_data_meta_invalid(&xdp);
>   		xdp.data_end = xdp.data + len;
> +		xdp.rxq = &tfile->xdp_rxq;
>   		orig_data = xdp.data;
>   		act = bpf_prog_run_xdp(xdp_prog, &xdp);
>   
>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [bpf-next V1-RFC PATCH 10/14] tun: setup xdp_rxq_info
  2017-12-20  7:48   ` Jason Wang
@ 2017-12-21 15:42     ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 39+ messages in thread
From: Jesper Dangaard Brouer @ 2017-12-21 15:42 UTC (permalink / raw)
  To: Jason Wang
  Cc: Daniel Borkmann, Alexei Starovoitov, Willem de Bruijn,
	Michael S. Tsirkin, netdev, dsahern, gospo, bjorn.topel,
	michael.chan, brouer

On Wed, 20 Dec 2017 15:48:01 +0800
Jason Wang <jasowang@redhat.com> wrote:

> > diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> > index e367d6310353..f1df08c2c541 100644
> > --- a/drivers/net/tun.c
> > +++ b/drivers/net/tun.c
> > @@ -180,6 +180,7 @@ struct tun_file {
> >   	struct list_head next;
> >   	struct tun_struct *detached;
> >   	struct skb_array tx_array;
> > +	struct xdp_rxq_info xdp_rxq;
> >   };
> >   
> >   struct tun_flow_entry {
> > @@ -687,8 +688,10 @@ static void __tun_detach(struct tun_file *tfile, bool clean)
> >   			    tun->dev->reg_state == NETREG_REGISTERED)
> >   				unregister_netdevice(tun->dev);
> >   		}
> > -		if (tun)
> > +		if (tun) {
> >   			skb_array_cleanup(&tfile->tx_array);
> > +			xdp_rxq_info_unreg(&tfile->xdp_rxq);
> > +		}
> >   		sock_put(&tfile->sk);
> >   	}
> >   }
> > @@ -728,11 +731,15 @@ static void tun_detach_all(struct net_device *dev)
> >   		tun_napi_del(tun, tfile);
> >   		/* Drop read queue */
> >   		tun_queue_purge(tfile);
> > +		skb_array_cleanup(&tfile->tx_array);  
> 
> Looks like this is unnecessary, skb array will be cleaned up only when 
> fd is closed otherwise there will be a double free.

What code path is called on "fd close" which call skb_array_cleanup() ?
(Is it __tun_detach()?)

Then, I guess I don't need below xdp_rxq_info_unreg() either, right?
 
> > +		xdp_rxq_info_unreg(&tfile->xdp_rxq);
> >   		sock_put(&tfile->sk);

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [bpf-next V1-RFC PATCH 01/14] xdp: base API for new XDP rx-queue info concept
  2017-12-18 10:55     ` Jesper Dangaard Brouer
  2017-12-18 13:23       ` David Ahern
@ 2017-12-21 16:59       ` Jesper Dangaard Brouer
  1 sibling, 0 replies; 39+ messages in thread
From: Jesper Dangaard Brouer @ 2017-12-21 16:59 UTC (permalink / raw)
  To: David Ahern
  Cc: Daniel Borkmann, Alexei Starovoitov, netdev, gospo, bjorn.topel,
	michael.chan, brouer, Saeed Mahameed

On Mon, 18 Dec 2017 11:55:01 +0100
Jesper Dangaard Brouer <brouer@redhat.com> wrote:

> On Wed, 13 Dec 2017 19:34:40 -0700
> David Ahern <dsahern@gmail.com> wrote:
> 
> > On 12/13/17 4:19 AM, Jesper Dangaard Brouer wrote:  
> > > +
> > > +void xdp_rxq_info_unreg(struct xdp_rxq_info *xdp_rxq)
> > > +{
> > > +	xdp_rxq->reg_state = REG_STATE_UNREGISTRED;
> > > +}
> > > +EXPORT_SYMBOL_GPL(xdp_rxq_info_unreg);
> > > +
> > > +void xdp_rxq_info_init(struct xdp_rxq_info *xdp_rxq)
> > > +{
> > > +	if (xdp_rxq->reg_state == REG_STATE_REGISTRED) {
> > > +		WARN(1, "Missing unregister, handled but fix driver\n");
> > > +		xdp_rxq_info_unreg(xdp_rxq);
> > > +	}
> > > +	memset(xdp_rxq, 0, sizeof(*xdp_rxq));
> > > +	xdp_rxq->queue_index = U32_MAX;
> > > +	xdp_rxq->reg_state = REG_STATE_NEW;
> > > +}
> > > +EXPORT_SYMBOL_GPL(xdp_rxq_info_init);
> > > +
> > > +void xdp_rxq_info_reg(struct xdp_rxq_info *xdp_rxq)
> > > +{
> > > +	WARN(!xdp_rxq->dev, "Missing net_device from driver");
> > > +	WARN(xdp_rxq->queue_index == U32_MAX, "Miss queue_index from driver");
> > > +	WARN(!(xdp_rxq->reg_state == REG_STATE_NEW),"API violation, miss init");
> > > +	xdp_rxq->reg_state = REG_STATE_REGISTRED;
> > > +}
> > > +EXPORT_SYMBOL_GPL(xdp_rxq_info_reg);
> > >     
> > 
> > Rather than WARN()'s why not make the _reg and _init functions return an
> > int that indicates an error? For example you don't want to continue if
> > the dev is expected but missing.  
> 
> Handling return-errors in the drivers complicated the driver code, as it
> involves unraveling and deallocating other RX-rings etc (that were
> already allocated) if the reg fails. (Also notice next patch will allow
> dev == NULL, if right ptype is set).
> 
> I'm not completely rejecting you idea, as this is a good optimization
> trick, which is to move validation checks to setup-time, thus allowing
> less validation checks at runtime.  I sort-of actually already did
> this, as I allow bpf to deref dev without NULL check.  I would argue
> this is good enough, as we will crash in a predictable way, as above
> WARN will point to which driver violated the API.
> 
> If people think it is valuable I can change this API to return an err?

I will take Ahern's suggestion of returning an err-code, but only from
xdp_rxq_info_reg().  And I'm going to move xdp_rxq_info_init to be an
internal function (which Saeed also implicitly suggested).
I'm working through the drivers now, and only two drivers don't have a
proper error-return for handling xdp_rxq_info_reg() could fail.

I've also extended xdp_rxq_info_reg() to take args dev + idx, to reduce
the code-lines (given we now also have to check return code, this got
too big).  Thus, reg is a single call with if-return-check.


> I guess, it would be more future-proof to do this, as we (Bjørn,
> Michael, Andy) want to extend this to implement a XDP frame/mem return
> code-path.  And the register call will likely have to allocate some
> resource that could fail, which need to be handled...

I'm mostly doing it for above reason, as I'm hoping to avoid touching
every XDP driver once again.  It is a real pain.

> If we do this, we might as well (slab) alloc the xdp_rxq_info
> structure to reduce the bloat in the drivers RX-rings to a single
> pointer (and a pointer to xdp_rxq_info is what xdp_buff.rxq need).

I've dropped my idea of (slab) allocating the xdp_rxq_info structure.
I started coding this up, but realized the number of lines added per
driver got too excessive for no apparent gain. (e.g. I also needed to
take the numa-node into account in some drivers).

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2017-12-21 16:59 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-13 11:19 [bpf-next V1-RFC PATCH 00/14] xdp: new XDP rx-queue info concept Jesper Dangaard Brouer
2017-12-13 11:19 ` [bpf-next V1-RFC PATCH 01/14] xdp: base API for " Jesper Dangaard Brouer
2017-12-14  2:34   ` David Ahern
2017-12-18 10:55     ` Jesper Dangaard Brouer
2017-12-18 13:23       ` David Ahern
2017-12-18 15:52         ` Jesper Dangaard Brouer
2017-12-21 16:59       ` Jesper Dangaard Brouer
2017-12-13 11:19 ` [bpf-next V1-RFC PATCH 02/14] xdp/mlx5: setup xdp_rxq_info and extend with qtype Jesper Dangaard Brouer
2017-12-13 12:27   ` Tariq Toukan
2017-12-13 13:44     ` Jesper Dangaard Brouer
2017-12-13 23:03       ` Saeed Mahameed
2017-12-14  6:46         ` Jesper Dangaard Brouer
2017-12-13 11:19 ` [bpf-next V1-RFC PATCH 03/14] i40e: setup xdp_rxq_info Jesper Dangaard Brouer
2017-12-13 11:19   ` [Intel-wired-lan] " Jesper Dangaard Brouer
2017-12-18 10:52   ` Björn Töpel
2017-12-18 10:52     ` =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?=
2017-12-18 13:05     ` Jesper Dangaard Brouer
2017-12-18 13:05       ` Jesper Dangaard Brouer
2017-12-13 11:19 ` [bpf-next V1-RFC PATCH 04/14] ixgbe: " Jesper Dangaard Brouer
2017-12-13 11:19   ` [Intel-wired-lan] " Jesper Dangaard Brouer
2017-12-13 11:19 ` [bpf-next V1-RFC PATCH 05/14] xdp/qede: setup xdp_rxq_info and intro xdp_rxq_info_is_reg Jesper Dangaard Brouer
2017-12-13 11:19 ` [bpf-next V1-RFC PATCH 06/14] mlx4: setup xdp_rxq_info Jesper Dangaard Brouer
2017-12-13 12:42   ` Tariq Toukan
2017-12-13 14:00     ` Jesper Dangaard Brouer
2017-12-13 11:19 ` [bpf-next V1-RFC PATCH 07/14] bnxt_en: " Jesper Dangaard Brouer
2017-12-13 11:20 ` [bpf-next V1-RFC PATCH 08/14] nfp: " Jesper Dangaard Brouer
2017-12-14  2:34   ` Jakub Kicinski
2017-12-18 20:25     ` Jesper Dangaard Brouer
2017-12-13 11:20 ` [bpf-next V1-RFC PATCH 09/14] thunderx: " Jesper Dangaard Brouer
2017-12-13 11:20   ` Jesper Dangaard Brouer
2017-12-13 11:20 ` [bpf-next V1-RFC PATCH 10/14] tun: " Jesper Dangaard Brouer
2017-12-20  7:48   ` Jason Wang
2017-12-21 15:42     ` Jesper Dangaard Brouer
2017-12-13 11:20 ` [bpf-next V1-RFC PATCH 11/14] virtio_net: " Jesper Dangaard Brouer
2017-12-13 11:20 ` [bpf-next V1-RFC PATCH 12/14] xdp: generic XDP handling of xdp_rxq_info Jesper Dangaard Brouer
2017-12-13 22:50   ` Saeed Mahameed
2017-12-18  9:47     ` Jesper Dangaard Brouer
2017-12-13 11:20 ` [bpf-next V1-RFC PATCH 13/14] bpf: finally expose xdp_rxq_info to XDP bpf-programs Jesper Dangaard Brouer
2017-12-13 11:20 ` [bpf-next V1-RFC PATCH 14/14] samples/bpf: program demonstrating access to xdp_rxq_info Jesper Dangaard Brouer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.