All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC v3 0/8] xdp: Infrastructure to generalize XDP
@ 2017-02-21 19:34 Tom Herbert
  2017-02-21 19:34 ` [PATCH RFC v3 1/8] " Tom Herbert
                   ` (8 more replies)
  0 siblings, 9 replies; 22+ messages in thread
From: Tom Herbert @ 2017-02-21 19:34 UTC (permalink / raw)
  To: davem, netdev; +Cc: kernel-team

This patch set generalizes XDP by making the hooks in drivers to be
generic. This has a number of advantages:

  - Allows a means to pipeline XDP programs together
  - Reduces the amount of code and complexity needed in drivers to
    manage XDP
  - Provides a more structured environment that is extensible to new
    features while being mostly transparent to the drivers
  - Allow XDP programs to be set per device or per queue
  - Moves management of BPF programs out of driver into a common
    infrastructure

The generic XDP infrastructure is based on an xdp_hook structure that
contains callback functions and private data structure that can be
populated by the user of XDP. The XDP hooks are registered either on a
netdev or a napi (both maintain a list of XDP hooks). Allowing per
netdev hooks makes management of XDP a lot simpler when the intent is
for the hook to apply to the whole device (as is the case with XDP_BPF
so far).  Multiple xdp hooks may be registered on a device or napi
instance, the order of execution is indicated in the priority field of
the xdp_hook structure. Execution of the list contains to the end or
until a program returns something other than XDP_PASS. If both
napi XDP hooks and device hooks are enabled, the NAPI hooks are run
first.

The xdp_hook structure contains a "hookfn" field that is the function
executes a hook. The "priv" structure is private data that is provided
as an argument to hookfn-- in the case of a BPF hook this is simply
the bpf_prog.

Hooks may be registered by xdp_register_dev_hook or
xdp_register_napi_hook, and subsequently they can be unregistered
but xdp_unregister_dev_hook and xdp_unregister_napi_hook. The
identifier for a hook is the pointer to the template hook that was
used to register the hook. xdp_find_dev_hook and
xdp_find_napi_hook will return whether a hook has been registered
and optionally return the contents of the hook. xdp_bpf_check_prog
is to check if the driver is okay with running the program (uses the
XDP_CHECK_BPF_PROG ndo command described below).

Driver interface:

Drivers no longer deal with BPF programs for the most part, instead
they call into the XDP interface.

There are two functions of interest for use in the receive data path:
  - xdp_hook_run_needed_check: returns true if there is an XDP
    program registered on the napi instance or its device
  - xdp_hook_run, xdp_hook_run_ret_last: runs the XDP programs for
    the hooks registered for the given napi instance or its device.
    The latter variant returns a pointer to the last XDP hook that
    was run (useful for reporting).

The ndo_xdp defines a new set of commands for this interface. A driver
should implement these commands:
  - XDP_MODE_ON: Initialize device to use XDP. Called when first XDP
		 program is registered on a device (including on a NAPI
		 instance).
  - XDP_MODE_OFF: XDP is finished on the device. Called after the last
		  XDP hook has been unregistered for a device.
  - XDP_CHECK_BPF_PROG: Check if a BPF program is acceptable to a device
		  to run.
  - XDP_OFFLOAD_BPF: Offload the associated BPF program (e.g. Netronome).

A new net feature is added NETIF_F_XDP so that a driver indicates
that is supports XDP.

This patch set:
  - Adds the infrastructure described above include xdp.c and xdp.h files.
  - Modifies mlx4, mlx5, qede, nfp, and virt_net drivers to use the new
    interface. That is mostly removed the management of BPF programs and
    changing to call the new interface.

v2:
  - Eliminate use of nfhooks like lists. Just use use simple array for
    the hooks
  - Modify more drivers that now support XDP

v3:
  - Don't allow non-BPF hooks to be set

Tested:

Tested XDP_DROP and XDP_TX on mlx5. No regression or other issues noted.

Testing other drviers is TBD.

Tom Herbert (8):
  xdp: Infrastructure to generalize XDP
  mlx4: Changes to use generic XDP infrastructure
  nfp: Changes to use generic XDP infrastructure
  qede: Changes to use generic XDP infrastructure
  virt_net: Changes to use generic XDP infrastructure
  mlx5: Changes to use generic XDP infrastructure
  bnxt: Changes to use generic XDP infrastructure
  xdp: Cleanup after API changes

 drivers/net/ethernet/broadcom/bnxt/bnxt.c          |  14 -
 drivers/net/ethernet/broadcom/bnxt/bnxt.h          |   2 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c      |  46 ++--
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c     |  92 ++-----
 drivers/net/ethernet/mellanox/mlx4/en_rx.c         |  27 +-
 drivers/net/ethernet/mellanox/mlx4/en_tx.c         |   1 +
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h       |   1 -
 drivers/net/ethernet/mellanox/mlx5/core/en.h       |   3 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  | 108 ++------
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c    |  12 +-
 drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c   |   1 +
 drivers/net/ethernet/netronome/nfp/nfp_net.h       |   5 +-
 .../net/ethernet/netronome/nfp/nfp_net_common.c    | 170 ++++++------
 .../net/ethernet/netronome/nfp/nfp_net_ethtool.c   |  12 +-
 drivers/net/ethernet/qlogic/qede/qede.h            |   3 +-
 drivers/net/ethernet/qlogic/qede/qede_ethtool.c    |   2 +-
 drivers/net/ethernet/qlogic/qede/qede_filter.c     |  39 ++-
 drivers/net/ethernet/qlogic/qede/qede_fp.c         |  36 ++-
 drivers/net/ethernet/qlogic/qede/qede_main.c       |  23 +-
 drivers/net/virtio_net.c                           |  99 +++----
 include/linux/filter.h                             |  11 +-
 include/linux/netdev_features.h                    |   3 +-
 include/linux/netdevice.h                          |  27 +-
 include/net/xdp.h                                  | 296 ++++++++++++++++++++
 include/trace/events/xdp.h                         |  16 +-
 net/core/Makefile                                  |   2 +-
 net/core/dev.c                                     |  52 ++--
 net/core/filter.c                                  |   7 +-
 net/core/rtnetlink.c                               |  14 +-
 net/core/xdp.c                                     | 306 +++++++++++++++++++++
 30 files changed, 934 insertions(+), 496 deletions(-)
 create mode 100644 include/net/xdp.h
 create mode 100644 net/core/xdp.c

-- 
2.9.3

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH RFC v3 1/8] xdp: Infrastructure to generalize XDP
  2017-02-21 19:34 [PATCH RFC v3 0/8] xdp: Infrastructure to generalize XDP Tom Herbert
@ 2017-02-21 19:34 ` Tom Herbert
  2017-02-21 19:34 ` [PATCH RFC v3 2/8] mlx4: Changes to use generic XDP infrastructure Tom Herbert
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 22+ messages in thread
From: Tom Herbert @ 2017-02-21 19:34 UTC (permalink / raw)
  To: davem, netdev; +Cc: kernel-team

This patch creates an infrastructure for registering and running code at
XDP hooks in drivers. This extends and generalizes the original XDP/BPF
interface. It abstract out management and running of BPF programs out of
drivers.

An XDP hook is defined by the xdp_hook structure. A pointer to this
structure is passed into the XDP register function to set up a hook.
The XDP register function mallocs its own xdp_hook structure and copies
the values from the xdp_hook passed in. The register function also saves
the pointer value of the xdp_hook argument; this pointer is used in
subsequently calls to XDP to identify the registered hook.

The interface is defined in net/xdp.h. This includes the definition of
xdp_hook, functions to register and unregister hooks on a device
or individual instances of napi, and xdp_hook_run that is called by
drivers to run the hooks.

Signed-off-by: Tom Herbert <tom@herbertland.com>
---
 drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c |   1 +
 include/linux/filter.h                           |  10 +-
 include/linux/netdev_features.h                  |   3 +-
 include/linux/netdevice.h                        |  16 ++
 include/net/xdp.h                                | 296 ++++++++++++++++++++++
 include/trace/events/xdp.h                       |  31 +++
 kernel/bpf/core.c                                |   1 +
 net/core/Makefile                                |   2 +-
 net/core/dev.c                                   |  52 ++--
 net/core/filter.c                                |   1 +
 net/core/rtnetlink.c                             |  14 +-
 net/core/xdp.c                                   | 306 +++++++++++++++++++++++
 12 files changed, 698 insertions(+), 35 deletions(-)
 create mode 100644 include/net/xdp.h
 create mode 100644 net/core/xdp.c

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c b/drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c
index 335beb8..d294fb2 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c
@@ -38,6 +38,7 @@
 #include <linux/filter.h>
 #include <linux/pkt_cls.h>
 #include <linux/unistd.h>
+#include <net/xdp.h>
 
 #include "nfp_asm.h"
 #include "nfp_bpf.h"
diff --git a/include/linux/filter.h b/include/linux/filter.h
index 0c1cc91..53b737f 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -434,7 +434,7 @@ struct sk_filter {
 	struct bpf_prog	*prog;
 };
 
-#define BPF_PROG_RUN(filter, ctx)  (*filter->bpf_func)(ctx, filter->insnsi)
+#define BPF_PROG_RUN(filter, ctx)  (*(filter)->bpf_func)(ctx, (filter)->insnsi)
 
 #define BPF_SKB_CB_LEN QDISC_CB_PRIV_LEN
 
@@ -443,12 +443,6 @@ struct bpf_skb_data_end {
 	void *data_end;
 };
 
-struct xdp_buff {
-	void *data;
-	void *data_end;
-	void *data_hard_start;
-};
-
 /* compute the linear packet data range [data, data_end) which
  * will be accessed by cls_bpf, act_bpf and lwt programs
  */
@@ -510,6 +504,8 @@ static inline u32 bpf_prog_run_clear_cb(const struct bpf_prog *prog,
 	return BPF_PROG_RUN(prog, skb);
 }
 
+struct xdp_buff;
+
 static __always_inline u32 bpf_prog_run_xdp(const struct bpf_prog *prog,
 					    struct xdp_buff *xdp)
 {
diff --git a/include/linux/netdev_features.h b/include/linux/netdev_features.h
index 9a04195..f22d379 100644
--- a/include/linux/netdev_features.h
+++ b/include/linux/netdev_features.h
@@ -71,8 +71,8 @@ enum {
 	NETIF_F_HW_VLAN_STAG_RX_BIT,	/* Receive VLAN STAG HW acceleration */
 	NETIF_F_HW_VLAN_STAG_FILTER_BIT,/* Receive filtering on VLAN STAGs */
 	NETIF_F_HW_L2FW_DOFFLOAD_BIT,	/* Allow L2 Forwarding in Hardware */
-
 	NETIF_F_HW_TC_BIT,		/* Offload TC infrastructure */
+	NETIF_F_XDP_BIT,		/* Support XDP interface */
 
 	/*
 	 * Add your fresh new feature above and remember to update
@@ -134,6 +134,7 @@ enum {
 #define NETIF_F_HW_VLAN_STAG_TX	__NETIF_F(HW_VLAN_STAG_TX)
 #define NETIF_F_HW_L2FW_DOFFLOAD	__NETIF_F(HW_L2FW_DOFFLOAD)
 #define NETIF_F_HW_TC		__NETIF_F(HW_TC)
+#define NETIF_F_XDP		__NETIF_F(XDP)
 
 #define for_each_netdev_feature(mask_addr, bit)	\
 	for_each_set_bit(bit, (unsigned long *)mask_addr, NETDEV_FEATURE_COUNT)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index f40f0ab..57ac7ea 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -324,6 +324,7 @@ struct napi_struct {
 	struct sk_buff		*skb;
 	struct hrtimer		timer;
 	struct list_head	dev_list;
+	struct xdp_hook_set __rcu *xdp_hooks;
 	struct hlist_node	napi_hash_node;
 	unsigned int		napi_id;
 };
@@ -822,12 +823,25 @@ enum xdp_netdev_command {
 	 * return true if a program is currently attached and running.
 	 */
 	XDP_QUERY_PROG,
+	/* Initialize device to use XDP. Called when first XDP program is
+	 * registered on a device (including on a NAPI instance).
+	 */
+	XDP_MODE_ON,
+	/* XDP is finished on the device. Called after the last XDP hook
+	 * has been removed from a device.
+	 */
+	XDP_MODE_OFF,
+	/* Check if device is okay with the proposed BPF program to be loaded */
+	XDP_CHECK_BPF_PROG,
+	/* Offload a BPF program to the device */
+	XDP_OFFLOAD_BPF,
 };
 
 struct netdev_xdp {
 	enum xdp_netdev_command command;
 	union {
 		/* XDP_SETUP_PROG */
+		/* XDP_CHECK_BPF_PROG */
 		struct bpf_prog *prog;
 		/* XDP_QUERY_PROG */
 		bool prog_attached;
@@ -1668,6 +1682,8 @@ struct net_device {
 	struct list_head	close_list;
 	struct list_head	ptype_all;
 	struct list_head	ptype_specific;
+	struct xdp_hook_set __rcu *xdp_hooks;
+	unsigned int		xdp_hook_cnt;
 
 	struct {
 		struct list_head upper;
diff --git a/include/net/xdp.h b/include/net/xdp.h
new file mode 100644
index 0000000..56b3cf2
--- /dev/null
+++ b/include/net/xdp.h
@@ -0,0 +1,296 @@
+/*
+ * eXpress Data Path (XDP)
+ *
+ * Copyright (c) 2017 Tom Herbert <tom@herbertland.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2
+ * as published by the Free Software Foundation.
+ */
+
+#ifndef __NET_XDP_H_
+#define __NET_XDP_H_
+
+#include <linux/filter.h>
+#include <linux/netdevice.h>
+#include <linux/static_key.h>
+
+/* XDP data structure.
+ *
+ * Fields:
+ *   data - pointer to first byte of data
+ *   data_end - pointer to last byte
+ *   data_hard_start - point to first possible byte
+ *
+ * Length is deduced by xdp->data_end - xdp->data.
+ */
+struct xdp_buff {
+	void *data;
+	void *data_end;
+	void *data_hard_start;
+};
+
+typedef unsigned int xdp_hookfn(const void *priv, struct xdp_buff *xdp);
+typedef void xdp_put_privfn(const void *priv);
+
+#define XDP_TAG_SIZE	8 /* Should be at least BPF_TAG_SIZE */
+
+/* xdp_hook struct
+ *
+ * This structure contains the ops and data for an XDP hook. A pointer
+ * to this structure providing the definition of a hook is passed into
+ * the XDP register function to set up a hook. The XDP register function
+ * mallocs its own xdp_hook structure and copies the values from the
+ * xdp_hook definition. The register function also saves the pointer value
+ * of the xdp_hook definition argument; this pointer is used in subsequent
+ * calls to XDP to find or unregister the hook.
+ *
+ * Fields:
+ *
+ *   priority - priority for insertion into set. The set is ordered lowest to
+ *	highest priority.
+ *   priv - private data associated with hook. This is passed as an argument
+ *	to the hook function. This is a bpf_prog structure.
+ *   put_priv - function call when XDP is done with private data.
+ *   def - point to definitions of xdp_hook. The pointer value is saved as
+ *      a refernce the instance of hook loaded (used to find and unregister a
+ *      hook).
+ *   tag - readable tag for reporting purposes
+ */
+struct xdp_hook {
+	int priority;
+	void __rcu *priv;
+	const struct xdp_hook *def;
+	u8 tag[XDP_TAG_SIZE];
+};
+
+/* xdp_hook_set
+ *
+ * This structure holds a set of XDP hooks in an array of size num. This
+ * structure is used in netdevice to refer to the XDP hooks for a whole
+ * device or in the napi structure to contain the hooks for an individual
+ * RX queue.
+ */
+struct xdp_hook_set {
+	unsigned int num;
+	struct rcu_head rcu;
+	struct xdp_hook hooks[0];
+};
+
+#define XDP_SET_SIZE(_num) (sizeof(struct xdp_hook_set) + ((_num) * \
+	sizeof(struct xdp_hook)))
+
+extern struct xdp_hook xdp_bpf_hook;
+
+extern struct static_key_false xdp_napi_hooks_needed;
+extern struct static_key_false xdp_dev_hooks_needed;
+
+/* Check if XDP hooks are set for a napi or its device */
+static inline bool xdp_hook_run_needed_check(struct net_device *dev,
+					     struct napi_struct *napi)
+{
+	return ((static_branch_unlikely(&xdp_dev_hooks_needed) &&
+		dev->xdp_hooks) ||
+		(static_branch_unlikely(&xdp_napi_hooks_needed) &&
+		 napi->xdp_hooks));
+}
+
+static inline int __xdp_run_one_hook(struct xdp_hook *hook,
+				     struct xdp_buff *xdp)
+{
+	void *priv = rcu_dereference(hook->priv);
+
+	return BPF_PROG_RUN((struct bpf_prog *)priv, (void *)xdp);
+}
+
+/* Core function to run the XDP hooks. This must be as fast as possible */
+static inline int __xdp_hook_run(struct xdp_hook_set *hook_set,
+				 struct xdp_buff *xdp,
+				 struct xdp_hook **last_hook)
+{
+	struct xdp_hook *hook;
+	int i, ret;
+
+	if (unlikely(!hook_set))
+		return XDP_PASS;
+
+	hook = &hook_set->hooks[0];
+	ret = __xdp_run_one_hook(hook, xdp);
+	*last_hook = hook;
+
+	for (i = 1; i < hook_set->num; i++) {
+		if (ret != XDP_PASS)
+			break;
+		hook = &hook_set->hooks[i];
+		ret = __xdp_run_one_hook(hook, xdp);
+		*last_hook = hook;
+	}
+
+	return ret;
+}
+
+/* Run the XDP hooks for a napi device and return a reference to the last
+ * hook processed. Called from a driver's receive routine. RCU
+ * read lock must be held.
+ */
+static inline int xdp_hook_run_ret_last(struct napi_struct *napi,
+					struct xdp_buff *xdp,
+					struct xdp_hook **last_hook)
+{
+	struct net_device *dev = napi->dev;
+	struct xdp_hook_set *hook_set;
+	int ret = XDP_PASS;
+
+	if (static_branch_unlikely(&xdp_napi_hooks_needed)) {
+		/* Run hooks in napi first */
+		hook_set = rcu_dereference(napi->xdp_hooks);
+		ret = __xdp_hook_run(hook_set, xdp, last_hook);
+
+		/* Check for dev hooks now taking into account that
+		 * we need to check for XDP_PASS having been
+		 * returned only if they are need (this is why
+		 * we don't do a fall through).
+		 */
+		if (static_branch_unlikely(&xdp_dev_hooks_needed)) {
+			if (ret != XDP_PASS)
+				return ret;
+			hook_set = rcu_dereference(dev->xdp_hooks);
+			ret = __xdp_hook_run(hook_set, xdp, last_hook);
+		}
+	} else if (static_branch_unlikely(&xdp_dev_hooks_needed)) {
+		/* Now run device hooks */
+		hook_set = rcu_dereference(dev->xdp_hooks);
+		ret = __xdp_hook_run(hook_set, xdp, last_hook);
+	}
+
+	return ret;
+}
+
+/* Run the XDP hooks for a napi device. Called from a driver's receive
+ * routine. RCU read lock must be held.
+ */
+static inline int xdp_hook_run(struct napi_struct *napi,
+			       struct xdp_buff *xdp)
+{
+	struct xdp_hook *last_hook;
+
+	return xdp_hook_run_ret_last(napi, xdp, &last_hook);
+}
+
+/* Register an XDP hook
+ *    dev: Assoicated net_device
+ *    hook_set: Hook set
+ *    def: Definition of the hook. The values are copied from this to a
+ *	   malloc'ed structure. The base_def pointer is saved as a
+ *	   reference to the hook to manage it
+ *    change: Change hook if it exists
+ *    dev_hook: Is a hook on a net_device (as oppsed to a napi instance)
+ */
+int __xdp_register_hook(struct net_device *dev,
+			struct xdp_hook_set __rcu **hook_set,
+			const struct xdp_hook *base_def,
+			bool change, bool dev_hook);
+
+/* Register an XDP hook on a device */
+static inline int xdp_register_dev_hook(struct net_device *dev,
+					const struct xdp_hook *def)
+{
+	return __xdp_register_hook(dev, &dev->xdp_hooks, def, false, true);
+}
+
+/* Register an XDP hook on a napi instance */
+static inline int xdp_register_napi_hook(struct napi_struct *napi,
+					 const struct xdp_hook *def)
+{
+	return __xdp_register_hook(napi->dev, &napi->xdp_hooks, def, false,
+				   false);
+}
+
+/* Change an XDP hook.
+ *
+ *    - If the hook does not exist (xdp_hook_ops does not match a hook set on
+ *      the device), then attempt to register the hook.
+ *    - Else, change the private data (priv field in xdp_hook_ops) in the
+ *      existing hook to be the new one (in reg). All the other fields in
+ *      xdp_hook_ops are ignored in that case.
+ */
+
+/* Change a device XDP hook */
+static inline int xdp_change_dev_hook(struct net_device *dev,
+				      const struct xdp_hook *reg)
+{
+	return __xdp_register_hook(dev, &dev->xdp_hooks, reg, true, true);
+}
+
+/* Change a napi XDP hook */
+static inline int xdp_change_napi_hook(struct napi_struct *napi,
+				       const struct xdp_hook *reg)
+{
+	return __xdp_register_hook(napi->dev, &napi->xdp_hooks, reg, true,
+				   false);
+}
+
+int __xdp_unregister_hook(struct net_device *dev,
+			  struct xdp_hook_set __rcu **hook_set,
+			  const struct xdp_hook *def, bool dev_hook);
+
+/* Unregister device XDP hook */
+static inline int xdp_unregister_dev_hook(struct net_device *dev,
+					   const struct xdp_hook *def)
+{
+	return __xdp_unregister_hook(dev, &dev->xdp_hooks, def, true);
+}
+
+/* Unregister a napi XDP hook */
+static inline int xdp_unregister_napi_hook(struct napi_struct *napi,
+					    const struct xdp_hook *def)
+{
+	return __xdp_unregister_hook(napi->dev, &napi->xdp_hooks, def, false);
+}
+
+/* Unregister all XDP hooks associated with a device (both the device hooks
+ * and hooks on all napi instances). This function is called when the netdev
+ * is being freed.
+ */
+void xdp_unregister_all_hooks(struct net_device *dev);
+
+/* Unregister all XDP hooks for a given xdp_hook_ops in a net. This walks
+ * all devices in net and napis for each device to unregister matching hooks.
+ * This can be called when a module that had registered some number of hooks
+ * is being unloaded.
+ */
+void xdp_unregister_net_hooks(struct net *net, struct xdp_hook *def);
+
+/* Find a registered device hook.
+ *   - If hook is found *ret is set to the values in the registered hook and
+ *     true is returned.
+ *   - Else false is returned.
+ */
+bool __xdp_find_hook(struct xdp_hook_set **hook_set,
+		     const struct xdp_hook *def,
+		     struct xdp_hook *ret);
+
+/* Find a device XDP hook. */
+static inline bool xdp_find_dev_hook(struct net_device *dev,
+				     const struct xdp_hook *def,
+				     struct xdp_hook *ret)
+{
+	return __xdp_find_hook(&dev->xdp_hooks, def, ret);
+}
+
+/* Find a napi XDP hook. */
+static inline bool xdp_find_napi_hook(struct napi_struct *napi,
+				      const struct xdp_hook *def,
+				      struct xdp_hook *ret)
+{
+	return __xdp_find_hook(&napi->xdp_hooks, def, ret);
+}
+
+int xdp_bpf_check_prog(struct net_device *dev, struct bpf_prog *prog);
+
+static inline void xdp_warn_invalid_action(u32 act)
+{
+	WARN_ONCE(1, "Illegal XDP return value %u, expect packet loss\n", act);
+}
+
+#endif /* __NET_XDP_H_ */
diff --git a/include/trace/events/xdp.h b/include/trace/events/xdp.h
index 1b61357..9ca6306 100644
--- a/include/trace/events/xdp.h
+++ b/include/trace/events/xdp.h
@@ -7,6 +7,7 @@
 #include <linux/netdevice.h>
 #include <linux/filter.h>
 #include <linux/tracepoint.h>
+#include <net/xdp.h>
 
 #define __XDP_ACT_MAP(FN)	\
 	FN(ABORTED)		\
@@ -48,6 +49,36 @@ TRACE_EVENT(xdp_exception,
 		  __print_symbolic(__entry->act, __XDP_ACT_SYM_TAB))
 );
 
+/* Temporary trace function. This will be renamed to xdp_exception after all
+ * the calling drivers have been patched.
+ */
+TRACE_EVENT(xdp_hook_exception,
+
+	TP_PROTO(const struct net_device *dev,
+		 const struct xdp_hook *hook, u32 act),
+
+	TP_ARGS(dev, hook, act),
+
+	TP_STRUCT__entry(
+		__string(name, dev->name)
+		__array(u8, prog_tag, 8)
+		__field(u32, act)
+	),
+
+	TP_fast_assign(
+		BUILD_BUG_ON(sizeof(__entry->prog_tag) !=
+						sizeof(hook->tag));
+		memcpy(__entry->prog_tag, hook->tag, sizeof(hook->tag));
+			__assign_str(name, dev->name);
+			__entry->act = act;
+		),
+
+	TP_printk("prog=%s device=%s action=%s",
+		  __print_hex_str(__entry->prog_tag, 8),
+		  __get_str(name),
+		  __print_symbolic(__entry->act, __XDP_ACT_SYM_TAB))
+);
+
 #endif /* _TRACE_XDP_H */
 
 #include <trace/define_trace.h>
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index f45827e2..04f2e30 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1412,6 +1412,7 @@ int __weak skb_copy_bits(const struct sk_buff *skb, int offset, void *to,
 #include <linux/bpf_trace.h>
 
 EXPORT_TRACEPOINT_SYMBOL_GPL(xdp_exception);
+EXPORT_TRACEPOINT_SYMBOL_GPL(xdp_hook_exception);
 
 EXPORT_TRACEPOINT_SYMBOL_GPL(bpf_prog_get_type);
 EXPORT_TRACEPOINT_SYMBOL_GPL(bpf_prog_put_rcu);
diff --git a/net/core/Makefile b/net/core/Makefile
index 79f9479..52410db 100644
--- a/net/core/Makefile
+++ b/net/core/Makefile
@@ -9,7 +9,7 @@ obj-$(CONFIG_SYSCTL) += sysctl_net_core.o
 
 obj-y		     += dev.o ethtool.o dev_addr_lists.o dst.o netevent.o \
 			neighbour.o rtnetlink.o utils.o link_watch.o filter.o \
-			sock_diag.o dev_ioctl.o tso.o sock_reuseport.o
+			sock_diag.o dev_ioctl.o tso.o sock_reuseport.o xdp.o
 
 obj-$(CONFIG_XFRM) += flow.o
 obj-y += net-sysfs.o
diff --git a/net/core/dev.c b/net/core/dev.c
index 05d19c6..81bdf24 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -140,6 +140,8 @@
 #include <linux/hrtimer.h>
 #include <linux/netfilter_ingress.h>
 #include <linux/crash_dump.h>
+#include <linux/filter.h>
+#include <net/xdp.h>
 
 #include "net-sysfs.h"
 
@@ -6615,6 +6617,24 @@ int dev_change_proto_down(struct net_device *dev, bool proto_down)
 }
 EXPORT_SYMBOL(dev_change_proto_down);
 
+/* Run a BPF/XDP program. RCU read lock must be held */
+static u32 dev_bpf_prog_run_xdp(const void *priv,
+				struct xdp_buff *xdp)
+{
+	const struct bpf_prog *prog = (const struct bpf_prog *)priv;
+
+	return BPF_PROG_RUN(prog, (void *)xdp);
+}
+
+static void dev_bpf_prog_put_xdp(const void *priv)
+{
+	bpf_prog_put((struct bpf_prog *)priv);
+}
+
+struct xdp_hook xdp_bpf_hook = {
+	.priority = 0,
+};
+
 /**
  *	dev_change_xdp_fd - set or clear a bpf program for a device rx path
  *	@dev: device
@@ -6627,7 +6647,6 @@ int dev_change_xdp_fd(struct net_device *dev, int fd, u32 flags)
 {
 	const struct net_device_ops *ops = dev->netdev_ops;
 	struct bpf_prog *prog = NULL;
-	struct netdev_xdp xdp;
 	int err;
 
 	ASSERT_RTNL();
@@ -6635,29 +6654,27 @@ int dev_change_xdp_fd(struct net_device *dev, int fd, u32 flags)
 	if (!ops->ndo_xdp)
 		return -EOPNOTSUPP;
 	if (fd >= 0) {
-		if (flags & XDP_FLAGS_UPDATE_IF_NOEXIST) {
-			memset(&xdp, 0, sizeof(xdp));
-			xdp.command = XDP_QUERY_PROG;
-
-			err = ops->ndo_xdp(dev, &xdp);
-			if (err < 0)
-				return err;
-			if (xdp.prog_attached)
-				return -EBUSY;
-		}
+		if ((flags & XDP_FLAGS_UPDATE_IF_NOEXIST) &&
+		    xdp_find_dev_hook(dev, &xdp_bpf_hook, NULL))
+			return -EBUSY;
 
 		prog = bpf_prog_get_type(fd, BPF_PROG_TYPE_XDP);
 		if (IS_ERR(prog))
 			return PTR_ERR(prog);
 	}
 
-	memset(&xdp, 0, sizeof(xdp));
-	xdp.command = XDP_SETUP_PROG;
-	xdp.prog = prog;
+	if (prog) {
+		err = xdp_bpf_check_prog(dev, prog);
+		if (err >= 0) {
+			rcu_assign_pointer(xdp_bpf_hook.priv, prog);
+			err = xdp_register_dev_hook(dev, &xdp_bpf_hook);
+		}
 
-	err = ops->ndo_xdp(dev, &xdp);
-	if (err < 0 && prog)
-		bpf_prog_put(prog);
+		if (err < 0)
+			bpf_prog_put(prog);
+	} else {
+		err = xdp_unregister_dev_hook(dev, &xdp_bpf_hook);
+	}
 
 	return err;
 }
@@ -7698,6 +7715,7 @@ void free_netdev(struct net_device *dev)
 	struct napi_struct *p, *n;
 
 	might_sleep();
+	xdp_unregister_all_hooks(dev);
 	netif_free_tx_queues(dev);
 #ifdef CONFIG_SYSFS
 	kvfree(dev->_rx);
diff --git a/net/core/filter.c b/net/core/filter.c
index e466e004..9a5de43 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -52,6 +52,7 @@
 #include <net/dst_metadata.h>
 #include <net/dst.h>
 #include <net/sock_reuseport.h>
+#include <net/xdp.h>
 
 /**
  *	sk_filter_trim_cap - run a packet through a socket filter
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index c4e84c5..b2f5772 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -56,6 +56,7 @@
 #include <net/fib_rules.h>
 #include <net/rtnetlink.h>
 #include <net/net_namespace.h>
+#include <net/xdp.h>
 
 struct rtnl_link {
 	rtnl_doit_func		doit;
@@ -901,7 +902,7 @@ static size_t rtnl_xdp_size(const struct net_device *dev)
 	size_t xdp_size = nla_total_size(0) +	/* nest IFLA_XDP */
 			  nla_total_size(1);	/* XDP_ATTACHED */
 
-	if (!dev->netdev_ops->ndo_xdp)
+	if (!(dev->features & NETIF_F_XDP))
 		return 0;
 	else
 		return xdp_size;
@@ -1251,20 +1252,15 @@ static int rtnl_fill_link_ifmap(struct sk_buff *skb, struct net_device *dev)
 
 static int rtnl_xdp_fill(struct sk_buff *skb, struct net_device *dev)
 {
-	struct netdev_xdp xdp_op = {};
 	struct nlattr *xdp;
 	int err;
 
-	if (!dev->netdev_ops->ndo_xdp)
-		return 0;
 	xdp = nla_nest_start(skb, IFLA_XDP);
 	if (!xdp)
 		return -EMSGSIZE;
-	xdp_op.command = XDP_QUERY_PROG;
-	err = dev->netdev_ops->ndo_xdp(dev, &xdp_op);
-	if (err)
-		goto err_cancel;
-	err = nla_put_u8(skb, IFLA_XDP_ATTACHED, xdp_op.prog_attached);
+
+	err = nla_put_u8(skb, IFLA_XDP_ATTACHED,
+			 xdp_find_dev_hook(dev, &xdp_bpf_hook, NULL));
 	if (err)
 		goto err_cancel;
 
diff --git a/net/core/xdp.c b/net/core/xdp.c
new file mode 100644
index 0000000..627671a
--- /dev/null
+++ b/net/core/xdp.c
@@ -0,0 +1,306 @@
+/*
+ * eXpress Data Path
+ *
+ * Copyright (c) 2017 Tom Herbert <tom@herbertland.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2
+ * as published by the Free Software Foundation.
+ */
+#include <linux/bpf.h>
+#include <net/xdp.h>
+
+DEFINE_STATIC_KEY_FALSE(xdp_dev_hooks_needed);
+EXPORT_SYMBOL(xdp_dev_hooks_needed);
+
+DEFINE_STATIC_KEY_FALSE(xdp_napi_hooks_needed);
+EXPORT_SYMBOL(xdp_napi_hooks_needed);
+
+static DEFINE_MUTEX(xdp_hook_mutex);
+
+int __xdp_register_hook(struct net_device *dev,
+			struct xdp_hook_set __rcu **xdp_hooks,
+			const struct xdp_hook *def,
+			bool change, bool dev_hook)
+{
+	struct xdp_hook_set *new_hooks = NULL, *old_hooks;
+	struct xdp_hook *hook;
+	int index, targindex = 0;
+	int i, err;
+
+	mutex_lock(&xdp_hook_mutex);
+
+	old_hooks = rcu_dereference(*xdp_hooks);
+
+	if (old_hooks) {
+		/* Walk over hooks, see if hook is already registered and
+		 * determine insertion point.
+		 */
+
+		for (index = 0; index < old_hooks->num; index++) {
+			hook = &old_hooks->hooks[index];
+			if (hook->def != def) {
+				if (def->priority < hook->priority)
+					targindex = index;
+				continue;
+			}
+
+			if (change) {
+				void *old_priv;
+
+				/* Only allow changing priv field in an existing
+				 * hook.
+				 */
+				old_priv = rcu_dereference_protected(hook->priv,
+					lockdep_is_held(&xdp_hook_mutex));
+				rcu_assign_pointer(hook->priv, def->priv);
+				if (old_priv)
+					bpf_prog_put((struct bpf_prog *)old_priv);
+				goto out;
+			} else {
+				/* Already registered */
+				err = -EALREADY;
+				goto err;
+			}
+		}
+	}
+
+	/* Need to add new hook set. index holds number of entries in hooks
+	 * set (zero if hooks set is NULL). targindex holds index to insert
+	 * new hook.
+	 */
+	new_hooks = kzalloc(XDP_SET_SIZE(index + 1), GFP_KERNEL);
+	if (!new_hooks) {
+		err = -ENOMEM;
+		goto err;
+	}
+
+	/* Initialize XDP in driver */
+	if (!dev->xdp_hook_cnt && dev->netdev_ops->ndo_xdp) {
+		struct netdev_xdp xdp_op = {};
+
+		xdp_op.command = XDP_MODE_ON;
+		err = dev->netdev_ops->ndo_xdp(dev, &xdp_op);
+		if (err)
+			goto err;
+	}
+
+	if (old_hooks) {
+		for (i = 0; i < targindex; i++)
+			new_hooks->hooks[i] = old_hooks->hooks[i];
+
+		for (i++; i < index + 1; i++)
+			new_hooks->hooks[i] = old_hooks->hooks[i - 1];
+	}
+
+	new_hooks->hooks[targindex] = *def;
+	rcu_assign_pointer(new_hooks->hooks[targindex].priv, def->priv);
+	new_hooks->num = index + 1;
+	rcu_assign_pointer(*xdp_hooks, new_hooks);
+
+	if (old_hooks)
+		kfree_rcu(old_hooks, rcu);
+
+	if (dev_hook)
+		static_branch_inc(&xdp_dev_hooks_needed);
+	else
+		static_branch_inc(&xdp_napi_hooks_needed);
+
+	dev->xdp_hook_cnt++;
+
+out:
+	mutex_unlock(&xdp_hook_mutex);
+
+	return 0;
+
+err:
+	mutex_unlock(&xdp_hook_mutex);
+	kfree(new_hooks);
+	return err;
+}
+EXPORT_SYMBOL_GPL(__xdp_register_hook);
+
+int __xdp_unregister_hook(struct net_device *dev,
+			  struct xdp_hook_set __rcu **xdp_hooks,
+			  const struct xdp_hook *def,
+			  bool dev_hook)
+{
+	struct xdp_hook_set *old_hooks, *new_hooks = NULL;
+	struct xdp_hook *hook;
+	int i, index;
+	int err = 0;
+
+	old_hooks = rcu_dereference(*xdp_hooks);
+
+	mutex_lock(&xdp_hook_mutex);
+
+	for (index = 0; index < old_hooks->num; index++) {
+		hook = &old_hooks->hooks[index];
+		if (hook->def != def)
+			continue;
+
+		if (old_hooks->num > 1) {
+			new_hooks = kzalloc(XDP_SET_SIZE(
+				old_hooks->num  - 1), GFP_KERNEL);
+
+			if (!new_hooks) {
+				err = -ENOMEM;
+				goto out;
+			}
+			for (i = 0; i < index; i++)
+				new_hooks->hooks[i] = old_hooks->hooks[i];
+			for (i++; i < index; i++)
+				new_hooks->hooks[i - 1] = old_hooks->hooks[i];
+
+			new_hooks->num = old_hooks->num - 1;
+		}
+
+		break;
+	}
+
+	if (index >= old_hooks->num)
+		goto out;
+
+	rcu_assign_pointer(*xdp_hooks, new_hooks);
+
+	if (old_hooks)
+		kfree_rcu(old_hooks, rcu);
+
+	dev->xdp_hook_cnt--;
+
+	if (dev_hook)
+		static_branch_dec(&xdp_dev_hooks_needed);
+	else
+		static_branch_dec(&xdp_napi_hooks_needed);
+
+	if (hook->priv)
+		bpf_prog_put((struct bpf_prog *)hook->priv);
+
+	if (!dev->xdp_hook_cnt && dev->netdev_ops->ndo_xdp) {
+		struct netdev_xdp xdp_op = {};
+
+		xdp_op.command = XDP_MODE_OFF;
+		dev->netdev_ops->ndo_xdp(dev, &xdp_op);
+	}
+
+out:
+	mutex_unlock(&xdp_hook_mutex);
+	synchronize_net();
+
+	return err;
+}
+EXPORT_SYMBOL_GPL(__xdp_unregister_hook);
+
+static void __xdp_unregister_hooks(struct net_device *dev,
+				   struct xdp_hook_set __rcu **xdp_hooks,
+				   bool dev_hook)
+{
+	struct xdp_hook_set *old_hooks;
+	int i;
+
+	mutex_lock(&xdp_hook_mutex);
+
+	old_hooks = rcu_dereference(*xdp_hooks);
+
+	if (!old_hooks) {
+		mutex_unlock(&xdp_hook_mutex);
+		return;
+	}
+
+	for (i = 0; i < old_hooks->num; i++) {
+		if (dev_hook)
+			static_branch_dec(&xdp_dev_hooks_needed);
+		else
+			static_branch_dec(&xdp_napi_hooks_needed);
+		dev->xdp_hook_cnt--;
+	}
+
+	rcu_assign_pointer(*xdp_hooks, NULL);
+
+	if (!dev->xdp_hook_cnt && dev->netdev_ops->ndo_xdp) {
+		struct netdev_xdp xdp_op = {};
+
+		xdp_op.command = XDP_MODE_OFF;
+		dev->netdev_ops->ndo_xdp(dev, &xdp_op);
+	}
+
+	mutex_unlock(&xdp_hook_mutex);
+
+	kfree_rcu(old_hooks, rcu);
+}
+
+void xdp_unregister_all_hooks(struct net_device *dev)
+{
+	struct napi_struct *napi;
+
+	/* Unregister NAPI hooks for device */
+	list_for_each_entry(napi, &dev->napi_list, dev_list)
+		__xdp_unregister_hooks(dev, &napi->xdp_hooks, false);
+
+	/* Unregister device hooks */
+	__xdp_unregister_hooks(dev, &dev->xdp_hooks, true);
+}
+EXPORT_SYMBOL_GPL(xdp_unregister_all_hooks);
+
+void xdp_unregister_net_hooks(struct net *net, struct xdp_hook *def)
+{
+	struct net_device *dev;
+	struct napi_struct *napi;
+
+	list_for_each_entry_rcu(dev, &net->dev_base_head, dev_list) {
+		list_for_each_entry(napi, &dev->napi_list, dev_list)
+			xdp_unregister_napi_hook(napi, def);
+
+		xdp_unregister_dev_hook(dev, def);
+	}
+}
+EXPORT_SYMBOL_GPL(xdp_unregister_net_hooks);
+
+bool __xdp_find_hook(struct xdp_hook_set __rcu **xdp_hooks,
+		     const struct xdp_hook *def,
+		     struct xdp_hook *ret)
+{
+	struct xdp_hook_set *old_hooks;
+	struct xdp_hook *hook;
+	bool retval = false;
+	int index;
+
+	rcu_read_lock();
+
+	old_hooks = rcu_dereference(*xdp_hooks);
+
+	if (!old_hooks)
+		goto out;
+
+	for (index = 0; index < old_hooks->num; index++) {
+		hook = &old_hooks->hooks[index];
+		if (hook->def != def)
+			continue;
+
+		if (ret)
+			*ret = *hook;
+		retval = true;
+		goto out;
+	}
+
+out:
+	rcu_read_unlock();
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(__xdp_find_hook);
+
+int xdp_bpf_check_prog(struct net_device *dev, struct bpf_prog *prog)
+{
+	if (dev->netdev_ops->ndo_xdp) {
+		struct netdev_xdp xdp_op = {};
+
+		xdp_op.command = XDP_CHECK_BPF_PROG;
+		xdp_op.prog = prog;
+
+		return dev->netdev_ops->ndo_xdp(dev, &xdp_op);
+	} else {
+		return -EOPNOTSUPP;
+	}
+}
+EXPORT_SYMBOL_GPL(xdp_bpf_check_prog);
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH RFC v3 2/8] mlx4: Changes to use generic XDP infrastructure
  2017-02-21 19:34 [PATCH RFC v3 0/8] xdp: Infrastructure to generalize XDP Tom Herbert
  2017-02-21 19:34 ` [PATCH RFC v3 1/8] " Tom Herbert
@ 2017-02-21 19:34 ` Tom Herbert
  2017-02-21 19:34 ` [PATCH RFC v3 3/8] nfp: " Tom Herbert
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 22+ messages in thread
From: Tom Herbert @ 2017-02-21 19:34 UTC (permalink / raw)
  To: davem, netdev; +Cc: kernel-team

Change XDP program management functional interface to correspond to new
XDP API.

Signed-off-by: Tom Herbert <tom@herbertland.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 92 +++++---------------------
 drivers/net/ethernet/mellanox/mlx4/en_rx.c     | 27 ++++----
 drivers/net/ethernet/mellanox/mlx4/en_tx.c     |  1 +
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h   |  1 -
 4 files changed, 29 insertions(+), 92 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index afe4444..e1dbfe7 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -42,6 +42,7 @@
 #include <net/busy_poll.h>
 #include <net/vxlan.h>
 #include <net/devlink.h>
+#include <net/xdp.h>
 
 #include <linux/mlx4/driver.h>
 #include <linux/mlx4/device.h>
@@ -2199,8 +2200,7 @@ int mlx4_en_try_alloc_resources(struct mlx4_en_priv *priv,
 				struct mlx4_en_port_profile *prof,
 				bool carry_xdp_prog)
 {
-	struct bpf_prog *xdp_prog;
-	int i, t;
+	int t;
 
 	mlx4_en_copy_priv(tmp, priv, prof);
 
@@ -2215,22 +2215,6 @@ int mlx4_en_try_alloc_resources(struct mlx4_en_priv *priv,
 		return -ENOMEM;
 	}
 
-	/* All rx_rings has the same xdp_prog.  Pick the first one. */
-	xdp_prog = rcu_dereference_protected(
-		priv->rx_ring[0]->xdp_prog,
-		lockdep_is_held(&priv->mdev->state_lock));
-
-	if (xdp_prog && carry_xdp_prog) {
-		xdp_prog = bpf_prog_add(xdp_prog, tmp->rx_ring_num);
-		if (IS_ERR(xdp_prog)) {
-			mlx4_en_free_resources(tmp);
-			return PTR_ERR(xdp_prog);
-		}
-		for (i = 0; i < tmp->rx_ring_num; i++)
-			rcu_assign_pointer(tmp->rx_ring[i]->xdp_prog,
-					   xdp_prog);
-	}
-
 	return 0;
 }
 
@@ -2717,42 +2701,20 @@ static int mlx4_en_set_tx_maxrate(struct net_device *dev, int queue_index, u32 m
 	return err;
 }
 
-static int mlx4_xdp_set(struct net_device *dev, struct bpf_prog *prog)
+static int mlx4_xdp_init(struct net_device *dev, bool enable)
 {
 	struct mlx4_en_priv *priv = netdev_priv(dev);
 	struct mlx4_en_dev *mdev = priv->mdev;
 	struct mlx4_en_port_profile new_prof;
-	struct bpf_prog *old_prog;
 	struct mlx4_en_priv *tmp;
 	int tx_changed = 0;
-	int xdp_ring_num;
 	int port_up = 0;
-	int err;
-	int i;
+	int xdp_ring_num, err;
 
-	xdp_ring_num = prog ? priv->rx_ring_num : 0;
+	xdp_ring_num = enable ? ALIGN(priv->rx_ring_num, MLX4_EN_NUM_UP) : 0;
 
-	/* No need to reconfigure buffers when simply swapping the
-	 * program for a new one.
-	 */
-	if (priv->tx_ring_num[TX_XDP] == xdp_ring_num) {
-		if (prog) {
-			prog = bpf_prog_add(prog, priv->rx_ring_num - 1);
-			if (IS_ERR(prog))
-				return PTR_ERR(prog);
-		}
-		mutex_lock(&mdev->state_lock);
-		for (i = 0; i < priv->rx_ring_num; i++) {
-			old_prog = rcu_dereference_protected(
-					priv->rx_ring[i]->xdp_prog,
-					lockdep_is_held(&mdev->state_lock));
-			rcu_assign_pointer(priv->rx_ring[i]->xdp_prog, prog);
-			if (old_prog)
-				bpf_prog_put(old_prog);
-		}
-		mutex_unlock(&mdev->state_lock);
+	if (priv->tx_ring_num[TX_XDP] == xdp_ring_num)
 		return 0;
-	}
 
 	if (!mlx4_en_check_xdp_mtu(dev, dev->mtu))
 		return -EOPNOTSUPP;
@@ -2761,14 +2723,6 @@ static int mlx4_xdp_set(struct net_device *dev, struct bpf_prog *prog)
 	if (!tmp)
 		return -ENOMEM;
 
-	if (prog) {
-		prog = bpf_prog_add(prog, priv->rx_ring_num - 1);
-		if (IS_ERR(prog)) {
-			err = PTR_ERR(prog);
-			goto out;
-		}
-	}
-
 	mutex_lock(&mdev->state_lock);
 	memcpy(&new_prof, priv->prof, sizeof(struct mlx4_en_port_profile));
 	new_prof.tx_ring_num[TX_XDP] = xdp_ring_num;
@@ -2781,11 +2735,8 @@ static int mlx4_xdp_set(struct net_device *dev, struct bpf_prog *prog)
 	}
 
 	err = mlx4_en_try_alloc_resources(priv, tmp, &new_prof, false);
-	if (err) {
-		if (prog)
-			bpf_prog_sub(prog, priv->rx_ring_num - 1);
+	if (err)
 		goto unlock_out;
-	}
 
 	if (priv->port_up) {
 		port_up = 1;
@@ -2796,15 +2747,6 @@ static int mlx4_xdp_set(struct net_device *dev, struct bpf_prog *prog)
 	if (tx_changed)
 		netif_set_real_num_tx_queues(dev, priv->tx_ring_num[TX]);
 
-	for (i = 0; i < priv->rx_ring_num; i++) {
-		old_prog = rcu_dereference_protected(
-					priv->rx_ring[i]->xdp_prog,
-					lockdep_is_held(&mdev->state_lock));
-		rcu_assign_pointer(priv->rx_ring[i]->xdp_prog, prog);
-		if (old_prog)
-			bpf_prog_put(old_prog);
-	}
-
 	if (port_up) {
 		err = mlx4_en_start_port(dev);
 		if (err) {
@@ -2816,26 +2758,24 @@ static int mlx4_xdp_set(struct net_device *dev, struct bpf_prog *prog)
 
 unlock_out:
 	mutex_unlock(&mdev->state_lock);
-out:
 	kfree(tmp);
 	return err;
 }
 
-static bool mlx4_xdp_attached(struct net_device *dev)
+static int mlx4_xdp_check_bpf(struct net_device *dev, struct bpf_prog *prog)
 {
-	struct mlx4_en_priv *priv = netdev_priv(dev);
-
-	return !!priv->tx_ring_num[TX_XDP];
+	return 0;
 }
 
 static int mlx4_xdp(struct net_device *dev, struct netdev_xdp *xdp)
 {
 	switch (xdp->command) {
-	case XDP_SETUP_PROG:
-		return mlx4_xdp_set(dev, xdp->prog);
-	case XDP_QUERY_PROG:
-		xdp->prog_attached = mlx4_xdp_attached(dev);
-		return 0;
+	case XDP_MODE_ON:
+		return mlx4_xdp_init(dev, true);
+	case XDP_MODE_OFF:
+		return mlx4_xdp_init(dev, false);
+	case XDP_CHECK_BPF_PROG:
+		return mlx4_xdp_check_bpf(dev, xdp->prog);
 	default:
 		return -EINVAL;
 	}
@@ -3335,7 +3275,7 @@ int mlx4_en_init_netdev(struct mlx4_en_dev *mdev, int port,
 
 	dev->vlan_features = dev->hw_features;
 
-	dev->hw_features |= NETIF_F_RXCSUM | NETIF_F_RXHASH;
+	dev->hw_features |= NETIF_F_RXCSUM | NETIF_F_RXHASH | NETIF_F_XDP;
 	dev->features = dev->hw_features | NETIF_F_HIGHDMA |
 			NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX |
 			NETIF_F_HW_VLAN_CTAG_FILTER;
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index d85e644..a8fddc0 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -43,6 +43,7 @@
 #include <linux/if_vlan.h>
 #include <linux/vmalloc.h>
 #include <linux/irq.h>
+#include <net/xdp.h>
 
 #if IS_ENABLED(CONFIG_IPV6)
 #include <net/ip6_checksum.h>
@@ -547,13 +548,7 @@ void mlx4_en_destroy_rx_ring(struct mlx4_en_priv *priv,
 {
 	struct mlx4_en_dev *mdev = priv->mdev;
 	struct mlx4_en_rx_ring *ring = *pring;
-	struct bpf_prog *old_prog;
 
-	old_prog = rcu_dereference_protected(
-					ring->xdp_prog,
-					lockdep_is_held(&mdev->state_lock));
-	if (old_prog)
-		bpf_prog_put(old_prog);
 	mlx4_free_hwq_res(mdev->dev, &ring->wqres, size * stride + TXBB_SIZE);
 	vfree(ring->rx_info);
 	ring->rx_info = NULL;
@@ -802,7 +797,6 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
 	struct mlx4_en_rx_ring *ring = priv->rx_ring[cq->ring];
 	struct mlx4_en_rx_alloc *frags;
 	struct mlx4_en_rx_desc *rx_desc;
-	struct bpf_prog *xdp_prog;
 	int doorbell_pending;
 	struct sk_buff *skb;
 	int index;
@@ -813,6 +807,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
 	int factor = priv->cqe_factor;
 	u64 timestamp;
 	bool l2_tunnel;
+	bool run_xdp;
 
 	if (unlikely(!priv->port_up))
 		return 0;
@@ -820,9 +815,9 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
 	if (unlikely(budget <= 0))
 		return polled;
 
-	/* Protect accesses to: ring->xdp_prog, priv->mac_hash list */
+	/* Protect accesses to: XDP hooks, priv->mac_hash list */
 	rcu_read_lock();
-	xdp_prog = rcu_dereference(ring->xdp_prog);
+	run_xdp = xdp_hook_run_needed_check(dev, &cq->napi);
 	doorbell_pending = 0;
 
 	/* We assume a 1:1 mapping between CQEs and Rx descriptors, so Rx
@@ -895,13 +890,14 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
 		l2_tunnel = (dev->hw_enc_features & NETIF_F_RXCSUM) &&
 			(cqe->vlan_my_qpn & cpu_to_be32(MLX4_CQE_L2_TUNNEL));
 
-		/* A bpf program gets first chance to drop the packet. It may
+		/* An xdp program gets first chance to drop the packet. It may
 		 * read bytes but not past the end of the frag.
 		 */
-		if (xdp_prog) {
+		if (run_xdp) {
 			struct xdp_buff xdp;
 			dma_addr_t dma;
 			void *orig_data;
+			struct xdp_hook *last_hook;
 			u32 act;
 
 			dma = be64_to_cpu(rx_desc->data[0].addr);
@@ -914,7 +910,8 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
 			xdp.data_end = xdp.data + length;
 			orig_data = xdp.data;
 
-			act = bpf_prog_run_xdp(xdp_prog, &xdp);
+			act = xdp_hook_run_ret_last(&cq->napi, &xdp,
+						    &last_hook);
 
 			if (xdp.data != orig_data) {
 				length = xdp.data_end - xdp.data;
@@ -930,12 +927,12 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
 							length, cq->ring,
 							&doorbell_pending)))
 					goto consumed;
-				trace_xdp_exception(dev, xdp_prog, act);
+				trace_xdp_hook_exception(dev, last_hook, act);
 				goto xdp_drop_no_cnt; /* Drop on xmit failure */
 			default:
-				bpf_warn_invalid_xdp_action(act);
+				xdp_warn_invalid_action(act);
 			case XDP_ABORTED:
-				trace_xdp_exception(dev, xdp_prog, act);
+				trace_xdp_hook_exception(dev, last_hook, act);
 			case XDP_DROP:
 				ring->xdp_drop++;
 xdp_drop_no_cnt:
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index 3ed4219..870acb7 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -43,6 +43,7 @@
 #include <linux/ip.h>
 #include <linux/ipv6.h>
 #include <linux/moduleparam.h>
+#include <net/xdp.h>
 
 #include "mlx4_en.h"
 
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index 4941b69..4e7667e 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -348,7 +348,6 @@ struct mlx4_en_rx_ring {
 	u8  fcs_del;
 	void *buf;
 	void *rx_info;
-	struct bpf_prog __rcu *xdp_prog;
 	struct mlx4_en_page_cache page_cache;
 	unsigned long bytes;
 	unsigned long packets;
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH RFC v3 3/8] nfp: Changes to use generic XDP infrastructure
  2017-02-21 19:34 [PATCH RFC v3 0/8] xdp: Infrastructure to generalize XDP Tom Herbert
  2017-02-21 19:34 ` [PATCH RFC v3 1/8] " Tom Herbert
  2017-02-21 19:34 ` [PATCH RFC v3 2/8] mlx4: Changes to use generic XDP infrastructure Tom Herbert
@ 2017-02-21 19:34 ` Tom Herbert
  2017-02-21 23:10   ` Jakub Kicinski
  2017-02-21 19:34 ` [PATCH RFC v3 4/8] qede: " Tom Herbert
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Tom Herbert @ 2017-02-21 19:34 UTC (permalink / raw)
  To: davem, netdev; +Cc: kernel-team

Change XDP program management functional interface to correspond to new
XDP API.

Signed-off-by: Tom Herbert <tom@herbertland.com>
---
 drivers/net/ethernet/netronome/nfp/nfp_net.h       |   5 +-
 .../net/ethernet/netronome/nfp/nfp_net_common.c    | 172 ++++++++++-----------
 .../net/ethernet/netronome/nfp/nfp_net_ethtool.c   |  12 +-
 3 files changed, 87 insertions(+), 102 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net.h b/drivers/net/ethernet/netronome/nfp/nfp_net.h
index e614a37..732e40b 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net.h
@@ -503,14 +503,13 @@ struct nfp_net {
 	unsigned is_vf:1;
 	unsigned bpf_offload_skip_sw:1;
 	unsigned bpf_offload_xdp:1;
+	unsigned xdp_enabled:1;
 
 	u32 ctrl;
 	u32 fl_bufsz;
 
 	u32 rx_offset;
 
-	struct bpf_prog *xdp_prog;
-
 	struct nfp_net_tx_ring *tx_rings;
 	struct nfp_net_rx_ring *rx_rings;
 
@@ -788,7 +787,7 @@ void
 nfp_net_irqs_assign(struct nfp_net *nn, struct msix_entry *irq_entries,
 		    unsigned int n);
 int
-nfp_net_ring_reconfig(struct nfp_net *nn, struct bpf_prog **xdp_prog,
+nfp_net_ring_reconfig(struct nfp_net *nn,
 		      struct nfp_net_ring_set *rx, struct nfp_net_ring_set *tx);
 
 #ifdef CONFIG_NFP_DEBUG
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index 074259c..6200035 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -65,6 +65,7 @@
 
 #include <net/pkt_cls.h>
 #include <net/vxlan.h>
+#include <net/xdp.h>
 
 #include "nfp_net_ctrl.h"
 #include "nfp_net.h"
@@ -1169,10 +1170,10 @@ nfp_net_napi_alloc_one(struct nfp_net *nn, int direction, dma_addr_t *dma_addr)
 {
 	void *frag;
 
-	if (!nn->xdp_prog)
-		frag = napi_alloc_frag(nn->fl_bufsz);
-	else
+	if (nn->xdp_enabled)
 		frag = page_address(alloc_page(GFP_ATOMIC | __GFP_COLD));
+	else
+		frag = napi_alloc_frag(nn->fl_bufsz);
 	if (!frag) {
 		nn_warn_ratelimit(nn, "Failed to alloc receive page frag\n");
 		return NULL;
@@ -1180,7 +1181,7 @@ nfp_net_napi_alloc_one(struct nfp_net *nn, int direction, dma_addr_t *dma_addr)
 
 	*dma_addr = nfp_net_dma_map_rx(nn, frag, nn->fl_bufsz, direction);
 	if (dma_mapping_error(&nn->pdev->dev, *dma_addr)) {
-		nfp_net_free_frag(frag, nn->xdp_prog);
+		nfp_net_free_frag(frag, nn->xdp_enabled);
 		nn_warn_ratelimit(nn, "Failed to map DMA RX buffer\n");
 		return NULL;
 	}
@@ -1251,17 +1252,15 @@ static void nfp_net_rx_ring_reset(struct nfp_net_rx_ring *rx_ring)
  * nfp_net_rx_ring_bufs_free() - Free any buffers currently on the RX ring
  * @nn:		NFP Net device
  * @rx_ring:	RX ring to remove buffers from
- * @xdp:	Whether XDP is enabled
  *
  * Assumes that the device is stopped and buffers are in [0, ring->cnt - 1)
  * entries.  After device is disabled nfp_net_rx_ring_reset() must be called
  * to restore required ring geometry.
  */
 static void
-nfp_net_rx_ring_bufs_free(struct nfp_net *nn, struct nfp_net_rx_ring *rx_ring,
-			  bool xdp)
+nfp_net_rx_ring_bufs_free(struct nfp_net *nn, struct nfp_net_rx_ring *rx_ring)
 {
-	int direction = xdp ? DMA_BIDIRECTIONAL : DMA_FROM_DEVICE;
+	int direction = nn->xdp_enabled ? DMA_BIDIRECTIONAL : DMA_FROM_DEVICE;
 	unsigned int i;
 
 	for (i = 0; i < rx_ring->cnt - 1; i++) {
@@ -1274,7 +1273,7 @@ nfp_net_rx_ring_bufs_free(struct nfp_net *nn, struct nfp_net_rx_ring *rx_ring,
 
 		nfp_net_dma_unmap_rx(nn, rx_ring->rxbufs[i].dma_addr,
 				     rx_ring->bufsz, direction);
-		nfp_net_free_frag(rx_ring->rxbufs[i].frag, xdp);
+		nfp_net_free_frag(rx_ring->rxbufs[i].frag, nn->xdp_enabled);
 		rx_ring->rxbufs[i].dma_addr = 0;
 		rx_ring->rxbufs[i].frag = NULL;
 	}
@@ -1287,8 +1286,7 @@ nfp_net_rx_ring_bufs_free(struct nfp_net *nn, struct nfp_net_rx_ring *rx_ring,
  * @xdp:	Whether XDP is enabled
  */
 static int
-nfp_net_rx_ring_bufs_alloc(struct nfp_net *nn, struct nfp_net_rx_ring *rx_ring,
-			   bool xdp)
+nfp_net_rx_ring_bufs_alloc(struct nfp_net *nn, struct nfp_net_rx_ring *rx_ring)
 {
 	struct nfp_net_rx_buf *rxbufs;
 	unsigned int i;
@@ -1298,9 +1296,9 @@ nfp_net_rx_ring_bufs_alloc(struct nfp_net *nn, struct nfp_net_rx_ring *rx_ring,
 	for (i = 0; i < rx_ring->cnt - 1; i++) {
 		rxbufs[i].frag =
 			nfp_net_rx_alloc_one(rx_ring, &rxbufs[i].dma_addr,
-					     rx_ring->bufsz, xdp);
+					     rx_ring->bufsz, nn->xdp_enabled);
 		if (!rxbufs[i].frag) {
-			nfp_net_rx_ring_bufs_free(nn, rx_ring, xdp);
+			nfp_net_rx_ring_bufs_free(nn, rx_ring);
 			return -ENOMEM;
 		}
 	}
@@ -1516,16 +1514,6 @@ nfp_net_tx_xdp_buf(struct nfp_net *nn, struct nfp_net_rx_ring *rx_ring,
 	return true;
 }
 
-static int nfp_net_run_xdp(struct bpf_prog *prog, void *data, unsigned int len)
-{
-	struct xdp_buff xdp;
-
-	xdp.data = data;
-	xdp.data_end = data + len;
-
-	return bpf_prog_run_xdp(prog, &xdp);
-}
-
 /**
  * nfp_net_rx() - receive up to @budget packets on @rx_ring
  * @rx_ring:   RX ring to receive from
@@ -1542,19 +1530,20 @@ static int nfp_net_rx(struct nfp_net_rx_ring *rx_ring, int budget)
 	struct nfp_net_r_vector *r_vec = rx_ring->r_vec;
 	struct nfp_net *nn = r_vec->nfp_net;
 	struct nfp_net_tx_ring *tx_ring;
-	struct bpf_prog *xdp_prog;
 	unsigned int true_bufsz;
 	struct sk_buff *skb;
+	bool run_xdp;
 	int pkts_polled = 0;
 	int rx_dma_map_dir;
 	int idx;
 
 	rcu_read_lock();
-	xdp_prog = READ_ONCE(nn->xdp_prog);
-	rx_dma_map_dir = xdp_prog ? DMA_BIDIRECTIONAL : DMA_FROM_DEVICE;
-	true_bufsz = xdp_prog ? PAGE_SIZE : nn->fl_bufsz;
+	rx_dma_map_dir = nn->xdp_enabled ? DMA_BIDIRECTIONAL : DMA_FROM_DEVICE;
+	true_bufsz = nn->xdp_enabled ? PAGE_SIZE : nn->fl_bufsz;
 	tx_ring = r_vec->xdp_ring;
 
+	run_xdp = xdp_hook_run_needed_check(nn->netdev, &r_vec->napi);
+
 	while (pkts_polled < budget) {
 		unsigned int meta_len, data_len, data_off, pkt_len, pkt_off;
 		struct nfp_net_rx_buf *rxbuf;
@@ -1605,15 +1594,21 @@ static int nfp_net_rx(struct nfp_net_rx_ring *rx_ring, int budget)
 		r_vec->rx_bytes += pkt_len;
 		u64_stats_update_end(&r_vec->rx_sync);
 
-		if (xdp_prog && !(rxd->rxd.flags & PCIE_DESC_RX_BPF &&
-				  nn->bpf_offload_xdp)) {
+		if (run_xdp && !(rxd->rxd.flags & PCIE_DESC_RX_BPF &&
+				 nn->bpf_offload_xdp)) {
 			int act;
+			struct xdp_buff xdp;
+			struct xdp_hook *last_hook;
 
 			dma_sync_single_for_cpu(&nn->pdev->dev,
 						rxbuf->dma_addr + pkt_off,
 						pkt_len, DMA_FROM_DEVICE);
-			act = nfp_net_run_xdp(xdp_prog, rxbuf->frag + data_off,
-					      pkt_len);
+
+			xdp.data = rxbuf->frag + data_off;
+			xdp.data_end = xdp.data + pkt_len;
+
+			act = xdp_hook_run_ret_last(&r_vec->napi, &xdp,
+						    &last_hook);
 			switch (act) {
 			case XDP_PASS:
 				break;
@@ -1621,12 +1616,15 @@ static int nfp_net_rx(struct nfp_net_rx_ring *rx_ring, int budget)
 				if (unlikely(!nfp_net_tx_xdp_buf(nn, rx_ring,
 								 tx_ring, rxbuf,
 								 pkt_off, pkt_len)))
-					trace_xdp_exception(nn->netdev, xdp_prog, act);
+					trace_xdp_hook_exception(nn->netdev,
+								 last_hook,
+								 act);
 				continue;
 			default:
-				bpf_warn_invalid_xdp_action(act);
+				xdp_warn_invalid_action(act);
 			case XDP_ABORTED:
-				trace_xdp_exception(nn->netdev, xdp_prog, act);
+				trace_xdp_hook_exception(nn->netdev, last_hook,
+							 act);
 			case XDP_DROP:
 				nfp_net_rx_give_one(rx_ring, rxbuf->frag,
 						    rxbuf->dma_addr);
@@ -1679,7 +1677,7 @@ static int nfp_net_rx(struct nfp_net_rx_ring *rx_ring, int budget)
 		napi_gro_receive(&rx_ring->r_vec->napi, skb);
 	}
 
-	if (xdp_prog && tx_ring->wr_ptr_add)
+	if (run_xdp && tx_ring->wr_ptr_add)
 		nfp_net_tx_xmit_more_flush(tx_ring);
 	rcu_read_unlock();
 
@@ -1910,8 +1908,7 @@ nfp_net_rx_ring_alloc(struct nfp_net_rx_ring *rx_ring, unsigned int fl_bufsz,
 }
 
 static struct nfp_net_rx_ring *
-nfp_net_rx_ring_set_prepare(struct nfp_net *nn, struct nfp_net_ring_set *s,
-			    bool xdp)
+nfp_net_rx_ring_set_prepare(struct nfp_net *nn, struct nfp_net_ring_set *s)
 {
 	unsigned int fl_bufsz =	nfp_net_calc_fl_bufsz(nn, s->mtu);
 	struct nfp_net_rx_ring *rings;
@@ -1927,7 +1924,7 @@ nfp_net_rx_ring_set_prepare(struct nfp_net *nn, struct nfp_net_ring_set *s,
 		if (nfp_net_rx_ring_alloc(&rings[r], fl_bufsz, s->dcnt))
 			goto err_free_prev;
 
-		if (nfp_net_rx_ring_bufs_alloc(nn, &rings[r], xdp))
+		if (nfp_net_rx_ring_bufs_alloc(nn, &rings[r]))
 			goto err_free_ring;
 	}
 
@@ -1935,7 +1932,7 @@ nfp_net_rx_ring_set_prepare(struct nfp_net *nn, struct nfp_net_ring_set *s,
 
 err_free_prev:
 	while (r--) {
-		nfp_net_rx_ring_bufs_free(nn, &rings[r], xdp);
+		nfp_net_rx_ring_bufs_free(nn, &rings[r]);
 err_free_ring:
 		nfp_net_rx_ring_free(&rings[r]);
 	}
@@ -1961,14 +1958,13 @@ nfp_net_rx_ring_set_swap(struct nfp_net *nn, struct nfp_net_ring_set *s)
 }
 
 static void
-nfp_net_rx_ring_set_free(struct nfp_net *nn, struct nfp_net_ring_set *s,
-			 bool xdp)
+nfp_net_rx_ring_set_free(struct nfp_net *nn, struct nfp_net_ring_set *s)
 {
 	struct nfp_net_rx_ring *rings = s->rings;
 	unsigned int r;
 
 	for (r = 0; r < s->n_rings; r++) {
-		nfp_net_rx_ring_bufs_free(nn, &rings[r], xdp);
+		nfp_net_rx_ring_bufs_free(nn, &rings[r]);
 		nfp_net_rx_ring_free(&rings[r]);
 	}
 
@@ -2304,7 +2300,7 @@ static int nfp_net_netdev_open(struct net_device *netdev)
 			goto err_cleanup_vec_p;
 	}
 
-	nn->rx_rings = nfp_net_rx_ring_set_prepare(nn, &rx, nn->xdp_prog);
+	nn->rx_rings = nfp_net_rx_ring_set_prepare(nn, &rx);
 	if (!nn->rx_rings) {
 		err = -ENOMEM;
 		goto err_cleanup_vec;
@@ -2352,7 +2348,7 @@ static int nfp_net_netdev_open(struct net_device *netdev)
 err_free_rings:
 	nfp_net_tx_ring_set_free(nn, &tx);
 err_free_rx_rings:
-	nfp_net_rx_ring_set_free(nn, &rx, nn->xdp_prog);
+	nfp_net_rx_ring_set_free(nn, &rx);
 err_cleanup_vec:
 	r = nn->num_r_vecs;
 err_cleanup_vec_p:
@@ -2393,7 +2389,7 @@ static void nfp_net_close_free_all(struct nfp_net *nn)
 	unsigned int r;
 
 	for (r = 0; r < nn->num_rx_rings; r++) {
-		nfp_net_rx_ring_bufs_free(nn, &nn->rx_rings[r], nn->xdp_prog);
+		nfp_net_rx_ring_bufs_free(nn, &nn->rx_rings[r]);
 		nfp_net_rx_ring_free(&nn->rx_rings[r]);
 	}
 	for (r = 0; r < nn->num_tx_rings; r++)
@@ -2474,7 +2470,6 @@ static void nfp_net_rss_init_itbl(struct nfp_net *nn)
 static int
 nfp_net_ring_swap_enable(struct nfp_net *nn, unsigned int *num_vecs,
 			 unsigned int *stack_tx_rings,
-			 struct bpf_prog **xdp_prog,
 			 struct nfp_net_ring_set *rx,
 			 struct nfp_net_ring_set *tx)
 {
@@ -2488,7 +2483,6 @@ nfp_net_ring_swap_enable(struct nfp_net *nn, unsigned int *num_vecs,
 
 	swap(*num_vecs, nn->num_r_vecs);
 	swap(*stack_tx_rings, nn->num_stack_tx_rings);
-	*xdp_prog = xchg(&nn->xdp_prog, *xdp_prog);
 
 	for (r = 0; r <	nn->max_r_vecs; r++)
 		nfp_net_vector_assign_rings(nn, &nn->r_vecs[r], r);
@@ -2512,11 +2506,11 @@ nfp_net_ring_swap_enable(struct nfp_net *nn, unsigned int *num_vecs,
 }
 
 static int
-nfp_net_check_config(struct nfp_net *nn, struct bpf_prog *xdp_prog,
+nfp_net_check_config(struct nfp_net *nn,
 		     struct nfp_net_ring_set *rx, struct nfp_net_ring_set *tx)
 {
 	/* XDP-enabled tests */
-	if (!xdp_prog)
+	if (!nn->xdp_enabled)
 		return 0;
 	if (rx && nfp_net_calc_fl_bufsz(nn, rx->mtu) > PAGE_SIZE) {
 		nn_warn(nn, "MTU too large w/ XDP enabled\n");
@@ -2531,7 +2525,7 @@ nfp_net_check_config(struct nfp_net *nn, struct bpf_prog *xdp_prog,
 }
 
 static void
-nfp_net_ring_reconfig_down(struct nfp_net *nn, struct bpf_prog **xdp_prog,
+nfp_net_ring_reconfig_down(struct nfp_net *nn,
 			   struct nfp_net_ring_set *rx,
 			   struct nfp_net_ring_set *tx,
 			   unsigned int stack_tx_rings, unsigned int num_vecs)
@@ -2544,31 +2538,30 @@ nfp_net_ring_reconfig_down(struct nfp_net *nn, struct bpf_prog **xdp_prog,
 	nn->num_tx_rings = tx ? tx->n_rings : nn->num_tx_rings;
 	nn->num_stack_tx_rings = stack_tx_rings;
 	nn->num_r_vecs = num_vecs;
-	*xdp_prog = xchg(&nn->xdp_prog, *xdp_prog);
 
 	if (!netif_is_rxfh_configured(nn->netdev))
 		nfp_net_rss_init_itbl(nn);
 }
 
 int
-nfp_net_ring_reconfig(struct nfp_net *nn, struct bpf_prog **xdp_prog,
+nfp_net_ring_reconfig(struct nfp_net *nn,
 		      struct nfp_net_ring_set *rx, struct nfp_net_ring_set *tx)
 {
 	unsigned int stack_tx_rings, num_vecs, r;
 	int err;
 
 	stack_tx_rings = tx ? tx->n_rings : nn->num_tx_rings;
-	if (*xdp_prog)
+	if (nn->xdp_enabled)
 		stack_tx_rings -= rx ? rx->n_rings : nn->num_rx_rings;
 
 	num_vecs = max(rx ? rx->n_rings : nn->num_rx_rings, stack_tx_rings);
 
-	err = nfp_net_check_config(nn, *xdp_prog, rx, tx);
+	err = nfp_net_check_config(nn, rx, tx);
 	if (err)
 		return err;
 
 	if (!netif_running(nn->netdev)) {
-		nfp_net_ring_reconfig_down(nn, xdp_prog, rx, tx,
+		nfp_net_ring_reconfig_down(nn, rx, tx,
 					   stack_tx_rings, num_vecs);
 		return 0;
 	}
@@ -2582,7 +2575,7 @@ nfp_net_ring_reconfig(struct nfp_net *nn, struct bpf_prog **xdp_prog,
 		}
 	}
 	if (rx) {
-		if (!nfp_net_rx_ring_set_prepare(nn, rx, *xdp_prog)) {
+		if (!nfp_net_rx_ring_set_prepare(nn, rx)) {
 			err = -ENOMEM;
 			goto err_cleanup_vecs;
 		}
@@ -2599,7 +2592,7 @@ nfp_net_ring_reconfig(struct nfp_net *nn, struct bpf_prog **xdp_prog,
 	nfp_net_clear_config_and_disable(nn);
 
 	err = nfp_net_ring_swap_enable(nn, &num_vecs, &stack_tx_rings,
-				       xdp_prog, rx, tx);
+				       rx, tx);
 	if (err) {
 		int err2;
 
@@ -2607,7 +2600,7 @@ nfp_net_ring_reconfig(struct nfp_net *nn, struct bpf_prog **xdp_prog,
 
 		/* Try with old configuration and old rings */
 		err2 = nfp_net_ring_swap_enable(nn, &num_vecs, &stack_tx_rings,
-						xdp_prog, rx, tx);
+						rx, tx);
 		if (err2)
 			nn_err(nn, "Can't restore ring config - FW communication failed (%d,%d)\n",
 			       err, err2);
@@ -2616,7 +2609,7 @@ nfp_net_ring_reconfig(struct nfp_net *nn, struct bpf_prog **xdp_prog,
 		nfp_net_cleanup_vector(nn, &nn->r_vecs[r]);
 
 	if (rx)
-		nfp_net_rx_ring_set_free(nn, rx, *xdp_prog);
+		nfp_net_rx_ring_set_free(nn, rx);
 	if (tx)
 		nfp_net_tx_ring_set_free(nn, tx);
 
@@ -2626,7 +2619,7 @@ nfp_net_ring_reconfig(struct nfp_net *nn, struct bpf_prog **xdp_prog,
 
 err_free_rx:
 	if (rx)
-		nfp_net_rx_ring_set_free(nn, rx, *xdp_prog);
+		nfp_net_rx_ring_set_free(nn, rx);
 err_cleanup_vecs:
 	for (r = num_vecs - 1; r >= nn->num_r_vecs; r--)
 		nfp_net_cleanup_vector(nn, &nn->r_vecs[r]);
@@ -2642,7 +2635,7 @@ static int nfp_net_change_mtu(struct net_device *netdev, int new_mtu)
 		.dcnt = nn->rxd_cnt,
 	};
 
-	return nfp_net_ring_reconfig(nn, &nn->xdp_prog, &rx, NULL);
+	return nfp_net_ring_reconfig(nn, &rx, NULL);
 }
 
 static void nfp_net_stat64(struct net_device *netdev,
@@ -2938,7 +2931,17 @@ static int nfp_net_xdp_offload(struct nfp_net *nn, struct bpf_prog *prog)
 	return ret;
 }
 
-static int nfp_net_xdp_setup(struct nfp_net *nn, struct bpf_prog *prog)
+static int nfp_xdp_check_bpf(struct nfp_net *nn, struct bpf_prog *prog)
+{
+	if (prog && prog->xdp_adjust_head) {
+		nn_err(nn, "Does not support bpf_xdp_adjust_head()\n");
+		return -EOPNOTSUPP;
+	}
+
+	return 0;
+}
+
+static int nfp_net_xdp_init(struct nfp_net *nn, bool enable)
 {
 	struct nfp_net_ring_set rx = {
 		.n_rings = nn->num_rx_rings,
@@ -2951,33 +2954,17 @@ static int nfp_net_xdp_setup(struct nfp_net *nn, struct bpf_prog *prog)
 	};
 	int err;
 
-	if (prog && prog->xdp_adjust_head) {
-		nn_err(nn, "Does not support bpf_xdp_adjust_head()\n");
-		return -EOPNOTSUPP;
-	}
-	if (!prog && !nn->xdp_prog)
-		return 0;
-	if (prog && nn->xdp_prog) {
-		prog = xchg(&nn->xdp_prog, prog);
-		bpf_prog_put(prog);
-		nfp_net_xdp_offload(nn, nn->xdp_prog);
+	if (nn->xdp_enabled == enable)
 		return 0;
-	}
-
-	tx.n_rings += prog ? nn->num_rx_rings : -nn->num_rx_rings;
 
-	/* We need RX reconfig to remap the buffers (BIDIR vs FROM_DEV) */
-	err = nfp_net_ring_reconfig(nn, &prog, &rx, &tx);
-	if (err)
-		return err;
+	nn->xdp_enabled = enable;
 
-	/* @prog got swapped and is now the old one */
-	if (prog)
-		bpf_prog_put(prog);
+	tx.n_rings += enable ? nn->num_rx_rings : -nn->num_rx_rings;
 
-	nfp_net_xdp_offload(nn, nn->xdp_prog);
+	/* We need RX reconfig to remap the buffers (BIDIR vs FROM_DEV) */
+	err = nfp_net_ring_reconfig(nn, &rx, &tx);
 
-	return 0;
+	return err;
 }
 
 static int nfp_net_xdp(struct net_device *netdev, struct netdev_xdp *xdp)
@@ -2985,11 +2972,14 @@ static int nfp_net_xdp(struct net_device *netdev, struct netdev_xdp *xdp)
 	struct nfp_net *nn = netdev_priv(netdev);
 
 	switch (xdp->command) {
-	case XDP_SETUP_PROG:
-		return nfp_net_xdp_setup(nn, xdp->prog);
-	case XDP_QUERY_PROG:
-		xdp->prog_attached = !!nn->xdp_prog;
-		return 0;
+	case XDP_MODE_ON:
+		return nfp_net_xdp_init(nn, true);
+	case XDP_MODE_OFF:
+		return nfp_net_xdp_init(nn, false);
+	case XDP_CHECK_BPF_PROG:
+		return nfp_xdp_check_bpf(nn, xdp->prog);
+	case XDP_OFFLOAD_BPF:
+		return nfp_net_xdp_offload(nn, xdp->prog);
 	default:
 		return -EINVAL;
 	}
@@ -3175,7 +3165,7 @@ int nfp_net_netdev_init(struct net_device *netdev)
 	 * and netdev->hw_features advertises which features are
 	 * supported.  By default we enable most features.
 	 */
-	netdev->hw_features = NETIF_F_HIGHDMA;
+	netdev->hw_features = NETIF_F_HIGHDMA | NETIF_F_XDP;
 	if (nn->cap & NFP_NET_CFG_CTRL_RXCSUM) {
 		netdev->hw_features |= NETIF_F_RXCSUM;
 		nn->ctrl |= NFP_NET_CFG_CTRL_RXCSUM;
@@ -3274,8 +3264,6 @@ void nfp_net_netdev_clean(struct net_device *netdev)
 {
 	struct nfp_net *nn = netdev_priv(netdev);
 
-	if (nn->xdp_prog)
-		bpf_prog_put(nn->xdp_prog);
 	if (nn->bpf_offload_xdp)
 		nfp_net_xdp_offload(nn, NULL);
 	unregister_netdev(nn->netdev);
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c b/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
index 2649f75..1377c7e 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
@@ -201,8 +201,7 @@ static int nfp_net_set_ring_size(struct nfp_net *nn, u32 rxd_cnt, u32 txd_cnt)
 	if (nn->txd_cnt != txd_cnt)
 		reconfig_tx = &tx;
 
-	return nfp_net_ring_reconfig(nn, &nn->xdp_prog,
-				     reconfig_rx, reconfig_tx);
+	return nfp_net_ring_reconfig(nn, reconfig_rx, reconfig_tx);
 }
 
 static int nfp_net_set_ringparam(struct net_device *netdev,
@@ -737,7 +736,7 @@ static void nfp_net_get_channels(struct net_device *netdev,
 	unsigned int num_tx_rings;
 
 	num_tx_rings = nn->num_tx_rings;
-	if (nn->xdp_prog)
+	if (nn->xdp_enabled)
 		num_tx_rings -= nn->num_rx_rings;
 
 	channel->max_rx = min(nn->max_rx_rings, nn->max_r_vecs);
@@ -767,15 +766,14 @@ static int nfp_net_set_num_rings(struct nfp_net *nn, unsigned int total_rx,
 	if (nn->num_rx_rings != total_rx)
 		reconfig_rx = &rx;
 	if (nn->num_stack_tx_rings != total_tx ||
-	    (nn->xdp_prog && reconfig_rx))
+	    (nn->xdp_enabled && reconfig_rx))
 		reconfig_tx = &tx;
 
 	/* nfp_net_check_config() will catch tx.n_rings > nn->max_tx_rings */
-	if (nn->xdp_prog)
+	if (nn->xdp_enabled)
 		tx.n_rings += total_rx;
 
-	return nfp_net_ring_reconfig(nn, &nn->xdp_prog,
-				     reconfig_rx, reconfig_tx);
+	return nfp_net_ring_reconfig(nn, reconfig_rx, reconfig_tx);
 }
 
 static int nfp_net_set_channels(struct net_device *netdev,
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH RFC v3 4/8] qede: Changes to use generic XDP infrastructure
  2017-02-21 19:34 [PATCH RFC v3 0/8] xdp: Infrastructure to generalize XDP Tom Herbert
                   ` (2 preceding siblings ...)
  2017-02-21 19:34 ` [PATCH RFC v3 3/8] nfp: " Tom Herbert
@ 2017-02-21 19:34 ` Tom Herbert
  2017-02-21 19:34 ` [PATCH RFC v3 5/8] virt_net: " Tom Herbert
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 22+ messages in thread
From: Tom Herbert @ 2017-02-21 19:34 UTC (permalink / raw)
  To: davem, netdev; +Cc: kernel-team

Change XDP program management functional interface to correspond to new
XDP API.

Signed-off-by: Tom Herbert <tom@herbertland.com>
---
 drivers/net/ethernet/qlogic/qede/qede.h         |  3 +-
 drivers/net/ethernet/qlogic/qede/qede_ethtool.c |  2 +-
 drivers/net/ethernet/qlogic/qede/qede_filter.c  | 39 ++++++++++---------------
 drivers/net/ethernet/qlogic/qede/qede_fp.c      | 36 +++++++++++++----------
 drivers/net/ethernet/qlogic/qede/qede_main.c    | 23 ++++-----------
 5 files changed, 45 insertions(+), 58 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qede/qede.h b/drivers/net/ethernet/qlogic/qede/qede.h
index f2aaef2..fdd2874 100644
--- a/drivers/net/ethernet/qlogic/qede/qede.h
+++ b/drivers/net/ethernet/qlogic/qede/qede.h
@@ -217,10 +217,9 @@ struct qede_dev {
 	u16				geneve_dst_port;
 
 	bool wol_enabled;
+	bool xdp_enabled;
 
 	struct qede_rdma_dev		rdma_info;
-
-	struct bpf_prog *xdp_prog;
 };
 
 enum QEDE_STATE {
diff --git a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
index 8979531..5d22764 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
@@ -342,7 +342,7 @@ static int qede_get_sset_count(struct net_device *dev, int stringset)
 		num_stats += QEDE_RSS_COUNT(edev) * QEDE_NUM_RQSTATS;
 
 		/* Account for XDP statistics [if needed] */
-		if (edev->xdp_prog)
+		if (edev->xdp_enabled)
 			num_stats += QEDE_RSS_COUNT(edev) * QEDE_NUM_TQSTATS;
 		return num_stats;
 
diff --git a/drivers/net/ethernet/qlogic/qede/qede_filter.c b/drivers/net/ethernet/qlogic/qede/qede_filter.c
index 107c3fd..9c9db44 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_filter.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_filter.c
@@ -426,7 +426,7 @@ int qede_set_features(struct net_device *dev, netdev_features_t features)
 		 * aggregations, so no need to actually reload.
 		 */
 		__qede_lock(edev);
-		if (edev->xdp_prog)
+		if (edev->xdp_enabled)
 			args.func(edev, &args);
 		else
 			qede_reload(edev, &args, true);
@@ -506,29 +506,21 @@ void qede_udp_tunnel_del(struct net_device *dev, struct udp_tunnel_info *ti)
 	schedule_delayed_work(&edev->sp_task, 0);
 }
 
-static void qede_xdp_reload_func(struct qede_dev *edev,
-				 struct qede_reload_args *args)
+static int qede_xdp_check_bpf(struct qede_dev *edev, struct bpf_prog *prog)
 {
-	struct bpf_prog *old;
-
-	old = xchg(&edev->xdp_prog, args->u.new_prog);
-	if (old)
-		bpf_prog_put(old);
-}
-
-static int qede_xdp_set(struct qede_dev *edev, struct bpf_prog *prog)
-{
-	struct qede_reload_args args;
-
 	if (prog && prog->xdp_adjust_head) {
 		DP_ERR(edev, "Does not support bpf_xdp_adjust_head()\n");
 		return -EOPNOTSUPP;
 	}
 
-	/* If we're called, there was already a bpf reference increment */
-	args.func = &qede_xdp_reload_func;
-	args.u.new_prog = prog;
-	qede_reload(edev, &args, false);
+	return 0;
+}
+
+static int qede_xdp_init(struct qede_dev *edev, bool enable)
+{
+	edev->xdp_enabled = enable;
+
+	qede_reload(edev, NULL, false);
 
 	return 0;
 }
@@ -538,11 +530,12 @@ int qede_xdp(struct net_device *dev, struct netdev_xdp *xdp)
 	struct qede_dev *edev = netdev_priv(dev);
 
 	switch (xdp->command) {
-	case XDP_SETUP_PROG:
-		return qede_xdp_set(edev, xdp->prog);
-	case XDP_QUERY_PROG:
-		xdp->prog_attached = !!edev->xdp_prog;
-		return 0;
+	case XDP_MODE_OFF:
+		return qede_xdp_init(edev, true);
+	case XDP_MODE_ON:
+		return qede_xdp_init(edev, false);
+	case XDP_CHECK_BPF_PROG:
+		return qede_xdp_check_bpf(edev, xdp->prog);
 	default:
 		return -EINVAL;
 	}
diff --git a/drivers/net/ethernet/qlogic/qede/qede_fp.c b/drivers/net/ethernet/qlogic/qede/qede_fp.c
index 1e65038..7423317 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_fp.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_fp.c
@@ -41,6 +41,7 @@
 #include <linux/if_vlan.h>
 #include <net/ip6_checksum.h>
 #include "qede_ptp.h"
+#include <net/xdp.h>
 
 #include <linux/qed/qed_if.h>
 #include "qede.h"
@@ -988,13 +989,14 @@ static bool qede_pkt_is_ip_fragmented(struct eth_fast_path_rx_reg_cqe *cqe,
 static bool qede_rx_xdp(struct qede_dev *edev,
 			struct qede_fastpath *fp,
 			struct qede_rx_queue *rxq,
-			struct bpf_prog *prog,
 			struct sw_rx_data *bd,
 			struct eth_fast_path_rx_reg_cqe *cqe)
 {
 	u16 len = le16_to_cpu(cqe->len_on_first_bd);
 	struct xdp_buff xdp;
 	enum xdp_action act;
+	struct xdp_hook *last_hook;
+	bool retval = false;
 
 	xdp.data = page_address(bd->data) + cqe->placement_offset;
 	xdp.data_end = xdp.data + len;
@@ -1004,11 +1006,13 @@ static bool qede_rx_xdp(struct qede_dev *edev,
 	 * side for map helpers.
 	 */
 	rcu_read_lock();
-	act = bpf_prog_run_xdp(prog, &xdp);
-	rcu_read_unlock();
 
-	if (act == XDP_PASS)
-		return true;
+	act = xdp_hook_run_ret_last(&fp->napi, &xdp, &last_hook);
+
+	if (act == XDP_PASS) {
+		retval = true;
+		goto out;
+	}
 
 	/* Count number of packets not to be passed to stack */
 	rxq->xdp_no_pass++;
@@ -1018,8 +1022,8 @@ static bool qede_rx_xdp(struct qede_dev *edev,
 		/* We need the replacement buffer before transmit. */
 		if (qede_alloc_rx_buffer(rxq, true)) {
 			qede_recycle_rx_bd_ring(rxq, 1);
-			trace_xdp_exception(edev->ndev, prog, act);
-			return false;
+			trace_xdp_hook_exception(edev->ndev, last_hook, act);
+			goto out;
 		}
 
 		/* Now if there's a transmission problem, we'd still have to
@@ -1029,22 +1033,25 @@ static bool qede_rx_xdp(struct qede_dev *edev,
 			dma_unmap_page(rxq->dev, bd->mapping,
 				       PAGE_SIZE, DMA_BIDIRECTIONAL);
 			__free_page(bd->data);
-			trace_xdp_exception(edev->ndev, prog, act);
+			trace_xdp_hook_exception(edev->ndev, last_hook, act);
 		}
 
 		/* Regardless, we've consumed an Rx BD */
 		qede_rx_bd_ring_consume(rxq);
-		return false;
+		goto out;
 
 	default:
-		bpf_warn_invalid_xdp_action(act);
+		xdp_warn_invalid_action(act);
 	case XDP_ABORTED:
-		trace_xdp_exception(edev->ndev, prog, act);
+		trace_xdp_hook_exception(edev->ndev, last_hook, act);
 	case XDP_DROP:
 		qede_recycle_rx_bd_ring(rxq, cqe->bd_num);
 	}
 
-	return false;
+out:
+	rcu_read_unlock();
+
+	return retval;
 }
 
 static struct sk_buff *qede_rx_allocate_skb(struct qede_dev *edev,
@@ -1189,7 +1196,6 @@ static int qede_rx_process_cqe(struct qede_dev *edev,
 			       struct qede_fastpath *fp,
 			       struct qede_rx_queue *rxq)
 {
-	struct bpf_prog *xdp_prog = READ_ONCE(rxq->xdp_prog);
 	struct eth_fast_path_rx_reg_cqe *fp_cqe;
 	u16 len, pad, bd_cons_idx, parse_flag;
 	enum eth_rx_cqe_type cqe_type;
@@ -1227,8 +1233,8 @@ static int qede_rx_process_cqe(struct qede_dev *edev,
 	pad = fp_cqe->placement_offset;
 
 	/* Run eBPF program if one is attached */
-	if (xdp_prog)
-		if (!qede_rx_xdp(edev, fp, rxq, xdp_prog, bd, fp_cqe))
+	if (xdp_hook_run_needed_check(edev->ndev, &fp->napi))
+		if (!qede_rx_xdp(edev, fp, rxq, bd, fp_cqe))
 			return 1;
 
 	/* If this is an error packet then drop it */
diff --git a/drivers/net/ethernet/qlogic/qede/qede_main.c b/drivers/net/ethernet/qlogic/qede/qede_main.c
index 3a78c3f..312da1c 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_main.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_main.c
@@ -581,7 +581,7 @@ static void qede_init_ndev(struct qede_dev *edev)
 {
 	struct net_device *ndev = edev->ndev;
 	struct pci_dev *pdev = edev->pdev;
-	u32 hw_features;
+	netdev_features_t hw_features;
 
 	pci_set_drvdata(pdev, ndev);
 
@@ -601,7 +601,7 @@ static void qede_init_ndev(struct qede_dev *edev)
 	/* user-changeble features */
 	hw_features = NETIF_F_GRO | NETIF_F_SG |
 		      NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
-		      NETIF_F_TSO | NETIF_F_TSO6;
+		      NETIF_F_TSO | NETIF_F_TSO6 | NETIF_F_XDP;
 
 	/* Encap features*/
 	hw_features |= NETIF_F_GSO_GRE | NETIF_F_GSO_UDP_TUNNEL |
@@ -730,7 +730,7 @@ static int qede_alloc_fp_array(struct qede_dev *edev)
 			if (!fp->rxq)
 				goto err;
 
-			if (edev->xdp_prog) {
+			if (edev->xdp_enabled) {
 				fp->xdp_tx = kzalloc(sizeof(*fp->xdp_tx),
 						     GFP_KERNEL);
 				if (!fp->xdp_tx)
@@ -952,9 +952,7 @@ static void __qede_remove(struct pci_dev *pdev, enum qede_remove_mode mode)
 
 	pci_set_drvdata(pdev, NULL);
 
-	/* Release edev's reference to XDP's bpf if such exist */
-	if (edev->xdp_prog)
-		bpf_prog_put(edev->xdp_prog);
+	free_netdev(ndev);
 
 	/* Use global ops since we've freed edev */
 	qed_ops->common->slowpath_stop(cdev);
@@ -1114,7 +1112,7 @@ static int qede_alloc_sge_mem(struct qede_dev *edev, struct qede_rx_queue *rxq)
 	int i;
 
 	/* Don't perform FW aggregations in case of XDP */
-	if (edev->xdp_prog)
+	if (edev->xdp_enabled)
 		edev->gro_disable = 1;
 
 	if (edev->gro_disable)
@@ -1172,7 +1170,7 @@ static int qede_alloc_mem_rxq(struct qede_dev *edev, struct qede_rx_queue *rxq)
 	/* Segment size to spilt a page in multiple equal parts,
 	 * unless XDP is used in which case we'd use the entire page.
 	 */
-	if (!edev->xdp_prog)
+	if (!edev->xdp_enabled)
 		rxq->rx_buf_seg_size = roundup_pow_of_two(rxq->rx_buf_size);
 	else
 		rxq->rx_buf_seg_size = PAGE_SIZE;
@@ -1625,8 +1623,6 @@ static int qede_stop_queues(struct qede_dev *edev)
 			rc = qede_stop_txq(edev, fp->xdp_tx, i);
 			if (rc)
 				return rc;
-
-			bpf_prog_put(fp->rxq->xdp_prog);
 		}
 	}
 
@@ -1770,13 +1766,6 @@ static int qede_start_queues(struct qede_dev *edev, bool clear_stats)
 			rc = qede_start_txq(edev, fp, fp->xdp_tx, i, XDP_PI);
 			if (rc)
 				goto out;
-
-			fp->rxq->xdp_prog = bpf_prog_add(edev->xdp_prog, 1);
-			if (IS_ERR(fp->rxq->xdp_prog)) {
-				rc = PTR_ERR(fp->rxq->xdp_prog);
-				fp->rxq->xdp_prog = NULL;
-				goto out;
-			}
 		}
 
 		if (fp->type & QEDE_FASTPATH_TX) {
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH RFC v3 5/8] virt_net: Changes to use generic XDP infrastructure
  2017-02-21 19:34 [PATCH RFC v3 0/8] xdp: Infrastructure to generalize XDP Tom Herbert
                   ` (3 preceding siblings ...)
  2017-02-21 19:34 ` [PATCH RFC v3 4/8] qede: " Tom Herbert
@ 2017-02-21 19:34 ` Tom Herbert
  2017-02-21 19:34 ` [PATCH RFC v3 6/8] mlx5: " Tom Herbert
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 22+ messages in thread
From: Tom Herbert @ 2017-02-21 19:34 UTC (permalink / raw)
  To: davem, netdev; +Cc: kernel-team

Change XDP program management functional interface to correspond to new
XDP API.

Signed-off-by: Tom Herbert <tom@herbertland.com>
---
 drivers/net/virtio_net.c | 99 +++++++++++++++++++-----------------------------
 1 file changed, 39 insertions(+), 60 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index bf95016..1f5eddf 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -95,8 +95,6 @@ struct receive_queue {
 
 	struct napi_struct napi;
 
-	struct bpf_prog __rcu *xdp_prog;
-
 	/* Chain pages by the private ptr. */
 	struct page *pages;
 
@@ -142,6 +140,9 @@ struct virtnet_info {
 	/* Host can handle any s/g split between our header and packet data */
 	bool any_header_sg;
 
+	/* XDP has been enabled in device */
+	bool xdp_enabled;
+
 	/* Packet virtio header size */
 	u8 hdr_len;
 
@@ -394,19 +395,19 @@ static struct sk_buff *receive_small(struct net_device *dev,
 				     void *buf, unsigned int len)
 {
 	struct sk_buff *skb;
-	struct bpf_prog *xdp_prog;
 	unsigned int xdp_headroom = virtnet_get_headroom(vi);
 	unsigned int header_offset = VIRTNET_RX_PAD + xdp_headroom;
 	unsigned int headroom = vi->hdr_len + header_offset;
 	unsigned int buflen = SKB_DATA_ALIGN(GOOD_PACKET_LEN + headroom) +
 			      SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
 	unsigned int delta = 0;
+	struct xdp_hook *last_hook;
+
 	len -= vi->hdr_len;
 
 	rcu_read_lock();
-	xdp_prog = rcu_dereference(rq->xdp_prog);
-	if (xdp_prog) {
-		struct virtio_net_hdr_mrg_rxbuf *hdr = buf + header_offset;
+	if (xdp_hook_run_needed_check(dev, &rq->napi)) {
+		struct virtio_net_hdr_mrg_rxbuf *hdr = buf;
 		struct xdp_buff xdp;
 		void *orig_data;
 		u32 act;
@@ -418,8 +419,7 @@ static struct sk_buff *receive_small(struct net_device *dev,
 		xdp.data = xdp.data_hard_start + xdp_headroom;
 		xdp.data_end = xdp.data + len;
 		orig_data = xdp.data;
-		act = bpf_prog_run_xdp(xdp_prog, &xdp);
-
+		act = xdp_hook_run_ret_last(&rq->napi, &xdp, &last_hook);
 		switch (act) {
 		case XDP_PASS:
 			/* Recalculate length in case bpf program changed it */
@@ -427,13 +427,13 @@ static struct sk_buff *receive_small(struct net_device *dev,
 			break;
 		case XDP_TX:
 			if (unlikely(!virtnet_xdp_xmit(vi, rq, &xdp)))
-				trace_xdp_exception(vi->dev, xdp_prog, act);
+				trace_xdp_hook_exception(vi->dev, last_hook, act);
 			rcu_read_unlock();
 			goto xdp_xmit;
 		default:
-			bpf_warn_invalid_xdp_action(act);
+			xdp_warn_invalid_action(act);
 		case XDP_ABORTED:
-			trace_xdp_exception(vi->dev, xdp_prog, act);
+			trace_xdp_hook_exception(vi->dev, last_hook, act);
 		case XDP_DROP:
 			goto err_xdp;
 		}
@@ -557,16 +557,15 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 	struct page *page = virt_to_head_page(buf);
 	int offset = buf - page_address(page);
 	struct sk_buff *head_skb, *curr_skb;
-	struct bpf_prog *xdp_prog;
 	unsigned int truesize;
 
 	head_skb = NULL;
 
 	rcu_read_lock();
-	xdp_prog = rcu_dereference(rq->xdp_prog);
-	if (xdp_prog) {
+	if (xdp_hook_run_needed_check(dev, &rq->napi)) {
 		struct page *xdp_page;
 		struct xdp_buff xdp;
+		struct xdp_hook *last_hook;
 		void *data;
 		u32 act;
 
@@ -597,7 +596,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 		xdp.data_hard_start = data - VIRTIO_XDP_HEADROOM + vi->hdr_len;
 		xdp.data = data + vi->hdr_len;
 		xdp.data_end = xdp.data + (len - vi->hdr_len);
-		act = bpf_prog_run_xdp(xdp_prog, &xdp);
+		act = xdp_hook_run_ret_last(&rq->napi, &xdp, &last_hook);
 
 		switch (act) {
 		case XDP_PASS:
@@ -620,16 +619,16 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 			break;
 		case XDP_TX:
 			if (unlikely(!virtnet_xdp_xmit(vi, rq, &xdp)))
-				trace_xdp_exception(vi->dev, xdp_prog, act);
+				trace_xdp_hook_exception(vi->dev, last_hook, act);
 			ewma_pkt_len_add(&rq->mrg_avg_pkt_len, len);
 			if (unlikely(xdp_page != page))
 				goto err_xdp;
 			rcu_read_unlock();
 			goto xdp_xmit;
 		default:
-			bpf_warn_invalid_xdp_action(act);
+			xdp_warn_invalid_action(act);
 		case XDP_ABORTED:
-			trace_xdp_exception(vi->dev, xdp_prog, act);
+			trace_xdp_hook_exception(vi->dev, last_hook, act);
 		case XDP_DROP:
 			if (unlikely(xdp_page != page))
 				__free_pages(xdp_page, 0);
@@ -1606,7 +1605,7 @@ static int virtnet_set_channels(struct net_device *dev,
 	 * also when XDP is loaded all RX queues have XDP programs so we only
 	 * need to check a single RX queue.
 	 */
-	if (vi->rq[0].xdp_prog)
+	if (vi->xdp_enabled)
 		return -EINVAL;
 
 	get_online_cpus();
@@ -1778,11 +1777,20 @@ static int virtnet_reset(struct virtnet_info *vi, int curr_qp, int xdp_qp)
 	return ret;
 }
 
-static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog)
+static int virtnet_xdp_check_bpf(struct net_device *dev, struct bpf_prog *prog)
+{
+	if (prog && prog->xdp_adjust_head) {
+		netdev_warn(dev, "Does not support bpf_xdp_adjust_head()\n");
+		return -EOPNOTSUPP;
+	}
+
+	return 0;
+}
+
+static int virtnet_xdp_init(struct net_device *dev, bool enable)
 {
 	unsigned long int max_sz = PAGE_SIZE - sizeof(struct padded_vnet_hdr);
 	struct virtnet_info *vi = netdev_priv(dev);
-	struct bpf_prog *old_prog;
 	u16 xdp_qp = 0, curr_qp;
 	int i, err;
 
@@ -1805,7 +1813,7 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog)
 	}
 
 	curr_qp = vi->curr_queue_pairs - vi->xdp_queue_pairs;
-	if (prog)
+	if (enable)
 		xdp_qp = nr_cpu_ids;
 
 	/* XDP requires extra queues for XDP_TX */
@@ -1815,12 +1823,6 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog)
 		return -ENOMEM;
 	}
 
-	if (prog) {
-		prog = bpf_prog_add(prog, vi->max_queue_pairs - 1);
-		if (IS_ERR(prog))
-			return PTR_ERR(prog);
-	}
-
 	/* Changing the headroom in buffers is a disruptive operation because
 	 * existing buffers must be flushed and reallocated. This will happen
 	 * when a xdp program is initially added or xdp is disabled by removing
@@ -1836,12 +1838,7 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog)
 
 	netif_set_real_num_rx_queues(dev, curr_qp + xdp_qp);
 
-	for (i = 0; i < vi->max_queue_pairs; i++) {
-		old_prog = rtnl_dereference(vi->rq[i].xdp_prog);
-		rcu_assign_pointer(vi->rq[i].xdp_prog, prog);
-		if (old_prog)
-			bpf_prog_put(old_prog);
-	}
+	vi->xdp_enabled = enable;
 
 	return 0;
 
@@ -1850,31 +1847,18 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog)
 	 * error up to user space for resolution. The underlying reset hung on
 	 * us so not much we can do here.
 	 */
-	if (prog)
-		bpf_prog_sub(prog, vi->max_queue_pairs - 1);
 	return err;
 }
 
-static bool virtnet_xdp_query(struct net_device *dev)
-{
-	struct virtnet_info *vi = netdev_priv(dev);
-	int i;
-
-	for (i = 0; i < vi->max_queue_pairs; i++) {
-		if (vi->rq[i].xdp_prog)
-			return true;
-	}
-	return false;
-}
-
 static int virtnet_xdp(struct net_device *dev, struct netdev_xdp *xdp)
 {
 	switch (xdp->command) {
-	case XDP_SETUP_PROG:
-		return virtnet_xdp_set(dev, xdp->prog);
-	case XDP_QUERY_PROG:
-		xdp->prog_attached = virtnet_xdp_query(dev);
-		return 0;
+	case XDP_MODE_ON:
+		return virtnet_xdp_init(dev, true);
+	case XDP_MODE_OFF:
+		return virtnet_xdp_init(dev, false);
+	case XDP_CHECK_BPF_PROG:
+		return virtnet_xdp_check_bpf(dev, xdp->prog);
 	default:
 		return -EINVAL;
 	}
@@ -1955,17 +1939,11 @@ static void virtnet_free_queues(struct virtnet_info *vi)
 
 static void _free_receive_bufs(struct virtnet_info *vi)
 {
-	struct bpf_prog *old_prog;
 	int i;
 
 	for (i = 0; i < vi->max_queue_pairs; i++) {
 		while (vi->rq[i].pages)
 			__free_pages(get_a_page(&vi->rq[i], GFP_KERNEL), 0);
-
-		old_prog = rtnl_dereference(vi->rq[i].xdp_prog);
-		RCU_INIT_POINTER(vi->rq[i].xdp_prog, NULL);
-		if (old_prog)
-			bpf_prog_put(old_prog);
 	}
 }
 
@@ -2274,7 +2252,8 @@ static int virtnet_probe(struct virtio_device *vdev)
 	/* Do we support "hardware" checksums? */
 	if (virtio_has_feature(vdev, VIRTIO_NET_F_CSUM)) {
 		/* This opens up the world of extra features. */
-		dev->hw_features |= NETIF_F_HW_CSUM | NETIF_F_SG;
+		dev->hw_features |= NETIF_F_HW_CSUM | NETIF_F_SG |
+				    NETIF_F_XDP;
 		if (csum)
 			dev->features |= NETIF_F_HW_CSUM | NETIF_F_SG;
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH RFC v3 6/8] mlx5: Changes to use generic XDP infrastructure
  2017-02-21 19:34 [PATCH RFC v3 0/8] xdp: Infrastructure to generalize XDP Tom Herbert
                   ` (4 preceding siblings ...)
  2017-02-21 19:34 ` [PATCH RFC v3 5/8] virt_net: " Tom Herbert
@ 2017-02-21 19:34 ` Tom Herbert
  2017-02-21 19:34 ` [PATCH RFC v3 7/8] bnxt: " Tom Herbert
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 22+ messages in thread
From: Tom Herbert @ 2017-02-21 19:34 UTC (permalink / raw)
  To: davem, netdev; +Cc: kernel-team

Change XDP program management functional interface to correspond to new
XDP API.

Signed-off-by: Tom Herbert <tom@herbertland.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h      |   3 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 108 ++++++----------------
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c   |  12 +--
 3 files changed, 36 insertions(+), 87 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 95ca03c..0255423 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -381,7 +381,6 @@ struct mlx5e_rq {
 	u16                    rx_headroom;
 
 	struct mlx5e_rx_am     am; /* Adaptive Moderation */
-	struct bpf_prog       *xdp_prog;
 
 	/* control */
 	struct mlx5_wq_ctrl    wq_ctrl;
@@ -695,7 +694,7 @@ struct mlx5e_priv {
 	/* priv data path fields - start */
 	struct mlx5e_sq            **txq_to_sq_map;
 	int channeltc_to_txq_map[MLX5E_MAX_NUM_CHANNELS][MLX5E_MAX_NUM_TC];
-	struct bpf_prog *xdp_prog;
+	bool			   xdp_enabled;
 	/* priv data path fields - end */
 
 	unsigned long              state;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 3cce628..07decbf 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -36,6 +36,7 @@
 #include <linux/mlx5/fs.h>
 #include <net/vxlan.h>
 #include <linux/bpf.h>
+#include <net/xdp.h>
 #include "en.h"
 #include "en_tc.h"
 #include "eswitch.h"
@@ -113,7 +114,7 @@ static void mlx5e_set_rq_type_params(struct mlx5e_priv *priv, u8 rq_type)
 static void mlx5e_set_rq_priv_params(struct mlx5e_priv *priv)
 {
 	u8 rq_type = mlx5e_check_fragmented_striding_rq_cap(priv->mdev) &&
-		    !priv->xdp_prog ?
+		    !priv->xdp_enabled ?
 		    MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ :
 		    MLX5_WQ_TYPE_LINKED_LIST;
 	mlx5e_set_rq_type_params(priv, rq_type);
@@ -568,14 +569,12 @@ static int mlx5e_create_rq(struct mlx5e_channel *c,
 	rq->ix      = c->ix;
 	rq->priv    = c->priv;
 
-	rq->xdp_prog = priv->xdp_prog ? bpf_prog_inc(priv->xdp_prog) : NULL;
-	if (IS_ERR(rq->xdp_prog)) {
-		err = PTR_ERR(rq->xdp_prog);
-		rq->xdp_prog = NULL;
-		goto err_rq_wq_destroy;
-	}
-
-	if (rq->xdp_prog) {
+	if (priv->xdp_enabled) {
+		/* Note XDP is checked whether it is enabled for the device. If
+		 * XDP programs are set per ring as opposed to setting program
+		 * across the device this could be adjusted to account for
+		 * that.
+		 */
 		rq->buff.map_dir = DMA_BIDIRECTIONAL;
 		rq->rx_headroom = XDP_PACKET_HEADROOM;
 	} else {
@@ -662,8 +661,6 @@ static int mlx5e_create_rq(struct mlx5e_channel *c,
 	mlx5_core_destroy_mkey(mdev, &rq->umr_mkey);
 
 err_rq_wq_destroy:
-	if (rq->xdp_prog)
-		bpf_prog_put(rq->xdp_prog);
 	mlx5_wq_destroy(&rq->wq_ctrl);
 
 	return err;
@@ -673,9 +670,6 @@ static void mlx5e_destroy_rq(struct mlx5e_rq *rq)
 {
 	int i;
 
-	if (rq->xdp_prog)
-		bpf_prog_put(rq->xdp_prog);
-
 	switch (rq->wq_type) {
 	case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ:
 		mlx5e_rq_free_mpwqe_info(rq);
@@ -1547,7 +1541,7 @@ static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix,
 	c->netdev   = priv->netdev;
 	c->mkey_be  = cpu_to_be32(priv->mdev->mlx5e_res.mkey.key);
 	c->num_tc   = priv->params.num_tc;
-	c->xdp      = !!priv->xdp_prog;
+	c->xdp      = priv->xdp_enabled;
 
 	if (priv->params.rx_am_enabled)
 		rx_cq_profile = mlx5e_am_get_def_profile(priv->params.rx_cq_period_mode);
@@ -3196,96 +3190,55 @@ static void mlx5e_tx_timeout(struct net_device *dev)
 		schedule_work(&priv->tx_timeout_work);
 }
 
-static int mlx5e_xdp_set(struct net_device *netdev, struct bpf_prog *prog)
+static int mlx5e_xdp_init(struct net_device *netdev, bool enable)
 {
 	struct mlx5e_priv *priv = netdev_priv(netdev);
-	struct bpf_prog *old_prog;
 	int err = 0;
-	bool reset, was_opened;
-	int i;
+	bool was_opened;
 
 	mutex_lock(&priv->state_lock);
 
-	if ((netdev->features & NETIF_F_LRO) && prog) {
+	if (priv->xdp_enabled == enable)
+		goto unlock;
+
+	if ((netdev->features & NETIF_F_LRO) && enable) {
 		netdev_warn(netdev, "can't set XDP while LRO is on, disable LRO first\n");
 		err = -EINVAL;
 		goto unlock;
 	}
 
+	/* Committing new XDP state */
+	priv->xdp_enabled = enable;
+
 	was_opened = test_bit(MLX5E_STATE_OPENED, &priv->state);
-	/* no need for full reset when exchanging programs */
-	reset = (!priv->xdp_prog || !prog);
 
-	if (was_opened && reset)
+	if (was_opened)
 		mlx5e_close_locked(netdev);
-	if (was_opened && !reset) {
-		/* num_channels is invariant here, so we can take the
-		 * batched reference right upfront.
-		 */
-		prog = bpf_prog_add(prog, priv->params.num_channels);
-		if (IS_ERR(prog)) {
-			err = PTR_ERR(prog);
-			goto unlock;
-		}
-	}
-
-	/* exchange programs, extra prog reference we got from caller
-	 * as long as we don't fail from this point onwards.
-	 */
-	old_prog = xchg(&priv->xdp_prog, prog);
-	if (old_prog)
-		bpf_prog_put(old_prog);
 
-	if (reset) /* change RQ type according to priv->xdp_prog */
-		mlx5e_set_rq_priv_params(priv);
+	mlx5e_set_rq_priv_params(priv);
 
-	if (was_opened && reset)
+	if (was_opened)
 		mlx5e_open_locked(netdev);
 
-	if (!test_bit(MLX5E_STATE_OPENED, &priv->state) || reset)
-		goto unlock;
-
-	/* exchanging programs w/o reset, we update ref counts on behalf
-	 * of the channels RQs here.
-	 */
-	for (i = 0; i < priv->params.num_channels; i++) {
-		struct mlx5e_channel *c = priv->channel[i];
-
-		clear_bit(MLX5E_RQ_STATE_ENABLED, &c->rq.state);
-		napi_synchronize(&c->napi);
-		/* prevent mlx5e_poll_rx_cq from accessing rq->xdp_prog */
-
-		old_prog = xchg(&c->rq.xdp_prog, prog);
-
-		set_bit(MLX5E_RQ_STATE_ENABLED, &c->rq.state);
-		/* napi_schedule in case we have missed anything */
-		set_bit(MLX5E_CHANNEL_NAPI_SCHED, &c->flags);
-		napi_schedule(&c->napi);
-
-		if (old_prog)
-			bpf_prog_put(old_prog);
-	}
-
 unlock:
 	mutex_unlock(&priv->state_lock);
 	return err;
 }
 
-static bool mlx5e_xdp_attached(struct net_device *dev)
+static int mlx5_xdp_check_bpf(struct net_device *dev, struct bpf_prog *prog)
 {
-	struct mlx5e_priv *priv = netdev_priv(dev);
-
-	return !!priv->xdp_prog;
+	return 0;
 }
 
 static int mlx5e_xdp(struct net_device *dev, struct netdev_xdp *xdp)
 {
 	switch (xdp->command) {
-	case XDP_SETUP_PROG:
-		return mlx5e_xdp_set(dev, xdp->prog);
-	case XDP_QUERY_PROG:
-		xdp->prog_attached = mlx5e_xdp_attached(dev);
-		return 0;
+	case XDP_MODE_ON:
+		return mlx5e_xdp_init(dev, true);
+	case XDP_MODE_OFF:
+		return mlx5e_xdp_init(dev, false);
+	case XDP_CHECK_BPF_PROG:
+		return mlx5_xdp_check_bpf(dev, xdp->prog);
 	default:
 		return -EINVAL;
 	}
@@ -3706,9 +3659,6 @@ static void mlx5e_nic_init(struct mlx5_core_dev *mdev,
 static void mlx5e_nic_cleanup(struct mlx5e_priv *priv)
 {
 	mlx5e_vxlan_cleanup(priv);
-
-	if (priv->xdp_prog)
-		bpf_prog_put(priv->xdp_prog);
 }
 
 static int mlx5e_init_nic_rx(struct mlx5e_priv *priv)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index b039b87..50ab4b9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -721,18 +721,18 @@ static inline int mlx5e_xdp_handle(struct mlx5e_rq *rq,
 				   struct mlx5e_dma_info *di,
 				   void *va, u16 *rx_headroom, u32 *len)
 {
-	const struct bpf_prog *prog = READ_ONCE(rq->xdp_prog);
 	struct xdp_buff xdp;
+	struct xdp_hook *last_hook;
 	u32 act;
 
-	if (!prog)
+	if (!xdp_hook_run_needed_check(rq->netdev, rq->cq.napi))
 		return false;
 
 	xdp.data = va + *rx_headroom;
 	xdp.data_end = xdp.data + *len;
 	xdp.data_hard_start = va;
 
-	act = bpf_prog_run_xdp(prog, &xdp);
+	act = xdp_hook_run_ret_last(rq->cq.napi, &xdp, &last_hook);
 	switch (act) {
 	case XDP_PASS:
 		*rx_headroom = xdp.data - xdp.data_hard_start;
@@ -740,12 +740,12 @@ static inline int mlx5e_xdp_handle(struct mlx5e_rq *rq,
 		return false;
 	case XDP_TX:
 		if (unlikely(!mlx5e_xmit_xdp_frame(rq, di, &xdp)))
-			trace_xdp_exception(rq->netdev, prog, act);
+			trace_xdp_hook_exception(rq->netdev, last_hook, act);
 		return true;
 	default:
-		bpf_warn_invalid_xdp_action(act);
+		xdp_warn_invalid_action(act);
 	case XDP_ABORTED:
-		trace_xdp_exception(rq->netdev, prog, act);
+		trace_xdp_hook_exception(rq->netdev, last_hook, act);
 	case XDP_DROP:
 		rq->stats.xdp_drop++;
 		mlx5e_page_release(rq, di, true);
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH RFC v3 7/8] bnxt: Changes to use generic XDP infrastructure
  2017-02-21 19:34 [PATCH RFC v3 0/8] xdp: Infrastructure to generalize XDP Tom Herbert
                   ` (5 preceding siblings ...)
  2017-02-21 19:34 ` [PATCH RFC v3 6/8] mlx5: " Tom Herbert
@ 2017-02-21 19:34 ` Tom Herbert
  2017-02-21 19:34 ` [PATCH RFC v3 8/8] xdp: Cleanup after API changes Tom Herbert
  2017-02-21 23:09 ` [PATCH RFC v3 0/8] xdp: Infrastructure to generalize XDP Jakub Kicinski
  8 siblings, 0 replies; 22+ messages in thread
From: Tom Herbert @ 2017-02-21 19:34 UTC (permalink / raw)
  To: davem, netdev; +Cc: kernel-team

Change XDP program management functional interface to correspond to new
XDP API.

Signed-off-by: Tom Herbert <tom@herbertland.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c     | 14 --------
 drivers/net/ethernet/broadcom/bnxt/bnxt.h     |  2 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c | 46 +++++++++++++++------------
 3 files changed, 27 insertions(+), 35 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 6dacdf1..d8b9733 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -2098,9 +2098,6 @@ static void bnxt_free_rx_rings(struct bnxt *bp)
 		struct bnxt_rx_ring_info *rxr = &bp->rx_ring[i];
 		struct bnxt_ring_struct *ring;
 
-		if (rxr->xdp_prog)
-			bpf_prog_put(rxr->xdp_prog);
-
 		kfree(rxr->rx_tpa);
 		rxr->rx_tpa = NULL;
 
@@ -2388,15 +2385,6 @@ static int bnxt_init_one_rx_ring(struct bnxt *bp, int ring_nr)
 	ring = &rxr->rx_ring_struct;
 	bnxt_init_rxbd_pages(ring, type);
 
-	if (BNXT_RX_PAGE_MODE(bp) && bp->xdp_prog) {
-		rxr->xdp_prog = bpf_prog_add(bp->xdp_prog, 1);
-		if (IS_ERR(rxr->xdp_prog)) {
-			int rc = PTR_ERR(rxr->xdp_prog);
-
-			rxr->xdp_prog = NULL;
-			return rc;
-		}
-	}
 	prod = rxr->rx_prod;
 	for (i = 0; i < bp->rx_ring_size; i++) {
 		if (bnxt_alloc_rx_data(bp, rxr, prod, GFP_KERNEL) != 0) {
@@ -7198,8 +7186,6 @@ static void bnxt_remove_one(struct pci_dev *pdev)
 	bnxt_dcb_free(bp);
 	kfree(bp->edev);
 	bp->edev = NULL;
-	if (bp->xdp_prog)
-		bpf_prog_put(bp->xdp_prog);
 	bnxt_cleanup_pci(bp);
 	free_netdev(dev);
 }
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index faf26a2..61bb375 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1180,7 +1180,7 @@ struct bnxt {
 	u8			num_leds;
 	struct bnxt_led_info	leds[BNXT_MAX_LED];
 
-	struct bpf_prog		*xdp_prog;
+	bool			xdp_enabled;
 };
 
 #define BNXT_RX_STATS_OFFSET(counter)			\
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
index 899c30f..3cfdc94 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
@@ -85,18 +85,18 @@ void bnxt_tx_int_xdp(struct bnxt *bp, struct bnxt_napi *bnapi, int nr_pkts)
 bool bnxt_rx_xdp(struct bnxt *bp, struct bnxt_rx_ring_info *rxr, u16 cons,
 		 struct page *page, u8 **data_ptr, unsigned int *len, u8 *event)
 {
-	struct bpf_prog *xdp_prog = READ_ONCE(rxr->xdp_prog);
 	struct bnxt_tx_ring_info *txr;
 	struct bnxt_sw_rx_bd *rx_buf;
 	struct pci_dev *pdev;
 	struct xdp_buff xdp;
+	struct xdp_hook *last_hook;
 	dma_addr_t mapping;
 	void *orig_data;
 	u32 tx_avail;
 	u32 offset;
 	u32 act;
 
-	if (!xdp_prog)
+	if (!xdp_hook_run_needed_check(bp->dev, &rxr->bnapi->napi))
 		return false;
 
 	pdev = bp->pdev;
@@ -113,7 +113,7 @@ bool bnxt_rx_xdp(struct bnxt *bp, struct bnxt_rx_ring_info *rxr, u16 cons,
 	dma_sync_single_for_cpu(&pdev->dev, mapping + offset, *len, bp->rx_dir);
 
 	rcu_read_lock();
-	act = bpf_prog_run_xdp(xdp_prog, &xdp);
+	act = xdp_hook_run_ret_last(&rxr->bnapi->napi, &xdp, &last_hook);
 	rcu_read_unlock();
 
 	tx_avail = bnxt_tx_avail(bp, txr);
@@ -134,7 +134,7 @@ bool bnxt_rx_xdp(struct bnxt *bp, struct bnxt_rx_ring_info *rxr, u16 cons,
 
 	case XDP_TX:
 		if (tx_avail < 2) {
-			trace_xdp_exception(bp->dev, xdp_prog, act);
+			trace_xdp_hook_exception(bp->dev, last_hook, act);
 			bnxt_reuse_rx_data(rxr, cons, page);
 			return true;
 		}
@@ -147,10 +147,10 @@ bool bnxt_rx_xdp(struct bnxt *bp, struct bnxt_rx_ring_info *rxr, u16 cons,
 		bnxt_reuse_rx_data(rxr, cons, page);
 		return true;
 	default:
-		bpf_warn_invalid_xdp_action(act);
+		xdp_warn_invalid_action(act);
 		/* Fall thru */
 	case XDP_ABORTED:
-		trace_xdp_exception(bp->dev, xdp_prog, act);
+		trace_xdp_hook_exception(bp->dev, last_hook, act);
 		/* Fall thru */
 	case XDP_DROP:
 		bnxt_reuse_rx_data(rxr, cons, page);
@@ -160,13 +160,15 @@ bool bnxt_rx_xdp(struct bnxt *bp, struct bnxt_rx_ring_info *rxr, u16 cons,
 }
 
 /* Under rtnl_lock */
-static int bnxt_xdp_set(struct bnxt *bp, struct bpf_prog *prog)
+static int bnxt_xdp_init(struct bnxt *bp, bool enable)
 {
 	struct net_device *dev = bp->dev;
 	int tx_xdp = 0, rc, tc;
-	struct bpf_prog *old;
 
-	if (prog && bp->dev->mtu > BNXT_MAX_PAGE_MODE_MTU) {
+	if (bp->xdp_enabled == enable)
+		return 0;
+
+	if (enable && bp->dev->mtu > BNXT_MAX_PAGE_MODE_MTU) {
 		netdev_warn(dev, "MTU %d larger than largest XDP supported MTU %d.\n",
 			    bp->dev->mtu, BNXT_MAX_PAGE_MODE_MTU);
 		return -EOPNOTSUPP;
@@ -175,7 +177,7 @@ static int bnxt_xdp_set(struct bnxt *bp, struct bpf_prog *prog)
 		netdev_warn(dev, "ethtool rx/tx channels must be combined to support XDP.\n");
 		return -EOPNOTSUPP;
 	}
-	if (prog)
+	if (enable)
 		tx_xdp = bp->rx_nr_rings;
 
 	tc = netdev_get_num_tc(dev);
@@ -190,11 +192,7 @@ static int bnxt_xdp_set(struct bnxt *bp, struct bpf_prog *prog)
 	if (netif_running(dev))
 		bnxt_close_nic(bp, true, false);
 
-	old = xchg(&bp->xdp_prog, prog);
-	if (old)
-		bpf_prog_put(old);
-
-	if (prog) {
+	if (enable) {
 		bnxt_set_rx_skb_mode(bp, true);
 	} else {
 		int rx, tx;
@@ -210,6 +208,7 @@ static int bnxt_xdp_set(struct bnxt *bp, struct bpf_prog *prog)
 	bp->tx_nr_rings = bp->tx_nr_rings_per_tc * tc + tx_xdp;
 	bp->cp_nr_rings = max_t(int, bp->tx_nr_rings, bp->rx_nr_rings);
 	bp->num_stat_ctxs = bp->cp_nr_rings;
+	bp->xdp_enabled = enable;
 	bnxt_set_tpa_flags(bp);
 	bnxt_set_ring_params(bp);
 
@@ -219,18 +218,25 @@ static int bnxt_xdp_set(struct bnxt *bp, struct bpf_prog *prog)
 	return 0;
 }
 
+static int bnxt_xdp_check_bpf(struct net_device *dev, struct bpf_prog *prog)
+{
+	return 0;
+}
+
 int bnxt_xdp(struct net_device *dev, struct netdev_xdp *xdp)
 {
 	struct bnxt *bp = netdev_priv(dev);
 	int rc;
 
 	switch (xdp->command) {
-	case XDP_SETUP_PROG:
-		rc = bnxt_xdp_set(bp, xdp->prog);
+	case XDP_MODE_ON:
+		rc = bnxt_xdp_init(bp, true);
+		break;
+	case XDP_MODE_OFF:
+		rc = bnxt_xdp_init(bp, false);
 		break;
-	case XDP_QUERY_PROG:
-		xdp->prog_attached = !!bp->xdp_prog;
-		rc = 0;
+	case XDP_CHECK_BPF_PROG:
+		rc = bnxt_xdp_check_bpf(dev, xdp->prog);
 		break;
 	default:
 		rc = -EINVAL;
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH RFC v3 8/8] xdp: Cleanup after API changes
  2017-02-21 19:34 [PATCH RFC v3 0/8] xdp: Infrastructure to generalize XDP Tom Herbert
                   ` (6 preceding siblings ...)
  2017-02-21 19:34 ` [PATCH RFC v3 7/8] bnxt: " Tom Herbert
@ 2017-02-21 19:34 ` Tom Herbert
  2017-02-21 21:06   ` David Miller
  2017-02-21 23:09 ` [PATCH RFC v3 0/8] xdp: Infrastructure to generalize XDP Jakub Kicinski
  8 siblings, 1 reply; 22+ messages in thread
From: Tom Herbert @ 2017-02-21 19:34 UTC (permalink / raw)
  To: davem, netdev; +Cc: kernel-team

This patch:
  - Change trace_xdp_hook_exception to trace_xdp_exception
  - Remove XDP_SETUP_PROG and XDP_QUERY_PROG constants
  - Remove bpf_warn_invalid_xdp_action

Signed-off-by: Tom Herbert <tom@herbertland.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c      |  4 +--
 drivers/net/ethernet/mellanox/mlx4/en_rx.c         |  4 +--
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c    |  4 +--
 .../net/ethernet/netronome/nfp/nfp_net_common.c    |  8 +++---
 drivers/net/ethernet/qlogic/qede/qede_fp.c         |  6 ++---
 drivers/net/virtio_net.c                           |  8 +++---
 include/linux/filter.h                             |  1 -
 include/linux/netdevice.h                          | 15 -----------
 include/trace/events/xdp.h                         | 29 ----------------------
 kernel/bpf/core.c                                  |  1 -
 net/core/filter.c                                  |  6 -----
 11 files changed, 16 insertions(+), 70 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
index 3cfdc94..e894b67 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
@@ -134,7 +134,7 @@ bool bnxt_rx_xdp(struct bnxt *bp, struct bnxt_rx_ring_info *rxr, u16 cons,
 
 	case XDP_TX:
 		if (tx_avail < 2) {
-			trace_xdp_hook_exception(bp->dev, last_hook, act);
+			trace_xdp_exception(bp->dev, last_hook, act);
 			bnxt_reuse_rx_data(rxr, cons, page);
 			return true;
 		}
@@ -150,7 +150,7 @@ bool bnxt_rx_xdp(struct bnxt *bp, struct bnxt_rx_ring_info *rxr, u16 cons,
 		xdp_warn_invalid_action(act);
 		/* Fall thru */
 	case XDP_ABORTED:
-		trace_xdp_hook_exception(bp->dev, last_hook, act);
+		trace_xdp_exception(bp->dev, last_hook, act);
 		/* Fall thru */
 	case XDP_DROP:
 		bnxt_reuse_rx_data(rxr, cons, page);
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index a8fddc0..d8648fe 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -927,12 +927,12 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
 							length, cq->ring,
 							&doorbell_pending)))
 					goto consumed;
-				trace_xdp_hook_exception(dev, last_hook, act);
+				trace_xdp_exception(dev, last_hook, act);
 				goto xdp_drop_no_cnt; /* Drop on xmit failure */
 			default:
 				xdp_warn_invalid_action(act);
 			case XDP_ABORTED:
-				trace_xdp_hook_exception(dev, last_hook, act);
+				trace_xdp_exception(dev, last_hook, act);
 			case XDP_DROP:
 				ring->xdp_drop++;
 xdp_drop_no_cnt:
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 50ab4b9..1be1eef 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -740,12 +740,12 @@ static inline int mlx5e_xdp_handle(struct mlx5e_rq *rq,
 		return false;
 	case XDP_TX:
 		if (unlikely(!mlx5e_xmit_xdp_frame(rq, di, &xdp)))
-			trace_xdp_hook_exception(rq->netdev, last_hook, act);
+			trace_xdp_exception(rq->netdev, last_hook, act);
 		return true;
 	default:
 		xdp_warn_invalid_action(act);
 	case XDP_ABORTED:
-		trace_xdp_hook_exception(rq->netdev, last_hook, act);
+		trace_xdp_exception(rq->netdev, last_hook, act);
 	case XDP_DROP:
 		rq->stats.xdp_drop++;
 		mlx5e_page_release(rq, di, true);
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index 6200035..82c4975 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -1616,15 +1616,13 @@ static int nfp_net_rx(struct nfp_net_rx_ring *rx_ring, int budget)
 				if (unlikely(!nfp_net_tx_xdp_buf(nn, rx_ring,
 								 tx_ring, rxbuf,
 								 pkt_off, pkt_len)))
-					trace_xdp_hook_exception(nn->netdev,
-								 last_hook,
-								 act);
+					trace_xdp_exception(nn->netdev,
+							    last_hook, act);
 				continue;
 			default:
 				xdp_warn_invalid_action(act);
 			case XDP_ABORTED:
-				trace_xdp_hook_exception(nn->netdev, last_hook,
-							 act);
+				trace_xdp_exception(nn->netdev, last_hook, act);
 			case XDP_DROP:
 				nfp_net_rx_give_one(rx_ring, rxbuf->frag,
 						    rxbuf->dma_addr);
diff --git a/drivers/net/ethernet/qlogic/qede/qede_fp.c b/drivers/net/ethernet/qlogic/qede/qede_fp.c
index 7423317..bd72718 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_fp.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_fp.c
@@ -1022,7 +1022,7 @@ static bool qede_rx_xdp(struct qede_dev *edev,
 		/* We need the replacement buffer before transmit. */
 		if (qede_alloc_rx_buffer(rxq, true)) {
 			qede_recycle_rx_bd_ring(rxq, 1);
-			trace_xdp_hook_exception(edev->ndev, last_hook, act);
+			trace_xdp_exception(edev->ndev, last_hook, act);
 			goto out;
 		}
 
@@ -1033,7 +1033,7 @@ static bool qede_rx_xdp(struct qede_dev *edev,
 			dma_unmap_page(rxq->dev, bd->mapping,
 				       PAGE_SIZE, DMA_BIDIRECTIONAL);
 			__free_page(bd->data);
-			trace_xdp_hook_exception(edev->ndev, last_hook, act);
+			trace_xdp_exception(edev->ndev, last_hook, act);
 		}
 
 		/* Regardless, we've consumed an Rx BD */
@@ -1043,7 +1043,7 @@ static bool qede_rx_xdp(struct qede_dev *edev,
 	default:
 		xdp_warn_invalid_action(act);
 	case XDP_ABORTED:
-		trace_xdp_hook_exception(edev->ndev, last_hook, act);
+		trace_xdp_exception(edev->ndev, last_hook, act);
 	case XDP_DROP:
 		qede_recycle_rx_bd_ring(rxq, cqe->bd_num);
 	}
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 1f5eddf..12d5102 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -427,13 +427,13 @@ static struct sk_buff *receive_small(struct net_device *dev,
 			break;
 		case XDP_TX:
 			if (unlikely(!virtnet_xdp_xmit(vi, rq, &xdp)))
-				trace_xdp_hook_exception(vi->dev, last_hook, act);
+				trace_xdp_exception(vi->dev, last_hook, act);
 			rcu_read_unlock();
 			goto xdp_xmit;
 		default:
 			xdp_warn_invalid_action(act);
 		case XDP_ABORTED:
-			trace_xdp_hook_exception(vi->dev, last_hook, act);
+			trace_xdp_exception(vi->dev, last_hook, act);
 		case XDP_DROP:
 			goto err_xdp;
 		}
@@ -619,7 +619,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 			break;
 		case XDP_TX:
 			if (unlikely(!virtnet_xdp_xmit(vi, rq, &xdp)))
-				trace_xdp_hook_exception(vi->dev, last_hook, act);
+				trace_xdp_exception(vi->dev, last_hook, act);
 			ewma_pkt_len_add(&rq->mrg_avg_pkt_len, len);
 			if (unlikely(xdp_page != page))
 				goto err_xdp;
@@ -628,7 +628,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 		default:
 			xdp_warn_invalid_action(act);
 		case XDP_ABORTED:
-			trace_xdp_hook_exception(vi->dev, last_hook, act);
+			trace_xdp_exception(vi->dev, last_hook, act);
 		case XDP_DROP:
 			if (unlikely(xdp_page != page))
 				__free_pages(xdp_page, 0);
diff --git a/include/linux/filter.h b/include/linux/filter.h
index 53b737f..a9578e9 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -632,7 +632,6 @@ bool bpf_helper_changes_pkt_data(void *func);
 
 struct bpf_prog *bpf_patch_insn_single(struct bpf_prog *prog, u32 off,
 				       const struct bpf_insn *patch, u32 len);
-void bpf_warn_invalid_xdp_action(u32 act);
 
 #ifdef CONFIG_BPF_JIT
 extern int bpf_jit_enable;
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 57ac7ea..4024088 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -811,18 +811,6 @@ struct tc_to_netdev {
  * to the netdevice through the xdp op.
  */
 enum xdp_netdev_command {
-	/* Set or clear a bpf program used in the earliest stages of packet
-	 * rx. The prog will have been loaded as BPF_PROG_TYPE_XDP. The callee
-	 * is responsible for calling bpf_prog_put on any old progs that are
-	 * stored. In case of error, the callee need not release the new prog
-	 * reference, but on success it takes ownership and must bpf_prog_put
-	 * when it is no longer used.
-	 */
-	XDP_SETUP_PROG,
-	/* Check if a bpf program is set on the device.  The callee should
-	 * return true if a program is currently attached and running.
-	 */
-	XDP_QUERY_PROG,
 	/* Initialize device to use XDP. Called when first XDP program is
 	 * registered on a device (including on a NAPI instance).
 	 */
@@ -840,11 +828,8 @@ enum xdp_netdev_command {
 struct netdev_xdp {
 	enum xdp_netdev_command command;
 	union {
-		/* XDP_SETUP_PROG */
 		/* XDP_CHECK_BPF_PROG */
 		struct bpf_prog *prog;
-		/* XDP_QUERY_PROG */
-		bool prog_attached;
 	};
 };
 
diff --git a/include/trace/events/xdp.h b/include/trace/events/xdp.h
index 9ca6306..d225de6 100644
--- a/include/trace/events/xdp.h
+++ b/include/trace/events/xdp.h
@@ -26,35 +26,6 @@ __XDP_ACT_MAP(__XDP_ACT_TP_FN)
 TRACE_EVENT(xdp_exception,
 
 	TP_PROTO(const struct net_device *dev,
-		 const struct bpf_prog *xdp, u32 act),
-
-	TP_ARGS(dev, xdp, act),
-
-	TP_STRUCT__entry(
-		__string(name, dev->name)
-		__array(u8, prog_tag, 8)
-		__field(u32, act)
-	),
-
-	TP_fast_assign(
-		BUILD_BUG_ON(sizeof(__entry->prog_tag) != sizeof(xdp->tag));
-		memcpy(__entry->prog_tag, xdp->tag, sizeof(xdp->tag));
-		__assign_str(name, dev->name);
-		__entry->act = act;
-	),
-
-	TP_printk("prog=%s device=%s action=%s",
-		  __print_hex_str(__entry->prog_tag, 8),
-		  __get_str(name),
-		  __print_symbolic(__entry->act, __XDP_ACT_SYM_TAB))
-);
-
-/* Temporary trace function. This will be renamed to xdp_exception after all
- * the calling drivers have been patched.
- */
-TRACE_EVENT(xdp_hook_exception,
-
-	TP_PROTO(const struct net_device *dev,
 		 const struct xdp_hook *hook, u32 act),
 
 	TP_ARGS(dev, hook, act),
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 04f2e30..f45827e2 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1412,7 +1412,6 @@ int __weak skb_copy_bits(const struct sk_buff *skb, int offset, void *to,
 #include <linux/bpf_trace.h>
 
 EXPORT_TRACEPOINT_SYMBOL_GPL(xdp_exception);
-EXPORT_TRACEPOINT_SYMBOL_GPL(xdp_hook_exception);
 
 EXPORT_TRACEPOINT_SYMBOL_GPL(bpf_prog_get_type);
 EXPORT_TRACEPOINT_SYMBOL_GPL(bpf_prog_put_rcu);
diff --git a/net/core/filter.c b/net/core/filter.c
index 9a5de43..15337d4 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2991,12 +2991,6 @@ static bool xdp_is_valid_access(int off, int size,
 	return __is_valid_xdp_access(off, size);
 }
 
-void bpf_warn_invalid_xdp_action(u32 act)
-{
-	WARN_ONCE(1, "Illegal XDP return value %u, expect packet loss\n", act);
-}
-EXPORT_SYMBOL_GPL(bpf_warn_invalid_xdp_action);
-
 static u32 bpf_convert_ctx_access(enum bpf_access_type type,
 				  const struct bpf_insn *si,
 				  struct bpf_insn *insn_buf,
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC v3 8/8] xdp: Cleanup after API changes
  2017-02-21 19:34 ` [PATCH RFC v3 8/8] xdp: Cleanup after API changes Tom Herbert
@ 2017-02-21 21:06   ` David Miller
  2017-02-21 21:24     ` Tom Herbert
  0 siblings, 1 reply; 22+ messages in thread
From: David Miller @ 2017-02-21 21:06 UTC (permalink / raw)
  To: tom; +Cc: netdev, kernel-team

From: Tom Herbert <tom@herbertland.com>
Date: Tue, 21 Feb 2017 11:34:17 -0800

>   - Change trace_xdp_hook_exception to trace_xdp_exception

Just give it this final name in patch #1 where you introduce it.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC v3 8/8] xdp: Cleanup after API changes
  2017-02-21 21:06   ` David Miller
@ 2017-02-21 21:24     ` Tom Herbert
  2017-02-21 22:14       ` David Miller
  0 siblings, 1 reply; 22+ messages in thread
From: Tom Herbert @ 2017-02-21 21:24 UTC (permalink / raw)
  To: David Miller; +Cc: Linux Kernel Network Developers, Kernel Team

On Tue, Feb 21, 2017 at 1:06 PM, David Miller <davem@davemloft.net> wrote:
> From: Tom Herbert <tom@herbertland.com>
> Date: Tue, 21 Feb 2017 11:34:17 -0800
>
>>   - Change trace_xdp_hook_exception to trace_xdp_exception
>
> Just give it this final name in patch #1 where you introduce it.

That will break compilation of the intervening patches. That is okay?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC v3 8/8] xdp: Cleanup after API changes
  2017-02-21 21:24     ` Tom Herbert
@ 2017-02-21 22:14       ` David Miller
  0 siblings, 0 replies; 22+ messages in thread
From: David Miller @ 2017-02-21 22:14 UTC (permalink / raw)
  To: tom; +Cc: netdev, kernel-team

From: Tom Herbert <tom@herbertland.com>
Date: Tue, 21 Feb 2017 13:24:43 -0800

> On Tue, Feb 21, 2017 at 1:06 PM, David Miller <davem@davemloft.net> wrote:
>> From: Tom Herbert <tom@herbertland.com>
>> Date: Tue, 21 Feb 2017 11:34:17 -0800
>>
>>>   - Change trace_xdp_hook_exception to trace_xdp_exception
>>
>> Just give it this final name in patch #1 where you introduce it.
> 
> That will break compilation of the intervening patches. That is okay?

The trace hook doesn't even exist until patch #1, so yes if you name it
properly from the start any references you have to it within followon
patches need to be adjusted as well.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC v3 0/8] xdp: Infrastructure to generalize XDP
  2017-02-21 19:34 [PATCH RFC v3 0/8] xdp: Infrastructure to generalize XDP Tom Herbert
                   ` (7 preceding siblings ...)
  2017-02-21 19:34 ` [PATCH RFC v3 8/8] xdp: Cleanup after API changes Tom Herbert
@ 2017-02-21 23:09 ` Jakub Kicinski
  2017-02-22  0:25   ` Tom Herbert
  8 siblings, 1 reply; 22+ messages in thread
From: Jakub Kicinski @ 2017-02-21 23:09 UTC (permalink / raw)
  To: Tom Herbert; +Cc: davem, netdev, kernel-team

Hi, 

I still have a few worries about this patch set...

On Tue, 21 Feb 2017 11:34:09 -0800, Tom Herbert wrote:
> This patch set generalizes XDP by making the hooks in drivers to be
> generic. This has a number of advantages:
> 
>   - Allows a means to pipeline XDP programs together

Does it add anything beyond what we already can achieve with tail
calls?  Is there expectations that users will install independent
black box/pre-compiled programs on the same interface?

>   - Reduces the amount of code and complexity needed in drivers to
>     manage XDP

Other than for Mellanox drivers the savings are in *teen lines of code?

>   - Provides a more structured environment that is extensible to new
>     features while being mostly transparent to the drivers

So far all features we added required explicit driver support.
Checksumming support as an example will require driver changes, too.
Generalized way to call programs is probably not going to buy us much?

>   - Allow XDP programs to be set per device or per queue

Some drivers already provide that even though we don't have a user API
to set it.

>   - Moves management of BPF programs out of driver into a common
>     infrastructure

I appreciate this one, but IMHO this generalized infrastructure is way
too complicated for the purpose.  Why not simply place an XDP program
pointer into netdev and napi and provide driver with a helper to install
the program there?  Or am I simply not understanding the benefits of
the generalized hooks?

I probably have a slightly unique perspective caring about offloads but
IMHO the cost of this patchset outweighs the benefits right now.  You
haven't actually implement the OFFLOAD command at all (grep for
XDP_OFFLOAD_BPF) but having to deal with the intermediary layer will
certainly complicate things here.  

There are also miscellaneous optimizations which are possible when
drivers can inspect the program, like only reserving headroom if
program may use header adjust...  it does seem to me that this set
introduces a lot of complexity and only superficial driver
simplifications.  I would personally rather wait with infrastructure
like this until John's TX-to-other netdev is well formulated.

Obviously this set will also be a major pain when backporting things,
too.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC v3 3/8] nfp: Changes to use generic XDP infrastructure
  2017-02-21 19:34 ` [PATCH RFC v3 3/8] nfp: " Tom Herbert
@ 2017-02-21 23:10   ` Jakub Kicinski
  0 siblings, 0 replies; 22+ messages in thread
From: Jakub Kicinski @ 2017-02-21 23:10 UTC (permalink / raw)
  To: Tom Herbert; +Cc: davem, netdev, kernel-team

On Tue, 21 Feb 2017 11:34:12 -0800, Tom Herbert wrote:
> Change XDP program management functional interface to correspond to new
> XDP API.
> 
> Signed-off-by: Tom Herbert <tom@herbertland.com>
> ---
>  drivers/net/ethernet/netronome/nfp/nfp_net.h       |   5 +-
>  .../net/ethernet/netronome/nfp/nfp_net_common.c    | 172 ++++++++++-----------
>  .../net/ethernet/netronome/nfp/nfp_net_ethtool.c   |  12 +-
>  3 files changed, 87 insertions(+), 102 deletions(-)
> 
> diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net.h b/drivers/net/ethernet/netronome/nfp/nfp_net.h
> index e614a37..732e40b 100644
> --- a/drivers/net/ethernet/netronome/nfp/nfp_net.h
> +++ b/drivers/net/ethernet/netronome/nfp/nfp_net.h
> @@ -503,14 +503,13 @@ struct nfp_net {
>  	unsigned is_vf:1;
>  	unsigned bpf_offload_skip_sw:1;
>  	unsigned bpf_offload_xdp:1;
> +	unsigned xdp_enabled:1;
>  
>  	u32 ctrl;
>  	u32 fl_bufsz;
>  
>  	u32 rx_offset;
>  
> -	struct bpf_prog *xdp_prog;
> -
>  	struct nfp_net_tx_ring *tx_rings;
>  	struct nfp_net_rx_ring *rx_rings;
>  
> @@ -788,7 +787,7 @@ void
>  nfp_net_irqs_assign(struct nfp_net *nn, struct msix_entry *irq_entries,
>  		    unsigned int n);
>  int
> -nfp_net_ring_reconfig(struct nfp_net *nn, struct bpf_prog **xdp_prog,
> +nfp_net_ring_reconfig(struct nfp_net *nn,
>  		      struct nfp_net_ring_set *rx, struct nfp_net_ring_set *tx);
>  
>  #ifdef CONFIG_NFP_DEBUG
> diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
> index 074259c..6200035 100644
> --- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
> +++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
> @@ -65,6 +65,7 @@
>  
>  #include <net/pkt_cls.h>
>  #include <net/vxlan.h>
> +#include <net/xdp.h>
>  
>  #include "nfp_net_ctrl.h"
>  #include "nfp_net.h"
> @@ -1169,10 +1170,10 @@ nfp_net_napi_alloc_one(struct nfp_net *nn, int direction, dma_addr_t *dma_addr)
>  {
>  	void *frag;
>  
> -	if (!nn->xdp_prog)
> -		frag = napi_alloc_frag(nn->fl_bufsz);
> -	else
> +	if (nn->xdp_enabled)
>  		frag = page_address(alloc_page(GFP_ATOMIC | __GFP_COLD));
> +	else
> +		frag = napi_alloc_frag(nn->fl_bufsz);
>  	if (!frag) {
>  		nn_warn_ratelimit(nn, "Failed to alloc receive page frag\n");
>  		return NULL;
> @@ -1180,7 +1181,7 @@ nfp_net_napi_alloc_one(struct nfp_net *nn, int direction, dma_addr_t *dma_addr)
>  
>  	*dma_addr = nfp_net_dma_map_rx(nn, frag, nn->fl_bufsz, direction);
>  	if (dma_mapping_error(&nn->pdev->dev, *dma_addr)) {
> -		nfp_net_free_frag(frag, nn->xdp_prog);
> +		nfp_net_free_frag(frag, nn->xdp_enabled);
>  		nn_warn_ratelimit(nn, "Failed to map DMA RX buffer\n");
>  		return NULL;
>  	}
> @@ -1251,17 +1252,15 @@ static void nfp_net_rx_ring_reset(struct nfp_net_rx_ring *rx_ring)
>   * nfp_net_rx_ring_bufs_free() - Free any buffers currently on the RX ring
>   * @nn:		NFP Net device
>   * @rx_ring:	RX ring to remove buffers from
> - * @xdp:	Whether XDP is enabled
>   *
>   * Assumes that the device is stopped and buffers are in [0, ring->cnt - 1)
>   * entries.  After device is disabled nfp_net_rx_ring_reset() must be called
>   * to restore required ring geometry.
>   */
>  static void
> -nfp_net_rx_ring_bufs_free(struct nfp_net *nn, struct nfp_net_rx_ring *rx_ring,
> -			  bool xdp)
> +nfp_net_rx_ring_bufs_free(struct nfp_net *nn, struct nfp_net_rx_ring *rx_ring)

You can't do that.  NFP has runtime reconfiguration mechanism which
allocates new rings before it stops the device.  All those methods need
to take XDP program or xdp bool as parameter.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC v3 0/8] xdp: Infrastructure to generalize XDP
  2017-02-21 23:09 ` [PATCH RFC v3 0/8] xdp: Infrastructure to generalize XDP Jakub Kicinski
@ 2017-02-22  0:25   ` Tom Herbert
  2017-02-22  1:29     ` Jakub Kicinski
  0 siblings, 1 reply; 22+ messages in thread
From: Tom Herbert @ 2017-02-22  0:25 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David S. Miller, Linux Kernel Network Developers, Kernel Team

On Tue, Feb 21, 2017 at 3:09 PM, Jakub Kicinski <kubakici@wp.pl> wrote:
> Hi,
>
> I still have a few worries about this patch set...
>
> On Tue, 21 Feb 2017 11:34:09 -0800, Tom Herbert wrote:
>> This patch set generalizes XDP by making the hooks in drivers to be
>> generic. This has a number of advantages:
>>
>>   - Allows a means to pipeline XDP programs together
>
> Does it add anything beyond what we already can achieve with tail
> calls?  Is there expectations that users will install independent
> black box/pre-compiled programs on the same interface?
>
>>   - Reduces the amount of code and complexity needed in drivers to
>>     manage XDP
>
> Other than for Mellanox drivers the savings are in *teen lines of code?
>
>>   - Provides a more structured environment that is extensible to new
>>     features while being mostly transparent to the drivers
>
> So far all features we added required explicit driver support.
> Checksumming support as an example will require driver changes, too.
> Generalized way to call programs is probably not going to buy us much?
>
Hi Jakub,

What is the the concern with checksumming? Isn't that just an issue of
defining fields in xdp_data and driver populating with the appropriate
information?

Tom

>>   - Allow XDP programs to be set per device or per queue
>
> Some drivers already provide that even though we don't have a user API
> to set it.
>
>>   - Moves management of BPF programs out of driver into a common
>>     infrastructure
>
> I appreciate this one, but IMHO this generalized infrastructure is way
> too complicated for the purpose.  Why not simply place an XDP program
> pointer into netdev and napi and provide driver with a helper to install
> the program there?  Or am I simply not understanding the benefits of
> the generalized hooks?
>
> I probably have a slightly unique perspective caring about offloads but
> IMHO the cost of this patchset outweighs the benefits right now.  You
> haven't actually implement the OFFLOAD command at all (grep for
> XDP_OFFLOAD_BPF) but having to deal with the intermediary layer will
> certainly complicate things here.
>
> There are also miscellaneous optimizations which are possible when
> drivers can inspect the program, like only reserving headroom if
> program may use header adjust...  it does seem to me that this set
> introduces a lot of complexity and only superficial driver
> simplifications.  I would personally rather wait with infrastructure
> like this until John's TX-to-other netdev is well formulated.
>
> Obviously this set will also be a major pain when backporting things,
> too.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC v3 0/8] xdp: Infrastructure to generalize XDP
  2017-02-22  0:25   ` Tom Herbert
@ 2017-02-22  1:29     ` Jakub Kicinski
  2017-02-22  2:04       ` Tom Herbert
  0 siblings, 1 reply; 22+ messages in thread
From: Jakub Kicinski @ 2017-02-22  1:29 UTC (permalink / raw)
  To: Tom Herbert; +Cc: David S. Miller, Linux Kernel Network Developers, Kernel Team

On Tue, 21 Feb 2017 16:25:15 -0800, Tom Herbert wrote:
> >>   - Provides a more structured environment that is extensible to new
> >>     features while being mostly transparent to the drivers  
> >
> > So far all features we added required explicit driver support.
> > Checksumming support as an example will require driver changes, too.
> > Generalized way to call programs is probably not going to buy us much?
> >  
> Hi Jakub,
> 
> What is the the concern with checksumming? Isn't that just an issue of
> defining fields in xdp_data and driver populating with the appropriate
> information?

Thanks for replying!  I was just using checksumming as an example.  I
was trying to counter the argument that the hook infrastructure will
allow for features to be implemented mostly transparently to the
drivers.  I may be wrong but it seems that it doesn't buy us anything
as far as the features which were so far discussed go.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC v3 0/8] xdp: Infrastructure to generalize XDP
  2017-02-22  1:29     ` Jakub Kicinski
@ 2017-02-22  2:04       ` Tom Herbert
  2017-02-22  2:23         ` Jakub Kicinski
  0 siblings, 1 reply; 22+ messages in thread
From: Tom Herbert @ 2017-02-22  2:04 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David S. Miller, Linux Kernel Network Developers, Kernel Team

On Tue, Feb 21, 2017 at 5:29 PM, Jakub Kicinski <kubakici@wp.pl> wrote:
> On Tue, 21 Feb 2017 16:25:15 -0800, Tom Herbert wrote:
>> >>   - Provides a more structured environment that is extensible to new
>> >>     features while being mostly transparent to the drivers
>> >
>> > So far all features we added required explicit driver support.
>> > Checksumming support as an example will require driver changes, too.
>> > Generalized way to call programs is probably not going to buy us much?
>> >
>> Hi Jakub,
>>
>> What is the the concern with checksumming? Isn't that just an issue of
>> defining fields in xdp_data and driver populating with the appropriate
>> information?
>
> Thanks for replying!  I was just using checksumming as an example.  I
> was trying to counter the argument that the hook infrastructure will
> allow for features to be implemented mostly transparently to theom
> drivers.  I may be wrong but it seems that it doesn't buy us anything
> as far as the features which were so far discussed go.

But I still don't understand the example. In the case of checksumming
the driver just needs to populate the values in xdp_data regarding
csum. We could basically reuse the same interface as that in skbuff.
The stack then can handle the logic of verifying checksums (verifying
a csum-complete value for instance). The driver only passes
information to the stack concerning a packet, it doesn't need to act
on the information. For instance, it's not the driver's prerogative to
verify the checksum itself. In general the less drivers have to do and
the more we push complexity of common functionality into the stack,
the better.

Tom

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC v3 0/8] xdp: Infrastructure to generalize XDP
  2017-02-22  2:04       ` Tom Herbert
@ 2017-02-22  2:23         ` Jakub Kicinski
  2017-02-22  2:54           ` Tom Herbert
  0 siblings, 1 reply; 22+ messages in thread
From: Jakub Kicinski @ 2017-02-22  2:23 UTC (permalink / raw)
  To: Tom Herbert; +Cc: David S. Miller, Linux Kernel Network Developers, Kernel Team

On Tue, 21 Feb 2017 18:04:49 -0800, Tom Herbert wrote:
> On Tue, Feb 21, 2017 at 5:29 PM, Jakub Kicinski <kubakici@wp.pl> wrote:
> > On Tue, 21 Feb 2017 16:25:15 -0800, Tom Herbert wrote:  
> >> >>   - Provides a more structured environment that is extensible to new
> >> >>     features while being mostly transparent to the drivers  
> >> >
> >> > So far all features we added required explicit driver support.
> >> > Checksumming support as an example will require driver changes, too.
> >> > Generalized way to call programs is probably not going to buy us much?
> >> >  
> >> Hi Jakub,
> >>
> >> What is the the concern with checksumming? Isn't that just an issue of
> >> defining fields in xdp_data and driver populating with the appropriate
> >> information?  
> >
> > Thanks for replying!  I was just using checksumming as an example.  I
> > was trying to counter the argument that the hook infrastructure will
> > allow for features to be implemented mostly transparently to theom
> > drivers.  I may be wrong but it seems that it doesn't buy us anything
> > as far as the features which were so far discussed go.  
> 
> But I still don't understand the example. In the case of checksumming
> the driver just needs to populate the values in xdp_data regarding
> csum. We could basically reuse the same interface as that in skbuff.
> The stack then can handle the logic of verifying checksums (verifying
> a csum-complete value for instance). The driver only passes
> information to the stack concerning a packet, it doesn't need to act
> on the information. For instance, it's not the driver's prerogative to
> verify the checksum itself. In general the less drivers have to do and
> the more we push complexity of common functionality into the stack,
> the better.

Sorry, I thought your patchset abstracts away populating struct xdp. 
So my initial statement should have read "Checksumming will require
driver changes, only".  Yes, drivers will just have to populate the
appropriate XDP buffer field.  What are the XDP features you foresee
this hook infrastructure to help drivers with?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC v3 0/8] xdp: Infrastructure to generalize XDP
  2017-02-22  2:23         ` Jakub Kicinski
@ 2017-02-22  2:54           ` Tom Herbert
  2017-02-22  3:34             ` David Miller
  0 siblings, 1 reply; 22+ messages in thread
From: Tom Herbert @ 2017-02-22  2:54 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David S. Miller, Linux Kernel Network Developers, Kernel Team

On Tue, Feb 21, 2017 at 6:23 PM, Jakub Kicinski <kubakici@wp.pl> wrote:
> On Tue, 21 Feb 2017 18:04:49 -0800, Tom Herbert wrote:
>> On Tue, Feb 21, 2017 at 5:29 PM, Jakub Kicinski <kubakici@wp.pl> wrote:
>> > On Tue, 21 Feb 2017 16:25:15 -0800, Tom Herbert wrote:
>> >> >>   - Provides a more structured environment that is extensible to new
>> >> >>     features while being mostly transparent to the drivers
>> >> >
>> >> > So far all features we added required explicit driver support.
>> >> > Checksumming support as an example will require driver changes, too.
>> >> > Generalized way to call programs is probably not going to buy us much?
>> >> >
>> >> Hi Jakub,
>> >>
>> >> What is the the concern with checksumming? Isn't that just an issue of
>> >> defining fields in xdp_data and driver populating with the appropriate
>> >> information?
>> >
>> > Thanks for replying!  I was just using checksumming as an example.  I
>> > was trying to counter the argument that the hook infrastructure will
>> > allow for features to be implemented mostly transparently to theom
>> > drivers.  I may be wrong but it seems that it doesn't buy us anything
>> > as far as the features which were so far discussed go.
>>
>> But I still don't understand the example. In the case of checksumming
>> the driver just needs to populate the values in xdp_data regarding
>> csum. We could basically reuse the same interface as that in skbuff.
>> The stack then can handle the logic of verifying checksums (verifying
>> a csum-complete value for instance). The driver only passes
>> information to the stack concerning a packet, it doesn't need to act
>> on the information. For instance, it's not the driver's prerogative to
>> verify the checksum itself. In general the less drivers have to do and
>> the more we push complexity of common functionality into the stack,
>> the better.
>
> Sorry, I thought your patchset abstracts away populating struct xdp.
> So my initial statement should have read "Checksumming will require
> driver changes, only".  Yes, drivers will just have to populate the
> appropriate XDP buffer field.  What are the XDP features you foresee
> this hook infrastructure to help drivers with?

It is part of the direction to take XDP beyond the first use case of
BPF and leverage the high performance processing model in a much
broader context. For instance, as Saeed alluded to we should be a able
to create an skb-less interface into the stack-- this is essentially
what we need for TXDP (expedited processing for the transport layer).
Also, I think that we should also be able to extrapolate a good RX
batch interface from an API like this (please see discussion on that).

Thanks,
Tom

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC v3 0/8] xdp: Infrastructure to generalize XDP
  2017-02-22  2:54           ` Tom Herbert
@ 2017-02-22  3:34             ` David Miller
  2017-02-22  4:02               ` Tom Herbert
  0 siblings, 1 reply; 22+ messages in thread
From: David Miller @ 2017-02-22  3:34 UTC (permalink / raw)
  To: tom; +Cc: kubakici, netdev, kernel-team

From: Tom Herbert <tom@herbertland.com>
Date: Tue, 21 Feb 2017 18:54:53 -0800

> It is part of the direction to take XDP beyond the first use case of
> BPF and leverage the high performance processing model in a much
> broader context.

And I've stated repeatedly that it's too early to be looking
that far into the future.

Look Tom, if all you want to do is create infrastructure so that
you can very easily slither XDP for modules in somewhere, please
just stop now.  Right now I am completely not interested in even
entertaining patches which facilitate that.

If instead, you are genuinely interested in making the process of
writing XDP support for drivers easier, that extremely useful right
now so please just stick to that specific focus.

I really see no other use case for the XDP hook abstraction other
than to support XDP module support, which I've said is in no way
proven to be necessary.

We don't even know what eBPF XDP itself is fully capable of yet,
so please stop reaching like this.

Thank you.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC v3 0/8] xdp: Infrastructure to generalize XDP
  2017-02-22  3:34             ` David Miller
@ 2017-02-22  4:02               ` Tom Herbert
  2017-02-22  4:31                 ` David Miller
  0 siblings, 1 reply; 22+ messages in thread
From: Tom Herbert @ 2017-02-22  4:02 UTC (permalink / raw)
  To: David Miller; +Cc: Jakub Kicinski, Linux Kernel Network Developers, Kernel Team

On Tue, Feb 21, 2017 at 7:34 PM, David Miller <davem@davemloft.net> wrote:
> From: Tom Herbert <tom@herbertland.com>
> Date: Tue, 21 Feb 2017 18:54:53 -0800
>
>> It is part of the direction to take XDP beyond the first use case of
>> BPF and leverage the high performance processing model in a much
>> broader context.
>
> And I've stated repeatedly that it's too early to be looking
> that far into the future.
>
> Look Tom, if all you want to do is create infrastructure so that
> you can very easily slither XDP for modules in somewhere, please
> just stop now.  Right now I am completely not interested in even
> entertaining patches which facilitate that.
>
> If instead, you are genuinely interested in making the process of
> writing XDP support for drivers easier, that extremely useful right
> now so please just stick to that specific focus.
>
> I really see no other use case for the XDP hook abstraction other
> than to support XDP module support, which I've said is in no way
> proven to be necessary.
>
> We don't even know what eBPF XDP itself is fully capable of yet,
> so please stop reaching like this.
>

I took out the parts about allowing non-BPF hooks in the patchset as
you requested.I believe code that is cleaner than what is currently
there, and the fact that the API is extensible to allow other uses is
a hallmark of good design.

Tom

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC v3 0/8] xdp: Infrastructure to generalize XDP
  2017-02-22  4:02               ` Tom Herbert
@ 2017-02-22  4:31                 ` David Miller
  0 siblings, 0 replies; 22+ messages in thread
From: David Miller @ 2017-02-22  4:31 UTC (permalink / raw)
  To: tom; +Cc: kubakici, netdev, kernel-team

From: Tom Herbert <tom@herbertland.com>
Date: Tue, 21 Feb 2017 20:02:07 -0800

> I took out the parts about allowing non-BPF hooks in the patchset as
> you requested.I believe code that is cleaner than what is currently
> there, and the fact that the API is extensible to allow other uses
> is a hallmark of good design.

For BPF only cases, there is no need for a chain of callbacks.  BPF
has tailcalls, which can be used to solve whatever problem chained
XDP stuff would be used for.

The only other possible use of chained XDP is non-BPF uses, thus my
and others's fundamental objection.

It's not a cleanup or a simplification for driver writers, you're
adding a new feature, and one whose need is not at all clear.

I also object to chaining on another level.  The whole idea of XDP
programs are that they exist in a very _precise_ and specific context
and do one thing.  If there are chains, there are dependencies, and
thus the context is variable.  We do not want this, trust me.

Is it such a huge deal to defer this chaining thing to after your
other simplifications?  All of these things you are trying to add are
blocking the less controversial and useful parts of your work.

Thanks.

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2017-02-22  4:31 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-21 19:34 [PATCH RFC v3 0/8] xdp: Infrastructure to generalize XDP Tom Herbert
2017-02-21 19:34 ` [PATCH RFC v3 1/8] " Tom Herbert
2017-02-21 19:34 ` [PATCH RFC v3 2/8] mlx4: Changes to use generic XDP infrastructure Tom Herbert
2017-02-21 19:34 ` [PATCH RFC v3 3/8] nfp: " Tom Herbert
2017-02-21 23:10   ` Jakub Kicinski
2017-02-21 19:34 ` [PATCH RFC v3 4/8] qede: " Tom Herbert
2017-02-21 19:34 ` [PATCH RFC v3 5/8] virt_net: " Tom Herbert
2017-02-21 19:34 ` [PATCH RFC v3 6/8] mlx5: " Tom Herbert
2017-02-21 19:34 ` [PATCH RFC v3 7/8] bnxt: " Tom Herbert
2017-02-21 19:34 ` [PATCH RFC v3 8/8] xdp: Cleanup after API changes Tom Herbert
2017-02-21 21:06   ` David Miller
2017-02-21 21:24     ` Tom Herbert
2017-02-21 22:14       ` David Miller
2017-02-21 23:09 ` [PATCH RFC v3 0/8] xdp: Infrastructure to generalize XDP Jakub Kicinski
2017-02-22  0:25   ` Tom Herbert
2017-02-22  1:29     ` Jakub Kicinski
2017-02-22  2:04       ` Tom Herbert
2017-02-22  2:23         ` Jakub Kicinski
2017-02-22  2:54           ` Tom Herbert
2017-02-22  3:34             ` David Miller
2017-02-22  4:02               ` Tom Herbert
2017-02-22  4:31                 ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.