All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next v2 0/8] xdp: make stack perform remove and tests
@ 2017-12-01  1:35 Jakub Kicinski
  2017-12-01  1:35 ` [PATCH net-next v2 1/8] net: xdp: avoid output parameters when querying XDP prog Jakub Kicinski
                   ` (7 more replies)
  0 siblings, 8 replies; 18+ messages in thread
From: Jakub Kicinski @ 2017-12-01  1:35 UTC (permalink / raw)
  To: netdev
  Cc: davem, daniel, alexei.starovoitov, jiri, oss-drivers, Jakub Kicinski

Hi!

The purpose of this series is to add a software model of BPF offloads
to make it easier for everyone to test them and make some of the more
arcane rules and assumptions more clear.

The series starts with 3 patches aiming to make XDP handling in the
drivers less error prone.  Currently driver authors have to remember
to free XDP programs if XDP is active during unregister.  With this
series the core will disable XDP on its own.  It will take place
after close, drivers are not expected to perform reconfiguration
when disabling XDP on a downed device.

Next two patches add the software netdev driver, followed by a python 
test which exercises all the corner cases which came to my mind.

Test needs to be run as root.  It will print basic information to
stdout, but can also create a more detailed log of all commands
when --log option is passed.  Log is in Emacs Org-mode format.

  ./tools/testing/selftests/bpf/test_offload.py --log /tmp/log

Last two patches replace the SR-IOV API implementation of dummy.

v2:
 - free device from the release function;
 - use bus-based name generatin instead of netdev name.
v1:
 - replace the SR-IOV API implementation of dummy;
 - make the dev_xdp_uninstall() also handle the XDP generic (Daniel).


Jakub Kicinski (8):
  net: xdp: avoid output parameters when querying XDP prog
  net: xdp: report flags program was installed with on query
  net: xdp: make the stack take care of the tear down
  netdevsim: add software driver for testing offloads
  netdevsim: add bpf offload support
  selftests/bpf: add offload test based on netdevsim
  netdevsim: add SR-IOV functionality
  net: dummy: remove fake SR-IOV functionality

 MAINTAINERS                                        |   5 +
 drivers/net/Kconfig                                |  11 +
 drivers/net/Makefile                               |   1 +
 drivers/net/dummy.c                                | 215 +------
 drivers/net/ethernet/broadcom/bnxt/bnxt.c          |   2 -
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  |   3 -
 drivers/net/ethernet/netronome/nfp/bpf/main.c      |   7 -
 .../net/ethernet/netronome/nfp/nfp_net_common.c    |   4 +-
 drivers/net/ethernet/qlogic/qede/qede_main.c       |   4 -
 drivers/net/netdevsim/Makefile                     |   7 +
 drivers/net/netdevsim/bpf.c                        | 373 +++++++++++
 drivers/net/netdevsim/netdev.c                     | 501 +++++++++++++++
 drivers/net/netdevsim/netdevsim.h                  |  78 +++
 drivers/net/tun.c                                  |   4 -
 include/linux/netdevice.h                          |   5 +-
 net/core/dev.c                                     |  53 +-
 net/core/rtnetlink.c                               |   6 +-
 tools/testing/selftests/bpf/Makefile               |   5 +-
 tools/testing/selftests/bpf/sample_ret0.c          |   7 +
 tools/testing/selftests/bpf/test_offload.py        | 681 +++++++++++++++++++++
 20 files changed, 1714 insertions(+), 258 deletions(-)
 create mode 100644 drivers/net/netdevsim/Makefile
 create mode 100644 drivers/net/netdevsim/bpf.c
 create mode 100644 drivers/net/netdevsim/netdev.c
 create mode 100644 drivers/net/netdevsim/netdevsim.h
 create mode 100644 tools/testing/selftests/bpf/sample_ret0.c
 create mode 100755 tools/testing/selftests/bpf/test_offload.py

-- 
2.14.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH net-next v2 1/8] net: xdp: avoid output parameters when querying XDP prog
  2017-12-01  1:35 [PATCH net-next v2 0/8] xdp: make stack perform remove and tests Jakub Kicinski
@ 2017-12-01  1:35 ` Jakub Kicinski
  2017-12-01  1:35 ` [PATCH net-next v2 2/8] net: xdp: report flags program was installed with on query Jakub Kicinski
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Jakub Kicinski @ 2017-12-01  1:35 UTC (permalink / raw)
  To: netdev
  Cc: davem, daniel, alexei.starovoitov, jiri, oss-drivers, Jakub Kicinski

The output parameters will get unwieldy if we want to add more
information about the program.  Simply pass the entire
struct netdev_bpf in.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
---
 include/linux/netdevice.h |  3 ++-
 net/core/dev.c            | 24 ++++++++++++++----------
 net/core/rtnetlink.c      |  6 +++++-
 3 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index ef789e1d679e..667bdd3ad33e 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3330,7 +3330,8 @@ struct sk_buff *dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
 typedef int (*bpf_op_t)(struct net_device *dev, struct netdev_bpf *bpf);
 int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack,
 		      int fd, u32 flags);
-u8 __dev_xdp_attached(struct net_device *dev, bpf_op_t xdp_op, u32 *prog_id);
+void __dev_xdp_query(struct net_device *dev, bpf_op_t xdp_op,
+		     struct netdev_bpf *xdp);
 
 int __dev_forward_skb(struct net_device *dev, struct sk_buff *skb);
 int dev_forward_skb(struct net_device *dev, struct sk_buff *skb);
diff --git a/net/core/dev.c b/net/core/dev.c
index 07ed21d64f92..3f271c9cb5e0 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -7073,17 +7073,21 @@ int dev_change_proto_down(struct net_device *dev, bool proto_down)
 }
 EXPORT_SYMBOL(dev_change_proto_down);
 
-u8 __dev_xdp_attached(struct net_device *dev, bpf_op_t bpf_op, u32 *prog_id)
+void __dev_xdp_query(struct net_device *dev, bpf_op_t bpf_op,
+		     struct netdev_bpf *xdp)
 {
-	struct netdev_bpf xdp;
-
-	memset(&xdp, 0, sizeof(xdp));
-	xdp.command = XDP_QUERY_PROG;
+	memset(xdp, 0, sizeof(*xdp));
+	xdp->command = XDP_QUERY_PROG;
 
 	/* Query must always succeed. */
-	WARN_ON(bpf_op(dev, &xdp) < 0);
-	if (prog_id)
-		*prog_id = xdp.prog_id;
+	WARN_ON(bpf_op(dev, xdp) < 0);
+}
+
+static u8 __dev_xdp_attached(struct net_device *dev, bpf_op_t bpf_op)
+{
+	struct netdev_bpf xdp;
+
+	__dev_xdp_query(dev, bpf_op, &xdp);
 
 	return xdp.prog_attached;
 }
@@ -7134,10 +7138,10 @@ int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack,
 		bpf_chk = generic_xdp_install;
 
 	if (fd >= 0) {
-		if (bpf_chk && __dev_xdp_attached(dev, bpf_chk, NULL))
+		if (bpf_chk && __dev_xdp_attached(dev, bpf_chk))
 			return -EEXIST;
 		if ((flags & XDP_FLAGS_UPDATE_IF_NOEXIST) &&
-		    __dev_xdp_attached(dev, bpf_op, NULL))
+		    __dev_xdp_attached(dev, bpf_op))
 			return -EBUSY;
 
 		prog = bpf_prog_get_type_dev(fd, BPF_PROG_TYPE_XDP,
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index dabba2a91fc8..9c4cb584bfb0 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1261,6 +1261,7 @@ static u8 rtnl_xdp_attached_mode(struct net_device *dev, u32 *prog_id)
 {
 	const struct net_device_ops *ops = dev->netdev_ops;
 	const struct bpf_prog *generic_xdp_prog;
+	struct netdev_bpf xdp;
 
 	ASSERT_RTNL();
 
@@ -1273,7 +1274,10 @@ static u8 rtnl_xdp_attached_mode(struct net_device *dev, u32 *prog_id)
 	if (!ops->ndo_bpf)
 		return XDP_ATTACHED_NONE;
 
-	return __dev_xdp_attached(dev, ops->ndo_bpf, prog_id);
+	__dev_xdp_query(dev, ops->ndo_bpf, &xdp);
+	*prog_id = xdp.prog_id;
+
+	return xdp.prog_attached;
 }
 
 static int rtnl_xdp_fill(struct sk_buff *skb, struct net_device *dev)
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next v2 2/8] net: xdp: report flags program was installed with on query
  2017-12-01  1:35 [PATCH net-next v2 0/8] xdp: make stack perform remove and tests Jakub Kicinski
  2017-12-01  1:35 ` [PATCH net-next v2 1/8] net: xdp: avoid output parameters when querying XDP prog Jakub Kicinski
@ 2017-12-01  1:35 ` Jakub Kicinski
  2017-12-01  1:35 ` [PATCH net-next v2 3/8] net: xdp: make the stack take care of the tear down Jakub Kicinski
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Jakub Kicinski @ 2017-12-01  1:35 UTC (permalink / raw)
  To: netdev
  Cc: davem, daniel, alexei.starovoitov, jiri, oss-drivers, Jakub Kicinski

Some drivers enforce that flags on program replacement and
removal must match the flags passed on install.  This leaves
the possibility open to enable simultaneous loading
of XDP programs both to HW and DRV.

Allow such drivers to report the flags back to the stack.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
---
 drivers/net/ethernet/netronome/nfp/nfp_net_common.c | 1 +
 include/linux/netdevice.h                           | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index 1a603fdd9e80..ea6bbf1efefc 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -3392,6 +3392,7 @@ static int nfp_net_xdp(struct net_device *netdev, struct netdev_bpf *xdp)
 		if (nn->dp.bpf_offload_xdp)
 			xdp->prog_attached = XDP_ATTACHED_HW;
 		xdp->prog_id = nn->xdp_prog ? nn->xdp_prog->aux->id : 0;
+		xdp->flags = nn->xdp_prog ? nn->xdp_flags : 0;
 		return 0;
 	case BPF_OFFLOAD_VERIFIER_PREP:
 		return nfp_app_bpf_verifier_prep(nn->app, nn, xdp);
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 667bdd3ad33e..cc4ce7456e38 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -820,6 +820,8 @@ struct netdev_bpf {
 		struct {
 			u8 prog_attached;
 			u32 prog_id;
+			/* flags with which program was installed */
+			u32 prog_flags;
 		};
 		/* BPF_OFFLOAD_VERIFIER_PREP */
 		struct {
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next v2 3/8] net: xdp: make the stack take care of the tear down
  2017-12-01  1:35 [PATCH net-next v2 0/8] xdp: make stack perform remove and tests Jakub Kicinski
  2017-12-01  1:35 ` [PATCH net-next v2 1/8] net: xdp: avoid output parameters when querying XDP prog Jakub Kicinski
  2017-12-01  1:35 ` [PATCH net-next v2 2/8] net: xdp: report flags program was installed with on query Jakub Kicinski
@ 2017-12-01  1:35 ` Jakub Kicinski
  2017-12-01  1:35 ` [PATCH net-next v2 4/8] netdevsim: add software driver for testing offloads Jakub Kicinski
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Jakub Kicinski @ 2017-12-01  1:35 UTC (permalink / raw)
  To: netdev
  Cc: davem, daniel, alexei.starovoitov, jiri, oss-drivers,
	Jakub Kicinski, Saeed Mahameed, Michael Chan, Ariel Elior,
	John Fastabend

Since day one of XDP drivers had to remember to free the program
on the remove path.  This leads to code duplication and is error
prone.  Make the stack query the installed programs on unregister
and if something is installed, remove the program.  Freeing of
program attached to XDP generic is moved from free_netdev() as well.

Because the remove will now be called before notifiers are
invoked, BPF offload state of the program will not get destroyed
before uninstall.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
---
CC: Saeed Mahameed <saeedm@mellanox.com>
CC: Michael Chan <michael.chan@broadcom.com>
CC: Ariel Elior <Ariel.Elior@cavium.com>
CC: John Fastabend <john.fastabend@gmail.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c          |  2 --
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  |  3 ---
 drivers/net/ethernet/netronome/nfp/bpf/main.c      |  7 ------
 .../net/ethernet/netronome/nfp/nfp_net_common.c    |  3 ---
 drivers/net/ethernet/qlogic/qede/qede_main.c       |  4 ---
 drivers/net/tun.c                                  |  4 ---
 net/core/dev.c                                     | 29 ++++++++++++++++------
 7 files changed, 22 insertions(+), 30 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 33c49ad697e4..413ad2444ba2 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -7800,8 +7800,6 @@ static void bnxt_remove_one(struct pci_dev *pdev)
 	bnxt_dcb_free(bp);
 	kfree(bp->edev);
 	bp->edev = NULL;
-	if (bp->xdp_prog)
-		bpf_prog_put(bp->xdp_prog);
 	bnxt_cleanup_pci(bp);
 	free_netdev(dev);
 }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index d2b057a3e512..0f5c012de52e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -4308,9 +4308,6 @@ static void mlx5e_nic_cleanup(struct mlx5e_priv *priv)
 {
 	mlx5e_ipsec_cleanup(priv);
 	mlx5e_vxlan_cleanup(priv);
-
-	if (priv->channels.params.xdp_prog)
-		bpf_prog_put(priv->channels.params.xdp_prog);
 }
 
 static int mlx5e_init_nic_rx(struct mlx5e_priv *priv)
diff --git a/drivers/net/ethernet/netronome/nfp/bpf/main.c b/drivers/net/ethernet/netronome/nfp/bpf/main.c
index e379b78e86ef..54bfd7846f6d 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/main.c
+++ b/drivers/net/ethernet/netronome/nfp/bpf/main.c
@@ -82,12 +82,6 @@ static const char *nfp_bpf_extra_cap(struct nfp_app *app, struct nfp_net *nn)
 	return nfp_net_ebpf_capable(nn) ? "BPF" : "";
 }
 
-static void nfp_bpf_vnic_free(struct nfp_app *app, struct nfp_net *nn)
-{
-	if (nn->dp.bpf_offload_xdp)
-		nfp_bpf_xdp_offload(app, nn, NULL);
-}
-
 static int nfp_bpf_setup_tc_block_cb(enum tc_setup_type type,
 				     void *type_data, void *cb_priv)
 {
@@ -168,7 +162,6 @@ const struct nfp_app_type app_bpf = {
 	.extra_cap	= nfp_bpf_extra_cap,
 
 	.vnic_alloc	= nfp_app_nic_vnic_alloc,
-	.vnic_free	= nfp_bpf_vnic_free,
 
 	.setup_tc	= nfp_bpf_setup_tc,
 	.tc_busy	= nfp_bpf_tc_busy,
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index ea6bbf1efefc..ad3e9f6a61e5 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -3562,9 +3562,6 @@ struct nfp_net *nfp_net_alloc(struct pci_dev *pdev, bool needs_netdev,
  */
 void nfp_net_free(struct nfp_net *nn)
 {
-	if (nn->xdp_prog)
-		bpf_prog_put(nn->xdp_prog);
-
 	if (nn->dp.netdev)
 		free_netdev(nn->dp.netdev);
 	else
diff --git a/drivers/net/ethernet/qlogic/qede/qede_main.c b/drivers/net/ethernet/qlogic/qede/qede_main.c
index 8f9b3eb82137..57332b3e5e64 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_main.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_main.c
@@ -1068,10 +1068,6 @@ static void __qede_remove(struct pci_dev *pdev, enum qede_remove_mode mode)
 
 	pci_set_drvdata(pdev, NULL);
 
-	/* Release edev's reference to XDP's bpf if such exist */
-	if (edev->xdp_prog)
-		bpf_prog_put(edev->xdp_prog);
-
 	/* Use global ops since we've freed edev */
 	qed_ops->common->slowpath_stop(cdev);
 	if (system_state == SYSTEM_POWER_OFF)
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 6a7bde9bc4b2..6f7e8e45c961 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -673,7 +673,6 @@ static void tun_detach(struct tun_file *tfile, bool clean)
 static void tun_detach_all(struct net_device *dev)
 {
 	struct tun_struct *tun = netdev_priv(dev);
-	struct bpf_prog *xdp_prog = rtnl_dereference(tun->xdp_prog);
 	struct tun_file *tfile, *tmp;
 	int i, n = tun->numqueues;
 
@@ -708,9 +707,6 @@ static void tun_detach_all(struct net_device *dev)
 	}
 	BUG_ON(tun->numdisabled != 0);
 
-	if (xdp_prog)
-		bpf_prog_put(xdp_prog);
-
 	if (tun->flags & IFF_PERSIST)
 		module_put(THIS_MODULE);
 }
diff --git a/net/core/dev.c b/net/core/dev.c
index 3f271c9cb5e0..6bea8931bb62 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -7110,6 +7110,27 @@ static int dev_xdp_install(struct net_device *dev, bpf_op_t bpf_op,
 	return bpf_op(dev, &xdp);
 }
 
+static void dev_xdp_uninstall(struct net_device *dev)
+{
+	struct netdev_bpf xdp;
+	bpf_op_t ndo_bpf;
+
+	/* Remove generic XDP */
+	WARN_ON(dev_xdp_install(dev, generic_xdp_install, NULL, 0, NULL));
+
+	/* Remove from the driver */
+	ndo_bpf = dev->netdev_ops->ndo_bpf;
+	if (!ndo_bpf)
+		return;
+
+	__dev_xdp_query(dev, ndo_bpf, &xdp);
+	if (xdp.prog_attached == XDP_ATTACHED_NONE)
+		return;
+
+	/* Program removal should always succeed */
+	WARN_ON(dev_xdp_install(dev, ndo_bpf, NULL, xdp.prog_flags, NULL));
+}
+
 /**
  *	dev_change_xdp_fd - set or clear a bpf program for a device rx path
  *	@dev: device
@@ -7240,6 +7261,7 @@ static void rollback_registered_many(struct list_head *head)
 		/* Shutdown queueing discipline. */
 		dev_shutdown(dev);
 
+		dev_xdp_uninstall(dev);
 
 		/* Notify protocols, that we are about to destroy
 		 * this device. They should clean all the things.
@@ -8199,7 +8221,6 @@ EXPORT_SYMBOL(alloc_netdev_mqs);
 void free_netdev(struct net_device *dev)
 {
 	struct napi_struct *p, *n;
-	struct bpf_prog *prog;
 
 	might_sleep();
 	netif_free_tx_queues(dev);
@@ -8218,12 +8239,6 @@ void free_netdev(struct net_device *dev)
 	free_percpu(dev->pcpu_refcnt);
 	dev->pcpu_refcnt = NULL;
 
-	prog = rcu_dereference_protected(dev->xdp_prog, 1);
-	if (prog) {
-		bpf_prog_put(prog);
-		static_key_slow_dec(&generic_xdp_needed);
-	}
-
 	/*  Compatibility with error handling in drivers */
 	if (dev->reg_state == NETREG_UNINITIALIZED) {
 		netdev_freemem(dev);
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next v2 4/8] netdevsim: add software driver for testing offloads
  2017-12-01  1:35 [PATCH net-next v2 0/8] xdp: make stack perform remove and tests Jakub Kicinski
                   ` (2 preceding siblings ...)
  2017-12-01  1:35 ` [PATCH net-next v2 3/8] net: xdp: make the stack take care of the tear down Jakub Kicinski
@ 2017-12-01  1:35 ` Jakub Kicinski
  2017-12-01  1:35 ` [PATCH net-next v2 5/8] netdevsim: add bpf offload support Jakub Kicinski
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Jakub Kicinski @ 2017-12-01  1:35 UTC (permalink / raw)
  To: netdev
  Cc: davem, daniel, alexei.starovoitov, jiri, oss-drivers, Jakub Kicinski

To be able to run selftests without any hardware required we
need a software model.  The model can also serve as an example
implementation for those implementing actual HW offloads.
The dummy driver have previously been extended to test SR-IOV,
but the general consensus seems to be against adding further
features to it.

Add a new driver for purposes of software modelling only.
eBPF and SR-IOV will be added here shortly, others are invited
to further extend the driver with their offload models.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
---
 MAINTAINERS                       |   5 ++
 drivers/net/Kconfig               |  11 ++++
 drivers/net/Makefile              |   1 +
 drivers/net/netdevsim/Makefile    |   6 ++
 drivers/net/netdevsim/netdev.c    | 118 ++++++++++++++++++++++++++++++++++++++
 drivers/net/netdevsim/netdevsim.h |  26 +++++++++
 6 files changed, 167 insertions(+)
 create mode 100644 drivers/net/netdevsim/Makefile
 create mode 100644 drivers/net/netdevsim/netdev.c
 create mode 100644 drivers/net/netdevsim/netdevsim.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 77d819b458a9..010e46a38373 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9599,6 +9599,11 @@ NETWORKING [WIRELESS]
 L:	linux-wireless@vger.kernel.org
 Q:	http://patchwork.kernel.org/project/linux-wireless/list/
 
+NETDEVSIM
+M:	Jakub Kicinski <jakub.kicinski@netronome.com>
+S:	Maintained
+F:	drivers/net/netdevsim/*
+
 NETXEN (1/10) GbE SUPPORT
 M:	Manish Chopra <manish.chopra@cavium.com>
 M:	Rahul Verma <rahul.verma@cavium.com>
diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 0936da592e12..944ec3c9282c 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -497,4 +497,15 @@ config THUNDERBOLT_NET
 
 source "drivers/net/hyperv/Kconfig"
 
+config NETDEVSIM
+	tristate "Simulated networking device"
+	depends on DEBUG_FS
+	help
+	  This driver is a developer testing tool and software model that can
+	  be used to test various control path networking APIs, especially
+	  HW-offload related.
+
+	  To compile this driver as a module, choose M here: the module
+	  will be called netdevsim.
+
 endif # NETDEVICES
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 766f62d02a0b..04c3b747812c 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -78,3 +78,4 @@ obj-$(CONFIG_FUJITSU_ES) += fjes/
 
 thunderbolt-net-y += thunderbolt.o
 obj-$(CONFIG_THUNDERBOLT_NET) += thunderbolt-net.o
+obj-$(CONFIG_NETDEVSIM) += netdevsim/
diff --git a/drivers/net/netdevsim/Makefile b/drivers/net/netdevsim/Makefile
new file mode 100644
index 000000000000..07867bfe873b
--- /dev/null
+++ b/drivers/net/netdevsim/Makefile
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: GPL-2.0
+
+obj-$(CONFIG_NETDEVSIM) += netdevsim.o
+
+netdevsim-objs := \
+	netdev.o \
diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c
new file mode 100644
index 000000000000..7599c72c477a
--- /dev/null
+++ b/drivers/net/netdevsim/netdev.c
@@ -0,0 +1,118 @@
+/*
+ * Copyright (C) 2017 Netronome Systems, Inc.
+ *
+ * This software is licensed under the GNU General License Version 2,
+ * June 1991 as shown in the file COPYING in the top-level directory of this
+ * source tree.
+ *
+ * THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS"
+ * WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING,
+ * BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE
+ * OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME
+ * THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
+ */
+
+#include <linux/etherdevice.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/netdevice.h>
+#include <linux/slab.h>
+#include <net/netlink.h>
+#include <net/rtnetlink.h>
+
+#include "netdevsim.h"
+
+static netdev_tx_t nsim_start_xmit(struct sk_buff *skb, struct net_device *dev)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+
+	u64_stats_update_begin(&ns->syncp);
+	ns->tx_packets++;
+	ns->tx_bytes += skb->len;
+	u64_stats_update_end(&ns->syncp);
+
+	dev_kfree_skb(skb);
+
+	return NETDEV_TX_OK;
+}
+
+static void nsim_set_rx_mode(struct net_device *dev)
+{
+}
+
+static void
+nsim_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+	unsigned int start;
+
+	do {
+		start = u64_stats_fetch_begin(&ns->syncp);
+		stats->tx_bytes = ns->tx_bytes;
+		stats->tx_packets = ns->tx_packets;
+	} while (u64_stats_fetch_retry(&ns->syncp, start));
+}
+
+static const struct net_device_ops nsim_netdev_ops = {
+	.ndo_start_xmit		= nsim_start_xmit,
+	.ndo_set_rx_mode	= nsim_set_rx_mode,
+	.ndo_set_mac_address	= eth_mac_addr,
+	.ndo_validate_addr	= eth_validate_addr,
+	.ndo_get_stats64	= nsim_get_stats64,
+};
+
+static void nsim_setup(struct net_device *dev)
+{
+	ether_setup(dev);
+	eth_hw_addr_random(dev);
+
+	dev->netdev_ops = &nsim_netdev_ops;
+	dev->needs_free_netdev = true;
+
+	dev->tx_queue_len = 0;
+	dev->flags |= IFF_NOARP;
+	dev->flags &= ~IFF_MULTICAST;
+	dev->priv_flags |= IFF_LIVE_ADDR_CHANGE |
+			   IFF_NO_QUEUE;
+	dev->features |= NETIF_F_HIGHDMA |
+			 NETIF_F_SG |
+			 NETIF_F_FRAGLIST |
+			 NETIF_F_HW_CSUM |
+			 NETIF_F_TSO;
+	dev->max_mtu = ETH_MAX_MTU;
+}
+
+static int nsim_validate(struct nlattr *tb[], struct nlattr *data[],
+			 struct netlink_ext_ack *extack)
+{
+	if (tb[IFLA_ADDRESS]) {
+		if (nla_len(tb[IFLA_ADDRESS]) != ETH_ALEN)
+			return -EINVAL;
+		if (!is_valid_ether_addr(nla_data(tb[IFLA_ADDRESS])))
+			return -EADDRNOTAVAIL;
+	}
+	return 0;
+}
+
+static struct rtnl_link_ops nsim_link_ops __read_mostly = {
+	.kind		= DRV_NAME,
+	.priv_size	= sizeof(struct netdevsim),
+	.setup		= nsim_setup,
+	.validate	= nsim_validate,
+};
+
+static int __init nsim_module_init(void)
+{
+	return rtnl_link_register(&nsim_link_ops);
+}
+
+static void __exit nsim_module_exit(void)
+{
+	rtnl_link_unregister(&nsim_link_ops);
+}
+
+module_init(nsim_module_init);
+module_exit(nsim_module_exit);
+MODULE_LICENSE("GPL");
+MODULE_ALIAS_RTNL_LINK(DRV_NAME);
diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h
new file mode 100644
index 000000000000..4558c6f11598
--- /dev/null
+++ b/drivers/net/netdevsim/netdevsim.h
@@ -0,0 +1,26 @@
+/*
+ * Copyright (C) 2017 Netronome Systems, Inc.
+ *
+ * This software is licensed under the GNU General License Version 2,
+ * June 1991 as shown in the file COPYING in the top-level directory of this
+ * source tree.
+ *
+ * THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS"
+ * WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING,
+ * BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE
+ * OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME
+ * THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
+ */
+
+#include <linux/kernel.h>
+#include <linux/netdevice.h>
+#include <linux/u64_stats_sync.h>
+
+#define DRV_NAME	"netdevsim"
+
+struct netdevsim {
+	u64 tx_packets;
+	u64 tx_bytes;
+	struct u64_stats_sync syncp;
+};
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next v2 5/8] netdevsim: add bpf offload support
  2017-12-01  1:35 [PATCH net-next v2 0/8] xdp: make stack perform remove and tests Jakub Kicinski
                   ` (3 preceding siblings ...)
  2017-12-01  1:35 ` [PATCH net-next v2 4/8] netdevsim: add software driver for testing offloads Jakub Kicinski
@ 2017-12-01  1:35 ` Jakub Kicinski
  2017-12-01  1:35 ` [PATCH net-next v2 6/8] selftests/bpf: add offload test based on netdevsim Jakub Kicinski
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Jakub Kicinski @ 2017-12-01  1:35 UTC (permalink / raw)
  To: netdev
  Cc: davem, daniel, alexei.starovoitov, jiri, oss-drivers, Jakub Kicinski

Add support for loading programs for netdevsim devices and
expose the related information via DebugFS.  Both offload
of XDP and cls_bpf programs is supported.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
---
 drivers/net/netdevsim/Makefile    |   1 +
 drivers/net/netdevsim/bpf.c       | 373 ++++++++++++++++++++++++++++++++++++++
 drivers/net/netdevsim/netdev.c    | 116 +++++++++++-
 drivers/net/netdevsim/netdevsim.h |  40 ++++
 4 files changed, 529 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/netdevsim/bpf.c

diff --git a/drivers/net/netdevsim/Makefile b/drivers/net/netdevsim/Makefile
index 07867bfe873b..074ddebbc41d 100644
--- a/drivers/net/netdevsim/Makefile
+++ b/drivers/net/netdevsim/Makefile
@@ -4,3 +4,4 @@ obj-$(CONFIG_NETDEVSIM) += netdevsim.o
 
 netdevsim-objs := \
 	netdev.o \
+	bpf.o \
diff --git a/drivers/net/netdevsim/bpf.c b/drivers/net/netdevsim/bpf.c
new file mode 100644
index 000000000000..8e4398a50903
--- /dev/null
+++ b/drivers/net/netdevsim/bpf.c
@@ -0,0 +1,373 @@
+/*
+ * Copyright (C) 2017 Netronome Systems, Inc.
+ *
+ * This software is licensed under the GNU General License Version 2,
+ * June 1991 as shown in the file COPYING in the top-level directory of this
+ * source tree.
+ *
+ * THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS"
+ * WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING,
+ * BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE
+ * OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME
+ * THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
+ */
+
+#include <linux/bpf.h>
+#include <linux/bpf_verifier.h>
+#include <linux/debugfs.h>
+#include <linux/kernel.h>
+#include <linux/rtnetlink.h>
+#include <net/pkt_cls.h>
+
+#include "netdevsim.h"
+
+struct nsim_bpf_bound_prog {
+	struct netdevsim *ns;
+	struct bpf_prog *prog;
+	struct dentry *ddir;
+	const char *state;
+	bool is_loaded;
+	struct list_head l;
+};
+
+static int nsim_debugfs_bpf_string_read(struct seq_file *file, void *data)
+{
+	const char **str = file->private;
+
+	if (*str)
+		seq_printf(file, "%s\n", *str);
+
+	return 0;
+}
+
+static int nsim_debugfs_bpf_string_open(struct inode *inode, struct file *f)
+{
+	return single_open(f, nsim_debugfs_bpf_string_read, inode->i_private);
+}
+
+static const struct file_operations nsim_bpf_string_fops = {
+	.owner = THIS_MODULE,
+	.open = nsim_debugfs_bpf_string_open,
+	.release = single_release,
+	.read = seq_read,
+	.llseek = seq_lseek
+};
+
+static int
+nsim_bpf_verify_insn(struct bpf_verifier_env *env, int insn_idx, int prev_insn)
+{
+	struct nsim_bpf_bound_prog *state;
+
+	state = env->prog->aux->offload->dev_priv;
+	if (state->ns->bpf_bind_verifier_delay && !insn_idx)
+		msleep(state->ns->bpf_bind_verifier_delay);
+
+	return 0;
+}
+
+static const struct bpf_ext_analyzer_ops nsim_bpf_analyzer_ops = {
+	.insn_hook = nsim_bpf_verify_insn,
+};
+
+static bool nsim_xdp_offload_active(struct netdevsim *ns)
+{
+	return ns->xdp_prog_mode == XDP_ATTACHED_HW;
+}
+
+static void nsim_prog_set_loaded(struct bpf_prog *prog, bool loaded)
+{
+	struct nsim_bpf_bound_prog *state;
+
+	if (!prog || !prog->aux->offload)
+		return;
+
+	state = prog->aux->offload->dev_priv;
+	state->is_loaded = loaded;
+}
+
+static int
+nsim_bpf_offload(struct netdevsim *ns, struct bpf_prog *prog, bool oldprog)
+{
+	nsim_prog_set_loaded(ns->bpf_offloaded, false);
+
+	WARN(!!ns->bpf_offloaded != oldprog,
+	     "bad offload state, expected offload %sto be active",
+	     oldprog ? "" : "not ");
+	ns->bpf_offloaded = prog;
+	ns->bpf_offloaded_id = prog ? prog->aux->id : 0;
+	nsim_prog_set_loaded(prog, true);
+
+	return 0;
+}
+
+int nsim_bpf_setup_tc_block_cb(enum tc_setup_type type,
+			       void *type_data, void *cb_priv)
+{
+	struct tc_cls_bpf_offload *cls_bpf = type_data;
+	struct bpf_prog *prog = cls_bpf->prog;
+	struct netdevsim *ns = cb_priv;
+	bool skip_sw;
+
+	if (type != TC_SETUP_CLSBPF ||
+	    !tc_can_offload(ns->netdev) ||
+	    cls_bpf->common.protocol != htons(ETH_P_ALL) ||
+	    cls_bpf->common.chain_index)
+		return -EOPNOTSUPP;
+
+	skip_sw = cls_bpf->gen_flags & TCA_CLS_FLAGS_SKIP_SW;
+
+	if (nsim_xdp_offload_active(ns))
+		return -EBUSY;
+
+	if (!ns->bpf_tc_accept)
+		return -EOPNOTSUPP;
+	/* Note: progs without skip_sw will probably not be dev bound */
+	if (prog && !prog->aux->offload && !ns->bpf_tc_non_bound_accept)
+		return -EOPNOTSUPP;
+
+	switch (cls_bpf->command) {
+	case TC_CLSBPF_REPLACE:
+		return nsim_bpf_offload(ns, prog, true);
+	case TC_CLSBPF_ADD:
+		return nsim_bpf_offload(ns, prog, false);
+	case TC_CLSBPF_DESTROY:
+		return nsim_bpf_offload(ns, NULL, true);
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+int nsim_bpf_disable_tc(struct netdevsim *ns)
+{
+	if (ns->bpf_offloaded && !nsim_xdp_offload_active(ns))
+		return -EBUSY;
+	return 0;
+}
+
+static int nsim_xdp_offload_prog(struct netdevsim *ns, struct netdev_bpf *bpf)
+{
+	if (!nsim_xdp_offload_active(ns) && !bpf->prog)
+		return 0;
+	if (!nsim_xdp_offload_active(ns) && bpf->prog && ns->bpf_offloaded) {
+		NSIM_EA(bpf->extack, "TC program is already loaded");
+		return -EBUSY;
+	}
+
+	return nsim_bpf_offload(ns, bpf->prog, nsim_xdp_offload_active(ns));
+}
+
+static int nsim_xdp_set_prog(struct netdevsim *ns, struct netdev_bpf *bpf)
+{
+	int err;
+
+	if (ns->xdp_prog && (bpf->flags ^ ns->xdp_flags) & XDP_FLAGS_MODES) {
+		NSIM_EA(bpf->extack, "program loaded with different flags");
+		return -EBUSY;
+	}
+
+	if (bpf->command == XDP_SETUP_PROG && !ns->bpf_xdpdrv_accept) {
+		NSIM_EA(bpf->extack, "driver XDP disabled in DebugFS");
+		return -EOPNOTSUPP;
+	}
+	if (bpf->command == XDP_SETUP_PROG_HW && !ns->bpf_xdpoffload_accept) {
+		NSIM_EA(bpf->extack, "XDP offload disabled in DebugFS");
+		return -EOPNOTSUPP;
+	}
+
+	if (bpf->command == XDP_SETUP_PROG_HW) {
+		err = nsim_xdp_offload_prog(ns, bpf);
+		if (err)
+			return err;
+	}
+
+	if (ns->xdp_prog)
+		bpf_prog_put(ns->xdp_prog);
+
+	ns->xdp_prog = bpf->prog;
+	ns->xdp_flags = bpf->flags;
+
+	if (!bpf->prog)
+		ns->xdp_prog_mode = XDP_ATTACHED_NONE;
+	else if (bpf->command == XDP_SETUP_PROG)
+		ns->xdp_prog_mode = XDP_ATTACHED_DRV;
+	else
+		ns->xdp_prog_mode = XDP_ATTACHED_HW;
+
+	return 0;
+}
+
+int nsim_bpf_create_prog(struct netdevsim *ns, struct bpf_prog *prog)
+{
+	struct nsim_bpf_bound_prog *state;
+	char name[16];
+	int err;
+
+	state = kzalloc(sizeof(*state), GFP_KERNEL);
+	if (!state)
+		return -ENOMEM;
+
+	state->ns = ns;
+	state->prog = prog;
+	state->state = "verify";
+
+	/* Program id is not populated yet when we create the state. */
+	sprintf(name, "%u", ns->prog_id_gen++);
+	state->ddir = debugfs_create_dir(name, ns->ddir_bpf_bound_progs);
+	if (IS_ERR(state->ddir)) {
+		err = PTR_ERR(state->ddir);
+		kfree(state);
+		return err;
+	}
+
+	debugfs_create_u32("id", 0400, state->ddir, &prog->aux->id);
+	debugfs_create_file("state", 0400, state->ddir,
+			    &state->state, &nsim_bpf_string_fops);
+	debugfs_create_bool("loaded", 0400, state->ddir, &state->is_loaded);
+
+	list_add_tail(&state->l, &ns->bpf_bound_progs);
+
+	prog->aux->offload->dev_priv = state;
+
+	return 0;
+}
+
+void nsim_bpf_destroy_prog(struct bpf_prog *prog)
+{
+	struct nsim_bpf_bound_prog *state;
+
+	state = prog->aux->offload->dev_priv;
+	WARN(state->is_loaded,
+	     "offload state destroyed while program still bound");
+	debugfs_remove_recursive(state->ddir);
+	list_del(&state->l);
+	kfree(state);
+}
+
+static int nsim_setup_prog_checks(struct netdevsim *ns, struct netdev_bpf *bpf)
+{
+	if (bpf->prog && bpf->prog->aux->offload) {
+		NSIM_EA(bpf->extack, "attempt to load offloaded prog to drv");
+		return -EINVAL;
+	}
+	if (ns->netdev->mtu > NSIM_XDP_MAX_MTU) {
+		NSIM_EA(bpf->extack, "MTU too large w/ XDP enabled");
+		return -EINVAL;
+	}
+	if (nsim_xdp_offload_active(ns)) {
+		NSIM_EA(bpf->extack, "xdp offload active, can't load drv prog");
+		return -EBUSY;
+	}
+	return 0;
+}
+
+static int
+nsim_setup_prog_hw_checks(struct netdevsim *ns, struct netdev_bpf *bpf)
+{
+	struct nsim_bpf_bound_prog *state;
+
+	if (!bpf->prog)
+		return 0;
+
+	if (!bpf->prog->aux->offload) {
+		NSIM_EA(bpf->extack, "xdpoffload of non-bound program");
+		return -EINVAL;
+	}
+	if (bpf->prog->aux->offload->netdev != ns->netdev) {
+		NSIM_EA(bpf->extack, "program bound to different dev");
+		return -EINVAL;
+	}
+
+	state = bpf->prog->aux->offload->dev_priv;
+	if (WARN_ON(strcmp(state->state, "xlated"))) {
+		NSIM_EA(bpf->extack, "offloading program in bad state");
+		return -EINVAL;
+	}
+	return 0;
+}
+
+int nsim_bpf(struct net_device *dev, struct netdev_bpf *bpf)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+	struct nsim_bpf_bound_prog *state;
+	int err;
+
+	ASSERT_RTNL();
+
+	switch (bpf->command) {
+	case BPF_OFFLOAD_VERIFIER_PREP:
+		if (!ns->bpf_bind_accept)
+			return -EOPNOTSUPP;
+
+		err = nsim_bpf_create_prog(ns, bpf->verifier.prog);
+		if (err)
+			return err;
+
+		bpf->verifier.ops = &nsim_bpf_analyzer_ops;
+		return 0;
+	case BPF_OFFLOAD_TRANSLATE:
+		state = bpf->offload.prog->aux->offload->dev_priv;
+
+		state->state = "xlated";
+		return 0;
+	case BPF_OFFLOAD_DESTROY:
+		nsim_bpf_destroy_prog(bpf->offload.prog);
+		return 0;
+	case XDP_QUERY_PROG:
+		bpf->prog_attached = ns->xdp_prog_mode;
+		bpf->prog_id = ns->xdp_prog ? ns->xdp_prog->aux->id : 0;
+		bpf->prog_flags = ns->xdp_prog ? ns->xdp_flags : 0;
+		return 0;
+	case XDP_SETUP_PROG:
+		err = nsim_setup_prog_checks(ns, bpf);
+		if (err)
+			return err;
+
+		return nsim_xdp_set_prog(ns, bpf);
+	case XDP_SETUP_PROG_HW:
+		err = nsim_setup_prog_hw_checks(ns, bpf);
+		if (err)
+			return err;
+
+		return nsim_xdp_set_prog(ns, bpf);
+	default:
+		return -EINVAL;
+	}
+}
+
+int nsim_bpf_init(struct netdevsim *ns)
+{
+	INIT_LIST_HEAD(&ns->bpf_bound_progs);
+
+	debugfs_create_u32("bpf_offloaded_id", 0400, ns->ddir,
+			   &ns->bpf_offloaded_id);
+
+	ns->bpf_bind_accept = true;
+	debugfs_create_bool("bpf_bind_accept", 0600, ns->ddir,
+			    &ns->bpf_bind_accept);
+	debugfs_create_u32("bpf_bind_verifier_delay", 0600, ns->ddir,
+			   &ns->bpf_bind_verifier_delay);
+	ns->ddir_bpf_bound_progs =
+		debugfs_create_dir("bpf_bound_progs", ns->ddir);
+
+	ns->bpf_tc_accept = true;
+	debugfs_create_bool("bpf_tc_accept", 0600, ns->ddir,
+			    &ns->bpf_tc_accept);
+	debugfs_create_bool("bpf_tc_non_bound_accept", 0600, ns->ddir,
+			    &ns->bpf_tc_non_bound_accept);
+	ns->bpf_xdpdrv_accept = true;
+	debugfs_create_bool("bpf_xdpdrv_accept", 0600, ns->ddir,
+			    &ns->bpf_xdpdrv_accept);
+	ns->bpf_xdpoffload_accept = true;
+	debugfs_create_bool("bpf_xdpoffload_accept", 0600, ns->ddir,
+			    &ns->bpf_xdpoffload_accept);
+
+	return 0;
+}
+
+void nsim_bpf_uninit(struct netdevsim *ns)
+{
+	WARN_ON(!list_empty(&ns->bpf_bound_progs));
+	WARN_ON(ns->xdp_prog);
+	WARN_ON(ns->bpf_offloaded);
+}
diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c
index 7599c72c477a..828c1ce49a8b 100644
--- a/drivers/net/netdevsim/netdev.c
+++ b/drivers/net/netdevsim/netdev.c
@@ -13,16 +13,45 @@
  * THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
  */
 
+#include <linux/debugfs.h>
 #include <linux/etherdevice.h>
 #include <linux/kernel.h>
 #include <linux/module.h>
 #include <linux/netdevice.h>
 #include <linux/slab.h>
 #include <net/netlink.h>
+#include <net/pkt_cls.h>
 #include <net/rtnetlink.h>
 
 #include "netdevsim.h"
 
+static int nsim_init(struct net_device *dev)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+	int err;
+
+	ns->netdev = dev;
+	ns->ddir = debugfs_create_dir(netdev_name(dev), nsim_ddir);
+
+	err = nsim_bpf_init(ns);
+	if (err)
+		goto err_debugfs_destroy;
+
+	return 0;
+
+err_debugfs_destroy:
+	debugfs_remove_recursive(ns->ddir);
+	return err;
+}
+
+static void nsim_uninit(struct net_device *dev)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+
+	debugfs_remove_recursive(ns->ddir);
+	nsim_bpf_uninit(ns);
+}
+
 static netdev_tx_t nsim_start_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct netdevsim *ns = netdev_priv(dev);
@@ -41,6 +70,19 @@ static void nsim_set_rx_mode(struct net_device *dev)
 {
 }
 
+static int nsim_change_mtu(struct net_device *dev, int new_mtu)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+
+	if (ns->xdp_prog_mode == XDP_ATTACHED_DRV &&
+	    new_mtu > NSIM_XDP_MAX_MTU)
+		return -EBUSY;
+
+	dev->mtu = new_mtu;
+
+	return 0;
+}
+
 static void
 nsim_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
 {
@@ -54,12 +96,66 @@ nsim_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
 	} while (u64_stats_fetch_retry(&ns->syncp, start));
 }
 
+static int
+nsim_setup_tc_block_cb(enum tc_setup_type type, void *type_data, void *cb_priv)
+{
+	return nsim_bpf_setup_tc_block_cb(type, type_data, cb_priv);
+}
+
+static int
+nsim_setup_tc_block(struct net_device *dev, struct tc_block_offload *f)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+
+	if (f->binder_type != TCF_BLOCK_BINDER_TYPE_CLSACT_INGRESS)
+		return -EOPNOTSUPP;
+
+	switch (f->command) {
+	case TC_BLOCK_BIND:
+		return tcf_block_cb_register(f->block, nsim_setup_tc_block_cb,
+					     ns, ns);
+	case TC_BLOCK_UNBIND:
+		tcf_block_cb_unregister(f->block, nsim_setup_tc_block_cb, ns);
+		return 0;
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int
+nsim_setup_tc(struct net_device *dev, enum tc_setup_type type, void *type_data)
+{
+	switch (type) {
+	case TC_SETUP_BLOCK:
+		return nsim_setup_tc_block(dev, type_data);
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int
+nsim_set_features(struct net_device *dev, netdev_features_t features)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+
+	if ((dev->features & NETIF_F_HW_TC) > (features & NETIF_F_HW_TC))
+		return nsim_bpf_disable_tc(ns);
+
+	return 0;
+}
+
 static const struct net_device_ops nsim_netdev_ops = {
+	.ndo_init		= nsim_init,
+	.ndo_uninit		= nsim_uninit,
 	.ndo_start_xmit		= nsim_start_xmit,
 	.ndo_set_rx_mode	= nsim_set_rx_mode,
 	.ndo_set_mac_address	= eth_mac_addr,
 	.ndo_validate_addr	= eth_validate_addr,
+	.ndo_change_mtu		= nsim_change_mtu,
 	.ndo_get_stats64	= nsim_get_stats64,
+	.ndo_setup_tc		= nsim_setup_tc,
+	.ndo_set_features	= nsim_set_features,
+	.ndo_bpf		= nsim_bpf,
 };
 
 static void nsim_setup(struct net_device *dev)
@@ -80,6 +176,7 @@ static void nsim_setup(struct net_device *dev)
 			 NETIF_F_FRAGLIST |
 			 NETIF_F_HW_CSUM |
 			 NETIF_F_TSO;
+	dev->hw_features |= NETIF_F_HW_TC;
 	dev->max_mtu = ETH_MAX_MTU;
 }
 
@@ -102,14 +199,31 @@ static struct rtnl_link_ops nsim_link_ops __read_mostly = {
 	.validate	= nsim_validate,
 };
 
+struct dentry *nsim_ddir;
+
 static int __init nsim_module_init(void)
 {
-	return rtnl_link_register(&nsim_link_ops);
+	int err;
+
+	nsim_ddir = debugfs_create_dir(DRV_NAME, NULL);
+	if (IS_ERR(nsim_ddir))
+		return PTR_ERR(nsim_ddir);
+
+	err = rtnl_link_register(&nsim_link_ops);
+	if (err)
+		goto err_debugfs_destroy;
+
+	return 0;
+
+err_debugfs_destroy:
+	debugfs_remove_recursive(nsim_ddir);
+	return err;
 }
 
 static void __exit nsim_module_exit(void)
 {
 	rtnl_link_unregister(&nsim_link_ops);
+	debugfs_remove_recursive(nsim_ddir);
 }
 
 module_init(nsim_module_init);
diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h
index 4558c6f11598..8779e6a8f885 100644
--- a/drivers/net/netdevsim/netdevsim.h
+++ b/drivers/net/netdevsim/netdevsim.h
@@ -14,13 +14,53 @@
  */
 
 #include <linux/kernel.h>
+#include <linux/list.h>
 #include <linux/netdevice.h>
 #include <linux/u64_stats_sync.h>
 
 #define DRV_NAME	"netdevsim"
 
+#define NSIM_XDP_MAX_MTU	4000
+
+#define NSIM_EA(extack, msg)	NL_SET_ERR_MSG_MOD((extack), msg)
+
+struct bpf_prog;
+struct dentry;
+
 struct netdevsim {
+	struct net_device *netdev;
+
 	u64 tx_packets;
 	u64 tx_bytes;
 	struct u64_stats_sync syncp;
+
+	struct dentry *ddir;
+
+	struct bpf_prog	*bpf_offloaded;
+	u32 bpf_offloaded_id;
+
+	u32 xdp_flags;
+	int xdp_prog_mode;
+	struct bpf_prog	*xdp_prog;
+
+	u32 prog_id_gen;
+
+	bool bpf_bind_accept;
+	u32 bpf_bind_verifier_delay;
+	struct dentry *ddir_bpf_bound_progs;
+	struct list_head bpf_bound_progs;
+
+	bool bpf_tc_accept;
+	bool bpf_tc_non_bound_accept;
+	bool bpf_xdpdrv_accept;
+	bool bpf_xdpoffload_accept;
 };
+
+extern struct dentry *nsim_ddir;
+
+int nsim_bpf_init(struct netdevsim *ns);
+void nsim_bpf_uninit(struct netdevsim *ns);
+int nsim_bpf(struct net_device *dev, struct netdev_bpf *bpf);
+int nsim_bpf_disable_tc(struct netdevsim *ns);
+int nsim_bpf_setup_tc_block_cb(enum tc_setup_type type,
+			       void *type_data, void *cb_priv);
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next v2 6/8] selftests/bpf: add offload test based on netdevsim
  2017-12-01  1:35 [PATCH net-next v2 0/8] xdp: make stack perform remove and tests Jakub Kicinski
                   ` (4 preceding siblings ...)
  2017-12-01  1:35 ` [PATCH net-next v2 5/8] netdevsim: add bpf offload support Jakub Kicinski
@ 2017-12-01  1:35 ` Jakub Kicinski
  2017-12-01  1:35 ` [PATCH net-next v2 7/8] netdevsim: add SR-IOV functionality Jakub Kicinski
  2017-12-01  1:35 ` [PATCH net-next v2 8/8] net: dummy: remove fake " Jakub Kicinski
  7 siblings, 0 replies; 18+ messages in thread
From: Jakub Kicinski @ 2017-12-01  1:35 UTC (permalink / raw)
  To: netdev
  Cc: davem, daniel, alexei.starovoitov, jiri, oss-drivers, Jakub Kicinski

Add a test of BPF offload control path interfaces based on
just-added netdevsim driver.  Perform various checks of both
the stack and the expected driver behaviour.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
---
 tools/testing/selftests/bpf/Makefile        |   5 +-
 tools/testing/selftests/bpf/sample_ret0.c   |   7 +
 tools/testing/selftests/bpf/test_offload.py | 681 ++++++++++++++++++++++++++++
 3 files changed, 691 insertions(+), 2 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/sample_ret0.c
 create mode 100755 tools/testing/selftests/bpf/test_offload.py

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 333a48655ee0..2c9d8c63c6fa 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -17,9 +17,10 @@ TEST_GEN_PROGS = test_verifier test_tag test_maps test_lru_map test_lpm_map test
 
 TEST_GEN_FILES = test_pkt_access.o test_xdp.o test_l4lb.o test_tcp_estats.o test_obj_id.o \
 	test_pkt_md_access.o test_xdp_redirect.o test_xdp_meta.o sockmap_parse_prog.o     \
-	sockmap_verdict_prog.o dev_cgroup.o
+	sockmap_verdict_prog.o dev_cgroup.o sample_ret0.o
 
-TEST_PROGS := test_kmod.sh test_xdp_redirect.sh test_xdp_meta.sh
+TEST_PROGS := test_kmod.sh test_xdp_redirect.sh test_xdp_meta.sh \
+	test_offload.py
 
 include ../lib.mk
 
diff --git a/tools/testing/selftests/bpf/sample_ret0.c b/tools/testing/selftests/bpf/sample_ret0.c
new file mode 100644
index 000000000000..fec99750d6ea
--- /dev/null
+++ b/tools/testing/selftests/bpf/sample_ret0.c
@@ -0,0 +1,7 @@
+/* SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) */
+
+/* Sample program which should always load for testing control paths. */
+int func()
+{
+	return 0;
+}
diff --git a/tools/testing/selftests/bpf/test_offload.py b/tools/testing/selftests/bpf/test_offload.py
new file mode 100755
index 000000000000..3914f7a4585a
--- /dev/null
+++ b/tools/testing/selftests/bpf/test_offload.py
@@ -0,0 +1,681 @@
+#!/usr/bin/python3
+
+# Copyright (C) 2017 Netronome Systems, Inc.
+#
+# This software is licensed under the GNU General License Version 2,
+# June 1991 as shown in the file COPYING in the top-level directory of this
+# source tree.
+#
+# THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS"
+# WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING,
+# BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+# FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE
+# OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME
+# THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
+
+from datetime import datetime
+import argparse
+import json
+import os
+import pprint
+import subprocess
+import time
+
+logfile = None
+log_level = 1
+bpf_test_dir = os.path.dirname(os.path.realpath(__file__))
+pp = pprint.PrettyPrinter()
+devs = [] # devices we created for clean up
+files = [] # files to be removed
+
+def log_get_sec(level=0):
+    return "*" * (log_level + level)
+
+def log_level_inc(add=1):
+    global log_level
+    log_level += add
+
+def log_level_dec(sub=1):
+    global log_level
+    log_level -= sub
+
+def log_level_set(level):
+    global log_level
+    log_level = level
+
+def log(header, data, level=None):
+    """
+    Output to an optional log.
+    """
+    if logfile is None:
+        return
+    if level is not None:
+        log_level_set(level)
+
+    if not isinstance(data, str):
+        data = pp.pformat(data)
+
+    if len(header):
+        logfile.write("\n" + log_get_sec() + " ")
+        logfile.write(header)
+    if len(header) and len(data.strip()):
+        logfile.write("\n")
+    logfile.write(data)
+
+def skip(cond, msg):
+    if not cond:
+        return
+    print("SKIP: " + msg)
+    log("SKIP: " + msg, "", level=1)
+    os.sys.exit(0)
+
+def fail(cond, msg):
+    if not cond:
+        return
+    print("FAIL: " + msg)
+    log("FAIL: " + msg, "", level=1)
+    os.sys.exit(1)
+
+def start_test(msg):
+    log(msg, "", level=1)
+    log_level_inc()
+    print(msg)
+
+def cmd(cmd, shell=True, include_stderr=False, background=False, fail=True):
+    """
+    Run a command in subprocess and return tuple of (retval, stdout);
+    optionally return stderr as well as third value.
+    """
+    proc = subprocess.Popen(cmd, shell=shell, stdout=subprocess.PIPE,
+                            stderr=subprocess.PIPE)
+    if background:
+        msg = "%s START: %s" % (log_get_sec(1),
+                                datetime.now().strftime("%H:%M:%S.%f"))
+        log("BKG " + proc.args, msg)
+        return proc
+
+    return cmd_result(proc, include_stderr=include_stderr, fail=fail)
+
+def cmd_result(proc, include_stderr=False, fail=False):
+    stdout, stderr = proc.communicate()
+    stdout = stdout.decode("utf-8")
+    stderr = stderr.decode("utf-8")
+    proc.stdout.close()
+    proc.stderr.close()
+
+    stderr = "\n" + stderr
+    if stderr[-1] == "\n":
+        stderr = stderr[:-1]
+
+    sec = log_get_sec(1)
+    log("CMD " + proc.args,
+        "RETCODE: %d\n%s STDOUT:\n%s%s STDERR:%s\n%s END: %s" %
+        (proc.returncode, sec, stdout, sec, stderr,
+         sec, datetime.now().strftime("%H:%M:%S.%f")))
+
+    if proc.returncode != 0 and fail:
+        if len(stderr) > 0 and stderr[-1] == "\n":
+            stderr = stderr[:-1]
+        raise Exception("Command failed: %s\n%s" % (proc.args, stderr))
+
+    if include_stderr:
+        return proc.returncode, stdout, stderr
+    else:
+        return proc.returncode, stdout
+
+def rm(f):
+    cmd("rm -f %s" % (f))
+    if f in files:
+        files.remove(f)
+
+def tool(name, args, flags, JSON=True, fail=True):
+    params = ""
+    if JSON:
+        params += "%s " % (flags["json"])
+
+    ret, out = cmd(name + " " + params + args, fail=fail)
+    if JSON and len(out.strip()) != 0:
+        return ret, json.loads(out)
+    else:
+        return ret, out
+
+def bpftool(args, JSON=True, fail=True):
+    return tool("bpftool", args, {"json":"-p"}, JSON=JSON, fail=fail)
+
+def bpftool_prog_list(expected=None):
+    _, progs = bpftool("prog show", JSON=True, fail=True)
+    if expected is not None:
+        if len(progs) != expected:
+            fail(True, "%d BPF programs loaded, expected %d" %
+                 (len(progs), expected))
+    return progs
+
+def bpftool_prog_list_wait(expected=0, n_retry=20):
+    for i in range(n_retry):
+        nprogs = len(bpftool_prog_list())
+        if nprogs == expected:
+            return
+        time.sleep(0.05)
+    raise Exception("Time out waiting for program counts to stabilize want %d, have %d" % (expected, nprogs))
+
+def ip(args, force=False, JSON=True, fail=True):
+    if force:
+        args = "-force " + args
+    return tool("ip", args, {"json":"-j"}, JSON=JSON, fail=fail)
+
+def tc(args, JSON=True, fail=True):
+    return tool("tc", args, {"json":"-p"}, JSON=JSON, fail=fail)
+
+def ethtool(dev, opt, args, fail=True):
+    return cmd("ethtool %s %s %s" % (opt, dev["ifname"], args), fail=fail)
+
+def bpf_obj(name, sec=".text", path=bpf_test_dir,):
+    return "obj %s sec %s" % (os.path.join(path, name), sec)
+
+def bpf_pinned(name):
+    return "pinned %s" % (name)
+
+def bpf_bytecode(bytecode):
+    return "bytecode \"%s\"" % (bytecode)
+
+class DebugfsDir:
+    """
+    Class for accessing DebugFS directories as a dictionary.
+    """
+
+    def __init__(self, path):
+        self.path = path
+        self._dict = self._debugfs_dir_read(path)
+
+    def __len__(self):
+        return len(self._dict.keys())
+
+    def __getitem__(self, key):
+        if type(key) is int:
+            key = list(self._dict.keys())[key]
+        return self._dict[key]
+
+    def __setitem__(self, key, value):
+        log("DebugFS set %s = %s" % (key, value), "")
+        log_level_inc()
+
+        cmd("echo '%s' > %s/%s" % (value, self.path, key))
+        log_level_dec()
+
+        _, out = cmd('cat %s/%s' % (self.path, key))
+        self._dict[key] = out.strip()
+
+    def _debugfs_dir_read(self, path):
+        dfs = {}
+
+        log("DebugFS state for %s" % (path), "")
+        log_level_inc(add=2)
+
+        _, out = cmd('ls ' + path)
+        for f in out.split():
+            p = os.path.join(path, f)
+            if os.path.isfile(p):
+                _, out = cmd('cat %s/%s' % (path, f))
+                dfs[f] = out.strip()
+            elif os.path.isdir(p):
+                dfs[f] = DebugfsDir(p)
+            else:
+                raise Exception("%s is neither file nor directory" % (p))
+
+        log_level_dec()
+        log("DebugFS state", dfs)
+        log_level_dec()
+
+        return dfs
+
+class NetdevSim:
+    """
+    Class for netdevsim netdevice and its attributes.
+    """
+
+    def __init__(self):
+        self.dev = self._netdevsim_create()
+        devs.append(self)
+
+        self.dfs_dir = '/sys/kernel/debug/netdevsim/%s' % (self.dev['ifname'])
+        self.dfs_refresh()
+
+    def __getitem__(self, key):
+        return self.dev[key]
+
+    def _netdevsim_create(self):
+        _, old  = ip("link show")
+        ip("link add sim%d type netdevsim")
+        _, new  = ip("link show")
+
+        for dev in new:
+            f = filter(lambda x: x["ifname"] == dev["ifname"], old)
+            if len(list(f)) == 0:
+                return dev
+
+        raise Exception("failed to create netdevsim device")
+
+    def remove(self):
+        devs.remove(self)
+        ip("link del dev %s" % (self.dev["ifname"]))
+
+    def dfs_refresh(self):
+        self.dfs = DebugfsDir(self.dfs_dir)
+        return self.dfs
+
+    def dfs_num_bound_progs(self):
+        path = os.path.join(self.dfs_dir, "bpf_bound_progs")
+        _, progs = cmd('ls %s' % (path))
+        return len(progs.split())
+
+    def dfs_get_bound_progs(self, expected):
+        progs = DebugfsDir(os.path.join(self.dfs_dir, "bpf_bound_progs"))
+        if expected is not None:
+            if len(progs) != expected:
+                fail(True, "%d BPF programs bound, expected %d" %
+                     (len(progs), expected))
+        return progs
+
+    def wait_for_flush(self, bound=0, total=0, n_retry=20):
+        for i in range(n_retry):
+            nbound = self.dfs_num_bound_progs()
+            nprogs = len(bpftool_prog_list())
+            if nbound == bound and nprogs == total:
+                return
+            time.sleep(0.05)
+        raise Exception("Time out waiting for program counts to stabilize want %d/%d, have %d bound, %d loaded" % (bound, total, nbound, nprogs))
+
+    def set_mtu(self, mtu, fail=True):
+        return ip("link set dev %s mtu %d" % (self.dev["ifname"], mtu),
+                  fail=fail)
+
+    def set_xdp(self, bpf, mode, force=False, fail=True):
+        return ip("link set dev %s xdp%s %s" % (self.dev["ifname"], mode, bpf),
+                  force=force, fail=fail)
+
+    def unset_xdp(self, mode, force=False, fail=True):
+        return ip("link set dev %s xdp%s off" % (self.dev["ifname"], mode),
+                  force=force, fail=fail)
+
+    def ip_link_show(self, xdp):
+        _, link = ip("link show dev %s" % (self['ifname']))
+        if len(link) > 1:
+            raise Exception("Multiple objects on ip link show")
+        if len(link) < 1:
+            return {}
+        fail(xdp != "xdp" in link,
+             "XDP program not reporting in iplink (reported %s, expected %s)" %
+             ("xdp" in link, xdp))
+        return link[0]
+
+    def tc_add_ingress(self):
+        tc("qdisc add dev %s ingress" % (self['ifname']))
+
+    def tc_del_ingress(self):
+        tc("qdisc del dev %s ingress" % (self['ifname']))
+
+    def tc_flush_filters(self, bound=0, total=0):
+        self.tc_del_ingress()
+        self.tc_add_ingress()
+        self.wait_for_flush(bound=bound, total=total)
+
+    def tc_show_ingress(self, expected=None):
+        # No JSON support, oh well...
+        flags = ["skip_sw", "skip_hw", "in_hw"]
+        named = ["protocol", "pref", "chain", "handle", "id", "tag"]
+
+        args = "-s filter show dev %s ingress" % (self['ifname'])
+        _, out = tc(args, JSON=False)
+
+        filters = []
+        lines = out.split('\n')
+        for line in lines:
+            words = line.split()
+            if "handle" not in words:
+                continue
+            fltr = {}
+            for flag in flags:
+                fltr[flag] = flag in words
+            for name in named:
+                try:
+                    idx = words.index(name)
+                    fltr[name] = words[idx + 1]
+                except ValueError:
+                    pass
+            filters.append(fltr)
+
+        if expected is not None:
+            fail(len(filters) != expected,
+                 "%d ingress filters loaded, expected %d" %
+                 (len(filters), expected))
+        return filters
+
+    def cls_bpf_add_filter(self, bpf, da=False, skip_sw=False, skip_hw=False,
+                           fail=True):
+        params = ""
+        if da:
+            params += " da"
+        if skip_sw:
+            params += " skip_sw"
+        if skip_hw:
+            params += " skip_hw"
+        return tc("filter add dev %s ingress bpf %s %s" %
+                  (self['ifname'], bpf, params), fail=fail)
+
+    def set_ethtool_tc_offloads(self, enable, fail=True):
+        args = "hw-tc-offload %s" % ("on" if enable else "off")
+        return ethtool(self, "-K", args, fail=fail)
+
+################################################################################
+def clean_up():
+    for dev in devs:
+        dev.remove()
+    for f in files:
+        cmd("rm -f %s" % (f))
+
+def pin_prog(file_name, idx=0):
+    progs = bpftool_prog_list(expected=(idx + 1))
+    prog = progs[idx]
+    bpftool("prog pin id %d %s" % (prog["id"], file_name))
+    files.append(file_name)
+
+    return file_name, bpf_pinned(file_name)
+
+# Parse command line
+parser = argparse.ArgumentParser()
+parser.add_argument("--log", help="output verbose log to given file")
+args = parser.parse_args()
+if args.log:
+    logfile = open(args.log, 'w+')
+    logfile.write("# -*-Org-*-")
+
+log("Prepare...", "", level=1)
+log_level_inc()
+
+# Check permissions
+skip(os.getuid() != 0, "test must be run as root")
+
+# Check tools
+ret, progs = bpftool("prog", fail=False)
+skip(ret != 0, "bpftool not installed")
+# Check no BPF programs are loaded
+skip(len(progs) != 0, "BPF programs already loaded on the system")
+
+# Check netdevsim
+ret, out = cmd("modprobe netdevsim", fail=False)
+skip(ret != 0, "netdevsim module could not be loaded")
+
+# Check debugfs
+_, out = cmd("mount")
+if out.find("/sys/kernel/debug type debugfs") == -1:
+    cmd("mount -t debugfs none /sys/kernel/debug")
+
+# Check samples are compiled
+samples = ["sample_ret0.o"]
+for s in samples:
+    ret, out = cmd("ls %s/%s" % (bpf_test_dir, s), fail=False)
+    skip(ret != 0, "sample %s/%s not found, please compile it" %
+         (bpf_test_dir, s))
+
+try:
+    obj = bpf_obj("sample_ret0.o")
+    bytecode = bpf_bytecode("1,6 0 0 4294967295,")
+
+    start_test("Test destruction of generic XDP...")
+    sim = NetdevSim()
+    sim.set_xdp(obj, "generic")
+    sim.remove()
+    bpftool_prog_list_wait(expected=0)
+
+    sim = NetdevSim()
+    sim.tc_add_ingress()
+
+    start_test("Test TC non-offloaded...")
+    ret, _ = sim.cls_bpf_add_filter(obj, skip_hw=True, fail=False)
+    fail(ret != 0, "Software TC filter did not load")
+
+    start_test("Test TC non-offloaded isn't getting bound...")
+    ret, _ = sim.cls_bpf_add_filter(obj, fail=False)
+    fail(ret != 0, "Software TC filter did not load")
+    sim.dfs_get_bound_progs(expected=0)
+
+    sim.tc_flush_filters()
+
+    start_test("Test TC offloads are off by default...")
+    ret, _ = sim.cls_bpf_add_filter(obj, skip_sw=True, fail=False)
+    fail(ret == 0, "TC filter loaded without enabling TC offloads")
+    sim.wait_for_flush()
+
+    sim.set_ethtool_tc_offloads(True)
+    sim.dfs["bpf_tc_non_bound_accept"] = "Y"
+
+    start_test("Test TC offload by default...")
+    ret, _ = sim.cls_bpf_add_filter(obj, fail=False)
+    fail(ret != 0, "Software TC filter did not load")
+    sim.dfs_get_bound_progs(expected=0)
+    ingress = sim.tc_show_ingress(expected=1)
+    fltr = ingress[0]
+    fail(not fltr["in_hw"], "Filter not offloaded by default")
+
+    sim.tc_flush_filters()
+
+    start_test("Test TC cBPF bytcode tries offload by default...")
+    ret, _ = sim.cls_bpf_add_filter(bytecode, fail=False)
+    fail(ret != 0, "Software TC filter did not load")
+    sim.dfs_get_bound_progs(expected=0)
+    ingress = sim.tc_show_ingress(expected=1)
+    fltr = ingress[0]
+    fail(not fltr["in_hw"], "Bytecode not offloaded by default")
+
+    sim.tc_flush_filters()
+    sim.dfs["bpf_tc_non_bound_accept"] = "N"
+
+    start_test("Test TC cBPF unbound bytecode doesn't offload...")
+    ret, _ = sim.cls_bpf_add_filter(bytecode, skip_sw=True, fail=False)
+    fail(ret == 0, "TC bytecode loaded for offload")
+    sim.wait_for_flush()
+
+    start_test("Test TC offloads work...")
+    ret, _ = sim.cls_bpf_add_filter(obj, skip_sw=True, fail=False)
+    fail(ret != 0, "TC filter did not load with TC offloads enabled")
+
+    start_test("Test TC offload basics...")
+    dfs = sim.dfs_get_bound_progs(expected=1)
+    progs = bpftool_prog_list(expected=1)
+    ingress = sim.tc_show_ingress(expected=1)
+
+    dprog = dfs[0]
+    prog = progs[0]
+    fltr = ingress[0]
+    fail(fltr["skip_hw"], "TC does reports 'skip_hw' on offloaded filter")
+    fail(not fltr["in_hw"], "TC does not report 'in_hw' for offloaded filter")
+    fail(not fltr["skip_sw"], "TC does not report 'skip_sw' back")
+
+    start_test("Test TC offload is device-bound...")
+    fail(str(prog["id"]) != fltr["id"], "Program IDs don't match")
+    fail(prog["tag"] != fltr["tag"], "Program tags don't match")
+    fail(fltr["id"] != dprog["id"], "Program IDs don't match")
+    fail(dprog["state"] != "xlated", "Offloaded program state not translated")
+    fail(dprog["loaded"] != "Y", "Offloaded program is not loaded")
+
+    start_test("Test disabling TC offloads is rejected while filters installed...")
+    ret, _ = sim.set_ethtool_tc_offloads(False, fail=False)
+    fail(ret == 0, "Driver should refuse to disable TC offloads with filters installed...")
+
+    start_test("Test qdisc removal frees things...")
+    sim.tc_flush_filters()
+    sim.tc_show_ingress(expected=0)
+
+    start_test("Test disabling TC offloads is OK without filters...")
+    ret, _ = sim.set_ethtool_tc_offloads(False, fail=False)
+    fail(ret != 0,
+         "Driver refused to disable TC offloads without filters installed...")
+
+    sim.set_ethtool_tc_offloads(True)
+
+    start_test("Test destroying device gets rid of TC filters...")
+    sim.cls_bpf_add_filter(obj, skip_sw=True)
+    sim.remove()
+    bpftool_prog_list_wait(expected=0)
+
+    sim = NetdevSim()
+    sim.set_ethtool_tc_offloads(True)
+
+    start_test("Test destroying device gets rid of XDP...")
+    sim.set_xdp(obj, "offload")
+    sim.remove()
+    bpftool_prog_list_wait(expected=0)
+
+    sim = NetdevSim()
+    sim.set_ethtool_tc_offloads(True)
+
+    start_test("Test XDP prog reporting...")
+    sim.set_xdp(obj, "drv")
+    ipl = sim.ip_link_show(xdp=True)
+    progs = bpftool_prog_list(expected=1)
+    fail(ipl["xdp"]["prog"]["id"] != progs[0]["id"],
+         "Loaded program has wrong ID")
+
+    start_test("Test XDP prog replace without force...")
+    ret, _ = sim.set_xdp(obj, "drv", fail=False)
+    fail(ret == 0, "Replaced XDP program without -force")
+    sim.wait_for_flush(total=1)
+
+    start_test("Test XDP prog replace with force...")
+    ret, _ = sim.set_xdp(obj, "drv", force=True, fail=False)
+    fail(ret != 0, "Could not replace XDP program with -force")
+    bpftool_prog_list_wait(expected=1)
+    ipl = sim.ip_link_show(xdp=True)
+    progs = bpftool_prog_list(expected=1)
+    fail(ipl["xdp"]["prog"]["id"] != progs[0]["id"],
+         "Loaded program has wrong ID")
+
+    start_test("Test XDP prog replace with bad flags...")
+    ret, _ = sim.set_xdp(obj, "offload", force=True, fail=False)
+    fail(ret == 0, "Replaced XDP program with a program in different mode")
+    ret, _ = sim.set_xdp(obj, "", force=True, fail=False)
+    fail(ret == 0, "Replaced XDP program with a program in different mode")
+
+    start_test("Test XDP prog remove with bad flags...")
+    ret, _ = sim.unset_xdp("offload", force=True, fail=False)
+    fail(ret == 0, "Removed program with a bad mode mode")
+    ret, _ = sim.unset_xdp("", force=True, fail=False)
+    fail(ret == 0, "Removed program with a bad mode mode")
+
+    start_test("Test MTU restrictions...")
+    ret, _ = sim.set_mtu(9000, fail=False)
+    fail(ret == 0,
+         "Driver should refuse to increase MTU to 9000 with XDP loaded...")
+    sim.unset_xdp("drv")
+    bpftool_prog_list_wait(expected=0)
+    sim.set_mtu(9000)
+    ret, _ = sim.set_xdp(obj, "drv", fail=False)
+    fail(ret == 0, "Driver should refuse to load program with MTU of 9000...")
+    sim.set_mtu(1500)
+
+    sim.wait_for_flush()
+    start_test("Test XDP offload...")
+    sim.set_xdp(obj, "offload")
+    ipl = sim.ip_link_show(xdp=True)
+    link_xdp = ipl["xdp"]["prog"]
+    progs = bpftool_prog_list(expected=1)
+    prog = progs[0]
+    fail(link_xdp["id"] != prog["id"], "Loaded program has wrong ID")
+
+    start_test("Test XDP offload is device bound...")
+    dfs = sim.dfs_get_bound_progs(expected=1)
+    dprog = dfs[0]
+
+    fail(prog["id"] != link_xdp["id"], "Program IDs don't match")
+    fail(prog["tag"] != link_xdp["tag"], "Program tags don't match")
+    fail(str(link_xdp["id"]) != dprog["id"], "Program IDs don't match")
+    fail(dprog["state"] != "xlated", "Offloaded program state not translated")
+    fail(dprog["loaded"] != "Y", "Offloaded program is not loaded")
+
+    start_test("Test removing XDP program many times...")
+    sim.unset_xdp("offload")
+    sim.unset_xdp("offload")
+    sim.unset_xdp("drv")
+    sim.unset_xdp("drv")
+    sim.unset_xdp("")
+    sim.unset_xdp("")
+    bpftool_prog_list_wait(expected=0)
+
+    start_test("Test attempt to use a program for a wrong device...")
+    sim2 = NetdevSim()
+    sim2.set_xdp(obj, "offload")
+    pin_file, pinned = pin_prog("/sys/fs/bpf/tmp")
+
+    ret, _ = sim.set_xdp(pinned, "offload", fail=False)
+    fail(ret == 0, "Pinned program loaded for a different device accepted")
+    sim2.remove()
+    ret, _ = sim.set_xdp(pinned, "offload", fail=False)
+    fail(ret == 0, "Pinned program loaded for a removed device accepted")
+    rm(pin_file)
+    bpftool_prog_list_wait(expected=0)
+
+    start_test("Test mixing of TC and XDP...")
+    sim.tc_add_ingress()
+    sim.set_xdp(obj, "offload")
+    ret, _ = sim.cls_bpf_add_filter(obj, skip_sw=True, fail=False)
+    fail(ret == 0, "Loading TC when XDP active should fail")
+    sim.unset_xdp("offload")
+    sim.wait_for_flush()
+
+    sim.cls_bpf_add_filter(obj, skip_sw=True)
+    ret, _ = sim.set_xdp(obj, "offload", fail=False)
+    fail(ret == 0, "Loading XDP when TC active should fail")
+
+    start_test("Test binding TC from pinned...")
+    pin_file, pinned = pin_prog("/sys/fs/bpf/tmp")
+    sim.tc_flush_filters(bound=1, total=1)
+    sim.cls_bpf_add_filter(pinned, da=True, skip_sw=True)
+    sim.tc_flush_filters(bound=1, total=1)
+
+    start_test("Test binding XDP from pinned...")
+    sim.set_xdp(obj, "offload")
+    pin_file, pinned = pin_prog("/sys/fs/bpf/tmp2", idx=1)
+
+    sim.set_xdp(pinned, "offload", force=True)
+    sim.unset_xdp("offload")
+    sim.set_xdp(pinned, "offload", force=True)
+    sim.unset_xdp("offload")
+
+    start_test("Test offload of wrong type fails...")
+    ret, _ = sim.cls_bpf_add_filter(pinned, da=True, skip_sw=True, fail=False)
+    fail(ret == 0, "Managed to attach XDP program to TC")
+
+    start_test("Test asking for TC offload of two filters...")
+    sim.cls_bpf_add_filter(obj, da=True, skip_sw=True)
+    sim.cls_bpf_add_filter(obj, da=True, skip_sw=True)
+    # The above will trigger a splat until TC cls_bpf drivers are fixed
+
+    sim.tc_flush_filters(bound=2, total=2)
+
+    start_test("Test if netdev removal waits for translation...")
+    delay_msec = 500
+    sim.dfs["bpf_bind_verifier_delay"] = delay_msec
+    start = time.time()
+    cmd_line = "tc filter add dev %s ingress bpf %s da skip_sw" % \
+               (sim['ifname'], obj)
+    tc_proc = cmd(cmd_line, background=True, fail=False)
+    # Wait for the verifier to start
+    while sim.dfs_num_bound_progs() <= 2:
+        pass
+    sim.remove()
+    end = time.time()
+    ret, _ = cmd_result(tc_proc, fail=False)
+    time_diff = end - start
+    log("Time", "start:\t%s\nend:\t%s\ndiff:\t%s" % (start, end, time_diff))
+
+    fail(ret == 0, "Managed to load TC filter on a unregistering device")
+    delay_sec = delay_msec * 0.001
+    fail(time_diff < delay_sec, "Removal process took %s, expected %s" %
+         (time_diff, delay_sec))
+
+    print("%s: OK" % (os.path.basename(__file__)))
+
+finally:
+    log("Clean up...", "", level=1)
+    log_level_inc()
+    clean_up()
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next v2 7/8] netdevsim: add SR-IOV functionality
  2017-12-01  1:35 [PATCH net-next v2 0/8] xdp: make stack perform remove and tests Jakub Kicinski
                   ` (5 preceding siblings ...)
  2017-12-01  1:35 ` [PATCH net-next v2 6/8] selftests/bpf: add offload test based on netdevsim Jakub Kicinski
@ 2017-12-01  1:35 ` Jakub Kicinski
  2017-12-01 13:43   ` Phil Sutter
  2017-12-01  1:35 ` [PATCH net-next v2 8/8] net: dummy: remove fake " Jakub Kicinski
  7 siblings, 1 reply; 18+ messages in thread
From: Jakub Kicinski @ 2017-12-01  1:35 UTC (permalink / raw)
  To: netdev
  Cc: davem, daniel, alexei.starovoitov, jiri, oss-drivers,
	Jakub Kicinski, Phil Sutter, Sabrina Dubroca

dummy driver was extended with VF-related netdev APIs for testing
SR-IOV-related software.  netdevsim did not exist back then.
Implement SR-IOV functionality in netdevsim.  Notable difference
is that since netdevsim has no module parameters, we will actually
create a device with sriov_numvfs attribute for each netdev.
The zero MAC address is accepted as some HW use it to mean any
address is allowed.  Link state is also now validated.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
---
CC: Phil Sutter <phil@nwl.cc>
CC: Sabrina Dubroca  <sd@queasysnail.net>
---
 drivers/net/netdevsim/netdev.c    | 273 +++++++++++++++++++++++++++++++++++++-
 drivers/net/netdevsim/netdevsim.h |  12 ++
 2 files changed, 283 insertions(+), 2 deletions(-)

diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c
index 828c1ce49a8b..a7a2a2290957 100644
--- a/drivers/net/netdevsim/netdev.c
+++ b/drivers/net/netdevsim/netdev.c
@@ -25,6 +25,124 @@
 
 #include "netdevsim.h"
 
+struct nsim_vf_config {
+	int link_state;
+	u16 min_tx_rate;
+	u16 max_tx_rate;
+	u16 vlan;
+	__be16 vlan_proto;
+	u16 qos;
+	u8 vf_mac[ETH_ALEN];
+	bool spoofchk_enabled;
+	bool trusted;
+	bool rss_query_enabled;
+};
+
+static u32 nsim_dev_id;
+
+static int nsim_num_vf(struct device *dev)
+{
+	struct netdevsim *ns = to_nsim(dev);
+
+	return ns->num_vfs;
+}
+
+static struct bus_type nsim_bus = {
+	.name		= DRV_NAME,
+	.dev_name	= DRV_NAME,
+	.num_vf		= nsim_num_vf,
+};
+
+static int nsim_vfs_enable(struct netdevsim *ns, unsigned int num_vfs)
+{
+	ns->vfconfigs = kcalloc(num_vfs, sizeof(struct nsim_vf_config),
+				GFP_KERNEL);
+	if (!ns->vfconfigs)
+		return -ENOMEM;
+	ns->num_vfs = num_vfs;
+
+	return 0;
+}
+
+static void nsim_vfs_disable(struct netdevsim *ns)
+{
+	kfree(ns->vfconfigs);
+	ns->vfconfigs = NULL;
+	ns->num_vfs = 0;
+}
+
+static ssize_t
+nsim_numvfs_store(struct device *dev, struct device_attribute *attr,
+		  const char *buf, size_t count)
+{
+	struct netdevsim *ns = to_nsim(dev);
+	unsigned int num_vfs;
+	int ret;
+
+	ret = kstrtouint(buf, 0, &num_vfs);
+	if (ret)
+		return ret;
+
+	rtnl_lock();
+	if (ns->num_vfs == num_vfs)
+		goto exit_good;
+	if (ns->num_vfs && num_vfs) {
+		ret = -EBUSY;
+		goto exit_unlock;
+	}
+
+	if (num_vfs) {
+		ret = nsim_vfs_enable(ns, num_vfs);
+		if (ret)
+			goto exit_unlock;
+	} else {
+		nsim_vfs_disable(ns);
+	}
+exit_good:
+	ret = count;
+exit_unlock:
+	rtnl_unlock();
+
+	return ret;
+}
+
+static ssize_t
+nsim_numvfs_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	struct netdevsim *ns = to_nsim(dev);
+
+	return sprintf(buf, "%u\n", ns->num_vfs);
+}
+
+static struct device_attribute nsim_numvfs_attr =
+	__ATTR(sriov_numvfs, 0664, nsim_numvfs_show, nsim_numvfs_store);
+
+static struct attribute *nsim_dev_attrs[] = {
+	&nsim_numvfs_attr.attr,
+	NULL,
+};
+
+static const struct attribute_group nsim_dev_attr_group = {
+	.attrs = nsim_dev_attrs,
+};
+
+static const struct attribute_group *nsim_dev_attr_groups[] = {
+	&nsim_dev_attr_group,
+	NULL,
+};
+
+static void nsim_dev_release(struct device *dev)
+{
+	struct netdevsim *ns = to_nsim(dev);
+
+	free_netdev(ns->netdev);
+}
+
+struct device_type nsim_dev_type = {
+	.groups = nsim_dev_attr_groups,
+	.release = nsim_dev_release,
+};
+
 static int nsim_init(struct net_device *dev)
 {
 	struct netdevsim *ns = netdev_priv(dev);
@@ -37,8 +155,19 @@ static int nsim_init(struct net_device *dev)
 	if (err)
 		goto err_debugfs_destroy;
 
+	ns->dev.id = nsim_dev_id++;
+	ns->dev.bus = &nsim_bus;
+	ns->dev.type = &nsim_dev_type;
+	err = device_register(&ns->dev);
+	if (err)
+		goto err_bpf_uninit;
+
+	SET_NETDEV_DEV(dev, &ns->dev);
+
 	return 0;
 
+err_bpf_uninit:
+	nsim_bpf_uninit(ns);
 err_debugfs_destroy:
 	debugfs_remove_recursive(ns->ddir);
 	return err;
@@ -50,6 +179,14 @@ static void nsim_uninit(struct net_device *dev)
 
 	debugfs_remove_recursive(ns->ddir);
 	nsim_bpf_uninit(ns);
+	nsim_vfs_disable(ns);
+}
+
+static void nsim_free(struct net_device *dev)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+
+	device_unregister(&ns->dev);
 }
 
 static netdev_tx_t nsim_start_xmit(struct sk_buff *skb, struct net_device *dev)
@@ -122,6 +259,123 @@ nsim_setup_tc_block(struct net_device *dev, struct tc_block_offload *f)
 	}
 }
 
+static int nsim_set_vf_mac(struct net_device *dev, int vf, u8 *mac)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+
+	/* Only refuse multicast addresses, zero address can mean unset/any. */
+	if (vf >= ns->num_vfs || is_multicast_ether_addr(mac))
+		return -EINVAL;
+	memcpy(ns->vfconfigs[vf].vf_mac, mac, ETH_ALEN);
+
+	return 0;
+}
+
+static int nsim_set_vf_vlan(struct net_device *dev, int vf,
+			    u16 vlan, u8 qos, __be16 vlan_proto)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+
+	if (vf >= ns->num_vfs || vlan > 4095 || qos > 7)
+		return -EINVAL;
+
+	ns->vfconfigs[vf].vlan = vlan;
+	ns->vfconfigs[vf].qos = qos;
+	ns->vfconfigs[vf].vlan_proto = vlan_proto;
+
+	return 0;
+}
+
+static int nsim_set_vf_rate(struct net_device *dev, int vf, int min, int max)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+
+	if (vf >= ns->num_vfs)
+		return -EINVAL;
+
+	ns->vfconfigs[vf].min_tx_rate = min;
+	ns->vfconfigs[vf].max_tx_rate = max;
+
+	return 0;
+}
+
+static int nsim_set_vf_spoofchk(struct net_device *dev, int vf, bool val)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+
+	if (vf >= ns->num_vfs)
+		return -EINVAL;
+	ns->vfconfigs[vf].spoofchk_enabled = val;
+
+	return 0;
+}
+
+static int nsim_set_vf_rss_query_en(struct net_device *dev, int vf, bool val)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+
+	if (vf >= ns->num_vfs)
+		return -EINVAL;
+	ns->vfconfigs[vf].rss_query_enabled = val;
+
+	return 0;
+}
+
+static int nsim_set_vf_trust(struct net_device *dev, int vf, bool val)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+
+	if (vf >= ns->num_vfs)
+		return -EINVAL;
+	ns->vfconfigs[vf].trusted = val;
+
+	return 0;
+}
+
+static int
+nsim_get_vf_config(struct net_device *dev, int vf, struct ifla_vf_info *ivi)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+
+	if (vf >= ns->num_vfs)
+		return -EINVAL;
+
+	ivi->vf = vf;
+	ivi->linkstate = ns->vfconfigs[vf].link_state;
+	ivi->min_tx_rate = ns->vfconfigs[vf].min_tx_rate;
+	ivi->max_tx_rate = ns->vfconfigs[vf].max_tx_rate;
+	ivi->vlan = ns->vfconfigs[vf].vlan;
+	ivi->vlan_proto = ns->vfconfigs[vf].vlan_proto;
+	ivi->qos = ns->vfconfigs[vf].qos;
+	memcpy(&ivi->mac, ns->vfconfigs[vf].vf_mac, ETH_ALEN);
+	ivi->spoofchk = ns->vfconfigs[vf].spoofchk_enabled;
+	ivi->trusted = ns->vfconfigs[vf].trusted;
+	ivi->rss_query_en = ns->vfconfigs[vf].rss_query_enabled;
+
+	return 0;
+}
+
+static int nsim_set_vf_link_state(struct net_device *dev, int vf, int state)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+
+	if (vf >= ns->num_vfs)
+		return -EINVAL;
+
+	switch (state) {
+	case IFLA_VF_LINK_STATE_AUTO:
+	case IFLA_VF_LINK_STATE_ENABLE:
+	case IFLA_VF_LINK_STATE_DISABLE:
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	ns->vfconfigs[vf].link_state = state;
+
+	return 0;
+}
+
 static int
 nsim_setup_tc(struct net_device *dev, enum tc_setup_type type, void *type_data)
 {
@@ -153,6 +407,14 @@ static const struct net_device_ops nsim_netdev_ops = {
 	.ndo_validate_addr	= eth_validate_addr,
 	.ndo_change_mtu		= nsim_change_mtu,
 	.ndo_get_stats64	= nsim_get_stats64,
+	.ndo_set_vf_mac		= nsim_set_vf_mac,
+	.ndo_set_vf_vlan	= nsim_set_vf_vlan,
+	.ndo_set_vf_rate	= nsim_set_vf_rate,
+	.ndo_set_vf_spoofchk	= nsim_set_vf_spoofchk,
+	.ndo_set_vf_trust	= nsim_set_vf_trust,
+	.ndo_get_vf_config	= nsim_get_vf_config,
+	.ndo_set_vf_link_state	= nsim_set_vf_link_state,
+	.ndo_set_vf_rss_query_en = nsim_set_vf_rss_query_en,
 	.ndo_setup_tc		= nsim_setup_tc,
 	.ndo_set_features	= nsim_set_features,
 	.ndo_bpf		= nsim_bpf,
@@ -164,7 +426,7 @@ static void nsim_setup(struct net_device *dev)
 	eth_hw_addr_random(dev);
 
 	dev->netdev_ops = &nsim_netdev_ops;
-	dev->needs_free_netdev = true;
+	dev->priv_destructor = nsim_free;
 
 	dev->tx_queue_len = 0;
 	dev->flags |= IFF_NOARP;
@@ -209,12 +471,18 @@ static int __init nsim_module_init(void)
 	if (IS_ERR(nsim_ddir))
 		return PTR_ERR(nsim_ddir);
 
-	err = rtnl_link_register(&nsim_link_ops);
+	err = bus_register(&nsim_bus);
 	if (err)
 		goto err_debugfs_destroy;
 
+	err = rtnl_link_register(&nsim_link_ops);
+	if (err)
+		goto err_unreg_bus;
+
 	return 0;
 
+err_unreg_bus:
+	bus_unregister(&nsim_bus);
 err_debugfs_destroy:
 	debugfs_remove_recursive(nsim_ddir);
 	return err;
@@ -223,6 +491,7 @@ static int __init nsim_module_init(void)
 static void __exit nsim_module_exit(void)
 {
 	rtnl_link_unregister(&nsim_link_ops);
+	bus_unregister(&nsim_bus);
 	debugfs_remove_recursive(nsim_ddir);
 }
 
diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h
index 8779e6a8f885..32270de9395a 100644
--- a/drivers/net/netdevsim/netdevsim.h
+++ b/drivers/net/netdevsim/netdevsim.h
@@ -13,6 +13,7 @@
  * THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
  */
 
+#include <linux/device.h>
 #include <linux/kernel.h>
 #include <linux/list.h>
 #include <linux/netdevice.h>
@@ -26,6 +27,7 @@
 
 struct bpf_prog;
 struct dentry;
+struct nsim_vf_config;
 
 struct netdevsim {
 	struct net_device *netdev;
@@ -34,8 +36,13 @@ struct netdevsim {
 	u64 tx_bytes;
 	struct u64_stats_sync syncp;
 
+	struct device dev;
+
 	struct dentry *ddir;
 
+	unsigned int num_vfs;
+	struct nsim_vf_config *vfconfigs;
+
 	struct bpf_prog	*bpf_offloaded;
 	u32 bpf_offloaded_id;
 
@@ -64,3 +71,8 @@ int nsim_bpf(struct net_device *dev, struct netdev_bpf *bpf);
 int nsim_bpf_disable_tc(struct netdevsim *ns);
 int nsim_bpf_setup_tc_block_cb(enum tc_setup_type type,
 			       void *type_data, void *cb_priv);
+
+static inline struct netdevsim *to_nsim(struct device *ptr)
+{
+	return container_of(ptr, struct netdevsim, dev);
+}
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next v2 8/8] net: dummy: remove fake SR-IOV functionality
  2017-12-01  1:35 [PATCH net-next v2 0/8] xdp: make stack perform remove and tests Jakub Kicinski
                   ` (6 preceding siblings ...)
  2017-12-01  1:35 ` [PATCH net-next v2 7/8] netdevsim: add SR-IOV functionality Jakub Kicinski
@ 2017-12-01  1:35 ` Jakub Kicinski
  2017-12-01 13:46   ` Phil Sutter
  7 siblings, 1 reply; 18+ messages in thread
From: Jakub Kicinski @ 2017-12-01  1:35 UTC (permalink / raw)
  To: netdev
  Cc: davem, daniel, alexei.starovoitov, jiri, oss-drivers,
	Jakub Kicinski, Phil Sutter, Sabrina Dubroca

netdevsim driver seems like a better place for fake SR-IOV
functionality.  Remove the code previously added to dummy.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
---
CC: Phil Sutter <phil@nwl.cc>
CC: Sabrina Dubroca  <sd@queasysnail.net>
---
 drivers/net/dummy.c | 215 +---------------------------------------------------
 1 file changed, 1 insertion(+), 214 deletions(-)

diff --git a/drivers/net/dummy.c b/drivers/net/dummy.c
index 58483af80bdb..30b1c8512049 100644
--- a/drivers/net/dummy.c
+++ b/drivers/net/dummy.c
@@ -42,48 +42,7 @@
 #define DRV_NAME	"dummy"
 #define DRV_VERSION	"1.0"
 
-#undef pr_fmt
-#define pr_fmt(fmt) DRV_NAME ": " fmt
-
 static int numdummies = 1;
-static int num_vfs;
-
-struct vf_data_storage {
-	u8	vf_mac[ETH_ALEN];
-	u16	pf_vlan; /* When set, guest VLAN config not allowed. */
-	u16	pf_qos;
-	__be16	vlan_proto;
-	u16	min_tx_rate;
-	u16	max_tx_rate;
-	u8	spoofchk_enabled;
-	bool	rss_query_enabled;
-	u8	trusted;
-	int	link_state;
-};
-
-struct dummy_priv {
-	struct vf_data_storage	*vfinfo;
-};
-
-static int dummy_num_vf(struct device *dev)
-{
-	return num_vfs;
-}
-
-static struct bus_type dummy_bus = {
-	.name	= "dummy",
-	.num_vf	= dummy_num_vf,
-};
-
-static void release_dummy_parent(struct device *dev)
-{
-}
-
-static struct device dummy_parent = {
-	.init_name	= "dummy",
-	.bus		= &dummy_bus,
-	.release	= release_dummy_parent,
-};
 
 /* fake multicast ability */
 static void set_multicast_list(struct net_device *dev)
@@ -133,25 +92,10 @@ static netdev_tx_t dummy_xmit(struct sk_buff *skb, struct net_device *dev)
 
 static int dummy_dev_init(struct net_device *dev)
 {
-	struct dummy_priv *priv = netdev_priv(dev);
-
 	dev->dstats = netdev_alloc_pcpu_stats(struct pcpu_dstats);
 	if (!dev->dstats)
 		return -ENOMEM;
 
-	priv->vfinfo = NULL;
-
-	if (!num_vfs)
-		return 0;
-
-	dev->dev.parent = &dummy_parent;
-	priv->vfinfo = kcalloc(num_vfs, sizeof(struct vf_data_storage),
-			       GFP_KERNEL);
-	if (!priv->vfinfo) {
-		free_percpu(dev->dstats);
-		return -ENOMEM;
-	}
-
 	return 0;
 }
 
@@ -169,117 +113,6 @@ static int dummy_change_carrier(struct net_device *dev, bool new_carrier)
 	return 0;
 }
 
-static int dummy_set_vf_mac(struct net_device *dev, int vf, u8 *mac)
-{
-	struct dummy_priv *priv = netdev_priv(dev);
-
-	if (!is_valid_ether_addr(mac) || (vf >= num_vfs))
-		return -EINVAL;
-
-	memcpy(priv->vfinfo[vf].vf_mac, mac, ETH_ALEN);
-
-	return 0;
-}
-
-static int dummy_set_vf_vlan(struct net_device *dev, int vf,
-			     u16 vlan, u8 qos, __be16 vlan_proto)
-{
-	struct dummy_priv *priv = netdev_priv(dev);
-
-	if ((vf >= num_vfs) || (vlan > 4095) || (qos > 7))
-		return -EINVAL;
-
-	priv->vfinfo[vf].pf_vlan = vlan;
-	priv->vfinfo[vf].pf_qos = qos;
-	priv->vfinfo[vf].vlan_proto = vlan_proto;
-
-	return 0;
-}
-
-static int dummy_set_vf_rate(struct net_device *dev, int vf, int min, int max)
-{
-	struct dummy_priv *priv = netdev_priv(dev);
-
-	if (vf >= num_vfs)
-		return -EINVAL;
-
-	priv->vfinfo[vf].min_tx_rate = min;
-	priv->vfinfo[vf].max_tx_rate = max;
-
-	return 0;
-}
-
-static int dummy_set_vf_spoofchk(struct net_device *dev, int vf, bool val)
-{
-	struct dummy_priv *priv = netdev_priv(dev);
-
-	if (vf >= num_vfs)
-		return -EINVAL;
-
-	priv->vfinfo[vf].spoofchk_enabled = val;
-
-	return 0;
-}
-
-static int dummy_set_vf_rss_query_en(struct net_device *dev, int vf, bool val)
-{
-	struct dummy_priv *priv = netdev_priv(dev);
-
-	if (vf >= num_vfs)
-		return -EINVAL;
-
-	priv->vfinfo[vf].rss_query_enabled = val;
-
-	return 0;
-}
-
-static int dummy_set_vf_trust(struct net_device *dev, int vf, bool val)
-{
-	struct dummy_priv *priv = netdev_priv(dev);
-
-	if (vf >= num_vfs)
-		return -EINVAL;
-
-	priv->vfinfo[vf].trusted = val;
-
-	return 0;
-}
-
-static int dummy_get_vf_config(struct net_device *dev,
-			       int vf, struct ifla_vf_info *ivi)
-{
-	struct dummy_priv *priv = netdev_priv(dev);
-
-	if (vf >= num_vfs)
-		return -EINVAL;
-
-	ivi->vf = vf;
-	memcpy(&ivi->mac, priv->vfinfo[vf].vf_mac, ETH_ALEN);
-	ivi->vlan = priv->vfinfo[vf].pf_vlan;
-	ivi->qos = priv->vfinfo[vf].pf_qos;
-	ivi->spoofchk = priv->vfinfo[vf].spoofchk_enabled;
-	ivi->linkstate = priv->vfinfo[vf].link_state;
-	ivi->min_tx_rate = priv->vfinfo[vf].min_tx_rate;
-	ivi->max_tx_rate = priv->vfinfo[vf].max_tx_rate;
-	ivi->rss_query_en = priv->vfinfo[vf].rss_query_enabled;
-	ivi->trusted = priv->vfinfo[vf].trusted;
-	ivi->vlan_proto = priv->vfinfo[vf].vlan_proto;
-
-	return 0;
-}
-
-static int dummy_set_vf_link_state(struct net_device *dev, int vf, int state)
-{
-	struct dummy_priv *priv = netdev_priv(dev);
-
-	if (vf >= num_vfs)
-		return -EINVAL;
-
-	priv->vfinfo[vf].link_state = state;
-
-	return 0;
-}
-
 static const struct net_device_ops dummy_netdev_ops = {
 	.ndo_init		= dummy_dev_init,
 	.ndo_uninit		= dummy_dev_uninit,
@@ -289,14 +122,6 @@ static const struct net_device_ops dummy_netdev_ops = {
 	.ndo_set_mac_address	= eth_mac_addr,
 	.ndo_get_stats64	= dummy_get_stats64,
 	.ndo_change_carrier	= dummy_change_carrier,
-	.ndo_set_vf_mac		= dummy_set_vf_mac,
-	.ndo_set_vf_vlan	= dummy_set_vf_vlan,
-	.ndo_set_vf_rate	= dummy_set_vf_rate,
-	.ndo_set_vf_spoofchk	= dummy_set_vf_spoofchk,
-	.ndo_set_vf_trust	= dummy_set_vf_trust,
-	.ndo_get_vf_config	= dummy_get_vf_config,
-	.ndo_set_vf_link_state	= dummy_set_vf_link_state,
-	.ndo_set_vf_rss_query_en = dummy_set_vf_rss_query_en,
 };
 
 static void dummy_get_drvinfo(struct net_device *dev,
@@ -323,13 +148,6 @@ static const struct ethtool_ops dummy_ethtool_ops = {
 	.get_ts_info		= dummy_get_ts_info,
 };
 
-static void dummy_free_netdev(struct net_device *dev)
-{
-	struct dummy_priv *priv = netdev_priv(dev);
-
-	kfree(priv->vfinfo);
-}
-
 static void dummy_setup(struct net_device *dev)
 {
 	ether_setup(dev);
@@ -338,7 +156,6 @@ static void dummy_setup(struct net_device *dev)
 	dev->netdev_ops = &dummy_netdev_ops;
 	dev->ethtool_ops = &dummy_ethtool_ops;
 	dev->needs_free_netdev = true;
-	dev->priv_destructor = dummy_free_netdev;
 
 	/* Fill in device structure with ethernet-generic values. */
 	dev->flags |= IFF_NOARP;
@@ -370,7 +187,6 @@ static int dummy_validate(struct nlattr *tb[], struct nlattr *data[],
 
 static struct rtnl_link_ops dummy_link_ops __read_mostly = {
 	.kind		= DRV_NAME,
-	.priv_size	= sizeof(struct dummy_priv),
 	.setup		= dummy_setup,
 	.validate	= dummy_validate,
 };
@@ -379,16 +195,12 @@ static struct rtnl_link_ops dummy_link_ops __read_mostly = {
 module_param(numdummies, int, 0);
 MODULE_PARM_DESC(numdummies, "Number of dummy pseudo devices");
 
-module_param(num_vfs, int, 0);
-MODULE_PARM_DESC(num_vfs, "Number of dummy VFs per dummy device");
-
 static int __init dummy_init_one(void)
 {
 	struct net_device *dev_dummy;
 	int err;
 
-	dev_dummy = alloc_netdev(sizeof(struct dummy_priv),
-				 "dummy%d", NET_NAME_ENUM, dummy_setup);
+	dev_dummy = alloc_netdev(0, "dummy%d", NET_NAME_ENUM, dummy_setup);
 	if (!dev_dummy)
 		return -ENOMEM;
 
@@ -407,21 +219,6 @@ static int __init dummy_init_module(void)
 {
 	int i, err = 0;
 
-	if (num_vfs) {
-		err = bus_register(&dummy_bus);
-		if (err < 0) {
-			pr_err("registering dummy bus failed\n");
-			return err;
-		}
-
-		err = device_register(&dummy_parent);
-		if (err < 0) {
-			pr_err("registering dummy parent device failed\n");
-			bus_unregister(&dummy_bus);
-			return err;
-		}
-	}
-
 	rtnl_lock();
 	err = __rtnl_link_register(&dummy_link_ops);
 	if (err < 0)
@@ -437,22 +234,12 @@ static int __init dummy_init_module(void)
 out:
 	rtnl_unlock();
 
-	if (err && num_vfs) {
-		device_unregister(&dummy_parent);
-		bus_unregister(&dummy_bus);
-	}
-
 	return err;
 }
 
 static void __exit dummy_cleanup_module(void)
 {
 	rtnl_link_unregister(&dummy_link_ops);
-
-	if (num_vfs) {
-		device_unregister(&dummy_parent);
-		bus_unregister(&dummy_bus);
-	}
 }
 
 module_init(dummy_init_module);
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next v2 7/8] netdevsim: add SR-IOV functionality
  2017-12-01  1:35 ` [PATCH net-next v2 7/8] netdevsim: add SR-IOV functionality Jakub Kicinski
@ 2017-12-01 13:43   ` Phil Sutter
  2017-12-01 20:14     ` Jakub Kicinski
  0 siblings, 1 reply; 18+ messages in thread
From: Phil Sutter @ 2017-12-01 13:43 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, davem, daniel, alexei.starovoitov, jiri, oss-drivers,
	Sabrina Dubroca

On Thu, Nov 30, 2017 at 05:35:39PM -0800, Jakub Kicinski wrote:
[...]
> +static int nsim_vfs_enable(struct netdevsim *ns, unsigned int num_vfs)
> +{
> +	ns->vfconfigs = kcalloc(num_vfs, sizeof(struct nsim_vf_config),
> +				GFP_KERNEL);
> +	if (!ns->vfconfigs)
> +		return -ENOMEM;
> +	ns->num_vfs = num_vfs;
> +
> +	return 0;
> +}
> +
> +static void nsim_vfs_disable(struct netdevsim *ns)
> +{
> +	kfree(ns->vfconfigs);
> +	ns->vfconfigs = NULL;
> +	ns->num_vfs = 0;
> +}

Why not something like:

| static int nsim_vfs_set(struct netdevsim *ns, unsigned int num_vfs)
| {
| 	void *ptr = krealloc(ns->vfconfigs,
| 			     num_vfs * sizeof(struct nsim_vf_config),
| 			     GFP_KERNEL);
| 
| 	if (!ptr)
| 		return -ENOMEM;
| 
| 	ns->vfconfigs = ptr;
| 	ns->num_vfs = num_vfs;
| 	return 0;
| }

> +static ssize_t
> +nsim_numvfs_store(struct device *dev, struct device_attribute *attr,
> +		  const char *buf, size_t count)
> +{
> +	struct netdevsim *ns = to_nsim(dev);
> +	unsigned int num_vfs;
> +	int ret;
> +
> +	ret = kstrtouint(buf, 0, &num_vfs);
> +	if (ret)
> +		return ret;
> +
> +	rtnl_lock();
> +	if (ns->num_vfs == num_vfs)
> +		goto exit_good;

Then replace this:

> +	if (ns->num_vfs && num_vfs) {
> +		ret = -EBUSY;
> +		goto exit_unlock;
> +	}
> +
> +	if (num_vfs) {
> +		ret = nsim_vfs_enable(ns, num_vfs);
> +		if (ret)
> +			goto exit_unlock;
> +	} else {
> +		nsim_vfs_disable(ns);
> +	}

with just:

|	nsim_vfs_set(ns, num_vfs);

> +exit_good:
> +	ret = count;
> +exit_unlock:
> +	rtnl_unlock();
> +
> +	return ret;
> +}

[...]

> +static void nsim_free(struct net_device *dev)
> +{
> +	struct netdevsim *ns = netdev_priv(dev);
> +
> +	device_unregister(&ns->dev);
>  }

Shouldn't this also kfree(ns->vfconfigs)?

Cheers, Phil

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next v2 8/8] net: dummy: remove fake SR-IOV functionality
  2017-12-01  1:35 ` [PATCH net-next v2 8/8] net: dummy: remove fake " Jakub Kicinski
@ 2017-12-01 13:46   ` Phil Sutter
  2017-12-01 20:19     ` Jakub Kicinski
  0 siblings, 1 reply; 18+ messages in thread
From: Phil Sutter @ 2017-12-01 13:46 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, davem, daniel, alexei.starovoitov, jiri, oss-drivers,
	Sabrina Dubroca

On Thu, Nov 30, 2017 at 05:35:40PM -0800, Jakub Kicinski wrote:
> netdevsim driver seems like a better place for fake SR-IOV
> functionality.  Remove the code previously added to dummy.
> 
> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>

Acked-by: Phil Sutter <phil@nwl.cc>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next v2 7/8] netdevsim: add SR-IOV functionality
  2017-12-01 13:43   ` Phil Sutter
@ 2017-12-01 20:14     ` Jakub Kicinski
  2017-12-01 21:36       ` Phil Sutter
  0 siblings, 1 reply; 18+ messages in thread
From: Jakub Kicinski @ 2017-12-01 20:14 UTC (permalink / raw)
  To: Phil Sutter
  Cc: netdev, davem, daniel, alexei.starovoitov, jiri, oss-drivers,
	Sabrina Dubroca

On Fri, 1 Dec 2017 14:43:06 +0100, Phil Sutter wrote:
> On Thu, Nov 30, 2017 at 05:35:39PM -0800, Jakub Kicinski wrote:
> [...]
> > +static int nsim_vfs_enable(struct netdevsim *ns, unsigned int num_vfs)
> > +{
> > +	ns->vfconfigs = kcalloc(num_vfs, sizeof(struct nsim_vf_config),
> > +				GFP_KERNEL);
> > +	if (!ns->vfconfigs)
> > +		return -ENOMEM;
> > +	ns->num_vfs = num_vfs;
> > +
> > +	return 0;
> > +}
> > +
> > +static void nsim_vfs_disable(struct netdevsim *ns)
> > +{
> > +	kfree(ns->vfconfigs);
> > +	ns->vfconfigs = NULL;
> > +	ns->num_vfs = 0;
> > +}  
> 
> Why not something like:
> 
> | static int nsim_vfs_set(struct netdevsim *ns, unsigned int num_vfs)
> | {
> | 	void *ptr = krealloc(ns->vfconfigs,
> | 			     num_vfs * sizeof(struct nsim_vf_config),
> | 			     GFP_KERNEL);
> | 
> | 	if (!ptr)
> | 		return -ENOMEM;
> | 
> | 	ns->vfconfigs = ptr;
> | 	ns->num_vfs = num_vfs;
> | 	return 0;
> | }

Um.  It either frees or allocates, never reallocates so I felt realloc
is misleading.  ZERO_SIZE_PTR is less clearly a NULL than a NULL.  I
will have to specify __GFP_ZERO.  It's not a calloc so there could be
potentially some overflows?

> > +static ssize_t
> > +nsim_numvfs_store(struct device *dev, struct device_attribute *attr,
> > +		  const char *buf, size_t count)
> > +{
> > +	struct netdevsim *ns = to_nsim(dev);
> > +	unsigned int num_vfs;
> > +	int ret;
> > +
> > +	ret = kstrtouint(buf, 0, &num_vfs);
> > +	if (ret)
> > +		return ret;
> > +
> > +	rtnl_lock();
> > +	if (ns->num_vfs == num_vfs)
> > +		goto exit_good;  
> 
> Then replace this:
> 
> > +	if (ns->num_vfs && num_vfs) {
> > +		ret = -EBUSY;
> > +		goto exit_unlock;
> > +	}
> > +
> > +	if (num_vfs) {
> > +		ret = nsim_vfs_enable(ns, num_vfs);
> > +		if (ret)
> > +			goto exit_unlock;
> > +	} else {
> > +		nsim_vfs_disable(ns);
> > +	}  
> 
> with just:
> 
> |	nsim_vfs_set(ns, num_vfs);

I'm trying to mirror the PCI subsystem behaviour here, which only
allows enable or disable, not increase.  I felt we should follow how
real devices behave:

	/* enable VFs */
	if (pdev->sriov->num_VFs) {
		dev_warn(&pdev->dev, "%d VFs already enabled. Disable before enabling %d VFs\n",
			 pdev->sriov->num_VFs, num_vfs);
		return -EBUSY;
	}

So IOW this is intentional.

> > +	ret = count;
> > +exit_unlock:
> > +	rtnl_unlock();
> > +
> > +	return ret;
> > +}  
> 
> [...]
> 
> > +static void nsim_free(struct net_device *dev)
> > +{
> > +	struct netdevsim *ns = netdev_priv(dev);
> > +
> > +	device_unregister(&ns->dev);
> >  }  
> 
> Shouldn't this also kfree(ns->vfconfigs)?

It's in uninit, I will move it to release.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next v2 8/8] net: dummy: remove fake SR-IOV functionality
  2017-12-01 13:46   ` Phil Sutter
@ 2017-12-01 20:19     ` Jakub Kicinski
  2017-12-01 21:41       ` Phil Sutter
  0 siblings, 1 reply; 18+ messages in thread
From: Jakub Kicinski @ 2017-12-01 20:19 UTC (permalink / raw)
  To: Phil Sutter
  Cc: netdev, davem, daniel, alexei.starovoitov, jiri, oss-drivers,
	Sabrina Dubroca

On Fri, 1 Dec 2017 14:46:34 +0100, Phil Sutter wrote:
> On Thu, Nov 30, 2017 at 05:35:40PM -0800, Jakub Kicinski wrote:
> > netdevsim driver seems like a better place for fake SR-IOV
> > functionality.  Remove the code previously added to dummy.
> > 
> > Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> > Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>  
> 
> Acked-by: Phil Sutter <phil@nwl.cc>

Thanks!

Did you have an opportunity to run your tests against this?  I didn't
find anything that uses dummy's SR-IOV in selftests.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next v2 7/8] netdevsim: add SR-IOV functionality
  2017-12-01 20:14     ` Jakub Kicinski
@ 2017-12-01 21:36       ` Phil Sutter
  2017-12-01 21:45         ` Jakub Kicinski
  0 siblings, 1 reply; 18+ messages in thread
From: Phil Sutter @ 2017-12-01 21:36 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, davem, daniel, alexei.starovoitov, jiri, oss-drivers,
	Sabrina Dubroca

On Fri, Dec 01, 2017 at 12:14:07PM -0800, Jakub Kicinski wrote:
> On Fri, 1 Dec 2017 14:43:06 +0100, Phil Sutter wrote:
> > On Thu, Nov 30, 2017 at 05:35:39PM -0800, Jakub Kicinski wrote:
> > [...]
> > > +static int nsim_vfs_enable(struct netdevsim *ns, unsigned int num_vfs)
> > > +{
> > > +	ns->vfconfigs = kcalloc(num_vfs, sizeof(struct nsim_vf_config),
> > > +				GFP_KERNEL);
> > > +	if (!ns->vfconfigs)
> > > +		return -ENOMEM;
> > > +	ns->num_vfs = num_vfs;
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +static void nsim_vfs_disable(struct netdevsim *ns)
> > > +{
> > > +	kfree(ns->vfconfigs);
> > > +	ns->vfconfigs = NULL;
> > > +	ns->num_vfs = 0;
> > > +}  
> > 
> > Why not something like:
> > 
> > | static int nsim_vfs_set(struct netdevsim *ns, unsigned int num_vfs)
> > | {
> > | 	void *ptr = krealloc(ns->vfconfigs,
> > | 			     num_vfs * sizeof(struct nsim_vf_config),
> > | 			     GFP_KERNEL);
> > | 
> > | 	if (!ptr)
> > | 		return -ENOMEM;
> > | 
> > | 	ns->vfconfigs = ptr;
> > | 	ns->num_vfs = num_vfs;
> > | 	return 0;
> > | }
> 
> Um.  It either frees or allocates, never reallocates so I felt realloc
> is misleading.  ZERO_SIZE_PTR is less clearly a NULL than a NULL.  I
> will have to specify __GFP_ZERO.  It's not a calloc so there could be
> potentially some overflows?

I don't understand: How can overflows happen if I use malloc() instead
of calloc()?

> > > +static ssize_t
> > > +nsim_numvfs_store(struct device *dev, struct device_attribute *attr,
> > > +		  const char *buf, size_t count)
> > > +{
> > > +	struct netdevsim *ns = to_nsim(dev);
> > > +	unsigned int num_vfs;
> > > +	int ret;
> > > +
> > > +	ret = kstrtouint(buf, 0, &num_vfs);
> > > +	if (ret)
> > > +		return ret;
> > > +
> > > +	rtnl_lock();
> > > +	if (ns->num_vfs == num_vfs)
> > > +		goto exit_good;  
> > 
> > Then replace this:
> > 
> > > +	if (ns->num_vfs && num_vfs) {
> > > +		ret = -EBUSY;
> > > +		goto exit_unlock;
> > > +	}
> > > +
> > > +	if (num_vfs) {
> > > +		ret = nsim_vfs_enable(ns, num_vfs);
> > > +		if (ret)
> > > +			goto exit_unlock;
> > > +	} else {
> > > +		nsim_vfs_disable(ns);
> > > +	}  
> > 
> > with just:
> > 
> > |	nsim_vfs_set(ns, num_vfs);
> 
> I'm trying to mirror the PCI subsystem behaviour here, which only
> allows enable or disable, not increase.  I felt we should follow how
> real devices behave:
> 
> 	/* enable VFs */
> 	if (pdev->sriov->num_VFs) {
> 		dev_warn(&pdev->dev, "%d VFs already enabled. Disable before enabling %d VFs\n",
> 			 pdev->sriov->num_VFs, num_vfs);
> 		return -EBUSY;
> 	}
> 
> So IOW this is intentional.

Ah, I see. Yes, then it makes sense! Keeping this virtual VF
functionality as close to real ones as possible is certainly feasible.

> > > +	ret = count;
> > > +exit_unlock:
> > > +	rtnl_unlock();
> > > +
> > > +	return ret;
> > > +}  
> > 
> > [...]
> > 
> > > +static void nsim_free(struct net_device *dev)
> > > +{
> > > +	struct netdevsim *ns = netdev_priv(dev);
> > > +
> > > +	device_unregister(&ns->dev);
> > >  }  
> > 
> > Shouldn't this also kfree(ns->vfconfigs)?
> 
> It's in uninit, I will move it to release.

Oh, I missed that. If you're certain this won't lead to memleaks, no
objection from my side. :)

Cheers, Phil

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next v2 8/8] net: dummy: remove fake SR-IOV functionality
  2017-12-01 20:19     ` Jakub Kicinski
@ 2017-12-01 21:41       ` Phil Sutter
  0 siblings, 0 replies; 18+ messages in thread
From: Phil Sutter @ 2017-12-01 21:41 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, davem, daniel, alexei.starovoitov, jiri, oss-drivers,
	Sabrina Dubroca

On Fri, Dec 01, 2017 at 12:19:52PM -0800, Jakub Kicinski wrote:
> On Fri, 1 Dec 2017 14:46:34 +0100, Phil Sutter wrote:
> > On Thu, Nov 30, 2017 at 05:35:40PM -0800, Jakub Kicinski wrote:
> > > netdevsim driver seems like a better place for fake SR-IOV
> > > functionality.  Remove the code previously added to dummy.
> > > 
> > > Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> > > Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>  
> > 
> > Acked-by: Phil Sutter <phil@nwl.cc>
> 
> Thanks!
> 
> Did you have an opportunity to run your tests against this?  I didn't
> find anything that uses dummy's SR-IOV in selftests.

In fact, at Red Hat nobody uses dummy for iproute SR-IOV testing yet
(which was the motivation for it in the first place). Hence why I didn't
see a problem with moving it from dummy over to something else.

Hopefully upstream iproute will at some point contain a testsuite which
makes use of this, but sadly that's still wishful thinking. :(

Cheers, Phil

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next v2 7/8] netdevsim: add SR-IOV functionality
  2017-12-01 21:36       ` Phil Sutter
@ 2017-12-01 21:45         ` Jakub Kicinski
  2017-12-01 21:58           ` Phil Sutter
  0 siblings, 1 reply; 18+ messages in thread
From: Jakub Kicinski @ 2017-12-01 21:45 UTC (permalink / raw)
  To: Phil Sutter
  Cc: netdev, davem, daniel, alexei.starovoitov, jiri, oss-drivers,
	Sabrina Dubroca

On Fri, 1 Dec 2017 22:36:52 +0100, Phil Sutter wrote:
> On Fri, Dec 01, 2017 at 12:14:07PM -0800, Jakub Kicinski wrote:
> > On Fri, 1 Dec 2017 14:43:06 +0100, Phil Sutter wrote:  
> > > On Thu, Nov 30, 2017 at 05:35:39PM -0800, Jakub Kicinski wrote:
> > > [...]  
> > > > +static int nsim_vfs_enable(struct netdevsim *ns, unsigned int num_vfs)
> > > > +{
> > > > +	ns->vfconfigs = kcalloc(num_vfs, sizeof(struct nsim_vf_config),
> > > > +				GFP_KERNEL);
> > > > +	if (!ns->vfconfigs)
> > > > +		return -ENOMEM;
> > > > +	ns->num_vfs = num_vfs;
> > > > +
> > > > +	return 0;
> > > > +}
> > > > +
> > > > +static void nsim_vfs_disable(struct netdevsim *ns)
> > > > +{
> > > > +	kfree(ns->vfconfigs);
> > > > +	ns->vfconfigs = NULL;
> > > > +	ns->num_vfs = 0;
> > > > +}    
> > > 
> > > Why not something like:
> > > 
> > > | static int nsim_vfs_set(struct netdevsim *ns, unsigned int num_vfs)
> > > | {
> > > | 	void *ptr = krealloc(ns->vfconfigs,
> > > | 			     num_vfs * sizeof(struct nsim_vf_config),
> > > | 			     GFP_KERNEL);
> > > | 
> > > | 	if (!ptr)
> > > | 		return -ENOMEM;
> > > | 
> > > | 	ns->vfconfigs = ptr;
> > > | 	ns->num_vfs = num_vfs;
> > > | 	return 0;
> > > | }  
> > 
> > Um.  It either frees or allocates, never reallocates so I felt realloc
> > is misleading.  ZERO_SIZE_PTR is less clearly a NULL than a NULL.  I
> > will have to specify __GFP_ZERO.  It's not a calloc so there could be
> > potentially some overflows?  
> 
> I don't understand: How can overflows happen if I use malloc() instead
> of calloc()?

The multiplication may overflow.  That's why we have kmalloc_array().
Note this explicit check in kmalloc_array() (which is also called by
kcalloc):

	if (size != 0 && n > SIZE_MAX / size)
		return NULL;

Where:

#define SIZE_MAX	(~(size_t)0)

> > > > +	ret = count;
> > > > +exit_unlock:
> > > > +	rtnl_unlock();
> > > > +
> > > > +	return ret;
> > > > +}    
> > > 
> > > [...]
> > >   
> > > > +static void nsim_free(struct net_device *dev)
> > > > +{
> > > > +	struct netdevsim *ns = netdev_priv(dev);
> > > > +
> > > > +	device_unregister(&ns->dev);
> > > >  }    
> > > 
> > > Shouldn't this also kfree(ns->vfconfigs)?  
> > 
> > It's in uninit, I will move it to release.  
> 
> Oh, I missed that. If you're certain this won't lead to memleaks, no
> objection from my side. :)

OK, I will respin v3 with the free moved :)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next v2 7/8] netdevsim: add SR-IOV functionality
  2017-12-01 21:45         ` Jakub Kicinski
@ 2017-12-01 21:58           ` Phil Sutter
  2017-12-01 22:14             ` Jakub Kicinski
  0 siblings, 1 reply; 18+ messages in thread
From: Phil Sutter @ 2017-12-01 21:58 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, davem, daniel, alexei.starovoitov, jiri, oss-drivers,
	Sabrina Dubroca

On Fri, Dec 01, 2017 at 01:45:09PM -0800, Jakub Kicinski wrote:
> On Fri, 1 Dec 2017 22:36:52 +0100, Phil Sutter wrote:
> > On Fri, Dec 01, 2017 at 12:14:07PM -0800, Jakub Kicinski wrote:
> > > On Fri, 1 Dec 2017 14:43:06 +0100, Phil Sutter wrote:  
> > > > On Thu, Nov 30, 2017 at 05:35:39PM -0800, Jakub Kicinski wrote:
> > > > [...]  
> > > > > +static int nsim_vfs_enable(struct netdevsim *ns, unsigned int num_vfs)
> > > > > +{
> > > > > +	ns->vfconfigs = kcalloc(num_vfs, sizeof(struct nsim_vf_config),
> > > > > +				GFP_KERNEL);
> > > > > +	if (!ns->vfconfigs)
> > > > > +		return -ENOMEM;
> > > > > +	ns->num_vfs = num_vfs;
> > > > > +
> > > > > +	return 0;
> > > > > +}
> > > > > +
> > > > > +static void nsim_vfs_disable(struct netdevsim *ns)
> > > > > +{
> > > > > +	kfree(ns->vfconfigs);
> > > > > +	ns->vfconfigs = NULL;
> > > > > +	ns->num_vfs = 0;
> > > > > +}    
> > > > 
> > > > Why not something like:
> > > > 
> > > > | static int nsim_vfs_set(struct netdevsim *ns, unsigned int num_vfs)
> > > > | {
> > > > | 	void *ptr = krealloc(ns->vfconfigs,
> > > > | 			     num_vfs * sizeof(struct nsim_vf_config),
> > > > | 			     GFP_KERNEL);
> > > > | 
> > > > | 	if (!ptr)
> > > > | 		return -ENOMEM;
> > > > | 
> > > > | 	ns->vfconfigs = ptr;
> > > > | 	ns->num_vfs = num_vfs;
> > > > | 	return 0;
> > > > | }  
> > > 
> > > Um.  It either frees or allocates, never reallocates so I felt realloc
> > > is misleading.  ZERO_SIZE_PTR is less clearly a NULL than a NULL.  I
> > > will have to specify __GFP_ZERO.  It's not a calloc so there could be
> > > potentially some overflows?  
> > 
> > I don't understand: How can overflows happen if I use malloc() instead
> > of calloc()?
> 
> The multiplication may overflow.  That's why we have kmalloc_array().
> Note this explicit check in kmalloc_array() (which is also called by
> kcalloc):
> 
> 	if (size != 0 && n > SIZE_MAX / size)
> 		return NULL;
> 
> Where:
> 
> #define SIZE_MAX	(~(size_t)0)

Ah, I see. Thanks for educating me on this!

> > > > > +	ret = count;
> > > > > +exit_unlock:
> > > > > +	rtnl_unlock();
> > > > > +
> > > > > +	return ret;
> > > > > +}    
> > > > 
> > > > [...]
> > > >   
> > > > > +static void nsim_free(struct net_device *dev)
> > > > > +{
> > > > > +	struct netdevsim *ns = netdev_priv(dev);
> > > > > +
> > > > > +	device_unregister(&ns->dev);
> > > > >  }    
> > > > 
> > > > Shouldn't this also kfree(ns->vfconfigs)?  
> > > 
> > > It's in uninit, I will move it to release.  
> > 
> > Oh, I missed that. If you're certain this won't lead to memleaks, no
> > objection from my side. :)
> 
> OK, I will respin v3 with the free moved :)

So it did leak? I'm glad the traffic I caused wasn't completely
pointless then. :)

Thanks, Phil

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next v2 7/8] netdevsim: add SR-IOV functionality
  2017-12-01 21:58           ` Phil Sutter
@ 2017-12-01 22:14             ` Jakub Kicinski
  0 siblings, 0 replies; 18+ messages in thread
From: Jakub Kicinski @ 2017-12-01 22:14 UTC (permalink / raw)
  To: Phil Sutter
  Cc: netdev, davem, daniel, alexei.starovoitov, jiri, oss-drivers,
	Sabrina Dubroca

On Fri, 1 Dec 2017 22:58:29 +0100, Phil Sutter wrote:
> > > > > > +	ret = count;
> > > > > > +exit_unlock:
> > > > > > +	rtnl_unlock();
> > > > > > +
> > > > > > +	return ret;
> > > > > > +}      
> > > > > 
> > > > > [...]
> > > > >     
> > > > > > +static void nsim_free(struct net_device *dev)
> > > > > > +{
> > > > > > +	struct netdevsim *ns = netdev_priv(dev);
> > > > > > +
> > > > > > +	device_unregister(&ns->dev);
> > > > > >  }      
> > > > > 
> > > > > Shouldn't this also kfree(ns->vfconfigs)?    
> > > > 
> > > > It's in uninit, I will move it to release.    
> > > 
> > > Oh, I missed that. If you're certain this won't lead to memleaks, no
> > > objection from my side. :)  
> > 
> > OK, I will respin v3 with the free moved :)  
> 
> So it did leak? I'm glad the traffic I caused wasn't completely
> pointless then. :)

There is a window where it could've been re-enabled and that
would leak, yes.  Thanks for catching it :)

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2017-12-01 22:14 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-01  1:35 [PATCH net-next v2 0/8] xdp: make stack perform remove and tests Jakub Kicinski
2017-12-01  1:35 ` [PATCH net-next v2 1/8] net: xdp: avoid output parameters when querying XDP prog Jakub Kicinski
2017-12-01  1:35 ` [PATCH net-next v2 2/8] net: xdp: report flags program was installed with on query Jakub Kicinski
2017-12-01  1:35 ` [PATCH net-next v2 3/8] net: xdp: make the stack take care of the tear down Jakub Kicinski
2017-12-01  1:35 ` [PATCH net-next v2 4/8] netdevsim: add software driver for testing offloads Jakub Kicinski
2017-12-01  1:35 ` [PATCH net-next v2 5/8] netdevsim: add bpf offload support Jakub Kicinski
2017-12-01  1:35 ` [PATCH net-next v2 6/8] selftests/bpf: add offload test based on netdevsim Jakub Kicinski
2017-12-01  1:35 ` [PATCH net-next v2 7/8] netdevsim: add SR-IOV functionality Jakub Kicinski
2017-12-01 13:43   ` Phil Sutter
2017-12-01 20:14     ` Jakub Kicinski
2017-12-01 21:36       ` Phil Sutter
2017-12-01 21:45         ` Jakub Kicinski
2017-12-01 21:58           ` Phil Sutter
2017-12-01 22:14             ` Jakub Kicinski
2017-12-01  1:35 ` [PATCH net-next v2 8/8] net: dummy: remove fake " Jakub Kicinski
2017-12-01 13:46   ` Phil Sutter
2017-12-01 20:19     ` Jakub Kicinski
2017-12-01 21:41       ` Phil Sutter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.