netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v5 0/5] netdevsim: link and forward skbs between ports
@ 2023-12-28  1:46 David Wei
  2023-12-28  1:46 ` [PATCH net-next v5 1/5] netdevsim: maintain a list of probed netdevsims David Wei
                   ` (4 more replies)
  0 siblings, 5 replies; 20+ messages in thread
From: David Wei @ 2023-12-28  1:46 UTC (permalink / raw)
  To: Jakub Kicinski, Jiri Pirko, Sabrina Dubroca, netdev
  Cc: David S. Miller, Eric Dumazet, Paolo Abeni

This patchset adds the ability to link two netdevsim ports together and
forward skbs between them, similar to veth. The goal is to use netdevsim
for testing features e.g. zero copy Rx using io_uring.

This feature was tested locally on QEMU, and a selftest is included.

---
v4->v5:
- reduce nsim_dev_list_lock critical section
- fixed missing mutex unlock during unwind ladder
- rework nsim_dev_peer_write synchronization to take devlink lock as
  well as rtnl_lock
- return err msgs to user during linking if port doesn't exist or
  linking to self
- update tx stats outside of RCU lock

v3->v4:
- maintain a mutex protected list of probed nsim_devs instead of using
  nsim_bus_dev
- fixed synchronization issues by taking rtnl_lock
- track tx_dropped skbs

v2->v3:
- take lock when traversing nsim_bus_dev_list
- take device ref when getting a nsim_bus_dev
- return 0 if nsim_dev_peer_read cannot find the port
- address code formatting
- do not hard code values in selftests
- add Makefile for selftests

v1->v2:
- renamed debugfs file from "link" to "peer"
- replaced strstep() with sscanf() for consistency
- increased char[] buf sz to 22 for copying id + port from user
- added err msg w/ expected fmt when linking as a hint to user
- prevent linking port to itself
- protect peer ptr using RCU

David Wei (5):
  netdevsim: maintain a list of probed netdevsims
  netdevsim: allow two netdevsim ports to be connected
  netdevsim: forward skbs from one connected port to another
  netdevsim: add selftest for forwarding skb between connected ports
  netdevsim: add Makefile for selftests

 MAINTAINERS                                   |   1 +
 drivers/net/netdevsim/dev.c                   | 153 ++++++++++++++++--
 drivers/net/netdevsim/netdev.c                |  27 +++-
 drivers/net/netdevsim/netdevsim.h             |   3 +
 .../selftests/drivers/net/netdevsim/Makefile  |  18 +++
 .../selftests/drivers/net/netdevsim/peer.sh   | 124 ++++++++++++++
 6 files changed, 310 insertions(+), 16 deletions(-)
 create mode 100644 tools/testing/selftests/drivers/net/netdevsim/Makefile
 create mode 100755 tools/testing/selftests/drivers/net/netdevsim/peer.sh

-- 
2.39.3


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH net-next v5 1/5] netdevsim: maintain a list of probed netdevsims
  2023-12-28  1:46 [PATCH net-next v5 0/5] netdevsim: link and forward skbs between ports David Wei
@ 2023-12-28  1:46 ` David Wei
  2024-01-02 11:04   ` Jiri Pirko
  2023-12-28  1:46 ` [PATCH net-next v5 2/5] netdevsim: allow two netdevsim ports to be connected David Wei
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 20+ messages in thread
From: David Wei @ 2023-12-28  1:46 UTC (permalink / raw)
  To: Jakub Kicinski, Jiri Pirko, Sabrina Dubroca, netdev
  Cc: David S. Miller, Eric Dumazet, Paolo Abeni

In this patch I added a linked list nsim_dev_list of probed nsim_devs,
added during nsim_drv_probe() and removed during nsim_drv_remove(). A
mutex nsim_dev_list_lock protects the list.

Signed-off-by: David Wei <dw@davidwei.uk>
---
 drivers/net/netdevsim/dev.c       | 19 +++++++++++++++++++
 drivers/net/netdevsim/netdevsim.h |  1 +
 2 files changed, 20 insertions(+)

diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c
index b4d3b9cde8bd..8d477aa99f94 100644
--- a/drivers/net/netdevsim/dev.c
+++ b/drivers/net/netdevsim/dev.c
@@ -35,6 +35,9 @@
 
 #include "netdevsim.h"
 
+static LIST_HEAD(nsim_dev_list);
+static DEFINE_MUTEX(nsim_dev_list_lock);
+
 static unsigned int
 nsim_dev_port_index(enum nsim_dev_port_type type, unsigned int port_index)
 {
@@ -1607,6 +1610,11 @@ int nsim_drv_probe(struct nsim_bus_dev *nsim_bus_dev)
 
 	nsim_dev->esw_mode = DEVLINK_ESWITCH_MODE_LEGACY;
 	devl_unlock(devlink);
+
+	mutex_lock(&nsim_dev_list_lock);
+	list_add(&nsim_dev->list, &nsim_dev_list);
+	mutex_unlock(&nsim_dev_list_lock);
+
 	return 0;
 
 err_hwstats_exit:
@@ -1668,8 +1676,19 @@ void nsim_drv_remove(struct nsim_bus_dev *nsim_bus_dev)
 {
 	struct nsim_dev *nsim_dev = dev_get_drvdata(&nsim_bus_dev->dev);
 	struct devlink *devlink = priv_to_devlink(nsim_dev);
+	struct nsim_dev *pos, *tmp;
+
+	mutex_lock(&nsim_dev_list_lock);
+	list_for_each_entry_safe(pos, tmp, &nsim_dev_list, list) {
+		if (pos == nsim_dev) {
+			list_del(&nsim_dev->list);
+			break;
+		}
+	}
+	mutex_unlock(&nsim_dev_list_lock);
 
 	devl_lock(devlink);
+
 	nsim_dev_reload_destroy(nsim_dev);
 
 	nsim_bpf_dev_exit(nsim_dev);
diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h
index 028c825b86db..babb61d7790b 100644
--- a/drivers/net/netdevsim/netdevsim.h
+++ b/drivers/net/netdevsim/netdevsim.h
@@ -277,6 +277,7 @@ struct nsim_vf_config {
 
 struct nsim_dev {
 	struct nsim_bus_dev *nsim_bus_dev;
+	struct list_head list;
 	struct nsim_fib_data *fib_data;
 	struct nsim_trap_data *trap_data;
 	struct dentry *ddir;
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next v5 2/5] netdevsim: allow two netdevsim ports to be connected
  2023-12-28  1:46 [PATCH net-next v5 0/5] netdevsim: link and forward skbs between ports David Wei
  2023-12-28  1:46 ` [PATCH net-next v5 1/5] netdevsim: maintain a list of probed netdevsims David Wei
@ 2023-12-28  1:46 ` David Wei
  2024-01-02 11:11   ` Jiri Pirko
  2024-01-04  1:39   ` Jakub Kicinski
  2023-12-28  1:46 ` [PATCH net-next v5 3/5] netdevsim: forward skbs from one connected port to another David Wei
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 20+ messages in thread
From: David Wei @ 2023-12-28  1:46 UTC (permalink / raw)
  To: Jakub Kicinski, Jiri Pirko, Sabrina Dubroca, netdev
  Cc: David S. Miller, Eric Dumazet, Paolo Abeni

Add a debugfs file in
/sys/kernel/debug/netdevsim/netdevsimN/ports/A/peer

Writing "M B" to this file will link port A of netdevsim N with port B
of netdevsim M. Reading this file will return the linked netdevsim id
and port, if any.

During nsim_dev_peer_write(), nsim_dev_list_lock prevents concurrent
modifications to nsim_dev and peer's devlink->lock prevents concurrent
modifications to the peer's port_list. rtnl_lock ensures netdevices do
not change during the critical section where a link is established.

The lock order is consistent with other parts that touch netdevsim and
should not deadlock.

During nsim_dev_peer_read(), RCU read critical section ensures valid
values even if stale.

Signed-off-by: David Wei <dw@davidwei.uk>
---
 drivers/net/netdevsim/dev.c       | 134 +++++++++++++++++++++++++++---
 drivers/net/netdevsim/netdev.c    |   6 ++
 drivers/net/netdevsim/netdevsim.h |   1 +
 3 files changed, 128 insertions(+), 13 deletions(-)

diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c
index 8d477aa99f94..6d5e4ce08dfd 100644
--- a/drivers/net/netdevsim/dev.c
+++ b/drivers/net/netdevsim/dev.c
@@ -391,6 +391,124 @@ static const struct file_operations nsim_dev_rate_parent_fops = {
 	.owner = THIS_MODULE,
 };
 
+static struct nsim_dev *nsim_dev_find_by_id(unsigned int id)
+{
+	struct nsim_dev *dev;
+
+	list_for_each_entry(dev, &nsim_dev_list, list)
+		if (dev->nsim_bus_dev->dev.id == id)
+			return dev;
+
+	return NULL;
+}
+
+static struct nsim_dev_port *
+__nsim_dev_port_lookup(struct nsim_dev *nsim_dev, enum nsim_dev_port_type type,
+		       unsigned int port_index)
+{
+	struct nsim_dev_port *nsim_dev_port;
+
+	port_index = nsim_dev_port_index(type, port_index);
+	list_for_each_entry(nsim_dev_port, &nsim_dev->port_list, list)
+		if (nsim_dev_port->port_index == port_index)
+			return nsim_dev_port;
+	return NULL;
+}
+
+static ssize_t nsim_dev_peer_read(struct file *file, char __user *data,
+				  size_t count, loff_t *ppos)
+{
+	struct nsim_dev_port *nsim_dev_port;
+	struct netdevsim *peer;
+	unsigned int id, port;
+	ssize_t ret = 0;
+	char buf[23];
+
+	nsim_dev_port = file->private_data;
+	rcu_read_lock();
+	peer = rcu_dereference(nsim_dev_port->ns->peer);
+	if (!peer) {
+		rcu_read_unlock();
+		return 0;
+	}
+
+	id = peer->nsim_bus_dev->dev.id;
+	port = peer->nsim_dev_port->port_index;
+	ret = scnprintf(buf, sizeof(buf), "%u %u\n", id, port);
+	ret = simple_read_from_buffer(data, count, ppos, buf, ret);
+
+	rcu_read_unlock();
+	return ret;
+}
+
+static ssize_t nsim_dev_peer_write(struct file *file,
+				   const char __user *data,
+				   size_t count, loff_t *ppos)
+{
+	struct nsim_dev_port *nsim_dev_port, *peer_dev_port;
+	struct nsim_dev *peer_dev;
+	unsigned int id, port;
+	char buf[22];
+	ssize_t ret;
+
+	if (count >= sizeof(buf))
+		return -ENOSPC;
+
+	ret = copy_from_user(buf, data, count);
+	if (ret)
+		return -EFAULT;
+	buf[count] = '\0';
+
+	ret = sscanf(buf, "%u %u", &id, &port);
+	if (ret != 2) {
+		pr_err("Format is peer netdevsim \"id port\" (uint uint)\n");
+		return -EINVAL;
+	}
+
+	ret = -EINVAL;
+	mutex_lock(&nsim_dev_list_lock);
+	peer_dev = nsim_dev_find_by_id(id);
+	if (!peer_dev) {
+		pr_err("Peer netdevsim %u does not exist\n", id);
+		goto out_mutex;
+	}
+
+	devl_lock(priv_to_devlink(peer_dev));
+	rtnl_lock();
+	nsim_dev_port = file->private_data;
+	peer_dev_port = __nsim_dev_port_lookup(peer_dev, NSIM_DEV_PORT_TYPE_PF,
+					       port);
+	if (!peer_dev_port) {
+		pr_err("Peer netdevsim %u port %u does not exist\n", id, port);
+		goto out_devl;
+	}
+
+	if (nsim_dev_port == peer_dev_port) {
+		pr_err("Cannot link netdevsim to itself\n");
+		goto out_devl;
+	}
+
+	rcu_assign_pointer(nsim_dev_port->ns->peer, peer_dev_port->ns);
+	rcu_assign_pointer(peer_dev_port->ns->peer, nsim_dev_port->ns);
+	ret = count;
+
+out_devl:
+	rtnl_unlock();
+	devl_unlock(priv_to_devlink(peer_dev));
+out_mutex:
+	mutex_unlock(&nsim_dev_list_lock);
+
+	return ret;
+}
+
+static const struct file_operations nsim_dev_peer_fops = {
+	.open = simple_open,
+	.read = nsim_dev_peer_read,
+	.write = nsim_dev_peer_write,
+	.llseek = generic_file_llseek,
+	.owner = THIS_MODULE,
+};
+
 static int nsim_dev_port_debugfs_init(struct nsim_dev *nsim_dev,
 				      struct nsim_dev_port *nsim_dev_port)
 {
@@ -421,6 +539,9 @@ static int nsim_dev_port_debugfs_init(struct nsim_dev *nsim_dev,
 	}
 	debugfs_create_symlink("dev", nsim_dev_port->ddir, dev_link_name);
 
+	debugfs_create_file("peer", 0600, nsim_dev_port->ddir,
+			    nsim_dev_port, &nsim_dev_peer_fops);
+
 	return 0;
 }
 
@@ -1704,19 +1825,6 @@ void nsim_drv_remove(struct nsim_bus_dev *nsim_bus_dev)
 	dev_set_drvdata(&nsim_bus_dev->dev, NULL);
 }
 
-static struct nsim_dev_port *
-__nsim_dev_port_lookup(struct nsim_dev *nsim_dev, enum nsim_dev_port_type type,
-		       unsigned int port_index)
-{
-	struct nsim_dev_port *nsim_dev_port;
-
-	port_index = nsim_dev_port_index(type, port_index);
-	list_for_each_entry(nsim_dev_port, &nsim_dev->port_list, list)
-		if (nsim_dev_port->port_index == port_index)
-			return nsim_dev_port;
-	return NULL;
-}
-
 int nsim_drv_port_add(struct nsim_bus_dev *nsim_bus_dev, enum nsim_dev_port_type type,
 		      unsigned int port_index)
 {
diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c
index aecaf5f44374..434322f6a565 100644
--- a/drivers/net/netdevsim/netdev.c
+++ b/drivers/net/netdevsim/netdev.c
@@ -388,6 +388,7 @@ nsim_create(struct nsim_dev *nsim_dev, struct nsim_dev_port *nsim_dev_port)
 	ns->nsim_dev = nsim_dev;
 	ns->nsim_dev_port = nsim_dev_port;
 	ns->nsim_bus_dev = nsim_dev->nsim_bus_dev;
+	RCU_INIT_POINTER(ns->peer, NULL);
 	SET_NETDEV_DEV(dev, &ns->nsim_bus_dev->dev);
 	SET_NETDEV_DEVLINK_PORT(dev, &nsim_dev_port->devlink_port);
 	nsim_ethtool_init(ns);
@@ -407,8 +408,13 @@ nsim_create(struct nsim_dev *nsim_dev, struct nsim_dev_port *nsim_dev_port)
 void nsim_destroy(struct netdevsim *ns)
 {
 	struct net_device *dev = ns->netdev;
+	struct netdevsim *peer;
 
 	rtnl_lock();
+	peer = rtnl_dereference(ns->peer);
+	if (peer)
+		RCU_INIT_POINTER(peer->peer, NULL);
+	RCU_INIT_POINTER(ns->peer, NULL);
 	unregister_netdevice(dev);
 	if (nsim_dev_port_is_pf(ns->nsim_dev_port)) {
 		nsim_macsec_teardown(ns);
diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h
index babb61d7790b..24fc3fbda791 100644
--- a/drivers/net/netdevsim/netdevsim.h
+++ b/drivers/net/netdevsim/netdevsim.h
@@ -125,6 +125,7 @@ struct netdevsim {
 	} udp_ports;
 
 	struct nsim_ethtool ethtool;
+	struct netdevsim __rcu *peer;
 };
 
 struct netdevsim *
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next v5 3/5] netdevsim: forward skbs from one connected port to another
  2023-12-28  1:46 [PATCH net-next v5 0/5] netdevsim: link and forward skbs between ports David Wei
  2023-12-28  1:46 ` [PATCH net-next v5 1/5] netdevsim: maintain a list of probed netdevsims David Wei
  2023-12-28  1:46 ` [PATCH net-next v5 2/5] netdevsim: allow two netdevsim ports to be connected David Wei
@ 2023-12-28  1:46 ` David Wei
  2024-01-02 11:13   ` Jiri Pirko
  2024-01-02 11:20   ` Eric Dumazet
  2023-12-28  1:46 ` [PATCH net-next v5 4/5] netdevsim: add selftest for forwarding skb between connected ports David Wei
  2023-12-28  1:46 ` [PATCH net-next v5 5/5] netdevsim: add Makefile for selftests David Wei
  4 siblings, 2 replies; 20+ messages in thread
From: David Wei @ 2023-12-28  1:46 UTC (permalink / raw)
  To: Jakub Kicinski, Jiri Pirko, Sabrina Dubroca, netdev
  Cc: David S. Miller, Eric Dumazet, Paolo Abeni

Forward skbs sent from one netdevsim port to its connected netdevsim
port using dev_forward_skb, in a spirit similar to veth.

Add a tx_dropped variable to struct netdevsim, tracking the number of
skbs that could not be forwarded using dev_forward_skb().

The xmit() function accessing the peer ptr is protected by an RCU read
critical section. The rcu_read_lock() is functionally redundant as since
v5.0 all softirqs are implicitly RCU read critical sections; but it is
useful for human readers.

If another CPU is concurrently in nsim_destroy(), then it will first set
the peer ptr to NULL. This does not affect any existing readers that
dereferenced a non-NULL peer. Then, in unregister_netdevice(), there is
a synchronize_rcu() before the netdev is actually unregistered and
freed. This ensures that any readers i.e. xmit() that got a non-NULL
peer will complete before the netdev is freed.

Any readers after the RCU_INIT_POINTER() but before synchronize_rcu()
will dereference NULL, making it safe.

The codepath to nsim_destroy() and nsim_create() takes both the newly
added nsim_dev_list_lock and rtnl_lock. This makes it safe with
concurrent calls to linking two netdevsims together.

Signed-off-by: David Wei <dw@davidwei.uk>
---
 drivers/net/netdevsim/netdev.c    | 21 ++++++++++++++++++---
 drivers/net/netdevsim/netdevsim.h |  1 +
 2 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c
index 434322f6a565..0009d0f1243f 100644
--- a/drivers/net/netdevsim/netdev.c
+++ b/drivers/net/netdevsim/netdev.c
@@ -29,19 +29,34 @@
 static netdev_tx_t nsim_start_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct netdevsim *ns = netdev_priv(dev);
+	struct netdevsim *peer_ns;
+	int ret = NETDEV_TX_OK;
 
 	if (!nsim_ipsec_tx(ns, skb))
 		goto out;
 
+	rcu_read_lock();
+	peer_ns = rcu_dereference(ns->peer);
+	if (!peer_ns)
+		goto out_stats;
+
+	skb_tx_timestamp(skb);
+	if (unlikely(dev_forward_skb(peer_ns->netdev, skb) == NET_RX_DROP))
+		ret = NET_XMIT_DROP;
+
+out_stats:
+	rcu_read_unlock();
 	u64_stats_update_begin(&ns->syncp);
 	ns->tx_packets++;
 	ns->tx_bytes += skb->len;
+	if (ret == NET_XMIT_DROP)
+		ns->tx_dropped++;
 	u64_stats_update_end(&ns->syncp);
+	return ret;
 
 out:
 	dev_kfree_skb(skb);
-
-	return NETDEV_TX_OK;
+	return ret;
 }
 
 static void nsim_set_rx_mode(struct net_device *dev)
@@ -70,6 +85,7 @@ nsim_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
 		start = u64_stats_fetch_begin(&ns->syncp);
 		stats->tx_bytes = ns->tx_bytes;
 		stats->tx_packets = ns->tx_packets;
+		stats->tx_dropped = ns->tx_dropped;
 	} while (u64_stats_fetch_retry(&ns->syncp, start));
 }
 
@@ -302,7 +318,6 @@ static void nsim_setup(struct net_device *dev)
 	eth_hw_addr_random(dev);
 
 	dev->tx_queue_len = 0;
-	dev->flags |= IFF_NOARP;
 	dev->flags &= ~IFF_MULTICAST;
 	dev->priv_flags |= IFF_LIVE_ADDR_CHANGE |
 			   IFF_NO_QUEUE;
diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h
index 24fc3fbda791..083b1ee7a1a2 100644
--- a/drivers/net/netdevsim/netdevsim.h
+++ b/drivers/net/netdevsim/netdevsim.h
@@ -98,6 +98,7 @@ struct netdevsim {
 
 	u64 tx_packets;
 	u64 tx_bytes;
+	u64 tx_dropped;
 	struct u64_stats_sync syncp;
 
 	struct nsim_bus_dev *nsim_bus_dev;
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next v5 4/5] netdevsim: add selftest for forwarding skb between connected ports
  2023-12-28  1:46 [PATCH net-next v5 0/5] netdevsim: link and forward skbs between ports David Wei
                   ` (2 preceding siblings ...)
  2023-12-28  1:46 ` [PATCH net-next v5 3/5] netdevsim: forward skbs from one connected port to another David Wei
@ 2023-12-28  1:46 ` David Wei
  2023-12-28  1:46 ` [PATCH net-next v5 5/5] netdevsim: add Makefile for selftests David Wei
  4 siblings, 0 replies; 20+ messages in thread
From: David Wei @ 2023-12-28  1:46 UTC (permalink / raw)
  To: Jakub Kicinski, Jiri Pirko, Sabrina Dubroca, netdev
  Cc: David S. Miller, Eric Dumazet, Paolo Abeni

Connect two netdevsim ports in different namespaces together, then send
packets between them using socat.

Signed-off-by: David Wei <dw@davidwei.uk>
---
 .../selftests/drivers/net/netdevsim/peer.sh   | 124 ++++++++++++++++++
 1 file changed, 124 insertions(+)
 create mode 100755 tools/testing/selftests/drivers/net/netdevsim/peer.sh

diff --git a/tools/testing/selftests/drivers/net/netdevsim/peer.sh b/tools/testing/selftests/drivers/net/netdevsim/peer.sh
new file mode 100755
index 000000000000..f123e6b7cd2f
--- /dev/null
+++ b/tools/testing/selftests/drivers/net/netdevsim/peer.sh
@@ -0,0 +1,124 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0-only
+
+NSIM_DEV_1_ID=$((RANDOM % 1024))
+NSIM_DEV_1_SYS=/sys/bus/netdevsim/devices/netdevsim$NSIM_DEV_1_ID
+NSIM_DEV_1_DFS=/sys/kernel/debug/netdevsim/netdevsim$NSIM_DEV_1_ID
+NSIM_DEV_2_ID=$((RANDOM % 1024))
+NSIM_DEV_2_SYS=/sys/bus/netdevsim/devices/netdevsim$NSIM_DEV_2_ID
+NSIM_DEV_2_DFS=/sys/kernel/debug/netdevsim/netdevsim$NSIM_DEV_2_ID
+
+NSIM_DEV_SYS_NEW=/sys/bus/netdevsim/new_device
+NSIM_DEV_SYS_DEL=/sys/bus/netdevsim/del_device
+
+socat_check()
+{
+	if [ ! -x "$(command -v socat)" ]; then
+		echo "socat command not found. Skipping test"
+		return 1
+	fi
+
+	return 0
+}
+
+setup_ns()
+{
+	set -e
+	ip netns add nssv
+	ip netns add nscl
+
+	NSIM_DEV_1_NAME=$(find $NSIM_DEV_1_SYS/net -maxdepth 1 -type d ! \
+		-path $NSIM_DEV_1_SYS/net -exec basename {} \;)
+	NSIM_DEV_2_NAME=$(find $NSIM_DEV_2_SYS/net -maxdepth 1 -type d ! \
+		-path $NSIM_DEV_2_SYS/net -exec basename {} \;)
+
+	ip link set $NSIM_DEV_1_NAME netns nssv
+	ip link set $NSIM_DEV_2_NAME netns nscl
+
+	ip netns exec nssv ip addr add '192.168.1.1/24' dev $NSIM_DEV_1_NAME
+	ip netns exec nscl ip addr add '192.168.1.2/24' dev $NSIM_DEV_2_NAME
+
+	ip netns exec nssv ip link set dev $NSIM_DEV_1_NAME up
+	ip netns exec nscl ip link set dev $NSIM_DEV_2_NAME up
+	set +e
+}
+
+cleanup_ns()
+{
+	ip netns del nscl
+	ip netns del nssv
+}
+
+###
+### Code start
+###
+
+modprobe netdevsim
+
+# linking
+
+echo $NSIM_DEV_1_ID > $NSIM_DEV_SYS_NEW
+
+echo "$NSIM_DEV_2_ID 0" > ${NSIM_DEV_1_DFS}/ports/0/peer 2>/dev/null
+if [ $? -eq 0 ]; then
+	echo "linking with non-existent netdevsim should fail"
+	exit 1
+fi
+
+echo $NSIM_DEV_2_ID > $NSIM_DEV_SYS_NEW
+
+echo "$NSIM_DEV_2_ID 0" > ${NSIM_DEV_1_DFS}/ports/0/peer
+if [ $? -ne 0 ]; then
+	echo "linking netdevsim1 port0 with netdevsim2 port0 should succeed"
+	exit 1
+fi
+
+# argument error checking
+
+echo "$NSIM_DEV_2_ID 1" > ${NSIM_DEV_1_DFS}/ports/0/peer 2>/dev/null
+if [ $? -eq 0 ]; then
+	echo "linking with non-existent port in a netdevsim should fail"
+	exit 1
+fi
+
+echo "$NSIM_DEV_1_ID 0" > ${NSIM_DEV_1_DFS}/ports/0/peer 2>/dev/null
+if [ $? -eq 0 ]; then
+	echo "linking with self should fail"
+	exit 1
+fi
+
+echo "$NSIM_DEV_2_ID a" > ${NSIM_DEV_1_DFS}/ports/0/peer 2>/dev/null
+if [ $? -eq 0 ]; then
+	echo "invalid arg should fail"
+	exit 1
+fi
+
+# send/recv packets
+
+socat_check || exit 4
+
+setup_ns
+
+tmp_file=$(mktemp)
+ip netns exec nssv socat TCP-LISTEN:1234,fork $tmp_file &
+pid=$!
+res=0
+
+echo "HI" | ip netns exec nscl socat STDIN TCP:192.168.1.1:1234
+
+count=$(cat $tmp_file | wc -c)
+if [[ $count -ne 3 ]]; then
+	echo "expected 3 bytes, got $count"
+	res=1
+fi
+
+echo $NSIM_DEV_2_ID > $NSIM_DEV_SYS_DEL
+
+kill $pid
+echo $NSIM_DEV_1_ID > $NSIM_DEV_SYS_DEL
+
+cleanup_ns
+
+modprobe -r netdevsim
+
+exit $res
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next v5 5/5] netdevsim: add Makefile for selftests
  2023-12-28  1:46 [PATCH net-next v5 0/5] netdevsim: link and forward skbs between ports David Wei
                   ` (3 preceding siblings ...)
  2023-12-28  1:46 ` [PATCH net-next v5 4/5] netdevsim: add selftest for forwarding skb between connected ports David Wei
@ 2023-12-28  1:46 ` David Wei
  4 siblings, 0 replies; 20+ messages in thread
From: David Wei @ 2023-12-28  1:46 UTC (permalink / raw)
  To: Jakub Kicinski, Jiri Pirko, Sabrina Dubroca, netdev
  Cc: David S. Miller, Eric Dumazet, Paolo Abeni

Add a Makefile for netdevsim selftests and add selftests path to
MAINTAINERS

Signed-off-by: David Wei <dw@davidwei.uk>
---
 MAINTAINERS                                    |  1 +
 .../selftests/drivers/net/netdevsim/Makefile   | 18 ++++++++++++++++++
 2 files changed, 19 insertions(+)
 create mode 100644 tools/testing/selftests/drivers/net/netdevsim/Makefile

diff --git a/MAINTAINERS b/MAINTAINERS
index dda78b4ce707..d086e49eac57 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -14853,6 +14853,7 @@ NETDEVSIM
 M:	Jakub Kicinski <kuba@kernel.org>
 S:	Maintained
 F:	drivers/net/netdevsim/*
+F:	tools/testing/selftests/drivers/net/netdevsim/*
 
 NETEM NETWORK EMULATOR
 M:	Stephen Hemminger <stephen@networkplumber.org>
diff --git a/tools/testing/selftests/drivers/net/netdevsim/Makefile b/tools/testing/selftests/drivers/net/netdevsim/Makefile
new file mode 100644
index 000000000000..5bace0b7fb57
--- /dev/null
+++ b/tools/testing/selftests/drivers/net/netdevsim/Makefile
@@ -0,0 +1,18 @@
+# SPDX-License-Identifier: GPL-2.0+ OR MIT
+
+TEST_PROGS = devlink.sh \
+	devlink_in_netns.sh \
+	devlink_trap.sh \
+	ethtool-coalesce.sh \
+	ethtool-fec.sh \
+	ethtool-pause.sh \
+	ethtool-ring.sh \
+	fib.sh \
+	hw_stats_l3.sh \
+	nexthop.sh \
+	peer.sh \
+	psample.sh \
+	tc-mq-visibility.sh \
+	udp_tunnel_nic.sh \
+
+include ../../../lib.mk
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next v5 1/5] netdevsim: maintain a list of probed netdevsims
  2023-12-28  1:46 ` [PATCH net-next v5 1/5] netdevsim: maintain a list of probed netdevsims David Wei
@ 2024-01-02 11:04   ` Jiri Pirko
  2024-01-03 21:48     ` David Wei
  0 siblings, 1 reply; 20+ messages in thread
From: Jiri Pirko @ 2024-01-02 11:04 UTC (permalink / raw)
  To: David Wei
  Cc: Jakub Kicinski, Sabrina Dubroca, netdev, David S. Miller,
	Eric Dumazet, Paolo Abeni

Thu, Dec 28, 2023 at 02:46:29AM CET, dw@davidwei.uk wrote:
>In this patch I added a linked list nsim_dev_list of probed nsim_devs,
>added during nsim_drv_probe() and removed during nsim_drv_remove(). A
>mutex nsim_dev_list_lock protects the list.
>
>Signed-off-by: David Wei <dw@davidwei.uk>
>---
> drivers/net/netdevsim/dev.c       | 19 +++++++++++++++++++
> drivers/net/netdevsim/netdevsim.h |  1 +
> 2 files changed, 20 insertions(+)
>
>diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c
>index b4d3b9cde8bd..8d477aa99f94 100644
>--- a/drivers/net/netdevsim/dev.c
>+++ b/drivers/net/netdevsim/dev.c
>@@ -35,6 +35,9 @@
> 
> #include "netdevsim.h"
> 
>+static LIST_HEAD(nsim_dev_list);
>+static DEFINE_MUTEX(nsim_dev_list_lock);
>+
> static unsigned int
> nsim_dev_port_index(enum nsim_dev_port_type type, unsigned int port_index)
> {
>@@ -1607,6 +1610,11 @@ int nsim_drv_probe(struct nsim_bus_dev *nsim_bus_dev)
> 
> 	nsim_dev->esw_mode = DEVLINK_ESWITCH_MODE_LEGACY;
> 	devl_unlock(devlink);
>+
>+	mutex_lock(&nsim_dev_list_lock);
>+	list_add(&nsim_dev->list, &nsim_dev_list);
>+	mutex_unlock(&nsim_dev_list_lock);
>+
> 	return 0;
> 
> err_hwstats_exit:
>@@ -1668,8 +1676,19 @@ void nsim_drv_remove(struct nsim_bus_dev *nsim_bus_dev)
> {
> 	struct nsim_dev *nsim_dev = dev_get_drvdata(&nsim_bus_dev->dev);
> 	struct devlink *devlink = priv_to_devlink(nsim_dev);
>+	struct nsim_dev *pos, *tmp;
>+
>+	mutex_lock(&nsim_dev_list_lock);
>+	list_for_each_entry_safe(pos, tmp, &nsim_dev_list, list) {
>+		if (pos == nsim_dev) {
>+			list_del(&nsim_dev->list);
>+			break;
>+		}
>+	}
>+	mutex_unlock(&nsim_dev_list_lock);

This is just:
	mutex_lock(&nsim_dev_list_lock);
	list_del(&nsim_dev->list);
	mutex_unlock(&nsim_dev_list_lock);

The loop is not good for anything.



> 
> 	devl_lock(devlink);
>+

Remove this leftover line addition.


> 	nsim_dev_reload_destroy(nsim_dev);
> 
> 	nsim_bpf_dev_exit(nsim_dev);
>diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h
>index 028c825b86db..babb61d7790b 100644
>--- a/drivers/net/netdevsim/netdevsim.h
>+++ b/drivers/net/netdevsim/netdevsim.h
>@@ -277,6 +277,7 @@ struct nsim_vf_config {
> 
> struct nsim_dev {
> 	struct nsim_bus_dev *nsim_bus_dev;
>+	struct list_head list;
> 	struct nsim_fib_data *fib_data;
> 	struct nsim_trap_data *trap_data;
> 	struct dentry *ddir;
>-- 
>2.39.3
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next v5 2/5] netdevsim: allow two netdevsim ports to be connected
  2023-12-28  1:46 ` [PATCH net-next v5 2/5] netdevsim: allow two netdevsim ports to be connected David Wei
@ 2024-01-02 11:11   ` Jiri Pirko
  2024-01-03 21:56     ` David Wei
  2024-01-04  1:39   ` Jakub Kicinski
  1 sibling, 1 reply; 20+ messages in thread
From: Jiri Pirko @ 2024-01-02 11:11 UTC (permalink / raw)
  To: David Wei
  Cc: Jakub Kicinski, Sabrina Dubroca, netdev, David S. Miller,
	Eric Dumazet, Paolo Abeni

Thu, Dec 28, 2023 at 02:46:30AM CET, dw@davidwei.uk wrote:
>Add a debugfs file in
>/sys/kernel/debug/netdevsim/netdevsimN/ports/A/peer
>
>Writing "M B" to this file will link port A of netdevsim N with port B
>of netdevsim M. Reading this file will return the linked netdevsim id
>and port, if any.
>
>During nsim_dev_peer_write(), nsim_dev_list_lock prevents concurrent
>modifications to nsim_dev and peer's devlink->lock prevents concurrent
>modifications to the peer's port_list. rtnl_lock ensures netdevices do
>not change during the critical section where a link is established.
>
>The lock order is consistent with other parts that touch netdevsim and
>should not deadlock.
>
>During nsim_dev_peer_read(), RCU read critical section ensures valid
>values even if stale.
>
>Signed-off-by: David Wei <dw@davidwei.uk>
>---
> drivers/net/netdevsim/dev.c       | 134 +++++++++++++++++++++++++++---
> drivers/net/netdevsim/netdev.c    |   6 ++
> drivers/net/netdevsim/netdevsim.h |   1 +
> 3 files changed, 128 insertions(+), 13 deletions(-)
>
>diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c
>index 8d477aa99f94..6d5e4ce08dfd 100644
>--- a/drivers/net/netdevsim/dev.c
>+++ b/drivers/net/netdevsim/dev.c
>@@ -391,6 +391,124 @@ static const struct file_operations nsim_dev_rate_parent_fops = {
> 	.owner = THIS_MODULE,
> };
> 
>+static struct nsim_dev *nsim_dev_find_by_id(unsigned int id)
>+{
>+	struct nsim_dev *dev;
>+
>+	list_for_each_entry(dev, &nsim_dev_list, list)
>+		if (dev->nsim_bus_dev->dev.id == id)
>+			return dev;
>+
>+	return NULL;
>+}
>+
>+static struct nsim_dev_port *
>+__nsim_dev_port_lookup(struct nsim_dev *nsim_dev, enum nsim_dev_port_type type,
>+		       unsigned int port_index)
>+{
>+	struct nsim_dev_port *nsim_dev_port;
>+
>+	port_index = nsim_dev_port_index(type, port_index);
>+	list_for_each_entry(nsim_dev_port, &nsim_dev->port_list, list)
>+		if (nsim_dev_port->port_index == port_index)
>+			return nsim_dev_port;
>+	return NULL;
>+}
>+
>+static ssize_t nsim_dev_peer_read(struct file *file, char __user *data,
>+				  size_t count, loff_t *ppos)
>+{
>+	struct nsim_dev_port *nsim_dev_port;
>+	struct netdevsim *peer;
>+	unsigned int id, port;
>+	ssize_t ret = 0;
>+	char buf[23];
>+
>+	nsim_dev_port = file->private_data;
>+	rcu_read_lock();
>+	peer = rcu_dereference(nsim_dev_port->ns->peer);
>+	if (!peer) {
>+		rcu_read_unlock();
>+		return 0;
>+	}
>+
>+	id = peer->nsim_bus_dev->dev.id;
>+	port = peer->nsim_dev_port->port_index;
>+	ret = scnprintf(buf, sizeof(buf), "%u %u\n", id, port);
>+	ret = simple_read_from_buffer(data, count, ppos, buf, ret);
>+
>+	rcu_read_unlock();
>+	return ret;
>+}
>+
>+static ssize_t nsim_dev_peer_write(struct file *file,
>+				   const char __user *data,
>+				   size_t count, loff_t *ppos)
>+{
>+	struct nsim_dev_port *nsim_dev_port, *peer_dev_port;
>+	struct nsim_dev *peer_dev;
>+	unsigned int id, port;
>+	char buf[22];
>+	ssize_t ret;
>+
>+	if (count >= sizeof(buf))
>+		return -ENOSPC;
>+
>+	ret = copy_from_user(buf, data, count);
>+	if (ret)
>+		return -EFAULT;
>+	buf[count] = '\0';
>+
>+	ret = sscanf(buf, "%u %u", &id, &port);
>+	if (ret != 2) {
>+		pr_err("Format is peer netdevsim \"id port\" (uint uint)\n");
>+		return -EINVAL;
>+	}
>+
>+	ret = -EINVAL;
>+	mutex_lock(&nsim_dev_list_lock);
>+	peer_dev = nsim_dev_find_by_id(id);
>+	if (!peer_dev) {
>+		pr_err("Peer netdevsim %u does not exist\n", id);
>+		goto out_mutex;
>+	}
>+
>+	devl_lock(priv_to_devlink(peer_dev));

Why exactly do you take devlink instance mutex of the peer here?


>+	rtnl_lock();
>+	nsim_dev_port = file->private_data;
>+	peer_dev_port = __nsim_dev_port_lookup(peer_dev, NSIM_DEV_PORT_TYPE_PF,
>+					       port);
>+	if (!peer_dev_port) {
>+		pr_err("Peer netdevsim %u port %u does not exist\n", id, port);
>+		goto out_devl;
>+	}
>+
>+	if (nsim_dev_port == peer_dev_port) {
>+		pr_err("Cannot link netdevsim to itself\n");
>+		goto out_devl;
>+	}
>+
>+	rcu_assign_pointer(nsim_dev_port->ns->peer, peer_dev_port->ns);
>+	rcu_assign_pointer(peer_dev_port->ns->peer, nsim_dev_port->ns);
>+	ret = count;
>+
>+out_devl:
>+	rtnl_unlock();
>+	devl_unlock(priv_to_devlink(peer_dev));
>+out_mutex:
>+	mutex_unlock(&nsim_dev_list_lock);
>+
>+	return ret;
>+}
>+
>+static const struct file_operations nsim_dev_peer_fops = {
>+	.open = simple_open,
>+	.read = nsim_dev_peer_read,
>+	.write = nsim_dev_peer_write,
>+	.llseek = generic_file_llseek,
>+	.owner = THIS_MODULE,
>+};
>+
> static int nsim_dev_port_debugfs_init(struct nsim_dev *nsim_dev,
> 				      struct nsim_dev_port *nsim_dev_port)
> {
>@@ -421,6 +539,9 @@ static int nsim_dev_port_debugfs_init(struct nsim_dev *nsim_dev,
> 	}
> 	debugfs_create_symlink("dev", nsim_dev_port->ddir, dev_link_name);
> 
>+	debugfs_create_file("peer", 0600, nsim_dev_port->ddir,
>+			    nsim_dev_port, &nsim_dev_peer_fops);
>+
> 	return 0;
> }
> 
>@@ -1704,19 +1825,6 @@ void nsim_drv_remove(struct nsim_bus_dev *nsim_bus_dev)
> 	dev_set_drvdata(&nsim_bus_dev->dev, NULL);
> }
> 
>-static struct nsim_dev_port *
>-__nsim_dev_port_lookup(struct nsim_dev *nsim_dev, enum nsim_dev_port_type type,
>-		       unsigned int port_index)
>-{
>-	struct nsim_dev_port *nsim_dev_port;
>-
>-	port_index = nsim_dev_port_index(type, port_index);
>-	list_for_each_entry(nsim_dev_port, &nsim_dev->port_list, list)
>-		if (nsim_dev_port->port_index == port_index)
>-			return nsim_dev_port;
>-	return NULL;
>-}
>-
> int nsim_drv_port_add(struct nsim_bus_dev *nsim_bus_dev, enum nsim_dev_port_type type,
> 		      unsigned int port_index)
> {
>diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c
>index aecaf5f44374..434322f6a565 100644
>--- a/drivers/net/netdevsim/netdev.c
>+++ b/drivers/net/netdevsim/netdev.c
>@@ -388,6 +388,7 @@ nsim_create(struct nsim_dev *nsim_dev, struct nsim_dev_port *nsim_dev_port)
> 	ns->nsim_dev = nsim_dev;
> 	ns->nsim_dev_port = nsim_dev_port;
> 	ns->nsim_bus_dev = nsim_dev->nsim_bus_dev;
>+	RCU_INIT_POINTER(ns->peer, NULL);
> 	SET_NETDEV_DEV(dev, &ns->nsim_bus_dev->dev);
> 	SET_NETDEV_DEVLINK_PORT(dev, &nsim_dev_port->devlink_port);
> 	nsim_ethtool_init(ns);
>@@ -407,8 +408,13 @@ nsim_create(struct nsim_dev *nsim_dev, struct nsim_dev_port *nsim_dev_port)
> void nsim_destroy(struct netdevsim *ns)
> {
> 	struct net_device *dev = ns->netdev;
>+	struct netdevsim *peer;
> 
> 	rtnl_lock();
>+	peer = rtnl_dereference(ns->peer);
>+	if (peer)
>+		RCU_INIT_POINTER(peer->peer, NULL);
>+	RCU_INIT_POINTER(ns->peer, NULL);
> 	unregister_netdevice(dev);
> 	if (nsim_dev_port_is_pf(ns->nsim_dev_port)) {
> 		nsim_macsec_teardown(ns);
>diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h
>index babb61d7790b..24fc3fbda791 100644
>--- a/drivers/net/netdevsim/netdevsim.h
>+++ b/drivers/net/netdevsim/netdevsim.h
>@@ -125,6 +125,7 @@ struct netdevsim {
> 	} udp_ports;
> 
> 	struct nsim_ethtool ethtool;
>+	struct netdevsim __rcu *peer;
> };
> 
> struct netdevsim *
>-- 
>2.39.3
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next v5 3/5] netdevsim: forward skbs from one connected port to another
  2023-12-28  1:46 ` [PATCH net-next v5 3/5] netdevsim: forward skbs from one connected port to another David Wei
@ 2024-01-02 11:13   ` Jiri Pirko
  2024-01-03 22:36     ` David Wei
  2024-01-02 11:20   ` Eric Dumazet
  1 sibling, 1 reply; 20+ messages in thread
From: Jiri Pirko @ 2024-01-02 11:13 UTC (permalink / raw)
  To: David Wei
  Cc: Jakub Kicinski, Sabrina Dubroca, netdev, David S. Miller,
	Eric Dumazet, Paolo Abeni

Thu, Dec 28, 2023 at 02:46:31AM CET, dw@davidwei.uk wrote:
>Forward skbs sent from one netdevsim port to its connected netdevsim
>port using dev_forward_skb, in a spirit similar to veth.
>
>Add a tx_dropped variable to struct netdevsim, tracking the number of
>skbs that could not be forwarded using dev_forward_skb().
>
>The xmit() function accessing the peer ptr is protected by an RCU read
>critical section. The rcu_read_lock() is functionally redundant as since
>v5.0 all softirqs are implicitly RCU read critical sections; but it is
>useful for human readers.
>
>If another CPU is concurrently in nsim_destroy(), then it will first set
>the peer ptr to NULL. This does not affect any existing readers that
>dereferenced a non-NULL peer. Then, in unregister_netdevice(), there is
>a synchronize_rcu() before the netdev is actually unregistered and
>freed. This ensures that any readers i.e. xmit() that got a non-NULL
>peer will complete before the netdev is freed.
>
>Any readers after the RCU_INIT_POINTER() but before synchronize_rcu()
>will dereference NULL, making it safe.
>
>The codepath to nsim_destroy() and nsim_create() takes both the newly
>added nsim_dev_list_lock and rtnl_lock. This makes it safe with

I don't see the rtnl_lock take in those functions.


Otherwise, this patch looks fine to me.


>concurrent calls to linking two netdevsims together.
>
>Signed-off-by: David Wei <dw@davidwei.uk>
>---
> drivers/net/netdevsim/netdev.c    | 21 ++++++++++++++++++---
> drivers/net/netdevsim/netdevsim.h |  1 +
> 2 files changed, 19 insertions(+), 3 deletions(-)
>
>diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c
>index 434322f6a565..0009d0f1243f 100644
>--- a/drivers/net/netdevsim/netdev.c
>+++ b/drivers/net/netdevsim/netdev.c
>@@ -29,19 +29,34 @@
> static netdev_tx_t nsim_start_xmit(struct sk_buff *skb, struct net_device *dev)
> {
> 	struct netdevsim *ns = netdev_priv(dev);
>+	struct netdevsim *peer_ns;
>+	int ret = NETDEV_TX_OK;
> 
> 	if (!nsim_ipsec_tx(ns, skb))
> 		goto out;
> 
>+	rcu_read_lock();
>+	peer_ns = rcu_dereference(ns->peer);
>+	if (!peer_ns)
>+		goto out_stats;
>+
>+	skb_tx_timestamp(skb);
>+	if (unlikely(dev_forward_skb(peer_ns->netdev, skb) == NET_RX_DROP))
>+		ret = NET_XMIT_DROP;
>+
>+out_stats:
>+	rcu_read_unlock();
> 	u64_stats_update_begin(&ns->syncp);
> 	ns->tx_packets++;
> 	ns->tx_bytes += skb->len;
>+	if (ret == NET_XMIT_DROP)
>+		ns->tx_dropped++;
> 	u64_stats_update_end(&ns->syncp);
>+	return ret;
> 
> out:
> 	dev_kfree_skb(skb);
>-
>-	return NETDEV_TX_OK;
>+	return ret;
> }
> 
> static void nsim_set_rx_mode(struct net_device *dev)
>@@ -70,6 +85,7 @@ nsim_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
> 		start = u64_stats_fetch_begin(&ns->syncp);
> 		stats->tx_bytes = ns->tx_bytes;
> 		stats->tx_packets = ns->tx_packets;
>+		stats->tx_dropped = ns->tx_dropped;
> 	} while (u64_stats_fetch_retry(&ns->syncp, start));
> }
> 
>@@ -302,7 +318,6 @@ static void nsim_setup(struct net_device *dev)
> 	eth_hw_addr_random(dev);
> 
> 	dev->tx_queue_len = 0;
>-	dev->flags |= IFF_NOARP;
> 	dev->flags &= ~IFF_MULTICAST;
> 	dev->priv_flags |= IFF_LIVE_ADDR_CHANGE |
> 			   IFF_NO_QUEUE;
>diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h
>index 24fc3fbda791..083b1ee7a1a2 100644
>--- a/drivers/net/netdevsim/netdevsim.h
>+++ b/drivers/net/netdevsim/netdevsim.h
>@@ -98,6 +98,7 @@ struct netdevsim {
> 
> 	u64 tx_packets;
> 	u64 tx_bytes;
>+	u64 tx_dropped;
> 	struct u64_stats_sync syncp;
> 
> 	struct nsim_bus_dev *nsim_bus_dev;
>-- 
>2.39.3
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next v5 3/5] netdevsim: forward skbs from one connected port to another
  2023-12-28  1:46 ` [PATCH net-next v5 3/5] netdevsim: forward skbs from one connected port to another David Wei
  2024-01-02 11:13   ` Jiri Pirko
@ 2024-01-02 11:20   ` Eric Dumazet
  2024-01-03 21:57     ` David Wei
  1 sibling, 1 reply; 20+ messages in thread
From: Eric Dumazet @ 2024-01-02 11:20 UTC (permalink / raw)
  To: David Wei
  Cc: Jakub Kicinski, Jiri Pirko, Sabrina Dubroca, netdev,
	David S. Miller, Paolo Abeni

On Thu, Dec 28, 2023 at 2:46 AM David Wei <dw@davidwei.uk> wrote:
>
> Forward skbs sent from one netdevsim port to its connected netdevsim
> port using dev_forward_skb, in a spirit similar to veth.
>
> Add a tx_dropped variable to struct netdevsim, tracking the number of
> skbs that could not be forwarded using dev_forward_skb().
>
> The xmit() function accessing the peer ptr is protected by an RCU read
> critical section. The rcu_read_lock() is functionally redundant as since
> v5.0 all softirqs are implicitly RCU read critical sections; but it is
> useful for human readers.
>
> If another CPU is concurrently in nsim_destroy(), then it will first set
> the peer ptr to NULL. This does not affect any existing readers that
> dereferenced a non-NULL peer. Then, in unregister_netdevice(), there is
> a synchronize_rcu() before the netdev is actually unregistered and
> freed. This ensures that any readers i.e. xmit() that got a non-NULL
> peer will complete before the netdev is freed.
>
> Any readers after the RCU_INIT_POINTER() but before synchronize_rcu()
> will dereference NULL, making it safe.
>
> The codepath to nsim_destroy() and nsim_create() takes both the newly
> added nsim_dev_list_lock and rtnl_lock. This makes it safe with
> concurrent calls to linking two netdevsims together.
>
> Signed-off-by: David Wei <dw@davidwei.uk>
> ---
>  drivers/net/netdevsim/netdev.c    | 21 ++++++++++++++++++---
>  drivers/net/netdevsim/netdevsim.h |  1 +
>  2 files changed, 19 insertions(+), 3 deletions(-)
>


> @@ -302,7 +318,6 @@ static void nsim_setup(struct net_device *dev)
>         eth_hw_addr_random(dev);
>
>         dev->tx_queue_len = 0;
> -       dev->flags |= IFF_NOARP;

This part seems to be unrelated to this patch ?

>         dev->flags &= ~IFF_MULTICAST;
>         dev->priv_flags |= IFF_LIVE_ADDR_CHANGE |
>                            IFF_NO_QUEUE;

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next v5 1/5] netdevsim: maintain a list of probed netdevsims
  2024-01-02 11:04   ` Jiri Pirko
@ 2024-01-03 21:48     ` David Wei
  0 siblings, 0 replies; 20+ messages in thread
From: David Wei @ 2024-01-03 21:48 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Jakub Kicinski, Sabrina Dubroca, netdev, David S. Miller,
	Eric Dumazet, Paolo Abeni

On 2024-01-02 03:04, Jiri Pirko wrote:
> Thu, Dec 28, 2023 at 02:46:29AM CET, dw@davidwei.uk wrote:
>> In this patch I added a linked list nsim_dev_list of probed nsim_devs,
>> added during nsim_drv_probe() and removed during nsim_drv_remove(). A
>> mutex nsim_dev_list_lock protects the list.
>>
>> Signed-off-by: David Wei <dw@davidwei.uk>
>> ---
>> drivers/net/netdevsim/dev.c       | 19 +++++++++++++++++++
>> drivers/net/netdevsim/netdevsim.h |  1 +
>> 2 files changed, 20 insertions(+)
>>
>> diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c
>> index b4d3b9cde8bd..8d477aa99f94 100644
>> --- a/drivers/net/netdevsim/dev.c
>> +++ b/drivers/net/netdevsim/dev.c
>> @@ -35,6 +35,9 @@
>>
>> #include "netdevsim.h"
>>
>> +static LIST_HEAD(nsim_dev_list);
>> +static DEFINE_MUTEX(nsim_dev_list_lock);
>> +
>> static unsigned int
>> nsim_dev_port_index(enum nsim_dev_port_type type, unsigned int port_index)
>> {
>> @@ -1607,6 +1610,11 @@ int nsim_drv_probe(struct nsim_bus_dev *nsim_bus_dev)
>>
>> 	nsim_dev->esw_mode = DEVLINK_ESWITCH_MODE_LEGACY;
>> 	devl_unlock(devlink);
>> +
>> +	mutex_lock(&nsim_dev_list_lock);
>> +	list_add(&nsim_dev->list, &nsim_dev_list);
>> +	mutex_unlock(&nsim_dev_list_lock);
>> +
>> 	return 0;
>>
>> err_hwstats_exit:
>> @@ -1668,8 +1676,19 @@ void nsim_drv_remove(struct nsim_bus_dev *nsim_bus_dev)
>> {
>> 	struct nsim_dev *nsim_dev = dev_get_drvdata(&nsim_bus_dev->dev);
>> 	struct devlink *devlink = priv_to_devlink(nsim_dev);
>> +	struct nsim_dev *pos, *tmp;
>> +
>> +	mutex_lock(&nsim_dev_list_lock);
>> +	list_for_each_entry_safe(pos, tmp, &nsim_dev_list, list) {
>> +		if (pos == nsim_dev) {
>> +			list_del(&nsim_dev->list);
>> +			break;
>> +		}
>> +	}
>> +	mutex_unlock(&nsim_dev_list_lock);
> 
> This is just:
> 	mutex_lock(&nsim_dev_list_lock);
> 	list_del(&nsim_dev->list);
> 	mutex_unlock(&nsim_dev_list_lock);
> 
> The loop is not good for anything.

Thanks, will fix this.

> 
> 
> 
>>
>> 	devl_lock(devlink);
>> +
> 
> Remove this leftover line addition.

Ditto.

> 
> 
>> 	nsim_dev_reload_destroy(nsim_dev);
>>
>> 	nsim_bpf_dev_exit(nsim_dev);
>> diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h
>> index 028c825b86db..babb61d7790b 100644
>> --- a/drivers/net/netdevsim/netdevsim.h
>> +++ b/drivers/net/netdevsim/netdevsim.h
>> @@ -277,6 +277,7 @@ struct nsim_vf_config {
>>
>> struct nsim_dev {
>> 	struct nsim_bus_dev *nsim_bus_dev;
>> +	struct list_head list;
>> 	struct nsim_fib_data *fib_data;
>> 	struct nsim_trap_data *trap_data;
>> 	struct dentry *ddir;
>> -- 
>> 2.39.3
>>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next v5 2/5] netdevsim: allow two netdevsim ports to be connected
  2024-01-02 11:11   ` Jiri Pirko
@ 2024-01-03 21:56     ` David Wei
  2024-01-04  9:30       ` Jiri Pirko
  0 siblings, 1 reply; 20+ messages in thread
From: David Wei @ 2024-01-03 21:56 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Jakub Kicinski, Sabrina Dubroca, netdev, David S. Miller,
	Eric Dumazet, Paolo Abeni

On 2024-01-02 03:11, Jiri Pirko wrote:
> Thu, Dec 28, 2023 at 02:46:30AM CET, dw@davidwei.uk wrote:
>> Add a debugfs file in
>> /sys/kernel/debug/netdevsim/netdevsimN/ports/A/peer
>>
>> Writing "M B" to this file will link port A of netdevsim N with port B
>> of netdevsim M. Reading this file will return the linked netdevsim id
>> and port, if any.
>>
>> During nsim_dev_peer_write(), nsim_dev_list_lock prevents concurrent
>> modifications to nsim_dev and peer's devlink->lock prevents concurrent
>> modifications to the peer's port_list. rtnl_lock ensures netdevices do
>> not change during the critical section where a link is established.
>>
>> The lock order is consistent with other parts that touch netdevsim and
>> should not deadlock.
>>
>> During nsim_dev_peer_read(), RCU read critical section ensures valid
>> values even if stale.
>>
>> Signed-off-by: David Wei <dw@davidwei.uk>
>> ---
>> drivers/net/netdevsim/dev.c       | 134 +++++++++++++++++++++++++++---
>> drivers/net/netdevsim/netdev.c    |   6 ++
>> drivers/net/netdevsim/netdevsim.h |   1 +
>> 3 files changed, 128 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c
>> index 8d477aa99f94..6d5e4ce08dfd 100644
>> --- a/drivers/net/netdevsim/dev.c
>> +++ b/drivers/net/netdevsim/dev.c
>> @@ -391,6 +391,124 @@ static const struct file_operations nsim_dev_rate_parent_fops = {
>> 	.owner = THIS_MODULE,
>> };
>>
>> +static struct nsim_dev *nsim_dev_find_by_id(unsigned int id)
>> +{
>> +	struct nsim_dev *dev;
>> +
>> +	list_for_each_entry(dev, &nsim_dev_list, list)
>> +		if (dev->nsim_bus_dev->dev.id == id)
>> +			return dev;
>> +
>> +	return NULL;
>> +}
>> +
>> +static struct nsim_dev_port *
>> +__nsim_dev_port_lookup(struct nsim_dev *nsim_dev, enum nsim_dev_port_type type,
>> +		       unsigned int port_index)
>> +{
>> +	struct nsim_dev_port *nsim_dev_port;
>> +
>> +	port_index = nsim_dev_port_index(type, port_index);
>> +	list_for_each_entry(nsim_dev_port, &nsim_dev->port_list, list)
>> +		if (nsim_dev_port->port_index == port_index)
>> +			return nsim_dev_port;
>> +	return NULL;
>> +}
>> +
>> +static ssize_t nsim_dev_peer_read(struct file *file, char __user *data,
>> +				  size_t count, loff_t *ppos)
>> +{
>> +	struct nsim_dev_port *nsim_dev_port;
>> +	struct netdevsim *peer;
>> +	unsigned int id, port;
>> +	ssize_t ret = 0;
>> +	char buf[23];
>> +
>> +	nsim_dev_port = file->private_data;
>> +	rcu_read_lock();
>> +	peer = rcu_dereference(nsim_dev_port->ns->peer);
>> +	if (!peer) {
>> +		rcu_read_unlock();
>> +		return 0;
>> +	}
>> +
>> +	id = peer->nsim_bus_dev->dev.id;
>> +	port = peer->nsim_dev_port->port_index;
>> +	ret = scnprintf(buf, sizeof(buf), "%u %u\n", id, port);
>> +	ret = simple_read_from_buffer(data, count, ppos, buf, ret);
>> +
>> +	rcu_read_unlock();
>> +	return ret;
>> +}
>> +
>> +static ssize_t nsim_dev_peer_write(struct file *file,
>> +				   const char __user *data,
>> +				   size_t count, loff_t *ppos)
>> +{
>> +	struct nsim_dev_port *nsim_dev_port, *peer_dev_port;
>> +	struct nsim_dev *peer_dev;
>> +	unsigned int id, port;
>> +	char buf[22];
>> +	ssize_t ret;
>> +
>> +	if (count >= sizeof(buf))
>> +		return -ENOSPC;
>> +
>> +	ret = copy_from_user(buf, data, count);
>> +	if (ret)
>> +		return -EFAULT;
>> +	buf[count] = '\0';
>> +
>> +	ret = sscanf(buf, "%u %u", &id, &port);
>> +	if (ret != 2) {
>> +		pr_err("Format is peer netdevsim \"id port\" (uint uint)\n");
>> +		return -EINVAL;
>> +	}
>> +
>> +	ret = -EINVAL;
>> +	mutex_lock(&nsim_dev_list_lock);
>> +	peer_dev = nsim_dev_find_by_id(id);
>> +	if (!peer_dev) {
>> +		pr_err("Peer netdevsim %u does not exist\n", id);
>> +		goto out_mutex;
>> +	}
>> +
>> +	devl_lock(priv_to_devlink(peer_dev));
> 
> Why exactly do you take devlink instance mutex of the peer here?

To make sure that port list do not change. Ports can be added or removed
at will from nsim_drv_port_add() and nsim_drv_port_del() which both take
the devlink lock.

> 
> 
>> +	rtnl_lock();
>> +	nsim_dev_port = file->private_data;
>> +	peer_dev_port = __nsim_dev_port_lookup(peer_dev, NSIM_DEV_PORT_TYPE_PF,
>> +					       port);
>> +	if (!peer_dev_port) {
>> +		pr_err("Peer netdevsim %u port %u does not exist\n", id, port);
>> +		goto out_devl;
>> +	}
>> +
>> +	if (nsim_dev_port == peer_dev_port) {
>> +		pr_err("Cannot link netdevsim to itself\n");
>> +		goto out_devl;
>> +	}
>> +
>> +	rcu_assign_pointer(nsim_dev_port->ns->peer, peer_dev_port->ns);
>> +	rcu_assign_pointer(peer_dev_port->ns->peer, nsim_dev_port->ns);
>> +	ret = count;
>> +
>> +out_devl:
>> +	rtnl_unlock();
>> +	devl_unlock(priv_to_devlink(peer_dev));
>> +out_mutex:
>> +	mutex_unlock(&nsim_dev_list_lock);
>> +
>> +	return ret;
>> +}
>> +
>> +static const struct file_operations nsim_dev_peer_fops = {
>> +	.open = simple_open,
>> +	.read = nsim_dev_peer_read,
>> +	.write = nsim_dev_peer_write,
>> +	.llseek = generic_file_llseek,
>> +	.owner = THIS_MODULE,
>> +};
>> +
>> static int nsim_dev_port_debugfs_init(struct nsim_dev *nsim_dev,
>> 				      struct nsim_dev_port *nsim_dev_port)
>> {
>> @@ -421,6 +539,9 @@ static int nsim_dev_port_debugfs_init(struct nsim_dev *nsim_dev,
>> 	}
>> 	debugfs_create_symlink("dev", nsim_dev_port->ddir, dev_link_name);
>>
>> +	debugfs_create_file("peer", 0600, nsim_dev_port->ddir,
>> +			    nsim_dev_port, &nsim_dev_peer_fops);
>> +
>> 	return 0;
>> }
>>
>> @@ -1704,19 +1825,6 @@ void nsim_drv_remove(struct nsim_bus_dev *nsim_bus_dev)
>> 	dev_set_drvdata(&nsim_bus_dev->dev, NULL);
>> }
>>
>> -static struct nsim_dev_port *
>> -__nsim_dev_port_lookup(struct nsim_dev *nsim_dev, enum nsim_dev_port_type type,
>> -		       unsigned int port_index)
>> -{
>> -	struct nsim_dev_port *nsim_dev_port;
>> -
>> -	port_index = nsim_dev_port_index(type, port_index);
>> -	list_for_each_entry(nsim_dev_port, &nsim_dev->port_list, list)
>> -		if (nsim_dev_port->port_index == port_index)
>> -			return nsim_dev_port;
>> -	return NULL;
>> -}
>> -
>> int nsim_drv_port_add(struct nsim_bus_dev *nsim_bus_dev, enum nsim_dev_port_type type,
>> 		      unsigned int port_index)
>> {
>> diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c
>> index aecaf5f44374..434322f6a565 100644
>> --- a/drivers/net/netdevsim/netdev.c
>> +++ b/drivers/net/netdevsim/netdev.c
>> @@ -388,6 +388,7 @@ nsim_create(struct nsim_dev *nsim_dev, struct nsim_dev_port *nsim_dev_port)
>> 	ns->nsim_dev = nsim_dev;
>> 	ns->nsim_dev_port = nsim_dev_port;
>> 	ns->nsim_bus_dev = nsim_dev->nsim_bus_dev;
>> +	RCU_INIT_POINTER(ns->peer, NULL);
>> 	SET_NETDEV_DEV(dev, &ns->nsim_bus_dev->dev);
>> 	SET_NETDEV_DEVLINK_PORT(dev, &nsim_dev_port->devlink_port);
>> 	nsim_ethtool_init(ns);
>> @@ -407,8 +408,13 @@ nsim_create(struct nsim_dev *nsim_dev, struct nsim_dev_port *nsim_dev_port)
>> void nsim_destroy(struct netdevsim *ns)
>> {
>> 	struct net_device *dev = ns->netdev;
>> +	struct netdevsim *peer;
>>
>> 	rtnl_lock();
>> +	peer = rtnl_dereference(ns->peer);
>> +	if (peer)
>> +		RCU_INIT_POINTER(peer->peer, NULL);
>> +	RCU_INIT_POINTER(ns->peer, NULL);
>> 	unregister_netdevice(dev);
>> 	if (nsim_dev_port_is_pf(ns->nsim_dev_port)) {
>> 		nsim_macsec_teardown(ns);
>> diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h
>> index babb61d7790b..24fc3fbda791 100644
>> --- a/drivers/net/netdevsim/netdevsim.h
>> +++ b/drivers/net/netdevsim/netdevsim.h
>> @@ -125,6 +125,7 @@ struct netdevsim {
>> 	} udp_ports;
>>
>> 	struct nsim_ethtool ethtool;
>> +	struct netdevsim __rcu *peer;
>> };
>>
>> struct netdevsim *
>> -- 
>> 2.39.3
>>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next v5 3/5] netdevsim: forward skbs from one connected port to another
  2024-01-02 11:20   ` Eric Dumazet
@ 2024-01-03 21:57     ` David Wei
  0 siblings, 0 replies; 20+ messages in thread
From: David Wei @ 2024-01-03 21:57 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jakub Kicinski, Jiri Pirko, Sabrina Dubroca, netdev,
	David S. Miller, Paolo Abeni

On 2024-01-02 03:20, Eric Dumazet wrote:
> On Thu, Dec 28, 2023 at 2:46 AM David Wei <dw@davidwei.uk> wrote:
>>
>> Forward skbs sent from one netdevsim port to its connected netdevsim
>> port using dev_forward_skb, in a spirit similar to veth.
>>
>> Add a tx_dropped variable to struct netdevsim, tracking the number of
>> skbs that could not be forwarded using dev_forward_skb().
>>
>> The xmit() function accessing the peer ptr is protected by an RCU read
>> critical section. The rcu_read_lock() is functionally redundant as since
>> v5.0 all softirqs are implicitly RCU read critical sections; but it is
>> useful for human readers.
>>
>> If another CPU is concurrently in nsim_destroy(), then it will first set
>> the peer ptr to NULL. This does not affect any existing readers that
>> dereferenced a non-NULL peer. Then, in unregister_netdevice(), there is
>> a synchronize_rcu() before the netdev is actually unregistered and
>> freed. This ensures that any readers i.e. xmit() that got a non-NULL
>> peer will complete before the netdev is freed.
>>
>> Any readers after the RCU_INIT_POINTER() but before synchronize_rcu()
>> will dereference NULL, making it safe.
>>
>> The codepath to nsim_destroy() and nsim_create() takes both the newly
>> added nsim_dev_list_lock and rtnl_lock. This makes it safe with
>> concurrent calls to linking two netdevsims together.
>>
>> Signed-off-by: David Wei <dw@davidwei.uk>
>> ---
>>  drivers/net/netdevsim/netdev.c    | 21 ++++++++++++++++++---
>>  drivers/net/netdevsim/netdevsim.h |  1 +
>>  2 files changed, 19 insertions(+), 3 deletions(-)
>>
> 
> 
>> @@ -302,7 +318,6 @@ static void nsim_setup(struct net_device *dev)
>>         eth_hw_addr_random(dev);
>>
>>         dev->tx_queue_len = 0;
>> -       dev->flags |= IFF_NOARP;
> 
> This part seems to be unrelated to this patch ?

Hi Eric, I found that this change is needed for skb forwarding to work.
Would you prefer me splitting this change into its own patch?

> 
>>         dev->flags &= ~IFF_MULTICAST;
>>         dev->priv_flags |= IFF_LIVE_ADDR_CHANGE |
>>                            IFF_NO_QUEUE;

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next v5 3/5] netdevsim: forward skbs from one connected port to another
  2024-01-02 11:13   ` Jiri Pirko
@ 2024-01-03 22:36     ` David Wei
  2024-01-04  9:31       ` Jiri Pirko
  0 siblings, 1 reply; 20+ messages in thread
From: David Wei @ 2024-01-03 22:36 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Jakub Kicinski, Sabrina Dubroca, netdev, David S. Miller,
	Eric Dumazet, Paolo Abeni

On 2024-01-02 03:13, Jiri Pirko wrote:
> Thu, Dec 28, 2023 at 02:46:31AM CET, dw@davidwei.uk wrote:
>> Forward skbs sent from one netdevsim port to its connected netdevsim
>> port using dev_forward_skb, in a spirit similar to veth.
>>
>> Add a tx_dropped variable to struct netdevsim, tracking the number of
>> skbs that could not be forwarded using dev_forward_skb().
>>
>> The xmit() function accessing the peer ptr is protected by an RCU read
>> critical section. The rcu_read_lock() is functionally redundant as since
>> v5.0 all softirqs are implicitly RCU read critical sections; but it is
>> useful for human readers.
>>
>> If another CPU is concurrently in nsim_destroy(), then it will first set
>> the peer ptr to NULL. This does not affect any existing readers that
>> dereferenced a non-NULL peer. Then, in unregister_netdevice(), there is
>> a synchronize_rcu() before the netdev is actually unregistered and
>> freed. This ensures that any readers i.e. xmit() that got a non-NULL
>> peer will complete before the netdev is freed.
>>
>> Any readers after the RCU_INIT_POINTER() but before synchronize_rcu()
>> will dereference NULL, making it safe.
>>
>> The codepath to nsim_destroy() and nsim_create() takes both the newly
>> added nsim_dev_list_lock and rtnl_lock. This makes it safe with
> 
> I don't see the rtnl_lock take in those functions.
> 
> 
> Otherwise, this patch looks fine to me.

For nsim_create(), rtnl_lock is taken in nsim_init_netdevsim(). For
nsim_destroy(), rtnl_lock is taken directly in the function.

What I mean here is, in the netdevsim device modification paths locks
are taken in this order:

devl_lock -> rtnl_lock

nsim_dev_list_lock is taken outside (not nested) of these.

In nsim_dev_peer_write() where two ports are linked, locks are taken in
this order:

nsim_dev_list_lock -> devl_lock -> rtnl_lock

This will not cause deadlocks and ensures that two ports being linked
are both valid.

> 
> 
>> concurrent calls to linking two netdevsims together.
>>
>> Signed-off-by: David Wei <dw@davidwei.uk>
>> ---
>> drivers/net/netdevsim/netdev.c    | 21 ++++++++++++++++++---
>> drivers/net/netdevsim/netdevsim.h |  1 +
>> 2 files changed, 19 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c
>> index 434322f6a565..0009d0f1243f 100644
>> --- a/drivers/net/netdevsim/netdev.c
 +++ b/drivers/net/netdevsim/netdev.c
>> @@ -29,19 +29,34 @@
>> static netdev_tx_t nsim_start_xmit(struct sk_buff *skb, struct net_device *dev)
>> {
>> 	struct netdevsim *ns = netdev_priv(dev);
>> +	struct netdevsim *peer_ns;
>> +	int ret = NETDEV_TX_OK;
>>
>> 	if (!nsim_ipsec_tx(ns, skb))
>> 		goto out;
>>
>> +	rcu_read_lock();
>> +	peer_ns = rcu_dereference(ns->peer);
>> +	if (!peer_ns)
>> +		goto out_stats;
>> +
>> +	skb_tx_timestamp(skb);
>> +	if (unlikely(dev_forward_skb(peer_ns->netdev, skb) == NET_RX_DROP))
>> +		ret = NET_XMIT_DROP;
>> +
>> +out_stats:
>> +	rcu_read_unlock();
>> 	u64_stats_update_begin(&ns->syncp);
>> 	ns->tx_packets++;
>> 	ns->tx_bytes += skb->len;
>> +	if (ret == NET_XMIT_DROP)
>> +		ns->tx_dropped++;
>> 	u64_stats_update_end(&ns->syncp);
>> +	return ret;
>>
>> out:
>> 	dev_kfree_skb(skb);
>> -
>> -	return NETDEV_TX_OK;
>> +	return ret;
>> }
>>
>> static void nsim_set_rx_mode(struct net_device *dev)
>> @@ -70,6 +85,7 @@ nsim_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
>> 		start = u64_stats_fetch_begin(&ns->syncp);
>> 		stats->tx_bytes = ns->tx_bytes;
>> 		stats->tx_packets = ns->tx_packets;
>> +		stats->tx_dropped = ns->tx_dropped;
>> 	} while (u64_stats_fetch_retry(&ns->syncp, start));
>> }
>>
>> @@ -302,7 +318,6 @@ static void nsim_setup(struct net_device *dev)
>> 	eth_hw_addr_random(dev);
>>
>> 	dev->tx_queue_len = 0;
>> -	dev->flags |= IFF_NOARP;
>> 	dev->flags &= ~IFF_MULTICAST;
>> 	dev->priv_flags |= IFF_LIVE_ADDR_CHANGE |
>> 			   IFF_NO_QUEUE;
>> diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h
>> index 24fc3fbda791..083b1ee7a1a2 100644
>> --- a/drivers/net/netdevsim/netdevsim.h
>> +++ b/drivers/net/netdevsim/netdevsim.h
>> @@ -98,6 +98,7 @@ struct netdevsim {
>>
>> 	u64 tx_packets;
>> 	u64 tx_bytes;
>> +	u64 tx_dropped;
>> 	struct u64_stats_sync syncp;
>>
>> 	struct nsim_bus_dev *nsim_bus_dev;
>> -- 
>> 2.39.3
>>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next v5 2/5] netdevsim: allow two netdevsim ports to be connected
  2023-12-28  1:46 ` [PATCH net-next v5 2/5] netdevsim: allow two netdevsim ports to be connected David Wei
  2024-01-02 11:11   ` Jiri Pirko
@ 2024-01-04  1:39   ` Jakub Kicinski
  2024-01-09 16:57     ` David Wei
  1 sibling, 1 reply; 20+ messages in thread
From: Jakub Kicinski @ 2024-01-04  1:39 UTC (permalink / raw)
  To: David Wei
  Cc: Jiri Pirko, Sabrina Dubroca, netdev, David S. Miller,
	Eric Dumazet, Paolo Abeni

On Wed, 27 Dec 2023 17:46:30 -0800 David Wei wrote:
> +static ssize_t nsim_dev_peer_write(struct file *file,
> +				   const char __user *data,
> +				   size_t count, loff_t *ppos)
> +{
> +	struct nsim_dev_port *nsim_dev_port, *peer_dev_port;
> +	struct nsim_dev *peer_dev;
> +	unsigned int id, port;
> +	char buf[22];
> +	ssize_t ret;
> +
> +	if (count >= sizeof(buf))
> +		return -ENOSPC;
> +
> +	ret = copy_from_user(buf, data, count);
> +	if (ret)
> +		return -EFAULT;
> +	buf[count] = '\0';
> +
> +	ret = sscanf(buf, "%u %u", &id, &port);
> +	if (ret != 2) {
> +		pr_err("Format is peer netdevsim \"id port\" (uint uint)\n");

netif_err() or dev_err() ? Granted the rest of the file seems to use
pr_err(), but I'm not sure why...

> +		return -EINVAL;
> +	}

Could you put a sleep() here and test removing the device while some
thread is stuck here? I don't recall exactly but I thought debugfs
remove waits for concurrent reads and writes which could be problematic
given we take all the locks under the sun here..

> +	ret = -EINVAL;
> +	mutex_lock(&nsim_dev_list_lock);
> +	peer_dev = nsim_dev_find_by_id(id);
> +	if (!peer_dev) {
> +		pr_err("Peer netdevsim %u does not exist\n", id);
> +		goto out_mutex;
> +	}
> +
> +	devl_lock(priv_to_devlink(peer_dev));
> +	rtnl_lock();
> +	nsim_dev_port = file->private_data;
> +	peer_dev_port = __nsim_dev_port_lookup(peer_dev, NSIM_DEV_PORT_TYPE_PF,
> +					       port);
> +	if (!peer_dev_port) {
> +		pr_err("Peer netdevsim %u port %u does not exist\n", id, port);
> +		goto out_devl;
> +	}
> +
> +	if (nsim_dev_port == peer_dev_port) {
> +		pr_err("Cannot link netdevsim to itself\n");
> +		goto out_devl;
> +	}
> +
> +	rcu_assign_pointer(nsim_dev_port->ns->peer, peer_dev_port->ns);
> +	rcu_assign_pointer(peer_dev_port->ns->peer, nsim_dev_port->ns);
> +	ret = count;
> +
> +out_devl:

out_unlock_rtnl

> +	rtnl_unlock();
> +	devl_unlock(priv_to_devlink(peer_dev));
> +out_mutex:

out_unlock_dev_list

> +	mutex_unlock(&nsim_dev_list_lock);
> +
> +	return ret;
> +}
> +
> +static const struct file_operations nsim_dev_peer_fops = {
> +	.open = simple_open,
> +	.read = nsim_dev_peer_read,
> +	.write = nsim_dev_peer_write,
> +	.llseek = generic_file_llseek,

You don't support seek, you want some form of no_seek here.

> +	.owner = THIS_MODULE,
> +};

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next v5 2/5] netdevsim: allow two netdevsim ports to be connected
  2024-01-03 21:56     ` David Wei
@ 2024-01-04  9:30       ` Jiri Pirko
  0 siblings, 0 replies; 20+ messages in thread
From: Jiri Pirko @ 2024-01-04  9:30 UTC (permalink / raw)
  To: David Wei
  Cc: Jakub Kicinski, Sabrina Dubroca, netdev, David S. Miller,
	Eric Dumazet, Paolo Abeni

Wed, Jan 03, 2024 at 10:56:36PM CET, dw@davidwei.uk wrote:
>On 2024-01-02 03:11, Jiri Pirko wrote:
>> Thu, Dec 28, 2023 at 02:46:30AM CET, dw@davidwei.uk wrote:
>>> Add a debugfs file in
>>> /sys/kernel/debug/netdevsim/netdevsimN/ports/A/peer
>>>
>>> Writing "M B" to this file will link port A of netdevsim N with port B
>>> of netdevsim M. Reading this file will return the linked netdevsim id
>>> and port, if any.
>>>
>>> During nsim_dev_peer_write(), nsim_dev_list_lock prevents concurrent
>>> modifications to nsim_dev and peer's devlink->lock prevents concurrent
>>> modifications to the peer's port_list. rtnl_lock ensures netdevices do
>>> not change during the critical section where a link is established.
>>>
>>> The lock order is consistent with other parts that touch netdevsim and
>>> should not deadlock.
>>>
>>> During nsim_dev_peer_read(), RCU read critical section ensures valid
>>> values even if stale.
>>>
>>> Signed-off-by: David Wei <dw@davidwei.uk>
>>> ---
>>> drivers/net/netdevsim/dev.c       | 134 +++++++++++++++++++++++++++---
>>> drivers/net/netdevsim/netdev.c    |   6 ++
>>> drivers/net/netdevsim/netdevsim.h |   1 +
>>> 3 files changed, 128 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c
>>> index 8d477aa99f94..6d5e4ce08dfd 100644
>>> --- a/drivers/net/netdevsim/dev.c
>>> +++ b/drivers/net/netdevsim/dev.c
>>> @@ -391,6 +391,124 @@ static const struct file_operations nsim_dev_rate_parent_fops = {
>>> 	.owner = THIS_MODULE,
>>> };
>>>
>>> +static struct nsim_dev *nsim_dev_find_by_id(unsigned int id)
>>> +{
>>> +	struct nsim_dev *dev;
>>> +
>>> +	list_for_each_entry(dev, &nsim_dev_list, list)
>>> +		if (dev->nsim_bus_dev->dev.id == id)
>>> +			return dev;
>>> +
>>> +	return NULL;
>>> +}
>>> +
>>> +static struct nsim_dev_port *
>>> +__nsim_dev_port_lookup(struct nsim_dev *nsim_dev, enum nsim_dev_port_type type,
>>> +		       unsigned int port_index)
>>> +{
>>> +	struct nsim_dev_port *nsim_dev_port;
>>> +
>>> +	port_index = nsim_dev_port_index(type, port_index);
>>> +	list_for_each_entry(nsim_dev_port, &nsim_dev->port_list, list)
>>> +		if (nsim_dev_port->port_index == port_index)
>>> +			return nsim_dev_port;
>>> +	return NULL;
>>> +}
>>> +
>>> +static ssize_t nsim_dev_peer_read(struct file *file, char __user *data,
>>> +				  size_t count, loff_t *ppos)
>>> +{
>>> +	struct nsim_dev_port *nsim_dev_port;
>>> +	struct netdevsim *peer;
>>> +	unsigned int id, port;
>>> +	ssize_t ret = 0;
>>> +	char buf[23];
>>> +
>>> +	nsim_dev_port = file->private_data;
>>> +	rcu_read_lock();
>>> +	peer = rcu_dereference(nsim_dev_port->ns->peer);
>>> +	if (!peer) {
>>> +		rcu_read_unlock();
>>> +		return 0;
>>> +	}
>>> +
>>> +	id = peer->nsim_bus_dev->dev.id;
>>> +	port = peer->nsim_dev_port->port_index;
>>> +	ret = scnprintf(buf, sizeof(buf), "%u %u\n", id, port);
>>> +	ret = simple_read_from_buffer(data, count, ppos, buf, ret);
>>> +
>>> +	rcu_read_unlock();
>>> +	return ret;
>>> +}
>>> +
>>> +static ssize_t nsim_dev_peer_write(struct file *file,
>>> +				   const char __user *data,
>>> +				   size_t count, loff_t *ppos)
>>> +{
>>> +	struct nsim_dev_port *nsim_dev_port, *peer_dev_port;
>>> +	struct nsim_dev *peer_dev;
>>> +	unsigned int id, port;
>>> +	char buf[22];
>>> +	ssize_t ret;
>>> +
>>> +	if (count >= sizeof(buf))
>>> +		return -ENOSPC;
>>> +
>>> +	ret = copy_from_user(buf, data, count);
>>> +	if (ret)
>>> +		return -EFAULT;
>>> +	buf[count] = '\0';
>>> +
>>> +	ret = sscanf(buf, "%u %u", &id, &port);
>>> +	if (ret != 2) {
>>> +		pr_err("Format is peer netdevsim \"id port\" (uint uint)\n");
>>> +		return -EINVAL;
>>> +	}
>>> +
>>> +	ret = -EINVAL;
>>> +	mutex_lock(&nsim_dev_list_lock);
>>> +	peer_dev = nsim_dev_find_by_id(id);
>>> +	if (!peer_dev) {
>>> +		pr_err("Peer netdevsim %u does not exist\n", id);
>>> +		goto out_mutex;
>>> +	}
>>> +
>>> +	devl_lock(priv_to_devlink(peer_dev));
>> 
>> Why exactly do you take devlink instance mutex of the peer here?
>
>To make sure that port list do not change. Ports can be added or removed
>at will from nsim_drv_port_add() and nsim_drv_port_del() which both take
>the devlink lock.

Ok.

>
>> 
>> 
>>> +	rtnl_lock();
>>> +	nsim_dev_port = file->private_data;
>>> +	peer_dev_port = __nsim_dev_port_lookup(peer_dev, NSIM_DEV_PORT_TYPE_PF,
>>> +					       port);
>>> +	if (!peer_dev_port) {
>>> +		pr_err("Peer netdevsim %u port %u does not exist\n", id, port);
>>> +		goto out_devl;
>>> +	}
>>> +
>>> +	if (nsim_dev_port == peer_dev_port) {
>>> +		pr_err("Cannot link netdevsim to itself\n");
>>> +		goto out_devl;
>>> +	}
>>> +
>>> +	rcu_assign_pointer(nsim_dev_port->ns->peer, peer_dev_port->ns);
>>> +	rcu_assign_pointer(peer_dev_port->ns->peer, nsim_dev_port->ns);
>>> +	ret = count;
>>> +
>>> +out_devl:
>>> +	rtnl_unlock();
>>> +	devl_unlock(priv_to_devlink(peer_dev));
>>> +out_mutex:
>>> +	mutex_unlock(&nsim_dev_list_lock);
>>> +
>>> +	return ret;
>>> +}
>>> +
>>> +static const struct file_operations nsim_dev_peer_fops = {
>>> +	.open = simple_open,
>>> +	.read = nsim_dev_peer_read,
>>> +	.write = nsim_dev_peer_write,
>>> +	.llseek = generic_file_llseek,
>>> +	.owner = THIS_MODULE,
>>> +};
>>> +
>>> static int nsim_dev_port_debugfs_init(struct nsim_dev *nsim_dev,
>>> 				      struct nsim_dev_port *nsim_dev_port)
>>> {
>>> @@ -421,6 +539,9 @@ static int nsim_dev_port_debugfs_init(struct nsim_dev *nsim_dev,
>>> 	}
>>> 	debugfs_create_symlink("dev", nsim_dev_port->ddir, dev_link_name);
>>>
>>> +	debugfs_create_file("peer", 0600, nsim_dev_port->ddir,
>>> +			    nsim_dev_port, &nsim_dev_peer_fops);
>>> +
>>> 	return 0;
>>> }
>>>
>>> @@ -1704,19 +1825,6 @@ void nsim_drv_remove(struct nsim_bus_dev *nsim_bus_dev)
>>> 	dev_set_drvdata(&nsim_bus_dev->dev, NULL);
>>> }
>>>
>>> -static struct nsim_dev_port *
>>> -__nsim_dev_port_lookup(struct nsim_dev *nsim_dev, enum nsim_dev_port_type type,
>>> -		       unsigned int port_index)
>>> -{
>>> -	struct nsim_dev_port *nsim_dev_port;
>>> -
>>> -	port_index = nsim_dev_port_index(type, port_index);
>>> -	list_for_each_entry(nsim_dev_port, &nsim_dev->port_list, list)
>>> -		if (nsim_dev_port->port_index == port_index)
>>> -			return nsim_dev_port;
>>> -	return NULL;
>>> -}
>>> -
>>> int nsim_drv_port_add(struct nsim_bus_dev *nsim_bus_dev, enum nsim_dev_port_type type,
>>> 		      unsigned int port_index)
>>> {
>>> diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c
>>> index aecaf5f44374..434322f6a565 100644
>>> --- a/drivers/net/netdevsim/netdev.c
>>> +++ b/drivers/net/netdevsim/netdev.c
>>> @@ -388,6 +388,7 @@ nsim_create(struct nsim_dev *nsim_dev, struct nsim_dev_port *nsim_dev_port)
>>> 	ns->nsim_dev = nsim_dev;
>>> 	ns->nsim_dev_port = nsim_dev_port;
>>> 	ns->nsim_bus_dev = nsim_dev->nsim_bus_dev;
>>> +	RCU_INIT_POINTER(ns->peer, NULL);
>>> 	SET_NETDEV_DEV(dev, &ns->nsim_bus_dev->dev);
>>> 	SET_NETDEV_DEVLINK_PORT(dev, &nsim_dev_port->devlink_port);
>>> 	nsim_ethtool_init(ns);
>>> @@ -407,8 +408,13 @@ nsim_create(struct nsim_dev *nsim_dev, struct nsim_dev_port *nsim_dev_port)
>>> void nsim_destroy(struct netdevsim *ns)
>>> {
>>> 	struct net_device *dev = ns->netdev;
>>> +	struct netdevsim *peer;
>>>
>>> 	rtnl_lock();
>>> +	peer = rtnl_dereference(ns->peer);
>>> +	if (peer)
>>> +		RCU_INIT_POINTER(peer->peer, NULL);
>>> +	RCU_INIT_POINTER(ns->peer, NULL);
>>> 	unregister_netdevice(dev);
>>> 	if (nsim_dev_port_is_pf(ns->nsim_dev_port)) {
>>> 		nsim_macsec_teardown(ns);
>>> diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h
>>> index babb61d7790b..24fc3fbda791 100644
>>> --- a/drivers/net/netdevsim/netdevsim.h
>>> +++ b/drivers/net/netdevsim/netdevsim.h
>>> @@ -125,6 +125,7 @@ struct netdevsim {
>>> 	} udp_ports;
>>>
>>> 	struct nsim_ethtool ethtool;
>>> +	struct netdevsim __rcu *peer;
>>> };
>>>
>>> struct netdevsim *
>>> -- 
>>> 2.39.3
>>>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next v5 3/5] netdevsim: forward skbs from one connected port to another
  2024-01-03 22:36     ` David Wei
@ 2024-01-04  9:31       ` Jiri Pirko
  2024-01-09 16:58         ` David Wei
  0 siblings, 1 reply; 20+ messages in thread
From: Jiri Pirko @ 2024-01-04  9:31 UTC (permalink / raw)
  To: David Wei
  Cc: Jakub Kicinski, Sabrina Dubroca, netdev, David S. Miller,
	Eric Dumazet, Paolo Abeni

Wed, Jan 03, 2024 at 11:36:36PM CET, dw@davidwei.uk wrote:
>On 2024-01-02 03:13, Jiri Pirko wrote:
>> Thu, Dec 28, 2023 at 02:46:31AM CET, dw@davidwei.uk wrote:
>>> Forward skbs sent from one netdevsim port to its connected netdevsim
>>> port using dev_forward_skb, in a spirit similar to veth.
>>>
>>> Add a tx_dropped variable to struct netdevsim, tracking the number of
>>> skbs that could not be forwarded using dev_forward_skb().
>>>
>>> The xmit() function accessing the peer ptr is protected by an RCU read
>>> critical section. The rcu_read_lock() is functionally redundant as since
>>> v5.0 all softirqs are implicitly RCU read critical sections; but it is
>>> useful for human readers.
>>>
>>> If another CPU is concurrently in nsim_destroy(), then it will first set
>>> the peer ptr to NULL. This does not affect any existing readers that
>>> dereferenced a non-NULL peer. Then, in unregister_netdevice(), there is
>>> a synchronize_rcu() before the netdev is actually unregistered and
>>> freed. This ensures that any readers i.e. xmit() that got a non-NULL
>>> peer will complete before the netdev is freed.
>>>
>>> Any readers after the RCU_INIT_POINTER() but before synchronize_rcu()
>>> will dereference NULL, making it safe.
>>>
>>> The codepath to nsim_destroy() and nsim_create() takes both the newly
>>> added nsim_dev_list_lock and rtnl_lock. This makes it safe with
>> 
>> I don't see the rtnl_lock take in those functions.
>> 
>> 
>> Otherwise, this patch looks fine to me.
>
>For nsim_create(), rtnl_lock is taken in nsim_init_netdevsim(). For
>nsim_destroy(), rtnl_lock is taken directly in the function.
>
>What I mean here is, in the netdevsim device modification paths locks
>are taken in this order:
>
>devl_lock -> rtnl_lock
>
>nsim_dev_list_lock is taken outside (not nested) of these.
>
>In nsim_dev_peer_write() where two ports are linked, locks are taken in
>this order:
>
>nsim_dev_list_lock -> devl_lock -> rtnl_lock
>
>This will not cause deadlocks and ensures that two ports being linked
>are both valid.

Okay. Perhaps would be good to document this in a comment somewhere in
the code?


>
>> 
>> 
>>> concurrent calls to linking two netdevsims together.
>>>
>>> Signed-off-by: David Wei <dw@davidwei.uk>
>>> ---
>>> drivers/net/netdevsim/netdev.c    | 21 ++++++++++++++++++---
>>> drivers/net/netdevsim/netdevsim.h |  1 +
>>> 2 files changed, 19 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c
>>> index 434322f6a565..0009d0f1243f 100644
>>> --- a/drivers/net/netdevsim/netdev.c
> +++ b/drivers/net/netdevsim/netdev.c
>>> @@ -29,19 +29,34 @@
>>> static netdev_tx_t nsim_start_xmit(struct sk_buff *skb, struct net_device *dev)
>>> {
>>> 	struct netdevsim *ns = netdev_priv(dev);
>>> +	struct netdevsim *peer_ns;
>>> +	int ret = NETDEV_TX_OK;
>>>
>>> 	if (!nsim_ipsec_tx(ns, skb))
>>> 		goto out;
>>>
>>> +	rcu_read_lock();
>>> +	peer_ns = rcu_dereference(ns->peer);
>>> +	if (!peer_ns)
>>> +		goto out_stats;
>>> +
>>> +	skb_tx_timestamp(skb);
>>> +	if (unlikely(dev_forward_skb(peer_ns->netdev, skb) == NET_RX_DROP))
>>> +		ret = NET_XMIT_DROP;
>>> +
>>> +out_stats:
>>> +	rcu_read_unlock();
>>> 	u64_stats_update_begin(&ns->syncp);
>>> 	ns->tx_packets++;
>>> 	ns->tx_bytes += skb->len;
>>> +	if (ret == NET_XMIT_DROP)
>>> +		ns->tx_dropped++;
>>> 	u64_stats_update_end(&ns->syncp);
>>> +	return ret;
>>>
>>> out:
>>> 	dev_kfree_skb(skb);
>>> -
>>> -	return NETDEV_TX_OK;
>>> +	return ret;
>>> }
>>>
>>> static void nsim_set_rx_mode(struct net_device *dev)
>>> @@ -70,6 +85,7 @@ nsim_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
>>> 		start = u64_stats_fetch_begin(&ns->syncp);
>>> 		stats->tx_bytes = ns->tx_bytes;
>>> 		stats->tx_packets = ns->tx_packets;
>>> +		stats->tx_dropped = ns->tx_dropped;
>>> 	} while (u64_stats_fetch_retry(&ns->syncp, start));
>>> }
>>>
>>> @@ -302,7 +318,6 @@ static void nsim_setup(struct net_device *dev)
>>> 	eth_hw_addr_random(dev);
>>>
>>> 	dev->tx_queue_len = 0;
>>> -	dev->flags |= IFF_NOARP;
>>> 	dev->flags &= ~IFF_MULTICAST;
>>> 	dev->priv_flags |= IFF_LIVE_ADDR_CHANGE |
>>> 			   IFF_NO_QUEUE;
>>> diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h
>>> index 24fc3fbda791..083b1ee7a1a2 100644
>>> --- a/drivers/net/netdevsim/netdevsim.h
>>> +++ b/drivers/net/netdevsim/netdevsim.h
>>> @@ -98,6 +98,7 @@ struct netdevsim {
>>>
>>> 	u64 tx_packets;
>>> 	u64 tx_bytes;
>>> +	u64 tx_dropped;
>>> 	struct u64_stats_sync syncp;
>>>
>>> 	struct nsim_bus_dev *nsim_bus_dev;
>>> -- 
>>> 2.39.3
>>>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next v5 2/5] netdevsim: allow two netdevsim ports to be connected
  2024-01-04  1:39   ` Jakub Kicinski
@ 2024-01-09 16:57     ` David Wei
  2024-01-10  1:53       ` Jakub Kicinski
  0 siblings, 1 reply; 20+ messages in thread
From: David Wei @ 2024-01-09 16:57 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Jiri Pirko, Sabrina Dubroca, netdev, David S. Miller,
	Eric Dumazet, Paolo Abeni

On 2024-01-03 17:39, Jakub Kicinski wrote:
> On Wed, 27 Dec 2023 17:46:30 -0800 David Wei wrote:
>> +static ssize_t nsim_dev_peer_write(struct file *file,
>> +				   const char __user *data,
>> +				   size_t count, loff_t *ppos)
>> +{
>> +	struct nsim_dev_port *nsim_dev_port, *peer_dev_port;
>> +	struct nsim_dev *peer_dev;
>> +	unsigned int id, port;
>> +	char buf[22];
>> +	ssize_t ret;
>> +
>> +	if (count >= sizeof(buf))
>> +		return -ENOSPC;
>> +
>> +	ret = copy_from_user(buf, data, count);
>> +	if (ret)
>> +		return -EFAULT;
>> +	buf[count] = '\0';
>> +
>> +	ret = sscanf(buf, "%u %u", &id, &port);
>> +	if (ret != 2) {
>> +		pr_err("Format is peer netdevsim \"id port\" (uint uint)\n");
> 
> netif_err() or dev_err() ? Granted the rest of the file seems to use
> pr_err(), but I'm not sure why...

I can change it to use one of these two in this patchset, then I can
chnage the others separately in another patch. How does that sound?

> 
>> +		return -EINVAL;
>> +	}
> 
> Could you put a sleep() here and test removing the device while some
> thread is stuck here? I don't recall exactly but I thought debugfs
> remove waits for concurrent reads and writes which could be problematic
> given we take all the locks under the sun here..

Yep, I'll test this.

> 
>> +	ret = -EINVAL;
>> +	mutex_lock(&nsim_dev_list_lock);
>> +	peer_dev = nsim_dev_find_by_id(id);
>> +	if (!peer_dev) {
>> +		pr_err("Peer netdevsim %u does not exist\n", id);
>> +		goto out_mutex;
>> +	}
>> +
>> +	devl_lock(priv_to_devlink(peer_dev));
>> +	rtnl_lock();
>> +	nsim_dev_port = file->private_data;
>> +	peer_dev_port = __nsim_dev_port_lookup(peer_dev, NSIM_DEV_PORT_TYPE_PF,
>> +					       port);
>> +	if (!peer_dev_port) {
>> +		pr_err("Peer netdevsim %u port %u does not exist\n", id, port);
>> +		goto out_devl;
>> +	}
>> +
>> +	if (nsim_dev_port == peer_dev_port) {
>> +		pr_err("Cannot link netdevsim to itself\n");
>> +		goto out_devl;
>> +	}
>> +
>> +	rcu_assign_pointer(nsim_dev_port->ns->peer, peer_dev_port->ns);
>> +	rcu_assign_pointer(peer_dev_port->ns->peer, nsim_dev_port->ns);
>> +	ret = count;
>> +
>> +out_devl:
> 
> out_unlock_rtnl
> 
>> +	rtnl_unlock();
>> +	devl_unlock(priv_to_devlink(peer_dev));
>> +out_mutex:
> 
> out_unlock_dev_list
> 
>> +	mutex_unlock(&nsim_dev_list_lock);
>> +
>> +	return ret;
>> +}
>> +
>> +static const struct file_operations nsim_dev_peer_fops = {
>> +	.open = simple_open,
>> +	.read = nsim_dev_peer_read,
>> +	.write = nsim_dev_peer_write,
>> +	.llseek = generic_file_llseek,
> 
> You don't support seek, you want some form of no_seek here.
> 
>> +	.owner = THIS_MODULE,
>> +};

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next v5 3/5] netdevsim: forward skbs from one connected port to another
  2024-01-04  9:31       ` Jiri Pirko
@ 2024-01-09 16:58         ` David Wei
  0 siblings, 0 replies; 20+ messages in thread
From: David Wei @ 2024-01-09 16:58 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Jakub Kicinski, Sabrina Dubroca, netdev, David S. Miller,
	Eric Dumazet, Paolo Abeni

On 2024-01-04 01:31, Jiri Pirko wrote:
> Wed, Jan 03, 2024 at 11:36:36PM CET, dw@davidwei.uk wrote:
>> On 2024-01-02 03:13, Jiri Pirko wrote:
>>> Thu, Dec 28, 2023 at 02:46:31AM CET, dw@davidwei.uk wrote:
>>>> Forward skbs sent from one netdevsim port to its connected netdevsim
>>>> port using dev_forward_skb, in a spirit similar to veth.
>>>>
>>>> Add a tx_dropped variable to struct netdevsim, tracking the number of
>>>> skbs that could not be forwarded using dev_forward_skb().
>>>>
>>>> The xmit() function accessing the peer ptr is protected by an RCU read
>>>> critical section. The rcu_read_lock() is functionally redundant as since
>>>> v5.0 all softirqs are implicitly RCU read critical sections; but it is
>>>> useful for human readers.
>>>>
>>>> If another CPU is concurrently in nsim_destroy(), then it will first set
>>>> the peer ptr to NULL. This does not affect any existing readers that
>>>> dereferenced a non-NULL peer. Then, in unregister_netdevice(), there is
>>>> a synchronize_rcu() before the netdev is actually unregistered and
>>>> freed. This ensures that any readers i.e. xmit() that got a non-NULL
>>>> peer will complete before the netdev is freed.
>>>>
>>>> Any readers after the RCU_INIT_POINTER() but before synchronize_rcu()
>>>> will dereference NULL, making it safe.
>>>>
>>>> The codepath to nsim_destroy() and nsim_create() takes both the newly
>>>> added nsim_dev_list_lock and rtnl_lock. This makes it safe with
>>>
>>> I don't see the rtnl_lock take in those functions.
>>>
>>>
>>> Otherwise, this patch looks fine to me.
>>
>> For nsim_create(), rtnl_lock is taken in nsim_init_netdevsim(). For
>> nsim_destroy(), rtnl_lock is taken directly in the function.
>>
>> What I mean here is, in the netdevsim device modification paths locks
>> are taken in this order:
>>
>> devl_lock -> rtnl_lock
>>
>> nsim_dev_list_lock is taken outside (not nested) of these.
>>
>> In nsim_dev_peer_write() where two ports are linked, locks are taken in
>> this order:
>>
>> nsim_dev_list_lock -> devl_lock -> rtnl_lock
>>
>> This will not cause deadlocks and ensures that two ports being linked
>> are both valid.
> 
> Okay. Perhaps would be good to document this in a comment somewhere in
> the code?

Yep, I'll add this.

> 
> 
>>
>>>
>>>
>>>> concurrent calls to linking two netdevsims together.
>>>>
>>>> Signed-off-by: David Wei <dw@davidwei.uk>
>>>> ---
>>>> drivers/net/netdevsim/netdev.c    | 21 ++++++++++++++++++---
>>>> drivers/net/netdevsim/netdevsim.h |  1 +
>>>> 2 files changed, 19 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c
>>>> index 434322f6a565..0009d0f1243f 100644
>>>> --- a/drivers/net/netdevsim/netdev.c
>> +++ b/drivers/net/netdevsim/netdev.c
>>>> @@ -29,19 +29,34 @@
>>>> static netdev_tx_t nsim_start_xmit(struct sk_buff *skb, struct net_device *dev)
>>>> {
>>>> 	struct netdevsim *ns = netdev_priv(dev);
>>>> +	struct netdevsim *peer_ns;
>>>> +	int ret = NETDEV_TX_OK;
>>>>
>>>> 	if (!nsim_ipsec_tx(ns, skb))
>>>> 		goto out;
>>>>
>>>> +	rcu_read_lock();
>>>> +	peer_ns = rcu_dereference(ns->peer);
>>>> +	if (!peer_ns)
>>>> +		goto out_stats;
>>>> +
>>>> +	skb_tx_timestamp(skb);
>>>> +	if (unlikely(dev_forward_skb(peer_ns->netdev, skb) == NET_RX_DROP))
>>>> +		ret = NET_XMIT_DROP;
>>>> +
>>>> +out_stats:
>>>> +	rcu_read_unlock();
>>>> 	u64_stats_update_begin(&ns->syncp);
>>>> 	ns->tx_packets++;
>>>> 	ns->tx_bytes += skb->len;
>>>> +	if (ret == NET_XMIT_DROP)
>>>> +		ns->tx_dropped++;
>>>> 	u64_stats_update_end(&ns->syncp);
>>>> +	return ret;
>>>>
>>>> out:
>>>> 	dev_kfree_skb(skb);
>>>> -
>>>> -	return NETDEV_TX_OK;
>>>> +	return ret;
>>>> }
>>>>
>>>> static void nsim_set_rx_mode(struct net_device *dev)
>>>> @@ -70,6 +85,7 @@ nsim_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
>>>> 		start = u64_stats_fetch_begin(&ns->syncp);
>>>> 		stats->tx_bytes = ns->tx_bytes;
>>>> 		stats->tx_packets = ns->tx_packets;
>>>> +		stats->tx_dropped = ns->tx_dropped;
>>>> 	} while (u64_stats_fetch_retry(&ns->syncp, start));
>>>> }
>>>>
>>>> @@ -302,7 +318,6 @@ static void nsim_setup(struct net_device *dev)
>>>> 	eth_hw_addr_random(dev);
>>>>
>>>> 	dev->tx_queue_len = 0;
>>>> -	dev->flags |= IFF_NOARP;
>>>> 	dev->flags &= ~IFF_MULTICAST;
>>>> 	dev->priv_flags |= IFF_LIVE_ADDR_CHANGE |
>>>> 			   IFF_NO_QUEUE;
>>>> diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h
>>>> index 24fc3fbda791..083b1ee7a1a2 100644
>>>> --- a/drivers/net/netdevsim/netdevsim.h
>>>> +++ b/drivers/net/netdevsim/netdevsim.h
>>>> @@ -98,6 +98,7 @@ struct netdevsim {
>>>>
>>>> 	u64 tx_packets;
>>>> 	u64 tx_bytes;
>>>> +	u64 tx_dropped;
>>>> 	struct u64_stats_sync syncp;
>>>>
>>>> 	struct nsim_bus_dev *nsim_bus_dev;
>>>> -- 
>>>> 2.39.3
>>>>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next v5 2/5] netdevsim: allow two netdevsim ports to be connected
  2024-01-09 16:57     ` David Wei
@ 2024-01-10  1:53       ` Jakub Kicinski
  0 siblings, 0 replies; 20+ messages in thread
From: Jakub Kicinski @ 2024-01-10  1:53 UTC (permalink / raw)
  To: David Wei
  Cc: Jiri Pirko, Sabrina Dubroca, netdev, David S. Miller,
	Eric Dumazet, Paolo Abeni

On Tue, 9 Jan 2024 08:57:59 -0800 David Wei wrote:
> >> +	ret = sscanf(buf, "%u %u", &id, &port);
> >> +	if (ret != 2) {
> >> +		pr_err("Format is peer netdevsim \"id port\" (uint uint)\n");  
> > 
> > netif_err() or dev_err() ? Granted the rest of the file seems to use
> > pr_err(), but I'm not sure why...  
> 
> I can change it to use one of these two in this patchset, then I can
> chnage the others separately in another patch. How does that sound?

Separate patch and separate series. Let's not load more unrelated
patches into this series :)

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2024-01-10  1:53 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-28  1:46 [PATCH net-next v5 0/5] netdevsim: link and forward skbs between ports David Wei
2023-12-28  1:46 ` [PATCH net-next v5 1/5] netdevsim: maintain a list of probed netdevsims David Wei
2024-01-02 11:04   ` Jiri Pirko
2024-01-03 21:48     ` David Wei
2023-12-28  1:46 ` [PATCH net-next v5 2/5] netdevsim: allow two netdevsim ports to be connected David Wei
2024-01-02 11:11   ` Jiri Pirko
2024-01-03 21:56     ` David Wei
2024-01-04  9:30       ` Jiri Pirko
2024-01-04  1:39   ` Jakub Kicinski
2024-01-09 16:57     ` David Wei
2024-01-10  1:53       ` Jakub Kicinski
2023-12-28  1:46 ` [PATCH net-next v5 3/5] netdevsim: forward skbs from one connected port to another David Wei
2024-01-02 11:13   ` Jiri Pirko
2024-01-03 22:36     ` David Wei
2024-01-04  9:31       ` Jiri Pirko
2024-01-09 16:58         ` David Wei
2024-01-02 11:20   ` Eric Dumazet
2024-01-03 21:57     ` David Wei
2023-12-28  1:46 ` [PATCH net-next v5 4/5] netdevsim: add selftest for forwarding skb between connected ports David Wei
2023-12-28  1:46 ` [PATCH net-next v5 5/5] netdevsim: add Makefile for selftests David Wei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).