All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC net-next 0/2] explicit netdev refs
@ 2021-11-17 17:47 Jakub Kicinski
  2021-11-17 17:47 ` [RFC net-next 1/2] net: add netdev_refs debug Jakub Kicinski
  2021-11-17 17:47 ` [RFC net-next 2/2] vlan: use new netdev_refs infra Jakub Kicinski
  0 siblings, 2 replies; 7+ messages in thread
From: Jakub Kicinski @ 2021-11-17 17:47 UTC (permalink / raw)
  To: netdev; +Cc: davem, eric.dumazet, Jakub Kicinski

syzbot keeps accumulating netdev ref leak errors. These are notoriously
hard to debug. We can't do much about all the old code at this point,
this set tries to provide an API for the new code to use.

It basically adds a explicitly reference counted pointer (there's probably
a better name for this). It's a structure which wraps the netdev pointer
and helps us doing sanity checking.

There isn't much that's netdev-specific here, but we probably still want
to keep our own wrappers even if the main struct gets generalized, so
that we can keep the helpers typed.

Sending as an RFC because vlan refcounting has a bug which needs to be
fixed first [1]. Explicit refs catch it and spew warnings appropriately.

[1] https://lore.kernel.org/all/87k0h9bb9x.fsf@nvidia.com/

Jakub Kicinski (2):
  net: add netdev_refs debug
  vlan: use new netdev_refs infra

 MAINTAINERS                                  |   1 +
 drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c |   3 +-
 include/linux/if_vlan.h                      |  11 +-
 include/linux/netdev_refs.h                  | 104 +++++++++++++++++++
 lib/Kconfig.debug                            |   7 ++
 net/8021q/vlan.c                             |  13 +--
 net/8021q/vlan_core.c                        |   4 +-
 net/8021q/vlan_dev.c                         |  63 +++++------
 net/8021q/vlan_gvrp.c                        |   5 +-
 net/8021q/vlan_mvrp.c                        |   5 +-
 net/8021q/vlan_netlink.c                     |   4 +-
 net/8021q/vlanproc.c                         |   6 +-
 net/core/dev.c                               |   8 ++
 13 files changed, 186 insertions(+), 48 deletions(-)
 create mode 100644 include/linux/netdev_refs.h

-- 
2.31.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [RFC net-next 1/2] net: add netdev_refs debug
  2021-11-17 17:47 [RFC net-next 0/2] explicit netdev refs Jakub Kicinski
@ 2021-11-17 17:47 ` Jakub Kicinski
  2021-11-17 18:24   ` Leon Romanovsky
  2021-11-17 19:13   ` Eric Dumazet
  2021-11-17 17:47 ` [RFC net-next 2/2] vlan: use new netdev_refs infra Jakub Kicinski
  1 sibling, 2 replies; 7+ messages in thread
From: Jakub Kicinski @ 2021-11-17 17:47 UTC (permalink / raw)
  To: netdev; +Cc: davem, eric.dumazet, Jakub Kicinski

Debugging netdev ref leaks is still pretty hard. Eric added
optional use of a normal refcount which is useful for tracking
abuse of existing users.

For new code, however, it'd be great if we could actually track
the refs per-user. Allowing us to detect leaks where they happen.
This patch introduces a netdev_ref type and uses the debug_objects
infra to track refs being lost or misused.

In the future we can extend this structure to also catch those
who fail to release the ref on unregistering notification.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 MAINTAINERS                 |   1 +
 include/linux/netdev_refs.h | 104 ++++++++++++++++++++++++++++++++++++
 lib/Kconfig.debug           |   7 +++
 net/core/dev.c              |   8 +++
 4 files changed, 120 insertions(+)
 create mode 100644 include/linux/netdev_refs.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 4c74516e4353..47fe27175c9f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -18482,6 +18482,7 @@ F:	include/uapi/linux/pkt_sched.h
 F:	include/uapi/linux/tc_act/
 F:	include/uapi/linux/tc_ematch/
 F:	net/sched/
+F:	tools/testing/selftests/tc-testing/
 
 TC90522 MEDIA DRIVER
 M:	Akihiro Tsukada <tskd08@gmail.com>
diff --git a/include/linux/netdev_refs.h b/include/linux/netdev_refs.h
new file mode 100644
index 000000000000..326772ea0a63
--- /dev/null
+++ b/include/linux/netdev_refs.h
@@ -0,0 +1,104 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+
+#ifndef _LINUX_NETDEV_REFS_H
+#define _LINUX_NETDEV_REFS_H
+
+#include <linux/debugobjects.h>
+#include <linux/netdevice.h>
+
+/* Explicit netdevice references
+ * struct netdev_ref is a storage for a reference. It's equivalent
+ * to a netdev pointer, but when debug is enabled it performs extra checks.
+ * Most users will want to take a reference with netdev_hold(), access it
+ * via netdev_ref_ptr() and release with netdev_put().
+ */
+
+struct netdev_ref {
+	struct net_device *dev;
+#ifdef CONFIG_DEBUG_OBJECTS_NETDEV_REFS
+	refcount_t cnt;
+#endif
+};
+
+extern const struct debug_obj_descr netdev_ref_debug_descr;
+
+/* Store a raw, unprotected pointer */
+static inline void __netdev_ref_store(struct netdev_ref *ref,
+				      struct net_device *dev)
+{
+	ref->dev = dev;
+
+#ifdef CONFIG_DEBUG_OBJECTS_NETDEV_REFS
+	refcount_set(&ref->cnt, 0);
+	debug_object_init(ref, &netdev_ref_debug_descr);
+#endif
+}
+
+/* Convert a previously stored unprotected pointer to a normal ref */
+static inline void __netdev_hold_stored(struct netdev_ref *ref)
+{
+	dev_hold(ref->dev);
+
+#ifdef CONFIG_DEBUG_OBJECTS_NETDEV_REFS
+	refcount_set(&ref->cnt, 1);
+	debug_object_activate(ref, &netdev_ref_debug_descr);
+#endif
+}
+
+/* Take a reference on a netdev and store it in @ref */
+static inline void netdev_hold(struct netdev_ref *ref, struct net_device *dev)
+{
+	__netdev_ref_store(ref, dev);
+	__netdev_hold_stored(ref);
+}
+
+/* Release a reference on a netdev previously acquired by netdev_hold() */
+static inline void netdev_put(struct netdev_ref *ref)
+{
+	dev_put(ref->dev);
+
+#ifdef CONFIG_DEBUG_OBJECTS_NETDEV_REFS
+	WARN_ON(refcount_read(&ref->cnt) != 1);
+	debug_object_deactivate(ref, &netdev_ref_debug_descr);
+#endif
+}
+
+/* Increase refcount of a reference, reference must be valid -
+ * initialized by netdev_hold() or equivalent set of sub-functions.
+ */
+static inline void netdev_ref_get(struct netdev_ref *ref)
+{
+	dev_hold(ref->dev);
+
+#ifdef CONFIG_DEBUG_OBJECTS_NETDEV_REFS
+	refcount_inc(&ref->cnt);
+#endif
+}
+
+/* Release a reference with unknown number of refs */
+static inline void netdev_ref_put(struct netdev_ref *ref)
+{
+	dev_put(ref->dev);
+
+#ifdef CONFIG_DEBUG_OBJECTS_NETDEV_REFS
+	if (refcount_dec_and_test(&ref->cnt))
+		debug_object_deactivate(ref, &netdev_ref_debug_descr);
+#endif
+}
+
+/* Unprotected access to a pointer stored by __netdev_ref_store() */
+static inline struct net_device *__netdev_ref_ptr(const struct netdev_ref *ref)
+{
+	return ref->dev;
+}
+
+/* Netdev pointer access on a normal ref */
+static inline struct net_device *netdev_ref_ptr(const struct netdev_ref *ref)
+{
+#ifdef CONFIG_DEBUG_OBJECTS_NETDEV_REFS
+	WARN_ON(!refcount_read(&ref->cnt));
+#endif
+	return ref->dev;
+}
+
+#endif
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 9ef7ce18b4f5..e07b1cbb4228 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -655,6 +655,13 @@ config DEBUG_OBJECTS_PERCPU_COUNTER
 	  percpu counter routines to track the life time of percpu counter
 	  objects and validate the percpu counter operations.
 
+config DEBUG_OBJECTS_NETDEV_REFS
+	bool "Debug net_device references"
+	depends on DEBUG_OBJECTS
+	help
+	  If you say Y here, additional code will be inserted into the
+	  net_device reference routines to catch incorrect use.
+
 config DEBUG_OBJECTS_ENABLE_DEFAULT
 	int "debug_objects bootup default value (0-1)"
 	range 0 1
diff --git a/net/core/dev.c b/net/core/dev.c
index 92c9258cbf28..c8c9be59de89 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -150,6 +150,7 @@
 #include <linux/pm_runtime.h>
 #include <linux/prandom.h>
 #include <linux/once_lite.h>
+#include <linux/netdev_refs.h>
 
 #include "net-sysfs.h"
 
@@ -158,6 +159,13 @@ static DEFINE_SPINLOCK(ptype_lock);
 struct list_head ptype_base[PTYPE_HASH_SIZE] __read_mostly;
 struct list_head ptype_all __read_mostly;	/* Taps */
 
+const struct debug_obj_descr netdev_ref_debug_descr = {
+	.name		= "netdev_ref",
+};
+#ifdef CONFIG_DEBUG_OBJECTS_NETDEV_REFS
+EXPORT_SYMBOL(netdev_ref_debug_descr);
+#endif
+
 static int netif_rx_internal(struct sk_buff *skb);
 static int call_netdevice_notifiers_info(unsigned long val,
 					 struct netdev_notifier_info *info);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC net-next 2/2] vlan: use new netdev_refs infra
  2021-11-17 17:47 [RFC net-next 0/2] explicit netdev refs Jakub Kicinski
  2021-11-17 17:47 ` [RFC net-next 1/2] net: add netdev_refs debug Jakub Kicinski
@ 2021-11-17 17:47 ` Jakub Kicinski
  1 sibling, 0 replies; 7+ messages in thread
From: Jakub Kicinski @ 2021-11-17 17:47 UTC (permalink / raw)
  To: netdev; +Cc: davem, eric.dumazet, Jakub Kicinski

Example user of the new netdev_refs infra.

VLAN is not the simplest - the ref ptr is stored on the IOCTL path
and netlink path but only taken during netdev registration.

No changes to the assembly output with debug disabled,
save for the few places where explicit temporary variables
were added.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c |  3 +-
 include/linux/if_vlan.h                      | 11 +++-
 net/8021q/vlan.c                             | 13 ++--
 net/8021q/vlan_core.c                        |  4 +-
 net/8021q/vlan_dev.c                         | 63 ++++++++++----------
 net/8021q/vlan_gvrp.c                        |  5 +-
 net/8021q/vlan_mvrp.c                        |  5 +-
 net/8021q/vlan_netlink.c                     |  4 +-
 net/8021q/vlanproc.c                         |  6 +-
 9 files changed, 66 insertions(+), 48 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
index e6a4a768b10b..459c17fc9287 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
@@ -1250,9 +1250,10 @@ static int bnxt_tc_resolve_tunnel_hdrs(struct bnxt *bp,
 	dst_dev = rt->dst.dev;
 	if (is_vlan_dev(dst_dev)) {
 #if IS_ENABLED(CONFIG_VLAN_8021Q)
+		struct net_device *real_dev = __vlan_dev_real_dev(dst_dev);
 		struct vlan_dev_priv *vlan = vlan_dev_priv(dst_dev);
 
-		if (vlan->real_dev != real_dst_dev) {
+		if (real_dev != real_dst_dev) {
 			netdev_info(bp->dev,
 				    "dst_dev(%s) doesn't use PF-if(%s)\n",
 				    netdev_name(dst_dev),
diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index 41a518336673..bb46fa2b1327 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -8,6 +8,7 @@
 #define _LINUX_IF_VLAN_H_
 
 #include <linux/netdevice.h>
+#include <linux/netdev_refs.h>
 #include <linux/etherdevice.h>
 #include <linux/rtnetlink.h>
 #include <linux/bug.h>
@@ -176,7 +177,7 @@ struct vlan_dev_priv {
 	u16					vlan_id;
 	u16					flags;
 
-	struct net_device			*real_dev;
+	struct netdev_ref			real_dev;
 	unsigned char				real_dev_addr[ETH_ALEN];
 
 	struct proc_dir_entry			*dent;
@@ -191,6 +192,14 @@ static inline struct vlan_dev_priv *vlan_dev_priv(const struct net_device *dev)
 	return netdev_priv(dev);
 }
 
+static inline struct net_device *
+__vlan_dev_real_dev(const struct net_device *dev)
+{
+	struct vlan_dev_priv *vlan = vlan_dev_priv(dev);
+
+	return netdev_ref_ptr(&vlan->real_dev);
+}
+
 static inline u16
 vlan_dev_get_egress_qos_mask(struct net_device *dev, u32 skprio)
 {
diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index a3a0a5e994f5..b5a8677e89ad 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -89,7 +89,7 @@ static void vlan_stacked_transfer_operstate(const struct net_device *rootdev,
 void unregister_vlan_dev(struct net_device *dev, struct list_head *head)
 {
 	struct vlan_dev_priv *vlan = vlan_dev_priv(dev);
-	struct net_device *real_dev = vlan->real_dev;
+	struct net_device *real_dev = netdev_ref_ptr(&vlan->real_dev);
 	struct vlan_info *vlan_info;
 	struct vlan_group *grp;
 	u16 vlan_id = vlan->vlan_id;
@@ -148,7 +148,7 @@ int vlan_check_real_dev(struct net_device *real_dev,
 int register_vlan_dev(struct net_device *dev, struct netlink_ext_ack *extack)
 {
 	struct vlan_dev_priv *vlan = vlan_dev_priv(dev);
-	struct net_device *real_dev = vlan->real_dev;
+	struct net_device *real_dev = __netdev_ref_ptr(&vlan->real_dev);
 	u16 vlan_id = vlan->vlan_id;
 	struct vlan_info *vlan_info;
 	struct vlan_group *grp;
@@ -185,7 +185,7 @@ int register_vlan_dev(struct net_device *dev, struct netlink_ext_ack *extack)
 		goto out_unregister_netdev;
 
 	/* Account for reference in struct vlan_dev_priv */
-	dev_hold(real_dev);
+	__netdev_hold_stored(&vlan->real_dev);
 
 	vlan_stacked_transfer_operstate(real_dev, dev, vlan);
 	linkwatch_fire_event(dev); /* _MUST_ call rfc2863_policy() */
@@ -272,7 +272,7 @@ static int register_vlan_device(struct net_device *real_dev, u16 vlan_id)
 	vlan = vlan_dev_priv(new_dev);
 	vlan->vlan_proto = htons(ETH_P_8021Q);
 	vlan->vlan_id = vlan_id;
-	vlan->real_dev = real_dev;
+	__netdev_ref_store(&vlan->real_dev, real_dev);
 	vlan->dent = NULL;
 	vlan->flags = VLAN_FLAG_REORDER_HDR;
 
@@ -321,6 +321,7 @@ static void vlan_transfer_features(struct net_device *dev,
 				   struct net_device *vlandev)
 {
 	struct vlan_dev_priv *vlan = vlan_dev_priv(vlandev);
+	struct net_device *real_dev = netdev_ref_ptr(&vlan->real_dev);
 
 	vlandev->gso_max_size = dev->gso_max_size;
 	vlandev->gso_max_segs = dev->gso_max_segs;
@@ -335,8 +336,8 @@ static void vlan_transfer_features(struct net_device *dev,
 #endif
 
 	vlandev->priv_flags &= ~IFF_XMIT_DST_RELEASE;
-	vlandev->priv_flags |= (vlan->real_dev->priv_flags & IFF_XMIT_DST_RELEASE);
-	vlandev->hw_enc_features = vlan_tnl_features(vlan->real_dev);
+	vlandev->priv_flags |= (real_dev->priv_flags & IFF_XMIT_DST_RELEASE);
+	vlandev->hw_enc_features = vlan_tnl_features(real_dev);
 
 	netdev_update_features(vlandev);
 }
diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
index 59bc13b5f14f..ab183724374d 100644
--- a/net/8021q/vlan_core.c
+++ b/net/8021q/vlan_core.c
@@ -101,10 +101,10 @@ EXPORT_SYMBOL(__vlan_find_dev_deep_rcu);
 
 struct net_device *vlan_dev_real_dev(const struct net_device *dev)
 {
-	struct net_device *ret = vlan_dev_priv(dev)->real_dev;
+	struct net_device *ret = __vlan_dev_real_dev(dev);
 
 	while (is_vlan_dev(ret))
-		ret = vlan_dev_priv(ret)->real_dev;
+		ret = __vlan_dev_real_dev(ret);
 
 	return ret;
 }
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index ab6dee28536d..c78f2cbc42c3 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -78,7 +78,7 @@ static int vlan_dev_hard_header(struct sk_buff *skb, struct net_device *dev,
 		saddr = dev->dev_addr;
 
 	/* Now make the underlying real hard header */
-	dev = vlan->real_dev;
+	dev = netdev_ref_ptr(&vlan->real_dev);
 	rc = dev_hard_header(skb, dev, type, daddr, saddr, len + vhdrlen);
 	if (rc > 0)
 		rc += vhdrlen;
@@ -116,7 +116,7 @@ static netdev_tx_t vlan_dev_hard_start_xmit(struct sk_buff *skb,
 		__vlan_hwaccel_put_tag(skb, vlan->vlan_proto, vlan_tci);
 	}
 
-	skb->dev = vlan->real_dev;
+	skb->dev = netdev_ref_ptr(&vlan->real_dev);
 	len = skb->len;
 	if (unlikely(netpoll_tx_running(dev)))
 		return vlan_netpoll_send_skb(vlan, skb);
@@ -140,7 +140,7 @@ static netdev_tx_t vlan_dev_hard_start_xmit(struct sk_buff *skb,
 
 static int vlan_dev_change_mtu(struct net_device *dev, int new_mtu)
 {
-	struct net_device *real_dev = vlan_dev_priv(dev)->real_dev;
+	struct net_device *real_dev = __vlan_dev_real_dev(dev);
 	unsigned int max_mtu = real_dev->mtu;
 
 	if (netif_reduces_vlan_mtu(real_dev))
@@ -241,7 +241,7 @@ int vlan_dev_change_flags(const struct net_device *dev, u32 flags, u32 mask)
 
 void vlan_dev_get_realdev_name(const struct net_device *dev, char *result, size_t size)
 {
-	strscpy_pad(result, vlan_dev_priv(dev)->real_dev->name, size);
+	strscpy_pad(result, __vlan_dev_real_dev(dev)->name, size);
 }
 
 bool vlan_dev_inherit_address(struct net_device *dev,
@@ -258,7 +258,7 @@ bool vlan_dev_inherit_address(struct net_device *dev,
 static int vlan_dev_open(struct net_device *dev)
 {
 	struct vlan_dev_priv *vlan = vlan_dev_priv(dev);
-	struct net_device *real_dev = vlan->real_dev;
+	struct net_device *real_dev = netdev_ref_ptr(&vlan->real_dev);
 	int err;
 
 	if (!(real_dev->flags & IFF_UP) &&
@@ -310,7 +310,7 @@ static int vlan_dev_open(struct net_device *dev)
 static int vlan_dev_stop(struct net_device *dev)
 {
 	struct vlan_dev_priv *vlan = vlan_dev_priv(dev);
-	struct net_device *real_dev = vlan->real_dev;
+	struct net_device *real_dev = netdev_ref_ptr(&vlan->real_dev);
 
 	dev_mc_unsync(real_dev, dev);
 	dev_uc_unsync(real_dev, dev);
@@ -329,7 +329,7 @@ static int vlan_dev_stop(struct net_device *dev)
 
 static int vlan_dev_set_mac_address(struct net_device *dev, void *p)
 {
-	struct net_device *real_dev = vlan_dev_priv(dev)->real_dev;
+	struct net_device *real_dev = __vlan_dev_real_dev(dev);
 	struct sockaddr *addr = p;
 	int err;
 
@@ -355,7 +355,7 @@ static int vlan_dev_set_mac_address(struct net_device *dev, void *p)
 
 static int vlan_dev_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
 {
-	struct net_device *real_dev = vlan_dev_priv(dev)->real_dev;
+	struct net_device *real_dev = __vlan_dev_real_dev(dev);
 	const struct net_device_ops *ops = real_dev->netdev_ops;
 	struct ifreq ifrr;
 	int err = -EOPNOTSUPP;
@@ -385,7 +385,7 @@ static int vlan_dev_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
 
 static int vlan_dev_neigh_setup(struct net_device *dev, struct neigh_parms *pa)
 {
-	struct net_device *real_dev = vlan_dev_priv(dev)->real_dev;
+	struct net_device *real_dev = __vlan_dev_real_dev(dev);
 	const struct net_device_ops *ops = real_dev->netdev_ops;
 	int err = 0;
 
@@ -399,7 +399,7 @@ static int vlan_dev_neigh_setup(struct net_device *dev, struct neigh_parms *pa)
 static int vlan_dev_fcoe_ddp_setup(struct net_device *dev, u16 xid,
 				   struct scatterlist *sgl, unsigned int sgc)
 {
-	struct net_device *real_dev = vlan_dev_priv(dev)->real_dev;
+	struct net_device *real_dev = __vlan_dev_real_dev(dev);
 	const struct net_device_ops *ops = real_dev->netdev_ops;
 	int rc = 0;
 
@@ -411,7 +411,7 @@ static int vlan_dev_fcoe_ddp_setup(struct net_device *dev, u16 xid,
 
 static int vlan_dev_fcoe_ddp_done(struct net_device *dev, u16 xid)
 {
-	struct net_device *real_dev = vlan_dev_priv(dev)->real_dev;
+	struct net_device *real_dev = __vlan_dev_real_dev(dev);
 	const struct net_device_ops *ops = real_dev->netdev_ops;
 	int len = 0;
 
@@ -423,7 +423,7 @@ static int vlan_dev_fcoe_ddp_done(struct net_device *dev, u16 xid)
 
 static int vlan_dev_fcoe_enable(struct net_device *dev)
 {
-	struct net_device *real_dev = vlan_dev_priv(dev)->real_dev;
+	struct net_device *real_dev = __vlan_dev_real_dev(dev);
 	const struct net_device_ops *ops = real_dev->netdev_ops;
 	int rc = -EINVAL;
 
@@ -434,7 +434,7 @@ static int vlan_dev_fcoe_enable(struct net_device *dev)
 
 static int vlan_dev_fcoe_disable(struct net_device *dev)
 {
-	struct net_device *real_dev = vlan_dev_priv(dev)->real_dev;
+	struct net_device *real_dev = __vlan_dev_real_dev(dev);
 	const struct net_device_ops *ops = real_dev->netdev_ops;
 	int rc = -EINVAL;
 
@@ -446,7 +446,7 @@ static int vlan_dev_fcoe_disable(struct net_device *dev)
 static int vlan_dev_fcoe_ddp_target(struct net_device *dev, u16 xid,
 				    struct scatterlist *sgl, unsigned int sgc)
 {
-	struct net_device *real_dev = vlan_dev_priv(dev)->real_dev;
+	struct net_device *real_dev = __vlan_dev_real_dev(dev);
 	const struct net_device_ops *ops = real_dev->netdev_ops;
 	int rc = 0;
 
@@ -460,7 +460,7 @@ static int vlan_dev_fcoe_ddp_target(struct net_device *dev, u16 xid,
 #ifdef NETDEV_FCOE_WWNN
 static int vlan_dev_fcoe_get_wwn(struct net_device *dev, u64 *wwn, int type)
 {
-	struct net_device *real_dev = vlan_dev_priv(dev)->real_dev;
+	struct net_device *real_dev = __vlan_dev_real_dev(dev);
 	const struct net_device_ops *ops = real_dev->netdev_ops;
 	int rc = -EINVAL;
 
@@ -472,7 +472,7 @@ static int vlan_dev_fcoe_get_wwn(struct net_device *dev, u64 *wwn, int type)
 
 static void vlan_dev_change_rx_flags(struct net_device *dev, int change)
 {
-	struct net_device *real_dev = vlan_dev_priv(dev)->real_dev;
+	struct net_device *real_dev = __vlan_dev_real_dev(dev);
 
 	if (dev->flags & IFF_UP) {
 		if (change & IFF_ALLMULTI)
@@ -484,8 +484,10 @@ static void vlan_dev_change_rx_flags(struct net_device *dev, int change)
 
 static void vlan_dev_set_rx_mode(struct net_device *vlan_dev)
 {
-	dev_mc_sync(vlan_dev_priv(vlan_dev)->real_dev, vlan_dev);
-	dev_uc_sync(vlan_dev_priv(vlan_dev)->real_dev, vlan_dev);
+	struct net_device *real_dev = __vlan_dev_real_dev(vlan_dev);
+
+	dev_mc_sync(real_dev, vlan_dev);
+	dev_uc_sync(real_dev, vlan_dev);
 }
 
 /*
@@ -529,7 +531,7 @@ static int vlan_passthru_hard_header(struct sk_buff *skb, struct net_device *dev
 				     unsigned int len)
 {
 	struct vlan_dev_priv *vlan = vlan_dev_priv(dev);
-	struct net_device *real_dev = vlan->real_dev;
+	struct net_device *real_dev = netdev_ref_ptr(&vlan->real_dev);
 
 	if (saddr == NULL)
 		saddr = dev->dev_addr;
@@ -552,7 +554,7 @@ static const struct net_device_ops vlan_netdev_ops;
 static int vlan_dev_init(struct net_device *dev)
 {
 	struct vlan_dev_priv *vlan = vlan_dev_priv(dev);
-	struct net_device *real_dev = vlan->real_dev;
+	struct net_device *real_dev = netdev_ref_ptr(&vlan->real_dev);
 
 	netif_carrier_off(dev);
 
@@ -636,7 +638,7 @@ void vlan_dev_uninit(struct net_device *dev)
 static netdev_features_t vlan_dev_fix_features(struct net_device *dev,
 	netdev_features_t features)
 {
-	struct net_device *real_dev = vlan_dev_priv(dev)->real_dev;
+	struct net_device *real_dev = __vlan_dev_real_dev(dev);
 	netdev_features_t old_features = features;
 	netdev_features_t lower_features;
 
@@ -659,9 +661,9 @@ static netdev_features_t vlan_dev_fix_features(struct net_device *dev,
 static int vlan_ethtool_get_link_ksettings(struct net_device *dev,
 					   struct ethtool_link_ksettings *cmd)
 {
-	const struct vlan_dev_priv *vlan = vlan_dev_priv(dev);
+	struct net_device *real_dev = __vlan_dev_real_dev(dev);
 
-	return __ethtool_get_link_ksettings(vlan->real_dev, cmd);
+	return __ethtool_get_link_ksettings(real_dev, cmd);
 }
 
 static void vlan_ethtool_get_drvinfo(struct net_device *dev,
@@ -676,13 +678,14 @@ static int vlan_ethtool_get_ts_info(struct net_device *dev,
 				    struct ethtool_ts_info *info)
 {
 	const struct vlan_dev_priv *vlan = vlan_dev_priv(dev);
-	const struct ethtool_ops *ops = vlan->real_dev->ethtool_ops;
-	struct phy_device *phydev = vlan->real_dev->phydev;
+	struct net_device *real_dev = netdev_ref_ptr(&vlan->real_dev);
+	const struct ethtool_ops *ops = real_dev->ethtool_ops;
+	struct phy_device *phydev = real_dev->phydev;
 
 	if (phy_has_tsinfo(phydev)) {
 		return phy_ts_info(phydev, info);
 	} else if (ops->get_ts_info) {
-		return ops->get_ts_info(vlan->real_dev, info);
+		return ops->get_ts_info(real_dev, info);
 	} else {
 		info->so_timestamping = SOF_TIMESTAMPING_RX_SOFTWARE |
 			SOF_TIMESTAMPING_SOFTWARE;
@@ -735,7 +738,7 @@ static void vlan_dev_poll_controller(struct net_device *dev)
 static int vlan_dev_netpoll_setup(struct net_device *dev, struct netpoll_info *npinfo)
 {
 	struct vlan_dev_priv *vlan = vlan_dev_priv(dev);
-	struct net_device *real_dev = vlan->real_dev;
+	struct net_device *real_dev = netdev_ref_ptr(&vlan->real_dev);
 	struct netpoll *netpoll;
 	int err = 0;
 
@@ -771,7 +774,7 @@ static void vlan_dev_netpoll_cleanup(struct net_device *dev)
 
 static int vlan_dev_get_iflink(const struct net_device *dev)
 {
-	struct net_device *real_dev = vlan_dev_priv(dev)->real_dev;
+	struct net_device *real_dev = __vlan_dev_real_dev(dev);
 
 	return real_dev->ifindex;
 }
@@ -785,7 +788,7 @@ static int vlan_dev_fill_forward_path(struct net_device_path_ctx *ctx,
 	path->encap.id = vlan->vlan_id;
 	path->encap.proto = vlan->vlan_proto;
 	path->dev = ctx->dev;
-	ctx->dev = vlan->real_dev;
+	ctx->dev = netdev_ref_ptr(&vlan->real_dev);
 	if (ctx->num_vlans >= ARRAY_SIZE(ctx->vlan))
 		return -ENOSPC;
 
@@ -845,7 +848,7 @@ static void vlan_dev_free(struct net_device *dev)
 	vlan->vlan_pcpu_stats = NULL;
 
 	/* Get rid of the vlan's reference to real_dev */
-	dev_put(vlan->real_dev);
+	netdev_put(&vlan->real_dev);
 }
 
 void vlan_setup(struct net_device *dev)
diff --git a/net/8021q/vlan_gvrp.c b/net/8021q/vlan_gvrp.c
index 6b34b72aa466..aa8df972f93c 100644
--- a/net/8021q/vlan_gvrp.c
+++ b/net/8021q/vlan_gvrp.c
@@ -31,7 +31,8 @@ int vlan_gvrp_request_join(const struct net_device *dev)
 
 	if (vlan->vlan_proto != htons(ETH_P_8021Q))
 		return 0;
-	return garp_request_join(vlan->real_dev, &vlan_gvrp_app,
+	return garp_request_join(netdev_ref_ptr(&vlan->real_dev),
+				 &vlan_gvrp_app,
 				 &vlan_id, sizeof(vlan_id), GVRP_ATTR_VID);
 }
 
@@ -42,7 +43,7 @@ void vlan_gvrp_request_leave(const struct net_device *dev)
 
 	if (vlan->vlan_proto != htons(ETH_P_8021Q))
 		return;
-	garp_request_leave(vlan->real_dev, &vlan_gvrp_app,
+	garp_request_leave(netdev_ref_ptr(&vlan->real_dev), &vlan_gvrp_app,
 			   &vlan_id, sizeof(vlan_id), GVRP_ATTR_VID);
 }
 
diff --git a/net/8021q/vlan_mvrp.c b/net/8021q/vlan_mvrp.c
index 689eceeaa360..ab534307fb89 100644
--- a/net/8021q/vlan_mvrp.c
+++ b/net/8021q/vlan_mvrp.c
@@ -37,7 +37,8 @@ int vlan_mvrp_request_join(const struct net_device *dev)
 
 	if (vlan->vlan_proto != htons(ETH_P_8021Q))
 		return 0;
-	return mrp_request_join(vlan->real_dev, &vlan_mrp_app,
+	return mrp_request_join(netdev_ref_ptr(&vlan->real_dev),
+				&vlan_mrp_app,
 				&vlan_id, sizeof(vlan_id), MVRP_ATTR_VID);
 }
 
@@ -48,7 +49,7 @@ void vlan_mvrp_request_leave(const struct net_device *dev)
 
 	if (vlan->vlan_proto != htons(ETH_P_8021Q))
 		return;
-	mrp_request_leave(vlan->real_dev, &vlan_mrp_app,
+	mrp_request_leave(netdev_ref_ptr(&vlan->real_dev), &vlan_mrp_app,
 			  &vlan_id, sizeof(vlan_id), MVRP_ATTR_VID);
 }
 
diff --git a/net/8021q/vlan_netlink.c b/net/8021q/vlan_netlink.c
index 0db85aeb119b..7499f73d9961 100644
--- a/net/8021q/vlan_netlink.c
+++ b/net/8021q/vlan_netlink.c
@@ -166,7 +166,7 @@ static int vlan_newlink(struct net *src_net, struct net_device *dev,
 
 	vlan->vlan_proto = proto;
 	vlan->vlan_id	 = nla_get_u16(data[IFLA_VLAN_ID]);
-	vlan->real_dev	 = real_dev;
+	__netdev_ref_store(&vlan->real_dev, real_dev);
 	dev->priv_flags |= (real_dev->priv_flags & IFF_XMIT_DST_RELEASE);
 	vlan->flags	 = VLAN_FLAG_REORDER_HDR;
 
@@ -274,7 +274,7 @@ static int vlan_fill_info(struct sk_buff *skb, const struct net_device *dev)
 
 static struct net *vlan_get_link_net(const struct net_device *dev)
 {
-	struct net_device *real_dev = vlan_dev_priv(dev)->real_dev;
+	struct net_device *real_dev = __vlan_dev_real_dev(dev);
 
 	return dev_net(real_dev);
 }
diff --git a/net/8021q/vlanproc.c b/net/8021q/vlanproc.c
index ec87dea23719..5784fb5fa2a9 100644
--- a/net/8021q/vlanproc.c
+++ b/net/8021q/vlanproc.c
@@ -231,9 +231,10 @@ static int vlan_seq_show(struct seq_file *seq, void *v)
 	} else {
 		const struct net_device *vlandev = v;
 		const struct vlan_dev_priv *vlan = vlan_dev_priv(vlandev);
+		struct net_device *real_dev = netdev_ref_ptr(&vlan->real_dev);
 
 		seq_printf(seq, "%-15s| %d  | %s\n",  vlandev->name,
-			   vlan->vlan_id,    vlan->real_dev->name);
+			   vlan->vlan_id,    real_dev->name);
 	}
 	return 0;
 }
@@ -242,6 +243,7 @@ static int vlandev_seq_show(struct seq_file *seq, void *offset)
 {
 	struct net_device *vlandev = (struct net_device *) seq->private;
 	const struct vlan_dev_priv *vlan = vlan_dev_priv(vlandev);
+	struct net_device *real_dev = netdev_ref_ptr(&vlan->real_dev);
 	struct rtnl_link_stats64 temp;
 	const struct rtnl_link_stats64 *stats;
 	static const char fmt64[] = "%30s %12llu\n";
@@ -262,7 +264,7 @@ static int vlandev_seq_show(struct seq_file *seq, void *offset)
 	seq_puts(seq, "\n");
 	seq_printf(seq, fmt64, "total frames transmitted", stats->tx_packets);
 	seq_printf(seq, fmt64, "total bytes transmitted", stats->tx_bytes);
-	seq_printf(seq, "Device: %s", vlan->real_dev->name);
+	seq_printf(seq, "Device: %s", real_dev->name);
 	/* now show all PRIORITY mappings relating to this VLAN */
 	seq_printf(seq, "\nINGRESS priority mappings: "
 			"0:%u  1:%u  2:%u  3:%u  4:%u  5:%u  6:%u 7:%u\n",
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [RFC net-next 1/2] net: add netdev_refs debug
  2021-11-17 17:47 ` [RFC net-next 1/2] net: add netdev_refs debug Jakub Kicinski
@ 2021-11-17 18:24   ` Leon Romanovsky
  2021-11-17 18:35     ` Jakub Kicinski
  2021-11-17 19:13   ` Eric Dumazet
  1 sibling, 1 reply; 7+ messages in thread
From: Leon Romanovsky @ 2021-11-17 18:24 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: netdev, davem, eric.dumazet

On Wed, Nov 17, 2021 at 09:47:22AM -0800, Jakub Kicinski wrote:
> Debugging netdev ref leaks is still pretty hard. Eric added
> optional use of a normal refcount which is useful for tracking
> abuse of existing users.
> 
> For new code, however, it'd be great if we could actually track
> the refs per-user. Allowing us to detect leaks where they happen.
> This patch introduces a netdev_ref type and uses the debug_objects
> infra to track refs being lost or misused.
> 
> In the future we can extend this structure to also catch those
> who fail to release the ref on unregistering notification.
> 
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
>  MAINTAINERS                 |   1 +
>  include/linux/netdev_refs.h | 104 ++++++++++++++++++++++++++++++++++++
>  lib/Kconfig.debug           |   7 +++
>  net/core/dev.c              |   8 +++
>  4 files changed, 120 insertions(+)
>  create mode 100644 include/linux/netdev_refs.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 4c74516e4353..47fe27175c9f 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -18482,6 +18482,7 @@ F:	include/uapi/linux/pkt_sched.h
>  F:	include/uapi/linux/tc_act/
>  F:	include/uapi/linux/tc_ematch/
>  F:	net/sched/
> +F:	tools/testing/selftests/tc-testing/
>  
>  TC90522 MEDIA DRIVER
>  M:	Akihiro Tsukada <tskd08@gmail.com>
> diff --git a/include/linux/netdev_refs.h b/include/linux/netdev_refs.h
> new file mode 100644
> index 000000000000..326772ea0a63
> --- /dev/null
> +++ b/include/linux/netdev_refs.h
> @@ -0,0 +1,104 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +
> +#ifndef _LINUX_NETDEV_REFS_H
> +#define _LINUX_NETDEV_REFS_H
> +
> +#include <linux/debugobjects.h>
> +#include <linux/netdevice.h>
> +
> +/* Explicit netdevice references
> + * struct netdev_ref is a storage for a reference. It's equivalent
> + * to a netdev pointer, but when debug is enabled it performs extra checks.
> + * Most users will want to take a reference with netdev_hold(), access it
> + * via netdev_ref_ptr() and release with netdev_put().
> + */
> +
> +struct netdev_ref {
> +	struct net_device *dev;
> +#ifdef CONFIG_DEBUG_OBJECTS_NETDEV_REFS
> +	refcount_t cnt;
> +#endif
> +};
> +
> +extern const struct debug_obj_descr netdev_ref_debug_descr;
> +
> +/* Store a raw, unprotected pointer */
> +static inline void __netdev_ref_store(struct netdev_ref *ref,
> +				      struct net_device *dev)
> +{
> +	ref->dev = dev;
> +
> +#ifdef CONFIG_DEBUG_OBJECTS_NETDEV_REFS
> +	refcount_set(&ref->cnt, 0);

This is very uncommon pattern. I would expect that first pointer access
will start from 1, like all refcount_t users. If you still prefer to
start from 0, i suggest you to use atomic_t. 

IMHO, much better will be to use kref for this type of reference counting.

Thanks

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC net-next 1/2] net: add netdev_refs debug
  2021-11-17 18:24   ` Leon Romanovsky
@ 2021-11-17 18:35     ` Jakub Kicinski
  2021-11-17 19:27       ` Leon Romanovsky
  0 siblings, 1 reply; 7+ messages in thread
From: Jakub Kicinski @ 2021-11-17 18:35 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: netdev, davem, eric.dumazet

On Wed, 17 Nov 2021 20:24:17 +0200 Leon Romanovsky wrote:
> > +/* Store a raw, unprotected pointer */
> > +static inline void __netdev_ref_store(struct netdev_ref *ref,
> > +				      struct net_device *dev)
> > +{
> > +	ref->dev = dev;
> > +
> > +#ifdef CONFIG_DEBUG_OBJECTS_NETDEV_REFS
> > +	refcount_set(&ref->cnt, 0);  
> 
> This is very uncommon pattern. I would expect that first pointer access
> will start from 1, like all refcount_t users. If you still prefer to
> start from 0, i suggest you to use atomic_t. 

It's not really "starting from 0", it's more of a "setting the count
to invalid". It can't escape from this state with a simple inc.

> IMHO, much better will be to use kref for this type of reference counting.

I'm not really sure if the netdev_ref_{get,put}() part is necessary.

Main pattern I'm trying to replace is a pointer which is optionally 
holding a reference (0, 1 references). Rather than a pointer which 
is ref counted (1..n references).

IOW __netdev_ref_store() is much more likely to be used than
netdev_ref_get(), that's also why netdev_put() does not invalidate
the pointer itself.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC net-next 1/2] net: add netdev_refs debug
  2021-11-17 17:47 ` [RFC net-next 1/2] net: add netdev_refs debug Jakub Kicinski
  2021-11-17 18:24   ` Leon Romanovsky
@ 2021-11-17 19:13   ` Eric Dumazet
  1 sibling, 0 replies; 7+ messages in thread
From: Eric Dumazet @ 2021-11-17 19:13 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: netdev, davem

On Wed, Nov 17, 2021 at 9:47 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> Debugging netdev ref leaks is still pretty hard. Eric added
> optional use of a normal refcount which is useful for tracking
> abuse of existing users.
>
> For new code, however, it'd be great if we could actually track
> the refs per-user. Allowing us to detect leaks where they happen.
> This patch introduces a netdev_ref type and uses the debug_objects
> infra to track refs being lost or misused.
>
> In the future we can extend this structure to also catch those
> who fail to release the ref on unregistering notification.
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---

Interesting, I was working on something very different, let me send
the RFC (it is absolutely not complete at this point)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC net-next 1/2] net: add netdev_refs debug
  2021-11-17 18:35     ` Jakub Kicinski
@ 2021-11-17 19:27       ` Leon Romanovsky
  0 siblings, 0 replies; 7+ messages in thread
From: Leon Romanovsky @ 2021-11-17 19:27 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: netdev, davem, eric.dumazet

On Wed, Nov 17, 2021 at 10:35:45AM -0800, Jakub Kicinski wrote:
> On Wed, 17 Nov 2021 20:24:17 +0200 Leon Romanovsky wrote:
> > > +/* Store a raw, unprotected pointer */
> > > +static inline void __netdev_ref_store(struct netdev_ref *ref,
> > > +				      struct net_device *dev)
> > > +{
> > > +	ref->dev = dev;
> > > +
> > > +#ifdef CONFIG_DEBUG_OBJECTS_NETDEV_REFS
> > > +	refcount_set(&ref->cnt, 0);  
> > 
> > This is very uncommon pattern. I would expect that first pointer access
> > will start from 1, like all refcount_t users. If you still prefer to
> > start from 0, i suggest you to use atomic_t. 
> 
> It's not really "starting from 0", it's more of a "setting the count
> to invalid". It can't escape from this state with a simple inc.

I understand it and this is what raises eyebrows. The refcount_t type
has very clear semantics which you are stretching too far.

Let's see what Eric had in mind for his RFC.

Thanks

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-11-17 19:27 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-17 17:47 [RFC net-next 0/2] explicit netdev refs Jakub Kicinski
2021-11-17 17:47 ` [RFC net-next 1/2] net: add netdev_refs debug Jakub Kicinski
2021-11-17 18:24   ` Leon Romanovsky
2021-11-17 18:35     ` Jakub Kicinski
2021-11-17 19:27       ` Leon Romanovsky
2021-11-17 19:13   ` Eric Dumazet
2021-11-17 17:47 ` [RFC net-next 2/2] vlan: use new netdev_refs infra Jakub Kicinski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.