All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking
@ 2021-12-05  4:21 Eric Dumazet
  2021-12-05  4:21 ` [PATCH v3 net-next 01/23] lib: add reference counting tracking infrastructure Eric Dumazet
                   ` (24 more replies)
  0 siblings, 25 replies; 52+ messages in thread
From: Eric Dumazet @ 2021-12-05  4:21 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Eric Dumazet, Dmitry Vyukov

From: Eric Dumazet <edumazet@google.com>

Two first patches add a generic infrastructure, that will be used
to get tracking of refcount increments/decrements.

The general idea is to be able to precisely pair each decrement with
a corresponding prior increment. Both share a cookie, basically
a pointer to private data storing stack traces.

The third patch adds dev_hold_track() and dev_put_track() helpers
(CONFIG_NET_DEV_REFCNT_TRACKER)

Then a series of 20 patches converts some dev_hold()/dev_put()
pairs to new hepers : dev_hold_track() and dev_put_track().

Hopefully this will be used by developpers and syzbot to
root cause bugs that cause netdevice dismantles freezes.

With CONFIG_PCPU_DEV_REFCNT=n option, we were able to detect
some class of bugs, but too late (when too many dev_put()
were happening).

Another series will be sent after this one is merged.

v3: moved NET_DEV_REFCNT_TRACKER to net/Kconfig.debug
    added "depends on DEBUG_KERNEL && STACKTRACE_SUPPORT"
    to hopefully get rid of kbuild reports for ARCH=nios2
    Reworded patch 3 changelog.
    Added missing htmldocs (Jakub)

v2: added four additional patches,
    added netdev_tracker_alloc() and netdev_tracker_free()
    addressed build error (kernel bots),
    use GFP_ATOMIC in test_ref_tracker_timer_func()

Eric Dumazet (23):
  lib: add reference counting tracking infrastructure
  lib: add tests for reference tracker
  net: add net device refcount tracker infrastructure
  net: add net device refcount tracker to struct netdev_rx_queue
  net: add net device refcount tracker to struct netdev_queue
  net: add net device refcount tracker to ethtool_phys_id()
  net: add net device refcount tracker to dev_ifsioc()
  drop_monitor: add net device refcount tracker
  net: dst: add net device refcount tracking to dst_entry
  ipv6: add net device refcount tracker to rt6_probe_deferred()
  sit: add net device refcount tracking to ip_tunnel
  ipv6: add net device refcount tracker to struct ip6_tnl
  net: add net device refcount tracker to struct neighbour
  net: add net device refcount tracker to struct pneigh_entry
  net: add net device refcount tracker to struct neigh_parms
  net: add net device refcount tracker to struct netdev_adjacent
  ipv6: add net device refcount tracker to struct inet6_dev
  ipv4: add net device refcount tracker to struct in_device
  net/sched: add net device refcount tracker to struct Qdisc
  net: linkwatch: add net device refcount tracker
  net: failover: add net device refcount tracker
  ipmr, ip6mr: add net device refcount tracker to struct vif_device
  netpoll: add net device refcount tracker to struct netpoll

 drivers/net/netconsole.c    |   2 +-
 include/linux/inetdevice.h  |   2 +
 include/linux/mroute_base.h |   2 +
 include/linux/netdevice.h   |  68 ++++++++++++++++++
 include/linux/netpoll.h     |   1 +
 include/linux/ref_tracker.h |  73 +++++++++++++++++++
 include/net/devlink.h       |   4 ++
 include/net/dst.h           |   1 +
 include/net/failover.h      |   1 +
 include/net/if_inet6.h      |   1 +
 include/net/ip6_tunnel.h    |   1 +
 include/net/ip_tunnels.h    |   3 +
 include/net/neighbour.h     |   3 +
 include/net/sch_generic.h   |   2 +-
 lib/Kconfig                 |   5 ++
 lib/Kconfig.debug           |  15 ++++
 lib/Makefile                |   4 +-
 lib/ref_tracker.c           | 140 ++++++++++++++++++++++++++++++++++++
 lib/test_ref_tracker.c      | 115 +++++++++++++++++++++++++++++
 net/Kconfig.debug           |  11 +++
 net/core/dev.c              |  10 ++-
 net/core/dev_ioctl.c        |   5 +-
 net/core/drop_monitor.c     |   6 +-
 net/core/dst.c              |   8 +--
 net/core/failover.c         |   4 +-
 net/core/link_watch.c       |   4 +-
 net/core/neighbour.c        |  18 ++---
 net/core/net-sysfs.c        |   8 +--
 net/core/netpoll.c          |   4 +-
 net/ethtool/ioctl.c         |   5 +-
 net/ipv4/devinet.c          |   4 +-
 net/ipv4/ipmr.c             |   3 +-
 net/ipv4/route.c            |   7 +-
 net/ipv6/addrconf.c         |   4 +-
 net/ipv6/addrconf_core.c    |   2 +-
 net/ipv6/ip6_gre.c          |   8 +--
 net/ipv6/ip6_tunnel.c       |   4 +-
 net/ipv6/ip6_vti.c          |   4 +-
 net/ipv6/ip6mr.c            |   3 +-
 net/ipv6/route.c            |  10 +--
 net/ipv6/sit.c              |   4 +-
 net/sched/sch_generic.c     |   4 +-
 42 files changed, 521 insertions(+), 62 deletions(-)
 create mode 100644 include/linux/ref_tracker.h
 create mode 100644 lib/ref_tracker.c
 create mode 100644 lib/test_ref_tracker.c
 create mode 100644 net/Kconfig.debug

-- 
2.34.1.400.ga245620fadb-goog


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v3 net-next 01/23] lib: add reference counting tracking infrastructure
  2021-12-05  4:21 [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking Eric Dumazet
@ 2021-12-05  4:21 ` Eric Dumazet
  2021-12-08 14:09   ` Andrzej Hajda
  2021-12-15 10:18   ` Jiri Slaby
  2021-12-05  4:21 ` [PATCH v3 net-next 02/23] lib: add tests for reference tracker Eric Dumazet
                   ` (23 subsequent siblings)
  24 siblings, 2 replies; 52+ messages in thread
From: Eric Dumazet @ 2021-12-05  4:21 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Eric Dumazet, Dmitry Vyukov

From: Eric Dumazet <edumazet@google.com>

It can be hard to track where references are taken and released.

In networking, we have annoying issues at device or netns dismantles,
and we had various proposals to ease root causing them.

This patch adds new infrastructure pairing refcount increases
and decreases. This will self document code, because programmers
will have to associate increments/decrements.

This is controled by CONFIG_REF_TRACKER which can be selected
by users of this feature.

This adds both cpu and memory costs, and thus should probably be
used with care.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
---
 include/linux/ref_tracker.h |  73 +++++++++++++++++++
 lib/Kconfig                 |   5 ++
 lib/Makefile                |   2 +
 lib/ref_tracker.c           | 140 ++++++++++++++++++++++++++++++++++++
 4 files changed, 220 insertions(+)
 create mode 100644 include/linux/ref_tracker.h
 create mode 100644 lib/ref_tracker.c

diff --git a/include/linux/ref_tracker.h b/include/linux/ref_tracker.h
new file mode 100644
index 0000000000000000000000000000000000000000..c11c9db5825cf933acf529c83db441a818135f29
--- /dev/null
+++ b/include/linux/ref_tracker.h
@@ -0,0 +1,73 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+#ifndef _LINUX_REF_TRACKER_H
+#define _LINUX_REF_TRACKER_H
+#include <linux/refcount.h>
+#include <linux/types.h>
+#include <linux/spinlock.h>
+
+struct ref_tracker;
+
+struct ref_tracker_dir {
+#ifdef CONFIG_REF_TRACKER
+	spinlock_t		lock;
+	unsigned int		quarantine_avail;
+	refcount_t		untracked;
+	struct list_head	list; /* List of active trackers */
+	struct list_head	quarantine; /* List of dead trackers */
+#endif
+};
+
+#ifdef CONFIG_REF_TRACKER
+static inline void ref_tracker_dir_init(struct ref_tracker_dir *dir,
+					unsigned int quarantine_count)
+{
+	INIT_LIST_HEAD(&dir->list);
+	INIT_LIST_HEAD(&dir->quarantine);
+	spin_lock_init(&dir->lock);
+	dir->quarantine_avail = quarantine_count;
+	refcount_set(&dir->untracked, 1);
+}
+
+void ref_tracker_dir_exit(struct ref_tracker_dir *dir);
+
+void ref_tracker_dir_print(struct ref_tracker_dir *dir,
+			   unsigned int display_limit);
+
+int ref_tracker_alloc(struct ref_tracker_dir *dir,
+		      struct ref_tracker **trackerp, gfp_t gfp);
+
+int ref_tracker_free(struct ref_tracker_dir *dir,
+		     struct ref_tracker **trackerp);
+
+#else /* CONFIG_REF_TRACKER */
+
+static inline void ref_tracker_dir_init(struct ref_tracker_dir *dir,
+					unsigned int quarantine_count)
+{
+}
+
+static inline void ref_tracker_dir_exit(struct ref_tracker_dir *dir)
+{
+}
+
+static inline void ref_tracker_dir_print(struct ref_tracker_dir *dir,
+					 unsigned int display_limit)
+{
+}
+
+static inline int ref_tracker_alloc(struct ref_tracker_dir *dir,
+				    struct ref_tracker **trackerp,
+				    gfp_t gfp)
+{
+	return 0;
+}
+
+static inline int ref_tracker_free(struct ref_tracker_dir *dir,
+				   struct ref_tracker **trackerp)
+{
+	return 0;
+}
+
+#endif
+
+#endif /* _LINUX_REF_TRACKER_H */
diff --git a/lib/Kconfig b/lib/Kconfig
index 5e7165e6a346c9bec878b78c8c8c3d175fc98dfd..655b0e43f260bfca63240794191e3f1890b2e801 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -680,6 +680,11 @@ config STACK_HASH_ORDER
 	 Select the hash size as a power of 2 for the stackdepot hash table.
 	 Choose a lower value to reduce the memory impact.
 
+config REF_TRACKER
+	bool
+	depends on STACKTRACE_SUPPORT
+	select STACKDEPOT
+
 config SBITMAP
 	bool
 
diff --git a/lib/Makefile b/lib/Makefile
index 364c23f1557816f73aebd8304c01224a4846ac6c..c1fd9243ddb9cc1ac5252d7eb8009f9290782c4a 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -270,6 +270,8 @@ obj-$(CONFIG_STACKDEPOT) += stackdepot.o
 KASAN_SANITIZE_stackdepot.o := n
 KCOV_INSTRUMENT_stackdepot.o := n
 
+obj-$(CONFIG_REF_TRACKER) += ref_tracker.o
+
 libfdt_files = fdt.o fdt_ro.o fdt_wip.o fdt_rw.o fdt_sw.o fdt_strerror.o \
 	       fdt_empty_tree.o fdt_addresses.o
 $(foreach file, $(libfdt_files), \
diff --git a/lib/ref_tracker.c b/lib/ref_tracker.c
new file mode 100644
index 0000000000000000000000000000000000000000..0ae2e66dcf0fdb976f4cb99e747c9448b37f22cc
--- /dev/null
+++ b/lib/ref_tracker.c
@@ -0,0 +1,140 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+#include <linux/export.h>
+#include <linux/ref_tracker.h>
+#include <linux/slab.h>
+#include <linux/stacktrace.h>
+#include <linux/stackdepot.h>
+
+#define REF_TRACKER_STACK_ENTRIES 16
+
+struct ref_tracker {
+	struct list_head	head;   /* anchor into dir->list or dir->quarantine */
+	bool			dead;
+	depot_stack_handle_t	alloc_stack_handle;
+	depot_stack_handle_t	free_stack_handle;
+};
+
+void ref_tracker_dir_exit(struct ref_tracker_dir *dir)
+{
+	struct ref_tracker *tracker, *n;
+	unsigned long flags;
+	bool leak = false;
+
+	spin_lock_irqsave(&dir->lock, flags);
+	list_for_each_entry_safe(tracker, n, &dir->quarantine, head) {
+		list_del(&tracker->head);
+		kfree(tracker);
+		dir->quarantine_avail++;
+	}
+	list_for_each_entry_safe(tracker, n, &dir->list, head) {
+		pr_err("leaked reference.\n");
+		if (tracker->alloc_stack_handle)
+			stack_depot_print(tracker->alloc_stack_handle);
+		leak = true;
+		list_del(&tracker->head);
+		kfree(tracker);
+	}
+	spin_unlock_irqrestore(&dir->lock, flags);
+	WARN_ON_ONCE(leak);
+	WARN_ON_ONCE(refcount_read(&dir->untracked) != 1);
+}
+EXPORT_SYMBOL(ref_tracker_dir_exit);
+
+void ref_tracker_dir_print(struct ref_tracker_dir *dir,
+			   unsigned int display_limit)
+{
+	struct ref_tracker *tracker;
+	unsigned long flags;
+	unsigned int i = 0;
+
+	spin_lock_irqsave(&dir->lock, flags);
+	list_for_each_entry(tracker, &dir->list, head) {
+		if (i < display_limit) {
+			pr_err("leaked reference.\n");
+			if (tracker->alloc_stack_handle)
+				stack_depot_print(tracker->alloc_stack_handle);
+			i++;
+		} else {
+			break;
+		}
+	}
+	spin_unlock_irqrestore(&dir->lock, flags);
+}
+EXPORT_SYMBOL(ref_tracker_dir_print);
+
+int ref_tracker_alloc(struct ref_tracker_dir *dir,
+		      struct ref_tracker **trackerp,
+		      gfp_t gfp)
+{
+	unsigned long entries[REF_TRACKER_STACK_ENTRIES];
+	struct ref_tracker *tracker;
+	unsigned int nr_entries;
+	unsigned long flags;
+
+	*trackerp = tracker = kzalloc(sizeof(*tracker), gfp | __GFP_NOFAIL);
+	if (unlikely(!tracker)) {
+		pr_err_once("memory allocation failure, unreliable refcount tracker.\n");
+		refcount_inc(&dir->untracked);
+		return -ENOMEM;
+	}
+	nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
+	nr_entries = filter_irq_stacks(entries, nr_entries);
+	tracker->alloc_stack_handle = stack_depot_save(entries, nr_entries, gfp);
+
+	spin_lock_irqsave(&dir->lock, flags);
+	list_add(&tracker->head, &dir->list);
+	spin_unlock_irqrestore(&dir->lock, flags);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(ref_tracker_alloc);
+
+int ref_tracker_free(struct ref_tracker_dir *dir,
+		     struct ref_tracker **trackerp)
+{
+	unsigned long entries[REF_TRACKER_STACK_ENTRIES];
+	struct ref_tracker *tracker = *trackerp;
+	depot_stack_handle_t stack_handle;
+	unsigned int nr_entries;
+	unsigned long flags;
+
+	if (!tracker) {
+		refcount_dec(&dir->untracked);
+		return -EEXIST;
+	}
+	nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
+	nr_entries = filter_irq_stacks(entries, nr_entries);
+	stack_handle = stack_depot_save(entries, nr_entries, GFP_ATOMIC);
+
+	spin_lock_irqsave(&dir->lock, flags);
+	if (tracker->dead) {
+		pr_err("reference already released.\n");
+		if (tracker->alloc_stack_handle) {
+			pr_err("allocated in:\n");
+			stack_depot_print(tracker->alloc_stack_handle);
+		}
+		if (tracker->free_stack_handle) {
+			pr_err("freed in:\n");
+			stack_depot_print(tracker->free_stack_handle);
+		}
+		spin_unlock_irqrestore(&dir->lock, flags);
+		WARN_ON_ONCE(1);
+		return -EINVAL;
+	}
+	tracker->dead = true;
+
+	tracker->free_stack_handle = stack_handle;
+
+	list_move_tail(&tracker->head, &dir->quarantine);
+	if (!dir->quarantine_avail) {
+		tracker = list_first_entry(&dir->quarantine, struct ref_tracker, head);
+		list_del(&tracker->head);
+	} else {
+		dir->quarantine_avail--;
+		tracker = NULL;
+	}
+	spin_unlock_irqrestore(&dir->lock, flags);
+
+	kfree(tracker);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(ref_tracker_free);
-- 
2.34.1.400.ga245620fadb-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 net-next 02/23] lib: add tests for reference tracker
  2021-12-05  4:21 [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking Eric Dumazet
  2021-12-05  4:21 ` [PATCH v3 net-next 01/23] lib: add reference counting tracking infrastructure Eric Dumazet
@ 2021-12-05  4:21 ` Eric Dumazet
  2021-12-05  4:21 ` [PATCH v3 net-next 03/23] net: add net device refcount tracker infrastructure Eric Dumazet
                   ` (22 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Eric Dumazet @ 2021-12-05  4:21 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Eric Dumazet, Dmitry Vyukov

From: Eric Dumazet <edumazet@google.com>

This module uses reference tracker, forcing two issues.

1) Double free of a tracker

2) leak of two trackers, one being allocated from softirq context.

"modprobe test_ref_tracker" would emit the following traces.
(Use scripts/decode_stacktrace.sh if necessary)

[  171.648681] reference already released.
[  171.653213] allocated in:
[  171.656523]  alloctest_ref_tracker_alloc2+0x1c/0x20 [test_ref_tracker]
[  171.656526]  init_module+0x86/0x1000 [test_ref_tracker]
[  171.656528]  do_one_initcall+0x9c/0x220
[  171.656532]  do_init_module+0x60/0x240
[  171.656536]  load_module+0x32b5/0x3610
[  171.656538]  __do_sys_init_module+0x148/0x1a0
[  171.656540]  __x64_sys_init_module+0x1d/0x20
[  171.656542]  do_syscall_64+0x4a/0xb0
[  171.656546]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  171.656549] freed in:
[  171.659520]  alloctest_ref_tracker_free+0x13/0x20 [test_ref_tracker]
[  171.659522]  init_module+0xec/0x1000 [test_ref_tracker]
[  171.659523]  do_one_initcall+0x9c/0x220
[  171.659525]  do_init_module+0x60/0x240
[  171.659527]  load_module+0x32b5/0x3610
[  171.659529]  __do_sys_init_module+0x148/0x1a0
[  171.659532]  __x64_sys_init_module+0x1d/0x20
[  171.659534]  do_syscall_64+0x4a/0xb0
[  171.659536]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  171.659575] ------------[ cut here ]------------
[  171.659576] WARNING: CPU: 5 PID: 13016 at lib/ref_tracker.c:112 ref_tracker_free+0x224/0x270
[  171.659581] Modules linked in: test_ref_tracker(+)
[  171.659591] CPU: 5 PID: 13016 Comm: modprobe Tainted: G S                5.16.0-smp-DEV #290
[  171.659595] RIP: 0010:ref_tracker_free+0x224/0x270
[  171.659599] Code: 5e 41 5f 5d c3 48 c7 c7 04 9c 74 a6 31 c0 e8 62 ee 67 00 83 7b 14 00 75 1a 83 7b 18 00 75 30 4c 89 ff 4c 89 f6 e8 9c 00 69 00 <0f> 0b bb ea ff ff ff eb ae 48 c7 c7 3a 0a 77 a6 31 c0 e8 34 ee 67
[  171.659601] RSP: 0018:ffff89058ba0bbd0 EFLAGS: 00010286
[  171.659603] RAX: 0000000000000029 RBX: ffff890586b19780 RCX: 08895bff57c7d100
[  171.659604] RDX: c0000000ffff7fff RSI: 0000000000000282 RDI: ffffffffc0407000
[  171.659606] RBP: ffff89058ba0bc88 R08: 0000000000000000 R09: ffffffffa6f342e0
[  171.659607] R10: 00000000ffff7fff R11: 0000000000000000 R12: 000000008f000000
[  171.659608] R13: 0000000000000014 R14: 0000000000000282 R15: ffffffffc0407000
[  171.659609] FS:  00007f97ea29d740(0000) GS:ffff8923ff940000(0000) knlGS:0000000000000000
[  171.659611] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  171.659613] CR2: 00007f97ea299000 CR3: 0000000186b4a004 CR4: 00000000001706e0
[  171.659614] Call Trace:
[  171.659615]  <TASK>
[  171.659631]  ? alloctest_ref_tracker_free+0x13/0x20 [test_ref_tracker]
[  171.659633]  ? init_module+0x105/0x1000 [test_ref_tracker]
[  171.659636]  ? do_one_initcall+0x9c/0x220
[  171.659638]  ? do_init_module+0x60/0x240
[  171.659641]  ? load_module+0x32b5/0x3610
[  171.659644]  ? __do_sys_init_module+0x148/0x1a0
[  171.659646]  ? __x64_sys_init_module+0x1d/0x20
[  171.659649]  ? do_syscall_64+0x4a/0xb0
[  171.659652]  ? entry_SYSCALL_64_after_hwframe+0x44/0xae
[  171.659656]  ? 0xffffffffc040a000
[  171.659658]  alloctest_ref_tracker_free+0x13/0x20 [test_ref_tracker]
[  171.659660]  init_module+0x105/0x1000 [test_ref_tracker]
[  171.659663]  do_one_initcall+0x9c/0x220
[  171.659666]  do_init_module+0x60/0x240
[  171.659669]  load_module+0x32b5/0x3610
[  171.659672]  __do_sys_init_module+0x148/0x1a0
[  171.659676]  __x64_sys_init_module+0x1d/0x20
[  171.659678]  do_syscall_64+0x4a/0xb0
[  171.659694]  ? exc_page_fault+0x6e/0x140
[  171.659696]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  171.659698] RIP: 0033:0x7f97ea3dbe7a
[  171.659700] Code: 48 8b 0d 61 8d 06 00 f7 d8 64 89 01 48 83 c8 ff c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2e 8d 06 00 f7 d8 64 89 01 48
[  171.659701] RSP: 002b:00007ffea67ce608 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
[  171.659703] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f97ea3dbe7a
[  171.659704] RDX: 00000000013a0ba0 RSI: 0000000000002808 RDI: 00007f97ea299000
[  171.659705] RBP: 00007ffea67ce670 R08: 0000000000000003 R09: 0000000000000000
[  171.659706] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000013a1048
[  171.659707] R13: 00000000013a0ba0 R14: 0000000001399930 R15: 00000000013a1030
[  171.659709]  </TASK>
[  171.659710] ---[ end trace f5dbd6afa41e60a9 ]---
[  171.659712] leaked reference.
[  171.663393]  alloctest_ref_tracker_alloc0+0x1c/0x20 [test_ref_tracker]
[  171.663395]  test_ref_tracker_timer_func+0x9/0x20 [test_ref_tracker]
[  171.663397]  call_timer_fn+0x31/0x140
[  171.663401]  expire_timers+0x46/0x110
[  171.663403]  __run_timers+0x16f/0x1b0
[  171.663404]  run_timer_softirq+0x1d/0x40
[  171.663406]  __do_softirq+0x148/0x2d3
[  171.663408] leaked reference.
[  171.667101]  alloctest_ref_tracker_alloc1+0x1c/0x20 [test_ref_tracker]
[  171.667103]  init_module+0x81/0x1000 [test_ref_tracker]
[  171.667104]  do_one_initcall+0x9c/0x220
[  171.667106]  do_init_module+0x60/0x240
[  171.667108]  load_module+0x32b5/0x3610
[  171.667111]  __do_sys_init_module+0x148/0x1a0
[  171.667113]  __x64_sys_init_module+0x1d/0x20
[  171.667115]  do_syscall_64+0x4a/0xb0
[  171.667117]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  171.667131] ------------[ cut here ]------------
[  171.667132] WARNING: CPU: 5 PID: 13016 at lib/ref_tracker.c:30 ref_tracker_dir_exit+0x104/0x130
[  171.667136] Modules linked in: test_ref_tracker(+)
[  171.667144] CPU: 5 PID: 13016 Comm: modprobe Tainted: G S      W         5.16.0-smp-DEV #290
[  171.667147] RIP: 0010:ref_tracker_dir_exit+0x104/0x130
[  171.667150] Code: 01 00 00 00 00 ad de 48 89 03 4c 89 63 08 48 89 df e8 20 a0 d5 ff 4c 89 f3 4d 39 ee 75 a8 4c 89 ff 48 8b 75 d0 e8 7c 05 69 00 <0f> 0b eb 0c 4c 89 ff 48 8b 75 d0 e8 6c 05 69 00 41 8b 47 08 83 f8
[  171.667151] RSP: 0018:ffff89058ba0bc68 EFLAGS: 00010286
[  171.667154] RAX: 08895bff57c7d100 RBX: ffffffffc0407010 RCX: 000000000000003b
[  171.667156] RDX: 000000000000003c RSI: 0000000000000282 RDI: ffffffffc0407000
[  171.667157] RBP: ffff89058ba0bc98 R08: 0000000000000000 R09: ffffffffa6f342e0
[  171.667159] R10: 00000000ffff7fff R11: 0000000000000000 R12: dead000000000122
[  171.667160] R13: ffffffffc0407010 R14: ffffffffc0407010 R15: ffffffffc0407000
[  171.667162] FS:  00007f97ea29d740(0000) GS:ffff8923ff940000(0000) knlGS:0000000000000000
[  171.667164] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  171.667166] CR2: 00007f97ea299000 CR3: 0000000186b4a004 CR4: 00000000001706e0
[  171.667169] Call Trace:
[  171.667170]  <TASK>
[  171.667171]  ? 0xffffffffc040a000
[  171.667173]  init_module+0x126/0x1000 [test_ref_tracker]
[  171.667175]  do_one_initcall+0x9c/0x220
[  171.667179]  do_init_module+0x60/0x240
[  171.667182]  load_module+0x32b5/0x3610
[  171.667186]  __do_sys_init_module+0x148/0x1a0
[  171.667189]  __x64_sys_init_module+0x1d/0x20
[  171.667192]  do_syscall_64+0x4a/0xb0
[  171.667194]  ? exc_page_fault+0x6e/0x140
[  171.667196]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  171.667199] RIP: 0033:0x7f97ea3dbe7a
[  171.667200] Code: 48 8b 0d 61 8d 06 00 f7 d8 64 89 01 48 83 c8 ff c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2e 8d 06 00 f7 d8 64 89 01 48
[  171.667201] RSP: 002b:00007ffea67ce608 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
[  171.667203] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f97ea3dbe7a
[  171.667204] RDX: 00000000013a0ba0 RSI: 0000000000002808 RDI: 00007f97ea299000
[  171.667205] RBP: 00007ffea67ce670 R08: 0000000000000003 R09: 0000000000000000
[  171.667206] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000013a1048
[  171.667207] R13: 00000000013a0ba0 R14: 0000000001399930 R15: 00000000013a1030
[  171.667209]  </TASK>
[  171.667210] ---[ end trace f5dbd6afa41e60aa ]---

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 lib/Kconfig.debug      |  10 ++++
 lib/Makefile           |   2 +-
 lib/test_ref_tracker.c | 115 +++++++++++++++++++++++++++++++++++++++++
 3 files changed, 126 insertions(+), 1 deletion(-)
 create mode 100644 lib/test_ref_tracker.c

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 5c12bde10996cf97b5f075d318089b1be73f71d7..633c2c5cb45bd435a7684dbe2f2eca477c871463 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -2106,6 +2106,16 @@ config BACKTRACE_SELF_TEST
 
 	  Say N if you are unsure.
 
+config TEST_REF_TRACKER
+	tristate "Self test for reference tracker"
+	depends on DEBUG_KERNEL && STACKTRACE_SUPPORT
+	select REF_TRACKER
+	help
+	  This option provides a kernel module performing tests
+	  using reference tracker infrastructure.
+
+	  Say N if you are unsure.
+
 config RBTREE_TEST
 	tristate "Red-Black tree test"
 	depends on DEBUG_KERNEL
diff --git a/lib/Makefile b/lib/Makefile
index c1fd9243ddb9cc1ac5252d7eb8009f9290782c4a..b213a7bbf3fda2eb9f234fb7473b8f1b617bed6b 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -101,7 +101,7 @@ obj-$(CONFIG_TEST_LOCKUP) += test_lockup.o
 obj-$(CONFIG_TEST_HMM) += test_hmm.o
 obj-$(CONFIG_TEST_FREE_PAGES) += test_free_pages.o
 obj-$(CONFIG_KPROBES_SANITY_TEST) += test_kprobes.o
-
+obj-$(CONFIG_TEST_REF_TRACKER) += test_ref_tracker.o
 #
 # CFLAGS for compiling floating point code inside the kernel. x86/Makefile turns
 # off the generation of FPU/SSE* instructions for kernel proper but FPU_FLAGS
diff --git a/lib/test_ref_tracker.c b/lib/test_ref_tracker.c
new file mode 100644
index 0000000000000000000000000000000000000000..19d7dec70cc62f0a7b274cd5ec8bf8cec62b0af3
--- /dev/null
+++ b/lib/test_ref_tracker.c
@@ -0,0 +1,115 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Referrence tracker self test.
+ *
+ * Copyright (c) 2021 Eric Dumazet <edumazet@google.com>
+ */
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/delay.h>
+#include <linux/ref_tracker.h>
+#include <linux/slab.h>
+#include <linux/timer.h>
+
+static struct ref_tracker_dir ref_dir;
+static struct ref_tracker *tracker[20];
+
+#define TRT_ALLOC(X) static noinline void 				\
+	alloctest_ref_tracker_alloc##X(struct ref_tracker_dir *dir, 	\
+				    struct ref_tracker **trackerp)	\
+	{								\
+		ref_tracker_alloc(dir, trackerp, GFP_KERNEL);		\
+	}
+
+TRT_ALLOC(1)
+TRT_ALLOC(2)
+TRT_ALLOC(3)
+TRT_ALLOC(4)
+TRT_ALLOC(5)
+TRT_ALLOC(6)
+TRT_ALLOC(7)
+TRT_ALLOC(8)
+TRT_ALLOC(9)
+TRT_ALLOC(10)
+TRT_ALLOC(11)
+TRT_ALLOC(12)
+TRT_ALLOC(13)
+TRT_ALLOC(14)
+TRT_ALLOC(15)
+TRT_ALLOC(16)
+TRT_ALLOC(17)
+TRT_ALLOC(18)
+TRT_ALLOC(19)
+
+#undef TRT_ALLOC
+
+static noinline void
+alloctest_ref_tracker_free(struct ref_tracker_dir *dir,
+			   struct ref_tracker **trackerp)
+{
+	ref_tracker_free(dir, trackerp);
+}
+
+
+static struct timer_list test_ref_tracker_timer;
+static atomic_t test_ref_timer_done = ATOMIC_INIT(0);
+
+static void test_ref_tracker_timer_func(struct timer_list *t)
+{
+	ref_tracker_alloc(&ref_dir, &tracker[0], GFP_ATOMIC);
+	atomic_set(&test_ref_timer_done, 1);
+}
+
+static int __init test_ref_tracker_init(void)
+{
+	int i;
+
+	ref_tracker_dir_init(&ref_dir, 100);
+
+	timer_setup(&test_ref_tracker_timer, test_ref_tracker_timer_func, 0);
+	mod_timer(&test_ref_tracker_timer, jiffies + 1);
+
+	alloctest_ref_tracker_alloc1(&ref_dir, &tracker[1]);
+	alloctest_ref_tracker_alloc2(&ref_dir, &tracker[2]);
+	alloctest_ref_tracker_alloc3(&ref_dir, &tracker[3]);
+	alloctest_ref_tracker_alloc4(&ref_dir, &tracker[4]);
+	alloctest_ref_tracker_alloc5(&ref_dir, &tracker[5]);
+	alloctest_ref_tracker_alloc6(&ref_dir, &tracker[6]);
+	alloctest_ref_tracker_alloc7(&ref_dir, &tracker[7]);
+	alloctest_ref_tracker_alloc8(&ref_dir, &tracker[8]);
+	alloctest_ref_tracker_alloc9(&ref_dir, &tracker[9]);
+	alloctest_ref_tracker_alloc10(&ref_dir, &tracker[10]);
+	alloctest_ref_tracker_alloc11(&ref_dir, &tracker[11]);
+	alloctest_ref_tracker_alloc12(&ref_dir, &tracker[12]);
+	alloctest_ref_tracker_alloc13(&ref_dir, &tracker[13]);
+	alloctest_ref_tracker_alloc14(&ref_dir, &tracker[14]);
+	alloctest_ref_tracker_alloc15(&ref_dir, &tracker[15]);
+	alloctest_ref_tracker_alloc16(&ref_dir, &tracker[16]);
+	alloctest_ref_tracker_alloc17(&ref_dir, &tracker[17]);
+	alloctest_ref_tracker_alloc18(&ref_dir, &tracker[18]);
+	alloctest_ref_tracker_alloc19(&ref_dir, &tracker[19]);
+
+	/* free all trackers but first 0 and 1. */
+	for (i = 2; i < ARRAY_SIZE(tracker); i++)
+		alloctest_ref_tracker_free(&ref_dir, &tracker[i]);
+
+	/* Attempt to free an already freed tracker. */
+	alloctest_ref_tracker_free(&ref_dir, &tracker[2]);
+
+	while (!atomic_read(&test_ref_timer_done))
+		msleep(1);
+
+	/* This should warn about tracker[0] & tracker[1] being not freed. */
+	ref_tracker_dir_exit(&ref_dir);
+
+	return 0;
+}
+
+static void __exit test_ref_tracker_exit(void)
+{
+}
+
+module_init(test_ref_tracker_init);
+module_exit(test_ref_tracker_exit);
+
+MODULE_LICENSE("GPL v2");
-- 
2.34.1.400.ga245620fadb-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 net-next 03/23] net: add net device refcount tracker infrastructure
  2021-12-05  4:21 [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking Eric Dumazet
  2021-12-05  4:21 ` [PATCH v3 net-next 01/23] lib: add reference counting tracking infrastructure Eric Dumazet
  2021-12-05  4:21 ` [PATCH v3 net-next 02/23] lib: add tests for reference tracker Eric Dumazet
@ 2021-12-05  4:21 ` Eric Dumazet
  2021-12-05  4:21 ` [PATCH v3 net-next 04/23] net: add net device refcount tracker to struct netdev_rx_queue Eric Dumazet
                   ` (21 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Eric Dumazet @ 2021-12-05  4:21 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Eric Dumazet, Dmitry Vyukov

From: Eric Dumazet <edumazet@google.com>

net device are refcounted. Over the years we had numerous bugs
caused by imbalanced dev_hold() and dev_put() calls.

The general idea is to be able to precisely pair each decrement with
a corresponding prior increment. Both share a cookie, basically
a pointer to private data storing stack traces.

This patch adds dev_hold_track() and dev_put_track().

To use these helpers, each data structure owning a refcount
should also use a "netdevice_tracker" to pair the hold and put.

netdevice_tracker dev_tracker;
...
dev_hold_track(dev, &dev_tracker, GFP_ATOMIC);
...
dev_put_track(dev, &dev_tracker);

Whenever a leak happens, we will get precise stack traces
of the point dev_hold_track() happened, at device dismantle phase.

We will also get a stack trace if too many dev_put_track() for the same
netdevice_tracker are attempted.

This is guarded by CONFIG_NET_DEV_REFCNT_TRACKER option.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/linux/netdevice.h | 45 +++++++++++++++++++++++++++++++++++++++
 lib/Kconfig.debug         |  5 +++++
 net/Kconfig.debug         | 11 ++++++++++
 net/core/dev.c            |  3 +++
 4 files changed, 64 insertions(+)
 create mode 100644 net/Kconfig.debug

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 65117f01d5f2a9c9e815b6c967d5e9e4c94af0ae..143d60ed004732e4a086e66fdcf7b3d362c1dc20 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -48,6 +48,7 @@
 #include <uapi/linux/pkt_cls.h>
 #include <linux/hashtable.h>
 #include <linux/rbtree.h>
+#include <linux/ref_tracker.h>
 
 struct netpoll_info;
 struct device;
@@ -300,6 +301,12 @@ enum netdev_state_t {
 };
 
 
+#ifdef CONFIG_NET_DEV_REFCNT_TRACKER
+typedef struct ref_tracker *netdevice_tracker;
+#else
+typedef struct {} netdevice_tracker;
+#endif
+
 struct gro_list {
 	struct list_head	list;
 	int			count;
@@ -1865,6 +1872,7 @@ enum netdev_ml_priv_type {
  *	@proto_down_reason:	reason a netdev interface is held down
  *	@pcpu_refcnt:		Number of references to this device
  *	@dev_refcnt:		Number of references to this device
+ *	@refcnt_tracker:	Tracker directory for tracked references to this device
  *	@todo_list:		Delayed register/unregister
  *	@link_watch_list:	XXX: need comments on this one
  *
@@ -2178,6 +2186,7 @@ struct net_device {
 #else
 	refcount_t		dev_refcnt;
 #endif
+	struct ref_tracker_dir	refcnt_tracker;
 
 	struct list_head	link_watch_list;
 
@@ -3805,6 +3814,7 @@ void netdev_run_todo(void);
  *	@dev: network device
  *
  * Release reference to device to allow it to be freed.
+ * Try using dev_put_track() instead.
  */
 static inline void dev_put(struct net_device *dev)
 {
@@ -3822,6 +3832,7 @@ static inline void dev_put(struct net_device *dev)
  *	@dev: network device
  *
  * Hold reference to device to keep it from being freed.
+ * Try using dev_hold_track() instead.
  */
 static inline void dev_hold(struct net_device *dev)
 {
@@ -3834,6 +3845,40 @@ static inline void dev_hold(struct net_device *dev)
 	}
 }
 
+static inline void netdev_tracker_alloc(struct net_device *dev,
+					netdevice_tracker *tracker, gfp_t gfp)
+{
+#ifdef CONFIG_NET_DEV_REFCNT_TRACKER
+	ref_tracker_alloc(&dev->refcnt_tracker, tracker, gfp);
+#endif
+}
+
+static inline void netdev_tracker_free(struct net_device *dev,
+				       netdevice_tracker *tracker)
+{
+#ifdef CONFIG_NET_DEV_REFCNT_TRACKER
+	ref_tracker_free(&dev->refcnt_tracker, tracker);
+#endif
+}
+
+static inline void dev_hold_track(struct net_device *dev,
+				  netdevice_tracker *tracker, gfp_t gfp)
+{
+	if (dev) {
+		dev_hold(dev);
+		netdev_tracker_alloc(dev, tracker, gfp);
+	}
+}
+
+static inline void dev_put_track(struct net_device *dev,
+				 netdevice_tracker *tracker)
+{
+	if (dev) {
+		netdev_tracker_free(dev, tracker);
+		dev_put(dev);
+	}
+}
+
 /* Carrier loss detection, dial on demand. The functions netif_carrier_on
  * and _off may be called from IRQ context, but it is caller
  * who is responsible for serialization of these calls.
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 633c2c5cb45bd435a7684dbe2f2eca477c871463..6504b97f8dfd786b6a7b9cdb2c2eda5aaa2c2b03 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -598,6 +598,11 @@ config DEBUG_MISC
 	  Say Y here if you need to enable miscellaneous debug code that should
 	  be under a more specific debug option but isn't.
 
+menu "Networking Debugging"
+
+source "net/Kconfig.debug"
+
+endmenu # "Networking Debugging"
 
 menu "Memory Debugging"
 
diff --git a/net/Kconfig.debug b/net/Kconfig.debug
new file mode 100644
index 0000000000000000000000000000000000000000..ee8d67a784291fb126d95324db05183b5a78ab70
--- /dev/null
+++ b/net/Kconfig.debug
@@ -0,0 +1,11 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+config NET_DEV_REFCNT_TRACKER
+	bool "Enable net device refcount tracking"
+	depends on DEBUG_KERNEL && STACKTRACE_SUPPORT
+	select REF_TRACKER
+	default n
+	help
+	  Enable debugging feature to track device references.
+	  This adds memory and cpu costs.
+
diff --git a/net/core/dev.c b/net/core/dev.c
index aba8acc1238c2213b9fc0c47c93568f731f375e5..1740d6cfe86b58359cceaec7ee9cc015a3843723 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -9864,6 +9864,7 @@ static void netdev_wait_allrefs(struct net_device *dev)
 			       netdev_unregister_timeout_secs * HZ)) {
 			pr_emerg("unregister_netdevice: waiting for %s to become free. Usage count = %d\n",
 				 dev->name, refcnt);
+			ref_tracker_dir_print(&dev->refcnt_tracker, 10);
 			warning_time = jiffies;
 		}
 	}
@@ -10154,6 +10155,7 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
 	dev = PTR_ALIGN(p, NETDEV_ALIGN);
 	dev->padded = (char *)dev - (char *)p;
 
+	ref_tracker_dir_init(&dev->refcnt_tracker, 128);
 #ifdef CONFIG_PCPU_DEV_REFCNT
 	dev->pcpu_refcnt = alloc_percpu(int);
 	if (!dev->pcpu_refcnt)
@@ -10270,6 +10272,7 @@ void free_netdev(struct net_device *dev)
 	list_for_each_entry_safe(p, n, &dev->napi_list, dev_list)
 		netif_napi_del(p);
 
+	ref_tracker_dir_exit(&dev->refcnt_tracker);
 #ifdef CONFIG_PCPU_DEV_REFCNT
 	free_percpu(dev->pcpu_refcnt);
 	dev->pcpu_refcnt = NULL;
-- 
2.34.1.400.ga245620fadb-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 net-next 04/23] net: add net device refcount tracker to struct netdev_rx_queue
  2021-12-05  4:21 [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking Eric Dumazet
                   ` (2 preceding siblings ...)
  2021-12-05  4:21 ` [PATCH v3 net-next 03/23] net: add net device refcount tracker infrastructure Eric Dumazet
@ 2021-12-05  4:21 ` Eric Dumazet
  2021-12-05  4:21 ` [PATCH v3 net-next 05/23] net: add net device refcount tracker to struct netdev_queue Eric Dumazet
                   ` (20 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Eric Dumazet @ 2021-12-05  4:21 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Eric Dumazet, Dmitry Vyukov

From: Eric Dumazet <edumazet@google.com>

This helps debugging net device refcount leaks.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/linux/netdevice.h | 2 ++
 net/core/net-sysfs.c      | 4 ++--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 143d60ed004732e4a086e66fdcf7b3d362c1dc20..3d691fadd56907212b8562432bb5fe0ced0fd962 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -741,6 +741,8 @@ struct netdev_rx_queue {
 #endif
 	struct kobject			kobj;
 	struct net_device		*dev;
+	netdevice_tracker		dev_tracker;
+
 #ifdef CONFIG_XDP_SOCKETS
 	struct xsk_buff_pool            *pool;
 #endif
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index affe34d71d313498d8fae4a319696250aa3aef0a..27a7ac2e516f65dbfdb2a2319e6faa27c7dd8f31 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -1004,7 +1004,7 @@ static void rx_queue_release(struct kobject *kobj)
 #endif
 
 	memset(kobj, 0, sizeof(*kobj));
-	dev_put(queue->dev);
+	dev_put_track(queue->dev, &queue->dev_tracker);
 }
 
 static const void *rx_queue_namespace(struct kobject *kobj)
@@ -1044,7 +1044,7 @@ static int rx_queue_add_kobject(struct net_device *dev, int index)
 	/* Kobject_put later will trigger rx_queue_release call which
 	 * decreases dev refcount: Take that reference here
 	 */
-	dev_hold(queue->dev);
+	dev_hold_track(queue->dev, &queue->dev_tracker, GFP_KERNEL);
 
 	kobj->kset = dev->queues_kset;
 	error = kobject_init_and_add(kobj, &rx_queue_ktype, NULL,
-- 
2.34.1.400.ga245620fadb-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 net-next 05/23] net: add net device refcount tracker to struct netdev_queue
  2021-12-05  4:21 [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking Eric Dumazet
                   ` (3 preceding siblings ...)
  2021-12-05  4:21 ` [PATCH v3 net-next 04/23] net: add net device refcount tracker to struct netdev_rx_queue Eric Dumazet
@ 2021-12-05  4:21 ` Eric Dumazet
  2021-12-05  4:22 ` [PATCH v3 net-next 06/23] net: add net device refcount tracker to ethtool_phys_id() Eric Dumazet
                   ` (19 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Eric Dumazet @ 2021-12-05  4:21 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Eric Dumazet, Dmitry Vyukov

From: Eric Dumazet <edumazet@google.com>

This will help debugging pesky netdev reference leaks.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/linux/netdevice.h | 2 ++
 net/core/net-sysfs.c      | 4 ++--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 3d691fadd56907212b8562432bb5fe0ced0fd962..b4f704337f657ebc46e02c9e5e7f5d2c2c64685e 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -586,6 +586,8 @@ struct netdev_queue {
  * read-mostly part
  */
 	struct net_device	*dev;
+	netdevice_tracker	dev_tracker;
+
 	struct Qdisc __rcu	*qdisc;
 	struct Qdisc		*qdisc_sleeping;
 #ifdef CONFIG_SYSFS
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 27a7ac2e516f65dbfdb2a2319e6faa27c7dd8f31..3b2cdbbdc858e06fb0a482a9b7fc778e501ba1e0 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -1607,7 +1607,7 @@ static void netdev_queue_release(struct kobject *kobj)
 	struct netdev_queue *queue = to_netdev_queue(kobj);
 
 	memset(kobj, 0, sizeof(*kobj));
-	dev_put(queue->dev);
+	dev_put_track(queue->dev, &queue->dev_tracker);
 }
 
 static const void *netdev_queue_namespace(struct kobject *kobj)
@@ -1647,7 +1647,7 @@ static int netdev_queue_add_kobject(struct net_device *dev, int index)
 	/* Kobject_put later will trigger netdev_queue_release call
 	 * which decreases dev refcount: Take that reference here
 	 */
-	dev_hold(queue->dev);
+	dev_hold_track(queue->dev, &queue->dev_tracker, GFP_KERNEL);
 
 	kobj->kset = dev->queues_kset;
 	error = kobject_init_and_add(kobj, &netdev_queue_ktype, NULL,
-- 
2.34.1.400.ga245620fadb-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 net-next 06/23] net: add net device refcount tracker to ethtool_phys_id()
  2021-12-05  4:21 [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking Eric Dumazet
                   ` (4 preceding siblings ...)
  2021-12-05  4:21 ` [PATCH v3 net-next 05/23] net: add net device refcount tracker to struct netdev_queue Eric Dumazet
@ 2021-12-05  4:22 ` Eric Dumazet
  2021-12-05  4:22 ` [PATCH v3 net-next 07/23] net: add net device refcount tracker to dev_ifsioc() Eric Dumazet
                   ` (18 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Eric Dumazet @ 2021-12-05  4:22 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Eric Dumazet, Dmitry Vyukov

From: Eric Dumazet <edumazet@google.com>

This helper might hold a netdev reference for a long time,
lets add reference tracking.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ethtool/ioctl.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
index fa8aa5ec19ba1cfff611a159c5b8bcd02f74c5ca..9a113d89352123acf56ec598c191bb75985c6be5 100644
--- a/net/ethtool/ioctl.c
+++ b/net/ethtool/ioctl.c
@@ -1989,6 +1989,7 @@ static int ethtool_phys_id(struct net_device *dev, void __user *useraddr)
 	struct ethtool_value id;
 	static bool busy;
 	const struct ethtool_ops *ops = dev->ethtool_ops;
+	netdevice_tracker dev_tracker;
 	int rc;
 
 	if (!ops->set_phys_id)
@@ -2008,7 +2009,7 @@ static int ethtool_phys_id(struct net_device *dev, void __user *useraddr)
 	 * removal of the device.
 	 */
 	busy = true;
-	dev_hold(dev);
+	dev_hold_track(dev, &dev_tracker, GFP_KERNEL);
 	rtnl_unlock();
 
 	if (rc == 0) {
@@ -2032,7 +2033,7 @@ static int ethtool_phys_id(struct net_device *dev, void __user *useraddr)
 	}
 
 	rtnl_lock();
-	dev_put(dev);
+	dev_put_track(dev, &dev_tracker);
 	busy = false;
 
 	(void) ops->set_phys_id(dev, ETHTOOL_ID_INACTIVE);
-- 
2.34.1.400.ga245620fadb-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 net-next 07/23] net: add net device refcount tracker to dev_ifsioc()
  2021-12-05  4:21 [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking Eric Dumazet
                   ` (5 preceding siblings ...)
  2021-12-05  4:22 ` [PATCH v3 net-next 06/23] net: add net device refcount tracker to ethtool_phys_id() Eric Dumazet
@ 2021-12-05  4:22 ` Eric Dumazet
  2021-12-05  4:22 ` [PATCH v3 net-next 08/23] drop_monitor: add net device refcount tracker Eric Dumazet
                   ` (17 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Eric Dumazet @ 2021-12-05  4:22 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Eric Dumazet, Dmitry Vyukov

From: Eric Dumazet <edumazet@google.com>

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/core/dev_ioctl.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/core/dev_ioctl.c b/net/core/dev_ioctl.c
index cbab5fec64b12df9355b154bdcaa0f48701966e8..1d309a6669325ad26c59892c4e78ca14dec017d9 100644
--- a/net/core/dev_ioctl.c
+++ b/net/core/dev_ioctl.c
@@ -313,6 +313,7 @@ static int dev_ifsioc(struct net *net, struct ifreq *ifr, void __user *data,
 	int err;
 	struct net_device *dev = __dev_get_by_name(net, ifr->ifr_name);
 	const struct net_device_ops *ops;
+	netdevice_tracker dev_tracker;
 
 	if (!dev)
 		return -ENODEV;
@@ -381,10 +382,10 @@ static int dev_ifsioc(struct net *net, struct ifreq *ifr, void __user *data,
 			return -ENODEV;
 		if (!netif_is_bridge_master(dev))
 			return -EOPNOTSUPP;
-		dev_hold(dev);
+		dev_hold_track(dev, &dev_tracker, GFP_KERNEL);
 		rtnl_unlock();
 		err = br_ioctl_call(net, netdev_priv(dev), cmd, ifr, NULL);
-		dev_put(dev);
+		dev_put_track(dev, &dev_tracker);
 		rtnl_lock();
 		return err;
 
-- 
2.34.1.400.ga245620fadb-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 net-next 08/23] drop_monitor: add net device refcount tracker
  2021-12-05  4:21 [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking Eric Dumazet
                   ` (6 preceding siblings ...)
  2021-12-05  4:22 ` [PATCH v3 net-next 07/23] net: add net device refcount tracker to dev_ifsioc() Eric Dumazet
@ 2021-12-05  4:22 ` Eric Dumazet
  2021-12-05  4:22 ` [PATCH v3 net-next 09/23] net: dst: add net device refcount tracking to dst_entry Eric Dumazet
                   ` (16 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Eric Dumazet @ 2021-12-05  4:22 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Eric Dumazet, Dmitry Vyukov

From: Eric Dumazet <edumazet@google.com>

We want to track all dev_hold()/dev_put() to ease leak hunting.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/devlink.h   | 4 ++++
 net/core/drop_monitor.c | 6 +++---
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/include/net/devlink.h b/include/net/devlink.h
index 043fcec8b0aadf041aba35b8339c93ac9336b551..3276a29f2b814cda1fdce80f52c126c1cd444272 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -664,13 +664,17 @@ struct devlink_health_reporter_ops {
  * @trap_name: Trap name.
  * @trap_group_name: Trap group name.
  * @input_dev: Input netdevice.
+ * @dev_tracker: refcount tracker for @input_dev.
  * @fa_cookie: Flow action user cookie.
  * @trap_type: Trap type.
  */
 struct devlink_trap_metadata {
 	const char *trap_name;
 	const char *trap_group_name;
+
 	struct net_device *input_dev;
+	netdevice_tracker dev_tracker;
+
 	const struct flow_action_cookie *fa_cookie;
 	enum devlink_trap_type trap_type;
 };
diff --git a/net/core/drop_monitor.c b/net/core/drop_monitor.c
index 49442cae6f69d5e9d93d00b53ab8f5a0563c1d37..3d0ab2eec91667bdfac93878a046fc727fc22b99 100644
--- a/net/core/drop_monitor.c
+++ b/net/core/drop_monitor.c
@@ -850,7 +850,7 @@ net_dm_hw_metadata_copy(const struct devlink_trap_metadata *metadata)
 	}
 
 	hw_metadata->input_dev = metadata->input_dev;
-	dev_hold(hw_metadata->input_dev);
+	dev_hold_track(hw_metadata->input_dev, &hw_metadata->dev_tracker, GFP_ATOMIC);
 
 	return hw_metadata;
 
@@ -864,9 +864,9 @@ net_dm_hw_metadata_copy(const struct devlink_trap_metadata *metadata)
 }
 
 static void
-net_dm_hw_metadata_free(const struct devlink_trap_metadata *hw_metadata)
+net_dm_hw_metadata_free(struct devlink_trap_metadata *hw_metadata)
 {
-	dev_put(hw_metadata->input_dev);
+	dev_put_track(hw_metadata->input_dev, &hw_metadata->dev_tracker);
 	kfree(hw_metadata->fa_cookie);
 	kfree(hw_metadata->trap_name);
 	kfree(hw_metadata->trap_group_name);
-- 
2.34.1.400.ga245620fadb-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 net-next 09/23] net: dst: add net device refcount tracking to dst_entry
  2021-12-05  4:21 [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking Eric Dumazet
                   ` (7 preceding siblings ...)
  2021-12-05  4:22 ` [PATCH v3 net-next 08/23] drop_monitor: add net device refcount tracker Eric Dumazet
@ 2021-12-05  4:22 ` Eric Dumazet
  2021-12-05  4:22 ` [PATCH v3 net-next 10/23] ipv6: add net device refcount tracker to rt6_probe_deferred() Eric Dumazet
                   ` (15 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Eric Dumazet @ 2021-12-05  4:22 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Eric Dumazet, Dmitry Vyukov

From: Eric Dumazet <edumazet@google.com>

We want to track all dev_hold()/dev_put() to ease leak hunting.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/linux/netdevice.h | 17 +++++++++++++++++
 include/net/dst.h         |  1 +
 net/core/dst.c            |  8 ++++----
 net/ipv4/route.c          |  7 ++++---
 net/ipv6/route.c          |  5 +++--
 5 files changed, 29 insertions(+), 9 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index b4f704337f657ebc46e02c9e5e7f5d2c2c64685e..afed3b10491b92da880a8cd13181ff041cc54673 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3883,6 +3883,23 @@ static inline void dev_put_track(struct net_device *dev,
 	}
 }
 
+static inline void dev_replace_track(struct net_device *odev,
+				     struct net_device *ndev,
+				     netdevice_tracker *tracker,
+				     gfp_t gfp)
+{
+#ifdef CONFIG_NET_DEV_REFCNT_TRACKER
+	if (odev)
+		ref_tracker_free(&odev->refcnt_tracker, tracker);
+#endif
+	dev_hold(ndev);
+	dev_put(odev);
+#ifdef CONFIG_NET_DEV_REFCNT_TRACKER
+	if (ndev)
+		ref_tracker_alloc(&ndev->refcnt_tracker, tracker, gfp);
+#endif
+}
+
 /* Carrier loss detection, dial on demand. The functions netif_carrier_on
  * and _off may be called from IRQ context, but it is caller
  * who is responsible for serialization of these calls.
diff --git a/include/net/dst.h b/include/net/dst.h
index a057319aabefac52075edb038358439ceec23a60..6aa252c3fc55ccaee58faebf265510469e91d780 100644
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -77,6 +77,7 @@ struct dst_entry {
 #ifndef CONFIG_64BIT
 	atomic_t		__refcnt;	/* 32-bit offset 64 */
 #endif
+	netdevice_tracker	dev_tracker;
 };
 
 struct dst_metrics {
diff --git a/net/core/dst.c b/net/core/dst.c
index 497ef9b3fc6abb5fd9c74e5730c3aedc3277f850..d16c2c9bfebd3dadd4c8dbc4f14836574bb52bbe 100644
--- a/net/core/dst.c
+++ b/net/core/dst.c
@@ -49,7 +49,7 @@ void dst_init(struct dst_entry *dst, struct dst_ops *ops,
 	      unsigned short flags)
 {
 	dst->dev = dev;
-	dev_hold(dev);
+	dev_hold_track(dev, &dst->dev_tracker, GFP_ATOMIC);
 	dst->ops = ops;
 	dst_init_metrics(dst, dst_default_metrics.metrics, true);
 	dst->expires = 0UL;
@@ -117,7 +117,7 @@ struct dst_entry *dst_destroy(struct dst_entry * dst)
 
 	if (dst->ops->destroy)
 		dst->ops->destroy(dst);
-	dev_put(dst->dev);
+	dev_put_track(dst->dev, &dst->dev_tracker);
 
 	lwtstate_put(dst->lwtstate);
 
@@ -159,8 +159,8 @@ void dst_dev_put(struct dst_entry *dst)
 	dst->input = dst_discard;
 	dst->output = dst_discard_out;
 	dst->dev = blackhole_netdev;
-	dev_hold(dst->dev);
-	dev_put(dev);
+	dev_replace_track(dev, blackhole_netdev, &dst->dev_tracker,
+			  GFP_ATOMIC);
 }
 EXPORT_SYMBOL(dst_dev_put);
 
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 243a0c52be42b60226a38dce980956b33e583d80..843a7a3699feeb24f3b9af5efaff87724214cbce 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1531,8 +1531,9 @@ void rt_flush_dev(struct net_device *dev)
 			if (rt->dst.dev != dev)
 				continue;
 			rt->dst.dev = blackhole_netdev;
-			dev_hold(rt->dst.dev);
-			dev_put(dev);
+			dev_replace_track(dev, blackhole_netdev,
+					  &rt->dst.dev_tracker,
+					  GFP_ATOMIC);
 		}
 		spin_unlock_bh(&ul->lock);
 	}
@@ -2819,7 +2820,7 @@ struct dst_entry *ipv4_blackhole_route(struct net *net, struct dst_entry *dst_or
 		new->output = dst_discard_out;
 
 		new->dev = net->loopback_dev;
-		dev_hold(new->dev);
+		dev_hold_track(new->dev, &new->dev_tracker, GFP_ATOMIC);
 
 		rt->rt_is_input = ort->rt_is_input;
 		rt->rt_iif = ort->rt_iif;
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index f0d29fcb20945b5a48097c89dc57daedeed9d177..ba4dc94d76d63c98ff49c41b712231f81eb8af40 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -182,8 +182,9 @@ static void rt6_uncached_list_flush_dev(struct net *net, struct net_device *dev)
 
 			if (rt_dev == dev) {
 				rt->dst.dev = blackhole_netdev;
-				dev_hold(rt->dst.dev);
-				dev_put(rt_dev);
+				dev_replace_track(rt_dev, blackhole_netdev,
+						  &rt->dst.dev_tracker,
+						  GFP_ATOMIC);
 			}
 		}
 		spin_unlock_bh(&ul->lock);
-- 
2.34.1.400.ga245620fadb-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 net-next 10/23] ipv6: add net device refcount tracker to rt6_probe_deferred()
  2021-12-05  4:21 [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking Eric Dumazet
                   ` (8 preceding siblings ...)
  2021-12-05  4:22 ` [PATCH v3 net-next 09/23] net: dst: add net device refcount tracking to dst_entry Eric Dumazet
@ 2021-12-05  4:22 ` Eric Dumazet
  2021-12-05  4:22 ` [PATCH v3 net-next 11/23] sit: add net device refcount tracking to ip_tunnel Eric Dumazet
                   ` (14 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Eric Dumazet @ 2021-12-05  4:22 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Eric Dumazet, Dmitry Vyukov

From: Eric Dumazet <edumazet@google.com>

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv6/route.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index ba4dc94d76d63c98ff49c41b712231f81eb8af40..8d834f901b483edf75c493620d38f979a4bcbf69 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -593,6 +593,7 @@ struct __rt6_probe_work {
 	struct work_struct work;
 	struct in6_addr target;
 	struct net_device *dev;
+	netdevice_tracker dev_tracker;
 };
 
 static void rt6_probe_deferred(struct work_struct *w)
@@ -603,7 +604,7 @@ static void rt6_probe_deferred(struct work_struct *w)
 
 	addrconf_addr_solict_mult(&work->target, &mcaddr);
 	ndisc_send_ns(work->dev, &work->target, &mcaddr, NULL, 0);
-	dev_put(work->dev);
+	dev_put_track(work->dev, &work->dev_tracker);
 	kfree(work);
 }
 
@@ -657,7 +658,7 @@ static void rt6_probe(struct fib6_nh *fib6_nh)
 	} else {
 		INIT_WORK(&work->work, rt6_probe_deferred);
 		work->target = *nh_gw;
-		dev_hold(dev);
+		dev_hold_track(dev, &work->dev_tracker, GFP_KERNEL);
 		work->dev = dev;
 		schedule_work(&work->work);
 	}
-- 
2.34.1.400.ga245620fadb-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 net-next 11/23] sit: add net device refcount tracking to ip_tunnel
  2021-12-05  4:21 [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking Eric Dumazet
                   ` (9 preceding siblings ...)
  2021-12-05  4:22 ` [PATCH v3 net-next 10/23] ipv6: add net device refcount tracker to rt6_probe_deferred() Eric Dumazet
@ 2021-12-05  4:22 ` Eric Dumazet
  2021-12-05  4:22 ` [PATCH v3 net-next 12/23] ipv6: add net device refcount tracker to struct ip6_tnl Eric Dumazet
                   ` (13 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Eric Dumazet @ 2021-12-05  4:22 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Eric Dumazet, Dmitry Vyukov

From: Eric Dumazet <edumazet@google.com>

Note that other ip_tunnel users do not seem to hold a reference
on tunnel->dev. Probably needs some investigations.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/ip_tunnels.h | 3 +++
 net/ipv6/sit.c           | 4 ++--
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h
index bc3b13ec93c9dcd3f5d4c0ad8598200912172863..0219fe907b261952ec1f140d105cd74d7eda6b40 100644
--- a/include/net/ip_tunnels.h
+++ b/include/net/ip_tunnels.h
@@ -104,7 +104,10 @@ struct metadata_dst;
 struct ip_tunnel {
 	struct ip_tunnel __rcu	*next;
 	struct hlist_node hash_node;
+
 	struct net_device	*dev;
+	netdevice_tracker	dev_tracker;
+
 	struct net		*net;	/* netns for packet i/o */
 
 	unsigned long	err_time;	/* Time when the last ICMP error
diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index 1b57ee36d6682e04085aa271c6c5c09e6e3a7b7e..057c0f83c8007fb0756ca0d3a2231fab8eebaacb 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -521,7 +521,7 @@ static void ipip6_tunnel_uninit(struct net_device *dev)
 		ipip6_tunnel_del_prl(tunnel, NULL);
 	}
 	dst_cache_reset(&tunnel->dst_cache);
-	dev_put(dev);
+	dev_put_track(dev, &tunnel->dev_tracker);
 }
 
 static int ipip6_err(struct sk_buff *skb, u32 info)
@@ -1463,7 +1463,7 @@ static int ipip6_tunnel_init(struct net_device *dev)
 		dev->tstats = NULL;
 		return err;
 	}
-	dev_hold(dev);
+	dev_hold_track(dev, &tunnel->dev_tracker, GFP_KERNEL);
 	return 0;
 }
 
-- 
2.34.1.400.ga245620fadb-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 net-next 12/23] ipv6: add net device refcount tracker to struct ip6_tnl
  2021-12-05  4:21 [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking Eric Dumazet
                   ` (10 preceding siblings ...)
  2021-12-05  4:22 ` [PATCH v3 net-next 11/23] sit: add net device refcount tracking to ip_tunnel Eric Dumazet
@ 2021-12-05  4:22 ` Eric Dumazet
  2021-12-05  4:22 ` [PATCH v3 net-next 13/23] net: add net device refcount tracker to struct neighbour Eric Dumazet
                   ` (12 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Eric Dumazet @ 2021-12-05  4:22 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Eric Dumazet, Dmitry Vyukov

From: Eric Dumazet <edumazet@google.com>

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/ip6_tunnel.h | 1 +
 net/ipv6/ip6_gre.c       | 8 ++++----
 net/ipv6/ip6_tunnel.c    | 4 ++--
 net/ipv6/ip6_vti.c       | 4 ++--
 4 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/include/net/ip6_tunnel.h b/include/net/ip6_tunnel.h
index 028eaea1c854493fdab40f655216230e991e2fc5..a38c4f1e4e5c641dcede4d7fedfcdbfadbac430e 100644
--- a/include/net/ip6_tunnel.h
+++ b/include/net/ip6_tunnel.h
@@ -46,6 +46,7 @@ struct __ip6_tnl_parm {
 struct ip6_tnl {
 	struct ip6_tnl __rcu *next;	/* next tunnel in list */
 	struct net_device *dev;	/* virtual device associated with tunnel */
+	netdevice_tracker dev_tracker;
 	struct net *net;	/* netns for packet i/o */
 	struct __ip6_tnl_parm parms;	/* tunnel configuration parameters */
 	struct flowi fl;	/* flowi template for xmit */
diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index d831d243969327b2cd4b539199e02a897a6c14f7..110839a88bc2ef098df38196f997335501305ad7 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -403,7 +403,7 @@ static void ip6erspan_tunnel_uninit(struct net_device *dev)
 	ip6erspan_tunnel_unlink_md(ign, t);
 	ip6gre_tunnel_unlink(ign, t);
 	dst_cache_reset(&t->dst_cache);
-	dev_put(dev);
+	dev_put_track(dev, &t->dev_tracker);
 }
 
 static void ip6gre_tunnel_uninit(struct net_device *dev)
@@ -416,7 +416,7 @@ static void ip6gre_tunnel_uninit(struct net_device *dev)
 	if (ign->fb_tunnel_dev == dev)
 		WRITE_ONCE(ign->fb_tunnel_dev, NULL);
 	dst_cache_reset(&t->dst_cache);
-	dev_put(dev);
+	dev_put_track(dev, &t->dev_tracker);
 }
 
 
@@ -1496,7 +1496,7 @@ static int ip6gre_tunnel_init_common(struct net_device *dev)
 	}
 	ip6gre_tnl_init_features(dev);
 
-	dev_hold(dev);
+	dev_hold_track(dev, &tunnel->dev_tracker, GFP_KERNEL);
 	return 0;
 
 cleanup_dst_cache_init:
@@ -1888,7 +1888,7 @@ static int ip6erspan_tap_init(struct net_device *dev)
 	dev->priv_flags |= IFF_LIVE_ADDR_CHANGE;
 	ip6erspan_tnl_link_config(tunnel, 1);
 
-	dev_hold(dev);
+	dev_hold_track(dev, &tunnel->dev_tracker, GFP_KERNEL);
 	return 0;
 
 cleanup_dst_cache_init:
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index 484aca492cc06858319e87ba9c989de9f378e896..fe786df4f8493396b803002d25701affd59ee96c 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -383,7 +383,7 @@ ip6_tnl_dev_uninit(struct net_device *dev)
 	else
 		ip6_tnl_unlink(ip6n, t);
 	dst_cache_reset(&t->dst_cache);
-	dev_put(dev);
+	dev_put_track(dev, &t->dev_tracker);
 }
 
 /**
@@ -1883,7 +1883,7 @@ ip6_tnl_dev_init_gen(struct net_device *dev)
 	dev->min_mtu = ETH_MIN_MTU;
 	dev->max_mtu = IP6_MAX_MTU - dev->hard_header_len;
 
-	dev_hold(dev);
+	dev_hold_track(dev, &t->dev_tracker, GFP_KERNEL);
 	return 0;
 
 destroy_dst:
diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c
index 527e9ead7449e5229db11b73622ff723847ffc96..ed9b6d6ca65e0173c9f03e580cca2747ad023a99 100644
--- a/net/ipv6/ip6_vti.c
+++ b/net/ipv6/ip6_vti.c
@@ -293,7 +293,7 @@ static void vti6_dev_uninit(struct net_device *dev)
 		RCU_INIT_POINTER(ip6n->tnls_wc[0], NULL);
 	else
 		vti6_tnl_unlink(ip6n, t);
-	dev_put(dev);
+	dev_put_track(dev, &t->dev_tracker);
 }
 
 static int vti6_input_proto(struct sk_buff *skb, int nexthdr, __be32 spi,
@@ -934,7 +934,7 @@ static inline int vti6_dev_init_gen(struct net_device *dev)
 	dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats);
 	if (!dev->tstats)
 		return -ENOMEM;
-	dev_hold(dev);
+	dev_hold_track(dev, &t->dev_tracker, GFP_KERNEL);
 	return 0;
 }
 
-- 
2.34.1.400.ga245620fadb-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 net-next 13/23] net: add net device refcount tracker to struct neighbour
  2021-12-05  4:21 [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking Eric Dumazet
                   ` (11 preceding siblings ...)
  2021-12-05  4:22 ` [PATCH v3 net-next 12/23] ipv6: add net device refcount tracker to struct ip6_tnl Eric Dumazet
@ 2021-12-05  4:22 ` Eric Dumazet
  2021-12-05  4:22 ` [PATCH v3 net-next 14/23] net: add net device refcount tracker to struct pneigh_entry Eric Dumazet
                   ` (11 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Eric Dumazet @ 2021-12-05  4:22 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Eric Dumazet, Dmitry Vyukov

From: Eric Dumazet <edumazet@google.com>

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/neighbour.h | 1 +
 net/core/neighbour.c    | 4 ++--
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/net/neighbour.h b/include/net/neighbour.h
index 55af812c3ed5704813bd64aa760246255aa5f93f..190b07fe089ef5c900a0d97df0bc4d667d8bdcd6 100644
--- a/include/net/neighbour.h
+++ b/include/net/neighbour.h
@@ -158,6 +158,7 @@ struct neighbour {
 	struct list_head	managed_list;
 	struct rcu_head		rcu;
 	struct net_device	*dev;
+	netdevice_tracker	dev_tracker;
 	u8			primary_key[0];
 } __randomize_layout;
 
diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index 72ba027c34cfea6f38a9e78927c35048ebfe7a7f..fb340347e4d88f0058383697071cfb5bfbd9f925 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -624,7 +624,7 @@ ___neigh_create(struct neigh_table *tbl, const void *pkey,
 
 	memcpy(n->primary_key, pkey, key_len);
 	n->dev = dev;
-	dev_hold(dev);
+	dev_hold_track(dev, &n->dev_tracker, GFP_ATOMIC);
 
 	/* Protocol specific setup. */
 	if (tbl->constructor &&	(error = tbl->constructor(n)) < 0) {
@@ -880,7 +880,7 @@ void neigh_destroy(struct neighbour *neigh)
 	if (dev->netdev_ops->ndo_neigh_destroy)
 		dev->netdev_ops->ndo_neigh_destroy(dev, neigh);
 
-	dev_put(dev);
+	dev_put_track(dev, &neigh->dev_tracker);
 	neigh_parms_put(neigh->parms);
 
 	neigh_dbg(2, "neigh %p is destroyed\n", neigh);
-- 
2.34.1.400.ga245620fadb-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 net-next 14/23] net: add net device refcount tracker to struct pneigh_entry
  2021-12-05  4:21 [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking Eric Dumazet
                   ` (12 preceding siblings ...)
  2021-12-05  4:22 ` [PATCH v3 net-next 13/23] net: add net device refcount tracker to struct neighbour Eric Dumazet
@ 2021-12-05  4:22 ` Eric Dumazet
  2021-12-05  4:22 ` [PATCH v3 net-next 15/23] net: add net device refcount tracker to struct neigh_parms Eric Dumazet
                   ` (10 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Eric Dumazet @ 2021-12-05  4:22 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Eric Dumazet, Dmitry Vyukov

From: Eric Dumazet <edumazet@google.com>

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/neighbour.h | 1 +
 net/core/neighbour.c    | 8 ++++----
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/include/net/neighbour.h b/include/net/neighbour.h
index 190b07fe089ef5c900a0d97df0bc4d667d8bdcd6..5fffb783670a6d2432896a06d3044f6ac83feaf4 100644
--- a/include/net/neighbour.h
+++ b/include/net/neighbour.h
@@ -174,6 +174,7 @@ struct pneigh_entry {
 	struct pneigh_entry	*next;
 	possible_net_t		net;
 	struct net_device	*dev;
+	netdevice_tracker	dev_tracker;
 	u32			flags;
 	u8			protocol;
 	u8			key[];
diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index fb340347e4d88f0058383697071cfb5bfbd9f925..56de74f8d2b1c896c478ded2c659b0207d7b5c75 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -771,10 +771,10 @@ struct pneigh_entry * pneigh_lookup(struct neigh_table *tbl,
 	write_pnet(&n->net, net);
 	memcpy(n->key, pkey, key_len);
 	n->dev = dev;
-	dev_hold(dev);
+	dev_hold_track(dev, &n->dev_tracker, GFP_KERNEL);
 
 	if (tbl->pconstructor && tbl->pconstructor(n)) {
-		dev_put(dev);
+		dev_put_track(dev, &n->dev_tracker);
 		kfree(n);
 		n = NULL;
 		goto out;
@@ -806,7 +806,7 @@ int pneigh_delete(struct neigh_table *tbl, struct net *net, const void *pkey,
 			write_unlock_bh(&tbl->lock);
 			if (tbl->pdestructor)
 				tbl->pdestructor(n);
-			dev_put(n->dev);
+			dev_put_track(n->dev, &n->dev_tracker);
 			kfree(n);
 			return 0;
 		}
@@ -839,7 +839,7 @@ static int pneigh_ifdown_and_unlock(struct neigh_table *tbl,
 		n->next = NULL;
 		if (tbl->pdestructor)
 			tbl->pdestructor(n);
-		dev_put(n->dev);
+		dev_put_track(n->dev, &n->dev_tracker);
 		kfree(n);
 	}
 	return -ENOENT;
-- 
2.34.1.400.ga245620fadb-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 net-next 15/23] net: add net device refcount tracker to struct neigh_parms
  2021-12-05  4:21 [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking Eric Dumazet
                   ` (13 preceding siblings ...)
  2021-12-05  4:22 ` [PATCH v3 net-next 14/23] net: add net device refcount tracker to struct pneigh_entry Eric Dumazet
@ 2021-12-05  4:22 ` Eric Dumazet
  2021-12-05  4:22 ` [PATCH v3 net-next 16/23] net: add net device refcount tracker to struct netdev_adjacent Eric Dumazet
                   ` (9 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Eric Dumazet @ 2021-12-05  4:22 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Eric Dumazet, Dmitry Vyukov

From: Eric Dumazet <edumazet@google.com>

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/neighbour.h | 1 +
 net/core/neighbour.c    | 6 +++---
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/net/neighbour.h b/include/net/neighbour.h
index 5fffb783670a6d2432896a06d3044f6ac83feaf4..937389e04c8e26b6f5a4c1362a90930145671889 100644
--- a/include/net/neighbour.h
+++ b/include/net/neighbour.h
@@ -70,6 +70,7 @@ enum {
 struct neigh_parms {
 	possible_net_t net;
 	struct net_device *dev;
+	netdevice_tracker dev_tracker;
 	struct list_head list;
 	int	(*neigh_setup)(struct neighbour *);
 	struct neigh_table *tbl;
diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index 56de74f8d2b1c896c478ded2c659b0207d7b5c75..dd271ffedf11cdd3a1b6bf5fad7b0ddcc5d41e80 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -1666,13 +1666,13 @@ struct neigh_parms *neigh_parms_alloc(struct net_device *dev,
 		refcount_set(&p->refcnt, 1);
 		p->reachable_time =
 				neigh_rand_reach_time(NEIGH_VAR(p, BASE_REACHABLE_TIME));
-		dev_hold(dev);
+		dev_hold_track(dev, &p->dev_tracker, GFP_KERNEL);
 		p->dev = dev;
 		write_pnet(&p->net, net);
 		p->sysctl_table = NULL;
 
 		if (ops->ndo_neigh_setup && ops->ndo_neigh_setup(dev, p)) {
-			dev_put(dev);
+			dev_put_track(dev, &p->dev_tracker);
 			kfree(p);
 			return NULL;
 		}
@@ -1703,7 +1703,7 @@ void neigh_parms_release(struct neigh_table *tbl, struct neigh_parms *parms)
 	list_del(&parms->list);
 	parms->dead = 1;
 	write_unlock_bh(&tbl->lock);
-	dev_put(parms->dev);
+	dev_put_track(parms->dev, &parms->dev_tracker);
 	call_rcu(&parms->rcu_head, neigh_rcu_free_parms);
 }
 EXPORT_SYMBOL(neigh_parms_release);
-- 
2.34.1.400.ga245620fadb-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 net-next 16/23] net: add net device refcount tracker to struct netdev_adjacent
  2021-12-05  4:21 [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking Eric Dumazet
                   ` (14 preceding siblings ...)
  2021-12-05  4:22 ` [PATCH v3 net-next 15/23] net: add net device refcount tracker to struct neigh_parms Eric Dumazet
@ 2021-12-05  4:22 ` Eric Dumazet
  2021-12-05  4:22 ` [PATCH v3 net-next 17/23] ipv6: add net device refcount tracker to struct inet6_dev Eric Dumazet
                   ` (8 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Eric Dumazet @ 2021-12-05  4:22 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Eric Dumazet, Dmitry Vyukov

From: Eric Dumazet <edumazet@google.com>

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/core/dev.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 1740d6cfe86b58359cceaec7ee9cc015a3843723..4420086f3aeb34614fc8222206dff2b2caa31d02 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6537,6 +6537,7 @@ static __latent_entropy void net_rx_action(struct softirq_action *h)
 
 struct netdev_adjacent {
 	struct net_device *dev;
+	netdevice_tracker dev_tracker;
 
 	/* upper master flag, there can only be one master device per list */
 	bool master;
@@ -7301,7 +7302,7 @@ static int __netdev_adjacent_dev_insert(struct net_device *dev,
 	adj->ref_nr = 1;
 	adj->private = private;
 	adj->ignore = false;
-	dev_hold(adj_dev);
+	dev_hold_track(adj_dev, &adj->dev_tracker, GFP_KERNEL);
 
 	pr_debug("Insert adjacency: dev %s adj_dev %s adj->ref_nr %d; dev_hold on %s\n",
 		 dev->name, adj_dev->name, adj->ref_nr, adj_dev->name);
@@ -7330,8 +7331,8 @@ static int __netdev_adjacent_dev_insert(struct net_device *dev,
 	if (netdev_adjacent_is_neigh_list(dev, adj_dev, dev_list))
 		netdev_adjacent_sysfs_del(dev, adj_dev->name, dev_list);
 free_adj:
+	dev_put_track(adj_dev, &adj->dev_tracker);
 	kfree(adj);
-	dev_put(adj_dev);
 
 	return ret;
 }
@@ -7372,7 +7373,7 @@ static void __netdev_adjacent_dev_remove(struct net_device *dev,
 	list_del_rcu(&adj->list);
 	pr_debug("adjacency: dev_put for %s, because link removed from %s to %s\n",
 		 adj_dev->name, dev->name, adj_dev->name);
-	dev_put(adj_dev);
+	dev_put_track(adj_dev, &adj->dev_tracker);
 	kfree_rcu(adj, rcu);
 }
 
-- 
2.34.1.400.ga245620fadb-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 net-next 17/23] ipv6: add net device refcount tracker to struct inet6_dev
  2021-12-05  4:21 [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking Eric Dumazet
                   ` (15 preceding siblings ...)
  2021-12-05  4:22 ` [PATCH v3 net-next 16/23] net: add net device refcount tracker to struct netdev_adjacent Eric Dumazet
@ 2021-12-05  4:22 ` Eric Dumazet
  2021-12-05  4:22 ` [PATCH v3 net-next 18/23] ipv4: add net device refcount tracker to struct in_device Eric Dumazet
                   ` (7 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Eric Dumazet @ 2021-12-05  4:22 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Eric Dumazet, Dmitry Vyukov

From: Eric Dumazet <edumazet@google.com>

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/if_inet6.h   | 1 +
 net/ipv6/addrconf.c      | 4 ++--
 net/ipv6/addrconf_core.c | 2 +-
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/net/if_inet6.h b/include/net/if_inet6.h
index 653e7d0f65cb7a5e7458daf860215d1873c532e7..f026cf08a8e86c54ea5d9f1abddd5f0e3caf402b 100644
--- a/include/net/if_inet6.h
+++ b/include/net/if_inet6.h
@@ -160,6 +160,7 @@ struct ipv6_devstat {
 
 struct inet6_dev {
 	struct net_device	*dev;
+	netdevice_tracker	dev_tracker;
 
 	struct list_head	addr_list;
 
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 3445f8017430f145496bad78afd6378bf5cb1c02..3eee17790a82fe6c528db4e821b11444cfa26866 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -405,13 +405,13 @@ static struct inet6_dev *ipv6_add_dev(struct net_device *dev)
 	if (ndev->cnf.forwarding)
 		dev_disable_lro(dev);
 	/* We refer to the device */
-	dev_hold(dev);
+	dev_hold_track(dev, &ndev->dev_tracker, GFP_KERNEL);
 
 	if (snmp6_alloc_dev(ndev) < 0) {
 		netdev_dbg(dev, "%s: cannot allocate memory for statistics\n",
 			   __func__);
 		neigh_parms_release(&nd_tbl, ndev->nd_parms);
-		dev_put(dev);
+		dev_put_track(dev, &ndev->dev_tracker);
 		kfree(ndev);
 		return ERR_PTR(err);
 	}
diff --git a/net/ipv6/addrconf_core.c b/net/ipv6/addrconf_core.c
index 1d4054bb345b72204179c17b4ebc69e11e3faf53..881d1477d24ad5af79fd744bee1e0792fcfa483d 100644
--- a/net/ipv6/addrconf_core.c
+++ b/net/ipv6/addrconf_core.c
@@ -263,7 +263,7 @@ void in6_dev_finish_destroy(struct inet6_dev *idev)
 #ifdef NET_REFCNT_DEBUG
 	pr_debug("%s: %s\n", __func__, dev ? dev->name : "NIL");
 #endif
-	dev_put(dev);
+	dev_put_track(dev, &idev->dev_tracker);
 	if (!idev->dead) {
 		pr_warn("Freeing alive inet6 device %p\n", idev);
 		return;
-- 
2.34.1.400.ga245620fadb-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 net-next 18/23] ipv4: add net device refcount tracker to struct in_device
  2021-12-05  4:21 [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking Eric Dumazet
                   ` (16 preceding siblings ...)
  2021-12-05  4:22 ` [PATCH v3 net-next 17/23] ipv6: add net device refcount tracker to struct inet6_dev Eric Dumazet
@ 2021-12-05  4:22 ` Eric Dumazet
  2021-12-05  4:22 ` [PATCH v3 net-next 19/23] net/sched: add net device refcount tracker to struct Qdisc Eric Dumazet
                   ` (6 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Eric Dumazet @ 2021-12-05  4:22 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Eric Dumazet, Dmitry Vyukov

From: Eric Dumazet <edumazet@google.com>

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/linux/inetdevice.h | 2 ++
 net/ipv4/devinet.c         | 4 ++--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/linux/inetdevice.h b/include/linux/inetdevice.h
index 518b484a7f07ebfa75cb8e4b41d78c15157f9aac..674aeead626089c2740ef844a7c8c8b799ee79dd 100644
--- a/include/linux/inetdevice.h
+++ b/include/linux/inetdevice.h
@@ -24,6 +24,8 @@ struct ipv4_devconf {
 
 struct in_device {
 	struct net_device	*dev;
+	netdevice_tracker	dev_tracker;
+
 	refcount_t		refcnt;
 	int			dead;
 	struct in_ifaddr	__rcu *ifa_list;/* IP ifaddr chain		*/
diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index 323e622ff9b745350a0ce63a238774281ab326e4..fba2bffd65f7967f390dcaf5183994af1ae5493b 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -243,7 +243,7 @@ void in_dev_finish_destroy(struct in_device *idev)
 #ifdef NET_REFCNT_DEBUG
 	pr_debug("%s: %p=%s\n", __func__, idev, dev ? dev->name : "NIL");
 #endif
-	dev_put(dev);
+	dev_put_track(dev, &idev->dev_tracker);
 	if (!idev->dead)
 		pr_err("Freeing alive in_device %p\n", idev);
 	else
@@ -271,7 +271,7 @@ static struct in_device *inetdev_init(struct net_device *dev)
 	if (IPV4_DEVCONF(in_dev->cnf, FORWARDING))
 		dev_disable_lro(dev);
 	/* Reference in_dev->dev */
-	dev_hold(dev);
+	dev_hold_track(dev, &in_dev->dev_tracker, GFP_KERNEL);
 	/* Account for reference dev->ip_ptr (below) */
 	refcount_set(&in_dev->refcnt, 1);
 
-- 
2.34.1.400.ga245620fadb-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 net-next 19/23] net/sched: add net device refcount tracker to struct Qdisc
  2021-12-05  4:21 [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking Eric Dumazet
                   ` (17 preceding siblings ...)
  2021-12-05  4:22 ` [PATCH v3 net-next 18/23] ipv4: add net device refcount tracker to struct in_device Eric Dumazet
@ 2021-12-05  4:22 ` Eric Dumazet
  2021-12-05  4:22 ` [PATCH v3 net-next 20/23] net: linkwatch: add net device refcount tracker Eric Dumazet
                   ` (5 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Eric Dumazet @ 2021-12-05  4:22 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Eric Dumazet, Dmitry Vyukov

From: Eric Dumazet <edumazet@google.com>

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/sch_generic.h | 2 +-
 net/sched/sch_generic.c   | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 22179b2fda72a0a1c2fddc07eeda40716a443a46..dbf202bb1a0edad582dbaefe72aac1632e219fbd 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -125,7 +125,7 @@ struct Qdisc {
 	spinlock_t		seqlock;
 
 	struct rcu_head		rcu;
-
+	netdevice_tracker	dev_tracker;
 	/* private data */
 	long privdata[] ____cacheline_aligned;
 };
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index d33804d41c5c5a9047c808fd37ba65ae8875fc79..8c8fbf2e385679e46a1b7af47eeac059fb8468cc 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -973,7 +973,7 @@ struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue,
 	sch->enqueue = ops->enqueue;
 	sch->dequeue = ops->dequeue;
 	sch->dev_queue = dev_queue;
-	dev_hold(dev);
+	dev_hold_track(dev, &sch->dev_tracker, GFP_KERNEL);
 	refcount_set(&sch->refcnt, 1);
 
 	return sch;
@@ -1073,7 +1073,7 @@ static void qdisc_destroy(struct Qdisc *qdisc)
 		ops->destroy(qdisc);
 
 	module_put(ops->owner);
-	dev_put(qdisc_dev(qdisc));
+	dev_put_track(qdisc_dev(qdisc), &qdisc->dev_tracker);
 
 	trace_qdisc_destroy(qdisc);
 
-- 
2.34.1.400.ga245620fadb-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 net-next 20/23] net: linkwatch: add net device refcount tracker
  2021-12-05  4:21 [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking Eric Dumazet
                   ` (18 preceding siblings ...)
  2021-12-05  4:22 ` [PATCH v3 net-next 19/23] net/sched: add net device refcount tracker to struct Qdisc Eric Dumazet
@ 2021-12-05  4:22 ` Eric Dumazet
  2021-12-05  4:22 ` [PATCH v3 net-next 21/23] net: failover: " Eric Dumazet
                   ` (4 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Eric Dumazet @ 2021-12-05  4:22 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Eric Dumazet, Dmitry Vyukov

From: Eric Dumazet <edumazet@google.com>

Add a netdevice_tracker inside struct net_device, to track
the self reference when a device is in lweventlist.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/linux/netdevice.h | 2 ++
 net/core/link_watch.c     | 4 ++--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index afed3b10491b92da880a8cd13181ff041cc54673..cc17f26cd4ad59a4d1868951fee13f56ab73cd1a 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1950,6 +1950,7 @@ enum netdev_ml_priv_type {
  *			keep a list of interfaces to be deleted.
  *
  *	@dev_addr_shadow:	Copy of @dev_addr to catch direct writes.
+ *	linkwatch_dev_tracker:	refcount tracker used by linkwatch.
  *
  *	FIXME: cleanup struct net_device such that network protocol info
  *	moves out.
@@ -2280,6 +2281,7 @@ struct net_device {
 	struct bpf_xdp_entity	xdp_state[__MAX_XDP_MODE];
 
 	u8 dev_addr_shadow[MAX_ADDR_LEN];
+	netdevice_tracker	linkwatch_dev_tracker;
 };
 #define to_net_dev(d) container_of(d, struct net_device, dev)
 
diff --git a/net/core/link_watch.c b/net/core/link_watch.c
index 9599afd0862dabfca4560de6ec36606bf3742fd6..d7d089963b1da4142a0863715382aa81241625eb 100644
--- a/net/core/link_watch.c
+++ b/net/core/link_watch.c
@@ -109,7 +109,7 @@ static void linkwatch_add_event(struct net_device *dev)
 	spin_lock_irqsave(&lweventlist_lock, flags);
 	if (list_empty(&dev->link_watch_list)) {
 		list_add_tail(&dev->link_watch_list, &lweventlist);
-		dev_hold(dev);
+		dev_hold_track(dev, &dev->linkwatch_dev_tracker, GFP_ATOMIC);
 	}
 	spin_unlock_irqrestore(&lweventlist_lock, flags);
 }
@@ -166,7 +166,7 @@ static void linkwatch_do_dev(struct net_device *dev)
 
 		netdev_state_change(dev);
 	}
-	dev_put(dev);
+	dev_put_track(dev, &dev->linkwatch_dev_tracker);
 }
 
 static void __linkwatch_run_queue(int urgent_only)
-- 
2.34.1.400.ga245620fadb-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 net-next 21/23] net: failover: add net device refcount tracker
  2021-12-05  4:21 [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking Eric Dumazet
                   ` (19 preceding siblings ...)
  2021-12-05  4:22 ` [PATCH v3 net-next 20/23] net: linkwatch: add net device refcount tracker Eric Dumazet
@ 2021-12-05  4:22 ` Eric Dumazet
  2021-12-05  4:22 ` [PATCH v3 net-next 22/23] ipmr, ip6mr: add net device refcount tracker to struct vif_device Eric Dumazet
                   ` (3 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Eric Dumazet @ 2021-12-05  4:22 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Eric Dumazet, Dmitry Vyukov

From: Eric Dumazet <edumazet@google.com>

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/failover.h | 1 +
 net/core/failover.c    | 4 ++--
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/net/failover.h b/include/net/failover.h
index bb15438f39c7d98fe7c5637f08c3cfb23e7e79d2..f2b42b4b9cd6c1fb6561e7deca3c549dcf04f20f 100644
--- a/include/net/failover.h
+++ b/include/net/failover.h
@@ -25,6 +25,7 @@ struct failover_ops {
 struct failover {
 	struct list_head list;
 	struct net_device __rcu *failover_dev;
+	netdevice_tracker	dev_tracker;
 	struct failover_ops __rcu *ops;
 };
 
diff --git a/net/core/failover.c b/net/core/failover.c
index b5cd3c727285d7a1738118c246abce8d31dac08f..dcaa92a85ea23c54bc5ea68eb8a0f38fb31ff436 100644
--- a/net/core/failover.c
+++ b/net/core/failover.c
@@ -252,7 +252,7 @@ struct failover *failover_register(struct net_device *dev,
 		return ERR_PTR(-ENOMEM);
 
 	rcu_assign_pointer(failover->ops, ops);
-	dev_hold(dev);
+	dev_hold_track(dev, &failover->dev_tracker, GFP_KERNEL);
 	dev->priv_flags |= IFF_FAILOVER;
 	rcu_assign_pointer(failover->failover_dev, dev);
 
@@ -285,7 +285,7 @@ void failover_unregister(struct failover *failover)
 		    failover_dev->name);
 
 	failover_dev->priv_flags &= ~IFF_FAILOVER;
-	dev_put(failover_dev);
+	dev_put_track(failover_dev, &failover->dev_tracker);
 
 	spin_lock(&failover_lock);
 	list_del(&failover->list);
-- 
2.34.1.400.ga245620fadb-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 net-next 22/23] ipmr, ip6mr: add net device refcount tracker to struct vif_device
  2021-12-05  4:21 [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking Eric Dumazet
                   ` (20 preceding siblings ...)
  2021-12-05  4:22 ` [PATCH v3 net-next 21/23] net: failover: " Eric Dumazet
@ 2021-12-05  4:22 ` Eric Dumazet
  2021-12-05  4:22 ` [PATCH v3 net-next 23/23] netpoll: add net device refcount tracker to struct netpoll Eric Dumazet
                   ` (2 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Eric Dumazet @ 2021-12-05  4:22 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Eric Dumazet, Dmitry Vyukov

From: Eric Dumazet <edumazet@google.com>

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/linux/mroute_base.h | 2 ++
 net/ipv4/ipmr.c             | 3 ++-
 net/ipv6/ip6mr.c            | 3 ++-
 3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/include/linux/mroute_base.h b/include/linux/mroute_base.h
index 8071148f29a6ec6a95df7e74bbfdeab5b5f6a644..e05ee9f001ffbf30f7b26ab5555e2db5cc058560 100644
--- a/include/linux/mroute_base.h
+++ b/include/linux/mroute_base.h
@@ -12,6 +12,7 @@
 /**
  * struct vif_device - interface representor for multicast routing
  * @dev: network device being used
+ * @dev_tracker: refcount tracker for @dev reference
  * @bytes_in: statistic; bytes ingressing
  * @bytes_out: statistic; bytes egresing
  * @pkt_in: statistic; packets ingressing
@@ -26,6 +27,7 @@
  */
 struct vif_device {
 	struct net_device *dev;
+	netdevice_tracker dev_tracker;
 	unsigned long bytes_in, bytes_out;
 	unsigned long pkt_in, pkt_out;
 	unsigned long rate_limit;
diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index 2dda856ca260259e5626577e2b2993a6d9967aa6..4c7aca884fa9a35816008a5f3a1a58dd1baf6c06 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -696,7 +696,7 @@ static int vif_delete(struct mr_table *mrt, int vifi, int notify,
 	if (v->flags & (VIFF_TUNNEL | VIFF_REGISTER) && !notify)
 		unregister_netdevice_queue(dev, head);
 
-	dev_put(dev);
+	dev_put_track(dev, &v->dev_tracker);
 	return 0;
 }
 
@@ -896,6 +896,7 @@ static int vif_add(struct net *net, struct mr_table *mrt,
 	/* And finish update writing critical data */
 	write_lock_bh(&mrt_lock);
 	v->dev = dev;
+	netdev_tracker_alloc(dev, &v->dev_tracker, GFP_ATOMIC);
 	if (v->flags & VIFF_REGISTER)
 		mrt->mroute_reg_vif_num = vifi;
 	if (vifi+1 > mrt->maxvif)
diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index 36ed9efb88254003720549da52f39b11e9bf911f..a77a15a7f3dcb61c53a86e055b8a1507d9d591f8 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -746,7 +746,7 @@ static int mif6_delete(struct mr_table *mrt, int vifi, int notify,
 	if ((v->flags & MIFF_REGISTER) && !notify)
 		unregister_netdevice_queue(dev, head);
 
-	dev_put(dev);
+	dev_put_track(dev, &v->dev_tracker);
 	return 0;
 }
 
@@ -919,6 +919,7 @@ static int mif6_add(struct net *net, struct mr_table *mrt,
 	/* And finish update writing critical data */
 	write_lock_bh(&mrt_lock);
 	v->dev = dev;
+	netdev_tracker_alloc(dev, &v->dev_tracker, GFP_ATOMIC);
 #ifdef CONFIG_IPV6_PIMSM_V2
 	if (v->flags & MIFF_REGISTER)
 		mrt->mroute_reg_vif_num = vifi;
-- 
2.34.1.400.ga245620fadb-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 net-next 23/23] netpoll: add net device refcount tracker to struct netpoll
  2021-12-05  4:21 [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking Eric Dumazet
                   ` (21 preceding siblings ...)
  2021-12-05  4:22 ` [PATCH v3 net-next 22/23] ipmr, ip6mr: add net device refcount tracker to struct vif_device Eric Dumazet
@ 2021-12-05  4:22 ` Eric Dumazet
  2021-12-06 23:23 ` [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking Andrew Lunn
  2021-12-07  0:26 ` Jakub Kicinski
  24 siblings, 0 replies; 52+ messages in thread
From: Eric Dumazet @ 2021-12-05  4:22 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Eric Dumazet, Dmitry Vyukov

From: Eric Dumazet <edumazet@google.com>

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 drivers/net/netconsole.c | 2 +-
 include/linux/netpoll.h  | 1 +
 net/core/netpoll.c       | 4 ++--
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c
index ccecba908ded61370f8fc408ea53aa1ff305aca3..ab8cd555102083c2f0179898681489b987afe5b0 100644
--- a/drivers/net/netconsole.c
+++ b/drivers/net/netconsole.c
@@ -721,7 +721,7 @@ static int netconsole_netdev_event(struct notifier_block *this,
 				__netpoll_cleanup(&nt->np);
 
 				spin_lock_irqsave(&target_list_lock, flags);
-				dev_put(nt->np.dev);
+				dev_put_track(nt->np.dev, &nt->np.dev_tracker);
 				nt->np.dev = NULL;
 				nt->enabled = false;
 				stopped = true;
diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index e6a2d72e0dc7a6929d32a2e994f24719e073121e..bd19c4b91e31204e85d30884720b761116d5c036 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -24,6 +24,7 @@ union inet_addr {
 
 struct netpoll {
 	struct net_device *dev;
+	netdevice_tracker dev_tracker;
 	char dev_name[IFNAMSIZ];
 	const char *name;
 
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index edfc0f8011f88a7d46d69e94c6343489369fa78c..db724463e7cd5089d85d8f75a77ad83bbece82dc 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -776,7 +776,7 @@ int netpoll_setup(struct netpoll *np)
 	err = __netpoll_setup(np, ndev);
 	if (err)
 		goto put;
-
+	netdev_tracker_alloc(ndev, &np->dev_tracker, GFP_KERNEL);
 	rtnl_unlock();
 	return 0;
 
@@ -853,7 +853,7 @@ void netpoll_cleanup(struct netpoll *np)
 	if (!np->dev)
 		goto out;
 	__netpoll_cleanup(np);
-	dev_put(np->dev);
+	dev_put_track(np->dev, &np->dev_tracker);
 	np->dev = NULL;
 out:
 	rtnl_unlock();
-- 
2.34.1.400.ga245620fadb-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking
  2021-12-05  4:21 [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking Eric Dumazet
                   ` (22 preceding siblings ...)
  2021-12-05  4:22 ` [PATCH v3 net-next 23/23] netpoll: add net device refcount tracker to struct netpoll Eric Dumazet
@ 2021-12-06 23:23 ` Andrew Lunn
  2021-12-06 23:44   ` Eric Dumazet
  2021-12-07  0:26 ` Jakub Kicinski
  24 siblings, 1 reply; 52+ messages in thread
From: Andrew Lunn @ 2021-12-06 23:23 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Jakub Kicinski, netdev, Eric Dumazet, Dmitry Vyukov

On Sat, Dec 04, 2021 at 08:21:54PM -0800, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> Two first patches add a generic infrastructure, that will be used
> to get tracking of refcount increments/decrements.

Hi Eric

Using this i found:

[  774.108901] unregister_netdevice: waiting for eth0 to become free. Usage count = 4
[  774.110864] leaked reference.
[  774.110874]  dst_alloc+0x7a/0x180
[  774.110887]  ip6_dst_alloc+0x27/0x90
[  774.110894]  ip6_pol_route+0x257/0x430
[  774.110900]  ip6_pol_route_output+0x19/0x20
[  774.110905]  fib6_rule_lookup+0x18b/0x270
[  774.110914]  ip6_route_output_flags_noref+0xaa/0x110
[  774.110918]  ip6_route_output_flags+0x32/0xa0
[  774.110922]  ip6_dst_lookup_tail.constprop.0+0x181/0x240
[  774.110929]  ip6_dst_lookup_flow+0x43/0xa0
[  774.110934]  inet6_csk_route_socket+0x166/0x200
[  774.110943]  inet6_csk_xmit+0x56/0x130
[  774.110946]  __tcp_transmit_skb+0x53b/0xc30
[  774.110953]  __tcp_send_ack.part.0+0xc6/0x1a0
[  774.110958]  tcp_send_ack+0x1c/0x20
[  774.110964]  __tcp_ack_snd_check+0x42/0x200
[  774.110968]  tcp_rcv_established+0x27a/0x6f0
[  774.110973] leaked reference.
[  774.110975]  ipv6_add_dev+0x13e/0x4f0
[  774.110982]  addrconf_notify+0x2ca/0x950
[  774.110989]  raw_notifier_call_chain+0x49/0x60
[  774.111000]  call_netdevice_notifiers_info+0x50/0x90
[  774.111007]  __dev_change_net_namespace+0x30d/0x6c0
[  774.111016]  do_setlink+0xdc/0x10b0
[  774.111024]  __rtnl_newlink+0x608/0xa10
[  774.111031]  rtnl_newlink+0x49/0x70
[  774.111038]  rtnetlink_rcv_msg+0x14f/0x380
[  774.111046]  netlink_rcv_skb+0x55/0x100
[  774.111053]  rtnetlink_rcv+0x15/0x20
[  774.111059]  netlink_unicast+0x230/0x340
[  774.111064]  netlink_sendmsg+0x252/0x4b0
[  774.111075]  sock_sendmsg+0x65/0x70
[  774.111080]  ____sys_sendmsg+0x24e/0x290
[  774.111084]  ___sys_sendmsg+0x81/0xc0

I'm using GNS3 to simulate a network topology. So a collection of veth
pairs, bridges and tap interfaces spread over a few namespaces. The
network being simulated uses Segment Routing. And traceroute might also
involved in this somehow. I have 3 patches applied, to make traceroute
actually work when SRv6 is being used. You can find v3 here:

https://lore.kernel.org/netdev/20211203162926.3680281-3-andrew@lunn.ch/T/

I'm not sure if these patches are part of the problem or not. None of
the traces i've seen are directly on the ICMP path. traceroute is
using udp, and one of the traces above is for tcp, and the other looks
like it is moving an interface into a different namespace?

This is net-next from today.

Any ideas?

Thanks
	Andrew

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking
  2021-12-06 23:23 ` [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking Andrew Lunn
@ 2021-12-06 23:44   ` Eric Dumazet
  2021-12-06 23:48     ` Eric Dumazet
  2021-12-07  0:00     ` Andrew Lunn
  0 siblings, 2 replies; 52+ messages in thread
From: Eric Dumazet @ 2021-12-06 23:44 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev, Dmitry Vyukov

On Mon, Dec 6, 2021 at 3:24 PM Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Sat, Dec 04, 2021 at 08:21:54PM -0800, Eric Dumazet wrote:
> > From: Eric Dumazet <edumazet@google.com>
> >
> > Two first patches add a generic infrastructure, that will be used
> > to get tracking of refcount increments/decrements.
>
> Hi Eric
>
> Using this i found:
>
> [  774.108901] unregister_netdevice: waiting for eth0 to become free. Usage count = 4
> [  774.110864] leaked reference.
> [  774.110874]  dst_alloc+0x7a/0x180
> [  774.110887]  ip6_dst_alloc+0x27/0x90
> [  774.110894]  ip6_pol_route+0x257/0x430
> [  774.110900]  ip6_pol_route_output+0x19/0x20
> [  774.110905]  fib6_rule_lookup+0x18b/0x270
> [  774.110914]  ip6_route_output_flags_noref+0xaa/0x110
> [  774.110918]  ip6_route_output_flags+0x32/0xa0
> [  774.110922]  ip6_dst_lookup_tail.constprop.0+0x181/0x240
> [  774.110929]  ip6_dst_lookup_flow+0x43/0xa0
> [  774.110934]  inet6_csk_route_socket+0x166/0x200
> [  774.110943]  inet6_csk_xmit+0x56/0x130
> [  774.110946]  __tcp_transmit_skb+0x53b/0xc30
> [  774.110953]  __tcp_send_ack.part.0+0xc6/0x1a0
> [  774.110958]  tcp_send_ack+0x1c/0x20
> [  774.110964]  __tcp_ack_snd_check+0x42/0x200
> [  774.110968]  tcp_rcv_established+0x27a/0x6f0
> [  774.110973] leaked reference.
> [  774.110975]  ipv6_add_dev+0x13e/0x4f0
> [  774.110982]  addrconf_notify+0x2ca/0x950
> [  774.110989]  raw_notifier_call_chain+0x49/0x60
> [  774.111000]  call_netdevice_notifiers_info+0x50/0x90
> [  774.111007]  __dev_change_net_namespace+0x30d/0x6c0
> [  774.111016]  do_setlink+0xdc/0x10b0
> [  774.111024]  __rtnl_newlink+0x608/0xa10
> [  774.111031]  rtnl_newlink+0x49/0x70
> [  774.111038]  rtnetlink_rcv_msg+0x14f/0x380
> [  774.111046]  netlink_rcv_skb+0x55/0x100
> [  774.111053]  rtnetlink_rcv+0x15/0x20
> [  774.111059]  netlink_unicast+0x230/0x340
> [  774.111064]  netlink_sendmsg+0x252/0x4b0
> [  774.111075]  sock_sendmsg+0x65/0x70
> [  774.111080]  ____sys_sendmsg+0x24e/0x290
> [  774.111084]  ___sys_sendmsg+0x81/0xc0
>
> I'm using GNS3 to simulate a network topology. So a collection of veth
> pairs, bridges and tap interfaces spread over a few namespaces. The
> network being simulated uses Segment Routing. And traceroute might also
> involved in this somehow. I have 3 patches applied, to make traceroute
> actually work when SRv6 is being used. You can find v3 here:
>
> https://lore.kernel.org/netdev/20211203162926.3680281-3-andrew@lunn.ch/T/
>
> I'm not sure if these patches are part of the problem or not. None of
> the traces i've seen are directly on the ICMP path. traceroute is
> using udp, and one of the traces above is for tcp, and the other looks
> like it is moving an interface into a different namespace?
>
> This is net-next from today.

I do not understand, net-next does not contain this stuff yet ?

I have other patches, this work is still in progress.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking
  2021-12-06 23:44   ` Eric Dumazet
@ 2021-12-06 23:48     ` Eric Dumazet
  2021-12-07  0:00     ` Andrew Lunn
  1 sibling, 0 replies; 52+ messages in thread
From: Eric Dumazet @ 2021-12-06 23:48 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev, Dmitry Vyukov

On Mon, Dec 6, 2021 at 3:44 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Mon, Dec 6, 2021 at 3:24 PM Andrew Lunn <andrew@lunn.ch> wrote:

> > I'm not sure if these patches are part of the problem or not. None of
> > the traces i've seen are directly on the ICMP path. traceroute is
> > using udp, and one of the traces above is for tcp, and the other looks
> > like it is moving an interface into a different namespace?
> >
> > This is net-next from today.
>
> I do not understand, net-next does not contain this stuff yet ?
>
> I have other patches, this work is still in progress.

(Total of 55 patches, but I do not want to send them and flood the mailing list)

I will send next series when first round is merged.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking
  2021-12-06 23:44   ` Eric Dumazet
  2021-12-06 23:48     ` Eric Dumazet
@ 2021-12-07  0:00     ` Andrew Lunn
  2021-12-07  0:04       ` Eric Dumazet
  1 sibling, 1 reply; 52+ messages in thread
From: Andrew Lunn @ 2021-12-07  0:00 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev, Dmitry Vyukov

On Mon, Dec 06, 2021 at 03:44:57PM -0800, Eric Dumazet wrote:
> On Mon, Dec 6, 2021 at 3:24 PM Andrew Lunn <andrew@lunn.ch> wrote:
> >
> > On Sat, Dec 04, 2021 at 08:21:54PM -0800, Eric Dumazet wrote:
> > > From: Eric Dumazet <edumazet@google.com>
> > >
> > > Two first patches add a generic infrastructure, that will be used
> > > to get tracking of refcount increments/decrements.
> >
> > Hi Eric
> >
> > Using this i found:
> >
> > [  774.108901] unregister_netdevice: waiting for eth0 to become free. Usage count = 4
> > [  774.110864] leaked reference.
> > [  774.110874]  dst_alloc+0x7a/0x180
> > [  774.110887]  ip6_dst_alloc+0x27/0x90
> > [  774.110894]  ip6_pol_route+0x257/0x430
> > [  774.110900]  ip6_pol_route_output+0x19/0x20
> > [  774.110905]  fib6_rule_lookup+0x18b/0x270
> > [  774.110914]  ip6_route_output_flags_noref+0xaa/0x110
> > [  774.110918]  ip6_route_output_flags+0x32/0xa0
> > [  774.110922]  ip6_dst_lookup_tail.constprop.0+0x181/0x240
> > [  774.110929]  ip6_dst_lookup_flow+0x43/0xa0
> > [  774.110934]  inet6_csk_route_socket+0x166/0x200
> > [  774.110943]  inet6_csk_xmit+0x56/0x130
> > [  774.110946]  __tcp_transmit_skb+0x53b/0xc30
> > [  774.110953]  __tcp_send_ack.part.0+0xc6/0x1a0
> > [  774.110958]  tcp_send_ack+0x1c/0x20
> > [  774.110964]  __tcp_ack_snd_check+0x42/0x200
> > [  774.110968]  tcp_rcv_established+0x27a/0x6f0
> > [  774.110973] leaked reference.
> > [  774.110975]  ipv6_add_dev+0x13e/0x4f0
> > [  774.110982]  addrconf_notify+0x2ca/0x950
> > [  774.110989]  raw_notifier_call_chain+0x49/0x60
> > [  774.111000]  call_netdevice_notifiers_info+0x50/0x90
> > [  774.111007]  __dev_change_net_namespace+0x30d/0x6c0
> > [  774.111016]  do_setlink+0xdc/0x10b0
> > [  774.111024]  __rtnl_newlink+0x608/0xa10
> > [  774.111031]  rtnl_newlink+0x49/0x70
> > [  774.111038]  rtnetlink_rcv_msg+0x14f/0x380
> > [  774.111046]  netlink_rcv_skb+0x55/0x100
> > [  774.111053]  rtnetlink_rcv+0x15/0x20
> > [  774.111059]  netlink_unicast+0x230/0x340
> > [  774.111064]  netlink_sendmsg+0x252/0x4b0
> > [  774.111075]  sock_sendmsg+0x65/0x70
> > [  774.111080]  ____sys_sendmsg+0x24e/0x290
> > [  774.111084]  ___sys_sendmsg+0x81/0xc0
> >
> > I'm using GNS3 to simulate a network topology. So a collection of veth
> > pairs, bridges and tap interfaces spread over a few namespaces. The
> > network being simulated uses Segment Routing. And traceroute might also
> > involved in this somehow. I have 3 patches applied, to make traceroute
> > actually work when SRv6 is being used. You can find v3 here:
> >
> > https://lore.kernel.org/netdev/20211203162926.3680281-3-andrew@lunn.ch/T/
> >
> > I'm not sure if these patches are part of the problem or not. None of
> > the traces i've seen are directly on the ICMP path. traceroute is
> > using udp, and one of the traces above is for tcp, and the other looks
> > like it is moving an interface into a different namespace?
> >
> > This is net-next from today.
> 
> I do not understand, net-next does not contain this stuff yet ?

Hi Eric

I'm getting warnings like:

unregister_netdevice: waiting for eth0 to become free. Usage count = 4

which is what your patchset is supposed to help fix. So i applied what
has been posted so far, in the hope it would find the issue. It is
reporting something...

> I have other patches, this work is still in progress.

Is what is currently posted usable? Do these traces above point at the
real problem i have, or because there are more patches, i should not
trust the output?

      Andrew

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking
  2021-12-07  0:00     ` Andrew Lunn
@ 2021-12-07  0:04       ` Eric Dumazet
  2021-12-07  0:12         ` Andrew Lunn
  0 siblings, 1 reply; 52+ messages in thread
From: Eric Dumazet @ 2021-12-07  0:04 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev, Dmitry Vyukov

On Mon, Dec 6, 2021 at 4:00 PM Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Mon, Dec 06, 2021 at 03:44:57PM -0800, Eric Dumazet wrote:
> > On Mon, Dec 6, 2021 at 3:24 PM Andrew Lunn <andrew@lunn.ch> wrote:
> > >
> > > On Sat, Dec 04, 2021 at 08:21:54PM -0800, Eric Dumazet wrote:
> > > > From: Eric Dumazet <edumazet@google.com>
> > > >
> > > > Two first patches add a generic infrastructure, that will be used
> > > > to get tracking of refcount increments/decrements.
> > >
> > > Hi Eric
> > >
> > > Using this i found:
> > >
> > > [  774.108901] unregister_netdevice: waiting for eth0 to become free. Usage count = 4
> > > [  774.110864] leaked reference.
> > > [  774.110874]  dst_alloc+0x7a/0x180
> > > [  774.110887]  ip6_dst_alloc+0x27/0x90
> > > [  774.110894]  ip6_pol_route+0x257/0x430
> > > [  774.110900]  ip6_pol_route_output+0x19/0x20
> > > [  774.110905]  fib6_rule_lookup+0x18b/0x270
> > > [  774.110914]  ip6_route_output_flags_noref+0xaa/0x110
> > > [  774.110918]  ip6_route_output_flags+0x32/0xa0
> > > [  774.110922]  ip6_dst_lookup_tail.constprop.0+0x181/0x240
> > > [  774.110929]  ip6_dst_lookup_flow+0x43/0xa0
> > > [  774.110934]  inet6_csk_route_socket+0x166/0x200
> > > [  774.110943]  inet6_csk_xmit+0x56/0x130
> > > [  774.110946]  __tcp_transmit_skb+0x53b/0xc30
> > > [  774.110953]  __tcp_send_ack.part.0+0xc6/0x1a0
> > > [  774.110958]  tcp_send_ack+0x1c/0x20
> > > [  774.110964]  __tcp_ack_snd_check+0x42/0x200
> > > [  774.110968]  tcp_rcv_established+0x27a/0x6f0
> > > [  774.110973] leaked reference.
> > > [  774.110975]  ipv6_add_dev+0x13e/0x4f0
> > > [  774.110982]  addrconf_notify+0x2ca/0x950
> > > [  774.110989]  raw_notifier_call_chain+0x49/0x60
> > > [  774.111000]  call_netdevice_notifiers_info+0x50/0x90
> > > [  774.111007]  __dev_change_net_namespace+0x30d/0x6c0
> > > [  774.111016]  do_setlink+0xdc/0x10b0
> > > [  774.111024]  __rtnl_newlink+0x608/0xa10
> > > [  774.111031]  rtnl_newlink+0x49/0x70
> > > [  774.111038]  rtnetlink_rcv_msg+0x14f/0x380
> > > [  774.111046]  netlink_rcv_skb+0x55/0x100
> > > [  774.111053]  rtnetlink_rcv+0x15/0x20
> > > [  774.111059]  netlink_unicast+0x230/0x340
> > > [  774.111064]  netlink_sendmsg+0x252/0x4b0
> > > [  774.111075]  sock_sendmsg+0x65/0x70
> > > [  774.111080]  ____sys_sendmsg+0x24e/0x290
> > > [  774.111084]  ___sys_sendmsg+0x81/0xc0
> > >
> > > I'm using GNS3 to simulate a network topology. So a collection of veth
> > > pairs, bridges and tap interfaces spread over a few namespaces. The
> > > network being simulated uses Segment Routing. And traceroute might also
> > > involved in this somehow. I have 3 patches applied, to make traceroute
> > > actually work when SRv6 is being used. You can find v3 here:
> > >
> > > https://lore.kernel.org/netdev/20211203162926.3680281-3-andrew@lunn.ch/T/
> > >
> > > I'm not sure if these patches are part of the problem or not. None of
> > > the traces i've seen are directly on the ICMP path. traceroute is
> > > using udp, and one of the traces above is for tcp, and the other looks
> > > like it is moving an interface into a different namespace?
> > >
> > > This is net-next from today.
> >
> > I do not understand, net-next does not contain this stuff yet ?
>
> Hi Eric
>
> I'm getting warnings like:
>
> unregister_netdevice: waiting for eth0 to become free. Usage count = 4
>
> which is what your patchset is supposed to help fix. So i applied what
> has been posted so far, in the hope it would find the issue. It is
> reporting something...

I thought you were telling me that you got these new reports after the
patch set being applied ?

Or were they happening because of your other changes ?

>
> > I have other patches, this work is still in progress.
>
> Is what is currently posted usable? Do these traces above point at the
> real problem i have, or because there are more patches, i should not
> trust the output?

I think I have not worked yet on the XFRM side in patch set 1.
Are you using XFRM ?

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking
  2021-12-07  0:04       ` Eric Dumazet
@ 2021-12-07  0:12         ` Andrew Lunn
  2021-12-07  0:17           ` Eric Dumazet
  0 siblings, 1 reply; 52+ messages in thread
From: Andrew Lunn @ 2021-12-07  0:12 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev, Dmitry Vyukov

On Mon, Dec 06, 2021 at 04:04:19PM -0800, Eric Dumazet wrote:
> On Mon, Dec 6, 2021 at 4:00 PM Andrew Lunn <andrew@lunn.ch> wrote:
> >
> > On Mon, Dec 06, 2021 at 03:44:57PM -0800, Eric Dumazet wrote:
> > > On Mon, Dec 6, 2021 at 3:24 PM Andrew Lunn <andrew@lunn.ch> wrote:
> > > >
> > > > On Sat, Dec 04, 2021 at 08:21:54PM -0800, Eric Dumazet wrote:
> > > > > From: Eric Dumazet <edumazet@google.com>
> > > > >
> > > > > Two first patches add a generic infrastructure, that will be used
> > > > > to get tracking of refcount increments/decrements.
> > > >
> > > > Hi Eric
> > > >
> > > > Using this i found:
> > > >
> > > > [  774.108901] unregister_netdevice: waiting for eth0 to become free. Usage count = 4
> > > > [  774.110864] leaked reference.
> > > > [  774.110874]  dst_alloc+0x7a/0x180
> > > > [  774.110887]  ip6_dst_alloc+0x27/0x90
> > > > [  774.110894]  ip6_pol_route+0x257/0x430
> > > > [  774.110900]  ip6_pol_route_output+0x19/0x20
> > > > [  774.110905]  fib6_rule_lookup+0x18b/0x270
> > > > [  774.110914]  ip6_route_output_flags_noref+0xaa/0x110
> > > > [  774.110918]  ip6_route_output_flags+0x32/0xa0
> > > > [  774.110922]  ip6_dst_lookup_tail.constprop.0+0x181/0x240
> > > > [  774.110929]  ip6_dst_lookup_flow+0x43/0xa0
> > > > [  774.110934]  inet6_csk_route_socket+0x166/0x200
> > > > [  774.110943]  inet6_csk_xmit+0x56/0x130
> > > > [  774.110946]  __tcp_transmit_skb+0x53b/0xc30
> > > > [  774.110953]  __tcp_send_ack.part.0+0xc6/0x1a0
> > > > [  774.110958]  tcp_send_ack+0x1c/0x20
> > > > [  774.110964]  __tcp_ack_snd_check+0x42/0x200
> > > > [  774.110968]  tcp_rcv_established+0x27a/0x6f0
> > > > [  774.110973] leaked reference.
> > > > [  774.110975]  ipv6_add_dev+0x13e/0x4f0
> > > > [  774.110982]  addrconf_notify+0x2ca/0x950
> > > > [  774.110989]  raw_notifier_call_chain+0x49/0x60
> > > > [  774.111000]  call_netdevice_notifiers_info+0x50/0x90
> > > > [  774.111007]  __dev_change_net_namespace+0x30d/0x6c0
> > > > [  774.111016]  do_setlink+0xdc/0x10b0
> > > > [  774.111024]  __rtnl_newlink+0x608/0xa10
> > > > [  774.111031]  rtnl_newlink+0x49/0x70
> > > > [  774.111038]  rtnetlink_rcv_msg+0x14f/0x380
> > > > [  774.111046]  netlink_rcv_skb+0x55/0x100
> > > > [  774.111053]  rtnetlink_rcv+0x15/0x20
> > > > [  774.111059]  netlink_unicast+0x230/0x340
> > > > [  774.111064]  netlink_sendmsg+0x252/0x4b0
> > > > [  774.111075]  sock_sendmsg+0x65/0x70
> > > > [  774.111080]  ____sys_sendmsg+0x24e/0x290
> > > > [  774.111084]  ___sys_sendmsg+0x81/0xc0
> > > >
> > > > I'm using GNS3 to simulate a network topology. So a collection of veth
> > > > pairs, bridges and tap interfaces spread over a few namespaces. The
> > > > network being simulated uses Segment Routing. And traceroute might also
> > > > involved in this somehow. I have 3 patches applied, to make traceroute
> > > > actually work when SRv6 is being used. You can find v3 here:
> > > >
> > > > https://lore.kernel.org/netdev/20211203162926.3680281-3-andrew@lunn.ch/T/
> > > >
> > > > I'm not sure if these patches are part of the problem or not. None of
> > > > the traces i've seen are directly on the ICMP path. traceroute is
> > > > using udp, and one of the traces above is for tcp, and the other looks
> > > > like it is moving an interface into a different namespace?
> > > >
> > > > This is net-next from today.
> > >
> > > I do not understand, net-next does not contain this stuff yet ?
> >
> > Hi Eric
> >
> > I'm getting warnings like:
> >
> > unregister_netdevice: waiting for eth0 to become free. Usage count = 4
> >
> > which is what your patchset is supposed to help fix. So i applied what
> > has been posted so far, in the hope it would find the issue. It is
> > reporting something...
> 
> I thought you were telling me that you got these new reports after the
> patch set being applied ?

Hi Eric

No. I applied them.

> Or were they happening because of your other changes ?

Hard to say. It looks like some sort of race condition. Sometimes when
i shut down the GNS3 simulation, i get the issues, sometimes not. I
don't have a good enough feeling to say either way, is it an existing
problem, or it is my code which is triggering it.

> > > I have other patches, this work is still in progress.
> >
> > Is what is currently posted usable? Do these traces above point at the
> > real problem i have, or because there are more patches, i should not
> > trust the output?
> 
> I think I have not worked yet on the XFRM side in patch set 1.
> Are you using XFRM ?

I don't think SRv6 uses XFRM.

  Andrew

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking
  2021-12-07  0:12         ` Andrew Lunn
@ 2021-12-07  0:17           ` Eric Dumazet
  2021-12-07  0:21             ` Eric Dumazet
  2021-12-07  0:27             ` Andrew Lunn
  0 siblings, 2 replies; 52+ messages in thread
From: Eric Dumazet @ 2021-12-07  0:17 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev, Dmitry Vyukov

On Mon, Dec 6, 2021 at 4:12 PM Andrew Lunn <andrew@lunn.ch> wrote:
>
>
> Hard to say. It looks like some sort of race condition. Sometimes when
> i shut down the GNS3 simulation, i get the issues, sometimes not. I
> don't have a good enough feeling to say either way, is it an existing
> problem, or it is my code which is triggering it.

OK got it.

I think it might be premature to use ref_tracker yet, until we also
have the netns one.
(Seeing the netns change path from your report, this might be relevant)

Path series adding netns tracking:

1fe7f3e6bf91 net: add networking namespace refcount tracker
14d34ec0eaad net: add netns refcount tracker to struct sock
648e1c8128a1 net: add netns refcount tracker to struct seq_net_private
fa5ec9628f3e net: sched: add netns refcount tracker to struct tcf_exts
fa9f11a0a627 netfilter: nfnetlink: add netns refcount tracker to
struct nfulnl_instance
8e3bbdc619d0 l2tp: add netns refcount tracker to l2tp_dfs_seq_data
323fd18ce64c ppp: add netns refcount tracker
d01d6c0df780 netfilter: nf_nat_masquerade: add netns refcount tracker
to masq_dev_work
1b7051234a99 SUNRPC: add netns refcount tracker to struct svc_xprt
44721a730a24 SUNRPC: add netns refcount tracker to struct gss_auth
648e8fd765b7 SUNRPC: add netns refcount tracker to struct rpc_xprt
c1d5973f3af0 net: initialize init_net earlier
75285dbd40cd net: add netns refcount tracker to struct nsproxy
0fbde1282785 vfs: add netns refcount tracker to struct fs_context
5a0c6bd0445f audit: add netns refcount tracker to struct audit_net
145f70501bfb audit: add netns refcount tracker to struct audit_reply
b5af80d1c341 audit: add netns refcount tracker to struct audit_netlink_list

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking
  2021-12-07  0:17           ` Eric Dumazet
@ 2021-12-07  0:21             ` Eric Dumazet
  2021-12-07  0:27             ` Andrew Lunn
  1 sibling, 0 replies; 52+ messages in thread
From: Eric Dumazet @ 2021-12-07  0:21 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev, Dmitry Vyukov

On Mon, Dec 6, 2021 at 4:17 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Mon, Dec 6, 2021 at 4:12 PM Andrew Lunn <andrew@lunn.ch> wrote:
> >
> >
> > Hard to say. It looks like some sort of race condition. Sometimes when
> > i shut down the GNS3 simulation, i get the issues, sometimes not. I
> > don't have a good enough feeling to say either way, is it an existing
> > problem, or it is my code which is triggering it.
>
> OK got it.
>
> I think it might be premature to use ref_tracker yet, until we also
> have the netns one.
> (Seeing the netns change path from your report, this might be relevant)
>
> Path series adding netns tracking:
>
> 1fe7f3e6bf91 net: add networking namespace refcount tracker
> 14d34ec0eaad net: add netns refcount tracker to struct sock
> 648e1c8128a1 net: add netns refcount tracker to struct seq_net_private
> fa5ec9628f3e net: sched: add netns refcount tracker to struct tcf_exts
> fa9f11a0a627 netfilter: nfnetlink: add netns refcount tracker to
> struct nfulnl_instance
> 8e3bbdc619d0 l2tp: add netns refcount tracker to l2tp_dfs_seq_data
> 323fd18ce64c ppp: add netns refcount tracker
> d01d6c0df780 netfilter: nf_nat_masquerade: add netns refcount tracker
> to masq_dev_work
> 1b7051234a99 SUNRPC: add netns refcount tracker to struct svc_xprt
> 44721a730a24 SUNRPC: add netns refcount tracker to struct gss_auth
> 648e8fd765b7 SUNRPC: add netns refcount tracker to struct rpc_xprt
> c1d5973f3af0 net: initialize init_net earlier
> 75285dbd40cd net: add netns refcount tracker to struct nsproxy
> 0fbde1282785 vfs: add netns refcount tracker to struct fs_context
> 5a0c6bd0445f audit: add netns refcount tracker to struct audit_net
> 145f70501bfb audit: add netns refcount tracker to struct audit_reply
> b5af80d1c341 audit: add netns refcount tracker to struct audit_netlink_list

And remaining of netdev tracking patches would be

b445498bdb7a net: eql: add net device refcount tracker
2702cdbf4a6d vlan: add net device refcount tracker
1f1ef25dabe9 net: bridge: add net device refcount tracker
e418e7268655 net: watchdog: add net device refcount tracker
b08577d21d47 net: switchdev: add net device refcount tracker
57f014dc36db inet: add net device refcount tracker to struct fib_nh_common
a51ab6e951ea ax25: add net device refcount tracker
cd494c182dcb llc: add net device refcount tracker
9572141ddb29 pktgen add net device refcount tracker
d0b171bbc275 net/smc: add net device tracker to struct smc_pnetentry
c335545a38c0 netlink: add net device refcount tracker to struct ethnl_req_info
4bb89c4fef19 openvswitch: add net device refcount tracker to struct vport
86957122aab2 net: sched: act_mirred: add net device refcount tracker

I think the most important one for the leaks would be the " inet: add
net device refcount tracker to struct fib_nh_common"

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking
  2021-12-05  4:21 [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking Eric Dumazet
                   ` (23 preceding siblings ...)
  2021-12-06 23:23 ` [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking Andrew Lunn
@ 2021-12-07  0:26 ` Jakub Kicinski
  24 siblings, 0 replies; 52+ messages in thread
From: Jakub Kicinski @ 2021-12-07  0:26 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David S . Miller, netdev, Eric Dumazet, Dmitry Vyukov

On Sat,  4 Dec 2021 20:21:54 -0800 Eric Dumazet wrote:
> Two first patches add a generic infrastructure, that will be used
> to get tracking of refcount increments/decrements.
> 
> The general idea is to be able to precisely pair each decrement with
> a corresponding prior increment. Both share a cookie, basically
> a pointer to private data storing stack traces.
> 
> The third patch adds dev_hold_track() and dev_put_track() helpers
> (CONFIG_NET_DEV_REFCNT_TRACKER)
> 
> Then a series of 20 patches converts some dev_hold()/dev_put()
> pairs to new hepers : dev_hold_track() and dev_put_track().
> 
> Hopefully this will be used by developpers and syzbot to
> root cause bugs that cause netdevice dismantles freezes.
> 
> With CONFIG_PCPU_DEV_REFCNT=n option, we were able to detect
> some class of bugs, but too late (when too many dev_put()
> were happening).

Applied with minor fixes to patches 3 and 20. Thanks!

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking
  2021-12-07  0:17           ` Eric Dumazet
  2021-12-07  0:21             ` Eric Dumazet
@ 2021-12-07  0:27             ` Andrew Lunn
  2021-12-07  0:53               ` Eric Dumazet
  1 sibling, 1 reply; 52+ messages in thread
From: Andrew Lunn @ 2021-12-07  0:27 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev, Dmitry Vyukov

On Mon, Dec 06, 2021 at 04:17:11PM -0800, Eric Dumazet wrote:
> On Mon, Dec 6, 2021 at 4:12 PM Andrew Lunn <andrew@lunn.ch> wrote:
> >
> >
> > Hard to say. It looks like some sort of race condition. Sometimes when
> > i shut down the GNS3 simulation, i get the issues, sometimes not. I
> > don't have a good enough feeling to say either way, is it an existing
> > problem, or it is my code which is triggering it.
> 
> OK got it.
> 
> I think it might be premature to use ref_tracker yet, until we also
> have the netns one.

There is a lot of netns going on with GNS3. So it does sound too
early.

Could i get access to the full set of patches and try them out?

Thanks
	Andrew

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking
  2021-12-07  0:27             ` Andrew Lunn
@ 2021-12-07  0:53               ` Eric Dumazet
  2021-12-07 19:52                 ` Andrew Lunn
  0 siblings, 1 reply; 52+ messages in thread
From: Eric Dumazet @ 2021-12-07  0:53 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev, Dmitry Vyukov

On Mon, Dec 6, 2021 at 4:27 PM Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Mon, Dec 06, 2021 at 04:17:11PM -0800, Eric Dumazet wrote:
> > On Mon, Dec 6, 2021 at 4:12 PM Andrew Lunn <andrew@lunn.ch> wrote:
> > >
> > >
> > > Hard to say. It looks like some sort of race condition. Sometimes when
> > > i shut down the GNS3 simulation, i get the issues, sometimes not. I
> > > don't have a good enough feeling to say either way, is it an existing
> > > problem, or it is my code which is triggering it.
> >
> > OK got it.
> >
> > I think it might be premature to use ref_tracker yet, until we also
> > have the netns one.
>
> There is a lot of netns going on with GNS3. So it does sound too
> early.
>
> Could i get access to the full set of patches and try them out?
>
> Thanks
>         Andrew

I just sent the netns tracker series.

I will shortly send the remainder of netdev tracking patches.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking
  2021-12-07  0:53               ` Eric Dumazet
@ 2021-12-07 19:52                 ` Andrew Lunn
  2021-12-07 20:00                   ` Eric Dumazet
  0 siblings, 1 reply; 52+ messages in thread
From: Andrew Lunn @ 2021-12-07 19:52 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev, Dmitry Vyukov

> I just sent the netns tracker series.
> 
> I will shortly send the remainder of netdev tracking patches.

Hi Eric

Thanks for sending the full set of patches.

I applied them on top of net-next. And i can still reproduce my
issue. This is without any changes of mine, just net-next plus your
two new patchsets.

It is not easy to reproduce. I needed to start/stop the GNS3
simulation around 8 times before i triggered it. I don't know how GNS3
does it tairdown. It could be there are active daemons running in the
name spaces as it removes the veth and tap devices? That might explain
why some of the leaks seem to be from TCP connection setup attempts?

Given i now have your full patchset, do you think these traces are
valid? Are they pointing at real leaks?

[ 1423.515246] unregister_netdevice: waiting for eth0 to become free. Usage count = 9
[ 1423.515747] leaked reference.
[ 1423.515755]  dst_alloc+0x7a/0x180
[ 1423.515765]  ip6_dst_alloc+0x27/0x90
[ 1423.515771]  ip6_pol_route+0x257/0x430
[ 1423.515777]  ip6_pol_route_output+0x19/0x20
[ 1423.515781]  fib6_rule_lookup+0x18b/0x270
[ 1423.515789]  ip6_route_output_flags_noref+0xaa/0x110
[ 1423.515793]  ip6_route_output_flags+0x32/0xa0
[ 1423.515797]  ip6_dst_lookup_tail.constprop.0+0x181/0x240
[ 1423.515803]  ip6_dst_lookup_flow+0x43/0xa0
[ 1423.515808]  inet6_csk_route_socket+0x166/0x200
[ 1423.515815]  inet6_csk_xmit+0x56/0x130
[ 1423.515818]  __tcp_transmit_skb+0x53b/0xc30
[ 1423.515825]  __tcp_send_ack.part.0+0xc6/0x1a0
[ 1423.515829]  tcp_send_ack+0x1c/0x20
[ 1423.515834]  __tcp_ack_snd_check+0x42/0x200
[ 1423.515838]  tcp_rcv_established+0x27a/0x6f0
[ 1423.515842] leaked reference.
[ 1423.515843]  dst_alloc+0x7a/0x180
[ 1423.515848]  ip6_dst_alloc+0x27/0x90
[ 1423.515851]  ip6_pol_route+0x257/0x430
[ 1423.515855]  ip6_pol_route_output+0x19/0x20
[ 1423.515860]  fib6_rule_lookup+0x18b/0x270
[ 1423.515865]  ip6_route_output_flags_noref+0xaa/0x110
[ 1423.515869]  ip6_route_output_flags+0x32/0xa0
[ 1423.515872]  seg6_output_core+0x28d/0x320
[ 1423.515878]  seg6_output+0x33/0x120
[ 1423.515884]  lwtunnel_output+0x72/0xc0
[ 1423.515890]  ip6_local_out+0x61/0x70
[ 1423.515894]  ip6_send_skb+0x23/0x70
[ 1423.515898]  udp_v6_send_skb+0x207/0x460
[ 1423.515905]  udpv6_sendmsg+0xb13/0xdb0
[ 1423.515908]  inet6_sendmsg+0x65/0x70
[ 1423.515916]  sock_sendmsg+0x48/0x70
[ 1423.515920] leaked reference.
[ 1423.515922]  dst_alloc+0x7a/0x180
[ 1423.515926]  ip6_dst_alloc+0x27/0x90
[ 1423.515929]  ip6_pol_route+0x257/0x430
[ 1423.515933]  ip6_pol_route_output+0x19/0x20
[ 1423.515938]  fib6_rule_lookup+0x18b/0x270
[ 1423.515943]  ip6_route_output_flags_noref+0xaa/0x110
[ 1423.515946]  ip6_route_output_flags+0x32/0xa0
[ 1423.515949]  seg6_output_core+0x28d/0x320
[ 1423.515955]  seg6_output+0x33/0x120
[ 1423.515961]  lwtunnel_output+0x72/0xc0
[ 1423.515965]  ip6_local_out+0x61/0x70
[ 1423.515968]  ip6_send_skb+0x23/0x70
[ 1423.515973]  udp_v6_send_skb+0x207/0x460
[ 1423.515979]  udpv6_sendmsg+0xb13/0xdb0
[ 1423.515982]  inet6_sendmsg+0x65/0x70
[ 1423.515985]  sock_sendmsg+0x48/0x70
[ 1423.515988] leaked reference.
[ 1423.515990]  dst_alloc+0x7a/0x180
[ 1423.515993]  ip6_dst_alloc+0x27/0x90
[ 1423.515997]  ip6_pol_route+0x257/0x430
[ 1423.516001]  ip6_pol_route_output+0x19/0x20
[ 1423.516005]  fib6_rule_lookup+0x18b/0x270
[ 1423.516010]  ip6_route_output_flags_noref+0xaa/0x110
[ 1423.516014]  ip6_route_output_flags+0x32/0xa0
[ 1423.516017]  seg6_output_core+0x28d/0x320
[ 1423.516022]  seg6_output+0x33/0x120
[ 1423.516028]  lwtunnel_output+0x72/0xc0
[ 1423.516032]  ip6_local_out+0x61/0x70
[ 1423.516036]  ip6_send_skb+0x23/0x70
[ 1423.516040]  udp_v6_send_skb+0x207/0x460
[ 1423.516046]  udpv6_sendmsg+0xb13/0xdb0
[ 1423.516049]  inet6_sendmsg+0x65/0x70
[ 1423.516052]  sock_sendmsg+0x48/0x70
[ 1423.516055] leaked reference.
[ 1423.516057]  dst_alloc+0x7a/0x180
[ 1423.516061]  ip6_dst_alloc+0x27/0x90
[ 1423.516064]  ip6_pol_route+0x257/0x430
[ 1423.516068]  ip6_pol_route_output+0x19/0x20
[ 1423.516072]  fib6_rule_lookup+0x18b/0x270
[ 1423.516078]  ip6_route_output_flags_noref+0xaa/0x110
[ 1423.516081]  ip6_route_output_flags+0x32/0xa0
[ 1423.516084]  ip6_dst_lookup_tail.constprop.0+0x181/0x240
[ 1423.516088]  ip6_dst_lookup_flow+0x43/0xa0
[ 1423.516092]  inet6_csk_route_socket+0x166/0x200
[ 1423.516099]  inet6_csk_xmit+0x56/0x130
[ 1423.516102]  __tcp_transmit_skb+0x53b/0xc30
[ 1423.516106]  tcp_write_xmit+0x3d4/0x13f0
[ 1423.516110]  __tcp_push_pending_frames+0x37/0x100
[ 1423.516114]  tcp_push+0xd2/0x100
[ 1423.516122]  tcp_sendmsg_locked+0x2a7/0xdb0
[ 1423.516128] leaked reference.
[ 1423.516130]  dst_alloc+0x7a/0x180
[ 1423.516134]  ip6_dst_alloc+0x27/0x90
[ 1423.516137]  ip6_pol_route+0x257/0x430
[ 1423.516141]  ip6_pol_route_output+0x19/0x20
[ 1423.516146]  fib6_rule_lookup+0x18b/0x270
[ 1423.516151]  ip6_route_output_flags_noref+0xaa/0x110
[ 1423.516154]  ip6_route_output_flags+0x32/0xa0
[ 1423.516157]  ip6_dst_lookup_tail.constprop.0+0xde/0x240
[ 1423.516161]  ip6_dst_lookup_flow+0x43/0xa0
[ 1423.516166]  tcp_v6_connect+0x2a7/0x670
[ 1423.516171]  __inet_stream_connect+0xd1/0x3b0
[ 1423.516175]  inet_stream_connect+0x3b/0x60
[ 1423.516178]  __sys_connect_file+0x5f/0x70
[ 1423.516182]  __sys_connect+0xa2/0xd0
[ 1423.516186]  __x64_sys_connect+0x18/0x20
[ 1423.516190]  do_syscall_64+0x3b/0xc0
[ 1423.516197] leaked reference.
[ 1423.516198]  fib6_nh_init+0x30d/0x9c0
[ 1423.516203]  rtm_new_nexthop+0x68a/0x13a0
[ 1423.516208]  rtnetlink_rcv_msg+0x14f/0x380
[ 1423.516216]  netlink_rcv_skb+0x55/0x100
[ 1423.516222]  rtnetlink_rcv+0x15/0x20
[ 1423.516228]  netlink_unicast+0x230/0x340
[ 1423.516232]  netlink_sendmsg+0x252/0x4b0
[ 1423.516236]  sock_sendmsg+0x65/0x70
[ 1423.516240]  ____sys_sendmsg+0x24e/0x290
[ 1423.516243]  ___sys_sendmsg+0x81/0xc0
[ 1423.516247]  __sys_sendmsg+0x62/0xb0
[ 1423.516252]  __x64_sys_sendmsg+0x1d/0x20
[ 1423.516256]  do_syscall_64+0x3b/0xc0
[ 1423.516260]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 1423.516266] leaked reference.
[ 1423.516267]  ipv6_add_dev+0x13e/0x4f0
[ 1423.516273]  addrconf_notify+0x2ca/0x950
[ 1423.516280]  raw_notifier_call_chain+0x49/0x60
[ 1423.516288]  call_netdevice_notifiers_info+0x50/0x90
[ 1423.516293]  __dev_change_net_namespace+0x30d/0x6c0
[ 1423.516299]  do_setlink+0xdc/0x10b0
[ 1423.516306]  __rtnl_newlink+0x608/0xa10
[ 1423.516312]  rtnl_newlink+0x49/0x70
[ 1423.516318]  rtnetlink_rcv_msg+0x14f/0x380
[ 1423.516323]  netlink_rcv_skb+0x55/0x100
[ 1423.516328]  rtnetlink_rcv+0x15/0x20
[ 1423.516333]  netlink_unicast+0x230/0x340
[ 1423.516337]  netlink_sendmsg+0x252/0x4b0
[ 1423.516345]  sock_sendmsg+0x65/0x70
[ 1423.516348]  ____sys_sendmsg+0x24e/0x290
[ 1423.516352]  ___sys_sendmsg+0x81/0xc0

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking
  2021-12-07 19:52                 ` Andrew Lunn
@ 2021-12-07 20:00                   ` Eric Dumazet
  2021-12-08 17:29                     ` Andrew Lunn
  0 siblings, 1 reply; 52+ messages in thread
From: Eric Dumazet @ 2021-12-07 20:00 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev, Dmitry Vyukov

On Tue, Dec 7, 2021 at 11:52 AM Andrew Lunn <andrew@lunn.ch> wrote:
>
> > I just sent the netns tracker series.
> >
> > I will shortly send the remainder of netdev tracking patches.
>
> Hi Eric
>
> Thanks for sending the full set of patches.
>
> I applied them on top of net-next. And i can still reproduce my
> issue. This is without any changes of mine, just net-next plus your
> two new patchsets.
>
> It is not easy to reproduce. I needed to start/stop the GNS3
> simulation around 8 times before i triggered it. I don't know how GNS3
> does it tairdown. It could be there are active daemons running in the
> name spaces as it removes the veth and tap devices? That might explain
> why some of the leaks seem to be from TCP connection setup attempts?
>
> Given i now have your full patchset, do you think these traces are
> valid? Are they pointing at real leaks?
>
> [ 1423.515246] unregister_netdevice: waiting for eth0 to become free. Usage count = 9
> [ 1423.515747] leaked reference.
> [ 1423.515755]  dst_alloc+0x7a/0x180
> [ 1423.515765]  ip6_dst_alloc+0x27/0x90
> [ 1423.515771]  ip6_pol_route+0x257/0x430
> [ 1423.515777]  ip6_pol_route_output+0x19/0x20
> [ 1423.515781]  fib6_rule_lookup+0x18b/0x270
> [ 1423.515789]  ip6_route_output_flags_noref+0xaa/0x110
> [ 1423.515793]  ip6_route_output_flags+0x32/0xa0
> [ 1423.515797]  ip6_dst_lookup_tail.constprop.0+0x181/0x240
> [ 1423.515803]  ip6_dst_lookup_flow+0x43/0xa0
> [ 1423.515808]  inet6_csk_route_socket+0x166/0x200
> [ 1423.515815]  inet6_csk_xmit+0x56/0x130
> [ 1423.515818]  __tcp_transmit_skb+0x53b/0xc30
> [ 1423.515825]  __tcp_send_ack.part.0+0xc6/0x1a0
> [ 1423.515829]  tcp_send_ack+0x1c/0x20
> [ 1423.515834]  __tcp_ack_snd_check+0x42/0x200
> [ 1423.515838]  tcp_rcv_established+0x27a/0x6f0
> [ 1423.515842] leaked reference.
> [ 1423.515843]  dst_alloc+0x7a/0x180
> [ 1423.515848]  ip6_dst_alloc+0x27/0x90
> [ 1423.515851]  ip6_pol_route+0x257/0x430
> [ 1423.515855]  ip6_pol_route_output+0x19/0x20
> [ 1423.515860]  fib6_rule_lookup+0x18b/0x270
> [ 1423.515865]  ip6_route_output_flags_noref+0xaa/0x110
> [ 1423.515869]  ip6_route_output_flags+0x32/0xa0
> [ 1423.515872]  seg6_output_core+0x28d/0x320
> [ 1423.515878]  seg6_output+0x33/0x120
> [ 1423.515884]  lwtunnel_output+0x72/0xc0
> [ 1423.515890]  ip6_local_out+0x61/0x70
> [ 1423.515894]  ip6_send_skb+0x23/0x70
> [ 1423.515898]  udp_v6_send_skb+0x207/0x460
> [ 1423.515905]  udpv6_sendmsg+0xb13/0xdb0
> [ 1423.515908]  inet6_sendmsg+0x65/0x70
> [ 1423.515916]  sock_sendmsg+0x48/0x70
> [ 1423.515920] leaked reference.
> [ 1423.515922]  dst_alloc+0x7a/0x180
> [ 1423.515926]  ip6_dst_alloc+0x27/0x90
> [ 1423.515929]  ip6_pol_route+0x257/0x430
> [ 1423.515933]  ip6_pol_route_output+0x19/0x20
> [ 1423.515938]  fib6_rule_lookup+0x18b/0x270
> [ 1423.515943]  ip6_route_output_flags_noref+0xaa/0x110
> [ 1423.515946]  ip6_route_output_flags+0x32/0xa0
> [ 1423.515949]  seg6_output_core+0x28d/0x320
> [ 1423.515955]  seg6_output+0x33/0x120
> [ 1423.515961]  lwtunnel_output+0x72/0xc0
> [ 1423.515965]  ip6_local_out+0x61/0x70
> [ 1423.515968]  ip6_send_skb+0x23/0x70
> [ 1423.515973]  udp_v6_send_skb+0x207/0x460
> [ 1423.515979]  udpv6_sendmsg+0xb13/0xdb0
> [ 1423.515982]  inet6_sendmsg+0x65/0x70
> [ 1423.515985]  sock_sendmsg+0x48/0x70
> [ 1423.515988] leaked reference.
> [ 1423.515990]  dst_alloc+0x7a/0x180
> [ 1423.515993]  ip6_dst_alloc+0x27/0x90
> [ 1423.515997]  ip6_pol_route+0x257/0x430
> [ 1423.516001]  ip6_pol_route_output+0x19/0x20
> [ 1423.516005]  fib6_rule_lookup+0x18b/0x270
> [ 1423.516010]  ip6_route_output_flags_noref+0xaa/0x110
> [ 1423.516014]  ip6_route_output_flags+0x32/0xa0
> [ 1423.516017]  seg6_output_core+0x28d/0x320
> [ 1423.516022]  seg6_output+0x33/0x120
> [ 1423.516028]  lwtunnel_output+0x72/0xc0
> [ 1423.516032]  ip6_local_out+0x61/0x70
> [ 1423.516036]  ip6_send_skb+0x23/0x70
> [ 1423.516040]  udp_v6_send_skb+0x207/0x460
> [ 1423.516046]  udpv6_sendmsg+0xb13/0xdb0
> [ 1423.516049]  inet6_sendmsg+0x65/0x70
> [ 1423.516052]  sock_sendmsg+0x48/0x70
> [ 1423.516055] leaked reference.
> [ 1423.516057]  dst_alloc+0x7a/0x180
> [ 1423.516061]  ip6_dst_alloc+0x27/0x90
> [ 1423.516064]  ip6_pol_route+0x257/0x430
> [ 1423.516068]  ip6_pol_route_output+0x19/0x20
> [ 1423.516072]  fib6_rule_lookup+0x18b/0x270
> [ 1423.516078]  ip6_route_output_flags_noref+0xaa/0x110
> [ 1423.516081]  ip6_route_output_flags+0x32/0xa0
> [ 1423.516084]  ip6_dst_lookup_tail.constprop.0+0x181/0x240
> [ 1423.516088]  ip6_dst_lookup_flow+0x43/0xa0
> [ 1423.516092]  inet6_csk_route_socket+0x166/0x200
> [ 1423.516099]  inet6_csk_xmit+0x56/0x130
> [ 1423.516102]  __tcp_transmit_skb+0x53b/0xc30
> [ 1423.516106]  tcp_write_xmit+0x3d4/0x13f0
> [ 1423.516110]  __tcp_push_pending_frames+0x37/0x100
> [ 1423.516114]  tcp_push+0xd2/0x100
> [ 1423.516122]  tcp_sendmsg_locked+0x2a7/0xdb0
> [ 1423.516128] leaked reference.
> [ 1423.516130]  dst_alloc+0x7a/0x180
> [ 1423.516134]  ip6_dst_alloc+0x27/0x90
> [ 1423.516137]  ip6_pol_route+0x257/0x430
> [ 1423.516141]  ip6_pol_route_output+0x19/0x20
> [ 1423.516146]  fib6_rule_lookup+0x18b/0x270
> [ 1423.516151]  ip6_route_output_flags_noref+0xaa/0x110
> [ 1423.516154]  ip6_route_output_flags+0x32/0xa0
> [ 1423.516157]  ip6_dst_lookup_tail.constprop.0+0xde/0x240
> [ 1423.516161]  ip6_dst_lookup_flow+0x43/0xa0
> [ 1423.516166]  tcp_v6_connect+0x2a7/0x670
> [ 1423.516171]  __inet_stream_connect+0xd1/0x3b0
> [ 1423.516175]  inet_stream_connect+0x3b/0x60
> [ 1423.516178]  __sys_connect_file+0x5f/0x70
> [ 1423.516182]  __sys_connect+0xa2/0xd0
> [ 1423.516186]  __x64_sys_connect+0x18/0x20
> [ 1423.516190]  do_syscall_64+0x3b/0xc0
> [ 1423.516197] leaked reference.
> [ 1423.516198]  fib6_nh_init+0x30d/0x9c0
> [ 1423.516203]  rtm_new_nexthop+0x68a/0x13a0
> [ 1423.516208]  rtnetlink_rcv_msg+0x14f/0x380
> [ 1423.516216]  netlink_rcv_skb+0x55/0x100
> [ 1423.516222]  rtnetlink_rcv+0x15/0x20
> [ 1423.516228]  netlink_unicast+0x230/0x340
> [ 1423.516232]  netlink_sendmsg+0x252/0x4b0
> [ 1423.516236]  sock_sendmsg+0x65/0x70
> [ 1423.516240]  ____sys_sendmsg+0x24e/0x290
> [ 1423.516243]  ___sys_sendmsg+0x81/0xc0
> [ 1423.516247]  __sys_sendmsg+0x62/0xb0
> [ 1423.516252]  __x64_sys_sendmsg+0x1d/0x20
> [ 1423.516256]  do_syscall_64+0x3b/0xc0
> [ 1423.516260]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [ 1423.516266] leaked reference.
> [ 1423.516267]  ipv6_add_dev+0x13e/0x4f0
> [ 1423.516273]  addrconf_notify+0x2ca/0x950
> [ 1423.516280]  raw_notifier_call_chain+0x49/0x60
> [ 1423.516288]  call_netdevice_notifiers_info+0x50/0x90
> [ 1423.516293]  __dev_change_net_namespace+0x30d/0x6c0
> [ 1423.516299]  do_setlink+0xdc/0x10b0
> [ 1423.516306]  __rtnl_newlink+0x608/0xa10
> [ 1423.516312]  rtnl_newlink+0x49/0x70
> [ 1423.516318]  rtnetlink_rcv_msg+0x14f/0x380
> [ 1423.516323]  netlink_rcv_skb+0x55/0x100
> [ 1423.516328]  rtnetlink_rcv+0x15/0x20
> [ 1423.516333]  netlink_unicast+0x230/0x340
> [ 1423.516337]  netlink_sendmsg+0x252/0x4b0
> [ 1423.516345]  sock_sendmsg+0x65/0x70
> [ 1423.516348]  ____sys_sendmsg+0x24e/0x290
> [ 1423.516352]  ___sys_sendmsg+0x81/0xc0

Thanks for the report.

This all involves IPv6, and this might point to something I hinted
about last week.

Can you try :

Note that I am about to travel, and will be unable to respond to
emails for about 20 hours.
Thanks !

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 8d834f901b48..043d270b65d6 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -163,9 +163,6 @@ static void rt6_uncached_list_flush_dev(struct net
*net, struct net_device *dev)
        struct net_device *loopback_dev = net->loopback_dev;
        int cpu;

-       if (dev == loopback_dev)
-               return;
-
        for_each_possible_cpu(cpu) {
                struct uncached_list *ul = per_cpu_ptr(&rt6_uncached_list, cpu);
                struct rt6_info *rt;
@@ -175,12 +172,13 @@ static void rt6_uncached_list_flush_dev(struct
net *net, struct net_device *dev)
                        struct inet6_dev *rt_idev = rt->rt6i_idev;
                        struct net_device *rt_dev = rt->dst.dev;

-                       if (rt_idev->dev == dev) {
+                       if (rt_idev->dev == dev && dev != loopback_dev) {
                                rt->rt6i_idev = in6_dev_get(loopback_dev);
                                in6_dev_put(rt_idev);
                        }

                        if (rt_dev == dev) {
+                               WARN_ON_ONCE(dev == loopback_dev);
                                rt->dst.dev = blackhole_netdev;
                                dev_replace_track(rt_dev, blackhole_netdev,
                                                  &rt->dst.dev_tracker,

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 net-next 01/23] lib: add reference counting tracking infrastructure
  2021-12-05  4:21 ` [PATCH v3 net-next 01/23] lib: add reference counting tracking infrastructure Eric Dumazet
@ 2021-12-08 14:09   ` Andrzej Hajda
  2021-12-08 14:27     ` Dmitry Vyukov
  2021-12-08 14:59     ` Jakub Kicinski
  2021-12-15 10:18   ` Jiri Slaby
  1 sibling, 2 replies; 52+ messages in thread
From: Andrzej Hajda @ 2021-12-08 14:09 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Dmitry Vyukov, Thomas Gleixner, Greg KH,
	Ingo Molnar

Hi Eric,


I've spotted this patchset thanks to LWN, anyway it was merged very 
quickly, I think it missed more broader review.

As the patch touches kernel lib I have added few people who could be 
interested.


On 05.12.2021 05:21, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> It can be hard to track where references are taken and released.
>
> In networking, we have annoying issues at device or netns dismantles,
> and we had various proposals to ease root causing them.
>
> This patch adds new infrastructure pairing refcount increases
> and decreases. This will self document code, because programmers
> will have to associate increments/decrements.
>
> This is controled by CONFIG_REF_TRACKER which can be selected
> by users of this feature.
>
> This adds both cpu and memory costs, and thus should probably be
> used with care.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
> ---


Life is surprising, I was working on my own framework, solving the same 
issue, with intention to publish it in few days :)


My approach was little different:

1. Instead of creating separate framework I have extended debug_objects.

2. There were no additional fields in refcounted object and trackers - 
infrastructure of debug_objects was reused - debug_objects tracked both 
pointers of refcounted object and its users.


Have you considered using debug_object? it seems to be good place to put 
it there, I am not sure about performance differences.

One more thing about design - as I understand CONFIG_REF_TRACKER turns 
on all trackers in whole kernel, have you considered possibility/helpers 
to enable/disable tracking per class of objects?


>   include/linux/ref_tracker.h |  73 +++++++++++++++++++
>   lib/Kconfig                 |   5 ++
>   lib/Makefile                |   2 +
>   lib/ref_tracker.c           | 140 ++++++++++++++++++++++++++++++++++++
>   4 files changed, 220 insertions(+)
>   create mode 100644 include/linux/ref_tracker.h
>   create mode 100644 lib/ref_tracker.c
>
> diff --git a/include/linux/ref_tracker.h b/include/linux/ref_tracker.h
> new file mode 100644
> index 0000000000000000000000000000000000000000..c11c9db5825cf933acf529c83db441a818135f29
> --- /dev/null
> +++ b/include/linux/ref_tracker.h
> @@ -0,0 +1,73 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +#ifndef _LINUX_REF_TRACKER_H
> +#define _LINUX_REF_TRACKER_H
> +#include <linux/refcount.h>
> +#include <linux/types.h>
> +#include <linux/spinlock.h>
> +
> +struct ref_tracker;


With sth similar to:

#ifdef CONFIG_REF_TRACKER

typedef struct ref_tracker *ref_tracker_p;
#else
typedef struct {} ref_tracker_p;
#endif

you can eliminate unused field in user's structures.

Beside this it looks OK to me.


Regards

Andrzej


> +
> +struct ref_tracker_dir {
> +#ifdef CONFIG_REF_TRACKER
> +	spinlock_t		lock;
> +	unsigned int		quarantine_avail;
> +	refcount_t		untracked;
> +	struct list_head	list; /* List of active trackers */
> +	struct list_head	quarantine; /* List of dead trackers */
> +#endif
> +};
> +
> +#ifdef CONFIG_REF_TRACKER
> +static inline void ref_tracker_dir_init(struct ref_tracker_dir *dir,
> +					unsigned int quarantine_count)
> +{
> +	INIT_LIST_HEAD(&dir->list);
> +	INIT_LIST_HEAD(&dir->quarantine);
> +	spin_lock_init(&dir->lock);
> +	dir->quarantine_avail = quarantine_count;
> +	refcount_set(&dir->untracked, 1);
> +}
> +
> +void ref_tracker_dir_exit(struct ref_tracker_dir *dir);
> +
> +void ref_tracker_dir_print(struct ref_tracker_dir *dir,
> +			   unsigned int display_limit);
> +
> +int ref_tracker_alloc(struct ref_tracker_dir *dir,
> +		      struct ref_tracker **trackerp, gfp_t gfp);
> +
> +int ref_tracker_free(struct ref_tracker_dir *dir,
> +		     struct ref_tracker **trackerp);
> +
> +#else /* CONFIG_REF_TRACKER */
> +
> +static inline void ref_tracker_dir_init(struct ref_tracker_dir *dir,
> +					unsigned int quarantine_count)
> +{
> +}
> +
> +static inline void ref_tracker_dir_exit(struct ref_tracker_dir *dir)
> +{
> +}
> +
> +static inline void ref_tracker_dir_print(struct ref_tracker_dir *dir,
> +					 unsigned int display_limit)
> +{
> +}
> +
> +static inline int ref_tracker_alloc(struct ref_tracker_dir *dir,
> +				    struct ref_tracker **trackerp,
> +				    gfp_t gfp)
> +{
> +	return 0;
> +}
> +
> +static inline int ref_tracker_free(struct ref_tracker_dir *dir,
> +				   struct ref_tracker **trackerp)
> +{
> +	return 0;
> +}
> +
> +#endif
> +
> +#endif /* _LINUX_REF_TRACKER_H */
> diff --git a/lib/Kconfig b/lib/Kconfig
> index 5e7165e6a346c9bec878b78c8c8c3d175fc98dfd..655b0e43f260bfca63240794191e3f1890b2e801 100644
> --- a/lib/Kconfig
> +++ b/lib/Kconfig
> @@ -680,6 +680,11 @@ config STACK_HASH_ORDER
>   	 Select the hash size as a power of 2 for the stackdepot hash table.
>   	 Choose a lower value to reduce the memory impact.
>   
> +config REF_TRACKER
> +	bool
> +	depends on STACKTRACE_SUPPORT
> +	select STACKDEPOT
> +
>   config SBITMAP
>   	bool
>   
> diff --git a/lib/Makefile b/lib/Makefile
> index 364c23f1557816f73aebd8304c01224a4846ac6c..c1fd9243ddb9cc1ac5252d7eb8009f9290782c4a 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -270,6 +270,8 @@ obj-$(CONFIG_STACKDEPOT) += stackdepot.o
>   KASAN_SANITIZE_stackdepot.o := n
>   KCOV_INSTRUMENT_stackdepot.o := n
>   
> +obj-$(CONFIG_REF_TRACKER) += ref_tracker.o
> +
>   libfdt_files = fdt.o fdt_ro.o fdt_wip.o fdt_rw.o fdt_sw.o fdt_strerror.o \
>   	       fdt_empty_tree.o fdt_addresses.o
>   $(foreach file, $(libfdt_files), \
> diff --git a/lib/ref_tracker.c b/lib/ref_tracker.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..0ae2e66dcf0fdb976f4cb99e747c9448b37f22cc
> --- /dev/null
> +++ b/lib/ref_tracker.c
> @@ -0,0 +1,140 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +#include <linux/export.h>
> +#include <linux/ref_tracker.h>
> +#include <linux/slab.h>
> +#include <linux/stacktrace.h>
> +#include <linux/stackdepot.h>
> +
> +#define REF_TRACKER_STACK_ENTRIES 16
> +
> +struct ref_tracker {
> +	struct list_head	head;   /* anchor into dir->list or dir->quarantine */
> +	bool			dead;
> +	depot_stack_handle_t	alloc_stack_handle;
> +	depot_stack_handle_t	free_stack_handle;
> +};
> +
> +void ref_tracker_dir_exit(struct ref_tracker_dir *dir)
> +{
> +	struct ref_tracker *tracker, *n;
> +	unsigned long flags;
> +	bool leak = false;
> +
> +	spin_lock_irqsave(&dir->lock, flags);
> +	list_for_each_entry_safe(tracker, n, &dir->quarantine, head) {
> +		list_del(&tracker->head);
> +		kfree(tracker);
> +		dir->quarantine_avail++;
> +	}
> +	list_for_each_entry_safe(tracker, n, &dir->list, head) {
> +		pr_err("leaked reference.\n");
> +		if (tracker->alloc_stack_handle)
> +			stack_depot_print(tracker->alloc_stack_handle);
> +		leak = true;
> +		list_del(&tracker->head);
> +		kfree(tracker);
> +	}
> +	spin_unlock_irqrestore(&dir->lock, flags);
> +	WARN_ON_ONCE(leak);
> +	WARN_ON_ONCE(refcount_read(&dir->untracked) != 1);
> +}
> +EXPORT_SYMBOL(ref_tracker_dir_exit);
> +
> +void ref_tracker_dir_print(struct ref_tracker_dir *dir,
> +			   unsigned int display_limit)
> +{
> +	struct ref_tracker *tracker;
> +	unsigned long flags;
> +	unsigned int i = 0;
> +
> +	spin_lock_irqsave(&dir->lock, flags);
> +	list_for_each_entry(tracker, &dir->list, head) {
> +		if (i < display_limit) {
> +			pr_err("leaked reference.\n");
> +			if (tracker->alloc_stack_handle)
> +				stack_depot_print(tracker->alloc_stack_handle);
> +			i++;
> +		} else {
> +			break;
> +		}
> +	}
> +	spin_unlock_irqrestore(&dir->lock, flags);
> +}
> +EXPORT_SYMBOL(ref_tracker_dir_print);
> +
> +int ref_tracker_alloc(struct ref_tracker_dir *dir,
> +		      struct ref_tracker **trackerp,
> +		      gfp_t gfp)
> +{
> +	unsigned long entries[REF_TRACKER_STACK_ENTRIES];
> +	struct ref_tracker *tracker;
> +	unsigned int nr_entries;
> +	unsigned long flags;
> +
> +	*trackerp = tracker = kzalloc(sizeof(*tracker), gfp | __GFP_NOFAIL);
> +	if (unlikely(!tracker)) {
> +		pr_err_once("memory allocation failure, unreliable refcount tracker.\n");
> +		refcount_inc(&dir->untracked);
> +		return -ENOMEM;
> +	}
> +	nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
> +	nr_entries = filter_irq_stacks(entries, nr_entries);
> +	tracker->alloc_stack_handle = stack_depot_save(entries, nr_entries, gfp);
> +
> +	spin_lock_irqsave(&dir->lock, flags);
> +	list_add(&tracker->head, &dir->list);
> +	spin_unlock_irqrestore(&dir->lock, flags);
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(ref_tracker_alloc);
> +
> +int ref_tracker_free(struct ref_tracker_dir *dir,
> +		     struct ref_tracker **trackerp)
> +{
> +	unsigned long entries[REF_TRACKER_STACK_ENTRIES];
> +	struct ref_tracker *tracker = *trackerp;
> +	depot_stack_handle_t stack_handle;
> +	unsigned int nr_entries;
> +	unsigned long flags;
> +
> +	if (!tracker) {
> +		refcount_dec(&dir->untracked);
> +		return -EEXIST;
> +	}
> +	nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
> +	nr_entries = filter_irq_stacks(entries, nr_entries);
> +	stack_handle = stack_depot_save(entries, nr_entries, GFP_ATOMIC);
> +
> +	spin_lock_irqsave(&dir->lock, flags);
> +	if (tracker->dead) {
> +		pr_err("reference already released.\n");
> +		if (tracker->alloc_stack_handle) {
> +			pr_err("allocated in:\n");
> +			stack_depot_print(tracker->alloc_stack_handle);
> +		}
> +		if (tracker->free_stack_handle) {
> +			pr_err("freed in:\n");
> +			stack_depot_print(tracker->free_stack_handle);
> +		}
> +		spin_unlock_irqrestore(&dir->lock, flags);
> +		WARN_ON_ONCE(1);
> +		return -EINVAL;
> +	}
> +	tracker->dead = true;
> +
> +	tracker->free_stack_handle = stack_handle;
> +
> +	list_move_tail(&tracker->head, &dir->quarantine);
> +	if (!dir->quarantine_avail) {
> +		tracker = list_first_entry(&dir->quarantine, struct ref_tracker, head);
> +		list_del(&tracker->head);
> +	} else {
> +		dir->quarantine_avail--;
> +		tracker = NULL;
> +	}
> +	spin_unlock_irqrestore(&dir->lock, flags);
> +
> +	kfree(tracker);
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(ref_tracker_free);

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 net-next 01/23] lib: add reference counting tracking infrastructure
  2021-12-08 14:09   ` Andrzej Hajda
@ 2021-12-08 14:27     ` Dmitry Vyukov
  2021-12-08 15:04       ` Andrzej Hajda
  2021-12-08 14:59     ` Jakub Kicinski
  1 sibling, 1 reply; 52+ messages in thread
From: Dmitry Vyukov @ 2021-12-08 14:27 UTC (permalink / raw)
  To: Andrzej Hajda
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev,
	Eric Dumazet, Thomas Gleixner, Greg KH, Ingo Molnar

On Wed, 8 Dec 2021 at 15:09, Andrzej Hajda <andrzej.hajda@intel.com> wrote:
>
> Hi Eric,
>
>
> I've spotted this patchset thanks to LWN, anyway it was merged very
> quickly, I think it missed more broader review.
>
> As the patch touches kernel lib I have added few people who could be
> interested.
>
>
> On 05.12.2021 05:21, Eric Dumazet wrote:
> > From: Eric Dumazet <edumazet@google.com>
> >
> > It can be hard to track where references are taken and released.
> >
> > In networking, we have annoying issues at device or netns dismantles,
> > and we had various proposals to ease root causing them.
> >
> > This patch adds new infrastructure pairing refcount increases
> > and decreases. This will self document code, because programmers
> > will have to associate increments/decrements.
> >
> > This is controled by CONFIG_REF_TRACKER which can be selected
> > by users of this feature.
> >
> > This adds both cpu and memory costs, and thus should probably be
> > used with care.
> >
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
> > ---
>
>
> Life is surprising, I was working on my own framework, solving the same
> issue, with intention to publish it in few days :)
>
>
> My approach was little different:
>
> 1. Instead of creating separate framework I have extended debug_objects.
>
> 2. There were no additional fields in refcounted object and trackers -
> infrastructure of debug_objects was reused - debug_objects tracked both
> pointers of refcounted object and its users.
>
> Have you considered using debug_object? it seems to be good place to put
> it there, I am not sure about performance differences.

Hi Andrzej,

How exactly did you do it? Do you have a link to your patch?
There still should be something similar to `struct ref_tracker` in
this patch, right? Or how do you match decrements with increments and
understand when a double-decrement happens?


> One more thing about design - as I understand CONFIG_REF_TRACKER turns
> on all trackers in whole kernel, have you considered possibility/helpers
> to enable/disable tracking per class of objects?
>
>
> >   include/linux/ref_tracker.h |  73 +++++++++++++++++++
> >   lib/Kconfig                 |   5 ++
> >   lib/Makefile                |   2 +
> >   lib/ref_tracker.c           | 140 ++++++++++++++++++++++++++++++++++++
> >   4 files changed, 220 insertions(+)
> >   create mode 100644 include/linux/ref_tracker.h
> >   create mode 100644 lib/ref_tracker.c
> >
> > diff --git a/include/linux/ref_tracker.h b/include/linux/ref_tracker.h
> > new file mode 100644
> > index 0000000000000000000000000000000000000000..c11c9db5825cf933acf529c83db441a818135f29
> > --- /dev/null
> > +++ b/include/linux/ref_tracker.h
> > @@ -0,0 +1,73 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +#ifndef _LINUX_REF_TRACKER_H
> > +#define _LINUX_REF_TRACKER_H
> > +#include <linux/refcount.h>
> > +#include <linux/types.h>
> > +#include <linux/spinlock.h>
> > +
> > +struct ref_tracker;
>
>
> With sth similar to:
>
> #ifdef CONFIG_REF_TRACKER
>
> typedef struct ref_tracker *ref_tracker_p;
> #else
> typedef struct {} ref_tracker_p;
> #endif
>
> you can eliminate unused field in user's structures.
>
> Beside this it looks OK to me.
>
>
> Regards
>
> Andrzej
>
>
> > +
> > +struct ref_tracker_dir {
> > +#ifdef CONFIG_REF_TRACKER
> > +     spinlock_t              lock;
> > +     unsigned int            quarantine_avail;
> > +     refcount_t              untracked;
> > +     struct list_head        list; /* List of active trackers */
> > +     struct list_head        quarantine; /* List of dead trackers */
> > +#endif
> > +};
> > +
> > +#ifdef CONFIG_REF_TRACKER
> > +static inline void ref_tracker_dir_init(struct ref_tracker_dir *dir,
> > +                                     unsigned int quarantine_count)
> > +{
> > +     INIT_LIST_HEAD(&dir->list);
> > +     INIT_LIST_HEAD(&dir->quarantine);
> > +     spin_lock_init(&dir->lock);
> > +     dir->quarantine_avail = quarantine_count;
> > +     refcount_set(&dir->untracked, 1);
> > +}
> > +
> > +void ref_tracker_dir_exit(struct ref_tracker_dir *dir);
> > +
> > +void ref_tracker_dir_print(struct ref_tracker_dir *dir,
> > +                        unsigned int display_limit);
> > +
> > +int ref_tracker_alloc(struct ref_tracker_dir *dir,
> > +                   struct ref_tracker **trackerp, gfp_t gfp);
> > +
> > +int ref_tracker_free(struct ref_tracker_dir *dir,
> > +                  struct ref_tracker **trackerp);
> > +
> > +#else /* CONFIG_REF_TRACKER */
> > +
> > +static inline void ref_tracker_dir_init(struct ref_tracker_dir *dir,
> > +                                     unsigned int quarantine_count)
> > +{
> > +}
> > +
> > +static inline void ref_tracker_dir_exit(struct ref_tracker_dir *dir)
> > +{
> > +}
> > +
> > +static inline void ref_tracker_dir_print(struct ref_tracker_dir *dir,
> > +                                      unsigned int display_limit)
> > +{
> > +}
> > +
> > +static inline int ref_tracker_alloc(struct ref_tracker_dir *dir,
> > +                                 struct ref_tracker **trackerp,
> > +                                 gfp_t gfp)
> > +{
> > +     return 0;
> > +}
> > +
> > +static inline int ref_tracker_free(struct ref_tracker_dir *dir,
> > +                                struct ref_tracker **trackerp)
> > +{
> > +     return 0;
> > +}
> > +
> > +#endif
> > +
> > +#endif /* _LINUX_REF_TRACKER_H */
> > diff --git a/lib/Kconfig b/lib/Kconfig
> > index 5e7165e6a346c9bec878b78c8c8c3d175fc98dfd..655b0e43f260bfca63240794191e3f1890b2e801 100644
> > --- a/lib/Kconfig
> > +++ b/lib/Kconfig
> > @@ -680,6 +680,11 @@ config STACK_HASH_ORDER
> >        Select the hash size as a power of 2 for the stackdepot hash table.
> >        Choose a lower value to reduce the memory impact.
> >
> > +config REF_TRACKER
> > +     bool
> > +     depends on STACKTRACE_SUPPORT
> > +     select STACKDEPOT
> > +
> >   config SBITMAP
> >       bool
> >
> > diff --git a/lib/Makefile b/lib/Makefile
> > index 364c23f1557816f73aebd8304c01224a4846ac6c..c1fd9243ddb9cc1ac5252d7eb8009f9290782c4a 100644
> > --- a/lib/Makefile
> > +++ b/lib/Makefile
> > @@ -270,6 +270,8 @@ obj-$(CONFIG_STACKDEPOT) += stackdepot.o
> >   KASAN_SANITIZE_stackdepot.o := n
> >   KCOV_INSTRUMENT_stackdepot.o := n
> >
> > +obj-$(CONFIG_REF_TRACKER) += ref_tracker.o
> > +
> >   libfdt_files = fdt.o fdt_ro.o fdt_wip.o fdt_rw.o fdt_sw.o fdt_strerror.o \
> >              fdt_empty_tree.o fdt_addresses.o
> >   $(foreach file, $(libfdt_files), \
> > diff --git a/lib/ref_tracker.c b/lib/ref_tracker.c
> > new file mode 100644
> > index 0000000000000000000000000000000000000000..0ae2e66dcf0fdb976f4cb99e747c9448b37f22cc
> > --- /dev/null
> > +++ b/lib/ref_tracker.c
> > @@ -0,0 +1,140 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +#include <linux/export.h>
> > +#include <linux/ref_tracker.h>
> > +#include <linux/slab.h>
> > +#include <linux/stacktrace.h>
> > +#include <linux/stackdepot.h>
> > +
> > +#define REF_TRACKER_STACK_ENTRIES 16
> > +
> > +struct ref_tracker {
> > +     struct list_head        head;   /* anchor into dir->list or dir->quarantine */
> > +     bool                    dead;
> > +     depot_stack_handle_t    alloc_stack_handle;
> > +     depot_stack_handle_t    free_stack_handle;
> > +};
> > +
> > +void ref_tracker_dir_exit(struct ref_tracker_dir *dir)
> > +{
> > +     struct ref_tracker *tracker, *n;
> > +     unsigned long flags;
> > +     bool leak = false;
> > +
> > +     spin_lock_irqsave(&dir->lock, flags);
> > +     list_for_each_entry_safe(tracker, n, &dir->quarantine, head) {
> > +             list_del(&tracker->head);
> > +             kfree(tracker);
> > +             dir->quarantine_avail++;
> > +     }
> > +     list_for_each_entry_safe(tracker, n, &dir->list, head) {
> > +             pr_err("leaked reference.\n");
> > +             if (tracker->alloc_stack_handle)
> > +                     stack_depot_print(tracker->alloc_stack_handle);
> > +             leak = true;
> > +             list_del(&tracker->head);
> > +             kfree(tracker);
> > +     }
> > +     spin_unlock_irqrestore(&dir->lock, flags);
> > +     WARN_ON_ONCE(leak);
> > +     WARN_ON_ONCE(refcount_read(&dir->untracked) != 1);
> > +}
> > +EXPORT_SYMBOL(ref_tracker_dir_exit);
> > +
> > +void ref_tracker_dir_print(struct ref_tracker_dir *dir,
> > +                        unsigned int display_limit)
> > +{
> > +     struct ref_tracker *tracker;
> > +     unsigned long flags;
> > +     unsigned int i = 0;
> > +
> > +     spin_lock_irqsave(&dir->lock, flags);
> > +     list_for_each_entry(tracker, &dir->list, head) {
> > +             if (i < display_limit) {
> > +                     pr_err("leaked reference.\n");
> > +                     if (tracker->alloc_stack_handle)
> > +                             stack_depot_print(tracker->alloc_stack_handle);
> > +                     i++;
> > +             } else {
> > +                     break;
> > +             }
> > +     }
> > +     spin_unlock_irqrestore(&dir->lock, flags);
> > +}
> > +EXPORT_SYMBOL(ref_tracker_dir_print);
> > +
> > +int ref_tracker_alloc(struct ref_tracker_dir *dir,
> > +                   struct ref_tracker **trackerp,
> > +                   gfp_t gfp)
> > +{
> > +     unsigned long entries[REF_TRACKER_STACK_ENTRIES];
> > +     struct ref_tracker *tracker;
> > +     unsigned int nr_entries;
> > +     unsigned long flags;
> > +
> > +     *trackerp = tracker = kzalloc(sizeof(*tracker), gfp | __GFP_NOFAIL);
> > +     if (unlikely(!tracker)) {
> > +             pr_err_once("memory allocation failure, unreliable refcount tracker.\n");
> > +             refcount_inc(&dir->untracked);
> > +             return -ENOMEM;
> > +     }
> > +     nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
> > +     nr_entries = filter_irq_stacks(entries, nr_entries);
> > +     tracker->alloc_stack_handle = stack_depot_save(entries, nr_entries, gfp);
> > +
> > +     spin_lock_irqsave(&dir->lock, flags);
> > +     list_add(&tracker->head, &dir->list);
> > +     spin_unlock_irqrestore(&dir->lock, flags);
> > +     return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(ref_tracker_alloc);
> > +
> > +int ref_tracker_free(struct ref_tracker_dir *dir,
> > +                  struct ref_tracker **trackerp)
> > +{
> > +     unsigned long entries[REF_TRACKER_STACK_ENTRIES];
> > +     struct ref_tracker *tracker = *trackerp;
> > +     depot_stack_handle_t stack_handle;
> > +     unsigned int nr_entries;
> > +     unsigned long flags;
> > +
> > +     if (!tracker) {
> > +             refcount_dec(&dir->untracked);
> > +             return -EEXIST;
> > +     }
> > +     nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
> > +     nr_entries = filter_irq_stacks(entries, nr_entries);
> > +     stack_handle = stack_depot_save(entries, nr_entries, GFP_ATOMIC);
> > +
> > +     spin_lock_irqsave(&dir->lock, flags);
> > +     if (tracker->dead) {
> > +             pr_err("reference already released.\n");
> > +             if (tracker->alloc_stack_handle) {
> > +                     pr_err("allocated in:\n");
> > +                     stack_depot_print(tracker->alloc_stack_handle);
> > +             }
> > +             if (tracker->free_stack_handle) {
> > +                     pr_err("freed in:\n");
> > +                     stack_depot_print(tracker->free_stack_handle);
> > +             }
> > +             spin_unlock_irqrestore(&dir->lock, flags);
> > +             WARN_ON_ONCE(1);
> > +             return -EINVAL;
> > +     }
> > +     tracker->dead = true;
> > +
> > +     tracker->free_stack_handle = stack_handle;
> > +
> > +     list_move_tail(&tracker->head, &dir->quarantine);
> > +     if (!dir->quarantine_avail) {
> > +             tracker = list_first_entry(&dir->quarantine, struct ref_tracker, head);
> > +             list_del(&tracker->head);
> > +     } else {
> > +             dir->quarantine_avail--;
> > +             tracker = NULL;
> > +     }
> > +     spin_unlock_irqrestore(&dir->lock, flags);
> > +
> > +     kfree(tracker);
> > +     return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(ref_tracker_free);

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 net-next 01/23] lib: add reference counting tracking infrastructure
  2021-12-08 14:09   ` Andrzej Hajda
  2021-12-08 14:27     ` Dmitry Vyukov
@ 2021-12-08 14:59     ` Jakub Kicinski
  2021-12-08 15:11       ` Andrzej Hajda
  1 sibling, 1 reply; 52+ messages in thread
From: Jakub Kicinski @ 2021-12-08 14:59 UTC (permalink / raw)
  To: Andrzej Hajda
  Cc: Eric Dumazet, David S . Miller, netdev, Eric Dumazet,
	Dmitry Vyukov, Thomas Gleixner, Greg KH, Ingo Molnar

On Wed, 8 Dec 2021 15:09:17 +0100 Andrzej Hajda wrote:
> On 05.12.2021 05:21, Eric Dumazet wrote:
> > From: Eric Dumazet <edumazet@google.com>
> >
> > It can be hard to track where references are taken and released.
> >
> > In networking, we have annoying issues at device or netns dismantles,
> > and we had various proposals to ease root causing them.
> >
> > This patch adds new infrastructure pairing refcount increases
> > and decreases. This will self document code, because programmers
> > will have to associate increments/decrements.
> >
> > This is controled by CONFIG_REF_TRACKER which can be selected
> > by users of this feature.
> >
> > This adds both cpu and memory costs, and thus should probably be
> > used with care.
> >
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
> 
> Life is surprising, I was working on my own framework, solving the same 
> issue, with intention to publish it in few days :)
>
> My approach was little different:
> 
> 1. Instead of creating separate framework I have extended debug_objects.
> 
> 2. There were no additional fields in refcounted object and trackers - 
> infrastructure of debug_objects was reused - debug_objects tracked both 
> pointers of refcounted object and its users.
> 
> 
> Have you considered using debug_object? it seems to be good place to put 
> it there, I am not sure about performance differences.

Something along these lines?

https://lore.kernel.org/all/20211117174723.2305681-1-kuba@kernel.org/

;)

> One more thing about design - as I understand CONFIG_REF_TRACKER turns 
> on all trackers in whole kernel, have you considered possibility/helpers 
> to enable/disable tracking per class of objects?

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 net-next 01/23] lib: add reference counting tracking infrastructure
  2021-12-08 14:27     ` Dmitry Vyukov
@ 2021-12-08 15:04       ` Andrzej Hajda
  0 siblings, 0 replies; 52+ messages in thread
From: Andrzej Hajda @ 2021-12-08 15:04 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev,
	Eric Dumazet, Thomas Gleixner, Greg KH, Ingo Molnar


On 08.12.2021 15:27, Dmitry Vyukov wrote:
> On Wed, 8 Dec 2021 at 15:09, Andrzej Hajda <andrzej.hajda@intel.com> wrote:
>> Hi Eric,
>>
>>
>> I've spotted this patchset thanks to LWN, anyway it was merged very
>> quickly, I think it missed more broader review.
>>
>> As the patch touches kernel lib I have added few people who could be
>> interested.
>>
>>
>> On 05.12.2021 05:21, Eric Dumazet wrote:
>>> From: Eric Dumazet <edumazet@google.com>
>>>
>>> It can be hard to track where references are taken and released.
>>>
>>> In networking, we have annoying issues at device or netns dismantles,
>>> and we had various proposals to ease root causing them.
>>>
>>> This patch adds new infrastructure pairing refcount increases
>>> and decreases. This will self document code, because programmers
>>> will have to associate increments/decrements.
>>>
>>> This is controled by CONFIG_REF_TRACKER which can be selected
>>> by users of this feature.
>>>
>>> This adds both cpu and memory costs, and thus should probably be
>>> used with care.
>>>
>>> Signed-off-by: Eric Dumazet <edumazet@google.com>
>>> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
>>> ---
>>
>> Life is surprising, I was working on my own framework, solving the same
>> issue, with intention to publish it in few days :)
>>
>>
>> My approach was little different:
>>
>> 1. Instead of creating separate framework I have extended debug_objects.
>>
>> 2. There were no additional fields in refcounted object and trackers -
>> infrastructure of debug_objects was reused - debug_objects tracked both
>> pointers of refcounted object and its users.
>>
>> Have you considered using debug_object? it seems to be good place to put
>> it there, I am not sure about performance differences.
> Hi Andrzej,
>
> How exactly did you do it? Do you have a link to your patch?
> There still should be something similar to `struct ref_tracker` in
> this patch, right? Or how do you match decrements with increments and
> understand when a double-decrement happens?


User during taking/dropping reference should pass pointer of the object 
who uses the reference (user).

And this pointer is tracked by debug_objects:

- on taking reference: the pointer is added to in-framework array 
associated with that reference,

- on dropping reference: framework checks if the pointer is in the 
array/quarantine, and the bug is accordingly reported.

- on destroying reference: bug is reported if users array is not empty,

- on taking/dropping reference to non-existing/destroyed object: bug is 
reported.


So instead of adding tracker field to user and passing it to the 
framework, address of the user itself is passed to the framework.


Regards

Andrzej



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 net-next 01/23] lib: add reference counting tracking infrastructure
  2021-12-08 14:59     ` Jakub Kicinski
@ 2021-12-08 15:11       ` Andrzej Hajda
  0 siblings, 0 replies; 52+ messages in thread
From: Andrzej Hajda @ 2021-12-08 15:11 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Eric Dumazet, David S . Miller, netdev, Eric Dumazet,
	Dmitry Vyukov, Thomas Gleixner, Greg KH, Ingo Molnar


On 08.12.2021 15:59, Jakub Kicinski wrote:
> On Wed, 8 Dec 2021 15:09:17 +0100 Andrzej Hajda wrote:
>> On 05.12.2021 05:21, Eric Dumazet wrote:
>>> From: Eric Dumazet <edumazet@google.com>
>>>
>>> It can be hard to track where references are taken and released.
>>>
>>> In networking, we have annoying issues at device or netns dismantles,
>>> and we had various proposals to ease root causing them.
>>>
>>> This patch adds new infrastructure pairing refcount increases
>>> and decreases. This will self document code, because programmers
>>> will have to associate increments/decrements.
>>>
>>> This is controled by CONFIG_REF_TRACKER which can be selected
>>> by users of this feature.
>>>
>>> This adds both cpu and memory costs, and thus should probably be
>>> used with care.
>>>
>>> Signed-off-by: Eric Dumazet <edumazet@google.com>
>>> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
>> Life is surprising, I was working on my own framework, solving the same
>> issue, with intention to publish it in few days :)
>>
>> My approach was little different:
>>
>> 1. Instead of creating separate framework I have extended debug_objects.
>>
>> 2. There were no additional fields in refcounted object and trackers -
>> infrastructure of debug_objects was reused - debug_objects tracked both
>> pointers of refcounted object and its users.
>>
>>
>> Have you considered using debug_object? it seems to be good place to put
>> it there, I am not sure about performance differences.
> Something along these lines?
>
> https://lore.kernel.org/all/20211117174723.2305681-1-kuba@kernel.org/
>
> ;)


Ups, I was not clear in my last sentence, I meant "extend debug_objects 
with tracking users of the object" :)


Regards

Andrzej


>
>> One more thing about design - as I understand CONFIG_REF_TRACKER turns
>> on all trackers in whole kernel, have you considered possibility/helpers
>> to enable/disable tracking per class of objects?

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking
  2021-12-07 20:00                   ` Eric Dumazet
@ 2021-12-08 17:29                     ` Andrew Lunn
  2021-12-08 18:21                       ` Eric Dumazet
  0 siblings, 1 reply; 52+ messages in thread
From: Andrew Lunn @ 2021-12-08 17:29 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev, Dmitry Vyukov

> This all involves IPv6, and this might point to something I hinted
> about last week.
> 
> Can you try :
> 
> Note that I am about to travel, and will be unable to respond to
> emails for about 20 hours.

Hi Eric

Hope you had a good trip.

I tried your patch. No difference, it still reports leaks, and the
WARN_ON_ONCE did not fire.

I extended the tracker a little, adding timestamps into the struct
ref_tracker and printing the age of the leaked reference as part of
the dump. The patch is at the end.

There is possibly an interesting pattern here:

[  454.787491] unregister_netdevice: waiting for eth0 to become free. Usage count = 9
[  454.787696] leaked reference. Age    15.138884
[  454.787699]  dst_alloc+0x7a/0x180
[  454.787706]  ip6_dst_alloc+0x27/0x90
[  454.787709]  ip6_pol_route+0x257/0x430
[  454.787711]  ip6_pol_route_output+0x19/0x20
[  454.787713]  fib6_rule_lookup+0x18b/0x270
[  454.787717]  ip6_route_output_flags_noref+0xaa/0x110
[  454.787719]  ip6_route_output_flags+0x32/0xa0
[  454.787720]  ip6_dst_lookup_tail.constprop.0+0x181/0x240
[  454.787724]  ip6_dst_lookup_flow+0x43/0xa0
[  454.787725]  inet6_csk_route_socket+0x166/0x200
[  454.787729]  inet6_csk_xmit+0x56/0x130
[  454.787731]  __tcp_transmit_skb+0x53b/0xc30
[  454.787734]  __tcp_send_ack.part.0+0xc6/0x1a0
[  454.787735]  tcp_send_ack+0x1c/0x20
[  454.787737]  __tcp_ack_snd_check+0x42/0x200
[  454.787739]  tcp_rcv_established+0x27a/0x6f0
[  454.787740] leaked reference. Age    30.240300
[  454.787741]  dst_alloc+0x7a/0x180
[  454.787743]  ip6_dst_alloc+0x27/0x90
[  454.787744]  ip6_pol_route+0x257/0x430
[  454.787746]  ip6_pol_route_output+0x19/0x20
[  454.787748]  fib6_rule_lookup+0x18b/0x270
[  454.787750]  ip6_route_output_flags_noref+0xaa/0x110
[  454.787751]  ip6_route_output_flags+0x32/0xa0
[  454.787752]  seg6_output_core+0x28d/0x320
[  454.787755]  seg6_output+0x33/0x120
[  454.787757]  lwtunnel_output+0x72/0xc0
[  454.787760]  ip6_local_out+0x61/0x70
[  454.787762]  ip6_send_skb+0x23/0x70
[  454.787764]  udp_v6_send_skb+0x207/0x460
[  454.787767]  udpv6_sendmsg+0xb13/0xdb0
[  454.787768]  inet6_sendmsg+0x65/0x70
[  454.787769]  sock_sendmsg+0x48/0x70
[  454.787772] leaked reference. Age    31.594229
[  454.787773]  dst_alloc+0x7a/0x180
[  454.787775]  ip6_dst_alloc+0x27/0x90
[  454.787776]  ip6_pol_route+0x257/0x430
[  454.787778]  ip6_pol_route_output+0x19/0x20
[  454.787780]  fib6_rule_lookup+0x18b/0x270
[  454.787782]  ip6_route_output_flags_noref+0xaa/0x110
[  454.787783]  ip6_route_output_flags+0x32/0xa0
[  454.787784]  seg6_output_core+0x28d/0x320
[  454.787786]  seg6_output+0x33/0x120
[  454.787789]  lwtunnel_output+0x72/0xc0
[  454.787790]  ip6_local_out+0x61/0x70
[  454.787792]  ip6_send_skb+0x23/0x70
[  454.787793]  udp_v6_send_skb+0x207/0x460
[  454.787796]  udpv6_sendmsg+0xb13/0xdb0
[  454.787797]  inet6_sendmsg+0x65/0x70
[  454.787798]  sock_sendmsg+0x48/0x70
[  454.787800] leaked reference. Age    35.147534
[  454.787800]  dst_alloc+0x7a/0x180
[  454.787802]  ip6_dst_alloc+0x27/0x90
[  454.787803]  ip6_pol_route+0x257/0x430
[  454.787805]  ip6_pol_route_output+0x19/0x20
[  454.787806]  fib6_rule_lookup+0x18b/0x270
[  454.787808]  ip6_route_output_flags_noref+0xaa/0x110
[  454.787810]  ip6_route_output_flags+0x32/0xa0
[  454.787811]  ip6_dst_lookup_tail.constprop.0+0x181/0x240
[  454.787813]  ip6_dst_lookup_flow+0x43/0xa0
[  454.787814]  inet6_csk_route_socket+0x166/0x200
[  454.787817]  inet6_csk_xmit+0x56/0x130
[  454.787818]  __tcp_transmit_skb+0x53b/0xc30
[  454.787819]  __tcp_send_ack.part.0+0xc6/0x1a0
[  454.787821]  tcp_send_ack+0x1c/0x20
[  454.787823]  __tcp_ack_snd_check+0x42/0x200
[  454.787824]  tcp_rcv_established+0x27a/0x6f0
[  454.787826] leaked reference. Age    37.269626
[  454.787826]  dst_alloc+0x7a/0x180
[  454.787828]  ip6_dst_alloc+0x27/0x90
[  454.787829]  ip6_pol_route+0x257/0x430
[  454.787831]  ip6_pol_route_output+0x19/0x20
[  454.787833]  fib6_rule_lookup+0x18b/0x270
[  454.787835]  ip6_route_output_flags_noref+0xaa/0x110
[  454.787836]  ip6_route_output_flags+0x32/0xa0
[  454.787837]  ip6_dst_lookup_tail.constprop.0+0xde/0x240
[  454.787839]  ip6_dst_lookup_flow+0x43/0xa0
[  454.787841]  tcp_v6_connect+0x2a7/0x670
[  454.787843]  __inet_stream_connect+0xd1/0x3b0
[  454.787845]  inet_stream_connect+0x3b/0x60
[  454.787846]  __sys_connect_file+0x5f/0x70
[  454.787848]  __sys_connect+0xa2/0xd0
[  454.787849]  __x64_sys_connect+0x18/0x20
[  454.787850]  do_syscall_64+0x3b/0xc0

Ages of 15 to 37 seconds. So these leaks look related to applications
using tcpv6.

[  454.787854] leaked reference. Age   385.273596
[  454.787855]  dst_alloc+0x7a/0x180
[  454.787857]  ip6_dst_alloc+0x27/0x90
[  454.787858]  ip6_pol_route+0x257/0x430
[  454.787859]  ip6_pol_route_output+0x19/0x20
[  454.787861]  fib6_rule_lookup+0x18b/0x270
[  454.787863]  ip6_route_output_flags_noref+0xaa/0x110
[  454.787864]  ip6_route_output_flags+0x32/0xa0
[  454.787866]  ip6_dst_lookup_tail.constprop.0+0x181/0x240
[  454.787867]  ip6_dst_lookup_flow+0x43/0xa0
[  454.787869]  inet6_csk_route_req+0x11b/0x150
[  454.787871]  tcp_v6_route_req+0xa8/0x140
[  454.787873]  tcp_conn_request+0x349/0xcd0
[  454.787875]  tcp_v6_conn_request+0x64/0xd0
[  454.787877]  tcp_rcv_state_process+0x25b/0x1070
[  454.787878]  tcp_v6_do_rcv+0x1c4/0x4a0
[  454.787881]  tcp_v6_rcv+0xea3/0xee0
[  454.787883] leaked reference. Age   389.378759
[  454.787884]  fib6_nh_init+0x30d/0x9c0
[  454.787885]  rtm_new_nexthop+0x68a/0x13a0
[  454.787888]  rtnetlink_rcv_msg+0x14f/0x380
[  454.787891]  netlink_rcv_skb+0x55/0x100
[  454.787893]  rtnetlink_rcv+0x15/0x20
[  454.787895]  netlink_unicast+0x230/0x340
[  454.787897]  netlink_sendmsg+0x252/0x4b0
[  454.787899]  sock_sendmsg+0x65/0x70
[  454.787900]  ____sys_sendmsg+0x24e/0x290
[  454.787902]  ___sys_sendmsg+0x81/0xc0
[  454.787903]  __sys_sendmsg+0x62/0xb0
[  454.787905]  __x64_sys_sendmsg+0x1d/0x20
[  454.787907]  do_syscall_64+0x3b/0xc0
[  454.787908]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  454.787911] leaked reference. Age   402.739110
[  454.787911]  ipv6_add_dev+0x13e/0x4f0
[  454.787914]  addrconf_notify+0x2ca/0x950
[  454.787917]  raw_notifier_call_chain+0x49/0x60
[  454.787920]  call_netdevice_notifiers_info+0x50/0x90
[  454.787923]  __dev_change_net_namespace+0x30d/0x6c0
[  454.787926]  do_setlink+0xdc/0x10b0
[  454.787928]  __rtnl_newlink+0x608/0xa10
[  454.787931]  rtnl_newlink+0x49/0x70
[  454.787933]  rtnetlink_rcv_msg+0x14f/0x380
[  454.787935]  netlink_rcv_skb+0x55/0x100
[  454.787937]  rtnetlink_rcv+0x15/0x20
[  454.787939]  netlink_unicast+0x230/0x340
[  454.787941]  netlink_sendmsg+0x252/0x4b0
[  454.787942]  sock_sendmsg+0x65/0x70
[  454.787944]  ____sys_sendmsg+0x24e/0x290
[  454.787945]  ___sys_sendmsg+0x81/0xc0

Here the age is much longer, more like the age of the container and
early setup of the container. I took a quick look at this last trace,
and it is the netdev being moved from one namespace to another.

So it could be there are two leaks here?

   Andrew

rom d3a3149c4e8f5020942ac00fe3bce6a1303f10b7 Mon Sep 17 00:00:00 2001
From: Andrew Lunn <andrew@lunn.ch>
Date: Wed, 8 Dec 2021 10:25:05 -0600
Subject: [PATCH] lib: ref_tracker: Add timestamp for reference

It can be useful to know if the leaked reference is old, or recent.
Is the bug in interface create and release, or more dynamic like a
connection.

Add a timestamp to each reference record, and print the age when
dumping the records.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
---
 lib/ref_tracker.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/lib/ref_tracker.c b/lib/ref_tracker.c
index 0ae2e66dcf0f..82a1c3681969 100644
--- a/lib/ref_tracker.c
+++ b/lib/ref_tracker.c
@@ -12,6 +12,7 @@ struct ref_tracker {
        bool                    dead;
        depot_stack_handle_t    alloc_stack_handle;
        depot_stack_handle_t    free_stack_handle;
+       ktime_t                 alloc_ktime;
 };
 
 void ref_tracker_dir_exit(struct ref_tracker_dir *dir)
@@ -44,13 +45,17 @@ void ref_tracker_dir_print(struct ref_tracker_dir *dir,
                           unsigned int display_limit)
 {
        struct ref_tracker *tracker;
+       ktime_t now = ktime_get();
        unsigned long flags;
        unsigned int i = 0;
-
        spin_lock_irqsave(&dir->lock, flags);
+
        list_for_each_entry(tracker, &dir->list, head) {
                if (i < display_limit) {
-                       pr_err("leaked reference.\n");
+                       ktime_t age = ktime_sub(now, tracker->alloc_ktime);
+                       unsigned long rem_nsec = do_div(age, 1000000000);
+                       pr_err("leaked reference. Age %5lu.%06lu\n",
+                              (unsigned long)age, rem_nsec / 1000);
                        if (tracker->alloc_stack_handle)
                                stack_depot_print(tracker->alloc_stack_handle);
                        i++;
@@ -80,6 +85,7 @@ int ref_tracker_alloc(struct ref_tracker_dir *dir,
        nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
        nr_entries = filter_irq_stacks(entries, nr_entries);
        tracker->alloc_stack_handle = stack_depot_save(entries, nr_entries, gfp);
+       tracker->alloc_ktime = ktime_get();
 
        spin_lock_irqsave(&dir->lock, flags);
        list_add(&tracker->head, &dir->list);
-- 
2.33.1



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking
  2021-12-08 17:29                     ` Andrew Lunn
@ 2021-12-08 18:21                       ` Eric Dumazet
  2021-12-08 18:53                         ` Eric Dumazet
  0 siblings, 1 reply; 52+ messages in thread
From: Eric Dumazet @ 2021-12-08 18:21 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev, Dmitry Vyukov

On Wed, Dec 8, 2021 at 9:29 AM Andrew Lunn <andrew@lunn.ch> wrote:
>
> > This all involves IPv6, and this might point to something I hinted
> > about last week.
> >
> > Can you try :
> >
> > Note that I am about to travel, and will be unable to respond to
> > emails for about 20 hours.
>
> Hi Eric
>
> Hope you had a good trip.
>
> I tried your patch. No difference, it still reports leaks, and the
> WARN_ON_ONCE did not fire.
>
> I extended the tracker a little, adding timestamps into the struct
> ref_tracker and printing the age of the leaked reference as part of
> the dump. The patch is at the end.
>
> There is possibly an interesting pattern here:
>
> [  454.787491] unregister_netdevice: waiting for eth0 to become free. Usage count = 9
> [  454.787696] leaked reference. Age    15.138884
> [  454.787699]  dst_alloc+0x7a/0x180
> [  454.787706]  ip6_dst_alloc+0x27/0x90
> [  454.787709]  ip6_pol_route+0x257/0x430
> [  454.787711]  ip6_pol_route_output+0x19/0x20
> [  454.787713]  fib6_rule_lookup+0x18b/0x270
> [  454.787717]  ip6_route_output_flags_noref+0xaa/0x110
> [  454.787719]  ip6_route_output_flags+0x32/0xa0
> [  454.787720]  ip6_dst_lookup_tail.constprop.0+0x181/0x240
> [  454.787724]  ip6_dst_lookup_flow+0x43/0xa0
> [  454.787725]  inet6_csk_route_socket+0x166/0x200
> [  454.787729]  inet6_csk_xmit+0x56/0x130
> [  454.787731]  __tcp_transmit_skb+0x53b/0xc30
> [  454.787734]  __tcp_send_ack.part.0+0xc6/0x1a0
> [  454.787735]  tcp_send_ack+0x1c/0x20
> [  454.787737]  __tcp_ack_snd_check+0x42/0x200
> [  454.787739]  tcp_rcv_established+0x27a/0x6f0
> [  454.787740] leaked reference. Age    30.240300
> [  454.787741]  dst_alloc+0x7a/0x180
> [  454.787743]  ip6_dst_alloc+0x27/0x90
> [  454.787744]  ip6_pol_route+0x257/0x430
> [  454.787746]  ip6_pol_route_output+0x19/0x20
> [  454.787748]  fib6_rule_lookup+0x18b/0x270
> [  454.787750]  ip6_route_output_flags_noref+0xaa/0x110
> [  454.787751]  ip6_route_output_flags+0x32/0xa0
> [  454.787752]  seg6_output_core+0x28d/0x320
> [  454.787755]  seg6_output+0x33/0x120
> [  454.787757]  lwtunnel_output+0x72/0xc0
> [  454.787760]  ip6_local_out+0x61/0x70
> [  454.787762]  ip6_send_skb+0x23/0x70
> [  454.787764]  udp_v6_send_skb+0x207/0x460
> [  454.787767]  udpv6_sendmsg+0xb13/0xdb0
> [  454.787768]  inet6_sendmsg+0x65/0x70
> [  454.787769]  sock_sendmsg+0x48/0x70
> [  454.787772] leaked reference. Age    31.594229
> [  454.787773]  dst_alloc+0x7a/0x180
> [  454.787775]  ip6_dst_alloc+0x27/0x90
> [  454.787776]  ip6_pol_route+0x257/0x430
> [  454.787778]  ip6_pol_route_output+0x19/0x20
> [  454.787780]  fib6_rule_lookup+0x18b/0x270
> [  454.787782]  ip6_route_output_flags_noref+0xaa/0x110
> [  454.787783]  ip6_route_output_flags+0x32/0xa0
> [  454.787784]  seg6_output_core+0x28d/0x320
> [  454.787786]  seg6_output+0x33/0x120
> [  454.787789]  lwtunnel_output+0x72/0xc0
> [  454.787790]  ip6_local_out+0x61/0x70
> [  454.787792]  ip6_send_skb+0x23/0x70
> [  454.787793]  udp_v6_send_skb+0x207/0x460
> [  454.787796]  udpv6_sendmsg+0xb13/0xdb0
> [  454.787797]  inet6_sendmsg+0x65/0x70
> [  454.787798]  sock_sendmsg+0x48/0x70
> [  454.787800] leaked reference. Age    35.147534
> [  454.787800]  dst_alloc+0x7a/0x180
> [  454.787802]  ip6_dst_alloc+0x27/0x90
> [  454.787803]  ip6_pol_route+0x257/0x430
> [  454.787805]  ip6_pol_route_output+0x19/0x20
> [  454.787806]  fib6_rule_lookup+0x18b/0x270
> [  454.787808]  ip6_route_output_flags_noref+0xaa/0x110
> [  454.787810]  ip6_route_output_flags+0x32/0xa0
> [  454.787811]  ip6_dst_lookup_tail.constprop.0+0x181/0x240
> [  454.787813]  ip6_dst_lookup_flow+0x43/0xa0
> [  454.787814]  inet6_csk_route_socket+0x166/0x200
> [  454.787817]  inet6_csk_xmit+0x56/0x130
> [  454.787818]  __tcp_transmit_skb+0x53b/0xc30
> [  454.787819]  __tcp_send_ack.part.0+0xc6/0x1a0
> [  454.787821]  tcp_send_ack+0x1c/0x20
> [  454.787823]  __tcp_ack_snd_check+0x42/0x200
> [  454.787824]  tcp_rcv_established+0x27a/0x6f0
> [  454.787826] leaked reference. Age    37.269626
> [  454.787826]  dst_alloc+0x7a/0x180
> [  454.787828]  ip6_dst_alloc+0x27/0x90
> [  454.787829]  ip6_pol_route+0x257/0x430
> [  454.787831]  ip6_pol_route_output+0x19/0x20
> [  454.787833]  fib6_rule_lookup+0x18b/0x270
> [  454.787835]  ip6_route_output_flags_noref+0xaa/0x110
> [  454.787836]  ip6_route_output_flags+0x32/0xa0
> [  454.787837]  ip6_dst_lookup_tail.constprop.0+0xde/0x240
> [  454.787839]  ip6_dst_lookup_flow+0x43/0xa0
> [  454.787841]  tcp_v6_connect+0x2a7/0x670
> [  454.787843]  __inet_stream_connect+0xd1/0x3b0
> [  454.787845]  inet_stream_connect+0x3b/0x60
> [  454.787846]  __sys_connect_file+0x5f/0x70
> [  454.787848]  __sys_connect+0xa2/0xd0
> [  454.787849]  __x64_sys_connect+0x18/0x20
> [  454.787850]  do_syscall_64+0x3b/0xc0
>
> Ages of 15 to 37 seconds. So these leaks look related to applications
> using tcpv6.
>
> [  454.787854] leaked reference. Age   385.273596
> [  454.787855]  dst_alloc+0x7a/0x180
> [  454.787857]  ip6_dst_alloc+0x27/0x90
> [  454.787858]  ip6_pol_route+0x257/0x430
> [  454.787859]  ip6_pol_route_output+0x19/0x20
> [  454.787861]  fib6_rule_lookup+0x18b/0x270
> [  454.787863]  ip6_route_output_flags_noref+0xaa/0x110
> [  454.787864]  ip6_route_output_flags+0x32/0xa0
> [  454.787866]  ip6_dst_lookup_tail.constprop.0+0x181/0x240
> [  454.787867]  ip6_dst_lookup_flow+0x43/0xa0
> [  454.787869]  inet6_csk_route_req+0x11b/0x150
> [  454.787871]  tcp_v6_route_req+0xa8/0x140
> [  454.787873]  tcp_conn_request+0x349/0xcd0
> [  454.787875]  tcp_v6_conn_request+0x64/0xd0
> [  454.787877]  tcp_rcv_state_process+0x25b/0x1070
> [  454.787878]  tcp_v6_do_rcv+0x1c4/0x4a0
> [  454.787881]  tcp_v6_rcv+0xea3/0xee0
> [  454.787883] leaked reference. Age   389.378759
> [  454.787884]  fib6_nh_init+0x30d/0x9c0
> [  454.787885]  rtm_new_nexthop+0x68a/0x13a0
> [  454.787888]  rtnetlink_rcv_msg+0x14f/0x380
> [  454.787891]  netlink_rcv_skb+0x55/0x100
> [  454.787893]  rtnetlink_rcv+0x15/0x20
> [  454.787895]  netlink_unicast+0x230/0x340
> [  454.787897]  netlink_sendmsg+0x252/0x4b0
> [  454.787899]  sock_sendmsg+0x65/0x70
> [  454.787900]  ____sys_sendmsg+0x24e/0x290
> [  454.787902]  ___sys_sendmsg+0x81/0xc0
> [  454.787903]  __sys_sendmsg+0x62/0xb0
> [  454.787905]  __x64_sys_sendmsg+0x1d/0x20
> [  454.787907]  do_syscall_64+0x3b/0xc0
> [  454.787908]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [  454.787911] leaked reference. Age   402.739110
> [  454.787911]  ipv6_add_dev+0x13e/0x4f0
> [  454.787914]  addrconf_notify+0x2ca/0x950
> [  454.787917]  raw_notifier_call_chain+0x49/0x60
> [  454.787920]  call_netdevice_notifiers_info+0x50/0x90
> [  454.787923]  __dev_change_net_namespace+0x30d/0x6c0
> [  454.787926]  do_setlink+0xdc/0x10b0
> [  454.787928]  __rtnl_newlink+0x608/0xa10
> [  454.787931]  rtnl_newlink+0x49/0x70
> [  454.787933]  rtnetlink_rcv_msg+0x14f/0x380
> [  454.787935]  netlink_rcv_skb+0x55/0x100
> [  454.787937]  rtnetlink_rcv+0x15/0x20
> [  454.787939]  netlink_unicast+0x230/0x340
> [  454.787941]  netlink_sendmsg+0x252/0x4b0
> [  454.787942]  sock_sendmsg+0x65/0x70
> [  454.787944]  ____sys_sendmsg+0x24e/0x290
> [  454.787945]  ___sys_sendmsg+0x81/0xc0
>
> Here the age is much longer, more like the age of the container and
> early setup of the container. I took a quick look at this last trace,
> and it is the netdev being moved from one namespace to another.
>
> So it could be there are two leaks here?

Possibly something is wrong when a netdevice is going through
__dev_change_net_namespace(),
there might be a missing reparenting of some objects.

The ipv6_add_dev() presence might give us a hint.

>
>    Andrew
>
> rom d3a3149c4e8f5020942ac00fe3bce6a1303f10b7 Mon Sep 17 00:00:00 2001
> From: Andrew Lunn <andrew@lunn.ch>
> Date: Wed, 8 Dec 2021 10:25:05 -0600
> Subject: [PATCH] lib: ref_tracker: Add timestamp for reference
>
> It can be useful to know if the leaked reference is old, or recent.
> Is the bug in interface create and release, or more dynamic like a
> connection.
>
> Add a timestamp to each reference record, and print the age when
> dumping the records.
>
> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
> ---
>  lib/ref_tracker.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/lib/ref_tracker.c b/lib/ref_tracker.c
> index 0ae2e66dcf0f..82a1c3681969 100644
> --- a/lib/ref_tracker.c
> +++ b/lib/ref_tracker.c
> @@ -12,6 +12,7 @@ struct ref_tracker {
>         bool                    dead;
>         depot_stack_handle_t    alloc_stack_handle;
>         depot_stack_handle_t    free_stack_handle;
> +       ktime_t                 alloc_ktime;

I do not think ns precision of a ktime_t is really needed.

jiffies should be enough, saving potential high cost of ktime_get_ns()

>  };
>
>  void ref_tracker_dir_exit(struct ref_tracker_dir *dir)
> @@ -44,13 +45,17 @@ void ref_tracker_dir_print(struct ref_tracker_dir *dir,
>                            unsigned int display_limit)
>  {
>         struct ref_tracker *tracker;
> +       ktime_t now = ktime_get();
>         unsigned long flags;
>         unsigned int i = 0;
> -
>         spin_lock_irqsave(&dir->lock, flags);
> +
>         list_for_each_entry(tracker, &dir->list, head) {
>                 if (i < display_limit) {
> -                       pr_err("leaked reference.\n");
> +                       ktime_t age = ktime_sub(now, tracker->alloc_ktime);
> +                       unsigned long rem_nsec = do_div(age, 1000000000);
> +                       pr_err("leaked reference. Age %5lu.%06lu\n",
> +                              (unsigned long)age, rem_nsec / 1000);
>                         if (tracker->alloc_stack_handle)
>                                 stack_depot_print(tracker->alloc_stack_handle);
>                         i++;
> @@ -80,6 +85,7 @@ int ref_tracker_alloc(struct ref_tracker_dir *dir,
>         nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
>         nr_entries = filter_irq_stacks(entries, nr_entries);
>         tracker->alloc_stack_handle = stack_depot_save(entries, nr_entries, gfp);
> +       tracker->alloc_ktime = ktime_get();
>
>         spin_lock_irqsave(&dir->lock, flags);
>         list_add(&tracker->head, &dir->list);
> --
> 2.33.1
>
>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking
  2021-12-08 18:21                       ` Eric Dumazet
@ 2021-12-08 18:53                         ` Eric Dumazet
  0 siblings, 0 replies; 52+ messages in thread
From: Eric Dumazet @ 2021-12-08 18:53 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev, Dmitry Vyukov

On Wed, Dec 8, 2021 at 10:21 AM Eric Dumazet <edumazet@google.com> wrote:
>

> I do not think ns precision of a ktime_t is really needed.
>
> jiffies should be enough, saving potential high cost of ktime_get_ns()
>

Also worth mentioning this is doubling size of ref_tracker.
(going from 32-byte to 64-bytes standard kmem cache)

I would advise creating a dedicated kmem_cache to avoid wasting memory.

current layout being:

struct ref_tracker {
 struct list_head           head;                 /*     0  0x10 */
 bool                       dead;                 /*  0x10   0x1 */

 /* XXX 3 bytes hole, try to pack */

 depot_stack_handle_t       alloc_stack_handle;   /*  0x14   0x4 */
 depot_stack_handle_t       free_stack_handle;    /*  0x18   0x4 */

 /* size: 32, cachelines: 1, members: 4 */
 /* sum members: 25, holes: 1, sum holes: 3 */
 /* padding: 4 */
 /* last cacheline: 32 bytes */
};

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 net-next 01/23] lib: add reference counting tracking infrastructure
  2021-12-05  4:21 ` [PATCH v3 net-next 01/23] lib: add reference counting tracking infrastructure Eric Dumazet
  2021-12-08 14:09   ` Andrzej Hajda
@ 2021-12-15 10:18   ` Jiri Slaby
  2021-12-15 10:38     ` Eric Dumazet
  1 sibling, 1 reply; 52+ messages in thread
From: Jiri Slaby @ 2021-12-15 10:18 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Dmitry Vyukov

On 05. 12. 21, 5:21, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> It can be hard to track where references are taken and released.
> 
> In networking, we have annoying issues at device or netns dismantles,
> and we had various proposals to ease root causing them.
...
> --- a/lib/Kconfig
> +++ b/lib/Kconfig
> @@ -680,6 +680,11 @@ config STACK_HASH_ORDER
>   	 Select the hash size as a power of 2 for the stackdepot hash table.
>   	 Choose a lower value to reduce the memory impact.
>   
> +config REF_TRACKER
> +	bool
> +	depends on STACKTRACE_SUPPORT
> +	select STACKDEPOT

Hi,

I have to:
+       select STACKDEPOT_ALWAYS_INIT
here. Otherwise I see this during boot:

> BUG: unable to handle page fault for address: 00000000001e6f80
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 0 P4D 0
> Oops: 0000 [#1] PREEMPT SMP PTI
> CPU: 1 PID: 1 Comm: swapper/0 Tainted: G          I       5.16.0-rc5-next-20211214-vanilla+ #46 2756e36611a8c8a8271884ae04571fc88e1cb566
> Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS SDBLI944.86P 05/08/2007
> RIP: 0010:__stack_depot_save (lib/stackdepot.c:373) 
> Code: 04 31 fb 83 fe 03 77 97 83 fe 02 74 7a 83 fe 03 74 72 83 fe 01 74 73 48 8b 05 45 ec 11 02 89 d9 81 e1 ff ff 0f 00 48 8d 0c c8 <48> 8b 29 48 85 ed 75 12 e9 9f 00 00 00 48 8b 6d 00 48 85 ed 0f 84
> All code
> ========
>    0:	04 31                	add    $0x31,%al
>    2:	fb                   	sti    
>    3:	83 fe 03             	cmp    $0x3,%esi
>    6:	77 97                	ja     0xffffffffffffff9f
>    8:	83 fe 02             	cmp    $0x2,%esi
>    b:	74 7a                	je     0x87
>    d:	83 fe 03             	cmp    $0x3,%esi
>   10:	74 72                	je     0x84
>   12:	83 fe 01             	cmp    $0x1,%esi
>   15:	74 73                	je     0x8a
>   17:	48 8b 05 45 ec 11 02 	mov    0x211ec45(%rip),%rax        # 0x211ec63
>   1e:	89 d9                	mov    %ebx,%ecx
>   20:	81 e1 ff ff 0f 00    	and    $0xfffff,%ecx
>   26:	48 8d 0c c8          	lea    (%rax,%rcx,8),%rcx
>   2a:*	48 8b 29             	mov    (%rcx),%rbp		<-- trapping instruction
>   2d:	48 85 ed             	test   %rbp,%rbp
>   30:	75 12                	jne    0x44
>   32:	e9 9f 00 00 00       	jmp    0xd6
>   37:	48 8b 6d 00          	mov    0x0(%rbp),%rbp
>   3b:	48 85 ed             	test   %rbp,%rbp
>   3e:	0f                   	.byte 0xf
>   3f:	84                   	.byte 0x84
> 
> Code starting with the faulting instruction
> ===========================================
>    0:	48 8b 29             	mov    (%rcx),%rbp
>    3:	48 85 ed             	test   %rbp,%rbp
>    6:	75 12                	jne    0x1a
>    8:	e9 9f 00 00 00       	jmp    0xac
>    d:	48 8b 6d 00          	mov    0x0(%rbp),%rbp
>   11:	48 85 ed             	test   %rbp,%rbp
>   14:	0f                   	.byte 0xf
>   15:	84                   	.byte 0x84
> RSP: 0000:ffffb3f700027b78 EFLAGS: 00010206
> RAX: 0000000000000000 RBX: 000000004ea3cdf0 RCX: 00000000001e6f80
> RDX: 000000000000000d RSI: 0000000000000002 RDI: 00000000793ec676
> RBP: ffff8b578094f4d0 R08: 0000000043abc8c3 R09: 000000000000000d
> R10: 0000000000000015 R11: 000000000000001c R12: 0000000000000001
> R13: 0000000000000cc0 R14: ffffb3f700027bd8 R15: 000000000000000d
> FS:  0000000000000000(0000) GS:ffff8b5845c80000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000000001e6f80 CR3: 0000000199410000 CR4: 00000000000006e0
> Call Trace:
> <TASK>
> ref_tracker_alloc (lib/ref_tracker.c:84) 
> net_rx_queue_update_kobjects (net/core/net-sysfs.c:1049 net/core/net-sysfs.c:1101) 
> netdev_register_kobject (net/core/net-sysfs.c:1761 net/core/net-sysfs.c:2012) 
> register_netdevice (net/core/dev.c:9660) 
> register_netdev (net/core/dev.c:9784) 
> loopback_net_init (drivers/net/loopback.c:217) 
> ops_init (net/core/net_namespace.c:140) 
> register_pernet_operations (net/core/net_namespace.c:1148 net/core/net_namespace.c:1217) 
> register_pernet_device (net/core/net_namespace.c:1304) 
> net_dev_init (net/core/dev.c:11014) 
> ? sysctl_core_init (net/core/dev.c:10958) 
> do_one_initcall (init/main.c:1303) 
> kernel_init_freeable (init/main.c:1377 init/main.c:1394 init/main.c:1413 init/main.c:1618) 
> ? rest_init (init/main.c:1499) 
> kernel_init (init/main.c:1509) 
> ret_from_fork (arch/x86/entry/entry_64.S:301) 
> </TASK>
> Modules linked in:
> CR2: 00000000001e6f80
> ---[ end trace 0000000000000000 ]---
> RIP: 0010:__stack_depot_save (lib/stackdepot.c:373) 
> Code: 04 31 fb 83 fe 03 77 97 83 fe 02 74 7a 83 fe 03 74 72 83 fe 01 74 73 48 8b 05 45 ec 11 02 89 d9 81 e1 ff ff 0f 00 48 8d 0c c8 <48> 8b 29 48 85 ed 75 12 e9 9f 00 00 00 48 8b 6d 00 48 85 ed 0f 84
> All code
> ========
>    0:	04 31                	add    $0x31,%al
>    2:	fb                   	sti    
>    3:	83 fe 03             	cmp    $0x3,%esi
>    6:	77 97                	ja     0xffffffffffffff9f
>    8:	83 fe 02             	cmp    $0x2,%esi
>    b:	74 7a                	je     0x87
>    d:	83 fe 03             	cmp    $0x3,%esi
>   10:	74 72                	je     0x84
>   12:	83 fe 01             	cmp    $0x1,%esi
>   15:	74 73                	je     0x8a
>   17:	48 8b 05 45 ec 11 02 	mov    0x211ec45(%rip),%rax        # 0x211ec63
>   1e:	89 d9                	mov    %ebx,%ecx
>   20:	81 e1 ff ff 0f 00    	and    $0xfffff,%ecx
>   26:	48 8d 0c c8          	lea    (%rax,%rcx,8),%rcx
>   2a:*	48 8b 29             	mov    (%rcx),%rbp		<-- trapping instruction
>   2d:	48 85 ed             	test   %rbp,%rbp
>   30:	75 12                	jne    0x44
>   32:	e9 9f 00 00 00       	jmp    0xd6
>   37:	48 8b 6d 00          	mov    0x0(%rbp),%rbp
>   3b:	48 85 ed             	test   %rbp,%rbp
>   3e:	0f                   	.byte 0xf
>   3f:	84                   	.byte 0x84
> 
> Code starting with the faulting instruction
> ===========================================
>    0:	48 8b 29             	mov    (%rcx),%rbp
>    3:	48 85 ed             	test   %rbp,%rbp
>    6:	75 12                	jne    0x1a
>    8:	e9 9f 00 00 00       	jmp    0xac
>    d:	48 8b 6d 00          	mov    0x0(%rbp),%rbp
>   11:	48 85 ed             	test   %rbp,%rbp
>   14:	0f                   	.byte 0xf
>   15:	84                   	.byte 0x84
> RSP: 0000:ffffb3f700027b78 EFLAGS: 00010206
> RAX: 0000000000000000 RBX: 000000004ea3cdf0 RCX: 00000000001e6f80
> RDX: 000000000000000d RSI: 0000000000000002 RDI: 00000000793ec676
> RBP: ffff8b578094f4d0 R08: 0000000043abc8c3 R09: 000000000000000d
> R10: 0000000000000015 R11: 000000000000001c R12: 0000000000000001
> R13: 0000000000000cc0 R14: ffffb3f700027bd8 R15: 000000000000000d
> FS:  0000000000000000(0000) GS:ffff8b5845c80000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000000001e6f80 CR3: 0000000199410000 CR4: 00000000000006e0

regards,
-- 
js

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 net-next 01/23] lib: add reference counting tracking infrastructure
  2021-12-15 10:18   ` Jiri Slaby
@ 2021-12-15 10:38     ` Eric Dumazet
  2021-12-15 10:41       ` Eric Dumazet
  0 siblings, 1 reply; 52+ messages in thread
From: Eric Dumazet @ 2021-12-15 10:38 UTC (permalink / raw)
  To: Jiri Slaby, Vlastimil Babka
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev, Dmitry Vyukov

On Wed, Dec 15, 2021 at 2:18 AM Jiri Slaby <jirislaby@gmail.com> wrote:
>
> On 05. 12. 21, 5:21, Eric Dumazet wrote:
> > From: Eric Dumazet <edumazet@google.com>
> >
> > It can be hard to track where references are taken and released.
> >
> > In networking, we have annoying issues at device or netns dismantles,
> > and we had various proposals to ease root causing them.
> ...
> > --- a/lib/Kconfig
> > +++ b/lib/Kconfig
> > @@ -680,6 +680,11 @@ config STACK_HASH_ORDER
> >        Select the hash size as a power of 2 for the stackdepot hash table.
> >        Choose a lower value to reduce the memory impact.
> >
> > +config REF_TRACKER
> > +     bool
> > +     depends on STACKTRACE_SUPPORT
> > +     select STACKDEPOT
>
> Hi,
>
> I have to:
> +       select STACKDEPOT_ALWAYS_INIT
> here. Otherwise I see this during boot:
>

Thanks, I am adding Vlastimil Babka to the CC

This stuff has been added in
commit e88cc9f5e2e7a5d28a1adf12615840fab4cbebfd
Author: Vlastimil Babka <vbabka@suse.cz>
Date:   Tue Dec 14 21:50:42 2021 +0000

    lib/stackdepot: allow optional init and stack_table allocation by kvmalloc()



> > BUG: unable to handle page fault for address: 00000000001e6f80
> > #PF: supervisor read access in kernel mode
> > #PF: error_code(0x0000) - not-present page
> > PGD 0 P4D 0
> > Oops: 0000 [#1] PREEMPT SMP PTI
> > CPU: 1 PID: 1 Comm: swapper/0 Tainted: G          I       5.16.0-rc5-next-20211214-vanilla+ #46 2756e36611a8c8a8271884ae04571fc88e1cb566
> > Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS SDBLI944.86P 05/08/2007
> > RIP: 0010:__stack_depot_save (lib/stackdepot.c:373)
> > Code: 04 31 fb 83 fe 03 77 97 83 fe 02 74 7a 83 fe 03 74 72 83 fe 01 74 73 48 8b 05 45 ec 11 02 89 d9 81 e1 ff ff 0f 00 48 8d 0c c8 <48> 8b 29 48 85 ed 75 12 e9 9f 00 00 00 48 8b 6d 00 48 85 ed 0f 84
> > All code
> > ========
> >    0: 04 31                   add    $0x31,%al
> >    2: fb                      sti
> >    3: 83 fe 03                cmp    $0x3,%esi
> >    6: 77 97                   ja     0xffffffffffffff9f
> >    8: 83 fe 02                cmp    $0x2,%esi
> >    b: 74 7a                   je     0x87
> >    d: 83 fe 03                cmp    $0x3,%esi
> >   10: 74 72                   je     0x84
> >   12: 83 fe 01                cmp    $0x1,%esi
> >   15: 74 73                   je     0x8a
> >   17: 48 8b 05 45 ec 11 02    mov    0x211ec45(%rip),%rax        # 0x211ec63
> >   1e: 89 d9                   mov    %ebx,%ecx
> >   20: 81 e1 ff ff 0f 00       and    $0xfffff,%ecx
> >   26: 48 8d 0c c8             lea    (%rax,%rcx,8),%rcx
> >   2a:*        48 8b 29                mov    (%rcx),%rbp              <-- trapping instruction
> >   2d: 48 85 ed                test   %rbp,%rbp
> >   30: 75 12                   jne    0x44
> >   32: e9 9f 00 00 00          jmp    0xd6
> >   37: 48 8b 6d 00             mov    0x0(%rbp),%rbp
> >   3b: 48 85 ed                test   %rbp,%rbp
> >   3e: 0f                      .byte 0xf
> >   3f: 84                      .byte 0x84
> >
> > Code starting with the faulting instruction
> > ===========================================
> >    0: 48 8b 29                mov    (%rcx),%rbp
> >    3: 48 85 ed                test   %rbp,%rbp
> >    6: 75 12                   jne    0x1a
> >    8: e9 9f 00 00 00          jmp    0xac
> >    d: 48 8b 6d 00             mov    0x0(%rbp),%rbp
> >   11: 48 85 ed                test   %rbp,%rbp
> >   14: 0f                      .byte 0xf
> >   15: 84                      .byte 0x84
> > RSP: 0000:ffffb3f700027b78 EFLAGS: 00010206
> > RAX: 0000000000000000 RBX: 000000004ea3cdf0 RCX: 00000000001e6f80
> > RDX: 000000000000000d RSI: 0000000000000002 RDI: 00000000793ec676
> > RBP: ffff8b578094f4d0 R08: 0000000043abc8c3 R09: 000000000000000d
> > R10: 0000000000000015 R11: 000000000000001c R12: 0000000000000001
> > R13: 0000000000000cc0 R14: ffffb3f700027bd8 R15: 000000000000000d
> > FS:  0000000000000000(0000) GS:ffff8b5845c80000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00000000001e6f80 CR3: 0000000199410000 CR4: 00000000000006e0
> > Call Trace:
> > <TASK>
> > ref_tracker_alloc (lib/ref_tracker.c:84)
> > net_rx_queue_update_kobjects (net/core/net-sysfs.c:1049 net/core/net-sysfs.c:1101)
> > netdev_register_kobject (net/core/net-sysfs.c:1761 net/core/net-sysfs.c:2012)
> > register_netdevice (net/core/dev.c:9660)
> > register_netdev (net/core/dev.c:9784)
> > loopback_net_init (drivers/net/loopback.c:217)
> > ops_init (net/core/net_namespace.c:140)
> > register_pernet_operations (net/core/net_namespace.c:1148 net/core/net_namespace.c:1217)
> > register_pernet_device (net/core/net_namespace.c:1304)
> > net_dev_init (net/core/dev.c:11014)
> > ? sysctl_core_init (net/core/dev.c:10958)
> > do_one_initcall (init/main.c:1303)
> > kernel_init_freeable (init/main.c:1377 init/main.c:1394 init/main.c:1413 init/main.c:1618)
> > ? rest_init (init/main.c:1499)
> > kernel_init (init/main.c:1509)
> > ret_from_fork (arch/x86/entry/entry_64.S:301)
> > </TASK>
> > Modules linked in:
> > CR2: 00000000001e6f80
> > ---[ end trace 0000000000000000 ]---
> > RIP: 0010:__stack_depot_save (lib/stackdepot.c:373)
> > Code: 04 31 fb 83 fe 03 77 97 83 fe 02 74 7a 83 fe 03 74 72 83 fe 01 74 73 48 8b 05 45 ec 11 02 89 d9 81 e1 ff ff 0f 00 48 8d 0c c8 <48> 8b 29 48 85 ed 75 12 e9 9f 00 00 00 48 8b 6d 00 48 85 ed 0f 84
> > All code
> > ========
> >    0: 04 31                   add    $0x31,%al
> >    2: fb                      sti
> >    3: 83 fe 03                cmp    $0x3,%esi
> >    6: 77 97                   ja     0xffffffffffffff9f
> >    8: 83 fe 02                cmp    $0x2,%esi
> >    b: 74 7a                   je     0x87
> >    d: 83 fe 03                cmp    $0x3,%esi
> >   10: 74 72                   je     0x84
> >   12: 83 fe 01                cmp    $0x1,%esi
> >   15: 74 73                   je     0x8a
> >   17: 48 8b 05 45 ec 11 02    mov    0x211ec45(%rip),%rax        # 0x211ec63
> >   1e: 89 d9                   mov    %ebx,%ecx
> >   20: 81 e1 ff ff 0f 00       and    $0xfffff,%ecx
> >   26: 48 8d 0c c8             lea    (%rax,%rcx,8),%rcx
> >   2a:*        48 8b 29                mov    (%rcx),%rbp              <-- trapping instruction
> >   2d: 48 85 ed                test   %rbp,%rbp
> >   30: 75 12                   jne    0x44
> >   32: e9 9f 00 00 00          jmp    0xd6
> >   37: 48 8b 6d 00             mov    0x0(%rbp),%rbp
> >   3b: 48 85 ed                test   %rbp,%rbp
> >   3e: 0f                      .byte 0xf
> >   3f: 84                      .byte 0x84
> >
> > Code starting with the faulting instruction
> > ===========================================
> >    0: 48 8b 29                mov    (%rcx),%rbp
> >    3: 48 85 ed                test   %rbp,%rbp
> >    6: 75 12                   jne    0x1a
> >    8: e9 9f 00 00 00          jmp    0xac
> >    d: 48 8b 6d 00             mov    0x0(%rbp),%rbp
> >   11: 48 85 ed                test   %rbp,%rbp
> >   14: 0f                      .byte 0xf
> >   15: 84                      .byte 0x84
> > RSP: 0000:ffffb3f700027b78 EFLAGS: 00010206
> > RAX: 0000000000000000 RBX: 000000004ea3cdf0 RCX: 00000000001e6f80
> > RDX: 000000000000000d RSI: 0000000000000002 RDI: 00000000793ec676
> > RBP: ffff8b578094f4d0 R08: 0000000043abc8c3 R09: 000000000000000d
> > R10: 0000000000000015 R11: 000000000000001c R12: 0000000000000001
> > R13: 0000000000000cc0 R14: ffffb3f700027bd8 R15: 000000000000000d
> > FS:  0000000000000000(0000) GS:ffff8b5845c80000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00000000001e6f80 CR3: 0000000199410000 CR4: 00000000000006e0
>
> regards,
> --
> js

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 net-next 01/23] lib: add reference counting tracking infrastructure
  2021-12-15 10:38     ` Eric Dumazet
@ 2021-12-15 10:41       ` Eric Dumazet
  2021-12-15 10:57         ` Vlastimil Babka
  0 siblings, 1 reply; 52+ messages in thread
From: Eric Dumazet @ 2021-12-15 10:41 UTC (permalink / raw)
  To: Jiri Slaby, Vlastimil Babka
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev, Dmitry Vyukov

On Wed, Dec 15, 2021 at 2:38 AM Eric Dumazet <edumazet@google.com> wrote:
>
> On Wed, Dec 15, 2021 at 2:18 AM Jiri Slaby <jirislaby@gmail.com> wrote:
> >
> > On 05. 12. 21, 5:21, Eric Dumazet wrote:
> > > From: Eric Dumazet <edumazet@google.com>
> > >
> > > It can be hard to track where references are taken and released.
> > >
> > > In networking, we have annoying issues at device or netns dismantles,
> > > and we had various proposals to ease root causing them.
> > ...
> > > --- a/lib/Kconfig
> > > +++ b/lib/Kconfig
> > > @@ -680,6 +680,11 @@ config STACK_HASH_ORDER
> > >        Select the hash size as a power of 2 for the stackdepot hash table.
> > >        Choose a lower value to reduce the memory impact.
> > >
> > > +config REF_TRACKER
> > > +     bool
> > > +     depends on STACKTRACE_SUPPORT
> > > +     select STACKDEPOT
> >
> > Hi,
> >
> > I have to:
> > +       select STACKDEPOT_ALWAYS_INIT
> > here. Otherwise I see this during boot:
> >
>
> Thanks, I am adding Vlastimil Babka to the CC
>
> This stuff has been added in
> commit e88cc9f5e2e7a5d28a1adf12615840fab4cbebfd
> Author: Vlastimil Babka <vbabka@suse.cz>
> Date:   Tue Dec 14 21:50:42 2021 +0000
>
>     lib/stackdepot: allow optional init and stack_table allocation by kvmalloc()
>
>

(This is a problem because this patch is not yet in net-next, so I really do
not know how this issue should be handled)

>
> > > BUG: unable to handle page fault for address: 00000000001e6f80
> > > #PF: supervisor read access in kernel mode
> > > #PF: error_code(0x0000) - not-present page
> > > PGD 0 P4D 0
> > > Oops: 0000 [#1] PREEMPT SMP PTI
> > > CPU: 1 PID: 1 Comm: swapper/0 Tainted: G          I       5.16.0-rc5-next-20211214-vanilla+ #46 2756e36611a8c8a8271884ae04571fc88e1cb566
> > > Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS SDBLI944.86P 05/08/2007
> > > RIP: 0010:__stack_depot_save (lib/stackdepot.c:373)
> > > Code: 04 31 fb 83 fe 03 77 97 83 fe 02 74 7a 83 fe 03 74 72 83 fe 01 74 73 48 8b 05 45 ec 11 02 89 d9 81 e1 ff ff 0f 00 48 8d 0c c8 <48> 8b 29 48 85 ed 75 12 e9 9f 00 00 00 48 8b 6d 00 48 85 ed 0f 84
> > > All code
> > > ========
> > >    0: 04 31                   add    $0x31,%al
> > >    2: fb                      sti
> > >    3: 83 fe 03                cmp    $0x3,%esi
> > >    6: 77 97                   ja     0xffffffffffffff9f
> > >    8: 83 fe 02                cmp    $0x2,%esi
> > >    b: 74 7a                   je     0x87
> > >    d: 83 fe 03                cmp    $0x3,%esi
> > >   10: 74 72                   je     0x84
> > >   12: 83 fe 01                cmp    $0x1,%esi
> > >   15: 74 73                   je     0x8a
> > >   17: 48 8b 05 45 ec 11 02    mov    0x211ec45(%rip),%rax        # 0x211ec63
> > >   1e: 89 d9                   mov    %ebx,%ecx
> > >   20: 81 e1 ff ff 0f 00       and    $0xfffff,%ecx
> > >   26: 48 8d 0c c8             lea    (%rax,%rcx,8),%rcx
> > >   2a:*        48 8b 29                mov    (%rcx),%rbp              <-- trapping instruction
> > >   2d: 48 85 ed                test   %rbp,%rbp
> > >   30: 75 12                   jne    0x44
> > >   32: e9 9f 00 00 00          jmp    0xd6
> > >   37: 48 8b 6d 00             mov    0x0(%rbp),%rbp
> > >   3b: 48 85 ed                test   %rbp,%rbp
> > >   3e: 0f                      .byte 0xf
> > >   3f: 84                      .byte 0x84
> > >
> > > Code starting with the faulting instruction
> > > ===========================================
> > >    0: 48 8b 29                mov    (%rcx),%rbp
> > >    3: 48 85 ed                test   %rbp,%rbp
> > >    6: 75 12                   jne    0x1a
> > >    8: e9 9f 00 00 00          jmp    0xac
> > >    d: 48 8b 6d 00             mov    0x0(%rbp),%rbp
> > >   11: 48 85 ed                test   %rbp,%rbp
> > >   14: 0f                      .byte 0xf
> > >   15: 84                      .byte 0x84
> > > RSP: 0000:ffffb3f700027b78 EFLAGS: 00010206
> > > RAX: 0000000000000000 RBX: 000000004ea3cdf0 RCX: 00000000001e6f80
> > > RDX: 000000000000000d RSI: 0000000000000002 RDI: 00000000793ec676
> > > RBP: ffff8b578094f4d0 R08: 0000000043abc8c3 R09: 000000000000000d
> > > R10: 0000000000000015 R11: 000000000000001c R12: 0000000000000001
> > > R13: 0000000000000cc0 R14: ffffb3f700027bd8 R15: 000000000000000d
> > > FS:  0000000000000000(0000) GS:ffff8b5845c80000(0000) knlGS:0000000000000000
> > > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > CR2: 00000000001e6f80 CR3: 0000000199410000 CR4: 00000000000006e0
> > > Call Trace:
> > > <TASK>
> > > ref_tracker_alloc (lib/ref_tracker.c:84)
> > > net_rx_queue_update_kobjects (net/core/net-sysfs.c:1049 net/core/net-sysfs.c:1101)
> > > netdev_register_kobject (net/core/net-sysfs.c:1761 net/core/net-sysfs.c:2012)
> > > register_netdevice (net/core/dev.c:9660)
> > > register_netdev (net/core/dev.c:9784)
> > > loopback_net_init (drivers/net/loopback.c:217)
> > > ops_init (net/core/net_namespace.c:140)
> > > register_pernet_operations (net/core/net_namespace.c:1148 net/core/net_namespace.c:1217)
> > > register_pernet_device (net/core/net_namespace.c:1304)
> > > net_dev_init (net/core/dev.c:11014)
> > > ? sysctl_core_init (net/core/dev.c:10958)
> > > do_one_initcall (init/main.c:1303)
> > > kernel_init_freeable (init/main.c:1377 init/main.c:1394 init/main.c:1413 init/main.c:1618)
> > > ? rest_init (init/main.c:1499)
> > > kernel_init (init/main.c:1509)
> > > ret_from_fork (arch/x86/entry/entry_64.S:301)
> > > </TASK>
> > > Modules linked in:
> > > CR2: 00000000001e6f80
> > > ---[ end trace 0000000000000000 ]---
> > > RIP: 0010:__stack_depot_save (lib/stackdepot.c:373)
> > > Code: 04 31 fb 83 fe 03 77 97 83 fe 02 74 7a 83 fe 03 74 72 83 fe 01 74 73 48 8b 05 45 ec 11 02 89 d9 81 e1 ff ff 0f 00 48 8d 0c c8 <48> 8b 29 48 85 ed 75 12 e9 9f 00 00 00 48 8b 6d 00 48 85 ed 0f 84
> > > All code
> > > ========
> > >    0: 04 31                   add    $0x31,%al
> > >    2: fb                      sti
> > >    3: 83 fe 03                cmp    $0x3,%esi
> > >    6: 77 97                   ja     0xffffffffffffff9f
> > >    8: 83 fe 02                cmp    $0x2,%esi
> > >    b: 74 7a                   je     0x87
> > >    d: 83 fe 03                cmp    $0x3,%esi
> > >   10: 74 72                   je     0x84
> > >   12: 83 fe 01                cmp    $0x1,%esi
> > >   15: 74 73                   je     0x8a
> > >   17: 48 8b 05 45 ec 11 02    mov    0x211ec45(%rip),%rax        # 0x211ec63
> > >   1e: 89 d9                   mov    %ebx,%ecx
> > >   20: 81 e1 ff ff 0f 00       and    $0xfffff,%ecx
> > >   26: 48 8d 0c c8             lea    (%rax,%rcx,8),%rcx
> > >   2a:*        48 8b 29                mov    (%rcx),%rbp              <-- trapping instruction
> > >   2d: 48 85 ed                test   %rbp,%rbp
> > >   30: 75 12                   jne    0x44
> > >   32: e9 9f 00 00 00          jmp    0xd6
> > >   37: 48 8b 6d 00             mov    0x0(%rbp),%rbp
> > >   3b: 48 85 ed                test   %rbp,%rbp
> > >   3e: 0f                      .byte 0xf
> > >   3f: 84                      .byte 0x84
> > >
> > > Code starting with the faulting instruction
> > > ===========================================
> > >    0: 48 8b 29                mov    (%rcx),%rbp
> > >    3: 48 85 ed                test   %rbp,%rbp
> > >    6: 75 12                   jne    0x1a
> > >    8: e9 9f 00 00 00          jmp    0xac
> > >    d: 48 8b 6d 00             mov    0x0(%rbp),%rbp
> > >   11: 48 85 ed                test   %rbp,%rbp
> > >   14: 0f                      .byte 0xf
> > >   15: 84                      .byte 0x84
> > > RSP: 0000:ffffb3f700027b78 EFLAGS: 00010206
> > > RAX: 0000000000000000 RBX: 000000004ea3cdf0 RCX: 00000000001e6f80
> > > RDX: 000000000000000d RSI: 0000000000000002 RDI: 00000000793ec676
> > > RBP: ffff8b578094f4d0 R08: 0000000043abc8c3 R09: 000000000000000d
> > > R10: 0000000000000015 R11: 000000000000001c R12: 0000000000000001
> > > R13: 0000000000000cc0 R14: ffffb3f700027bd8 R15: 000000000000000d
> > > FS:  0000000000000000(0000) GS:ffff8b5845c80000(0000) knlGS:0000000000000000
> > > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > CR2: 00000000001e6f80 CR3: 0000000199410000 CR4: 00000000000006e0
> >
> > regards,
> > --
> > js

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 net-next 01/23] lib: add reference counting tracking infrastructure
  2021-12-15 10:41       ` Eric Dumazet
@ 2021-12-15 10:57         ` Vlastimil Babka
  2021-12-15 11:08           ` Eric Dumazet
  0 siblings, 1 reply; 52+ messages in thread
From: Vlastimil Babka @ 2021-12-15 10:57 UTC (permalink / raw)
  To: Eric Dumazet, Jiri Slaby, Andrew Morton
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev,
	Dmitry Vyukov, linux-mm


On 12/15/21 11:41, Eric Dumazet wrote:
> On Wed, Dec 15, 2021 at 2:38 AM Eric Dumazet <edumazet@google.com> wrote:
>>
>> On Wed, Dec 15, 2021 at 2:18 AM Jiri Slaby <jirislaby@gmail.com> wrote:
>> >
>> > On 05. 12. 21, 5:21, Eric Dumazet wrote:
>> > > From: Eric Dumazet <edumazet@google.com>
>> > >
>> > > It can be hard to track where references are taken and released.
>> > >
>> > > In networking, we have annoying issues at device or netns dismantles,
>> > > and we had various proposals to ease root causing them.
>> > ...
>> > > --- a/lib/Kconfig
>> > > +++ b/lib/Kconfig
>> > > @@ -680,6 +680,11 @@ config STACK_HASH_ORDER
>> > >        Select the hash size as a power of 2 for the stackdepot hash table.
>> > >        Choose a lower value to reduce the memory impact.
>> > >
>> > > +config REF_TRACKER
>> > > +     bool
>> > > +     depends on STACKTRACE_SUPPORT
>> > > +     select STACKDEPOT
>> >
>> > Hi,
>> >
>> > I have to:
>> > +       select STACKDEPOT_ALWAYS_INIT
>> > here. Otherwise I see this during boot:
>> >
>>
>> Thanks, I am adding Vlastimil Babka to the CC
>>
>> This stuff has been added in
>> commit e88cc9f5e2e7a5d28a1adf12615840fab4cbebfd
>> Author: Vlastimil Babka <vbabka@suse.cz>
>> Date:   Tue Dec 14 21:50:42 2021 +0000
>>
>>     lib/stackdepot: allow optional init and stack_table allocation by kvmalloc()
>>
>>
> 
> (This is a problem because this patch is not yet in net-next, so I really do
> not know how this issue should be handled)

Looks like multiple new users of stackdepot start appearing as soon as I
touch it :)

The way we solved this with a new DRM user was Andrew adding a fixup to my
patch referenced above, in his "after-next" section of mm tree.
Should work here as well.

----8<----
From 0fa1f25925c05f8c5c4f776913d84904fb4c03a1 Mon Sep 17 00:00:00 2001
From: Vlastimil Babka <vbabka@suse.cz>
Date: Wed, 15 Dec 2021 11:52:10 +0100
Subject: [PATCH] lib/stackdepot: allow optional init and stack_table
 allocation by kvmalloc() - fixup4

Due to 4e66934eaadc ("lib: add reference counting tracking infrastructure")
landing recently to net-next adding a new stack depot user in lib/ref_tracker.c
we need to add an appropriate call to stack_depot_init() there as well.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 include/linux/ref_tracker.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/ref_tracker.h b/include/linux/ref_tracker.h
index c11c9db5825c..60f3453be23e 100644
--- a/include/linux/ref_tracker.h
+++ b/include/linux/ref_tracker.h
@@ -4,6 +4,7 @@
 #include <linux/refcount.h>
 #include <linux/types.h>
 #include <linux/spinlock.h>
+#include <linux/stackdepot.h>
 
 struct ref_tracker;
 
@@ -26,6 +27,7 @@ static inline void ref_tracker_dir_init(struct ref_tracker_dir *dir,
 	spin_lock_init(&dir->lock);
 	dir->quarantine_avail = quarantine_count;
 	refcount_set(&dir->untracked, 1);
+	stack_depot_init();
 }
 
 void ref_tracker_dir_exit(struct ref_tracker_dir *dir);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 net-next 01/23] lib: add reference counting tracking infrastructure
  2021-12-15 10:57         ` Vlastimil Babka
@ 2021-12-15 11:08           ` Eric Dumazet
  2021-12-15 11:09             ` Jiri Slaby
  0 siblings, 1 reply; 52+ messages in thread
From: Eric Dumazet @ 2021-12-15 11:08 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Jiri Slaby, Andrew Morton, Eric Dumazet, David S . Miller,
	Jakub Kicinski, netdev, Dmitry Vyukov, linux-mm

On Wed, Dec 15, 2021 at 2:57 AM Vlastimil Babka <vbabka@suse.cz> wrote:
>
>
> On 12/15/21 11:41, Eric Dumazet wrote:
> > On Wed, Dec 15, 2021 at 2:38 AM Eric Dumazet <edumazet@google.com> wrote:
> >>
> >> On Wed, Dec 15, 2021 at 2:18 AM Jiri Slaby <jirislaby@gmail.com> wrote:
> >> >
> >> > On 05. 12. 21, 5:21, Eric Dumazet wrote:
> >> > > From: Eric Dumazet <edumazet@google.com>
> >> > >
> >> > > It can be hard to track where references are taken and released.
> >> > >
> >> > > In networking, we have annoying issues at device or netns dismantles,
> >> > > and we had various proposals to ease root causing them.
> >> > ...
> >> > > --- a/lib/Kconfig
> >> > > +++ b/lib/Kconfig
> >> > > @@ -680,6 +680,11 @@ config STACK_HASH_ORDER
> >> > >        Select the hash size as a power of 2 for the stackdepot hash table.
> >> > >        Choose a lower value to reduce the memory impact.
> >> > >
> >> > > +config REF_TRACKER
> >> > > +     bool
> >> > > +     depends on STACKTRACE_SUPPORT
> >> > > +     select STACKDEPOT
> >> >
> >> > Hi,
> >> >
> >> > I have to:
> >> > +       select STACKDEPOT_ALWAYS_INIT
> >> > here. Otherwise I see this during boot:
> >> >
> >>
> >> Thanks, I am adding Vlastimil Babka to the CC
> >>
> >> This stuff has been added in
> >> commit e88cc9f5e2e7a5d28a1adf12615840fab4cbebfd
> >> Author: Vlastimil Babka <vbabka@suse.cz>
> >> Date:   Tue Dec 14 21:50:42 2021 +0000
> >>
> >>     lib/stackdepot: allow optional init and stack_table allocation by kvmalloc()
> >>
> >>
> >
> > (This is a problem because this patch is not yet in net-next, so I really do
> > not know how this issue should be handled)
>
> Looks like multiple new users of stackdepot start appearing as soon as I
> touch it :)
>
> The way we solved this with a new DRM user was Andrew adding a fixup to my
> patch referenced above, in his "after-next" section of mm tree.
> Should work here as well.
>
> ----8<----
> From 0fa1f25925c05f8c5c4f776913d84904fb4c03a1 Mon Sep 17 00:00:00 2001
> From: Vlastimil Babka <vbabka@suse.cz>
> Date: Wed, 15 Dec 2021 11:52:10 +0100
> Subject: [PATCH] lib/stackdepot: allow optional init and stack_table
>  allocation by kvmalloc() - fixup4
>
> Due to 4e66934eaadc ("lib: add reference counting tracking infrastructure")
> landing recently to net-next adding a new stack depot user in lib/ref_tracker.c
> we need to add an appropriate call to stack_depot_init() there as well.
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---

I guess this minimal fix will do.

In the future, when net-next (or net tree) has everything in place,
I will probably un-inline ref_tracker_dir_init() to avoid pulling
all these includes...

Reviewed-by: Eric Dumazet <edumazet@google.com>
Reported-by: Jiri Slab <jirislaby@gmail.com>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 net-next 01/23] lib: add reference counting tracking infrastructure
  2021-12-15 11:08           ` Eric Dumazet
@ 2021-12-15 11:09             ` Jiri Slaby
  2021-12-15 11:25               ` Eric Dumazet
  0 siblings, 1 reply; 52+ messages in thread
From: Jiri Slaby @ 2021-12-15 11:09 UTC (permalink / raw)
  To: Eric Dumazet, Vlastimil Babka
  Cc: Andrew Morton, Eric Dumazet, David S . Miller, Jakub Kicinski,
	netdev, Dmitry Vyukov, linux-mm

On 15. 12. 21, 12:08, Eric Dumazet wrote:
> Reported-by: Jiri Slab <jirislaby@gmail.com>

(I am not the allocator :P.)

-- 
js

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 net-next 01/23] lib: add reference counting tracking infrastructure
  2021-12-15 11:09             ` Jiri Slaby
@ 2021-12-15 11:25               ` Eric Dumazet
  0 siblings, 0 replies; 52+ messages in thread
From: Eric Dumazet @ 2021-12-15 11:25 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Vlastimil Babka, Andrew Morton, Eric Dumazet, David S . Miller,
	Jakub Kicinski, netdev, Dmitry Vyukov, linux-mm

On Wed, Dec 15, 2021 at 3:09 AM Jiri Slaby <jirislaby@gmail.com> wrote:
>
> On 15. 12. 21, 12:08, Eric Dumazet wrote:
> > Reported-by: Jiri Slab <jirislaby@gmail.com>
>
> (I am not the allocator :P.)

Ah, it took me a while to understand, sorry for the typo ;)

^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2021-12-15 11:25 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-05  4:21 [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking Eric Dumazet
2021-12-05  4:21 ` [PATCH v3 net-next 01/23] lib: add reference counting tracking infrastructure Eric Dumazet
2021-12-08 14:09   ` Andrzej Hajda
2021-12-08 14:27     ` Dmitry Vyukov
2021-12-08 15:04       ` Andrzej Hajda
2021-12-08 14:59     ` Jakub Kicinski
2021-12-08 15:11       ` Andrzej Hajda
2021-12-15 10:18   ` Jiri Slaby
2021-12-15 10:38     ` Eric Dumazet
2021-12-15 10:41       ` Eric Dumazet
2021-12-15 10:57         ` Vlastimil Babka
2021-12-15 11:08           ` Eric Dumazet
2021-12-15 11:09             ` Jiri Slaby
2021-12-15 11:25               ` Eric Dumazet
2021-12-05  4:21 ` [PATCH v3 net-next 02/23] lib: add tests for reference tracker Eric Dumazet
2021-12-05  4:21 ` [PATCH v3 net-next 03/23] net: add net device refcount tracker infrastructure Eric Dumazet
2021-12-05  4:21 ` [PATCH v3 net-next 04/23] net: add net device refcount tracker to struct netdev_rx_queue Eric Dumazet
2021-12-05  4:21 ` [PATCH v3 net-next 05/23] net: add net device refcount tracker to struct netdev_queue Eric Dumazet
2021-12-05  4:22 ` [PATCH v3 net-next 06/23] net: add net device refcount tracker to ethtool_phys_id() Eric Dumazet
2021-12-05  4:22 ` [PATCH v3 net-next 07/23] net: add net device refcount tracker to dev_ifsioc() Eric Dumazet
2021-12-05  4:22 ` [PATCH v3 net-next 08/23] drop_monitor: add net device refcount tracker Eric Dumazet
2021-12-05  4:22 ` [PATCH v3 net-next 09/23] net: dst: add net device refcount tracking to dst_entry Eric Dumazet
2021-12-05  4:22 ` [PATCH v3 net-next 10/23] ipv6: add net device refcount tracker to rt6_probe_deferred() Eric Dumazet
2021-12-05  4:22 ` [PATCH v3 net-next 11/23] sit: add net device refcount tracking to ip_tunnel Eric Dumazet
2021-12-05  4:22 ` [PATCH v3 net-next 12/23] ipv6: add net device refcount tracker to struct ip6_tnl Eric Dumazet
2021-12-05  4:22 ` [PATCH v3 net-next 13/23] net: add net device refcount tracker to struct neighbour Eric Dumazet
2021-12-05  4:22 ` [PATCH v3 net-next 14/23] net: add net device refcount tracker to struct pneigh_entry Eric Dumazet
2021-12-05  4:22 ` [PATCH v3 net-next 15/23] net: add net device refcount tracker to struct neigh_parms Eric Dumazet
2021-12-05  4:22 ` [PATCH v3 net-next 16/23] net: add net device refcount tracker to struct netdev_adjacent Eric Dumazet
2021-12-05  4:22 ` [PATCH v3 net-next 17/23] ipv6: add net device refcount tracker to struct inet6_dev Eric Dumazet
2021-12-05  4:22 ` [PATCH v3 net-next 18/23] ipv4: add net device refcount tracker to struct in_device Eric Dumazet
2021-12-05  4:22 ` [PATCH v3 net-next 19/23] net/sched: add net device refcount tracker to struct Qdisc Eric Dumazet
2021-12-05  4:22 ` [PATCH v3 net-next 20/23] net: linkwatch: add net device refcount tracker Eric Dumazet
2021-12-05  4:22 ` [PATCH v3 net-next 21/23] net: failover: " Eric Dumazet
2021-12-05  4:22 ` [PATCH v3 net-next 22/23] ipmr, ip6mr: add net device refcount tracker to struct vif_device Eric Dumazet
2021-12-05  4:22 ` [PATCH v3 net-next 23/23] netpoll: add net device refcount tracker to struct netpoll Eric Dumazet
2021-12-06 23:23 ` [PATCH v3 net-next 00/23] net: add preliminary netdev refcount tracking Andrew Lunn
2021-12-06 23:44   ` Eric Dumazet
2021-12-06 23:48     ` Eric Dumazet
2021-12-07  0:00     ` Andrew Lunn
2021-12-07  0:04       ` Eric Dumazet
2021-12-07  0:12         ` Andrew Lunn
2021-12-07  0:17           ` Eric Dumazet
2021-12-07  0:21             ` Eric Dumazet
2021-12-07  0:27             ` Andrew Lunn
2021-12-07  0:53               ` Eric Dumazet
2021-12-07 19:52                 ` Andrew Lunn
2021-12-07 20:00                   ` Eric Dumazet
2021-12-08 17:29                     ` Andrew Lunn
2021-12-08 18:21                       ` Eric Dumazet
2021-12-08 18:53                         ` Eric Dumazet
2021-12-07  0:26 ` Jakub Kicinski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.