* [PATCH net-next 0/3] net: device tracking improvements
@ 2022-02-04 18:36 Eric Dumazet
2022-02-04 18:36 ` [PATCH net-next 1/3] ref_tracker: implement use-after-free detection Eric Dumazet
` (3 more replies)
0 siblings, 4 replies; 6+ messages in thread
From: Eric Dumazet @ 2022-02-04 18:36 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski; +Cc: netdev, Eric Dumazet, Eric Dumazet
From: Eric Dumazet <edumazet@google.com>
Main goal of this series is to be able to detect the following case
which apparently is still haunting us.
dev_hold_track(dev, tracker_1, GFP_ATOMIC);
dev_hold(dev);
dev_put(dev);
dev_put(dev); // Should complain loudly here.
dev_put_track(dev, tracker_1); // instead of here (as before this series)
Eric Dumazet (3):
ref_tracker: implement use-after-free detection
ref_tracker: add a count of untracked references
net: refine dev_put()/dev_hold() debugging
include/linux/netdevice.h | 69 ++++++++++++++++++++++++-------------
include/linux/ref_tracker.h | 4 +++
lib/ref_tracker.c | 17 ++++++++-
net/core/dev.c | 2 +-
4 files changed, 67 insertions(+), 25 deletions(-)
--
2.35.0.263.gb82422642f-goog
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH net-next 1/3] ref_tracker: implement use-after-free detection
2022-02-04 18:36 [PATCH net-next 0/3] net: device tracking improvements Eric Dumazet
@ 2022-02-04 18:36 ` Eric Dumazet
2022-02-04 18:36 ` [PATCH net-next 2/3] ref_tracker: add a count of untracked references Eric Dumazet
` (2 subsequent siblings)
3 siblings, 0 replies; 6+ messages in thread
From: Eric Dumazet @ 2022-02-04 18:36 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski; +Cc: netdev, Eric Dumazet, Eric Dumazet
From: Eric Dumazet <edumazet@google.com>
Whenever ref_tracker_dir_init() is called, mark the struct ref_tracker_dir
as dead.
Test the dead status from ref_tracker_alloc() and ref_tracker_free()
This should detect buggy dev_put()/dev_hold() happening too late
in netdevice dismantle process.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
include/linux/ref_tracker.h | 2 ++
lib/ref_tracker.c | 5 +++++
2 files changed, 7 insertions(+)
diff --git a/include/linux/ref_tracker.h b/include/linux/ref_tracker.h
index 60f3453be23e6881725d383c55f93143fda1e7a2..a443abda937d86ff534225bf16b958a9da295a7d 100644
--- a/include/linux/ref_tracker.h
+++ b/include/linux/ref_tracker.h
@@ -13,6 +13,7 @@ struct ref_tracker_dir {
spinlock_t lock;
unsigned int quarantine_avail;
refcount_t untracked;
+ bool dead;
struct list_head list; /* List of active trackers */
struct list_head quarantine; /* List of dead trackers */
#endif
@@ -26,6 +27,7 @@ static inline void ref_tracker_dir_init(struct ref_tracker_dir *dir,
INIT_LIST_HEAD(&dir->quarantine);
spin_lock_init(&dir->lock);
dir->quarantine_avail = quarantine_count;
+ dir->dead = false;
refcount_set(&dir->untracked, 1);
stack_depot_init();
}
diff --git a/lib/ref_tracker.c b/lib/ref_tracker.c
index a6789c0c626b0f68ad67c264cd19177a63fb82d2..32ff6bd497f8e464eeb51a3628cb24bded0547da 100644
--- a/lib/ref_tracker.c
+++ b/lib/ref_tracker.c
@@ -20,6 +20,7 @@ void ref_tracker_dir_exit(struct ref_tracker_dir *dir)
unsigned long flags;
bool leak = false;
+ dir->dead = true;
spin_lock_irqsave(&dir->lock, flags);
list_for_each_entry_safe(tracker, n, &dir->quarantine, head) {
list_del(&tracker->head);
@@ -72,6 +73,8 @@ int ref_tracker_alloc(struct ref_tracker_dir *dir,
gfp_t gfp_mask = gfp;
unsigned long flags;
+ WARN_ON_ONCE(dir->dead);
+
if (gfp & __GFP_DIRECT_RECLAIM)
gfp_mask |= __GFP_NOFAIL;
*trackerp = tracker = kzalloc(sizeof(*tracker), gfp_mask);
@@ -100,6 +103,8 @@ int ref_tracker_free(struct ref_tracker_dir *dir,
unsigned int nr_entries;
unsigned long flags;
+ WARN_ON_ONCE(dir->dead);
+
if (!tracker) {
refcount_dec(&dir->untracked);
return -EEXIST;
--
2.35.0.263.gb82422642f-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH net-next 2/3] ref_tracker: add a count of untracked references
2022-02-04 18:36 [PATCH net-next 0/3] net: device tracking improvements Eric Dumazet
2022-02-04 18:36 ` [PATCH net-next 1/3] ref_tracker: implement use-after-free detection Eric Dumazet
@ 2022-02-04 18:36 ` Eric Dumazet
2022-02-04 18:36 ` [PATCH net-next 3/3] net: refine dev_put()/dev_hold() debugging Eric Dumazet
2022-02-04 21:51 ` [PATCH net-next 0/3] net: device tracking improvements Eric Dumazet
3 siblings, 0 replies; 6+ messages in thread
From: Eric Dumazet @ 2022-02-04 18:36 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski; +Cc: netdev, Eric Dumazet, Eric Dumazet
From: Eric Dumazet <edumazet@google.com>
We are still chasing a netdev refcount imbalance, and we suspect
we have one rogue dev_put() that is consuming a reference taken
from a dev_hold_track()
To detect this case, allow ref_tracker_alloc() and ref_tracker_free()
to be called with a NULL @trackerp parameter, and use a dedicated
refcount_t just for them.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
include/linux/ref_tracker.h | 2 ++
lib/ref_tracker.c | 12 +++++++++++-
2 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/include/linux/ref_tracker.h b/include/linux/ref_tracker.h
index a443abda937d86ff534225bf16b958a9da295a7d..9ca353ab712b5e897d9b3e5cfcd7117b610dd01a 100644
--- a/include/linux/ref_tracker.h
+++ b/include/linux/ref_tracker.h
@@ -13,6 +13,7 @@ struct ref_tracker_dir {
spinlock_t lock;
unsigned int quarantine_avail;
refcount_t untracked;
+ refcount_t no_tracker;
bool dead;
struct list_head list; /* List of active trackers */
struct list_head quarantine; /* List of dead trackers */
@@ -29,6 +30,7 @@ static inline void ref_tracker_dir_init(struct ref_tracker_dir *dir,
dir->quarantine_avail = quarantine_count;
dir->dead = false;
refcount_set(&dir->untracked, 1);
+ refcount_set(&dir->no_tracker, 1);
stack_depot_init();
}
diff --git a/lib/ref_tracker.c b/lib/ref_tracker.c
index 32ff6bd497f8e464eeb51a3628cb24bded0547da..9c0c2e09df666d19aba441f568762afbd1cad4d0 100644
--- a/lib/ref_tracker.c
+++ b/lib/ref_tracker.c
@@ -38,6 +38,7 @@ void ref_tracker_dir_exit(struct ref_tracker_dir *dir)
spin_unlock_irqrestore(&dir->lock, flags);
WARN_ON_ONCE(leak);
WARN_ON_ONCE(refcount_read(&dir->untracked) != 1);
+ WARN_ON_ONCE(refcount_read(&dir->no_tracker) != 1);
}
EXPORT_SYMBOL(ref_tracker_dir_exit);
@@ -75,6 +76,10 @@ int ref_tracker_alloc(struct ref_tracker_dir *dir,
WARN_ON_ONCE(dir->dead);
+ if (!trackerp) {
+ refcount_inc(&dir->no_tracker);
+ return 0;
+ }
if (gfp & __GFP_DIRECT_RECLAIM)
gfp_mask |= __GFP_NOFAIL;
*trackerp = tracker = kzalloc(sizeof(*tracker), gfp_mask);
@@ -98,13 +103,18 @@ int ref_tracker_free(struct ref_tracker_dir *dir,
struct ref_tracker **trackerp)
{
unsigned long entries[REF_TRACKER_STACK_ENTRIES];
- struct ref_tracker *tracker = *trackerp;
depot_stack_handle_t stack_handle;
+ struct ref_tracker *tracker;
unsigned int nr_entries;
unsigned long flags;
WARN_ON_ONCE(dir->dead);
+ if (!trackerp) {
+ refcount_dec(&dir->no_tracker);
+ return 0;
+ }
+ tracker = *trackerp;
if (!tracker) {
refcount_dec(&dir->untracked);
return -EEXIST;
--
2.35.0.263.gb82422642f-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH net-next 3/3] net: refine dev_put()/dev_hold() debugging
2022-02-04 18:36 [PATCH net-next 0/3] net: device tracking improvements Eric Dumazet
2022-02-04 18:36 ` [PATCH net-next 1/3] ref_tracker: implement use-after-free detection Eric Dumazet
2022-02-04 18:36 ` [PATCH net-next 2/3] ref_tracker: add a count of untracked references Eric Dumazet
@ 2022-02-04 18:36 ` Eric Dumazet
2022-02-04 21:51 ` [PATCH net-next 0/3] net: device tracking improvements Eric Dumazet
3 siblings, 0 replies; 6+ messages in thread
From: Eric Dumazet @ 2022-02-04 18:36 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski; +Cc: netdev, Eric Dumazet, Eric Dumazet
From: Eric Dumazet <edumazet@google.com>
We are still chasing some syzbot reports where we think a rogue dev_put()
is called with no corresponding prior dev_hold().
Unfortunately it eats a reference on dev->dev_refcnt taken by innocent
dev_hold_track(), meaning that the refcount saturation splat comes
too late to be useful.
Make sure that 'not tracked' dev_put() and dev_hold() better use
CONFIG_NET_DEV_REFCNT_TRACKER=y debug infrastructure:
Prior patch in the series allowed ref_tracker_alloc() and ref_tracker_free()
to be called with a NULL @trackerp parameter, and to use a separate refcount
only to detect too many put() even in the following case:
dev_hold_track(dev, tracker_1, GFP_ATOMIC);
dev_hold(dev);
dev_put(dev);
dev_put(dev); // Should complain loudly here.
dev_put_track(dev, tracker_1); // instead of here
Add clarification about netdev_tracker_alloc() role.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
include/linux/netdevice.h | 69 ++++++++++++++++++++++++++-------------
net/core/dev.c | 2 +-
2 files changed, 47 insertions(+), 24 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index e490b84732d1654bf067b30f2bb0b0825f88dea9..3fb6fb67ed77e70314a699c9bdf8f4b26acfcc19 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3817,14 +3817,7 @@ extern unsigned int netdev_budget_usecs;
/* Called by rtnetlink.c:rtnl_unlock() */
void netdev_run_todo(void);
-/**
- * dev_put - release reference to device
- * @dev: network device
- *
- * Release reference to device to allow it to be freed.
- * Try using dev_put_track() instead.
- */
-static inline void dev_put(struct net_device *dev)
+static inline void __dev_put(struct net_device *dev)
{
if (dev) {
#ifdef CONFIG_PCPU_DEV_REFCNT
@@ -3835,14 +3828,7 @@ static inline void dev_put(struct net_device *dev)
}
}
-/**
- * dev_hold - get reference to device
- * @dev: network device
- *
- * Hold reference to device to keep it from being freed.
- * Try using dev_hold_track() instead.
- */
-static inline void dev_hold(struct net_device *dev)
+static inline void __dev_hold(struct net_device *dev)
{
if (dev) {
#ifdef CONFIG_PCPU_DEV_REFCNT
@@ -3853,11 +3839,24 @@ static inline void dev_hold(struct net_device *dev)
}
}
+static inline void __netdev_tracker_alloc(struct net_device *dev,
+ netdevice_tracker *tracker,
+ gfp_t gfp)
+{
+#ifdef CONFIG_NET_DEV_REFCNT_TRACKER
+ ref_tracker_alloc(&dev->refcnt_tracker, tracker, gfp);
+#endif
+}
+
+/* netdev_tracker_alloc() can upgrade a prior untracked reference
+ * taken by dev_get_by_name()/dev_get_by_index() to a tracked one.
+ */
static inline void netdev_tracker_alloc(struct net_device *dev,
netdevice_tracker *tracker, gfp_t gfp)
{
#ifdef CONFIG_NET_DEV_REFCNT_TRACKER
- ref_tracker_alloc(&dev->refcnt_tracker, tracker, gfp);
+ refcount_dec(&dev->refcnt_tracker.no_tracker);
+ __netdev_tracker_alloc(dev, tracker, gfp);
#endif
}
@@ -3873,8 +3872,8 @@ static inline void dev_hold_track(struct net_device *dev,
netdevice_tracker *tracker, gfp_t gfp)
{
if (dev) {
- dev_hold(dev);
- netdev_tracker_alloc(dev, tracker, gfp);
+ __dev_hold(dev);
+ __netdev_tracker_alloc(dev, tracker, gfp);
}
}
@@ -3883,10 +3882,34 @@ static inline void dev_put_track(struct net_device *dev,
{
if (dev) {
netdev_tracker_free(dev, tracker);
- dev_put(dev);
+ __dev_put(dev);
}
}
+/**
+ * dev_hold - get reference to device
+ * @dev: network device
+ *
+ * Hold reference to device to keep it from being freed.
+ * Try using dev_hold_track() instead.
+ */
+static inline void dev_hold(struct net_device *dev)
+{
+ dev_hold_track(dev, NULL, GFP_ATOMIC);
+}
+
+/**
+ * dev_put - release reference to device
+ * @dev: network device
+ *
+ * Release reference to device to allow it to be freed.
+ * Try using dev_put_track() instead.
+ */
+static inline void dev_put(struct net_device *dev)
+{
+ dev_put_track(dev, NULL);
+}
+
static inline void dev_replace_track(struct net_device *odev,
struct net_device *ndev,
netdevice_tracker *tracker,
@@ -3895,11 +3918,11 @@ static inline void dev_replace_track(struct net_device *odev,
if (odev)
netdev_tracker_free(odev, tracker);
- dev_hold(ndev);
- dev_put(odev);
+ __dev_hold(ndev);
+ __dev_put(odev);
if (ndev)
- netdev_tracker_alloc(ndev, tracker, gfp);
+ __netdev_tracker_alloc(ndev, tracker, gfp);
}
/* Carrier loss detection, dial on demand. The functions netif_carrier_on
diff --git a/net/core/dev.c b/net/core/dev.c
index f79744d99413434ad28b26dee9aeeb2893a0e3ae..1eaa0b88e3ba5d800484656f2c3420af57050294 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -10172,7 +10172,7 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
dev->pcpu_refcnt = alloc_percpu(int);
if (!dev->pcpu_refcnt)
goto free_dev;
- dev_hold(dev);
+ __dev_hold(dev);
#else
refcount_set(&dev->dev_refcnt, 1);
#endif
--
2.35.0.263.gb82422642f-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH net-next 0/3] net: device tracking improvements
2022-02-04 18:36 [PATCH net-next 0/3] net: device tracking improvements Eric Dumazet
` (2 preceding siblings ...)
2022-02-04 18:36 ` [PATCH net-next 3/3] net: refine dev_put()/dev_hold() debugging Eric Dumazet
@ 2022-02-04 21:51 ` Eric Dumazet
2022-02-04 22:24 ` Eric Dumazet
3 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2022-02-04 21:51 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David S . Miller, Jakub Kicinski, netdev
On Fri, Feb 4, 2022 at 10:36 AM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
> From: Eric Dumazet <edumazet@google.com>
>
> Main goal of this series is to be able to detect the following case
> which apparently is still haunting us.
>
> dev_hold_track(dev, tracker_1, GFP_ATOMIC);
> dev_hold(dev);
> dev_put(dev);
> dev_put(dev); // Should complain loudly here.
> dev_put_track(dev, tracker_1); // instead of here (as before this series)
Please do not merge.
I have missed some warnings in my tests, it seems I need to refine
things a bit more.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net-next 0/3] net: device tracking improvements
2022-02-04 21:51 ` [PATCH net-next 0/3] net: device tracking improvements Eric Dumazet
@ 2022-02-04 22:24 ` Eric Dumazet
0 siblings, 0 replies; 6+ messages in thread
From: Eric Dumazet @ 2022-02-04 22:24 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David S . Miller, Jakub Kicinski, netdev
On Fri, Feb 4, 2022 at 1:51 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Fri, Feb 4, 2022 at 10:36 AM Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >
> > From: Eric Dumazet <edumazet@google.com>
> >
> > Main goal of this series is to be able to detect the following case
> > which apparently is still haunting us.
> >
> > dev_hold_track(dev, tracker_1, GFP_ATOMIC);
> > dev_hold(dev);
> > dev_put(dev);
> > dev_put(dev); // Should complain loudly here.
> > dev_put_track(dev, tracker_1); // instead of here (as before this series)
>
>
> Please do not merge.
>
> I have missed some warnings in my tests, it seems I need to refine
> things a bit more.
I had to add the following on top of the last patch, I will send a V2 soon.
diff --git a/net/core/link_watch.c b/net/core/link_watch.c
index b0f5344d1185be66d05cd1dc50cffc5ccfe883ef..95098d1a49bdf4cbc3ddeb4d345e4276f974a208
100644
--- a/net/core/link_watch.c
+++ b/net/core/link_watch.c
@@ -166,10 +166,10 @@ static void linkwatch_do_dev(struct net_device *dev)
netdev_state_change(dev);
}
- /* Note: our callers are responsible for
- * calling netdev_tracker_free().
+ /* Note: our callers are responsible for calling netdev_tracker_free().
+ * This is the reason we use __dev_put() instead of dev_put().
*/
- dev_put(dev);
+ __dev_put(dev);
}
static void __linkwatch_run_queue(int urgent_only)
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2022-02-04 22:24 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-04 18:36 [PATCH net-next 0/3] net: device tracking improvements Eric Dumazet
2022-02-04 18:36 ` [PATCH net-next 1/3] ref_tracker: implement use-after-free detection Eric Dumazet
2022-02-04 18:36 ` [PATCH net-next 2/3] ref_tracker: add a count of untracked references Eric Dumazet
2022-02-04 18:36 ` [PATCH net-next 3/3] net: refine dev_put()/dev_hold() debugging Eric Dumazet
2022-02-04 21:51 ` [PATCH net-next 0/3] net: device tracking improvements Eric Dumazet
2022-02-04 22:24 ` Eric Dumazet
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.