All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] net: make unregister netdev warning timeout configurable
@ 2021-03-23  6:49 Dmitry Vyukov
  2021-03-24  9:32 ` Leon Romanovsky
  2021-03-24  9:40 ` Eric Dumazet
  0 siblings, 2 replies; 6+ messages in thread
From: Dmitry Vyukov @ 2021-03-23  6:49 UTC (permalink / raw)
  To: davem, edumazet, leon; +Cc: Dmitry Vyukov, netdev, linux-kernel

netdev_wait_allrefs() issues a warning if refcount does not drop to 0
after 10 seconds. While 10 second wait generally should not happen
under normal workload in normal environment, it seems to fire falsely
very often during fuzzing and/or in qemu emulation (~10x slower).
At least it's not possible to understand if it's really a false
positive or not. Automated testing generally bumps all timeouts
to very high values to avoid flake failures.
Add net.core.netdev_unregister_timeout_secs sysctl to make
the timeout configurable for automated testing systems.
Lowering the timeout may also be useful for e.g. manual bisection.
The default value matches the current behavior.

Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=211877
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

---
Changes since v1:
 - use sysctl instead of a config
---
 Documentation/admin-guide/sysctl/net.rst | 11 +++++++++++
 include/linux/netdevice.h                |  1 +
 net/core/dev.c                           |  6 +++++-
 net/core/sysctl_net_core.c               | 10 ++++++++++
 4 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/sysctl/net.rst b/Documentation/admin-guide/sysctl/net.rst
index f2ab8a5b6a4b8..2090bfc69aa50 100644
--- a/Documentation/admin-guide/sysctl/net.rst
+++ b/Documentation/admin-guide/sysctl/net.rst
@@ -311,6 +311,17 @@ permit to distribute the load on several cpus.
 If set to 1 (default), timestamps are sampled as soon as possible, before
 queueing.
 
+netdev_unregister_timeout_secs
+------------------------------
+
+Unregister network device timeout in seconds.
+This option controls the timeout (in seconds) used to issue a warning while
+waiting for a network device refcount to drop to 0 during device
+unregistration. A lower value may be useful during bisection to detect
+a leaked reference faster. A larger value may be useful to prevent false
+warnings on slow/loaded systems.
+Default value is 10, minimum 0, maximum 3600.
+
 optmem_max
 ----------
 
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 87a5d186faff4..179c5693f5119 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -4611,6 +4611,7 @@ void dev_get_tstats64(struct net_device *dev, struct rtnl_link_stats64 *s);
 
 extern int		netdev_max_backlog;
 extern int		netdev_tstamp_prequeue;
+extern int		netdev_unregister_timeout_secs;
 extern int		weight_p;
 extern int		dev_weight_rx_bias;
 extern int		dev_weight_tx_bias;
diff --git a/net/core/dev.c b/net/core/dev.c
index 0f72ff5d34ba0..7accbd4a3bec1 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -10344,6 +10344,8 @@ int netdev_refcnt_read(const struct net_device *dev)
 }
 EXPORT_SYMBOL(netdev_refcnt_read);
 
+int netdev_unregister_timeout_secs __read_mostly = 10;
+
 #define WAIT_REFS_MIN_MSECS 1
 #define WAIT_REFS_MAX_MSECS 250
 /**
@@ -10405,7 +10407,9 @@ static void netdev_wait_allrefs(struct net_device *dev)
 
 		refcnt = netdev_refcnt_read(dev);
 
-		if (refcnt && time_after(jiffies, warning_time + 10 * HZ)) {
+		if (refcnt &&
+		    time_after(jiffies, warning_time +
+			       netdev_unregister_timeout_secs * HZ)) {
 			pr_emerg("unregister_netdevice: waiting for %s to become free. Usage count = %d\n",
 				 dev->name, refcnt);
 			warning_time = jiffies;
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index 4567de519603b..d84c8a1b280e2 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -24,6 +24,7 @@
 
 static int two = 2;
 static int three = 3;
+static int int_3600 = 3600;
 static int min_sndbuf = SOCK_MIN_SNDBUF;
 static int min_rcvbuf = SOCK_MIN_RCVBUF;
 static int max_skb_frags = MAX_SKB_FRAGS;
@@ -570,6 +571,15 @@ static struct ctl_table net_core_table[] = {
 		.proc_handler	= proc_dointvec_minmax,
 		.extra1		= SYSCTL_ONE,
 	},
+	{
+		.procname	= "netdev_unregister_timeout_secs",
+		.data		= &netdev_unregister_timeout_secs,
+		.maxlen		= sizeof(unsigned int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= SYSCTL_ZERO,
+		.extra2		= &int_3600,
+	},
 	{ }
 };
 

base-commit: e0c755a45f6fb6e81e3a62a94db0400ef0cdc046
-- 
2.31.0.291.g576ba9dcdaf-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] net: make unregister netdev warning timeout configurable
  2021-03-23  6:49 [PATCH v2] net: make unregister netdev warning timeout configurable Dmitry Vyukov
@ 2021-03-24  9:32 ` Leon Romanovsky
  2021-03-24  9:40 ` Eric Dumazet
  1 sibling, 0 replies; 6+ messages in thread
From: Leon Romanovsky @ 2021-03-24  9:32 UTC (permalink / raw)
  To: Dmitry Vyukov; +Cc: davem, edumazet, netdev, linux-kernel

On Tue, Mar 23, 2021 at 07:49:23AM +0100, Dmitry Vyukov wrote:
> netdev_wait_allrefs() issues a warning if refcount does not drop to 0
> after 10 seconds. While 10 second wait generally should not happen
> under normal workload in normal environment, it seems to fire falsely
> very often during fuzzing and/or in qemu emulation (~10x slower).
> At least it's not possible to understand if it's really a false
> positive or not. Automated testing generally bumps all timeouts
> to very high values to avoid flake failures.
> Add net.core.netdev_unregister_timeout_secs sysctl to make
> the timeout configurable for automated testing systems.
> Lowering the timeout may also be useful for e.g. manual bisection.
> The default value matches the current behavior.
> 
> Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
> Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=211877
> Cc: netdev@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> 
> ---
> Changes since v1:
>  - use sysctl instead of a config
> ---
>  Documentation/admin-guide/sysctl/net.rst | 11 +++++++++++
>  include/linux/netdevice.h                |  1 +
>  net/core/dev.c                           |  6 +++++-
>  net/core/sysctl_net_core.c               | 10 ++++++++++
>  4 files changed, 27 insertions(+), 1 deletion(-)
> 

Thanks,
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] net: make unregister netdev warning timeout configurable
  2021-03-23  6:49 [PATCH v2] net: make unregister netdev warning timeout configurable Dmitry Vyukov
  2021-03-24  9:32 ` Leon Romanovsky
@ 2021-03-24  9:40 ` Eric Dumazet
  2021-03-25  7:39   ` Dmitry Vyukov
  1 sibling, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2021-03-24  9:40 UTC (permalink / raw)
  To: Dmitry Vyukov; +Cc: David Miller, Leon Romanovsky, netdev, LKML

On Tue, Mar 23, 2021 at 7:49 AM Dmitry Vyukov <dvyukov@google.com> wrote:
>
> netdev_wait_allrefs() issues a warning if refcount does not drop to 0
> after 10 seconds. While 10 second wait generally should not happen
> under normal workload in normal environment, it seems to fire falsely
> very often during fuzzing and/or in qemu emulation (~10x slower).
> At least it's not possible to understand if it's really a false
> positive or not. Automated testing generally bumps all timeouts
> to very high values to avoid flake failures.
> Add net.core.netdev_unregister_timeout_secs sysctl to make
> the timeout configurable for automated testing systems.
> Lowering the timeout may also be useful for e.g. manual bisection.
> The default value matches the current behavior.
>
> Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
> Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=211877
> Cc: netdev@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
>
> ---
> Changes since v1:
>  - use sysctl instead of a config
> ---

>         },
> +       {
> +               .procname       = "netdev_unregister_timeout_secs",
> +               .data           = &netdev_unregister_timeout_secs,
> +               .maxlen         = sizeof(unsigned int),
> +               .mode           = 0644,
> +               .proc_handler   = proc_dointvec_minmax,
> +               .extra1         = SYSCTL_ZERO,
> +               .extra2         = &int_3600,
> +       },
>         { }
>  };
>

If we allow the sysctl to be 0, then we risk a flood of pr_emerg()
(one per jiffy ?)

If you really want the zero value, you need to change pr_emerg() to
pr_emerg_ratelimited()

Also, please base your patch on net-next, to avoid future merge conflicts
with my prior patch add2d7363107 "net: set initial device refcount to 1".

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] net: make unregister netdev warning timeout configurable
  2021-03-24  9:40 ` Eric Dumazet
@ 2021-03-25  7:39   ` Dmitry Vyukov
  2021-03-25  8:46     ` Eric Dumazet
  0 siblings, 1 reply; 6+ messages in thread
From: Dmitry Vyukov @ 2021-03-25  7:39 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, Leon Romanovsky, netdev, LKML

On Wed, Mar 24, 2021 at 10:40 AM Eric Dumazet <edumazet@google.com> wrote:
>
> On Tue, Mar 23, 2021 at 7:49 AM Dmitry Vyukov <dvyukov@google.com> wrote:
> >
> > netdev_wait_allrefs() issues a warning if refcount does not drop to 0
> > after 10 seconds. While 10 second wait generally should not happen
> > under normal workload in normal environment, it seems to fire falsely
> > very often during fuzzing and/or in qemu emulation (~10x slower).
> > At least it's not possible to understand if it's really a false
> > positive or not. Automated testing generally bumps all timeouts
> > to very high values to avoid flake failures.
> > Add net.core.netdev_unregister_timeout_secs sysctl to make
> > the timeout configurable for automated testing systems.
> > Lowering the timeout may also be useful for e.g. manual bisection.
> > The default value matches the current behavior.
> >
> > Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
> > Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=211877
> > Cc: netdev@vger.kernel.org
> > Cc: linux-kernel@vger.kernel.org
> >
> > ---
> > Changes since v1:
> >  - use sysctl instead of a config
> > ---
>
> >         },
> > +       {
> > +               .procname       = "netdev_unregister_timeout_secs",
> > +               .data           = &netdev_unregister_timeout_secs,
> > +               .maxlen         = sizeof(unsigned int),
> > +               .mode           = 0644,
> > +               .proc_handler   = proc_dointvec_minmax,
> > +               .extra1         = SYSCTL_ZERO,
> > +               .extra2         = &int_3600,
> > +       },
> >         { }
> >  };
> >
>
> If we allow the sysctl to be 0, then we risk a flood of pr_emerg()
> (one per jiffy ?)

My reasoning was that it's up to the user. Some spammy output on the
console for rare events is probably not the worst way how root can
misconfigure the kernel :)
It allows one to check (more or less) if we are reaching
unregister_netdevice with non-zero refcount, which may be useful for
some debugging maybe.
But I don't mind changing it to 1 (or 5) if you prefer. On syzbot we
only want to increase it.

> If you really want the zero value, you need to change pr_emerg() to
> pr_emerg_ratelimited()
>
> Also, please base your patch on net-next, to avoid future merge conflicts
> with my prior patch add2d7363107 "net: set initial device refcount to 1".

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] net: make unregister netdev warning timeout configurable
  2021-03-25  7:39   ` Dmitry Vyukov
@ 2021-03-25  8:46     ` Eric Dumazet
  2021-03-25 10:32       ` Dmitry Vyukov
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2021-03-25  8:46 UTC (permalink / raw)
  To: Dmitry Vyukov; +Cc: David Miller, Leon Romanovsky, netdev, LKML

On Thu, Mar 25, 2021 at 8:39 AM Dmitry Vyukov <dvyukov@google.com> wrote:
>
> On Wed, Mar 24, 2021 at 10:40 AM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Tue, Mar 23, 2021 at 7:49 AM Dmitry Vyukov <dvyukov@google.com> wrote:
> > >
> > > netdev_wait_allrefs() issues a warning if refcount does not drop to 0
> > > after 10 seconds. While 10 second wait generally should not happen
> > > under normal workload in normal environment, it seems to fire falsely
> > > very often during fuzzing and/or in qemu emulation (~10x slower).
> > > At least it's not possible to understand if it's really a false
> > > positive or not. Automated testing generally bumps all timeouts
> > > to very high values to avoid flake failures.
> > > Add net.core.netdev_unregister_timeout_secs sysctl to make
> > > the timeout configurable for automated testing systems.
> > > Lowering the timeout may also be useful for e.g. manual bisection.
> > > The default value matches the current behavior.
> > >
> > > Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
> > > Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=211877
> > > Cc: netdev@vger.kernel.org
> > > Cc: linux-kernel@vger.kernel.org
> > >
> > > ---
> > > Changes since v1:
> > >  - use sysctl instead of a config
> > > ---
> >
> > >         },
> > > +       {
> > > +               .procname       = "netdev_unregister_timeout_secs",
> > > +               .data           = &netdev_unregister_timeout_secs,
> > > +               .maxlen         = sizeof(unsigned int),
> > > +               .mode           = 0644,
> > > +               .proc_handler   = proc_dointvec_minmax,
> > > +               .extra1         = SYSCTL_ZERO,
> > > +               .extra2         = &int_3600,
> > > +       },
> > >         { }
> > >  };
> > >
> >
> > If we allow the sysctl to be 0, then we risk a flood of pr_emerg()
> > (one per jiffy ?)
>
> My reasoning was that it's up to the user. Some spammy output on the
> console for rare events is probably not the worst way how root can
> misconfigure the kernel :)
> It allows one to check (more or less) if we are reaching
> unregister_netdevice with non-zero refcount, which may be useful for
> some debugging maybe.
> But I don't mind changing it to 1 (or 5) if you prefer. On syzbot we
> only want to increase it.
>

Yes, please use a lower limit of one to avoid spurious reports.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] net: make unregister netdev warning timeout configurable
  2021-03-25  8:46     ` Eric Dumazet
@ 2021-03-25 10:32       ` Dmitry Vyukov
  0 siblings, 0 replies; 6+ messages in thread
From: Dmitry Vyukov @ 2021-03-25 10:32 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, Leon Romanovsky, netdev, LKML

On Thu, Mar 25, 2021 at 9:46 AM Eric Dumazet <edumazet@google.com> wrote:
>
> On Thu, Mar 25, 2021 at 8:39 AM Dmitry Vyukov <dvyukov@google.com> wrote:
> >
> > On Wed, Mar 24, 2021 at 10:40 AM Eric Dumazet <edumazet@google.com> wrote:
> > >
> > > On Tue, Mar 23, 2021 at 7:49 AM Dmitry Vyukov <dvyukov@google.com> wrote:
> > > >
> > > > netdev_wait_allrefs() issues a warning if refcount does not drop to 0
> > > > after 10 seconds. While 10 second wait generally should not happen
> > > > under normal workload in normal environment, it seems to fire falsely
> > > > very often during fuzzing and/or in qemu emulation (~10x slower).
> > > > At least it's not possible to understand if it's really a false
> > > > positive or not. Automated testing generally bumps all timeouts
> > > > to very high values to avoid flake failures.
> > > > Add net.core.netdev_unregister_timeout_secs sysctl to make
> > > > the timeout configurable for automated testing systems.
> > > > Lowering the timeout may also be useful for e.g. manual bisection.
> > > > The default value matches the current behavior.
> > > >
> > > > Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
> > > > Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=211877
> > > > Cc: netdev@vger.kernel.org
> > > > Cc: linux-kernel@vger.kernel.org
> > > >
> > > > ---
> > > > Changes since v1:
> > > >  - use sysctl instead of a config
> > > > ---
> > >
> > > >         },
> > > > +       {
> > > > +               .procname       = "netdev_unregister_timeout_secs",
> > > > +               .data           = &netdev_unregister_timeout_secs,
> > > > +               .maxlen         = sizeof(unsigned int),
> > > > +               .mode           = 0644,
> > > > +               .proc_handler   = proc_dointvec_minmax,
> > > > +               .extra1         = SYSCTL_ZERO,
> > > > +               .extra2         = &int_3600,
> > > > +       },
> > > >         { }
> > > >  };
> > > >
> > >
> > > If we allow the sysctl to be 0, then we risk a flood of pr_emerg()
> > > (one per jiffy ?)
> >
> > My reasoning was that it's up to the user. Some spammy output on the
> > console for rare events is probably not the worst way how root can
> > misconfigure the kernel :)
> > It allows one to check (more or less) if we are reaching
> > unregister_netdevice with non-zero refcount, which may be useful for
> > some debugging maybe.
> > But I don't mind changing it to 1 (or 5) if you prefer. On syzbot we
> > only want to increase it.
> >
>
> Yes, please use a lower limit of one to avoid spurious reports.

This commit is already in net-next and net-next is not rebased, right?
I sent a separate fix as "net: change netdev_unregister_timeout_secs
min value to 1".

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-03-25 10:33 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-23  6:49 [PATCH v2] net: make unregister netdev warning timeout configurable Dmitry Vyukov
2021-03-24  9:32 ` Leon Romanovsky
2021-03-24  9:40 ` Eric Dumazet
2021-03-25  7:39   ` Dmitry Vyukov
2021-03-25  8:46     ` Eric Dumazet
2021-03-25 10:32       ` Dmitry Vyukov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.