All of lore.kernel.org
 help / color / mirror / Atom feed
From: Antoine Tenart <atenart@kernel.org>
To: davem@davemloft.net, kuba@kernel.org
Cc: Antoine Tenart <atenart@kernel.org>,
	pabeni@redhat.com, gregkh@linuxfoundation.org,
	ebiederm@xmission.com, stephen@networkplumber.org,
	herbert@gondor.apana.org.au, juri.lelli@redhat.com,
	netdev@vger.kernel.org
Subject: [RFC PATCH net-next 1/9] net-sysfs: try not to restart the syscall if it will fail eventually
Date: Tue, 28 Sep 2021 14:54:52 +0200	[thread overview]
Message-ID: <20210928125500.167943-2-atenart@kernel.org> (raw)
In-Reply-To: <20210928125500.167943-1-atenart@kernel.org>

Due to deadlocks in the networking subsystem spotted 12 years ago[1],
a workaround was put in place[2] to avoid taking the rtnl lock when it
was not available and restarting the syscall (back to VFS, letting
userspace spin). The following construction is found a lot in the net
sysfs and sysctl code:

  if (!rtnl_trylock())
          return restart_syscall();

This can be problematic when multiple userspace threads use such
interfaces in a short period, making them to spin a lot. This happens
for example when adding and moving virtual interfaces: userspace
programs listening on events, such as systemd-udevd and NetworkManager,
do trigger actions reading files in sysfs. It gets worse when a lot of
virtual interfaces are created concurrently, say when creating
containers at boot time.

Returning early without hitting the above pattern when the syscall will
fail eventually does make things better. While it is not a fix for the
issue, it does ease things.

[1] https://lore.kernel.org/netdev/49A4D5D5.5090602@trash.net/
    https://lore.kernel.org/netdev/m14oyhis31.fsf@fess.ebiederm.org/
    and https://lore.kernel.org/netdev/20090226084924.16cb3e08@nehalam/
[2] Rightfully, those deadlocks are *hard* to solve.

Signed-off-by: Antoine Tenart <atenart@kernel.org>
---
 net/core/net-sysfs.c | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index f6197774048b..21c3fdeccf20 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -196,6 +196,12 @@ static ssize_t speed_show(struct device *dev,
 	struct net_device *netdev = to_net_dev(dev);
 	int ret = -EINVAL;
 
+	/* The check is also done in __ethtool_get_link_ksettings; this helps
+	 * returning early without hitting the trylock/restart below.
+	 */
+	if (!netdev->ethtool_ops->get_link_ksettings)
+		return -EOPNOTSUPP;
+
 	if (!rtnl_trylock())
 		return restart_syscall();
 
@@ -216,6 +222,12 @@ static ssize_t duplex_show(struct device *dev,
 	struct net_device *netdev = to_net_dev(dev);
 	int ret = -EINVAL;
 
+	/* The check is also done in __ethtool_get_link_ksettings; this helps
+	 * returning early without hitting the trylock/restart below.
+	 */
+	if (!netdev->ethtool_ops->get_link_ksettings)
+		return -EOPNOTSUPP;
+
 	if (!rtnl_trylock())
 		return restart_syscall();
 
@@ -478,6 +490,12 @@ static ssize_t phys_port_id_show(struct device *dev,
 	struct net_device *netdev = to_net_dev(dev);
 	ssize_t ret = -EINVAL;
 
+	/* The check is also done in dev_get_phys_port_id; this helps returning
+	 * early without hitting the trylock/restart below.
+	 */
+	if (!netdev->netdev_ops->ndo_get_phys_port_id)
+		return -EOPNOTSUPP;
+
 	if (!rtnl_trylock())
 		return restart_syscall();
 
@@ -500,6 +518,13 @@ static ssize_t phys_port_name_show(struct device *dev,
 	struct net_device *netdev = to_net_dev(dev);
 	ssize_t ret = -EINVAL;
 
+	/* The checks are also done in dev_get_phys_port_name; this helps
+	 * returning early without hitting the trylock/restart below.
+	 */
+	if (!netdev->netdev_ops->ndo_get_phys_port_name &&
+	    !netdev->netdev_ops->ndo_get_devlink_port)
+		return -EOPNOTSUPP;
+
 	if (!rtnl_trylock())
 		return restart_syscall();
 
@@ -522,6 +547,13 @@ static ssize_t phys_switch_id_show(struct device *dev,
 	struct net_device *netdev = to_net_dev(dev);
 	ssize_t ret = -EINVAL;
 
+	/* The checks are also done in dev_get_phys_port_name; this helps
+	 * returning early without hitting the trylock/restart below.
+	 */
+	if (!netdev->netdev_ops->ndo_get_port_parent_id &&
+	    !netdev->netdev_ops->ndo_get_devlink_port)
+		return -EOPNOTSUPP;
+
 	if (!rtnl_trylock())
 		return restart_syscall();
 
@@ -1226,6 +1258,12 @@ static ssize_t tx_maxrate_store(struct netdev_queue *queue,
 	if (!capable(CAP_NET_ADMIN))
 		return -EPERM;
 
+	/* The check is also done later; this helps returning early without
+	 * hitting the trylock/restart below.
+	 */
+	if (!dev->netdev_ops->ndo_set_tx_maxrate)
+		return -EOPNOTSUPP;
+
 	err = kstrtou32(buf, 10, &rate);
 	if (err < 0)
 		return err;
-- 
2.31.1


  reply	other threads:[~2021-09-28 12:55 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-28 12:54 [RFC PATCH net-next 0/9] Userspace spinning on net-sysfs access Antoine Tenart
2021-09-28 12:54 ` Antoine Tenart [this message]
2021-09-28 12:54 ` [RFC PATCH net-next 2/9] net: split unlisting the net device from unlisting its node name Antoine Tenart
2021-09-28 12:54 ` [RFC PATCH net-next 3/9] net: export netdev_name_node_lookup Antoine Tenart
2021-09-28 12:54 ` [RFC PATCH net-next 4/9] bonding: use the correct function to check for netdev name collision Antoine Tenart
2021-09-28 12:54 ` [RFC PATCH net-next 5/9] ppp: " Antoine Tenart
2021-09-28 12:54 ` [RFC PATCH net-next 6/9] net: " Antoine Tenart
2021-09-28 12:54 ` [RFC PATCH net-next 7/9] net: delay the removal of the name nodes until run_todo Antoine Tenart
2021-09-28 12:54 ` [RFC PATCH net-next 8/9] net: delay device_del " Antoine Tenart
2021-09-29  0:02   ` Jakub Kicinski
2021-09-29  8:26     ` Antoine Tenart
2021-09-29 13:31       ` Jakub Kicinski
2021-09-29 17:31         ` Antoine Tenart
2021-10-29  9:04           ` Antoine Tenart
2021-10-05 15:21         ` Antoine Tenart
2021-10-05 18:34           ` Jakub Kicinski
2021-09-28 12:55 ` [RFC PATCH net-next 9/9] net-sysfs: remove the use of rtnl_trylock/restart_syscall Antoine Tenart
2021-10-06  6:45   ` Michal Hocko
2021-10-06  8:03     ` Antoine Tenart
2021-10-06  8:55       ` Michal Hocko
2021-10-06  6:37 ` [RFC PATCH net-next 0/9] Userspace spinning on net-sysfs access Michal Hocko
2021-10-06  7:59   ` Antoine Tenart
2021-10-06  8:35     ` Michal Hocko
2021-10-29 14:33 ` Antoine Tenart
2021-10-29 15:45   ` Stephen Hemminger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210928125500.167943-2-atenart@kernel.org \
    --to=atenart@kernel.org \
    --cc=davem@davemloft.net \
    --cc=ebiederm@xmission.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=herbert@gondor.apana.org.au \
    --cc=juri.lelli@redhat.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.