netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/4] ethtool: runtime-resume netdev parent before ethtool ops
@ 2021-08-01 10:35 Heiner Kallweit
  2021-08-01 10:36 ` [PATCH net-next 1/4] ethtool: runtime-resume netdev parent before ethtool ioctl ops Heiner Kallweit
                   ` (5 more replies)
  0 siblings, 6 replies; 22+ messages in thread
From: Heiner Kallweit @ 2021-08-01 10:35 UTC (permalink / raw)
  To: Jakub Kicinski, David Miller; +Cc: netdev

If a network device is runtime-suspended then:
- network device may be flagged as detached and all ethtool ops (even if
  not accessing the device) will fail because netif_device_present()
  returns false
- ethtool ops may fail because device is not accessible (e.g. because being
  in D3 in case of a PCI device)

It may not be desirable that userspace can't use even simple ethtool ops
that not access the device if interface or link is down. To be more friendly
to userspace let's ensure that device is runtime-resumed when executing
ethtool ops in kernel.

This patch series covers the typical case that the netdev parent is power-
managed, e.g. a PCI device. Not sure whether cases exist where the netdev
itself is power-managed. If yes then we may need an extension for this.
But the series as-is at least shouldn't cause problems in that case.

Heiner Kallweit (4):
  ethtool: runtime-resume netdev parent before ethtool ioctl ops
  ethtool: move implementation of ethnl_ops_begin/complete to netlink.c
  ethtool: move netif_device_present check from
    ethnl_parse_header_dev_get to ethnl_ops_begin
  ethtool: runtime-resume netdev parent in ethnl_ops_begin

 net/ethtool/ioctl.c   | 18 ++++++++++++++---
 net/ethtool/netlink.c | 45 +++++++++++++++++++++++++++++++++++++------
 net/ethtool/netlink.h | 15 ++-------------
 3 files changed, 56 insertions(+), 22 deletions(-)

-- 
2.32.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH net-next 1/4] ethtool: runtime-resume netdev parent before ethtool ioctl ops
  2021-08-01 10:35 [PATCH net-next 0/4] ethtool: runtime-resume netdev parent before ethtool ops Heiner Kallweit
@ 2021-08-01 10:36 ` Heiner Kallweit
  2021-08-03 20:41   ` Grygorii Strashko
  2021-08-01 10:37 ` [PATCH net-next 2/4] ethtool: move implementation of ethnl_ops_begin/complete to netlink.c Heiner Kallweit
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 22+ messages in thread
From: Heiner Kallweit @ 2021-08-01 10:36 UTC (permalink / raw)
  To: Jakub Kicinski, David Miller; +Cc: netdev

If a network device is runtime-suspended then:
- network device may be flagged as detached and all ethtool ops (even if not
  accessing the device) will fail because netif_device_present() returns
  false
- ethtool ops may fail because device is not accessible (e.g. because being
  in D3 in case of a PCI device)

It may not be desirable that userspace can't use even simple ethtool ops
that not access the device if interface or link is down. To be more friendly
to userspace let's ensure that device is runtime-resumed when executing the
respective ethtool op in kernel.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 net/ethtool/ioctl.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
index baa5d1004..b7ff9abe7 100644
--- a/net/ethtool/ioctl.c
+++ b/net/ethtool/ioctl.c
@@ -23,6 +23,7 @@
 #include <linux/rtnetlink.h>
 #include <linux/sched/signal.h>
 #include <linux/net.h>
+#include <linux/pm_runtime.h>
 #include <net/devlink.h>
 #include <net/xdp_sock_drv.h>
 #include <net/flow_offload.h>
@@ -2589,7 +2590,7 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
 	int rc;
 	netdev_features_t old_features;
 
-	if (!dev || !netif_device_present(dev))
+	if (!dev)
 		return -ENODEV;
 
 	if (copy_from_user(&ethcmd, useraddr, sizeof(ethcmd)))
@@ -2645,10 +2646,18 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
 			return -EPERM;
 	}
 
+	if (dev->dev.parent)
+		pm_runtime_get_sync(dev->dev.parent);
+
+	if (!netif_device_present(dev)) {
+		rc = -ENODEV;
+		goto out;
+	}
+
 	if (dev->ethtool_ops->begin) {
 		rc = dev->ethtool_ops->begin(dev);
-		if (rc  < 0)
-			return rc;
+		if (rc < 0)
+			goto out;
 	}
 	old_features = dev->features;
 
@@ -2867,6 +2876,9 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
 
 	if (old_features != dev->features)
 		netdev_features_change(dev);
+out:
+	if (dev->dev.parent)
+		pm_runtime_put(dev->dev.parent);
 
 	return rc;
 }
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH net-next 2/4] ethtool: move implementation of ethnl_ops_begin/complete to netlink.c
  2021-08-01 10:35 [PATCH net-next 0/4] ethtool: runtime-resume netdev parent before ethtool ops Heiner Kallweit
  2021-08-01 10:36 ` [PATCH net-next 1/4] ethtool: runtime-resume netdev parent before ethtool ioctl ops Heiner Kallweit
@ 2021-08-01 10:37 ` Heiner Kallweit
  2021-08-01 10:40 ` [PATCH net-next 3/4] ethtool: move netif_device_present check from ethnl_parse_header_dev_get to ethnl_ops_begin Heiner Kallweit
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 22+ messages in thread
From: Heiner Kallweit @ 2021-08-01 10:37 UTC (permalink / raw)
  To: Jakub Kicinski, David Miller; +Cc: netdev

In preparation of subsequent extensions to both functions move the
implementations from netlink.h to netlink.c.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 net/ethtool/netlink.c | 14 ++++++++++++++
 net/ethtool/netlink.h | 15 ++-------------
 2 files changed, 16 insertions(+), 13 deletions(-)

diff --git a/net/ethtool/netlink.c b/net/ethtool/netlink.c
index 73e0f5b62..ac720d684 100644
--- a/net/ethtool/netlink.c
+++ b/net/ethtool/netlink.c
@@ -29,6 +29,20 @@ const struct nla_policy ethnl_header_policy_stats[] = {
 							  ETHTOOL_FLAGS_STATS),
 };
 
+int ethnl_ops_begin(struct net_device *dev)
+{
+	if (dev && dev->ethtool_ops->begin)
+		return dev->ethtool_ops->begin(dev);
+	else
+		return 0;
+}
+
+void ethnl_ops_complete(struct net_device *dev)
+{
+	if (dev && dev->ethtool_ops->complete)
+		dev->ethtool_ops->complete(dev);
+}
+
 /**
  * ethnl_parse_header_dev_get() - parse request header
  * @req_info:    structure to put results into
diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h
index 3fc395c86..077aac392 100644
--- a/net/ethtool/netlink.h
+++ b/net/ethtool/netlink.h
@@ -247,19 +247,8 @@ struct ethnl_reply_data {
 	struct net_device		*dev;
 };
 
-static inline int ethnl_ops_begin(struct net_device *dev)
-{
-	if (dev && dev->ethtool_ops->begin)
-		return dev->ethtool_ops->begin(dev);
-	else
-		return 0;
-}
-
-static inline void ethnl_ops_complete(struct net_device *dev)
-{
-	if (dev && dev->ethtool_ops->complete)
-		dev->ethtool_ops->complete(dev);
-}
+int ethnl_ops_begin(struct net_device *dev);
+void ethnl_ops_complete(struct net_device *dev);
 
 /**
  * struct ethnl_request_ops - unified handling of GET requests
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH net-next 3/4] ethtool: move netif_device_present check from ethnl_parse_header_dev_get to ethnl_ops_begin
  2021-08-01 10:35 [PATCH net-next 0/4] ethtool: runtime-resume netdev parent before ethtool ops Heiner Kallweit
  2021-08-01 10:36 ` [PATCH net-next 1/4] ethtool: runtime-resume netdev parent before ethtool ioctl ops Heiner Kallweit
  2021-08-01 10:37 ` [PATCH net-next 2/4] ethtool: move implementation of ethnl_ops_begin/complete to netlink.c Heiner Kallweit
@ 2021-08-01 10:40 ` Heiner Kallweit
  2021-08-01 10:41 ` [PATCH net-next 4/4] ethtool: runtime-resume netdev parent in ethnl_ops_begin Heiner Kallweit
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 22+ messages in thread
From: Heiner Kallweit @ 2021-08-01 10:40 UTC (permalink / raw)
  To: Jakub Kicinski, David Miller; +Cc: netdev

If device is runtime-suspended and not accessible then it may be
flagged as not present. If checking whether device is present is
done too early then we may bail out before we have the chance to
runtime-resume the device. Therefore move this check to
ethnl_ops_begin(). This is in preparation of a follow-up patch
that tries to runtime-resume the device before executing ethtool
ops.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 net/ethtool/netlink.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/net/ethtool/netlink.c b/net/ethtool/netlink.c
index ac720d684..e628d17f5 100644
--- a/net/ethtool/netlink.c
+++ b/net/ethtool/netlink.c
@@ -31,7 +31,13 @@ const struct nla_policy ethnl_header_policy_stats[] = {
 
 int ethnl_ops_begin(struct net_device *dev)
 {
-	if (dev && dev->ethtool_ops->begin)
+	if (!dev)
+		return 0;
+
+	if (!netif_device_present(dev))
+		return -ENODEV;
+
+	if (dev->ethtool_ops->begin)
 		return dev->ethtool_ops->begin(dev);
 	else
 		return 0;
@@ -115,12 +121,6 @@ int ethnl_parse_header_dev_get(struct ethnl_req_info *req_info,
 		return -EINVAL;
 	}
 
-	if (dev && !netif_device_present(dev)) {
-		dev_put(dev);
-		NL_SET_ERR_MSG(extack, "device not present");
-		return -ENODEV;
-	}
-
 	req_info->dev = dev;
 	req_info->flags = flags;
 	return 0;
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH net-next 4/4] ethtool: runtime-resume netdev parent in ethnl_ops_begin
  2021-08-01 10:35 [PATCH net-next 0/4] ethtool: runtime-resume netdev parent before ethtool ops Heiner Kallweit
                   ` (2 preceding siblings ...)
  2021-08-01 10:40 ` [PATCH net-next 3/4] ethtool: move netif_device_present check from ethnl_parse_header_dev_get to ethnl_ops_begin Heiner Kallweit
@ 2021-08-01 10:41 ` Heiner Kallweit
  2021-08-05 11:51   ` Julian Wiedmann
  2021-08-01 16:25 ` [PATCH net-next 0/4] ethtool: runtime-resume netdev parent before ethtool ops Heiner Kallweit
  2021-08-03 12:00 ` patchwork-bot+netdevbpf
  5 siblings, 1 reply; 22+ messages in thread
From: Heiner Kallweit @ 2021-08-01 10:41 UTC (permalink / raw)
  To: Jakub Kicinski, David Miller; +Cc: netdev

If a network device is runtime-suspended then:
- network device may be flagged as detached and all ethtool ops (even if not
  accessing the device) will fail because netif_device_present() returns
  false
- ethtool ops may fail because device is not accessible (e.g. because being
  in D3 in case of a PCI device)

It may not be desirable that userspace can't use even simple ethtool ops
that not access the device if interface or link is down. To be more friendly
to userspace let's ensure that device is runtime-resumed when executing the
respective ethtool op in kernel.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 net/ethtool/netlink.c | 31 +++++++++++++++++++++++++------
 1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/net/ethtool/netlink.c b/net/ethtool/netlink.c
index e628d17f5..417aaf9ca 100644
--- a/net/ethtool/netlink.c
+++ b/net/ethtool/netlink.c
@@ -2,6 +2,7 @@
 
 #include <net/sock.h>
 #include <linux/ethtool_netlink.h>
+#include <linux/pm_runtime.h>
 #include "netlink.h"
 
 static struct genl_family ethtool_genl_family;
@@ -31,22 +32,40 @@ const struct nla_policy ethnl_header_policy_stats[] = {
 
 int ethnl_ops_begin(struct net_device *dev)
 {
+	int ret;
+
 	if (!dev)
 		return 0;
 
-	if (!netif_device_present(dev))
-		return -ENODEV;
+	if (dev->dev.parent)
+		pm_runtime_get_sync(dev->dev.parent);
 
-	if (dev->ethtool_ops->begin)
-		return dev->ethtool_ops->begin(dev);
-	else
-		return 0;
+	if (!netif_device_present(dev)) {
+		ret = -ENODEV;
+		goto err;
+	}
+
+	if (dev->ethtool_ops->begin) {
+		ret = dev->ethtool_ops->begin(dev);
+		if (ret)
+			goto err;
+	}
+
+	return 0;
+err:
+	if (dev->dev.parent)
+		pm_runtime_put(dev->dev.parent);
+
+	return ret;
 }
 
 void ethnl_ops_complete(struct net_device *dev)
 {
 	if (dev && dev->ethtool_ops->complete)
 		dev->ethtool_ops->complete(dev);
+
+	if (dev->dev.parent)
+		pm_runtime_put(dev->dev.parent);
 }
 
 /**
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next 0/4] ethtool: runtime-resume netdev parent before ethtool ops
  2021-08-01 10:35 [PATCH net-next 0/4] ethtool: runtime-resume netdev parent before ethtool ops Heiner Kallweit
                   ` (3 preceding siblings ...)
  2021-08-01 10:41 ` [PATCH net-next 4/4] ethtool: runtime-resume netdev parent in ethnl_ops_begin Heiner Kallweit
@ 2021-08-01 16:25 ` Heiner Kallweit
  2021-08-02 14:15   ` Jakub Kicinski
  2021-08-03 12:00 ` patchwork-bot+netdevbpf
  5 siblings, 1 reply; 22+ messages in thread
From: Heiner Kallweit @ 2021-08-01 16:25 UTC (permalink / raw)
  To: Jakub Kicinski, David Miller; +Cc: netdev

On 01.08.2021 12:35, Heiner Kallweit wrote:
> If a network device is runtime-suspended then:
> - network device may be flagged as detached and all ethtool ops (even if
>   not accessing the device) will fail because netif_device_present()
>   returns false
> - ethtool ops may fail because device is not accessible (e.g. because being
>   in D3 in case of a PCI device)
> 
> It may not be desirable that userspace can't use even simple ethtool ops
> that not access the device if interface or link is down. To be more friendly
> to userspace let's ensure that device is runtime-resumed when executing
> ethtool ops in kernel.
> 
> This patch series covers the typical case that the netdev parent is power-
> managed, e.g. a PCI device. Not sure whether cases exist where the netdev
> itself is power-managed. If yes then we may need an extension for this.
> But the series as-is at least shouldn't cause problems in that case.
> 
> Heiner Kallweit (4):
>   ethtool: runtime-resume netdev parent before ethtool ioctl ops
>   ethtool: move implementation of ethnl_ops_begin/complete to netlink.c
>   ethtool: move netif_device_present check from
>     ethnl_parse_header_dev_get to ethnl_ops_begin
>   ethtool: runtime-resume netdev parent in ethnl_ops_begin
> 
>  net/ethtool/ioctl.c   | 18 ++++++++++++++---
>  net/ethtool/netlink.c | 45 +++++++++++++++++++++++++++++++++++++------
>  net/ethtool/netlink.h | 15 ++-------------
>  3 files changed, 56 insertions(+), 22 deletions(-)
> 

Patchwork is showing the following warning for all patches in the series.

netdev/cc_maintainers	warning	7 maintainers not CCed: ecree@solarflare.com andrew@lunn.ch magnus.karlsson@intel.com danieller@nvidia.com arnd@arndb.de irusskikh@marvell.com alexanderduyck@fb.com

This seems to be a false positive, e.g. address ecree@solarflare.com
doesn't exist at all in MAINTAINERS file.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next 0/4] ethtool: runtime-resume netdev parent before ethtool ops
  2021-08-01 16:25 ` [PATCH net-next 0/4] ethtool: runtime-resume netdev parent before ethtool ops Heiner Kallweit
@ 2021-08-02 14:15   ` Jakub Kicinski
  2021-08-02 16:42     ` Heiner Kallweit
  0 siblings, 1 reply; 22+ messages in thread
From: Jakub Kicinski @ 2021-08-02 14:15 UTC (permalink / raw)
  To: Heiner Kallweit; +Cc: David Miller, netdev

On Sun, 1 Aug 2021 18:25:52 +0200 Heiner Kallweit wrote:
> Patchwork is showing the following warning for all patches in the series.
> 
> netdev/cc_maintainers	warning	7 maintainers not CCed: ecree@solarflare.com andrew@lunn.ch magnus.karlsson@intel.com danieller@nvidia.com arnd@arndb.de irusskikh@marvell.com alexanderduyck@fb.com
> 
> This seems to be a false positive, e.g. address ecree@solarflare.com
> doesn't exist at all in MAINTAINERS file.

It gets the list from the get_maintainers script. It's one of the less
reliable tests, but I feel like efforts should be made primarily
towards improving get_maintainers rather than improving the test itself.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next 0/4] ethtool: runtime-resume netdev parent before ethtool ops
  2021-08-02 14:15   ` Jakub Kicinski
@ 2021-08-02 16:42     ` Heiner Kallweit
  2021-08-02 16:54       ` Jakub Kicinski
  0 siblings, 1 reply; 22+ messages in thread
From: Heiner Kallweit @ 2021-08-02 16:42 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: David Miller, netdev

On 02.08.2021 16:15, Jakub Kicinski wrote:
> On Sun, 1 Aug 2021 18:25:52 +0200 Heiner Kallweit wrote:
>> Patchwork is showing the following warning for all patches in the series.
>>
>> netdev/cc_maintainers	warning	7 maintainers not CCed: ecree@solarflare.com andrew@lunn.ch magnus.karlsson@intel.com danieller@nvidia.com arnd@arndb.de irusskikh@marvell.com alexanderduyck@fb.com
>>
>> This seems to be a false positive, e.g. address ecree@solarflare.com
>> doesn't exist at all in MAINTAINERS file.
> 
> It gets the list from the get_maintainers script. It's one of the less
> reliable tests, but I feel like efforts should be made primarily
> towards improving get_maintainers rather than improving the test itself.
> 
When running get_maintainers.pl for any of the patches in the series
I don't get these addresses. I run get_maintainers w/o options, maybe
you set some special option? That's what I get when running get_maintainers:

"David S. Miller" <davem@davemloft.net> (maintainer:NETWORKING [GENERAL])
Jakub Kicinski <kuba@kernel.org> (maintainer:NETWORKING [GENERAL])
Stephen Rothwell <sfr@canb.auug.org.au> (commit_signer:1/2=50%,authored:1/2=50%,added_lines:3144/3159=100%)
Heiner Kallweit <hkallweit1@gmail.com> (commit_signer:1/2=50%,authored:1/2=50%,removed_lines:3/3=100%)
netdev@vger.kernel.org (open list:NETWORKING [GENERAL])
linux-kernel@vger.kernel.org (open list)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next 0/4] ethtool: runtime-resume netdev parent before ethtool ops
  2021-08-02 16:42     ` Heiner Kallweit
@ 2021-08-02 16:54       ` Jakub Kicinski
  2021-08-02 19:00         ` Heiner Kallweit
  0 siblings, 1 reply; 22+ messages in thread
From: Jakub Kicinski @ 2021-08-02 16:54 UTC (permalink / raw)
  To: Heiner Kallweit; +Cc: David Miller, netdev

On Mon, 2 Aug 2021 18:42:28 +0200 Heiner Kallweit wrote:
> On 02.08.2021 16:15, Jakub Kicinski wrote:
> > On Sun, 1 Aug 2021 18:25:52 +0200 Heiner Kallweit wrote:  
> >> Patchwork is showing the following warning for all patches in the series.
> >>
> >> netdev/cc_maintainers	warning	7 maintainers not CCed: ecree@solarflare.com andrew@lunn.ch magnus.karlsson@intel.com danieller@nvidia.com arnd@arndb.de irusskikh@marvell.com alexanderduyck@fb.com
> >>
> >> This seems to be a false positive, e.g. address ecree@solarflare.com
> >> doesn't exist at all in MAINTAINERS file.  
> > 
> > It gets the list from the get_maintainers script. It's one of the less
> > reliable tests, but I feel like efforts should be made primarily
> > towards improving get_maintainers rather than improving the test itself.
> >   
> When running get_maintainers.pl for any of the patches in the series
> I don't get these addresses. I run get_maintainers w/o options, maybe
> you set some special option? That's what I get when running get_maintainers:
> 
> "David S. Miller" <davem@davemloft.net> (maintainer:NETWORKING [GENERAL])
> Jakub Kicinski <kuba@kernel.org> (maintainer:NETWORKING [GENERAL])
> Stephen Rothwell <sfr@canb.auug.org.au> (commit_signer:1/2=50%,authored:1/2=50%,added_lines:3144/3159=100%)
> Heiner Kallweit <hkallweit1@gmail.com> (commit_signer:1/2=50%,authored:1/2=50%,removed_lines:3/3=100%)
> netdev@vger.kernel.org (open list:NETWORKING [GENERAL])
> linux-kernel@vger.kernel.org (open list)

Mm. Maybe your system doesn't have some perl module? Not sure what it
may be. With tip of net-next/master:

$ ./scripts/get_maintainer.pl /tmp/te/0002-ethtool-move-implementation-of-ethnl_ops_begin-compl.patch
"David S. Miller" <davem@davemloft.net> (maintainer:NETWORKING [GENERAL],commit_signer:12/16=75%,commit_signer:15/18=83%)
Jakub Kicinski <kuba@kernel.org> (maintainer:NETWORKING [GENERAL],commit_signer:11/16=69%,authored:9/16=56%,added_lines:127/198=64%,removed_lines:41/57=72%,commit_signer:14/18=78%,authored:11/18=61%,added_lines:74/84=88%,removed_lines:35/52=67%)
Heiner Kallweit <hkallweit1@gmail.com> (commit_signer:3/16=19%,authored:3/16=19%,added_lines:46/198=23%,removed_lines:13/57=23%,authored:1/18=6%,removed_lines:13/52=25%)
Fernando Fernandez Mancera <ffmancera@riseup.net> (commit_signer:1/16=6%,authored:1/16=6%)
Vladyslav Tarasiuk <vladyslavt@nvidia.com> (commit_signer:1/16=6%,added_lines:11/198=6%,commit_signer:1/18=6%)
Yangbo Lu <yangbo.lu@nxp.com> (authored:1/16=6%,added_lines:10/198=5%,authored:1/18=6%)
Johannes Berg <johannes.berg@intel.com> (authored:1/16=6%)
Zheng Yongjun <zhengyongjun3@huawei.com> (commit_signer:1/18=6%)
Andrew Lunn <andrew@lunn.ch> (commit_signer:1/18=6%)
Danielle Ratson <danieller@nvidia.com> (authored:1/18=6%)
Ido Schimmel <idosch@nvidia.com> (authored:1/18=6%)
netdev@vger.kernel.org (open list:NETWORKING [GENERAL])
linux-kernel@vger.kernel.org (open list)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next 0/4] ethtool: runtime-resume netdev parent before ethtool ops
  2021-08-02 16:54       ` Jakub Kicinski
@ 2021-08-02 19:00         ` Heiner Kallweit
  0 siblings, 0 replies; 22+ messages in thread
From: Heiner Kallweit @ 2021-08-02 19:00 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: David Miller, netdev

On 02.08.2021 18:54, Jakub Kicinski wrote:
> On Mon, 2 Aug 2021 18:42:28 +0200 Heiner Kallweit wrote:
>> On 02.08.2021 16:15, Jakub Kicinski wrote:
>>> On Sun, 1 Aug 2021 18:25:52 +0200 Heiner Kallweit wrote:  
>>>> Patchwork is showing the following warning for all patches in the series.
>>>>
>>>> netdev/cc_maintainers	warning	7 maintainers not CCed: ecree@solarflare.com andrew@lunn.ch magnus.karlsson@intel.com danieller@nvidia.com arnd@arndb.de irusskikh@marvell.com alexanderduyck@fb.com
>>>>
>>>> This seems to be a false positive, e.g. address ecree@solarflare.com
>>>> doesn't exist at all in MAINTAINERS file.  
>>>
>>> It gets the list from the get_maintainers script. It's one of the less
>>> reliable tests, but I feel like efforts should be made primarily
>>> towards improving get_maintainers rather than improving the test itself.
>>>   
>> When running get_maintainers.pl for any of the patches in the series
>> I don't get these addresses. I run get_maintainers w/o options, maybe
>> you set some special option? That's what I get when running get_maintainers:
>>
>> "David S. Miller" <davem@davemloft.net> (maintainer:NETWORKING [GENERAL])
>> Jakub Kicinski <kuba@kernel.org> (maintainer:NETWORKING [GENERAL])
>> Stephen Rothwell <sfr@canb.auug.org.au> (commit_signer:1/2=50%,authored:1/2=50%,added_lines:3144/3159=100%)
>> Heiner Kallweit <hkallweit1@gmail.com> (commit_signer:1/2=50%,authored:1/2=50%,removed_lines:3/3=100%)
>> netdev@vger.kernel.org (open list:NETWORKING [GENERAL])
>> linux-kernel@vger.kernel.org (open list)
> 
> Mm. Maybe your system doesn't have some perl module? Not sure what it
> may be. With tip of net-next/master:
> 
> $ ./scripts/get_maintainer.pl /tmp/te/0002-ethtool-move-implementation-of-ethnl_ops_begin-compl.patch
> "David S. Miller" <davem@davemloft.net> (maintainer:NETWORKING [GENERAL],commit_signer:12/16=75%,commit_signer:15/18=83%)
> Jakub Kicinski <kuba@kernel.org> (maintainer:NETWORKING [GENERAL],commit_signer:11/16=69%,authored:9/16=56%,added_lines:127/198=64%,removed_lines:41/57=72%,commit_signer:14/18=78%,authored:11/18=61%,added_lines:74/84=88%,removed_lines:35/52=67%)
> Heiner Kallweit <hkallweit1@gmail.com> (commit_signer:3/16=19%,authored:3/16=19%,added_lines:46/198=23%,removed_lines:13/57=23%,authored:1/18=6%,removed_lines:13/52=25%)
> Fernando Fernandez Mancera <ffmancera@riseup.net> (commit_signer:1/16=6%,authored:1/16=6%)
> Vladyslav Tarasiuk <vladyslavt@nvidia.com> (commit_signer:1/16=6%,added_lines:11/198=6%,commit_signer:1/18=6%)
> Yangbo Lu <yangbo.lu@nxp.com> (authored:1/16=6%,added_lines:10/198=5%,authored:1/18=6%)
> Johannes Berg <johannes.berg@intel.com> (authored:1/16=6%)
> Zheng Yongjun <zhengyongjun3@huawei.com> (commit_signer:1/18=6%)
> Andrew Lunn <andrew@lunn.ch> (commit_signer:1/18=6%)
> Danielle Ratson <danieller@nvidia.com> (authored:1/18=6%)
> Ido Schimmel <idosch@nvidia.com> (authored:1/18=6%)
> netdev@vger.kernel.org (open list:NETWORKING [GENERAL])
> linux-kernel@vger.kernel.org (open list)
> 

Ah, maybe it's because I typically don't work with the full git repo
but just do a "git clone --depth 1". 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next 0/4] ethtool: runtime-resume netdev parent before ethtool ops
  2021-08-01 10:35 [PATCH net-next 0/4] ethtool: runtime-resume netdev parent before ethtool ops Heiner Kallweit
                   ` (4 preceding siblings ...)
  2021-08-01 16:25 ` [PATCH net-next 0/4] ethtool: runtime-resume netdev parent before ethtool ops Heiner Kallweit
@ 2021-08-03 12:00 ` patchwork-bot+netdevbpf
  5 siblings, 0 replies; 22+ messages in thread
From: patchwork-bot+netdevbpf @ 2021-08-03 12:00 UTC (permalink / raw)
  To: Heiner Kallweit; +Cc: kuba, davem, netdev

Hello:

This series was applied to netdev/net-next.git (refs/heads/master):

On Sun, 1 Aug 2021 12:35:18 +0200 you wrote:
> If a network device is runtime-suspended then:
> - network device may be flagged as detached and all ethtool ops (even if
>   not accessing the device) will fail because netif_device_present()
>   returns false
> - ethtool ops may fail because device is not accessible (e.g. because being
>   in D3 in case of a PCI device)
> 
> [...]

Here is the summary with links:
  - [net-next,1/4] ethtool: runtime-resume netdev parent before ethtool ioctl ops
    https://git.kernel.org/netdev/net-next/c/f32a21376573
  - [net-next,2/4] ethtool: move implementation of ethnl_ops_begin/complete to netlink.c
    https://git.kernel.org/netdev/net-next/c/c5ab51df03e2
  - [net-next,3/4] ethtool: move netif_device_present check from ethnl_parse_header_dev_get to ethnl_ops_begin
    https://git.kernel.org/netdev/net-next/c/41107ac22fcf
  - [net-next,4/4] ethtool: runtime-resume netdev parent in ethnl_ops_begin
    https://git.kernel.org/netdev/net-next/c/d43c65b05b84

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next 1/4] ethtool: runtime-resume netdev parent before ethtool ioctl ops
  2021-08-01 10:36 ` [PATCH net-next 1/4] ethtool: runtime-resume netdev parent before ethtool ioctl ops Heiner Kallweit
@ 2021-08-03 20:41   ` Grygorii Strashko
  2021-08-03 21:32     ` Heiner Kallweit
  0 siblings, 1 reply; 22+ messages in thread
From: Grygorii Strashko @ 2021-08-03 20:41 UTC (permalink / raw)
  To: Heiner Kallweit, Jakub Kicinski, David Miller; +Cc: netdev, Linux PM list



On 01/08/2021 13:36, Heiner Kallweit wrote:
> If a network device is runtime-suspended then:
> - network device may be flagged as detached and all ethtool ops (even if not
>    accessing the device) will fail because netif_device_present() returns
>    false
> - ethtool ops may fail because device is not accessible (e.g. because being
>    in D3 in case of a PCI device)
> 
> It may not be desirable that userspace can't use even simple ethtool ops
> that not access the device if interface or link is down. To be more friendly
> to userspace let's ensure that device is runtime-resumed when executing the
> respective ethtool op in kernel.
> 
> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
> ---
>   net/ethtool/ioctl.c | 18 +++++++++++++++---
>   1 file changed, 15 insertions(+), 3 deletions(-)
> 
> diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
> index baa5d1004..b7ff9abe7 100644
> --- a/net/ethtool/ioctl.c
> +++ b/net/ethtool/ioctl.c
> @@ -23,6 +23,7 @@
>   #include <linux/rtnetlink.h>
>   #include <linux/sched/signal.h>
>   #include <linux/net.h>
> +#include <linux/pm_runtime.h>
>   #include <net/devlink.h>
>   #include <net/xdp_sock_drv.h>
>   #include <net/flow_offload.h>
> @@ -2589,7 +2590,7 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
>   	int rc;
>   	netdev_features_t old_features;
>   
> -	if (!dev || !netif_device_present(dev))
> +	if (!dev)
>   		return -ENODEV;
>   
>   	if (copy_from_user(&ethcmd, useraddr, sizeof(ethcmd)))
> @@ -2645,10 +2646,18 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
>   			return -EPERM;
>   	}
>   
> +	if (dev->dev.parent)
> +		pm_runtime_get_sync(dev->dev.parent);

the PM Runtime should allow to wake up parent when child is resumed if everything is configured properly.

rpm_resume()
...
     if (!parent && dev->parent) {
  --> here

So, hence PM runtime calls are moved to from drivers to net_core wouldn't be more correct approach to
enable PM runtime for netdev->dev and lets PM runtime do the job?

But, to be honest, I'm not sure adding PM runtime manipulation to the net core is a good idea -
at minimum it might be tricky and required very careful approach (especially in err path).
For example, even in this patch you do not check return value of pm_runtime_get_sync() and in
commit bd869245a3dc ("net: core: try to runtime-resume detached device in __dev_open") also actualy.


The TI CPSW driver may also be placed in non reachable state when netdev is closed (and even lose context),
but we do not use netif_device_detach() (so netdev is accessible through netdev_ops/ethtool_ops),
but instead wake up device by runtime PM for allowed operations or just save requested configuration which
is applied at netdev->open() time then.
I feel that using netif_device_detach() in PM runtime sounds like a too heavy approach ;)

huh, see it's merged already, so...

> +
> +	if (!netif_device_present(dev)) {
> +		rc = -ENODEV;
> +		goto out;
> +	}
> +
>   	if (dev->ethtool_ops->begin) {
>   		rc = dev->ethtool_ops->begin(dev);
> -		if (rc  < 0)
> -			return rc;
> +		if (rc < 0)
> +			goto out;
>   	}
>   	old_features = dev->features;
>   
> @@ -2867,6 +2876,9 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
>   
>   	if (old_features != dev->features)
>   		netdev_features_change(dev);
> +out:
> +	if (dev->dev.parent)
> +		pm_runtime_put(dev->dev.parent);
>   
>   	return rc;
>   }
> 

-- 
Best regards,
grygorii

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next 1/4] ethtool: runtime-resume netdev parent before ethtool ioctl ops
  2021-08-03 20:41   ` Grygorii Strashko
@ 2021-08-03 21:32     ` Heiner Kallweit
  2021-08-04  8:43       ` Grygorii Strashko
  0 siblings, 1 reply; 22+ messages in thread
From: Heiner Kallweit @ 2021-08-03 21:32 UTC (permalink / raw)
  To: Grygorii Strashko, Jakub Kicinski, David Miller; +Cc: netdev, Linux PM list

On 03.08.2021 22:41, Grygorii Strashko wrote:
> 
> 
> On 01/08/2021 13:36, Heiner Kallweit wrote:
>> If a network device is runtime-suspended then:
>> - network device may be flagged as detached and all ethtool ops (even if not
>>    accessing the device) will fail because netif_device_present() returns
>>    false
>> - ethtool ops may fail because device is not accessible (e.g. because being
>>    in D3 in case of a PCI device)
>>
>> It may not be desirable that userspace can't use even simple ethtool ops
>> that not access the device if interface or link is down. To be more friendly
>> to userspace let's ensure that device is runtime-resumed when executing the
>> respective ethtool op in kernel.
>>
>> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
>> ---
>>   net/ethtool/ioctl.c | 18 +++++++++++++++---
>>   1 file changed, 15 insertions(+), 3 deletions(-)
>>
>> diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
>> index baa5d1004..b7ff9abe7 100644
>> --- a/net/ethtool/ioctl.c
>> +++ b/net/ethtool/ioctl.c
>> @@ -23,6 +23,7 @@
>>   #include <linux/rtnetlink.h>
>>   #include <linux/sched/signal.h>
>>   #include <linux/net.h>
>> +#include <linux/pm_runtime.h>
>>   #include <net/devlink.h>
>>   #include <net/xdp_sock_drv.h>
>>   #include <net/flow_offload.h>
>> @@ -2589,7 +2590,7 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
>>       int rc;
>>       netdev_features_t old_features;
>>   -    if (!dev || !netif_device_present(dev))
>> +    if (!dev)
>>           return -ENODEV;
>>         if (copy_from_user(&ethcmd, useraddr, sizeof(ethcmd)))
>> @@ -2645,10 +2646,18 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
>>               return -EPERM;
>>       }
>>   +    if (dev->dev.parent)
>> +        pm_runtime_get_sync(dev->dev.parent);
> 
> the PM Runtime should allow to wake up parent when child is resumed if everything is configured properly.
> 
Not sure if there's any case yet where the netdev-embedded device is power-managed.
Typically only the parent (e.g. a PCI device) is.

> rpm_resume()
> ...
>     if (!parent && dev->parent) {
>  --> here
> 
Currently we don't get that far because we will bail out here already:

else if (dev->power.disable_depth > 0)
		retval = -EACCES;

If netdev-embedded device isn't power-managed then disable_depth is 1.

> So, hence PM runtime calls are moved to from drivers to net_core wouldn't be more correct approach to
> enable PM runtime for netdev->dev and lets PM runtime do the job?
> 
Where would netdev->dev be runtime-resumed so that netif_device_present() passes?
Wouldn't we then need RPM ops for the parent (e.g. PCI) and for netdev->dev?
E.g. the parent runtime-resume can be triggered by a PCI PME, then it would
have to resume netdev->dev.

> But, to be honest, I'm not sure adding PM runtime manipulation to the net core is a good idea -

The TI CPSW driver runtime-resumes the device in begin ethtool op and suspends
it in complete. This pattern is used in more than one driver and may be worth
being moved to the core.

> at minimum it might be tricky and required very careful approach (especially in err path).
> For example, even in this patch you do not check return value of pm_runtime_get_sync() and in
> commit bd869245a3dc ("net: core: try to runtime-resume detached device in __dev_open") also actualy.

The pm_runtime_get_sync() calls are attempts here. We don't want to bail out if a device
doesn't support RPM. I agree that checking the return code could make sense, but then we would
have to be careful which error codes we consider as failed.

> 
> 
> The TI CPSW driver may also be placed in non reachable state when netdev is closed (and even lose context),
> but we do not use netif_device_detach() (so netdev is accessible through netdev_ops/ethtool_ops),
> but instead wake up device by runtime PM for allowed operations or just save requested configuration which
> is applied at netdev->open() time then.
> I feel that using netif_device_detach() in PM runtime sounds like a too heavy approach ;)
> 
That's not a rare pattern when suspending or runtime-suspending to prevent different types
of access to a not accessible device. But yes, it's relatively big hammer ..

> huh, see it's merged already, so...
> 
>> +
>> +    if (!netif_device_present(dev)) {
>> +        rc = -ENODEV;
>> +        goto out;
>> +    }
>> +
>>       if (dev->ethtool_ops->begin) {
>>           rc = dev->ethtool_ops->begin(dev);
>> -        if (rc  < 0)
>> -            return rc;
>> +        if (rc < 0)
>> +            goto out;
>>       }
>>       old_features = dev->features;
>>   @@ -2867,6 +2876,9 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
>>         if (old_features != dev->features)
>>           netdev_features_change(dev);
>> +out:
>> +    if (dev->dev.parent)
>> +        pm_runtime_put(dev->dev.parent);
>>         return rc;
>>   }
>>
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next 1/4] ethtool: runtime-resume netdev parent before ethtool ioctl ops
  2021-08-03 21:32     ` Heiner Kallweit
@ 2021-08-04  8:43       ` Grygorii Strashko
  2021-08-04 19:33         ` Heiner Kallweit
  0 siblings, 1 reply; 22+ messages in thread
From: Grygorii Strashko @ 2021-08-04  8:43 UTC (permalink / raw)
  To: Heiner Kallweit, Jakub Kicinski, David Miller
  Cc: netdev, Linux PM list, Andrew Lunn, Florian Fainelli



On 04/08/2021 00:32, Heiner Kallweit wrote:
> On 03.08.2021 22:41, Grygorii Strashko wrote:
>>
>>
>> On 01/08/2021 13:36, Heiner Kallweit wrote:
>>> If a network device is runtime-suspended then:
>>> - network device may be flagged as detached and all ethtool ops (even if not
>>>     accessing the device) will fail because netif_device_present() returns
>>>     false
>>> - ethtool ops may fail because device is not accessible (e.g. because being
>>>     in D3 in case of a PCI device)
>>>
>>> It may not be desirable that userspace can't use even simple ethtool ops
>>> that not access the device if interface or link is down. To be more friendly
>>> to userspace let's ensure that device is runtime-resumed when executing the
>>> respective ethtool op in kernel.
>>>
>>> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
>>> ---
>>>    net/ethtool/ioctl.c | 18 +++++++++++++++---
>>>    1 file changed, 15 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
>>> index baa5d1004..b7ff9abe7 100644
>>> --- a/net/ethtool/ioctl.c
>>> +++ b/net/ethtool/ioctl.c
>>> @@ -23,6 +23,7 @@
>>>    #include <linux/rtnetlink.h>
>>>    #include <linux/sched/signal.h>
>>>    #include <linux/net.h>
>>> +#include <linux/pm_runtime.h>
>>>    #include <net/devlink.h>
>>>    #include <net/xdp_sock_drv.h>
>>>    #include <net/flow_offload.h>
>>> @@ -2589,7 +2590,7 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
>>>        int rc;
>>>        netdev_features_t old_features;
>>>    -    if (!dev || !netif_device_present(dev))
>>> +    if (!dev)
>>>            return -ENODEV;
>>>          if (copy_from_user(&ethcmd, useraddr, sizeof(ethcmd)))
>>> @@ -2645,10 +2646,18 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
>>>                return -EPERM;
>>>        }
>>>    +    if (dev->dev.parent)
>>> +        pm_runtime_get_sync(dev->dev.parent);
>>
>> the PM Runtime should allow to wake up parent when child is resumed if everything is configured properly.
>>
> Not sure if there's any case yet where the netdev-embedded device is power-managed.
> Typically only the parent (e.g. a PCI device) is.
> 
>> rpm_resume()
>> ...
>>      if (!parent && dev->parent) {
>>   --> here
>>
> Currently we don't get that far because we will bail out here already:
> 
> else if (dev->power.disable_depth > 0)
> 		retval = -EACCES;
> 
> If netdev-embedded device isn't power-managed then disable_depth is 1.

Right. But if pm_runtime_enable() is added for ndev->dev then PM runtime will start working for it
and should handle parent properly - from my experience, every time any code need manipulate with "parent" or
smth. else to make PM runtime working it means smth. is wrong.

diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index f6197774048b..33b72b788aa2 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -1963,6 +1963,7 @@ int netdev_register_kobject(struct net_device *ndev)
         }
  
         pm_runtime_set_memalloc_noio(dev, true);
+       pm_runtime_enable(dev);
  
         return error;
  }


> 
>> So, hence PM runtime calls are moved to from drivers to net_core wouldn't be more correct approach to
>> enable PM runtime for netdev->dev and lets PM runtime do the job?
>>
> Where would netdev->dev be runtime-resumed so that netif_device_present() passes?

That's the biggest issues here. Some driver uses netif_device_detach() in PM runtime and, this way, introduces custom dependency
between Core device PM (runtime) sate and Net core, other driver does not do.
Does it means every driver with PM runtime now have to be updated to indicate it PM state to Net core with netif_device_detach()?
Why? Why return value from pm_runtime_get calls is not enough?

Believe me it's terrible idea to introduce custom PM state dependency between PM runtime and Net core,
for example it took years to sync properly System wide suspend and PM runtime which are separate framworks.

By the way netif_device_detach() during System Wide suspend is looks perfectly valid, because entering
System wide Suspend should prohibit any access to netdev at some stage. And that's what 99% of network drivers are doing
(actually I can find only ./realtek/r8169_main.c which abuse netif_device_detach() function and,
I assume, it is your case)

> Wouldn't we then need RPM ops for the parent (e.g. PCI) and for netdev->dev?

No. as I know -  netdev->dev can be declared as pm_runtime_no_callbacks(&adap->dev);
I2C adapter might be a good example to check.

> E.g. the parent runtime-resume can be triggered by a PCI PME, then it would
> have to resume netdev->dev.
> 
>> But, to be honest, I'm not sure adding PM runtime manipulation to the net core is a good idea -
> 
> The TI CPSW driver runtime-resumes the device in begin ethtool op and suspends
> it in complete. This pattern is used in more than one driver and may be worth
> being moved to the core.

I'm not against code refactoring and optimization, but in my opinion it has to be done right from the beginning or
not done at all.

> 
>> at minimum it might be tricky and required very careful approach (especially in err path).
>> For example, even in this patch you do not check return value of pm_runtime_get_sync() and in
>> commit bd869245a3dc ("net: core: try to runtime-resume detached device in __dev_open") also actualy.
> 
> The pm_runtime_get_sync() calls are attempts here. We don't want to bail out if a device
> doesn't support RPM.

And if 'parent' is not supporting PM runtime - it, as i see, should be handled by PM runtime core properly.

I agree that checking the return code could make sense, but then we would
> have to be careful which error codes we consider as failed.

huh. you can't 'try' pm_runtime_get_sync() and then align on netif_device_present() :(

might be, some how, it will work for r8169_main, but will not work for others.
- no checking pm_runtime_get_sync() err code will cause PM runtime 'usage_count' leak
- no checking pm_runtime_get_sync() err may cause to continue( for TI CPSW for example) with
   device in undefined PM state ("disabled" or "half-enabled") and so crash later.



> 
>>
>>
>> The TI CPSW driver may also be placed in non reachable state when netdev is closed (and even lose context),
>> but we do not use netif_device_detach() (so netdev is accessible through netdev_ops/ethtool_ops),
>> but instead wake up device by runtime PM for allowed operations or just save requested configuration which
>> is applied at netdev->open() time then.
>> I feel that using netif_device_detach() in PM runtime sounds like a too heavy approach ;)
>>
> That's not a rare pattern when suspending or runtime-suspending to prevent different types
> of access to a not accessible device. But yes, it's relatively big hammer ..

Again, netif_device_detach() seems correct for System wide suspend, but in my opinion - it's
not correct for PM runtime.

Sry, with all do respect, first corresponding driver has to be fixed and not Net core hacked to support it.

Further decisions is up to maintainers.


> 
>> huh, see it's merged already, so...
>>
>>> +
>>> +    if (!netif_device_present(dev)) {
>>> +        rc = -ENODEV;
>>> +        goto out;
>>> +    }
>>> +
>>>        if (dev->ethtool_ops->begin) {
>>>            rc = dev->ethtool_ops->begin(dev);
>>> -        if (rc  < 0)
>>> -            return rc;
>>> +        if (rc < 0)
>>> +            goto out;
>>>        }
>>>        old_features = dev->features;
>>>    @@ -2867,6 +2876,9 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
>>>          if (old_features != dev->features)
>>>            netdev_features_change(dev);
>>> +out:
>>> +    if (dev->dev.parent)
>>> +        pm_runtime_put(dev->dev.parent);
>>>          return rc;
>>>    }
>>>
>>
> 

-- 
Best regards,
grygorii

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next 1/4] ethtool: runtime-resume netdev parent before ethtool ioctl ops
  2021-08-04  8:43       ` Grygorii Strashko
@ 2021-08-04 19:33         ` Heiner Kallweit
  2021-08-05  8:20           ` Grygorii Strashko
  0 siblings, 1 reply; 22+ messages in thread
From: Heiner Kallweit @ 2021-08-04 19:33 UTC (permalink / raw)
  To: Grygorii Strashko, Jakub Kicinski, David Miller
  Cc: netdev, Linux PM list, Andrew Lunn, Florian Fainelli

On 04.08.2021 10:43, Grygorii Strashko wrote:
> 
> 
> On 04/08/2021 00:32, Heiner Kallweit wrote:
>> On 03.08.2021 22:41, Grygorii Strashko wrote:
>>>
>>>
>>> On 01/08/2021 13:36, Heiner Kallweit wrote:
>>>> If a network device is runtime-suspended then:
>>>> - network device may be flagged as detached and all ethtool ops (even if not
>>>>     accessing the device) will fail because netif_device_present() returns
>>>>     false
>>>> - ethtool ops may fail because device is not accessible (e.g. because being
>>>>     in D3 in case of a PCI device)
>>>>
>>>> It may not be desirable that userspace can't use even simple ethtool ops
>>>> that not access the device if interface or link is down. To be more friendly
>>>> to userspace let's ensure that device is runtime-resumed when executing the
>>>> respective ethtool op in kernel.
>>>>
>>>> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
>>>> ---
>>>>    net/ethtool/ioctl.c | 18 +++++++++++++++---
>>>>    1 file changed, 15 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
>>>> index baa5d1004..b7ff9abe7 100644
>>>> --- a/net/ethtool/ioctl.c
>>>> +++ b/net/ethtool/ioctl.c
>>>> @@ -23,6 +23,7 @@
>>>>    #include <linux/rtnetlink.h>
>>>>    #include <linux/sched/signal.h>
>>>>    #include <linux/net.h>
>>>> +#include <linux/pm_runtime.h>
>>>>    #include <net/devlink.h>
>>>>    #include <net/xdp_sock_drv.h>
>>>>    #include <net/flow_offload.h>
>>>> @@ -2589,7 +2590,7 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
>>>>        int rc;
>>>>        netdev_features_t old_features;
>>>>    -    if (!dev || !netif_device_present(dev))
>>>> +    if (!dev)
>>>>            return -ENODEV;
>>>>          if (copy_from_user(&ethcmd, useraddr, sizeof(ethcmd)))
>>>> @@ -2645,10 +2646,18 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
>>>>                return -EPERM;
>>>>        }
>>>>    +    if (dev->dev.parent)
>>>> +        pm_runtime_get_sync(dev->dev.parent);
>>>
>>> the PM Runtime should allow to wake up parent when child is resumed if everything is configured properly.
>>>
>> Not sure if there's any case yet where the netdev-embedded device is power-managed.
>> Typically only the parent (e.g. a PCI device) is.
>>
>>> rpm_resume()
>>> ...
>>>      if (!parent && dev->parent) {
>>>   --> here
>>>
>> Currently we don't get that far because we will bail out here already:
>>
>> else if (dev->power.disable_depth > 0)
>>         retval = -EACCES;
>>
>> If netdev-embedded device isn't power-managed then disable_depth is 1.
> 
> Right. But if pm_runtime_enable() is added for ndev->dev then PM runtime will start working for it
> and should handle parent properly - from my experience, every time any code need manipulate with "parent" or
> smth. else to make PM runtime working it means smth. is wrong.
> 
> diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
> index f6197774048b..33b72b788aa2 100644
> --- a/net/core/net-sysfs.c
> +++ b/net/core/net-sysfs.c
> @@ -1963,6 +1963,7 @@ int netdev_register_kobject(struct net_device *ndev)
>         }
>  
>         pm_runtime_set_memalloc_noio(dev, true);
> +       pm_runtime_enable(dev);
>  
>         return error;
>  }
> 
> 
>>
>>> So, hence PM runtime calls are moved to from drivers to net_core wouldn't be more correct approach to
>>> enable PM runtime for netdev->dev and lets PM runtime do the job?
>>>
>> Where would netdev->dev be runtime-resumed so that netif_device_present() passes?
> 
> That's the biggest issues here. Some driver uses netif_device_detach() in PM runtime and, this way, introduces custom dependency
> between Core device PM (runtime) sate and Net core, other driver does not do.
> Does it means every driver with PM runtime now have to be updated to indicate it PM state to Net core with netif_device_detach()?

No, that's not needed.

> Why? Why return value from pm_runtime_get calls is not enough?
> 
> Believe me it's terrible idea to introduce custom PM state dependency between PM runtime and Net core,
> for example it took years to sync properly System wide suspend and PM runtime which are separate framworks.
> 
> By the way netif_device_detach() during System Wide suspend is looks perfectly valid, because entering
> System wide Suspend should prohibit any access to netdev at some stage. And that's what 99% of network drivers are doing
> (actually I can find only ./realtek/r8169_main.c which abuse netif_device_detach() function and,
> I assume, it is your case)
> 
Actually I was inspired by the Intel drivers, see e.g. __igc_shutdown(). They also detach the
netdevice on runtime suspend. One reason is that several core functions check for device
presence before e.g. calling a ndo callback. Example: dev_set_mtu_ext()
Same applies for __dev_set_rx_mode(). Therefore I wondered whether cpsw_ndo_set_rx_mode()
- that does not include runtime-resuming the device - may be called when device is
runtime-suspended, e.g. if interface is up, but link is down.

>> Wouldn't we then need RPM ops for the parent (e.g. PCI) and for netdev->dev?
> 
> No. as I know -  netdev->dev can be declared as pm_runtime_no_callbacks(&adap->dev);
> I2C adapter might be a good example to check.
> 
>> E.g. the parent runtime-resume can be triggered by a PCI PME, then it would
>> have to resume netdev->dev.
>>
>>> But, to be honest, I'm not sure adding PM runtime manipulation to the net core is a good idea -
>>
>> The TI CPSW driver runtime-resumes the device in begin ethtool op and suspends
>> it in complete. This pattern is used in more than one driver and may be worth
>> being moved to the core.
> 
> I'm not against code refactoring and optimization, but in my opinion it has to be done right from the beginning or
> not done at all.
> 
>>
>>> at minimum it might be tricky and required very careful approach (especially in err path).
>>> For example, even in this patch you do not check return value of pm_runtime_get_sync() and in
>>> commit bd869245a3dc ("net: core: try to runtime-resume detached device in __dev_open") also actualy.
>>
>> The pm_runtime_get_sync() calls are attempts here. We don't want to bail out if a device
>> doesn't support RPM.
> 
> And if 'parent' is not supporting PM runtime - it, as i see, should be handled by PM runtime core properly.
> 
> I agree that checking the return code could make sense, but then we would
>> have to be careful which error codes we consider as failed.
> 
> huh. you can't 'try' pm_runtime_get_sync() and then align on netif_device_present() :(
> 
> might be, some how, it will work for r8169_main, but will not work for others.
> - no checking pm_runtime_get_sync() err code will cause PM runtime 'usage_count' leak

No. pm_runtime_get_sync() always bumps the usage count, no matter whether it fails or not.
This makes it easy to deal with this. The problem you describe exists with
pm_runtime_resume_and_get(). That's why I wondered whether we should annotate this
function as __must_check. See here:
https://lore.kernel.org/linux-pm/CAJZ5v0gps0C2923VqM8876npvhcETsyN+ajAkBKX5kf49J0+Mg@mail.gmail.com/T/#t

> - no checking pm_runtime_get_sync() err may cause to continue( for TI CPSW for example) with
>   device in undefined PM state ("disabled" or "half-enabled") and so crash later.
> 
I'd say 95% of rpm callers don't check the return value. I'm not saying this is a good thing,
but obviously it doesn't cause relevant harm.

> 
> 
>>
>>>
>>>
>>> The TI CPSW driver may also be placed in non reachable state when netdev is closed (and even lose context),
>>> but we do not use netif_device_detach() (so netdev is accessible through netdev_ops/ethtool_ops),
>>> but instead wake up device by runtime PM for allowed operations or just save requested configuration which
>>> is applied at netdev->open() time then.
>>> I feel that using netif_device_detach() in PM runtime sounds like a too heavy approach ;)
>>>
>> That's not a rare pattern when suspending or runtime-suspending to prevent different types
>> of access to a not accessible device. But yes, it's relatively big hammer ..
> 
> Again, netif_device_detach() seems correct for System wide suspend, but in my opinion - it's
> not correct for PM runtime.
> 
> Sry, with all do respect, first corresponding driver has to be fixed and not Net core hacked to support it.
> 
> Further decisions is up to maintainers.
> 
> 
>>
>>> huh, see it's merged already, so...
>>>
>>>> +
>>>> +    if (!netif_device_present(dev)) {
>>>> +        rc = -ENODEV;
>>>> +        goto out;
>>>> +    }
>>>> +
>>>>        if (dev->ethtool_ops->begin) {
>>>>            rc = dev->ethtool_ops->begin(dev);
>>>> -        if (rc  < 0)
>>>> -            return rc;
>>>> +        if (rc < 0)
>>>> +            goto out;
>>>>        }
>>>>        old_features = dev->features;
>>>>    @@ -2867,6 +2876,9 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
>>>>          if (old_features != dev->features)
>>>>            netdev_features_change(dev);
>>>> +out:
>>>> +    if (dev->dev.parent)
>>>> +        pm_runtime_put(dev->dev.parent);
>>>>          return rc;
>>>>    }
>>>>
>>>
>>
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next 1/4] ethtool: runtime-resume netdev parent before ethtool ioctl ops
  2021-08-04 19:33         ` Heiner Kallweit
@ 2021-08-05  8:20           ` Grygorii Strashko
  2021-08-05 11:11             ` Joakim Zhang
  2021-08-05 19:24             ` Heiner Kallweit
  0 siblings, 2 replies; 22+ messages in thread
From: Grygorii Strashko @ 2021-08-05  8:20 UTC (permalink / raw)
  To: Heiner Kallweit, Jakub Kicinski, David Miller
  Cc: netdev, Linux PM list, Andrew Lunn, Florian Fainelli



On 04/08/2021 22:33, Heiner Kallweit wrote:
> On 04.08.2021 10:43, Grygorii Strashko wrote:
>>
>>
>> On 04/08/2021 00:32, Heiner Kallweit wrote:
>>> On 03.08.2021 22:41, Grygorii Strashko wrote:
>>>>
>>>>
>>>> On 01/08/2021 13:36, Heiner Kallweit wrote:
>>>>> If a network device is runtime-suspended then:
>>>>> - network device may be flagged as detached and all ethtool ops (even if not
>>>>>      accessing the device) will fail because netif_device_present() returns
>>>>>      false
>>>>> - ethtool ops may fail because device is not accessible (e.g. because being
>>>>>      in D3 in case of a PCI device)
>>>>>
>>>>> It may not be desirable that userspace can't use even simple ethtool ops
>>>>> that not access the device if interface or link is down. To be more friendly
>>>>> to userspace let's ensure that device is runtime-resumed when executing the
>>>>> respective ethtool op in kernel.
>>>>>
>>>>> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
>>>>> ---
>>>>>     net/ethtool/ioctl.c | 18 +++++++++++++++---
>>>>>     1 file changed, 15 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
>>>>> index baa5d1004..b7ff9abe7 100644
>>>>> --- a/net/ethtool/ioctl.c
>>>>> +++ b/net/ethtool/ioctl.c
>>>>> @@ -23,6 +23,7 @@
>>>>>     #include <linux/rtnetlink.h>
>>>>>     #include <linux/sched/signal.h>
>>>>>     #include <linux/net.h>
>>>>> +#include <linux/pm_runtime.h>
>>>>>     #include <net/devlink.h>
>>>>>     #include <net/xdp_sock_drv.h>
>>>>>     #include <net/flow_offload.h>
>>>>> @@ -2589,7 +2590,7 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
>>>>>         int rc;
>>>>>         netdev_features_t old_features;
>>>>>     -    if (!dev || !netif_device_present(dev))
>>>>> +    if (!dev)
>>>>>             return -ENODEV;
>>>>>           if (copy_from_user(&ethcmd, useraddr, sizeof(ethcmd)))
>>>>> @@ -2645,10 +2646,18 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
>>>>>                 return -EPERM;
>>>>>         }
>>>>>     +    if (dev->dev.parent)
>>>>> +        pm_runtime_get_sync(dev->dev.parent);
>>>>
>>>> the PM Runtime should allow to wake up parent when child is resumed if everything is configured properly.
>>>>
>>> Not sure if there's any case yet where the netdev-embedded device is power-managed.
>>> Typically only the parent (e.g. a PCI device) is.
>>>
>>>> rpm_resume()
>>>> ...
>>>>       if (!parent && dev->parent) {
>>>>    --> here
>>>>
>>> Currently we don't get that far because we will bail out here already:
>>>
>>> else if (dev->power.disable_depth > 0)
>>>          retval = -EACCES;
>>>
>>> If netdev-embedded device isn't power-managed then disable_depth is 1.
>>
>> Right. But if pm_runtime_enable() is added for ndev->dev then PM runtime will start working for it
>> and should handle parent properly - from my experience, every time any code need manipulate with "parent" or
>> smth. else to make PM runtime working it means smth. is wrong.
>>
>> diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
>> index f6197774048b..33b72b788aa2 100644
>> --- a/net/core/net-sysfs.c
>> +++ b/net/core/net-sysfs.c
>> @@ -1963,6 +1963,7 @@ int netdev_register_kobject(struct net_device *ndev)
>>          }
>>   
>>          pm_runtime_set_memalloc_noio(dev, true);
>> +       pm_runtime_enable(dev);
>>   
>>          return error;
>>   }
>>
>>
>>>
>>>> So, hence PM runtime calls are moved to from drivers to net_core wouldn't be more correct approach to
>>>> enable PM runtime for netdev->dev and lets PM runtime do the job?
>>>>
>>> Where would netdev->dev be runtime-resumed so that netif_device_present() passes?
>>
>> That's the biggest issues here. Some driver uses netif_device_detach() in PM runtime and, this way, introduces custom dependency
>> between Core device PM (runtime) sate and Net core, other driver does not do.
>> Does it means every driver with PM runtime now have to be updated to indicate it PM state to Net core with netif_device_detach()?
> 
> No, that's not needed.
> 
>> Why? Why return value from pm_runtime_get calls is not enough?
>>
>> Believe me it's terrible idea to introduce custom PM state dependency between PM runtime and Net core,
>> for example it took years to sync properly System wide suspend and PM runtime which are separate framworks.
>>
>> By the way netif_device_detach() during System Wide suspend is looks perfectly valid, because entering
>> System wide Suspend should prohibit any access to netdev at some stage. And that's what 99% of network drivers are doing
>> (actually I can find only ./realtek/r8169_main.c which abuse netif_device_detach() function and,
>> I assume, it is your case)
>>
> Actually I was inspired by the Intel drivers, see e.g. __igc_shutdown(). They also detach the
> netdevice on runtime suspend. One reason is that several core functions check for device
> presence before e.g. calling a ndo callback. Example: dev_set_mtu_ext()

right and also:
- netlink - which you've hacked already
- 8021q: vlan_dev_ioctl/vlan_dev_neigh_setup/vlan_add_rx_filter_info/vlan_kill_rx_filter_info


> Same applies for __dev_set_rx_mode(). Therefore I wondered whether cpsw_ndo_set_rx_mode()
> - that does not include runtime-resuming the device - may be called when device is
> runtime-suspended, e.g. if interface is up, but link is down.

CPSW doesn't manage PM runtime in link status handler, as it has only on/off state and off state can cause full
context loss restore of which is expensive and hard to implement. And for most of netdev drivers no aggressive PM runtime
is implemented exactly because of that (mac/vlan/fdb/mdb/...). Common patterns:

(a)
.probe
  -get
.remove
  -put

(b)
.probe
  -get
  -put
.open
  -get
.close
  -put
.protect places which may be called when netif is down

The CPSW follows (b) and so cpsw_ndo_set_rx_mode() can't be called when when device is
runtime-suspended.

I assume, some hw like PCI, can have more PM states and in some of them keep HW context intact.


> 
>>> Wouldn't we then need RPM ops for the parent (e.g. PCI) and for netdev->dev?
>>
>> No. as I know -  netdev->dev can be declared as pm_runtime_no_callbacks(&adap->dev);
>> I2C adapter might be a good example to check.
>>
>>> E.g. the parent runtime-resume can be triggered by a PCI PME, then it would
>>> have to resume netdev->dev.
>>>
>>>> But, to be honest, I'm not sure adding PM runtime manipulation to the net core is a good idea -
>>>
>>> The TI CPSW driver runtime-resumes the device in begin ethtool op and suspends
>>> it in complete. This pattern is used in more than one driver and may be worth
>>> being moved to the core.
>>
>> I'm not against code refactoring and optimization, but in my opinion it has to be done right from the beginning or
>> not done at all.
>>
>>>
>>>> at minimum it might be tricky and required very careful approach (especially in err path).
>>>> For example, even in this patch you do not check return value of pm_runtime_get_sync() and in
>>>> commit bd869245a3dc ("net: core: try to runtime-resume detached device in __dev_open") also actualy.
>>>
>>> The pm_runtime_get_sync() calls are attempts here. We don't want to bail out if a device
>>> doesn't support RPM.
>>
>> And if 'parent' is not supporting PM runtime - it, as i see, should be handled by PM runtime core properly.
>>
>> I agree that checking the return code could make sense, but then we would
>>> have to be careful which error codes we consider as failed.
>>
>> huh. you can't 'try' pm_runtime_get_sync() and then align on netif_device_present() :(
>>
>> might be, some how, it will work for r8169_main, but will not work for others.
>> - no checking pm_runtime_get_sync() err code will cause PM runtime 'usage_count' leak
> 
> No. pm_runtime_get_sync() always bumps the usage count, no matter whether it fails or not.


> This makes it easy to deal with this. The problem you describe exists with
> pm_runtime_resume_and_get(). That's why I wondered whether we should annotate this
> function as __must_check. See here:
> https://lore.kernel.org/linux-pm/CAJZ5v0gps0C2923VqM8876npvhcETsyN+ajAkBKX5kf49J0+Mg@mail.gmail.com/T/#t
> 
>> - no checking pm_runtime_get_sync() err may cause to continue( for TI CPSW for example) with
>>    device in undefined PM state ("disabled" or "half-enabled") and so crash later.
>>
> I'd say 95% of rpm callers don't check the return value. I'm not saying this is a good thing,
> but obviously it doesn't cause relevant harm.

this is completely wrong assumption as PM errors cause silent stuck, undefined behavior or dumps (sometimes delayed)
which is terribly hard to root cause.

yes. many drivers do not check, but over last few years more and more strict policies applied to avoid that and
in many case no checking return code - is red flag and patch reject.
Don't like that phrase ;), but "It doesn't mean that incorrect code has to be copy-pasted all over the places"

this is correct get pattern for get:
	ret = pm_runtime_get_sync(&pdev->dev);
	if (ret < 0) {
		pm_runtime_put_noidle(&pdev->dev);
		return ret;
	}

My strong opinion
  - PM runtime return code must be checked.
  - get rid of netif_device_detach() in r8169

by the way, have you tried below test with your driver (not sure how it works for you):

.rtl_open
  - pm_runtime_get_sync
  - pm_runtime_put_sync - usage_count == 0
.r8169_phylink_handler
  - pm_request_resume - why async? still usage_count == 0
.some ethtool request to go through dev_ethtool()
  - pm_runtime_get_sync
  - pm_runtime_put - async, usage_count == 0
    ^ would not it put r8169 in runtime-suspended state while link is still UP?
  

> 
>>
>>
>>>
>>>>
>>>>
>>>> The TI CPSW driver may also be placed in non reachable state when netdev is closed (and even lose context),
>>>> but we do not use netif_device_detach() (so netdev is accessible through netdev_ops/ethtool_ops),
>>>> but instead wake up device by runtime PM for allowed operations or just save requested configuration which
>>>> is applied at netdev->open() time then.
>>>> I feel that using netif_device_detach() in PM runtime sounds like a too heavy approach ;)
>>>>
>>> That's not a rare pattern when suspending or runtime-suspending to prevent different types
>>> of access to a not accessible device. But yes, it's relatively big hammer ..
>>
>> Again, netif_device_detach() seems correct for System wide suspend, but in my opinion - it's
>> not correct for PM runtime.
>>
>> Sry, with all do respect, first corresponding driver has to be fixed and not Net core hacked to support it.
>>
>> Further decisions is up to maintainers.
>>
>>
>>>
>>>> huh, see it's merged already, so...
>>>>
>>>>> +
>>>>> +    if (!netif_device_present(dev)) {
>>>>> +        rc = -ENODEV;
>>>>> +        goto out;
>>>>> +    }
>>>>> +
>>>>>         if (dev->ethtool_ops->begin) {
>>>>>             rc = dev->ethtool_ops->begin(dev);
>>>>> -        if (rc  < 0)
>>>>> -            return rc;
>>>>> +        if (rc < 0)
>>>>> +            goto out;
>>>>>         }
>>>>>         old_features = dev->features;
>>>>>     @@ -2867,6 +2876,9 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
>>>>>           if (old_features != dev->features)
>>>>>             netdev_features_change(dev);
>>>>> +out:
>>>>> +    if (dev->dev.parent)
>>>>> +        pm_runtime_put(dev->dev.parent);
>>>>>           return rc;
>>>>>     }
>>>>>
>>>>
>>>
>>
> 

-- 
Best regards,
grygorii

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH net-next 1/4] ethtool: runtime-resume netdev parent before ethtool ioctl ops
  2021-08-05  8:20           ` Grygorii Strashko
@ 2021-08-05 11:11             ` Joakim Zhang
  2021-08-05 11:58               ` Grygorii Strashko
  2021-08-05 19:24             ` Heiner Kallweit
  1 sibling, 1 reply; 22+ messages in thread
From: Joakim Zhang @ 2021-08-05 11:11 UTC (permalink / raw)
  To: Grygorii Strashko, Heiner Kallweit, Jakub Kicinski, David Miller
  Cc: netdev, Linux PM list, Andrew Lunn, Florian Fainelli


> -----Original Message-----
> From: Grygorii Strashko <grygorii.strashko@ti.com>
> Sent: 2021年8月5日 16:21
> To: Heiner Kallweit <hkallweit1@gmail.com>; Jakub Kicinski
> <kuba@kernel.org>; David Miller <davem@davemloft.net>
> Cc: netdev@vger.kernel.org; Linux PM list <linux-pm@vger.kernel.org>;
> Andrew Lunn <andrew@lunn.ch>; Florian Fainelli <f.fainelli@gmail.com>
> Subject: Re: [PATCH net-next 1/4] ethtool: runtime-resume netdev parent
> before ethtool ioctl ops
> 
> 
> 
> On 04/08/2021 22:33, Heiner Kallweit wrote:
> > On 04.08.2021 10:43, Grygorii Strashko wrote:
> >>
> >>
> >> On 04/08/2021 00:32, Heiner Kallweit wrote:
> >>> On 03.08.2021 22:41, Grygorii Strashko wrote:
> >>>>
> >>>>
> >>>> On 01/08/2021 13:36, Heiner Kallweit wrote:
> >>>>> If a network device is runtime-suspended then:
> >>>>> - network device may be flagged as detached and all ethtool ops
> >>>>> (even if not
> >>>>>      accessing the device) will fail because
> >>>>> netif_device_present() returns
> >>>>>      false
> >>>>> - ethtool ops may fail because device is not accessible (e.g.
> >>>>> because being
> >>>>>      in D3 in case of a PCI device)
> >>>>>
> >>>>> It may not be desirable that userspace can't use even simple
> >>>>> ethtool ops that not access the device if interface or link is
> >>>>> down. To be more friendly to userspace let's ensure that device is
> >>>>> runtime-resumed when executing the respective ethtool op in kernel.
> >>>>>
> >>>>> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
> >>>>> ---
> >>>>>     net/ethtool/ioctl.c | 18 +++++++++++++++---
> >>>>>     1 file changed, 15 insertions(+), 3 deletions(-)
> >>>>>
> >>>>> diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c index
> >>>>> baa5d1004..b7ff9abe7 100644
> >>>>> --- a/net/ethtool/ioctl.c
> >>>>> +++ b/net/ethtool/ioctl.c
> >>>>> @@ -23,6 +23,7 @@
> >>>>>     #include <linux/rtnetlink.h>
> >>>>>     #include <linux/sched/signal.h>
> >>>>>     #include <linux/net.h>
> >>>>> +#include <linux/pm_runtime.h>
> >>>>>     #include <net/devlink.h>
> >>>>>     #include <net/xdp_sock_drv.h>
> >>>>>     #include <net/flow_offload.h>
> >>>>> @@ -2589,7 +2590,7 @@ int dev_ethtool(struct net *net, struct
> >>>>> ifreq *ifr)
> >>>>>         int rc;
> >>>>>         netdev_features_t old_features;
> >>>>>     -    if (!dev || !netif_device_present(dev))
> >>>>> +    if (!dev)
> >>>>>             return -ENODEV;
> >>>>>           if (copy_from_user(&ethcmd, useraddr, sizeof(ethcmd)))
> >>>>> @@ -2645,10 +2646,18 @@ int dev_ethtool(struct net *net, struct
> >>>>> ifreq *ifr)
> >>>>>                 return -EPERM;
> >>>>>         }
> >>>>>     +    if (dev->dev.parent)
> >>>>> +        pm_runtime_get_sync(dev->dev.parent);
> >>>>
> >>>> the PM Runtime should allow to wake up parent when child is resumed if
> everything is configured properly.
> >>>>
> >>> Not sure if there's any case yet where the netdev-embedded device is
> power-managed.
> >>> Typically only the parent (e.g. a PCI device) is.
> >>>
> >>>> rpm_resume()
> >>>> ...
> >>>>       if (!parent && dev->parent) {
> >>>>    --> here
> >>>>
> >>> Currently we don't get that far because we will bail out here already:
> >>>
> >>> else if (dev->power.disable_depth > 0)
> >>>          retval = -EACCES;
> >>>
> >>> If netdev-embedded device isn't power-managed then disable_depth is 1.
> >>
> >> Right. But if pm_runtime_enable() is added for ndev->dev then PM
> >> runtime will start working for it and should handle parent properly -
> >> from my experience, every time any code need manipulate with "parent" or
> smth. else to make PM runtime working it means smth. is wrong.
> >>
> >> diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c index
> >> f6197774048b..33b72b788aa2 100644
> >> --- a/net/core/net-sysfs.c
> >> +++ b/net/core/net-sysfs.c
> >> @@ -1963,6 +1963,7 @@ int netdev_register_kobject(struct net_device
> >> *ndev)
> >>          }
> >>
> >>          pm_runtime_set_memalloc_noio(dev, true);
> >> +       pm_runtime_enable(dev);
> >>
> >>          return error;
> >>   }
> >>
> >>
> >>>
> >>>> So, hence PM runtime calls are moved to from drivers to net_core
> >>>> wouldn't be more correct approach to enable PM runtime for netdev->dev
> and lets PM runtime do the job?
> >>>>
> >>> Where would netdev->dev be runtime-resumed so that
> netif_device_present() passes?
> >>
> >> That's the biggest issues here. Some driver uses
> >> netif_device_detach() in PM runtime and, this way, introduces custom
> dependency between Core device PM (runtime) sate and Net core, other driver
> does not do.
> >> Does it means every driver with PM runtime now have to be updated to
> indicate it PM state to Net core with netif_device_detach()?
> >
> > No, that's not needed.
> >
> >> Why? Why return value from pm_runtime_get calls is not enough?
> >>
> >> Believe me it's terrible idea to introduce custom PM state dependency
> >> between PM runtime and Net core, for example it took years to sync
> properly System wide suspend and PM runtime which are separate framworks.
> >>
> >> By the way netif_device_detach() during System Wide suspend is looks
> >> perfectly valid, because entering System wide Suspend should prohibit
> >> any access to netdev at some stage. And that's what 99% of network
> >> drivers are doing (actually I can find only ./realtek/r8169_main.c
> >> which abuse netif_device_detach() function and, I assume, it is your
> >> case)
> >>
> > Actually I was inspired by the Intel drivers, see e.g.
> > __igc_shutdown(). They also detach the netdevice on runtime suspend.
> > One reason is that several core functions check for device presence
> > before e.g. calling a ndo callback. Example: dev_set_mtu_ext()
> 
> right and also:
> - netlink - which you've hacked already
> - 8021q:
> vlan_dev_ioctl/vlan_dev_neigh_setup/vlan_add_rx_filter_info/vlan_kill_rx_filte
> r_info

Yes, there are many place need to do such check. I always face a problem that where I need to
runtime-resume the device, is there any suggestion? I always add it when an issue came out.

What confuse me it that, is there any document describe that which .ndo callback should be called
with interface up, instead .ndo callback _CAN_ be called with interface down? I think this can help
us decide when we need runtime-resume device.

After leaning all your discussion, from my point of view, it seems not a good choice to add RPM
to net core. It had better handled by driver itself.
 
> 
> > Same applies for __dev_set_rx_mode(). Therefore I wondered whether
> > cpsw_ndo_set_rx_mode()
> > - that does not include runtime-resuming the device - may be called
> > when device is runtime-suspended, e.g. if interface is up, but link is down.
> 
> CPSW doesn't manage PM runtime in link status handler, as it has only on/off
> state and off state can cause full context loss restore of which is expensive and
> hard to implement. And for most of netdev drivers no aggressive PM runtime is
> implemented exactly because of that (mac/vlan/fdb/mdb/...). Common
> patterns:
> 
> (a)
> .probe
>   -get
> .remove
>   -put
> 
> (b)
> .probe
>   -get
>   -put
> .open
>   -get
> .close
>   -put
> .protect places which may be called when netif is down
> 
> The CPSW follows (b) and so cpsw_ndo_set_rx_mode() can't be called when
> when device is runtime-suspended.
> 
> I assume, some hw like PCI, can have more PM states and in some of them
> keep HW context intact.
> 
> 
> >
> >>> Wouldn't we then need RPM ops for the parent (e.g. PCI) and for
> netdev->dev?
> >>
> >> No. as I know -  netdev->dev can be declared as
> >> pm_runtime_no_callbacks(&adap->dev);
> >> I2C adapter might be a good example to check.
> >>
> >>> E.g. the parent runtime-resume can be triggered by a PCI PME, then
> >>> it would have to resume netdev->dev.
> >>>
> >>>> But, to be honest, I'm not sure adding PM runtime manipulation to
> >>>> the net core is a good idea -
> >>>
> >>> The TI CPSW driver runtime-resumes the device in begin ethtool op
> >>> and suspends it in complete. This pattern is used in more than one
> >>> driver and may be worth being moved to the core.
> >>
> >> I'm not against code refactoring and optimization, but in my opinion
> >> it has to be done right from the beginning or not done at all.
> >>
> >>>
> >>>> at minimum it might be tricky and required very careful approach
> (especially in err path).
> >>>> For example, even in this patch you do not check return value of
> >>>> pm_runtime_get_sync() and in commit bd869245a3dc ("net: core: try to
> runtime-resume detached device in __dev_open") also actualy.
> >>>
> >>> The pm_runtime_get_sync() calls are attempts here. We don't want to
> >>> bail out if a device doesn't support RPM.
> >>
> >> And if 'parent' is not supporting PM runtime - it, as i see, should be handled
> by PM runtime core properly.
> >>
> >> I agree that checking the return code could make sense, but then we
> >> would
> >>> have to be careful which error codes we consider as failed.
> >>
> >> huh. you can't 'try' pm_runtime_get_sync() and then align on
> >> netif_device_present() :(
> >>
> >> might be, some how, it will work for r8169_main, but will not work for
> others.
> >> - no checking pm_runtime_get_sync() err code will cause PM runtime
> >> 'usage_count' leak
> >
> > No. pm_runtime_get_sync() always bumps the usage count, no matter
> whether it fails or not.
> 
> 
> > This makes it easy to deal with this. The problem you describe exists
> > with pm_runtime_resume_and_get(). That's why I wondered whether we
> > should annotate this function as __must_check. See here:
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore
> > .kernel.org%2Flinux-pm%2FCAJZ5v0gps0C2923VqM8876npvhcETsyN%2BajAk
> BKX5k
> >
> f49J0%2BMg%40mail.gmail.com%2FT%2F%23t&amp;data=04%7C01%7Cqiang
> qing.zh
> >
> ang%40nxp.com%7C392a231908da48658f4108d957e9f47c%7C686ea1d3bc2b
> 4c6fa92
> >
> cd99c5c301635%7C0%7C0%7C637637484629499490%7CUnknown%7CTWFpb
> GZsb3d8eyJ
> >
> WIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7
> C1000
> >
> &amp;sdata=lbHf9xDmV7w7QkolIsc%2FdmN%2FPKcOCkb2quaMa1JWI2g%3D
> &amp;rese
> > rved=0
> >
> >> - no checking pm_runtime_get_sync() err may cause to continue( for TI
> >> CPSW for example) with
> >>    device in undefined PM state ("disabled" or "half-enabled") and so crash
> later.
> >>
> > I'd say 95% of rpm callers don't check the return value. I'm not
> > saying this is a good thing, but obviously it doesn't cause relevant harm.
> 
> this is completely wrong assumption as PM errors cause silent stuck, undefined
> behavior or dumps (sometimes delayed) which is terribly hard to root cause.
> 
> yes. many drivers do not check, but over last few years more and more strict
> policies applied to avoid that and in many case no checking return code - is red
> flag and patch reject.
> Don't like that phrase ;), but "It doesn't mean that incorrect code has to be
> copy-pasted all over the places"
> 
> this is correct get pattern for get:
> 	ret = pm_runtime_get_sync(&pdev->dev);
> 	if (ret < 0) {
> 		pm_runtime_put_noidle(&pdev->dev);
> 		return ret;
> 	}

Yes, it’s the correct usage for pm_runtime_get_sync.

Best Regards,
Joakim Zhang
> My strong opinion
>   - PM runtime return code must be checked.
>   - get rid of netif_device_detach() in r8169
> 
> by the way, have you tried below test with your driver (not sure how it works
> for you):
> 
> .rtl_open
>   - pm_runtime_get_sync
>   - pm_runtime_put_sync - usage_count == 0 .r8169_phylink_handler
>   - pm_request_resume - why async? still usage_count == 0 .some ethtool
> request to go through dev_ethtool()
>   - pm_runtime_get_sync
>   - pm_runtime_put - async, usage_count == 0
>     ^ would not it put r8169 in runtime-suspended state while link is still UP?
> 
> 
> >
> >>
> >>
> >>>
> >>>>
> >>>>
> >>>> The TI CPSW driver may also be placed in non reachable state when
> >>>> netdev is closed (and even lose context), but we do not use
> >>>> netif_device_detach() (so netdev is accessible through
> >>>> netdev_ops/ethtool_ops), but instead wake up device by runtime PM for
> allowed operations or just save requested configuration which is applied at
> netdev->open() time then.
> >>>> I feel that using netif_device_detach() in PM runtime sounds like a
> >>>> too heavy approach ;)
> >>>>
> >>> That's not a rare pattern when suspending or runtime-suspending to
> >>> prevent different types of access to a not accessible device. But yes, it's
> relatively big hammer ..
> >>
> >> Again, netif_device_detach() seems correct for System wide suspend,
> >> but in my opinion - it's not correct for PM runtime.
> >>
> >> Sry, with all do respect, first corresponding driver has to be fixed and not
> Net core hacked to support it.
> >>
> >> Further decisions is up to maintainers.
> >>
> >>
> >>>
> >>>> huh, see it's merged already, so...
> >>>>
> >>>>> +
> >>>>> +    if (!netif_device_present(dev)) {
> >>>>> +        rc = -ENODEV;
> >>>>> +        goto out;
> >>>>> +    }
> >>>>> +
> >>>>>         if (dev->ethtool_ops->begin) {
> >>>>>             rc = dev->ethtool_ops->begin(dev);
> >>>>> -        if (rc  < 0)
> >>>>> -            return rc;
> >>>>> +        if (rc < 0)
> >>>>> +            goto out;
> >>>>>         }
> >>>>>         old_features = dev->features;
> >>>>>     @@ -2867,6 +2876,9 @@ int dev_ethtool(struct net *net, struct
> >>>>> ifreq *ifr)
> >>>>>           if (old_features != dev->features)
> >>>>>             netdev_features_change(dev);
> >>>>> +out:
> >>>>> +    if (dev->dev.parent)
> >>>>> +        pm_runtime_put(dev->dev.parent);
> >>>>>           return rc;
> >>>>>     }
> >>>>>
> >>>>
> >>>
> >>
> >
> 
> --
> Best regards,
> grygorii

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next 4/4] ethtool: runtime-resume netdev parent in ethnl_ops_begin
  2021-08-01 10:41 ` [PATCH net-next 4/4] ethtool: runtime-resume netdev parent in ethnl_ops_begin Heiner Kallweit
@ 2021-08-05 11:51   ` Julian Wiedmann
  2021-08-05 18:48     ` Heiner Kallweit
  0 siblings, 1 reply; 22+ messages in thread
From: Julian Wiedmann @ 2021-08-05 11:51 UTC (permalink / raw)
  To: Heiner Kallweit, Jakub Kicinski, David Miller; +Cc: netdev

On 01.08.21 13:41, Heiner Kallweit wrote:
> If a network device is runtime-suspended then:
> - network device may be flagged as detached and all ethtool ops (even if not
>   accessing the device) will fail because netif_device_present() returns
>   false
> - ethtool ops may fail because device is not accessible (e.g. because being
>   in D3 in case of a PCI device)
> 
> It may not be desirable that userspace can't use even simple ethtool ops
> that not access the device if interface or link is down. To be more friendly
> to userspace let's ensure that device is runtime-resumed when executing the
> respective ethtool op in kernel.
> 
> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
> ---
>  net/ethtool/netlink.c | 31 +++++++++++++++++++++++++------
>  1 file changed, 25 insertions(+), 6 deletions(-)
> 

[...]

>  
>  void ethnl_ops_complete(struct net_device *dev)
>  {
>  	if (dev && dev->ethtool_ops->complete)
>  		dev->ethtool_ops->complete(dev);
> +
> +	if (dev->dev.parent)
> +		pm_runtime_put(dev->dev.parent);
>  }
>  
>  /**
> 

Hello Heiner,

Coverity complains that we checked dev != NULL earlier but now
unconditionally dereference it:


*** CID 1506213:  Null pointer dereferences  (FORWARD_NULL)
/net/ethtool/netlink.c: 67 in ethnl_ops_complete()
61     
62     void ethnl_ops_complete(struct net_device *dev)
63     {
64     	if (dev && dev->ethtool_ops->complete)
65     		dev->ethtool_ops->complete(dev);
66     
>>>     CID 1506213:  Null pointer dereferences  (FORWARD_NULL)
>>>     Dereferencing null pointer "dev".
67     	if (dev->dev.parent)
68     		pm_runtime_put(dev->dev.parent);
69     }
70     
71     /**
72      * ethnl_parse_header_dev_get() - parse request header

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next 1/4] ethtool: runtime-resume netdev parent before ethtool ioctl ops
  2021-08-05 11:11             ` Joakim Zhang
@ 2021-08-05 11:58               ` Grygorii Strashko
  0 siblings, 0 replies; 22+ messages in thread
From: Grygorii Strashko @ 2021-08-05 11:58 UTC (permalink / raw)
  To: Joakim Zhang, Heiner Kallweit, Jakub Kicinski, David Miller
  Cc: netdev, Linux PM list, Andrew Lunn, Florian Fainelli



On 05/08/2021 14:11, Joakim Zhang wrote:
> 
>> -----Original Message-----
>> From: Grygorii Strashko <grygorii.strashko@ti.com>
>> Sent: 2021年8月5日 16:21
>> To: Heiner Kallweit <hkallweit1@gmail.com>; Jakub Kicinski
>> <kuba@kernel.org>; David Miller <davem@davemloft.net>
>> Cc: netdev@vger.kernel.org; Linux PM list <linux-pm@vger.kernel.org>;
>> Andrew Lunn <andrew@lunn.ch>; Florian Fainelli <f.fainelli@gmail.com>
>> Subject: Re: [PATCH net-next 1/4] ethtool: runtime-resume netdev parent
>> before ethtool ioctl ops
>>
>>
>>
>> On 04/08/2021 22:33, Heiner Kallweit wrote:
>>> On 04.08.2021 10:43, Grygorii Strashko wrote:
>>>>
>>>>
>>>> On 04/08/2021 00:32, Heiner Kallweit wrote:
>>>>> On 03.08.2021 22:41, Grygorii Strashko wrote:
>>>>>>
>>>>>>
>>>>>> On 01/08/2021 13:36, Heiner Kallweit wrote:
>>>>>>> If a network device is runtime-suspended then:
>>>>>>> - network device may be flagged as detached and all ethtool ops
>>>>>>> (even if not
>>>>>>>       accessing the device) will fail because
>>>>>>> netif_device_present() returns
>>>>>>>       false
>>>>>>> - ethtool ops may fail because device is not accessible (e.g.
>>>>>>> because being
>>>>>>>       in D3 in case of a PCI device)
>>>>>>>
>>>>>>> It may not be desirable that userspace can't use even simple
>>>>>>> ethtool ops that not access the device if interface or link is
>>>>>>> down. To be more friendly to userspace let's ensure that device is
>>>>>>> runtime-resumed when executing the respective ethtool op in kernel.
>>>>>>>
>>>>>>> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
>>>>>>> ---
>>>>>>>      net/ethtool/ioctl.c | 18 +++++++++++++++---
>>>>>>>      1 file changed, 15 insertions(+), 3 deletions(-)
>>>>>>>
>>>>>>> diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c index
>>>>>>> baa5d1004..b7ff9abe7 100644
>>>>>>> --- a/net/ethtool/ioctl.c
>>>>>>> +++ b/net/ethtool/ioctl.c
>>>>>>> @@ -23,6 +23,7 @@
>>>>>>>      #include <linux/rtnetlink.h>
>>>>>>>      #include <linux/sched/signal.h>
>>>>>>>      #include <linux/net.h>
>>>>>>> +#include <linux/pm_runtime.h>
>>>>>>>      #include <net/devlink.h>
>>>>>>>      #include <net/xdp_sock_drv.h>
>>>>>>>      #include <net/flow_offload.h>
>>>>>>> @@ -2589,7 +2590,7 @@ int dev_ethtool(struct net *net, struct
>>>>>>> ifreq *ifr)
>>>>>>>          int rc;
>>>>>>>          netdev_features_t old_features;
>>>>>>>      -    if (!dev || !netif_device_present(dev))
>>>>>>> +    if (!dev)
>>>>>>>              return -ENODEV;
>>>>>>>            if (copy_from_user(&ethcmd, useraddr, sizeof(ethcmd)))
>>>>>>> @@ -2645,10 +2646,18 @@ int dev_ethtool(struct net *net, struct
>>>>>>> ifreq *ifr)
>>>>>>>                  return -EPERM;
>>>>>>>          }
>>>>>>>      +    if (dev->dev.parent)
>>>>>>> +        pm_runtime_get_sync(dev->dev.parent);
>>>>>>
>>>>>> the PM Runtime should allow to wake up parent when child is resumed if
>> everything is configured properly.
>>>>>>
>>>>> Not sure if there's any case yet where the netdev-embedded device is
>> power-managed.
>>>>> Typically only the parent (e.g. a PCI device) is.
>>>>>
>>>>>> rpm_resume()
>>>>>> ...
>>>>>>        if (!parent && dev->parent) {
>>>>>>     --> here
>>>>>>
>>>>> Currently we don't get that far because we will bail out here already:
>>>>>
>>>>> else if (dev->power.disable_depth > 0)
>>>>>           retval = -EACCES;
>>>>>
>>>>> If netdev-embedded device isn't power-managed then disable_depth is 1.
>>>>
>>>> Right. But if pm_runtime_enable() is added for ndev->dev then PM
>>>> runtime will start working for it and should handle parent properly -
>>>> from my experience, every time any code need manipulate with "parent" or
>> smth. else to make PM runtime working it means smth. is wrong.
>>>>
>>>> diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c index
>>>> f6197774048b..33b72b788aa2 100644
>>>> --- a/net/core/net-sysfs.c
>>>> +++ b/net/core/net-sysfs.c
>>>> @@ -1963,6 +1963,7 @@ int netdev_register_kobject(struct net_device
>>>> *ndev)
>>>>           }
>>>>
>>>>           pm_runtime_set_memalloc_noio(dev, true);
>>>> +       pm_runtime_enable(dev);
>>>>
>>>>           return error;
>>>>    }
>>>>
>>>>
>>>>>
>>>>>> So, hence PM runtime calls are moved to from drivers to net_core
>>>>>> wouldn't be more correct approach to enable PM runtime for netdev->dev
>> and lets PM runtime do the job?
>>>>>>
>>>>> Where would netdev->dev be runtime-resumed so that
>> netif_device_present() passes?
>>>>
>>>> That's the biggest issues here. Some driver uses
>>>> netif_device_detach() in PM runtime and, this way, introduces custom
>> dependency between Core device PM (runtime) sate and Net core, other driver
>> does not do.
>>>> Does it means every driver with PM runtime now have to be updated to
>> indicate it PM state to Net core with netif_device_detach()?
>>>
>>> No, that's not needed.
>>>
>>>> Why? Why return value from pm_runtime_get calls is not enough?
>>>>
>>>> Believe me it's terrible idea to introduce custom PM state dependency
>>>> between PM runtime and Net core, for example it took years to sync
>> properly System wide suspend and PM runtime which are separate framworks.
>>>>
>>>> By the way netif_device_detach() during System Wide suspend is looks
>>>> perfectly valid, because entering System wide Suspend should prohibit
>>>> any access to netdev at some stage. And that's what 99% of network
>>>> drivers are doing (actually I can find only ./realtek/r8169_main.c
>>>> which abuse netif_device_detach() function and, I assume, it is your
>>>> case)
>>>>
>>> Actually I was inspired by the Intel drivers, see e.g.
>>> __igc_shutdown(). They also detach the netdevice on runtime suspend.
>>> One reason is that several core functions check for device presence
>>> before e.g. calling a ndo callback. Example: dev_set_mtu_ext()
>>
>> right and also:
>> - netlink - which you've hacked already
>> - 8021q:
>> vlan_dev_ioctl/vlan_dev_neigh_setup/vlan_add_rx_filter_info/vlan_kill_rx_filte
>> r_info
> 
> Yes, there are many place need to do such check. I always face a problem that where I need to
> runtime-resume the device, is there any suggestion? I always add it when an issue came out.
> 
> What confuse me it that, is there any document describe that which .ndo callback should be called
> with interface up, instead .ndo callback _CAN_ be called with interface down? I think this can help
> us decide when we need runtime-resume device.

In general, you can assume that any ndo can be called which are not part of data path (xmit, watchdog, irq/napi),
so the only option is to put netif down and go through every ndo testing.

> 
> After leaning all your discussion, from my point of view, it seems not a good choice to add RPM
> to net core. It had better handled by driver itself.


-- 
Best regards,
grygorii

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next 4/4] ethtool: runtime-resume netdev parent in ethnl_ops_begin
  2021-08-05 11:51   ` Julian Wiedmann
@ 2021-08-05 18:48     ` Heiner Kallweit
  0 siblings, 0 replies; 22+ messages in thread
From: Heiner Kallweit @ 2021-08-05 18:48 UTC (permalink / raw)
  To: Julian Wiedmann, Jakub Kicinski, David Miller; +Cc: netdev

On 05.08.2021 13:51, Julian Wiedmann wrote:
> On 01.08.21 13:41, Heiner Kallweit wrote:
>> If a network device is runtime-suspended then:
>> - network device may be flagged as detached and all ethtool ops (even if not
>>   accessing the device) will fail because netif_device_present() returns
>>   false
>> - ethtool ops may fail because device is not accessible (e.g. because being
>>   in D3 in case of a PCI device)
>>
>> It may not be desirable that userspace can't use even simple ethtool ops
>> that not access the device if interface or link is down. To be more friendly
>> to userspace let's ensure that device is runtime-resumed when executing the
>> respective ethtool op in kernel.
>>
>> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
>> ---
>>  net/ethtool/netlink.c | 31 +++++++++++++++++++++++++------
>>  1 file changed, 25 insertions(+), 6 deletions(-)
>>
> 
> [...]
> 
>>  
>>  void ethnl_ops_complete(struct net_device *dev)
>>  {
>>  	if (dev && dev->ethtool_ops->complete)
>>  		dev->ethtool_ops->complete(dev);
>> +
>> +	if (dev->dev.parent)
>> +		pm_runtime_put(dev->dev.parent);
>>  }
>>  
>>  /**
>>
> 
> Hello Heiner,
> 
> Coverity complains that we checked dev != NULL earlier but now
> unconditionally dereference it:
> 
Thanks for the hint. I wonder whether we have any valid case where
dev could be NULL. There are several places where dev is dereferenced
after the call to ethnl_ops_begin(). Just one example:
linkmodes_prepare_data()

Only ethnl_request_ops where allow_nodev_do is true is
ethnl_strset_request_ops. However in strset_prepare_data()
ethnl_ops_begin() is called only if dev isn't NULL.
Supposedly we should return an error from ethnl_ops_begin()
if dev is NULL.

> 
> *** CID 1506213:  Null pointer dereferences  (FORWARD_NULL)
> /net/ethtool/netlink.c: 67 in ethnl_ops_complete()
> 61     
> 62     void ethnl_ops_complete(struct net_device *dev)
> 63     {
> 64     	if (dev && dev->ethtool_ops->complete)
> 65     		dev->ethtool_ops->complete(dev);
> 66     
>>>>     CID 1506213:  Null pointer dereferences  (FORWARD_NULL)
>>>>     Dereferencing null pointer "dev".
> 67     	if (dev->dev.parent)
> 68     		pm_runtime_put(dev->dev.parent);
> 69     }
> 70     
> 71     /**
> 72      * ethnl_parse_header_dev_get() - parse request header
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next 1/4] ethtool: runtime-resume netdev parent before ethtool ioctl ops
  2021-08-05  8:20           ` Grygorii Strashko
  2021-08-05 11:11             ` Joakim Zhang
@ 2021-08-05 19:24             ` Heiner Kallweit
  2021-08-05 20:00               ` Grygorii Strashko
  1 sibling, 1 reply; 22+ messages in thread
From: Heiner Kallweit @ 2021-08-05 19:24 UTC (permalink / raw)
  To: Grygorii Strashko, Jakub Kicinski, David Miller
  Cc: netdev, Linux PM list, Andrew Lunn, Florian Fainelli

On 05.08.2021 10:20, Grygorii Strashko wrote:
> 
> 
> On 04/08/2021 22:33, Heiner Kallweit wrote:
>> On 04.08.2021 10:43, Grygorii Strashko wrote:
>>>
>>>
>>> On 04/08/2021 00:32, Heiner Kallweit wrote:
>>>> On 03.08.2021 22:41, Grygorii Strashko wrote:
>>>>>
>>>>>
>>>>> On 01/08/2021 13:36, Heiner Kallweit wrote:
>>>>>> If a network device is runtime-suspended then:
>>>>>> - network device may be flagged as detached and all ethtool ops (even if not
>>>>>>      accessing the device) will fail because netif_device_present() returns
>>>>>>      false
>>>>>> - ethtool ops may fail because device is not accessible (e.g. because being
>>>>>>      in D3 in case of a PCI device)
>>>>>>
>>>>>> It may not be desirable that userspace can't use even simple ethtool ops
>>>>>> that not access the device if interface or link is down. To be more friendly
>>>>>> to userspace let's ensure that device is runtime-resumed when executing the
>>>>>> respective ethtool op in kernel.
>>>>>>
>>>>>> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
>>>>>> ---
>>>>>>     net/ethtool/ioctl.c | 18 +++++++++++++++---
>>>>>>     1 file changed, 15 insertions(+), 3 deletions(-)
>>>>>>
>>>>>> diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
>>>>>> index baa5d1004..b7ff9abe7 100644
>>>>>> --- a/net/ethtool/ioctl.c
>>>>>> +++ b/net/ethtool/ioctl.c
>>>>>> @@ -23,6 +23,7 @@
>>>>>>     #include <linux/rtnetlink.h>
>>>>>>     #include <linux/sched/signal.h>
>>>>>>     #include <linux/net.h>
>>>>>> +#include <linux/pm_runtime.h>
>>>>>>     #include <net/devlink.h>
>>>>>>     #include <net/xdp_sock_drv.h>
>>>>>>     #include <net/flow_offload.h>
>>>>>> @@ -2589,7 +2590,7 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
>>>>>>         int rc;
>>>>>>         netdev_features_t old_features;
>>>>>>     -    if (!dev || !netif_device_present(dev))
>>>>>> +    if (!dev)
>>>>>>             return -ENODEV;
>>>>>>           if (copy_from_user(&ethcmd, useraddr, sizeof(ethcmd)))
>>>>>> @@ -2645,10 +2646,18 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
>>>>>>                 return -EPERM;
>>>>>>         }
>>>>>>     +    if (dev->dev.parent)
>>>>>> +        pm_runtime_get_sync(dev->dev.parent);
>>>>>
>>>>> the PM Runtime should allow to wake up parent when child is resumed if everything is configured properly.
>>>>>
>>>> Not sure if there's any case yet where the netdev-embedded device is power-managed.
>>>> Typically only the parent (e.g. a PCI device) is.
>>>>
>>>>> rpm_resume()
>>>>> ...
>>>>>       if (!parent && dev->parent) {
>>>>>    --> here
>>>>>
>>>> Currently we don't get that far because we will bail out here already:
>>>>
>>>> else if (dev->power.disable_depth > 0)
>>>>          retval = -EACCES;
>>>>
>>>> If netdev-embedded device isn't power-managed then disable_depth is 1.
>>>
>>> Right. But if pm_runtime_enable() is added for ndev->dev then PM runtime will start working for it
>>> and should handle parent properly - from my experience, every time any code need manipulate with "parent" or
>>> smth. else to make PM runtime working it means smth. is wrong.
>>>
>>> diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
>>> index f6197774048b..33b72b788aa2 100644
>>> --- a/net/core/net-sysfs.c
>>> +++ b/net/core/net-sysfs.c
>>> @@ -1963,6 +1963,7 @@ int netdev_register_kobject(struct net_device *ndev)
>>>          }
>>>            pm_runtime_set_memalloc_noio(dev, true);
>>> +       pm_runtime_enable(dev);
>>>            return error;
>>>   }
>>>
>>>
>>>>
>>>>> So, hence PM runtime calls are moved to from drivers to net_core wouldn't be more correct approach to
>>>>> enable PM runtime for netdev->dev and lets PM runtime do the job?
>>>>>
>>>> Where would netdev->dev be runtime-resumed so that netif_device_present() passes?
>>>
>>> That's the biggest issues here. Some driver uses netif_device_detach() in PM runtime and, this way, introduces custom dependency
>>> between Core device PM (runtime) sate and Net core, other driver does not do.
>>> Does it means every driver with PM runtime now have to be updated to indicate it PM state to Net core with netif_device_detach()?
>>
>> No, that's not needed.
>>
>>> Why? Why return value from pm_runtime_get calls is not enough?
>>>
>>> Believe me it's terrible idea to introduce custom PM state dependency between PM runtime and Net core,
>>> for example it took years to sync properly System wide suspend and PM runtime which are separate framworks.
>>>
>>> By the way netif_device_detach() during System Wide suspend is looks perfectly valid, because entering
>>> System wide Suspend should prohibit any access to netdev at some stage. And that's what 99% of network drivers are doing
>>> (actually I can find only ./realtek/r8169_main.c which abuse netif_device_detach() function and,
>>> I assume, it is your case)
>>>
>> Actually I was inspired by the Intel drivers, see e.g. __igc_shutdown(). They also detach the
>> netdevice on runtime suspend. One reason is that several core functions check for device
>> presence before e.g. calling a ndo callback. Example: dev_set_mtu_ext()
> 
> right and also:
> - netlink - which you've hacked already
> - 8021q: vlan_dev_ioctl/vlan_dev_neigh_setup/vlan_add_rx_filter_info/vlan_kill_rx_filter_info
> 
> 
>> Same applies for __dev_set_rx_mode(). Therefore I wondered whether cpsw_ndo_set_rx_mode()
>> - that does not include runtime-resuming the device - may be called when device is
>> runtime-suspended, e.g. if interface is up, but link is down.
> 
> CPSW doesn't manage PM runtime in link status handler, as it has only on/off state and off state can cause full
> context loss restore of which is expensive and hard to implement. And for most of netdev drivers no aggressive PM runtime
> is implemented exactly because of that (mac/vlan/fdb/mdb/...). Common patterns:
> 
> (a)
> .probe
>  -get
> .remove
>  -put
> 
> (b)
> .probe
>  -get
>  -put
> .open
>  -get
> .close
>  -put
> .protect places which may be called when netif is down
> 
> The CPSW follows (b) and so cpsw_ndo_set_rx_mode() can't be called when when device is
> runtime-suspended.
> 
> I assume, some hw like PCI, can have more PM states and in some of them keep HW context intact.
> 
Exactly, there's no reason to keep PCI in D0 if link is down. Once NIC detects a cable was
plugged in it triggers a PCI PME and PCI core sets PCI bus from D3cold/D3hot to D0 and
runtime-resumes device.

> 
>>
>>>> Wouldn't we then need RPM ops for the parent (e.g. PCI) and for netdev->dev?
>>>
>>> No. as I know -  netdev->dev can be declared as pm_runtime_no_callbacks(&adap->dev);
>>> I2C adapter might be a good example to check.
>>>
>>>> E.g. the parent runtime-resume can be triggered by a PCI PME, then it would
>>>> have to resume netdev->dev.
>>>>
>>>>> But, to be honest, I'm not sure adding PM runtime manipulation to the net core is a good idea -
>>>>
>>>> The TI CPSW driver runtime-resumes the device in begin ethtool op and suspends
>>>> it in complete. This pattern is used in more than one driver and may be worth
>>>> being moved to the core.
>>>
>>> I'm not against code refactoring and optimization, but in my opinion it has to be done right from the beginning or
>>> not done at all.
>>>
>>>>
>>>>> at minimum it might be tricky and required very careful approach (especially in err path).
>>>>> For example, even in this patch you do not check return value of pm_runtime_get_sync() and in
>>>>> commit bd869245a3dc ("net: core: try to runtime-resume detached device in __dev_open") also actualy.
>>>>
>>>> The pm_runtime_get_sync() calls are attempts here. We don't want to bail out if a device
>>>> doesn't support RPM.
>>>
>>> And if 'parent' is not supporting PM runtime - it, as i see, should be handled by PM runtime core properly.
>>>
>>> I agree that checking the return code could make sense, but then we would
>>>> have to be careful which error codes we consider as failed.
>>>
>>> huh. you can't 'try' pm_runtime_get_sync() and then align on netif_device_present() :(
>>>
>>> might be, some how, it will work for r8169_main, but will not work for others.
>>> - no checking pm_runtime_get_sync() err code will cause PM runtime 'usage_count' leak
>>
>> No. pm_runtime_get_sync() always bumps the usage count, no matter whether it fails or not.
> 
> 
>> This makes it easy to deal with this. The problem you describe exists with
>> pm_runtime_resume_and_get(). That's why I wondered whether we should annotate this
>> function as __must_check. See here:
>> https://lore.kernel.org/linux-pm/CAJZ5v0gps0C2923VqM8876npvhcETsyN+ajAkBKX5kf49J0+Mg@mail.gmail.com/T/#t
>>
>>> - no checking pm_runtime_get_sync() err may cause to continue( for TI CPSW for example) with
>>>    device in undefined PM state ("disabled" or "half-enabled") and so crash later.
>>>
>> I'd say 95% of rpm callers don't check the return value. I'm not saying this is a good thing,
>> but obviously it doesn't cause relevant harm.
> 
> this is completely wrong assumption as PM errors cause silent stuck, undefined behavior or dumps (sometimes delayed)
> which is terribly hard to root cause.
> 
> yes. many drivers do not check, but over last few years more and more strict policies applied to avoid that and
> in many case no checking return code - is red flag and patch reject.
> Don't like that phrase ;), but "It doesn't mean that incorrect code has to be copy-pasted all over the places"
> 
> this is correct get pattern for get:
>     ret = pm_runtime_get_sync(&pdev->dev);
>     if (ret < 0) {
>         pm_runtime_put_noidle(&pdev->dev);
>         return ret;
>     }
> 
That's exactly what pm_runtime_resume_and_get() does. IIRC this helper hasn't been
part of the API from the beginning and was added later.

> My strong opinion
>  - PM runtime return code must be checked.
>  - get rid of netif_device_detach() in r8169
> 
> by the way, have you tried below test with your driver (not sure how it works for you):
> 
> .rtl_open
>  - pm_runtime_get_sync
>  - pm_runtime_put_sync - usage_count == 0
> .r8169_phylink_handler
>  - pm_request_resume - why async? still usage_count == 0

pm_request_resume() is only meant to cancel a potentially scheduled runtime-suspend
if link has a short drop. In such a case link would be up again after ~ 3-4s,
timeout for runtime-suspending device after link drop is 10s.

> .some ethtool request to go through dev_ethtool()
>  - pm_runtime_get_sync
>  - pm_runtime_put - async, usage_count == 0
>    ^ would not it put r8169 in runtime-suspended state while link is still UP?
>  
No, see rtl8169_runtime_idle(). If link is up no runtime suspend is scheduled.

> 
>>
>>>
>>>
>>>>
>>>>>
>>>>>
>>>>> The TI CPSW driver may also be placed in non reachable state when netdev is closed (and even lose context),
>>>>> but we do not use netif_device_detach() (so netdev is accessible through netdev_ops/ethtool_ops),
>>>>> but instead wake up device by runtime PM for allowed operations or just save requested configuration which
>>>>> is applied at netdev->open() time then.
>>>>> I feel that using netif_device_detach() in PM runtime sounds like a too heavy approach ;)
>>>>>
>>>> That's not a rare pattern when suspending or runtime-suspending to prevent different types
>>>> of access to a not accessible device. But yes, it's relatively big hammer ..
>>>
>>> Again, netif_device_detach() seems correct for System wide suspend, but in my opinion - it's
>>> not correct for PM runtime.
>>>
>>> Sry, with all do respect, first corresponding driver has to be fixed and not Net core hacked to support it.
>>>
>>> Further decisions is up to maintainers.
>>>
>>>
>>>>
>>>>> huh, see it's merged already, so...
>>>>>
>>>>>> +
>>>>>> +    if (!netif_device_present(dev)) {
>>>>>> +        rc = -ENODEV;
>>>>>> +        goto out;
>>>>>> +    }
>>>>>> +
>>>>>>         if (dev->ethtool_ops->begin) {
>>>>>>             rc = dev->ethtool_ops->begin(dev);
>>>>>> -        if (rc  < 0)
>>>>>> -            return rc;
>>>>>> +        if (rc < 0)
>>>>>> +            goto out;
>>>>>>         }
>>>>>>         old_features = dev->features;
>>>>>>     @@ -2867,6 +2876,9 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
>>>>>>           if (old_features != dev->features)
>>>>>>             netdev_features_change(dev);
>>>>>> +out:
>>>>>> +    if (dev->dev.parent)
>>>>>> +        pm_runtime_put(dev->dev.parent);
>>>>>>           return rc;
>>>>>>     }
>>>>>>
>>>>>
>>>>
>>>
>>
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next 1/4] ethtool: runtime-resume netdev parent before ethtool ioctl ops
  2021-08-05 19:24             ` Heiner Kallweit
@ 2021-08-05 20:00               ` Grygorii Strashko
  0 siblings, 0 replies; 22+ messages in thread
From: Grygorii Strashko @ 2021-08-05 20:00 UTC (permalink / raw)
  To: Heiner Kallweit, Jakub Kicinski, David Miller, Rafael J. Wysocki,
	Alan Stern
  Cc: netdev, Linux PM list, Andrew Lunn, Florian Fainelli



On 05/08/2021 22:24, Heiner Kallweit wrote:
> On 05.08.2021 10:20, Grygorii Strashko wrote:
>>
>>
>> On 04/08/2021 22:33, Heiner Kallweit wrote:
>>> On 04.08.2021 10:43, Grygorii Strashko wrote:
>>>>
>>>>
>>>> On 04/08/2021 00:32, Heiner Kallweit wrote:
>>>>> On 03.08.2021 22:41, Grygorii Strashko wrote:
>>>>>>
>>>>>>
>>>>>> On 01/08/2021 13:36, Heiner Kallweit wrote:
>>>>>>> If a network device is runtime-suspended then:
>>>>>>> - network device may be flagged as detached and all ethtool ops (even if not
>>>>>>>       accessing the device) will fail because netif_device_present() returns
>>>>>>>       false
>>>>>>> - ethtool ops may fail because device is not accessible (e.g. because being
>>>>>>>       in D3 in case of a PCI device)
>>>>>>>
>>>>>>> It may not be desirable that userspace can't use even simple ethtool ops
>>>>>>> that not access the device if interface or link is down. To be more friendly
>>>>>>> to userspace let's ensure that device is runtime-resumed when executing the
>>>>>>> respective ethtool op in kernel.
>>>>>>>
>>>>>>> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
>>>>>>> ---
>>>>>>>      net/ethtool/ioctl.c | 18 +++++++++++++++---
>>>>>>>      1 file changed, 15 insertions(+), 3 deletions(-)
>>>>>>>
>>>>>>> diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
>>>>>>> index baa5d1004..b7ff9abe7 100644
>>>>>>> --- a/net/ethtool/ioctl.c
>>>>>>> +++ b/net/ethtool/ioctl.c
>>>>>>> @@ -23,6 +23,7 @@
>>>>>>>      #include <linux/rtnetlink.h>
>>>>>>>      #include <linux/sched/signal.h>
>>>>>>>      #include <linux/net.h>
>>>>>>> +#include <linux/pm_runtime.h>
>>>>>>>      #include <net/devlink.h>
>>>>>>>      #include <net/xdp_sock_drv.h>
>>>>>>>      #include <net/flow_offload.h>
>>>>>>> @@ -2589,7 +2590,7 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
>>>>>>>          int rc;
>>>>>>>          netdev_features_t old_features;
>>>>>>>      -    if (!dev || !netif_device_present(dev))
>>>>>>> +    if (!dev)
>>>>>>>              return -ENODEV;
>>>>>>>            if (copy_from_user(&ethcmd, useraddr, sizeof(ethcmd)))
>>>>>>> @@ -2645,10 +2646,18 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
>>>>>>>                  return -EPERM;
>>>>>>>          }
>>>>>>>      +    if (dev->dev.parent)
>>>>>>> +        pm_runtime_get_sync(dev->dev.parent);
>>>>>>
>>>>>> the PM Runtime should allow to wake up parent when child is resumed if everything is configured properly.
>>>>>>
>>>>> Not sure if there's any case yet where the netdev-embedded device is power-managed.
>>>>> Typically only the parent (e.g. a PCI device) is.
>>>>>
>>>>>> rpm_resume()
>>>>>> ...
>>>>>>        if (!parent && dev->parent) {
>>>>>>     --> here
>>>>>>
>>>>> Currently we don't get that far because we will bail out here already:
>>>>>
>>>>> else if (dev->power.disable_depth > 0)
>>>>>           retval = -EACCES;
>>>>>
>>>>> If netdev-embedded device isn't power-managed then disable_depth is 1.
>>>>
>>>> Right. But if pm_runtime_enable() is added for ndev->dev then PM runtime will start working for it
>>>> and should handle parent properly - from my experience, every time any code need manipulate with "parent" or
>>>> smth. else to make PM runtime working it means smth. is wrong.
>>>>
>>>> diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
>>>> index f6197774048b..33b72b788aa2 100644
>>>> --- a/net/core/net-sysfs.c
>>>> +++ b/net/core/net-sysfs.c
>>>> @@ -1963,6 +1963,7 @@ int netdev_register_kobject(struct net_device *ndev)
>>>>           }
>>>>             pm_runtime_set_memalloc_noio(dev, true);
>>>> +       pm_runtime_enable(dev);
>>>>             return error;
>>>>    }
>>>>
>>>>
>>>>>
>>>>>> So, hence PM runtime calls are moved to from drivers to net_core wouldn't be more correct approach to
>>>>>> enable PM runtime for netdev->dev and lets PM runtime do the job?
>>>>>>
>>>>> Where would netdev->dev be runtime-resumed so that netif_device_present() passes?
>>>>
>>>> That's the biggest issues here. Some driver uses netif_device_detach() in PM runtime and, this way, introduces custom dependency
>>>> between Core device PM (runtime) sate and Net core, other driver does not do.
>>>> Does it means every driver with PM runtime now have to be updated to indicate it PM state to Net core with netif_device_detach()?
>>>
>>> No, that's not needed.
>>>
>>>> Why? Why return value from pm_runtime_get calls is not enough?
>>>>
>>>> Believe me it's terrible idea to introduce custom PM state dependency between PM runtime and Net core,
>>>> for example it took years to sync properly System wide suspend and PM runtime which are separate framworks.
>>>>
>>>> By the way netif_device_detach() during System Wide suspend is looks perfectly valid, because entering
>>>> System wide Suspend should prohibit any access to netdev at some stage. And that's what 99% of network drivers are doing
>>>> (actually I can find only ./realtek/r8169_main.c which abuse netif_device_detach() function and,
>>>> I assume, it is your case)
>>>>
>>> Actually I was inspired by the Intel drivers, see e.g. __igc_shutdown(). They also detach the
>>> netdevice on runtime suspend. One reason is that several core functions check for device
>>> presence before e.g. calling a ndo callback. Example: dev_set_mtu_ext()
>>
>> right and also:
>> - netlink - which you've hacked already
>> - 8021q: vlan_dev_ioctl/vlan_dev_neigh_setup/vlan_add_rx_filter_info/vlan_kill_rx_filter_info
>>
>>
>>> Same applies for __dev_set_rx_mode(). Therefore I wondered whether cpsw_ndo_set_rx_mode()
>>> - that does not include runtime-resuming the device - may be called when device is
>>> runtime-suspended, e.g. if interface is up, but link is down.
>>
>> CPSW doesn't manage PM runtime in link status handler, as it has only on/off state and off state can cause full
>> context loss restore of which is expensive and hard to implement. And for most of netdev drivers no aggressive PM runtime
>> is implemented exactly because of that (mac/vlan/fdb/mdb/...). Common patterns:
>>
>> (a)
>> .probe
>>   -get
>> .remove
>>   -put
>>
>> (b)
>> .probe
>>   -get
>>   -put
>> .open
>>   -get
>> .close
>>   -put
>> .protect places which may be called when netif is down
>>
>> The CPSW follows (b) and so cpsw_ndo_set_rx_mode() can't be called when when device is
>> runtime-suspended.
>>
>> I assume, some hw like PCI, can have more PM states and in some of them keep HW context intact.
>>
> Exactly, there's no reason to keep PCI in D0 if link is down. Once NIC detects a cable was
> plugged in it triggers a PCI PME and PCI core sets PCI bus from D3cold/D3hot to D0 and
> runtime-resumes device.
> 
>>
>>>
>>>>> Wouldn't we then need RPM ops for the parent (e.g. PCI) and for netdev->dev?
>>>>
>>>> No. as I know -  netdev->dev can be declared as pm_runtime_no_callbacks(&adap->dev);
>>>> I2C adapter might be a good example to check.
>>>>
>>>>> E.g. the parent runtime-resume can be triggered by a PCI PME, then it would
>>>>> have to resume netdev->dev.
>>>>>
>>>>>> But, to be honest, I'm not sure adding PM runtime manipulation to the net core is a good idea -
>>>>>
>>>>> The TI CPSW driver runtime-resumes the device in begin ethtool op and suspends
>>>>> it in complete. This pattern is used in more than one driver and may be worth
>>>>> being moved to the core.
>>>>
>>>> I'm not against code refactoring and optimization, but in my opinion it has to be done right from the beginning or
>>>> not done at all.
>>>>
>>>>>
>>>>>> at minimum it might be tricky and required very careful approach (especially in err path).
>>>>>> For example, even in this patch you do not check return value of pm_runtime_get_sync() and in
>>>>>> commit bd869245a3dc ("net: core: try to runtime-resume detached device in __dev_open") also actualy.
>>>>>
>>>>> The pm_runtime_get_sync() calls are attempts here. We don't want to bail out if a device
>>>>> doesn't support RPM.
>>>>
>>>> And if 'parent' is not supporting PM runtime - it, as i see, should be handled by PM runtime core properly.
>>>>
>>>> I agree that checking the return code could make sense, but then we would
>>>>> have to be careful which error codes we consider as failed.
>>>>
>>>> huh. you can't 'try' pm_runtime_get_sync() and then align on netif_device_present() :(
>>>>
>>>> might be, some how, it will work for r8169_main, but will not work for others.
>>>> - no checking pm_runtime_get_sync() err code will cause PM runtime 'usage_count' leak
>>>
>>> No. pm_runtime_get_sync() always bumps the usage count, no matter whether it fails or not.
>>
>>
>>> This makes it easy to deal with this. The problem you describe exists with
>>> pm_runtime_resume_and_get(). That's why I wondered whether we should annotate this
>>> function as __must_check. See here:
>>> https://lore.kernel.org/linux-pm/CAJZ5v0gps0C2923VqM8876npvhcETsyN+ajAkBKX5kf49J0+Mg@mail.gmail.com/T/#t
>>>
>>>> - no checking pm_runtime_get_sync() err may cause to continue( for TI CPSW for example) with
>>>>     device in undefined PM state ("disabled" or "half-enabled") and so crash later.
>>>>
>>> I'd say 95% of rpm callers don't check the return value. I'm not saying this is a good thing,
>>> but obviously it doesn't cause relevant harm.
>>
>> this is completely wrong assumption as PM errors cause silent stuck, undefined behavior or dumps (sometimes delayed)
>> which is terribly hard to root cause.
>>
>> yes. many drivers do not check, but over last few years more and more strict policies applied to avoid that and
>> in many case no checking return code - is red flag and patch reject.
>> Don't like that phrase ;), but "It doesn't mean that incorrect code has to be copy-pasted all over the places"
>>
>> this is correct get pattern for get:
>>      ret = pm_runtime_get_sync(&pdev->dev);
>>      if (ret < 0) {
>>          pm_runtime_put_noidle(&pdev->dev);
>>          return ret;
>>      }
>>
> That's exactly what pm_runtime_resume_and_get() does. IIRC this helper hasn't been
> part of the API from the beginning and was added later.
> 
>> My strong opinion
>>   - PM runtime return code must be checked.
>>   - get rid of netif_device_detach() in r8169
>>
>> by the way, have you tried below test with your driver (not sure how it works for you):
>>
>> .rtl_open
>>   - pm_runtime_get_sync
>>   - pm_runtime_put_sync - usage_count == 0
>> .r8169_phylink_handler
>>   - pm_request_resume - why async? still usage_count == 0
> 
> pm_request_resume() is only meant to cancel a potentially scheduled runtime-suspend
> if link has a short drop. In such a case link would be up again after ~ 3-4s,
> timeout for runtime-suspending device after link drop is 10s.
> 
>> .some ethtool request to go through dev_ethtool()
>>   - pm_runtime_get_sync
>>   - pm_runtime_put - async, usage_count == 0
>>     ^ would not it put r8169 in runtime-suspended state while link is still UP?
>>   
> No, see rtl8169_runtime_idle(). If link is up no runtime suspend is scheduled.

really :) This one

static int rtl8169_runtime_idle(struct device *device)
{
	struct rtl8169_private *tp = dev_get_drvdata(device);

	if (!netif_running(tp->dev) || !netif_carrier_ok(tp->dev))
		pm_schedule_suspend(device, 10000);

	return -EBUSY;
}

Sry, but you really need to take pause and rump up on PM runtime, with all do respect.

Pay attention on PM runtime autosupend, so you can do properly get_sync on link up and use autosuspend on Link down.

Hope PM people can comment here also.

-- 
Best regards,
grygorii

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2021-08-05 20:01 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-01 10:35 [PATCH net-next 0/4] ethtool: runtime-resume netdev parent before ethtool ops Heiner Kallweit
2021-08-01 10:36 ` [PATCH net-next 1/4] ethtool: runtime-resume netdev parent before ethtool ioctl ops Heiner Kallweit
2021-08-03 20:41   ` Grygorii Strashko
2021-08-03 21:32     ` Heiner Kallweit
2021-08-04  8:43       ` Grygorii Strashko
2021-08-04 19:33         ` Heiner Kallweit
2021-08-05  8:20           ` Grygorii Strashko
2021-08-05 11:11             ` Joakim Zhang
2021-08-05 11:58               ` Grygorii Strashko
2021-08-05 19:24             ` Heiner Kallweit
2021-08-05 20:00               ` Grygorii Strashko
2021-08-01 10:37 ` [PATCH net-next 2/4] ethtool: move implementation of ethnl_ops_begin/complete to netlink.c Heiner Kallweit
2021-08-01 10:40 ` [PATCH net-next 3/4] ethtool: move netif_device_present check from ethnl_parse_header_dev_get to ethnl_ops_begin Heiner Kallweit
2021-08-01 10:41 ` [PATCH net-next 4/4] ethtool: runtime-resume netdev parent in ethnl_ops_begin Heiner Kallweit
2021-08-05 11:51   ` Julian Wiedmann
2021-08-05 18:48     ` Heiner Kallweit
2021-08-01 16:25 ` [PATCH net-next 0/4] ethtool: runtime-resume netdev parent before ethtool ops Heiner Kallweit
2021-08-02 14:15   ` Jakub Kicinski
2021-08-02 16:42     ` Heiner Kallweit
2021-08-02 16:54       ` Jakub Kicinski
2021-08-02 19:00         ` Heiner Kallweit
2021-08-03 12:00 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).