linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 1/2] igb: Use device_lock() insead of rtnl_lock()
@ 2020-02-05  8:16 Kai-Heng Feng
  2020-02-05  8:16 ` [PATCH v2 2/2] net-sysfs: Ensure begin/complete are called in speed_show() and duplex_show() Kai-Heng Feng
  2020-02-07  4:11 ` [Intel-wired-lan] [PATCH v2 1/2] igb: Use device_lock() insead of rtnl_lock() Dan Carpenter
  0 siblings, 2 replies; 7+ messages in thread
From: Kai-Heng Feng @ 2020-02-05  8:16 UTC (permalink / raw)
  To: davem, mkubecek, jeffrey.t.kirsher
  Cc: Kai-Heng Feng, David S. Miller,
	moderated list:INTEL ETHERNET DRIVERS,
	open list:NETWORKING DRIVERS, open list

Commit 9474933caf21 ("igb: close/suspend race in netif_device_detach")
fixed race condition between close and power management ops by using
rtnl_lock().

However we can achieve the same by using device_lock() since all power
management ops are protected by device_lock().

This fix is a preparation for next patch, to prevent a dead lock under
rtnl_lock() when calling runtime resume routine.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
---
v2:
 - No change.

 drivers/net/ethernet/intel/igb/igb_main.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index b46bff8fe056..3750e2b926b1 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -4026,8 +4026,13 @@ static int __igb_close(struct net_device *netdev, bool suspending)
 
 int igb_close(struct net_device *netdev)
 {
+	struct igb_adapter *adapter = netdev_priv(netdev);
+	struct device *dev = &adapter->pdev->dev;
+
+	device_lock(dev);
 	if (netif_device_present(netdev) || netdev->dismantle)
 		return __igb_close(netdev, false);
+	device_unlock(dev);
 	return 0;
 }
 
@@ -8760,7 +8765,6 @@ static int __igb_shutdown(struct pci_dev *pdev, bool *enable_wake,
 	u32 wufc = runtime ? E1000_WUFC_LNKC : adapter->wol;
 	bool wake;
 
-	rtnl_lock();
 	netif_device_detach(netdev);
 
 	if (netif_running(netdev))
@@ -8769,7 +8773,6 @@ static int __igb_shutdown(struct pci_dev *pdev, bool *enable_wake,
 	igb_ptp_suspend(adapter);
 
 	igb_clear_interrupt_scheme(adapter);
-	rtnl_unlock();
 
 	status = rd32(E1000_STATUS);
 	if (status & E1000_STATUS_LU)
@@ -8897,13 +8900,11 @@ static int __maybe_unused igb_resume(struct device *dev)
 
 	wr32(E1000_WUS, ~0);
 
-	rtnl_lock();
 	if (!err && netif_running(netdev))
 		err = __igb_open(netdev, true);
 
 	if (!err)
 		netif_device_attach(netdev);
-	rtnl_unlock();
 
 	return err;
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 2/2] net-sysfs: Ensure begin/complete are called in speed_show() and duplex_show()
  2020-02-05  8:16 [PATCH v2 1/2] igb: Use device_lock() insead of rtnl_lock() Kai-Heng Feng
@ 2020-02-05  8:16 ` Kai-Heng Feng
  2020-02-05  9:06   ` Andy Shevchenko
  2020-02-07  4:11 ` [Intel-wired-lan] [PATCH v2 1/2] igb: Use device_lock() insead of rtnl_lock() Dan Carpenter
  1 sibling, 1 reply; 7+ messages in thread
From: Kai-Heng Feng @ 2020-02-05  8:16 UTC (permalink / raw)
  To: davem, mkubecek, jeffrey.t.kirsher
  Cc: Kai-Heng Feng, David S. Miller, Jakub Kicinski, Jiri Pirko,
	Pablo Neira Ayuso, Maxime Chevallier, Heiner Kallweit,
	Andy Shevchenko, Florian Fainelli, Greg Kroah-Hartman,
	Ido Schimmel, Jouni Hogander, YueHaibing, Eric Dumazet, Wang Hai,
	Thomas Gleixner, Li RongQing, open list,
	open list:NETWORKING [GENERAL]

Device like igb gets runtime suspended when there's no link partner. We
can't get correct speed under that state:
$ cat /sys/class/net/enp3s0/speed
1000

In addition to that, an error can also be spotted in dmesg:
[  385.991957] igb 0000:03:00.0 enp3s0: PCIe link lost

It's because the igb device doesn't get runtime resumed before calling
get_link_ksettings().

So let's use a new helper to call begin() and complete() like what
dev_ethtool() does, to runtime resume/suspend or power up/down the
device properly.

Once this fix is in place, igb can show the speed correctly without link
partner:
$ cat /sys/class/net/enp3s0/speed
-1

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
---
v2:
 - Add a new helper with begin/complete and use it in net-sysfs.

 include/linux/ethtool.h |  4 ++++
 net/core/net-sysfs.c    |  4 ++--
 net/ethtool/ioctl.c     | 33 ++++++++++++++++++++++++++++++++-
 3 files changed, 38 insertions(+), 3 deletions(-)

diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 95991e4300bf..785ec1921417 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -160,6 +160,10 @@ extern int
 __ethtool_get_link_ksettings(struct net_device *dev,
 			     struct ethtool_link_ksettings *link_ksettings);
 
+extern int
+__ethtool_get_link_ksettings_full(struct net_device *dev,
+				  struct ethtool_link_ksettings *link_ksettings);
+
 /**
  * ethtool_intersect_link_masks - Given two link masks, AND them together
  * @dst: first mask and where result is stored
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 4c826b8bf9b1..a199e15a080f 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -201,7 +201,7 @@ static ssize_t speed_show(struct device *dev,
 	if (netif_running(netdev)) {
 		struct ethtool_link_ksettings cmd;
 
-		if (!__ethtool_get_link_ksettings(netdev, &cmd))
+		if (!__ethtool_get_link_ksettings_full(netdev, &cmd))
 			ret = sprintf(buf, fmt_dec, cmd.base.speed);
 	}
 	rtnl_unlock();
@@ -221,7 +221,7 @@ static ssize_t duplex_show(struct device *dev,
 	if (netif_running(netdev)) {
 		struct ethtool_link_ksettings cmd;
 
-		if (!__ethtool_get_link_ksettings(netdev, &cmd)) {
+		if (!__ethtool_get_link_ksettings_full(netdev, &cmd)) {
 			const char *duplex;
 
 			switch (cmd.base.duplex) {
diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
index b987052d91ef..faeba247c1fb 100644
--- a/net/ethtool/ioctl.c
+++ b/net/ethtool/ioctl.c
@@ -420,7 +420,9 @@ struct ethtool_link_usettings {
 	} link_modes;
 };
 
-/* Internal kernel helper to query a device ethtool_link_settings. */
+/* Internal kernel helper to query a device ethtool_link_settings. To be called
+ * inside begin/complete block.
+ */
 int __ethtool_get_link_ksettings(struct net_device *dev,
 				 struct ethtool_link_ksettings *link_ksettings)
 {
@@ -434,6 +436,35 @@ int __ethtool_get_link_ksettings(struct net_device *dev,
 }
 EXPORT_SYMBOL(__ethtool_get_link_ksettings);
 
+/* Internal kernel helper to query a device ethtool_link_settings. To be called
+ * outside of begin/complete block.
+ */
+int __ethtool_get_link_ksettings_full(struct net_device *dev,
+				      struct ethtool_link_ksettings *link_ksettings)
+{
+	int rc;
+
+	ASSERT_RTNL();
+
+	if (!dev->ethtool_ops->get_link_ksettings)
+		return -EOPNOTSUPP;
+
+	if (dev->ethtool_ops->begin) {
+		rc = dev->ethtool_ops->begin(dev);
+		if (rc  < 0)
+			return rc;
+	}
+
+	memset(link_ksettings, 0, sizeof(*link_ksettings));
+	rc = dev->ethtool_ops->get_link_ksettings(dev, link_ksettings);
+
+	if (dev->ethtool_ops->complete)
+		dev->ethtool_ops->complete(dev);
+
+	return rc;
+}
+EXPORT_SYMBOL(__ethtool_get_link_ksettings_full);
+
 /* convert ethtool_link_usettings in user space to a kernel internal
  * ethtool_link_ksettings. return 0 on success, errno on error.
  */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 2/2] net-sysfs: Ensure begin/complete are called in speed_show() and duplex_show()
  2020-02-05  8:16 ` [PATCH v2 2/2] net-sysfs: Ensure begin/complete are called in speed_show() and duplex_show() Kai-Heng Feng
@ 2020-02-05  9:06   ` Andy Shevchenko
  2020-02-05  9:19     ` Stephen Hemminger
  2020-02-05  9:23     ` Michal Kubecek
  0 siblings, 2 replies; 7+ messages in thread
From: Andy Shevchenko @ 2020-02-05  9:06 UTC (permalink / raw)
  To: Kai-Heng Feng
  Cc: davem, mkubecek, jeffrey.t.kirsher, David S. Miller,
	Jakub Kicinski, Jiri Pirko, Pablo Neira Ayuso, Maxime Chevallier,
	Heiner Kallweit, Florian Fainelli, Greg Kroah-Hartman,
	Ido Schimmel, Jouni Hogander, YueHaibing, Eric Dumazet, Wang Hai,
	Thomas Gleixner, Li RongQing, open list,
	open list:NETWORKING [GENERAL]

On Wed, Feb 05, 2020 at 04:16:16PM +0800, Kai-Heng Feng wrote:
> Device like igb gets runtime suspended when there's no link partner. We
> can't get correct speed under that state:
> $ cat /sys/class/net/enp3s0/speed
> 1000
> 
> In addition to that, an error can also be spotted in dmesg:
> [  385.991957] igb 0000:03:00.0 enp3s0: PCIe link lost
> 
> It's because the igb device doesn't get runtime resumed before calling
> get_link_ksettings().
> 
> So let's use a new helper to call begin() and complete() like what
> dev_ethtool() does, to runtime resume/suspend or power up/down the
> device properly.
> 
> Once this fix is in place, igb can show the speed correctly without link
> partner:
> $ cat /sys/class/net/enp3s0/speed
> -1

What is the meaning of -1? Does it tells us "Hey, something is bad in hardware
I can't tell you the speed" or does it imply anything else?

Wouldn't be better to report 0?

Where is the documentation part of this ABI change?

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 2/2] net-sysfs: Ensure begin/complete are called in speed_show() and duplex_show()
  2020-02-05  9:06   ` Andy Shevchenko
@ 2020-02-05  9:19     ` Stephen Hemminger
  2020-02-05  9:23     ` Michal Kubecek
  1 sibling, 0 replies; 7+ messages in thread
From: Stephen Hemminger @ 2020-02-05  9:19 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Kai-Heng Feng, davem, mkubecek, jeffrey.t.kirsher,
	David S. Miller, Jakub Kicinski, Jiri Pirko, Pablo Neira Ayuso,
	Maxime Chevallier, Heiner Kallweit, Florian Fainelli,
	Greg Kroah-Hartman, Ido Schimmel, Jouni Hogander, YueHaibing,
	Eric Dumazet, Wang Hai, Thomas Gleixner, Li RongQing, open list,
	open list:NETWORKING [GENERAL]

On Wed, 5 Feb 2020 11:06:38 +0200
Andy Shevchenko <andriy.shevchenko@linux.intel.com> wrote:

> On Wed, Feb 05, 2020 at 04:16:16PM +0800, Kai-Heng Feng wrote:
> > Device like igb gets runtime suspended when there's no link partner. We
> > can't get correct speed under that state:
> > $ cat /sys/class/net/enp3s0/speed
> > 1000
> > 
> > In addition to that, an error can also be spotted in dmesg:
> > [  385.991957] igb 0000:03:00.0 enp3s0: PCIe link lost
> > 
> > It's because the igb device doesn't get runtime resumed before calling
> > get_link_ksettings().
> > 
> > So let's use a new helper to call begin() and complete() like what
> > dev_ethtool() does, to runtime resume/suspend or power up/down the
> > device properly.
> > 
> > Once this fix is in place, igb can show the speed correctly without link
> > partner:
> > $ cat /sys/class/net/enp3s0/speed
> > -1  
> 
> What is the meaning of -1? Does it tells us "Hey, something is bad in hardware
> I can't tell you the speed" or does it imply anything else?
> 
> Wouldn't be better to report 0?
> 
> Where is the documentation part of this ABI change?
> 

The speed value of -1 is defined in ethtool.h as:

#define SPEED_UNKNOWN		-1




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 2/2] net-sysfs: Ensure begin/complete are called in speed_show() and duplex_show()
  2020-02-05  9:06   ` Andy Shevchenko
  2020-02-05  9:19     ` Stephen Hemminger
@ 2020-02-05  9:23     ` Michal Kubecek
  2020-02-05  9:33       ` Andy Shevchenko
  1 sibling, 1 reply; 7+ messages in thread
From: Michal Kubecek @ 2020-02-05  9:23 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Kai-Heng Feng, davem, jeffrey.t.kirsher, David S. Miller,
	Jakub Kicinski, Jiri Pirko, Pablo Neira Ayuso, Maxime Chevallier,
	Heiner Kallweit, Florian Fainelli, Greg Kroah-Hartman,
	Ido Schimmel, Jouni Hogander, YueHaibing, Eric Dumazet, Wang Hai,
	Thomas Gleixner, Li RongQing, open list,
	open list:NETWORKING [GENERAL]

On Wed, Feb 05, 2020 at 11:06:38AM +0200, Andy Shevchenko wrote:
> On Wed, Feb 05, 2020 at 04:16:16PM +0800, Kai-Heng Feng wrote:
> > Device like igb gets runtime suspended when there's no link partner. We
> > can't get correct speed under that state:
> > $ cat /sys/class/net/enp3s0/speed
> > 1000
> > 
> > In addition to that, an error can also be spotted in dmesg:
> > [  385.991957] igb 0000:03:00.0 enp3s0: PCIe link lost
> > 
> > It's because the igb device doesn't get runtime resumed before calling
> > get_link_ksettings().
> > 
> > So let's use a new helper to call begin() and complete() like what
> > dev_ethtool() does, to runtime resume/suspend or power up/down the
> > device properly.
> > 
> > Once this fix is in place, igb can show the speed correctly without link
> > partner:
> > $ cat /sys/class/net/enp3s0/speed
> > -1
> 
> What is the meaning of -1? Does it tells us "Hey, something is bad in hardware
> I can't tell you the speed" or does it imply anything else?

It's SPEED_UNKNOWN constant printed with "%d" template.

> Wouldn't be better to report 0?
> 
> Where is the documentation part of this ABI change?

It's not an ABI change, /sys/class/net/*/speed already shows -1 when the
device reports SPEED_UNKNOWN. The only change is that after this patch,
igb driver reports SPEED_UNKNOWN rather than an outdated value if there
is no link.

Michal

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 2/2] net-sysfs: Ensure begin/complete are called in speed_show() and duplex_show()
  2020-02-05  9:23     ` Michal Kubecek
@ 2020-02-05  9:33       ` Andy Shevchenko
  0 siblings, 0 replies; 7+ messages in thread
From: Andy Shevchenko @ 2020-02-05  9:33 UTC (permalink / raw)
  To: Michal Kubecek
  Cc: Kai-Heng Feng, davem, jeffrey.t.kirsher, David S. Miller,
	Jakub Kicinski, Jiri Pirko, Pablo Neira Ayuso, Maxime Chevallier,
	Heiner Kallweit, Florian Fainelli, Greg Kroah-Hartman,
	Ido Schimmel, Jouni Hogander, YueHaibing, Eric Dumazet, Wang Hai,
	Thomas Gleixner, Li RongQing, open list,
	open list:NETWORKING [GENERAL]

On Wed, Feb 05, 2020 at 10:23:45AM +0100, Michal Kubecek wrote:
> On Wed, Feb 05, 2020 at 11:06:38AM +0200, Andy Shevchenko wrote:
> > On Wed, Feb 05, 2020 at 04:16:16PM +0800, Kai-Heng Feng wrote:
> > > Device like igb gets runtime suspended when there's no link partner. We
> > > can't get correct speed under that state:
> > > $ cat /sys/class/net/enp3s0/speed
> > > 1000
> > > 
> > > In addition to that, an error can also be spotted in dmesg:
> > > [  385.991957] igb 0000:03:00.0 enp3s0: PCIe link lost
> > > 
> > > It's because the igb device doesn't get runtime resumed before calling
> > > get_link_ksettings().
> > > 
> > > So let's use a new helper to call begin() and complete() like what
> > > dev_ethtool() does, to runtime resume/suspend or power up/down the
> > > device properly.
> > > 
> > > Once this fix is in place, igb can show the speed correctly without link
> > > partner:
> > > $ cat /sys/class/net/enp3s0/speed
> > > -1
> > 
> > What is the meaning of -1? Does it tells us "Hey, something is bad in hardware
> > I can't tell you the speed" or does it imply anything else?
> 
> It's SPEED_UNKNOWN constant printed with "%d" template.
> 
> > Wouldn't be better to report 0?
> > 
> > Where is the documentation part of this ABI change?
> 
> It's not an ABI change, /sys/class/net/*/speed already shows -1 when the
> device reports SPEED_UNKNOWN. The only change is that after this patch,
> igb driver reports SPEED_UNKNOWN rather than an outdated value if there
> is no link.

Thanks for elaboration.
Perhaps add a couple of words to the commit message that there is no ABI
changes.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Intel-wired-lan] [PATCH v2 1/2] igb: Use device_lock() insead of rtnl_lock()
  2020-02-05  8:16 [PATCH v2 1/2] igb: Use device_lock() insead of rtnl_lock() Kai-Heng Feng
  2020-02-05  8:16 ` [PATCH v2 2/2] net-sysfs: Ensure begin/complete are called in speed_show() and duplex_show() Kai-Heng Feng
@ 2020-02-07  4:11 ` Dan Carpenter
  1 sibling, 0 replies; 7+ messages in thread
From: Dan Carpenter @ 2020-02-07  4:11 UTC (permalink / raw)
  To: kbuild, Kai-Heng Feng
  Cc: kbuild-all, davem, mkubecek, jeffrey.t.kirsher,
	open list:NETWORKING DRIVERS, Kai-Heng Feng,
	moderated list:INTEL ETHERNET DRIVERS, David S. Miller,
	open list

Hi Kai-Heng,

url:    https://github.com/0day-ci/linux/commits/Kai-Heng-Feng/igb-Use-device_lock-insead-of-rtnl_lock/20200205-174208
base:   https://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue.git dev-queue

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>

New smatch warnings:
drivers/net/ethernet/intel/igb/igb_main.c:4036 igb_close() warn: inconsistent returns 'dev->mutex'.

# https://github.com/0day-ci/linux/commit/905877ae7d44efc1d9c5c1ae9f4489f56bcb13a6
git remote add linux-review https://github.com/0day-ci/linux
git remote update linux-review
git checkout 905877ae7d44efc1d9c5c1ae9f4489f56bcb13a6
vim +4036 drivers/net/ethernet/intel/igb/igb_main.c

9d5c824399dea8 drivers/net/igb/igb_main.c                Auke Kok       2008-01-24  4026  
46eafa59e18d03 drivers/net/ethernet/intel/igb/igb_main.c Stefan Assmann 2016-02-03  4027  int igb_close(struct net_device *netdev)
749ab2cd127046 drivers/net/ethernet/intel/igb/igb_main.c Yan, Zheng     2012-01-04  4028  {
905877ae7d44ef drivers/net/ethernet/intel/igb/igb_main.c Kai-Heng Feng  2020-02-05  4029  	struct igb_adapter *adapter = netdev_priv(netdev);
905877ae7d44ef drivers/net/ethernet/intel/igb/igb_main.c Kai-Heng Feng  2020-02-05  4030  	struct device *dev = &adapter->pdev->dev;
905877ae7d44ef drivers/net/ethernet/intel/igb/igb_main.c Kai-Heng Feng  2020-02-05  4031  
905877ae7d44ef drivers/net/ethernet/intel/igb/igb_main.c Kai-Heng Feng  2020-02-05  4032  	device_lock(dev);
888f22931478a0 drivers/net/ethernet/intel/igb/igb_main.c Lyude Paul     2017-12-12  4033  	if (netif_device_present(netdev) || netdev->dismantle)
749ab2cd127046 drivers/net/ethernet/intel/igb/igb_main.c Yan, Zheng     2012-01-04  4034  		return __igb_close(netdev, false);
                                                                                                        ^^^^^^
Lock held for this return.

905877ae7d44ef drivers/net/ethernet/intel/igb/igb_main.c Kai-Heng Feng  2020-02-05  4035  	device_unlock(dev);
9474933caf21a4 drivers/net/ethernet/intel/igb/igb_main.c Todd Fujinaka  2016-11-15 @4036  	return 0;
749ab2cd127046 drivers/net/ethernet/intel/igb/igb_main.c Yan, Zheng     2012-01-04  4037  }
749ab2cd127046 drivers/net/ethernet/intel/igb/igb_main.c Yan, Zheng     2012-01-04  4038  

---
0-DAY kernel test infrastructure                 Open Source Technology Center
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org Intel Corporation

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-02-07  4:11 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-05  8:16 [PATCH v2 1/2] igb: Use device_lock() insead of rtnl_lock() Kai-Heng Feng
2020-02-05  8:16 ` [PATCH v2 2/2] net-sysfs: Ensure begin/complete are called in speed_show() and duplex_show() Kai-Heng Feng
2020-02-05  9:06   ` Andy Shevchenko
2020-02-05  9:19     ` Stephen Hemminger
2020-02-05  9:23     ` Michal Kubecek
2020-02-05  9:33       ` Andy Shevchenko
2020-02-07  4:11 ` [Intel-wired-lan] [PATCH v2 1/2] igb: Use device_lock() insead of rtnl_lock() Dan Carpenter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).