netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net] be2net: Signal that the device cannot transmit during reconfiguration
@ 2019-07-16  8:16 Benjamin Poirier
  2019-07-16 19:41 ` David Miller
  2019-07-17  4:23 ` Firo Yang
  0 siblings, 2 replies; 7+ messages in thread
From: Benjamin Poirier @ 2019-07-16  8:16 UTC (permalink / raw)
  To: David Miller
  Cc: Sathya Perla, Ajit Khaparde, Sriharsha Basavapatna,
	Somnath Kotur, Firo Yang, Saeed Mahameed, netdev

While changing the number of interrupt channels, be2net stops adapter
operation (including netif_tx_disable()) but it doesn't signal that it
cannot transmit. This may lead dev_watchdog() to falsely trigger during
that time.

Add the missing call to netif_carrier_off(), following the pattern used in
many other drivers. netif_carrier_on() is already taken care of in
be_open().

Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
---
 drivers/net/ethernet/emulex/benet/be_main.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index 82015c8a5ed7..b7a246b33599 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -4697,8 +4697,12 @@ int be_update_queues(struct be_adapter *adapter)
 	struct net_device *netdev = adapter->netdev;
 	int status;
 
-	if (netif_running(netdev))
+	if (netif_running(netdev)) {
+		/* device cannot transmit now, avoid dev_watchdog timeouts */
+		netif_carrier_off(netdev);
+
 		be_close(netdev);
+	}
 
 	be_cancel_worker(adapter);
 
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net] be2net: Signal that the device cannot transmit during reconfiguration
  2019-07-16  8:16 [PATCH net] be2net: Signal that the device cannot transmit during reconfiguration Benjamin Poirier
@ 2019-07-16 19:41 ` David Miller
  2019-07-17  4:23 ` Firo Yang
  1 sibling, 0 replies; 7+ messages in thread
From: David Miller @ 2019-07-16 19:41 UTC (permalink / raw)
  To: bpoirier
  Cc: sathya.perla, ajit.khaparde, sriharsha.basavapatna,
	somnath.kotur, fyang, saeedm, netdev

From: Benjamin Poirier <bpoirier@suse.com>
Date: Tue, 16 Jul 2019 17:16:55 +0900

> While changing the number of interrupt channels, be2net stops adapter
> operation (including netif_tx_disable()) but it doesn't signal that it
> cannot transmit. This may lead dev_watchdog() to falsely trigger during
> that time.
> 
> Add the missing call to netif_carrier_off(), following the pattern used in
> many other drivers. netif_carrier_on() is already taken care of in
> be_open().
> 
> Signed-off-by: Benjamin Poirier <bpoirier@suse.com>

Applied.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] be2net: Signal that the device cannot transmit during reconfiguration
  2019-07-16  8:16 [PATCH net] be2net: Signal that the device cannot transmit during reconfiguration Benjamin Poirier
  2019-07-16 19:41 ` David Miller
@ 2019-07-17  4:23 ` Firo Yang
  2019-07-17  8:23   ` Benjamin Poirier
  1 sibling, 1 reply; 7+ messages in thread
From: Firo Yang @ 2019-07-17  4:23 UTC (permalink / raw)
  To: Benjamin Poirier, David Miller
  Cc: Ajit Khaparde, Sathya Perla, Somnath Kotur,
	Sriharsha Basavapatna, Saeed Mahameed, netdev

I think there is a problem if dev_watchdog() is triggered before netif_carrier_off(). dev_watchdog() might call ->ndo_tx_timeout(), i.e. be_tx_timeout(), if txq timeout  happens. Thus be_tx_timeout() could still be able to access the memory which is being freed by be_update_queues().

Thanks,
Firo

________________________________________
From: Benjamin Poirier
Sent: Tuesday, July 16, 2019 4:16 PM
To: David Miller
Cc: Ajit Khaparde; Sathya Perla; Somnath Kotur; Sriharsha Basavapatna; Saeed Mahameed; Firo Yang; netdev@vger.kernel.org
Subject: [PATCH net] be2net: Signal that the device cannot transmit during reconfiguration

While changing the number of interrupt channels, be2net stops adapter
operation (including netif_tx_disable()) but it doesn't signal that it
cannot transmit. This may lead dev_watchdog() to falsely trigger during
that time.

Add the missing call to netif_carrier_off(), following the pattern used in
many other drivers. netif_carrier_on() is already taken care of in
be_open().

Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
---
 drivers/net/ethernet/emulex/benet/be_main.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index 82015c8a5ed7..b7a246b33599 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -4697,8 +4697,12 @@ int be_update_queues(struct be_adapter *adapter)
        struct net_device *netdev = adapter->netdev;
        int status;

-       if (netif_running(netdev))
+       if (netif_running(netdev)) {
+               /* device cannot transmit now, avoid dev_watchdog timeouts */
+               netif_carrier_off(netdev);
+
                be_close(netdev);
+       }

        be_cancel_worker(adapter);

--
2.22.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net] be2net: Signal that the device cannot transmit during reconfiguration
  2019-07-17  4:23 ` Firo Yang
@ 2019-07-17  8:23   ` Benjamin Poirier
  2019-07-17  8:56     ` Firo Yang
  0 siblings, 1 reply; 7+ messages in thread
From: Benjamin Poirier @ 2019-07-17  8:23 UTC (permalink / raw)
  To: Firo Yang
  Cc: David Miller, Ajit Khaparde, Sathya Perla, Somnath Kotur,
	Sriharsha Basavapatna, Saeed Mahameed, netdev

On 2019/07/17 13:23, Firo Yang wrote:
> I think there is a problem if dev_watchdog() is triggered before netif_carrier_off(). dev_watchdog() might call ->ndo_tx_timeout(), i.e. be_tx_timeout(), if txq timeout  happens. Thus be_tx_timeout() could still be able to access the memory which is being freed by be_update_queues().

Good point. That's a separate problem which would occur in case of real
tx timeout. How about this followup change:

--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -4698,8 +4698,13 @@ int be_update_queues(struct be_adapter *adapter)
 	int status;
 
 	if (netif_running(netdev)) {
+		/* be_tx_timeout() must not run concurrently with this
+		 * function, synchronize with an already-running dev_watchdog
+		 */
+		netif_tx_lock_bh(netdev);
 		/* device cannot transmit now, avoid dev_watchdog timeouts */
 		netif_carrier_off(netdev);
+		netif_tx_unlock_bh(netdev);
 
 		be_close(netdev);
 	}

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] be2net: Signal that the device cannot transmit during reconfiguration
  2019-07-17  8:23   ` Benjamin Poirier
@ 2019-07-17  8:56     ` Firo Yang
  2019-07-17  9:32       ` Benjamin Poirier
  0 siblings, 1 reply; 7+ messages in thread
From: Firo Yang @ 2019-07-17  8:56 UTC (permalink / raw)
  To: Benjamin Poirier
  Cc: Ajit Khaparde, Sathya Perla, Somnath Kotur,
	Sriharsha Basavapatna, David Miller, Saeed Mahameed, netdev

I don't think this change could fix this problem because if SMP, dev_watchdog() could run on a different CPU.

Thanks,
Firo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] be2net: Signal that the device cannot transmit during reconfiguration
  2019-07-17  8:56     ` Firo Yang
@ 2019-07-17  9:32       ` Benjamin Poirier
  2019-07-17 10:25         ` Firo Yang
  0 siblings, 1 reply; 7+ messages in thread
From: Benjamin Poirier @ 2019-07-17  9:32 UTC (permalink / raw)
  To: Firo Yang
  Cc: Ajit Khaparde, Sathya Perla, Somnath Kotur,
	Sriharsha Basavapatna, David Miller, Saeed Mahameed, netdev

On 2019/07/17 17:56, Firo Yang wrote:
> I don't think this change could fix this problem because if SMP, dev_watchdog() could run on a different CPU.

hmm, SMP is clearly part of the picture here. The change I proposed
revolves around the synchronization offered by dev->tx_global_lock:

we have
\ dev_watchdog
	\ netif_tx_lock
		spin_lock(&dev->tx_global_lock);
	...
	\ netif_tx_unlock

and

\ be_update_queues
	\ netif_tx_lock_bh
		\ netif_tx_lock
			spin_lock(&dev->tx_global_lock);

Makes sense?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] be2net: Signal that the device cannot transmit during reconfiguration
  2019-07-17  9:32       ` Benjamin Poirier
@ 2019-07-17 10:25         ` Firo Yang
  0 siblings, 0 replies; 7+ messages in thread
From: Firo Yang @ 2019-07-17 10:25 UTC (permalink / raw)
  To: Benjamin Poirier
  Cc: Ajit Khaparde, Sathya Perla, Somnath Kotur,
	Sriharsha Basavapatna, David Miller, Saeed Mahameed, netdev

Crystal clear. Many thanks.

// Firo

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-07-17 10:40 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-16  8:16 [PATCH net] be2net: Signal that the device cannot transmit during reconfiguration Benjamin Poirier
2019-07-16 19:41 ` David Miller
2019-07-17  4:23 ` Firo Yang
2019-07-17  8:23   ` Benjamin Poirier
2019-07-17  8:56     ` Firo Yang
2019-07-17  9:32       ` Benjamin Poirier
2019-07-17 10:25         ` Firo Yang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).