All of lore.kernel.org
 help / color / mirror / Atom feed
* [net-next 00/20][pull request] 100GbE Intel Wired LAN Driver Updates 2016-07-20
@ 2016-07-20 22:23 Jeff Kirsher
  2016-07-20 22:23 ` [net-next 01/20] fm10k: no need to continue in fm10k_down if __FM10K_DOWN already set Jeff Kirsher
                   ` (20 more replies)
  0 siblings, 21 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-07-20 22:23 UTC (permalink / raw)
  To: davem; +Cc: Jeff Kirsher, netdev, nhorman, sassmann, jogreene, guru.anbalagane

This series contains updates to fm10k only.

Ngai-Mint provides a fix to clear PCIE_GMBX bits to ensure the proper
functioning of the mailbox global interrupt after a data path reset.

Jake provides most of the patches in the series, starting with a early
return from fm10k_down() if we are already down to prevent conflict with
other threads.  Fixed an issue where fm10k_update_stats() could cause
a null pointer dereference, specifically if it is called when we are going
down and the rings have been removed.  Cleans up and fixes the data path
reset flow, Tx hang routine and stop_hw().  Re-worked the fm10k_reinit()
to be more maintainable and fixed several inconsistencies with the work
flow.  Implemented fm10k_prepare_suspend() and fm10k_handle_resume()
which abstract around the now existing fm10k_prepare_for_reset and
fm10k_handle_reset. The new functions also handle stopping the service
task, which is something that the original re-init flow does not need.
Fixed an issue where if an FLR occurs, VF devices will be knocked out of
bus master mode, and the driver will be unable to recover from the reset
properly, so ensure bus master is enabled after every reset.  Fixed an
issue where a reset will occur as if for no reason, regularly every few
minutes until the switch manager software is loaded, which is caused
by continuously requesting the lport map so only do the request after
we have verified the switch mailbox is tx_ready.

The following are changes since commit c0d661ca3701f38d9313ca3954e3741ad92e3aaa:
  Merge branch 'mlxsw-per-prio-tc-counters'
and are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue 100GbE

Jacob Keller (19):
  fm10k: no need to continue in fm10k_down if __FM10K_DOWN already set
  fm10k: avoid possible null pointer dereference in fm10k_update_stats
  fm10k: prevent multiple threads updating statistics
  fm10k: don't stop reset due to FM10K_ERR_REQUESTS_PENDING
  fm10k: perform data path reset even when switch is not ready
  fm10k: use actual hardware registers when checking for pending Tx
  fm10k: only warn when stop_hw fails with FM10K_ERR_REQUESTS_PENDING
  fm10k: wait for queues to drain if stop_hw() fails once
  fm10k: split fm10k_reinit into two functions
  fm10k: implement prepare_suspend and handle_resume
  fm10k: use common reset flow when handling io errors from PCI stack
  fm10k: implement reset_notify handler for PCIe FLR events
  fm10k: use common flow for suspend and resume
  fm10k: enable bus master after every reset
  fm10k: check if PCIe link is restored
  fm10k: implement request_lport_map pointer
  fm10k: force link to remain down for at least a second on resume
    events
  fm10k: return proper error code when pci_enable_msix_range fails
  fm10k: bump version number

Ngai-Mint Kwan (1):
  fm10k: Reset mailbox global interrupts

 drivers/net/ethernet/intel/fm10k/fm10k.h         |   2 +
 drivers/net/ethernet/intel/fm10k/fm10k_common.c  |   6 +-
 drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c |   2 +
 drivers/net/ethernet/intel/fm10k/fm10k_main.c    |  14 +-
 drivers/net/ethernet/intel/fm10k/fm10k_mbx.h     |   2 +
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c     | 322 +++++++++++++----------
 drivers/net/ethernet/intel/fm10k/fm10k_pf.c      |  38 ++-
 drivers/net/ethernet/intel/fm10k/fm10k_type.h    |   2 +
 drivers/net/ethernet/intel/fm10k/fm10k_vf.c      |  12 +-
 9 files changed, 233 insertions(+), 167 deletions(-)

-- 
2.5.5

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [net-next 01/20] fm10k: no need to continue in fm10k_down if __FM10K_DOWN already set
  2016-07-20 22:23 [net-next 00/20][pull request] 100GbE Intel Wired LAN Driver Updates 2016-07-20 Jeff Kirsher
@ 2016-07-20 22:23 ` Jeff Kirsher
  2016-07-20 22:23 ` [net-next 02/20] fm10k: avoid possible null pointer dereference in fm10k_update_stats Jeff Kirsher
                   ` (19 subsequent siblings)
  20 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-07-20 22:23 UTC (permalink / raw)
  To: davem
  Cc: Jacob Keller, netdev, nhorman, sassmann, jogreene,
	guru.anbalagane, Jeff Kirsher

From: Jacob Keller <jacob.e.keller@intel.com>

Return early from fm10k_down() when we are already down, since that
means another thread is either already finished or has started going
down, so shouldn't conflict with them.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index e05aca9..610c313 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -1601,7 +1601,8 @@ void fm10k_down(struct fm10k_intfc *interface)
 	int err;
 
 	/* signal that we are down to the interrupt handler and service task */
-	set_bit(__FM10K_DOWN, &interface->state);
+	if (test_and_set_bit(__FM10K_DOWN, &interface->state))
+		return;
 
 	/* call carrier off first to avoid false dev_watchdog timeouts */
 	netif_carrier_off(netdev);
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 02/20] fm10k: avoid possible null pointer dereference in fm10k_update_stats
  2016-07-20 22:23 [net-next 00/20][pull request] 100GbE Intel Wired LAN Driver Updates 2016-07-20 Jeff Kirsher
  2016-07-20 22:23 ` [net-next 01/20] fm10k: no need to continue in fm10k_down if __FM10K_DOWN already set Jeff Kirsher
@ 2016-07-20 22:23 ` Jeff Kirsher
  2016-07-20 22:23 ` [net-next 03/20] fm10k: prevent multiple threads updating statistics Jeff Kirsher
                   ` (18 subsequent siblings)
  20 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-07-20 22:23 UTC (permalink / raw)
  To: davem
  Cc: Jacob Keller, netdev, nhorman, sassmann, jogreene,
	guru.anbalagane, Jeff Kirsher

From: Jacob Keller <jacob.e.keller@intel.com>

It's currently possible for fm10k_update_stats to be called during the
window when we go down and the rings are removed. This can result in
a null pointer dereference. In fm10k_get_stats64 we work around this by
using ACCESS_ONCE and a null pointer check inside the loop. Use this
same flow in the fm10k_update_stats to avoid the potential null pointer.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index 610c313..be0b7de 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -377,7 +377,10 @@ void fm10k_update_stats(struct fm10k_intfc *interface)
 
 	/* gather some stats to the interface struct that are per queue */
 	for (bytes = 0, pkts = 0, i = 0; i < interface->num_tx_queues; i++) {
-		struct fm10k_ring *tx_ring = interface->tx_ring[i];
+		struct fm10k_ring *tx_ring = READ_ONCE(interface->tx_ring[i]);
+
+		if (!tx_ring)
+			continue;
 
 		restart_queue += tx_ring->tx_stats.restart_queue;
 		tx_busy += tx_ring->tx_stats.tx_busy;
@@ -396,7 +399,10 @@ void fm10k_update_stats(struct fm10k_intfc *interface)
 
 	/* gather some stats to the interface struct that are per queue */
 	for (bytes = 0, pkts = 0, i = 0; i < interface->num_rx_queues; i++) {
-		struct fm10k_ring *rx_ring = interface->rx_ring[i];
+		struct fm10k_ring *rx_ring = READ_ONCE(interface->rx_ring[i]);
+
+		if (!rx_ring)
+			continue;
 
 		bytes += rx_ring->stats.bytes;
 		pkts += rx_ring->stats.packets;
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 03/20] fm10k: prevent multiple threads updating statistics
  2016-07-20 22:23 [net-next 00/20][pull request] 100GbE Intel Wired LAN Driver Updates 2016-07-20 Jeff Kirsher
  2016-07-20 22:23 ` [net-next 01/20] fm10k: no need to continue in fm10k_down if __FM10K_DOWN already set Jeff Kirsher
  2016-07-20 22:23 ` [net-next 02/20] fm10k: avoid possible null pointer dereference in fm10k_update_stats Jeff Kirsher
@ 2016-07-20 22:23 ` Jeff Kirsher
  2016-07-20 22:23 ` [net-next 04/20] fm10k: Reset mailbox global interrupts Jeff Kirsher
                   ` (17 subsequent siblings)
  20 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-07-20 22:23 UTC (permalink / raw)
  To: davem
  Cc: Jacob Keller, netdev, nhorman, sassmann, jogreene,
	guru.anbalagane, Jeff Kirsher

From: Jacob Keller <jacob.e.keller@intel.com>

Also prevent updating stats while the interface is down. If we're
already updating stats, just return doing nothing. When we take the
device down, block stat updates until we come back up. This ensures that
we avoid tearing down rings when we're updating statistics, and prevents
updating statistics until we're up.

We can't re-use the __FM10K_DOWN for this because it wouldn't prevent
multiple threads from accessing statistics. Neither does it prevent the
case where we start updating stats and then start going down in another
thread.

The fm10k_get_stats64 is except from this, because it has a completely
different flow which does not suffer from the same issues as
fm10k_update_stats might.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k.h     |  1 +
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c | 14 ++++++++++++++
 2 files changed, 15 insertions(+)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k.h b/drivers/net/ethernet/intel/fm10k/fm10k.h
index e98b86b..c8d0817 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k.h
@@ -362,6 +362,7 @@ enum fm10k_state_t {
 	__FM10K_SERVICE_DISABLE,
 	__FM10K_MBX_LOCK,
 	__FM10K_LINK_DOWN,
+	__FM10K_UPDATING_STATS,
 };
 
 static inline void fm10k_mbx_lock(struct fm10k_intfc *interface)
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index be0b7de..4439376 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -372,6 +372,10 @@ void fm10k_update_stats(struct fm10k_intfc *interface)
 	u64 bytes, pkts;
 	int i;
 
+	/* ensure only one thread updates stats at a time */
+	if (test_and_set_bit(__FM10K_UPDATING_STATS, &interface->state))
+		return;
+
 	/* do not allow stats update via service task for next second */
 	interface->next_stats_update = jiffies + HZ;
 
@@ -449,6 +453,8 @@ void fm10k_update_stats(struct fm10k_intfc *interface)
 	/* Fill out the OS statistics structure */
 	net_stats->rx_errors = rx_errors;
 	net_stats->rx_dropped = interface->stats.nodesc_drop.count;
+
+	clear_bit(__FM10K_UPDATING_STATS, &interface->state);
 }
 
 /**
@@ -1572,6 +1578,9 @@ void fm10k_up(struct fm10k_intfc *interface)
 	/* configure interrupts */
 	hw->mac.ops.update_int_moderator(hw);
 
+	/* enable statistics capture again */
+	clear_bit(__FM10K_UPDATING_STATS, &interface->state);
+
 	/* clear down bit to indicate we are ready to go */
 	clear_bit(__FM10K_DOWN, &interface->state);
 
@@ -1629,6 +1638,10 @@ void fm10k_down(struct fm10k_intfc *interface)
 	/* capture stats one last time before stopping interface */
 	fm10k_update_stats(interface);
 
+	/* prevent updating statistics while we're down */
+	while (test_and_set_bit(__FM10K_UPDATING_STATS, &interface->state))
+		usleep_range(1000, 2000);
+
 	/* Disable DMA engine for Tx/Rx */
 	err = hw->mac.ops.stop_hw(hw);
 	if (err)
@@ -1757,6 +1770,7 @@ static int fm10k_sw_init(struct fm10k_intfc *interface,
 
 	/* Start off interface as being down */
 	set_bit(__FM10K_DOWN, &interface->state);
+	set_bit(__FM10K_UPDATING_STATS, &interface->state);
 
 	return 0;
 }
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 04/20] fm10k: Reset mailbox global interrupts
  2016-07-20 22:23 [net-next 00/20][pull request] 100GbE Intel Wired LAN Driver Updates 2016-07-20 Jeff Kirsher
                   ` (2 preceding siblings ...)
  2016-07-20 22:23 ` [net-next 03/20] fm10k: prevent multiple threads updating statistics Jeff Kirsher
@ 2016-07-20 22:23 ` Jeff Kirsher
  2016-07-20 22:23 ` [net-next 05/20] fm10k: don't stop reset due to FM10K_ERR_REQUESTS_PENDING Jeff Kirsher
                   ` (16 subsequent siblings)
  20 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-07-20 22:23 UTC (permalink / raw)
  To: davem
  Cc: Ngai-Mint Kwan, netdev, nhorman, sassmann, jogreene,
	guru.anbalagane, Jacob Keller, Jeff Kirsher

From: Ngai-Mint Kwan <ngai-mint.kwan@intel.com>

When a data path reset is initiated, write control to the PCIE_GMBX is
yanked from the switch manager. The switch manager writes to this
register to clear mailbox global interrupt bits as part of its mailbox
interrupt handling routine. When the device recovers from the data path
reset and these bits are not cleared, it will prevent future mailbox
global interrupts from being triggered. Upon confirming that the device
has exited from a data path reset, clear these bits to ensure the proper
functioning of the mailbox global interrupt.

Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_mbx.h | 2 ++
 drivers/net/ethernet/intel/fm10k/fm10k_pf.c  | 4 ++++
 2 files changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_mbx.h b/drivers/net/ethernet/intel/fm10k/fm10k_mbx.h
index b7dbc8a..35c1dba 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_mbx.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_mbx.h
@@ -41,6 +41,8 @@ struct fm10k_mbx_info;
 #define FM10K_MBX_ACK_INTERRUPT			0x00000010
 #define FM10K_MBX_INTERRUPT_ENABLE		0x00000020
 #define FM10K_MBX_INTERRUPT_DISABLE		0x00000040
+#define FM10K_MBX_GLOBAL_REQ_INTERRUPT		0x00000200
+#define FM10K_MBX_GLOBAL_ACK_INTERRUPT		0x00000400
 #define FM10K_MBICR(_n)		((_n) + 0x18840)
 #define FM10K_GMBX		0x18842
 
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pf.c b/drivers/net/ethernet/intel/fm10k/fm10k_pf.c
index dc75507..69e2c82 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pf.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pf.c
@@ -77,6 +77,10 @@ static s32 fm10k_reset_hw_pf(struct fm10k_hw *hw)
 	if (!(reg & FM10K_IP_NOTINRESET))
 		err = FM10K_ERR_RESET_FAILED;
 
+	/* Reset mailbox global interrupts */
+	reg = FM10K_MBX_GLOBAL_REQ_INTERRUPT | FM10K_MBX_GLOBAL_ACK_INTERRUPT;
+	fm10k_write_reg(hw, FM10K_GMBX, reg);
+
 out:
 	return err;
 }
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 05/20] fm10k: don't stop reset due to FM10K_ERR_REQUESTS_PENDING
  2016-07-20 22:23 [net-next 00/20][pull request] 100GbE Intel Wired LAN Driver Updates 2016-07-20 Jeff Kirsher
                   ` (3 preceding siblings ...)
  2016-07-20 22:23 ` [net-next 04/20] fm10k: Reset mailbox global interrupts Jeff Kirsher
@ 2016-07-20 22:23 ` Jeff Kirsher
  2016-07-20 22:23 ` [net-next 06/20] fm10k: perform data path reset even when switch is not ready Jeff Kirsher
                   ` (15 subsequent siblings)
  20 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-07-20 22:23 UTC (permalink / raw)
  To: davem
  Cc: Jacob Keller, netdev, nhorman, sassmann, jogreene,
	guru.anbalagane, Jeff Kirsher

From: Jacob Keller <jacob.e.keller@intel.com>

Don't report FM10K_ERR_REQUESTS_PENDING when we fail to disable queues
within the timeout. This can occur due to a hardware Tx hang, or when
the switch ethernet fabric is resetting while we are transmitting
traffic. It can sometimes take up to 500ms before the Tx DMA engine
gives up. Instead, just skip the DMA engine check and perform
a data-path reset anyways. Add a statistic counter to keep track of the
number of resets occurring while we have pending DMA on the rings.

In order to prevent having to re-assign err to 0, re-order the
last few items of the reset_hw_pf function so that we don't perform
"return err" at the end.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c |  2 ++
 drivers/net/ethernet/intel/fm10k/fm10k_pf.c      | 24 ++++++++++++++----------
 drivers/net/ethernet/intel/fm10k/fm10k_type.h    |  1 +
 drivers/net/ethernet/intel/fm10k/fm10k_vf.c      | 12 +++++++-----
 4 files changed, 24 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c b/drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c
index 9b51954..c04cbe9 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c
@@ -76,6 +76,8 @@ static const struct fm10k_stats fm10k_gstrings_global_stats[] = {
 	FM10K_STAT("mac_rules_used", hw.swapi.mac.used),
 	FM10K_STAT("mac_rules_avail", hw.swapi.mac.avail),
 
+	FM10K_STAT("reset_while_pending", hw.mac.reset_while_pending),
+
 	FM10K_STAT("tx_hang_count", tx_timeout_count),
 };
 
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pf.c b/drivers/net/ethernet/intel/fm10k/fm10k_pf.c
index 69e2c82..7fbd94b 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pf.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pf.c
@@ -51,8 +51,12 @@ static s32 fm10k_reset_hw_pf(struct fm10k_hw *hw)
 
 	/* shut down all rings */
 	err = fm10k_disable_queues_generic(hw, FM10K_MAX_QUEUES);
-	if (err)
+	if (err == FM10K_ERR_REQUESTS_PENDING) {
+		hw->mac.reset_while_pending++;
+		goto force_reset;
+	} else if (err) {
 		return err;
+	}
 
 	/* Verify that DMA is no longer active */
 	reg = fm10k_read_reg(hw, FM10K_DMA_CTRL);
@@ -62,27 +66,27 @@ static s32 fm10k_reset_hw_pf(struct fm10k_hw *hw)
 	/* verify the switch is ready for reset */
 	reg = fm10k_read_reg(hw, FM10K_DMA_CTRL2);
 	if (!(reg & FM10K_DMA_CTRL2_SWITCH_READY))
-		goto out;
+		return FM10K_ERR_DMA_PENDING;
 
+force_reset:
 	/* Inititate data path reset */
-	reg |= FM10K_DMA_CTRL_DATAPATH_RESET;
+	reg = FM10K_DMA_CTRL_DATAPATH_RESET;
 	fm10k_write_reg(hw, FM10K_DMA_CTRL, reg);
 
 	/* Flush write and allow 100us for reset to complete */
 	fm10k_write_flush(hw);
 	udelay(FM10K_RESET_TIMEOUT);
 
-	/* Verify we made it out of reset */
-	reg = fm10k_read_reg(hw, FM10K_IP);
-	if (!(reg & FM10K_IP_NOTINRESET))
-		err = FM10K_ERR_RESET_FAILED;
-
 	/* Reset mailbox global interrupts */
 	reg = FM10K_MBX_GLOBAL_REQ_INTERRUPT | FM10K_MBX_GLOBAL_ACK_INTERRUPT;
 	fm10k_write_reg(hw, FM10K_GMBX, reg);
 
-out:
-	return err;
+	/* Verify we made it out of reset */
+	reg = fm10k_read_reg(hw, FM10K_IP);
+	if (!(reg & FM10K_IP_NOTINRESET))
+		return FM10K_ERR_RESET_FAILED;
+
+	return 0;
 }
 
 /**
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_type.h b/drivers/net/ethernet/intel/fm10k/fm10k_type.h
index b8bc061..1d65ad8 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_type.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_type.h
@@ -562,6 +562,7 @@ struct fm10k_mac_info {
 	bool tx_ready;
 	u32 dglort_map;
 	u8 itr_scale;
+	u64 reset_while_pending;
 };
 
 struct fm10k_swapi_table_info {
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_vf.c b/drivers/net/ethernet/intel/fm10k/fm10k_vf.c
index 3b06685e..337ba65 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_vf.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_vf.c
@@ -34,7 +34,7 @@ static s32 fm10k_stop_hw_vf(struct fm10k_hw *hw)
 
 	/* we need to disable the queues before taking further steps */
 	err = fm10k_stop_hw_generic(hw);
-	if (err)
+	if (err && err != FM10K_ERR_REQUESTS_PENDING)
 		return err;
 
 	/* If permanent address is set then we need to restore it */
@@ -67,7 +67,7 @@ static s32 fm10k_stop_hw_vf(struct fm10k_hw *hw)
 		fm10k_write_reg(hw, FM10K_TDLEN(i), tdlen);
 	}
 
-	return 0;
+	return err;
 }
 
 /**
@@ -83,7 +83,9 @@ static s32 fm10k_reset_hw_vf(struct fm10k_hw *hw)
 
 	/* shut down queues we own and reset DMA configuration */
 	err = fm10k_stop_hw_vf(hw);
-	if (err)
+	if (err == FM10K_ERR_REQUESTS_PENDING)
+		hw->mac.reset_while_pending++;
+	else if (err)
 		return err;
 
 	/* Inititate VF reset */
@@ -96,9 +98,9 @@ static s32 fm10k_reset_hw_vf(struct fm10k_hw *hw)
 	/* Clear reset bit and verify it was cleared */
 	fm10k_write_reg(hw, FM10K_VFCTRL, 0);
 	if (fm10k_read_reg(hw, FM10K_VFCTRL) & FM10K_VFCTRL_RST)
-		err = FM10K_ERR_RESET_FAILED;
+		return FM10K_ERR_RESET_FAILED;
 
-	return err;
+	return 0;
 }
 
 /**
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 06/20] fm10k: perform data path reset even when switch is not ready
  2016-07-20 22:23 [net-next 00/20][pull request] 100GbE Intel Wired LAN Driver Updates 2016-07-20 Jeff Kirsher
                   ` (4 preceding siblings ...)
  2016-07-20 22:23 ` [net-next 05/20] fm10k: don't stop reset due to FM10K_ERR_REQUESTS_PENDING Jeff Kirsher
@ 2016-07-20 22:23 ` Jeff Kirsher
  2016-07-20 22:23 ` [net-next 07/20] fm10k: use actual hardware registers when checking for pending Tx Jeff Kirsher
                   ` (14 subsequent siblings)
  20 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-07-20 22:23 UTC (permalink / raw)
  To: davem
  Cc: Jacob Keller, netdev, nhorman, sassmann, jogreene,
	guru.anbalagane, Jeff Kirsher

From: Jacob Keller <jacob.e.keller@intel.com>

A while ago, an additional check for the switch being ready was added to
reset_hw. A recent refactor accidentally made this check return an error
code on failure which caused fm10k_probe to fail when the switch wasn't
brought up first. The original reasoning for the check was to prevent
additional data path reset when the fabric wasn't ready yet. However,
there isn't a compelling reason to keep the check, as the data path
reset will restore hardware to a known good state. Remove the check and
perform the data path reset regardless of the switch manager state.

An alternative fix is to return FM10K_SUCCESS instead, and bypass the
actual data path reset. This should be fine as we will perform
a reset_hw once the switch is active. However, since data path reset
will reset many parts of the hardware it seems better to just perform
the reset regardless of switch state.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_pf.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pf.c b/drivers/net/ethernet/intel/fm10k/fm10k_pf.c
index 7fbd94b..23f3566 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pf.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pf.c
@@ -63,11 +63,6 @@ static s32 fm10k_reset_hw_pf(struct fm10k_hw *hw)
 	if (reg & (FM10K_DMA_CTRL_TX_ACTIVE | FM10K_DMA_CTRL_RX_ACTIVE))
 		return FM10K_ERR_DMA_PENDING;
 
-	/* verify the switch is ready for reset */
-	reg = fm10k_read_reg(hw, FM10K_DMA_CTRL2);
-	if (!(reg & FM10K_DMA_CTRL2_SWITCH_READY))
-		return FM10K_ERR_DMA_PENDING;
-
 force_reset:
 	/* Inititate data path reset */
 	reg = FM10K_DMA_CTRL_DATAPATH_RESET;
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 07/20] fm10k: use actual hardware registers when checking for pending Tx
  2016-07-20 22:23 [net-next 00/20][pull request] 100GbE Intel Wired LAN Driver Updates 2016-07-20 Jeff Kirsher
                   ` (5 preceding siblings ...)
  2016-07-20 22:23 ` [net-next 06/20] fm10k: perform data path reset even when switch is not ready Jeff Kirsher
@ 2016-07-20 22:23 ` Jeff Kirsher
  2016-07-20 22:23 ` [net-next 08/20] fm10k: only warn when stop_hw fails with FM10K_ERR_REQUESTS_PENDING Jeff Kirsher
                   ` (13 subsequent siblings)
  20 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-07-20 22:23 UTC (permalink / raw)
  To: davem
  Cc: Jacob Keller, netdev, nhorman, sassmann, jogreene,
	guru.anbalagane, Jeff Kirsher

From: Jacob Keller <jacob.e.keller@intel.com>

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_main.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_main.c b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
index a9ccc1e..c6a4645 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_main.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
@@ -1130,9 +1130,11 @@ static u64 fm10k_get_tx_completed(struct fm10k_ring *ring)
 
 static u64 fm10k_get_tx_pending(struct fm10k_ring *ring)
 {
-	/* use SW head and tail until we have real hardware */
-	u32 head = ring->next_to_clean;
-	u32 tail = ring->next_to_use;
+	struct fm10k_intfc *interface = ring->q_vector->interface;
+	struct fm10k_hw *hw = &interface->hw;
+
+	u32 head = fm10k_read_reg(hw, FM10K_TDH(ring->reg_idx));
+	u32 tail = fm10k_read_reg(hw, FM10K_TDT(ring->reg_idx));
 
 	return ((head <= tail) ? tail : tail + ring->count) - head;
 }
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 08/20] fm10k: only warn when stop_hw fails with FM10K_ERR_REQUESTS_PENDING
  2016-07-20 22:23 [net-next 00/20][pull request] 100GbE Intel Wired LAN Driver Updates 2016-07-20 Jeff Kirsher
                   ` (6 preceding siblings ...)
  2016-07-20 22:23 ` [net-next 07/20] fm10k: use actual hardware registers when checking for pending Tx Jeff Kirsher
@ 2016-07-20 22:23 ` Jeff Kirsher
  2016-07-20 22:23 ` [net-next 09/20] fm10k: wait for queues to drain if stop_hw() fails once Jeff Kirsher
                   ` (12 subsequent siblings)
  20 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-07-20 22:23 UTC (permalink / raw)
  To: davem
  Cc: Jacob Keller, netdev, nhorman, sassmann, jogreene,
	guru.anbalagane, Jeff Kirsher

From: Jacob Keller <jacob.e.keller@intel.com>

When stop_hw() routine fails with FM10K_ERR_REQUESTS_PENDING, this
indicates that the Tx or Rx queues did not shutdown within the time
limit. Print a more suitable message at the dev_info level instead of
dev_err.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index 4439376..4dfd128 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -1644,7 +1644,10 @@ void fm10k_down(struct fm10k_intfc *interface)
 
 	/* Disable DMA engine for Tx/Rx */
 	err = hw->mac.ops.stop_hw(hw);
-	if (err)
+	if (err == FM10K_ERR_REQUESTS_PENDING)
+		dev_info(&interface->pdev->dev,
+			 "due to pending requests hw was not shut down gracefully\n");
+	else if (err)
 		dev_err(&interface->pdev->dev, "stop_hw failed: %d\n", err);
 
 	/* free any buffers still on the rings */
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 09/20] fm10k: wait for queues to drain if stop_hw() fails once
  2016-07-20 22:23 [net-next 00/20][pull request] 100GbE Intel Wired LAN Driver Updates 2016-07-20 Jeff Kirsher
                   ` (7 preceding siblings ...)
  2016-07-20 22:23 ` [net-next 08/20] fm10k: only warn when stop_hw fails with FM10K_ERR_REQUESTS_PENDING Jeff Kirsher
@ 2016-07-20 22:23 ` Jeff Kirsher
  2016-07-20 22:23 ` [net-next 10/20] fm10k: split fm10k_reinit into two functions Jeff Kirsher
                   ` (11 subsequent siblings)
  20 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-07-20 22:23 UTC (permalink / raw)
  To: davem
  Cc: Jacob Keller, netdev, nhorman, sassmann, jogreene,
	guru.anbalagane, Jeff Kirsher

From: Jacob Keller <jacob.e.keller@intel.com>

It turns out that sometimes during a reset the Tx queues will be
temporarily stuck longer than .stop_hw() expects. Work around this issue
by attempting to .stop_hw() first. If it tails, wait a number of
attempts until the Tx queues appear to be drained. After this, attempt
stop_hw() again. This ensures that we avoid waiting if we don't need to,
such as during the first initialization of a VF, and give the proper
amount of time necessary to recover from most situations. It is possible
that the hardware is actually stuck. For PFs, this is usually fixed by
a datapath reset. Unfortunately the VF cannot request a similar reset
for itself.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k.h      |  1 +
 drivers/net/ethernet/intel/fm10k/fm10k_main.c |  2 +-
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c  | 44 +++++++++++++++++++++++----
 3 files changed, 40 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k.h b/drivers/net/ethernet/intel/fm10k/fm10k.h
index c8d0817..c4cf08d 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k.h
@@ -458,6 +458,7 @@ __be16 fm10k_tx_encap_offload(struct sk_buff *skb);
 netdev_tx_t fm10k_xmit_frame_ring(struct sk_buff *skb,
 				  struct fm10k_ring *tx_ring);
 void fm10k_tx_timeout_reset(struct fm10k_intfc *interface);
+u64 fm10k_get_tx_pending(struct fm10k_ring *ring);
 bool fm10k_check_tx_hang(struct fm10k_ring *tx_ring);
 void fm10k_alloc_rx_buffers(struct fm10k_ring *rx_ring, u16 cleaned_count);
 
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_main.c b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
index c6a4645..c85fc9894 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_main.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
@@ -1128,7 +1128,7 @@ static u64 fm10k_get_tx_completed(struct fm10k_ring *ring)
 	return ring->stats.packets;
 }
 
-static u64 fm10k_get_tx_pending(struct fm10k_ring *ring)
+u64 fm10k_get_tx_pending(struct fm10k_ring *ring)
 {
 	struct fm10k_intfc *interface = ring->q_vector->interface;
 	struct fm10k_hw *hw = &interface->hw;
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index 4dfd128..7c9b20c 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -1613,7 +1613,7 @@ void fm10k_down(struct fm10k_intfc *interface)
 {
 	struct net_device *netdev = interface->netdev;
 	struct fm10k_hw *hw = &interface->hw;
-	int err;
+	int err, i = 0, count = 0;
 
 	/* signal that we are down to the interrupt handler and service task */
 	if (test_and_set_bit(__FM10K_DOWN, &interface->state))
@@ -1629,9 +1629,6 @@ void fm10k_down(struct fm10k_intfc *interface)
 	/* reset Rx filters */
 	fm10k_reset_rx_state(interface);
 
-	/* allow 10ms for device to quiesce */
-	usleep_range(10000, 20000);
-
 	/* disable polling routines */
 	fm10k_napi_disable_all(interface);
 
@@ -1642,11 +1639,46 @@ void fm10k_down(struct fm10k_intfc *interface)
 	while (test_and_set_bit(__FM10K_UPDATING_STATS, &interface->state))
 		usleep_range(1000, 2000);
 
+	/* skip waiting for TX DMA if we lost PCIe link */
+	if (FM10K_REMOVED(hw->hw_addr))
+		goto skip_tx_dma_drain;
+
+	/* In some rare circumstances it can take a while for Tx queues to
+	 * quiesce and be fully disabled. Attempt to .stop_hw() first, and
+	 * then if we get ERR_REQUESTS_PENDING, go ahead and wait in a loop
+	 * until the Tx queues have emptied, or until a number of retries. If
+	 * we fail to clear within the retry loop, we will issue a warning
+	 * indicating that Tx DMA is probably hung. Note this means we call
+	 * .stop_hw() twice but this shouldn't cause any problems.
+	 */
+	err = hw->mac.ops.stop_hw(hw);
+	if (err != FM10K_ERR_REQUESTS_PENDING)
+		goto skip_tx_dma_drain;
+
+#define TX_DMA_DRAIN_RETRIES 25
+	for (count = 0; count < TX_DMA_DRAIN_RETRIES; count++) {
+		usleep_range(10000, 20000);
+
+		/* start checking at the last ring to have pending Tx */
+		for (; i < interface->num_tx_queues; i++)
+			if (fm10k_get_tx_pending(interface->tx_ring[i]))
+				break;
+
+		/* if all the queues are drained, we can break now */
+		if (i == interface->num_tx_queues)
+			break;
+	}
+
+	if (count >= TX_DMA_DRAIN_RETRIES)
+		dev_err(&interface->pdev->dev,
+			"Tx queues failed to drain after %d tries. Tx DMA is probably hung.\n",
+			count);
+skip_tx_dma_drain:
 	/* Disable DMA engine for Tx/Rx */
 	err = hw->mac.ops.stop_hw(hw);
 	if (err == FM10K_ERR_REQUESTS_PENDING)
-		dev_info(&interface->pdev->dev,
-			 "due to pending requests hw was not shut down gracefully\n");
+		dev_err(&interface->pdev->dev,
+			"due to pending requests hw was not shut down gracefully\n");
 	else if (err)
 		dev_err(&interface->pdev->dev, "stop_hw failed: %d\n", err);
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 10/20] fm10k: split fm10k_reinit into two functions
  2016-07-20 22:23 [net-next 00/20][pull request] 100GbE Intel Wired LAN Driver Updates 2016-07-20 Jeff Kirsher
                   ` (8 preceding siblings ...)
  2016-07-20 22:23 ` [net-next 09/20] fm10k: wait for queues to drain if stop_hw() fails once Jeff Kirsher
@ 2016-07-20 22:23 ` Jeff Kirsher
  2016-07-20 22:23 ` [net-next 11/20] fm10k: implement prepare_suspend and handle_resume Jeff Kirsher
                   ` (10 subsequent siblings)
  20 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-07-20 22:23 UTC (permalink / raw)
  To: davem
  Cc: Jacob Keller, netdev, nhorman, sassmann, jogreene,
	guru.anbalagane, Jeff Kirsher

From: Jacob Keller <jacob.e.keller@intel.com>

There are several flows in the driver which perform the similar function
of tearing down software and restoring software to recover from certain
errors or PCIe events, including:

  * fm10k_reinit
  * fm10k_suspend/resume
  * fm10k_io_error_detected/fm10k_io_resume

In addition, we want to implement a .reset_notify() handler as well
which will also perform similar function.

Rework how the driver codes reset and resume flows by separating out the
reinit logic into two functions "fm10k_prepare_for_reset" and
"fm10k_handle_reset". This first step will allow us to re-use this
functionality in the similar blocks of code instead of re-coding the
same sequence of events slightly different.

The end result should be more maintainable and correct, fixing several
inconsistencies with the work flow.

The new functions expect to take the rtnl_lock() themselves, and it does
have the unfortunate side effect of having the reinit flow take then
release then take the rtnl_lock. However, this minor downside is
out weighted by the benefits of code reduction and reducing needless
difference between these flows.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c | 33 +++++++++++++++++++++++-----
 1 file changed, 28 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index 7c9b20c..2963b41 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -136,11 +136,9 @@ static void fm10k_detach_subtask(struct fm10k_intfc *interface)
 	rtnl_unlock();
 }
 
-static void fm10k_reinit(struct fm10k_intfc *interface)
+static void fm10k_prepare_for_reset(struct fm10k_intfc *interface)
 {
 	struct net_device *netdev = interface->netdev;
-	struct fm10k_hw *hw = &interface->hw;
-	int err;
 
 	WARN_ON(in_interrupt());
 
@@ -165,6 +163,17 @@ static void fm10k_reinit(struct fm10k_intfc *interface)
 	/* delay any future reset requests */
 	interface->last_reset = jiffies + (10 * HZ);
 
+	rtnl_unlock();
+}
+
+static int fm10k_handle_reset(struct fm10k_intfc *interface)
+{
+	struct net_device *netdev = interface->netdev;
+	struct fm10k_hw *hw = &interface->hw;
+	int err;
+
+	rtnl_lock();
+
 	/* reset and initialize the hardware so it is in a known state */
 	err = hw->mac.ops.reset_hw(hw);
 	if (err) {
@@ -185,7 +194,7 @@ static void fm10k_reinit(struct fm10k_intfc *interface)
 		goto reinit_err;
 	}
 
-	/* reassociate interrupts */
+	/* re-associate interrupts */
 	err = fm10k_mbx_request_irq(interface);
 	if (err)
 		goto err_mbx_irq;
@@ -219,7 +228,7 @@ static void fm10k_reinit(struct fm10k_intfc *interface)
 
 	clear_bit(__FM10K_RESETTING, &interface->state);
 
-	return;
+	return err;
 err_open:
 	fm10k_mbx_free_irq(interface);
 err_mbx_irq:
@@ -230,6 +239,20 @@ reinit_err:
 	rtnl_unlock();
 
 	clear_bit(__FM10K_RESETTING, &interface->state);
+
+	return err;
+}
+
+static void fm10k_reinit(struct fm10k_intfc *interface)
+{
+	int err;
+
+	fm10k_prepare_for_reset(interface);
+
+	err = fm10k_handle_reset(interface);
+	if (err)
+		dev_err(&interface->pdev->dev,
+			"fm10k_handle_reset failed: %d\n", err);
 }
 
 static void fm10k_reset_subtask(struct fm10k_intfc *interface)
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 11/20] fm10k: implement prepare_suspend and handle_resume
  2016-07-20 22:23 [net-next 00/20][pull request] 100GbE Intel Wired LAN Driver Updates 2016-07-20 Jeff Kirsher
                   ` (9 preceding siblings ...)
  2016-07-20 22:23 ` [net-next 10/20] fm10k: split fm10k_reinit into two functions Jeff Kirsher
@ 2016-07-20 22:23 ` Jeff Kirsher
  2016-07-20 22:23 ` [net-next 12/20] fm10k: use common reset flow when handling io errors from PCI stack Jeff Kirsher
                   ` (9 subsequent siblings)
  20 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-07-20 22:23 UTC (permalink / raw)
  To: davem
  Cc: Jacob Keller, netdev, nhorman, sassmann, jogreene,
	guru.anbalagane, Jeff Kirsher

From: Jacob Keller <jacob.e.keller@intel.com>

Implement fm10k_prepare_suspend and fm10k_handle_resume functions which
abstract around the now existing fm10k_prepare_for_reset and
fm10k_handle_reset. The new functions also handle stopping the service
task, which is something that the original re-init flow does not need.

Every other location that does a suspend/resume type flow is expected to
use these functions, because otherwise they may have conflicts with the
running watchdog routines. This also has the effect of preventing
possible surprise remove events during handling of FLR events and PCIe
errors.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c | 38 ++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index 2963b41..a6ee046 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -2112,6 +2112,44 @@ static void fm10k_remove(struct pci_dev *pdev)
 	pci_disable_device(pdev);
 }
 
+static void fm10k_prepare_suspend(struct fm10k_intfc *interface)
+{
+	/* the watchdog task reads from registers, which might appear like
+	 * a surprise remove if the PCIe device is disabled while we're
+	 * stopped. We stop the watchdog task until after we resume software
+	 * activity.
+	 */
+	set_bit(__FM10K_SERVICE_DISABLE, &interface->state);
+	cancel_work_sync(&interface->service_task);
+
+	fm10k_prepare_for_reset(interface);
+}
+
+static int fm10k_handle_resume(struct fm10k_intfc *interface)
+{
+	struct fm10k_hw *hw = &interface->hw;
+	int err;
+
+	/* reset statistics starting values */
+	hw->mac.ops.rebind_hw_stats(hw, &interface->stats);
+
+	err = fm10k_handle_reset(interface);
+	if (err)
+		return err;
+
+	/* assume host is not ready, to prevent race with watchdog in case we
+	 * actually don't have connection to the switch
+	 */
+	interface->host_ready = false;
+	fm10k_watchdog_host_not_ready(interface);
+
+	/* clear the service task disable bit to allow service task to start */
+	clear_bit(__FM10K_SERVICE_DISABLE, &interface->state);
+	fm10k_service_event_schedule(interface);
+
+	return err;
+}
+
 #ifdef CONFIG_PM
 /**
  * fm10k_resume - Restore device to pre-sleep state
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 12/20] fm10k: use common reset flow when handling io errors from PCI stack
  2016-07-20 22:23 [net-next 00/20][pull request] 100GbE Intel Wired LAN Driver Updates 2016-07-20 Jeff Kirsher
                   ` (10 preceding siblings ...)
  2016-07-20 22:23 ` [net-next 11/20] fm10k: implement prepare_suspend and handle_resume Jeff Kirsher
@ 2016-07-20 22:23 ` Jeff Kirsher
  2016-07-20 22:23 ` [net-next 13/20] fm10k: implement reset_notify handler for PCIe FLR events Jeff Kirsher
                   ` (8 subsequent siblings)
  20 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-07-20 22:23 UTC (permalink / raw)
  To: davem
  Cc: Jacob Keller, netdev, nhorman, sassmann, jogreene,
	guru.anbalagane, Jeff Kirsher

From: Jacob Keller <jacob.e.keller@intel.com>

Now that we have extracted the necessary steps for a split
suspend/resume flow, re-use these functions instead of using the current
open coded flow. This ensures that we don't miss any steps. It also
ensures that we have the correct driver states set.

Since we'll be handling all of the reset flow ourselves, we no longer
need to request a reset in the io_slot_reset() function.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c | 60 ++++------------------------
 1 file changed, 7 insertions(+), 53 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index a6ee046..716a5c8 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -2312,17 +2312,7 @@ static pci_ers_result_t fm10k_io_error_detected(struct pci_dev *pdev,
 	if (state == pci_channel_io_perm_failure)
 		return PCI_ERS_RESULT_DISCONNECT;
 
-	rtnl_lock();
-
-	if (netif_running(netdev))
-		fm10k_close(netdev);
-
-	fm10k_mbx_free_irq(interface);
-
-	/* free interrupts */
-	fm10k_clear_queueing_scheme(interface);
-
-	rtnl_unlock();
+	fm10k_prepare_suspend(interface);
 
 	/* Request a slot reset. */
 	return PCI_ERS_RESULT_NEED_RESET;
@@ -2336,7 +2326,6 @@ static pci_ers_result_t fm10k_io_error_detected(struct pci_dev *pdev,
  */
 static pci_ers_result_t fm10k_io_slot_reset(struct pci_dev *pdev)
 {
-	struct fm10k_intfc *interface = pci_get_drvdata(pdev);
 	pci_ers_result_t result;
 
 	if (pci_enable_device_mem(pdev)) {
@@ -2354,12 +2343,6 @@ static pci_ers_result_t fm10k_io_slot_reset(struct pci_dev *pdev)
 
 		pci_wake_from_d3(pdev, false);
 
-		/* refresh hw_addr in case it was dropped */
-		interface->hw.hw_addr = interface->uc_addr;
-
-		interface->flags |= FM10K_FLAG_RESET_REQUESTED;
-		fm10k_service_event_schedule(interface);
-
 		result = PCI_ERS_RESULT_RECOVERED;
 	}
 
@@ -2379,44 +2362,15 @@ static void fm10k_io_resume(struct pci_dev *pdev)
 {
 	struct fm10k_intfc *interface = pci_get_drvdata(pdev);
 	struct net_device *netdev = interface->netdev;
-	struct fm10k_hw *hw = &interface->hw;
-	int err = 0;
-
-	/* reset hardware to known state */
-	err = hw->mac.ops.init_hw(&interface->hw);
-	if (err) {
-		dev_err(&pdev->dev, "init_hw failed: %d\n", err);
-		return;
-	}
-
-	/* reset statistics starting values */
-	hw->mac.ops.rebind_hw_stats(hw, &interface->stats);
-
-	rtnl_lock();
-
-	err = fm10k_init_queueing_scheme(interface);
-	if (err) {
-		dev_err(&interface->pdev->dev,
-			"init_queueing_scheme failed: %d\n", err);
-		goto unlock;
-	}
-
-	/* reassociate interrupts */
-	fm10k_mbx_request_irq(interface);
-
-	rtnl_lock();
-	if (netif_running(netdev))
-		err = fm10k_open(netdev);
-	rtnl_unlock();
+	int err;
 
-	/* final check of hardware state before registering the interface */
-	err = err ? : fm10k_hw_ready(interface);
+	err = fm10k_handle_resume(interface);
 
-	if (!err)
+	if (err)
+		dev_warn(&pdev->dev,
+			 "fm10k_io_resume failed: %d\n", err);
+	else
 		netif_device_attach(netdev);
-
-unlock:
-	rtnl_unlock();
 }
 
 static const struct pci_error_handlers fm10k_err_handler = {
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 13/20] fm10k: implement reset_notify handler for PCIe FLR events
  2016-07-20 22:23 [net-next 00/20][pull request] 100GbE Intel Wired LAN Driver Updates 2016-07-20 Jeff Kirsher
                   ` (11 preceding siblings ...)
  2016-07-20 22:23 ` [net-next 12/20] fm10k: use common reset flow when handling io errors from PCI stack Jeff Kirsher
@ 2016-07-20 22:23 ` Jeff Kirsher
  2016-07-20 22:23 ` [net-next 14/20] fm10k: use common flow for suspend and resume Jeff Kirsher
                   ` (7 subsequent siblings)
  20 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-07-20 22:23 UTC (permalink / raw)
  To: davem
  Cc: Jacob Keller, netdev, nhorman, sassmann, jogreene,
	guru.anbalagane, Jeff Kirsher

From: Jacob Keller <jacob.e.keller@intel.com>

When a function level PCI reset is triggered using sysfs, it calls the
driver's .reset_notify error handler. Implement a handler based on the
now split fm10k_prepare_for_reset and fm10k_handle_reset functions, so
that we fully reset the driver when the PCI function level reset occurs.
This also ensures the reset is handled in a clean way by first disabling
all the driver bits first and then restoring them after the function
reset. Previously the stack simply performed a blind function reset and
our driver didn't take any part in the process.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c | 33 ++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index 716a5c8..c60c3b0 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -2373,10 +2373,43 @@ static void fm10k_io_resume(struct pci_dev *pdev)
 		netif_device_attach(netdev);
 }
 
+/**
+ * fm10k_io_reset_notify - called when PCI function is reset
+ * @pdev: Pointer to PCI device
+ *
+ * This callback is called when the PCI function is reset such as from
+ * /sys/class/net/<enpX>/device/reset or similar. When prepare is true, it
+ * means we should prepare for a function reset. If prepare is false, it means
+ * the function reset just occurred.
+ */
+static void fm10k_io_reset_notify(struct pci_dev *pdev, bool prepare)
+{
+	struct fm10k_intfc *interface = pci_get_drvdata(pdev);
+	int err = 0;
+
+	if (prepare) {
+		/* warn incase we have any active VF devices */
+		if (pci_num_vf(pdev))
+			dev_warn(&pdev->dev,
+				 "PCIe FLR may cause issues for any active VF devices\n");
+
+		fm10k_prepare_suspend(interface);
+	} else {
+		err = fm10k_handle_resume(interface);
+	}
+
+	if (err) {
+		dev_warn(&pdev->dev,
+			 "fm10k_io_reset_notify failed: %d\n", err);
+		netif_device_detach(interface->netdev);
+	}
+}
+
 static const struct pci_error_handlers fm10k_err_handler = {
 	.error_detected = fm10k_io_error_detected,
 	.slot_reset = fm10k_io_slot_reset,
 	.resume = fm10k_io_resume,
+	.reset_notify = fm10k_io_reset_notify,
 };
 
 static struct pci_driver fm10k_driver = {
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 14/20] fm10k: use common flow for suspend and resume
  2016-07-20 22:23 [net-next 00/20][pull request] 100GbE Intel Wired LAN Driver Updates 2016-07-20 Jeff Kirsher
                   ` (12 preceding siblings ...)
  2016-07-20 22:23 ` [net-next 13/20] fm10k: implement reset_notify handler for PCIe FLR events Jeff Kirsher
@ 2016-07-20 22:23 ` Jeff Kirsher
  2016-07-20 22:23 ` [net-next 15/20] fm10k: enable bus master after every reset Jeff Kirsher
                   ` (6 subsequent siblings)
  20 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-07-20 22:23 UTC (permalink / raw)
  To: davem
  Cc: Jacob Keller, netdev, nhorman, sassmann, jogreene,
	guru.anbalagane, Jeff Kirsher

From: Jacob Keller <jacob.e.keller@intel.com>

Continuing the effort to commonize the similar suspend/resume flows,
finish up by using the new fm10k_handle_suspand and fm10k_handle_resume
functions for the standard suspend/resume flow.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c | 73 ++--------------------------
 1 file changed, 3 insertions(+), 70 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index c60c3b0..b02361c 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -2186,60 +2186,13 @@ static int fm10k_resume(struct pci_dev *pdev)
 	/* refresh hw_addr in case it was dropped */
 	hw->hw_addr = interface->uc_addr;
 
-	/* reset hardware to known state */
-	err = hw->mac.ops.init_hw(&interface->hw);
-	if (err) {
-		dev_err(&pdev->dev, "init_hw failed: %d\n", err);
-		return err;
-	}
-
-	/* reset statistics starting values */
-	hw->mac.ops.rebind_hw_stats(hw, &interface->stats);
-
-	rtnl_lock();
-
-	err = fm10k_init_queueing_scheme(interface);
-	if (err)
-		goto err_queueing_scheme;
-
-	err = fm10k_mbx_request_irq(interface);
-	if (err)
-		goto err_mbx_irq;
-
-	err = fm10k_hw_ready(interface);
-	if (err)
-		goto err_open;
-
-	err = netif_running(netdev) ? fm10k_open(netdev) : 0;
+	err = fm10k_handle_resume(interface);
 	if (err)
-		goto err_open;
-
-	rtnl_unlock();
-
-	/* assume host is not ready, to prevent race with watchdog in case we
-	 * actually don't have connection to the switch
-	 */
-	interface->host_ready = false;
-	fm10k_watchdog_host_not_ready(interface);
-
-	/* clear the service task disable bit to allow service task to start */
-	clear_bit(__FM10K_SERVICE_DISABLE, &interface->state);
-	fm10k_service_event_schedule(interface);
-
-	/* restore SR-IOV interface */
-	fm10k_iov_resume(pdev);
+		return err;
 
 	netif_device_attach(netdev);
 
 	return 0;
-err_open:
-	fm10k_mbx_free_irq(interface);
-err_mbx_irq:
-	fm10k_clear_queueing_scheme(interface);
-err_queueing_scheme:
-	rtnl_unlock();
-
-	return err;
 }
 
 /**
@@ -2259,27 +2212,7 @@ static int fm10k_suspend(struct pci_dev *pdev,
 
 	netif_device_detach(netdev);
 
-	fm10k_iov_suspend(pdev);
-
-	/* the watchdog tasks may read registers, which will appear like a
-	 * surprise-remove event once the PCI device is disabled. This will
-	 * cause us to close the netdevice, so we don't retain the open/closed
-	 * state post-resume. Prevent this by disabling the service task while
-	 * suspended, until we actually resume.
-	 */
-	set_bit(__FM10K_SERVICE_DISABLE, &interface->state);
-	cancel_work_sync(&interface->service_task);
-
-	rtnl_lock();
-
-	if (netif_running(netdev))
-		fm10k_close(netdev);
-
-	fm10k_mbx_free_irq(interface);
-
-	fm10k_clear_queueing_scheme(interface);
-
-	rtnl_unlock();
+	fm10k_prepare_suspend(interface);
 
 	err = pci_save_state(pdev);
 	if (err)
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 15/20] fm10k: enable bus master after every reset
  2016-07-20 22:23 [net-next 00/20][pull request] 100GbE Intel Wired LAN Driver Updates 2016-07-20 Jeff Kirsher
                   ` (13 preceding siblings ...)
  2016-07-20 22:23 ` [net-next 14/20] fm10k: use common flow for suspend and resume Jeff Kirsher
@ 2016-07-20 22:23 ` Jeff Kirsher
  2016-07-20 22:23 ` [net-next 16/20] fm10k: check if PCIe link is restored Jeff Kirsher
                   ` (5 subsequent siblings)
  20 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-07-20 22:23 UTC (permalink / raw)
  To: davem
  Cc: Jacob Keller, netdev, nhorman, sassmann, jogreene,
	guru.anbalagane, Jeff Kirsher

From: Jacob Keller <jacob.e.keller@intel.com>

If an FLR occurs, VF devices will be knocked out of bus master mode, and
the driver will be unable to recover from the reset properly, resulting
in malicious driver events and an infinite reset loop. In the normal
case, the bus master mode will already be enabled and this call will
essentially be a no-op. Since we're doing this every reset, it is
possible we could remove the other calls to pci_set_master() but it
seems not harmful to just leave them in place.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index b02361c..5e40460 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -174,6 +174,8 @@ static int fm10k_handle_reset(struct fm10k_intfc *interface)
 
 	rtnl_lock();
 
+	pci_set_master(interface->pdev);
+
 	/* reset and initialize the hardware so it is in a known state */
 	err = hw->mac.ops.reset_hw(hw);
 	if (err) {
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 16/20] fm10k: check if PCIe link is restored
  2016-07-20 22:23 [net-next 00/20][pull request] 100GbE Intel Wired LAN Driver Updates 2016-07-20 Jeff Kirsher
                   ` (14 preceding siblings ...)
  2016-07-20 22:23 ` [net-next 15/20] fm10k: enable bus master after every reset Jeff Kirsher
@ 2016-07-20 22:23 ` Jeff Kirsher
  2016-07-21 10:51   ` Sergei Shtylyov
  2016-07-20 22:23 ` [net-next 17/20] fm10k: implement request_lport_map pointer Jeff Kirsher
                   ` (4 subsequent siblings)
  20 siblings, 1 reply; 24+ messages in thread
From: Jeff Kirsher @ 2016-07-20 22:23 UTC (permalink / raw)
  To: davem
  Cc: Jacob Keller, netdev, nhorman, sassmann, jogreene,
	guru.anbalagane, Jeff Kirsher

From: Jacob Keller <jacob.e.keller@intel.com>

Sometimes, a VF driver will lose PCIe address access, such as due to
a PF FLR event. In fm10k_detach_subtask, poll and check whether the
PCIe register space is active again and restore the device when it has.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index 5e40460..d4ccb2a 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -123,11 +123,24 @@ static void fm10k_service_timer(unsigned long data)
 static void fm10k_detach_subtask(struct fm10k_intfc *interface)
 {
 	struct net_device *netdev = interface->netdev;
+	u32 __iomem *hw_addr;
+	u32 value;
 
 	/* do nothing if device is still present or hw_addr is set */
 	if (netif_device_present(netdev) || interface->hw.hw_addr)
 		return;
 
+	/* check the real address space to see if we've recovered */
+	hw_addr = READ_ONCE(interface->uc_addr);
+	value = readl(hw_addr);
+	if ((~value)) {
+		interface->hw.hw_addr = interface->uc_addr;
+		netif_device_attach(netdev);
+		interface->flags |= FM10K_FLAG_RESET_REQUESTED;
+		netdev_warn(netdev, "PCIe link restored, device now attached\n");
+		return;
+	}
+
 	rtnl_lock();
 
 	if (netif_running(netdev))
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 17/20] fm10k: implement request_lport_map pointer
  2016-07-20 22:23 [net-next 00/20][pull request] 100GbE Intel Wired LAN Driver Updates 2016-07-20 Jeff Kirsher
                   ` (15 preceding siblings ...)
  2016-07-20 22:23 ` [net-next 16/20] fm10k: check if PCIe link is restored Jeff Kirsher
@ 2016-07-20 22:23 ` Jeff Kirsher
  2016-07-20 22:23 ` [net-next 18/20] fm10k: force link to remain down for at least a second on resume events Jeff Kirsher
                   ` (3 subsequent siblings)
  20 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-07-20 22:23 UTC (permalink / raw)
  To: davem
  Cc: Jacob Keller, netdev, nhorman, sassmann, jogreene,
	guru.anbalagane, Jeff Kirsher

From: Jacob Keller <jacob.e.keller@intel.com>

If the fm10k interface is brought up, but the switch manager software is
not running, the driver will continuously request the lport map every
few seconds in the base driver watchdog routine. Eventually after
several minutes the switch mailbox Tx fifo will fill up and the mailbox
will timeout, resulting in a reset. This reset will appear as if for no
reason, and occurs regularly every few minutes until the switch manager
software is loaded.

Prevent this from happening by only requesting the lport map after we've
verified the switch mailbox is tx_ready. In order to simplify code logic
and reduce code duplication, implement this as a new function pointer
"mac.ops.request_lport_map" which the VF will not implement. Otherwise,
we have to duplicate the tx_ready check outside of
fm10k_get_host_state_generic, or re-implement most of
fm10k_get_host_state_generic in the pf version.

The resulting code is simpler and easier to understand, and prevents the
PF from continuously requesting lport map and filling the Tx fifo of
a switch mailbox that isn't ready.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_common.c |  6 +++++-
 drivers/net/ethernet/intel/fm10k/fm10k_pf.c     | 15 +++------------
 drivers/net/ethernet/intel/fm10k/fm10k_type.h   |  1 +
 3 files changed, 9 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_common.c b/drivers/net/ethernet/intel/fm10k/fm10k_common.c
index 5bbf19c..d6baaea 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_common.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_common.c
@@ -519,8 +519,12 @@ s32 fm10k_get_host_state_generic(struct fm10k_hw *hw, bool *host_ready)
 		goto out;
 
 	/* interface cannot receive traffic without logical ports */
-	if (mac->dglort_map == FM10K_DGLORTMAP_NONE)
+	if (mac->dglort_map == FM10K_DGLORTMAP_NONE) {
+		if (hw->mac.ops.request_lport_map)
+			ret_val = hw->mac.ops.request_lport_map(hw);
+
 		goto out;
+	}
 
 	/* if we passed all the tests above then the switch is ready and we no
 	 * longer need to check for link
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pf.c b/drivers/net/ethernet/intel/fm10k/fm10k_pf.c
index 23f3566..682299d 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pf.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pf.c
@@ -1622,25 +1622,15 @@ static s32 fm10k_request_lport_map_pf(struct fm10k_hw *hw)
  **/
 static s32 fm10k_get_host_state_pf(struct fm10k_hw *hw, bool *switch_ready)
 {
-	s32 ret_val = 0;
 	u32 dma_ctrl2;
 
 	/* verify the switch is ready for interaction */
 	dma_ctrl2 = fm10k_read_reg(hw, FM10K_DMA_CTRL2);
 	if (!(dma_ctrl2 & FM10K_DMA_CTRL2_SWITCH_READY))
-		goto out;
+		return 0;
 
 	/* retrieve generic host state info */
-	ret_val = fm10k_get_host_state_generic(hw, switch_ready);
-	if (ret_val)
-		goto out;
-
-	/* interface cannot receive traffic without logical ports */
-	if (hw->mac.dglort_map == FM10K_DGLORTMAP_NONE)
-		ret_val = fm10k_request_lport_map_pf(hw);
-
-out:
-	return ret_val;
+	return fm10k_get_host_state_generic(hw, switch_ready);
 }
 
 /* This structure defines the attibutes to be parsed below */
@@ -1816,6 +1806,7 @@ static const struct fm10k_mac_ops mac_ops_pf = {
 	.set_dma_mask		= fm10k_set_dma_mask_pf,
 	.get_fault		= fm10k_get_fault_pf,
 	.get_host_state		= fm10k_get_host_state_pf,
+	.request_lport_map	= fm10k_request_lport_map_pf,
 };
 
 static const struct fm10k_iov_ops iov_ops_pf = {
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_type.h b/drivers/net/ethernet/intel/fm10k/fm10k_type.h
index 1d65ad8..f4e75c4 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_type.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_type.h
@@ -526,6 +526,7 @@ struct fm10k_mac_ops {
 	s32 (*stop_hw)(struct fm10k_hw *);
 	s32 (*get_bus_info)(struct fm10k_hw *);
 	s32 (*get_host_state)(struct fm10k_hw *, bool *);
+	s32 (*request_lport_map)(struct fm10k_hw *);
 	s32 (*update_vlan)(struct fm10k_hw *, u32, u8, bool);
 	s32 (*read_mac_addr)(struct fm10k_hw *);
 	s32 (*update_uc_addr)(struct fm10k_hw *, u16, const u8 *,
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 18/20] fm10k: force link to remain down for at least a second on resume events
  2016-07-20 22:23 [net-next 00/20][pull request] 100GbE Intel Wired LAN Driver Updates 2016-07-20 Jeff Kirsher
                   ` (16 preceding siblings ...)
  2016-07-20 22:23 ` [net-next 17/20] fm10k: implement request_lport_map pointer Jeff Kirsher
@ 2016-07-20 22:23 ` Jeff Kirsher
  2016-07-20 22:23 ` [net-next 19/20] fm10k: return proper error code when pci_enable_msix_range fails Jeff Kirsher
                   ` (2 subsequent siblings)
  20 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-07-20 22:23 UTC (permalink / raw)
  To: davem
  Cc: Jacob Keller, netdev, nhorman, sassmann, jogreene,
	guru.anbalagane, Jeff Kirsher

From: Jacob Keller <jacob.e.keller@intel.com>

When we resume from an AER recovery with many active VFs, the PF sees
many spurious link up and link down events. Prevent this by delaying
link down for at least one second after the resume event.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index d4ccb2a..b8245c7 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -2158,6 +2158,10 @@ static int fm10k_handle_resume(struct fm10k_intfc *interface)
 	interface->host_ready = false;
 	fm10k_watchdog_host_not_ready(interface);
 
+	/* force link to stay down for a second to prevent link flutter */
+	interface->link_down_event = jiffies + (HZ);
+	set_bit(__FM10K_LINK_DOWN, &interface->state);
+
 	/* clear the service task disable bit to allow service task to start */
 	clear_bit(__FM10K_SERVICE_DISABLE, &interface->state);
 	fm10k_service_event_schedule(interface);
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 19/20] fm10k: return proper error code when pci_enable_msix_range fails
  2016-07-20 22:23 [net-next 00/20][pull request] 100GbE Intel Wired LAN Driver Updates 2016-07-20 Jeff Kirsher
                   ` (17 preceding siblings ...)
  2016-07-20 22:23 ` [net-next 18/20] fm10k: force link to remain down for at least a second on resume events Jeff Kirsher
@ 2016-07-20 22:23 ` Jeff Kirsher
  2016-07-20 22:23 ` [net-next 20/20] fm10k: bump version number Jeff Kirsher
  2016-07-21  5:05 ` [net-next 00/20][pull request] 100GbE Intel Wired LAN Driver Updates 2016-07-20 David Miller
  20 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-07-20 22:23 UTC (permalink / raw)
  To: davem
  Cc: Jacob Keller, netdev, nhorman, sassmann, jogreene,
	guru.anbalagane, Jeff Kirsher

From: Jacob Keller <jacob.e.keller@intel.com>

The pci_enable_msix_range() function returns a positive value of the
number of allocated vectors if it succeeds. On failure it returns
a negative error code. Return this code properly so that the error
message printed by the driver will show the actual error code instead of
being masked by -ENOMEM.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_main.c b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
index c85fc9894..f72d1ca 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_main.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
@@ -1858,7 +1858,7 @@ static int fm10k_init_msix_capability(struct fm10k_intfc *interface)
 	if (v_budget < 0) {
 		kfree(interface->msix_entries);
 		interface->msix_entries = NULL;
-		return -ENOMEM;
+		return v_budget;
 	}
 
 	/* record the number of queues available for q_vectors */
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 20/20] fm10k: bump version number
  2016-07-20 22:23 [net-next 00/20][pull request] 100GbE Intel Wired LAN Driver Updates 2016-07-20 Jeff Kirsher
                   ` (18 preceding siblings ...)
  2016-07-20 22:23 ` [net-next 19/20] fm10k: return proper error code when pci_enable_msix_range fails Jeff Kirsher
@ 2016-07-20 22:23 ` Jeff Kirsher
  2016-07-21  5:05 ` [net-next 00/20][pull request] 100GbE Intel Wired LAN Driver Updates 2016-07-20 David Miller
  20 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-07-20 22:23 UTC (permalink / raw)
  To: davem
  Cc: Jacob Keller, netdev, nhorman, sassmann, jogreene,
	guru.anbalagane, Jeff Kirsher

From: Jacob Keller <jacob.e.keller@intel.com>

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_main.c b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
index f72d1ca..e9767b6 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_main.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
@@ -28,7 +28,7 @@
 
 #include "fm10k.h"
 
-#define DRV_VERSION	"0.19.3-k"
+#define DRV_VERSION	"0.21.2-k"
 #define DRV_SUMMARY	"Intel(R) Ethernet Switch Host Interface Driver"
 const char fm10k_driver_version[] = DRV_VERSION;
 char fm10k_driver_name[] = "fm10k";
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [net-next 00/20][pull request] 100GbE Intel Wired LAN Driver Updates 2016-07-20
  2016-07-20 22:23 [net-next 00/20][pull request] 100GbE Intel Wired LAN Driver Updates 2016-07-20 Jeff Kirsher
                   ` (19 preceding siblings ...)
  2016-07-20 22:23 ` [net-next 20/20] fm10k: bump version number Jeff Kirsher
@ 2016-07-21  5:05 ` David Miller
  20 siblings, 0 replies; 24+ messages in thread
From: David Miller @ 2016-07-21  5:05 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: netdev, nhorman, sassmann, jogreene, guru.anbalagane

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Wed, 20 Jul 2016 15:23:38 -0700

> This series contains updates to fm10k only.

Pulled, thanks Jeff.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [net-next 16/20] fm10k: check if PCIe link is restored
  2016-07-20 22:23 ` [net-next 16/20] fm10k: check if PCIe link is restored Jeff Kirsher
@ 2016-07-21 10:51   ` Sergei Shtylyov
  2016-07-21 20:13     ` ~~ Keller, Jacob E
  0 siblings, 1 reply; 24+ messages in thread
From: Sergei Shtylyov @ 2016-07-21 10:51 UTC (permalink / raw)
  To: Jeff Kirsher, davem
  Cc: Jacob Keller, netdev, nhorman, sassmann, jogreene, guru.anbalagane

Hello.

On 7/21/2016 1:23 AM, Jeff Kirsher wrote:

> From: Jacob Keller <jacob.e.keller@intel.com>
>
> Sometimes, a VF driver will lose PCIe address access, such as due to
> a PF FLR event. In fm10k_detach_subtask, poll and check whether the
> PCIe register space is active again and restore the device when it has.
>
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> ---
>  drivers/net/ethernet/intel/fm10k/fm10k_pci.c | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
> index 5e40460..d4ccb2a 100644
> --- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
> +++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
> @@ -123,11 +123,24 @@ static void fm10k_service_timer(unsigned long data)
>  static void fm10k_detach_subtask(struct fm10k_intfc *interface)
>  {
>  	struct net_device *netdev = interface->netdev;
> +	u32 __iomem *hw_addr;
> +	u32 value;
>
>  	/* do nothing if device is still present or hw_addr is set */
>  	if (netif_device_present(netdev) || interface->hw.hw_addr)
>  		return;
>
> +	/* check the real address space to see if we've recovered */
> +	hw_addr = READ_ONCE(interface->uc_addr);
> +	value = readl(hw_addr);
> +	if ((~value)) {

    Why these double parens?

[...]

MBR, Sergei

^ permalink raw reply	[flat|nested] 24+ messages in thread

* ~~
  2016-07-21 10:51   ` Sergei Shtylyov
@ 2016-07-21 20:13     ` Keller, Jacob E
  0 siblings, 0 replies; 24+ messages in thread
From: Keller, Jacob E @ 2016-07-21 20:13 UTC (permalink / raw)
  To: davem, Kirsher, Jeffrey T, sergei.shtylyov
  Cc: nhorman, netdev, guru.anbalagane, sassmann, jogreene

On Thu, 2016-07-21 at 13:51 +0300, Sergei Shtylyov wrote:
> Hello.
> 
> On 7/21/2016 1:23 AM, Jeff Kirsher wrote:
> 
> > From: Jacob Keller <jacob.e.keller@intel.com>
> > 
> > Sometimes, a VF driver will lose PCIe address access, such as due
> > to
> > a PF FLR event. In fm10k_detach_subtask, poll and check whether the
> > PCIe register space is active again and restore the device when it
> > has.
> > 
> > Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> > Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
> > Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> > ---
> >  drivers/net/ethernet/intel/fm10k/fm10k_pci.c | 13 +++++++++++++
> >  1 file changed, 13 insertions(+)
> > 
> > diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
> > b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
> > index 5e40460..d4ccb2a 100644
> > --- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
> > +++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
> > @@ -123,11 +123,24 @@ static void fm10k_service_timer(unsigned long
> > data)
> >  static void fm10k_detach_subtask(struct fm10k_intfc *interface)
> >  {
> >  	struct net_device *netdev = interface->netdev;
> > +	u32 __iomem *hw_addr;
> > +	u32 value;
> > 
> >  	/* do nothing if device is still present or hw_addr is set
> > */
> >  	if (netif_device_present(netdev) || interface->hw.hw_addr)
> >  		return;
> > 
> > +	/* check the real address space to see if we've recovered
> > */
> > +	hw_addr = READ_ONCE(interface->uc_addr);
> > +	value = readl(hw_addr);
> > +	if ((~value)) {
> 
>     Why these double parens?
> 

You're right it doesn't need them. I think at one point the check was
"!(~value)" in some other portion of code, and likely got copied by
mistake.

Thanks,
Jake

> [...]
> 
> MBR, Sergei
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2016-07-21 20:13 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-20 22:23 [net-next 00/20][pull request] 100GbE Intel Wired LAN Driver Updates 2016-07-20 Jeff Kirsher
2016-07-20 22:23 ` [net-next 01/20] fm10k: no need to continue in fm10k_down if __FM10K_DOWN already set Jeff Kirsher
2016-07-20 22:23 ` [net-next 02/20] fm10k: avoid possible null pointer dereference in fm10k_update_stats Jeff Kirsher
2016-07-20 22:23 ` [net-next 03/20] fm10k: prevent multiple threads updating statistics Jeff Kirsher
2016-07-20 22:23 ` [net-next 04/20] fm10k: Reset mailbox global interrupts Jeff Kirsher
2016-07-20 22:23 ` [net-next 05/20] fm10k: don't stop reset due to FM10K_ERR_REQUESTS_PENDING Jeff Kirsher
2016-07-20 22:23 ` [net-next 06/20] fm10k: perform data path reset even when switch is not ready Jeff Kirsher
2016-07-20 22:23 ` [net-next 07/20] fm10k: use actual hardware registers when checking for pending Tx Jeff Kirsher
2016-07-20 22:23 ` [net-next 08/20] fm10k: only warn when stop_hw fails with FM10K_ERR_REQUESTS_PENDING Jeff Kirsher
2016-07-20 22:23 ` [net-next 09/20] fm10k: wait for queues to drain if stop_hw() fails once Jeff Kirsher
2016-07-20 22:23 ` [net-next 10/20] fm10k: split fm10k_reinit into two functions Jeff Kirsher
2016-07-20 22:23 ` [net-next 11/20] fm10k: implement prepare_suspend and handle_resume Jeff Kirsher
2016-07-20 22:23 ` [net-next 12/20] fm10k: use common reset flow when handling io errors from PCI stack Jeff Kirsher
2016-07-20 22:23 ` [net-next 13/20] fm10k: implement reset_notify handler for PCIe FLR events Jeff Kirsher
2016-07-20 22:23 ` [net-next 14/20] fm10k: use common flow for suspend and resume Jeff Kirsher
2016-07-20 22:23 ` [net-next 15/20] fm10k: enable bus master after every reset Jeff Kirsher
2016-07-20 22:23 ` [net-next 16/20] fm10k: check if PCIe link is restored Jeff Kirsher
2016-07-21 10:51   ` Sergei Shtylyov
2016-07-21 20:13     ` ~~ Keller, Jacob E
2016-07-20 22:23 ` [net-next 17/20] fm10k: implement request_lport_map pointer Jeff Kirsher
2016-07-20 22:23 ` [net-next 18/20] fm10k: force link to remain down for at least a second on resume events Jeff Kirsher
2016-07-20 22:23 ` [net-next 19/20] fm10k: return proper error code when pci_enable_msix_range fails Jeff Kirsher
2016-07-20 22:23 ` [net-next 20/20] fm10k: bump version number Jeff Kirsher
2016-07-21  5:05 ` [net-next 00/20][pull request] 100GbE Intel Wired LAN Driver Updates 2016-07-20 David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.