[Intel-wired-lan] [PATCH 00/15] harden fm10k driver against PFLR and surprise remove

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Intel-wired-lan] [PATCH 00/15] harden fm10k driver against PFLR and surprise remove
@ 2017-06-01 22:40 Jacob Keller
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 01/15] fm10k: stop spurious link down messages when Tx FIFO is full Jacob Keller
                   ` (14 more replies)
  0 siblings, 15 replies; 18+ messages in thread
From: Jacob Keller @ 2017-06-01 22:40 UTC (permalink / raw)
  To: intel-wired-lan

This series of patches is aimed primarily at resolving some issues seen
when the PCIe device is detached. Additionally some work was done to
allow better handling for large number of VLAN messages with many VFs.
These changes help stabilize the driver under surprise remove events and
avoid kernel panics.

The first few patches include general driver cleanup, such as removing
unnecessary latency for the first reset, and fixing some new warnings
found with GCC 7.

Next, we have some patches related to the handling of VFLRE events. The
code here was pretty weird, and caused some problems if the PF device
was reset while handling the VFLRE events.

There's a few patches which fix up some kernel panics and hard lockups
which can occur if the device is removed without notice (for example if
you perform a PFLR event without using the .reset_notify() logic).

Finally we have some patches aimed at stabilizing the driver when a
large number of VLAN updates occur. For example, if you create many VFs,
and add hundreds of VLANs per VF, previously the driver would enter into
a permanent reset loop, where it would send too many mailbox messages,
which causes the driver to mailbox timeout, reset, and then during reset
attempt to re-add all those same VLAN messages.

To resolve this, we added a somewhat complicated MAC/VLAN message queue.
This queue is important because it allows us to delay sending mailbox
messages until we have space. We originally tried some patches to simply
increase the mailbox size, but ultimately this solution was not
tractable and resulted in a lot of wasted memory.

The queue solution could theoretically be expanded to handle all mailbox
messages, but we determined through testing that only MAC address and
VLAN updates actually generated sufficient mailbox messages to actually
trigger these timeouts.

Jacob Keller (15):
  fm10k: stop spurious link down messages when Tx FIFO is full
  fm10k: fix typos on fall through comments
  fm10k: avoid possible truncation of q_vector->name
  fm10k: add missing fall through comment
  fm10k: avoid needless delay when loading driver
  fm10k: simplify reading PFVFLRE register
  fm10k: don't loop while resetting VFs due to VFLR event
  fm10k: avoid divide by zero in rare cases when device is resetting
  fm10k: move fm10k_prepare_for_reset and fm10k_handle_reset
  fm10k: prepare_for_reset() when we lose PCIe Link
  fm10k: prevent race condition of __FM10K_SERVICE_SCHED
  fm10k: use spinlock to implement mailbox lock
  fm10k: use generic PM hooks instead of legacy PCIe power hooks
  fm10k: introduce a message queue for MAC/VLAN messages
  fm10k: use the MAC/VLAN queue for VF<->PF MAC/VLAN requests

 drivers/net/ethernet/intel/fm10k/fm10k.h        |  53 +++-
 drivers/net/ethernet/intel/fm10k/fm10k_common.c |   6 +-
 drivers/net/ethernet/intel/fm10k/fm10k_iov.c    | 160 ++++++++--
 drivers/net/ethernet/intel/fm10k/fm10k_main.c   |   1 +
 drivers/net/ethernet/intel/fm10k/fm10k_mbx.c    |   4 +-
 drivers/net/ethernet/intel/fm10k/fm10k_netdev.c | 207 ++++++++++---
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c    | 395 ++++++++++++++++++------
 drivers/net/ethernet/intel/fm10k/fm10k_pf.c     |  12 +-
 drivers/net/ethernet/intel/fm10k/fm10k_pf.h     |   3 +-
 9 files changed, 662 insertions(+), 179 deletions(-)

-- 
2.13.0.598.gf927b9495246

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Intel-wired-lan] [PATCH 01/15] fm10k: stop spurious link down messages when Tx FIFO is full
  2017-06-01 22:40 [Intel-wired-lan] [PATCH 00/15] harden fm10k driver against PFLR and surprise remove Jacob Keller
@ 2017-06-01 22:40 ` Jacob Keller
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 02/15] fm10k: fix typos on fall through comments Jacob Keller
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Jacob Keller @ 2017-06-01 22:40 UTC (permalink / raw)
  To: intel-wired-lan

In fm10k_get_host_state_generic, we check the mailbox tx_read() function
to ensure that the mailbox is still open. This function also checks to
make sure we have space to transmit another message. Unfortunately, if
we just recently sent a bunch of messages (such as enabling hundreds of
VLANs on a vF) this can result in a race where the watchdog task thinks
the link went down just because we haven't had time to process all these
messages yet.

Instead, lets just check whether the mailbox is still open. This ensures
that we don't race with the Tx fifo, and we only link down once the
mailbox is not open.

This is safe, because if the FIFO fills up and we're unable to send
a message for too long, we'll end up triggering the timeout detection
which results in a reset. Additionally, since we still check to ensure
the mailbox state is OPEN, we'll transition to link down whenever the
mailbox closes as well.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_common.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_common.c b/drivers/net/ethernet/intel/fm10k/fm10k_common.c
index 62a6ad9b3eed..736a9f087bc9 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_common.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_common.c
@@ -1,5 +1,5 @@
 /* Intel(R) Ethernet Switch Host Interface Driver
- * Copyright(c) 2013 - 2016 Intel Corporation.
+ * Copyright(c) 2013 - 2017 Intel Corporation.
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms and conditions of the GNU General Public License,
@@ -517,8 +517,8 @@ s32 fm10k_get_host_state_generic(struct fm10k_hw *hw, bool *host_ready)
 		goto out;
 	}

-	/* verify Mailbox is still valid */
-	if (!mbx->ops.tx_ready(mbx, FM10K_VFMBX_MSG_MTU))
+	/* verify Mailbox is still open */
+	if (mbx->state != FM10K_STATE_OPEN)
 		goto out;

 	/* interface cannot receive traffic without logical ports */
-- 
2.13.0.598.gf927b9495246

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Intel-wired-lan] [PATCH 02/15] fm10k: fix typos on fall through comments
  2017-06-01 22:40 [Intel-wired-lan] [PATCH 00/15] harden fm10k driver against PFLR and surprise remove Jacob Keller
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 01/15] fm10k: stop spurious link down messages when Tx FIFO is full Jacob Keller
@ 2017-06-01 22:40 ` Jacob Keller
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 03/15] fm10k: avoid possible truncation of q_vector->name Jacob Keller
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Jacob Keller @ 2017-06-01 22:40 UTC (permalink / raw)
  To: intel-wired-lan

Newer versions of GCC since version 7 now warn when a case statement may
fall through without an explicit comment. "Fallthough" does not count as
it is misspelled. Fix the typos for these comments to appease the new
warnings.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_mbx.c |  4 ++--
 drivers/net/ethernet/intel/fm10k/fm10k_pf.c  | 10 +++++-----
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_mbx.c b/drivers/net/ethernet/intel/fm10k/fm10k_mbx.c
index 334088a101c3..244d3ad58ca7 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_mbx.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_mbx.c
@@ -1,5 +1,5 @@
 /* Intel(R) Ethernet Switch Host Interface Driver
- * Copyright(c) 2013 - 2016 Intel Corporation.
+ * Copyright(c) 2013 - 2017 Intel Corporation.
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms and conditions of the GNU General Public License,
@@ -1586,7 +1586,7 @@ s32 fm10k_pfvf_mbx_init(struct fm10k_hw *hw, struct fm10k_mbx_info *mbx,
 			mbx->mbmem_reg = FM10K_MBMEM_VF(id, 0);
 			break;
 		}
-		/* fallthough */
+		/* fall through */
 	default:
 		return FM10K_MBX_ERR_NO_MBX;
 	}
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pf.c b/drivers/net/ethernet/intel/fm10k/fm10k_pf.c
index 40ee0242a80a..9e4fb3a44376 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pf.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pf.c
@@ -1,5 +1,5 @@
 /* Intel(R) Ethernet Switch Host Interface Driver
- * Copyright(c) 2013 - 2016 Intel Corporation.
+ * Copyright(c) 2013 - 2017 Intel Corporation.
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms and conditions of the GNU General Public License,
@@ -1334,19 +1334,19 @@ static u8 fm10k_iov_supported_xcast_mode_pf(struct fm10k_vf_info *vf_info,
 	case FM10K_XCAST_MODE_PROMISC:
 		if (vf_flags & FM10K_VF_FLAG_PROMISC_CAPABLE)
 			return FM10K_XCAST_MODE_PROMISC;
-		/* fallthough */
+		/* fall through */
 	case FM10K_XCAST_MODE_ALLMULTI:
 		if (vf_flags & FM10K_VF_FLAG_ALLMULTI_CAPABLE)
 			return FM10K_XCAST_MODE_ALLMULTI;
-		/* fallthough */
+		/* fall through */
 	case FM10K_XCAST_MODE_MULTI:
 		if (vf_flags & FM10K_VF_FLAG_MULTI_CAPABLE)
 			return FM10K_XCAST_MODE_MULTI;
-		/* fallthough */
+		/* fall through */
 	case FM10K_XCAST_MODE_NONE:
 		if (vf_flags & FM10K_VF_FLAG_NONE_CAPABLE)
 			return FM10K_XCAST_MODE_NONE;
-		/* fallthough */
+		/* fall through */
 	default:
 		break;
 	}
-- 
2.13.0.598.gf927b9495246


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Intel-wired-lan] [PATCH 03/15] fm10k: avoid possible truncation of q_vector->name
  2017-06-01 22:40 [Intel-wired-lan] [PATCH 00/15] harden fm10k driver against PFLR and surprise remove Jacob Keller
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 01/15] fm10k: stop spurious link down messages when Tx FIFO is full Jacob Keller
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 02/15] fm10k: fix typos on fall through comments Jacob Keller
@ 2017-06-01 22:40 ` Jacob Keller
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 04/15] fm10k: add missing fall through comment Jacob Keller
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Jacob Keller @ 2017-06-01 22:40 UTC (permalink / raw)
  To: intel-wired-lan

New versions of GCC since version 7 began warning about possible
truncation of calls to snprintf. We can fix this and avoid false
positives. First, we should pass the full buffer size to snprintf,
because it guarantees a NULL character as part of its passed length, so
passing len-1 is simply wasting a byte of possible storage.

Second, if we make the ri and ti variables unsigned, the compiler is
able to correctly reason that the value never gets larger than 256, so
it doesn't need to warn about the full space required to print a signed
integer.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index 3e26d27ad213..80b18f2479b4 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -1544,7 +1544,7 @@ int fm10k_qv_request_irq(struct fm10k_intfc *interface)
 	struct net_device *dev = interface->netdev;
 	struct fm10k_hw *hw = &interface->hw;
 	struct msix_entry *entry;
-	int ri = 0, ti = 0;
+	unsigned int ri = 0, ti = 0;
 	int vector, err;
 
 	entry = &interface->msix_entries[NON_Q_VECTORS(hw)];
@@ -1554,15 +1554,15 @@ int fm10k_qv_request_irq(struct fm10k_intfc *interface)
 
 		/* name the vector */
 		if (q_vector->tx.count && q_vector->rx.count) {
-			snprintf(q_vector->name, sizeof(q_vector->name) - 1,
-				 "%s-TxRx-%d", dev->name, ri++);
+			snprintf(q_vector->name, sizeof(q_vector->name),
+				 "%s-TxRx-%u", dev->name, ri++);
 			ti++;
 		} else if (q_vector->rx.count) {
-			snprintf(q_vector->name, sizeof(q_vector->name) - 1,
-				 "%s-rx-%d", dev->name, ri++);
+			snprintf(q_vector->name, sizeof(q_vector->name),
+				 "%s-rx-%u", dev->name, ri++);
 		} else if (q_vector->tx.count) {
-			snprintf(q_vector->name, sizeof(q_vector->name) - 1,
-				 "%s-tx-%d", dev->name, ti++);
+			snprintf(q_vector->name, sizeof(q_vector->name),
+				 "%s-tx-%u", dev->name, ti++);
 		} else {
 			/* skip this unused q_vector */
 			continue;
-- 
2.13.0.598.gf927b9495246


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Intel-wired-lan] [PATCH 04/15] fm10k: add missing fall through comment
  2017-06-01 22:40 [Intel-wired-lan] [PATCH 00/15] harden fm10k driver against PFLR and surprise remove Jacob Keller
                   ` (2 preceding siblings ...)
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 03/15] fm10k: avoid possible truncation of q_vector->name Jacob Keller
@ 2017-06-01 22:40 ` Jacob Keller
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 05/15] fm10k: avoid needless delay when loading driver Jacob Keller
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Jacob Keller @ 2017-06-01 22:40 UTC (permalink / raw)
  To: intel-wired-lan

Newer versions of GCC starting with 7 now additionally warn when a case
statement may fall through without an explicit comment mentioning it.
Add such a comment to silence the warning, as this is expected.

Unfortunately the comment must come directly before the next case
statement, so we put it outside the #ifdef. Otherwise, the compiler
cannot properly detect it and thus the warning is displayed regardless.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_main.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_main.c b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
index 9dffaba85ae6..189d52a8a605 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_main.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
@@ -876,6 +876,7 @@ static void fm10k_tx_csum(struct fm10k_ring *tx_ring,
 	case IPPROTO_GRE:
 		if (skb->encapsulation)
 			break;
+		/* fall through */
 	default:
 		if (unlikely(net_ratelimit())) {
 			dev_warn(tx_ring->dev,
-- 
2.13.0.598.gf927b9495246


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Intel-wired-lan] [PATCH 05/15] fm10k: avoid needless delay when loading driver
  2017-06-01 22:40 [Intel-wired-lan] [PATCH 00/15] harden fm10k driver against PFLR and surprise remove Jacob Keller
                   ` (3 preceding siblings ...)
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 04/15] fm10k: add missing fall through comment Jacob Keller
@ 2017-06-01 22:40 ` Jacob Keller
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 06/15] fm10k: simplify reading PFVFLRE register Jacob Keller
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Jacob Keller @ 2017-06-01 22:40 UTC (permalink / raw)
  To: intel-wired-lan

When we load the driver, we set the last_reset to be in the future,
which delays the initial driver reset. Additionally, the service task
isn't scheduled to run automatically until the timer runs out. This
causes a needless delay of the first reset to begin talking to the
switch manager.

We can avoid this by simply not setting last_reset and immediately
scheduling the service task while in probe. This allows the device to
wake up faster, and avoids this delay.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index 80b18f2479b4..32b5ace82cad 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -1800,9 +1800,6 @@ static int fm10k_sw_init(struct fm10k_intfc *interface,
 		netdev->vlan_features |= NETIF_F_HIGHDMA;
 	}
 
-	/* delay any future reset requests */
-	interface->last_reset = jiffies + (10 * HZ);
-
 	/* reset and initialize the hardware so it is in a known state */
 	err = hw->mac.ops.reset_hw(hw);
 	if (err) {
@@ -2079,8 +2076,9 @@ static int fm10k_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	/* enable SR-IOV after registering netdev to enforce PF/VF ordering */
 	fm10k_iov_configure(pdev, 0);
 
-	/* clear the service task disable bit to allow service task to start */
+	/* clear the service task disable bit and kick off service task */
 	clear_bit(__FM10K_SERVICE_DISABLE, interface->state);
+	fm10k_service_event_schedule(interface);
 
 	return 0;
 
-- 
2.13.0.598.gf927b9495246


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Intel-wired-lan] [PATCH 06/15] fm10k: simplify reading PFVFLRE register
  2017-06-01 22:40 [Intel-wired-lan] [PATCH 00/15] harden fm10k driver against PFLR and surprise remove Jacob Keller
                   ` (4 preceding siblings ...)
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 05/15] fm10k: avoid needless delay when loading driver Jacob Keller
@ 2017-06-01 22:40 ` Jacob Keller
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 07/15] fm10k: don't loop while resetting VFs due to VFLR event Jacob Keller
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Jacob Keller @ 2017-06-01 22:40 UTC (permalink / raw)
  To: intel-wired-lan

We're doing a really convoluted bitshift and read for the PFVFLRE
register. Just reading the PFVFLRE(1), shifting it by 32, then reading
PFVFLRE(0) should be sufficient.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_iov.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_iov.c b/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
index d8356c494f06..dfc88a463735 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
@@ -1,5 +1,5 @@
 /* Intel(R) Ethernet Switch Host Interface Driver
- * Copyright(c) 2013 - 2016 Intel Corporation.
+ * Copyright(c) 2013 - 2017 Intel Corporation.
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms and conditions of the GNU General Public License,
@@ -67,10 +67,8 @@ s32 fm10k_iov_event(struct fm10k_intfc *interface)
 
 	/* read VFLRE to determine if any VFs have been reset */
 	do {
-		vflre = fm10k_read_reg(hw, FM10K_PFVFLRE(0));
+		vflre = fm10k_read_reg(hw, FM10K_PFVFLRE(1));
 		vflre <<= 32;
-		vflre |= fm10k_read_reg(hw, FM10K_PFVFLRE(1));
-		vflre = (vflre << 32) | (vflre >> 32);
 		vflre |= fm10k_read_reg(hw, FM10K_PFVFLRE(0));
 
 		i = iov_data->num_vfs;
-- 
2.13.0.598.gf927b9495246


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Intel-wired-lan] [PATCH 07/15] fm10k: don't loop while resetting VFs due to VFLR event
  2017-06-01 22:40 [Intel-wired-lan] [PATCH 00/15] harden fm10k driver against PFLR and surprise remove Jacob Keller
                   ` (5 preceding siblings ...)
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 06/15] fm10k: simplify reading PFVFLRE register Jacob Keller
@ 2017-06-01 22:40 ` Jacob Keller
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 08/15] fm10k: avoid divide by zero in rare cases when device is resetting Jacob Keller
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Jacob Keller @ 2017-06-01 22:40 UTC (permalink / raw)
  To: intel-wired-lan

We've always had a really weird looping construction for resetting VFs.
We read the VFLRE register and reset the VF if the corresponding bit is
set, which makes sense. However we loop continuously until we no longer
have any bits left unset. At first this makes sense, as a sort of "keep
trying until we succeed" concept.

Unfortunately this causes a problem if we happen to surprise remove
while this code is executing, because in this case we'll always read all
1s for the VFLRE register. This results in a hard lockup on the CPU
because the loop will never terminate.

Because our own reset function will clear the VFLR event register
always, (except when we've lost PCIe link obviously) there is no real
reason to loop. In practice, we'll loop over once and find that no VFs
are pending anymore.

Lets just check once. Since we're clear the notification when we reset
there's no benefit to the loop. Additionally, there shouldn't be a race
as future VLFRE events should trigger an interrupt. Additionally, we
didn't warn or do anything in the looped case anyways.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_iov.c | 24 +++++++++++-------------
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_iov.c b/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
index dfc88a463735..03897720bf0b 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
@@ -66,23 +66,21 @@ s32 fm10k_iov_event(struct fm10k_intfc *interface)
 		goto read_unlock;

 	/* read VFLRE to determine if any VFs have been reset */
-	do {
-		vflre = fm10k_read_reg(hw, FM10K_PFVFLRE(1));
-		vflre <<= 32;
-		vflre |= fm10k_read_reg(hw, FM10K_PFVFLRE(0));
+	vflre = fm10k_read_reg(hw, FM10K_PFVFLRE(1));
+	vflre <<= 32;
+	vflre |= fm10k_read_reg(hw, FM10K_PFVFLRE(0));

-		i = iov_data->num_vfs;
+	i = iov_data->num_vfs;

-		for (vflre <<= 64 - i; vflre && i--; vflre += vflre) {
-			struct fm10k_vf_info *vf_info = &iov_data->vf_info[i];
+	for (vflre <<= 64 - i; vflre && i--; vflre += vflre) {
+		struct fm10k_vf_info *vf_info = &iov_data->vf_info[i];

-			if (vflre >= 0)
-				continue;
+		if (vflre >= 0)
+			continue;

-			hw->iov.ops.reset_resources(hw, vf_info);
-			vf_info->mbx.ops.connect(hw, &vf_info->mbx);
-		}
-	} while (i != iov_data->num_vfs);
+		hw->iov.ops.reset_resources(hw, vf_info);
+		vf_info->mbx.ops.connect(hw, &vf_info->mbx);
+	}

 read_unlock:
 	rcu_read_unlock();
-- 
2.13.0.598.gf927b9495246

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Intel-wired-lan] [PATCH 08/15] fm10k: avoid divide by zero in rare cases when device is resetting
  2017-06-01 22:40 [Intel-wired-lan] [PATCH 00/15] harden fm10k driver against PFLR and surprise remove Jacob Keller
                   ` (6 preceding siblings ...)
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 07/15] fm10k: don't loop while resetting VFs due to VFLR event Jacob Keller
@ 2017-06-01 22:40 ` Jacob Keller
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 09/15] fm10k: move fm10k_prepare_for_reset and fm10k_handle_reset Jacob Keller
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Jacob Keller @ 2017-06-01 22:40 UTC (permalink / raw)
  To: intel-wired-lan

It is possible that under rare circumstances the device is undergoing
a reset, such as when a PFLR occurs, and the device may be transmitting
simultaneously. In this case, we might attempt to divide by zero when
finding the proper r_idx. Instead, lets read the num_tx_queues once,
and make sure it's non-zero.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_netdev.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
index 24f2f6f86f5a..f0d824558fbf 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
@@ -643,9 +643,13 @@ int fm10k_close(struct net_device *netdev)
 static netdev_tx_t fm10k_xmit_frame(struct sk_buff *skb, struct net_device *dev)
 {
 	struct fm10k_intfc *interface = netdev_priv(dev);
+	int num_tx_queues = READ_ONCE(interface->num_tx_queues);
 	unsigned int r_idx = skb->queue_mapping;
 	int err;
 
+	if (!num_tx_queues)
+		return NETDEV_TX_BUSY;
+
 	if ((skb->protocol == htons(ETH_P_8021Q)) &&
 	    !skb_vlan_tag_present(skb)) {
 		/* FM10K only supports hardware tagging, any tags in frame
@@ -698,8 +702,8 @@ static netdev_tx_t fm10k_xmit_frame(struct sk_buff *skb, struct net_device *dev)
 		__skb_put(skb, pad_len);
 	}
 
-	if (r_idx >= interface->num_tx_queues)
-		r_idx %= interface->num_tx_queues;
+	if (r_idx >= num_tx_queues)
+		r_idx %= num_tx_queues;
 
 	err = fm10k_xmit_frame_ring(skb, interface->tx_ring[r_idx]);
 
-- 
2.13.0.598.gf927b9495246


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Intel-wired-lan] [PATCH 09/15] fm10k: move fm10k_prepare_for_reset and fm10k_handle_reset
  2017-06-01 22:40 [Intel-wired-lan] [PATCH 00/15] harden fm10k driver against PFLR and surprise remove Jacob Keller
                   ` (7 preceding siblings ...)
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 08/15] fm10k: avoid divide by zero in rare cases when device is resetting Jacob Keller
@ 2017-06-01 22:40 ` Jacob Keller
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 10/15] fm10k: prepare_for_reset() when we lose PCIe Link Jacob Keller
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Jacob Keller @ 2017-06-01 22:40 UTC (permalink / raw)
  To: intel-wired-lan

A future patch needs these functions defined earlier in the file. Move
them closer to above where they will be called.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c | 58 ++++++++++++++--------------
 1 file changed, 29 insertions(+), 29 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index 32b5ace82cad..6a7b4c5429ae 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -132,35 +132,6 @@ static void fm10k_service_timer(unsigned long data)
 	fm10k_service_event_schedule(interface);
 }
 
-static void fm10k_detach_subtask(struct fm10k_intfc *interface)
-{
-	struct net_device *netdev = interface->netdev;
-	u32 __iomem *hw_addr;
-	u32 value;
-
-	/* do nothing if device is still present or hw_addr is set */
-	if (netif_device_present(netdev) || interface->hw.hw_addr)
-		return;
-
-	/* check the real address space to see if we've recovered */
-	hw_addr = READ_ONCE(interface->uc_addr);
-	value = readl(hw_addr);
-	if (~value) {
-		interface->hw.hw_addr = interface->uc_addr;
-		netif_device_attach(netdev);
-		set_bit(FM10K_FLAG_RESET_REQUESTED, interface->flags);
-		netdev_warn(netdev, "PCIe link restored, device now attached\n");
-		return;
-	}
-
-	rtnl_lock();
-
-	if (netif_running(netdev))
-		dev_close(netdev);
-
-	rtnl_unlock();
-}
-
 static void fm10k_prepare_for_reset(struct fm10k_intfc *interface)
 {
 	struct net_device *netdev = interface->netdev;
@@ -270,6 +241,35 @@ static int fm10k_handle_reset(struct fm10k_intfc *interface)
 	return err;
 }
 
+static void fm10k_detach_subtask(struct fm10k_intfc *interface)
+{
+	struct net_device *netdev = interface->netdev;
+	u32 __iomem *hw_addr;
+	u32 value;
+
+	/* do nothing if device is still present or hw_addr is set */
+	if (netif_device_present(netdev) || interface->hw.hw_addr)
+		return;
+
+	/* check the real address space to see if we've recovered */
+	hw_addr = READ_ONCE(interface->uc_addr);
+	value = readl(hw_addr);
+	if (~value) {
+		interface->hw.hw_addr = interface->uc_addr;
+		netif_device_attach(netdev);
+		set_bit(FM10K_FLAG_RESET_REQUESTED, interface->flags);
+		netdev_warn(netdev, "PCIe link restored, device now attached\n");
+		return;
+	}
+
+	rtnl_lock();
+
+	if (netif_running(netdev))
+		dev_close(netdev);
+
+	rtnl_unlock();
+}
+
 static void fm10k_reinit(struct fm10k_intfc *interface)
 {
 	int err;
-- 
2.13.0.598.gf927b9495246


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Intel-wired-lan] [PATCH 10/15] fm10k: prepare_for_reset() when we lose PCIe Link
  2017-06-01 22:40 [Intel-wired-lan] [PATCH 00/15] harden fm10k driver against PFLR and surprise remove Jacob Keller
                   ` (8 preceding siblings ...)
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 09/15] fm10k: move fm10k_prepare_for_reset and fm10k_handle_reset Jacob Keller
@ 2017-06-01 22:40 ` Jacob Keller
  2017-07-10 20:02   ` Keller, Jacob E
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 11/15] fm10k: prevent race condition of __FM10K_SERVICE_SCHED Jacob Keller
                   ` (4 subsequent siblings)
  14 siblings, 1 reply; 18+ messages in thread
From: Jacob Keller @ 2017-06-01 22:40 UTC (permalink / raw)
  To: intel-wired-lan

If we lose PCIe link, such as when an unannounced PFLR event occurs, or
when a device is surprise removed, we currently detach the device and
close the netdev. This unfortunately leaves a lot of things still
active, such as the msix_mbx_pf IRQ, and Tx/Rx resources.

This can cause problems because the register reads will return
potentially invalid values which may result in unknown driver behavior.

Begin the process of resetting using fm10k_prepare_for_reset(), much in
the same way as the suspend and resume cycle does. This will attempt to
shutdown as much as possible, in order to prevent possible issues.

Since the __FM10K_RESETTING state is long lived, we'll also stop waiting
for it when we check to the fm10k_reset_subtask. This is important since
otherwise it would deadlock with the fm10k_detach_subtask. Additionally,
stop attempting to manage the mailbox subtask if we're
detached/resetting, as there is nothing to do when we don't have a PCIe
address.

Overall this produces a much cleaner shutdown and recovery cycle for
a PCIe surprise remove event.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c | 39 +++++++++++++++++++---------
 1 file changed, 27 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index 6a7b4c5429ae..2d94a16f9613 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -141,8 +141,9 @@ static void fm10k_prepare_for_reset(struct fm10k_intfc *interface)
 	/* put off any impending NetWatchDogTimeout */
 	netif_trans_update(netdev);
 
-	while (test_and_set_bit(__FM10K_RESETTING, interface->state))
-		usleep_range(1000, 2000);
+	/* Nothing to do if a reset is already in progress */
+	if (test_and_set_bit(__FM10K_RESETTING, interface->state))
+		return;
 
 	rtnl_lock();
 
@@ -168,6 +169,8 @@ static int fm10k_handle_reset(struct fm10k_intfc *interface)
 	struct fm10k_hw *hw = &interface->hw;
 	int err;
 
+	WARN_ON(!test_bit(__FM10K_RESETTING, interface->state));
+
 	rtnl_lock();
 
 	pci_set_master(interface->pdev);
@@ -247,27 +250,33 @@ static void fm10k_detach_subtask(struct fm10k_intfc *interface)
 	u32 __iomem *hw_addr;
 	u32 value;
 
-	/* do nothing if device is still present or hw_addr is set */
+	/* do nothing if netdev is still present or hw_addr is set */
 	if (netif_device_present(netdev) || interface->hw.hw_addr)
 		return;
 
+	/* We've lost the PCIe register space, and can no longer access the
+	 * device. Shut everything except the detach subtask down and prepare
+	 * to reset the device in case we recover.
+	 */
+	fm10k_prepare_for_reset(interface);
+
 	/* check the real address space to see if we've recovered */
 	hw_addr = READ_ONCE(interface->uc_addr);
 	value = readl(hw_addr);
 	if (~value) {
+		/* Restore the hardware address */
 		interface->hw.hw_addr = interface->uc_addr;
+
+		/* PCIe link has been restored, and the device is active
+		 * again. Restore everything and reset the device.
+		 */
+		fm10k_handle_reset(interface);
+
+		/* Re-attach the netdev */
 		netif_device_attach(netdev);
-		set_bit(FM10K_FLAG_RESET_REQUESTED, interface->flags);
 		netdev_warn(netdev, "PCIe link restored, device now attached\n");
 		return;
 	}
-
-	rtnl_lock();
-
-	if (netif_running(netdev))
-		dev_close(netdev);
-
-	rtnl_unlock();
 }
 
 static void fm10k_reinit(struct fm10k_intfc *interface)
@@ -360,6 +369,10 @@ static void fm10k_watchdog_update_host_state(struct fm10k_intfc *interface)
  **/
 static void fm10k_mbx_subtask(struct fm10k_intfc *interface)
 {
+	/* If we're resetting, bail out */
+	if (test_bit(__FM10K_RESETTING, interface->state))
+		return;
+
 	/* process upstream mailbox and update device state */
 	fm10k_watchdog_update_host_state(interface);
 
@@ -609,9 +622,11 @@ static void fm10k_service_task(struct work_struct *work)
 
 	interface = container_of(work, struct fm10k_intfc, service_task);
 
+	/* Check whether we're detached first */
+	fm10k_detach_subtask(interface);
+
 	/* tasks run even when interface is down */
 	fm10k_mbx_subtask(interface);
-	fm10k_detach_subtask(interface);
 	fm10k_reset_subtask(interface);
 
 	/* tasks only run when interface is up */
-- 
2.13.0.598.gf927b9495246


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Intel-wired-lan] [PATCH 11/15] fm10k: prevent race condition of __FM10K_SERVICE_SCHED
  2017-06-01 22:40 [Intel-wired-lan] [PATCH 00/15] harden fm10k driver against PFLR and surprise remove Jacob Keller
                   ` (9 preceding siblings ...)
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 10/15] fm10k: prepare_for_reset() when we lose PCIe Link Jacob Keller
@ 2017-06-01 22:40 ` Jacob Keller
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 12/15] fm10k: use spinlock to implement mailbox lock Jacob Keller
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Jacob Keller @ 2017-06-01 22:40 UTC (permalink / raw)
  To: intel-wired-lan

Although very unlikely, it is possible that cancel_work_sync() may stop
the service_task before it actually started. In this case, the
__FM10K_SERVICE_SCHED bit will never be cleared. This results in the
service task being unable to reschedule in the future. Add a helper
function which sets the service disable bit, waits for the service task
to stop and clears the schedule bit, thus avoiding the race condition.
We know the schedule bit is safe to clear because the cancel_work_sync()
guarantees the service task is not running.

Add a helper function also to restart the service task, for symmetry.
This is not strictly needed but helps the mental model of how to stop
and start the service task.

This race could only happen in fm10k_suspend/fm10k_resume as this is the
only place where the service task is actually restarted. Thus,
suspend/resume testing would be ideal. However, note that the chance of
this happening is very slim as the service event is scheduled for
immediate execution, and you would have to trigger a suspend at almost
the exact same time as the service task was scheduled.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c | 32 ++++++++++++++++++++++------
 1 file changed, 25 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index 2d94a16f9613..206da6b7c46a 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -118,6 +118,27 @@ static void fm10k_service_event_complete(struct fm10k_intfc *interface)
 		fm10k_service_event_schedule(interface);
 }
 
+static void fm10k_stop_service_event(struct fm10k_intfc *interface)
+{
+	set_bit(__FM10K_SERVICE_DISABLE, interface->state);
+	cancel_work_sync(&interface->service_task);
+
+	/* It's possible that cancel_work_sync stopped the service task from
+	 * running before it could actually start. In this case the
+	 * __FM10K_SERVICE_SCHED bit will never be cleared. Since we know that
+	 * the service task cannot be running at this point, we need to clear
+	 * the scheduled bit, as otherwise the service task may never be
+	 * restarted.
+	 */
+	clear_bit(__FM10K_SERVICE_SCHED, interface->state);
+}
+
+static void fm10k_start_service_event(struct fm10k_intfc *interface)
+{
+	clear_bit(__FM10K_SERVICE_DISABLE, interface->state);
+	fm10k_service_event_schedule(interface);
+}
+
 /**
  * fm10k_service_timer - Timer Call-back
  * @data: pointer to interface cast into an unsigned long
@@ -2131,8 +2152,7 @@ static void fm10k_remove(struct pci_dev *pdev)
 
 	del_timer_sync(&interface->service_timer);
 
-	set_bit(__FM10K_SERVICE_DISABLE, interface->state);
-	cancel_work_sync(&interface->service_task);
+	fm10k_stop_service_event(interface);
 
 	/* free netdev, this may bounce the interrupts due to setup_tc */
 	if (netdev->reg_state == NETREG_REGISTERED)
@@ -2170,8 +2190,7 @@ static void fm10k_prepare_suspend(struct fm10k_intfc *interface)
 	 * stopped. We stop the watchdog task until after we resume software
 	 * activity.
 	 */
-	set_bit(__FM10K_SERVICE_DISABLE, interface->state);
-	cancel_work_sync(&interface->service_task);
+	fm10k_stop_service_event(interface);
 
 	fm10k_prepare_for_reset(interface);
 }
@@ -2198,9 +2217,8 @@ static int fm10k_handle_resume(struct fm10k_intfc *interface)
 	interface->link_down_event = jiffies + (HZ);
 	set_bit(__FM10K_LINK_DOWN, interface->state);
 
-	/* clear the service task disable bit to allow service task to start */
-	clear_bit(__FM10K_SERVICE_DISABLE, interface->state);
-	fm10k_service_event_schedule(interface);
+	/* restart the service task */
+	fm10k_start_service_event(interface);
 
 	return err;
 }
-- 
2.13.0.598.gf927b9495246


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Intel-wired-lan] [PATCH 12/15] fm10k: use spinlock to implement mailbox lock
  2017-06-01 22:40 [Intel-wired-lan] [PATCH 00/15] harden fm10k driver against PFLR and surprise remove Jacob Keller
                   ` (10 preceding siblings ...)
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 11/15] fm10k: prevent race condition of __FM10K_SERVICE_SCHED Jacob Keller
@ 2017-06-01 22:40 ` Jacob Keller
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 13/15] fm10k: use generic PM hooks instead of legacy PCIe power hooks Jacob Keller
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Jacob Keller @ 2017-06-01 22:40 UTC (permalink / raw)
  To: intel-wired-lan

Lets not re-invent the locking wheel. Remove our bitlock and use
a proper spinlock instead.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k.h     | 15 +++++----------
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c |  3 +++
 2 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k.h b/drivers/net/ethernet/intel/fm10k/fm10k.h
index 689c413b7782..1bcba0665ac1 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k.h
@@ -274,7 +274,6 @@ enum fm10k_state_t {
 	__FM10K_SERVICE_SCHED,
 	__FM10K_SERVICE_REQUEST,
 	__FM10K_SERVICE_DISABLE,
-	__FM10K_MBX_LOCK,
 	__FM10K_LINK_DOWN,
 	__FM10K_UPDATING_STATS,
 	/* This value must be last and determines the BITMAP size */
@@ -344,6 +343,8 @@ struct fm10k_intfc {
 
 	struct fm10k_hw_stats stats;
 	struct fm10k_hw hw;
+	/* Mailbox lock */
+	spinlock_t mbx_lock;
 	u32 __iomem *uc_addr;
 	u32 __iomem *sw_addr;
 	u16 msg_enable;
@@ -384,23 +385,17 @@ struct fm10k_intfc {
 
 static inline void fm10k_mbx_lock(struct fm10k_intfc *interface)
 {
-	/* busy loop if we cannot obtain the lock as some calls
-	 * such as ndo_set_rx_mode may be made in atomic context
-	 */
-	while (test_and_set_bit(__FM10K_MBX_LOCK, interface->state))
-		udelay(20);
+	spin_lock(&interface->mbx_lock);
 }
 
 static inline void fm10k_mbx_unlock(struct fm10k_intfc *interface)
 {
-	/* flush memory to make sure state is correct */
-	smp_mb__before_atomic();
-	clear_bit(__FM10K_MBX_LOCK, interface->state);
+	spin_unlock(&interface->mbx_lock);
 }
 
 static inline int fm10k_mbx_trylock(struct fm10k_intfc *interface)
 {
-	return !test_and_set_bit(__FM10K_MBX_LOCK, interface->state);
+	return spin_trylock(&interface->mbx_lock);
 }
 
 /* fm10k_test_staterr - test bits in Rx descriptor status and error fields */
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index 206da6b7c46a..17f3913e4bf7 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -1893,6 +1893,9 @@ static int fm10k_sw_init(struct fm10k_intfc *interface,
 	netdev_rss_key_fill(rss_key, sizeof(rss_key));
 	memcpy(interface->rssrk, rss_key, sizeof(rss_key));
 
+	/* Initialize the mailbox lock */
+	spin_lock_init(&interface->mbx_lock);
+
 	/* Start off interface as being down */
 	set_bit(__FM10K_DOWN, interface->state);
 	set_bit(__FM10K_UPDATING_STATS, interface->state);
-- 
2.13.0.598.gf927b9495246


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Intel-wired-lan] [PATCH 13/15] fm10k: use generic PM hooks instead of legacy PCIe power hooks
  2017-06-01 22:40 [Intel-wired-lan] [PATCH 00/15] harden fm10k driver against PFLR and surprise remove Jacob Keller
                   ` (11 preceding siblings ...)
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 12/15] fm10k: use spinlock to implement mailbox lock Jacob Keller
@ 2017-06-01 22:40 ` Jacob Keller
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 14/15] fm10k: introduce a message queue for MAC/VLAN messages Jacob Keller
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 15/15] fm10k: use the MAC/VLAN queue for VF<->PF MAC/VLAN requests Jacob Keller
  14 siblings, 0 replies; 18+ messages in thread
From: Jacob Keller @ 2017-06-01 22:40 UTC (permalink / raw)
  To: intel-wired-lan

Replace the PCI specific legacy power management hooks with the new
generic power management hooks which work properly for both suspend and
hibernate. The new generic system is better and properly handles the
lower level PCIe power management rather than forcing the driver to
handle it.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c | 67 +++++++++-------------------
 1 file changed, 22 insertions(+), 45 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index 17f3913e4bf7..d14cfe76e58b 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -2228,36 +2228,19 @@ static int fm10k_handle_resume(struct fm10k_intfc *interface)
 
 #ifdef CONFIG_PM
 /**
- * fm10k_resume - Restore device to pre-sleep state
- * @pdev: PCI device information struct
+ * fm10k_resume - Generic PM resume hook
+ * @dev: generic device structure
  *
- * fm10k_resume is called after the system has powered back up from a sleep
- * state and is ready to resume operation.  This function is meant to restore
- * the device back to its pre-sleep state.
+ * Generic PM hook used when waking the device from a low power state after
+ * suspend or hibernation. This function does not need to handle lower PCIe
+ * device state as the stack takes care of that for us.
  **/
-static int fm10k_resume(struct pci_dev *pdev)
+static int fm10k_resume(struct device *dev)
 {
-	struct fm10k_intfc *interface = pci_get_drvdata(pdev);
+	struct fm10k_intfc *interface = pci_get_drvdata(to_pci_dev(dev));
 	struct net_device *netdev = interface->netdev;
 	struct fm10k_hw *hw = &interface->hw;
-	u32 err;
-
-	pci_set_power_state(pdev, PCI_D0);
-	pci_restore_state(pdev);
-
-	/* pci_restore_state clears dev->state_saved so call
-	 * pci_save_state to restore it.
-	 */
-	pci_save_state(pdev);
-
-	err = pci_enable_device_mem(pdev);
-	if (err) {
-		dev_err(&pdev->dev, "Cannot enable PCI device from suspend\n");
-		return err;
-	}
-	pci_set_master(pdev);
-
-	pci_wake_from_d3(pdev, false);
+	int err;
 
 	/* refresh hw_addr in case it was dropped */
 	hw->hw_addr = interface->uc_addr;
@@ -2272,36 +2255,27 @@ static int fm10k_resume(struct pci_dev *pdev)
 }
 
 /**
- * fm10k_suspend - Prepare the device for a system sleep state
- * @pdev: PCI device information struct
+ * fm10k_suspend - Generic PM suspend hook
+ * @dev: generic device structure
  *
- * fm10k_suspend is meant to shutdown the device prior to the system entering
- * a sleep state.  The fm10k hardware does not support wake on lan so the
- * driver simply needs to shut down the device so it is in a low power state.
+ * Generic PM hook used when setting the device into a low power state for
+ * system suspend or hibernation. This function does not need to handle lower
+ * PCIe device state as the stack takes care of that for us.
  **/
-static int fm10k_suspend(struct pci_dev *pdev,
-			 pm_message_t __always_unused state)
+static int fm10k_suspend(struct device *dev)
 {
-	struct fm10k_intfc *interface = pci_get_drvdata(pdev);
+	struct fm10k_intfc *interface = pci_get_drvdata(to_pci_dev(dev));
 	struct net_device *netdev = interface->netdev;
-	int err = 0;
 
 	netif_device_detach(netdev);
 
 	fm10k_prepare_suspend(interface);
 
-	err = pci_save_state(pdev);
-	if (err)
-		return err;
-
-	pci_disable_device(pdev);
-	pci_wake_from_d3(pdev, false);
-	pci_set_power_state(pdev, PCI_D3hot);
-
 	return 0;
 }
 
 #endif /* CONFIG_PM */
+
 /**
  * fm10k_io_error_detected - called when PCI error is detected
  * @pdev: Pointer to PCI device
@@ -2421,15 +2395,18 @@ static const struct pci_error_handlers fm10k_err_handler = {
 	.reset_notify = fm10k_io_reset_notify,
 };
 
+static SIMPLE_DEV_PM_OPS(fm10k_pm_ops, fm10k_suspend, fm10k_resume);
+
 static struct pci_driver fm10k_driver = {
 	.name			= fm10k_driver_name,
 	.id_table		= fm10k_pci_tbl,
 	.probe			= fm10k_probe,
 	.remove			= fm10k_remove,
 #ifdef CONFIG_PM
-	.suspend		= fm10k_suspend,
-	.resume			= fm10k_resume,
-#endif
+	.driver = {
+		.pm		= &fm10k_pm_ops,
+	},
+#endif /* CONFIG_PM */
 	.sriov_configure	= fm10k_iov_configure,
 	.err_handler		= &fm10k_err_handler
 };
-- 
2.13.0.598.gf927b9495246


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Intel-wired-lan] [PATCH 14/15] fm10k: introduce a message queue for MAC/VLAN messages
  2017-06-01 22:40 [Intel-wired-lan] [PATCH 00/15] harden fm10k driver against PFLR and surprise remove Jacob Keller
                   ` (12 preceding siblings ...)
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 13/15] fm10k: use generic PM hooks instead of legacy PCIe power hooks Jacob Keller
@ 2017-06-01 22:40 ` Jacob Keller
  2017-06-16 22:45   ` Keller, Jacob E
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 15/15] fm10k: use the MAC/VLAN queue for VF<->PF MAC/VLAN requests Jacob Keller
  14 siblings, 1 reply; 18+ messages in thread
From: Jacob Keller @ 2017-06-01 22:40 UTC (permalink / raw)
  To: intel-wired-lan

Under some circumstances, when dealing with a large number of MAC
address or VLAN updates at once, the fm10k driver, particularly the VFs
can overload the mailbox with too many messages at once.

This results in a mailbox timeout, which causes the driver to initiate
a reset. During the reset, we re-send all the same messages that
originally caused the timeout. This results in a cycle of resets each
triggering a future reset.

To fix or avoid this, we introduce a workqueue item which monitors
a queue of MAC and VLAN requests. These requests are queued to the end
of the list, and we process as a FIFO periodically.

Initially we only handle requests for the netdev, but we do handle
unicast MAC addresses, multicast MAC addresses, and update VLAN
requests.

A future patch will add support to use this queue for handling MAC
update requests from the VF<->PF mailbox.

The MAC/VLAN work item will keep checking to make sure that each request
does not overflow the mailbox and cause a timeout. If it might, then the
work item will reschedule itself a short time later. This avoids any
reset cycle, since we never send the message if the mailbox is not
ready.

As an alternative, we tried increasing the mailbox message FIFO, but
this just delays the problem and results in needless memory waste on the
system. Our new message queue is dynamically allocated so only uses as
much memory as it needs. Additionally, it need not be contiguous like
the Tx and Rx FIFOs.

Note that this patch chose to only create a queue for MAC and VLAN
messages, since these are the only messages sent in a large enough
volume to cause the reset loop. Other messages are very unlikely to
overflow the mailbox Tx fifo so easily.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k.h        |  38 +++++
 drivers/net/ethernet/intel/fm10k/fm10k_netdev.c | 199 +++++++++++++++++++-----
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c    | 194 +++++++++++++++++++++++
 3 files changed, 389 insertions(+), 42 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k.h b/drivers/net/ethernet/intel/fm10k/fm10k.h
index 1bcba0665ac1..e2029ed438f7 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k.h
@@ -248,6 +248,29 @@ struct fm10k_udp_port {
 	__be16			port;
 };
 
+enum fm10k_macvlan_request_type {
+	FM10K_UC_MAC_REQUEST,
+	FM10K_MC_MAC_REQUEST,
+	FM10K_VLAN_REQUEST
+};
+
+struct fm10k_macvlan_request {
+	enum fm10k_macvlan_request_type type;
+	struct list_head list;
+	union {
+		struct fm10k_mac_request {
+			u8 addr[ETH_ALEN];
+			u16 glort;
+			u16 vid;
+		} mac;
+		struct fm10k_vlan_request {
+			u32 vid;
+			u8 vsi;
+		} vlan;
+	};
+	bool set;
+};
+
 /* one work queue for entire driver */
 extern struct workqueue_struct *fm10k_workqueue;
 
@@ -274,6 +297,8 @@ enum fm10k_state_t {
 	__FM10K_SERVICE_SCHED,
 	__FM10K_SERVICE_REQUEST,
 	__FM10K_SERVICE_DISABLE,
+	__FM10K_MACVLAN_SCHED,
+	__FM10K_MACVLAN_REQUEST,
 	__FM10K_LINK_DOWN,
 	__FM10K_UPDATING_STATS,
 	/* This value must be last and determines the BITMAP size */
@@ -366,6 +391,12 @@ struct fm10k_intfc {
 	struct list_head vxlan_port;
 	struct list_head geneve_port;
 
+	/* MAC/VLAN update queue */
+	struct list_head macvlan_requests;
+	struct delayed_work macvlan_task;
+	/* MAC/VLAN update queue lock */
+	spinlock_t macvlan_lock;
+
 #ifdef CONFIG_DEBUG_FS
 	struct dentry *dbg_intfc;
 #endif /* CONFIG_DEBUG_FS */
@@ -485,6 +516,7 @@ void fm10k_up(struct fm10k_intfc *interface);
 void fm10k_down(struct fm10k_intfc *interface);
 void fm10k_update_stats(struct fm10k_intfc *interface);
 void fm10k_service_event_schedule(struct fm10k_intfc *interface);
+void fm10k_macvlan_schedule(struct fm10k_intfc *interface);
 void fm10k_update_rx_drop_en(struct fm10k_intfc *interface);
 #ifdef CONFIG_NET_POLL_CONTROLLER
 void fm10k_netpoll(struct net_device *netdev);
@@ -505,6 +537,12 @@ void fm10k_reset_rx_state(struct fm10k_intfc *);
 int fm10k_setup_tc(struct net_device *dev, u8 tc);
 int fm10k_open(struct net_device *netdev);
 int fm10k_close(struct net_device *netdev);
+int fm10k_queue_vlan_request(struct fm10k_intfc *interface, u32 vid,
+			     u8 vsi, bool set);
+int fm10k_queue_mac_request(struct fm10k_intfc *interface, u16 glort,
+			    const unsigned char *addr, u16 vid, bool set);
+void fm10k_clear_macvlan_queue(struct fm10k_intfc *interface,
+			       u16 glort, bool vlans);
 
 /* Ethtool */
 void fm10k_set_ethtool_ops(struct net_device *dev);
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
index f0d824558fbf..7f6e477ed74d 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
@@ -758,11 +758,132 @@ static bool fm10k_host_mbx_ready(struct fm10k_intfc *interface)
 	return (hw->mac.type == fm10k_mac_vf || interface->host_ready);
 }
 
+/**
+ * fm10k_queue_vlan_request - Queue a VLAN update request
+ * @interface: the fm10k interface structure
+ * @vid: the VLAN vid
+ * @vsi: VSI index number
+ * @set: whether to set or clear
+ *
+ * This function queues up a VLAN update. For VFs, this must be sent to the
+ * managing PF over the mailbox. For PFs, we'll use the same handling so that
+ * it's similar to the VF. This avoids storming the PF<->VF mailbox with too
+ * many VLAN updates during reset.
+ */
+int fm10k_queue_vlan_request(struct fm10k_intfc *interface,
+			     u32 vid, u8 vsi, bool set)
+{
+	struct fm10k_macvlan_request *request;
+	unsigned long flags;
+
+	/* This must be atomic since we may be called while the netdev
+	 * addr_list_lock is held
+	 */
+	request = kzalloc(sizeof(*request), GFP_ATOMIC);
+	if (!request)
+		return -ENOMEM;
+
+	request->type = FM10K_VLAN_REQUEST;
+	request->vlan.vid = vid;
+	request->vlan.vsi = vsi;
+	request->set = set;
+
+	spin_lock_irqsave(&interface->macvlan_lock, flags);
+	list_add_tail(&request->list, &interface->macvlan_requests);
+	spin_unlock_irqrestore(&interface->macvlan_lock, flags);
+
+	fm10k_macvlan_schedule(interface);
+
+	return 0;
+}
+
+/**
+ * fm10k_queue_mac_request - Queue a MAC update request
+ * @interface: the fm10k interface structure
+ * @glort: the target glort for this update
+ * @addr: the address to update
+ * @vid: the vid to update
+ * @sync: whether to add or remove
+ *
+ * This function queues up a MAC request for sending to the switch manager.
+ * A separate thread monitors the queue and sends updates to the switch
+ * manager. Return 0 on success, and negative error code on failure.
+ **/
+int fm10k_queue_mac_request(struct fm10k_intfc *interface, u16 glort,
+			    const unsigned char *addr, u16 vid, bool set)
+{
+	struct fm10k_macvlan_request *request;
+	unsigned long flags;
+
+	/* This must be atomic since we may be called while the netdev
+	 * addr_list_lock is held
+	 */
+	request = kzalloc(sizeof(*request), GFP_ATOMIC);
+	if (!request)
+		return -ENOMEM;
+
+	if (is_multicast_ether_addr(addr))
+		request->type = FM10K_MC_MAC_REQUEST;
+	else
+		request->type = FM10K_UC_MAC_REQUEST;
+
+	ether_addr_copy(request->mac.addr, addr);
+	request->mac.glort = glort;
+	request->mac.vid = vid;
+	request->set = set;
+
+	spin_lock_irqsave(&interface->macvlan_lock, flags);
+	list_add_tail(&request->list, &interface->macvlan_requests);
+	spin_unlock_irqrestore(&interface->macvlan_lock, flags);
+
+	fm10k_macvlan_schedule(interface);
+
+	return 0;
+}
+
+/**
+ * fm10k_clear_macvlan_queue - Cancel pending updates for a given glort
+ * @interface: the fm10k interface structure
+ * @glort: the target glort to clear
+ * @vlans: true to clear VLAN messages, false to ignore them
+ *
+ * Cancel any outstanding MAC/VLAN requests for a given glort. This is
+ * expected to be called when a logical port goes down.
+ **/
+void fm10k_clear_macvlan_queue(struct fm10k_intfc *interface,
+			       u16 glort, bool vlans)
+
+{
+	struct fm10k_macvlan_request *r, *tmp;
+	unsigned long flags;
+
+	spin_lock_irqsave(&interface->macvlan_lock, flags);
+
+	/* Free any outstanding MAC/VLAN requests for this interface */
+	list_for_each_entry_safe(r, tmp, &interface->macvlan_requests, list) {
+		switch (r->type) {
+		case FM10K_MC_MAC_REQUEST:
+		case FM10K_UC_MAC_REQUEST:
+			/* Don't free requests for other interfaces */
+			if (r->mac.glort != glort)
+				break;
+			/* fall through */
+		case FM10K_VLAN_REQUEST:
+			if (vlans) {
+				list_del(&r->list);
+				kfree(r);
+			}
+			break;
+		}
+	}
+
+	spin_unlock_irqrestore(&interface->macvlan_lock, flags);
+}
+
 static int fm10k_uc_vlan_unsync(struct net_device *netdev,
 				const unsigned char *uc_addr)
 {
 	struct fm10k_intfc *interface = netdev_priv(netdev);
-	struct fm10k_hw *hw = &interface->hw;
 	u16 glort = interface->glort;
 	u16 vid = interface->vid;
 	bool set = !!(vid / VLAN_N_VID);
@@ -771,10 +892,7 @@ static int fm10k_uc_vlan_unsync(struct net_device *netdev,
 	/* drop any leading bits on the VLAN ID */
 	vid &= VLAN_N_VID - 1;
 
-	if (fm10k_host_mbx_ready(interface))
-		err = hw->mac.ops.update_uc_addr(hw, glort, uc_addr,
-						 vid, set, 0);
-
+	err = fm10k_queue_mac_request(interface, glort, uc_addr, vid, set);
 	if (err)
 		return err;
 
@@ -786,7 +904,6 @@ static int fm10k_mc_vlan_unsync(struct net_device *netdev,
 				const unsigned char *mc_addr)
 {
 	struct fm10k_intfc *interface = netdev_priv(netdev);
-	struct fm10k_hw *hw = &interface->hw;
 	u16 glort = interface->glort;
 	u16 vid = interface->vid;
 	bool set = !!(vid / VLAN_N_VID);
@@ -795,9 +912,7 @@ static int fm10k_mc_vlan_unsync(struct net_device *netdev,
 	/* drop any leading bits on the VLAN ID */
 	vid &= VLAN_N_VID - 1;
 
-	if (fm10k_host_mbx_ready(interface))
-		err = hw->mac.ops.update_mc_addr(hw, glort, mc_addr, vid, set);
-
+	err = fm10k_queue_mac_request(interface, glort, mc_addr, vid, set);
 	if (err)
 		return err;
 
@@ -855,18 +970,14 @@ static int fm10k_update_vid(struct net_device *netdev, u16 vid, bool set)
 
 	/* only need to update the VLAN if not in promiscuous mode */
 	if (!(netdev->flags & IFF_PROMISC)) {
-		err = hw->mac.ops.update_vlan(hw, vid, 0, set);
+		err = fm10k_queue_vlan_request(interface, vid, 0, set);
 		if (err)
 			goto err_out;
 	}
 
-	/* update our base MAC address if host's mailbox is ready */
-	if (fm10k_host_mbx_ready(interface))
-		err = hw->mac.ops.update_uc_addr(hw, interface->glort,
-						 hw->mac.addr, vid, set, 0);
-	else
-		err = -EHOSTDOWN;
-
+	/* Update our base MAC address */
+	err = fm10k_queue_mac_request(interface, interface->glort,
+				      hw->mac.addr, vid, set);
 	if (err)
 		goto err_out;
 
@@ -910,7 +1021,6 @@ static u16 fm10k_find_next_vlan(struct fm10k_intfc *interface, u16 vid)
 
 static void fm10k_clear_unused_vlans(struct fm10k_intfc *interface)
 {
-	struct fm10k_hw *hw = &interface->hw;
 	u32 vid, prev_vid;
 
 	/* loop through and find any gaps in the table */
@@ -922,7 +1032,7 @@ static void fm10k_clear_unused_vlans(struct fm10k_intfc *interface)
 
 		/* send request to clear multiple bits at a time */
 		prev_vid += (vid - prev_vid - 1) << FM10K_VLAN_LENGTH_SHIFT;
-		hw->mac.ops.update_vlan(hw, prev_vid, 0, false);
+		fm10k_queue_vlan_request(interface, prev_vid, 0, false);
 	}
 }
 
@@ -937,15 +1047,11 @@ static int __fm10k_uc_sync(struct net_device *dev,
 	if (!is_valid_ether_addr(addr))
 		return -EADDRNOTAVAIL;
 
-	/* update table with current entries if host's mailbox is ready */
-	if (!fm10k_host_mbx_ready(interface))
-		return -EHOSTDOWN;
-
 	for (vid = hw->mac.default_vid ? fm10k_find_next_vlan(interface, 0) : 1;
 	     vid < VLAN_N_VID;
 	     vid = fm10k_find_next_vlan(interface, vid)) {
-		err = hw->mac.ops.update_uc_addr(hw, glort, addr,
-						 vid, sync, 0);
+		err = fm10k_queue_mac_request(interface, glort,
+					      addr, vid, sync);
 		if (err)
 			return err;
 	}
@@ -1002,15 +1108,18 @@ static int __fm10k_mc_sync(struct net_device *dev,
 	struct fm10k_intfc *interface = netdev_priv(dev);
 	struct fm10k_hw *hw = &interface->hw;
 	u16 vid, glort = interface->glort;
+	s32 err;
 
-	/* update table with current entries if host's mailbox is ready */
-	if (!fm10k_host_mbx_ready(interface))
-		return 0;
+	if (!is_multicast_ether_addr(addr))
+		return -EADDRNOTAVAIL;
 
 	for (vid = hw->mac.default_vid ? fm10k_find_next_vlan(interface, 0) : 1;
 	     vid < VLAN_N_VID;
 	     vid = fm10k_find_next_vlan(interface, vid)) {
-		hw->mac.ops.update_mc_addr(hw, glort, addr, vid, sync);
+		err = fm10k_queue_mac_request(interface, glort,
+					      addr, vid, sync);
+		if (err)
+			return err;
 	}
 
 	return 0;
@@ -1050,7 +1159,8 @@ static void fm10k_set_rx_mode(struct net_device *dev)
 	if (interface->xcast_mode != xcast_mode) {
 		/* update VLAN table */
 		if (xcast_mode == FM10K_XCAST_MODE_PROMISC)
-			hw->mac.ops.update_vlan(hw, FM10K_VLAN_ALL, 0, true);
+			fm10k_queue_vlan_request(interface, FM10K_VLAN_ALL,
+						 0, true);
 		if (interface->xcast_mode == FM10K_XCAST_MODE_PROMISC)
 			fm10k_clear_unused_vlans(interface);
 
@@ -1098,22 +1208,20 @@ void fm10k_restore_rx_state(struct fm10k_intfc *interface)
 					       interface->glort_count, true);
 
 	/* update VLAN table */
-	hw->mac.ops.update_vlan(hw, FM10K_VLAN_ALL, 0,
-				xcast_mode == FM10K_XCAST_MODE_PROMISC);
+	fm10k_queue_vlan_request(interface, FM10K_VLAN_ALL, 0,
+				 xcast_mode == FM10K_XCAST_MODE_PROMISC);
 
 	/* Add filter for VLAN 0 */
-	hw->mac.ops.update_vlan(hw, 0, 0, true);
+	fm10k_queue_vlan_request(interface, 0, 0, true);
 
 	/* update table with current entries */
 	for (vid = hw->mac.default_vid ? fm10k_find_next_vlan(interface, 0) : 1;
 	     vid < VLAN_N_VID;
 	     vid = fm10k_find_next_vlan(interface, vid)) {
-		hw->mac.ops.update_vlan(hw, vid, 0, true);
+		fm10k_queue_vlan_request(interface, vid, 0, true);
 
-		/* Update unicast entries if host's mailbox is ready */
-		if (fm10k_host_mbx_ready(interface))
-			hw->mac.ops.update_uc_addr(hw, glort, hw->mac.addr,
-						   vid, true, 0);
+		fm10k_queue_mac_request(interface, glort,
+					hw->mac.addr, vid, true);
 	}
 
 	/* update xcast mode before synchronizing addresses if host's mailbox
@@ -1140,6 +1248,13 @@ void fm10k_reset_rx_state(struct fm10k_intfc *interface)
 	struct net_device *netdev = interface->netdev;
 	struct fm10k_hw *hw = &interface->hw;
 
+	/* Wait for MAC/VLAN work to finish */
+	while (test_bit(__FM10K_MACVLAN_SCHED, interface->state))
+		usleep_range(1000, 2000);
+
+	/* Cancel pending MAC/VLAN requests */
+	fm10k_clear_macvlan_queue(interface, interface->glort, true);
+
 	fm10k_mbx_lock(interface);
 
 	/* clear the logical port state on lower device if host's mailbox is
@@ -1372,8 +1487,8 @@ static void *fm10k_dfwd_add_station(struct net_device *dev,
 	if (fm10k_host_mbx_ready(interface)) {
 		hw->mac.ops.update_xcast_mode(hw, glort,
 					      FM10K_XCAST_MODE_MULTI);
-		hw->mac.ops.update_uc_addr(hw, glort, sdev->dev_addr,
-					   0, true, 0);
+		fm10k_queue_mac_request(interface, glort, sdev->dev_addr,
+					0, true);
 	}
 
 	fm10k_mbx_unlock(interface);
@@ -1412,8 +1527,8 @@ static void fm10k_dfwd_del_station(struct net_device *dev, void *priv)
 	if (fm10k_host_mbx_ready(interface)) {
 		hw->mac.ops.update_xcast_mode(hw, glort,
 					      FM10K_XCAST_MODE_NONE);
-		hw->mac.ops.update_uc_addr(hw, glort, sdev->dev_addr,
-					   0, false, 0);
+		fm10k_queue_mac_request(interface, glort, sdev->dev_addr,
+					0, false);
 	}
 
 	fm10k_mbx_unlock(interface);
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index d14cfe76e58b..a54c07eb374e 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -91,6 +91,65 @@ static int fm10k_hw_ready(struct fm10k_intfc *interface)
 	return FM10K_REMOVED(hw->hw_addr) ? -ENODEV : 0;
 }
 
+/**
+ * fm10k_macvlan_schedule - Schedule MAC/VLAN queue task
+ * @interface: fm10k private interface structure
+ *
+ * Schedule the MAC/VLAN queue monitor task. If the MAC/VLAN task cannot be
+ * started immediately, request that it be restarted when possible.
+ */
+void fm10k_macvlan_schedule(struct fm10k_intfc *interface)
+{
+	/* Avoid processing the MAC/VLAN queue when the service task is
+	 * disabled, or when we're resetting the device.
+	 */
+	if (!test_bit(__FM10K_SERVICE_DISABLE, interface->state) &&
+	    !test_bit(__FM10K_RESETTING, interface->state) &&
+	    !test_and_set_bit(__FM10K_MACVLAN_SCHED, interface->state)) {
+		clear_bit(__FM10K_MACVLAN_REQUEST, interface->state);
+		/* We delay the actual start of execution in order to allow
+		 * multiple MAC/VLAN updates to accumulate before handling
+		 * them, and to allow some time to let the mailbox drain
+		 * between runs.
+		 */
+		queue_delayed_work(fm10k_workqueue,
+				   &interface->macvlan_task, 10);
+	} else {
+		set_bit(__FM10K_MACVLAN_REQUEST, interface->state);
+	}
+}
+
+/**
+ * fm10k_stop_macvlan_task - Stop the MAC/VLAN queue monitor
+ * @interface: fm10k private interface structure
+ *
+ * Wait until the MAC/VLAN queue task has stopped, and cancel any future
+ * requests. Expects to be called with either the __FM10K_SERVICE_DISBABLE or
+ * __FM10K_RESETTING status bits set, as otherwise the task may be rescheduled
+ * at any time.
+ *
+ * There is no fm10k_start_macvlan_task as there is more than one flow for
+ * re-enabling the task.
+ */
+static void fm10k_stop_macvlan_task(struct fm10k_intfc *interface)
+{
+	/* It is a bug to call this function except when we're resetting or
+	 * the service task is disabled.
+	 */
+	WARN_ON(!test_bit(__FM10K_SERVICE_DISABLE, interface->state) &&
+		!test_bit(__FM10K_RESETTING, interface->state));
+
+	cancel_delayed_work_sync(&interface->macvlan_task);
+
+	/* We set the __FM10K_MACVLAN_SCHED bit when we schedule the task.
+	 * However, it may not be unset of the MAC/VLAN task never actually
+	 * got a chance to run. Since we've canceled the task here, and it
+	 * cannot be rescheuled right now, we need to ensure the scheduled bit
+	 * gets unset.
+	 */
+	clear_bit(__FM10K_MACVLAN_SCHED, interface->state);
+}
+
 void fm10k_service_event_schedule(struct fm10k_intfc *interface)
 {
 	if (!test_bit(__FM10K_SERVICE_DISABLE, interface->state) &&
@@ -166,6 +225,12 @@ static void fm10k_prepare_for_reset(struct fm10k_intfc *interface)
 	if (test_and_set_bit(__FM10K_RESETTING, interface->state))
 		return;
 
+	/* As the MAC/VLAN task will be accessing registers it must not be
+	 * running while we reset. Although the task will not be scheduled
+	 * once we start resetting it may already be running
+	 */
+	fm10k_stop_macvlan_task(interface);
+
 	rtnl_lock();
 
 	fm10k_iov_suspend(interface->pdev);
@@ -250,6 +315,12 @@ static int fm10k_handle_reset(struct fm10k_intfc *interface)
 
 	clear_bit(__FM10K_RESETTING, interface->state);
 
+	/* We might have received a MAC/VLAN request while resetting. If so,
+	 * kick off the queue now.
+	 */
+	if (test_bit(__FM10K_MACVLAN_REQUEST, interface->state))
+		fm10k_macvlan_schedule(interface);
+
 	return err;
 err_open:
 	fm10k_mbx_free_irq(interface);
@@ -658,6 +729,112 @@ static void fm10k_service_task(struct work_struct *work)
 	fm10k_service_event_complete(interface);
 }
 
+/**
+ * fm10k_macvlan_task - send queued MAC/VLAN requests to switch manager
+ * @work: pointer to work_struct containing our data
+ *
+ * This work item handles sending MAC/VLAN updates to the switch manager. When
+ * the interface is up, it will attempt to queue mailbox messages to the
+ * switch manager requesting updates for MAC/VLAN pairs. If the Tx fifo of the
+ * mailbox is full, it will reschedule itself to try again in a short while.
+ * This ensures that the driver does not overload the switch mailbox with too
+ * many simultaneous requests, causing an unnecessary reset.
+ **/
+static void fm10k_macvlan_task(struct work_struct *work)
+{
+	struct fm10k_macvlan_request *item;
+	struct fm10k_intfc *interface;
+	struct delayed_work *dwork;
+	struct list_head *requests;
+	struct fm10k_hw *hw;
+	unsigned long flags;
+
+	dwork = to_delayed_work(work);
+	interface = container_of(dwork, struct fm10k_intfc, macvlan_task);
+	hw = &interface->hw;
+	requests = &interface->macvlan_requests;
+
+	do {
+		/* Pop the first item off the list */
+		spin_lock_irqsave(&interface->macvlan_lock, flags);
+		item = list_first_entry_or_null(requests,
+						struct fm10k_macvlan_request,
+						list);
+		if (item)
+			list_del_init(&item->list);
+
+		spin_unlock_irqrestore(&interface->macvlan_lock, flags);
+
+		/* We have no more items to process */
+		if (!item)
+			goto done;
+
+		fm10k_mbx_lock(interface);
+
+		/* Check that we have plenty of space to send the message. We
+		 * want to ensure that the mailbox stays low enough to avoid a
+		 * change in the host state, otherwise we may see spurious
+		 * link up / link down notifications.
+		 */
+		if (!hw->mbx.ops.tx_ready(&hw->mbx, FM10K_VFMBX_MSG_MTU + 5)) {
+			hw->mbx.ops.process(hw, &hw->mbx);
+			set_bit(__FM10K_MACVLAN_REQUEST, interface->state);
+			fm10k_mbx_unlock(interface);
+
+			/* Put the request back on the list */
+			spin_lock_irqsave(&interface->macvlan_lock, flags);
+			list_add(&item->list, requests);
+			spin_unlock_irqrestore(&interface->macvlan_lock, flags);
+			break;
+		}
+
+		switch (item->type) {
+		case FM10K_MC_MAC_REQUEST:
+			hw->mac.ops.update_mc_addr(hw,
+						   item->mac.glort,
+						   item->mac.addr,
+						   item->mac.vid,
+						   item->set);
+			break;
+		case FM10K_UC_MAC_REQUEST:
+			hw->mac.ops.update_uc_addr(hw,
+						   item->mac.glort,
+						   item->mac.addr,
+						   item->mac.vid,
+						   item->set,
+						   0);
+			break;
+		case FM10K_VLAN_REQUEST:
+			hw->mac.ops.update_vlan(hw,
+						item->vlan.vid,
+						item->vlan.vsi,
+						item->set);
+			break;
+		default:
+			break;
+		}
+
+		fm10k_mbx_unlock(interface);
+
+		/* Free the item now that we've sent the update */
+		kfree(item);
+	} while (true);
+
+done:
+	WARN_ON(!test_bit(__FM10K_MACVLAN_SCHED, interface->state));
+
+	/* flush memory to make sure state is correct */
+	smp_mb__before_atomic();
+	clear_bit(__FM10K_MACVLAN_SCHED, interface->state);
+
+	/* If a MAC/VLAN request was scheduled since we started, we should
+	 * re-schedule. However, there is no reason to re-schedule if there is
+	 * no work to do.
+	 */
+	if (test_bit(__FM10K_MACVLAN_REQUEST, interface->state))
+		fm10k_macvlan_schedule(interface);
+}
+
 /**
  * fm10k_configure_tx_ring - Configure Tx ring after Reset
  * @interface: board private structure
@@ -1890,11 +2067,15 @@ static int fm10k_sw_init(struct fm10k_intfc *interface,
 	INIT_LIST_HEAD(&interface->vxlan_port);
 	INIT_LIST_HEAD(&interface->geneve_port);
 
+	/* Initialize the MAC/VLAN queue */
+	INIT_LIST_HEAD(&interface->macvlan_requests);
+
 	netdev_rss_key_fill(rss_key, sizeof(rss_key));
 	memcpy(interface->rssrk, rss_key, sizeof(rss_key));
 
 	/* Initialize the mailbox lock */
 	spin_lock_init(&interface->mbx_lock);
+	spin_lock_init(&interface->macvlan_lock);
 
 	/* Start off interface as being down */
 	set_bit(__FM10K_DOWN, interface->state);
@@ -2103,6 +2284,9 @@ static int fm10k_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 		    (unsigned long)interface);
 	INIT_WORK(&interface->service_task, fm10k_service_task);
 
+	/* Setup the MAC/VLAN queue */
+	INIT_DELAYED_WORK(&interface->macvlan_task, fm10k_macvlan_task);
+
 	/* kick off service timer now, even when interface is down */
 	mod_timer(&interface->service_timer, (HZ * 2) + jiffies);
 
@@ -2156,6 +2340,10 @@ static void fm10k_remove(struct pci_dev *pdev)
 	del_timer_sync(&interface->service_timer);
 
 	fm10k_stop_service_event(interface);
+	fm10k_stop_macvlan_task(interface);
+
+	/* Remove all pending MAC/VLAN requests */
+	fm10k_clear_macvlan_queue(interface, interface->glort, true);
 
 	/* free netdev, this may bounce the interrupts due to setup_tc */
 	if (netdev->reg_state == NETREG_REGISTERED)
@@ -2192,6 +2380,9 @@ static void fm10k_prepare_suspend(struct fm10k_intfc *interface)
 	 * a surprise remove if the PCIe device is disabled while we're
 	 * stopped. We stop the watchdog task until after we resume software
 	 * activity.
+	 *
+	 * Note that the MAC/VLAN task will be stopped as part of preparing
+	 * for reset so we don't need to handle it here.
 	 */
 	fm10k_stop_service_event(interface);
 
@@ -2223,6 +2414,9 @@ static int fm10k_handle_resume(struct fm10k_intfc *interface)
 	/* restart the service task */
 	fm10k_start_service_event(interface);
 
+	/* Restart the MAC/VLAN request queue in-case of outstanding events */
+	fm10k_macvlan_schedule(interface);
+
 	return err;
 }
 
-- 
2.13.0.598.gf927b9495246


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Intel-wired-lan] [PATCH 15/15] fm10k: use the MAC/VLAN queue for VF<->PF MAC/VLAN requests
  2017-06-01 22:40 [Intel-wired-lan] [PATCH 00/15] harden fm10k driver against PFLR and surprise remove Jacob Keller
                   ` (13 preceding siblings ...)
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 14/15] fm10k: introduce a message queue for MAC/VLAN messages Jacob Keller
@ 2017-06-01 22:40 ` Jacob Keller
  14 siblings, 0 replies; 18+ messages in thread
From: Jacob Keller @ 2017-06-01 22:40 UTC (permalink / raw)
  To: intel-wired-lan

Now that we have a working MAC/VLAN queue for handling MAC/VLAN messages
from the netdev, replace the default handler for the VF<->PF messages.
This new handler is very similar to the default code, but uses the
MAC/VLAN queue instead of sending the message directly. Unfortunately we
can't easily re-use the default code, so we'll just replace the entire
function.

This ensures that a VF requesting a large number of VLANs or MAC
addresses does not start a reset cycle, as explained in the commit which
introduced the message queue.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Kwan, Ngai-mint <ngai-mint.kwan@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_iov.c | 132 ++++++++++++++++++++++++++-
 drivers/net/ethernet/intel/fm10k/fm10k_pf.c  |   2 +-
 drivers/net/ethernet/intel/fm10k/fm10k_pf.h  |   3 +-
 3 files changed, 133 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_iov.c b/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
index 03897720bf0b..7e53521667b8 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
@@ -35,10 +35,133 @@ static s32 fm10k_iov_msg_error(struct fm10k_hw *hw, u32 **results,
 	return fm10k_tlv_msg_error(hw, results, mbx);
 }
 
+/**
+ *  fm10k_iov_msg_queue_mac_vlan - Message handler for MAC/VLAN request from VF
+ *  @hw: Pointer to hardware structure
+ *  @results: Pointer array to message, results[0] is pointer to message
+ *  @mbx: Pointer to mailbox information structure
+ *
+ *  This function is a custom handler for MAC/VLAN requests from the VF. The
+ *  assumption is that it is acceptable to directly hande off the message from
+ *  the VF to the PF's switch manager. However, we use a MAC/VLAN message
+ *  queue to avoid overloading the mailbox when a large number of requests
+ *  come in.
+ **/
+static s32 fm10k_iov_msg_queue_mac_vlan(struct fm10k_hw *hw, u32 **results,
+					struct fm10k_mbx_info *mbx)
+{
+	struct fm10k_vf_info *vf_info = (struct fm10k_vf_info *)mbx;
+	struct fm10k_intfc *interface = hw->back;
+	u8 mac[ETH_ALEN];
+	u32 *result;
+	int err = 0;
+	bool set;
+	u16 vlan;
+	u32 vid;
+
+	/* we shouldn't be updating rules on a disabled interface */
+	if (!FM10K_VF_FLAG_ENABLED(vf_info))
+		err = FM10K_ERR_PARAM;
+
+	if (!err && !!results[FM10K_MAC_VLAN_MSG_VLAN]) {
+		result = results[FM10K_MAC_VLAN_MSG_VLAN];
+
+		/* record VLAN id requested */
+		err = fm10k_tlv_attr_get_u32(result, &vid);
+		if (err)
+			return err;
+
+		set = !(vid & FM10K_VLAN_CLEAR);
+		vid &= ~FM10K_VLAN_CLEAR;
+
+		/* if the length field has been set, this is a multi-bit
+		 * update request. For multi-bit requests, simply disallow
+		 * them when the pf_vid has been set. In this case, the PF
+		 * should have already cleared the VLAN_TABLE, and if we
+		 * allowed them, it could allow a rogue VF to receive traffic
+		 * on a VLAN it was not assigned. In the single-bit case, we
+		 * need to modify requests for VLAN 0 to use the default PF or
+		 * SW vid when assigned.
+		 */
+
+		if (vid >> 16) {
+			/* prevent multi-bit requests when PF has
+			 * administratively set the VLAN for this VF
+			 */
+			if (vf_info->pf_vid)
+				return FM10K_ERR_PARAM;
+		} else {
+			err = fm10k_iov_select_vid(vf_info, (u16)vid);
+			if (err < 0)
+				return err;
+
+			vid = err;
+		}
+
+		/* update VSI info for VF in regards to VLAN table */
+		err = hw->mac.ops.update_vlan(hw, vid, vf_info->vsi, set);
+	}
+
+	if (!err && !!results[FM10K_MAC_VLAN_MSG_MAC]) {
+		result = results[FM10K_MAC_VLAN_MSG_MAC];
+
+		/* record unicast MAC address requested */
+		err = fm10k_tlv_attr_get_mac_vlan(result, mac, &vlan);
+		if (err)
+			return err;
+
+		/* block attempts to set MAC for a locked device */
+		if (is_valid_ether_addr(vf_info->mac) &&
+		    !ether_addr_equal(mac, vf_info->mac))
+			return FM10K_ERR_PARAM;
+
+		set = !(vlan & FM10K_VLAN_CLEAR);
+		vlan &= ~FM10K_VLAN_CLEAR;
+
+		err = fm10k_iov_select_vid(vf_info, vlan);
+		if (err < 0)
+			return err;
+
+		vlan = (u16)err;
+
+		/* Add this request to the MAC/VLAN queue */
+		err = fm10k_queue_mac_request(interface, vf_info->glort,
+					      mac, vlan, set);
+	}
+
+	if (!err && !!results[FM10K_MAC_VLAN_MSG_MULTICAST]) {
+		result = results[FM10K_MAC_VLAN_MSG_MULTICAST];
+
+		/* record multicast MAC address requested */
+		err = fm10k_tlv_attr_get_mac_vlan(result, mac, &vlan);
+		if (err)
+			return err;
+
+		/* verify that the VF is allowed to request multicast */
+		if (!(vf_info->vf_flags & FM10K_VF_FLAG_MULTI_ENABLED))
+			return FM10K_ERR_PARAM;
+
+		set = !(vlan & FM10K_VLAN_CLEAR);
+		vlan &= ~FM10K_VLAN_CLEAR;
+
+		err = fm10k_iov_select_vid(vf_info, vlan);
+		if (err < 0)
+			return err;
+
+		vlan = (u16)err;
+
+		/* Add this request to the MAC/VLAN queue */
+		err = fm10k_queue_mac_request(interface, vf_info->glort,
+					      mac, vlan, set);
+	}
+
+	return err;
+}
+
 static const struct fm10k_msg_data iov_mbx_data[] = {
 	FM10K_TLV_MSG_TEST_HANDLER(fm10k_tlv_msg_test),
 	FM10K_VF_MSG_MSIX_HANDLER(fm10k_iov_msg_msix_pf),
-	FM10K_VF_MSG_MAC_VLAN_HANDLER(fm10k_iov_msg_mac_vlan_pf),
+	FM10K_VF_MSG_MAC_VLAN_HANDLER(fm10k_iov_msg_queue_mac_vlan),
 	FM10K_VF_MSG_LPORT_STATE_HANDLER(fm10k_iov_msg_lport_state_pf),
 	FM10K_TLV_MSG_ERROR_HANDLER(fm10k_iov_msg_error),
 };
@@ -126,8 +249,10 @@ s32 fm10k_iov_mbx(struct fm10k_intfc *interface)
 		hw->mbx.ops.process(hw, &hw->mbx);
 
 		/* verify port mapping is valid, if not reset port */
-		if (vf_info->vf_flags && !fm10k_glort_valid_pf(hw, glort))
+		if (vf_info->vf_flags && !fm10k_glort_valid_pf(hw, glort)) {
 			hw->iov.ops.reset_lport(hw, vf_info);
+			fm10k_clear_macvlan_queue(interface, glort, false);
+		}
 
 		/* reset VFs that have mailbox timed out */
 		if (!mbx->timeout) {
@@ -190,6 +315,7 @@ void fm10k_iov_suspend(struct pci_dev *pdev)
 
 		hw->iov.ops.reset_resources(hw, vf_info);
 		hw->iov.ops.reset_lport(hw, vf_info);
+		fm10k_clear_macvlan_queue(interface, vf_info->glort, false);
 	}
 }
 
@@ -414,6 +540,8 @@ static inline void fm10k_reset_vf_info(struct fm10k_intfc *interface,
 	/* disable LPORT for this VF which clears switch rules */
 	hw->iov.ops.reset_lport(hw, vf_info);
 
+	fm10k_clear_macvlan_queue(interface, vf_info->glort, false);
+
 	/* assign new MAC+VLAN for this VF */
 	hw->iov.ops.assign_default_mac_vlan(hw, vf_info);
 
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pf.c b/drivers/net/ethernet/intel/fm10k/fm10k_pf.c
index 9e4fb3a44376..425d814aed4d 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pf.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pf.c
@@ -1186,7 +1186,7 @@ s32 fm10k_iov_msg_msix_pf(struct fm10k_hw *hw, u32 **results,
  * Will report an error if the VLAN ID is out of range. For VID = 0, it will
  * return either the pf_vid or sw_vid depending on which one is set.
  */
-static s32 fm10k_iov_select_vid(struct fm10k_vf_info *vf_info, u16 vid)
+s32 fm10k_iov_select_vid(struct fm10k_vf_info *vf_info, u16 vid)
 {
 	if (!vid)
 		return vf_info->pf_vid ? vf_info->pf_vid : vf_info->sw_vid;
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pf.h b/drivers/net/ethernet/intel/fm10k/fm10k_pf.h
index 3336d3c10760..e04d41f1a532 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pf.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pf.h
@@ -1,5 +1,5 @@
 /* Intel(R) Ethernet Switch Host Interface Driver
- * Copyright(c) 2013 - 2016 Intel Corporation.
+ * Copyright(c) 2013 - 2017 Intel Corporation.
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms and conditions of the GNU General Public License,
@@ -114,6 +114,7 @@ extern const struct fm10k_tlv_attr fm10k_err_msg_attr[];
 #define FM10K_PF_MSG_ERR_HANDLER(msg, func) \
 	FM10K_MSG_HANDLER(FM10K_PF_MSG_ID_##msg, fm10k_err_msg_attr, func)
 
+s32 fm10k_iov_select_vid(struct fm10k_vf_info *vf_info, u16 vid);
 s32 fm10k_iov_msg_msix_pf(struct fm10k_hw *, u32 **, struct fm10k_mbx_info *);
 s32 fm10k_iov_msg_mac_vlan_pf(struct fm10k_hw *, u32 **,
 			      struct fm10k_mbx_info *);
-- 
2.13.0.598.gf927b9495246


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Intel-wired-lan] [PATCH 14/15] fm10k: introduce a message queue for MAC/VLAN messages
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 14/15] fm10k: introduce a message queue for MAC/VLAN messages Jacob Keller
@ 2017-06-16 22:45   ` Keller, Jacob E
  0 siblings, 0 replies; 18+ messages in thread
From: Keller, Jacob E @ 2017-06-16 22:45 UTC (permalink / raw)
  To: intel-wired-lan



> -----Original Message-----
> From: Keller, Jacob E
> Sent: Thursday, June 01, 2017 3:41 PM
> To: Intel Wired LAN <intel-wired-lan@lists.osuosl.org>
> Cc: Keller, Jacob E <jacob.e.keller@intel.com>
> Subject: [PATCH 14/15] fm10k: introduce a message queue for MAC/VLAN
> messages
> 
> Under some circumstances, when dealing with a large number of MAC
> address or VLAN updates at once, the fm10k driver, particularly the VFs
> can overload the mailbox with too many messages at once.
> 
> This results in a mailbox timeout, which causes the driver to initiate
> a reset. During the reset, we re-send all the same messages that
> originally caused the timeout. This results in a cycle of resets each
> triggering a future reset.
> 
> To fix or avoid this, we introduce a workqueue item which monitors
> a queue of MAC and VLAN requests. These requests are queued to the end
> of the list, and we process as a FIFO periodically.
> 
> Initially we only handle requests for the netdev, but we do handle
> unicast MAC addresses, multicast MAC addresses, and update VLAN
> requests.
> 
> A future patch will add support to use this queue for handling MAC
> update requests from the VF<->PF mailbox.
> 
> The MAC/VLAN work item will keep checking to make sure that each request
> does not overflow the mailbox and cause a timeout. If it might, then the
> work item will reschedule itself a short time later. This avoids any
> reset cycle, since we never send the message if the mailbox is not
> ready.
> 
> As an alternative, we tried increasing the mailbox message FIFO, but
> this just delays the problem and results in needless memory waste on the
> system. Our new message queue is dynamically allocated so only uses as
> much memory as it needs. Additionally, it need not be contiguous like
> the Tx and Rx FIFOs.
> 
> Note that this patch chose to only create a queue for MAC and VLAN
> messages, since these are the only messages sent in a large enough
> volume to cause the reset loop. Other messages are very unlikely to
> overflow the mailbox Tx fifo so easily.
> 
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> ---

Validation found a bug in this patch. I'll be sending a v2 (of only this patch) that should apply in-place of the current version on your queue.

Regards,
Jake


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Intel-wired-lan] [PATCH 10/15] fm10k: prepare_for_reset() when we lose PCIe Link
  2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 10/15] fm10k: prepare_for_reset() when we lose PCIe Link Jacob Keller
@ 2017-07-10 20:02   ` Keller, Jacob E
  0 siblings, 0 replies; 18+ messages in thread
From: Keller, Jacob E @ 2017-07-10 20:02 UTC (permalink / raw)
  To: intel-wired-lan

> -----Original Message-----
> From: Keller, Jacob E
> Sent: Thursday, June 01, 2017 3:41 PM
> To: Intel Wired LAN <intel-wired-lan@lists.osuosl.org>
> Cc: Keller, Jacob E <jacob.e.keller@intel.com>
> Subject: [PATCH 10/15] fm10k: prepare_for_reset() when we lose PCIe Link
> 
> If we lose PCIe link, such as when an unannounced PFLR event occurs, or
> when a device is surprise removed, we currently detach the device and
> close the netdev. This unfortunately leaves a lot of things still
> active, such as the msix_mbx_pf IRQ, and Tx/Rx resources.
> 
> This can cause problems because the register reads will return
> potentially invalid values which may result in unknown driver behavior.
> 
> Begin the process of resetting using fm10k_prepare_for_reset(), much in
> the same way as the suspend and resume cycle does. This will attempt to
> shutdown as much as possible, in order to prevent possible issues.
> 
> Since the __FM10K_RESETTING state is long lived, we'll also stop waiting
> for it when we check to the fm10k_reset_subtask. This is important since
> otherwise it would deadlock with the fm10k_detach_subtask. Additionally,
> stop attempting to manage the mailbox subtask if we're
> detached/resetting, as there is nothing to do when we don't have a PCIe
> address.
> 
> Overall this produces a much cleaner shutdown and recovery cycle for
> a PCIe surprise remove event.
> 

This patch suffers from a problem we discovered while testing. I have some fixes proposed, and have marked the necessary patches as changes requested. I'll be submitting a new version soon. It will be labeled v3 since one of the patches already had a v2.

Thanks,
Jake

> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> ---
>  drivers/net/ethernet/intel/fm10k/fm10k_pci.c | 39 +++++++++++++++++++-----
> ----
>  1 file changed, 27 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
> b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
> index 6a7b4c5429ae..2d94a16f9613 100644
> --- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
> +++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
> @@ -141,8 +141,9 @@ static void fm10k_prepare_for_reset(struct fm10k_intfc
> *interface)
>  	/* put off any impending NetWatchDogTimeout */
>  	netif_trans_update(netdev);
> 
> -	while (test_and_set_bit(__FM10K_RESETTING, interface->state))
> -		usleep_range(1000, 2000);
> +	/* Nothing to do if a reset is already in progress */
> +	if (test_and_set_bit(__FM10K_RESETTING, interface->state))
> +		return;
> 
>  	rtnl_lock();
> 
> @@ -168,6 +169,8 @@ static int fm10k_handle_reset(struct fm10k_intfc
> *interface)
>  	struct fm10k_hw *hw = &interface->hw;
>  	int err;
> 
> +	WARN_ON(!test_bit(__FM10K_RESETTING, interface->state));
> +
>  	rtnl_lock();
> 
>  	pci_set_master(interface->pdev);
> @@ -247,27 +250,33 @@ static void fm10k_detach_subtask(struct fm10k_intfc
> *interface)
>  	u32 __iomem *hw_addr;
>  	u32 value;
> 
> -	/* do nothing if device is still present or hw_addr is set */
> +	/* do nothing if netdev is still present or hw_addr is set */
>  	if (netif_device_present(netdev) || interface->hw.hw_addr)
>  		return;
> 
> +	/* We've lost the PCIe register space, and can no longer access the
> +	 * device. Shut everything except the detach subtask down and prepare
> +	 * to reset the device in case we recover.
> +	 */
> +	fm10k_prepare_for_reset(interface);
> +
>  	/* check the real address space to see if we've recovered */
>  	hw_addr = READ_ONCE(interface->uc_addr);
>  	value = readl(hw_addr);
>  	if (~value) {
> +		/* Restore the hardware address */
>  		interface->hw.hw_addr = interface->uc_addr;
> +
> +		/* PCIe link has been restored, and the device is active
> +		 * again. Restore everything and reset the device.
> +		 */
> +		fm10k_handle_reset(interface);
> +
> +		/* Re-attach the netdev */
>  		netif_device_attach(netdev);
> -		set_bit(FM10K_FLAG_RESET_REQUESTED, interface->flags);
>  		netdev_warn(netdev, "PCIe link restored, device now
> attached\n");
>  		return;
>  	}
> -
> -	rtnl_lock();
> -
> -	if (netif_running(netdev))
> -		dev_close(netdev);
> -
> -	rtnl_unlock();
>  }
> 
>  static void fm10k_reinit(struct fm10k_intfc *interface)
> @@ -360,6 +369,10 @@ static void fm10k_watchdog_update_host_state(struct
> fm10k_intfc *interface)
>   **/
>  static void fm10k_mbx_subtask(struct fm10k_intfc *interface)
>  {
> +	/* If we're resetting, bail out */
> +	if (test_bit(__FM10K_RESETTING, interface->state))
> +		return;
> +
>  	/* process upstream mailbox and update device state */
>  	fm10k_watchdog_update_host_state(interface);
> 
> @@ -609,9 +622,11 @@ static void fm10k_service_task(struct work_struct
> *work)
> 
>  	interface = container_of(work, struct fm10k_intfc, service_task);
> 
> +	/* Check whether we're detached first */
> +	fm10k_detach_subtask(interface);
> +
>  	/* tasks run even when interface is down */
>  	fm10k_mbx_subtask(interface);
> -	fm10k_detach_subtask(interface);
>  	fm10k_reset_subtask(interface);
> 
>  	/* tasks only run when interface is up */
> --
> 2.13.0.598.gf927b9495246


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2017-07-10 20:02 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-01 22:40 [Intel-wired-lan] [PATCH 00/15] harden fm10k driver against PFLR and surprise remove Jacob Keller
2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 01/15] fm10k: stop spurious link down messages when Tx FIFO is full Jacob Keller
2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 02/15] fm10k: fix typos on fall through comments Jacob Keller
2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 03/15] fm10k: avoid possible truncation of q_vector->name Jacob Keller
2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 04/15] fm10k: add missing fall through comment Jacob Keller
2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 05/15] fm10k: avoid needless delay when loading driver Jacob Keller
2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 06/15] fm10k: simplify reading PFVFLRE register Jacob Keller
2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 07/15] fm10k: don't loop while resetting VFs due to VFLR event Jacob Keller
2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 08/15] fm10k: avoid divide by zero in rare cases when device is resetting Jacob Keller
2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 09/15] fm10k: move fm10k_prepare_for_reset and fm10k_handle_reset Jacob Keller
2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 10/15] fm10k: prepare_for_reset() when we lose PCIe Link Jacob Keller
2017-07-10 20:02   ` Keller, Jacob E
2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 11/15] fm10k: prevent race condition of __FM10K_SERVICE_SCHED Jacob Keller
2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 12/15] fm10k: use spinlock to implement mailbox lock Jacob Keller
2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 13/15] fm10k: use generic PM hooks instead of legacy PCIe power hooks Jacob Keller
2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 14/15] fm10k: introduce a message queue for MAC/VLAN messages Jacob Keller
2017-06-16 22:45   ` Keller, Jacob E
2017-06-01 22:40 ` [Intel-wired-lan] [PATCH 15/15] fm10k: use the MAC/VLAN queue for VF<->PF MAC/VLAN requests Jacob Keller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.