[net-next 0/9][pull request] 1GbE Intel Wired LAN Driver Updates 2017-10-10

All of lore.kernel.org
 help / color / mirror / Atom feed

* [net-next 0/9][pull request] 1GbE Intel Wired LAN Driver Updates 2017-10-10
@ 2017-10-10 17:21 Jeff Kirsher
  2017-10-10 17:21 ` [net-next 1/9] e1000e: Fix error path in link detection Jeff Kirsher
                   ` (9 more replies)
  0 siblings, 10 replies; 16+ messages in thread
From: Jeff Kirsher @ 2017-10-10 17:21 UTC (permalink / raw)
  To: davem; +Cc: Jeff Kirsher, netdev, nhorman, sassmann, jogreene

This series contains updates to e1000e and igb.

Benjamin Poirier provides several fixes for e1000e, starting with a
correction to the return status which was always returning success even
if it was not successful.  Fixed code comments to reflect the actual
code behavior.  Fixed the conditional test for the correct return
value.  Fixed a potential race condition reported by Lennart Sorensen,
where the single flag get_link_status is used to signal two different
states.

Sasha fixes a buffer overrun for i219 devices, where the chipset had
reduced the round-trip latency for the LAN controller DMA accesses
which in some high performance cases caused a buffer overrun while
processing the DMA transactions.

Willem de Bruijn changes the default behavior of e1000e to use the
burst mode settings by default unless the user specifies the
receive interrupt delay (RxIntDelay).

Florian Fainelli updates the driver to differentiate between when
e1000e_put_txbuf() is called from normal reclamation or when a
DMA mapping failure to make the driver more "drop monitor friendly".

Christophe JAILLET fixes a potential NULL pointer dereference by
properly returning -ENOMEM on memory allocation failures.

The following are changes since commit 812b5ca7d376e7e008ac0c897d1ef94eb05ddc3b:
  Add a driver for Renesas uPD60620 and uPD60620A PHYs
and are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue 1GbE

Benjamin Poirier (5):
  e1000e: Fix error path in link detection
  e1000e: Fix wrong comment related to link detection
  e1000e: Fix return value test
  e1000e: Separate signaling for link check/link up
  e1000e: Avoid receiver overrun interrupt bursts

Christophe JAILLET (1):
  igb: check memory allocation failure

Florian Fainelli (1):
  e1000e: Be drop monitor friendly

Sasha Neftin (1):
  e1000e: fix buffer overrun while the I219 is processing DMA
    transactions

Willem de Bruijn (1):
  e1000e: apply burst mode settings only on default

 drivers/net/ethernet/intel/e1000e/defines.h |  1 +
 drivers/net/ethernet/intel/e1000e/e1000.h   |  4 --
 drivers/net/ethernet/intel/e1000e/mac.c     | 11 +++--
 drivers/net/ethernet/intel/e1000e/netdev.c  | 75 +++++++++++++++++------------
 drivers/net/ethernet/intel/e1000e/param.c   | 16 +++++-
 drivers/net/ethernet/intel/e1000e/phy.c     |  7 +--
 drivers/net/ethernet/intel/igb/igb_main.c   |  2 +
 7 files changed, 75 insertions(+), 41 deletions(-)

-- 
2.14.2

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [net-next 1/9] e1000e: Fix error path in link detection
  2017-10-10 17:21 [net-next 0/9][pull request] 1GbE Intel Wired LAN Driver Updates 2017-10-10 Jeff Kirsher
@ 2017-10-10 17:21 ` Jeff Kirsher
  2017-10-10 17:21 ` [net-next 2/9] e1000e: Fix wrong comment related to " Jeff Kirsher
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Jeff Kirsher @ 2017-10-10 17:21 UTC (permalink / raw)
  To: davem; +Cc: Benjamin Poirier, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Benjamin Poirier <bpoirier@suse.com>

In case of error from e1e_rphy(), the loop will exit early and "success"
will be set to true erroneously.

Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/e1000e/phy.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/phy.c b/drivers/net/ethernet/intel/e1000e/phy.c
index d78d47b41a71..86ff0969efb6 100644
--- a/drivers/net/ethernet/intel/e1000e/phy.c
+++ b/drivers/net/ethernet/intel/e1000e/phy.c
@@ -1744,6 +1744,7 @@ s32 e1000e_phy_has_link_generic(struct e1000_hw *hw, u32 iterations,
 	s32 ret_val = 0;
 	u16 i, phy_status;
 
+	*success = false;
 	for (i = 0; i < iterations; i++) {
 		/* Some PHYs require the MII_BMSR register to be read
 		 * twice due to the link bit being sticky.  No harm doing
@@ -1763,16 +1764,16 @@ s32 e1000e_phy_has_link_generic(struct e1000_hw *hw, u32 iterations,
 		ret_val = e1e_rphy(hw, MII_BMSR, &phy_status);
 		if (ret_val)
 			break;
-		if (phy_status & BMSR_LSTATUS)
+		if (phy_status & BMSR_LSTATUS) {
+			*success = true;
 			break;
+		}
 		if (usec_interval >= 1000)
 			msleep(usec_interval / 1000);
 		else
 			udelay(usec_interval);
 	}
 
-	*success = (i < iterations);
-
 	return ret_val;
 }
 
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [net-next 2/9] e1000e: Fix wrong comment related to link detection
  2017-10-10 17:21 [net-next 0/9][pull request] 1GbE Intel Wired LAN Driver Updates 2017-10-10 Jeff Kirsher
  2017-10-10 17:21 ` [net-next 1/9] e1000e: Fix error path in link detection Jeff Kirsher
@ 2017-10-10 17:21 ` Jeff Kirsher
  2017-10-10 17:21 ` [net-next 3/9] e1000e: Fix return value test Jeff Kirsher
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Jeff Kirsher @ 2017-10-10 17:21 UTC (permalink / raw)
  To: davem; +Cc: Benjamin Poirier, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Benjamin Poirier <bpoirier@suse.com>

Reading e1000e_check_for_copper_link() shows that get_link_status is set to
false after link has been detected. Therefore, it stays TRUE until then.

Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/e1000e/netdev.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 8436c5f2c3e8..ead4c112580e 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -5074,7 +5074,7 @@ static bool e1000e_has_link(struct e1000_adapter *adapter)
 
 	/* get_link_status is set on LSC (link status) interrupt or
 	 * Rx sequence error interrupt.  get_link_status will stay
-	 * false until the check_for_link establishes link
+	 * true until the check_for_link establishes link
 	 * for copper adapters ONLY
 	 */
 	switch (hw->phy.media_type) {
@@ -5092,7 +5092,7 @@ static bool e1000e_has_link(struct e1000_adapter *adapter)
 		break;
 	case e1000_media_type_internal_serdes:
 		ret_val = hw->mac.ops.check_for_link(hw);
-		link_active = adapter->hw.mac.serdes_has_link;
+		link_active = hw->mac.serdes_has_link;
 		break;
 	default:
 	case e1000_media_type_unknown:
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [net-next 3/9] e1000e: Fix return value test
  2017-10-10 17:21 [net-next 0/9][pull request] 1GbE Intel Wired LAN Driver Updates 2017-10-10 Jeff Kirsher
  2017-10-10 17:21 ` [net-next 1/9] e1000e: Fix error path in link detection Jeff Kirsher
  2017-10-10 17:21 ` [net-next 2/9] e1000e: Fix wrong comment related to " Jeff Kirsher
@ 2017-10-10 17:21 ` Jeff Kirsher
  2017-10-10 17:21 ` [net-next 4/9] e1000e: Separate signaling for link check/link up Jeff Kirsher
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Jeff Kirsher @ 2017-10-10 17:21 UTC (permalink / raw)
  To: davem; +Cc: Benjamin Poirier, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Benjamin Poirier <bpoirier@suse.com>

All the helpers return -E1000_ERR_PHY.

Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/e1000e/netdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index ead4c112580e..a740de6a30b0 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -5099,7 +5099,7 @@ static bool e1000e_has_link(struct e1000_adapter *adapter)
 		break;
 	}
 
-	if ((ret_val == E1000_ERR_PHY) && (hw->phy.type == e1000_phy_igp_3) &&
+	if ((ret_val == -E1000_ERR_PHY) && (hw->phy.type == e1000_phy_igp_3) &&
 	    (er32(CTRL) & E1000_PHY_CTRL_GBE_DISABLE)) {
 		/* See e1000_kmrn_lock_loss_workaround_ich8lan() */
 		e_info("Gigabit has been disabled, downgrading speed\n");
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [net-next 4/9] e1000e: Separate signaling for link check/link up
  2017-10-10 17:21 [net-next 0/9][pull request] 1GbE Intel Wired LAN Driver Updates 2017-10-10 Jeff Kirsher
                   ` (2 preceding siblings ...)
  2017-10-10 17:21 ` [net-next 3/9] e1000e: Fix return value test Jeff Kirsher
@ 2017-10-10 17:21 ` Jeff Kirsher
  2017-10-10 17:21 ` [net-next 5/9] e1000e: Avoid receiver overrun interrupt bursts Jeff Kirsher
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Jeff Kirsher @ 2017-10-10 17:21 UTC (permalink / raw)
  To: davem; +Cc: Benjamin Poirier, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Benjamin Poirier <bpoirier@suse.com>

Lennart reported the following race condition:

\ e1000_watchdog_task
    \ e1000e_has_link
        \ hw->mac.ops.check_for_link() === e1000e_check_for_copper_link
            /* link is up */
            mac->get_link_status = false;

                            /* interrupt */
                            \ e1000_msix_other
                                hw->mac.get_link_status = true;

        link_active = !hw->mac.get_link_status
        /* link_active is false, wrongly */

This problem arises because the single flag get_link_status is used to
signal two different states: link status needs checking and link status is
down.

Avoid the problem by using the return value of .check_for_link to signal
the link status to e1000e_has_link().

Reported-by: Lennart Sorensen <lsorense@csclub.uwaterloo.ca>
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/e1000e/mac.c    | 11 ++++++++---
 drivers/net/ethernet/intel/e1000e/netdev.c |  2 +-
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/mac.c b/drivers/net/ethernet/intel/e1000e/mac.c
index b322011ec282..f457c5703d0c 100644
--- a/drivers/net/ethernet/intel/e1000e/mac.c
+++ b/drivers/net/ethernet/intel/e1000e/mac.c
@@ -410,6 +410,9 @@ void e1000e_clear_hw_cntrs_base(struct e1000_hw *hw)
  *  Checks to see of the link status of the hardware has changed.  If a
  *  change in link status has been detected, then we read the PHY registers
  *  to get the current speed/duplex if link exists.
+ *
+ *  Returns a negative error code (-E1000_ERR_*) or 0 (link down) or 1 (link
+ *  up).
  **/
 s32 e1000e_check_for_copper_link(struct e1000_hw *hw)
 {
@@ -423,7 +426,7 @@ s32 e1000e_check_for_copper_link(struct e1000_hw *hw)
 	 * Change or Rx Sequence Error interrupt.
 	 */
 	if (!mac->get_link_status)
-		return 0;
+		return 1;
 
 	/* First we want to see if the MII Status Register reports
 	 * link.  If so, then we want to get the current speed/duplex
@@ -461,10 +464,12 @@ s32 e1000e_check_for_copper_link(struct e1000_hw *hw)
 	 * different link partner.
 	 */
 	ret_val = e1000e_config_fc_after_link_up(hw);
-	if (ret_val)
+	if (ret_val) {
 		e_dbg("Error configuring flow control\n");
+		return ret_val;
+	}
 
-	return ret_val;
+	return 1;
 }
 
 /**
diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index a740de6a30b0..0a5f95ab0d3c 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -5081,7 +5081,7 @@ static bool e1000e_has_link(struct e1000_adapter *adapter)
 	case e1000_media_type_copper:
 		if (hw->mac.get_link_status) {
 			ret_val = hw->mac.ops.check_for_link(hw);
-			link_active = !hw->mac.get_link_status;
+			link_active = ret_val > 0;
 		} else {
 			link_active = true;
 		}
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [net-next 5/9] e1000e: Avoid receiver overrun interrupt bursts
  2017-10-10 17:21 [net-next 0/9][pull request] 1GbE Intel Wired LAN Driver Updates 2017-10-10 Jeff Kirsher
                   ` (3 preceding siblings ...)
  2017-10-10 17:21 ` [net-next 4/9] e1000e: Separate signaling for link check/link up Jeff Kirsher
@ 2017-10-10 17:21 ` Jeff Kirsher
  2017-10-10 17:21 ` [net-next 6/9] e1000e: fix buffer overrun while the I219 is processing DMA transactions Jeff Kirsher
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Jeff Kirsher @ 2017-10-10 17:21 UTC (permalink / raw)
  To: davem; +Cc: Benjamin Poirier, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Benjamin Poirier <bpoirier@suse.com>

When e1000e_poll() is not fast enough to keep up with incoming traffic, the
adapter (when operating in msix mode) raises the Other interrupt to signal
Receiver Overrun.

This is a double problem because 1) at the moment e1000_msix_other()
assumes that it is only called in case of Link Status Change and 2) if the
condition persists, the interrupt is repeatedly raised again in quick
succession.

Ideally we would configure the Other interrupt to not be raised in case of
receiver overrun but this doesn't seem possible on this adapter. Instead,
we handle the first part of the problem by reverting to the practice of
reading ICR in the other interrupt handler, like before commit 16ecba59bc33
("e1000e: Do not read ICR in Other interrupt"). Thanks to commit
0a8047ac68e5 ("e1000e: Fix msi-x interrupt automask") which cleared IAME
from CTRL_EXT, reading ICR doesn't interfere with RxQ0, TxQ0 interrupts
anymore. We handle the second part of the problem by not re-enabling the
Other interrupt right away when there is overrun. Instead, we wait until
traffic subsides, napi polling mode is exited and interrupts are
re-enabled.

Reported-by: Lennart Sorensen <lsorense@csclub.uwaterloo.ca>
Fixes: 16ecba59bc33 ("e1000e: Do not read ICR in Other interrupt")
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/e1000e/defines.h |  1 +
 drivers/net/ethernet/intel/e1000e/netdev.c  | 33 ++++++++++++++++++++++-------
 2 files changed, 26 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/defines.h b/drivers/net/ethernet/intel/e1000e/defines.h
index 0641c0098738..afb7ebe20b24 100644
--- a/drivers/net/ethernet/intel/e1000e/defines.h
+++ b/drivers/net/ethernet/intel/e1000e/defines.h
@@ -398,6 +398,7 @@
 #define E1000_ICR_LSC           0x00000004 /* Link Status Change */
 #define E1000_ICR_RXSEQ         0x00000008 /* Rx sequence error */
 #define E1000_ICR_RXDMT0        0x00000010 /* Rx desc min. threshold (0) */
+#define E1000_ICR_RXO           0x00000040 /* Receiver Overrun */
 #define E1000_ICR_RXT0          0x00000080 /* Rx timer intr (ring 0) */
 #define E1000_ICR_ECCER         0x00400000 /* Uncorrectable ECC Error */
 /* If this bit asserted, the driver should claim the interrupt */
diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 0a5f95ab0d3c..ee9de3500331 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -1910,14 +1910,30 @@ static irqreturn_t e1000_msix_other(int __always_unused irq, void *data)
 	struct net_device *netdev = data;
 	struct e1000_adapter *adapter = netdev_priv(netdev);
 	struct e1000_hw *hw = &adapter->hw;
+	u32 icr;
+	bool enable = true;
+
+	icr = er32(ICR);
+	if (icr & E1000_ICR_RXO) {
+		ew32(ICR, E1000_ICR_RXO);
+		enable = false;
+		/* napi poll will re-enable Other, make sure it runs */
+		if (napi_schedule_prep(&adapter->napi)) {
+			adapter->total_rx_bytes = 0;
+			adapter->total_rx_packets = 0;
+			__napi_schedule(&adapter->napi);
+		}
+	}
+	if (icr & E1000_ICR_LSC) {
+		ew32(ICR, E1000_ICR_LSC);
+		hw->mac.get_link_status = true;
+		/* guard against interrupt when we're going down */
+		if (!test_bit(__E1000_DOWN, &adapter->state))
+			mod_timer(&adapter->watchdog_timer, jiffies + 1);
+	}
 
-	hw->mac.get_link_status = true;
-
-	/* guard against interrupt when we're going down */
-	if (!test_bit(__E1000_DOWN, &adapter->state)) {
-		mod_timer(&adapter->watchdog_timer, jiffies + 1);
+	if (enable && !test_bit(__E1000_DOWN, &adapter->state))
 		ew32(IMS, E1000_IMS_OTHER);
-	}
 
 	return IRQ_HANDLED;
 }
@@ -2687,7 +2703,8 @@ static int e1000e_poll(struct napi_struct *napi, int weight)
 		napi_complete_done(napi, work_done);
 		if (!test_bit(__E1000_DOWN, &adapter->state)) {
 			if (adapter->msix_entries)
-				ew32(IMS, adapter->rx_ring->ims_val);
+				ew32(IMS, adapter->rx_ring->ims_val |
+				     E1000_IMS_OTHER);
 			else
 				e1000_irq_enable(adapter);
 		}
@@ -4204,7 +4221,7 @@ static void e1000e_trigger_lsc(struct e1000_adapter *adapter)
 	struct e1000_hw *hw = &adapter->hw;
 
 	if (adapter->msix_entries)
-		ew32(ICS, E1000_ICS_OTHER);
+		ew32(ICS, E1000_ICS_LSC | E1000_ICS_OTHER);
 	else
 		ew32(ICS, E1000_ICS_LSC);
 }
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [net-next 6/9] e1000e: fix buffer overrun while the I219 is processing DMA transactions
  2017-10-10 17:21 [net-next 0/9][pull request] 1GbE Intel Wired LAN Driver Updates 2017-10-10 Jeff Kirsher
                   ` (4 preceding siblings ...)
  2017-10-10 17:21 ` [net-next 5/9] e1000e: Avoid receiver overrun interrupt bursts Jeff Kirsher
@ 2017-10-10 17:21 ` Jeff Kirsher
  2017-10-11  9:07   ` David Laight
  2017-10-10 17:21 ` [net-next 7/9] e1000e: apply burst mode settings only on default Jeff Kirsher
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 16+ messages in thread
From: Jeff Kirsher @ 2017-10-10 17:21 UTC (permalink / raw)
  To: davem; +Cc: Sasha Neftin, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Sasha Neftin <sasha.neftin@intel.com>

Intel® 100/200 Series Chipset platforms reduced the round-trip
latency for the LAN Controller DMA accesses, causing in some high
performance cases a buffer overrun while the I219 LAN Connected
Device is processing the DMA transactions. I219LM and I219V devices
can fall into unrecovered Tx hang under very stressfully UDP traffic
and multiple reconnection of Ethernet cable. This Tx hang of the LAN
Controller is only recovered if the system is rebooted. Slightly slow
down DMA access by reducing the number of outstanding requests.
This workaround could have an impact on TCP traffic performance
on the platform. Disabling TSO eliminates performance loss for TCP
traffic without a noticeable impact on CPU performance.

Please, refer to I218/I219 specification update:
https://www.intel.com/content/www/us/en/embedded/products/networking/
ethernet-connection-i218-family-documentation.html

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Reviewed-by: Dima Ruinskiy <dima.ruinskiy@intel.com>
Reviewed-by: Raanan Avargil <raanan.avargil@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/e1000e/netdev.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index ee9de3500331..14b096f3d1da 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -3021,8 +3021,8 @@ static void e1000_configure_tx(struct e1000_adapter *adapter)
 
 	hw->mac.ops.config_collision_dist(hw);
 
-	/* SPT and CNP Si errata workaround to avoid data corruption */
-	if (hw->mac.type >= e1000_pch_spt) {
+	/* SPT and KBL Si errata workaround to avoid data corruption */
+	if (hw->mac.type == e1000_pch_spt) {
 		u32 reg_val;
 
 		reg_val = er32(IOSFPC);
@@ -3030,7 +3030,9 @@ static void e1000_configure_tx(struct e1000_adapter *adapter)
 		ew32(IOSFPC, reg_val);
 
 		reg_val = er32(TARC(0));
-		reg_val |= E1000_TARC0_CB_MULTIQ_3_REQ;
+		/* SPT and KBL Si errata workaround to avoid Tx hang */
+		reg_val &= ~BIT(28);
+		reg_val |= BIT(29);
 		ew32(TARC(0), reg_val);
 	}
 }
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [net-next 7/9] e1000e: apply burst mode settings only on default
  2017-10-10 17:21 [net-next 0/9][pull request] 1GbE Intel Wired LAN Driver Updates 2017-10-10 Jeff Kirsher
                   ` (5 preceding siblings ...)
  2017-10-10 17:21 ` [net-next 6/9] e1000e: fix buffer overrun while the I219 is processing DMA transactions Jeff Kirsher
@ 2017-10-10 17:21 ` Jeff Kirsher
  2017-10-10 17:21 ` [net-next 8/9] e1000e: Be drop monitor friendly Jeff Kirsher
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Jeff Kirsher @ 2017-10-10 17:21 UTC (permalink / raw)
  To: davem; +Cc: Willem de Bruijn, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Willem de Bruijn <willemb@google.com>

Devices that support FLAG2_DMA_BURST have different default values
for RDTR and RADV. Apply burst mode default settings only when no
explicit value was passed at module load.

The RDTR default is zero. If the module is loaded for low latency
operation with RxIntDelay=0, do not override this value with a burst
default of 32.

Move the decision to apply burst values earlier, where explicitly
initialized module variables can be distinguished from defaults.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/e1000e/e1000.h  |  4 ----
 drivers/net/ethernet/intel/e1000e/netdev.c |  8 --------
 drivers/net/ethernet/intel/e1000e/param.c  | 16 +++++++++++++++-
 3 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/e1000.h b/drivers/net/ethernet/intel/e1000e/e1000.h
index 98e68888abb1..2311b31bdcac 100644
--- a/drivers/net/ethernet/intel/e1000e/e1000.h
+++ b/drivers/net/ethernet/intel/e1000e/e1000.h
@@ -94,10 +94,6 @@ struct e1000_info;
  */
 #define E1000_CHECK_RESET_COUNT		25
 
-#define DEFAULT_RDTR			0
-#define DEFAULT_RADV			8
-#define BURST_RDTR			0x20
-#define BURST_RADV			0x20
 #define PCICFG_DESC_RING_STATUS		0xe4
 #define FLUSH_DESC_REQUIRED		0x100
 
diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 14b096f3d1da..00f48d4cabec 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -3242,14 +3242,6 @@ static void e1000_configure_rx(struct e1000_adapter *adapter)
 		 */
 		ew32(RXDCTL(0), E1000_RXDCTL_DMA_BURST_ENABLE);
 		ew32(RXDCTL(1), E1000_RXDCTL_DMA_BURST_ENABLE);
-
-		/* override the delay timers for enabling bursting, only if
-		 * the value was not set by the user via module options
-		 */
-		if (adapter->rx_int_delay == DEFAULT_RDTR)
-			adapter->rx_int_delay = BURST_RDTR;
-		if (adapter->rx_abs_int_delay == DEFAULT_RADV)
-			adapter->rx_abs_int_delay = BURST_RADV;
 	}
 
 	/* set the Receive Delay Timer Register */
diff --git a/drivers/net/ethernet/intel/e1000e/param.c b/drivers/net/ethernet/intel/e1000e/param.c
index 6d8c39abee16..47da51864543 100644
--- a/drivers/net/ethernet/intel/e1000e/param.c
+++ b/drivers/net/ethernet/intel/e1000e/param.c
@@ -73,17 +73,25 @@ E1000_PARAM(TxAbsIntDelay, "Transmit Absolute Interrupt Delay");
 /* Receive Interrupt Delay in units of 1.024 microseconds
  * hardware will likely hang if you set this to anything but zero.
  *
+ * Burst variant is used as default if device has FLAG2_DMA_BURST.
+ *
  * Valid Range: 0-65535
  */
 E1000_PARAM(RxIntDelay, "Receive Interrupt Delay");
+#define DEFAULT_RDTR	0
+#define BURST_RDTR	0x20
 #define MAX_RXDELAY 0xFFFF
 #define MIN_RXDELAY 0
 
 /* Receive Absolute Interrupt Delay in units of 1.024 microseconds
+ *
+ * Burst variant is used as default if device has FLAG2_DMA_BURST.
  *
  * Valid Range: 0-65535
  */
 E1000_PARAM(RxAbsIntDelay, "Receive Absolute Interrupt Delay");
+#define DEFAULT_RADV	8
+#define BURST_RADV	0x20
 #define MAX_RXABSDELAY 0xFFFF
 #define MIN_RXABSDELAY 0
 
@@ -297,6 +305,9 @@ void e1000e_check_options(struct e1000_adapter *adapter)
 					 .max = MAX_RXDELAY } }
 		};
 
+		if (adapter->flags2 & FLAG2_DMA_BURST)
+			opt.def = BURST_RDTR;
+
 		if (num_RxIntDelay > bd) {
 			adapter->rx_int_delay = RxIntDelay[bd];
 			e1000_validate_option(&adapter->rx_int_delay, &opt,
@@ -307,7 +318,7 @@ void e1000e_check_options(struct e1000_adapter *adapter)
 	}
 	/* Receive Absolute Interrupt Delay */
 	{
-		static const struct e1000_option opt = {
+		static struct e1000_option opt = {
 			.type = range_option,
 			.name = "Receive Absolute Interrupt Delay",
 			.err  = "using default of "
@@ -317,6 +328,9 @@ void e1000e_check_options(struct e1000_adapter *adapter)
 					 .max = MAX_RXABSDELAY } }
 		};
 
+		if (adapter->flags2 & FLAG2_DMA_BURST)
+			opt.def = BURST_RADV;
+
 		if (num_RxAbsIntDelay > bd) {
 			adapter->rx_abs_int_delay = RxAbsIntDelay[bd];
 			e1000_validate_option(&adapter->rx_abs_int_delay, &opt,
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [net-next 8/9] e1000e: Be drop monitor friendly
  2017-10-10 17:21 [net-next 0/9][pull request] 1GbE Intel Wired LAN Driver Updates 2017-10-10 Jeff Kirsher
                   ` (6 preceding siblings ...)
  2017-10-10 17:21 ` [net-next 7/9] e1000e: apply burst mode settings only on default Jeff Kirsher
@ 2017-10-10 17:21 ` Jeff Kirsher
  2017-10-10 17:21 ` [net-next 9/9] igb: check memory allocation failure Jeff Kirsher
  2017-10-10 20:21 ` [net-next 0/9][pull request] 1GbE Intel Wired LAN Driver Updates 2017-10-10 David Miller
  9 siblings, 0 replies; 16+ messages in thread
From: Jeff Kirsher @ 2017-10-10 17:21 UTC (permalink / raw)
  To: davem; +Cc: Florian Fainelli, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Florian Fainelli <f.fainelli@gmail.com>

e1000e_put_txbuf() can be called from normal reclamation path as well as
when a DMA mapping failure, so we need to differentiate these two cases
when freeing SKBs to be drop monitor friendly. e1000e_tx_hwtstamp_work()
and e1000_remove() are processing TX timestamped SKBs and those should
not be accounted as drops either.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/e1000e/netdev.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 00f48d4cabec..bf8f38f76953 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -1071,7 +1071,8 @@ static bool e1000_clean_rx_irq(struct e1000_ring *rx_ring, int *work_done,
 }
 
 static void e1000_put_txbuf(struct e1000_ring *tx_ring,
-			    struct e1000_buffer *buffer_info)
+			    struct e1000_buffer *buffer_info,
+			    bool drop)
 {
 	struct e1000_adapter *adapter = tx_ring->adapter;
 
@@ -1085,7 +1086,10 @@ static void e1000_put_txbuf(struct e1000_ring *tx_ring,
 		buffer_info->dma = 0;
 	}
 	if (buffer_info->skb) {
-		dev_kfree_skb_any(buffer_info->skb);
+		if (drop)
+			dev_kfree_skb_any(buffer_info->skb);
+		else
+			dev_consume_skb_any(buffer_info->skb);
 		buffer_info->skb = NULL;
 	}
 	buffer_info->time_stamp = 0;
@@ -1199,7 +1203,7 @@ static void e1000e_tx_hwtstamp_work(struct work_struct *work)
 		wmb(); /* force write prior to skb_tstamp_tx */
 
 		skb_tstamp_tx(skb, &shhwtstamps);
-		dev_kfree_skb_any(skb);
+		dev_consume_skb_any(skb);
 	} else if (time_after(jiffies, adapter->tx_hwtstamp_start
 			      + adapter->tx_timeout_factor * HZ)) {
 		dev_kfree_skb_any(adapter->tx_hwtstamp_skb);
@@ -1254,7 +1258,7 @@ static bool e1000_clean_tx_irq(struct e1000_ring *tx_ring)
 				}
 			}
 
-			e1000_put_txbuf(tx_ring, buffer_info);
+			e1000_put_txbuf(tx_ring, buffer_info, false);
 			tx_desc->upper.data = 0;
 
 			i++;
@@ -2437,7 +2441,7 @@ static void e1000_clean_tx_ring(struct e1000_ring *tx_ring)
 
 	for (i = 0; i < tx_ring->count; i++) {
 		buffer_info = &tx_ring->buffer_info[i];
-		e1000_put_txbuf(tx_ring, buffer_info);
+		e1000_put_txbuf(tx_ring, buffer_info, false);
 	}
 
 	netdev_reset_queue(adapter->netdev);
@@ -5625,7 +5629,7 @@ static int e1000_tx_map(struct e1000_ring *tx_ring, struct sk_buff *skb,
 			i += tx_ring->count;
 		i--;
 		buffer_info = &tx_ring->buffer_info[i];
-		e1000_put_txbuf(tx_ring, buffer_info);
+		e1000_put_txbuf(tx_ring, buffer_info, true);
 	}
 
 	return 0;
@@ -7419,7 +7423,7 @@ static void e1000_remove(struct pci_dev *pdev)
 	if (adapter->flags & FLAG_HAS_HW_TIMESTAMP) {
 		cancel_work_sync(&adapter->tx_hwtstamp_work);
 		if (adapter->tx_hwtstamp_skb) {
-			dev_kfree_skb_any(adapter->tx_hwtstamp_skb);
+			dev_consume_skb_any(adapter->tx_hwtstamp_skb);
 			adapter->tx_hwtstamp_skb = NULL;
 		}
 	}
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [net-next 9/9] igb: check memory allocation failure
  2017-10-10 17:21 [net-next 0/9][pull request] 1GbE Intel Wired LAN Driver Updates 2017-10-10 Jeff Kirsher
                   ` (7 preceding siblings ...)
  2017-10-10 17:21 ` [net-next 8/9] e1000e: Be drop monitor friendly Jeff Kirsher
@ 2017-10-10 17:21 ` Jeff Kirsher
  2017-10-10 20:21 ` [net-next 0/9][pull request] 1GbE Intel Wired LAN Driver Updates 2017-10-10 David Miller
  9 siblings, 0 replies; 16+ messages in thread
From: Jeff Kirsher @ 2017-10-10 17:21 UTC (permalink / raw)
  To: davem
  Cc: Christophe JAILLET, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Christophe JAILLET <christophe.jaillet@wanadoo.fr>

Check memory allocation failures and return -ENOMEM in such cases, as
already done for other memory allocations in this function.

This avoids NULL pointers dereference.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Tested-by: Aaron Brown <aaron.f.brown@intel.com
Acked-by: PJ Waskiewicz <peter.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index fd4a46b03cc8..837d9b46a390 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -3162,6 +3162,8 @@ static int igb_sw_init(struct igb_adapter *adapter)
 	/* Setup and initialize a copy of the hw vlan table array */
 	adapter->shadow_vfta = kcalloc(E1000_VLAN_FILTER_TBL_SIZE, sizeof(u32),
 				       GFP_ATOMIC);
+	if (!adapter->shadow_vfta)
+		return -ENOMEM;
 
 	/* This call may decrease the number of queues */
 	if (igb_init_interrupt_scheme(adapter, true)) {
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [net-next 0/9][pull request] 1GbE Intel Wired LAN Driver Updates 2017-10-10
  2017-10-10 17:21 [net-next 0/9][pull request] 1GbE Intel Wired LAN Driver Updates 2017-10-10 Jeff Kirsher
                   ` (8 preceding siblings ...)
  2017-10-10 17:21 ` [net-next 9/9] igb: check memory allocation failure Jeff Kirsher
@ 2017-10-10 20:21 ` David Miller
  9 siblings, 0 replies; 16+ messages in thread
From: David Miller @ 2017-10-10 20:21 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: netdev, nhorman, sassmann, jogreene

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Tue, 10 Oct 2017 10:21:30 -0700

> This series contains updates to e1000e and igb.

Pulled, thanks Jeff.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [net-next 6/9] e1000e: fix buffer overrun while the I219 is processing DMA transactions
  2017-10-10 17:21 ` [net-next 6/9] e1000e: fix buffer overrun while the I219 is processing DMA transactions Jeff Kirsher
@ 2017-10-11  9:07   ` David Laight
  2017-10-16 10:24     ` Neftin, Sasha
  2017-10-16 10:39     ` Neftin, Sasha
  0 siblings, 2 replies; 16+ messages in thread
From: David Laight @ 2017-10-11  9:07 UTC (permalink / raw)
  To: 'Jeff Kirsher', davem
  Cc: Sasha Neftin, netdev, nhorman, sassmann, jogreene

From: Jeff Kirsher
> Sent: 10 October 2017 18:22
> Intel 100/200 Series Chipset platforms reduced the round-trip
> latency for the LAN Controller DMA accesses, causing in some high
> performance cases a buffer overrun while the I219 LAN Connected
> Device is processing the DMA transactions. I219LM and I219V devices
> can fall into unrecovered Tx hang under very stressfully UDP traffic
> and multiple reconnection of Ethernet cable. This Tx hang of the LAN
> Controller is only recovered if the system is rebooted. Slightly slow
> down DMA access by reducing the number of outstanding requests.
> This workaround could have an impact on TCP traffic performance
> on the platform. Disabling TSO eliminates performance loss for TCP
> traffic without a noticeable impact on CPU performance.
> 
> Please, refer to I218/I219 specification update:
> https://www.intel.com/content/www/us/en/embedded/products/networking/
> ethernet-connection-i218-family-documentation.html
> 
> Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
> Reviewed-by: Dima Ruinskiy <dima.ruinskiy@intel.com>
> Reviewed-by: Raanan Avargil <raanan.avargil@intel.com>
> Tested-by: Aaron Brown <aaron.f.brown@intel.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> ---
>  drivers/net/ethernet/intel/e1000e/netdev.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
> index ee9de3500331..14b096f3d1da 100644
> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
> @@ -3021,8 +3021,8 @@ static void e1000_configure_tx(struct e1000_adapter *adapter)
> 
>  	hw->mac.ops.config_collision_dist(hw);
> 
> -	/* SPT and CNP Si errata workaround to avoid data corruption */
> -	if (hw->mac.type >= e1000_pch_spt) {
> +	/* SPT and KBL Si errata workaround to avoid data corruption */
> +	if (hw->mac.type == e1000_pch_spt) {
>  		u32 reg_val;
> 
>  		reg_val = er32(IOSFPC);
> @@ -3030,7 +3030,9 @@ static void e1000_configure_tx(struct e1000_adapter *adapter)
>  		ew32(IOSFPC, reg_val);
> 
>  		reg_val = er32(TARC(0));
> -		reg_val |= E1000_TARC0_CB_MULTIQ_3_REQ;
> +		/* SPT and KBL Si errata workaround to avoid Tx hang */
> +		reg_val &= ~BIT(28);
> +		reg_val |= BIT(29);

Shouldn't some more of the commit message about what this is doing
be in the comment?
And shouldn't the 28 and 28 be named constants?

>  		ew32(TARC(0), reg_val);

	David


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [net-next 6/9] e1000e: fix buffer overrun while the I219 is processing DMA transactions
  2017-10-11  9:07   ` David Laight
@ 2017-10-16 10:24     ` Neftin, Sasha
  2017-10-16 16:11       ` Alexander Duyck
  2017-10-16 10:39     ` Neftin, Sasha
  1 sibling, 1 reply; 16+ messages in thread
From: Neftin, Sasha @ 2017-10-16 10:24 UTC (permalink / raw)
  To: David Laight, 'Jeff Kirsher', davem
  Cc: netdev, nhorman, sassmann, jogreene

On 10/11/2017 12:07, David Laight wrote:
> From: Jeff Kirsher
>> Sent: 10 October 2017 18:22
>> Intel 100/200 Series Chipset platforms reduced the round-trip
>> latency for the LAN Controller DMA accesses, causing in some high
>> performance cases a buffer overrun while the I219 LAN Connected
>> Device is processing the DMA transactions. I219LM and I219V devices
>> can fall into unrecovered Tx hang under very stressfully UDP traffic
>> and multiple reconnection of Ethernet cable. This Tx hang of the LAN
>> Controller is only recovered if the system is rebooted. Slightly slow
>> down DMA access by reducing the number of outstanding requests.
>> This workaround could have an impact on TCP traffic performance
>> on the platform. Disabling TSO eliminates performance loss for TCP
>> traffic without a noticeable impact on CPU performance.
>>
>> Please, refer to I218/I219 specification update:
>> https://www.intel.com/content/www/us/en/embedded/products/networking/
>> ethernet-connection-i218-family-documentation.html
>>
>> Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
>> Reviewed-by: Dima Ruinskiy <dima.ruinskiy@intel.com>
>> Reviewed-by: Raanan Avargil <raanan.avargil@intel.com>
>> Tested-by: Aaron Brown <aaron.f.brown@intel.com>
>> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
>> ---
>>   drivers/net/ethernet/intel/e1000e/netdev.c | 8 +++++---
>>   1 file changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
>> index ee9de3500331..14b096f3d1da 100644
>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>> @@ -3021,8 +3021,8 @@ static void e1000_configure_tx(struct e1000_adapter *adapter)
>>
>>   	hw->mac.ops.config_collision_dist(hw);
>>
>> -	/* SPT and CNP Si errata workaround to avoid data corruption */
>> -	if (hw->mac.type >= e1000_pch_spt) {
>> +	/* SPT and KBL Si errata workaround to avoid data corruption */
>> +	if (hw->mac.type == e1000_pch_spt) {
>>   		u32 reg_val;
>>
>>   		reg_val = er32(IOSFPC);
>> @@ -3030,7 +3030,9 @@ static void e1000_configure_tx(struct e1000_adapter *adapter)
>>   		ew32(IOSFPC, reg_val);
>>
>>   		reg_val = er32(TARC(0));
>> -		reg_val |= E1000_TARC0_CB_MULTIQ_3_REQ;
>> +		/* SPT and KBL Si errata workaround to avoid Tx hang */
>> +		reg_val &= ~BIT(28);
>> +		reg_val |= BIT(29);
> Shouldn't some more of the commit message about what this is doing
> be in the comment?
There is provided link on specification update: 
https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/i218-i219-ethernet-connection-spec-update.pdf?asset=9561. 
This is Intel's public edition.
> And shouldn't the 28 and 28 be named constants?
(28 and 29) you can easy understand from the code that value has been 
changed from 3 to 2. There is no point add flags here I thought.
>
>>   		ew32(TARC(0), reg_val);
> 	David
>
Thanks,

Sasha

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [net-next 6/9] e1000e: fix buffer overrun while the I219 is processing DMA transactions
  2017-10-11  9:07   ` David Laight
  2017-10-16 10:24     ` Neftin, Sasha
@ 2017-10-16 10:39     ` Neftin, Sasha
  2017-10-16 13:27       ` David Laight
  1 sibling, 1 reply; 16+ messages in thread
From: Neftin, Sasha @ 2017-10-16 10:39 UTC (permalink / raw)
  To: David Laight, 'Jeff Kirsher', davem
  Cc: netdev, nhorman, sassmann, jogreene

On 10/11/2017 12:07, David Laight wrote:
> From: Jeff Kirsher
>> Sent: 10 October 2017 18:22
>> Intel 100/200 Series Chipset platforms reduced the round-trip
>> latency for the LAN Controller DMA accesses, causing in some high
>> performance cases a buffer overrun while the I219 LAN Connected
>> Device is processing the DMA transactions. I219LM and I219V devices
>> can fall into unrecovered Tx hang under very stressfully UDP traffic
>> and multiple reconnection of Ethernet cable. This Tx hang of the LAN
>> Controller is only recovered if the system is rebooted. Slightly slow
>> down DMA access by reducing the number of outstanding requests.
>> This workaround could have an impact on TCP traffic performance
>> on the platform. Disabling TSO eliminates performance loss for TCP
>> traffic without a noticeable impact on CPU performance.
>>
>> Please, refer to I218/I219 specification update:
>> https://www.intel.com/content/www/us/en/embedded/products/networking/
>> ethernet-connection-i218-family-documentation.html
>>
>> Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
>> Reviewed-by: Dima Ruinskiy <dima.ruinskiy@intel.com>
>> Reviewed-by: Raanan Avargil <raanan.avargil@intel.com>
>> Tested-by: Aaron Brown <aaron.f.brown@intel.com>
>> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
>> ---
>>   drivers/net/ethernet/intel/e1000e/netdev.c | 8 +++++---
>>   1 file changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
>> index ee9de3500331..14b096f3d1da 100644
>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>> @@ -3021,8 +3021,8 @@ static void e1000_configure_tx(struct e1000_adapter *adapter)
>>
>>   	hw->mac.ops.config_collision_dist(hw);
>>
>> -	/* SPT and CNP Si errata workaround to avoid data corruption */
>> -	if (hw->mac.type >= e1000_pch_spt) {
>> +	/* SPT and KBL Si errata workaround to avoid data corruption */
>> +	if (hw->mac.type == e1000_pch_spt) {
>>   		u32 reg_val;
>>
>>   		reg_val = er32(IOSFPC);
>> @@ -3030,7 +3030,9 @@ static void e1000_configure_tx(struct e1000_adapter *adapter)
>>   		ew32(IOSFPC, reg_val);
>>
>>   		reg_val = er32(TARC(0));
>> -		reg_val |= E1000_TARC0_CB_MULTIQ_3_REQ;
>> +		/* SPT and KBL Si errata workaround to avoid Tx hang */
>> +		reg_val &= ~BIT(28);
>> +		reg_val |= BIT(29);
> Shouldn't some more of the commit message about what this is doing
> be in the comment?
There is provided link on specification update: 
https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/i218-i219-ethernet-connection-spec-update.pdf?asset=9561. 
This is Intel's public release.
> And shouldn't the 28 and 28 be named constants?
(28 and 29) - you can easy understand from code that same value has been 
changed from 3 to 2. There is no point add flag here I thought.
>
>>   		ew32(TARC(0), reg_val);
> 	David
>
Thanks,

Sasha

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [net-next 6/9] e1000e: fix buffer overrun while the I219 is processing DMA transactions
  2017-10-16 10:39     ` Neftin, Sasha
@ 2017-10-16 13:27       ` David Laight
  0 siblings, 0 replies; 16+ messages in thread
From: David Laight @ 2017-10-16 13:27 UTC (permalink / raw)
  To: 'Neftin, Sasha', 'Jeff Kirsher', davem
  Cc: netdev, nhorman, sassmann, jogreene

From: Neftin, Sasha
> Sent: 16 October 2017 11:40
> On 10/11/2017 12:07, David Laight wrote:
> > From: Jeff Kirsher
> >> Sent: 10 October 2017 18:22
> >> Intel 100/200 Series Chipset platforms reduced the round-trip
> >> latency for the LAN Controller DMA accesses, causing in some high
> >> performance cases a buffer overrun while the I219 LAN Connected
> >> Device is processing the DMA transactions. I219LM and I219V devices
> >> can fall into unrecovered Tx hang under very stressfully UDP traffic
> >> and multiple reconnection of Ethernet cable. This Tx hang of the LAN
> >> Controller is only recovered if the system is rebooted. Slightly slow
> >> down DMA access by reducing the number of outstanding requests.
> >> This workaround could have an impact on TCP traffic performance
> >> on the platform. Disabling TSO eliminates performance loss for TCP
> >> traffic without a noticeable impact on CPU performance.
> >>
> >> Please, refer to I218/I219 specification update:
> >> https://www.intel.com/content/www/us/en/embedded/products/networking/
> >> ethernet-connection-i218-family-documentation.html
> >>
> >> Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
> >> Reviewed-by: Dima Ruinskiy <dima.ruinskiy@intel.com>
> >> Reviewed-by: Raanan Avargil <raanan.avargil@intel.com>
> >> Tested-by: Aaron Brown <aaron.f.brown@intel.com>
> >> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> >> ---
> >>   drivers/net/ethernet/intel/e1000e/netdev.c | 8 +++++---
> >>   1 file changed, 5 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
> b/drivers/net/ethernet/intel/e1000e/netdev.c
> >> index ee9de3500331..14b096f3d1da 100644
> >> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
> >> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
> >> @@ -3021,8 +3021,8 @@ static void e1000_configure_tx(struct e1000_adapter *adapter)
> >>
> >>   	hw->mac.ops.config_collision_dist(hw);
> >>
> >> -	/* SPT and CNP Si errata workaround to avoid data corruption */
> >> -	if (hw->mac.type >= e1000_pch_spt) {
> >> +	/* SPT and KBL Si errata workaround to avoid data corruption */
> >> +	if (hw->mac.type == e1000_pch_spt) {
> >>   		u32 reg_val;
> >>
> >>   		reg_val = er32(IOSFPC);
> >> @@ -3030,7 +3030,9 @@ static void e1000_configure_tx(struct e1000_adapter *adapter)
> >>   		ew32(IOSFPC, reg_val);
> >>
> >>   		reg_val = er32(TARC(0));
> >> -		reg_val |= E1000_TARC0_CB_MULTIQ_3_REQ;
> >> +		/* SPT and KBL Si errata workaround to avoid Tx hang */
> >> +		reg_val &= ~BIT(28);
> >> +		reg_val |= BIT(29);

> > Shouldn't some more of the commit message about what this is doing
> > be in the comment?

> There is provided link on specification update:
> https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/i218-i219-ethernet-
> connection-spec-update.pdf?asset=9561.
> This is Intel's public release.

And sometime next week the marketing people will decide to reorganise the
web site and the link will become invalid.

> > And shouldn't the 28 and 28 be named constants?

> (28 and 29) - you can easy understand from code that same value has been
> changed from 3 to 2. There is no point add flag here I thought.

Oh, there is. The 'workaround is':
  Slightly slow down DMA access by reducing the number of outstanding requests.
  This workaround could have an impact on TCP traffic performance and could
  reduce performance up to 5 to 15% (depending) on the platform.
  Disabling TSO eliminates performance loss for TCP traffic without a 
  noticeable impact on CPU performance.

I wonder what tests they did to show that TSO doesn't save cpu cycles!

So my guess is that you are changing the number of outstanding PCIe reads
(or reads for tx buffers, or ???) from 3 to 2.

Lets read between the lines a little further
(since you are at Intel you can probably check this):
Assuming that TSO is 'Transmit Segmentation Offload' and that TSO packets
might be 64k, then reading 3 TSO packets might issue PCIe reads for 196k
bytes of data (under 4k for non-TSO).
If the internal buffer that this data is stored in isn't that big then
that internal buffer would overflow.
It might be that data is removed from this buffer as soon as the last
completion TLP arrives - but they can be interleaved with other
outstanding PCIe reads.
It all rather depends on the negotiated maximum TLP size and number
of tags.

Perhaps reducing the maximum TSO packet to 32k stops the overflow
as well...

	David


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [net-next 6/9] e1000e: fix buffer overrun while the I219 is processing DMA transactions
  2017-10-16 10:24     ` Neftin, Sasha
@ 2017-10-16 16:11       ` Alexander Duyck
  0 siblings, 0 replies; 16+ messages in thread
From: Alexander Duyck @ 2017-10-16 16:11 UTC (permalink / raw)
  To: Neftin, Sasha
  Cc: David Laight, Jeff Kirsher, davem, netdev, nhorman, sassmann, jogreene

On Mon, Oct 16, 2017 at 3:24 AM, Neftin, Sasha <sasha.neftin@intel.com> wrote:
> On 10/11/2017 12:07, David Laight wrote:
>>
>> From: Jeff Kirsher
>>>
>>> Sent: 10 October 2017 18:22
>>> Intel 100/200 Series Chipset platforms reduced the round-trip
>>> latency for the LAN Controller DMA accesses, causing in some high
>>> performance cases a buffer overrun while the I219 LAN Connected
>>> Device is processing the DMA transactions. I219LM and I219V devices
>>> can fall into unrecovered Tx hang under very stressfully UDP traffic
>>> and multiple reconnection of Ethernet cable. This Tx hang of the LAN
>>> Controller is only recovered if the system is rebooted. Slightly slow
>>> down DMA access by reducing the number of outstanding requests.
>>> This workaround could have an impact on TCP traffic performance
>>> on the platform. Disabling TSO eliminates performance loss for TCP
>>> traffic without a noticeable impact on CPU performance.
>>>
>>> Please, refer to I218/I219 specification update:
>>> https://www.intel.com/content/www/us/en/embedded/products/networking/
>>> ethernet-connection-i218-family-documentation.html
>>>
>>> Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
>>> Reviewed-by: Dima Ruinskiy <dima.ruinskiy@intel.com>
>>> Reviewed-by: Raanan Avargil <raanan.avargil@intel.com>
>>> Tested-by: Aaron Brown <aaron.f.brown@intel.com>
>>> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
>>> ---
>>>   drivers/net/ethernet/intel/e1000e/netdev.c | 8 +++++---
>>>   1 file changed, 5 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
>>> b/drivers/net/ethernet/intel/e1000e/netdev.c
>>> index ee9de3500331..14b096f3d1da 100644
>>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
>>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>>> @@ -3021,8 +3021,8 @@ static void e1000_configure_tx(struct e1000_adapter
>>> *adapter)
>>>
>>>         hw->mac.ops.config_collision_dist(hw);
>>>
>>> -       /* SPT and CNP Si errata workaround to avoid data corruption */
>>> -       if (hw->mac.type >= e1000_pch_spt) {
>>> +       /* SPT and KBL Si errata workaround to avoid data corruption */
>>> +       if (hw->mac.type == e1000_pch_spt) {
>>>                 u32 reg_val;
>>>
>>>                 reg_val = er32(IOSFPC);
>>> @@ -3030,7 +3030,9 @@ static void e1000_configure_tx(struct e1000_adapter
>>> *adapter)
>>>                 ew32(IOSFPC, reg_val);
>>>
>>>                 reg_val = er32(TARC(0));
>>> -               reg_val |= E1000_TARC0_CB_MULTIQ_3_REQ;
>>> +               /* SPT and KBL Si errata workaround to avoid Tx hang */
>>> +               reg_val &= ~BIT(28);
>>> +               reg_val |= BIT(29);
>>
>> Shouldn't some more of the commit message about what this is doing
>> be in the comment?
>
> There is provided link on specification update:
> https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/i218-i219-ethernet-connection-spec-update.pdf?asset=9561.
> This is Intel's public edition.
>>
>> And shouldn't the 28 and 28 be named constants?
>
> (28 and 29) you can easy understand from the code that value has been
> changed from 3 to 2. There is no point add flags here I thought.

I have to agree with David. This isn't clear and this is going in the
opposite direction of being clear towards being very murky.

You already had the E1000_TARC0_CB_MULTIQ_3_REQ define. It shouldn't
be hard to come up with a bitmask that defines the full width of the
field you are updating so that you can use that mask to clear out the
value, and then also define a value for "MULTIQ_2_REQ" to replace it
the value you were using before. Assuming we still want to go with
this route.

He also has a point about using netif_set_gso_max_size() to restrict
the GSO size. If that would work for something like this then that
might be the preferred way to go as you wouldn't be introducing the
same type of issues as you currently do in that you are requiring
disabling TSO in order to avoid "performance loss" which in this case
I assume you are only referring to throughput without taking CPU into
account.

- Alex

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2017-10-16 16:11 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-10 17:21 [net-next 0/9][pull request] 1GbE Intel Wired LAN Driver Updates 2017-10-10 Jeff Kirsher
2017-10-10 17:21 ` [net-next 1/9] e1000e: Fix error path in link detection Jeff Kirsher
2017-10-10 17:21 ` [net-next 2/9] e1000e: Fix wrong comment related to " Jeff Kirsher
2017-10-10 17:21 ` [net-next 3/9] e1000e: Fix return value test Jeff Kirsher
2017-10-10 17:21 ` [net-next 4/9] e1000e: Separate signaling for link check/link up Jeff Kirsher
2017-10-10 17:21 ` [net-next 5/9] e1000e: Avoid receiver overrun interrupt bursts Jeff Kirsher
2017-10-10 17:21 ` [net-next 6/9] e1000e: fix buffer overrun while the I219 is processing DMA transactions Jeff Kirsher
2017-10-11  9:07   ` David Laight
2017-10-16 10:24     ` Neftin, Sasha
2017-10-16 16:11       ` Alexander Duyck
2017-10-16 10:39     ` Neftin, Sasha
2017-10-16 13:27       ` David Laight
2017-10-10 17:21 ` [net-next 7/9] e1000e: apply burst mode settings only on default Jeff Kirsher
2017-10-10 17:21 ` [net-next 8/9] e1000e: Be drop monitor friendly Jeff Kirsher
2017-10-10 17:21 ` [net-next 9/9] igb: check memory allocation failure Jeff Kirsher
2017-10-10 20:21 ` [net-next 0/9][pull request] 1GbE Intel Wired LAN Driver Updates 2017-10-10 David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.