netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [net-next 0/7][pull request] 1GbE Intel Wired LAN Driver Updates 2019-10-16
@ 2019-10-16 23:47 Jeff Kirsher
  2019-10-16 23:47 ` [net-next 1/7] igb/igc: Don't warn on fatal read failures when the device is removed Jeff Kirsher
                   ` (6 more replies)
  0 siblings, 7 replies; 16+ messages in thread
From: Jeff Kirsher @ 2019-10-16 23:47 UTC (permalink / raw)
  To: davem; +Cc: Jeff Kirsher, netdev, nhorman, sassmann

This series contains updates to e1000e, igc and igb.

Lyude Paul fixes a warning that was occurring during a fatal read error
while the device is still present.

Robert Beckett adds receive drop enable support via ehttool private flag
for igb.

Sasha adds stream control transmission protocol (SCTP) CRC checksum
support for igc.  Also added S0ix support to the e1000e driver.  Then
added multicast support by adding the address list to the MTA table and
providing the option for IPv6 address for igc.  In addition, added
receive checksum support to igc as well.  Lastly, cleaned up some code
that was not fully implemented yet for the VLAN filter table array.

The following are changes since commit d9f45ab9e671166004b75427f10389e1f70cfc30:
  net: bcmgenet: Add a shutdown callback
and are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue 1GbE

Lyude Paul (1):
  igb/igc: Don't warn on fatal read failures when the device is removed

Robert Beckett (1):
  igb: add rx drop enable attribute

Sasha Neftin (5):
  igc: Add SCTP CRC checksumming functionality
  e1000e: Add support for S0ix
  igc: Add set_rx_mode support
  igc: Add Rx checksum support
  igc: Clean up unused shadow_vfta pointer

 drivers/net/ethernet/intel/e1000e/netdev.c   | 186 +++++++++++++++
 drivers/net/ethernet/intel/e1000e/regs.h     |   4 +
 drivers/net/ethernet/intel/igb/igb.h         |   1 +
 drivers/net/ethernet/intel/igb/igb_ethtool.c |   8 +
 drivers/net/ethernet/intel/igb/igb_main.c    |  14 +-
 drivers/net/ethernet/intel/igc/igc.h         |   1 -
 drivers/net/ethernet/intel/igc/igc_defines.h |   8 +-
 drivers/net/ethernet/intel/igc/igc_hw.h      |   1 +
 drivers/net/ethernet/intel/igc/igc_mac.c     | 104 +++++++++
 drivers/net/ethernet/intel/igc/igc_mac.h     |   2 +
 drivers/net/ethernet/intel/igc/igc_main.c    | 229 ++++++++++++++++++-
 11 files changed, 551 insertions(+), 7 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [net-next 1/7] igb/igc: Don't warn on fatal read failures when the device is removed
  2019-10-16 23:47 [net-next 0/7][pull request] 1GbE Intel Wired LAN Driver Updates 2019-10-16 Jeff Kirsher
@ 2019-10-16 23:47 ` Jeff Kirsher
  2019-10-16 23:58   ` Jakub Kicinski
  2019-10-16 23:47 ` [net-next 2/7] igb: add rx drop enable attribute Jeff Kirsher
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 16+ messages in thread
From: Jeff Kirsher @ 2019-10-16 23:47 UTC (permalink / raw)
  To: davem
  Cc: Lyude Paul, netdev, nhorman, sassmann, Sasha Neftin, Aaron Brown,
	Feng Tang, Jeff Kirsher

From: Lyude Paul <lyude@redhat.com>

Fatal read errors are worth warning about, unless of course the device
was just unplugged from the machine - something that's a rather normal
occurence when the igb/igc adapter is located on a Thunderbolt dock. So,
let's only WARN() if there's a fatal read error while the device is
still present.

This fixes the following WARN splat that's been appearing whenever I
unplug my Caldigit TS3 Thunderbolt dock from my laptop:

  igb 0000:09:00.0 enp9s0: PCIe link lost
  ------------[ cut here ]------------
  igb: Failed to read reg 0x18!
  WARNING: CPU: 7 PID: 516 at
  drivers/net/ethernet/intel/igb/igb_main.c:756 igb_rd32+0x57/0x6a [igb]
  Modules linked in: igb dca thunderbolt fuse vfat fat elan_i2c mei_wdt
  mei_hdcp i915 wmi_bmof intel_wmi_thunderbolt iTCO_wdt
  iTCO_vendor_support x86_pkg_temp_thermal intel_powerclamp joydev
  coretemp crct10dif_pclmul crc32_pclmul i2c_algo_bit ghash_clmulni_intel
  intel_cstate drm_kms_helper intel_uncore syscopyarea sysfillrect
  sysimgblt fb_sys_fops intel_rapl_perf intel_xhci_usb_role_switch mei_me
  drm roles idma64 i2c_i801 ucsi_acpi typec_ucsi mei intel_lpss_pci
  processor_thermal_device typec intel_pch_thermal intel_soc_dts_iosf
  intel_lpss int3403_thermal thinkpad_acpi wmi int340x_thermal_zone
  ledtrig_audio int3400_thermal acpi_thermal_rel acpi_pad video
  pcc_cpufreq ip_tables serio_raw nvme nvme_core crc32c_intel uas
  usb_storage e1000e i2c_dev
  CPU: 7 PID: 516 Comm: kworker/u16:3 Not tainted 5.2.0-rc1Lyude-Test+ #14
  Hardware name: LENOVO 20L8S2N800/20L8S2N800, BIOS N22ET35W (1.12 ) 04/09/2018
  Workqueue: kacpi_hotplug acpi_hotplug_work_fn
  RIP: 0010:igb_rd32+0x57/0x6a [igb]
  Code: 87 b8 fc ff ff 48 c7 47 08 00 00 00 00 48 c7 c6 33 42 9b c0 4c 89
  c7 e8 47 45 cd dc 89 ee 48 c7 c7 43 42 9b c0 e8 c1 94 71 dc <0f> 0b eb
  08 8b 00 ff c0 75 b0 eb c8 44 89 e0 5d 41 5c c3 0f 1f 44
  RSP: 0018:ffffba5801cf7c48 EFLAGS: 00010286
  RAX: 0000000000000000 RBX: ffff9e7956608840 RCX: 0000000000000007
  RDX: 0000000000000000 RSI: ffffba5801cf7b24 RDI: ffff9e795e3d6a00
  RBP: 0000000000000018 R08: 000000009dec4a01 R09: ffffffff9e61018f
  R10: 0000000000000000 R11: ffffba5801cf7ae5 R12: 00000000ffffffff
  R13: ffff9e7956608840 R14: ffff9e795a6f10b0 R15: 0000000000000000
  FS:  0000000000000000(0000) GS:ffff9e795e3c0000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000564317bc4088 CR3: 000000010e00a006 CR4: 00000000003606e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   igb_release_hw_control+0x1a/0x30 [igb]
   igb_remove+0xc5/0x14b [igb]
   pci_device_remove+0x3b/0x93
   device_release_driver_internal+0xd7/0x17e
   pci_stop_bus_device+0x36/0x75
   pci_stop_bus_device+0x66/0x75
   pci_stop_bus_device+0x66/0x75
   pci_stop_and_remove_bus_device+0xf/0x19
   trim_stale_devices+0xc5/0x13a
   ? __pm_runtime_resume+0x6e/0x7b
   trim_stale_devices+0x103/0x13a
   ? __pm_runtime_resume+0x6e/0x7b
   trim_stale_devices+0x103/0x13a
   acpiphp_check_bridge+0xd8/0xf5
   acpiphp_hotplug_notify+0xf7/0x14b
   ? acpiphp_check_bridge+0xf5/0xf5
   acpi_device_hotplug+0x357/0x3b5
   acpi_hotplug_work_fn+0x1a/0x23
   process_one_work+0x1a7/0x296
   worker_thread+0x1a8/0x24c
   ? process_scheduled_works+0x2c/0x2c
   kthread+0xe9/0xee
   ? kthread_destroy_worker+0x41/0x41
   ret_from_fork+0x35/0x40
  ---[ end trace 252bf10352c63d22 ]---

Signed-off-by: Lyude Paul <lyude@redhat.com>
Fixes: 47e16692b26b ("igb/igc: warn when fatal read failure happens")
Acked-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Acked-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c | 3 ++-
 drivers/net/ethernet/intel/igc/igc_main.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 105b0624081a..31b9e02875cc 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -753,7 +753,8 @@ u32 igb_rd32(struct e1000_hw *hw, u32 reg)
 		struct net_device *netdev = igb->netdev;
 		hw->hw_addr = NULL;
 		netdev_err(netdev, "PCIe link lost\n");
-		WARN(1, "igb: Failed to read reg 0x%x!\n", reg);
+		WARN(pci_device_is_present(igb->pdev),
+		     "igb: Failed to read reg 0x%x!\n", reg);
 	}
 
 	return value;
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 63b62d74f961..8e424dfab12e 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -4047,7 +4047,8 @@ u32 igc_rd32(struct igc_hw *hw, u32 reg)
 		hw->hw_addr = NULL;
 		netif_device_detach(netdev);
 		netdev_err(netdev, "PCIe link lost, device now detached\n");
-		WARN(1, "igc: Failed to read reg 0x%x!\n", reg);
+		WARN(pci_device_is_present(igc->pdev),
+		     "igc: Failed to read reg 0x%x!\n", reg);
 	}
 
 	return value;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [net-next 2/7] igb: add rx drop enable attribute
  2019-10-16 23:47 [net-next 0/7][pull request] 1GbE Intel Wired LAN Driver Updates 2019-10-16 Jeff Kirsher
  2019-10-16 23:47 ` [net-next 1/7] igb/igc: Don't warn on fatal read failures when the device is removed Jeff Kirsher
@ 2019-10-16 23:47 ` Jeff Kirsher
  2019-10-16 23:55   ` Jakub Kicinski
  2019-10-16 23:47 ` [net-next 3/7] igc: Add SCTP CRC checksumming functionality Jeff Kirsher
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 16+ messages in thread
From: Jeff Kirsher @ 2019-10-16 23:47 UTC (permalink / raw)
  To: davem
  Cc: Robert Beckett, netdev, nhorman, sassmann, Aaron Brown, Jeff Kirsher

From: Robert Beckett <bob.beckett@collabora.com>

To allow userland to enable or disable dropping packets when descriptor
ring is exhausted, add RX_DROP_EN private flag.

This can be used in conjunction with flow control to mitigate packet storms
(e.g. due to network loop or DoS) by forcing the network adapter to send
pause frames whenever the ring is close to exhaustion.

By default this will maintain previous behaviour of enabling dropping of
packets during ring buffer exhaustion.
Some use cases prefer to not drop packets upon exhaustion, but instead
use flow control to limit ingress rates and ensure no dropped packets.
This is useful when the host CPU cannot keep up with packet delivery,
but data delivery is more important than throughput via multiple queues.

Userland can set this flag to 0 via ethtool to disable packet dropping.

Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/igb.h         |  1 +
 drivers/net/ethernet/intel/igb/igb_ethtool.c |  8 ++++++++
 drivers/net/ethernet/intel/igb/igb_main.c    | 11 +++++++++--
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb.h b/drivers/net/ethernet/intel/igb/igb.h
index ca54e268d157..ff4a61218e3e 100644
--- a/drivers/net/ethernet/intel/igb/igb.h
+++ b/drivers/net/ethernet/intel/igb/igb.h
@@ -617,6 +617,7 @@ struct igb_adapter {
 #define IGB_FLAG_VLAN_PROMISC		BIT(15)
 #define IGB_FLAG_RX_LEGACY		BIT(16)
 #define IGB_FLAG_FQTSS			BIT(17)
+#define IGB_FLAG_RX_DROP_EN		BIT(18)
 
 /* Media Auto Sense */
 #define IGB_MAS_ENABLE_0		0X0001
diff --git a/drivers/net/ethernet/intel/igb/igb_ethtool.c b/drivers/net/ethernet/intel/igb/igb_ethtool.c
index 3182b059bf55..b584fa1f0e14 100644
--- a/drivers/net/ethernet/intel/igb/igb_ethtool.c
+++ b/drivers/net/ethernet/intel/igb/igb_ethtool.c
@@ -128,6 +128,8 @@ static const char igb_gstrings_test[][ETH_GSTRING_LEN] = {
 static const char igb_priv_flags_strings[][ETH_GSTRING_LEN] = {
 #define IGB_PRIV_FLAGS_LEGACY_RX	BIT(0)
 	"legacy-rx",
+#define IGB_PRIV_FLAGS_RX_DROP_EN	BIT(1)
+	"rx-drop-en",
 };
 
 #define IGB_PRIV_FLAGS_STR_LEN ARRAY_SIZE(igb_priv_flags_strings)
@@ -3444,6 +3446,8 @@ static u32 igb_get_priv_flags(struct net_device *netdev)
 
 	if (adapter->flags & IGB_FLAG_RX_LEGACY)
 		priv_flags |= IGB_PRIV_FLAGS_LEGACY_RX;
+	if (adapter->flags & IGB_FLAG_RX_DROP_EN)
+		priv_flags |= IGB_PRIV_FLAGS_RX_DROP_EN;
 
 	return priv_flags;
 }
@@ -3457,6 +3461,10 @@ static int igb_set_priv_flags(struct net_device *netdev, u32 priv_flags)
 	if (priv_flags & IGB_PRIV_FLAGS_LEGACY_RX)
 		flags |= IGB_FLAG_RX_LEGACY;
 
+	flags &= ~IGB_FLAG_RX_DROP_EN;
+	if (priv_flags & IGB_PRIV_FLAGS_RX_DROP_EN)
+		flags |= IGB_FLAG_RX_DROP_EN;
+
 	if (flags != adapter->flags) {
 		adapter->flags = flags;
 
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 31b9e02875cc..5dae45bb3578 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -3237,6 +3237,9 @@ static int igb_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 	igb_validate_mdi_setting(hw);
 
+	/* By default, support dropping packets due to ring exhaustion */
+	adapter->flags |= IGB_FLAG_RX_DROP_EN;
+
 	/* By default, support wake on port A */
 	if (hw->bus.func == 0)
 		adapter->flags |= IGB_FLAG_WOL_SUPPORTED;
@@ -4504,8 +4507,12 @@ void igb_configure_rx_ring(struct igb_adapter *adapter,
 	srrctl |= E1000_SRRCTL_DESCTYPE_ADV_ONEBUF;
 	if (hw->mac.type >= e1000_82580)
 		srrctl |= E1000_SRRCTL_TIMESTAMP;
-	/* Only set Drop Enable if we are supporting multiple queues */
-	if (adapter->vfs_allocated_count || adapter->num_rx_queues > 1)
+	/*
+	 * Only set Drop Enable if we are supporting multiple queues and
+	 * allowed by flags
+	 */
+	if ((adapter->flags & IGB_FLAG_RX_DROP_EN) &&
+		(adapter->vfs_allocated_count || adapter->num_rx_queues > 1))
 		srrctl |= E1000_SRRCTL_DROP_EN;
 
 	wr32(E1000_SRRCTL(reg_idx), srrctl);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [net-next 3/7] igc: Add SCTP CRC checksumming functionality
  2019-10-16 23:47 [net-next 0/7][pull request] 1GbE Intel Wired LAN Driver Updates 2019-10-16 Jeff Kirsher
  2019-10-16 23:47 ` [net-next 1/7] igb/igc: Don't warn on fatal read failures when the device is removed Jeff Kirsher
  2019-10-16 23:47 ` [net-next 2/7] igb: add rx drop enable attribute Jeff Kirsher
@ 2019-10-16 23:47 ` Jeff Kirsher
  2019-10-16 23:47 ` [net-next 4/7] e1000e: Add support for S0ix Jeff Kirsher
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 16+ messages in thread
From: Jeff Kirsher @ 2019-10-16 23:47 UTC (permalink / raw)
  To: davem; +Cc: Sasha Neftin, netdev, nhorman, sassmann, Aaron Brown, Jeff Kirsher

From: Sasha Neftin <sasha.neftin@intel.com>

Add stream control transmission protocol CRC checksum.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_main.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 8e424dfab12e..12a030343ed7 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -4212,6 +4212,7 @@ static int igc_probe(struct pci_dev *pdev,
 
 	/* Add supported features to the features list*/
 	netdev->features |= NETIF_F_HW_CSUM;
+	netdev->features |= NETIF_F_SCTP_CRC;
 
 	/* setup the private structure */
 	err = igc_sw_init(adapter);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [net-next 4/7] e1000e: Add support for S0ix
  2019-10-16 23:47 [net-next 0/7][pull request] 1GbE Intel Wired LAN Driver Updates 2019-10-16 Jeff Kirsher
                   ` (2 preceding siblings ...)
  2019-10-16 23:47 ` [net-next 3/7] igc: Add SCTP CRC checksumming functionality Jeff Kirsher
@ 2019-10-16 23:47 ` Jeff Kirsher
  2019-10-17  0:02   ` Jakub Kicinski
  2019-10-16 23:47 ` [net-next 5/7] igc: Add set_rx_mode support Jeff Kirsher
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 16+ messages in thread
From: Jeff Kirsher @ 2019-10-16 23:47 UTC (permalink / raw)
  To: davem
  Cc: Sasha Neftin, netdev, nhorman, sassmann, Rafael J. Wysocki,
	Vitaly Lifshits, Rajneesh Bhardwaj, Aaron Brown, Jeff Kirsher

From: Sasha Neftin <sasha.neftin@intel.com>

Implement flow for S0ix support. Modern SoCs support S0ix low power
states during idle periods, which are sub-states of ACPI S0 that increase
power saving while supporting an instant-on experience for providing
lower latency that ACPI S0. The S0ix states shut off parts of the SoC
when they are not in use, while still maintaning optimal performance.
This patch add support for S0ix started from an Ice Lake platform.

Suggested-by: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@linux.intel.com>
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/e1000e/netdev.c | 186 +++++++++++++++++++++
 drivers/net/ethernet/intel/e1000e/regs.h   |   4 +
 2 files changed, 190 insertions(+)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index d7d56e42a6aa..cc4363e67072 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -6294,6 +6294,178 @@ static void e1000e_flush_lpic(struct pci_dev *pdev)
 	pm_runtime_put_sync(netdev->dev.parent);
 }
 
+/* S0ix implementation */
+static int e1000e_s0ix_entry_flow(struct e1000_adapter *adapter)
+{
+	struct e1000_hw *hw = &adapter->hw;
+	u32 mac_data;
+	u16 phy_data;
+
+	/* Disable the periodic inband message,
+	 * don't request PCIe clock in K1 page770_17[10:9] = 10b
+	 */
+	e1e_rphy(hw, HV_PM_CTRL, &phy_data);
+	phy_data &= ~HV_PM_CTRL_K1_CLK_REQ;
+	phy_data |= BIT(10);
+	e1e_wphy(hw, HV_PM_CTRL, phy_data);
+
+	/* Make sure we don't exit K1 every time a new packet arrives
+	 * 772_29[5] = 1 CS_Mode_Stay_In_K1
+	 */
+	e1e_rphy(hw, I217_CGFREG, &phy_data);
+	phy_data |= BIT(5);
+	e1e_wphy(hw, I217_CGFREG, phy_data);
+
+	/* Change the MAC/PHY interface to SMBus
+	 * Force the SMBus in PHY page769_23[0] = 1
+	 * Force the SMBus in MAC CTRL_EXT[11] = 1
+	 */
+	e1e_rphy(hw, CV_SMB_CTRL, &phy_data);
+	phy_data |= CV_SMB_CTRL_FORCE_SMBUS;
+	e1e_wphy(hw, CV_SMB_CTRL, phy_data);
+	mac_data = er32(CTRL_EXT);
+	mac_data |= E1000_CTRL_EXT_FORCE_SMBUS;
+	ew32(CTRL_EXT, mac_data);
+
+	/* DFT control: PHY bit: page769_20[0] = 1
+	 * Gate PPW via EXTCNF_CTRL - set 0x0F00[7] = 1
+	 */
+	e1e_rphy(hw, I82579_DFT_CTRL, &phy_data);
+	phy_data |= BIT(0);
+	e1e_wphy(hw, I82579_DFT_CTRL, phy_data);
+
+	mac_data = er32(EXTCNF_CTRL);
+	mac_data |= E1000_EXTCNF_CTRL_GATE_PHY_CFG;
+	ew32(EXTCNF_CTRL, mac_data);
+
+	/* Check MAC Tx/Rx packet buffer pointers.
+	 * Reset MAC Tx/Rx packet buffer pointers to suppress any
+	 * pending traffic indication that would prevent power gating.
+	 */
+	mac_data = er32(TDFH);
+	if (mac_data)
+		ew32(TDFH, 0);
+	mac_data = er32(TDFT);
+	if (mac_data)
+		ew32(TDFT, 0);
+	mac_data = er32(TDFHS);
+	if (mac_data)
+		ew32(TDFHS, 0);
+	mac_data = er32(TDFTS);
+	if (mac_data)
+		ew32(TDFTS, 0);
+	mac_data = er32(TDFPC);
+	if (mac_data)
+		ew32(TDFPC, 0);
+	mac_data = er32(RDFH);
+	if (mac_data)
+		ew32(RDFH, 0);
+	mac_data = er32(RDFT);
+	if (mac_data)
+		ew32(RDFT, 0);
+	mac_data = er32(RDFHS);
+	if (mac_data)
+		ew32(RDFHS, 0);
+	mac_data = er32(RDFTS);
+	if (mac_data)
+		ew32(RDFTS, 0);
+	mac_data = er32(RDFPC);
+	if (mac_data)
+		ew32(RDFPC, 0);
+
+	/* Enable the Dynamic Power Gating in the MAC */
+	mac_data = er32(FEXTNVM7);
+	mac_data |= BIT(22);
+	ew32(FEXTNVM7, mac_data);
+
+	/* Disable the time synchronization clock */
+	mac_data = er32(FEXTNVM7);
+	mac_data |= BIT(31);
+	mac_data &= ~BIT(0);
+	ew32(FEXTNVM7, mac_data);
+
+	/* Dynamic Power Gating Enable */
+	mac_data = er32(CTRL_EXT);
+	mac_data |= BIT(3);
+	ew32(CTRL_EXT, mac_data);
+
+	/* Enable the Dynamic Clock Gating in the DMA and MAC */
+	mac_data = er32(CTRL_EXT);
+	mac_data |= E1000_CTRL_EXT_DMA_DYN_CLK_EN;
+	ew32(CTRL_EXT, mac_data);
+
+	/* No MAC DPG gating SLP_S0 in modern standby
+	 * Switch the logic of the lanphypc to use PMC counter
+	 */
+	mac_data = er32(FEXTNVM5);
+	mac_data |= BIT(7);
+	ew32(FEXTNVM5, mac_data);
+
+	return 0;
+}
+
+static int e1000e_s0ix_exit_flow(struct e1000_adapter *adapter)
+{
+	struct e1000_hw *hw = &adapter->hw;
+	u32 mac_data;
+	u16 phy_data;
+
+	/* Disable the Dynamic Power Gating in the MAC */
+	mac_data = er32(FEXTNVM7);
+	mac_data &= 0xFFBFFFFF;
+	ew32(FEXTNVM7, mac_data);
+
+	/* Enable the time synchronization clock */
+	mac_data = er32(FEXTNVM7);
+	mac_data |= BIT(0);
+	ew32(FEXTNVM7, mac_data);
+
+	/* Disable Dynamic Power Gating */
+	mac_data = er32(CTRL_EXT);
+	mac_data &= 0xFFFFFFF7;
+	ew32(CTRL_EXT, mac_data);
+
+	/* Disable the Dynamic Clock Gating in the DMA and MAC */
+	mac_data = er32(CTRL_EXT);
+	mac_data &= 0xFFF7FFFF;
+	ew32(CTRL_EXT, mac_data);
+
+	/* Revert the lanphypc logic to use the internal Gbe counter
+	 * and not the PMC counter
+	 */
+	mac_data = er32(FEXTNVM5);
+	mac_data &= 0xFFFFFF7F;
+	ew32(FEXTNVM5, mac_data);
+
+	/* Enable the periodic inband message,
+	 * Request PCIe clock in K1 page770_17[10:9] =01b
+	 */
+	e1e_rphy(hw, HV_PM_CTRL, &phy_data);
+	phy_data &= 0xFBFF;
+	phy_data |= HV_PM_CTRL_K1_CLK_REQ;
+	e1e_wphy(hw, HV_PM_CTRL, phy_data);
+
+	/* Return back configuration
+	 * 772_29[5] = 0 CS_Mode_Stay_In_K1
+	 */
+	e1e_rphy(hw, I217_CGFREG, &phy_data);
+	phy_data &= 0xFFDF;
+	e1e_wphy(hw, I217_CGFREG, phy_data);
+
+	/* Change the MAC/PHY interface to Kumeran
+	 * Unforce the SMBus in PHY page769_23[0] = 0
+	 * Unforce the SMBus in MAC CTRL_EXT[11] = 0
+	 */
+	e1e_rphy(hw, CV_SMB_CTRL, &phy_data);
+	phy_data &= ~CV_SMB_CTRL_FORCE_SMBUS;
+	e1e_wphy(hw, CV_SMB_CTRL, phy_data);
+	mac_data = er32(CTRL_EXT);
+	mac_data &= ~E1000_CTRL_EXT_FORCE_SMBUS;
+	ew32(CTRL_EXT, mac_data);
+
+	return 0;
+}
+
 static int e1000e_pm_freeze(struct device *dev)
 {
 	struct net_device *netdev = dev_get_drvdata(dev);
@@ -6650,6 +6822,9 @@ static int e1000e_pm_thaw(struct device *dev)
 static int e1000e_pm_suspend(struct device *dev)
 {
 	struct pci_dev *pdev = to_pci_dev(dev);
+	struct net_device *netdev = pci_get_drvdata(to_pci_dev(dev));
+	struct e1000_adapter *adapter = netdev_priv(netdev);
+	struct e1000_hw *hw = &adapter->hw;
 	int rc;
 
 	e1000e_flush_lpic(pdev);
@@ -6660,14 +6835,25 @@ static int e1000e_pm_suspend(struct device *dev)
 	if (rc)
 		e1000e_pm_thaw(dev);
 
+	/* Introduce S0ix implementation */
+	if (hw->mac.type >= e1000_pch_cnp)
+		e1000e_s0ix_entry_flow(adapter);
+
 	return rc;
 }
 
 static int e1000e_pm_resume(struct device *dev)
 {
 	struct pci_dev *pdev = to_pci_dev(dev);
+	struct net_device *netdev = pci_get_drvdata(to_pci_dev(dev));
+	struct e1000_adapter *adapter = netdev_priv(netdev);
+	struct e1000_hw *hw = &adapter->hw;
 	int rc;
 
+	/* Introduce S0ix implementation */
+	if (hw->mac.type >= e1000_pch_cnp)
+		e1000e_s0ix_exit_flow(adapter);
+
 	rc = __e1000_resume(pdev);
 	if (rc)
 		return rc;
diff --git a/drivers/net/ethernet/intel/e1000e/regs.h b/drivers/net/ethernet/intel/e1000e/regs.h
index 47f5ca793970..df59fd1d660c 100644
--- a/drivers/net/ethernet/intel/e1000e/regs.h
+++ b/drivers/net/ethernet/intel/e1000e/regs.h
@@ -18,6 +18,7 @@
 #define E1000_FEXTNVM	0x00028	/* Future Extended NVM - RW */
 #define E1000_FEXTNVM3	0x0003C	/* Future Extended NVM 3 - RW */
 #define E1000_FEXTNVM4	0x00024	/* Future Extended NVM 4 - RW */
+#define E1000_FEXTNVM5	0x00014	/* Future Extended NVM 5 - RW */
 #define E1000_FEXTNVM6	0x00010	/* Future Extended NVM 6 - RW */
 #define E1000_FEXTNVM7	0x000E4	/* Future Extended NVM 7 - RW */
 #define E1000_FEXTNVM9	0x5BB4	/* Future Extended NVM 9 - RW */
@@ -234,4 +235,7 @@
 #define E1000_RXMTRL	0x0B634	/* Time sync Rx EtherType and Msg Type - RW */
 #define E1000_RXUDP	0x0B638	/* Time Sync Rx UDP Port - RW */
 
+/* PHY registers */
+#define I82579_DFT_CTRL	PHY_REG(769, 20)
+
 #endif
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [net-next 5/7] igc: Add set_rx_mode support
  2019-10-16 23:47 [net-next 0/7][pull request] 1GbE Intel Wired LAN Driver Updates 2019-10-16 Jeff Kirsher
                   ` (3 preceding siblings ...)
  2019-10-16 23:47 ` [net-next 4/7] e1000e: Add support for S0ix Jeff Kirsher
@ 2019-10-16 23:47 ` Jeff Kirsher
  2019-10-16 23:47 ` [net-next 6/7] igc: Add Rx checksum support Jeff Kirsher
  2019-10-16 23:47 ` [net-next 7/7] igc: Clean up unused shadow_vfta pointer Jeff Kirsher
  6 siblings, 0 replies; 16+ messages in thread
From: Jeff Kirsher @ 2019-10-16 23:47 UTC (permalink / raw)
  To: davem; +Cc: Sasha Neftin, netdev, nhorman, sassmann, Aaron Brown, Jeff Kirsher

From: Sasha Neftin <sasha.neftin@intel.com>

Add multicast addresses list to the MTA table.
Implement basic Rx mode support.
Add option for IPv6 address settings.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_defines.h |   3 +
 drivers/net/ethernet/intel/igc/igc_hw.h      |   1 +
 drivers/net/ethernet/intel/igc/igc_mac.c     | 104 +++++++++++
 drivers/net/ethernet/intel/igc/igc_mac.h     |   2 +
 drivers/net/ethernet/intel/igc/igc_main.c    | 181 +++++++++++++++++++
 5 files changed, 291 insertions(+)

diff --git a/drivers/net/ethernet/intel/igc/igc_defines.h b/drivers/net/ethernet/intel/igc/igc_defines.h
index f3f2325fe567..03f1aca3f57f 100644
--- a/drivers/net/ethernet/intel/igc/igc_defines.h
+++ b/drivers/net/ethernet/intel/igc/igc_defines.h
@@ -402,4 +402,7 @@
 #define IGC_ADVTXD_TUCMD_L4T_TCP	0x00000800  /* L4 Packet Type of TCP */
 #define IGC_ADVTXD_TUCMD_L4T_SCTP	0x00001000 /* L4 packet TYPE of SCTP */
 
+/* Maximum size of the MTA register table in all supported adapters */
+#define MAX_MTA_REG			128
+
 #endif /* _IGC_DEFINES_H_ */
diff --git a/drivers/net/ethernet/intel/igc/igc_hw.h b/drivers/net/ethernet/intel/igc/igc_hw.h
index abb2d72911ff..20f710645746 100644
--- a/drivers/net/ethernet/intel/igc/igc_hw.h
+++ b/drivers/net/ethernet/intel/igc/igc_hw.h
@@ -91,6 +91,7 @@ struct igc_mac_info {
 	u16 mta_reg_count;
 	u16 uta_reg_count;
 
+	u32 mta_shadow[MAX_MTA_REG];
 	u16 rar_entry_count;
 
 	u8 forced_speed_duplex;
diff --git a/drivers/net/ethernet/intel/igc/igc_mac.c b/drivers/net/ethernet/intel/igc/igc_mac.c
index 5eeb4c8caf4a..12aa6b5fcb5d 100644
--- a/drivers/net/ethernet/intel/igc/igc_mac.c
+++ b/drivers/net/ethernet/intel/igc/igc_mac.c
@@ -784,3 +784,107 @@ bool igc_enable_mng_pass_thru(struct igc_hw *hw)
 out:
 	return ret_val;
 }
+
+/**
+ *  igc_hash_mc_addr - Generate a multicast hash value
+ *  @hw: pointer to the HW structure
+ *  @mc_addr: pointer to a multicast address
+ *
+ *  Generates a multicast address hash value which is used to determine
+ *  the multicast filter table array address and new table value.  See
+ *  igc_mta_set()
+ **/
+static u32 igc_hash_mc_addr(struct igc_hw *hw, u8 *mc_addr)
+{
+	u32 hash_value, hash_mask;
+	u8 bit_shift = 0;
+
+	/* Register count multiplied by bits per register */
+	hash_mask = (hw->mac.mta_reg_count * 32) - 1;
+
+	/* For a mc_filter_type of 0, bit_shift is the number of left-shifts
+	 * where 0xFF would still fall within the hash mask.
+	 */
+	while (hash_mask >> bit_shift != 0xFF)
+		bit_shift++;
+
+	/* The portion of the address that is used for the hash table
+	 * is determined by the mc_filter_type setting.
+	 * The algorithm is such that there is a total of 8 bits of shifting.
+	 * The bit_shift for a mc_filter_type of 0 represents the number of
+	 * left-shifts where the MSB of mc_addr[5] would still fall within
+	 * the hash_mask.  Case 0 does this exactly.  Since there are a total
+	 * of 8 bits of shifting, then mc_addr[4] will shift right the
+	 * remaining number of bits. Thus 8 - bit_shift.  The rest of the
+	 * cases are a variation of this algorithm...essentially raising the
+	 * number of bits to shift mc_addr[5] left, while still keeping the
+	 * 8-bit shifting total.
+	 *
+	 * For example, given the following Destination MAC Address and an
+	 * MTA register count of 128 (thus a 4096-bit vector and 0xFFF mask),
+	 * we can see that the bit_shift for case 0 is 4.  These are the hash
+	 * values resulting from each mc_filter_type...
+	 * [0] [1] [2] [3] [4] [5]
+	 * 01  AA  00  12  34  56
+	 * LSB                 MSB
+	 *
+	 * case 0: hash_value = ((0x34 >> 4) | (0x56 << 4)) & 0xFFF = 0x563
+	 * case 1: hash_value = ((0x34 >> 3) | (0x56 << 5)) & 0xFFF = 0xAC6
+	 * case 2: hash_value = ((0x34 >> 2) | (0x56 << 6)) & 0xFFF = 0x163
+	 * case 3: hash_value = ((0x34 >> 0) | (0x56 << 8)) & 0xFFF = 0x634
+	 */
+	switch (hw->mac.mc_filter_type) {
+	default:
+	case 0:
+		break;
+	case 1:
+		bit_shift += 1;
+		break;
+	case 2:
+		bit_shift += 2;
+		break;
+	case 3:
+		bit_shift += 4;
+		break;
+	}
+
+	hash_value = hash_mask & (((mc_addr[4] >> (8 - bit_shift)) |
+				  (((u16)mc_addr[5]) << bit_shift)));
+
+	return hash_value;
+}
+
+/**
+ *  igc_update_mc_addr_list - Update Multicast addresses
+ *  @hw: pointer to the HW structure
+ *  @mc_addr_list: array of multicast addresses to program
+ *  @mc_addr_count: number of multicast addresses to program
+ *
+ *  Updates entire Multicast Table Array.
+ *  The caller must have a packed mc_addr_list of multicast addresses.
+ **/
+void igc_update_mc_addr_list(struct igc_hw *hw,
+			     u8 *mc_addr_list, u32 mc_addr_count)
+{
+	u32 hash_value, hash_bit, hash_reg;
+	int i;
+
+	/* clear mta_shadow */
+	memset(&hw->mac.mta_shadow, 0, sizeof(hw->mac.mta_shadow));
+
+	/* update mta_shadow from mc_addr_list */
+	for (i = 0; (u32)i < mc_addr_count; i++) {
+		hash_value = igc_hash_mc_addr(hw, mc_addr_list);
+
+		hash_reg = (hash_value >> 5) & (hw->mac.mta_reg_count - 1);
+		hash_bit = hash_value & 0x1F;
+
+		hw->mac.mta_shadow[hash_reg] |= BIT(hash_bit);
+		mc_addr_list += ETH_ALEN;
+	}
+
+	/* replace the entire MTA table */
+	for (i = hw->mac.mta_reg_count - 1; i >= 0; i--)
+		array_wr32(IGC_MTA, i, hw->mac.mta_shadow[i]);
+	wrfl();
+}
diff --git a/drivers/net/ethernet/intel/igc/igc_mac.h b/drivers/net/ethernet/intel/igc/igc_mac.h
index 782bc995badc..832cccec87cd 100644
--- a/drivers/net/ethernet/intel/igc/igc_mac.h
+++ b/drivers/net/ethernet/intel/igc/igc_mac.h
@@ -29,6 +29,8 @@ s32 igc_get_speed_and_duplex_copper(struct igc_hw *hw, u16 *speed,
 				    u16 *duplex);
 
 bool igc_enable_mng_pass_thru(struct igc_hw *hw);
+void igc_update_mc_addr_list(struct igc_hw *hw,
+			     u8 *mc_addr_list, u32 mc_addr_count);
 
 enum igc_mng_mode {
 	igc_mng_mode_none = 0,
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 12a030343ed7..a861fc038721 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -795,6 +795,44 @@ static int igc_set_mac(struct net_device *netdev, void *p)
 	return 0;
 }
 
+/**
+ *  igc_write_mc_addr_list - write multicast addresses to MTA
+ *  @netdev: network interface device structure
+ *
+ *  Writes multicast address list to the MTA hash table.
+ *  Returns: -ENOMEM on failure
+ *           0 on no addresses written
+ *           X on writing X addresses to MTA
+ **/
+static int igc_write_mc_addr_list(struct net_device *netdev)
+{
+	struct igc_adapter *adapter = netdev_priv(netdev);
+	struct igc_hw *hw = &adapter->hw;
+	struct netdev_hw_addr *ha;
+	u8  *mta_list;
+	int i;
+
+	if (netdev_mc_empty(netdev)) {
+		/* nothing to program, so clear mc list */
+		igc_update_mc_addr_list(hw, NULL, 0);
+		return 0;
+	}
+
+	mta_list = kcalloc(netdev_mc_count(netdev), 6, GFP_ATOMIC);
+	if (!mta_list)
+		return -ENOMEM;
+
+	/* The shared function expects a packed array of only addresses. */
+	i = 0;
+	netdev_for_each_mc_addr(ha, netdev)
+		memcpy(mta_list + (i++ * ETH_ALEN), ha->addr, ETH_ALEN);
+
+	igc_update_mc_addr_list(hw, mta_list, i);
+	kfree(mta_list);
+
+	return netdev_mc_count(netdev);
+}
+
 static void igc_tx_ctxtdesc(struct igc_ring *tx_ring,
 			    struct igc_tx_buffer *first,
 			    u32 vlan_macip_lens, u32 type_tucmd,
@@ -2518,6 +2556,110 @@ int igc_del_mac_steering_filter(struct igc_adapter *adapter,
 					IGC_MAC_STATE_QUEUE_STEERING | flags);
 }
 
+/* Add a MAC filter for 'addr' directing matching traffic to 'queue',
+ * 'flags' is used to indicate what kind of match is made, match is by
+ * default for the destination address, if matching by source address
+ * is desired the flag IGC_MAC_STATE_SRC_ADDR can be used.
+ */
+static int igc_add_mac_filter(struct igc_adapter *adapter,
+			      const u8 *addr, const u8 queue)
+{
+	struct igc_hw *hw = &adapter->hw;
+	int rar_entries = hw->mac.rar_entry_count;
+	int i;
+
+	if (is_zero_ether_addr(addr))
+		return -EINVAL;
+
+	/* Search for the first empty entry in the MAC table.
+	 * Do not touch entries at the end of the table reserved for the VF MAC
+	 * addresses.
+	 */
+	for (i = 0; i < rar_entries; i++) {
+		if (!igc_mac_entry_can_be_used(&adapter->mac_table[i],
+					       addr, 0))
+			continue;
+
+		ether_addr_copy(adapter->mac_table[i].addr, addr);
+		adapter->mac_table[i].queue = queue;
+		adapter->mac_table[i].state |= IGC_MAC_STATE_IN_USE;
+
+		igc_rar_set_index(adapter, i);
+		return i;
+	}
+
+	return -ENOSPC;
+}
+
+/* Remove a MAC filter for 'addr' directing matching traffic to
+ * 'queue', 'flags' is used to indicate what kind of match need to be
+ * removed, match is by default for the destination address, if
+ * matching by source address is to be removed the flag
+ * IGC_MAC_STATE_SRC_ADDR can be used.
+ */
+static int igc_del_mac_filter(struct igc_adapter *adapter,
+			      const u8 *addr, const u8 queue)
+{
+	struct igc_hw *hw = &adapter->hw;
+	int rar_entries = hw->mac.rar_entry_count;
+	int i;
+
+	if (is_zero_ether_addr(addr))
+		return -EINVAL;
+
+	/* Search for matching entry in the MAC table based on given address
+	 * and queue. Do not touch entries at the end of the table reserved
+	 * for the VF MAC addresses.
+	 */
+	for (i = 0; i < rar_entries; i++) {
+		if (!(adapter->mac_table[i].state & IGC_MAC_STATE_IN_USE))
+			continue;
+		if (adapter->mac_table[i].state != 0)
+			continue;
+		if (adapter->mac_table[i].queue != queue)
+			continue;
+		if (!ether_addr_equal(adapter->mac_table[i].addr, addr))
+			continue;
+
+		/* When a filter for the default address is "deleted",
+		 * we return it to its initial configuration
+		 */
+		if (adapter->mac_table[i].state & IGC_MAC_STATE_DEFAULT) {
+			adapter->mac_table[i].state =
+				IGC_MAC_STATE_DEFAULT | IGC_MAC_STATE_IN_USE;
+			adapter->mac_table[i].queue = 0;
+		} else {
+			adapter->mac_table[i].state = 0;
+			adapter->mac_table[i].queue = 0;
+			memset(adapter->mac_table[i].addr, 0, ETH_ALEN);
+		}
+
+		igc_rar_set_index(adapter, i);
+		return 0;
+	}
+
+	return -ENOENT;
+}
+
+static int igc_uc_sync(struct net_device *netdev, const unsigned char *addr)
+{
+	struct igc_adapter *adapter = netdev_priv(netdev);
+	int ret;
+
+	ret = igc_add_mac_filter(adapter, addr, adapter->num_rx_queues);
+
+	return min_t(int, ret, 0);
+}
+
+static int igc_uc_unsync(struct net_device *netdev, const unsigned char *addr)
+{
+	struct igc_adapter *adapter = netdev_priv(netdev);
+
+	igc_del_mac_filter(adapter, addr, adapter->num_rx_queues);
+
+	return 0;
+}
+
 /**
  * igc_set_rx_mode - Secondary Unicast, Multicast and Promiscuous mode set
  * @netdev: network interface device structure
@@ -2529,6 +2671,44 @@ int igc_del_mac_steering_filter(struct igc_adapter *adapter,
  */
 static void igc_set_rx_mode(struct net_device *netdev)
 {
+	struct igc_adapter *adapter = netdev_priv(netdev);
+	struct igc_hw *hw = &adapter->hw;
+	u32 rctl = 0, rlpml = MAX_JUMBO_FRAME_SIZE;
+	int count;
+
+	/* Check for Promiscuous and All Multicast modes */
+	if (netdev->flags & IFF_PROMISC) {
+		rctl |= IGC_RCTL_UPE | IGC_RCTL_MPE;
+	} else {
+		if (netdev->flags & IFF_ALLMULTI) {
+			rctl |= IGC_RCTL_MPE;
+		} else {
+			/* Write addresses to the MTA, if the attempt fails
+			 * then we should just turn on promiscuous mode so
+			 * that we can at least receive multicast traffic
+			 */
+			count = igc_write_mc_addr_list(netdev);
+			if (count < 0)
+				rctl |= IGC_RCTL_MPE;
+		}
+	}
+
+	/* Write addresses to available RAR registers, if there is not
+	 * sufficient space to store all the addresses then enable
+	 * unicast promiscuous mode
+	 */
+	if (__dev_uc_sync(netdev, igc_uc_sync, igc_uc_unsync))
+		rctl |= IGC_RCTL_UPE;
+
+	/* update state of unicast and multicast */
+	rctl |= rd32(IGC_RCTL) & ~(IGC_RCTL_UPE | IGC_RCTL_MPE);
+	wr32(IGC_RCTL, rctl);
+
+#if (PAGE_SIZE < 8192)
+	if (adapter->max_frame_size <= IGC_MAX_FRAME_BUILD_SKB)
+		rlpml = IGC_MAX_FRAME_BUILD_SKB;
+#endif
+	wr32(IGC_RLPML, rlpml);
 }
 
 /**
@@ -3982,6 +4162,7 @@ static const struct net_device_ops igc_netdev_ops = {
 	.ndo_open		= igc_open,
 	.ndo_stop		= igc_close,
 	.ndo_start_xmit		= igc_xmit_frame,
+	.ndo_set_rx_mode	= igc_set_rx_mode,
 	.ndo_set_mac_address	= igc_set_mac,
 	.ndo_change_mtu		= igc_change_mtu,
 	.ndo_get_stats		= igc_get_stats,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [net-next 6/7] igc: Add Rx checksum support
  2019-10-16 23:47 [net-next 0/7][pull request] 1GbE Intel Wired LAN Driver Updates 2019-10-16 Jeff Kirsher
                   ` (4 preceding siblings ...)
  2019-10-16 23:47 ` [net-next 5/7] igc: Add set_rx_mode support Jeff Kirsher
@ 2019-10-16 23:47 ` Jeff Kirsher
  2019-10-16 23:47 ` [net-next 7/7] igc: Clean up unused shadow_vfta pointer Jeff Kirsher
  6 siblings, 0 replies; 16+ messages in thread
From: Jeff Kirsher @ 2019-10-16 23:47 UTC (permalink / raw)
  To: davem; +Cc: Sasha Neftin, netdev, nhorman, sassmann, Aaron Brown, Jeff Kirsher

From: Sasha Neftin <sasha.neftin@intel.com>

Extend the socket buffer field process and add Rx checksum functionality
Minor: fix indentation with tab instead of spaces.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_defines.h |  5 ++-
 drivers/net/ethernet/intel/igc/igc_main.c    | 43 ++++++++++++++++++++
 2 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_defines.h b/drivers/net/ethernet/intel/igc/igc_defines.h
index 03f1aca3f57f..f3788f0b95b4 100644
--- a/drivers/net/ethernet/intel/igc/igc_defines.h
+++ b/drivers/net/ethernet/intel/igc/igc_defines.h
@@ -282,7 +282,10 @@
 #define IGC_RCTL_BAM		0x00008000 /* broadcast enable */
 
 /* Receive Descriptor bit definitions */
-#define IGC_RXD_STAT_EOP	0x02    /* End of Packet */
+#define IGC_RXD_STAT_EOP	0x02	/* End of Packet */
+#define IGC_RXD_STAT_IXSM	0x04	/* Ignore checksum */
+#define IGC_RXD_STAT_UDPCS	0x10	/* UDP xsum calculated */
+#define IGC_RXD_STAT_TCPCS	0x20	/* TCP xsum calculated */
 
 #define IGC_RXDEXT_STATERR_CE		0x01000000
 #define IGC_RXDEXT_STATERR_SE		0x02000000
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index a861fc038721..52521a0bbf48 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -1201,6 +1201,46 @@ static netdev_tx_t igc_xmit_frame(struct sk_buff *skb,
 	return igc_xmit_frame_ring(skb, igc_tx_queue_mapping(adapter, skb));
 }
 
+static inline void igc_rx_checksum(struct igc_ring *ring,
+				   union igc_adv_rx_desc *rx_desc,
+				   struct sk_buff *skb)
+{
+	skb_checksum_none_assert(skb);
+
+	/* Ignore Checksum bit is set */
+	if (igc_test_staterr(rx_desc, IGC_RXD_STAT_IXSM))
+		return;
+
+	/* Rx checksum disabled via ethtool */
+	if (!(ring->netdev->features & NETIF_F_RXCSUM))
+		return;
+
+	/* TCP/UDP checksum error bit is set */
+	if (igc_test_staterr(rx_desc,
+			     IGC_RXDEXT_STATERR_TCPE |
+			     IGC_RXDEXT_STATERR_IPE)) {
+		/* work around errata with sctp packets where the TCPE aka
+		 * L4E bit is set incorrectly on 64 byte (60 byte w/o crc)
+		 * packets, (aka let the stack check the crc32c)
+		 */
+		if (!(skb->len == 60 &&
+		      test_bit(IGC_RING_FLAG_RX_SCTP_CSUM, &ring->flags))) {
+			u64_stats_update_begin(&ring->rx_syncp);
+			ring->rx_stats.csum_err++;
+			u64_stats_update_end(&ring->rx_syncp);
+		}
+		/* let the stack verify checksum errors */
+		return;
+	}
+	/* It must be a TCP or UDP packet with a valid checksum */
+	if (igc_test_staterr(rx_desc, IGC_RXD_STAT_TCPCS |
+				      IGC_RXD_STAT_UDPCS))
+		skb->ip_summed = CHECKSUM_UNNECESSARY;
+
+	dev_dbg(ring->dev, "cksum success: bits %08X\n",
+		le32_to_cpu(rx_desc->wb.upper.status_error));
+}
+
 static inline void igc_rx_hash(struct igc_ring *ring,
 			       union igc_adv_rx_desc *rx_desc,
 			       struct sk_buff *skb)
@@ -1227,6 +1267,8 @@ static void igc_process_skb_fields(struct igc_ring *rx_ring,
 {
 	igc_rx_hash(rx_ring, rx_desc, skb);
 
+	igc_rx_checksum(rx_ring, rx_desc, skb);
+
 	skb_record_rx_queue(skb, rx_ring->queue_index);
 
 	skb->protocol = eth_type_trans(skb, rx_ring->netdev);
@@ -4392,6 +4434,7 @@ static int igc_probe(struct pci_dev *pdev,
 		goto err_sw_init;
 
 	/* Add supported features to the features list*/
+	netdev->features |= NETIF_F_RXCSUM;
 	netdev->features |= NETIF_F_HW_CSUM;
 	netdev->features |= NETIF_F_SCTP_CRC;
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [net-next 7/7] igc: Clean up unused shadow_vfta pointer
  2019-10-16 23:47 [net-next 0/7][pull request] 1GbE Intel Wired LAN Driver Updates 2019-10-16 Jeff Kirsher
                   ` (5 preceding siblings ...)
  2019-10-16 23:47 ` [net-next 6/7] igc: Add Rx checksum support Jeff Kirsher
@ 2019-10-16 23:47 ` Jeff Kirsher
  6 siblings, 0 replies; 16+ messages in thread
From: Jeff Kirsher @ 2019-10-16 23:47 UTC (permalink / raw)
  To: davem; +Cc: Sasha Neftin, netdev, nhorman, sassmann, Aaron Brown, Jeff Kirsher

From: Sasha Neftin <sasha.neftin@intel.com>

VLAN filter table array not implemented yet and shadow_vfta pointer
not used. Clean up the code and remove the unused shadow_vfta pointer.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igc/igc.h      | 1 -
 drivers/net/ethernet/intel/igc/igc_main.c | 1 -
 2 files changed, 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h
index 7e16345d836e..0868677d43ed 100644
--- a/drivers/net/ethernet/intel/igc/igc.h
+++ b/drivers/net/ethernet/intel/igc/igc.h
@@ -411,7 +411,6 @@ struct igc_adapter {
 	u32 tx_hwtstamp_timeouts;
 	u32 tx_hwtstamp_skipped;
 	u32 rx_hwtstamp_cleared;
-	u32 *shadow_vfta;
 
 	u32 rss_queues;
 	u32 rss_indir_tbl_init;
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 52521a0bbf48..0a4bcd2d3b85 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -4574,7 +4574,6 @@ static void igc_remove(struct pci_dev *pdev)
 	pci_release_mem_regions(pdev);
 
 	kfree(adapter->mac_table);
-	kfree(adapter->shadow_vfta);
 	free_netdev(netdev);
 
 	pci_disable_pcie_error_reporting(pdev);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [net-next 2/7] igb: add rx drop enable attribute
  2019-10-16 23:47 ` [net-next 2/7] igb: add rx drop enable attribute Jeff Kirsher
@ 2019-10-16 23:55   ` Jakub Kicinski
  2019-10-17 11:24     ` Robert Beckett
  0 siblings, 1 reply; 16+ messages in thread
From: Jakub Kicinski @ 2019-10-16 23:55 UTC (permalink / raw)
  To: Jeff Kirsher
  Cc: davem, Robert Beckett, netdev, nhorman, sassmann, Aaron Brown

On Wed, 16 Oct 2019 16:47:06 -0700, Jeff Kirsher wrote:
> From: Robert Beckett <bob.beckett@collabora.com>
> 
> To allow userland to enable or disable dropping packets when descriptor
> ring is exhausted, add RX_DROP_EN private flag.
> 
> This can be used in conjunction with flow control to mitigate packet storms
> (e.g. due to network loop or DoS) by forcing the network adapter to send
> pause frames whenever the ring is close to exhaustion.
> 
> By default this will maintain previous behaviour of enabling dropping of
> packets during ring buffer exhaustion.
> Some use cases prefer to not drop packets upon exhaustion, but instead
> use flow control to limit ingress rates and ensure no dropped packets.
> This is useful when the host CPU cannot keep up with packet delivery,
> but data delivery is more important than throughput via multiple queues.
> 
> Userland can set this flag to 0 via ethtool to disable packet dropping.
> 
> Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
> Tested-by: Aaron Brown <aaron.f.brown@intel.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

How is this different than enabling/disabling flow control..

ethtool -a/-A

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [net-next 1/7] igb/igc: Don't warn on fatal read failures when the device is removed
  2019-10-16 23:47 ` [net-next 1/7] igb/igc: Don't warn on fatal read failures when the device is removed Jeff Kirsher
@ 2019-10-16 23:58   ` Jakub Kicinski
  0 siblings, 0 replies; 16+ messages in thread
From: Jakub Kicinski @ 2019-10-16 23:58 UTC (permalink / raw)
  To: Jeff Kirsher
  Cc: davem, Lyude Paul, netdev, nhorman, sassmann, Sasha Neftin,
	Aaron Brown, Feng Tang

On Wed, 16 Oct 2019 16:47:05 -0700, Jeff Kirsher wrote:
> From: Lyude Paul <lyude@redhat.com>
> 
> Fatal read errors are worth warning about, unless of course the device
> was just unplugged from the machine - something that's a rather normal
> occurence when the igb/igc adapter is located on a Thunderbolt dock. So,
> let's only WARN() if there's a fatal read error while the device is
> still present.
> 
> This fixes the following WARN splat that's been appearing whenever I
> unplug my Caldigit TS3 Thunderbolt dock from my laptop:
> 
>  [...]
> 
> Signed-off-by: Lyude Paul <lyude@redhat.com>
> Fixes: 47e16692b26b ("igb/igc: warn when fatal read failure happens")
> Acked-by: Sasha Neftin <sasha.neftin@intel.com>
> Tested-by: Aaron Brown <aaron.f.brown@intel.com>
> Acked-by: Feng Tang <feng.tang@intel.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Any plans to start sending fixes to net? :(

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [net-next 4/7] e1000e: Add support for S0ix
  2019-10-16 23:47 ` [net-next 4/7] e1000e: Add support for S0ix Jeff Kirsher
@ 2019-10-17  0:02   ` Jakub Kicinski
  2019-10-17  5:09     ` Neftin, Sasha
  0 siblings, 1 reply; 16+ messages in thread
From: Jakub Kicinski @ 2019-10-17  0:02 UTC (permalink / raw)
  To: Jeff Kirsher
  Cc: davem, Sasha Neftin, netdev, nhorman, sassmann,
	Rafael J. Wysocki, Vitaly Lifshits, Rajneesh Bhardwaj,
	Aaron Brown

On Wed, 16 Oct 2019 16:47:08 -0700, Jeff Kirsher wrote:
>  static int e1000e_pm_freeze(struct device *dev)
>  {
>  	struct net_device *netdev = dev_get_drvdata(dev);
> @@ -6650,6 +6822,9 @@ static int e1000e_pm_thaw(struct device *dev)
>  static int e1000e_pm_suspend(struct device *dev)
>  {
>  	struct pci_dev *pdev = to_pci_dev(dev);
> +	struct net_device *netdev = pci_get_drvdata(to_pci_dev(dev));
> +	struct e1000_adapter *adapter = netdev_priv(netdev);
> +	struct e1000_hw *hw = &adapter->hw;
>  	int rc;

reverse xmas tree?

>  
>  	e1000e_flush_lpic(pdev);
> @@ -6660,14 +6835,25 @@ static int e1000e_pm_suspend(struct device *dev)
>  	if (rc)
>  		e1000e_pm_thaw(dev);
>  
> +	/* Introduce S0ix implementation */
> +	if (hw->mac.type >= e1000_pch_cnp)
> +		e1000e_s0ix_entry_flow(adapter);

the entry/exit functions never fail, you can make them return void

>  	return rc;
>  }
>  
>  static int e1000e_pm_resume(struct device *dev)
>  {
>  	struct pci_dev *pdev = to_pci_dev(dev);
> +	struct net_device *netdev = pci_get_drvdata(to_pci_dev(dev));
> +	struct e1000_adapter *adapter = netdev_priv(netdev);
> +	struct e1000_hw *hw = &adapter->hw;
>  	int rc;
>  
> +	/* Introduce S0ix implementation */
> +	if (hw->mac.type >= e1000_pch_cnp)
> +		e1000e_s0ix_exit_flow(adapter);
> +
>  	rc = __e1000_resume(pdev);
>  	if (rc)
>  		return rc;

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [net-next 4/7] e1000e: Add support for S0ix
  2019-10-17  0:02   ` Jakub Kicinski
@ 2019-10-17  5:09     ` Neftin, Sasha
  0 siblings, 0 replies; 16+ messages in thread
From: Neftin, Sasha @ 2019-10-17  5:09 UTC (permalink / raw)
  To: Jakub Kicinski, Jeff Kirsher
  Cc: davem, netdev, nhorman, sassmann, Rafael J. Wysocki,
	Vitaly Lifshits, Rajneesh Bhardwaj, Aaron Brown

On 10/17/2019 03:02, Jakub Kicinski wrote:
> On Wed, 16 Oct 2019 16:47:08 -0700, Jeff Kirsher wrote:
>>   static int e1000e_pm_freeze(struct device *dev)
>>   {
>>   	struct net_device *netdev = dev_get_drvdata(dev);
>> @@ -6650,6 +6822,9 @@ static int e1000e_pm_thaw(struct device *dev)
>>   static int e1000e_pm_suspend(struct device *dev)
>>   {
>>   	struct pci_dev *pdev = to_pci_dev(dev);
>> +	struct net_device *netdev = pci_get_drvdata(to_pci_dev(dev));
>> +	struct e1000_adapter *adapter = netdev_priv(netdev);
>> +	struct e1000_hw *hw = &adapter->hw;
>>   	int rc;
> 
> reverse xmas tree?
> 
What do you suggest, move the 'struct pci_dev *pdev' two lines below?
>>   
>>   	e1000e_flush_lpic(pdev);
>> @@ -6660,14 +6835,25 @@ static int e1000e_pm_suspend(struct device *dev)
>>   	if (rc)
>>   		e1000e_pm_thaw(dev);
>>   
>> +	/* Introduce S0ix implementation */
>> +	if (hw->mac.type >= e1000_pch_cnp)
>> +		e1000e_s0ix_entry_flow(adapter);
> 
> the entry/exit functions never fail, you can make them return void
> 
Thanks Jakub for yor comment.This is only initial flow. We continue 
development S0ix support for next product and I consider add error code 
if it will required.
>>   	return rc;
>>   }
>>   
>>   static int e1000e_pm_resume(struct device *dev)
>>   {
>>   	struct pci_dev *pdev = to_pci_dev(dev);
>> +	struct net_device *netdev = pci_get_drvdata(to_pci_dev(dev));
>> +	struct e1000_adapter *adapter = netdev_priv(netdev);
>> +	struct e1000_hw *hw = &adapter->hw;
>>   	int rc;
>>   
>> +	/* Introduce S0ix implementation */
>> +	if (hw->mac.type >= e1000_pch_cnp)
>> +		e1000e_s0ix_exit_flow(adapter);
>> +
>>   	rc = __e1000_resume(pdev);
>>   	if (rc)
>>   		return rc;


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [net-next 2/7] igb: add rx drop enable attribute
  2019-10-16 23:55   ` Jakub Kicinski
@ 2019-10-17 11:24     ` Robert Beckett
  2019-10-17 15:44       ` Jakub Kicinski
  0 siblings, 1 reply; 16+ messages in thread
From: Robert Beckett @ 2019-10-17 11:24 UTC (permalink / raw)
  To: Jakub Kicinski, Jeff Kirsher
  Cc: davem, netdev, nhorman, sassmann, Aaron Brown

On Wed, 2019-10-16 at 16:55 -0700, Jakub Kicinski wrote:
> On Wed, 16 Oct 2019 16:47:06 -0700, Jeff Kirsher wrote:
> > From: Robert Beckett <bob.beckett@collabora.com>
> > 
> > To allow userland to enable or disable dropping packets when
> > descriptor
> > ring is exhausted, add RX_DROP_EN private flag.
> > 
> > This can be used in conjunction with flow control to mitigate
> > packet storms
> > (e.g. due to network loop or DoS) by forcing the network adapter to
> > send
> > pause frames whenever the ring is close to exhaustion.
> > 
> > By default this will maintain previous behaviour of enabling
> > dropping of
> > packets during ring buffer exhaustion.
> > Some use cases prefer to not drop packets upon exhaustion, but
> > instead
> > use flow control to limit ingress rates and ensure no dropped
> > packets.
> > This is useful when the host CPU cannot keep up with packet
> > delivery,
> > but data delivery is more important than throughput via multiple
> > queues.
> > 
> > Userland can set this flag to 0 via ethtool to disable packet
> > dropping.
> > 
> > Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
> > Tested-by: Aaron Brown <aaron.f.brown@intel.com>
> > Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> 
> How is this different than enabling/disabling flow control..
> 
> ethtool -a/-A

Enabling flow control enables the advertisement of flow control
capabilites and allows negotiation with link partner. It does not
dictate under which circumstances those pause frames will be emitted.

This patch enables an igb specific feature that can cause flow control
to be used. The default behaviour is to drop packets if the rx ring
buffer fills. This flag tells the driver instead to emit pause frames
and not drop packets, which is useful when reliable data delivery is
more important than throughput.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [net-next 2/7] igb: add rx drop enable attribute
  2019-10-17 11:24     ` Robert Beckett
@ 2019-10-17 15:44       ` Jakub Kicinski
  2019-10-17 17:04         ` Robert Beckett
  0 siblings, 1 reply; 16+ messages in thread
From: Jakub Kicinski @ 2019-10-17 15:44 UTC (permalink / raw)
  To: Robert Beckett
  Cc: Jeff Kirsher, davem, netdev, nhorman, sassmann, Aaron Brown

On Thu, 17 Oct 2019 12:24:03 +0100, Robert Beckett wrote:
> On Wed, 2019-10-16 at 16:55 -0700, Jakub Kicinski wrote:
> > On Wed, 16 Oct 2019 16:47:06 -0700, Jeff Kirsher wrote:  
> > > From: Robert Beckett <bob.beckett@collabora.com>
> > > 
> > > To allow userland to enable or disable dropping packets when
> > > descriptor
> > > ring is exhausted, add RX_DROP_EN private flag.
> > > 
> > > This can be used in conjunction with flow control to mitigate
> > > packet storms
> > > (e.g. due to network loop or DoS) by forcing the network adapter to
> > > send
> > > pause frames whenever the ring is close to exhaustion.
> > > 
> > > By default this will maintain previous behaviour of enabling
> > > dropping of
> > > packets during ring buffer exhaustion.
> > > Some use cases prefer to not drop packets upon exhaustion, but
> > > instead
> > > use flow control to limit ingress rates and ensure no dropped
> > > packets.
> > > This is useful when the host CPU cannot keep up with packet
> > > delivery,
> > > but data delivery is more important than throughput via multiple
> > > queues.
> > > 
> > > Userland can set this flag to 0 via ethtool to disable packet
> > > dropping.
> > > 
> > > Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
> > > Tested-by: Aaron Brown <aaron.f.brown@intel.com>
> > > Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>  
> > 
> > How is this different than enabling/disabling flow control..
> > 
> > ethtool -a/-A  
> 
> Enabling flow control enables the advertisement of flow control
> capabilites and allows negotiation with link partner.

More or less. If autoneg is on it controls advertised bits,
if autoneg is off it controls the enabled/disable directly.

> It does not dictate under which circumstances those pause frames will
> be emitted.

So you're saying even with pause frames on igb by default will not
backpressure all the way to the wire if host RX ring is full/fill ring
is empty?

> This patch enables an igb specific feature that can cause flow control
> to be used. The default behaviour is to drop packets if the rx ring
> buffer fills. This flag tells the driver instead to emit pause frames
> and not drop packets, which is useful when reliable data delivery is
> more important than throughput.

The feature looks like something easily understood with a standard NIC
model in mind. Therefore it should have a generic config knob not a
private flag.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [net-next 2/7] igb: add rx drop enable attribute
  2019-10-17 15:44       ` Jakub Kicinski
@ 2019-10-17 17:04         ` Robert Beckett
  2019-10-17 17:40           ` Jakub Kicinski
  0 siblings, 1 reply; 16+ messages in thread
From: Robert Beckett @ 2019-10-17 17:04 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Jeff Kirsher, davem, netdev, nhorman, sassmann, Aaron Brown

On Thu, 2019-10-17 at 08:44 -0700, Jakub Kicinski wrote:
> On Thu, 17 Oct 2019 12:24:03 +0100, Robert Beckett wrote:
> > On Wed, 2019-10-16 at 16:55 -0700, Jakub Kicinski wrote:
> > > On Wed, 16 Oct 2019 16:47:06 -0700, Jeff Kirsher wrote:  
> > > > From: Robert Beckett <bob.beckett@collabora.com>
> > > > 
> > > > To allow userland to enable or disable dropping packets when
> > > > descriptor
> > > > ring is exhausted, add RX_DROP_EN private flag.
> > > > 
> > > > This can be used in conjunction with flow control to mitigate
> > > > packet storms
> > > > (e.g. due to network loop or DoS) by forcing the network
> > > > adapter to
> > > > send
> > > > pause frames whenever the ring is close to exhaustion.
> > > > 
> > > > By default this will maintain previous behaviour of enabling
> > > > dropping of
> > > > packets during ring buffer exhaustion.
> > > > Some use cases prefer to not drop packets upon exhaustion, but
> > > > instead
> > > > use flow control to limit ingress rates and ensure no dropped
> > > > packets.
> > > > This is useful when the host CPU cannot keep up with packet
> > > > delivery,
> > > > but data delivery is more important than throughput via
> > > > multiple
> > > > queues.
> > > > 
> > > > Userland can set this flag to 0 via ethtool to disable packet
> > > > dropping.
> > > > 
> > > > Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
> > > > Tested-by: Aaron Brown <aaron.f.brown@intel.com>
> > > > Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>  
> > > 
> > > How is this different than enabling/disabling flow control..
> > > 
> > > ethtool -a/-A  
> > 
> > Enabling flow control enables the advertisement of flow control
> > capabilites and allows negotiation with link partner.
> 
> More or less. If autoneg is on it controls advertised bits,
> if autoneg is off it controls the enabled/disable directly.
> 
> > It does not dictate under which circumstances those pause frames
> > will
> > be emitted.
> 
> So you're saying even with pause frames on igb by default will not
> backpressure all the way to the wire if host RX ring is full/fill
> ring
> is empty?

Correct.
Honestly I personally considered it a bug when I first saw it.

see e6bdb6fefc590

Specifically it enables dropping of frames if multiple queues are in
use, ostensibly to prevent head of line blocking between the different
rx queues.

This patch says that that should be a user choice, defaulting to the
old behaviour.


> 
> > This patch enables an igb specific feature that can cause flow
> > control
> > to be used. The default behaviour is to drop packets if the rx ring
> > buffer fills. This flag tells the driver instead to emit pause
> > frames
> > and not drop packets, which is useful when reliable data delivery
> > is
> > more important than throughput.
> 
> The feature looks like something easily understood with a standard
> NIC
> model in mind. Therefore it should have a generic config knob not a
> private flag.

I have no particular opinion on whether this should be a generic flag
or a priv flag. It could be argued that it is a priv flag to handle a
particular quirk of the igb driver that others likely won't share
(though Ive not vetted other drivers to check).

It could also be argued that if flow control is successfully enabled
(either forced or via autoneg) that dropping packets should not be
enabled, however I'm not enough of a networking expert to say whether
the flow control being enabled dictates which circumstances should
generate the pause frames or whether it is implementation defined.

If it is decided that the kernel in general should support a new set of
flags for flow control causes, over the top of the the enable/disable,
then I could see that being a generic flag that each nic decides how to
implement. I would rather wait for more consensus before tackling that
though.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [net-next 2/7] igb: add rx drop enable attribute
  2019-10-17 17:04         ` Robert Beckett
@ 2019-10-17 17:40           ` Jakub Kicinski
  0 siblings, 0 replies; 16+ messages in thread
From: Jakub Kicinski @ 2019-10-17 17:40 UTC (permalink / raw)
  To: Robert Beckett
  Cc: Jeff Kirsher, davem, netdev, nhorman, sassmann, Aaron Brown

On Thu, 17 Oct 2019 18:04:22 +0100, Robert Beckett wrote:
> On Thu, 2019-10-17 at 08:44 -0700, Jakub Kicinski wrote:
> > On Thu, 17 Oct 2019 12:24:03 +0100, Robert Beckett wrote:  
> > > On Wed, 2019-10-16 at 16:55 -0700, Jakub Kicinski wrote:  
> > > > On Wed, 16 Oct 2019 16:47:06 -0700, Jeff Kirsher wrote:    
> > > > > From: Robert Beckett <bob.beckett@collabora.com>
> > > > > 
> > > > > To allow userland to enable or disable dropping packets when
> > > > > descriptor
> > > > > ring is exhausted, add RX_DROP_EN private flag.
> > > > > 
> > > > > This can be used in conjunction with flow control to mitigate
> > > > > packet storms
> > > > > (e.g. due to network loop or DoS) by forcing the network
> > > > > adapter to
> > > > > send
> > > > > pause frames whenever the ring is close to exhaustion.
> > > > > 
> > > > > By default this will maintain previous behaviour of enabling
> > > > > dropping of
> > > > > packets during ring buffer exhaustion.
> > > > > Some use cases prefer to not drop packets upon exhaustion, but
> > > > > instead
> > > > > use flow control to limit ingress rates and ensure no dropped
> > > > > packets.
> > > > > This is useful when the host CPU cannot keep up with packet
> > > > > delivery,
> > > > > but data delivery is more important than throughput via
> > > > > multiple
> > > > > queues.
> > > > > 
> > > > > Userland can set this flag to 0 via ethtool to disable packet
> > > > > dropping.
> > > > > 
> > > > > Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
> > > > > Tested-by: Aaron Brown <aaron.f.brown@intel.com>
> > > > > Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>    
> > > > 
> > > > How is this different than enabling/disabling flow control..
> > > > 
> > > > ethtool -a/-A    
> > > 
> > > Enabling flow control enables the advertisement of flow control
> > > capabilites and allows negotiation with link partner.  
> > 
> > More or less. If autoneg is on it controls advertised bits,
> > if autoneg is off it controls the enabled/disable directly.
> >   
> > > It does not dictate under which circumstances those pause frames
> > > will
> > > be emitted.  
> > 
> > So you're saying even with pause frames on igb by default will not
> > backpressure all the way to the wire if host RX ring is full/fill
> > ring
> > is empty?  
> 
> Correct.
> Honestly I personally considered it a bug when I first saw it.
> 
> see e6bdb6fefc590
> 
> Specifically it enables dropping of frames if multiple queues are in
> use, ostensibly to prevent head of line blocking between the different
> rx queues.
> 
> This patch says that that should be a user choice, defaulting to the
> old behaviour.

I'd say just always enable it then. Honestly if it's statically enabled
with multi queue (which I presume is the default?) then there's no need
for us to ponder the uAPI questions. I can't think of anyone flipping
that flag to disabled...

Flow control is supposed to prevent _all_ drops it can, AFAIU.

If we get an actual user request to change it we can revisit.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2019-10-17 17:40 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-16 23:47 [net-next 0/7][pull request] 1GbE Intel Wired LAN Driver Updates 2019-10-16 Jeff Kirsher
2019-10-16 23:47 ` [net-next 1/7] igb/igc: Don't warn on fatal read failures when the device is removed Jeff Kirsher
2019-10-16 23:58   ` Jakub Kicinski
2019-10-16 23:47 ` [net-next 2/7] igb: add rx drop enable attribute Jeff Kirsher
2019-10-16 23:55   ` Jakub Kicinski
2019-10-17 11:24     ` Robert Beckett
2019-10-17 15:44       ` Jakub Kicinski
2019-10-17 17:04         ` Robert Beckett
2019-10-17 17:40           ` Jakub Kicinski
2019-10-16 23:47 ` [net-next 3/7] igc: Add SCTP CRC checksumming functionality Jeff Kirsher
2019-10-16 23:47 ` [net-next 4/7] e1000e: Add support for S0ix Jeff Kirsher
2019-10-17  0:02   ` Jakub Kicinski
2019-10-17  5:09     ` Neftin, Sasha
2019-10-16 23:47 ` [net-next 5/7] igc: Add set_rx_mode support Jeff Kirsher
2019-10-16 23:47 ` [net-next 6/7] igc: Add Rx checksum support Jeff Kirsher
2019-10-16 23:47 ` [net-next 7/7] igc: Clean up unused shadow_vfta pointer Jeff Kirsher

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).