netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [net-next 0/8][pull request] 1GbE Intel Wired LAN Driver Updates 2018-01-24
@ 2018-01-24 20:55 Jeff Kirsher
  2018-01-24 20:55 ` [net-next 1/8] igb: Allow to remove administratively set MAC on VFs Jeff Kirsher
                   ` (8 more replies)
  0 siblings, 9 replies; 12+ messages in thread
From: Jeff Kirsher @ 2018-01-24 20:55 UTC (permalink / raw)
  To: davem; +Cc: Jeff Kirsher, netdev, nhorman, sassmann, jogreene

This series contains updates to igb and e1000e only.

Corinna Vinschen implements the ability to set the VF MAC to
00:00:00:00:00:00 via RTM_SETLINK on the PF, to prevent receiving
"invlaid argument" when libvirt attempts to restore the MAC address back
to its original state of 00:00:00:00:00:00.

Zhang Shengju adds a new function igb_get_max_rss_queues() to get the
maxium number of RSS queues and to reduce code duplication.

Matt fixes an issue with e1000e where when setting HTHREST, we should
only be setting bit 8.  Also added a dev_info() message to alert when
C-states have been disabled, to help in debugging.

Jesus adds code comments to clarify the idlescope configuration
constraints.

Lyude Paul fixes a kernel crash on hotplug of igb, by checking to ensure
that we are in the process of dismantling the netdev on hotplug events.

Markus Elfring removes a duplicate message for a memory allocation
failure.

Daniel Hua fixes an issue where transmit timestamps will stop being put
into the socket when link is repeatedly up/down due to TSYNCTXCTL's TXTT
bit (Transmit timestamp valid bit) not clearing in the timeout logic of
ptp_tx_work(), which in turn causes no new timestamp to be loaded to the
TXSTMP register.

The following are changes since commit 46410c2efa9cb5b2f40c9ce24a75d147f44aedeb:
  Merge branch 'pktgen-Behavior-flags-fixes'
and are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue 1GbE

Corinna Vinschen (1):
  igb: Allow to remove administratively set MAC on VFs

Daniel Hua (1):
  igb: Clear TXSTMP when ptp_tx_work() is timeout

Jesus Sanchez-Palencia (1):
  igb: Clarify idleslope config constraints

Lyude Paul (1):
  igb: Free IRQs when device is hotplugged

Markus Elfring (1):
  igb: Delete an error message for a failed memory allocation in
    igb_enable_sriov()

Matt Turner (2):
  e1000e: Set HTHRESH when PTHRESH is used
  e1000e: Alert the user that C-states will be disabled by enabling
    jumbo frames

Zhang Shengju (1):
  igb: add function to get maximum RSS queues

 drivers/net/ethernet/intel/e1000e/netdev.c   |  4 +-
 drivers/net/ethernet/intel/igb/igb.h         |  1 +
 drivers/net/ethernet/intel/igb/igb_ethtool.c | 32 +------------
 drivers/net/ethernet/intel/igb/igb_main.c    | 72 +++++++++++++++++++++-------
 drivers/net/ethernet/intel/igb/igb_ptp.c     |  9 ++++
 5 files changed, 70 insertions(+), 48 deletions(-)

-- 
2.14.3

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [net-next 1/8] igb: Allow to remove administratively set MAC on VFs
  2018-01-24 20:55 [net-next 0/8][pull request] 1GbE Intel Wired LAN Driver Updates 2018-01-24 Jeff Kirsher
@ 2018-01-24 20:55 ` Jeff Kirsher
  2018-01-24 20:55 ` [net-next 2/8] igb: add function to get maximum RSS queues Jeff Kirsher
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Jeff Kirsher @ 2018-01-24 20:55 UTC (permalink / raw)
  To: davem; +Cc: Corinna Vinschen, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Corinna Vinschen <vinschen@redhat.com>

Before libvirt modifies the MAC address and vlan tag for an SRIOV VF
for use by a virtual machine (either using vfio device assignment or
macvtap passthru mode), it saves the current MAC address and vlan tag
so that it can reset them to their original value when the guest is
done.  Libvirt can't leave the VF MAC set to the value used by the
now-defunct guest since it may be started again later using a
different VF, but it certainly shouldn't just pick any random value,
either. So it saves the state of everything prior to using the VF, and
resets it to that.

The igb driver initializes the MAC addresses of all VFs to
00:00:00:00:00:00, and reports that when asked (via an RTM_GETLINK
netlink message, also visible in the list of VFs in the output of "ip
link show"). But when libvirt attempts to restore the MAC address back
to 00:00:00:00:00:00 (using an RTM_SETLINK netlink message) the kernel
responds with "Invalid argument".

Forbidding a reset back to the original value leaves the VF MAC at the
value set for the now-defunct virtual machine. Especially on a system
with NetworkManager enabled, this has very bad consequences, since
NetworkManager forces all interfacess to be IFF_UP all the time - if
the same virtual machine is restarted using a different VF (or even on
a different host), there will be multiple interfaces watching for
traffic with the same MAC address.

To allow libvirt to revert to the original state, we need a way to
remove the administrative set MAC on a VF, to allow normal host
operation again, and to reset/overwrite the VF MAC via VF netdev.

This patch implements the outlined scenario by allowing to set the
VF MAC to 00:00:00:00:00:00 via RTM_SETLINK on the PF.
igb_ndo_set_vf_mac resets the IGB_VF_FLAG_PF_SET_MAC flag to 0,
so it's possible to reset the VF MAC back to the original value via
the VF netdev.

Note: Recent patches to libvirt allow for a workaround if the NIC
isn't capable of resetting the administrative MAC back to all 0, but
in theory the NIC should allow resetting the MAC in the first place.

Signed-off-by: Corinna Vinschen <vinschen@redhat.com>
Tested-by: Aaron Brown <arron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c | 42 +++++++++++++++++++++++--------
 1 file changed, 31 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index c208753ff5b7..8504900bf7b8 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -8718,7 +8718,8 @@ static void igb_rar_set_index(struct igb_adapter *adapter, u32 index)
 
 	/* Indicate to hardware the Address is Valid. */
 	if (adapter->mac_table[index].state & IGB_MAC_STATE_IN_USE) {
-		rar_high |= E1000_RAH_AV;
+		if (is_valid_ether_addr(addr))
+			rar_high |= E1000_RAH_AV;
 
 		if (hw->mac.type == e1000_82575)
 			rar_high |= E1000_RAH_POOL_1 *
@@ -8756,17 +8757,36 @@ static int igb_set_vf_mac(struct igb_adapter *adapter,
 static int igb_ndo_set_vf_mac(struct net_device *netdev, int vf, u8 *mac)
 {
 	struct igb_adapter *adapter = netdev_priv(netdev);
-	if (!is_valid_ether_addr(mac) || (vf >= adapter->vfs_allocated_count))
+
+	if (vf >= adapter->vfs_allocated_count)
+		return -EINVAL;
+
+	/* Setting the VF MAC to 0 reverts the IGB_VF_FLAG_PF_SET_MAC
+	 * flag and allows to overwrite the MAC via VF netdev.  This
+	 * is necessary to allow libvirt a way to restore the original
+	 * MAC after unbinding vfio-pci and reloading igbvf after shutting
+	 * down a VM.
+	 */
+	if (is_zero_ether_addr(mac)) {
+		adapter->vf_data[vf].flags &= ~IGB_VF_FLAG_PF_SET_MAC;
+		dev_info(&adapter->pdev->dev,
+			 "remove administratively set MAC on VF %d\n",
+			 vf);
+	} else if (is_valid_ether_addr(mac)) {
+		adapter->vf_data[vf].flags |= IGB_VF_FLAG_PF_SET_MAC;
+		dev_info(&adapter->pdev->dev, "setting MAC %pM on VF %d\n",
+			 mac, vf);
+		dev_info(&adapter->pdev->dev,
+			 "Reload the VF driver to make this change effective.");
+		/* Generate additional warning if PF is down */
+		if (test_bit(__IGB_DOWN, &adapter->state)) {
+			dev_warn(&adapter->pdev->dev,
+				 "The VF MAC address has been set, but the PF device is not up.\n");
+			dev_warn(&adapter->pdev->dev,
+				 "Bring the PF device up before attempting to use the VF device.\n");
+		}
+	} else {
 		return -EINVAL;
-	adapter->vf_data[vf].flags |= IGB_VF_FLAG_PF_SET_MAC;
-	dev_info(&adapter->pdev->dev, "setting MAC %pM on VF %d\n", mac, vf);
-	dev_info(&adapter->pdev->dev,
-		 "Reload the VF driver to make this change effective.");
-	if (test_bit(__IGB_DOWN, &adapter->state)) {
-		dev_warn(&adapter->pdev->dev,
-			 "The VF MAC address has been set, but the PF device is not up.\n");
-		dev_warn(&adapter->pdev->dev,
-			 "Bring the PF device up before attempting to use the VF device.\n");
 	}
 	return igb_set_vf_mac(adapter, vf, mac);
 }
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [net-next 2/8] igb: add function to get maximum RSS queues
  2018-01-24 20:55 [net-next 0/8][pull request] 1GbE Intel Wired LAN Driver Updates 2018-01-24 Jeff Kirsher
  2018-01-24 20:55 ` [net-next 1/8] igb: Allow to remove administratively set MAC on VFs Jeff Kirsher
@ 2018-01-24 20:55 ` Jeff Kirsher
  2018-01-24 20:55 ` [net-next 3/8] e1000e: Set HTHRESH when PTHRESH is used Jeff Kirsher
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Jeff Kirsher @ 2018-01-24 20:55 UTC (permalink / raw)
  To: davem; +Cc: Zhang Shengju, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Zhang Shengju <zhangshengju@cmss.chinamobile.com>

This patch adds a new function igb_get_max_rss_queues() to get maximum
RSS queues, this will reduce duplicate code and facilitate future
maintenance.

Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/igb.h         |  1 +
 drivers/net/ethernet/intel/igb/igb_ethtool.c | 32 +---------------------------
 drivers/net/ethernet/intel/igb/igb_main.c    | 12 +++++++++--
 3 files changed, 12 insertions(+), 33 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb.h b/drivers/net/ethernet/intel/igb/igb.h
index 92845692087a..1c6b8d9176a8 100644
--- a/drivers/net/ethernet/intel/igb/igb.h
+++ b/drivers/net/ethernet/intel/igb/igb.h
@@ -690,6 +690,7 @@ void igb_ptp_rx_pktstamp(struct igb_q_vector *q_vector, void *va,
 int igb_ptp_set_ts_config(struct net_device *netdev, struct ifreq *ifr);
 int igb_ptp_get_ts_config(struct net_device *netdev, struct ifreq *ifr);
 void igb_set_flag_queue_pairs(struct igb_adapter *, const u32);
+unsigned int igb_get_max_rss_queues(struct igb_adapter *);
 #ifdef CONFIG_IGB_HWMON
 void igb_sysfs_exit(struct igb_adapter *adapter);
 int igb_sysfs_init(struct igb_adapter *adapter);
diff --git a/drivers/net/ethernet/intel/igb/igb_ethtool.c b/drivers/net/ethernet/intel/igb/igb_ethtool.c
index d06a8db514d4..606e6761758f 100644
--- a/drivers/net/ethernet/intel/igb/igb_ethtool.c
+++ b/drivers/net/ethernet/intel/igb/igb_ethtool.c
@@ -3338,37 +3338,7 @@ static int igb_set_rxfh(struct net_device *netdev, const u32 *indir,
 
 static unsigned int igb_max_channels(struct igb_adapter *adapter)
 {
-	struct e1000_hw *hw = &adapter->hw;
-	unsigned int max_combined = 0;
-
-	switch (hw->mac.type) {
-	case e1000_i211:
-		max_combined = IGB_MAX_RX_QUEUES_I211;
-		break;
-	case e1000_82575:
-	case e1000_i210:
-		max_combined = IGB_MAX_RX_QUEUES_82575;
-		break;
-	case e1000_i350:
-		if (!!adapter->vfs_allocated_count) {
-			max_combined = 1;
-			break;
-		}
-		/* fall through */
-	case e1000_82576:
-		if (!!adapter->vfs_allocated_count) {
-			max_combined = 2;
-			break;
-		}
-		/* fall through */
-	case e1000_82580:
-	case e1000_i354:
-	default:
-		max_combined = IGB_MAX_RX_QUEUES;
-		break;
-	}
-
-	return max_combined;
+	return igb_get_max_rss_queues(adapter);
 }
 
 static void igb_get_channels(struct net_device *netdev,
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 8504900bf7b8..21a34e9e2645 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -3373,10 +3373,10 @@ static void igb_probe_vfs(struct igb_adapter *adapter)
 #endif /* CONFIG_PCI_IOV */
 }
 
-static void igb_init_queue_configuration(struct igb_adapter *adapter)
+unsigned int igb_get_max_rss_queues(struct igb_adapter *adapter)
 {
 	struct e1000_hw *hw = &adapter->hw;
-	u32 max_rss_queues;
+	unsigned int max_rss_queues;
 
 	/* Determine the maximum number of RSS queues supported. */
 	switch (hw->mac.type) {
@@ -3407,6 +3407,14 @@ static void igb_init_queue_configuration(struct igb_adapter *adapter)
 		break;
 	}
 
+	return max_rss_queues;
+}
+
+static void igb_init_queue_configuration(struct igb_adapter *adapter)
+{
+	u32 max_rss_queues;
+
+	max_rss_queues = igb_get_max_rss_queues(adapter);
 	adapter->rss_queues = min_t(u32, max_rss_queues, num_online_cpus());
 
 	igb_set_flag_queue_pairs(adapter, max_rss_queues);
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [net-next 3/8] e1000e: Set HTHRESH when PTHRESH is used
  2018-01-24 20:55 [net-next 0/8][pull request] 1GbE Intel Wired LAN Driver Updates 2018-01-24 Jeff Kirsher
  2018-01-24 20:55 ` [net-next 1/8] igb: Allow to remove administratively set MAC on VFs Jeff Kirsher
  2018-01-24 20:55 ` [net-next 2/8] igb: add function to get maximum RSS queues Jeff Kirsher
@ 2018-01-24 20:55 ` Jeff Kirsher
  2018-01-24 20:55 ` [net-next 4/8] igb: Clarify idleslope config constraints Jeff Kirsher
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Jeff Kirsher @ 2018-01-24 20:55 UTC (permalink / raw)
  To: davem; +Cc: Matt Turner, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Matt Turner <matt.turner@intel.com>

According to section 12.0.3.4.13 "Receive Descriptor Control - RXDCTL"
of the Intel® 82579 Gigabit Ethernet PHY Datasheet v2.1:

    "HTHRESH should be given a non zero value when ever PTHRESH is
     used."

In RXDCTL(0), PTHRESH lives at bits 5:0, and HTHREST lives at bits 13:8.
Set only bit 8 of HTHREST as is done in e1000_flush_rx_ring(). Found by
inspection.

Signed-off-by: Matt Turner <matt.turner@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/e1000e/netdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 9f18d39bdc8f..5af93ce5acc6 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -3303,7 +3303,7 @@ static void e1000_configure_rx(struct e1000_adapter *adapter)
 		if (adapter->flags & FLAG_IS_ICH) {
 			u32 rxdctl = er32(RXDCTL(0));
 
-			ew32(RXDCTL(0), rxdctl | 0x3);
+			ew32(RXDCTL(0), rxdctl | 0x3 | BIT(8));
 		}
 
 		pm_qos_update_request(&adapter->pm_qos_req, lat);
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [net-next 4/8] igb: Clarify idleslope config constraints
  2018-01-24 20:55 [net-next 0/8][pull request] 1GbE Intel Wired LAN Driver Updates 2018-01-24 Jeff Kirsher
                   ` (2 preceding siblings ...)
  2018-01-24 20:55 ` [net-next 3/8] e1000e: Set HTHRESH when PTHRESH is used Jeff Kirsher
@ 2018-01-24 20:55 ` Jeff Kirsher
  2018-01-24 20:55 ` [net-next 5/8] e1000e: Alert the user that C-states will be disabled by enabling jumbo frames Jeff Kirsher
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Jeff Kirsher @ 2018-01-24 20:55 UTC (permalink / raw)
  To: davem
  Cc: Jesus Sanchez-Palencia, netdev, nhorman, sassmann, jogreene,
	Jeff Kirsher

From: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>

By design, the idleslope increments are restricted to 16.384kbps steps.
Add a comment to igb_main.c making that explicit and add one example
that illustrates the impact of that.

Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 21a34e9e2645..78fe10819c3a 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -1744,6 +1744,20 @@ static void igb_configure_cbs(struct igb_adapter *adapter, int queue,
 		 *     value = idleSlope * 61034
 		 *             -----------------                          (E6)
 		 *                  1000000
+		 *
+		 * NOTE: For i210, given the above, we can see that idleslope
+		 *       is represented in 16.38431 kbps units by the value at
+		 *       the TQAVCC register (1Gbps / 61034), which reduces
+		 *       the granularity for idleslope increments.
+		 *       For instance, if you want to configure a 2576kbps
+		 *       idleslope, the value to be written on the register
+		 *       would have to be 157.23. If rounded down, you end
+		 *       up with less bandwidth available than originally
+		 *       required (~2572 kbps). If rounded up, you end up
+		 *       with a higher bandwidth (~2589 kbps). Below the
+		 *       approach we take is to always round up the
+		 *       calculated value, so the resulting bandwidth might
+		 *       be slightly higher for some configurations.
 		 */
 		value = DIV_ROUND_UP_ULL(idleslope * 61034ULL, 1000000);
 
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [net-next 5/8] e1000e: Alert the user that C-states will be disabled by enabling jumbo frames
  2018-01-24 20:55 [net-next 0/8][pull request] 1GbE Intel Wired LAN Driver Updates 2018-01-24 Jeff Kirsher
                   ` (3 preceding siblings ...)
  2018-01-24 20:55 ` [net-next 4/8] igb: Clarify idleslope config constraints Jeff Kirsher
@ 2018-01-24 20:55 ` Jeff Kirsher
  2018-01-24 20:55 ` [net-next 6/8] igb: Free IRQs when device is hotplugged Jeff Kirsher
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Jeff Kirsher @ 2018-01-24 20:55 UTC (permalink / raw)
  To: davem; +Cc: Matt Turner, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Matt Turner <matt.turner@intel.com>

I personally spent a long time trying to decypher why my CPU would not
reach deeper C-states. Let's just tell the next user what's going on.

Signed-off-by: Matt Turner <matt.turner@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/e1000e/netdev.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 5af93ce5acc6..1298b69f990b 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -3306,6 +3306,8 @@ static void e1000_configure_rx(struct e1000_adapter *adapter)
 			ew32(RXDCTL(0), rxdctl | 0x3 | BIT(8));
 		}
 
+		dev_info(&adapter->pdev->dev,
+			 "Some CPU C-states have been disabled in order to enable jumbo frames\n");
 		pm_qos_update_request(&adapter->pm_qos_req, lat);
 	} else {
 		pm_qos_update_request(&adapter->pm_qos_req,
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [net-next 6/8] igb: Free IRQs when device is hotplugged
  2018-01-24 20:55 [net-next 0/8][pull request] 1GbE Intel Wired LAN Driver Updates 2018-01-24 Jeff Kirsher
                   ` (4 preceding siblings ...)
  2018-01-24 20:55 ` [net-next 5/8] e1000e: Alert the user that C-states will be disabled by enabling jumbo frames Jeff Kirsher
@ 2018-01-24 20:55 ` Jeff Kirsher
  2018-01-24 20:55 ` [net-next 7/8] igb: Delete an error message for a failed memory allocation in igb_enable_sriov() Jeff Kirsher
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Jeff Kirsher @ 2018-01-24 20:55 UTC (permalink / raw)
  To: davem
  Cc: Lyude Paul, netdev, nhorman, sassmann, jogreene, Todd Fujinaka,
	Stephen Hemminger, stable, Jeff Kirsher

From: Lyude Paul <lyude@redhat.com>

Recently I got a Caldigit TS3 Thunderbolt 3 dock, and noticed that upon
hotplugging my kernel would immediately crash due to igb:

[  680.825801] kernel BUG at drivers/pci/msi.c:352!
[  680.828388] invalid opcode: 0000 [#1] SMP
[  680.829194] Modules linked in: igb(O) thunderbolt i2c_algo_bit joydev vfat fat btusb btrtl btbcm btintel bluetooth ecdh_generic hp_wmi sparse_keymap rfkill wmi_bmof iTCO_wdt intel_rapl x86_pkg_temp_thermal coretemp crc32_pclmul snd_pcm rtsx_pci_ms mei_me snd_timer memstick snd pcspkr mei soundcore i2c_i801 tpm_tis psmouse shpchp wmi tpm_tis_core tpm video hp_wireless acpi_pad rtsx_pci_sdmmc mmc_core crc32c_intel serio_raw rtsx_pci mfd_core xhci_pci xhci_hcd i2c_hid i2c_core [last unloaded: igb]
[  680.831085] CPU: 1 PID: 78 Comm: kworker/u16:1 Tainted: G           O     4.15.0-rc3Lyude-Test+ #6
[  680.831596] Hardware name: HP HP ZBook Studio G4/826B, BIOS P71 Ver. 01.03 06/09/2017
[  680.832168] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
[  680.832687] RIP: 0010:free_msi_irqs+0x180/0x1b0
[  680.833271] RSP: 0018:ffffc9000030fbf0 EFLAGS: 00010286
[  680.833761] RAX: ffff8803405f9c00 RBX: ffff88033e3d2e40 RCX: 000000000000002c
[  680.834278] RDX: 0000000000000000 RSI: 00000000000000ac RDI: ffff880340be2178
[  680.834832] RBP: 0000000000000000 R08: ffff880340be1ff0 R09: ffff8803405f9c00
[  680.835342] R10: 0000000000000000 R11: 0000000000000040 R12: ffff88033d63a298
[  680.835822] R13: ffff88033d63a000 R14: 0000000000000060 R15: ffff880341959000
[  680.836332] FS:  0000000000000000(0000) GS:ffff88034f440000(0000) knlGS:0000000000000000
[  680.836817] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  680.837360] CR2: 000055e64044afdf CR3: 0000000001c09002 CR4: 00000000003606e0
[  680.837954] Call Trace:
[  680.838853]  pci_disable_msix+0xce/0xf0
[  680.839616]  igb_reset_interrupt_capability+0x5d/0x60 [igb]
[  680.840278]  igb_remove+0x9d/0x110 [igb]
[  680.840764]  pci_device_remove+0x36/0xb0
[  680.841279]  device_release_driver_internal+0x157/0x220
[  680.841739]  pci_stop_bus_device+0x7d/0xa0
[  680.842255]  pci_stop_bus_device+0x2b/0xa0
[  680.842722]  pci_stop_bus_device+0x3d/0xa0
[  680.843189]  pci_stop_and_remove_bus_device+0xe/0x20
[  680.843627]  trim_stale_devices+0xf3/0x140
[  680.844086]  trim_stale_devices+0x94/0x140
[  680.844532]  trim_stale_devices+0xa6/0x140
[  680.845031]  ? get_slot_status+0x90/0xc0
[  680.845536]  acpiphp_check_bridge.part.5+0xfe/0x140
[  680.846021]  acpiphp_hotplug_notify+0x175/0x200
[  680.846581]  ? free_bridge+0x100/0x100
[  680.847113]  acpi_device_hotplug+0x8a/0x490
[  680.847535]  acpi_hotplug_work_fn+0x1a/0x30
[  680.848076]  process_one_work+0x182/0x3a0
[  680.848543]  worker_thread+0x2e/0x380
[  680.848963]  ? process_one_work+0x3a0/0x3a0
[  680.849373]  kthread+0x111/0x130
[  680.849776]  ? kthread_create_worker_on_cpu+0x50/0x50
[  680.850188]  ret_from_fork+0x1f/0x30
[  680.850601] Code: 43 14 85 c0 0f 84 d5 fe ff ff 31 ed eb 0f 83 c5 01 39 6b 14 0f 86 c5 fe ff ff 8b 7b 10 01 ef e8 b7 e4 d2 ff 48 83 78 70 00 74 e3 <0f> 0b 49 8d b5 a0 00 00 00 e8 62 6f d3 ff e9 c7 fe ff ff 48 8b
[  680.851497] RIP: free_msi_irqs+0x180/0x1b0 RSP: ffffc9000030fbf0

As it turns out, normally the freeing of IRQs that would fix this is called
inside of the scope of __igb_close(). However, since the device is
already gone by the point we try to unregister the netdevice from the
driver due to a hotplug we end up seeing that the netif isn't present
and thus, forget to free any of the device IRQs.

So: make sure that if we're in the process of dismantling the netdev, we
always allow __igb_close() to be called so that IRQs may be freed
normally. Additionally, only allow igb_close() to be called from
__igb_close() if it hasn't already been called for the given adapter.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Fixes: 9474933caf21 ("igb: close/suspend race in netif_device_detach")
Cc: Todd Fujinaka <todd.fujinaka@intel.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: stable@vger.kernel.org
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 78fe10819c3a..ac83de48a4d3 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -3698,7 +3698,7 @@ static int __igb_close(struct net_device *netdev, bool suspending)
 
 int igb_close(struct net_device *netdev)
 {
-	if (netif_device_present(netdev))
+	if (netif_device_present(netdev) || netdev->dismantle)
 		return __igb_close(netdev, false);
 	return 0;
 }
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [net-next 7/8] igb: Delete an error message for a failed memory allocation in igb_enable_sriov()
  2018-01-24 20:55 [net-next 0/8][pull request] 1GbE Intel Wired LAN Driver Updates 2018-01-24 Jeff Kirsher
                   ` (5 preceding siblings ...)
  2018-01-24 20:55 ` [net-next 6/8] igb: Free IRQs when device is hotplugged Jeff Kirsher
@ 2018-01-24 20:55 ` Jeff Kirsher
  2018-01-24 20:55 ` [net-next 8/8] igb: Clear TXSTMP when ptp_tx_work() is timeout Jeff Kirsher
  2018-01-24 21:10 ` [net-next 0/8][pull request] 1GbE Intel Wired LAN Driver Updates 2018-01-24 David Miller
  8 siblings, 0 replies; 12+ messages in thread
From: Jeff Kirsher @ 2018-01-24 20:55 UTC (permalink / raw)
  To: davem; +Cc: Markus Elfring, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Markus Elfring <elfring@users.sourceforge.net>

Omit an extra message for a memory allocation failure in this function.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index ac83de48a4d3..b88fae785369 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -3214,8 +3214,6 @@ static int igb_enable_sriov(struct pci_dev *pdev, int num_vfs)
 	/* if allocation failed then we do not support SR-IOV */
 	if (!adapter->vf_data) {
 		adapter->vfs_allocated_count = 0;
-		dev_err(&pdev->dev,
-			"Unable to allocate memory for VF Data Storage\n");
 		err = -ENOMEM;
 		goto out;
 	}
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [net-next 8/8] igb: Clear TXSTMP when ptp_tx_work() is timeout
  2018-01-24 20:55 [net-next 0/8][pull request] 1GbE Intel Wired LAN Driver Updates 2018-01-24 Jeff Kirsher
                   ` (6 preceding siblings ...)
  2018-01-24 20:55 ` [net-next 7/8] igb: Delete an error message for a failed memory allocation in igb_enable_sriov() Jeff Kirsher
@ 2018-01-24 20:55 ` Jeff Kirsher
  2018-01-24 21:10 ` [net-next 0/8][pull request] 1GbE Intel Wired LAN Driver Updates 2018-01-24 David Miller
  8 siblings, 0 replies; 12+ messages in thread
From: Jeff Kirsher @ 2018-01-24 20:55 UTC (permalink / raw)
  To: davem; +Cc: Daniel Hua, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Daniel Hua <daniel.hua@ni.com>

Problem description:
After ethernet cable connect and disconnect for several iterations on a
device with i210, tx timestamp will stop being put into the socket.

Steps to reproduce:
1. Setup a device with i210 and wire it to a 802.1AS capable switch (
Extreme Networks Summit x440 is used in our case)
2. Have the gptp daemon running on the device and make sure it is synced
with the switch
3. Have the switch disable and enable the port, wait for the device gets
resynced with the switch
4. Iterates step 3 until the device is not albe to get resynced
5. Review the log in dmesg and you will see warning message "igb : clearing
Tx timestamp hang"

Root cause:
If ptp_tx_work() gets scheduled just before the port gets disabled, a LINK
DOWN event will be processed before ptp_tx_work(), which may cause timeout
in ptp_tx_work(). In the timeout logic, the TSYNCTXCTL's TXTT bit (Transmit
timestamp valid bit) is not cleared, causing no new timestamp loaded to
TXSTMP register. Consequently therefore, no new interrupt is triggerred by
TSICR.TXTS bit and no more Tx timestamp send to the socket.

Signed-off-by: Daniel Hua <daniel.hua@ni.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/igb_ptp.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/net/ethernet/intel/igb/igb_ptp.c b/drivers/net/ethernet/intel/igb/igb_ptp.c
index 841c2a083349..0746b19ec6d3 100644
--- a/drivers/net/ethernet/intel/igb/igb_ptp.c
+++ b/drivers/net/ethernet/intel/igb/igb_ptp.c
@@ -643,6 +643,10 @@ static void igb_ptp_tx_work(struct work_struct *work)
 		adapter->ptp_tx_skb = NULL;
 		clear_bit_unlock(__IGB_PTP_TX_IN_PROGRESS, &adapter->state);
 		adapter->tx_hwtstamp_timeouts++;
+		/* Clear the tx valid bit in TSYNCTXCTL register to enable
+		 * interrupt
+		 */
+		rd32(E1000_TXSTMPH);
 		dev_warn(&adapter->pdev->dev, "clearing Tx timestamp hang\n");
 		return;
 	}
@@ -717,6 +721,7 @@ void igb_ptp_rx_hang(struct igb_adapter *adapter)
  */
 void igb_ptp_tx_hang(struct igb_adapter *adapter)
 {
+	struct e1000_hw *hw = &adapter->hw;
 	bool timeout = time_is_before_jiffies(adapter->ptp_tx_start +
 					      IGB_PTP_TX_TIMEOUT);
 
@@ -736,6 +741,10 @@ void igb_ptp_tx_hang(struct igb_adapter *adapter)
 		adapter->ptp_tx_skb = NULL;
 		clear_bit_unlock(__IGB_PTP_TX_IN_PROGRESS, &adapter->state);
 		adapter->tx_hwtstamp_timeouts++;
+		/* Clear the tx valid bit in TSYNCTXCTL register to enable
+		 * interrupt
+		 */
+		rd32(E1000_TXSTMPH);
 		dev_warn(&adapter->pdev->dev, "clearing Tx timestamp hang\n");
 	}
 }
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [net-next 0/8][pull request] 1GbE Intel Wired LAN Driver Updates 2018-01-24
  2018-01-24 20:55 [net-next 0/8][pull request] 1GbE Intel Wired LAN Driver Updates 2018-01-24 Jeff Kirsher
                   ` (7 preceding siblings ...)
  2018-01-24 20:55 ` [net-next 8/8] igb: Clear TXSTMP when ptp_tx_work() is timeout Jeff Kirsher
@ 2018-01-24 21:10 ` David Miller
  2018-01-25  2:29   ` Alexander Duyck
  8 siblings, 1 reply; 12+ messages in thread
From: David Miller @ 2018-01-24 21:10 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: netdev, nhorman, sassmann, jogreene

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Wed, 24 Jan 2018 12:55:12 -0800

> This series contains updates to igb and e1000e only.

Pulled, however:

> Corinna Vinschen implements the ability to set the VF MAC to
> 00:00:00:00:00:00 via RTM_SETLINK on the PF, to prevent receiving
> "invlaid argument" when libvirt attempts to restore the MAC address back
> to its original state of 00:00:00:00:00:00.

This is really a mess and the wrong way to go about this.

No interface, even a VF, should come up or ever have an invalid
MAC addres like all-zeros.  That's the fundamental problem and
once you fix that all of this other crazy logic and workarounds
no longer become necessary.

Whatever it takes, just do it.  We can even come up with a global
MAC address range that on a Linux system is reserved for VFs to
come up with.

Thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [net-next 0/8][pull request] 1GbE Intel Wired LAN Driver Updates 2018-01-24
  2018-01-24 21:10 ` [net-next 0/8][pull request] 1GbE Intel Wired LAN Driver Updates 2018-01-24 David Miller
@ 2018-01-25  2:29   ` Alexander Duyck
  2018-01-29 16:46     ` David Miller
  0 siblings, 1 reply; 12+ messages in thread
From: Alexander Duyck @ 2018-01-25  2:29 UTC (permalink / raw)
  To: David Miller; +Cc: Jeff Kirsher, Netdev, Neil Horman, sassmann, John Greene

On Wed, Jan 24, 2018 at 1:10 PM, David Miller <davem@davemloft.net> wrote:
> From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> Date: Wed, 24 Jan 2018 12:55:12 -0800
>
>> This series contains updates to igb and e1000e only.
>
> Pulled, however:
>
>> Corinna Vinschen implements the ability to set the VF MAC to
>> 00:00:00:00:00:00 via RTM_SETLINK on the PF, to prevent receiving
>> "invlaid argument" when libvirt attempts to restore the MAC address back
>> to its original state of 00:00:00:00:00:00.
>
> This is really a mess and the wrong way to go about this.
>
> No interface, even a VF, should come up or ever have an invalid
> MAC addres like all-zeros.  That's the fundamental problem and
> once you fix that all of this other crazy logic and workarounds
> no longer become necessary.

In the case of igbvf the VFs never come up with 0s in their MAC
address. An all 0's MAC address basically leaves it open to VF's
choice for assigning themselves a MAC address, or at least that is the
way I recall coding it back in the day.

There are a few issues with making changes to this at this point. The
first being that this concept is pretty much baked into the VF driver
logic for most drivers supporting legacy SR-IOV, and as pointed out in
the patch comments the libvirt interface is writing 0's to disable the
VF MAC address when it is not in use. At this point we cannot change
this without breaking the libvirt userspace.

One of the motivations for clearing this is to avoid having the PF
misdirect traffic as having a MAC address mapped to a
disabled/unassigned VF could result in traffic being dropped when it
should be directed elsewhere such as a bridge on the PF, or out to
some other PF that is now running the VM there.

> Whatever it takes, just do it.  We can even come up with a global
> MAC address range that on a Linux system is reserved for VFs to
> come up with.

That is normally how the VFs handle this on their side. The code was
setup such that if the PF provided an all 0's MAC address then the VF
would assign itself a locally administered address so that it wouldn't
come up with an address of 0s. If you are saying the VFs shouldn't be
allowed to come up with an all 0's MAC address I believe that none of
them do. I believe they either fail to come up at all or report a
locally administered address for themselves. I can double check that
though (at least for Intel) to verify that it is in fact a consistent
behavior. In theory there isn't likely to be a VF bound to the
interface anyway, usually when the MAC address is invalidated it is
because a VM has been terminated and the VF driver is just in limbo
since it is usually assigned to a VFIO interface which doesn't
actually expose the network interface to the kernel.

I suppose we could look at pushing the LAA generation up into the PF,
but we would still want to maintain the all 0's address while the VF
is inactive since we need to clear the stale VF addresses from the MAC
address table in the event of a VM being relocated to a different
server and taking the MAC address with it.

The good news to all this is that this is going to be fading out and
going away anyway as SwitchDev takes over for SR-IOV.

> Thanks.

I'll double check our VF drivers and make sure none of them are
exposing a netdevice with an all 0's MAC address, and see what we can
do about relocating the locally administered address generation into
the PF.

Thanks.

- Alex

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [net-next 0/8][pull request] 1GbE Intel Wired LAN Driver Updates 2018-01-24
  2018-01-25  2:29   ` Alexander Duyck
@ 2018-01-29 16:46     ` David Miller
  0 siblings, 0 replies; 12+ messages in thread
From: David Miller @ 2018-01-29 16:46 UTC (permalink / raw)
  To: alexander.duyck; +Cc: jeffrey.t.kirsher, netdev, nhorman, sassmann, jogreene

From: Alexander Duyck <alexander.duyck@gmail.com>
Date: Wed, 24 Jan 2018 18:29:42 -0800

> I'll double check our VF drivers and make sure none of them are
> exposing a netdevice with an all 0's MAC address, and see what we can
> do about relocating the locally administered address generation into
> the PF.

If such netdevs are never exposed with all 0's MACs, then I'm happy.

Thanks for the explanation.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-01-29 16:46 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-24 20:55 [net-next 0/8][pull request] 1GbE Intel Wired LAN Driver Updates 2018-01-24 Jeff Kirsher
2018-01-24 20:55 ` [net-next 1/8] igb: Allow to remove administratively set MAC on VFs Jeff Kirsher
2018-01-24 20:55 ` [net-next 2/8] igb: add function to get maximum RSS queues Jeff Kirsher
2018-01-24 20:55 ` [net-next 3/8] e1000e: Set HTHRESH when PTHRESH is used Jeff Kirsher
2018-01-24 20:55 ` [net-next 4/8] igb: Clarify idleslope config constraints Jeff Kirsher
2018-01-24 20:55 ` [net-next 5/8] e1000e: Alert the user that C-states will be disabled by enabling jumbo frames Jeff Kirsher
2018-01-24 20:55 ` [net-next 6/8] igb: Free IRQs when device is hotplugged Jeff Kirsher
2018-01-24 20:55 ` [net-next 7/8] igb: Delete an error message for a failed memory allocation in igb_enable_sriov() Jeff Kirsher
2018-01-24 20:55 ` [net-next 8/8] igb: Clear TXSTMP when ptp_tx_work() is timeout Jeff Kirsher
2018-01-24 21:10 ` [net-next 0/8][pull request] 1GbE Intel Wired LAN Driver Updates 2018-01-24 David Miller
2018-01-25  2:29   ` Alexander Duyck
2018-01-29 16:46     ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).