All of lore.kernel.org
 help / color / mirror / Atom feed
* [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface
@ 2014-09-18 22:35 Alexander Duyck
  2014-09-18 22:35 ` [net-next PATCH 01/29] fm10k: Add skeletal frame for Intel(R) FM10000 Ethernet Switch Host Interface Driver Alexander Duyck
                   ` (31 more replies)
  0 siblings, 32 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-18 22:35 UTC (permalink / raw)
  To: davem
  Cc: nhorman, netdev, john.fastabend, matthew.vick, jeffrey.t.kirsher,
	sassmann

This patch series adds support for the FM10000 Ethernet switch host
interface.  The Intel FM10000 Ethernet Switch is a 48-port Ethernet switch
supporting both Ethernet ports and PCI Express host interfaces.  The fm10k
driver provides support for the host interface portion of the switch, both
PF and VF.

As the host interfaces are directly connected to the switch this results in
some significant differences versus a standard network driver.  For example
there is no PHY or MII on the device.  Since packets are delivered directly
from the switch to the host interface these are unnecessary.  Otherwise most
of the functionality is very similar to our other network drivers such as
ixgbe or igb.  For example we support all the standard network offloads,
jumbo frames, SR-IOV (64 VFS), PTP, and some VXLAN and NVGRE offloads.

---

Alexander Duyck (29):
      fm10k: Add skeletal frame for Intel(R) FM10000 Ethernet Switch Host Interface Driver
      fm10k: Add register defines and basic structures
      fm10k: Add support for TLV message parsing and generation
      fm10k: Add support for basic interaction with hardware
      fm10k: Add support for mailbox
      fm10k: Implement PF <-> SM mailbox operations
      fm10k: Add support for PF
      fm10k: Add support for configuring PF interface
      fm10k: Add netdev
      fm10k: Add support for L2 filtering
      fm10k: Add support for ndo_open/stop
      fm10k: Add interrupt support
      fm10k: add support for Tx/Rx rings
      fm10k: Add service task to handle delayed events
      fm10k: Add Tx/Rx hardware ring bring-up/tear-down
      fm10k: Add transmit and receive fastpath and interrupt handlers
      fm10k: Add ethtool support
      fm10k: Add support for PCI power management and error handling
      fm10k: Add support for multiple queues
      fm10k: Add support for netdev offloads
      fm10k: Add support for MACVLAN acceleration
      fm10k: Add support for PF <-> VF mailbox
      fm10k: Add support for VF
      fm10k: Add support for SR-IOV to PF core files
      fm10k: Add support for SR-IOV to driver
      fm10k: Add support for IEEE DCBx
      fm10k: Add support for debugfs
      fm10k: Add support for ptp to hw specific files
      fm10k: Add support for PTP


 drivers/net/ethernet/intel/Kconfig               |   19 
 drivers/net/ethernet/intel/Makefile              |    1 
 drivers/net/ethernet/intel/fm10k/Makefile        |   33 
 drivers/net/ethernet/intel/fm10k/fm10k.h         |  532 +++++
 drivers/net/ethernet/intel/fm10k/fm10k_common.c  |  534 +++++
 drivers/net/ethernet/intel/fm10k/fm10k_common.h  |   65 +
 drivers/net/ethernet/intel/fm10k/fm10k_dcbnl.c   |  174 ++
 drivers/net/ethernet/intel/fm10k/fm10k_debugfs.c |  259 +++
 drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c | 1069 +++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_iov.c     |  536 +++++
 drivers/net/ethernet/intel/fm10k/fm10k_main.c    | 1978 ++++++++++++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_mbx.c     | 2125 ++++++++++++++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_mbx.h     |  307 +++
 drivers/net/ethernet/intel/fm10k/fm10k_netdev.c  | 1424 ++++++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c     | 2166 ++++++++++++++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_pf.c      | 1849 +++++++++++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_pf.h      |  135 +
 drivers/net/ethernet/intel/fm10k/fm10k_ptp.c     |  535 +++++
 drivers/net/ethernet/intel/fm10k/fm10k_tlv.c     |  863 +++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_tlv.h     |  186 ++
 drivers/net/ethernet/intel/fm10k/fm10k_type.h    |  769 ++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_vf.c      |  552 ++++++
 drivers/net/ethernet/intel/fm10k/fm10k_vf.h      |   78 +
 23 files changed, 16189 insertions(+)
 create mode 100644 drivers/net/ethernet/intel/fm10k/Makefile
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k.h
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_common.c
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_common.h
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_dcbnl.c
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_debugfs.c
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_iov.c
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_main.c
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_mbx.c
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_mbx.h
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_pci.c
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_pf.c
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_pf.h
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_ptp.c
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_tlv.c
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_tlv.h
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_type.h
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_vf.c
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_vf.h

-- 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [net-next PATCH 01/29] fm10k: Add skeletal frame for Intel(R) FM10000 Ethernet Switch Host Interface Driver
  2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
@ 2014-09-18 22:35 ` Alexander Duyck
  2014-09-18 22:35 ` [net-next PATCH 02/29] fm10k: Add register defines and basic structures Alexander Duyck
                   ` (30 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-18 22:35 UTC (permalink / raw)
  To: davem
  Cc: nhorman, netdev, john.fastabend, matthew.vick, jeffrey.t.kirsher,
	sassmann

This patch adds the beginning framework onto which I am going to add the
fm10k driver which supports the Intel(R) FM10000 Ethernet Switch Host
Interface.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/Kconfig            |   19 +++
 drivers/net/ethernet/intel/Makefile           |    1 
 drivers/net/ethernet/intel/fm10k/Makefile     |   30 +++++
 drivers/net/ethernet/intel/fm10k/fm10k.h      |   39 +++++++
 drivers/net/ethernet/intel/fm10k/fm10k_main.c |   67 ++++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c  |  145 +++++++++++++++++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_type.h |   27 +++++
 7 files changed, 328 insertions(+)
 create mode 100644 drivers/net/ethernet/intel/fm10k/Makefile
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k.h
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_main.c
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_pci.c
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_type.h

diff --git a/drivers/net/ethernet/intel/Kconfig b/drivers/net/ethernet/intel/Kconfig
index bb9f0ba..6a6d5ee 100644
--- a/drivers/net/ethernet/intel/Kconfig
+++ b/drivers/net/ethernet/intel/Kconfig
@@ -300,4 +300,23 @@ config I40EVF
 	  will be called i40evf.  MSI-X interrupt support is required
 	  for this driver to work correctly.
 
+config FM10K
+	tristate "Intel(R) FM10000 Ethernet Switch Host Interface Support"
+	default n
+	depends on PCI_MSI
+	---help---
+	  This driver supports Intel(R) FM10000 Ethernet Switch Host
+	  Interface.  For more information on how to identify your adapter,
+	  go to the Adapter & Driver ID Guide at:
+
+	  <http://support.intel.com/support/network/sb/CS-008441.htm>
+
+	  For general information and support, go to the Intel support
+	  website at:
+
+	  <http://support.intel.com>
+
+	  To compile this driver as a module, choose M here. The module
+	  will be called fm10k.  MSI-X interrupt support is required
+
 endif # NET_VENDOR_INTEL
diff --git a/drivers/net/ethernet/intel/Makefile b/drivers/net/ethernet/intel/Makefile
index cdbbca8..5ea764d 100644
--- a/drivers/net/ethernet/intel/Makefile
+++ b/drivers/net/ethernet/intel/Makefile
@@ -12,3 +12,4 @@ obj-$(CONFIG_IXGBEVF) += ixgbevf/
 obj-$(CONFIG_I40E) += i40e/
 obj-$(CONFIG_IXGB) += ixgb/
 obj-$(CONFIG_I40EVF) += i40evf/
+obj-$(CONFIG_FM10K) += fm10k/
diff --git a/drivers/net/ethernet/intel/fm10k/Makefile b/drivers/net/ethernet/intel/fm10k/Makefile
new file mode 100644
index 0000000..ddc0eb7
--- /dev/null
+++ b/drivers/net/ethernet/intel/fm10k/Makefile
@@ -0,0 +1,30 @@
+################################################################################
+#
+# Intel Ethernet Switch Host Interface Driver
+# Copyright(c) 2013 - 2014 Intel Corporation.
+#
+# This program is free software; you can redistribute it and/or modify it
+# under the terms and conditions of the GNU General Public License,
+# version 2, as published by the Free Software Foundation.
+#
+# This program is distributed in the hope it will be useful, but WITHOUT
+# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+# FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+# more details.
+#
+# The full GNU General Public License is included in this distribution in
+# the file called "COPYING".
+#
+# Contact Information:
+# e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+# Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+#
+################################################################################
+
+#
+# Makefile for the Intel(R) FM10000 Ethernet Switch Host Interface driver
+#
+
+obj-$(CONFIG_FM10K) += fm10k.o
+
+fm10k-objs := fm10k_main.o fm10k_pci.o
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k.h b/drivers/net/ethernet/intel/fm10k/fm10k.h
new file mode 100644
index 0000000..2e27bd9
--- /dev/null
+++ b/drivers/net/ethernet/intel/fm10k/fm10k.h
@@ -0,0 +1,39 @@
+/* Intel Ethernet Switch Host Interface Driver
+ * Copyright(c) 2013 - 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Contact Information:
+ * e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+ * Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+ */
+
+#ifndef _FM10K_H_
+#define _FM10K_H_
+
+#include <linux/types.h>
+#include <linux/etherdevice.h>
+#include <linux/rtnetlink.h>
+#include <linux/if_vlan.h>
+#include <linux/pci.h>
+
+#include "fm10k_type.h"
+
+/* main */
+extern char fm10k_driver_name[];
+extern const char fm10k_driver_version[];
+
+/* PCI */
+int fm10k_register_pci_driver(void);
+void fm10k_unregister_pci_driver(void);
+#endif /* _FM10K_H_ */
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_main.c b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
new file mode 100644
index 0000000..a7822b8
--- /dev/null
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
@@ -0,0 +1,67 @@
+/* Intel Ethernet Switch Host Interface Driver
+ * Copyright(c) 2013 - 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Contact Information:
+ * e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+ * Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+ */
+
+#include <linux/types.h>
+#include <linux/module.h>
+#include <net/ipv6.h>
+#include <net/ip.h>
+#include <net/tcp.h>
+#include <linux/if_macvlan.h>
+
+#include "fm10k.h"
+
+#define DRV_VERSION	"0.12.2-k"
+const char fm10k_driver_version[] = DRV_VERSION;
+char fm10k_driver_name[] = "fm10k";
+static const char fm10k_driver_string[] =
+	"Intel(R) Ethernet Switch Host Interface Driver";
+static const char fm10k_copyright[] =
+	"Copyright (c) 2013 Intel Corporation.";
+
+MODULE_AUTHOR("Intel Corporation, <linux.nics@intel.com>");
+MODULE_DESCRIPTION("Intel(R) Ethernet Switch Host Interface Driver");
+MODULE_LICENSE("GPL");
+MODULE_VERSION(DRV_VERSION);
+
+/** fm10k_init_module - Driver Registration Routine
+ *
+ * fm10k_init_module is the first routine called when the driver is
+ * loaded.  All it does is register with the PCI subsystem.
+ **/
+static int __init fm10k_init_module(void)
+{
+	pr_info("%s - version %s\n", fm10k_driver_string, fm10k_driver_version);
+	pr_info("%s\n", fm10k_copyright);
+
+	return fm10k_register_pci_driver();
+}
+module_init(fm10k_init_module);
+
+/**
+ * fm10k_exit_module - Driver Exit Cleanup Routine
+ *
+ * fm10k_exit_module is called just before the driver is removed
+ * from memory.
+ **/
+static void __exit fm10k_exit_module(void)
+{
+	fm10k_unregister_pci_driver();
+}
+module_exit(fm10k_exit_module);
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
new file mode 100644
index 0000000..12089ce
--- /dev/null
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -0,0 +1,145 @@
+/* Intel Ethernet Switch Host Interface Driver
+ * Copyright(c) 2013 - 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Contact Information:
+ * e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+ * Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+ */
+
+#include <linux/module.h>
+
+#include "fm10k.h"
+
+/**
+ * fm10k_pci_tbl - PCI Device ID Table
+ *
+ * Wildcard entries (PCI_ANY_ID) should come last
+ * Last entry must be all 0s
+ *
+ * { Vendor ID, Device ID, SubVendor ID, SubDevice ID,
+ *   Class, Class Mask, private data (not used) }
+ */
+static const struct pci_device_id fm10k_pci_tbl[] = {
+	{ PCI_VDEVICE(INTEL, FM10K_DEV_ID_PF) },
+	/* required last entry */
+	{ 0, }
+};
+MODULE_DEVICE_TABLE(pci, fm10k_pci_tbl);
+
+/**
+ * fm10k_probe - Device Initialization Routine
+ * @pdev: PCI device information struct
+ * @ent: entry in fm10k_pci_tbl
+ *
+ * Returns 0 on success, negative on failure
+ *
+ * fm10k_probe initializes an interface identified by a pci_dev structure.
+ * The OS initialization, configuring of the interface private structure,
+ * and a hardware reset occur.
+ **/
+static int fm10k_probe(struct pci_dev *pdev,
+		       const struct pci_device_id *ent)
+{
+	int err;
+	u64 dma_mask;
+
+	err = pci_enable_device_mem(pdev);
+	if (err)
+		return err;
+
+	/* By default fm10k only supports a 48 bit DMA mask */
+	dma_mask = DMA_BIT_MASK(48) | dma_get_required_mask(&pdev->dev);
+
+	if ((dma_mask <= DMA_BIT_MASK(32)) ||
+	    dma_set_mask_and_coherent(&pdev->dev, dma_mask)) {
+		dma_mask &= DMA_BIT_MASK(32);
+
+		err = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32));
+		err = dma_set_mask(&pdev->dev, DMA_BIT_MASK(32));
+		if (err) {
+			err = dma_set_coherent_mask(&pdev->dev,
+						    DMA_BIT_MASK(32));
+			if (err) {
+				dev_err(&pdev->dev,
+					"No usable DMA configuration, aborting\n");
+				goto err_dma;
+			}
+		}
+	}
+
+	err = pci_request_selected_regions(pdev,
+					   pci_select_bars(pdev,
+							   IORESOURCE_MEM),
+					   fm10k_driver_name);
+	if (err) {
+		dev_err(&pdev->dev,
+			"pci_request_selected_regions failed 0x%x\n", err);
+		goto err_pci_reg;
+	}
+
+	pci_set_master(pdev);
+	pci_save_state(pdev);
+
+	return 0;
+
+err_pci_reg:
+err_dma:
+	pci_disable_device(pdev);
+	return err;
+}
+
+/**
+ * fm10k_remove - Device Removal Routine
+ * @pdev: PCI device information struct
+ *
+ * fm10k_remove is called by the PCI subsystem to alert the driver
+ * that it should release a PCI device.  The could be caused by a
+ * Hot-Plug event, or because the driver is going to be removed from
+ * memory.
+ **/
+static void fm10k_remove(struct pci_dev *pdev)
+{
+	pci_release_selected_regions(pdev,
+				     pci_select_bars(pdev, IORESOURCE_MEM));
+
+	pci_disable_device(pdev);
+}
+
+static struct pci_driver fm10k_driver = {
+	.name			= fm10k_driver_name,
+	.id_table		= fm10k_pci_tbl,
+	.probe			= fm10k_probe,
+	.remove			= fm10k_remove,
+};
+
+/**
+ * fm10k_register_pci_driver - register driver interface
+ *
+ * This funciton is called on module load in order to register the driver.
+ **/
+int fm10k_register_pci_driver(void)
+{
+	return pci_register_driver(&fm10k_driver);
+}
+
+/**
+ * fm10k_unregister_pci_driver - unregister driver interface
+ *
+ * This funciton is called on module unload in order to remove the driver.
+ **/
+void fm10k_unregister_pci_driver(void)
+{
+	pci_unregister_driver(&fm10k_driver);
+}
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_type.h b/drivers/net/ethernet/intel/fm10k/fm10k_type.h
new file mode 100644
index 0000000..2c06f02
--- /dev/null
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_type.h
@@ -0,0 +1,27 @@
+/* Intel Ethernet Switch Host Interface Driver
+ * Copyright(c) 2013 - 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Contact Information:
+ * e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+ * Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+ */
+
+#ifndef _FM10K_TYPE_H_
+#define _FM10K_TYPE_H_
+
+#define FM10K_DEV_ID_PF			0x15A4
+#define FM10K_DEV_ID_VF			0x15A5
+
+#endif /* _FM10K_TYPE_H */

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [net-next PATCH 02/29] fm10k: Add register defines and basic structures
  2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
  2014-09-18 22:35 ` [net-next PATCH 01/29] fm10k: Add skeletal frame for Intel(R) FM10000 Ethernet Switch Host Interface Driver Alexander Duyck
@ 2014-09-18 22:35 ` Alexander Duyck
  2014-09-18 22:36 ` [net-next PATCH 03/29] fm10k: Add support for TLV message parsing and generation Alexander Duyck
                   ` (29 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-18 22:35 UTC (permalink / raw)
  To: davem
  Cc: nhorman, netdev, john.fastabend, matthew.vick, jeffrey.t.kirsher,
	sassmann

This patch adds the basic defines and structures needed by the PF for
operation.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_main.c |    3 
 drivers/net/ethernet/intel/fm10k/fm10k_type.h |  659 +++++++++++++++++++++++++
 2 files changed, 661 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_main.c b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
index a7822b8..6ca0614 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_main.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
@@ -40,7 +40,8 @@ MODULE_DESCRIPTION("Intel(R) Ethernet Switch Host Interface Driver");
 MODULE_LICENSE("GPL");
 MODULE_VERSION(DRV_VERSION);
 
-/** fm10k_init_module - Driver Registration Routine
+/**
+ * fm10k_init_module - Driver Registration Routine
  *
  * fm10k_init_module is the first routine called when the driver is
  * loaded.  All it does is register with the PCI subsystem.
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_type.h b/drivers/net/ethernet/intel/fm10k/fm10k_type.h
index 2c06f02..1606805 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_type.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_type.h
@@ -21,7 +21,666 @@
 #ifndef _FM10K_TYPE_H_
 #define _FM10K_TYPE_H_
 
+/* forward declaration */
+struct fm10k_hw;
+
+#include <linux/types.h>
+#include <asm/byteorder.h>
+#include <linux/etherdevice.h>
+
 #define FM10K_DEV_ID_PF			0x15A4
 #define FM10K_DEV_ID_VF			0x15A5
 
+#define FM10K_MAX_QUEUES		256
+#define FM10K_MAX_QUEUES_PF		128
+#define FM10K_MAX_QUEUES_POOL		16
+
+#define FM10K_48_BIT_MASK		0x0000FFFFFFFFFFFFull
+#define FM10K_STAT_VALID		0x80000000
+
+/* PCI Bus Info */
+#define FM10K_PCIE_LINK_CAP		0x7C
+#define FM10K_PCIE_LINK_STATUS		0x82
+#define FM10K_PCIE_LINK_WIDTH		0x3F0
+#define FM10K_PCIE_LINK_WIDTH_1		0x10
+#define FM10K_PCIE_LINK_WIDTH_2		0x20
+#define FM10K_PCIE_LINK_WIDTH_4		0x40
+#define FM10K_PCIE_LINK_WIDTH_8		0x80
+#define FM10K_PCIE_LINK_SPEED		0xF
+#define FM10K_PCIE_LINK_SPEED_2500	0x1
+#define FM10K_PCIE_LINK_SPEED_5000	0x2
+#define FM10K_PCIE_LINK_SPEED_8000	0x3
+
+/* PCIe payload size */
+#define FM10K_PCIE_DEV_CAP			0x74
+#define FM10K_PCIE_DEV_CAP_PAYLOAD		0x07
+#define FM10K_PCIE_DEV_CAP_PAYLOAD_128		0x00
+#define FM10K_PCIE_DEV_CAP_PAYLOAD_256		0x01
+#define FM10K_PCIE_DEV_CAP_PAYLOAD_512		0x02
+#define FM10K_PCIE_DEV_CTRL			0x78
+#define FM10K_PCIE_DEV_CTRL_PAYLOAD		0xE0
+#define FM10K_PCIE_DEV_CTRL_PAYLOAD_128		0x00
+#define FM10K_PCIE_DEV_CTRL_PAYLOAD_256		0x20
+#define FM10K_PCIE_DEV_CTRL_PAYLOAD_512		0x40
+
+/* PCIe MSI-X Capability info */
+#define FM10K_PCI_MSIX_MSG_CTRL			0xB2
+#define FM10K_PCI_MSIX_MSG_CTRL_TBL_SZ_MASK	0x7FF
+#define FM10K_MAX_MSIX_VECTORS			256
+#define FM10K_MAX_VECTORS_PF			256
+#define FM10K_MAX_VECTORS_POOL			32
+
+/* PCIe SR-IOV Info */
+#define FM10K_PCIE_SRIOV_CTRL			0x190
+#define FM10K_PCIE_SRIOV_CTRL_VFARI		0x10
+
+#define FM10K_ERR_PARAM				-2
+#define FM10K_ERR_REQUESTS_PENDING		-4
+#define FM10K_ERR_RESET_REQUESTED		-5
+#define FM10K_ERR_DMA_PENDING			-6
+#define FM10K_ERR_RESET_FAILED			-7
+#define FM10K_ERR_INVALID_MAC_ADDR		-8
+#define FM10K_ERR_INVALID_VALUE			-9
+#define FM10K_NOT_IMPLEMENTED			0x7FFFFFFF
+
+/* Start of PF registers */
+#define FM10K_CTRL		0x0000
+#define FM10K_CTRL_BAR4_ALLOWED			0x00000004
+
+#define FM10K_CTRL_EXT		0x0001
+#define FM10K_GCR		0x0003
+#define FM10K_GCR_EXT		0x0005
+
+/* Interrupt control registers */
+#define FM10K_EICR		0x0006
+#define FM10K_EICR_FAULT_MASK			0x0000003F
+#define FM10K_EICR_MAILBOX			0x00000040
+#define FM10K_EICR_SWITCHREADY			0x00000080
+#define FM10K_EICR_SWITCHNOTREADY		0x00000100
+#define FM10K_EICR_SWITCHINTERRUPT		0x00000200
+#define FM10K_EICR_VFLR				0x00000800
+#define FM10K_EICR_MAXHOLDTIME			0x00001000
+#define FM10K_EIMR		0x0007
+#define FM10K_EIMR_PCA_FAULT			0x00000001
+#define FM10K_EIMR_THI_FAULT			0x00000010
+#define FM10K_EIMR_FUM_FAULT			0x00000400
+#define FM10K_EIMR_MAILBOX			0x00001000
+#define FM10K_EIMR_SWITCHREADY			0x00004000
+#define FM10K_EIMR_SWITCHNOTREADY		0x00010000
+#define FM10K_EIMR_SWITCHINTERRUPT		0x00040000
+#define FM10K_EIMR_SRAMERROR			0x00100000
+#define FM10K_EIMR_VFLR				0x00400000
+#define FM10K_EIMR_MAXHOLDTIME			0x01000000
+#define FM10K_EIMR_ALL				0x55555555
+#define FM10K_EIMR_DISABLE(NAME)		((FM10K_EIMR_ ## NAME) << 0)
+#define FM10K_EIMR_ENABLE(NAME)			((FM10K_EIMR_ ## NAME) << 1)
+#define FM10K_FAULT_ADDR_LO		0x0
+#define FM10K_FAULT_ADDR_HI		0x1
+#define FM10K_FAULT_SPECINFO		0x2
+#define FM10K_FAULT_FUNC		0x3
+#define FM10K_FAULT_SIZE		0x4
+#define FM10K_FAULT_FUNC_VALID			0x00008000
+#define FM10K_FAULT_FUNC_PF			0x00004000
+#define FM10K_FAULT_FUNC_VF_MASK		0x00003F00
+#define FM10K_FAULT_FUNC_VF_SHIFT		8
+#define FM10K_FAULT_FUNC_TYPE_MASK		0x000000FF
+
+#define FM10K_PCA_FAULT		0x0008
+#define FM10K_THI_FAULT		0x0010
+#define FM10K_FUM_FAULT		0x001C
+
+/* Rx queue timeout indicator */
+#define FM10K_MAXHOLDQ(_n)	((_n) + 0x0020)
+
+/* Switch Manager info */
+#define FM10K_SM_AREA(_n)	((_n) + 0x0028)
+
+/* GLORT mapping registers */
+#define FM10K_DGLORTMAP(_n)	((_n) + 0x0030)
+#define FM10K_DGLORT_COUNT			8
+#define FM10K_DGLORTMAP_MASK_SHIFT		16
+#define FM10K_DGLORTMAP_ANY			0x00000000
+#define FM10K_DGLORTMAP_NONE			0x0000FFFF
+#define FM10K_DGLORTMAP_ZERO			0xFFFF0000
+#define FM10K_DGLORTDEC(_n)	((_n) + 0x0038)
+#define FM10K_DGLORTDEC_VSILENGTH_SHIFT		4
+#define FM10K_DGLORTDEC_VSIBASE_SHIFT		7
+#define FM10K_DGLORTDEC_PCLENGTH_SHIFT		14
+#define FM10K_DGLORTDEC_QBASE_SHIFT		16
+#define FM10K_DGLORTDEC_RSSLENGTH_SHIFT		24
+#define FM10K_DGLORTDEC_INNERRSS_ENABLE		0x08000000
+#define FM10K_TUNNEL_CFG	0x0040
+#define FM10K_TUNNEL_CFG_NVGRE_SHIFT		16
+#define FM10K_SWPRI_MAP(_n)	((_n) + 0x0050)
+#define FM10K_SWPRI_MAX		16
+#define FM10K_RSSRK(_n, _m)	(((_n) * 0x10) + (_m) + 0x0800)
+#define FM10K_RSSRK_SIZE	10
+#define FM10K_RSSRK_ENTRIES_PER_REG		4
+#define FM10K_RETA(_n, _m)	(((_n) * 0x20) + (_m) + 0x1000)
+#define FM10K_RETA_SIZE		32
+#define FM10K_RETA_ENTRIES_PER_REG		4
+#define FM10K_MAX_RSS_INDICES	128
+
+/* Rate limiting registers */
+#define FM10K_TC_CREDIT(_n)	((_n) + 0x2000)
+#define FM10K_TC_CREDIT_CREDIT_MASK		0x001FFFFF
+#define FM10K_TC_MAXCREDIT(_n)	((_n) + 0x2040)
+#define FM10K_TC_MAXCREDIT_64K			0x00010000
+#define FM10K_TC_RATE(_n)	((_n) + 0x2080)
+#define FM10K_TC_RATE_QUANTA_MASK		0x0000FFFF
+#define FM10K_TC_RATE_INTERVAL_4US_GEN1		0x00020000
+#define FM10K_TC_RATE_INTERVAL_4US_GEN2		0x00040000
+#define FM10K_TC_RATE_INTERVAL_4US_GEN3		0x00080000
+
+/* DMA control registers */
+#define FM10K_DMA_CTRL		0x20C3
+#define FM10K_DMA_CTRL_TX_ENABLE		0x00000001
+#define FM10K_DMA_CTRL_TX_ACTIVE		0x00000008
+#define FM10K_DMA_CTRL_RX_ENABLE		0x00000010
+#define FM10K_DMA_CTRL_RX_ACTIVE		0x00000080
+#define FM10K_DMA_CTRL_RX_DESC_SIZE		0x00000100
+#define FM10K_DMA_CTRL_MINMSS_64		0x00008000
+#define FM10K_DMA_CTRL_MAX_HOLD_1US_GEN3	0x04800000
+#define FM10K_DMA_CTRL_MAX_HOLD_1US_GEN2	0x04000000
+#define FM10K_DMA_CTRL_MAX_HOLD_1US_GEN1	0x03800000
+#define FM10K_DMA_CTRL_DATAPATH_RESET		0x20000000
+#define FM10K_DMA_CTRL_32_DESC			0x00000000
+
+#define FM10K_DMA_CTRL2		0x20C4
+#define FM10K_DMA_CTRL2_SWITCH_READY		0x00002000
+
+/* TSO flags configuration
+ * First packet contains all flags except for fin and psh
+ * Middle packet contains only urg and ack
+ * Last packet contains urg, ack, fin, and psh
+ */
+#define FM10K_TSO_FLAGS_LOW		0x00300FF6
+#define FM10K_TSO_FLAGS_HI		0x00000039
+#define FM10K_DTXTCPFLGL	0x20C5
+#define FM10K_DTXTCPFLGH	0x20C6
+
+#define FM10K_TPH_CTRL		0x20C7
+#define FM10K_MRQC(_n)		((_n) + 0x2100)
+#define FM10K_MRQC_TCP_IPV4			0x00000001
+#define FM10K_MRQC_IPV4				0x00000002
+#define FM10K_MRQC_IPV6				0x00000010
+#define FM10K_MRQC_TCP_IPV6			0x00000020
+#define FM10K_MRQC_UDP_IPV4			0x00000040
+#define FM10K_MRQC_UDP_IPV6			0x00000080
+
+#define FM10K_TQMAP(_n)		((_n) + 0x2800)
+#define FM10K_TQMAP_TABLE_SIZE			2048
+#define FM10K_RQMAP(_n)		((_n) + 0x3000)
+
+/* Hardware Statistics */
+#define FM10K_STATS_TIMEOUT		0x3800
+#define FM10K_STATS_UR			0x3801
+#define FM10K_STATS_CA			0x3802
+#define FM10K_STATS_UM			0x3803
+#define FM10K_STATS_XEC			0x3804
+#define FM10K_STATS_VLAN_DROP		0x3805
+#define FM10K_STATS_LOOPBACK_DROP	0x3806
+#define FM10K_STATS_NODESC_DROP		0x3807
+
+/* PCIe state registers */
+#define FM10K_PHYADDR		0x381C
+
+/* Rx ring registers */
+#define FM10K_RDBAL(_n)		((0x40 * (_n)) + 0x4000)
+#define FM10K_RDBAH(_n)		((0x40 * (_n)) + 0x4001)
+#define FM10K_RDLEN(_n)		((0x40 * (_n)) + 0x4002)
+#define FM10K_TPH_RXCTRL(_n)	((0x40 * (_n)) + 0x4003)
+#define FM10K_TPH_RXCTRL_DESC_TPHEN		0x00000020
+#define FM10K_TPH_RXCTRL_DESC_RROEN		0x00000200
+#define FM10K_TPH_RXCTRL_DATA_WROEN		0x00002000
+#define FM10K_TPH_RXCTRL_HDR_WROEN		0x00008000
+#define FM10K_RDH(_n)		((0x40 * (_n)) + 0x4004)
+#define FM10K_RDT(_n)		((0x40 * (_n)) + 0x4005)
+#define FM10K_RXQCTL(_n)	((0x40 * (_n)) + 0x4006)
+#define FM10K_RXQCTL_ENABLE			0x00000001
+#define FM10K_RXQCTL_PF				0x000000FC
+#define FM10K_RXQCTL_VF_SHIFT			2
+#define FM10K_RXQCTL_VF				0x00000100
+#define FM10K_RXQCTL_ID_MASK	(FM10K_RXQCTL_PF | FM10K_RXQCTL_VF)
+#define FM10K_RXDCTL(_n)	((0x40 * (_n)) + 0x4007)
+#define FM10K_RXDCTL_WRITE_BACK_MIN_DELAY	0x00000001
+#define FM10K_RXDCTL_DROP_ON_EMPTY		0x00000200
+#define FM10K_RXINT(_n)		((0x40 * (_n)) + 0x4008)
+#define FM10K_SRRCTL(_n)	((0x40 * (_n)) + 0x4009)
+#define FM10K_SRRCTL_BSIZEPKT_SHIFT		8 /* shift _right_ */
+#define FM10K_SRRCTL_LOOPBACK_SUPPRESS		0x40000000
+#define FM10K_SRRCTL_BUFFER_CHAINING_EN		0x80000000
+
+/* Rx Statistics */
+#define FM10K_QPRC(_n)		((0x40 * (_n)) + 0x400A)
+#define FM10K_QPRDC(_n)		((0x40 * (_n)) + 0x400B)
+#define FM10K_QBRC_L(_n)	((0x40 * (_n)) + 0x400C)
+#define FM10K_QBRC_H(_n)	((0x40 * (_n)) + 0x400D)
+
+/* Rx GLORT register */
+#define FM10K_RX_SGLORT(_n)		((0x40 * (_n)) + 0x400E)
+
+/* Tx ring registers */
+#define FM10K_TDBAL(_n)		((0x40 * (_n)) + 0x8000)
+#define FM10K_TDBAH(_n)		((0x40 * (_n)) + 0x8001)
+#define FM10K_TDLEN(_n)		((0x40 * (_n)) + 0x8002)
+#define FM10K_TPH_TXCTRL(_n)	((0x40 * (_n)) + 0x8003)
+#define FM10K_TPH_TXCTRL_DESC_TPHEN		0x00000020
+#define FM10K_TPH_TXCTRL_DESC_RROEN		0x00000200
+#define FM10K_TPH_TXCTRL_DESC_WROEN		0x00000800
+#define FM10K_TPH_TXCTRL_DATA_RROEN		0x00002000
+#define FM10K_TDH(_n)		((0x40 * (_n)) + 0x8004)
+#define FM10K_TDT(_n)		((0x40 * (_n)) + 0x8005)
+#define FM10K_TXDCTL(_n)	((0x40 * (_n)) + 0x8006)
+#define FM10K_TXDCTL_ENABLE			0x00004000
+#define FM10K_TXDCTL_MAX_TIME_SHIFT		16
+#define FM10K_TXQCTL(_n)	((0x40 * (_n)) + 0x8007)
+#define FM10K_TXQCTL_PF				0x0000003F
+#define FM10K_TXQCTL_VF				0x00000040
+#define FM10K_TXQCTL_ID_MASK	(FM10K_TXQCTL_PF | FM10K_TXQCTL_VF)
+#define FM10K_TXQCTL_PC_SHIFT			7
+#define FM10K_TXQCTL_PC_MASK			0x00000380
+#define FM10K_TXQCTL_TC_SHIFT			10
+#define FM10K_TXQCTL_VID_SHIFT			16
+#define FM10K_TXQCTL_VID_MASK			0x0FFF0000
+#define FM10K_TXQCTL_UNLIMITED_BW		0x10000000
+#define FM10K_TXINT(_n)		((0x40 * (_n)) + 0x8008)
+
+/* Tx Statistics */
+#define FM10K_QPTC(_n)		((0x40 * (_n)) + 0x8009)
+#define FM10K_QBTC_L(_n)	((0x40 * (_n)) + 0x800A)
+#define FM10K_QBTC_H(_n)	((0x40 * (_n)) + 0x800B)
+
+/* Tx Push registers */
+#define FM10K_TQDLOC(_n)	((0x40 * (_n)) + 0x800C)
+#define FM10K_TQDLOC_BASE_32_DESC		0x08
+#define FM10K_TQDLOC_SIZE_32_DESC		0x00050000
+
+/* Tx GLORT registers */
+#define FM10K_TX_SGLORT(_n)	((0x40 * (_n)) + 0x800D)
+#define FM10K_PFVTCTL(_n)	((0x40 * (_n)) + 0x800E)
+#define FM10K_PFVTCTL_FTAG_DESC_ENABLE		0x00000001
+
+/* Interrupt moderation and control registers */
+#define FM10K_INT_MAP(_n)	((_n) + 0x10080)
+#define FM10K_INT_MAP_TIMER0			0x00000000
+#define FM10K_INT_MAP_TIMER1			0x00000100
+#define FM10K_INT_MAP_IMMEDIATE			0x00000200
+#define FM10K_INT_MAP_DISABLE			0x00000300
+#define FM10K_MSIX_VECTOR_MASK(_n)	((0x4 * (_n)) + 0x11003)
+#define FM10K_INT_CTRL		0x12000
+#define FM10K_INT_CTRL_ENABLEMODERATOR		0x00000400
+#define FM10K_ITR(_n)		((_n) + 0x12400)
+#define FM10K_ITR_INTERVAL1_SHIFT		12
+#define FM10K_ITR_PENDING2			0x10000000
+#define FM10K_ITR_AUTOMASK			0x20000000
+#define FM10K_ITR_MASK_SET			0x40000000
+#define FM10K_ITR_MASK_CLEAR			0x80000000
+#define FM10K_ITR2(_n)		((0x2 * (_n)) + 0x12800)
+#define FM10K_ITR_REG_COUNT			768
+#define FM10K_ITR_REG_COUNT_PF			256
+
+/* Switch manager interrupt registers */
+#define FM10K_IP		0x13000
+#define FM10K_IP_NOTINRESET			0x00000100
+
+/* VLAN registers */
+#define FM10K_VLAN_TABLE(_n, _m)	((0x80 * (_n)) + (_m) + 0x14000)
+#define FM10K_VLAN_TABLE_SIZE			128
+
+/* VLAN specific message offsets */
+#define FM10K_VLAN_TABLE_VID_MAX		4096
+#define FM10K_VLAN_TABLE_VSI_MAX		64
+#define FM10K_VLAN_LENGTH_SHIFT			16
+#define FM10K_VLAN_CLEAR			(1 << 15)
+#define FM10K_VLAN_ALL \
+	((FM10K_VLAN_TABLE_VID_MAX - 1) << FM10K_VLAN_LENGTH_SHIFT)
+
+/* VF FLR event notification registers */
+#define FM10K_PFVFLRE(_n)	((0x1 * (_n)) + 0x18844)
+#define FM10K_PFVFLREC(_n)	((0x1 * (_n)) + 0x18846)
+
+/* Defines for size of uncacheable memories */
+#define FM10K_UC_ADDR_START	0x000000	/* start of standard regs */
+#define FM10K_UC_ADDR_END	0x100000	/* end of standard regs */
+#define FM10K_UC_ADDR_SIZE	(FM10K_UC_ADDR_END - FM10K_UC_ADDR_START)
+
+/* Define timeouts for resets and disables */
+#define FM10K_QUEUE_DISABLE_TIMEOUT		100
+#define FM10K_RESET_TIMEOUT			100
+
+enum fm10k_int_source {
+	fm10k_int_Mailbox	= 0,
+	fm10k_int_PCIeFault	= 1,
+	fm10k_int_SwitchUpDown	= 2,
+	fm10k_int_SwitchEvent	= 3,
+	fm10k_int_SRAM		= 4,
+	fm10k_int_VFLR		= 5,
+	fm10k_int_MaxHoldTime	= 6,
+	fm10k_int_sources_max_pf
+};
+
+/* PCIe bus speeds */
+enum fm10k_bus_speed {
+	fm10k_bus_speed_unknown	= 0,
+	fm10k_bus_speed_2500	= 2500,
+	fm10k_bus_speed_5000	= 5000,
+	fm10k_bus_speed_8000	= 8000,
+	fm10k_bus_speed_reserved
+};
+
+/* PCIe bus widths */
+enum fm10k_bus_width {
+	fm10k_bus_width_unknown	= 0,
+	fm10k_bus_width_pcie_x1	= 1,
+	fm10k_bus_width_pcie_x2	= 2,
+	fm10k_bus_width_pcie_x4	= 4,
+	fm10k_bus_width_pcie_x8	= 8,
+	fm10k_bus_width_reserved
+};
+
+/* PCIe payload sizes */
+enum fm10k_bus_payload {
+	fm10k_bus_payload_unknown = 0,
+	fm10k_bus_payload_128	  = 1,
+	fm10k_bus_payload_256	  = 2,
+	fm10k_bus_payload_512	  = 3,
+	fm10k_bus_payload_reserved
+};
+
+/* Bus parameters */
+struct fm10k_bus_info {
+	enum fm10k_bus_speed speed;
+	enum fm10k_bus_width width;
+	enum fm10k_bus_payload payload;
+};
+
+/* Statistics related declarations */
+struct fm10k_hw_stat {
+	u64 count;
+	u32 base_l;
+	u32 base_h;
+};
+
+struct fm10k_hw_stats_q {
+	struct fm10k_hw_stat tx_bytes;
+	struct fm10k_hw_stat tx_packets;
+#define tx_stats_idx	tx_packets.base_h
+	struct fm10k_hw_stat rx_bytes;
+	struct fm10k_hw_stat rx_packets;
+#define rx_stats_idx	rx_packets.base_h
+	struct fm10k_hw_stat rx_drops;
+};
+
+struct fm10k_hw_stats {
+	struct fm10k_hw_stat	timeout;
+#define stats_idx	timeout.base_h
+	struct fm10k_hw_stat	ur;
+	struct fm10k_hw_stat	ca;
+	struct fm10k_hw_stat	um;
+	struct fm10k_hw_stat	xec;
+	struct fm10k_hw_stat	vlan_drop;
+	struct fm10k_hw_stat	loopback_drop;
+	struct fm10k_hw_stat	nodesc_drop;
+	struct fm10k_hw_stats_q q[FM10K_MAX_QUEUES_PF];
+};
+
+/* Establish DGLORT feature priority */
+enum fm10k_dglortdec_idx {
+	fm10k_dglort_default	= 0,
+	fm10k_dglort_vf_rsvd0	= 1,
+	fm10k_dglort_vf_rss	= 2,
+	fm10k_dglort_pf_rsvd0	= 3,
+	fm10k_dglort_pf_queue	= 4,
+	fm10k_dglort_pf_vsi	= 5,
+	fm10k_dglort_pf_rsvd1	= 6,
+	fm10k_dglort_pf_rss	= 7
+};
+
+struct fm10k_dglort_cfg {
+	u16 glort;	/* GLORT base */
+	u16 queue_b;	/* Base value for queue */
+	u8  vsi_b;	/* Base value for VSI */
+	u8  idx;	/* index of DGLORTDEC entry */
+	u8  rss_l;	/* RSS indices */
+	u8  pc_l;	/* Priority Class indices */
+	u8  vsi_l;	/* Number of bits from GLORT used to determine VSI */
+	u8  queue_l;	/* Number of bits from GLORT used to determine queue */
+	u8  shared_l;	/* Ignored bits from GLORT resulting in shared VSI */
+	u8  inner_rss;	/* Boolean value if inner header is used for RSS */
+};
+
+enum fm10k_pca_fault {
+	PCA_NO_FAULT,
+	PCA_UNMAPPED_ADDR,
+	PCA_BAD_QACCESS_PF,
+	PCA_BAD_QACCESS_VF,
+	PCA_MALICIOUS_REQ,
+	PCA_POISONED_TLP,
+	PCA_TLP_ABORT,
+	__PCA_MAX
+};
+
+enum fm10k_thi_fault {
+	THI_NO_FAULT,
+	THI_MAL_DIS_Q_FAULT,
+	__THI_MAX
+};
+
+enum fm10k_fum_fault {
+	FUM_NO_FAULT,
+	FUM_UNMAPPED_ADDR,
+	FUM_POISONED_TLP,
+	FUM_BAD_VF_QACCESS,
+	FUM_ADD_DECODE_ERR,
+	FUM_RO_ERROR,
+	FUM_QPRC_CRC_ERROR,
+	FUM_CSR_TIMEOUT,
+	FUM_INVALID_TYPE,
+	FUM_INVALID_LENGTH,
+	FUM_INVALID_BE,
+	FUM_INVALID_ALIGN,
+	__FUM_MAX
+};
+
+struct fm10k_fault {
+	u64 address;	/* Address at the time fault was detected */
+	u32 specinfo;	/* Extra info on this fault (fault dependent) */
+	u8 type;	/* Fault value dependent on subunit */
+	u8 func;	/* Function number of the fault */
+};
+
+struct fm10k_mac_ops {
+	/* basic bring-up and tear-down */
+	s32 (*reset_hw)(struct fm10k_hw *);
+	s32 (*init_hw)(struct fm10k_hw *);
+	s32 (*start_hw)(struct fm10k_hw *);
+	s32 (*stop_hw)(struct fm10k_hw *);
+	s32 (*get_bus_info)(struct fm10k_hw *);
+	s32 (*get_host_state)(struct fm10k_hw *, bool *);
+	bool (*is_slot_appropriate)(struct fm10k_hw *);
+	s32 (*update_vlan)(struct fm10k_hw *, u32, u8, bool);
+	s32 (*read_mac_addr)(struct fm10k_hw *);
+	s32 (*update_uc_addr)(struct fm10k_hw *, u16, const u8 *,
+			      u16, bool, u8);
+	s32 (*update_mc_addr)(struct fm10k_hw *, u16, const u8 *, u16, bool);
+	s32 (*update_xcast_mode)(struct fm10k_hw *, u16, u8);
+	void (*update_int_moderator)(struct fm10k_hw *);
+	s32  (*update_lport_state)(struct fm10k_hw *, u16, u16, bool);
+	void (*update_hw_stats)(struct fm10k_hw *, struct fm10k_hw_stats *);
+	void (*rebind_hw_stats)(struct fm10k_hw *, struct fm10k_hw_stats *);
+	s32 (*configure_dglort_map)(struct fm10k_hw *,
+				    struct fm10k_dglort_cfg *);
+	void (*set_dma_mask)(struct fm10k_hw *, u64);
+	s32 (*get_fault)(struct fm10k_hw *, int, struct fm10k_fault *);
+	void (*request_lport_map)(struct fm10k_hw *);
+	s32 (*adjust_systime)(struct fm10k_hw *, s32 ppb);
+};
+
+enum fm10k_mac_type {
+	fm10k_mac_unknown = 0,
+	fm10k_mac_pf,
+	fm10k_num_macs
+};
+
+struct fm10k_mac_info {
+	struct fm10k_mac_ops ops;
+	enum fm10k_mac_type type;
+	u8 addr[ETH_ALEN];
+	u8 perm_addr[ETH_ALEN];
+	u16 default_vid;
+	u16 max_msix_vectors;
+	u16 max_queues;
+	bool vlan_override;
+	bool get_host_state;
+	bool tx_ready;
+	u32 dglort_map;
+};
+
+struct fm10k_swapi_table_info {
+	u32 used;
+	u32 avail;
+};
+
+struct fm10k_swapi_info {
+	u32 status;
+	struct fm10k_swapi_table_info mac;
+	struct fm10k_swapi_table_info nexthop;
+	struct fm10k_swapi_table_info ffu;
+};
+
+enum fm10k_xcast_modes {
+	FM10K_XCAST_MODE_ALLMULTI	= 0,
+	FM10K_XCAST_MODE_MULTI		= 1,
+	FM10K_XCAST_MODE_PROMISC	= 2,
+	FM10K_XCAST_MODE_NONE		= 3,
+	FM10K_XCAST_MODE_DISABLE	= 4
+};
+
+enum fm10k_devices {
+	fm10k_device_pf,
+};
+
+struct fm10k_info {
+	enum fm10k_mac_type	mac;
+	s32			(*get_invariants)(struct fm10k_hw *);
+	struct fm10k_mac_ops	*mac_ops;
+};
+
+struct fm10k_hw {
+	u32 __iomem *hw_addr;
+	void *back;
+	struct fm10k_mac_info mac;
+	struct fm10k_bus_info bus;
+	struct fm10k_bus_info bus_caps;
+	struct fm10k_swapi_info swapi;
+	u16 device_id;
+	u16 vendor_id;
+	u16 subsystem_device_id;
+	u16 subsystem_vendor_id;
+	u8 revision_id;
+};
+
+/* Number of Transmit and Receive Descriptors must be a multiple of 8 */
+#define FM10K_REQ_TX_DESCRIPTOR_MULTIPLE	8
+#define FM10K_REQ_RX_DESCRIPTOR_MULTIPLE	8
+
+/* Transmit Descriptor */
+struct fm10k_tx_desc {
+	__le64 buffer_addr;	/* Address of the descriptor's data buffer */
+	__le16 buflen;		/* Length of data to be DMAed */
+	__le16 vlan;		/* VLAN_ID and VPRI to be inserted in FTAG */
+	__le16 mss;		/* MSS for segmentation offload */
+	u8 hdrlen;		/* Header size for segmentation offload */
+	u8 flags;		/* Status and offload request flags */
+};
+
+/* Transmit Descriptor Cache Structure */
+struct fm10k_tx_desc_cache {
+	struct fm10k_tx_desc tx_desc[256];
+};
+
+#define FM10K_TXD_FLAG_INT	0x01
+#define FM10K_TXD_FLAG_TIME	0x02
+#define FM10K_TXD_FLAG_CSUM	0x04
+#define FM10K_TXD_FLAG_FTAG	0x10
+#define FM10K_TXD_FLAG_RS	0x20
+#define FM10K_TXD_FLAG_LAST	0x40
+#define FM10K_TXD_FLAG_DONE	0x80
+
+/* These macros are meant to enable optimal placement of the RS and INT
+ * bits.  It will point us to the last descriptor in the cache for either the
+ * start of the packet, or the end of the packet.  If the index is actually
+ * at the start of the FIFO it will point to the offset for the last index
+ * in the FIFO to prevent an unnecessary write.
+ */
+#define FM10K_TXD_WB_FIFO_SIZE	4
+
+/* Receive Descriptor - 32B */
+union fm10k_rx_desc {
+	struct {
+		__le64 pkt_addr; /* Packet buffer address */
+		__le64 hdr_addr; /* Header buffer address */
+		__le64 reserved; /* Empty space, RSS hash */
+		__le64 timestamp;
+	} q; /* Read, Writeback, 64b quad-words */
+	struct {
+		__le32 data; /* RSS and header data */
+		__le32 rss;  /* RSS Hash */
+		__le32 staterr;
+		__le32 vlan_len;
+		__le32 glort; /* sglort/dglort */
+	} d; /* Writeback, 32b double-words */
+	struct {
+		__le16 pkt_info; /* RSS, Pkt type */
+		__le16 hdr_info; /* Splithdr, hdrlen, xC */
+		__le16 rss_lower;
+		__le16 rss_upper;
+		__le16 status; /* status/error */
+		__le16 csum_err; /* checksum or extended error value */
+		__le16 length; /* Packet length */
+		__le16 vlan; /* VLAN tag */
+		__le16 dglort;
+		__le16 sglort;
+	} w; /* Writeback, 16b words */
+};
+
+#define FM10K_RXD_RSSTYPE_MASK		0x000F
+enum fm10k_rdesc_rss_type {
+	FM10K_RSSTYPE_NONE	= 0x0,
+	FM10K_RSSTYPE_IPV4_TCP	= 0x1,
+	FM10K_RSSTYPE_IPV4	= 0x2,
+	FM10K_RSSTYPE_IPV6_TCP	= 0x3,
+	/* Reserved 0x4 */
+	FM10K_RSSTYPE_IPV6	= 0x5,
+	/* Reserved 0x6 */
+	FM10K_RSSTYPE_IPV4_UDP	= 0x7,
+	FM10K_RSSTYPE_IPV6_UDP	= 0x8
+	/* Reserved 0x9 - 0xF */
+};
+
+#define FM10K_RXD_HDR_INFO_XC_MASK	0x0006
+enum fm10k_rxdesc_xc {
+	FM10K_XC_UNICAST	= 0x0,
+	FM10K_XC_MULTICAST	= 0x4,
+	FM10K_XC_BROADCAST	= 0x6
+};
+
+#define FM10K_RXD_STATUS_DD		0x0001 /* Descriptor done */
+#define FM10K_RXD_STATUS_EOP		0x0002 /* End of packet */
+#define FM10K_RXD_STATUS_L4CS		0x0010 /* Indicates an L4 csum */
+#define FM10K_RXD_STATUS_L4CS2		0x0040 /* Inner header L4 csum */
+#define FM10K_RXD_STATUS_L4E2		0x0800 /* Inner header L4 csum err */
+#define FM10K_RXD_STATUS_IPE2		0x1000 /* Inner header IPv4 csum err */
+#define FM10K_RXD_STATUS_RXE		0x2000 /* Generic Rx error */
+#define FM10K_RXD_STATUS_L4E		0x4000 /* L4 csum error */
+#define FM10K_RXD_STATUS_IPE		0x8000 /* IPv4 csum error */
+
+struct fm10k_ftag {
+	__be16 swpri_type_user;
+	__be16 vlan;
+	__be16 sglort;
+	__be16 dglort;
+};
+
 #endif /* _FM10K_TYPE_H */

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [net-next PATCH 03/29] fm10k: Add support for TLV message parsing and generation
  2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
  2014-09-18 22:35 ` [net-next PATCH 01/29] fm10k: Add skeletal frame for Intel(R) FM10000 Ethernet Switch Host Interface Driver Alexander Duyck
  2014-09-18 22:35 ` [net-next PATCH 02/29] fm10k: Add register defines and basic structures Alexander Duyck
@ 2014-09-18 22:36 ` Alexander Duyck
  2014-09-18 22:36 ` [net-next PATCH 04/29] fm10k: Add support for basic interaction with hardware Alexander Duyck
                   ` (28 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-18 22:36 UTC (permalink / raw)
  To: davem
  Cc: nhorman, netdev, john.fastabend, matthew.vick, jeffrey.t.kirsher,
	sassmann

This patch adds support for the TVL message formats supported by the PF,
VF, and Switch Management entity.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/fm10k/Makefile    |    3 
 drivers/net/ethernet/intel/fm10k/fm10k_tlv.c |  543 ++++++++++++++++++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_tlv.h |  141 +++++++
 3 files changed, 686 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_tlv.c
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_tlv.h

diff --git a/drivers/net/ethernet/intel/fm10k/Makefile b/drivers/net/ethernet/intel/fm10k/Makefile
index ddc0eb7..1052808 100644
--- a/drivers/net/ethernet/intel/fm10k/Makefile
+++ b/drivers/net/ethernet/intel/fm10k/Makefile
@@ -27,4 +27,5 @@
 
 obj-$(CONFIG_FM10K) += fm10k.o
 
-fm10k-objs := fm10k_main.o fm10k_pci.o
+fm10k-objs := fm10k_main.o fm10k_pci.o \
+	      fm10k_tlv.o
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_tlv.c b/drivers/net/ethernet/intel/fm10k/fm10k_tlv.c
new file mode 100644
index 0000000..f1fc709
--- /dev/null
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_tlv.c
@@ -0,0 +1,543 @@
+/* Intel Ethernet Switch Host Interface Driver
+ * Copyright(c) 2013 - 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Contact Information:
+ * e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+ * Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+ */
+
+#include "fm10k_tlv.h"
+
+/**
+ *  fm10k_tlv_msg_init - Initialize message block for TLV data storage
+ *  @msg: Pointer to message block
+ *  @msg_id: Message ID indicating message type
+ *
+ *  This function return success if provided with a valid message pointer
+ **/
+s32 fm10k_tlv_msg_init(u32 *msg, u16 msg_id)
+{
+	/* verify pointer is not NULL */
+	if (!msg)
+		return FM10K_ERR_PARAM;
+
+	*msg = (FM10K_TLV_FLAGS_MSG << FM10K_TLV_FLAGS_SHIFT) | msg_id;
+
+	return 0;
+}
+
+/**
+ *  fm10k_tlv_attr_put_null_string - Place null terminated string on message
+ *  @msg: Pointer to message block
+ *  @attr_id: Attribute ID
+ *  @string: Pointer to string to be stored in attribute
+ *
+ *  This function will reorder a string to be CPU endian and store it in
+ *  the attribute buffer.  It will return success if provided with a valid
+ *  pointers.
+ **/
+s32 fm10k_tlv_attr_put_null_string(u32 *msg, u16 attr_id,
+				   const unsigned char *string)
+{
+	u32 attr_data = 0, len = 0;
+	u32 *attr;
+
+	/* verify pointers are not NULL */
+	if (!string || !msg)
+		return FM10K_ERR_PARAM;
+
+	attr = &msg[FM10K_TLV_DWORD_LEN(*msg)];
+
+	/* copy string into local variable and then write to msg */
+	do {
+		/* write data to message */
+		if (len && !(len % 4)) {
+			attr[len / 4] = attr_data;
+			attr_data = 0;
+		}
+
+		/* record character to offset location */
+		attr_data |= (u32)(*string) << (8 * (len % 4));
+		len++;
+
+		/* test for NULL and then increment */
+	} while (*(string++));
+
+	/* write last piece of data to message */
+	attr[(len + 3) / 4] = attr_data;
+
+	/* record attribute header, update message length */
+	len <<= FM10K_TLV_LEN_SHIFT;
+	attr[0] = len | attr_id;
+
+	/* add header length to length */
+	len += FM10K_TLV_HDR_LEN << FM10K_TLV_LEN_SHIFT;
+	*msg += FM10K_TLV_LEN_ALIGN(len);
+
+	return 0;
+}
+
+/**
+ *  fm10k_tlv_attr_get_null_string - Get null terminated string from attribute
+ *  @attr: Pointer to attribute
+ *  @string: Pointer to location of destination string
+ *
+ *  This function pulls the string back out of the attribute and will place
+ *  it in the array pointed by by string.  It will return success if provided
+ *  with a valid pointers.
+ **/
+s32 fm10k_tlv_attr_get_null_string(u32 *attr, unsigned char *string)
+{
+	u32 len;
+
+	/* verify pointers are not NULL */
+	if (!string || !attr)
+		return FM10K_ERR_PARAM;
+
+	len = *attr >> FM10K_TLV_LEN_SHIFT;
+	attr++;
+
+	while (len--)
+		string[len] = (u8)(attr[len / 4] >> (8 * (len % 4)));
+
+	return 0;
+}
+
+/**
+ *  fm10k_tlv_attr_put_mac_vlan - Store MAC/VLAN attribute in message
+ *  @msg: Pointer to message block
+ *  @attr_id: Attribute ID
+ *  @mac_addr: MAC address to be stored
+ *
+ *  This function will reorder a MAC address to be CPU endian and store it
+ *  in the attribute buffer.  It will return success if provided with a
+ *  valid pointers.
+ **/
+s32 fm10k_tlv_attr_put_mac_vlan(u32 *msg, u16 attr_id,
+				const u8 *mac_addr, u16 vlan)
+{
+	u32 len = ETH_ALEN << FM10K_TLV_LEN_SHIFT;
+	u32 *attr;
+
+	/* verify pointers are not NULL */
+	if (!msg || !mac_addr)
+		return FM10K_ERR_PARAM;
+
+	attr = &msg[FM10K_TLV_DWORD_LEN(*msg)];
+
+	/* record attribute header, update message length */
+	attr[0] = len | attr_id;
+
+	/* copy value into local variable and then write to msg */
+	attr[1] = le32_to_cpu(*(const __le32 *)&mac_addr[0]);
+	attr[2] = le16_to_cpu(*(const __le16 *)&mac_addr[4]);
+	attr[2] |= (u32)vlan << 16;
+
+	/* add header length to length */
+	len += FM10K_TLV_HDR_LEN << FM10K_TLV_LEN_SHIFT;
+	*msg += FM10K_TLV_LEN_ALIGN(len);
+
+	return 0;
+}
+
+/**
+ *  fm10k_tlv_attr_get_mac_vlan - Get MAC/VLAN stored in attribute
+ *  @attr: Pointer to attribute
+ *  @attr_id: Attribute ID
+ *  @mac_addr: location of buffer to store MAC address
+ *
+ *  This function pulls the MAC address back out of the attribute and will
+ *  place it in the array pointed by by mac_addr.  It will return success
+ *  if provided with a valid pointers.
+ **/
+s32 fm10k_tlv_attr_get_mac_vlan(u32 *attr, u8 *mac_addr, u16 *vlan)
+{
+	/* verify pointers are not NULL */
+	if (!mac_addr || !attr)
+		return FM10K_ERR_PARAM;
+
+	*(__le32 *)&mac_addr[0] = cpu_to_le32(attr[1]);
+	*(__le16 *)&mac_addr[4] = cpu_to_le16((u16)(attr[2]));
+	*vlan = (u16)(attr[2] >> 16);
+
+	return 0;
+}
+
+/**
+ *  fm10k_tlv_attr_put_bool - Add header indicating value "true"
+ *  @msg: Pointer to message block
+ *  @attr_id: Attribute ID
+ *
+ *  This function will simply add an attribute header, the fact
+ *  that the header is here means the attribute value is true, else
+ *  it is false.  The function will return success if provided with a
+ *  valid pointers.
+ **/
+s32 fm10k_tlv_attr_put_bool(u32 *msg, u16 attr_id)
+{
+	/* verify pointers are not NULL */
+	if (!msg)
+		return FM10K_ERR_PARAM;
+
+	/* record attribute header */
+	msg[FM10K_TLV_DWORD_LEN(*msg)] = attr_id;
+
+	/* add header length to length */
+	*msg += FM10K_TLV_HDR_LEN << FM10K_TLV_LEN_SHIFT;
+
+	return 0;
+}
+
+/**
+ *  fm10k_tlv_attr_put_value - Store integer value attribute in message
+ *  @msg: Pointer to message block
+ *  @attr_id: Attribute ID
+ *  @value: Value to be written
+ *  @len: Size of value
+ *
+ *  This function will place an integer value of up to 8 bytes in size
+ *  in a message attribute.  The function will return success provided
+ *  that msg is a valid pointer, and len is 1, 2, 4, or 8.
+ **/
+s32 fm10k_tlv_attr_put_value(u32 *msg, u16 attr_id, s64 value, u32 len)
+{
+	u32 *attr;
+
+	/* verify non-null msg and len is 1, 2, 4, or 8 */
+	if (!msg || !len || len > 8 || (len & (len - 1)))
+		return FM10K_ERR_PARAM;
+
+	attr = &msg[FM10K_TLV_DWORD_LEN(*msg)];
+
+	if (len < 4) {
+		attr[1] = (u32)value & ((0x1ul << (8 * len)) - 1);
+	} else {
+		attr[1] = (u32)value;
+		if (len > 4)
+			attr[2] = (u32)(value >> 32);
+	}
+
+	/* record attribute header, update message length */
+	len <<= FM10K_TLV_LEN_SHIFT;
+	attr[0] = len | attr_id;
+
+	/* add header length to length */
+	len += FM10K_TLV_HDR_LEN << FM10K_TLV_LEN_SHIFT;
+	*msg += FM10K_TLV_LEN_ALIGN(len);
+
+	return 0;
+}
+
+/**
+ *  fm10k_tlv_attr_get_value - Get integer value stored in attribute
+ *  @attr: Pointer to attribute
+ *  @value: Pointer to destination buffer
+ *  @len: Size of value
+ *
+ *  This function will place an integer value of up to 8 bytes in size
+ *  in the offset pointed to by value.  The function will return success
+ *  provided that pointers are valid and the len value matches the
+ *  attribute length.
+ **/
+s32 fm10k_tlv_attr_get_value(u32 *attr, void *value, u32 len)
+{
+	/* verify pointers are not NULL */
+	if (!attr || !value)
+		return FM10K_ERR_PARAM;
+
+	if ((*attr >> FM10K_TLV_LEN_SHIFT) != len)
+		return FM10K_ERR_PARAM;
+
+	if (len == 8)
+		*(u64 *)value = ((u64)attr[2] << 32) | attr[1];
+	else if (len == 4)
+		*(u32 *)value = attr[1];
+	else if (len == 2)
+		*(u16 *)value = (u16)attr[1];
+	else
+		*(u8 *)value = (u8)attr[1];
+
+	return 0;
+}
+
+/**
+ *  fm10k_tlv_attr_put_le_struct - Store little endian structure in message
+ *  @msg: Pointer to message block
+ *  @attr_id: Attribute ID
+ *  @le_struct: Pointer to structure to be written
+ *  @len: Size of le_struct
+ *
+ *  This function will place a little endian structure value in a message
+ *  attribute.  The function will return success provided that all pointers
+ *  are valid and length is a non-zero multiple of 4.
+ **/
+s32 fm10k_tlv_attr_put_le_struct(u32 *msg, u16 attr_id,
+				 const void *le_struct, u32 len)
+{
+	const __le32 *le32_ptr = (const __le32 *)le_struct;
+	u32 *attr;
+	u32 i;
+
+	/* verify non-null msg and len is in 32 bit words */
+	if (!msg || !len || (len % 4))
+		return FM10K_ERR_PARAM;
+
+	attr = &msg[FM10K_TLV_DWORD_LEN(*msg)];
+
+	/* copy le32 structure into host byte order at 32b boundaries */
+	for (i = 0; i < (len / 4); i++)
+		attr[i + 1] = le32_to_cpu(le32_ptr[i]);
+
+	/* record attribute header, update message length */
+	len <<= FM10K_TLV_LEN_SHIFT;
+	attr[0] = len | attr_id;
+
+	/* add header length to length */
+	len += FM10K_TLV_HDR_LEN << FM10K_TLV_LEN_SHIFT;
+	*msg += FM10K_TLV_LEN_ALIGN(len);
+
+	return 0;
+}
+
+/**
+ *  fm10k_tlv_attr_get_le_struct - Get little endian struct form attribute
+ *  @attr: Pointer to attribute
+ *  @le_struct: Pointer to structure to be written
+ *  @len: Size of structure
+ *
+ *  This function will place a little endian structure in the buffer
+ *  pointed to by le_struct.  The function will return success
+ *  provided that pointers are valid and the len value matches the
+ *  attribute length.
+ **/
+s32 fm10k_tlv_attr_get_le_struct(u32 *attr, void *le_struct, u32 len)
+{
+	__le32 *le32_ptr = (__le32 *)le_struct;
+	u32 i;
+
+	/* verify pointers are not NULL */
+	if (!le_struct || !attr)
+		return FM10K_ERR_PARAM;
+
+	if ((*attr >> FM10K_TLV_LEN_SHIFT) != len)
+		return FM10K_ERR_PARAM;
+
+	attr++;
+
+	for (i = 0; len; i++, len -= 4)
+		le32_ptr[i] = cpu_to_le32(attr[i]);
+
+	return 0;
+}
+
+/**
+ *  fm10k_tlv_attr_nest_start - Start a set of nested attributes
+ *  @msg: Pointer to message block
+ *  @attr_id: Attribute ID
+ *
+ *  This function will mark off a new nested region for encapsulating
+ *  a given set of attributes.  The idea is if you wish to place a secondary
+ *  structure within the message this mechanism allows for that.  The
+ *  function will return NULL on failure, and a pointer to the start
+ *  of the nested attributes on success.
+ **/
+u32 *fm10k_tlv_attr_nest_start(u32 *msg, u16 attr_id)
+{
+	u32 *attr;
+
+	/* verify pointer is not NULL */
+	if (!msg)
+		return NULL;
+
+	attr = &msg[FM10K_TLV_DWORD_LEN(*msg)];
+
+	attr[0] = attr_id;
+
+	/* return pointer to nest header */
+	return attr;
+}
+
+/**
+ *  fm10k_tlv_attr_nest_start - Start a set of nested attributes
+ *  @msg: Pointer to message block
+ *
+ *  This function closes off an existing set of nested attributes.  The
+ *  message pointer should be pointing to the parent of the nest.  So in
+ *  the case of a nest within the nest this would be the outer nest pointer.
+ *  This function will return success provided all pointers are valid.
+ **/
+s32 fm10k_tlv_attr_nest_stop(u32 *msg)
+{
+	u32 *attr;
+	u32 len;
+
+	/* verify pointer is not NULL */
+	if (!msg)
+		return FM10K_ERR_PARAM;
+
+	/* locate the nested header and retrieve its length */
+	attr = &msg[FM10K_TLV_DWORD_LEN(*msg)];
+	len = (attr[0] >> FM10K_TLV_LEN_SHIFT) << FM10K_TLV_LEN_SHIFT;
+
+	/* only include nest if data was added to it */
+	if (len) {
+		len += FM10K_TLV_HDR_LEN << FM10K_TLV_LEN_SHIFT;
+		*msg += len;
+	}
+
+	return 0;
+}
+
+/**
+ *  fm10k_tlv_attr_validate - Validate attribute metadata
+ *  @attr: Pointer to attribute
+ *  @tlv_attr: Type and length info for attribute
+ *
+ *  This function does some basic validation of the input TLV.  It
+ *  verifies the length, and in the case of null terminated strings
+ *  it verifies that the last byte is null.  The function will
+ *  return FM10K_ERR_PARAM if any attribute is malformed, otherwise
+ *  it returns 0.
+ **/
+static s32 fm10k_tlv_attr_validate(u32 *attr,
+				   const struct fm10k_tlv_attr *tlv_attr)
+{
+	u32 attr_id = *attr & FM10K_TLV_ID_MASK;
+	u16 len = *attr >> FM10K_TLV_LEN_SHIFT;
+
+	/* verify this is an attribute and not a message */
+	if (*attr & (FM10K_TLV_FLAGS_MSG << FM10K_TLV_FLAGS_SHIFT))
+		return FM10K_ERR_PARAM;
+
+	/* search through the list of attributes to find a matching ID */
+	while (tlv_attr->id < attr_id)
+		tlv_attr++;
+
+	/* if didn't find a match then we should exit */
+	if (tlv_attr->id != attr_id)
+		return FM10K_NOT_IMPLEMENTED;
+
+	/* move to start of attribute data */
+	attr++;
+
+	switch (tlv_attr->type) {
+	case FM10K_TLV_NULL_STRING:
+		if (!len ||
+		    (attr[(len - 1) / 4] & (0xFF << (8 * ((len - 1) % 4)))))
+			return FM10K_ERR_PARAM;
+		if (len > tlv_attr->len)
+			return FM10K_ERR_PARAM;
+		break;
+	case FM10K_TLV_MAC_ADDR:
+		if (len != ETH_ALEN)
+			return FM10K_ERR_PARAM;
+		break;
+	case FM10K_TLV_BOOL:
+		if (len)
+			return FM10K_ERR_PARAM;
+		break;
+	case FM10K_TLV_UNSIGNED:
+	case FM10K_TLV_SIGNED:
+		if (len != tlv_attr->len)
+			return FM10K_ERR_PARAM;
+		break;
+	case FM10K_TLV_LE_STRUCT:
+		/* struct must be 4 byte aligned */
+		if ((len % 4) || len != tlv_attr->len)
+			return FM10K_ERR_PARAM;
+		break;
+	case FM10K_TLV_NESTED:
+		/* nested attributes must be 4 byte aligned */
+		if (len % 4)
+			return FM10K_ERR_PARAM;
+		break;
+	default:
+		/* attribute id is mapped to bad value */
+		return FM10K_ERR_PARAM;
+	}
+
+	return 0;
+}
+
+/**
+ *  fm10k_tlv_attr_parse - Parses stream of attribute data
+ *  @attr: Pointer to attribute list
+ *  @results: Pointer array to store pointers to attributes
+ *  @tlv_attr: Type and length info for attributes
+ *
+ *  This function validates a stream of attributes and parses them
+ *  up into an array of pointers stored in results.  The function will
+ *  return FM10K_ERR_PARAM on any input or message error,
+ *  FM10K_NOT_IMPLEMENTED for any attribute that is outside of the array
+ *  and 0 on success.
+ **/
+s32 fm10k_tlv_attr_parse(u32 *attr, u32 **results,
+			 const struct fm10k_tlv_attr *tlv_attr)
+{
+	u32 i, attr_id, offset = 0;
+	s32 err = 0;
+	u16 len;
+
+	/* verify pointers are not NULL */
+	if (!attr || !results)
+		return FM10K_ERR_PARAM;
+
+	/* initialize results to NULL */
+	for (i = 0; i < FM10K_TLV_RESULTS_MAX; i++)
+		results[i] = NULL;
+
+	/* pull length from the message header */
+	len = *attr >> FM10K_TLV_LEN_SHIFT;
+
+	/* no attributes to parse if there is no length */
+	if (!len)
+		return 0;
+
+	/* no attributes to parse, just raw data, message becomes attribute */
+	if (!tlv_attr) {
+		results[0] = attr;
+		return 0;
+	}
+
+	/* move to start of attribute data */
+	attr++;
+
+	/* run through list parsing all attributes */
+	while (offset < len) {
+		attr_id = *attr & FM10K_TLV_ID_MASK;
+
+		if (attr_id < FM10K_TLV_RESULTS_MAX)
+			err = fm10k_tlv_attr_validate(attr, tlv_attr);
+		else
+			err = FM10K_NOT_IMPLEMENTED;
+
+		if (err < 0)
+			return err;
+		if (!err)
+			results[attr_id] = attr;
+
+		/* update offset */
+		offset += FM10K_TLV_DWORD_LEN(*attr) * 4;
+
+		/* move to next attribute */
+		attr = &attr[FM10K_TLV_DWORD_LEN(*attr)];
+	}
+
+	/* we should find ourselves at the end of the list */
+	if (offset != len)
+		return FM10K_ERR_PARAM;
+
+	return 0;
+}
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_tlv.h b/drivers/net/ethernet/intel/fm10k/fm10k_tlv.h
new file mode 100644
index 0000000..6d22db6
--- /dev/null
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_tlv.h
@@ -0,0 +1,141 @@
+/* Intel Ethernet Switch Host Interface Driver
+ * Copyright(c) 2013 - 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Contact Information:
+ * e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+ * Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+ */
+
+#ifndef _FM10K_TLV_H_
+#define _FM10K_TLV_H_
+
+#include "fm10k_type.h"
+
+/* Message / Argument header format
+ *    3			  2		      1			  0
+ *  1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ * |	     Length	   | Flags |	      Type / ID		   |
+ * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ *
+ * The message header format described here is used for messages that are
+ * passed between the PF and the VF.  To allow for messages larger then
+ * mailbox size we will provide a message with the above header and it
+ * will be segmented and transported to the mailbox to the other side where
+ * it is reassembled.  It contains the following fields:
+ * Len: Length of the message in bytes excluding the message header
+ * Flags: TBD
+ * Rule: These will be the message/argument types we pass
+ */
+/* message data header */
+#define FM10K_TLV_ID_SHIFT		0
+#define FM10K_TLV_ID_SIZE		16
+#define FM10K_TLV_ID_MASK		((1u << FM10K_TLV_ID_SIZE) - 1)
+#define FM10K_TLV_FLAGS_SHIFT		16
+#define FM10K_TLV_FLAGS_MSG		0x1
+#define FM10K_TLV_FLAGS_SIZE		4
+#define FM10K_TLV_LEN_SHIFT		20
+#define FM10K_TLV_LEN_SIZE		12
+
+#define FM10K_TLV_HDR_LEN		4ul
+#define FM10K_TLV_LEN_ALIGN_MASK \
+	((FM10K_TLV_HDR_LEN - 1) << FM10K_TLV_LEN_SHIFT)
+#define FM10K_TLV_LEN_ALIGN(tlv) \
+	(((tlv) + FM10K_TLV_LEN_ALIGN_MASK) & ~FM10K_TLV_LEN_ALIGN_MASK)
+#define FM10K_TLV_DWORD_LEN(tlv) \
+	((u16)((FM10K_TLV_LEN_ALIGN(tlv)) >> (FM10K_TLV_LEN_SHIFT + 2)) + 1)
+
+#define FM10K_TLV_RESULTS_MAX		32
+
+enum fm10k_tlv_type {
+	FM10K_TLV_NULL_STRING,
+	FM10K_TLV_MAC_ADDR,
+	FM10K_TLV_BOOL,
+	FM10K_TLV_UNSIGNED,
+	FM10K_TLV_SIGNED,
+	FM10K_TLV_LE_STRUCT,
+	FM10K_TLV_NESTED,
+	FM10K_TLV_MAX_TYPE
+};
+
+#define FM10K_TLV_ERROR (~0u)
+
+struct fm10k_tlv_attr {
+	unsigned int		id;
+	enum fm10k_tlv_type	type;
+	u16			len;
+};
+
+#define FM10K_TLV_ATTR_NULL_STRING(id, len) { id, FM10K_TLV_NULL_STRING, len }
+#define FM10K_TLV_ATTR_MAC_ADDR(id)	    { id, FM10K_TLV_MAC_ADDR, 6 }
+#define FM10K_TLV_ATTR_BOOL(id)		    { id, FM10K_TLV_BOOL, 0 }
+#define FM10K_TLV_ATTR_U8(id)		    { id, FM10K_TLV_UNSIGNED, 1 }
+#define FM10K_TLV_ATTR_U16(id)		    { id, FM10K_TLV_UNSIGNED, 2 }
+#define FM10K_TLV_ATTR_U32(id)		    { id, FM10K_TLV_UNSIGNED, 4 }
+#define FM10K_TLV_ATTR_U64(id)		    { id, FM10K_TLV_UNSIGNED, 8 }
+#define FM10K_TLV_ATTR_S8(id)		    { id, FM10K_TLV_SIGNED, 1 }
+#define FM10K_TLV_ATTR_S16(id)		    { id, FM10K_TLV_SIGNED, 2 }
+#define FM10K_TLV_ATTR_S32(id)		    { id, FM10K_TLV_SIGNED, 4 }
+#define FM10K_TLV_ATTR_S64(id)		    { id, FM10K_TLV_SIGNED, 8 }
+#define FM10K_TLV_ATTR_LE_STRUCT(id, len)   { id, FM10K_TLV_LE_STRUCT, len }
+#define FM10K_TLV_ATTR_NESTED(id)	    { id, FM10K_TLV_NESTED }
+#define FM10K_TLV_ATTR_LAST		    { FM10K_TLV_ERROR }
+
+s32 fm10k_tlv_msg_init(u32 *, u16);
+s32 fm10k_tlv_attr_put_null_string(u32 *, u16, const unsigned char *);
+s32 fm10k_tlv_attr_get_null_string(u32 *, unsigned char *);
+s32 fm10k_tlv_attr_put_mac_vlan(u32 *, u16, const u8 *, u16);
+s32 fm10k_tlv_attr_get_mac_vlan(u32 *, u8 *, u16 *);
+s32 fm10k_tlv_attr_put_bool(u32 *, u16);
+s32 fm10k_tlv_attr_put_value(u32 *, u16, s64, u32);
+#define fm10k_tlv_attr_put_u8(msg, attr_id, val) \
+		fm10k_tlv_attr_put_value(msg, attr_id, val, 1)
+#define fm10k_tlv_attr_put_u16(msg, attr_id, val) \
+		fm10k_tlv_attr_put_value(msg, attr_id, val, 2)
+#define fm10k_tlv_attr_put_u32(msg, attr_id, val) \
+		fm10k_tlv_attr_put_value(msg, attr_id, val, 4)
+#define fm10k_tlv_attr_put_u64(msg, attr_id, val) \
+		fm10k_tlv_attr_put_value(msg, attr_id, val, 8)
+#define fm10k_tlv_attr_put_s8(msg, attr_id, val) \
+		fm10k_tlv_attr_put_value(msg, attr_id, val, 1)
+#define fm10k_tlv_attr_put_s16(msg, attr_id, val) \
+		fm10k_tlv_attr_put_value(msg, attr_id, val, 2)
+#define fm10k_tlv_attr_put_s32(msg, attr_id, val) \
+		fm10k_tlv_attr_put_value(msg, attr_id, val, 4)
+#define fm10k_tlv_attr_put_s64(msg, attr_id, val) \
+		fm10k_tlv_attr_put_value(msg, attr_id, val, 8)
+s32 fm10k_tlv_attr_get_value(u32 *, void *, u32);
+#define fm10k_tlv_attr_get_u8(attr, ptr) \
+		fm10k_tlv_attr_get_value(attr, ptr, sizeof(u8))
+#define fm10k_tlv_attr_get_u16(attr, ptr) \
+		fm10k_tlv_attr_get_value(attr, ptr, sizeof(u16))
+#define fm10k_tlv_attr_get_u32(attr, ptr) \
+		fm10k_tlv_attr_get_value(attr, ptr, sizeof(u32))
+#define fm10k_tlv_attr_get_u64(attr, ptr) \
+		fm10k_tlv_attr_get_value(attr, ptr, sizeof(u64))
+#define fm10k_tlv_attr_get_s8(attr, ptr) \
+		fm10k_tlv_attr_get_value(attr, ptr, sizeof(s8))
+#define fm10k_tlv_attr_get_s16(attr, ptr) \
+		fm10k_tlv_attr_get_value(attr, ptr, sizeof(s16))
+#define fm10k_tlv_attr_get_s32(attr, ptr) \
+		fm10k_tlv_attr_get_value(attr, ptr, sizeof(s32))
+#define fm10k_tlv_attr_get_s64(attr, ptr) \
+		fm10k_tlv_attr_get_value(attr, ptr, sizeof(s64))
+s32 fm10k_tlv_attr_put_le_struct(u32 *, u16, const void *, u32);
+s32 fm10k_tlv_attr_get_le_struct(u32 *, void *, u32);
+u32 *fm10k_tlv_attr_nest_start(u32 *, u16);
+s32 fm10k_tlv_attr_nest_stop(u32 *);
+s32 fm10k_tlv_attr_parse(u32 *, u32 **, const struct fm10k_tlv_attr *);
+#endif /* _FM10K_MSG_H_ */

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [net-next PATCH 04/29] fm10k: Add support for basic interaction with hardware
  2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
                   ` (2 preceding siblings ...)
  2014-09-18 22:36 ` [net-next PATCH 03/29] fm10k: Add support for TLV message parsing and generation Alexander Duyck
@ 2014-09-18 22:36 ` Alexander Duyck
  2014-09-18 22:36 ` [net-next PATCH 05/29] fm10k: Add support for mailbox Alexander Duyck
                   ` (27 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-18 22:36 UTC (permalink / raw)
  To: davem
  Cc: nhorman, netdev, john.fastabend, matthew.vick, jeffrey.t.kirsher,
	sassmann

This patch adds the basic read/write operations for accessing the hardware.

In addition to read read functionality the read functions also provide
surprise remove detection in the event that the device either loses power
or is removed.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k.h        |    9 ++++-
 drivers/net/ethernet/intel/fm10k/fm10k_common.h |   44 +++++++++++++++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c    |   30 ++++++++++++++++
 3 files changed, 82 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_common.h

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k.h b/drivers/net/ethernet/intel/fm10k/fm10k.h
index 2e27bd9..3da3a99 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k.h
@@ -27,7 +27,14 @@
 #include <linux/if_vlan.h>
 #include <linux/pci.h>
 
-#include "fm10k_type.h"
+#include "fm10k_common.h"
+
+struct fm10k_intfc {
+	struct pci_dev *pdev;
+
+	struct fm10k_hw hw;
+	u32 __iomem *uc_addr;
+};
 
 /* main */
 extern char fm10k_driver_name[];
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_common.h b/drivers/net/ethernet/intel/fm10k/fm10k_common.h
new file mode 100644
index 0000000..2b214a6
--- /dev/null
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_common.h
@@ -0,0 +1,44 @@
+/* Intel Ethernet Switch Host Interface Driver
+ * Copyright(c) 2013 - 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Contact Information:
+ * e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+ * Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+ */
+
+#ifndef _FM10K_COMMON_H_
+#define _FM10K_COMMON_H_
+
+#include "fm10k_type.h"
+
+#define FM10K_REMOVED(hw_addr) unlikely(!(hw_addr))
+
+/* PCI configuration read */
+u16 fm10k_read_pci_cfg_word(struct fm10k_hw *hw, u32 reg);
+
+/* read operations, indexed using DWORDS */
+u32 fm10k_read_reg(struct fm10k_hw *hw, int reg);
+
+/* write operations, indexed using DWORDS */
+#define fm10k_write_reg(hw, reg, val) \
+do { \
+	u32 __iomem *hw_addr = ACCESS_ONCE((hw)->hw_addr); \
+	if (!FM10K_REMOVED(hw_addr)) \
+		writel((val), &hw_addr[(reg)]); \
+} while (0)
+
+/* read ctrl register which has no clear on read fields as PCIe flush */
+#define fm10k_write_flush(hw) fm10k_read_reg((hw), FM10K_CTRL)
+#endif /* _FM10K_COMMON_H_ */
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index 12089ce..6d5e33c 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -38,6 +38,36 @@ static const struct pci_device_id fm10k_pci_tbl[] = {
 };
 MODULE_DEVICE_TABLE(pci, fm10k_pci_tbl);
 
+u16 fm10k_read_pci_cfg_word(struct fm10k_hw *hw, u32 reg)
+{
+	struct fm10k_intfc *interface = hw->back;
+	u16 value = 0;
+
+	if (FM10K_REMOVED(hw->hw_addr))
+		return ~value;
+
+	pci_read_config_word(interface->pdev, reg, &value);
+	if (value == 0xFFFF)
+		fm10k_write_flush(hw);
+
+	return value;
+}
+
+u32 fm10k_read_reg(struct fm10k_hw *hw, int reg)
+{
+	u32 __iomem *hw_addr = ACCESS_ONCE(hw->hw_addr);
+	u32 value = 0;
+
+	if (FM10K_REMOVED(hw_addr))
+		return ~value;
+
+	value = readl(&hw_addr[reg]);
+	if (!(~value) && (!reg || !(~readl(hw_addr))))
+		hw->hw_addr = NULL;
+
+	return value;
+}
+
 /**
  * fm10k_probe - Device Initialization Routine
  * @pdev: PCI device information struct

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [net-next PATCH 05/29] fm10k: Add support for mailbox
  2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
                   ` (3 preceding siblings ...)
  2014-09-18 22:36 ` [net-next PATCH 04/29] fm10k: Add support for basic interaction with hardware Alexander Duyck
@ 2014-09-18 22:36 ` Alexander Duyck
  2014-09-18 22:36 ` [net-next PATCH 06/29] fm10k: Implement PF <-> SM mailbox operations Alexander Duyck
                   ` (26 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-18 22:36 UTC (permalink / raw)
  To: davem
  Cc: nhorman, netdev, john.fastabend, matthew.vick, jeffrey.t.kirsher,
	sassmann

This patch adds generic mailbox support.  The general idea of the mailboxes
is to use a pair of ring buffers, one for request, one for response to send
data between the local driver and some remote entity be it the PF of the
Switch Manager.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_mbx.h  |  241 +++++++++++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_tlv.c  |  320 +++++++++++++++++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_tlv.h  |   45 ++++
 drivers/net/ethernet/intel/fm10k/fm10k_type.h |    3 
 4 files changed, 609 insertions(+)
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_mbx.h

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_mbx.h b/drivers/net/ethernet/intel/fm10k/fm10k_mbx.h
new file mode 100644
index 0000000..ee2f712
--- /dev/null
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_mbx.h
@@ -0,0 +1,241 @@
+/* Intel Ethernet Switch Host Interface Driver
+ * Copyright(c) 2013 - 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Contact Information:
+ * e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+ * Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+ */
+
+#ifndef _FM10K_MBX_H_
+#define _FM10K_MBX_H_
+
+/* forward declaration */
+struct fm10k_mbx_info;
+
+#include "fm10k_type.h"
+#include "fm10k_tlv.h"
+
+/* PF Mailbox Registers */
+#define FM10K_MBMEM(_n)		((_n) + 0x18000)
+#define FM10K_MBMEM_VF(_n, _m)	(((_n) * 0x10) + (_m) + 0x18000)
+#define FM10K_MBMEM_SM(_n)	((_n) + 0x18400)
+#define FM10K_MBMEM_PF(_n)	((_n) + 0x18600)
+/* XOR provides means of switching from Tx to Rx FIFO */
+#define FM10K_MBMEM_PF_XOR	(FM10K_MBMEM_SM(0) ^ FM10K_MBMEM_PF(0))
+#define FM10K_MBX(_n)		((_n) + 0x18800)
+#define FM10K_MBX_REQ				0x00000002
+#define FM10K_MBX_ACK				0x00000004
+#define FM10K_MBX_REQ_INTERRUPT			0x00000008
+#define FM10K_MBX_ACK_INTERRUPT			0x00000010
+#define FM10K_MBX_INTERRUPT_ENABLE		0x00000020
+#define FM10K_MBX_INTERRUPT_DISABLE		0x00000040
+#define FM10K_MBICR(_n)		((_n) + 0x18840)
+#define FM10K_GMBX		0x18842
+
+/* VF Mailbox Registers */
+#define FM10K_VFMBX		0x00010
+#define FM10K_VFMBMEM(_n)	((_n) + 0x00020)
+#define FM10K_VFMBMEM_LEN	16
+#define FM10K_VFMBMEM_VF_XOR	(FM10K_VFMBMEM_LEN / 2)
+
+/* Delays/timeouts */
+#define FM10K_MBX_DISCONNECT_TIMEOUT		500
+#define FM10K_MBX_POLL_DELAY			19
+#define FM10K_MBX_INT_DELAY			20
+
+/* PF/VF Mailbox state machine
+ *
+ * +----------+	    connect()	+----------+
+ * |  CLOSED  | --------------> |  CONNECT |
+ * +----------+			+----------+
+ *   ^				  ^	 |
+ *   | rcv:	      rcv:	  |	 | rcv:
+ *   |  Connect	       Disconnect |	 |  Connect
+ *   |  Disconnect     Error	  |	 |  Data
+ *   |				  |	 |
+ *   |				  |	 V
+ * +----------+   disconnect()	+----------+
+ * |DISCONNECT| <-------------- |   OPEN   |
+ * +----------+			+----------+
+ *
+ * The diagram above describes the PF/VF mailbox state machine.  There
+ * are four main states to this machine.
+ * Closed: This state represents a mailbox that is in a standby state
+ *	   with interrupts disabled.  In this state the mailbox should not
+ *	   read the mailbox or write any data.  The only means of exiting
+ *	   this state is for the system to make the connect() call for the
+ *	   mailbox, it will then transition to the connect state.
+ * Connect: In this state the mailbox is seeking a connection.  It will
+ *	    post a connect message with no specified destination and will
+ *	    wait for a reply from the other side of the mailbox.  This state
+ *	    is exited when either a connect with the local mailbox as the
+ *	    destination is received or when a data message is received with
+ *	    a valid sequence number.
+ * Open: In this state the mailbox is able to transfer data between the local
+ *       entity and the remote.  It will fall back to connect in the event of
+ *       receiving either an error message, or a disconnect message.  It will
+ *       transition to disconnect on a call to disconnect();
+ * Disconnect: In this state the mailbox is attempting to gracefully terminate
+ *	       the connection.  It will do so at the first point where it knows
+ *	       that the remote endpoint is either done sending, or when the
+ *	       remote endpoint has fallen back into connect.
+ */
+enum fm10k_mbx_state {
+	FM10K_STATE_CLOSED,
+	FM10K_STATE_CONNECT,
+	FM10K_STATE_OPEN,
+	FM10K_STATE_DISCONNECT,
+};
+
+/* macros for retriving and setting header values */
+#define FM10K_MSG_HDR_MASK(name) \
+	((0x1u << FM10K_MSG_##name##_SIZE) - 1)
+#define FM10K_MSG_HDR_FIELD_SET(value, name) \
+	(((u32)(value) & FM10K_MSG_HDR_MASK(name)) << FM10K_MSG_##name##_SHIFT)
+#define FM10K_MSG_HDR_FIELD_GET(value, name) \
+	((u16)((value) >> FM10K_MSG_##name##_SHIFT) & FM10K_MSG_HDR_MASK(name))
+
+/* HNI/SM Mailbox FIFO format
+ *    3                   2                   1                   0
+ *  1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ * +-------+-----------------------+-------+-----------------------+
+ * | Error |      Remote Head      |Version|      Local Tail       |
+ * +-------+-----------------------+-------+-----------------------+
+ * |                                                               |
+ * .                        Local FIFO Data                        .
+ * .                                                               .
+ * +-------+-----------------------+-------+-----------------------+
+ *
+ * The layout above describes the format for the FIFOs used by the host
+ * network interface and the switch manager to communicate messages back
+ * and forth.  Both the HNI and the switch maintain one such FIFO.  The
+ * layout in memory has the switch manager FIFO followed immediately by
+ * the HNI FIFO.  For this reason I am using just the pointer to the
+ * HNI FIFO in the mailbox ops as the offset between the two is fixed.
+ *
+ * The header for the FIFO is broken out into the following fields:
+ * Local Tail:  Offset into FIFO region for next DWORD to write.
+ * Version:  Version info for mailbox, only values of 0/1 are supported.
+ * Remote Head:  Offset into remote FIFO to indicate how much we have read.
+ * Error: Error indication, values TBD.
+ */
+
+/* version number for switch manager mailboxes */
+#define FM10K_SM_MBX_VERSION		1
+#define FM10K_SM_MBX_FIFO_LEN		(FM10K_MBMEM_PF_XOR - 1)
+#define FM10K_SM_MBX_FIFO_HDR_LEN	1
+
+/* offsets shared between all SM FIFO headers */
+#define FM10K_MSG_SM_TAIL_SHIFT			0
+#define FM10K_MSG_SM_TAIL_SIZE			12
+#define FM10K_MSG_SM_VER_SHIFT			12
+#define FM10K_MSG_SM_VER_SIZE			4
+#define FM10K_MSG_SM_HEAD_SHIFT			16
+#define FM10K_MSG_SM_HEAD_SIZE			12
+#define FM10K_MSG_SM_ERR_SHIFT			28
+#define FM10K_MSG_SM_ERR_SIZE			4
+
+/* All error messages returned by mailbox functions
+ * The value -511 is 0xFE01 in hex.  The idea is to order the errors
+ * from 0xFE01 - 0xFEFF so error codes are easily visible in the mailbox
+ * messages.  This also helps to avoid error number collisions as Linux
+ * doesn't appear to use error numbers 256 - 511.
+ */
+#define FM10K_MBX_ERR(_n) ((_n) - 512)
+#define FM10K_MBX_ERR_NO_MBX		FM10K_MBX_ERR(0x01)
+#define FM10K_MBX_ERR_NO_SPACE		FM10K_MBX_ERR(0x03)
+#define FM10K_MBX_ERR_TAIL		FM10K_MBX_ERR(0x05)
+#define FM10K_MBX_ERR_HEAD		FM10K_MBX_ERR(0x06)
+#define FM10K_MBX_ERR_SRC		FM10K_MBX_ERR(0x08)
+#define FM10K_MBX_ERR_TYPE		FM10K_MBX_ERR(0x09)
+#define FM10K_MBX_ERR_SIZE		FM10K_MBX_ERR(0x0B)
+#define FM10K_MBX_ERR_BUSY		FM10K_MBX_ERR(0x0C)
+#define FM10K_MBX_ERR_RSVD0		FM10K_MBX_ERR(0x0E)
+#define FM10K_MBX_ERR_CRC		FM10K_MBX_ERR(0x0F)
+
+#define FM10K_MBX_CRC_SEED		0xFFFF
+
+struct fm10k_mbx_ops {
+	s32 (*connect)(struct fm10k_hw *, struct fm10k_mbx_info *);
+	void (*disconnect)(struct fm10k_hw *, struct fm10k_mbx_info *);
+	bool (*rx_ready)(struct fm10k_mbx_info *);
+	bool (*tx_ready)(struct fm10k_mbx_info *, u16);
+	bool (*tx_complete)(struct fm10k_mbx_info *);
+	s32 (*enqueue_tx)(struct fm10k_hw *, struct fm10k_mbx_info *,
+			  const u32 *);
+	s32 (*process)(struct fm10k_hw *, struct fm10k_mbx_info *);
+	s32 (*register_handlers)(struct fm10k_mbx_info *,
+				 const struct fm10k_msg_data *);
+};
+
+struct fm10k_mbx_fifo {
+	u32 *buffer;
+	u16 head;
+	u16 tail;
+	u16 size;
+};
+
+/* size of buffer to be stored in mailbox for FIFOs */
+#define FM10K_MBX_TX_BUFFER_SIZE	512
+#define FM10K_MBX_RX_BUFFER_SIZE	128
+#define FM10K_MBX_BUFFER_SIZE \
+	(FM10K_MBX_TX_BUFFER_SIZE + FM10K_MBX_RX_BUFFER_SIZE)
+
+/* minimum and maximum message size in dwords */
+#define FM10K_MBX_MSG_MAX_SIZE \
+	((FM10K_MBX_TX_BUFFER_SIZE - 1) & (FM10K_MBX_RX_BUFFER_SIZE - 1))
+#define FM10K_VFMBX_MSG_MTU	((FM10K_VFMBMEM_LEN / 2) - 1)
+
+#define FM10K_MBX_INIT_TIMEOUT	2000 /* number of retries on mailbox */
+#define FM10K_MBX_INIT_DELAY	500  /* microseconds between retries */
+
+struct fm10k_mbx_info {
+	/* function pointers for mailbox operations */
+	struct fm10k_mbx_ops ops;
+	const struct fm10k_msg_data *msg_data;
+
+	/* message FIFOs */
+	struct fm10k_mbx_fifo rx;
+	struct fm10k_mbx_fifo tx;
+
+	/* delay for handling timeouts */
+	u32 timeout;
+	u32 udelay;
+
+	/* mailbox state info */
+	u32 mbx_reg, mbmem_reg, mbx_lock, mbx_hdr;
+	u16 max_size, mbmem_len;
+	u16 tail, tail_len, pulled;
+	u16 head, head_len, pushed;
+	u16 local, remote;
+	enum fm10k_mbx_state state;
+
+	/* result of last mailbox test */
+	s32 test_result;
+
+	/* statistics */
+	u64 tx_busy;
+	u64 tx_dropped;
+	u64 tx_messages;
+	u64 tx_dwords;
+	u64 rx_messages;
+	u64 rx_dwords;
+	u64 rx_parse_err;
+
+	/* Buffer to store messages */
+	u32 buffer[FM10K_MBX_BUFFER_SIZE];
+};
+
+#endif /* _FM10K_MBX_H_ */
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_tlv.c b/drivers/net/ethernet/intel/fm10k/fm10k_tlv.c
index f1fc709..fd0a05f 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_tlv.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_tlv.c
@@ -541,3 +541,323 @@ s32 fm10k_tlv_attr_parse(u32 *attr, u32 **results,
 
 	return 0;
 }
+
+/**
+ *  fm10k_tlv_msg_parse - Parses message header and calls function handler
+ *  @hw: Pointer to hardware structure
+ *  @msg: Pointer to message
+ *  @mbx: Pointer to mailbox information structure
+ *  @func: Function array containing list of message handling functions
+ *
+ *  This function should be the first function called upon receiving a
+ *  message.  The handler will identify the message type and call the correct
+ *  handler for the given message.  It will return the value from the function
+ *  call on a recognized message type, otherwise it will return
+ *  FM10K_NOT_IMPLEMENTED on an unrecognized type.
+ **/
+s32 fm10k_tlv_msg_parse(struct fm10k_hw *hw, u32 *msg,
+			struct fm10k_mbx_info *mbx,
+			const struct fm10k_msg_data *data)
+{
+	u32 *results[FM10K_TLV_RESULTS_MAX];
+	u32 msg_id;
+	s32 err;
+
+	/* verify pointer is not NULL */
+	if (!msg || !data)
+		return FM10K_ERR_PARAM;
+
+	/* verify this is a message and not an attribute */
+	if (!(*msg & (FM10K_TLV_FLAGS_MSG << FM10K_TLV_FLAGS_SHIFT)))
+		return FM10K_ERR_PARAM;
+
+	/* grab message ID */
+	msg_id = *msg & FM10K_TLV_ID_MASK;
+
+	while (data->id < msg_id)
+		data++;
+
+	/* if we didn't find it then pass it up as an error */
+	if (data->id != msg_id) {
+		while (data->id != FM10K_TLV_ERROR)
+			data++;
+	}
+
+	/* parse the attributes into the results list */
+	err = fm10k_tlv_attr_parse(msg, results, data->attr);
+	if (err < 0)
+		return err;
+
+	return data->func(hw, results, mbx);
+}
+
+/**
+ *  fm10k_tlv_msg_error - Default handler for unrecognized TLV message IDs
+ *  @hw: Pointer to hardware structure
+ *  @results: Pointer array to message, results[0] is pointer to message
+ *  @mbx: Unused mailbox pointer
+ *
+ *  This function is a default handler for unrecognized messages.  At a
+ *  a minimum it just indicates that the message requested was
+ *  unimplemented.
+ **/
+s32 fm10k_tlv_msg_error(struct fm10k_hw *hw, u32 **results,
+			struct fm10k_mbx_info *mbx)
+{
+	return FM10K_NOT_IMPLEMENTED;
+}
+
+static const unsigned char test_str[] =	"fm10k";
+static const unsigned char test_mac[ETH_ALEN] = { 0x12, 0x34, 0x56,
+						  0x78, 0x9a, 0xbc };
+static const u16 test_vlan = 0x0FED;
+static const u64 test_u64 = 0xfedcba9876543210ull;
+static const u32 test_u32 = 0x87654321;
+static const u16 test_u16 = 0x8765;
+static const u8  test_u8  = 0x87;
+static const s64 test_s64 = -0x123456789abcdef0ll;
+static const s32 test_s32 = -0x1235678;
+static const s16 test_s16 = -0x1234;
+static const s8  test_s8  = -0x12;
+static const __le32 test_le[2] = { cpu_to_le32(0x12345678),
+				   cpu_to_le32(0x9abcdef0)};
+
+/* The message below is meant to be used as a test message to demonstrate
+ * how to use the TLV interface and to test the types.  Normally this code
+ * be compiled out by stripping the code wrapped in FM10K_TLV_TEST_MSG
+ */
+const struct fm10k_tlv_attr fm10k_tlv_msg_test_attr[] = {
+	FM10K_TLV_ATTR_NULL_STRING(FM10K_TEST_MSG_STRING, 80),
+	FM10K_TLV_ATTR_MAC_ADDR(FM10K_TEST_MSG_MAC_ADDR),
+	FM10K_TLV_ATTR_U8(FM10K_TEST_MSG_U8),
+	FM10K_TLV_ATTR_U16(FM10K_TEST_MSG_U16),
+	FM10K_TLV_ATTR_U32(FM10K_TEST_MSG_U32),
+	FM10K_TLV_ATTR_U64(FM10K_TEST_MSG_U64),
+	FM10K_TLV_ATTR_S8(FM10K_TEST_MSG_S8),
+	FM10K_TLV_ATTR_S16(FM10K_TEST_MSG_S16),
+	FM10K_TLV_ATTR_S32(FM10K_TEST_MSG_S32),
+	FM10K_TLV_ATTR_S64(FM10K_TEST_MSG_S64),
+	FM10K_TLV_ATTR_LE_STRUCT(FM10K_TEST_MSG_LE_STRUCT, 8),
+	FM10K_TLV_ATTR_NESTED(FM10K_TEST_MSG_NESTED),
+	FM10K_TLV_ATTR_S32(FM10K_TEST_MSG_RESULT),
+	FM10K_TLV_ATTR_LAST
+};
+
+/**
+ *  fm10k_tlv_msg_test_generate_data - Stuff message with data
+ *  @msg: Pointer to message
+ *  @attr_flags: List of flags indicating what attributes to add
+ *
+ *  This function is meant to load a message buffer with attribute data
+ **/
+static void fm10k_tlv_msg_test_generate_data(u32 *msg, u32 attr_flags)
+{
+	if (attr_flags & (1 << FM10K_TEST_MSG_STRING))
+		fm10k_tlv_attr_put_null_string(msg, FM10K_TEST_MSG_STRING,
+					       test_str);
+	if (attr_flags & (1 << FM10K_TEST_MSG_MAC_ADDR))
+		fm10k_tlv_attr_put_mac_vlan(msg, FM10K_TEST_MSG_MAC_ADDR,
+					    test_mac, test_vlan);
+	if (attr_flags & (1 << FM10K_TEST_MSG_U8))
+		fm10k_tlv_attr_put_u8(msg, FM10K_TEST_MSG_U8,  test_u8);
+	if (attr_flags & (1 << FM10K_TEST_MSG_U16))
+		fm10k_tlv_attr_put_u16(msg, FM10K_TEST_MSG_U16, test_u16);
+	if (attr_flags & (1 << FM10K_TEST_MSG_U32))
+		fm10k_tlv_attr_put_u32(msg, FM10K_TEST_MSG_U32, test_u32);
+	if (attr_flags & (1 << FM10K_TEST_MSG_U64))
+		fm10k_tlv_attr_put_u64(msg, FM10K_TEST_MSG_U64, test_u64);
+	if (attr_flags & (1 << FM10K_TEST_MSG_S8))
+		fm10k_tlv_attr_put_s8(msg, FM10K_TEST_MSG_S8,  test_s8);
+	if (attr_flags & (1 << FM10K_TEST_MSG_S16))
+		fm10k_tlv_attr_put_s16(msg, FM10K_TEST_MSG_S16, test_s16);
+	if (attr_flags & (1 << FM10K_TEST_MSG_S32))
+		fm10k_tlv_attr_put_s32(msg, FM10K_TEST_MSG_S32, test_s32);
+	if (attr_flags & (1 << FM10K_TEST_MSG_S64))
+		fm10k_tlv_attr_put_s64(msg, FM10K_TEST_MSG_S64, test_s64);
+	if (attr_flags & (1 << FM10K_TEST_MSG_LE_STRUCT))
+		fm10k_tlv_attr_put_le_struct(msg, FM10K_TEST_MSG_LE_STRUCT,
+					     test_le, 8);
+}
+
+/**
+ *  fm10k_tlv_msg_test_create - Create a test message testing all attributes
+ *  @msg: Pointer to message
+ *  @attr_flags: List of flags indicating what attributes to add
+ *
+ *  This function is meant to load a message buffer with all attribute types
+ *  including a nested attribute.
+ **/
+void fm10k_tlv_msg_test_create(u32 *msg, u32 attr_flags)
+{
+	u32 *nest = NULL;
+
+	fm10k_tlv_msg_init(msg, FM10K_TLV_MSG_ID_TEST);
+
+	fm10k_tlv_msg_test_generate_data(msg, attr_flags);
+
+	/* check for nested attributes */
+	attr_flags >>= FM10K_TEST_MSG_NESTED;
+
+	if (attr_flags) {
+		nest = fm10k_tlv_attr_nest_start(msg, FM10K_TEST_MSG_NESTED);
+
+		fm10k_tlv_msg_test_generate_data(nest, attr_flags);
+
+		fm10k_tlv_attr_nest_stop(msg);
+	}
+}
+
+/**
+ *  fm10k_tlv_msg_test - Validate all results on test message receive
+ *  @hw: Pointer to hardware structure
+ *  @results: Pointer array to attributes in the mesage
+ *  @mbx: Pointer to mailbox information structure
+ *
+ *  This function does a check to verify all attributes match what the test
+ *  message placed in the message buffer.  It is the default handler
+ *  for TLV test messages.
+ **/
+s32 fm10k_tlv_msg_test(struct fm10k_hw *hw, u32 **results,
+		       struct fm10k_mbx_info *mbx)
+{
+	u32 *nest_results[FM10K_TLV_RESULTS_MAX];
+	unsigned char result_str[80];
+	unsigned char result_mac[ETH_ALEN];
+	s32 err = 0;
+	__le32 result_le[2];
+	u16 result_vlan;
+	u64 result_u64;
+	u32 result_u32;
+	u16 result_u16;
+	u8  result_u8;
+	s64 result_s64;
+	s32 result_s32;
+	s16 result_s16;
+	s8  result_s8;
+	u32 reply[3];
+
+	/* retrieve results of a previous test */
+	if (!!results[FM10K_TEST_MSG_RESULT])
+		return fm10k_tlv_attr_get_s32(results[FM10K_TEST_MSG_RESULT],
+					      &mbx->test_result);
+
+parse_nested:
+	if (!!results[FM10K_TEST_MSG_STRING]) {
+		err = fm10k_tlv_attr_get_null_string(
+					results[FM10K_TEST_MSG_STRING],
+					result_str);
+		if (!err && memcmp(test_str, result_str, sizeof(test_str)))
+			err = FM10K_ERR_INVALID_VALUE;
+		if (err)
+			goto report_result;
+	}
+	if (!!results[FM10K_TEST_MSG_MAC_ADDR]) {
+		err = fm10k_tlv_attr_get_mac_vlan(
+					results[FM10K_TEST_MSG_MAC_ADDR],
+					result_mac, &result_vlan);
+		if (!err && memcmp(test_mac, result_mac, ETH_ALEN))
+			err = FM10K_ERR_INVALID_VALUE;
+		if (!err && test_vlan != result_vlan)
+			err = FM10K_ERR_INVALID_VALUE;
+		if (err)
+			goto report_result;
+	}
+	if (!!results[FM10K_TEST_MSG_U8]) {
+		err = fm10k_tlv_attr_get_u8(results[FM10K_TEST_MSG_U8],
+					    &result_u8);
+		if (!err && test_u8 != result_u8)
+			err = FM10K_ERR_INVALID_VALUE;
+		if (err)
+			goto report_result;
+	}
+	if (!!results[FM10K_TEST_MSG_U16]) {
+		err = fm10k_tlv_attr_get_u16(results[FM10K_TEST_MSG_U16],
+					     &result_u16);
+		if (!err && test_u16 != result_u16)
+			err = FM10K_ERR_INVALID_VALUE;
+		if (err)
+			goto report_result;
+	}
+	if (!!results[FM10K_TEST_MSG_U32]) {
+		err = fm10k_tlv_attr_get_u32(results[FM10K_TEST_MSG_U32],
+					     &result_u32);
+		if (!err && test_u32 != result_u32)
+			err = FM10K_ERR_INVALID_VALUE;
+		if (err)
+			goto report_result;
+	}
+	if (!!results[FM10K_TEST_MSG_U64]) {
+		err = fm10k_tlv_attr_get_u64(results[FM10K_TEST_MSG_U64],
+					     &result_u64);
+		if (!err && test_u64 != result_u64)
+			err = FM10K_ERR_INVALID_VALUE;
+		if (err)
+			goto report_result;
+	}
+	if (!!results[FM10K_TEST_MSG_S8]) {
+		err = fm10k_tlv_attr_get_s8(results[FM10K_TEST_MSG_S8],
+					    &result_s8);
+		if (!err && test_s8 != result_s8)
+			err = FM10K_ERR_INVALID_VALUE;
+		if (err)
+			goto report_result;
+	}
+	if (!!results[FM10K_TEST_MSG_S16]) {
+		err = fm10k_tlv_attr_get_s16(results[FM10K_TEST_MSG_S16],
+					     &result_s16);
+		if (!err && test_s16 != result_s16)
+			err = FM10K_ERR_INVALID_VALUE;
+		if (err)
+			goto report_result;
+	}
+	if (!!results[FM10K_TEST_MSG_S32]) {
+		err = fm10k_tlv_attr_get_s32(results[FM10K_TEST_MSG_S32],
+					     &result_s32);
+		if (!err && test_s32 != result_s32)
+			err = FM10K_ERR_INVALID_VALUE;
+		if (err)
+			goto report_result;
+	}
+	if (!!results[FM10K_TEST_MSG_S64]) {
+		err = fm10k_tlv_attr_get_s64(results[FM10K_TEST_MSG_S64],
+					     &result_s64);
+		if (!err && test_s64 != result_s64)
+			err = FM10K_ERR_INVALID_VALUE;
+		if (err)
+			goto report_result;
+	}
+	if (!!results[FM10K_TEST_MSG_LE_STRUCT]) {
+		err = fm10k_tlv_attr_get_le_struct(
+					results[FM10K_TEST_MSG_LE_STRUCT],
+					result_le,
+					sizeof(result_le));
+		if (!err && memcmp(test_le, result_le, sizeof(test_le)))
+			err = FM10K_ERR_INVALID_VALUE;
+		if (err)
+			goto report_result;
+	}
+
+	if (!!results[FM10K_TEST_MSG_NESTED]) {
+		/* clear any pointers */
+		memset(nest_results, 0, sizeof(nest_results));
+
+		/* parse the nested attributes into the nest results list */
+		err = fm10k_tlv_attr_parse(results[FM10K_TEST_MSG_NESTED],
+					   nest_results,
+					   fm10k_tlv_msg_test_attr);
+		if (err)
+			goto report_result;
+
+		/* loop back through to the start */
+		results = nest_results;
+		goto parse_nested;
+	}
+
+report_result:
+	/* generate reply with test result */
+	fm10k_tlv_msg_init(reply, FM10K_TLV_MSG_ID_TEST);
+	fm10k_tlv_attr_put_s32(reply, FM10K_TEST_MSG_RESULT, err);
+
+	/* load onto outgoing mailbox */
+	return mbx->ops.enqueue_tx(hw, mbx, reply);
+}
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_tlv.h b/drivers/net/ethernet/intel/fm10k/fm10k_tlv.h
index 6d22db6..7e045e8 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_tlv.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_tlv.h
@@ -21,6 +21,9 @@
 #ifndef _FM10K_TLV_H_
 #define _FM10K_TLV_H_
 
+/* forward declaration */
+struct fm10k_msg_data;
+
 #include "fm10k_type.h"
 
 /* Message / Argument header format
@@ -93,6 +96,15 @@ struct fm10k_tlv_attr {
 #define FM10K_TLV_ATTR_NESTED(id)	    { id, FM10K_TLV_NESTED }
 #define FM10K_TLV_ATTR_LAST		    { FM10K_TLV_ERROR }
 
+struct fm10k_msg_data {
+	unsigned int		    id;
+	const struct fm10k_tlv_attr *attr;
+	s32			    (*func)(struct fm10k_hw *, u32 **,
+					    struct fm10k_mbx_info *);
+};
+
+#define FM10K_MSG_HANDLER(id, attr, func) { id, attr, func }
+
 s32 fm10k_tlv_msg_init(u32 *, u16);
 s32 fm10k_tlv_attr_put_null_string(u32 *, u16, const unsigned char *);
 s32 fm10k_tlv_attr_get_null_string(u32 *, unsigned char *);
@@ -138,4 +150,37 @@ s32 fm10k_tlv_attr_get_le_struct(u32 *, void *, u32);
 u32 *fm10k_tlv_attr_nest_start(u32 *, u16);
 s32 fm10k_tlv_attr_nest_stop(u32 *);
 s32 fm10k_tlv_attr_parse(u32 *, u32 **, const struct fm10k_tlv_attr *);
+s32 fm10k_tlv_msg_parse(struct fm10k_hw *, u32 *, struct fm10k_mbx_info *,
+			const struct fm10k_msg_data *);
+s32 fm10k_tlv_msg_error(struct fm10k_hw *hw, u32 **results,
+			struct fm10k_mbx_info *);
+
+#define FM10K_TLV_MSG_ID_TEST	0
+
+enum fm10k_tlv_test_attr_id {
+	FM10K_TEST_MSG_UNSET,
+	FM10K_TEST_MSG_STRING,
+	FM10K_TEST_MSG_MAC_ADDR,
+	FM10K_TEST_MSG_U8,
+	FM10K_TEST_MSG_U16,
+	FM10K_TEST_MSG_U32,
+	FM10K_TEST_MSG_U64,
+	FM10K_TEST_MSG_S8,
+	FM10K_TEST_MSG_S16,
+	FM10K_TEST_MSG_S32,
+	FM10K_TEST_MSG_S64,
+	FM10K_TEST_MSG_LE_STRUCT,
+	FM10K_TEST_MSG_NESTED,
+	FM10K_TEST_MSG_RESULT,
+	FM10K_TEST_MSG_MAX
+};
+
+extern const struct fm10k_tlv_attr fm10k_tlv_msg_test_attr[];
+void fm10k_tlv_msg_test_create(u32 *, u32);
+s32 fm10k_tlv_msg_test(struct fm10k_hw *, u32 **, struct fm10k_mbx_info *);
+
+#define FM10K_TLV_MSG_TEST_HANDLER(func) \
+	FM10K_MSG_HANDLER(FM10K_TLV_MSG_ID_TEST, fm10k_tlv_msg_test_attr, func)
+#define FM10K_TLV_MSG_ERROR_HANDLER(func) \
+	FM10K_MSG_HANDLER(FM10K_TLV_ERROR, NULL, func)
 #endif /* _FM10K_MSG_H_ */
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_type.h b/drivers/net/ethernet/intel/fm10k/fm10k_type.h
index 1606805..eda0c7c 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_type.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_type.h
@@ -28,6 +28,8 @@ struct fm10k_hw;
 #include <asm/byteorder.h>
 #include <linux/etherdevice.h>
 
+#include "fm10k_mbx.h"
+
 #define FM10K_DEV_ID_PF			0x15A4
 #define FM10K_DEV_ID_VF			0x15A5
 
@@ -573,6 +575,7 @@ struct fm10k_hw {
 	struct fm10k_mac_info mac;
 	struct fm10k_bus_info bus;
 	struct fm10k_bus_info bus_caps;
+	struct fm10k_mbx_info mbx;
 	struct fm10k_swapi_info swapi;
 	u16 device_id;
 	u16 vendor_id;

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [net-next PATCH 06/29] fm10k: Implement PF <-> SM mailbox operations
  2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
                   ` (4 preceding siblings ...)
  2014-09-18 22:36 ` [net-next PATCH 05/29] fm10k: Add support for mailbox Alexander Duyck
@ 2014-09-18 22:36 ` Alexander Duyck
  2014-09-18 22:36 ` [net-next PATCH 07/29] fm10k: Add support for PF Alexander Duyck
                   ` (25 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-18 22:36 UTC (permalink / raw)
  To: davem
  Cc: nhorman, netdev, john.fastabend, matthew.vick, jeffrey.t.kirsher,
	sassmann

This patch adds support for the mailbox that connects the PF to the Switch
Management entity.  This mailbox will pass TLV formatted messages between
the two entities by using a pair of shared ring buffers.

The primary use of the mailbox is to configure L2 forwarding addresses,
VLANs, and general resource allocation from the switch.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/fm10k/Makefile    |    2 
 drivers/net/ethernet/intel/fm10k/fm10k_mbx.c | 1351 ++++++++++++++++++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_mbx.h |    4 
 3 files changed, 1355 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_mbx.c

diff --git a/drivers/net/ethernet/intel/fm10k/Makefile b/drivers/net/ethernet/intel/fm10k/Makefile
index 1052808..6642611 100644
--- a/drivers/net/ethernet/intel/fm10k/Makefile
+++ b/drivers/net/ethernet/intel/fm10k/Makefile
@@ -28,4 +28,4 @@
 obj-$(CONFIG_FM10K) += fm10k.o
 
 fm10k-objs := fm10k_main.o fm10k_pci.o \
-	      fm10k_tlv.o
+	      fm10k_mbx.o fm10k_tlv.o
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_mbx.c b/drivers/net/ethernet/intel/fm10k/fm10k_mbx.c
new file mode 100644
index 0000000..2609847
--- /dev/null
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_mbx.c
@@ -0,0 +1,1351 @@
+/* Intel Ethernet Switch Host Interface Driver
+ * Copyright(c) 2013 - 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Contact Information:
+ * e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+ * Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+ */
+
+#include "fm10k_common.h"
+
+/**
+ *  fm10k_fifo_init - Initialize a message FIFO
+ *  @fifo: pointer to FIFO
+ *  @buffer: pointer to memory to be used to store FIFO
+ *  @size: maximum message size to store in FIFO, must be 2^n - 1
+ **/
+static void fm10k_fifo_init(struct fm10k_mbx_fifo *fifo, u32 *buffer, u16 size)
+{
+	fifo->buffer = buffer;
+	fifo->size = size;
+	fifo->head = 0;
+	fifo->tail = 0;
+}
+
+/**
+ *  fm10k_fifo_used - Retrieve used space in FIFO
+ *  @fifo: pointer to FIFO
+ *
+ *  This function returns the number of DWORDs used in the FIFO
+ **/
+static u16 fm10k_fifo_used(struct fm10k_mbx_fifo *fifo)
+{
+	return fifo->tail - fifo->head;
+}
+
+/**
+ *  fm10k_fifo_unused - Retrieve unused space in FIFO
+ *  @fifo: pointer to FIFO
+ *
+ *  This function returns the number of unused DWORDs in the FIFO
+ **/
+static u16 fm10k_fifo_unused(struct fm10k_mbx_fifo *fifo)
+{
+	return fifo->size + fifo->head - fifo->tail;
+}
+
+/**
+ *  fm10k_fifo_empty - Test to verify if fifo is empty
+ *  @fifo: pointer to FIFO
+ *
+ *  This function returns true if the FIFO is empty, else false
+ **/
+static bool fm10k_fifo_empty(struct fm10k_mbx_fifo *fifo)
+{
+	return fifo->head == fifo->tail;
+}
+
+/**
+ *  fm10k_fifo_head_offset - returns indices of head with given offset
+ *  @fifo: pointer to FIFO
+ *  @offset: offset to add to head
+ *
+ *  This function returns the indicies into the fifo based on head + offset
+ **/
+static u16 fm10k_fifo_head_offset(struct fm10k_mbx_fifo *fifo, u16 offset)
+{
+	return (fifo->head + offset) & (fifo->size - 1);
+}
+
+/**
+ *  fm10k_fifo_tail_offset - returns indices of tail with given offset
+ *  @fifo: pointer to FIFO
+ *  @offset: offset to add to tail
+ *
+ *  This function returns the indicies into the fifo based on tail + offset
+ **/
+static u16 fm10k_fifo_tail_offset(struct fm10k_mbx_fifo *fifo, u16 offset)
+{
+	return (fifo->tail + offset) & (fifo->size - 1);
+}
+
+/**
+ *  fm10k_fifo_head_len - Retrieve length of first message in FIFO
+ *  @fifo: pointer to FIFO
+ *
+ *  This function returns the size of the first message in the FIFO
+ **/
+static u16 fm10k_fifo_head_len(struct fm10k_mbx_fifo *fifo)
+{
+	u32 *head = fifo->buffer + fm10k_fifo_head_offset(fifo, 0);
+
+	/* verify there is at least 1 DWORD in the fifo so *head is valid */
+	if (fm10k_fifo_empty(fifo))
+		return 0;
+
+	/* retieve the message length */
+	return FM10K_TLV_DWORD_LEN(*head);
+}
+
+/**
+ *  fm10k_fifo_head_drop - Drop the first message in FIFO
+ *  @fifo: pointer to FIFO
+ *
+ *  This function returns the size of the message dropped from the FIFO
+ **/
+static u16 fm10k_fifo_head_drop(struct fm10k_mbx_fifo *fifo)
+{
+	u16 len = fm10k_fifo_head_len(fifo);
+
+	/* update head so it is at the start of next frame */
+	fifo->head += len;
+
+	return len;
+}
+
+/**
+ *  fm10k_mbx_index_len - Convert a head/tail index into a length value
+ *  @mbx: pointer to mailbox
+ *  @head: head index
+ *  @tail: head index
+ *
+ *  This function takes the head and tail index and determines the length
+ *  of the data indicated by this pair.
+ **/
+static u16 fm10k_mbx_index_len(struct fm10k_mbx_info *mbx, u16 head, u16 tail)
+{
+	u16 len = tail - head;
+
+	/* we wrapped so subtract 2, one for index 0, one for all 1s index */
+	if (len > tail)
+		len -= 2;
+
+	return len & ((mbx->mbmem_len << 1) - 1);
+}
+
+/**
+ *  fm10k_mbx_tail_add - Determine new tail value with added offset
+ *  @mbx: pointer to mailbox
+ *  @offset: length to add to head offset
+ *
+ *  This function takes the local tail index and recomputes it for
+ *  a given length added as an offset.
+ **/
+static u16 fm10k_mbx_tail_add(struct fm10k_mbx_info *mbx, u16 offset)
+{
+	u16 tail = (mbx->tail + offset + 1) & ((mbx->mbmem_len << 1) - 1);
+
+	/* add/sub 1 because we cannot have offset 0 or all 1s */
+	return (tail > mbx->tail) ? --tail : ++tail;
+}
+
+/**
+ *  fm10k_mbx_tail_sub - Determine new tail value with subtracted offset
+ *  @mbx: pointer to mailbox
+ *  @offset: length to add to head offset
+ *
+ *  This function takes the local tail index and recomputes it for
+ *  a given length added as an offset.
+ **/
+static u16 fm10k_mbx_tail_sub(struct fm10k_mbx_info *mbx, u16 offset)
+{
+	u16 tail = (mbx->tail - offset - 1) & ((mbx->mbmem_len << 1) - 1);
+
+	/* sub/add 1 because we cannot have offset 0 or all 1s */
+	return (tail < mbx->tail) ? ++tail : --tail;
+}
+
+/**
+ *  fm10k_mbx_head_add - Determine new head value with added offset
+ *  @mbx: pointer to mailbox
+ *  @offset: length to add to head offset
+ *
+ *  This function takes the local head index and recomputes it for
+ *  a given length added as an offset.
+ **/
+static u16 fm10k_mbx_head_add(struct fm10k_mbx_info *mbx, u16 offset)
+{
+	u16 head = (mbx->head + offset + 1) & ((mbx->mbmem_len << 1) - 1);
+
+	/* add/sub 1 because we cannot have offset 0 or all 1s */
+	return (head > mbx->head) ? --head : ++head;
+}
+
+/**
+ *  fm10k_mbx_head_sub - Determine new head value with subtracted offset
+ *  @mbx: pointer to mailbox
+ *  @offset: length to add to head offset
+ *
+ *  This function takes the local head index and recomputes it for
+ *  a given length added as an offset.
+ **/
+static u16 fm10k_mbx_head_sub(struct fm10k_mbx_info *mbx, u16 offset)
+{
+	u16 head = (mbx->head - offset - 1) & ((mbx->mbmem_len << 1) - 1);
+
+	/* sub/add 1 because we cannot have offset 0 or all 1s */
+	return (head < mbx->head) ? ++head : --head;
+}
+
+/**
+ *  fm10k_mbx_pushed_tail_len - Retrieve the length of message being pushed
+ *  @mbx: pointer to mailbox
+ *
+ *  This function will return the length of the message currently being
+ *  pushed onto the tail of the Rx queue.
+ **/
+static u16 fm10k_mbx_pushed_tail_len(struct fm10k_mbx_info *mbx)
+{
+	u32 *tail = mbx->rx.buffer + fm10k_fifo_tail_offset(&mbx->rx, 0);
+
+	/* pushed tail is only valid if pushed is set */
+	if (!mbx->pushed)
+		return 0;
+
+	return FM10K_TLV_DWORD_LEN(*tail);
+}
+
+/**
+ *  fm10k_fifo_write_copy - pulls data off of msg and places it in fifo
+ *  @fifo: pointer to FIFO
+ *  @msg: message array to populate
+ *  @tail_offset: additional offset to add to tail pointer
+ *  @len: length of FIFO to copy into message header
+ *
+ *  This function will take a message and copy it into a section of the
+ *  FIFO.  In order to get something into a location other than just
+ *  the tail you can use tail_offset to adjust the pointer.
+ **/
+static void fm10k_fifo_write_copy(struct fm10k_mbx_fifo *fifo,
+				  const u32 *msg, u16 tail_offset, u16 len)
+{
+	u16 end = fm10k_fifo_tail_offset(fifo, tail_offset);
+	u32 *tail = fifo->buffer + end;
+
+	/* track when we should cross the end of the FIFO */
+	end = fifo->size - end;
+
+	/* copy end of message before start of message */
+	if (end < len)
+		memcpy(fifo->buffer, msg + end, (len - end) << 2);
+	else
+		end = len;
+
+	/* Copy remaining message into Tx FIFO */
+	memcpy(tail, msg, end << 2);
+}
+
+/**
+ *  fm10k_fifo_enqueue - Enqueues the message to the tail of the FIFO
+ *  @fifo: pointer to FIFO
+ *  @msg: message array to read
+ *
+ *  This function enqueues a message up to the size specified by the length
+ *  contained in the first DWORD of the message and will place at the tail
+ *  of the FIFO.  It will return 0 on success, or a negative value on error.
+ **/
+static s32 fm10k_fifo_enqueue(struct fm10k_mbx_fifo *fifo, const u32 *msg)
+{
+	u16 len = FM10K_TLV_DWORD_LEN(*msg);
+
+	/* verify parameters */
+	if (len > fifo->size)
+		return FM10K_MBX_ERR_SIZE;
+
+	/* verify there is room for the message */
+	if (len > fm10k_fifo_unused(fifo))
+		return FM10K_MBX_ERR_NO_SPACE;
+
+	/* Copy message into FIFO */
+	fm10k_fifo_write_copy(fifo, msg, 0, len);
+
+	/* memory barrier to guarantee FIFO is written before tail update */
+	wmb();
+
+	/* Update Tx FIFO tail */
+	fifo->tail += len;
+
+	return 0;
+}
+
+/**
+ *  fm10k_mbx_validate_msg_size - Validate incoming message based on size
+ *  @mbx: pointer to mailbox
+ *  @len: length of data pushed onto buffer
+ *
+ *  This function analyzes the frame and will return a non-zero value when
+ *  the start of a message larger than the mailbox is detected.
+ **/
+static u16 fm10k_mbx_validate_msg_size(struct fm10k_mbx_info *mbx, u16 len)
+{
+	struct fm10k_mbx_fifo *fifo = &mbx->rx;
+	u16 total_len = 0, msg_len;
+	u32 *msg;
+
+	/* length should include previous amounts pushed */
+	len += mbx->pushed;
+
+	/* offset in message is based off of current message size */
+	do {
+		msg = fifo->buffer + fm10k_fifo_tail_offset(fifo, total_len);
+		msg_len = FM10K_TLV_DWORD_LEN(*msg);
+		total_len += msg_len;
+	} while (total_len < len);
+
+	/* message extends out of pushed section, but fits in FIFO */
+	if ((len < total_len) && (msg_len <= mbx->rx.size))
+		return 0;
+
+	/* return length of invalid section */
+	return (len < total_len) ? len : (len - total_len);
+}
+
+/**
+ *  fm10k_mbx_write_copy - pulls data off of Tx FIFO and places it in mbmem
+ *  @mbx: pointer to mailbox
+ *
+ *  This function will take a seciton of the Rx FIFO and copy it into the
+		mbx->tail--;
+ *  mailbox memory.  The offset in mbmem is based on the lower bits of the
+ *  tail and len determines the length to copy.
+ **/
+static void fm10k_mbx_write_copy(struct fm10k_hw *hw,
+				 struct fm10k_mbx_info *mbx)
+{
+	struct fm10k_mbx_fifo *fifo = &mbx->tx;
+	u32 mbmem = mbx->mbmem_reg;
+	u32 *head = fifo->buffer;
+	u16 end, len, tail, mask;
+
+	if (!mbx->tail_len)
+		return;
+
+	/* determine data length and mbmem tail index */
+	mask = mbx->mbmem_len - 1;
+	len = mbx->tail_len;
+	tail = fm10k_mbx_tail_sub(mbx, len);
+	if (tail > mask)
+		tail++;
+
+	/* determine offset in the ring */
+	end = fm10k_fifo_head_offset(fifo, mbx->pulled);
+	head += end;
+
+	/* memory barrier to guarantee data is ready to be read */
+	rmb();
+
+	/* Copy message from Tx FIFO */
+	for (end = fifo->size - end; len; head = fifo->buffer) {
+		do {
+			/* adjust tail to match offset for FIFO */
+			tail &= mask;
+			if (!tail)
+				tail++;
+
+			/* write message to hardware FIFO */
+			fm10k_write_reg(hw, mbmem + tail++, *(head++));
+		} while (--len && --end);
+	}
+}
+
+/**
+ *  fm10k_mbx_pull_head - Pulls data off of head of Tx FIFO
+ *  @hw: pointer to hardware structure
+ *  @mbx: pointer to mailbox
+ *  @head: acknowledgement number last received
+ *
+ *  This function will push the tail index forward based on the remote
+ *  head index.  It will then pull up to mbmem_len DWORDs off of the
+ *  head of the FIFO and will place it in the MBMEM registers
+ *  associated with the mailbox.
+ **/
+static void fm10k_mbx_pull_head(struct fm10k_hw *hw,
+				struct fm10k_mbx_info *mbx, u16 head)
+{
+	u16 mbmem_len, len, ack = fm10k_mbx_index_len(mbx, head, mbx->tail);
+	struct fm10k_mbx_fifo *fifo = &mbx->tx;
+
+	/* update number of bytes pulled and update bytes in transit */
+	mbx->pulled += mbx->tail_len - ack;
+
+	/* determine length of data to pull, reserve space for mbmem header */
+	mbmem_len = mbx->mbmem_len - 1;
+	len = fm10k_fifo_used(fifo) - mbx->pulled;
+	if (len > mbmem_len)
+		len = mbmem_len;
+
+	/* update tail and record number of bytes in transit */
+	mbx->tail = fm10k_mbx_tail_add(mbx, len - ack);
+	mbx->tail_len = len;
+
+	/* drop pulled messages from the FIFO */
+	for (len = fm10k_fifo_head_len(fifo);
+	     len && (mbx->pulled >= len);
+	     len = fm10k_fifo_head_len(fifo)) {
+		mbx->pulled -= fm10k_fifo_head_drop(fifo);
+		mbx->tx_messages++;
+		mbx->tx_dwords += len;
+	}
+
+	/* Copy message out from the Tx FIFO */
+	fm10k_mbx_write_copy(hw, mbx);
+}
+
+/**
+ *  fm10k_mbx_read_copy - pulls data off of mbmem and places it in Rx FIFO
+ *  @hw: pointer to hardware structure
+ *  @mbx: pointer to mailbox
+ *
+ *  This function will take a seciton of the mailbox memory and copy it
+ *  into the Rx FIFO.  The offset is based on the lower bits of the
+ *  head and len determines the length to copy.
+ **/
+static void fm10k_mbx_read_copy(struct fm10k_hw *hw,
+				struct fm10k_mbx_info *mbx)
+{
+	struct fm10k_mbx_fifo *fifo = &mbx->rx;
+	u32 mbmem = mbx->mbmem_reg ^ mbx->mbmem_len;
+	u32 *tail = fifo->buffer;
+	u16 end, len, head;
+
+	/* determine data length and mbmem head index */
+	len = mbx->head_len;
+	head = fm10k_mbx_head_sub(mbx, len);
+	if (head >= mbx->mbmem_len)
+		head++;
+
+	/* determine offset in the ring */
+	end = fm10k_fifo_tail_offset(fifo, mbx->pushed);
+	tail += end;
+
+	/* Copy message into Rx FIFO */
+	for (end = fifo->size - end; len; tail = fifo->buffer) {
+		do {
+			/* adjust head to match offset for FIFO */
+			head &= mbx->mbmem_len - 1;
+			if (!head)
+				head++;
+
+			/* read message from hardware FIFO */
+			*(tail++) = fm10k_read_reg(hw, mbmem + head++);
+		} while (--len && --end);
+	}
+
+	/* memory barrier to guarantee FIFO is written before tail update */
+	wmb();
+}
+
+/**
+ *  fm10k_mbx_push_tail - Pushes up to 15 DWORDs on to tail of FIFO
+ *  @hw: pointer to hardware structure
+ *  @mbx: pointer to mailbox
+ *  @tail: tail index of message
+ *
+ *  This function will first validate the tail index and size for the
+ *  incoming message.  It then updates the acknowlegment number and
+ *  copies the data into the FIFO.  It will return the number of messages
+ *  dequeued on success and a negative value on error.
+ **/
+static s32 fm10k_mbx_push_tail(struct fm10k_hw *hw,
+			       struct fm10k_mbx_info *mbx,
+			       u16 tail)
+{
+	struct fm10k_mbx_fifo *fifo = &mbx->rx;
+	u16 len, seq = fm10k_mbx_index_len(mbx, mbx->head, tail);
+
+	/* determine length of data to push */
+	len = fm10k_fifo_unused(fifo) - mbx->pushed;
+	if (len > seq)
+		len = seq;
+
+	/* update head and record bytes received */
+	mbx->head = fm10k_mbx_head_add(mbx, len);
+	mbx->head_len = len;
+
+	/* nothing to do if there is no data */
+	if (!len)
+		return 0;
+
+	/* Copy msg into Rx FIFO */
+	fm10k_mbx_read_copy(hw, mbx);
+
+	/* determine if there are any invalid lengths in message */
+	if (fm10k_mbx_validate_msg_size(mbx, len))
+		return FM10K_MBX_ERR_SIZE;
+
+	/* Update pushed */
+	mbx->pushed += len;
+
+	/* flush any completed messages */
+	for (len = fm10k_mbx_pushed_tail_len(mbx);
+	     len && (mbx->pushed >= len);
+	     len = fm10k_mbx_pushed_tail_len(mbx)) {
+		fifo->tail += len;
+		mbx->pushed -= len;
+		mbx->rx_messages++;
+		mbx->rx_dwords += len;
+	}
+
+	return 0;
+}
+
+/**
+ *  fm10k_mbx_rx_ready - Indicates that a message is ready in the Rx FIFO
+ *  @mbx: pointer to mailbox
+ *
+ *  This function returns true if there is a message in the Rx FIFO to dequeue.
+ **/
+static bool fm10k_mbx_rx_ready(struct fm10k_mbx_info *mbx)
+{
+	u16 msg_size = fm10k_fifo_head_len(&mbx->rx);
+
+	return msg_size && (fm10k_fifo_used(&mbx->rx) >= msg_size);
+}
+
+/**
+ *  fm10k_mbx_tx_ready - Indicates that the mailbox is in state ready for Tx
+ *  @mbx: pointer to mailbox
+ *  @len: verify free space is >= this value
+ *
+ *  This function returns true if the mailbox is in a state ready to transmit.
+ **/
+static bool fm10k_mbx_tx_ready(struct fm10k_mbx_info *mbx, u16 len)
+{
+	u16 fifo_unused = fm10k_fifo_unused(&mbx->tx);
+
+	return (mbx->state == FM10K_STATE_OPEN) && (fifo_unused >= len);
+}
+
+/**
+ *  fm10k_mbx_tx_complete - Indicates that the Tx FIFO has been emptied
+ *  @mbx: pointer to mailbox
+ *
+ *  This function returns true if the Tx FIFO is empty.
+ **/
+static bool fm10k_mbx_tx_complete(struct fm10k_mbx_info *mbx)
+{
+	return fm10k_fifo_empty(&mbx->tx);
+}
+
+/**
+ *  fm10k_mbx_deqeueue_rx - Dequeues the message from the head in the Rx FIFO
+ *  @hw: pointer to hardware structure
+ *  @mbx: pointer to mailbox
+ *
+ *  This function dequeues messages and hands them off to the tlv parser.
+ *  It will return the number of messages processed when called.
+ **/
+static u16 fm10k_mbx_dequeue_rx(struct fm10k_hw *hw,
+				struct fm10k_mbx_info *mbx)
+{
+	struct fm10k_mbx_fifo *fifo = &mbx->rx;
+	s32 err;
+	u16 cnt;
+
+	/* parse Rx messages out of the Rx FIFO to empty it */
+	for (cnt = 0; !fm10k_fifo_empty(fifo); cnt++) {
+		err = fm10k_tlv_msg_parse(hw, fifo->buffer + fifo->head,
+					  mbx, mbx->msg_data);
+		if (err < 0)
+			mbx->rx_parse_err++;
+
+		fm10k_fifo_head_drop(fifo);
+	}
+
+	/* shift remaining bytes back to start of FIFO */
+	memmove(fifo->buffer, fifo->buffer + fifo->tail, mbx->pushed << 2);
+
+	/* shift head and tail based on the memory we moved */
+	fifo->tail -= fifo->head;
+	fifo->head = 0;
+
+	return cnt;
+}
+
+/**
+ *  fm10k_mbx_enqueue_tx - Enqueues the message to the tail of the Tx FIFO
+ *  @hw: pointer to hardware structure
+ *  @mbx: pointer to mailbox
+ *  @msg: message array to read
+ *
+ *  This function enqueues a message up to the size specified by the length
+ *  contained in the first DWORD of the message and will place at the tail
+ *  of the FIFO.  It will return 0 on success, or a negative value on error.
+ **/
+static s32 fm10k_mbx_enqueue_tx(struct fm10k_hw *hw,
+				struct fm10k_mbx_info *mbx, const u32 *msg)
+{
+	u32 countdown = mbx->timeout;
+	s32 err;
+
+	switch (mbx->state) {
+	case FM10K_STATE_CLOSED:
+	case FM10K_STATE_DISCONNECT:
+		return FM10K_MBX_ERR_NO_MBX;
+	default:
+		break;
+	}
+
+	/* enqueue the message on the Tx FIFO */
+	err = fm10k_fifo_enqueue(&mbx->tx, msg);
+
+	/* if it failed give the FIFO a chance to drain */
+	while (err && countdown) {
+		countdown--;
+		udelay(mbx->udelay);
+		mbx->ops.process(hw, mbx);
+		err = fm10k_fifo_enqueue(&mbx->tx, msg);
+	}
+
+	/* if we failed trhead the error */
+	if (err) {
+		mbx->timeout = 0;
+		mbx->tx_busy++;
+	}
+
+	/* begin processing message, ignore errors as this is just meant
+	 * to start the mailbox flow so we are not concerned if there
+	 * is a bad error, or the mailbox is already busy with a request
+	 */
+	if (!mbx->tail_len)
+		mbx->ops.process(hw, mbx);
+
+	return 0;
+}
+
+/**
+ *  fm10k_mbx_read - Copies the mbmem to local message buffer
+ *  @hw: pointer to hardware structure
+ *  @mbx: pointer to mailbox
+ *
+ *  This function copies the message from the mbmem to the message array
+ **/
+static s32 fm10k_mbx_read(struct fm10k_hw *hw, struct fm10k_mbx_info *mbx)
+{
+	/* only allow one reader in here at a time */
+	if (mbx->mbx_hdr)
+		return FM10K_MBX_ERR_BUSY;
+
+	/* read to capture initial interrupt bits */
+	if (fm10k_read_reg(hw, mbx->mbx_reg) & FM10K_MBX_REQ_INTERRUPT)
+		mbx->mbx_lock = FM10K_MBX_ACK;
+
+	/* write back interrupt bits to clear */
+	fm10k_write_reg(hw, mbx->mbx_reg,
+			FM10K_MBX_REQ_INTERRUPT | FM10K_MBX_ACK_INTERRUPT);
+
+	/* read remote header */
+	mbx->mbx_hdr = fm10k_read_reg(hw, mbx->mbmem_reg ^ mbx->mbmem_len);
+
+	return 0;
+}
+
+/**
+ *  fm10k_mbx_write - Copies the local message buffer to mbmem
+ *  @hw: pointer to hardware structure
+ *  @mbx: pointer to mailbox
+ *
+ *  This function copies the message from the the message array to mbmem
+ **/
+static void fm10k_mbx_write(struct fm10k_hw *hw, struct fm10k_mbx_info *mbx)
+{
+	u32 mbmem = mbx->mbmem_reg;
+
+	/* write new msg header to notify recepient of change */
+	fm10k_write_reg(hw, mbmem, mbx->mbx_hdr);
+
+	/* write mailbox to sent interrupt */
+	if (mbx->mbx_lock)
+		fm10k_write_reg(hw, mbx->mbx_reg, mbx->mbx_lock);
+
+	/* we no longer are using the header so free it */
+	mbx->mbx_hdr = 0;
+	mbx->mbx_lock = 0;
+}
+
+/**
+ *  fm10k_mbx_reset_work- Reset internal pointers for any pending work
+ *  @mbx: pointer to mailbox
+ *
+ *  This function will reset all internal pointers so any work in progress
+ *  is dropped.  This call should occur every time we transition from the
+ *  open state to the connect state.
+ **/
+static void fm10k_mbx_reset_work(struct fm10k_mbx_info *mbx)
+{
+	/* reset our outgoing max size back to Rx limits */
+	mbx->max_size = mbx->rx.size - 1;
+
+	/* just do a quick resysnc to start of message */
+	mbx->pushed = 0;
+	mbx->pulled = 0;
+	mbx->tail_len = 0;
+	mbx->head_len = 0;
+	mbx->rx.tail = 0;
+	mbx->rx.head = 0;
+}
+
+/**
+ *  fm10k_mbx_update_max_size - Update the max_size and drop any large messages
+ *  @mbx: pointer to mailbox
+ *  @size: new value for max_size
+ *
+ *  This function will update the max_size value and drop any outgoing messages
+ *  from the head of the Tx FIFO that are larger than max_size.
+ **/
+static void fm10k_mbx_update_max_size(struct fm10k_mbx_info *mbx, u16 size)
+{
+	u16 len;
+
+	mbx->max_size = size;
+
+	/* flush any oversized messages from the queue */
+	for (len = fm10k_fifo_head_len(&mbx->tx);
+	     len > size;
+	     len = fm10k_fifo_head_len(&mbx->tx)) {
+		fm10k_fifo_head_drop(&mbx->tx);
+		mbx->tx_dropped++;
+	}
+}
+
+/**
+ *  fm10k_mbx_validate_handlers - Validate layout of message parsing data
+ *  @msg_data: handlers for mailbox events
+ *
+ *  This function validates the layout of the message parsing data.  This
+ *  should be mostly static, but it is important to catch any errors that
+ *  are made when constructing the parsers.
+ **/
+static s32 fm10k_mbx_validate_handlers(const struct fm10k_msg_data *msg_data)
+{
+	const struct fm10k_tlv_attr *attr;
+	unsigned int id;
+
+	/* Allow NULL mailboxes that transmit but don't receive */
+	if (!msg_data)
+		return 0;
+
+	while (msg_data->id != FM10K_TLV_ERROR) {
+		/* all messages should have a function handler */
+		if (!msg_data->func)
+			return FM10K_ERR_PARAM;
+
+		/* parser is optional */
+		attr = msg_data->attr;
+		if (attr) {
+			while (attr->id != FM10K_TLV_ERROR) {
+				id = attr->id;
+				attr++;
+				/* ID should always be increasing */
+				if (id >= attr->id)
+					return FM10K_ERR_PARAM;
+				/* ID should fit in results array */
+				if (id >= FM10K_TLV_RESULTS_MAX)
+					return FM10K_ERR_PARAM;
+			}
+
+			/* verify terminator is in the list */
+			if (attr->id != FM10K_TLV_ERROR)
+				return FM10K_ERR_PARAM;
+		}
+
+		id = msg_data->id;
+		msg_data++;
+		/* ID should always be increasing */
+		if (id >= msg_data->id)
+			return FM10K_ERR_PARAM;
+	}
+
+	/* verify terminator is in the list */
+	if ((msg_data->id != FM10K_TLV_ERROR) || !msg_data->func)
+		return FM10K_ERR_PARAM;
+
+	return 0;
+}
+
+/**
+ *  fm10k_mbx_register_handlers - Register a set of handler ops for mailbox
+ *  @mbx: pointer to mailbox
+ *  @msg_data: handlers for mailbox events
+ *
+ *  This function associates a set of message handling ops with a mailbox.
+ **/
+static s32 fm10k_mbx_register_handlers(struct fm10k_mbx_info *mbx,
+				       const struct fm10k_msg_data *msg_data)
+{
+	/* validate layout of handlers before assigning them */
+	if (fm10k_mbx_validate_handlers(msg_data))
+		return FM10K_ERR_PARAM;
+
+	/* initialize the message handlers */
+	mbx->msg_data = msg_data;
+
+	return 0;
+}
+
+/**
+ *  fm10k_sm_mbx_create_data_hdr - Generate a mailbox header for local FIFO
+ *  @mbx: pointer to mailbox
+ *
+ *  This function returns a connection mailbox header
+ **/
+static void fm10k_sm_mbx_create_data_hdr(struct fm10k_mbx_info *mbx)
+{
+	if (mbx->tail_len)
+		mbx->mbx_lock |= FM10K_MBX_REQ;
+
+	mbx->mbx_hdr = FM10K_MSG_HDR_FIELD_SET(mbx->tail, SM_TAIL) |
+		       FM10K_MSG_HDR_FIELD_SET(mbx->remote, SM_VER) |
+		       FM10K_MSG_HDR_FIELD_SET(mbx->head, SM_HEAD);
+}
+
+/**
+ *  fm10k_sm_mbx_create_connect_hdr - Generate a mailbox header for local FIFO
+ *  @mbx: pointer to mailbox
+ *  @err: error flags to report if any
+ *
+ *  This function returns a connection mailbox header
+ **/
+static void fm10k_sm_mbx_create_connect_hdr(struct fm10k_mbx_info *mbx, u8 err)
+{
+	if (mbx->local)
+		mbx->mbx_lock |= FM10K_MBX_REQ;
+
+	mbx->mbx_hdr = FM10K_MSG_HDR_FIELD_SET(mbx->tail, SM_TAIL) |
+		       FM10K_MSG_HDR_FIELD_SET(mbx->remote, SM_VER) |
+		       FM10K_MSG_HDR_FIELD_SET(mbx->head, SM_HEAD) |
+		       FM10K_MSG_HDR_FIELD_SET(err, SM_ERR);
+}
+
+/**
+ *  fm10k_sm_mbx_connect_reset - Reset following request for reset
+ *  @mbx: pointer to mailbox
+ *
+ *  This function resets the mailbox to a just connected state
+ **/
+static void fm10k_sm_mbx_connect_reset(struct fm10k_mbx_info *mbx)
+{
+	/* flush any uncompleted work */
+	fm10k_mbx_reset_work(mbx);
+
+	/* set local version to max and remote version to 0 */
+	mbx->local = FM10K_SM_MBX_VERSION;
+	mbx->remote = 0;
+
+	/* initalize tail and head */
+	mbx->tail = 1;
+	mbx->head = 1;
+
+	/* reset state back to connect */
+	mbx->state = FM10K_STATE_CONNECT;
+}
+
+/**
+ *  fm10k_sm_mbx_connect - Start switch manager mailbox connection
+ *  @hw: pointer to hardware structure
+ *  @mbx: pointer to mailbox
+ *
+ *  This function will initiate a mailbox connection with the switch
+ *  manager.  To do this it will first disconnect the mailbox, and then
+ *  reconnect it in order to complete a reset of the mailbox.
+ *
+ *  This function will return an error if the mailbox has not been initiated
+ *  or is currently in use.
+ **/
+static s32 fm10k_sm_mbx_connect(struct fm10k_hw *hw, struct fm10k_mbx_info *mbx)
+{
+	/* we cannot connect an uninitialized mailbox */
+	if (!mbx->rx.buffer)
+		return FM10K_MBX_ERR_NO_SPACE;
+
+	/* we cannot connect an already connected mailbox */
+	if (mbx->state != FM10K_STATE_CLOSED)
+		return FM10K_MBX_ERR_BUSY;
+
+	/* mailbox timeout can now become active */
+	mbx->timeout = FM10K_MBX_INIT_TIMEOUT;
+
+	/* Place mbx in ready to connect state */
+	mbx->state = FM10K_STATE_CONNECT;
+	mbx->max_size = FM10K_MBX_MSG_MAX_SIZE;
+
+	/* reset interface back to connect */
+	fm10k_sm_mbx_connect_reset(mbx);
+
+	/* enable interrupt and notify other party of new message */
+	mbx->mbx_lock = FM10K_MBX_REQ_INTERRUPT | FM10K_MBX_ACK_INTERRUPT |
+			FM10K_MBX_INTERRUPT_ENABLE;
+
+	/* generate and load connect header into mailbox */
+	fm10k_sm_mbx_create_connect_hdr(mbx, 0);
+	fm10k_mbx_write(hw, mbx);
+
+	/* enable interrupt and notify other party of new message */
+
+	return 0;
+}
+
+/**
+ *  fm10k_sm_mbx_disconnect - Shutdown mailbox connection
+ *  @hw: pointer to hardware structure
+ *  @mbx: pointer to mailbox
+ *
+ *  This function will shut down the mailbox.  It places the mailbox first
+ *  in the disconnect state, it then allows up to a predefined timeout for
+ *  the mailbox to transition to close on its own.  If this does not occur
+ *  then the mailbox will be forced into the closed state.
+ *
+ *  Any mailbox transactions not completed before calling this function
+ *  are not guaranteed to complete and may be dropped.
+ **/
+static void fm10k_sm_mbx_disconnect(struct fm10k_hw *hw,
+				    struct fm10k_mbx_info *mbx)
+{
+	int timeout = mbx->timeout ? FM10K_MBX_DISCONNECT_TIMEOUT : 0;
+
+	/* Place mbx in ready to disconnect state */
+	mbx->state = FM10K_STATE_DISCONNECT;
+
+	/* trigger interrupt to start shutdown process */
+	fm10k_write_reg(hw, mbx->mbx_reg, FM10K_MBX_REQ |
+					  FM10K_MBX_INTERRUPT_DISABLE);
+	do {
+		udelay(FM10K_MBX_POLL_DELAY);
+		mbx->ops.process(hw, mbx);
+		timeout -= FM10K_MBX_POLL_DELAY;
+	} while ((timeout > 0) && (mbx->state != FM10K_STATE_CLOSED));
+
+	/* in case we didn't close just force the mailbox into shutdown */
+	mbx->state = FM10K_STATE_CLOSED;
+	mbx->remote = 0;
+	fm10k_mbx_reset_work(mbx);
+	fm10k_mbx_update_max_size(mbx, 0);
+
+	fm10k_write_reg(hw, mbx->mbmem_reg, 0);
+}
+
+/**
+ *  fm10k_mbx_validate_fifo_hdr - Validate fields in the remote FIFO header
+ *  @mbx: pointer to mailbox
+ *
+ *  This function will parse up the fields in the mailbox header and return
+ *  an error if the header contains any of a number of invalid configurations
+ *  including unrecognized offsets or version numbers.
+ **/
+static s32 fm10k_sm_mbx_validate_fifo_hdr(struct fm10k_mbx_info *mbx)
+{
+	const u32 *hdr = &mbx->mbx_hdr;
+	u16 tail, head, ver;
+
+	tail = FM10K_MSG_HDR_FIELD_GET(*hdr, SM_TAIL);
+	ver = FM10K_MSG_HDR_FIELD_GET(*hdr, SM_VER);
+	head = FM10K_MSG_HDR_FIELD_GET(*hdr, SM_HEAD);
+
+	switch (ver) {
+	case 0:
+		break;
+	case FM10K_SM_MBX_VERSION:
+		if (!head || head > FM10K_SM_MBX_FIFO_LEN)
+			return FM10K_MBX_ERR_HEAD;
+		if (!tail || tail > FM10K_SM_MBX_FIFO_LEN)
+			return FM10K_MBX_ERR_TAIL;
+		if (mbx->tail < head)
+			head += mbx->mbmem_len - 1;
+		if (tail < mbx->head)
+			tail += mbx->mbmem_len - 1;
+		if (fm10k_mbx_index_len(mbx, head, mbx->tail) > mbx->tail_len)
+			return FM10K_MBX_ERR_HEAD;
+		if (fm10k_mbx_index_len(mbx, mbx->head, tail) < mbx->mbmem_len)
+			break;
+		return FM10K_MBX_ERR_TAIL;
+	default:
+		return FM10K_MBX_ERR_SRC;
+	}
+
+	return 0;
+}
+
+/**
+ *  fm10k_sm_mbx_process_error - Process header with error flag set
+ *  @mbx: pointer to mailbox
+ *
+ *  This function is meant to respond to a request where the error flag
+ *  is set.  As a result we will terminate a connection if one is present
+ *  and fall back into the reset state with a connection header of version
+ *  0 (RESET).
+ **/
+static void fm10k_sm_mbx_process_error(struct fm10k_mbx_info *mbx)
+{
+	const enum fm10k_mbx_state state = mbx->state;
+
+	switch (state) {
+	case FM10K_STATE_DISCONNECT:
+		/* if there is an error just disconnect */
+		mbx->remote = 0;
+		break;
+	case FM10K_STATE_OPEN:
+		/* flush any uncompleted work */
+		fm10k_sm_mbx_connect_reset(mbx);
+		break;
+	case FM10K_STATE_CONNECT:
+		/* try connnecting at lower version */
+		if (mbx->remote) {
+			while (mbx->local > 1)
+				mbx->local--;
+			mbx->remote = 0;
+		}
+		break;
+	default:
+		break;
+	}
+
+	fm10k_sm_mbx_create_connect_hdr(mbx, 0);
+}
+
+/**
+ *  fm10k_sm_mbx_create_error_message - Process an error in FIFO hdr
+ *  @mbx: pointer to mailbox
+ *  @err: local error encountered
+ *
+ *  This function will interpret the error provided by err, and based on
+ *  that it may set the error bit in the local message header
+ **/
+static void fm10k_sm_mbx_create_error_msg(struct fm10k_mbx_info *mbx, s32 err)
+{
+	/* only generate an error message for these types */
+	switch (err) {
+	case FM10K_MBX_ERR_TAIL:
+	case FM10K_MBX_ERR_HEAD:
+	case FM10K_MBX_ERR_SRC:
+	case FM10K_MBX_ERR_SIZE:
+	case FM10K_MBX_ERR_RSVD0:
+		break;
+	default:
+		return;
+	}
+
+	/* process it as though we received an error, and send error reply */
+	fm10k_sm_mbx_process_error(mbx);
+	fm10k_sm_mbx_create_connect_hdr(mbx, 1);
+}
+
+/**
+ *  fm10k_sm_mbx_receive - Take message from Rx mailbox FIFO and put it in Rx
+ *  @hw: pointer to hardware structure
+ *  @mbx: pointer to mailbox
+ *
+ *  This function will dequeue one message from the Rx switch manager mailbox
+ *  FIFO and place it in the Rx mailbox FIFO for processing by software.
+ **/
+static s32 fm10k_sm_mbx_receive(struct fm10k_hw *hw,
+				struct fm10k_mbx_info *mbx,
+				u16 tail)
+{
+	/* reduce length by 1 to convert to a mask */
+	u16 mbmem_len = mbx->mbmem_len - 1;
+	s32 err;
+
+	/* push tail in front of head */
+	if (tail < mbx->head)
+		tail += mbmem_len;
+
+	/* copy data to the Rx FIFO */
+	err = fm10k_mbx_push_tail(hw, mbx, tail);
+	if (err < 0)
+		return err;
+
+	/* process messages if we have received any */
+	fm10k_mbx_dequeue_rx(hw, mbx);
+
+	/* guarantee head aligns with the end of the last message */
+	mbx->head = fm10k_mbx_head_sub(mbx, mbx->pushed);
+	mbx->pushed = 0;
+
+	/* clear any extra bits left over since index adds 1 extra bit */
+	if (mbx->head > mbmem_len)
+		mbx->head -= mbmem_len;
+
+	return err;
+}
+
+/**
+ *  fm10k_sm_mbx_transmit - Take message from Tx and put it in Tx mailbox FIFO
+ *  @hw: pointer to hardware structure
+ *  @mbx: pointer to mailbox
+ *
+ *  This function will dequeue one message from the Tx mailbox FIFO and place
+ *  it in the Tx switch manager mailbox FIFO for processing by hardware.
+ **/
+static void fm10k_sm_mbx_transmit(struct fm10k_hw *hw,
+				  struct fm10k_mbx_info *mbx, u16 head)
+{
+	struct fm10k_mbx_fifo *fifo = &mbx->tx;
+	/* reduce length by 1 to convert to a mask */
+	u16 mbmem_len = mbx->mbmem_len - 1;
+	u16 tail_len, len = 0;
+	u32 *msg;
+
+	/* push head behind tail */
+	if (mbx->tail < head)
+		head += mbmem_len;
+
+	fm10k_mbx_pull_head(hw, mbx, head);
+
+	/* determine msg aligned offset for end of buffer */
+	do {
+		msg = fifo->buffer + fm10k_fifo_head_offset(fifo, len);
+		tail_len = len;
+		len += FM10K_TLV_DWORD_LEN(*msg);
+	} while ((len <= mbx->tail_len) && (len < mbmem_len));
+
+	/* guarantee we stop on a message boundary */
+	if (mbx->tail_len > tail_len) {
+		mbx->tail = fm10k_mbx_tail_sub(mbx, mbx->tail_len - tail_len);
+		mbx->tail_len = tail_len;
+	}
+
+	/* clear any extra bits left over since index adds 1 extra bit */
+	if (mbx->tail > mbmem_len)
+		mbx->tail -= mbmem_len;
+}
+
+/**
+ *  fm10k_sm_mbx_create_reply - Generate reply based on state and remote head
+ *  @mbx: pointer to mailbox
+ *  @head: acknowledgement number
+ *
+ *  This function will generate an outgoing message based on the current
+ *  mailbox state and the remote fifo head.  It will return the length
+ *  of the outgoing message excluding header on success, and a negative value
+ *  on error.
+ **/
+static void fm10k_sm_mbx_create_reply(struct fm10k_hw *hw,
+				      struct fm10k_mbx_info *mbx, u16 head)
+{
+	switch (mbx->state) {
+	case FM10K_STATE_OPEN:
+	case FM10K_STATE_DISCONNECT:
+		/* flush out Tx data */
+		fm10k_sm_mbx_transmit(hw, mbx, head);
+
+		/* generate new header based on data */
+		if (mbx->tail_len || (mbx->state == FM10K_STATE_OPEN)) {
+			fm10k_sm_mbx_create_data_hdr(mbx);
+		} else {
+			mbx->remote = 0;
+			fm10k_sm_mbx_create_connect_hdr(mbx, 0);
+		}
+		break;
+	case FM10K_STATE_CONNECT:
+	case FM10K_STATE_CLOSED:
+		fm10k_sm_mbx_create_connect_hdr(mbx, 0);
+		break;
+	default:
+		break;
+	}
+}
+
+/**
+ *  fm10k_sm_mbx_process_reset - Process header with version == 0 (RESET)
+ *  @hw: pointer to hardware structure
+ *  @mbx: pointer to mailbox
+ *
+ *  This function is meant to respond to a request where the version data
+ *  is set to 0.  As such we will either terminate the connection or go
+ *  into the connect state in order to re-establish the connection.  This
+ *  function can also be used to respond to an error as the connection
+ *  resetting would also be a means of dealing with errors.
+ **/
+static void fm10k_sm_mbx_process_reset(struct fm10k_hw *hw,
+				       struct fm10k_mbx_info *mbx)
+{
+	const enum fm10k_mbx_state state = mbx->state;
+
+	switch (state) {
+	case FM10K_STATE_DISCONNECT:
+		/* drop remote connections and disconnect */
+		mbx->state = FM10K_STATE_CLOSED;
+		mbx->remote = 0;
+		mbx->local = 0;
+		break;
+	case FM10K_STATE_OPEN:
+		/* flush any incomplete work */
+		fm10k_sm_mbx_connect_reset(mbx);
+		break;
+	case FM10K_STATE_CONNECT:
+		/* Update remote value to match local value */
+		mbx->remote = mbx->local;
+	default:
+		break;
+	}
+
+	fm10k_sm_mbx_create_reply(hw, mbx, mbx->tail);
+}
+
+/**
+ *  fm10k_sm_mbx_process_version_1 - Process header with version == 1
+ *  @hw: pointer to hardware structure
+ *  @mbx: pointer to mailbox
+ *
+ *  This function is meant to process messages received when the remote
+ *  mailbox is active.
+ **/
+static s32 fm10k_sm_mbx_process_version_1(struct fm10k_hw *hw,
+					  struct fm10k_mbx_info *mbx)
+{
+	const u32 *hdr = &mbx->mbx_hdr;
+	u16 head, tail;
+	s32 len;
+
+	/* pull all fields needed for verification */
+	tail = FM10K_MSG_HDR_FIELD_GET(*hdr, SM_TAIL);
+	head = FM10K_MSG_HDR_FIELD_GET(*hdr, SM_HEAD);
+
+	/* if we are in connect and wanting version 1 then start up and go */
+	if (mbx->state == FM10K_STATE_CONNECT) {
+		if (!mbx->remote)
+			goto send_reply;
+		if (mbx->remote != 1)
+			return FM10K_MBX_ERR_SRC;
+
+		mbx->state = FM10K_STATE_OPEN;
+	}
+
+	do {
+		/* abort on message size errors */
+		len = fm10k_sm_mbx_receive(hw, mbx, tail);
+		if (len < 0)
+			return len;
+
+		/* continue until we have flushed the Rx FIFO */
+	} while (len);
+
+send_reply:
+	fm10k_sm_mbx_create_reply(hw, mbx, head);
+
+	return 0;
+}
+
+/**
+ *  fm10k_sm_mbx_process - Process mailbox switch mailbox interrupt
+ *  @hw: pointer to hardware structure
+ *  @mbx: pointer to mailbox
+ *
+ *  This function will process incoming mailbox events and generate mailbox
+ *  replies.  It will return a value indicating the number of DWORDs
+ *  transmitted excluding header on success or a negative value on error.
+ **/
+static s32 fm10k_sm_mbx_process(struct fm10k_hw *hw,
+				struct fm10k_mbx_info *mbx)
+{
+	s32 err;
+
+	/* we do not read mailbox if closed */
+	if (mbx->state == FM10K_STATE_CLOSED)
+		return 0;
+
+	/* retrieve data from switch manager */
+	err = fm10k_mbx_read(hw, mbx);
+	if (err)
+		return err;
+
+	err = fm10k_sm_mbx_validate_fifo_hdr(mbx);
+	if (err < 0)
+		goto fifo_err;
+
+	if (FM10K_MSG_HDR_FIELD_GET(mbx->mbx_hdr, SM_ERR)) {
+		fm10k_sm_mbx_process_error(mbx);
+		goto fifo_err;
+	}
+
+	switch (FM10K_MSG_HDR_FIELD_GET(mbx->mbx_hdr, SM_VER)) {
+	case 0:
+		fm10k_sm_mbx_process_reset(hw, mbx);
+		break;
+	case FM10K_SM_MBX_VERSION:
+		err = fm10k_sm_mbx_process_version_1(hw, mbx);
+		break;
+	}
+
+fifo_err:
+	if (err < 0)
+		fm10k_sm_mbx_create_error_msg(mbx, err);
+
+	/* report data to switch manager */
+	fm10k_mbx_write(hw, mbx);
+
+	return err;
+}
+
+/**
+ *  fm10k_sm_mbx_init - Initialize mailbox memory for PF/SM mailbox
+ *  @hw: pointer to hardware structure
+ *  @mbx: pointer to mailbox
+ *  @msg_data: handlers for mailbox events
+ *
+ *  This function for now is used to stub out the PF/SM mailbox
+ **/
+s32 fm10k_sm_mbx_init(struct fm10k_hw *hw, struct fm10k_mbx_info *mbx,
+		      const struct fm10k_msg_data *msg_data)
+{
+	mbx->mbx_reg = FM10K_GMBX;
+	mbx->mbmem_reg = FM10K_MBMEM_PF(0);
+	/* start out in closed state */
+	mbx->state = FM10K_STATE_CLOSED;
+
+	/* validate layout of handlers before assigning them */
+	if (fm10k_mbx_validate_handlers(msg_data))
+		return FM10K_ERR_PARAM;
+
+	/* initialize the message handlers */
+	mbx->msg_data = msg_data;
+
+	/* start mailbox as timed out and let the reset_hw call
+	 * set the timeout value to begin communications
+	 */
+	mbx->timeout = 0;
+	mbx->udelay = FM10K_MBX_INIT_DELAY;
+
+	/* Split buffer for use by Tx/Rx FIFOs */
+	mbx->max_size = FM10K_MBX_MSG_MAX_SIZE;
+	mbx->mbmem_len = FM10K_MBMEM_PF_XOR;
+
+	/* initialize the FIFOs, sizes are in 4 byte increments */
+	fm10k_fifo_init(&mbx->tx, mbx->buffer, FM10K_MBX_TX_BUFFER_SIZE);
+	fm10k_fifo_init(&mbx->rx, &mbx->buffer[FM10K_MBX_TX_BUFFER_SIZE],
+			FM10K_MBX_RX_BUFFER_SIZE);
+
+	/* initialize function pointers */
+	mbx->ops.connect = fm10k_sm_mbx_connect;
+	mbx->ops.disconnect = fm10k_sm_mbx_disconnect;
+	mbx->ops.rx_ready = fm10k_mbx_rx_ready;
+	mbx->ops.tx_ready = fm10k_mbx_tx_ready;
+	mbx->ops.tx_complete = fm10k_mbx_tx_complete;
+	mbx->ops.enqueue_tx = fm10k_mbx_enqueue_tx;
+	mbx->ops.process = fm10k_sm_mbx_process;
+	mbx->ops.register_handlers = fm10k_mbx_register_handlers;
+
+	return 0;
+}
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_mbx.h b/drivers/net/ethernet/intel/fm10k/fm10k_mbx.h
index ee2f712..f5ba86f 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_mbx.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_mbx.h
@@ -135,7 +135,6 @@ enum fm10k_mbx_state {
 /* version number for switch manager mailboxes */
 #define FM10K_SM_MBX_VERSION		1
 #define FM10K_SM_MBX_FIFO_LEN		(FM10K_MBMEM_PF_XOR - 1)
-#define FM10K_SM_MBX_FIFO_HDR_LEN	1
 
 /* offsets shared between all SM FIFO headers */
 #define FM10K_MSG_SM_TAIL_SHIFT			0
@@ -238,4 +237,7 @@ struct fm10k_mbx_info {
 	u32 buffer[FM10K_MBX_BUFFER_SIZE];
 };
 
+s32 fm10k_sm_mbx_init(struct fm10k_hw *, struct fm10k_mbx_info *,
+		      const struct fm10k_msg_data *);
+
 #endif /* _FM10K_MBX_H_ */

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [net-next PATCH 07/29] fm10k: Add support for PF
  2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
                   ` (5 preceding siblings ...)
  2014-09-18 22:36 ` [net-next PATCH 06/29] fm10k: Implement PF <-> SM mailbox operations Alexander Duyck
@ 2014-09-18 22:36 ` Alexander Duyck
  2014-09-18 22:36 ` [net-next PATCH 08/29] fm10k: Add support for configuring PF interface Alexander Duyck
                   ` (24 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-18 22:36 UTC (permalink / raw)
  To: davem
  Cc: nhorman, netdev, john.fastabend, matthew.vick, jeffrey.t.kirsher,
	sassmann

This patch adds basic support for the PF.  With this it is possible to
bring up the interface, but without being able to configure any of the
filters on the interface itself.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/fm10k/Makefile       |    3 
 drivers/net/ethernet/intel/fm10k/fm10k_common.c |  534 +++++++++++++++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_common.h |   13 +
 drivers/net/ethernet/intel/fm10k/fm10k_pf.c     |  392 +++++++++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_pf.h     |   28 +
 5 files changed, 969 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_common.c
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_pf.c
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_pf.h

diff --git a/drivers/net/ethernet/intel/fm10k/Makefile b/drivers/net/ethernet/intel/fm10k/Makefile
index 6642611..0904bcb 100644
--- a/drivers/net/ethernet/intel/fm10k/Makefile
+++ b/drivers/net/ethernet/intel/fm10k/Makefile
@@ -27,5 +27,6 @@
 
 obj-$(CONFIG_FM10K) += fm10k.o
 
-fm10k-objs := fm10k_main.o fm10k_pci.o \
+fm10k-objs := fm10k_main.o fm10k_common.o fm10k_pci.o \
+	      fm10k_pf.o \
 	      fm10k_mbx.o fm10k_tlv.o
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_common.c b/drivers/net/ethernet/intel/fm10k/fm10k_common.c
new file mode 100644
index 0000000..bf19dcc
--- /dev/null
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_common.c
@@ -0,0 +1,534 @@
+/* Intel Ethernet Switch Host Interface Driver
+ * Copyright(c) 2013 - 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Contact Information:
+ * e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+ * Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+ */
+
+#include "fm10k_common.h"
+
+/**
+ *  fm10k_get_bus_info_generic - Generic set PCI bus info
+ *  @hw: pointer to hardware structure
+ *
+ *  Gets the PCI bus info (speed, width, type) then calls helper function to
+ *  store this data within the fm10k_hw structure.
+ **/
+s32 fm10k_get_bus_info_generic(struct fm10k_hw *hw)
+{
+	u16 link_cap, link_status, device_cap, device_control;
+
+	/* Get the maximum link width and speed from PCIe config space */
+	link_cap = fm10k_read_pci_cfg_word(hw, FM10K_PCIE_LINK_CAP);
+
+	switch (link_cap & FM10K_PCIE_LINK_WIDTH) {
+	case FM10K_PCIE_LINK_WIDTH_1:
+		hw->bus_caps.width = fm10k_bus_width_pcie_x1;
+		break;
+	case FM10K_PCIE_LINK_WIDTH_2:
+		hw->bus_caps.width = fm10k_bus_width_pcie_x2;
+		break;
+	case FM10K_PCIE_LINK_WIDTH_4:
+		hw->bus_caps.width = fm10k_bus_width_pcie_x4;
+		break;
+	case FM10K_PCIE_LINK_WIDTH_8:
+		hw->bus_caps.width = fm10k_bus_width_pcie_x8;
+		break;
+	default:
+		hw->bus_caps.width = fm10k_bus_width_unknown;
+		break;
+	}
+
+	switch (link_cap & FM10K_PCIE_LINK_SPEED) {
+	case FM10K_PCIE_LINK_SPEED_2500:
+		hw->bus_caps.speed = fm10k_bus_speed_2500;
+		break;
+	case FM10K_PCIE_LINK_SPEED_5000:
+		hw->bus_caps.speed = fm10k_bus_speed_5000;
+		break;
+	case FM10K_PCIE_LINK_SPEED_8000:
+		hw->bus_caps.speed = fm10k_bus_speed_8000;
+		break;
+	default:
+		hw->bus_caps.speed = fm10k_bus_speed_unknown;
+		break;
+	}
+
+	/* Get the PCIe maximum payload size for the PCIe function */
+	device_cap = fm10k_read_pci_cfg_word(hw, FM10K_PCIE_DEV_CAP);
+
+	switch (device_cap & FM10K_PCIE_DEV_CAP_PAYLOAD) {
+	case FM10K_PCIE_DEV_CAP_PAYLOAD_128:
+		hw->bus_caps.payload = fm10k_bus_payload_128;
+		break;
+	case FM10K_PCIE_DEV_CAP_PAYLOAD_256:
+		hw->bus_caps.payload = fm10k_bus_payload_256;
+		break;
+	case FM10K_PCIE_DEV_CAP_PAYLOAD_512:
+		hw->bus_caps.payload = fm10k_bus_payload_512;
+		break;
+	default:
+		hw->bus_caps.payload = fm10k_bus_payload_unknown;
+		break;
+	}
+
+	/* Get the negotiated link width and speed from PCIe config space */
+	link_status = fm10k_read_pci_cfg_word(hw, FM10K_PCIE_LINK_STATUS);
+
+	switch (link_status & FM10K_PCIE_LINK_WIDTH) {
+	case FM10K_PCIE_LINK_WIDTH_1:
+		hw->bus.width = fm10k_bus_width_pcie_x1;
+		break;
+	case FM10K_PCIE_LINK_WIDTH_2:
+		hw->bus.width = fm10k_bus_width_pcie_x2;
+		break;
+	case FM10K_PCIE_LINK_WIDTH_4:
+		hw->bus.width = fm10k_bus_width_pcie_x4;
+		break;
+	case FM10K_PCIE_LINK_WIDTH_8:
+		hw->bus.width = fm10k_bus_width_pcie_x8;
+		break;
+	default:
+		hw->bus.width = fm10k_bus_width_unknown;
+		break;
+	}
+
+	switch (link_status & FM10K_PCIE_LINK_SPEED) {
+	case FM10K_PCIE_LINK_SPEED_2500:
+		hw->bus.speed = fm10k_bus_speed_2500;
+		break;
+	case FM10K_PCIE_LINK_SPEED_5000:
+		hw->bus.speed = fm10k_bus_speed_5000;
+		break;
+	case FM10K_PCIE_LINK_SPEED_8000:
+		hw->bus.speed = fm10k_bus_speed_8000;
+		break;
+	default:
+		hw->bus.speed = fm10k_bus_speed_unknown;
+		break;
+	}
+
+	/* Get the negotiated PCIe maximum payload size for the PCIe function */
+	device_control = fm10k_read_pci_cfg_word(hw, FM10K_PCIE_DEV_CTRL);
+
+	switch (device_control & FM10K_PCIE_DEV_CTRL_PAYLOAD) {
+	case FM10K_PCIE_DEV_CTRL_PAYLOAD_128:
+		hw->bus.payload = fm10k_bus_payload_128;
+		break;
+	case FM10K_PCIE_DEV_CTRL_PAYLOAD_256:
+		hw->bus.payload = fm10k_bus_payload_256;
+		break;
+	case FM10K_PCIE_DEV_CTRL_PAYLOAD_512:
+		hw->bus.payload = fm10k_bus_payload_512;
+		break;
+	default:
+		hw->bus.payload = fm10k_bus_payload_unknown;
+		break;
+	}
+
+	return 0;
+}
+
+static u16 fm10k_get_pcie_msix_count_generic(struct fm10k_hw *hw)
+{
+	u16 msix_count;
+
+	/* read in value from MSI-X capability register */
+	msix_count = fm10k_read_pci_cfg_word(hw, FM10K_PCI_MSIX_MSG_CTRL);
+	msix_count &= FM10K_PCI_MSIX_MSG_CTRL_TBL_SZ_MASK;
+
+	/* MSI-X count is zero-based in HW */
+	msix_count++;
+
+	if (msix_count > FM10K_MAX_MSIX_VECTORS)
+		msix_count = FM10K_MAX_MSIX_VECTORS;
+
+	return msix_count;
+}
+
+/**
+ *  fm10k_get_invariants_generic - Inits constant values
+ *  @hw: pointer to the hardware structure
+ *
+ *  Initialize the common invariants for the device.
+ **/
+s32 fm10k_get_invariants_generic(struct fm10k_hw *hw)
+{
+	struct fm10k_mac_info *mac = &hw->mac;
+
+	/* initialize GLORT state to avoid any false hits */
+	mac->dglort_map = FM10K_DGLORTMAP_NONE;
+
+	/* record maximum number of MSI-X vectors */
+	mac->max_msix_vectors = fm10k_get_pcie_msix_count_generic(hw);
+
+	return 0;
+}
+
+/**
+ *  fm10k_start_hw_generic - Prepare hardware for Tx/Rx
+ *  @hw: pointer to hardware structure
+ *
+ *  This function sets the Tx ready flag to indicate that the Tx path has
+ *  been initialized.
+ **/
+s32 fm10k_start_hw_generic(struct fm10k_hw *hw)
+{
+	/* set flag indicating we are beginning Tx */
+	hw->mac.tx_ready = true;
+
+	return 0;
+}
+
+/**
+ *  fm10k_disable_queues_generic - Stop Tx/Rx queues
+ *  @hw: pointer to hardware structure
+ *  @q_cnt: number of queues to be disabled
+ *
+ **/
+s32 fm10k_disable_queues_generic(struct fm10k_hw *hw, u16 q_cnt)
+{
+	u32 reg;
+	u16 i, time;
+
+	/* clear tx_ready to prevent any false hits for reset */
+	hw->mac.tx_ready = false;
+
+	/* clear the enable bit for all rings */
+	for (i = 0; i < q_cnt; i++) {
+		reg = fm10k_read_reg(hw, FM10K_TXDCTL(i));
+		fm10k_write_reg(hw, FM10K_TXDCTL(i),
+				reg & ~FM10K_TXDCTL_ENABLE);
+		reg = fm10k_read_reg(hw, FM10K_RXQCTL(i));
+		fm10k_write_reg(hw, FM10K_RXQCTL(i),
+				reg & ~FM10K_RXQCTL_ENABLE);
+	}
+
+	fm10k_write_flush(hw);
+	udelay(1);
+
+	/* loop through all queues to verify that they are all disabled */
+	for (i = 0, time = FM10K_QUEUE_DISABLE_TIMEOUT; time;) {
+		/* if we are at end of rings all rings are disabled */
+		if (i == q_cnt)
+			return 0;
+
+		/* if queue enables cleared, then move to next ring pair */
+		reg = fm10k_read_reg(hw, FM10K_TXDCTL(i));
+		if (!~reg || !(reg & FM10K_TXDCTL_ENABLE)) {
+			reg = fm10k_read_reg(hw, FM10K_RXQCTL(i));
+			if (!~reg || !(reg & FM10K_RXQCTL_ENABLE)) {
+				i++;
+				continue;
+			}
+		}
+
+		/* decrement time and wait 1 usec */
+		time--;
+		if (time)
+			udelay(1);
+	}
+
+	return FM10K_ERR_REQUESTS_PENDING;
+}
+
+/**
+ *  fm10k_stop_hw_generic - Stop Tx/Rx units
+ *  @hw: pointer to hardware structure
+ *
+ **/
+s32 fm10k_stop_hw_generic(struct fm10k_hw *hw)
+{
+	return fm10k_disable_queues_generic(hw, hw->mac.max_queues);
+}
+
+/**
+ *  fm10k_read_hw_stats_32b - Reads value of 32-bit registers
+ *  @hw: pointer to the hardware structure
+ *  @addr: address of register containing a 32-bit value
+ *
+ *  Function reads the content of the register and returns the delta
+ *  between the base and the current value.
+ *  **/
+u32 fm10k_read_hw_stats_32b(struct fm10k_hw *hw, u32 addr,
+			    struct fm10k_hw_stat *stat)
+{
+	u32 delta = fm10k_read_reg(hw, addr) - stat->base_l;
+
+	if (FM10K_REMOVED(hw->hw_addr))
+		stat->base_h = 0;
+
+	return delta;
+}
+
+/**
+ *  fm10k_read_hw_stats_48b - Reads value of 48-bit registers
+ *  @hw: pointer to the hardware structure
+ *  @addr: address of register containing the lower 32-bit value
+ *
+ *  Function reads the content of 2 registers, combined to represent a 48-bit
+ *  statistical value. Extra processing is required to handle overflowing.
+ *  Finally, a delta value is returned representing the difference between the
+ *  values stored in registers and values stored in the statistic counters.
+ *  **/
+static u64 fm10k_read_hw_stats_48b(struct fm10k_hw *hw, u32 addr,
+				   struct fm10k_hw_stat *stat)
+{
+	u32 count_l;
+	u32 count_h;
+	u32 count_tmp;
+	u64 delta;
+
+	count_h = fm10k_read_reg(hw, addr + 1);
+
+	/* Check for overflow */
+	do {
+		count_tmp = count_h;
+		count_l = fm10k_read_reg(hw, addr);
+		count_h = fm10k_read_reg(hw, addr + 1);
+	} while (count_h != count_tmp);
+
+	delta = ((u64)(count_h - stat->base_h) << 32) + count_l;
+	delta -= stat->base_l;
+
+	return delta & FM10K_48_BIT_MASK;
+}
+
+/**
+ *  fm10k_update_hw_base_48b - Updates 48-bit statistic base value
+ *  @stat: pointer to the hardware statistic structure
+ *  @delta: value to be updated into the hardware statistic structure
+ *
+ *  Function receives a value and determines if an update is required based on
+ *  a delta calculation. Only the base value will be updated.
+ **/
+static void fm10k_update_hw_base_48b(struct fm10k_hw_stat *stat, u64 delta)
+{
+	if (!delta)
+		return;
+
+	/* update lower 32 bits */
+	delta += stat->base_l;
+	stat->base_l = (u32)delta;
+
+	/* update upper 32 bits */
+	stat->base_h += (u32)(delta >> 32);
+}
+
+/**
+ *  fm10k_update_hw_stats_tx_q - Updates TX queue statistics counters
+ *  @hw: pointer to the hardware structure
+ *  @q: pointer to the ring of hardware statistics queue
+ *  @idx: index pointing to the start of the ring iteration
+ *
+ *  Function updates the TX queue statistics counters that are related to the
+ *  hardware.
+ **/
+static void fm10k_update_hw_stats_tx_q(struct fm10k_hw *hw,
+				       struct fm10k_hw_stats_q *q,
+				       u32 idx)
+{
+	u32 id_tx, id_tx_prev, tx_packets;
+	u64 tx_bytes = 0;
+
+	/* Retrieve TX Owner Data */
+	id_tx = fm10k_read_reg(hw, FM10K_TXQCTL(idx));
+
+	/* Process TX Ring */
+	do {
+		tx_packets = fm10k_read_hw_stats_32b(hw, FM10K_QPTC(idx),
+						     &q->tx_packets);
+
+		if (tx_packets)
+			tx_bytes = fm10k_read_hw_stats_48b(hw,
+							   FM10K_QBTC_L(idx),
+							   &q->tx_bytes);
+
+		/* Re-Check Owner Data */
+		id_tx_prev = id_tx;
+		id_tx = fm10k_read_reg(hw, FM10K_TXQCTL(idx));
+	} while ((id_tx ^ id_tx_prev) & FM10K_TXQCTL_ID_MASK);
+
+	/* drop non-ID bits and set VALID ID bit */
+	id_tx &= FM10K_TXQCTL_ID_MASK;
+	id_tx |= FM10K_STAT_VALID;
+
+	/* update packet counts */
+	if (q->tx_stats_idx == id_tx) {
+		q->tx_packets.count += tx_packets;
+		q->tx_bytes.count += tx_bytes;
+	}
+
+	/* update bases and record ID */
+	fm10k_update_hw_base_32b(&q->tx_packets, tx_packets);
+	fm10k_update_hw_base_48b(&q->tx_bytes, tx_bytes);
+
+	q->tx_stats_idx = id_tx;
+}
+
+/**
+ *  fm10k_update_hw_stats_rx_q - Updates RX queue statistics counters
+ *  @hw: pointer to the hardware structure
+ *  @q: pointer to the ring of hardware statistics queue
+ *  @idx: index pointing to the start of the ring iteration
+ *
+ *  Function updates the RX queue statistics counters that are related to the
+ *  hardware.
+ **/
+static void fm10k_update_hw_stats_rx_q(struct fm10k_hw *hw,
+				       struct fm10k_hw_stats_q *q,
+				       u32 idx)
+{
+	u32 id_rx, id_rx_prev, rx_packets, rx_drops;
+	u64 rx_bytes = 0;
+
+	/* Retrieve RX Owner Data */
+	id_rx = fm10k_read_reg(hw, FM10K_RXQCTL(idx));
+
+	/* Process RX Ring*/
+	do {
+		rx_drops = fm10k_read_hw_stats_32b(hw, FM10K_QPRDC(idx),
+						   &q->rx_drops);
+
+		rx_packets = fm10k_read_hw_stats_32b(hw, FM10K_QPRC(idx),
+						     &q->rx_packets);
+
+		if (rx_packets)
+			rx_bytes = fm10k_read_hw_stats_48b(hw,
+							   FM10K_QBRC_L(idx),
+							   &q->rx_bytes);
+
+		/* Re-Check Owner Data */
+		id_rx_prev = id_rx;
+		id_rx = fm10k_read_reg(hw, FM10K_RXQCTL(idx));
+	} while ((id_rx ^ id_rx_prev) & FM10K_RXQCTL_ID_MASK);
+
+	/* drop non-ID bits and set VALID ID bit */
+	id_rx &= FM10K_RXQCTL_ID_MASK;
+	id_rx |= FM10K_STAT_VALID;
+
+	/* update packet counts */
+	if (q->rx_stats_idx == id_rx) {
+		q->rx_drops.count += rx_drops;
+		q->rx_packets.count += rx_packets;
+		q->rx_bytes.count += rx_bytes;
+	}
+
+	/* update bases and record ID */
+	fm10k_update_hw_base_32b(&q->rx_drops, rx_drops);
+	fm10k_update_hw_base_32b(&q->rx_packets, rx_packets);
+	fm10k_update_hw_base_48b(&q->rx_bytes, rx_bytes);
+
+	q->rx_stats_idx = id_rx;
+}
+
+/**
+ *  fm10k_update_hw_stats_q - Updates queue statistics counters
+ *  @hw: pointer to the hardware structure
+ *  @q: pointer to the ring of hardware statistics queue
+ *  @idx: index pointing to the start of the ring iteration
+ *  @count: number of queues to iterate over
+ *
+ *  Function updates the queue statistics counters that are related to the
+ *  hardware.
+ **/
+void fm10k_update_hw_stats_q(struct fm10k_hw *hw, struct fm10k_hw_stats_q *q,
+			     u32 idx, u32 count)
+{
+	u32 i;
+
+	for (i = 0; i < count; i++, idx++, q++) {
+		fm10k_update_hw_stats_tx_q(hw, q, idx);
+		fm10k_update_hw_stats_rx_q(hw, q, idx);
+	}
+}
+
+/**
+ *  fm10k_unbind_hw_stats_q - Unbind the queue counters from their queues
+ *  @hw: pointer to the hardware structure
+ *  @q: pointer to the ring of hardware statistics queue
+ *  @idx: index pointing to the start of the ring iteration
+ *  @count: number of queues to iterate over
+ *
+ *  Function invalidates the index values for the queues so any updates that
+ *  may have happened are ignored and the base for the queue stats is reset.
+ **/
+
+void fm10k_unbind_hw_stats_q(struct fm10k_hw_stats_q *q, u32 idx, u32 count)
+{
+	u32 i;
+
+	for (i = 0; i < count; i++, idx++, q++) {
+		q->rx_stats_idx = 0;
+		q->tx_stats_idx = 0;
+	}
+}
+
+/**
+ *  fm10k_get_host_state_generic - Returns the state of the host
+ *  @hw: pointer to hardware structure
+ *  @host_ready: pointer to boolean value that will record host state
+ *
+ *  This function will check the health of the mailbox and Tx queue 0
+ *  in order to determine if we should report that the link is up or not.
+ **/
+s32 fm10k_get_host_state_generic(struct fm10k_hw *hw, bool *host_ready)
+{
+	struct fm10k_mbx_info *mbx = &hw->mbx;
+	struct fm10k_mac_info *mac = &hw->mac;
+	s32 ret_val = 0;
+	u32 txdctl = fm10k_read_reg(hw, FM10K_TXDCTL(0));
+
+	/* process upstream mailbox in case interrupts were disabled */
+	mbx->ops.process(hw, mbx);
+
+	/* If Tx is no longer enabled link should come down */
+	if (!(~txdctl) || !(txdctl & FM10K_TXDCTL_ENABLE))
+		mac->get_host_state = true;
+
+	/* exit if not checking for link, or link cannot be changed */
+	if (!mac->get_host_state || !(~txdctl))
+		goto out;
+
+	/* if we somehow dropped the Tx enable we should reset */
+	if (hw->mac.tx_ready && !(txdctl & FM10K_TXDCTL_ENABLE)) {
+		ret_val = FM10K_ERR_RESET_REQUESTED;
+		goto out;
+	}
+
+	/* if Mailbox timed out we should request reset */
+	if (!mbx->timeout) {
+		ret_val = FM10K_ERR_RESET_REQUESTED;
+		goto out;
+	}
+
+	/* verify Mailbox is still valid */
+	if (!mbx->ops.tx_ready(mbx, FM10K_VFMBX_MSG_MTU))
+		goto out;
+
+	/* interface cannot receive traffic without logical ports */
+	if (mac->dglort_map == FM10K_DGLORTMAP_NONE)
+		goto out;
+
+	/* if we passed all the tests above then the switch is ready and we no
+	 * longer need to check for link
+	 */
+	mac->get_host_state = false;
+
+out:
+	*host_ready = !mac->get_host_state;
+	return ret_val;
+}
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_common.h b/drivers/net/ethernet/intel/fm10k/fm10k_common.h
index 2b214a6..8250e14 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_common.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_common.h
@@ -41,4 +41,17 @@ do { \
 
 /* read ctrl register which has no clear on read fields as PCIe flush */
 #define fm10k_write_flush(hw) fm10k_read_reg((hw), FM10K_CTRL)
+s32 fm10k_get_bus_info_generic(struct fm10k_hw *hw);
+s32 fm10k_get_invariants_generic(struct fm10k_hw *hw);
+s32 fm10k_disable_queues_generic(struct fm10k_hw *hw, u16 q_cnt);
+s32 fm10k_start_hw_generic(struct fm10k_hw *hw);
+s32 fm10k_stop_hw_generic(struct fm10k_hw *hw);
+u32 fm10k_read_hw_stats_32b(struct fm10k_hw *hw, u32 addr,
+			    struct fm10k_hw_stat *stat);
+#define fm10k_update_hw_base_32b(stat, delta) ((stat)->base_l += (delta))
+void fm10k_update_hw_stats_q(struct fm10k_hw *hw, struct fm10k_hw_stats_q *q,
+			     u32 idx, u32 count);
+#define fm10k_unbind_hw_stats_32b(s) ((s)->base_h = 0)
+void fm10k_unbind_hw_stats_q(struct fm10k_hw_stats_q *q, u32 idx, u32 count);
+s32 fm10k_get_host_state_generic(struct fm10k_hw *hw, bool *host_ready);
 #endif /* _FM10K_COMMON_H_ */
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pf.c b/drivers/net/ethernet/intel/fm10k/fm10k_pf.c
new file mode 100644
index 0000000..b798cd5
--- /dev/null
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pf.c
@@ -0,0 +1,392 @@
+/* Intel Ethernet Switch Host Interface Driver
+ * Copyright(c) 2013 - 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Contact Information:
+ * e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+ * Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+ */
+
+#include "fm10k_pf.h"
+
+/**
+ *  fm10k_reset_hw_pf - PF hardware reset
+ *  @hw: pointer to hardware structure
+ *
+ *  This function should return the hardware to a state similar to the
+ *  one it is in after being powered on.
+ **/
+static s32 fm10k_reset_hw_pf(struct fm10k_hw *hw)
+{
+	s32 err;
+	u32 reg;
+	u16 i;
+
+	/* Disable interrupts */
+	fm10k_write_reg(hw, FM10K_EIMR, FM10K_EIMR_DISABLE(ALL));
+
+	/* Lock ITR2 reg 0 into itself and disable interrupt moderation */
+	fm10k_write_reg(hw, FM10K_ITR2(0), 0);
+	fm10k_write_reg(hw, FM10K_INT_CTRL, 0);
+
+	/* We assume here Tx and Rx queue 0 are owned by the PF */
+
+	/* Shut off VF access to their queues forcing them to queue 0 */
+	for (i = 0; i < FM10K_TQMAP_TABLE_SIZE; i++) {
+		fm10k_write_reg(hw, FM10K_TQMAP(i), 0);
+		fm10k_write_reg(hw, FM10K_RQMAP(i), 0);
+	}
+
+	/* shut down all rings */
+	err = fm10k_disable_queues_generic(hw, FM10K_MAX_QUEUES);
+	if (err)
+		return err;
+
+	/* Verify that DMA is no longer active */
+	reg = fm10k_read_reg(hw, FM10K_DMA_CTRL);
+	if (reg & (FM10K_DMA_CTRL_TX_ACTIVE | FM10K_DMA_CTRL_RX_ACTIVE))
+		return FM10K_ERR_DMA_PENDING;
+
+	/* Inititate data path reset */
+	reg |= FM10K_DMA_CTRL_DATAPATH_RESET;
+	fm10k_write_reg(hw, FM10K_DMA_CTRL, reg);
+
+	/* Flush write and allow 100us for reset to complete */
+	fm10k_write_flush(hw);
+	udelay(FM10K_RESET_TIMEOUT);
+
+	/* Verify we made it out of reset */
+	reg = fm10k_read_reg(hw, FM10K_IP);
+	if (!(reg & FM10K_IP_NOTINRESET))
+		err = FM10K_ERR_RESET_FAILED;
+
+	return err;
+}
+
+/**
+ *  fm10k_init_hw_pf - PF hardware initialization
+ *  @hw: pointer to hardware structure
+ *
+ **/
+static s32 fm10k_init_hw_pf(struct fm10k_hw *hw)
+{
+	u32 dma_ctrl, txqctl;
+	u16 i;
+
+	/* Establish default VSI as valid */
+	fm10k_write_reg(hw, FM10K_DGLORTDEC(fm10k_dglort_default), 0);
+	fm10k_write_reg(hw, FM10K_DGLORTMAP(fm10k_dglort_default),
+			FM10K_DGLORTMAP_ANY);
+
+	/* Invalidate all other GLORT entries */
+	for (i = 1; i < FM10K_DGLORT_COUNT; i++)
+		fm10k_write_reg(hw, FM10K_DGLORTMAP(i), FM10K_DGLORTMAP_NONE);
+
+	/* reset ITR2(0) to point to itself */
+	fm10k_write_reg(hw, FM10K_ITR2(0), 0);
+
+	/* reset VF ITR2(0) to point to 0 avoid PF registers */
+	fm10k_write_reg(hw, FM10K_ITR2(FM10K_ITR_REG_COUNT_PF), 0);
+
+	/* loop through all PF ITR2 registers pointing them to the previous */
+	for (i = 1; i < FM10K_ITR_REG_COUNT_PF; i++)
+		fm10k_write_reg(hw, FM10K_ITR2(i), i - 1);
+
+	/* Enable interrupt moderator if not already enabled */
+	fm10k_write_reg(hw, FM10K_INT_CTRL, FM10K_INT_CTRL_ENABLEMODERATOR);
+
+	/* compute the default txqctl configuration */
+	txqctl = FM10K_TXQCTL_PF | FM10K_TXQCTL_UNLIMITED_BW |
+		 (hw->mac.default_vid << FM10K_TXQCTL_VID_SHIFT);
+
+	for (i = 0; i < FM10K_MAX_QUEUES; i++) {
+		/* configure rings for 256 Queue / 32 Descriptor cache mode */
+		fm10k_write_reg(hw, FM10K_TQDLOC(i),
+				(i * FM10K_TQDLOC_BASE_32_DESC) |
+				FM10K_TQDLOC_SIZE_32_DESC);
+		fm10k_write_reg(hw, FM10K_TXQCTL(i), txqctl);
+
+		/* configure rings to provide TPH processing hints */
+		fm10k_write_reg(hw, FM10K_TPH_TXCTRL(i),
+				FM10K_TPH_TXCTRL_DESC_TPHEN |
+				FM10K_TPH_TXCTRL_DESC_RROEN |
+				FM10K_TPH_TXCTRL_DESC_WROEN |
+				FM10K_TPH_TXCTRL_DATA_RROEN);
+		fm10k_write_reg(hw, FM10K_TPH_RXCTRL(i),
+				FM10K_TPH_RXCTRL_DESC_TPHEN |
+				FM10K_TPH_RXCTRL_DESC_RROEN |
+				FM10K_TPH_RXCTRL_DATA_WROEN |
+				FM10K_TPH_RXCTRL_HDR_WROEN);
+	}
+
+	/* set max hold interval to align with 1.024 usec in all modes */
+	switch (hw->bus.speed) {
+	case fm10k_bus_speed_2500:
+		dma_ctrl = FM10K_DMA_CTRL_MAX_HOLD_1US_GEN1;
+		break;
+	case fm10k_bus_speed_5000:
+		dma_ctrl = FM10K_DMA_CTRL_MAX_HOLD_1US_GEN2;
+		break;
+	case fm10k_bus_speed_8000:
+		dma_ctrl = FM10K_DMA_CTRL_MAX_HOLD_1US_GEN3;
+		break;
+	default:
+		dma_ctrl = 0;
+		break;
+	}
+
+	/* Configure TSO flags */
+	fm10k_write_reg(hw, FM10K_DTXTCPFLGL, FM10K_TSO_FLAGS_LOW);
+	fm10k_write_reg(hw, FM10K_DTXTCPFLGH, FM10K_TSO_FLAGS_HI);
+
+	/* Enable DMA engine
+	 * Set Rx Descriptor size to 32
+	 * Set Minimum MSS to 64
+	 * Set Maximum number of Rx queues to 256 / 32 Descriptor
+	 */
+	dma_ctrl |= FM10K_DMA_CTRL_TX_ENABLE | FM10K_DMA_CTRL_RX_ENABLE |
+		    FM10K_DMA_CTRL_RX_DESC_SIZE | FM10K_DMA_CTRL_MINMSS_64 |
+		    FM10K_DMA_CTRL_32_DESC;
+
+	fm10k_write_reg(hw, FM10K_DMA_CTRL, dma_ctrl);
+
+	/* record maximum queue count, we limit ourselves to 128 */
+	hw->mac.max_queues = FM10K_MAX_QUEUES_PF;
+
+	return 0;
+}
+
+/**
+ *  fm10k_is_slot_appropriate_pf - Indicate appropriate slot for this SKU
+ *  @hw: pointer to hardware structure
+ *
+ *  Looks at the PCIe bus info to confirm whether or not this slot can support
+ *  the necessary bandwidth for this device.
+ **/
+static bool fm10k_is_slot_appropriate_pf(struct fm10k_hw *hw)
+{
+	return (hw->bus.speed == hw->bus_caps.speed) &&
+	       (hw->bus.width == hw->bus_caps.width);
+}
+
+/**
+ *  fm10k_read_mac_addr_pf - Read device MAC address
+ *  @hw: pointer to the HW structure
+ *
+ *  Reads the device MAC address from the SM_AREA and stores the value.
+ **/
+static s32 fm10k_read_mac_addr_pf(struct fm10k_hw *hw)
+{
+	u8 perm_addr[ETH_ALEN];
+	u32 serial_num;
+	int i;
+
+	serial_num = fm10k_read_reg(hw, FM10K_SM_AREA(1));
+
+	/* last byte should be all 1's */
+	if ((~serial_num) << 24)
+		return  FM10K_ERR_INVALID_MAC_ADDR;
+
+	perm_addr[0] = (u8)(serial_num >> 24);
+	perm_addr[1] = (u8)(serial_num >> 16);
+	perm_addr[2] = (u8)(serial_num >> 8);
+
+	serial_num = fm10k_read_reg(hw, FM10K_SM_AREA(0));
+
+	/* first byte should be all 1's */
+	if ((~serial_num) >> 24)
+		return  FM10K_ERR_INVALID_MAC_ADDR;
+
+	perm_addr[3] = (u8)(serial_num >> 16);
+	perm_addr[4] = (u8)(serial_num >> 8);
+	perm_addr[5] = (u8)(serial_num);
+
+	for (i = 0; i < ETH_ALEN; i++) {
+		hw->mac.perm_addr[i] = perm_addr[i];
+		hw->mac.addr[i] = perm_addr[i];
+	}
+
+	return 0;
+}
+
+/**
+ *  fm10k_update_stats_hw_pf - Updates hardware related statistics of PF
+ *  @hw: pointer to hardware structure
+ *  @stats: pointer to the stats structure to update
+ *
+ *  This function collects and aggregates global and per queue hardware
+ *  statistics.
+ **/
+static void fm10k_update_hw_stats_pf(struct fm10k_hw *hw,
+				     struct fm10k_hw_stats *stats)
+{
+	u32 timeout, ur, ca, um, xec, vlan_drop, loopback_drop, nodesc_drop;
+	u32 id, id_prev;
+
+	/* Use Tx queue 0 as a canary to detect a reset */
+	id = fm10k_read_reg(hw, FM10K_TXQCTL(0));
+
+	/* Read Global Statistics */
+	do {
+		timeout = fm10k_read_hw_stats_32b(hw, FM10K_STATS_TIMEOUT,
+						  &stats->timeout);
+		ur = fm10k_read_hw_stats_32b(hw, FM10K_STATS_UR, &stats->ur);
+		ca = fm10k_read_hw_stats_32b(hw, FM10K_STATS_CA, &stats->ca);
+		um = fm10k_read_hw_stats_32b(hw, FM10K_STATS_UM, &stats->um);
+		xec = fm10k_read_hw_stats_32b(hw, FM10K_STATS_XEC, &stats->xec);
+		vlan_drop = fm10k_read_hw_stats_32b(hw, FM10K_STATS_VLAN_DROP,
+						    &stats->vlan_drop);
+		loopback_drop = fm10k_read_hw_stats_32b(hw,
+							FM10K_STATS_LOOPBACK_DROP,
+						     &stats->loopback_drop);
+		nodesc_drop = fm10k_read_hw_stats_32b(hw,
+						      FM10K_STATS_NODESC_DROP,
+						      &stats->nodesc_drop);
+
+		/* if value has not changed then we have consistent data */
+		id_prev = id;
+		id = fm10k_read_reg(hw, FM10K_TXQCTL(0));
+	} while ((id ^ id_prev) & FM10K_TXQCTL_ID_MASK);
+
+	/* drop non-ID bits and set VALID ID bit */
+	id &= FM10K_TXQCTL_ID_MASK;
+	id |= FM10K_STAT_VALID;
+
+	/* Update Global Statistics */
+	if (stats->stats_idx == id) {
+		stats->timeout.count += timeout;
+		stats->ur.count += ur;
+		stats->ca.count += ca;
+		stats->um.count += um;
+		stats->xec.count += xec;
+		stats->vlan_drop.count += vlan_drop;
+		stats->loopback_drop.count += loopback_drop;
+		stats->nodesc_drop.count += nodesc_drop;
+	}
+
+	/* Update bases and record current PF id */
+	fm10k_update_hw_base_32b(&stats->timeout, timeout);
+	fm10k_update_hw_base_32b(&stats->ur, ur);
+	fm10k_update_hw_base_32b(&stats->ca, ca);
+	fm10k_update_hw_base_32b(&stats->um, um);
+	fm10k_update_hw_base_32b(&stats->xec, xec);
+	fm10k_update_hw_base_32b(&stats->vlan_drop, vlan_drop);
+	fm10k_update_hw_base_32b(&stats->loopback_drop, loopback_drop);
+	fm10k_update_hw_base_32b(&stats->nodesc_drop, nodesc_drop);
+	stats->stats_idx = id;
+
+	/* Update Queue Statistics */
+	fm10k_update_hw_stats_q(hw, stats->q, 0, hw->mac.max_queues);
+}
+
+/**
+ *  fm10k_rebind_hw_stats_pf - Resets base for hardware statistics of PF
+ *  @hw: pointer to hardware structure
+ *  @stats: pointer to the stats structure to update
+ *
+ *  This function resets the base for global and per queue hardware
+ *  statistics.
+ **/
+static void fm10k_rebind_hw_stats_pf(struct fm10k_hw *hw,
+				     struct fm10k_hw_stats *stats)
+{
+	/* Unbind Global Statistics */
+	fm10k_unbind_hw_stats_32b(&stats->timeout);
+	fm10k_unbind_hw_stats_32b(&stats->ur);
+	fm10k_unbind_hw_stats_32b(&stats->ca);
+	fm10k_unbind_hw_stats_32b(&stats->um);
+	fm10k_unbind_hw_stats_32b(&stats->xec);
+	fm10k_unbind_hw_stats_32b(&stats->vlan_drop);
+	fm10k_unbind_hw_stats_32b(&stats->loopback_drop);
+	fm10k_unbind_hw_stats_32b(&stats->nodesc_drop);
+
+	/* Unbind Queue Statistics */
+	fm10k_unbind_hw_stats_q(stats->q, 0, hw->mac.max_queues);
+
+	/* Reinitialize bases for all stats */
+	fm10k_update_hw_stats_pf(hw, stats);
+}
+
+/**
+ *  fm10k_get_fault_pf - Record a fault in one of the interface units
+ *  @hw: pointer to hardware structure
+ *  @type: pointer to fault type register offset
+ *  @fault: pointer to memory location to record the fault
+ *
+ *  Record the fault register contents to the fault data structure and
+ *  clear the entry from the register.
+ *
+ *  Returns ERR_PARAM if invalid register is specified or no error is present.
+ **/
+static s32 fm10k_get_fault_pf(struct fm10k_hw *hw, int type,
+			      struct fm10k_fault *fault)
+{
+	u32 func;
+
+	/* verify the fault register is in range and is aligned */
+	switch (type) {
+	case FM10K_PCA_FAULT:
+	case FM10K_THI_FAULT:
+	case FM10K_FUM_FAULT:
+		break;
+	default:
+		return FM10K_ERR_PARAM;
+	}
+
+	/* only service faults that are valid */
+	func = fm10k_read_reg(hw, type + FM10K_FAULT_FUNC);
+	if (!(func & FM10K_FAULT_FUNC_VALID))
+		return FM10K_ERR_PARAM;
+
+	/* read remaining fields */
+	fault->address = fm10k_read_reg(hw, type + FM10K_FAULT_ADDR_HI);
+	fault->address <<= 32;
+	fault->address = fm10k_read_reg(hw, type + FM10K_FAULT_ADDR_LO);
+	fault->specinfo = fm10k_read_reg(hw, type + FM10K_FAULT_SPECINFO);
+
+	/* clear valid bit to allow for next error */
+	fm10k_write_reg(hw, type + FM10K_FAULT_FUNC, FM10K_FAULT_FUNC_VALID);
+
+	/* Record which function triggered the error */
+	if (func & FM10K_FAULT_FUNC_PF)
+		fault->func = 0;
+	else
+		fault->func = 1 + ((func & FM10K_FAULT_FUNC_VF_MASK) >>
+				   FM10K_FAULT_FUNC_VF_SHIFT);
+
+	/* record fault type */
+	fault->type = func & FM10K_FAULT_FUNC_TYPE_MASK;
+
+	return 0;
+}
+
+static struct fm10k_mac_ops mac_ops_pf = {
+	.get_bus_info		= &fm10k_get_bus_info_generic,
+	.reset_hw		= &fm10k_reset_hw_pf,
+	.init_hw		= &fm10k_init_hw_pf,
+	.start_hw		= &fm10k_start_hw_generic,
+	.stop_hw		= &fm10k_stop_hw_generic,
+	.is_slot_appropriate	= &fm10k_is_slot_appropriate_pf,
+	.read_mac_addr		= &fm10k_read_mac_addr_pf,
+	.update_hw_stats	= &fm10k_update_hw_stats_pf,
+	.rebind_hw_stats	= &fm10k_rebind_hw_stats_pf,
+	.get_fault		= &fm10k_get_fault_pf,
+	.get_host_state		= &fm10k_get_host_state_generic,
+};
+
+struct fm10k_info fm10k_pf_info = {
+	.mac		= fm10k_mac_pf,
+	.get_invariants	= &fm10k_get_invariants_generic,
+	.mac_ops	= &mac_ops_pf,
+};
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pf.h b/drivers/net/ethernet/intel/fm10k/fm10k_pf.h
new file mode 100644
index 0000000..054392f
--- /dev/null
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pf.h
@@ -0,0 +1,28 @@
+/* Intel Ethernet Switch Host Interface Driver
+ * Copyright(c) 2013 - 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Contact Information:
+ * e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+ * Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+ */
+
+#ifndef _FM10K_PF_H_
+#define _FM10K_PF_H_
+
+#include "fm10k_type.h"
+#include "fm10k_common.h"
+
+extern struct fm10k_info fm10k_pf_info;
+#endif /* _FM10K_PF_H */

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [net-next PATCH 08/29] fm10k: Add support for configuring PF interface
  2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
                   ` (6 preceding siblings ...)
  2014-09-18 22:36 ` [net-next PATCH 07/29] fm10k: Add support for PF Alexander Duyck
@ 2014-09-18 22:36 ` Alexander Duyck
  2014-09-18 22:36 ` [net-next PATCH 09/29] fm10k: Add netdev Alexander Duyck
                   ` (23 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-18 22:36 UTC (permalink / raw)
  To: davem
  Cc: nhorman, netdev, john.fastabend, matthew.vick, jeffrey.t.kirsher,
	sassmann

This patch adds support for the operations which will configure filters on
the interface.  In addition with these patches we begin to introduce the PF
messages that will be sent to or received from the Switch Management
entity.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_pf.c |  587 +++++++++++++++++++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_pf.h |   98 +++++
 2 files changed, 683 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pf.c b/drivers/net/ethernet/intel/fm10k/fm10k_pf.c
index b798cd5..8da382c 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pf.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pf.c
@@ -181,6 +181,66 @@ static bool fm10k_is_slot_appropriate_pf(struct fm10k_hw *hw)
 }
 
 /**
+ *  fm10k_update_vlan_pf - Update status of VLAN ID in VLAN filter table
+ *  @hw: pointer to hardware structure
+ *  @vid: VLAN ID to add to table
+ *  @vsi: Index indicating VF ID or PF ID in table
+ *  @set: Indicates if this is a set or clear operation
+ *
+ *  This function adds or removes the corresponding VLAN ID from the VLAN
+ *  filter table for the corresponding function.  In addition to the
+ *  standard set/clear that supports one bit a multi-bit write is
+ *  supported to set 64 bits at a time.
+ **/
+static s32 fm10k_update_vlan_pf(struct fm10k_hw *hw, u32 vid, u8 vsi, bool set)
+{
+	u32 vlan_table, reg, mask, bit, len;
+
+	/* verify the VSI index is valid */
+	if (vsi > FM10K_VLAN_TABLE_VSI_MAX)
+		return FM10K_ERR_PARAM;
+
+	/* VLAN multi-bit write:
+	 * The multi-bit write has several parts to it.
+	 *    3			  2		      1			  0
+	 *  1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+	 * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+	 * | RSVD0 |         Length        |C|RSVD0|        VLAN ID        |
+	 * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+	 *
+	 * VLAN ID: Vlan Starting value
+	 * RSVD0: Reserved section, must be 0
+	 * C: Flag field, 0 is set, 1 is clear (Used in VF VLAN message)
+	 * Length: Number of times to repeat the bit being set
+	 */
+	len = vid >> 16;
+	vid = (vid << 17) >> 17;
+
+	/* verify the reserved 0 fields are 0 */
+	if (len >= FM10K_VLAN_TABLE_VID_MAX ||
+	    vid >= FM10K_VLAN_TABLE_VID_MAX)
+		return FM10K_ERR_PARAM;
+
+	/* Loop through the table updating all required VLANs */
+	for (reg = FM10K_VLAN_TABLE(vsi, vid / 32), bit = vid % 32;
+	     len < FM10K_VLAN_TABLE_VID_MAX;
+	     len -= 32 - bit, reg++, bit = 0) {
+		/* record the initial state of the register */
+		vlan_table = fm10k_read_reg(hw, reg);
+
+		/* truncate mask if we are at the start or end of the run */
+		mask = (~(u32)0 >> ((len < 31) ? 31 - len : 0)) << bit;
+
+		/* make necessary modifications to the register */
+		mask &= set ? ~vlan_table : vlan_table;
+		if (mask)
+			fm10k_write_reg(hw, reg, vlan_table ^ mask);
+	}
+
+	return 0;
+}
+
+/**
  *  fm10k_read_mac_addr_pf - Read device MAC address
  *  @hw: pointer to the HW structure
  *
@@ -221,6 +281,297 @@ static s32 fm10k_read_mac_addr_pf(struct fm10k_hw *hw)
 }
 
 /**
+ *  fm10k_glort_valid_pf - Validate that the provided glort is valid
+ *  @hw: pointer to the HW structure
+ *  @glort: base glort to be validated
+ *
+ *  This function will return an error if the provided glort is invalid
+ **/
+bool fm10k_glort_valid_pf(struct fm10k_hw *hw, u16 glort)
+{
+	glort &= hw->mac.dglort_map >> FM10K_DGLORTMAP_MASK_SHIFT;
+
+	return glort == (hw->mac.dglort_map & FM10K_DGLORTMAP_NONE);
+}
+
+/**
+ *  fm10k_update_uc_addr_pf - Update device unicast addresss
+ *  @hw: pointer to the HW structure
+ *  @glort: base resource tag for this request
+ *  @mac: MAC address to add/remove from table
+ *  @vid: VLAN ID to add/remove from table
+ *  @add: Indicates if this is an add or remove operation
+ *  @flags: flags field to indicate add and secure
+ *
+ *  This function generates a message to the Switch API requesting
+ *  that the given logical port add/remove the given L2 MAC/VLAN address.
+ **/
+static s32 fm10k_update_xc_addr_pf(struct fm10k_hw *hw, u16 glort,
+				   const u8 *mac, u16 vid, bool add, u8 flags)
+{
+	struct fm10k_mbx_info *mbx = &hw->mbx;
+	struct fm10k_mac_update mac_update;
+	u32 msg[5];
+
+	/* if glort is not valid return error */
+	if (!fm10k_glort_valid_pf(hw, glort))
+		return FM10K_ERR_PARAM;
+
+	/* drop upper 4 bits of VLAN ID */
+	vid = (vid << 4) >> 4;
+
+	/* record fields */
+	mac_update.mac_lower = cpu_to_le32(((u32)mac[2] << 24) |
+						 ((u32)mac[3] << 16) |
+						 ((u32)mac[4] << 8) |
+						 ((u32)mac[5]));
+	mac_update.mac_upper = cpu_to_le16(((u32)mac[0] << 8) |
+						 ((u32)mac[1]));
+	mac_update.vlan = cpu_to_le16(vid);
+	mac_update.glort = cpu_to_le16(glort);
+	mac_update.action = add ? 0 : 1;
+	mac_update.flags = flags;
+
+	/* populate mac_update fields */
+	fm10k_tlv_msg_init(msg, FM10K_PF_MSG_ID_UPDATE_MAC_FWD_RULE);
+	fm10k_tlv_attr_put_le_struct(msg, FM10K_PF_ATTR_ID_MAC_UPDATE,
+				     &mac_update, sizeof(mac_update));
+
+	/* load onto outgoing mailbox */
+	return mbx->ops.enqueue_tx(hw, mbx, msg);
+}
+
+/**
+ *  fm10k_update_uc_addr_pf - Update device unicast addresss
+ *  @hw: pointer to the HW structure
+ *  @glort: base resource tag for this request
+ *  @mac: MAC address to add/remove from table
+ *  @vid: VLAN ID to add/remove from table
+ *  @add: Indicates if this is an add or remove operation
+ *  @flags: flags field to indicate add and secure
+ *
+ *  This function is used to add or remove unicast addresses for
+ *  the PF.
+ **/
+static s32 fm10k_update_uc_addr_pf(struct fm10k_hw *hw, u16 glort,
+				   const u8 *mac, u16 vid, bool add, u8 flags)
+{
+	/* verify MAC address is valid */
+	if (!is_valid_ether_addr(mac))
+		return FM10K_ERR_PARAM;
+
+	return fm10k_update_xc_addr_pf(hw, glort, mac, vid, add, flags);
+}
+
+/**
+ *  fm10k_update_mc_addr_pf - Update device multicast addresses
+ *  @hw: pointer to the HW structure
+ *  @glort: base resource tag for this request
+ *  @mac: MAC address to add/remove from table
+ *  @vid: VLAN ID to add/remove from table
+ *  @add: Indicates if this is an add or remove operation
+ *
+ *  This function is used to add or remove multicast MAC addresses for
+ *  the PF.
+ **/
+static s32 fm10k_update_mc_addr_pf(struct fm10k_hw *hw, u16 glort,
+				   const u8 *mac, u16 vid, bool add)
+{
+	/* verify multicast address is valid */
+	if (!is_multicast_ether_addr(mac))
+		return FM10K_ERR_PARAM;
+
+	return fm10k_update_xc_addr_pf(hw, glort, mac, vid, add, 0);
+}
+
+/**
+ *  fm10k_update_xcast_mode_pf - Request update of multicast mode
+ *  @hw: pointer to hardware structure
+ *  @glort: base resource tag for this request
+ *  @mode: integer value indicating mode being requested
+ *
+ *  This function will attempt to request a higher mode for the port
+ *  so that it can enable either multicast, multicast promiscuous, or
+ *  promiscuous mode of operation.
+ **/
+static s32 fm10k_update_xcast_mode_pf(struct fm10k_hw *hw, u16 glort, u8 mode)
+{
+	struct fm10k_mbx_info *mbx = &hw->mbx;
+	u32 msg[3], xcast_mode;
+
+	if (mode > FM10K_XCAST_MODE_NONE)
+		return FM10K_ERR_PARAM;
+	/* if glort is not valid return error */
+	if (!fm10k_glort_valid_pf(hw, glort))
+		return FM10K_ERR_PARAM;
+
+	/* write xcast mode as a single u32 value,
+	 * lower 16 bits: glort
+	 * upper 16 bits: mode
+	 */
+	xcast_mode = ((u32)mode << 16) | glort;
+
+	/* generate message requesting to change xcast mode */
+	fm10k_tlv_msg_init(msg, FM10K_PF_MSG_ID_XCAST_MODES);
+	fm10k_tlv_attr_put_u32(msg, FM10K_PF_ATTR_ID_XCAST_MODE, xcast_mode);
+
+	/* load onto outgoing mailbox */
+	return mbx->ops.enqueue_tx(hw, mbx, msg);
+}
+
+/**
+ *  fm10k_update_int_moderator_pf - Update interrupt moderator linked list
+ *  @hw: pointer to hardware structure
+ *
+ *  This function walks through the MSI-X vector table to determine the
+ *  number of active interrupts and based on that information updates the
+ *  interrupt moderator linked list.
+ **/
+static void fm10k_update_int_moderator_pf(struct fm10k_hw *hw)
+{
+	u32 i;
+
+	/* Disable interrupt moderator */
+	fm10k_write_reg(hw, FM10K_INT_CTRL, 0);
+
+	/* loop through PF from last to first looking enabled vectors */
+	for (i = FM10K_ITR_REG_COUNT_PF - 1; i; i--) {
+		if (!fm10k_read_reg(hw, FM10K_MSIX_VECTOR_MASK(i)))
+			break;
+	}
+
+	/* always reset VFITR2[0] to point to last enabled PF vector*/
+	fm10k_write_reg(hw, FM10K_ITR2(FM10K_ITR_REG_COUNT_PF), i);
+
+	/* reset ITR2[0] to point to last enabled PF vector */
+	fm10k_write_reg(hw, FM10K_ITR2(0), i);
+
+	/* Enable interrupt moderator */
+	fm10k_write_reg(hw, FM10K_INT_CTRL, FM10K_INT_CTRL_ENABLEMODERATOR);
+}
+
+/**
+ *  fm10k_update_lport_state_pf - Notify the switch of a change in port state
+ *  @hw: pointer to the HW structure
+ *  @glort: base resource tag for this request
+ *  @count: number of logical ports being updated
+ *  @enable: boolean value indicating enable or disable
+ *
+ *  This function is used to add/remove a logical port from the switch.
+ **/
+static s32 fm10k_update_lport_state_pf(struct fm10k_hw *hw, u16 glort,
+				       u16 count, bool enable)
+{
+	struct fm10k_mbx_info *mbx = &hw->mbx;
+	u32 msg[3], lport_msg;
+
+	/* do nothing if we are being asked to create or destroy 0 ports */
+	if (!count)
+		return 0;
+
+	/* if glort is not valid return error */
+	if (!fm10k_glort_valid_pf(hw, glort))
+		return FM10K_ERR_PARAM;
+
+	/* construct the lport message from the 2 pieces of data we have */
+	lport_msg = ((u32)count << 16) | glort;
+
+	/* generate lport create/delete message */
+	fm10k_tlv_msg_init(msg, enable ? FM10K_PF_MSG_ID_LPORT_CREATE :
+					 FM10K_PF_MSG_ID_LPORT_DELETE);
+	fm10k_tlv_attr_put_u32(msg, FM10K_PF_ATTR_ID_PORT, lport_msg);
+
+	/* load onto outgoing mailbox */
+	return mbx->ops.enqueue_tx(hw, mbx, msg);
+}
+
+/**
+ *  fm10k_configure_dglort_map_pf - Configures GLORT entry and queues
+ *  @hw: pointer to hardware structure
+ *  @dglort: pointer to dglort configuration structure
+ *
+ *  Reads the configuration structure contained in dglort_cfg and uses
+ *  that information to then populate a DGLORTMAP/DEC entry and the queues
+ *  to which it has been assigned.
+ **/
+static s32 fm10k_configure_dglort_map_pf(struct fm10k_hw *hw,
+					 struct fm10k_dglort_cfg *dglort)
+{
+	u16 glort, queue_count, vsi_count, pc_count;
+	u16 vsi, queue, pc, q_idx;
+	u32 txqctl, dglortdec, dglortmap;
+
+	/* verify the dglort pointer */
+	if (!dglort)
+		return FM10K_ERR_PARAM;
+
+	/* verify the dglort values */
+	if ((dglort->idx > 7) || (dglort->rss_l > 7) || (dglort->pc_l > 3) ||
+	    (dglort->vsi_l > 6) || (dglort->vsi_b > 64) ||
+	    (dglort->queue_l > 8) || (dglort->queue_b >= 256))
+		return FM10K_ERR_PARAM;
+
+	/* determine count of VSIs and queues */
+	queue_count = 1 << (dglort->rss_l + dglort->pc_l);
+	vsi_count = 1 << (dglort->vsi_l + dglort->queue_l);
+	glort = dglort->glort;
+	q_idx = dglort->queue_b;
+
+	/* configure SGLORT for queues */
+	for (vsi = 0; vsi < vsi_count; vsi++, glort++) {
+		for (queue = 0; queue < queue_count; queue++, q_idx++) {
+			if (q_idx >= FM10K_MAX_QUEUES)
+				break;
+
+			fm10k_write_reg(hw, FM10K_TX_SGLORT(q_idx), glort);
+			fm10k_write_reg(hw, FM10K_RX_SGLORT(q_idx), glort);
+		}
+	}
+
+	/* determine count of PCs and queues */
+	queue_count = 1 << (dglort->queue_l + dglort->rss_l + dglort->vsi_l);
+	pc_count = 1 << dglort->pc_l;
+
+	/* configure PC for Tx queues */
+	for (pc = 0; pc < pc_count; pc++) {
+		q_idx = pc + dglort->queue_b;
+		for (queue = 0; queue < queue_count; queue++) {
+			if (q_idx >= FM10K_MAX_QUEUES)
+				break;
+
+			txqctl = fm10k_read_reg(hw, FM10K_TXQCTL(q_idx));
+			txqctl &= ~FM10K_TXQCTL_PC_MASK;
+			txqctl |= pc << FM10K_TXQCTL_PC_SHIFT;
+			fm10k_write_reg(hw, FM10K_TXQCTL(q_idx), txqctl);
+
+			q_idx += pc_count;
+		}
+	}
+
+	/* configure DGLORTDEC */
+	dglortdec = ((u32)(dglort->rss_l) << FM10K_DGLORTDEC_RSSLENGTH_SHIFT) |
+		    ((u32)(dglort->queue_b) << FM10K_DGLORTDEC_QBASE_SHIFT) |
+		    ((u32)(dglort->pc_l) << FM10K_DGLORTDEC_PCLENGTH_SHIFT) |
+		    ((u32)(dglort->vsi_b) << FM10K_DGLORTDEC_VSIBASE_SHIFT) |
+		    ((u32)(dglort->vsi_l) << FM10K_DGLORTDEC_VSILENGTH_SHIFT) |
+		    ((u32)(dglort->queue_l));
+	if (dglort->inner_rss)
+		dglortdec |=  FM10K_DGLORTDEC_INNERRSS_ENABLE;
+
+	/* configure DGLORTMAP */
+	dglortmap = (dglort->idx == fm10k_dglort_default) ?
+			FM10K_DGLORTMAP_ANY : FM10K_DGLORTMAP_ZERO;
+	dglortmap <<= dglort->vsi_l + dglort->queue_l + dglort->shared_l;
+	dglortmap |= dglort->glort;
+
+	/* write values to hardware */
+	fm10k_write_reg(hw, FM10K_DGLORTDEC(dglort->idx), dglortdec);
+	fm10k_write_reg(hw, FM10K_DGLORTMAP(dglort->idx), dglortmap);
+
+	return 0;
+}
+
+/**
  *  fm10k_update_stats_hw_pf - Updates hardware related statistics of PF
  *  @hw: pointer to hardware structure
  *  @stats: pointer to the stats structure to update
@@ -319,6 +670,22 @@ static void fm10k_rebind_hw_stats_pf(struct fm10k_hw *hw,
 }
 
 /**
+ *  fm10k_set_dma_mask_pf - Configures PhyAddrSpace to limit DMA to system
+ *  @hw: pointer to hardware structure
+ *  @dma_mask: 64 bit DMA mask required for platform
+ *
+ *  This function sets the PHYADDR.PhyAddrSpace bits for the endpoint in order
+ *  to limit the access to memory beyond what is physically in the system.
+ **/
+static void fm10k_set_dma_mask_pf(struct fm10k_hw *hw, u64 dma_mask)
+{
+	/* we need to write the upper 32 bits of DMA mask to PhyAddrSpace */
+	u32 phyaddr = (u32)(dma_mask >> 32);
+
+	fm10k_write_reg(hw, FM10K_PHYADDR, phyaddr);
+}
+
+/**
  *  fm10k_get_fault_pf - Record a fault in one of the interface units
  *  @hw: pointer to hardware structure
  *  @type: pointer to fault type register offset
@@ -371,6 +738,207 @@ static s32 fm10k_get_fault_pf(struct fm10k_hw *hw, int type,
 	return 0;
 }
 
+/**
+ *  fm10k_request_lport_map_pf - Request LPORT map from the switch API
+ *  @hw: pointer to hardware structure
+ *
+ **/
+static s32 fm10k_request_lport_map_pf(struct fm10k_hw *hw)
+{
+	struct fm10k_mbx_info *mbx = &hw->mbx;
+	u32 msg[1];
+
+	/* issue request asking for LPORT map */
+	fm10k_tlv_msg_init(msg, FM10K_PF_MSG_ID_LPORT_MAP);
+
+	/* load onto outgoing mailbox */
+	return mbx->ops.enqueue_tx(hw, mbx, msg);
+}
+
+/**
+ *  fm10k_get_host_state_pf - Returns the state of the switch and mailbox
+ *  @hw: pointer to hardware structure
+ *  @switch_ready: pointer to boolean value that will record switch state
+ *
+ *  This funciton will check the DMA_CTRL2 register and mailbox in order
+ *  to determine if the switch is ready for the PF to begin requesting
+ *  addresses and mapping traffic to the local interface.
+ **/
+static s32 fm10k_get_host_state_pf(struct fm10k_hw *hw, bool *switch_ready)
+{
+	s32 ret_val = 0;
+	u32 dma_ctrl2;
+
+	/* verify the switch is ready for interraction */
+	dma_ctrl2 = fm10k_read_reg(hw, FM10K_DMA_CTRL2);
+	if (!(dma_ctrl2 & FM10K_DMA_CTRL2_SWITCH_READY))
+		goto out;
+
+	/* retrieve generic host state info */
+	ret_val = fm10k_get_host_state_generic(hw, switch_ready);
+	if (ret_val)
+		goto out;
+
+	/* interface cannot receive traffic without logical ports */
+	if (hw->mac.dglort_map == FM10K_DGLORTMAP_NONE)
+		ret_val = fm10k_request_lport_map_pf(hw);
+
+out:
+	return ret_val;
+}
+
+/* This structure defines the attibutes to be parsed below */
+const struct fm10k_tlv_attr fm10k_lport_map_msg_attr[] = {
+	FM10K_TLV_ATTR_U32(FM10K_PF_ATTR_ID_LPORT_MAP),
+	FM10K_TLV_ATTR_LAST
+};
+
+/**
+ *  fm10k_msg_lport_map_pf - Message handler for lport_map message from SM
+ *  @hw: Pointer to hardware structure
+ *  @results: pointer array containing parsed data
+ *  @mbx: Pointer to mailbox information structure
+ *
+ *  This handler configures the lport mapping based on the reply from the
+ *  switch API.
+ **/
+s32 fm10k_msg_lport_map_pf(struct fm10k_hw *hw, u32 **results,
+			   struct fm10k_mbx_info *mbx)
+{
+	u16 glort, mask;
+	u32 dglort_map;
+	s32 err;
+
+	err = fm10k_tlv_attr_get_u32(results[FM10K_PF_ATTR_ID_LPORT_MAP],
+				     &dglort_map);
+	if (err)
+		return err;
+
+	/* extract values out of the header */
+	glort = FM10K_MSG_HDR_FIELD_GET(dglort_map, LPORT_MAP_GLORT);
+	mask = FM10K_MSG_HDR_FIELD_GET(dglort_map, LPORT_MAP_MASK);
+
+	/* verify mask is set and none of the masked bits in glort are set */
+	if (!mask || (glort & ~mask))
+		return FM10K_ERR_PARAM;
+
+	/* verify the mask is contiguous, and that it is 1's followed by 0's */
+	if (((~(mask - 1) & mask) + mask) & FM10K_DGLORTMAP_NONE)
+		return FM10K_ERR_PARAM;
+
+	/* record the glort, mask, and port count */
+	hw->mac.dglort_map = dglort_map;
+
+	return 0;
+}
+
+const struct fm10k_tlv_attr fm10k_update_pvid_msg_attr[] = {
+	FM10K_TLV_ATTR_U32(FM10K_PF_ATTR_ID_UPDATE_PVID),
+	FM10K_TLV_ATTR_LAST
+};
+
+/**
+ *  fm10k_msg_update_pvid_pf - Message handler for port VLAN message from SM
+ *  @hw: Pointer to hardware structure
+ *  @results: pointer array containing parsed data
+ *  @mbx: Pointer to mailbox information structure
+ *
+ *  This handler configures the default VLAN for the PF
+ **/
+s32 fm10k_msg_update_pvid_pf(struct fm10k_hw *hw, u32 **results,
+			     struct fm10k_mbx_info *mbx)
+{
+	u16 glort, pvid;
+	u32 pvid_update;
+	s32 err;
+
+	err = fm10k_tlv_attr_get_u32(results[FM10K_PF_ATTR_ID_UPDATE_PVID],
+				     &pvid_update);
+	if (err)
+		return err;
+
+	/* extract values from the pvid update */
+	glort = FM10K_MSG_HDR_FIELD_GET(pvid_update, UPDATE_PVID_GLORT);
+	pvid = FM10K_MSG_HDR_FIELD_GET(pvid_update, UPDATE_PVID_PVID);
+
+	/* if glort is not valid return error */
+	if (!fm10k_glort_valid_pf(hw, glort))
+		return FM10K_ERR_PARAM;
+
+	/* verify VID is valid */
+	if (pvid >= FM10K_VLAN_TABLE_VID_MAX)
+		return FM10K_ERR_PARAM;
+
+	/* record the port VLAN ID value */
+	hw->mac.default_vid = pvid;
+
+	return 0;
+}
+
+/**
+ *  fm10k_record_global_table_data - Move global table data to swapi table info
+ *  @from: pointer to source table data structure
+ *  @to: pointer to destination table info structure
+ *
+ *  This function is will copy table_data to the table_info contained in
+ *  the hw struct.
+ **/
+static void fm10k_record_global_table_data(struct fm10k_global_table_data *from,
+					   struct fm10k_swapi_table_info *to)
+{
+	/* convert from le32 struct to CPU byte ordered values */
+	to->used = le32_to_cpu(from->used);
+	to->avail = le32_to_cpu(from->avail);
+}
+
+const struct fm10k_tlv_attr fm10k_err_msg_attr[] = {
+	FM10K_TLV_ATTR_LE_STRUCT(FM10K_PF_ATTR_ID_ERR,
+				 sizeof(struct fm10k_swapi_error)),
+	FM10K_TLV_ATTR_LAST
+};
+
+/**
+ *  fm10k_msg_err_pf - Message handler for error reply
+ *  @hw: Pointer to hardware structure
+ *  @results: pointer array containing parsed data
+ *  @mbx: Pointer to mailbox information structure
+ *
+ *  This handler will capture the data for any error replies to previous
+ *  messages that the PF has sent.
+ **/
+s32 fm10k_msg_err_pf(struct fm10k_hw *hw, u32 **results,
+		     struct fm10k_mbx_info *mbx)
+{
+	struct fm10k_swapi_error err_msg;
+	s32 err;
+
+	/* extract structure from message */
+	err = fm10k_tlv_attr_get_le_struct(results[FM10K_PF_ATTR_ID_ERR],
+					   &err_msg, sizeof(err_msg));
+	if (err)
+		return err;
+
+	/* record table status */
+	fm10k_record_global_table_data(&err_msg.mac, &hw->swapi.mac);
+	fm10k_record_global_table_data(&err_msg.nexthop, &hw->swapi.nexthop);
+	fm10k_record_global_table_data(&err_msg.ffu, &hw->swapi.ffu);
+
+	/* record SW API status value */
+	hw->swapi.status = le32_to_cpu(err_msg.status);
+
+	return 0;
+}
+
+static const struct fm10k_msg_data fm10k_msg_data_pf[] = {
+	FM10K_PF_MSG_ERR_HANDLER(XCAST_MODES, fm10k_msg_err_pf),
+	FM10K_PF_MSG_ERR_HANDLER(UPDATE_MAC_FWD_RULE, fm10k_msg_err_pf),
+	FM10K_PF_MSG_LPORT_MAP_HANDLER(fm10k_msg_lport_map_pf),
+	FM10K_PF_MSG_ERR_HANDLER(LPORT_CREATE, fm10k_msg_err_pf),
+	FM10K_PF_MSG_ERR_HANDLER(LPORT_DELETE, fm10k_msg_err_pf),
+	FM10K_PF_MSG_UPDATE_PVID_HANDLER(fm10k_msg_update_pvid_pf),
+	FM10K_TLV_MSG_ERROR_HANDLER(fm10k_tlv_msg_error),
+};
+
 static struct fm10k_mac_ops mac_ops_pf = {
 	.get_bus_info		= &fm10k_get_bus_info_generic,
 	.reset_hw		= &fm10k_reset_hw_pf,
@@ -378,15 +946,30 @@ static struct fm10k_mac_ops mac_ops_pf = {
 	.start_hw		= &fm10k_start_hw_generic,
 	.stop_hw		= &fm10k_stop_hw_generic,
 	.is_slot_appropriate	= &fm10k_is_slot_appropriate_pf,
+	.update_vlan		= &fm10k_update_vlan_pf,
 	.read_mac_addr		= &fm10k_read_mac_addr_pf,
+	.update_uc_addr		= &fm10k_update_uc_addr_pf,
+	.update_mc_addr		= &fm10k_update_mc_addr_pf,
+	.update_xcast_mode	= &fm10k_update_xcast_mode_pf,
+	.update_int_moderator	= &fm10k_update_int_moderator_pf,
+	.update_lport_state	= &fm10k_update_lport_state_pf,
 	.update_hw_stats	= &fm10k_update_hw_stats_pf,
 	.rebind_hw_stats	= &fm10k_rebind_hw_stats_pf,
+	.configure_dglort_map	= &fm10k_configure_dglort_map_pf,
+	.set_dma_mask		= &fm10k_set_dma_mask_pf,
 	.get_fault		= &fm10k_get_fault_pf,
-	.get_host_state		= &fm10k_get_host_state_generic,
+	.get_host_state		= &fm10k_get_host_state_pf,
 };
 
+static s32 fm10k_get_invariants_pf(struct fm10k_hw *hw)
+{
+	fm10k_get_invariants_generic(hw);
+
+	return fm10k_sm_mbx_init(hw, &hw->mbx, fm10k_msg_data_pf);
+}
+
 struct fm10k_info fm10k_pf_info = {
 	.mac		= fm10k_mac_pf,
-	.get_invariants	= &fm10k_get_invariants_generic,
+	.get_invariants	= &fm10k_get_invariants_pf,
 	.mac_ops	= &mac_ops_pf,
 };
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pf.h b/drivers/net/ethernet/intel/fm10k/fm10k_pf.h
index 054392f..108a7a7 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pf.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pf.h
@@ -24,5 +24,103 @@
 #include "fm10k_type.h"
 #include "fm10k_common.h"
 
+bool fm10k_glort_valid_pf(struct fm10k_hw *hw, u16 glort);
+
+enum fm10k_pf_tlv_msg_id_v1 {
+	FM10K_PF_MSG_ID_TEST			= 0x000, /* msg ID reserved */
+	FM10K_PF_MSG_ID_XCAST_MODES		= 0x001,
+	FM10K_PF_MSG_ID_UPDATE_MAC_FWD_RULE	= 0x002,
+	FM10K_PF_MSG_ID_LPORT_MAP		= 0x100,
+	FM10K_PF_MSG_ID_LPORT_CREATE		= 0x200,
+	FM10K_PF_MSG_ID_LPORT_DELETE		= 0x201,
+	FM10K_PF_MSG_ID_CONFIG			= 0x300,
+	FM10K_PF_MSG_ID_UPDATE_PVID		= 0x400,
+	FM10K_PF_MSG_ID_CREATE_FLOW_TABLE	= 0x501,
+	FM10K_PF_MSG_ID_DELETE_FLOW_TABLE	= 0x502,
+	FM10K_PF_MSG_ID_UPDATE_FLOW		= 0x503,
+	FM10K_PF_MSG_ID_DELETE_FLOW		= 0x504,
+	FM10K_PF_MSG_ID_SET_FLOW_STATE		= 0x505,
+	FM10K_PF_MSG_ID_GET_1588_INFO		= 0x506,
+	FM10K_PF_MSG_ID_1588_TIMESTAMP		= 0x701,
+};
+
+enum fm10k_pf_tlv_attr_id_v1 {
+	FM10K_PF_ATTR_ID_ERR			= 0x00,
+	FM10K_PF_ATTR_ID_LPORT_MAP		= 0x01,
+	FM10K_PF_ATTR_ID_XCAST_MODE		= 0x02,
+	FM10K_PF_ATTR_ID_MAC_UPDATE		= 0x03,
+	FM10K_PF_ATTR_ID_VLAN_UPDATE		= 0x04,
+	FM10K_PF_ATTR_ID_CONFIG			= 0x05,
+	FM10K_PF_ATTR_ID_CREATE_FLOW_TABLE	= 0x06,
+	FM10K_PF_ATTR_ID_DELETE_FLOW_TABLE	= 0x07,
+	FM10K_PF_ATTR_ID_UPDATE_FLOW		= 0x08,
+	FM10K_PF_ATTR_ID_FLOW_STATE		= 0x09,
+	FM10K_PF_ATTR_ID_FLOW_HANDLE		= 0x0A,
+	FM10K_PF_ATTR_ID_DELETE_FLOW		= 0x0B,
+	FM10K_PF_ATTR_ID_PORT			= 0x0C,
+	FM10K_PF_ATTR_ID_UPDATE_PVID		= 0x0D,
+	FM10K_PF_ATTR_ID_1588_TIMESTAMP		= 0x10,
+};
+
+#define FM10K_MSG_LPORT_MAP_GLORT_SHIFT	0
+#define FM10K_MSG_LPORT_MAP_GLORT_SIZE	16
+#define FM10K_MSG_LPORT_MAP_MASK_SHIFT	16
+#define FM10K_MSG_LPORT_MAP_MASK_SIZE	16
+
+#define FM10K_MSG_UPDATE_PVID_GLORT_SHIFT	0
+#define FM10K_MSG_UPDATE_PVID_GLORT_SIZE	16
+#define FM10K_MSG_UPDATE_PVID_PVID_SHIFT	16
+#define FM10K_MSG_UPDATE_PVID_PVID_SIZE		16
+
+struct fm10k_mac_update {
+	__le32	mac_lower;
+	__le16	mac_upper;
+	__le16	vlan;
+	__le16	glort;
+	u8	flags;
+	u8	action;
+};
+
+struct fm10k_global_table_data {
+	__le32	used;
+	__le32	avail;
+};
+
+struct fm10k_swapi_error {
+	__le32				status;
+	struct fm10k_global_table_data	mac;
+	struct fm10k_global_table_data	nexthop;
+	struct fm10k_global_table_data	ffu;
+};
+
+struct fm10k_swapi_1588_timestamp {
+	__le64 egress;
+	__le64 ingress;
+	__le16 dglort;
+	__le16 sglort;
+};
+
+s32 fm10k_msg_lport_map_pf(struct fm10k_hw *, u32 **, struct fm10k_mbx_info *);
+extern const struct fm10k_tlv_attr fm10k_lport_map_msg_attr[];
+#define FM10K_PF_MSG_LPORT_MAP_HANDLER(func) \
+	FM10K_MSG_HANDLER(FM10K_PF_MSG_ID_LPORT_MAP, \
+			  fm10k_lport_map_msg_attr, func)
+s32 fm10k_msg_update_pvid_pf(struct fm10k_hw *, u32 **,
+			     struct fm10k_mbx_info *);
+extern const struct fm10k_tlv_attr fm10k_update_pvid_msg_attr[];
+#define FM10K_PF_MSG_UPDATE_PVID_HANDLER(func) \
+	FM10K_MSG_HANDLER(FM10K_PF_MSG_ID_UPDATE_PVID, \
+			  fm10k_update_pvid_msg_attr, func)
+
+s32 fm10k_msg_err_pf(struct fm10k_hw *, u32 **, struct fm10k_mbx_info *);
+extern const struct fm10k_tlv_attr fm10k_err_msg_attr[];
+#define FM10K_PF_MSG_ERR_HANDLER(msg, func) \
+	FM10K_MSG_HANDLER(FM10K_PF_MSG_ID_##msg, fm10k_err_msg_attr, func)
+
+extern const struct fm10k_tlv_attr fm10k_1588_timestamp_msg_attr[];
+#define FM10K_PF_MSG_1588_TIMESTAMP_HANDLER(func) \
+	FM10K_MSG_HANDLER(FM10K_PF_MSG_ID_1588_TIMESTAMP, \
+			  fm10k_1588_timestamp_msg_attr, func)
+
 extern struct fm10k_info fm10k_pf_info;
 #endif /* _FM10K_PF_H */

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [net-next PATCH 09/29] fm10k: Add netdev
  2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
                   ` (7 preceding siblings ...)
  2014-09-18 22:36 ` [net-next PATCH 08/29] fm10k: Add support for configuring PF interface Alexander Duyck
@ 2014-09-18 22:36 ` Alexander Duyck
  2014-09-18 22:37 ` [net-next PATCH 10/29] fm10k: Add support for L2 filtering Alexander Duyck
                   ` (22 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-18 22:36 UTC (permalink / raw)
  To: davem
  Cc: nhorman, netdev, john.fastabend, matthew.vick, jeffrey.t.kirsher,
	sassmann

Now that we have the ability to configure the basic settings on the device
we can start allocating and configuring a netdev for the interface.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/fm10k/Makefile       |    2 
 drivers/net/ethernet/intel/fm10k/fm10k.h        |   92 +++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_netdev.c |   98 ++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c    |  230 +++++++++++++++++++++++
 4 files changed, 418 insertions(+), 4 deletions(-)
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_netdev.c

diff --git a/drivers/net/ethernet/intel/fm10k/Makefile b/drivers/net/ethernet/intel/fm10k/Makefile
index 0904bcb..ee4e415 100644
--- a/drivers/net/ethernet/intel/fm10k/Makefile
+++ b/drivers/net/ethernet/intel/fm10k/Makefile
@@ -28,5 +28,5 @@
 obj-$(CONFIG_FM10K) += fm10k.o
 
 fm10k-objs := fm10k_main.o fm10k_common.o fm10k_pci.o \
-	      fm10k_pf.o \
+	      fm10k_netdev.o fm10k_pf.o \
 	      fm10k_mbx.o fm10k_tlv.o
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k.h b/drivers/net/ethernet/intel/fm10k/fm10k.h
index 3da3a99..fa3e5e1 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k.h
@@ -27,15 +27,102 @@
 #include <linux/if_vlan.h>
 #include <linux/pci.h>
 
-#include "fm10k_common.h"
+#include "fm10k_pf.h"
+
+#define FM10K_MAX_JUMBO_FRAME_SIZE	15358	/* Maximum supported size 15K */
+
+enum fm10k_ring_f_enum {
+	RING_F_RSS,
+	RING_F_QOS,
+	RING_F_ARRAY_SIZE  /* must be last in enum set */
+};
+
+struct fm10k_ring_feature {
+	u16 limit;	/* upper limit on feature indices */
+	u16 indices;	/* current value of indices */
+	u16 mask;	/* Mask used for feature to ring mapping */
+	u16 offset;	/* offset to start of feature */
+};
+
+#define fm10k_vxlan_port_for_each(vp, intfc) \
+	list_for_each_entry(vp, &(intfc)->vxlan_port, list)
+struct fm10k_vxlan_port {
+	struct list_head	list;
+	sa_family_t		sa_family;
+	__be16			port;
+};
 
 struct fm10k_intfc {
+	unsigned long active_vlans[BITS_TO_LONGS(VLAN_N_VID)];
+	struct net_device *netdev;
 	struct pci_dev *pdev;
+	unsigned long state;
+
+	u32 flags;
+#define FM10K_FLAG_RESET_REQUESTED		(u32)(1 << 0)
+#define FM10K_FLAG_RSS_FIELD_IPV4_UDP		(u32)(1 << 1)
+#define FM10K_FLAG_RSS_FIELD_IPV6_UDP		(u32)(1 << 2)
+#define FM10K_FLAG_RX_TS_ENABLED		(u32)(1 << 3)
+#define FM10K_FLAG_SWPRI_CONFIG			(u32)(1 << 4)
+	int xcast_mode;
+
+	u64 rx_overrun_pf;
+	u64 rx_overrun_vf;
 
+	struct fm10k_ring_feature ring_feature[RING_F_ARRAY_SIZE];
+
+	struct fm10k_hw_stats stats;
 	struct fm10k_hw hw;
 	u32 __iomem *uc_addr;
+	u16 msg_enable;
+
+	u32 reta[FM10K_RETA_SIZE];
+	u32 rssrk[FM10K_RSSRK_SIZE];
+
+	/* VXLAN port tracking information */
+	struct list_head vxlan_port;
+
+#if defined(HAVE_DCBNL_IEEE) && defined(CONFIG_DCB)
+	u8 pfc_en;
+#endif
+	u8 rx_pause;
+
+	/* GLORT resources in use by PF */
+	u16 glort;
+	u16 glort_count;
+
+	/* VLAN ID for updating multicast/unicast lists */
+	u16 vid;
+};
+
+enum fm10k_state_t {
+	__FM10K_RESETTING,
+	__FM10K_DOWN,
+	__FM10K_MBX_LOCK,
+	__FM10K_LINK_DOWN,
 };
 
+static inline void fm10k_mbx_lock(struct fm10k_intfc *interface)
+{
+	/* busy loop if we cannot obtain the lock as some calls
+	 * such as ndo_set_rx_mode may be made in atomic context
+	 */
+	while (test_and_set_bit(__FM10K_MBX_LOCK, &interface->state))
+		udelay(20);
+}
+
+static inline void fm10k_mbx_unlock(struct fm10k_intfc *interface)
+{
+	/* flush memory to make sure state is correct */
+	smp_mb__before_clear_bit();
+	clear_bit(__FM10K_MBX_LOCK, &interface->state);
+}
+
+static inline int fm10k_mbx_trylock(struct fm10k_intfc *interface)
+{
+	return !test_and_set_bit(__FM10K_MBX_LOCK, &interface->state);
+}
+
 /* main */
 extern char fm10k_driver_name[];
 extern const char fm10k_driver_version[];
@@ -43,4 +130,7 @@ extern const char fm10k_driver_version[];
 /* PCI */
 int fm10k_register_pci_driver(void);
 void fm10k_unregister_pci_driver(void);
+
+/* Netdev */
+struct net_device *fm10k_alloc_netdev(void);
 #endif /* _FM10K_H_ */
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
new file mode 100644
index 0000000..9a4f3a6
--- /dev/null
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
@@ -0,0 +1,98 @@
+/* Intel Ethernet Switch Host Interface Driver
+ * Copyright(c) 2013 - 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Contact Information:
+ * e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+ * Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+ */
+
+#include "fm10k.h"
+
+static netdev_tx_t fm10k_xmit_frame(struct sk_buff *skb, struct net_device *dev)
+{
+	dev_kfree_skb_any(skb);
+	return NETDEV_TX_OK;
+}
+
+static int fm10k_change_mtu(struct net_device *dev, int new_mtu)
+{
+	if (new_mtu < 68 || new_mtu > FM10K_MAX_JUMBO_FRAME_SIZE)
+		return -EINVAL;
+
+	dev->mtu = new_mtu;
+
+	return 0;
+}
+
+static int fm10k_set_mac(struct net_device *dev, void *p)
+{
+	struct sockaddr *addr = p;
+	s32 err = 0;
+
+	if (!is_valid_ether_addr(addr->sa_data))
+		return -EADDRNOTAVAIL;
+
+	if (!err) {
+		ether_addr_copy(dev->dev_addr, addr->sa_data);
+		dev->addr_assign_type &= ~NET_ADDR_RANDOM;
+	}
+
+	return err;
+}
+
+static void fm10k_set_rx_mode(struct net_device *dev)
+{
+}
+
+static const struct net_device_ops fm10k_netdev_ops = {
+	.ndo_validate_addr	= eth_validate_addr,
+	.ndo_start_xmit		= fm10k_xmit_frame,
+	.ndo_set_mac_address	= fm10k_set_mac,
+	.ndo_change_mtu		= fm10k_change_mtu,
+	.ndo_set_rx_mode	= fm10k_set_rx_mode,
+};
+
+#define DEFAULT_DEBUG_LEVEL_SHIFT 3
+
+struct net_device *fm10k_alloc_netdev(void)
+{
+	struct fm10k_intfc *interface;
+	struct net_device *dev;
+
+	dev = alloc_etherdev(sizeof(struct fm10k_intfc));
+	if (!dev)
+		return NULL;
+
+	/* set net device and ethtool ops */
+	dev->netdev_ops = &fm10k_netdev_ops;
+
+	/* configure default debug level */
+	interface = netdev_priv(dev);
+	interface->msg_enable = (1 << DEFAULT_DEBUG_LEVEL_SHIFT) - 1;
+
+	/* configure default features */
+	dev->features |= NETIF_F_SG;
+
+	/* all features defined to this point should be changeable */
+	dev->hw_features |= dev->features;
+
+	/* configure VLAN features */
+	dev->vlan_features |= dev->features;
+
+	/* configure tunnel offloads */
+	dev->hw_enc_features = NETIF_F_SG;
+
+	return dev;
+}
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index 6d5e33c..5fcbd1d 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -22,6 +22,10 @@
 
 #include "fm10k.h"
 
+static const struct fm10k_info *fm10k_info_tbl[] = {
+	[fm10k_device_pf] = &fm10k_pf_info,
+};
+
 /**
  * fm10k_pci_tbl - PCI Device ID Table
  *
@@ -32,7 +36,7 @@
  *   Class, Class Mask, private data (not used) }
  */
 static const struct pci_device_id fm10k_pci_tbl[] = {
-	{ PCI_VDEVICE(INTEL, FM10K_DEV_ID_PF) },
+	{ PCI_VDEVICE(INTEL, FM10K_DEV_ID_PF), fm10k_device_pf },
 	/* required last entry */
 	{ 0, }
 };
@@ -62,12 +66,153 @@ u32 fm10k_read_reg(struct fm10k_hw *hw, int reg)
 		return ~value;
 
 	value = readl(&hw_addr[reg]);
-	if (!(~value) && (!reg || !(~readl(hw_addr))))
+	if (!(~value) && (!reg || !(~readl(hw_addr)))) {
+		struct fm10k_intfc *interface = hw->back;
+		struct net_device *netdev = interface->netdev;
+
 		hw->hw_addr = NULL;
+		netif_device_detach(netdev);
+		netdev_err(netdev, "PCIe link lost, device now detached\n");
+	}
 
 	return value;
 }
 
+static int fm10k_hw_ready(struct fm10k_intfc *interface)
+{
+	struct fm10k_hw *hw = &interface->hw;
+
+	fm10k_write_flush(hw);
+
+	return FM10K_REMOVED(hw->hw_addr) ? -ENODEV : 0;
+}
+
+/**
+ * fm10k_sw_init - Initialize general software structures
+ * @interface: host interface private structure to initialize
+ *
+ * fm10k_sw_init initializes the interface private data structure.
+ * Fields are initialized based on PCI device information and
+ * OS network device settings (MTU size).
+ **/
+static int fm10k_sw_init(struct fm10k_intfc *interface,
+			 const struct pci_device_id *ent)
+{
+	static const u32 seed[FM10K_RSSRK_SIZE] = { 0xda565a6d, 0xc20e5b25,
+						    0x3d256741, 0xb08fa343,
+						    0xcb2bcad0, 0xb4307bae,
+						    0xa32dcb77, 0x0cf23080,
+						    0x3bb7426a, 0xfa01acbe };
+	const struct fm10k_info *fi = fm10k_info_tbl[ent->driver_data];
+	struct fm10k_hw *hw = &interface->hw;
+	struct pci_dev *pdev = interface->pdev;
+	struct net_device *netdev = interface->netdev;
+	unsigned int rss;
+	int err;
+
+	/* initialize back pointer */
+	hw->back = interface;
+	hw->hw_addr = interface->uc_addr;
+
+	/* PCI config space info */
+	hw->vendor_id = pdev->vendor;
+	hw->device_id = pdev->device;
+	hw->revision_id = pdev->revision;
+	hw->subsystem_vendor_id = pdev->subsystem_vendor;
+	hw->subsystem_device_id = pdev->subsystem_device;
+
+	/* Setup hw api */
+	memcpy(&hw->mac.ops, fi->mac_ops, sizeof(hw->mac.ops));
+	hw->mac.type = fi->mac;
+
+	/* Set common capability flags and settings */
+	rss = min_t(int, FM10K_MAX_RSS_INDICES, num_online_cpus());
+	interface->ring_feature[RING_F_RSS].limit = rss;
+	fi->get_invariants(hw);
+
+	/* pick up the PCIe bus settings for reporting later */
+	if (hw->mac.ops.get_bus_info)
+		hw->mac.ops.get_bus_info(hw);
+
+	/* limit the usable DMA range */
+	if (hw->mac.ops.set_dma_mask)
+		hw->mac.ops.set_dma_mask(hw, dma_get_mask(&pdev->dev));
+
+	/* update netdev with DMA restrictions */
+	if (dma_get_mask(&pdev->dev) > DMA_BIT_MASK(32)) {
+		netdev->features |= NETIF_F_HIGHDMA;
+		netdev->vlan_features |= NETIF_F_HIGHDMA;
+	}
+
+	/* reset and initialize the hardware so it is in a known state */
+	err = hw->mac.ops.reset_hw(hw) ? : hw->mac.ops.init_hw(hw);
+	if (err) {
+		dev_err(&pdev->dev, "init_hw failed: %d\n", err);
+		return err;
+	}
+
+	/* initialize hardware statistics */
+	hw->mac.ops.update_hw_stats(hw, &interface->stats);
+
+	/* Start with random Ethernet address */
+	eth_random_addr(hw->mac.addr);
+
+	/* Initialize MAC address from hardware */
+	err = hw->mac.ops.read_mac_addr(hw);
+	if (err) {
+		dev_warn(&pdev->dev,
+			 "Failed to obtain MAC address defaulting to random\n");
+		/* tag address assignment as random */
+		netdev->addr_assign_type |= NET_ADDR_RANDOM;
+	}
+
+	memcpy(netdev->dev_addr, hw->mac.addr, netdev->addr_len);
+	memcpy(netdev->perm_addr, hw->mac.addr, netdev->addr_len);
+
+	if (!is_valid_ether_addr(netdev->perm_addr)) {
+		dev_err(&pdev->dev, "Invalid MAC Address\n");
+		return -EIO;
+	}
+
+	/* Only the PF can support VXLAN and NVGRE offloads */
+	if (hw->mac.type != fm10k_mac_pf) {
+		netdev->hw_enc_features = 0;
+		netdev->features &= ~NETIF_F_GSO_UDP_TUNNEL;
+		netdev->hw_features &= ~NETIF_F_GSO_UDP_TUNNEL;
+	}
+
+	/* initialize vxlan_port list */
+	INIT_LIST_HEAD(&interface->vxlan_port);
+
+	/* initialize RSS key */
+	memcpy(interface->rssrk, seed, sizeof(seed));
+
+	/* Start off interface as being down */
+	set_bit(__FM10K_DOWN, &interface->state);
+
+	return 0;
+}
+
+static void fm10k_slot_warn(struct fm10k_intfc *interface)
+{
+	struct device *dev = &interface->pdev->dev;
+	struct fm10k_hw *hw = &interface->hw;
+
+	if (hw->mac.ops.is_slot_appropriate(hw))
+		return;
+
+	dev_warn(dev,
+		 "For optimal performance, a %s %s slot is recommended.\n",
+		 (hw->bus_caps.width == fm10k_bus_width_pcie_x1 ? "x1" :
+		  hw->bus_caps.width == fm10k_bus_width_pcie_x4 ? "x4" :
+		  "x8"),
+		 (hw->bus_caps.speed == fm10k_bus_speed_2500 ? "2.5GT/s" :
+		  hw->bus_caps.speed == fm10k_bus_speed_5000 ? "5.0GT/s" :
+		  "8.0GT/s"));
+	dev_warn(dev,
+		 "A slot with more lanes and/or higher speed is suggested.\n");
+}
+
 /**
  * fm10k_probe - Device Initialization Routine
  * @pdev: PCI device information struct
@@ -82,6 +227,9 @@ u32 fm10k_read_reg(struct fm10k_hw *hw, int reg)
 static int fm10k_probe(struct pci_dev *pdev,
 		       const struct pci_device_id *ent)
 {
+	struct net_device *netdev;
+	struct fm10k_intfc *interface;
+	struct fm10k_hw *hw;
 	int err;
 	u64 dma_mask;
 
@@ -122,8 +270,75 @@ static int fm10k_probe(struct pci_dev *pdev,
 	pci_set_master(pdev);
 	pci_save_state(pdev);
 
+	netdev = fm10k_alloc_netdev();
+	if (!netdev) {
+		err = -ENOMEM;
+		goto err_alloc_netdev;
+	}
+
+	SET_NETDEV_DEV(netdev, &pdev->dev);
+
+	interface = netdev_priv(netdev);
+	pci_set_drvdata(pdev, interface);
+
+	interface->netdev = netdev;
+	interface->pdev = pdev;
+	hw = &interface->hw;
+
+	interface->uc_addr = ioremap(pci_resource_start(pdev, 0),
+				     FM10K_UC_ADDR_SIZE);
+	if (!interface->uc_addr) {
+		err = -EIO;
+		goto err_ioremap;
+	}
+
+	err = fm10k_sw_init(interface, ent);
+	if (err)
+		goto err_sw_init;
+
+	/* final check of hardware state before registering the interface */
+	err = fm10k_hw_ready(interface);
+	if (err)
+		goto err_register;
+
+	err = register_netdev(netdev);
+	if (err)
+		goto err_register;
+
+	/* carrier off reporting is important to ethtool even BEFORE open */
+	netif_carrier_off(netdev);
+
+	/* stop all the transmit queues from transmitting until link is up */
+	netif_tx_stop_all_queues(netdev);
+
+	/* print bus type/speed/width info */
+	dev_info(&pdev->dev, "(PCI Express:%s Width: %s Payload: %s)\n",
+		 (hw->bus.speed == fm10k_bus_speed_8000 ? "8.0GT/s" :
+		  hw->bus.speed == fm10k_bus_speed_5000 ? "5.0GT/s" :
+		  hw->bus.speed == fm10k_bus_speed_2500 ? "2.5GT/s" :
+		  "Unknown"),
+		 (hw->bus.width == fm10k_bus_width_pcie_x8 ? "x8" :
+		  hw->bus.width == fm10k_bus_width_pcie_x4 ? "x4" :
+		  hw->bus.width == fm10k_bus_width_pcie_x1 ? "x1" :
+		  "Unknown"),
+		 (hw->bus.payload == fm10k_bus_payload_128 ? "128B" :
+		  hw->bus.payload == fm10k_bus_payload_256 ? "256B" :
+		  hw->bus.payload == fm10k_bus_payload_512 ? "512B" :
+		  "Unknown"));
+
+	/* print warning for non-optimal configurations */
+	fm10k_slot_warn(interface);
+
 	return 0;
 
+err_register:
+err_sw_init:
+	iounmap(interface->uc_addr);
+err_ioremap:
+	free_netdev(netdev);
+err_alloc_netdev:
+	pci_release_selected_regions(pdev,
+				     pci_select_bars(pdev, IORESOURCE_MEM));
 err_pci_reg:
 err_dma:
 	pci_disable_device(pdev);
@@ -141,6 +356,17 @@ err_dma:
  **/
 static void fm10k_remove(struct pci_dev *pdev)
 {
+	struct fm10k_intfc *interface = pci_get_drvdata(pdev);
+	struct net_device *netdev = interface->netdev;
+
+	/* free netdev, this may bounce the interrupts due to setup_tc */
+	if (netdev->reg_state == NETREG_REGISTERED)
+		unregister_netdev(netdev);
+
+	iounmap(interface->uc_addr);
+
+	free_netdev(netdev);
+
 	pci_release_selected_regions(pdev,
 				     pci_select_bars(pdev, IORESOURCE_MEM));
 

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [net-next PATCH 10/29] fm10k: Add support for L2 filtering
  2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
                   ` (8 preceding siblings ...)
  2014-09-18 22:36 ` [net-next PATCH 09/29] fm10k: Add netdev Alexander Duyck
@ 2014-09-18 22:37 ` Alexander Duyck
  2014-09-18 22:37 ` [net-next PATCH 11/29] fm10k: Add support for ndo_open/stop Alexander Duyck
                   ` (21 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-18 22:37 UTC (permalink / raw)
  To: davem
  Cc: nhorman, netdev, john.fastabend, matthew.vick, jeffrey.t.kirsher,
	sassmann

This patch adds support for L2 filtering.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k.h        |    2 
 drivers/net/ethernet/intel/fm10k/fm10k_netdev.c |  355 +++++++++++++++++++++++
 2 files changed, 356 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k.h b/drivers/net/ethernet/intel/fm10k/fm10k.h
index fa3e5e1..33256fa 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k.h
@@ -133,4 +133,6 @@ void fm10k_unregister_pci_driver(void);
 
 /* Netdev */
 struct net_device *fm10k_alloc_netdev(void);
+void fm10k_restore_rx_state(struct fm10k_intfc *);
+void fm10k_reset_rx_state(struct fm10k_intfc *);
 #endif /* _FM10K_H_ */
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
index 9a4f3a6..cf7b4f3 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
@@ -36,24 +36,365 @@ static int fm10k_change_mtu(struct net_device *dev, int new_mtu)
 	return 0;
 }
 
+static int fm10k_uc_vlan_unsync(struct net_device *netdev,
+				const unsigned char *uc_addr)
+{
+	struct fm10k_intfc *interface = netdev_priv(netdev);
+	struct fm10k_hw *hw = &interface->hw;
+	u16 glort = interface->glort;
+	u16 vid = interface->vid;
+	bool set = !!(vid / VLAN_N_VID);
+	int err;
+
+	/* drop any leading bits on the VLAN ID */
+	vid &= VLAN_N_VID - 1;
+
+	err = hw->mac.ops.update_uc_addr(hw, glort, uc_addr, vid, set, 0);
+	if (err)
+		return err;
+
+	/* return non-zero value as we are only doing a partial sync/unsync */
+	return 1;
+}
+
+static int fm10k_mc_vlan_unsync(struct net_device *netdev,
+				const unsigned char *mc_addr)
+{
+	struct fm10k_intfc *interface = netdev_priv(netdev);
+	struct fm10k_hw *hw = &interface->hw;
+	u16 glort = interface->glort;
+	u16 vid = interface->vid;
+	bool set = !!(vid / VLAN_N_VID);
+	int err;
+
+	/* drop any leading bits on the VLAN ID */
+	vid &= VLAN_N_VID - 1;
+
+	err = hw->mac.ops.update_mc_addr(hw, glort, mc_addr, vid, set);
+	if (err)
+		return err;
+
+	/* return non-zero value as we are only doing a partial sync/unsync */
+	return 1;
+}
+
+static int fm10k_update_vid(struct net_device *netdev, u16 vid, bool set)
+{
+	struct fm10k_intfc *interface = netdev_priv(netdev);
+	struct fm10k_hw *hw = &interface->hw;
+	s32 err;
+
+	/* updates do not apply to VLAN 0 */
+	if (!vid)
+		return 0;
+
+	if (vid >= VLAN_N_VID)
+		return -EINVAL;
+
+	/* Verify we have permission to add VLANs */
+	if (hw->mac.vlan_override)
+		return -EACCES;
+
+	/* if default VLAN is already present do nothing */
+	if (vid == hw->mac.default_vid)
+		return -EBUSY;
+
+	/* update active_vlans bitmask */
+	set_bit(vid, interface->active_vlans);
+	if (!set)
+		clear_bit(vid, interface->active_vlans);
+
+	fm10k_mbx_lock(interface);
+
+	/* only need to update the VLAN if not in promiscous mode */
+	if (!(netdev->flags & IFF_PROMISC)) {
+		err = hw->mac.ops.update_vlan(hw, vid, 0, set);
+		if (err)
+			return err;
+	}
+
+	/* update our base MAC address */
+	err = hw->mac.ops.update_uc_addr(hw, interface->glort, hw->mac.addr,
+					 vid, set, 0);
+	if (err)
+		return err;
+
+	/* set vid prior to syncing/unsyncing the VLAN */
+	interface->vid = vid + (set ? VLAN_N_VID : 0);
+
+	/* Update the unicast and multicast address list to add/drop VLAN */
+	__dev_uc_unsync(netdev, fm10k_uc_vlan_unsync);
+	__dev_mc_unsync(netdev, fm10k_mc_vlan_unsync);
+
+	fm10k_mbx_unlock(interface);
+
+	return 0;
+}
+
+static int fm10k_vlan_rx_add_vid(struct net_device *netdev,
+				 __always_unused __be16 proto, u16 vid)
+{
+	/* update VLAN and address table based on changes */
+	return fm10k_update_vid(netdev, vid, true);
+}
+
+static int fm10k_vlan_rx_kill_vid(struct net_device *netdev,
+				  __always_unused __be16 proto, u16 vid)
+{
+	/* update VLAN and address table based on changes */
+	return fm10k_update_vid(netdev, vid, false);
+}
+
+static u16 fm10k_find_next_vlan(struct fm10k_intfc *interface, u16 vid)
+{
+	struct fm10k_hw *hw = &interface->hw;
+	u16 default_vid = hw->mac.default_vid;
+	u16 vid_limit = vid < default_vid ? default_vid : VLAN_N_VID;
+
+	vid = find_next_bit(interface->active_vlans, vid_limit, ++vid);
+
+	return vid;
+}
+
+static void fm10k_clear_unused_vlans(struct fm10k_intfc *interface)
+{
+	struct fm10k_hw *hw = &interface->hw;
+	u32 vid, prev_vid;
+
+	/* loop through and find any gaps in the table */
+	for (vid = 0, prev_vid = 0;
+	     prev_vid < VLAN_N_VID;
+	     prev_vid = vid + 1, vid = fm10k_find_next_vlan(interface, vid)) {
+		if (prev_vid == vid)
+			continue;
+
+		/* send request to clear multiple bits at a time */
+		prev_vid += (vid - prev_vid - 1) << FM10K_VLAN_LENGTH_SHIFT;
+		hw->mac.ops.update_vlan(hw, prev_vid, 0, false);
+	}
+}
+
+static int __fm10k_uc_sync(struct net_device *dev,
+			   const unsigned char *addr, bool sync)
+{
+	struct fm10k_intfc *interface = netdev_priv(dev);
+	struct fm10k_hw *hw = &interface->hw;
+	u16 vid, glort = interface->glort;
+	s32 err;
+
+	if (!is_valid_ether_addr(addr))
+		return -EADDRNOTAVAIL;
+
+	/* update table with current entries */
+	for (vid = hw->mac.default_vid ? fm10k_find_next_vlan(interface, 0) : 0;
+	     vid < VLAN_N_VID;
+	     vid = fm10k_find_next_vlan(interface, vid)) {
+		err = hw->mac.ops.update_uc_addr(hw, glort, addr,
+						  vid, sync, 0);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static int fm10k_uc_sync(struct net_device *dev,
+			 const unsigned char *addr)
+{
+	return __fm10k_uc_sync(dev, addr, true);
+}
+
+static int fm10k_uc_unsync(struct net_device *dev,
+			   const unsigned char *addr)
+{
+	return __fm10k_uc_sync(dev, addr, false);
+}
+
 static int fm10k_set_mac(struct net_device *dev, void *p)
 {
+	struct fm10k_intfc *interface = netdev_priv(dev);
+	struct fm10k_hw *hw = &interface->hw;
 	struct sockaddr *addr = p;
 	s32 err = 0;
 
 	if (!is_valid_ether_addr(addr->sa_data))
 		return -EADDRNOTAVAIL;
 
+	if (dev->flags & IFF_UP) {
+		/* setting MAC address requires mailbox */
+		fm10k_mbx_lock(interface);
+
+		err = fm10k_uc_sync(dev, addr->sa_data);
+		if (!err)
+			fm10k_uc_unsync(dev, hw->mac.addr);
+
+		fm10k_mbx_unlock(interface);
+	}
+
 	if (!err) {
 		ether_addr_copy(dev->dev_addr, addr->sa_data);
+		ether_addr_copy(hw->mac.addr, addr->sa_data);
 		dev->addr_assign_type &= ~NET_ADDR_RANDOM;
 	}
 
-	return err;
+	/* if we had a mailbox error suggest trying again */
+	return err ? -EAGAIN : 0;
+}
+
+static int __fm10k_mc_sync(struct net_device *dev,
+			   const unsigned char *addr, bool sync)
+{
+	struct fm10k_intfc *interface = netdev_priv(dev);
+	struct fm10k_hw *hw = &interface->hw;
+	u16 vid, glort = interface->glort;
+	s32 err;
+
+	if (!is_multicast_ether_addr(addr))
+		return -EADDRNOTAVAIL;
+
+	/* update table with current entries */
+	for (vid = hw->mac.default_vid ? fm10k_find_next_vlan(interface, 0) : 0;
+	     vid < VLAN_N_VID;
+	     vid = fm10k_find_next_vlan(interface, vid)) {
+		err = hw->mac.ops.update_mc_addr(hw, glort, addr, vid, sync);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static int fm10k_mc_sync(struct net_device *dev,
+			 const unsigned char *addr)
+{
+	return __fm10k_mc_sync(dev, addr, true);
+}
+
+static int fm10k_mc_unsync(struct net_device *dev,
+			   const unsigned char *addr)
+{
+	return __fm10k_mc_sync(dev, addr, false);
 }
 
 static void fm10k_set_rx_mode(struct net_device *dev)
 {
+	struct fm10k_intfc *interface = netdev_priv(dev);
+	struct fm10k_hw *hw = &interface->hw;
+	int xcast_mode;
+
+	/* no need to update the harwdare if we are not running */
+	if (!(dev->flags & IFF_UP))
+		return;
+
+	/* determine new mode based on flags */
+	xcast_mode = (dev->flags & IFF_PROMISC) ? FM10K_XCAST_MODE_PROMISC :
+		     (dev->flags & IFF_ALLMULTI) ? FM10K_XCAST_MODE_ALLMULTI :
+		     (dev->flags & (IFF_BROADCAST | IFF_MULTICAST)) ?
+		     FM10K_XCAST_MODE_MULTI : FM10K_XCAST_MODE_NONE;
+
+	fm10k_mbx_lock(interface);
+
+	/* syncronize all of the addresses */
+	if (xcast_mode != FM10K_XCAST_MODE_PROMISC) {
+		__dev_uc_sync(dev, fm10k_uc_sync, fm10k_uc_unsync);
+		if (xcast_mode != FM10K_XCAST_MODE_ALLMULTI)
+			__dev_mc_sync(dev, fm10k_mc_sync, fm10k_mc_unsync);
+	}
+
+	/* if we aren't changing modes there is nothing to do */
+	if (interface->xcast_mode != xcast_mode) {
+		/* update VLAN table */
+		if (xcast_mode == FM10K_XCAST_MODE_PROMISC)
+			hw->mac.ops.update_vlan(hw, FM10K_VLAN_ALL, 0, true);
+		if (interface->xcast_mode == FM10K_XCAST_MODE_PROMISC)
+			fm10k_clear_unused_vlans(interface);
+
+		/* update xcast mode */
+		hw->mac.ops.update_xcast_mode(hw, interface->glort, xcast_mode);
+
+		/* record updated xcast mode state */
+		interface->xcast_mode = xcast_mode;
+	}
+
+	fm10k_mbx_unlock(interface);
+}
+
+void fm10k_restore_rx_state(struct fm10k_intfc *interface)
+{
+	struct net_device *netdev = interface->netdev;
+	struct fm10k_hw *hw = &interface->hw;
+	int xcast_mode;
+	u16 vid, glort;
+
+	/* record glort for this interface */
+	glort = interface->glort;
+
+	/* convert interface flags to xcast mode */
+	if (netdev->flags & IFF_PROMISC)
+		xcast_mode = FM10K_XCAST_MODE_PROMISC;
+	else if (netdev->flags & IFF_ALLMULTI)
+		xcast_mode = FM10K_XCAST_MODE_ALLMULTI;
+	else if (netdev->flags & (IFF_BROADCAST | IFF_MULTICAST))
+		xcast_mode = FM10K_XCAST_MODE_MULTI;
+	else
+		xcast_mode = FM10K_XCAST_MODE_NONE;
+
+	fm10k_mbx_lock(interface);
+
+	/* Enable logical port */
+	hw->mac.ops.update_lport_state(hw, glort, interface->glort_count, true);
+
+	/* update VLAN table */
+	hw->mac.ops.update_vlan(hw, FM10K_VLAN_ALL, 0,
+				xcast_mode == FM10K_XCAST_MODE_PROMISC);
+
+	/* Add filter for VLAN 0 */
+	hw->mac.ops.update_vlan(hw, 0, 0, true);
+
+	/* update table with current entries */
+	for (vid = hw->mac.default_vid ? fm10k_find_next_vlan(interface, 0) : 0;
+	     vid < VLAN_N_VID;
+	     vid = fm10k_find_next_vlan(interface, vid)) {
+		hw->mac.ops.update_vlan(hw, vid, 0, true);
+		hw->mac.ops.update_uc_addr(hw, glort, hw->mac.addr,
+					   vid, true, 0);
+	}
+
+	/* syncronize all of the addresses */
+	if (xcast_mode != FM10K_XCAST_MODE_PROMISC) {
+		__dev_uc_sync(netdev, fm10k_uc_sync, fm10k_uc_unsync);
+		if (xcast_mode != FM10K_XCAST_MODE_ALLMULTI)
+			__dev_mc_sync(netdev, fm10k_mc_sync, fm10k_mc_unsync);
+	}
+
+	/* update xcast mode */
+	hw->mac.ops.update_xcast_mode(hw, glort, xcast_mode);
+
+	fm10k_mbx_unlock(interface);
+
+	/* record updated xcast mode state */
+	interface->xcast_mode = xcast_mode;
+}
+
+void fm10k_reset_rx_state(struct fm10k_intfc *interface)
+{
+	struct net_device *netdev = interface->netdev;
+	struct fm10k_hw *hw = &interface->hw;
+
+	fm10k_mbx_lock(interface);
+
+	/* clear the logical port state on lower device */
+	hw->mac.ops.update_lport_state(hw, interface->glort,
+				       interface->glort_count, false);
+
+	fm10k_mbx_unlock(interface);
+
+	/* reset flags to default state */
+	interface->xcast_mode = FM10K_XCAST_MODE_NONE;
+
+	/* clear the sync flag since the lport has been dropped */
+	__dev_uc_unsync(netdev, NULL);
+	__dev_mc_unsync(netdev, NULL);
 }
 
 static const struct net_device_ops fm10k_netdev_ops = {
@@ -61,6 +402,8 @@ static const struct net_device_ops fm10k_netdev_ops = {
 	.ndo_start_xmit		= fm10k_xmit_frame,
 	.ndo_set_mac_address	= fm10k_set_mac,
 	.ndo_change_mtu		= fm10k_change_mtu,
+	.ndo_vlan_rx_add_vid	= fm10k_vlan_rx_add_vid,
+	.ndo_vlan_rx_kill_vid	= fm10k_vlan_rx_kill_vid,
 	.ndo_set_rx_mode	= fm10k_set_rx_mode,
 };
 
@@ -94,5 +437,15 @@ struct net_device *fm10k_alloc_netdev(void)
 	/* configure tunnel offloads */
 	dev->hw_enc_features = NETIF_F_SG;
 
+	/* we want to leave these both on as we cannot disable VLAN tag
+	 * insertion or stripping on the hardware since it is contained
+	 * in the FTAG and not in the frame itself.
+	 */
+	dev->features |= NETIF_F_HW_VLAN_CTAG_TX |
+			 NETIF_F_HW_VLAN_CTAG_RX |
+			 NETIF_F_HW_VLAN_CTAG_FILTER;
+
+	dev->priv_flags |= IFF_UNICAST_FLT;
+
 	return dev;
 }

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [net-next PATCH 11/29] fm10k: Add support for ndo_open/stop
  2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
                   ` (9 preceding siblings ...)
  2014-09-18 22:37 ` [net-next PATCH 10/29] fm10k: Add support for L2 filtering Alexander Duyck
@ 2014-09-18 22:37 ` Alexander Duyck
  2014-09-18 22:37 ` [net-next PATCH 12/29] fm10k: Add interrupt support Alexander Duyck
                   ` (20 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-18 22:37 UTC (permalink / raw)
  To: davem
  Cc: nhorman, netdev, john.fastabend, matthew.vick, jeffrey.t.kirsher,
	sassmann

Add support for brining the interface up/down.  This is still primitive yet
as we have not yet added support for the descriptor queues.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k.h        |    4 +
 drivers/net/ethernet/intel/fm10k/fm10k_netdev.c |   68 +++++++++++++++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c    |   45 +++++++++++++++
 3 files changed, 117 insertions(+)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k.h b/drivers/net/ethernet/intel/fm10k/fm10k.h
index 33256fa..c26dc75 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k.h
@@ -130,9 +130,13 @@ extern const char fm10k_driver_version[];
 /* PCI */
 int fm10k_register_pci_driver(void);
 void fm10k_unregister_pci_driver(void);
+void fm10k_up(struct fm10k_intfc *interface);
+void fm10k_down(struct fm10k_intfc *interface);
 
 /* Netdev */
 struct net_device *fm10k_alloc_netdev(void);
 void fm10k_restore_rx_state(struct fm10k_intfc *);
 void fm10k_reset_rx_state(struct fm10k_intfc *);
+int fm10k_open(struct net_device *netdev);
+int fm10k_close(struct net_device *netdev);
 #endif /* _FM10K_H_ */
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
index cf7b4f3..ca84898 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
@@ -20,6 +20,72 @@
 
 #include "fm10k.h"
 
+/**
+ * fm10k_request_glort_range - Request GLORTs for use in configuring rules
+ * @interface: board private structure
+ *
+ * This function allocates a range of glorts for this inteface to use.
+ **/
+static void fm10k_request_glort_range(struct fm10k_intfc *interface)
+{
+	struct fm10k_hw *hw = &interface->hw;
+	u16 mask = (~hw->mac.dglort_map) >> FM10K_DGLORTMAP_MASK_SHIFT;
+
+	/* establish GLORT base */
+	interface->glort = hw->mac.dglort_map & FM10K_DGLORTMAP_NONE;
+	interface->glort_count = 0;
+
+	/* nothing we can do until mask is allocated */
+	if (hw->mac.dglort_map == FM10K_DGLORTMAP_NONE)
+		return;
+
+	interface->glort_count = mask + 1;
+}
+
+/**
+ * fm10k_open - Called when a network interface is made active
+ * @netdev: network interface device structure
+ *
+ * Returns 0 on success, negative value on failure
+ *
+ * The open entry point is called when a network interface is made
+ * active by the system (IFF_UP).  At this point all resources needed
+ * for transmit and receive operations are allocated, the interrupt
+ * handler is registered with the OS, the watchdog timer is started,
+ * and the stack is notified that the interface is ready.
+ **/
+int fm10k_open(struct net_device *netdev)
+{
+	struct fm10k_intfc *interface = netdev_priv(netdev);
+
+	/* setup GLORT assignment for this port */
+	fm10k_request_glort_range(interface);
+
+	fm10k_up(interface);
+
+	return 0;
+}
+
+/**
+ * fm10k_close - Disables a network interface
+ * @netdev: network interface device structure
+ *
+ * Returns 0, this is not allowed to fail
+ *
+ * The close entry point is called when an interface is de-activated
+ * by the OS.  The hardware is still under the drivers control, but
+ * needs to be disabled.  A global MAC reset is issued to stop the
+ * hardware, and all transmit and receive resources are freed.
+ **/
+int fm10k_close(struct net_device *netdev)
+{
+	struct fm10k_intfc *interface = netdev_priv(netdev);
+
+	fm10k_down(interface);
+
+	return 0;
+}
+
 static netdev_tx_t fm10k_xmit_frame(struct sk_buff *skb, struct net_device *dev)
 {
 	dev_kfree_skb_any(skb);
@@ -398,6 +464,8 @@ void fm10k_reset_rx_state(struct fm10k_intfc *interface)
 }
 
 static const struct net_device_ops fm10k_netdev_ops = {
+	.ndo_open		= fm10k_open,
+	.ndo_stop		= fm10k_close,
 	.ndo_validate_addr	= eth_validate_addr,
 	.ndo_start_xmit		= fm10k_xmit_frame,
 	.ndo_set_mac_address	= fm10k_set_mac,
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index 5fcbd1d..b6d5e72 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -87,6 +87,51 @@ static int fm10k_hw_ready(struct fm10k_intfc *interface)
 	return FM10K_REMOVED(hw->hw_addr) ? -ENODEV : 0;
 }
 
+void fm10k_up(struct fm10k_intfc *interface)
+{
+	struct fm10k_hw *hw = &interface->hw;
+
+	/* Enable Tx/Rx DMA */
+	hw->mac.ops.start_hw(hw);
+
+	/* configure interrupts */
+	hw->mac.ops.update_int_moderator(hw);
+
+	/* clear down bit to indicate we are ready to go */
+	clear_bit(__FM10K_DOWN, &interface->state);
+
+	/* re-establish Rx filters */
+	fm10k_restore_rx_state(interface);
+
+	/* enable transmits */
+	netif_tx_start_all_queues(interface->netdev);
+}
+
+void fm10k_down(struct fm10k_intfc *interface)
+{
+	struct net_device *netdev = interface->netdev;
+	struct fm10k_hw *hw = &interface->hw;
+
+	/* signal that we are down to the interrupt handler and service task */
+	set_bit(__FM10K_DOWN, &interface->state);
+
+	/* call carrier off first to avoid false dev_watchdog timeouts */
+	netif_carrier_off(netdev);
+
+	/* disable transmits */
+	netif_tx_stop_all_queues(netdev);
+	netif_tx_disable(netdev);
+
+	/* reset Rx filters */
+	fm10k_reset_rx_state(interface);
+
+	/* allow 10ms for device to quiesce */
+	usleep_range(10000, 20000);
+
+	/* Disable DMA engine for Tx/Rx */
+	hw->mac.ops.stop_hw(hw);
+}
+
 /**
  * fm10k_sw_init - Initialize general software structures
  * @interface: host interface private structure to initialize

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [net-next PATCH 12/29] fm10k: Add interrupt support
  2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
                   ` (10 preceding siblings ...)
  2014-09-18 22:37 ` [net-next PATCH 11/29] fm10k: Add support for ndo_open/stop Alexander Duyck
@ 2014-09-18 22:37 ` Alexander Duyck
  2014-09-18 22:37 ` [net-next PATCH 13/29] fm10k: add support for Tx/Rx rings Alexander Duyck
                   ` (19 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-18 22:37 UTC (permalink / raw)
  To: davem
  Cc: nhorman, netdev, john.fastabend, matthew.vick, jeffrey.t.kirsher,
	sassmann

This patch set adds interrupt support for the fm10k interfaces.  The
interfaces themselves only support MSI-X, so neither MSI or legacy
interrupts are used.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k.h        |   59 +++
 drivers/net/ethernet/intel/fm10k/fm10k_main.c   |  401 ++++++++++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_netdev.c |   11 +
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c    |  501 +++++++++++++++++++++++
 4 files changed, 972 insertions(+)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k.h b/drivers/net/ethernet/intel/fm10k/fm10k.h
index c26dc75..3f83eeb 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k.h
@@ -31,6 +31,45 @@
 
 #define FM10K_MAX_JUMBO_FRAME_SIZE	15358	/* Maximum supported size 15K */
 
+struct fm10k_ring_container {
+	unsigned int total_bytes;	/* total bytes processed this int */
+	unsigned int total_packets;	/* total packets processed this int */
+	u16 work_limit;			/* total work allowed per interrupt */
+	u16 itr;			/* interrupt throttle rate value */
+	u8 count;			/* total number of rings in vector */
+};
+
+#define FM10K_ITR_MAX		0x0FFF	/* maximum value for ITR */
+#define FM10K_ITR_10K		100	/* 100us */
+#define FM10K_ITR_20K		50	/* 50us */
+#define FM10K_ITR_ADAPTIVE	0x8000	/* adaptive interrupt moderation flag */
+
+#define FM10K_ITR_ENABLE	(FM10K_ITR_AUTOMASK | FM10K_ITR_MASK_CLEAR)
+
+#define MAX_Q_VECTORS 256
+#define MIN_Q_VECTORS	1
+enum fm10k_non_q_vectors {
+	FM10K_MBX_VECTOR,
+	NON_Q_VECTORS_PF
+};
+
+#define NON_Q_VECTORS(hw)	(((hw)->mac.type == fm10k_mac_pf) ? \
+						NON_Q_VECTORS_PF : \
+						0)
+#define MIN_MSIX_COUNT(hw)	(MIN_Q_VECTORS + NON_Q_VECTORS(hw))
+
+struct fm10k_q_vector {
+	struct fm10k_intfc *interface;
+	u32 __iomem *itr;	/* pointer to ITR register for this vector */
+	u16 v_idx;		/* index of q_vector within interface array */
+	struct fm10k_ring_container rx, tx;
+
+	struct napi_struct napi;
+	char name[IFNAMSIZ + 9];
+
+	struct rcu_head rcu;	/* to avoid race with update stats on free */
+};
+
 enum fm10k_ring_f_enum {
 	RING_F_RSS,
 	RING_F_QOS,
@@ -66,15 +105,29 @@ struct fm10k_intfc {
 #define FM10K_FLAG_SWPRI_CONFIG			(u32)(1 << 4)
 	int xcast_mode;
 
+	/* Tx fast path data */
+	int num_tx_queues;
+	u16 tx_itr;
+
+	/* Rx fast path data */
+	int num_rx_queues;
+	u16 rx_itr;
+
 	u64 rx_overrun_pf;
 	u64 rx_overrun_vf;
 
+	/* Queueing vectors */
+	struct fm10k_q_vector *q_vector[MAX_Q_VECTORS];
+	struct msix_entry *msix_entries;
+	int num_q_vectors;	/* current number of q_vectors for device */
 	struct fm10k_ring_feature ring_feature[RING_F_ARRAY_SIZE];
 
 	struct fm10k_hw_stats stats;
 	struct fm10k_hw hw;
 	u32 __iomem *uc_addr;
 	u16 msg_enable;
+	u16 tx_ring_count;
+	u16 rx_ring_count;
 
 	u32 reta[FM10K_RETA_SIZE];
 	u32 rssrk[FM10K_RSSRK_SIZE];
@@ -126,8 +179,14 @@ static inline int fm10k_mbx_trylock(struct fm10k_intfc *interface)
 /* main */
 extern char fm10k_driver_name[];
 extern const char fm10k_driver_version[];
+int fm10k_init_queueing_scheme(struct fm10k_intfc *interface);
+void fm10k_clear_queueing_scheme(struct fm10k_intfc *interface);
 
 /* PCI */
+void fm10k_mbx_free_irq(struct fm10k_intfc *);
+int fm10k_mbx_request_irq(struct fm10k_intfc *);
+void fm10k_qv_free_irq(struct fm10k_intfc *interface);
+int fm10k_qv_request_irq(struct fm10k_intfc *interface);
 int fm10k_register_pci_driver(void);
 void fm10k_unregister_pci_driver(void);
 void fm10k_up(struct fm10k_intfc *interface);
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_main.c b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
index 6ca0614..b0a2ba1 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_main.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
@@ -66,3 +66,404 @@ static void __exit fm10k_exit_module(void)
 	fm10k_unregister_pci_driver();
 }
 module_exit(fm10k_exit_module);
+
+/**
+ * fm10k_update_itr - update the dynamic ITR value based on packet size
+ *
+ *      Stores a new ITR value based on strictly on packet size.  The
+ *      divisors and thresholds used by this function were determined based
+ *      on theoretical maximum wire speed and testing data, in order to
+ *      minimize response time while increasing bulk throughput.
+ *
+ * @ring_container: Container for rings to have ITR updated
+ **/
+static void fm10k_update_itr(struct fm10k_ring_container *ring_container)
+{
+	unsigned int avg_wire_size, packets;
+
+	/* Only update ITR if we are using adaptive setting */
+	if (!(ring_container->itr & FM10K_ITR_ADAPTIVE))
+		goto clear_counts;
+
+	packets = ring_container->total_packets;
+	if (!packets)
+		goto clear_counts;
+
+	avg_wire_size = ring_container->total_bytes / packets;
+
+	/* Add 24 bytes to size to account for CRC, preamble, and gap */
+	avg_wire_size += 24;
+
+	/* Don't starve jumbo frames */
+	if (avg_wire_size > 3000)
+		avg_wire_size = 3000;
+
+	/* Give a little boost to mid-size frames */
+	if ((avg_wire_size > 300) && (avg_wire_size < 1200))
+		avg_wire_size /= 3;
+	else
+		avg_wire_size /= 2;
+
+	/* write back value and retain adaptive flag */
+	ring_container->itr = avg_wire_size | FM10K_ITR_ADAPTIVE;
+
+clear_counts:
+	ring_container->total_bytes = 0;
+	ring_container->total_packets = 0;
+}
+
+static void fm10k_qv_enable(struct fm10k_q_vector *q_vector)
+{
+	/* Enable auto-mask and clear the current mask */
+	u32 itr = FM10K_ITR_ENABLE;
+
+	/* Update Tx ITR */
+	fm10k_update_itr(&q_vector->tx);
+
+	/* Update Rx ITR */
+	fm10k_update_itr(&q_vector->rx);
+
+	/* Store Tx itr in timer slot 0 */
+	itr |= (q_vector->tx.itr & FM10K_ITR_MAX);
+
+	/* Shift Rx itr to timer slot 1 */
+	itr |= (q_vector->rx.itr & FM10K_ITR_MAX) << FM10K_ITR_INTERVAL1_SHIFT;
+
+	/* Write the final value to the ITR register */
+	writel(itr, q_vector->itr);
+}
+
+static int fm10k_poll(struct napi_struct *napi, int budget)
+{
+	struct fm10k_q_vector *q_vector =
+			       container_of(napi, struct fm10k_q_vector, napi);
+
+	/* all work done, exit the polling mode */
+	napi_complete(napi);
+
+	/* re-enable the q_vector */
+	fm10k_qv_enable(q_vector);
+
+	return 0;
+}
+
+/**
+ * fm10k_set_num_queues: Allocate queues for device, feature dependent
+ * @interface: board private structure to initialize
+ *
+ * This is the top level queue allocation routine.  The order here is very
+ * important, starting with the "most" number of features turned on at once,
+ * and ending with the smallest set of features.  This way large combinations
+ * can be allocated if they're turned on, and smaller combinations are the
+ * fallthrough conditions.
+ *
+ **/
+static void fm10k_set_num_queues(struct fm10k_intfc *interface)
+{
+	/* Start with base case */
+	interface->num_rx_queues = 1;
+	interface->num_tx_queues = 1;
+}
+
+/**
+ * fm10k_alloc_q_vector - Allocate memory for a single interrupt vector
+ * @interface: board private structure to initialize
+ * @v_count: q_vectors allocated on interface, used for ring interleaving
+ * @v_idx: index of vector in interface struct
+ * @txr_count: total number of Tx rings to allocate
+ * @txr_idx: index of first Tx ring to allocate
+ * @rxr_count: total number of Rx rings to allocate
+ * @rxr_idx: index of first Rx ring to allocate
+ *
+ * We allocate one q_vector.  If allocation fails we return -ENOMEM.
+ **/
+static int fm10k_alloc_q_vector(struct fm10k_intfc *interface,
+				unsigned int v_count, unsigned int v_idx,
+				unsigned int txr_count, unsigned int txr_idx,
+				unsigned int rxr_count, unsigned int rxr_idx)
+{
+	struct fm10k_q_vector *q_vector;
+	int ring_count, size;
+
+	ring_count = txr_count + rxr_count;
+	size = sizeof(struct fm10k_q_vector);
+
+	/* allocate q_vector and rings */
+	q_vector = kzalloc(size, GFP_KERNEL);
+	if (!q_vector)
+		return -ENOMEM;
+
+	/* initialize NAPI */
+	netif_napi_add(interface->netdev, &q_vector->napi,
+		       fm10k_poll, NAPI_POLL_WEIGHT);
+
+	/* tie q_vector and interface together */
+	interface->q_vector[v_idx] = q_vector;
+	q_vector->interface = interface;
+	q_vector->v_idx = v_idx;
+
+	/* save Tx ring container info */
+	q_vector->tx.itr = interface->tx_itr;
+	q_vector->tx.count = txr_count;
+
+	/* save Rx ring container info */
+	q_vector->rx.itr = interface->rx_itr;
+	q_vector->rx.count = rxr_count;
+
+	return 0;
+}
+
+/**
+ * fm10k_free_q_vector - Free memory allocated for specific interrupt vector
+ * @interface: board private structure to initialize
+ * @v_idx: Index of vector to be freed
+ *
+ * This function frees the memory allocated to the q_vector.  In addition if
+ * NAPI is enabled it will delete any references to the NAPI struct prior
+ * to freeing the q_vector.
+ **/
+static void fm10k_free_q_vector(struct fm10k_intfc *interface, int v_idx)
+{
+	struct fm10k_q_vector *q_vector = interface->q_vector[v_idx];
+
+	interface->q_vector[v_idx] = NULL;
+	netif_napi_del(&q_vector->napi);
+	kfree_rcu(q_vector, rcu);
+}
+
+/**
+ * fm10k_alloc_q_vectors - Allocate memory for interrupt vectors
+ * @interface: board private structure to initialize
+ *
+ * We allocate one q_vector per queue interrupt.  If allocation fails we
+ * return -ENOMEM.
+ **/
+static int fm10k_alloc_q_vectors(struct fm10k_intfc *interface)
+{
+	unsigned int q_vectors = interface->num_q_vectors;
+	unsigned int rxr_remaining = interface->num_rx_queues;
+	unsigned int txr_remaining = interface->num_tx_queues;
+	unsigned int rxr_idx = 0, txr_idx = 0, v_idx = 0;
+	int err;
+
+	if (q_vectors >= (rxr_remaining + txr_remaining)) {
+		for (; rxr_remaining; v_idx++) {
+			err = fm10k_alloc_q_vector(interface, q_vectors, v_idx,
+						   0, 0, 1, rxr_idx);
+			if (err)
+				goto err_out;
+
+			/* update counts and index */
+			rxr_remaining--;
+			rxr_idx++;
+		}
+	}
+
+	for (; v_idx < q_vectors; v_idx++) {
+		int rqpv = DIV_ROUND_UP(rxr_remaining, q_vectors - v_idx);
+		int tqpv = DIV_ROUND_UP(txr_remaining, q_vectors - v_idx);
+
+		err = fm10k_alloc_q_vector(interface, q_vectors, v_idx,
+					   tqpv, txr_idx,
+					   rqpv, rxr_idx);
+
+		if (err)
+			goto err_out;
+
+		/* update counts and index */
+		rxr_remaining -= rqpv;
+		txr_remaining -= tqpv;
+		rxr_idx++;
+		txr_idx++;
+	}
+
+	return 0;
+
+err_out:
+	interface->num_tx_queues = 0;
+	interface->num_rx_queues = 0;
+	interface->num_q_vectors = 0;
+
+	while (v_idx--)
+		fm10k_free_q_vector(interface, v_idx);
+
+	return -ENOMEM;
+}
+
+/**
+ * fm10k_free_q_vectors - Free memory allocated for interrupt vectors
+ * @interface: board private structure to initialize
+ *
+ * This function frees the memory allocated to the q_vectors.  In addition if
+ * NAPI is enabled it will delete any references to the NAPI struct prior
+ * to freeing the q_vector.
+ **/
+static void fm10k_free_q_vectors(struct fm10k_intfc *interface)
+{
+	int v_idx = interface->num_q_vectors;
+
+	interface->num_tx_queues = 0;
+	interface->num_rx_queues = 0;
+	interface->num_q_vectors = 0;
+
+	while (v_idx--)
+		fm10k_free_q_vector(interface, v_idx);
+}
+
+/**
+ * f10k_reset_msix_capability - reset MSI-X capability
+ * @interface: board private structure to initialize
+ *
+ * Reset the MSI-X capability back to its starting state
+ **/
+static void fm10k_reset_msix_capability(struct fm10k_intfc *interface)
+{
+	pci_disable_msix(interface->pdev);
+	kfree(interface->msix_entries);
+	interface->msix_entries = NULL;
+}
+
+/**
+ * f10k_init_msix_capability - configure MSI-X capability
+ * @interface: board private structure to initialize
+ *
+ * Attempt to configure the interrupts using the best available
+ * capabilities of the hardware and the kernel.
+ **/
+static int fm10k_init_msix_capability(struct fm10k_intfc *interface)
+{
+	struct fm10k_hw *hw = &interface->hw;
+	int v_budget, vector;
+
+	/* It's easy to be greedy for MSI-X vectors, but it really
+	 * doesn't do us much good if we have a lot more vectors
+	 * than CPU's.  So let's be conservative and only ask for
+	 * (roughly) the same number of vectors as there are CPU's.
+	 * the default is to use pairs of vectors
+	 */
+	v_budget = max(interface->num_rx_queues, interface->num_tx_queues);
+	v_budget = min_t(u16, v_budget, num_online_cpus());
+
+	/* account for vectors not related to queues */
+	v_budget += NON_Q_VECTORS(hw);
+
+	/* At the same time, hardware can only support a maximum of
+	 * hw.mac->max_msix_vectors vectors.  With features
+	 * such as RSS and VMDq, we can easily surpass the number of Rx and Tx
+	 * descriptor queues supported by our device.  Thus, we cap it off in
+	 * those rare cases where the cpu count also exceeds our vector limit.
+	 */
+	v_budget = min_t(int, v_budget, hw->mac.max_msix_vectors);
+
+	/* A failure in MSI-X entry allocation is fatal. */
+	interface->msix_entries = kcalloc(v_budget, sizeof(struct msix_entry),
+					  GFP_KERNEL);
+	if (!interface->msix_entries)
+		return -ENOMEM;
+
+	/* populate entry values */
+	for (vector = 0; vector < v_budget; vector++)
+		interface->msix_entries[vector].entry = vector;
+
+	/* Attempt to enable MSI-X with requested value */
+	v_budget = pci_enable_msix_range(interface->pdev,
+					 interface->msix_entries,
+					 MIN_MSIX_COUNT(hw),
+					 v_budget);
+	if (v_budget < 0) {
+		kfree(interface->msix_entries);
+		interface->msix_entries = NULL;
+		return -ENOMEM;
+	}
+
+	/* record the number of queues available for q_vectors */
+	interface->num_q_vectors = v_budget - NON_Q_VECTORS(hw);
+
+	return 0;
+}
+
+static void fm10k_init_reta(struct fm10k_intfc *interface)
+{
+	u16 i, rss_i = interface->ring_feature[RING_F_RSS].indices;
+	u32 reta, base;
+
+	/* If the netdev is initialized we have to maintain table if possible */
+	if (interface->netdev->reg_state) {
+		for (i = FM10K_RETA_SIZE; i--;) {
+			reta = interface->reta[i];
+			if ((((reta << 24) >> 24) < rss_i) &&
+			    (((reta << 16) >> 24) < rss_i) &&
+			    (((reta <<  8) >> 24) < rss_i) &&
+			    (((reta)       >> 24) < rss_i))
+				continue;
+			goto repopulate_reta;
+		}
+
+		/* do nothing if all of the elements are in bounds */
+		return;
+	}
+
+repopulate_reta:
+	/* Populate the redirection table 4 entries at a time.  To do this
+	 * we are generating the results for n and n+2 and then interleaving
+	 * those with the results with n+1 and n+3.
+	 */
+	for (i = FM10K_RETA_SIZE; i--;) {
+		/* first pass generates n and n+2 */
+		base = ((i * 0x00040004) + 0x00020000) * rss_i;
+		reta = (base & 0x3F803F80) >> 7;
+
+		/* second pass generates n+1 and n+3 */
+		base += 0x00010001 * rss_i;
+		reta |= (base & 0x3F803F80) << 1;
+
+		interface->reta[i] = reta;
+	}
+}
+
+/**
+ * fm10k_init_queueing_scheme - Determine proper queueing scheme
+ * @interface: board private structure to initialize
+ *
+ * We determine which queueing scheme to use based on...
+ * - Hardware queue count (num_*_queues)
+ *   - defined by miscellaneous hardware support/features (RSS, etc.)
+ **/
+int fm10k_init_queueing_scheme(struct fm10k_intfc *interface)
+{
+	int err;
+
+	/* Number of supported queues */
+	fm10k_set_num_queues(interface);
+
+	/* Configure MSI-X capability */
+	err = fm10k_init_msix_capability(interface);
+	if (err) {
+		dev_err(&interface->pdev->dev,
+			"Unable to initialize MSI-X capability\n");
+		return err;
+	}
+
+	/* Allocate memory for queues */
+	err = fm10k_alloc_q_vectors(interface);
+	if (err)
+		return err;
+
+	/* Initialize RSS redirection table */
+	fm10k_init_reta(interface);
+
+	return 0;
+}
+
+/**
+ * fm10k_clear_queueing_scheme - Clear the current queueing scheme settings
+ * @interface: board private structure to clear queueing scheme on
+ *
+ * We go through and clear queueing specific resources and reset the structure
+ * to pre-load conditions
+ **/
+void fm10k_clear_queueing_scheme(struct fm10k_intfc *interface)
+{
+	fm10k_free_q_vectors(interface);
+	fm10k_reset_msix_capability(interface);
+}
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
index ca84898..487efcb 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
@@ -57,6 +57,12 @@ static void fm10k_request_glort_range(struct fm10k_intfc *interface)
 int fm10k_open(struct net_device *netdev)
 {
 	struct fm10k_intfc *interface = netdev_priv(netdev);
+	int err;
+
+	/* allocate interrupt resources */
+	err = fm10k_qv_request_irq(interface);
+	if (err)
+		goto err_req_irq;
 
 	/* setup GLORT assignment for this port */
 	fm10k_request_glort_range(interface);
@@ -64,6 +70,9 @@ int fm10k_open(struct net_device *netdev)
 	fm10k_up(interface);
 
 	return 0;
+
+err_req_irq:
+	return err;
 }
 
 /**
@@ -83,6 +92,8 @@ int fm10k_close(struct net_device *netdev)
 
 	fm10k_down(interface);
 
+	fm10k_qv_free_irq(interface);
+
 	return 0;
 }
 
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index b6d5e72..fdef40a 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -87,6 +87,469 @@ static int fm10k_hw_ready(struct fm10k_intfc *interface)
 	return FM10K_REMOVED(hw->hw_addr) ? -ENODEV : 0;
 }
 
+static void fm10k_napi_enable_all(struct fm10k_intfc *interface)
+{
+	struct fm10k_q_vector *q_vector;
+	int q_idx;
+
+	for (q_idx = 0; q_idx < interface->num_q_vectors; q_idx++) {
+		q_vector = interface->q_vector[q_idx];
+		napi_enable(&q_vector->napi);
+	}
+}
+
+static irqreturn_t fm10k_msix_clean_rings(int irq, void *data)
+{
+	struct fm10k_q_vector *q_vector = data;
+
+	if (q_vector->rx.count || q_vector->tx.count)
+		napi_schedule(&q_vector->napi);
+
+	return IRQ_HANDLED;
+}
+
+#define FM10K_ERR_MSG(type) case (type): error = #type; break
+static void fm10k_print_fault(struct fm10k_intfc *interface, int type,
+			      struct fm10k_fault *fault)
+{
+	struct pci_dev *pdev = interface->pdev;
+	char *error;
+
+	switch (type) {
+	case FM10K_PCA_FAULT:
+		switch (fault->type) {
+		default:
+			error = "Unknown PCA error";
+			break;
+		FM10K_ERR_MSG(PCA_NO_FAULT);
+		FM10K_ERR_MSG(PCA_UNMAPPED_ADDR);
+		FM10K_ERR_MSG(PCA_BAD_QACCESS_PF);
+		FM10K_ERR_MSG(PCA_BAD_QACCESS_VF);
+		FM10K_ERR_MSG(PCA_MALICIOUS_REQ);
+		FM10K_ERR_MSG(PCA_POISONED_TLP);
+		FM10K_ERR_MSG(PCA_TLP_ABORT);
+		}
+		break;
+	case FM10K_THI_FAULT:
+		switch (fault->type) {
+		default:
+			error = "Unknown THI error";
+			break;
+		FM10K_ERR_MSG(THI_NO_FAULT);
+		FM10K_ERR_MSG(THI_MAL_DIS_Q_FAULT);
+		}
+		break;
+	case FM10K_FUM_FAULT:
+		switch (fault->type) {
+		default:
+			error = "Unknown FUM error";
+			break;
+		FM10K_ERR_MSG(FUM_NO_FAULT);
+		FM10K_ERR_MSG(FUM_UNMAPPED_ADDR);
+		FM10K_ERR_MSG(FUM_BAD_VF_QACCESS);
+		FM10K_ERR_MSG(FUM_ADD_DECODE_ERR);
+		FM10K_ERR_MSG(FUM_RO_ERROR);
+		FM10K_ERR_MSG(FUM_QPRC_CRC_ERROR);
+		FM10K_ERR_MSG(FUM_CSR_TIMEOUT);
+		FM10K_ERR_MSG(FUM_INVALID_TYPE);
+		FM10K_ERR_MSG(FUM_INVALID_LENGTH);
+		FM10K_ERR_MSG(FUM_INVALID_BE);
+		FM10K_ERR_MSG(FUM_INVALID_ALIGN);
+		}
+		break;
+	default:
+		error = "Undocumented fault";
+		break;
+	}
+
+	dev_warn(&pdev->dev,
+		 "%s Address: 0x%llx SpecInfo: 0x%x Func: %02x.%0x\n",
+		 error, fault->address, fault->specinfo,
+		 PCI_SLOT(fault->func), PCI_FUNC(fault->func));
+}
+
+static void fm10k_report_fault(struct fm10k_intfc *interface, u32 eicr)
+{
+	struct fm10k_hw *hw = &interface->hw;
+	struct fm10k_fault fault = { 0 };
+	int type, err;
+
+	for (eicr &= FM10K_EICR_FAULT_MASK, type = FM10K_PCA_FAULT;
+	     eicr;
+	     eicr >>= 1, type += FM10K_FAULT_SIZE) {
+		/* only check if there is an error reported */
+		if (!(eicr & 0x1))
+			continue;
+
+		/* retrieve fault info */
+		err = hw->mac.ops.get_fault(hw, type, &fault);
+		if (err) {
+			dev_err(&interface->pdev->dev,
+				"error reading fault\n");
+			continue;
+		}
+
+		fm10k_print_fault(interface, type, &fault);
+	}
+}
+
+static void fm10k_reset_drop_on_empty(struct fm10k_intfc *interface, u32 eicr)
+{
+	struct fm10k_hw *hw = &interface->hw;
+	const u32 rxdctl = FM10K_RXDCTL_WRITE_BACK_MIN_DELAY;
+	u32 maxholdq;
+	int q;
+
+	if (!(eicr & FM10K_EICR_MAXHOLDTIME))
+		return;
+
+	maxholdq = fm10k_read_reg(hw, FM10K_MAXHOLDQ(7));
+	if (maxholdq)
+		fm10k_write_reg(hw, FM10K_MAXHOLDQ(7), maxholdq);
+	for (q = 255;;) {
+		if (maxholdq & (1 << 31)) {
+			if (q < FM10K_MAX_QUEUES_PF) {
+				interface->rx_overrun_pf++;
+				fm10k_write_reg(hw, FM10K_RXDCTL(q), rxdctl);
+			} else {
+				interface->rx_overrun_vf++;
+			}
+		}
+
+		maxholdq *= 2;
+		if (!maxholdq)
+			q &= ~(32 - 1);
+
+		if (!q)
+			break;
+
+		if (q-- % 32)
+			continue;
+
+		maxholdq = fm10k_read_reg(hw, FM10K_MAXHOLDQ(q / 32));
+		if (maxholdq)
+			fm10k_write_reg(hw, FM10K_MAXHOLDQ(q / 32), maxholdq);
+	}
+}
+
+static irqreturn_t fm10k_msix_mbx_pf(int irq, void *data)
+{
+	struct fm10k_intfc *interface = data;
+	struct fm10k_hw *hw = &interface->hw;
+	struct fm10k_mbx_info *mbx = &hw->mbx;
+	u32 eicr;
+
+	/* unmask any set bits related to this interrupt */
+	eicr = fm10k_read_reg(hw, FM10K_EICR);
+	fm10k_write_reg(hw, FM10K_EICR, eicr & (FM10K_EICR_MAILBOX |
+						FM10K_EICR_SWITCHREADY |
+						FM10K_EICR_SWITCHNOTREADY));
+
+	/* report any faults found to the message log */
+	fm10k_report_fault(interface, eicr);
+
+	/* reset any queues disabled due to receiver overrun */
+	fm10k_reset_drop_on_empty(interface, eicr);
+
+	/* service mailboxes */
+	if (fm10k_mbx_trylock(interface)) {
+		mbx->ops.process(hw, mbx);
+		fm10k_mbx_unlock(interface);
+	}
+
+	/* re-enable mailbox interrupt and indicate 20us delay */
+	fm10k_write_reg(hw, FM10K_ITR(FM10K_MBX_VECTOR),
+			FM10K_ITR_ENABLE | FM10K_MBX_INT_DELAY);
+
+	return IRQ_HANDLED;
+}
+
+void fm10k_mbx_free_irq(struct fm10k_intfc *interface)
+{
+	struct msix_entry *entry = &interface->msix_entries[FM10K_MBX_VECTOR];
+	struct fm10k_hw *hw = &interface->hw;
+	int itr_reg;
+
+	/* disconnect the mailbox */
+	hw->mbx.ops.disconnect(hw, &hw->mbx);
+
+	/* disable Mailbox cause */
+	if (hw->mac.type == fm10k_mac_pf) {
+		fm10k_write_reg(hw, FM10K_EIMR,
+				FM10K_EIMR_DISABLE(PCA_FAULT) |
+				FM10K_EIMR_DISABLE(FUM_FAULT) |
+				FM10K_EIMR_DISABLE(MAILBOX) |
+				FM10K_EIMR_DISABLE(SWITCHREADY) |
+				FM10K_EIMR_DISABLE(SWITCHNOTREADY) |
+				FM10K_EIMR_DISABLE(SRAMERROR) |
+				FM10K_EIMR_DISABLE(VFLR) |
+				FM10K_EIMR_DISABLE(MAXHOLDTIME));
+		itr_reg = FM10K_ITR(FM10K_MBX_VECTOR);
+	}
+
+	fm10k_write_reg(hw, itr_reg, FM10K_ITR_MASK_SET);
+
+	free_irq(entry->vector, interface);
+}
+
+/* generic error handler for mailbox issues */
+static s32 fm10k_mbx_error(struct fm10k_hw *hw, u32 **results,
+			   struct fm10k_mbx_info *mbx)
+{
+	struct fm10k_intfc *interface = container_of(hw,
+						     struct fm10k_intfc,
+						     hw);
+	struct pci_dev *pdev = interface->pdev;
+
+	dev_err(&pdev->dev, "Unknown message ID %u\n",
+		**results & FM10K_TLV_ID_MASK);
+
+	return 0;
+}
+
+static s32 fm10k_lport_map(struct fm10k_hw *hw, u32 **results,
+			   struct fm10k_mbx_info *mbx)
+{
+	struct fm10k_intfc *interface = container_of(hw,
+						     struct fm10k_intfc,
+						     hw);
+	u32 dglort_map = hw->mac.dglort_map;
+	s32 err;
+
+	err = fm10k_msg_lport_map_pf(hw, results, mbx);
+	if (err)
+		return err;
+
+	/* we need to reset if port count was just updated */
+	if (dglort_map != hw->mac.dglort_map)
+		interface->flags |= FM10K_FLAG_RESET_REQUESTED;
+
+	return 0;
+}
+
+static s32 fm10k_update_pvid(struct fm10k_hw *hw, u32 **results,
+			     struct fm10k_mbx_info *mbx)
+{
+	struct fm10k_intfc *interface = container_of(hw,
+						     struct fm10k_intfc,
+						     hw);
+	u16 glort, pvid;
+	u32 pvid_update;
+	s32 err;
+
+	err = fm10k_tlv_attr_get_u32(results[FM10K_PF_ATTR_ID_UPDATE_PVID],
+				     &pvid_update);
+	if (err)
+		return err;
+
+	/* extract values from the pvid update */
+	glort = FM10K_MSG_HDR_FIELD_GET(pvid_update, UPDATE_PVID_GLORT);
+	pvid = FM10K_MSG_HDR_FIELD_GET(pvid_update, UPDATE_PVID_PVID);
+
+	/* if glort is not valid return error */
+	if (!fm10k_glort_valid_pf(hw, glort))
+		return FM10K_ERR_PARAM;
+
+	/* verify VID is valid */
+	if (pvid >= FM10K_VLAN_TABLE_VID_MAX)
+		return FM10K_ERR_PARAM;
+
+	/* we need to reset if default VLAN was just updated */
+	if (pvid != hw->mac.default_vid)
+		interface->flags |= FM10K_FLAG_RESET_REQUESTED;
+
+	hw->mac.default_vid = pvid;
+
+	return 0;
+}
+
+static const struct fm10k_msg_data pf_mbx_data[] = {
+	FM10K_PF_MSG_ERR_HANDLER(XCAST_MODES, fm10k_msg_err_pf),
+	FM10K_PF_MSG_ERR_HANDLER(UPDATE_MAC_FWD_RULE, fm10k_msg_err_pf),
+	FM10K_PF_MSG_LPORT_MAP_HANDLER(fm10k_lport_map),
+	FM10K_PF_MSG_ERR_HANDLER(LPORT_CREATE, fm10k_msg_err_pf),
+	FM10K_PF_MSG_ERR_HANDLER(LPORT_DELETE, fm10k_msg_err_pf),
+	FM10K_PF_MSG_UPDATE_PVID_HANDLER(fm10k_update_pvid),
+	FM10K_TLV_MSG_ERROR_HANDLER(fm10k_mbx_error),
+};
+
+static int fm10k_mbx_request_irq_pf(struct fm10k_intfc *interface)
+{
+	struct msix_entry *entry = &interface->msix_entries[FM10K_MBX_VECTOR];
+	struct net_device *dev = interface->netdev;
+	struct fm10k_hw *hw = &interface->hw;
+	int err;
+
+	/* Use timer0 for interrupt moderation on the mailbox */
+	u32 mbx_itr = FM10K_INT_MAP_TIMER0 | entry->entry;
+	u32 other_itr = FM10K_INT_MAP_IMMEDIATE | entry->entry;
+
+	/* register mailbox handlers */
+	err = hw->mbx.ops.register_handlers(&hw->mbx, pf_mbx_data);
+	if (err)
+		return err;
+
+	/* request the IRQ */
+	err = request_irq(entry->vector, fm10k_msix_mbx_pf, 0,
+			  dev->name, interface);
+	if (err) {
+		netif_err(interface, probe, dev,
+			  "request_irq for msix_mbx failed: %d\n", err);
+		return err;
+	}
+
+	/* Enable interrupts w/ no moderation for "other" interrupts */
+	fm10k_write_reg(hw, FM10K_INT_MAP(fm10k_int_PCIeFault), other_itr);
+	fm10k_write_reg(hw, FM10K_INT_MAP(fm10k_int_SwitchUpDown), other_itr);
+	fm10k_write_reg(hw, FM10K_INT_MAP(fm10k_int_SRAM), other_itr);
+	fm10k_write_reg(hw, FM10K_INT_MAP(fm10k_int_MaxHoldTime), other_itr);
+	fm10k_write_reg(hw, FM10K_INT_MAP(fm10k_int_VFLR), other_itr);
+
+	/* Enable interrupts w/ moderation for mailbox */
+	fm10k_write_reg(hw, FM10K_INT_MAP(fm10k_int_Mailbox), mbx_itr);
+
+	/* Enable individual interrupt causes */
+	fm10k_write_reg(hw, FM10K_EIMR, FM10K_EIMR_ENABLE(PCA_FAULT) |
+					FM10K_EIMR_ENABLE(FUM_FAULT) |
+					FM10K_EIMR_ENABLE(MAILBOX) |
+					FM10K_EIMR_ENABLE(SWITCHREADY) |
+					FM10K_EIMR_ENABLE(SWITCHNOTREADY) |
+					FM10K_EIMR_ENABLE(SRAMERROR) |
+					FM10K_EIMR_ENABLE(VFLR) |
+					FM10K_EIMR_ENABLE(MAXHOLDTIME));
+
+	/* enable interrupt */
+	fm10k_write_reg(hw, FM10K_ITR(entry->entry), FM10K_ITR_ENABLE);
+
+	return 0;
+}
+
+int fm10k_mbx_request_irq(struct fm10k_intfc *interface)
+{
+	struct fm10k_hw *hw = &interface->hw;
+	int err;
+
+	/* enable Mailbox cause */
+	err = fm10k_mbx_request_irq_pf(interface);
+
+	/* connect mailbox */
+	if (!err)
+		err = hw->mbx.ops.connect(hw, &hw->mbx);
+
+	return err;
+}
+
+/**
+ * fm10k_qv_free_irq - release interrupts associated with queue vectors
+ * @interface: board private structure
+ *
+ * Release all interrupts associated with this interface
+ **/
+void fm10k_qv_free_irq(struct fm10k_intfc *interface)
+{
+	int vector = interface->num_q_vectors;
+	struct fm10k_hw *hw = &interface->hw;
+	struct msix_entry *entry;
+
+	entry = &interface->msix_entries[NON_Q_VECTORS(hw) + vector];
+
+	while (vector) {
+		struct fm10k_q_vector *q_vector;
+
+		vector--;
+		entry--;
+		q_vector = interface->q_vector[vector];
+
+		if (!q_vector->tx.count && !q_vector->rx.count)
+			continue;
+
+		/* disable interrupts */
+
+		writel(FM10K_ITR_MASK_SET, q_vector->itr);
+
+		free_irq(entry->vector, q_vector);
+	}
+}
+
+/**
+ * fm10k_qv_request_irq - initialize interrupts for queue vectors
+ * @interface: board private structure
+ *
+ * Attempts to configure interrupts using the best available
+ * capabilities of the hardware and kernel.
+ **/
+int fm10k_qv_request_irq(struct fm10k_intfc *interface)
+{
+	struct net_device *dev = interface->netdev;
+	struct fm10k_hw *hw = &interface->hw;
+	struct msix_entry *entry;
+	int ri = 0, ti = 0;
+	int vector, err;
+
+	entry = &interface->msix_entries[NON_Q_VECTORS(hw)];
+
+	for (vector = 0; vector < interface->num_q_vectors; vector++) {
+		struct fm10k_q_vector *q_vector = interface->q_vector[vector];
+
+		/* name the vector */
+		if (q_vector->tx.count && q_vector->rx.count) {
+			snprintf(q_vector->name, sizeof(q_vector->name) - 1,
+				 "%s-TxRx-%d", dev->name, ri++);
+			ti++;
+		} else if (q_vector->rx.count) {
+			snprintf(q_vector->name, sizeof(q_vector->name) - 1,
+				 "%s-rx-%d", dev->name, ri++);
+		} else if (q_vector->tx.count) {
+			snprintf(q_vector->name, sizeof(q_vector->name) - 1,
+				 "%s-tx-%d", dev->name, ti++);
+		} else {
+			/* skip this unused q_vector */
+			continue;
+		}
+
+		/* Assign ITR register to q_vector */
+		q_vector->itr = &interface->uc_addr[FM10K_ITR(entry->entry)];
+
+		/* request the IRQ */
+		err = request_irq(entry->vector, &fm10k_msix_clean_rings, 0,
+				  q_vector->name, q_vector);
+		if (err) {
+			netif_err(interface, probe, dev,
+				  "request_irq failed for MSIX interrupt Error: %d\n",
+				  err);
+			goto err_out;
+		}
+
+		/* Enable q_vector */
+		writel(FM10K_ITR_ENABLE, q_vector->itr);
+
+		entry++;
+	}
+
+	return 0;
+
+err_out:
+	/* wind through the ring freeing all entries and vectors */
+	while (vector) {
+		struct fm10k_q_vector *q_vector;
+
+		entry--;
+		vector--;
+		q_vector = interface->q_vector[vector];
+
+		if (!q_vector->tx.count && !q_vector->rx.count)
+			continue;
+
+		/* disable interrupts */
+
+		writel(FM10K_ITR_MASK_SET, q_vector->itr);
+
+		free_irq(entry->vector, q_vector);
+	}
+
+	return err;
+}
+
 void fm10k_up(struct fm10k_intfc *interface)
 {
 	struct fm10k_hw *hw = &interface->hw;
@@ -100,6 +563,9 @@ void fm10k_up(struct fm10k_intfc *interface)
 	/* clear down bit to indicate we are ready to go */
 	clear_bit(__FM10K_DOWN, &interface->state);
 
+	/* enable polling cleanups */
+	fm10k_napi_enable_all(interface);
+
 	/* re-establish Rx filters */
 	fm10k_restore_rx_state(interface);
 
@@ -107,6 +573,17 @@ void fm10k_up(struct fm10k_intfc *interface)
 	netif_tx_start_all_queues(interface->netdev);
 }
 
+static void fm10k_napi_disable_all(struct fm10k_intfc *interface)
+{
+	struct fm10k_q_vector *q_vector;
+	int q_idx;
+
+	for (q_idx = 0; q_idx < interface->num_q_vectors; q_idx++) {
+		q_vector = interface->q_vector[q_idx];
+		napi_disable(&q_vector->napi);
+	}
+}
+
 void fm10k_down(struct fm10k_intfc *interface)
 {
 	struct net_device *netdev = interface->netdev;
@@ -128,6 +605,9 @@ void fm10k_down(struct fm10k_intfc *interface)
 	/* allow 10ms for device to quiesce */
 	usleep_range(10000, 20000);
 
+	/* disable polling routines */
+	fm10k_napi_disable_all(interface);
+
 	/* Disable DMA engine for Tx/Rx */
 	hw->mac.ops.stop_hw(hw);
 }
@@ -226,6 +706,10 @@ static int fm10k_sw_init(struct fm10k_intfc *interface,
 		netdev->hw_features &= ~NETIF_F_GSO_UDP_TUNNEL;
 	}
 
+	/* set default interrupt moderation */
+	interface->tx_itr = FM10K_ITR_10K;
+	interface->rx_itr = FM10K_ITR_ADAPTIVE | FM10K_ITR_20K;
+
 	/* initialize vxlan_port list */
 	INIT_LIST_HEAD(&interface->vxlan_port);
 
@@ -341,6 +825,14 @@ static int fm10k_probe(struct pci_dev *pdev,
 	if (err)
 		goto err_sw_init;
 
+	err = fm10k_init_queueing_scheme(interface);
+	if (err)
+		goto err_sw_init;
+
+	err = fm10k_mbx_request_irq(interface);
+	if (err)
+		goto err_mbx_interrupt;
+
 	/* final check of hardware state before registering the interface */
 	err = fm10k_hw_ready(interface);
 	if (err)
@@ -377,6 +869,9 @@ static int fm10k_probe(struct pci_dev *pdev,
 	return 0;
 
 err_register:
+	fm10k_mbx_free_irq(interface);
+err_mbx_interrupt:
+	fm10k_clear_queueing_scheme(interface);
 err_sw_init:
 	iounmap(interface->uc_addr);
 err_ioremap:
@@ -408,6 +903,12 @@ static void fm10k_remove(struct pci_dev *pdev)
 	if (netdev->reg_state == NETREG_REGISTERED)
 		unregister_netdev(netdev);
 
+	/* disable mailbox interrupt */
+	fm10k_mbx_free_irq(interface);
+
+	/* free interrupts */
+	fm10k_clear_queueing_scheme(interface);
+
 	iounmap(interface->uc_addr);
 
 	free_netdev(netdev);

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [net-next PATCH 13/29] fm10k: add support for Tx/Rx rings
  2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
                   ` (11 preceding siblings ...)
  2014-09-18 22:37 ` [net-next PATCH 12/29] fm10k: Add interrupt support Alexander Duyck
@ 2014-09-18 22:37 ` Alexander Duyck
  2014-09-18 22:37 ` [net-next PATCH 14/29] fm10k: Add service task to handle delayed events Alexander Duyck
                   ` (18 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-18 22:37 UTC (permalink / raw)
  To: davem
  Cc: nhorman, netdev, john.fastabend, matthew.vick, jeffrey.t.kirsher,
	sassmann

This change adds the defines and structures necessary to support both Tx
and Rx descriptor rings.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k.h        |  188 +++++++++++++++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_main.c   |   63 ++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_netdev.c |   70 ++++++++-
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c    |    4 
 4 files changed, 323 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k.h b/drivers/net/ethernet/intel/fm10k/fm10k.h
index 3f83eeb..ae48d72 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k.h
@@ -31,7 +31,118 @@
 
 #define FM10K_MAX_JUMBO_FRAME_SIZE	15358	/* Maximum supported size 15K */
 
+#define MAX_QUEUES	FM10K_MAX_QUEUES_PF
+
+#define FM10K_MIN_RXD		 128
+#define FM10K_MAX_RXD		4096
+#define FM10K_DEFAULT_RXD	 256
+
+#define FM10K_MIN_TXD		 128
+#define FM10K_MAX_TXD		4096
+#define FM10K_DEFAULT_TXD	 256
+#define FM10K_DEFAULT_TX_WORK	 256
+
+#define FM10K_RXBUFFER_256	  256
+#define FM10K_RXBUFFER_16384	16384
+#define FM10K_RX_HDR_LEN	FM10K_RXBUFFER_256
+#if PAGE_SIZE <= FM10K_RXBUFFER_16384
+#define FM10K_RX_BUFSZ		(PAGE_SIZE / 2)
+#else
+#define FM10K_RX_BUFSZ		FM10K_RXBUFFER_16384
+#endif
+
+/* How many Rx Buffers do we bundle into one write to the hardware ? */
+#define FM10K_RX_BUFFER_WRITE	16	/* Must be power of 2 */
+
+enum fm10k_ring_state_t {
+	__FM10K_TX_DETECT_HANG,
+	__FM10K_HANG_CHECK_ARMED,
+};
+
+#define check_for_tx_hang(ring) \
+	test_bit(__FM10K_TX_DETECT_HANG, &(ring)->state)
+#define set_check_for_tx_hang(ring) \
+	set_bit(__FM10K_TX_DETECT_HANG, &(ring)->state)
+#define clear_check_for_tx_hang(ring) \
+	clear_bit(__FM10K_TX_DETECT_HANG, &(ring)->state)
+
+struct fm10k_tx_buffer {
+	struct fm10k_tx_desc *next_to_watch;
+	struct sk_buff *skb;
+	unsigned int bytecount;
+	u16 gso_segs;
+	u16 tx_flags;
+	DEFINE_DMA_UNMAP_ADDR(dma);
+	DEFINE_DMA_UNMAP_LEN(len);
+};
+
+struct fm10k_rx_buffer {
+	dma_addr_t dma;
+	struct page *page;
+	u32 page_offset;
+};
+
+struct fm10k_queue_stats {
+	u64 packets;
+	u64 bytes;
+};
+
+struct fm10k_tx_queue_stats {
+	u64 restart_queue;
+	u64 csum_err;
+	u64 tx_busy;
+	u64 tx_done_old;
+};
+
+struct fm10k_rx_queue_stats {
+	u64 alloc_failed;
+	u64 csum_err;
+	u64 errors;
+};
+
+struct fm10k_ring {
+	struct fm10k_q_vector *q_vector;/* backpointer to host q_vector */
+	struct net_device *netdev;	/* netdev ring belongs to */
+	struct device *dev;		/* device for DMA mapping */
+	void *desc;			/* descriptor ring memory */
+	union {
+		struct fm10k_tx_buffer *tx_buffer;
+		struct fm10k_rx_buffer *rx_buffer;
+	};
+	u32 __iomem *tail;
+	unsigned long state;
+	dma_addr_t dma;			/* phys. address of descriptor ring */
+	unsigned int size;		/* length in bytes */
+
+	u8 queue_index;			/* needed for queue management */
+	u8 reg_idx;			/* holds the special value that gets
+					 * the hardware register offset
+					 * associated with this ring, which is
+					 * different for DCB and RSS modes
+					 */
+	u8 qos_pc;			/* priority class of queue */
+	u16 vid;			/* default vlan ID of queue */
+	u16 count;			/* amount of descriptors */
+
+	u16 next_to_alloc;
+	u16 next_to_use;
+	u16 next_to_clean;
+
+	struct fm10k_queue_stats stats;
+	struct u64_stats_sync syncp;
+	union {
+		/* Tx */
+		struct fm10k_tx_queue_stats tx_stats;
+		/* Rx */
+		struct {
+			struct fm10k_rx_queue_stats rx_stats;
+			struct sk_buff *skb;
+		};
+	};
+} ____cacheline_internodealigned_in_smp;
+
 struct fm10k_ring_container {
+	struct fm10k_ring *ring;	/* pointer to linked list of rings */
 	unsigned int total_bytes;	/* total bytes processed this int */
 	unsigned int total_packets;	/* total packets processed this int */
 	u16 work_limit;			/* total work allowed per interrupt */
@@ -46,6 +157,15 @@ struct fm10k_ring_container {
 
 #define FM10K_ITR_ENABLE	(FM10K_ITR_AUTOMASK | FM10K_ITR_MASK_CLEAR)
 
+static inline struct netdev_queue *txring_txq(const struct fm10k_ring *ring)
+{
+	return &ring->netdev->_tx[ring->queue_index];
+}
+
+/* iterator for handling rings in ring container */
+#define fm10k_for_each_ring(pos, head) \
+	for (pos = &(head).ring[(head).count]; (--pos) >= (head).ring;)
+
 #define MAX_Q_VECTORS 256
 #define MIN_Q_VECTORS	1
 enum fm10k_non_q_vectors {
@@ -68,6 +188,9 @@ struct fm10k_q_vector {
 	char name[IFNAMSIZ + 9];
 
 	struct rcu_head rcu;	/* to avoid race with update stats on free */
+
+	/* for dynamic allocation of rings associated with this q_vector */
+	struct fm10k_ring ring[0] ____cacheline_internodealigned_in_smp;
 };
 
 enum fm10k_ring_f_enum {
@@ -113,9 +236,15 @@ struct fm10k_intfc {
 	int num_rx_queues;
 	u16 rx_itr;
 
+	/* TX */
+	struct fm10k_ring *tx_ring[MAX_QUEUES] ____cacheline_aligned_in_smp;
+
 	u64 rx_overrun_pf;
 	u64 rx_overrun_vf;
 
+	/* RX */
+	struct fm10k_ring *rx_ring[MAX_QUEUES];
+
 	/* Queueing vectors */
 	struct fm10k_q_vector *q_vector[MAX_Q_VECTORS];
 	struct msix_entry *msix_entries;
@@ -176,6 +305,65 @@ static inline int fm10k_mbx_trylock(struct fm10k_intfc *interface)
 	return !test_and_set_bit(__FM10K_MBX_LOCK, &interface->state);
 }
 
+/* fm10k_test_staterr - test bits in Rx descriptor status and error fields */
+static inline __le32 fm10k_test_staterr(union fm10k_rx_desc *rx_desc,
+					const u32 stat_err_bits)
+{
+	return rx_desc->d.staterr & cpu_to_le32(stat_err_bits);
+}
+
+/* fm10k_desc_unused - calculate if we have unused descriptors */
+static inline u16 fm10k_desc_unused(struct fm10k_ring *ring)
+{
+	s16 unused = ring->next_to_clean - ring->next_to_use - 1;
+
+	return likely(unused < 0) ? unused + ring->count : unused;
+}
+
+#define FM10K_TX_DESC(R, i)	\
+	(&(((struct fm10k_tx_desc *)((R)->desc))[i]))
+#define FM10K_RX_DESC(R, i)	\
+	 (&(((union fm10k_rx_desc *)((R)->desc))[i]))
+
+#define FM10K_MAX_TXD_PWR	14
+#define FM10K_MAX_DATA_PER_TXD	(1 << FM10K_MAX_TXD_PWR)
+
+/* Tx Descriptors needed, worst case */
+#define TXD_USE_COUNT(S)	DIV_ROUND_UP((S), FM10K_MAX_DATA_PER_TXD)
+#define DESC_NEEDED	(MAX_SKB_FRAGS + 4)
+
+enum fm10k_tx_flags {
+	/* Tx offload flags */
+	FM10K_TX_FLAGS_CSUM	= 0x01,
+};
+
+/* This structure is stored as little endian values as that is the native
+ * format of the Rx descriptor.  The ordering of these fields is reversed
+ * from the actual ftag header to allow for a single bswap to take care
+ * of placing all of the values in network order
+ */
+union fm10k_ftag_info {
+	__le64 ftag;
+	struct {
+		/* dglort and sglort combined into a single 32bit desc read */
+		__le32 glort;
+		/* upper 16 bits of vlan are reserved 0 for swpri_type_user */
+		__le32 vlan;
+	} d;
+	struct {
+		__le16 dglort;
+		__le16 sglort;
+		__le16 vlan;
+		__le16 swpri_type_user;
+	} w;
+};
+
+struct fm10k_cb {
+	union fm10k_ftag_info fi;
+};
+
+#define FM10K_CB(skb) ((struct fm10k_cb *)(skb)->cb)
+
 /* main */
 extern char fm10k_driver_name[];
 extern const char fm10k_driver_version[];
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_main.c b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
index b0a2ba1..bf84c26 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_main.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
@@ -183,10 +183,12 @@ static int fm10k_alloc_q_vector(struct fm10k_intfc *interface,
 				unsigned int rxr_count, unsigned int rxr_idx)
 {
 	struct fm10k_q_vector *q_vector;
+	struct fm10k_ring *ring;
 	int ring_count, size;
 
 	ring_count = txr_count + rxr_count;
-	size = sizeof(struct fm10k_q_vector);
+	size = sizeof(struct fm10k_q_vector) +
+	       (sizeof(struct fm10k_ring) * ring_count);
 
 	/* allocate q_vector and rings */
 	q_vector = kzalloc(size, GFP_KERNEL);
@@ -202,14 +204,66 @@ static int fm10k_alloc_q_vector(struct fm10k_intfc *interface,
 	q_vector->interface = interface;
 	q_vector->v_idx = v_idx;
 
+	/* initialize pointer to rings */
+	ring = q_vector->ring;
+
 	/* save Tx ring container info */
+	q_vector->tx.ring = ring;
+	q_vector->tx.work_limit = FM10K_DEFAULT_TX_WORK;
 	q_vector->tx.itr = interface->tx_itr;
 	q_vector->tx.count = txr_count;
 
+	while (txr_count) {
+		/* assign generic ring traits */
+		ring->dev = &interface->pdev->dev;
+		ring->netdev = interface->netdev;
+
+		/* configure backlink on ring */
+		ring->q_vector = q_vector;
+
+		/* apply Tx specific ring traits */
+		ring->count = interface->tx_ring_count;
+		ring->queue_index = txr_idx;
+
+		/* assign ring to interface */
+		interface->tx_ring[txr_idx] = ring;
+
+		/* update count and index */
+		txr_count--;
+		txr_idx += v_count;
+
+		/* push pointer to next ring */
+		ring++;
+	}
+
 	/* save Rx ring container info */
+	q_vector->rx.ring = ring;
 	q_vector->rx.itr = interface->rx_itr;
 	q_vector->rx.count = rxr_count;
 
+	while (rxr_count) {
+		/* assign generic ring traits */
+		ring->dev = &interface->pdev->dev;
+		ring->netdev = interface->netdev;
+
+		/* configure backlink on ring */
+		ring->q_vector = q_vector;
+
+		/* apply Rx specific ring traits */
+		ring->count = interface->rx_ring_count;
+		ring->queue_index = rxr_idx;
+
+		/* assign ring to interface */
+		interface->rx_ring[rxr_idx] = ring;
+
+		/* update count and index */
+		rxr_count--;
+		rxr_idx += v_count;
+
+		/* push pointer to next ring */
+		ring++;
+	}
+
 	return 0;
 }
 
@@ -225,6 +279,13 @@ static int fm10k_alloc_q_vector(struct fm10k_intfc *interface,
 static void fm10k_free_q_vector(struct fm10k_intfc *interface, int v_idx)
 {
 	struct fm10k_q_vector *q_vector = interface->q_vector[v_idx];
+	struct fm10k_ring *ring;
+
+	fm10k_for_each_ring(ring, q_vector->tx)
+		interface->tx_ring[ring->queue_index] = NULL;
+
+	fm10k_for_each_ring(ring, q_vector->rx)
+		interface->rx_ring[ring->queue_index] = NULL;
 
 	interface->q_vector[v_idx] = NULL;
 	netif_napi_del(&q_vector->napi);
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
index 487efcb..b987bb6 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
@@ -67,10 +67,19 @@ int fm10k_open(struct net_device *netdev)
 	/* setup GLORT assignment for this port */
 	fm10k_request_glort_range(interface);
 
+	/* Notify the stack of the actual queue counts */
+
+	err = netif_set_real_num_rx_queues(netdev,
+					   interface->num_rx_queues);
+	if (err)
+		goto err_set_queues;
+
 	fm10k_up(interface);
 
 	return 0;
 
+err_set_queues:
+	fm10k_qv_free_irq(interface);
 err_req_irq:
 	return err;
 }
@@ -474,6 +483,64 @@ void fm10k_reset_rx_state(struct fm10k_intfc *interface)
 	__dev_mc_unsync(netdev, NULL);
 }
 
+/**
+ * fm10k_get_stats64 - Get System Network Statistics
+ * @netdev: network interface device structure
+ * @stats: storage space for 64bit statistics
+ *
+ * Returns 64bit statistics, for use in the ndo_get_stats64 callback. This
+ * function replaces fm10k_get_stats for kernels which support it.
+ */
+static struct rtnl_link_stats64 *fm10k_get_stats64(struct net_device *netdev,
+						   struct rtnl_link_stats64 *stats)
+{
+	struct fm10k_intfc *interface = netdev_priv(netdev);
+	struct fm10k_ring *ring;
+	unsigned int start, i;
+	u64 bytes, packets;
+
+	rcu_read_lock();
+
+	for (i = 0; i < interface->num_rx_queues; i++) {
+		ring = ACCESS_ONCE(interface->rx_ring[i]);
+
+		if (!ring)
+			continue;
+
+		do {
+			start = u64_stats_fetch_begin_irq(&ring->syncp);
+			packets = ring->stats.packets;
+			bytes   = ring->stats.bytes;
+		} while (u64_stats_fetch_retry_irq(&ring->syncp, start));
+
+		stats->rx_packets += packets;
+		stats->rx_bytes   += bytes;
+	}
+
+	for (i = 0; i < interface->num_tx_queues; i++) {
+		ring = ACCESS_ONCE(interface->rx_ring[i]);
+
+		if (!ring)
+			continue;
+
+		do {
+			start = u64_stats_fetch_begin_irq(&ring->syncp);
+			packets = ring->stats.packets;
+			bytes   = ring->stats.bytes;
+		} while (u64_stats_fetch_retry_irq(&ring->syncp, start));
+
+		stats->tx_packets += packets;
+		stats->tx_bytes   += bytes;
+	}
+
+	rcu_read_unlock();
+
+	/* following stats updated by fm10k_service_task() */
+	stats->rx_missed_errors	= netdev->stats.rx_missed_errors;
+
+	return stats;
+}
+
 static const struct net_device_ops fm10k_netdev_ops = {
 	.ndo_open		= fm10k_open,
 	.ndo_stop		= fm10k_close,
@@ -484,6 +551,7 @@ static const struct net_device_ops fm10k_netdev_ops = {
 	.ndo_vlan_rx_add_vid	= fm10k_vlan_rx_add_vid,
 	.ndo_vlan_rx_kill_vid	= fm10k_vlan_rx_kill_vid,
 	.ndo_set_rx_mode	= fm10k_set_rx_mode,
+	.ndo_get_stats64	= fm10k_get_stats64,
 };
 
 #define DEFAULT_DEBUG_LEVEL_SHIFT 3
@@ -493,7 +561,7 @@ struct net_device *fm10k_alloc_netdev(void)
 	struct fm10k_intfc *interface;
 	struct net_device *dev;
 
-	dev = alloc_etherdev(sizeof(struct fm10k_intfc));
+	dev = alloc_etherdev_mq(sizeof(struct fm10k_intfc), MAX_QUEUES);
 	if (!dev)
 		return NULL;
 
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index fdef40a..9464fd6 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -706,6 +706,10 @@ static int fm10k_sw_init(struct fm10k_intfc *interface,
 		netdev->hw_features &= ~NETIF_F_GSO_UDP_TUNNEL;
 	}
 
+	/* set default ring sizes */
+	interface->tx_ring_count = FM10K_DEFAULT_TXD;
+	interface->rx_ring_count = FM10K_DEFAULT_RXD;
+
 	/* set default interrupt moderation */
 	interface->tx_itr = FM10K_ITR_10K;
 	interface->rx_itr = FM10K_ITR_ADAPTIVE | FM10K_ITR_20K;

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [net-next PATCH 14/29] fm10k: Add service task to handle delayed events
  2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
                   ` (12 preceding siblings ...)
  2014-09-18 22:37 ` [net-next PATCH 13/29] fm10k: add support for Tx/Rx rings Alexander Duyck
@ 2014-09-18 22:37 ` Alexander Duyck
  2014-09-18 22:37 ` [net-next PATCH 15/29] fm10k: Add Tx/Rx hardware ring bring-up/tear-down Alexander Duyck
                   ` (17 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-18 22:37 UTC (permalink / raw)
  To: davem
  Cc: nhorman, netdev, john.fastabend, matthew.vick, jeffrey.t.kirsher,
	sassmann

This patch adds support for the service task.  The service task takes care
of all processes that cannot be done in interrupt context such as resets,
stats updates, TC prio updates, and checking for hung or detached devices.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k.h     |   25 +
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c |  435 ++++++++++++++++++++++++++
 2 files changed, 460 insertions(+)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k.h b/drivers/net/ethernet/intel/fm10k/fm10k.h
index ae48d72..b88a52c 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k.h
@@ -239,8 +239,21 @@ struct fm10k_intfc {
 	/* TX */
 	struct fm10k_ring *tx_ring[MAX_QUEUES] ____cacheline_aligned_in_smp;
 
+	u64 restart_queue;
+	u64 tx_busy;
+	u64 tx_csum_errors;
+	u64 alloc_failed;
+	u64 rx_csum_errors;
+	u64 rx_errors;
+
+	u64 tx_bytes_nic;
+	u64 tx_packets_nic;
+	u64 rx_bytes_nic;
+	u64 rx_packets_nic;
+	u64 rx_drops_nic;
 	u64 rx_overrun_pf;
 	u64 rx_overrun_vf;
+	u32 tx_timeout_count;
 
 	/* RX */
 	struct fm10k_ring *rx_ring[MAX_QUEUES];
@@ -257,6 +270,13 @@ struct fm10k_intfc {
 	u16 msg_enable;
 	u16 tx_ring_count;
 	u16 rx_ring_count;
+	struct timer_list service_timer;
+	struct work_struct service_task;
+	unsigned long next_stats_update;
+	unsigned long next_tx_hang_check;
+	unsigned long last_reset;
+	unsigned long link_down_event;
+	bool host_ready;
 
 	u32 reta[FM10K_RETA_SIZE];
 	u32 rssrk[FM10K_RSSRK_SIZE];
@@ -280,6 +300,8 @@ struct fm10k_intfc {
 enum fm10k_state_t {
 	__FM10K_RESETTING,
 	__FM10K_DOWN,
+	__FM10K_SERVICE_SCHED,
+	__FM10K_SERVICE_DISABLE,
 	__FM10K_MBX_LOCK,
 	__FM10K_LINK_DOWN,
 };
@@ -379,6 +401,9 @@ int fm10k_register_pci_driver(void);
 void fm10k_unregister_pci_driver(void);
 void fm10k_up(struct fm10k_intfc *interface);
 void fm10k_down(struct fm10k_intfc *interface);
+void fm10k_update_stats(struct fm10k_intfc *interface);
+void fm10k_service_event_schedule(struct fm10k_intfc *interface);
+void fm10k_update_rx_drop_en(struct fm10k_intfc *interface);
 
 /* Netdev */
 struct net_device *fm10k_alloc_netdev(void);
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index 9464fd6..879371f 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -87,6 +87,404 @@ static int fm10k_hw_ready(struct fm10k_intfc *interface)
 	return FM10K_REMOVED(hw->hw_addr) ? -ENODEV : 0;
 }
 
+void fm10k_service_event_schedule(struct fm10k_intfc *interface)
+{
+	if (!test_bit(__FM10K_SERVICE_DISABLE, &interface->state) &&
+	    !test_and_set_bit(__FM10K_SERVICE_SCHED, &interface->state))
+		schedule_work(&interface->service_task);
+}
+
+static void fm10k_service_event_complete(struct fm10k_intfc *interface)
+{
+	BUG_ON(!test_bit(__FM10K_SERVICE_SCHED, &interface->state));
+
+	/* flush memory to make sure state is correct before next watchog */
+	smp_mb__before_clear_bit();
+	clear_bit(__FM10K_SERVICE_SCHED, &interface->state);
+}
+
+/**
+ * fm10k_service_timer - Timer Call-back
+ * @data: pointer to interface cast into an unsigned long
+ **/
+static void fm10k_service_timer(unsigned long data)
+{
+	struct fm10k_intfc *interface = (struct fm10k_intfc *)data;
+
+	/* Reset the timer */
+	mod_timer(&interface->service_timer, (HZ * 2) + jiffies);
+
+	fm10k_service_event_schedule(interface);
+}
+
+static void fm10k_detach_subtask(struct fm10k_intfc *interface)
+{
+	struct net_device *netdev = interface->netdev;
+
+	/* do nothing if device is still present or hw_addr is set */
+	if (netif_device_present(netdev) || interface->hw.hw_addr)
+		return;
+
+	rtnl_lock();
+
+	if (netif_running(netdev))
+		dev_close(netdev);
+
+	rtnl_unlock();
+}
+
+static void fm10k_reinit(struct fm10k_intfc *interface)
+{
+	struct net_device *netdev = interface->netdev;
+	struct fm10k_hw *hw = &interface->hw;
+	int err;
+
+	WARN_ON(in_interrupt());
+
+	/* put off any impending NetWatchDogTimeout */
+	netdev->trans_start = jiffies;
+
+	while (test_and_set_bit(__FM10K_RESETTING, &interface->state))
+		usleep_range(1000, 2000);
+
+	rtnl_lock();
+
+	if (netif_running(netdev))
+		fm10k_close(netdev);
+
+	fm10k_mbx_free_irq(interface);
+
+	/* delay any future reset requests */
+	interface->last_reset = jiffies + (10 * HZ);
+
+	/* reset and initialize the hardware so it is in a known state */
+	err = hw->mac.ops.reset_hw(hw) ? : hw->mac.ops.init_hw(hw);
+	if (err)
+		dev_err(&interface->pdev->dev, "init_hw failed: %d\n", err);
+
+	/* reassociate interrupts */
+	fm10k_mbx_request_irq(interface);
+
+	if (netif_running(netdev))
+		fm10k_open(netdev);
+
+	rtnl_unlock();
+
+	clear_bit(__FM10K_RESETTING, &interface->state);
+}
+
+static void fm10k_reset_subtask(struct fm10k_intfc *interface)
+{
+	if (!(interface->flags & FM10K_FLAG_RESET_REQUESTED))
+		return;
+
+	interface->flags &= ~FM10K_FLAG_RESET_REQUESTED;
+
+	netdev_err(interface->netdev, "Reset interface\n");
+	interface->tx_timeout_count++;
+
+	fm10k_reinit(interface);
+}
+
+/**
+ * fm10k_configure_swpri_map - Configure Receive SWPRI to PC mapping
+ * @interface: board private structure
+ *
+ * Configure the SWPRI to PC mapping for the port.
+ **/
+static void fm10k_configure_swpri_map(struct fm10k_intfc *interface)
+{
+	struct net_device *netdev = interface->netdev;
+	struct fm10k_hw *hw = &interface->hw;
+	int i;
+
+	/* clear flag indicating update is needed */
+	interface->flags &= ~FM10K_FLAG_SWPRI_CONFIG;
+
+	/* these registers are only available on the PF */
+	if (hw->mac.type != fm10k_mac_pf)
+		return;
+
+	/* configure SWPRI to PC map */
+	for (i = 0; i < FM10K_SWPRI_MAX; i++)
+		fm10k_write_reg(hw, FM10K_SWPRI_MAP(i),
+				netdev_get_prio_tc_map(netdev, i));
+}
+
+/**
+ * fm10k_watchdog_update_host_state - Update the link status based on host.
+ * @interface: board private structure
+ **/
+static void fm10k_watchdog_update_host_state(struct fm10k_intfc *interface)
+{
+	struct fm10k_hw *hw = &interface->hw;
+	s32 err;
+
+	if (test_bit(__FM10K_LINK_DOWN, &interface->state)) {
+		interface->host_ready = false;
+		if (time_is_after_jiffies(interface->link_down_event))
+			return;
+		clear_bit(__FM10K_LINK_DOWN, &interface->state);
+	}
+
+	if (interface->flags & FM10K_FLAG_SWPRI_CONFIG) {
+		if (rtnl_trylock()) {
+			fm10k_configure_swpri_map(interface);
+			rtnl_unlock();
+		}
+	}
+
+	/* lock the mailbox for transmit and receive */
+	fm10k_mbx_lock(interface);
+
+	err = hw->mac.ops.get_host_state(hw, &interface->host_ready);
+	if (err && time_is_before_jiffies(interface->last_reset))
+		interface->flags |= FM10K_FLAG_RESET_REQUESTED;
+
+	/* free the lock */
+	fm10k_mbx_unlock(interface);
+}
+
+/**
+ * fm10k_mbx_subtask - Process upstream and downstream mailboxes
+ * @interface: board private structure
+ *
+ * This function will process both the upstream and downstream mailboxes.
+ * It is necessary for us to hold the rtnl_lock while doing this as the
+ * mailbox accesses are protected by this lock.
+ **/
+static void fm10k_mbx_subtask(struct fm10k_intfc *interface)
+{
+	/* process upstream mailbox and update device state */
+	fm10k_watchdog_update_host_state(interface);
+}
+
+/**
+ * fm10k_watchdog_host_is_ready - Update netdev status based on host ready
+ * @interface: board private structure
+ **/
+static void fm10k_watchdog_host_is_ready(struct fm10k_intfc *interface)
+{
+	struct net_device *netdev = interface->netdev;
+
+	/* only continue if link state is currently down */
+	if (netif_carrier_ok(netdev))
+		return;
+
+	netif_info(interface, drv, netdev, "NIC Link is up\n");
+
+	netif_carrier_on(netdev);
+	netif_tx_wake_all_queues(netdev);
+}
+
+/**
+ * fm10k_watchdog_host_not_ready - Update netdev status based on host not ready
+ * @interface: board private structure
+ **/
+static void fm10k_watchdog_host_not_ready(struct fm10k_intfc *interface)
+{
+	struct net_device *netdev = interface->netdev;
+
+	/* only continue if link state is currently up */
+	if (!netif_carrier_ok(netdev))
+		return;
+
+	netif_info(interface, drv, netdev, "NIC Link is down\n");
+
+	netif_carrier_off(netdev);
+	netif_tx_stop_all_queues(netdev);
+}
+
+/**
+ * fm10k_update_stats - Update the board statistics counters.
+ * @interface: board private structure
+ **/
+void fm10k_update_stats(struct fm10k_intfc *interface)
+{
+	struct net_device_stats *net_stats = &interface->netdev->stats;
+	struct fm10k_hw *hw = &interface->hw;
+	u64 rx_errors = 0, rx_csum_errors = 0, tx_csum_errors = 0;
+	u64 restart_queue = 0, tx_busy = 0, alloc_failed = 0;
+	u64 rx_bytes_nic = 0, rx_pkts_nic = 0, rx_drops_nic = 0;
+	u64 tx_bytes_nic = 0, tx_pkts_nic = 0;
+	u64 bytes, pkts;
+	int i;
+
+	/* do not allow stats update via service task for next second */
+	interface->next_stats_update = jiffies + HZ;
+
+	/* gather some stats to the interface struct that are per queue */
+	for (bytes = 0, pkts = 0, i = 0; i < interface->num_tx_queues; i++) {
+		struct fm10k_ring *tx_ring = interface->tx_ring[i];
+
+		restart_queue += tx_ring->tx_stats.restart_queue;
+		tx_busy += tx_ring->tx_stats.tx_busy;
+		tx_csum_errors += tx_ring->tx_stats.csum_err;
+		bytes += tx_ring->stats.bytes;
+		pkts += tx_ring->stats.packets;
+	}
+
+	interface->restart_queue = restart_queue;
+	interface->tx_busy = tx_busy;
+	net_stats->tx_bytes = bytes;
+	net_stats->tx_packets = pkts;
+	interface->tx_csum_errors = tx_csum_errors;
+	/* gather some stats to the interface struct that are per queue */
+	for (bytes = 0, pkts = 0, i = 0; i < interface->num_rx_queues; i++) {
+		struct fm10k_ring *rx_ring = interface->rx_ring[i];
+
+		bytes += rx_ring->stats.bytes;
+		pkts += rx_ring->stats.packets;
+		alloc_failed += rx_ring->rx_stats.alloc_failed;
+		rx_csum_errors += rx_ring->rx_stats.csum_err;
+		rx_errors += rx_ring->rx_stats.errors;
+	}
+
+	net_stats->rx_bytes = bytes;
+	net_stats->rx_packets = pkts;
+	interface->alloc_failed = alloc_failed;
+	interface->rx_csum_errors = rx_csum_errors;
+	interface->rx_errors = rx_errors;
+
+	hw->mac.ops.update_hw_stats(hw, &interface->stats);
+
+	for (i = 0; i < FM10K_MAX_QUEUES_PF; i++) {
+		struct fm10k_hw_stats_q *q = &interface->stats.q[i];
+
+		tx_bytes_nic += q->tx_bytes.count;
+		tx_pkts_nic += q->tx_packets.count;
+		rx_bytes_nic += q->rx_bytes.count;
+		rx_pkts_nic += q->rx_packets.count;
+		rx_drops_nic += q->rx_drops.count;
+	}
+
+	interface->tx_bytes_nic = tx_bytes_nic;
+	interface->tx_packets_nic = tx_pkts_nic;
+	interface->rx_bytes_nic = rx_bytes_nic;
+	interface->rx_packets_nic = rx_pkts_nic;
+	interface->rx_drops_nic = rx_drops_nic;
+
+	/* Fill out the OS statistics structure */
+	net_stats->rx_errors = interface->stats.xec.count;
+	net_stats->rx_dropped = interface->stats.nodesc_drop.count;
+}
+
+/**
+ * fm10k_watchdog_flush_tx - flush queues on host not ready
+ * @interface - pointer to the device interface structure
+ **/
+static void fm10k_watchdog_flush_tx(struct fm10k_intfc *interface)
+{
+	int some_tx_pending = 0;
+	int i;
+
+	/* nothing to do if carrier is up */
+	if (netif_carrier_ok(interface->netdev))
+		return;
+
+	for (i = 0; i < interface->num_tx_queues; i++) {
+		struct fm10k_ring *tx_ring = interface->tx_ring[i];
+
+		if (tx_ring->next_to_use != tx_ring->next_to_clean) {
+			some_tx_pending = 1;
+			break;
+		}
+	}
+
+	/* We've lost link, so the controller stops DMA, but we've got
+	 * queued Tx work that's never going to get done, so reset
+	 * controller to flush Tx.
+	 */
+	if (some_tx_pending)
+		interface->flags |= FM10K_FLAG_RESET_REQUESTED;
+}
+
+/**
+ * fm10k_watchdog_subtask - check and bring link up
+ * @interface - pointer to the device interface structure
+ **/
+static void fm10k_watchdog_subtask(struct fm10k_intfc *interface)
+{
+	/* if interface is down do nothing */
+	if (test_bit(__FM10K_DOWN, &interface->state) ||
+	    test_bit(__FM10K_RESETTING, &interface->state))
+		return;
+
+	if (interface->host_ready)
+		fm10k_watchdog_host_is_ready(interface);
+	else
+		fm10k_watchdog_host_not_ready(interface);
+
+	/* update stats only once every second */
+	if (time_is_before_jiffies(interface->next_stats_update))
+		fm10k_update_stats(interface);
+
+	/* flush any uncompleted work */
+	fm10k_watchdog_flush_tx(interface);
+}
+
+/**
+ * fm10k_check_hang_subtask - check for hung queues and dropped interrupts
+ * @interface - pointer to the device interface structure
+ *
+ * This function serves two purposes.  First it strobes the interrupt lines
+ * in order to make certain interrupts are occurring.  Secondly it sets the
+ * bits needed to check for TX hangs.  As a result we should immediately
+ * determine if a hang has occurred.
+ */
+static void fm10k_check_hang_subtask(struct fm10k_intfc *interface)
+{
+	int i;
+
+	/* If we're down or resetting, just bail */
+	if (test_bit(__FM10K_DOWN, &interface->state) ||
+	    test_bit(__FM10K_RESETTING, &interface->state))
+		return;
+
+	/* rate limit tx hang checks to only once every 2 seconds */
+	if (time_is_after_eq_jiffies(interface->next_tx_hang_check))
+		return;
+	interface->next_tx_hang_check = jiffies + (2 * HZ);
+
+	if (netif_carrier_ok(interface->netdev)) {
+		/* Force detection of hung controller */
+		for (i = 0; i < interface->num_tx_queues; i++)
+			set_check_for_tx_hang(interface->tx_ring[i]);
+
+		/* Rearm all in-use q_vectors for immediate firing */
+		for (i = 0; i < interface->num_q_vectors; i++) {
+			struct fm10k_q_vector *qv = interface->q_vector[i];
+
+			if (!qv->tx.count && !qv->rx.count)
+				continue;
+			writel(FM10K_ITR_ENABLE | FM10K_ITR_PENDING2, qv->itr);
+		}
+	}
+}
+
+/**
+ * fm10k_service_task - manages and runs subtasks
+ * @work: pointer to work_struct containing our data
+ **/
+static void fm10k_service_task(struct work_struct *work)
+{
+	struct fm10k_intfc *interface = container_of(work,
+						     struct fm10k_intfc,
+						     service_task);
+
+	/* tasks always capable of running, but must be rtnl protected */
+	fm10k_mbx_subtask(interface);
+	fm10k_detach_subtask(interface);
+	fm10k_reset_subtask(interface);
+
+	/* tasks only run when interface is up */
+	fm10k_watchdog_subtask(interface);
+	fm10k_check_hang_subtask(interface);
+
+	/* release lock on service events to allow scheduling next event */
+	fm10k_service_event_complete(interface);
+}
+
 static void fm10k_napi_enable_all(struct fm10k_intfc *interface)
 {
 	struct fm10k_q_vector *q_vector;
@@ -257,6 +655,20 @@ static irqreturn_t fm10k_msix_mbx_pf(int irq, void *data)
 		fm10k_mbx_unlock(interface);
 	}
 
+	/* if switch toggled state we should reset GLORTs */
+	if (eicr & FM10K_EICR_SWITCHNOTREADY) {
+		/* force link down for at least 4 seconds */
+		interface->link_down_event = jiffies + (4 * HZ);
+		set_bit(__FM10K_LINK_DOWN, &interface->state);
+
+		/* reset dglort_map back to no config */
+		hw->mac.dglort_map = FM10K_DGLORTMAP_NONE;
+	}
+
+	/* we should validate host state after interrupt event */
+	hw->mac.get_host_state = 1;
+	fm10k_service_event_schedule(interface);
+
 	/* re-enable mailbox interrupt and indicate 20us delay */
 	fm10k_write_reg(hw, FM10K_ITR(FM10K_MBX_VECTOR),
 			FM10K_ITR_ENABLE | FM10K_MBX_INT_DELAY);
@@ -571,6 +983,9 @@ void fm10k_up(struct fm10k_intfc *interface)
 
 	/* enable transmits */
 	netif_tx_start_all_queues(interface->netdev);
+
+	/* kick off the service timer */
+	mod_timer(&interface->service_timer, jiffies);
 }
 
 static void fm10k_napi_disable_all(struct fm10k_intfc *interface)
@@ -608,6 +1023,11 @@ void fm10k_down(struct fm10k_intfc *interface)
 	/* disable polling routines */
 	fm10k_napi_disable_all(interface);
 
+	del_timer_sync(&interface->service_timer);
+
+	/* capture stats one last time before stopping interface */
+	fm10k_update_stats(interface);
+
 	/* Disable DMA engine for Tx/Rx */
 	hw->mac.ops.stop_hw(hw);
 }
@@ -669,6 +1089,9 @@ static int fm10k_sw_init(struct fm10k_intfc *interface,
 		netdev->vlan_features |= NETIF_F_HIGHDMA;
 	}
 
+	/* delay any future reset requests */
+	interface->last_reset = jiffies + (10 * HZ);
+
 	/* reset and initialize the hardware so it is in a known state */
 	err = hw->mac.ops.reset_hw(hw) ? : hw->mac.ops.init_hw(hw);
 	if (err) {
@@ -706,6 +1129,12 @@ static int fm10k_sw_init(struct fm10k_intfc *interface,
 		netdev->hw_features &= ~NETIF_F_GSO_UDP_TUNNEL;
 	}
 
+	/* Initialize service timer and service task */
+	set_bit(__FM10K_SERVICE_DISABLE, &interface->state);
+	setup_timer(&interface->service_timer, &fm10k_service_timer,
+		    (unsigned long)interface);
+	INIT_WORK(&interface->service_task, fm10k_service_task);
+
 	/* set default ring sizes */
 	interface->tx_ring_count = FM10K_DEFAULT_TXD;
 	interface->rx_ring_count = FM10K_DEFAULT_RXD;
@@ -870,6 +1299,9 @@ static int fm10k_probe(struct pci_dev *pdev,
 	/* print warning for non-optimal configurations */
 	fm10k_slot_warn(interface);
 
+	/* clear the service task disable bit to allow service task to start */
+	clear_bit(__FM10K_SERVICE_DISABLE, &interface->state);
+
 	return 0;
 
 err_register:
@@ -903,6 +1335,9 @@ static void fm10k_remove(struct pci_dev *pdev)
 	struct fm10k_intfc *interface = pci_get_drvdata(pdev);
 	struct net_device *netdev = interface->netdev;
 
+	set_bit(__FM10K_SERVICE_DISABLE, &interface->state);
+	cancel_work_sync(&interface->service_task);
+
 	/* free netdev, this may bounce the interrupts due to setup_tc */
 	if (netdev->reg_state == NETREG_REGISTERED)
 		unregister_netdev(netdev);

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [net-next PATCH 15/29] fm10k: Add Tx/Rx hardware ring bring-up/tear-down
  2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
                   ` (13 preceding siblings ...)
  2014-09-18 22:37 ` [net-next PATCH 14/29] fm10k: Add service task to handle delayed events Alexander Duyck
@ 2014-09-18 22:37 ` Alexander Duyck
  2014-09-18 22:38 ` [net-next PATCH 16/29] fm10k: Add transmit and receive fastpath and interrupt handlers Alexander Duyck
                   ` (16 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-18 22:37 UTC (permalink / raw)
  To: davem
  Cc: nhorman, netdev, john.fastabend, matthew.vick, jeffrey.t.kirsher,
	sassmann

This patch adds support for allocating, configuring, and freeing Tx/Rx ring
resources.  With these changes in place the descriptor queues are in a
state where they are ready to transmit or receive if provided buffers.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k.h        |    8 +
 drivers/net/ethernet/intel/fm10k/fm10k_netdev.c |  339 +++++++++++++++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c    |  302 ++++++++++++++++++++
 3 files changed, 649 insertions(+)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k.h b/drivers/net/ethernet/intel/fm10k/fm10k.h
index b88a52c..c0a169a 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k.h
@@ -407,6 +407,14 @@ void fm10k_update_rx_drop_en(struct fm10k_intfc *interface);
 
 /* Netdev */
 struct net_device *fm10k_alloc_netdev(void);
+int fm10k_setup_rx_resources(struct fm10k_ring *);
+int fm10k_setup_tx_resources(struct fm10k_ring *);
+void fm10k_free_rx_resources(struct fm10k_ring *);
+void fm10k_free_tx_resources(struct fm10k_ring *);
+void fm10k_clean_all_rx_rings(struct fm10k_intfc *);
+void fm10k_clean_all_tx_rings(struct fm10k_intfc *);
+void fm10k_unmap_and_free_tx_resource(struct fm10k_ring *,
+				      struct fm10k_tx_buffer *);
 void fm10k_restore_rx_state(struct fm10k_intfc *);
 void fm10k_reset_rx_state(struct fm10k_intfc *);
 int fm10k_open(struct net_device *netdev);
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
index b987bb6..24a3e8c 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
@@ -21,6 +21,328 @@
 #include "fm10k.h"
 
 /**
+ * fm10k_setup_tx_resources - allocate Tx resources (Descriptors)
+ * @tx_ring:    tx descriptor ring (for a specific queue) to setup
+ *
+ * Return 0 on success, negative on failure
+ **/
+int fm10k_setup_tx_resources(struct fm10k_ring *tx_ring)
+{
+	struct device *dev = tx_ring->dev;
+	int size;
+
+	size = sizeof(struct fm10k_tx_buffer) * tx_ring->count;
+
+	tx_ring->tx_buffer = vzalloc(size);
+	if (!tx_ring->tx_buffer)
+		goto err;
+
+	/* round up to nearest 4K */
+	tx_ring->size = tx_ring->count * sizeof(struct fm10k_tx_desc);
+	tx_ring->size = ALIGN(tx_ring->size, 4096);
+
+	tx_ring->desc = dma_alloc_coherent(dev, tx_ring->size,
+					   &tx_ring->dma, GFP_KERNEL);
+	if (!tx_ring->desc)
+		goto err;
+
+	return 0;
+
+err:
+	vfree(tx_ring->tx_buffer);
+	tx_ring->tx_buffer = NULL;
+	return -ENOMEM;
+}
+
+/**
+ * fm10k_setup_all_tx_resources - allocate all queues Tx resources
+ * @interface: board private structure
+ *
+ * If this function returns with an error, then it's possible one or
+ * more of the rings is populated (while the rest are not).  It is the
+ * callers duty to clean those orphaned rings.
+ *
+ * Return 0 on success, negative on failure
+ **/
+static int fm10k_setup_all_tx_resources(struct fm10k_intfc *interface)
+{
+	int i, err = 0;
+
+	for (i = 0; i < interface->num_tx_queues; i++) {
+		err = fm10k_setup_tx_resources(interface->tx_ring[i]);
+		if (!err)
+			continue;
+
+		netif_err(interface, probe, interface->netdev,
+			  "Allocation for Tx Queue %u failed\n", i);
+		goto err_setup_tx;
+	}
+
+	return 0;
+err_setup_tx:
+	/* rewind the index freeing the rings as we go */
+	while (i--)
+		fm10k_free_tx_resources(interface->tx_ring[i]);
+	return err;
+}
+
+/**
+ * fm10k_setup_rx_resources - allocate Rx resources (Descriptors)
+ * @rx_ring:    rx descriptor ring (for a specific queue) to setup
+ *
+ * Returns 0 on success, negative on failure
+ **/
+int fm10k_setup_rx_resources(struct fm10k_ring *rx_ring)
+{
+	struct device *dev = rx_ring->dev;
+	int size;
+
+	size = sizeof(struct fm10k_rx_buffer) * rx_ring->count;
+
+	rx_ring->rx_buffer = vzalloc(size);
+	if (!rx_ring->rx_buffer)
+		goto err;
+
+	rx_ring->size = rx_ring->count * sizeof(union fm10k_rx_desc);
+
+	/* Round up to nearest 4K */
+	rx_ring->size = ALIGN(rx_ring->size, 4096);
+
+	rx_ring->desc = dma_alloc_coherent(dev, rx_ring->size,
+					   &rx_ring->dma, GFP_KERNEL);
+	if (!rx_ring->desc)
+		goto err;
+
+	return 0;
+err:
+	vfree(rx_ring->rx_buffer);
+	rx_ring->rx_buffer = NULL;
+	return -ENOMEM;
+}
+
+/**
+ * fm10k_setup_all_rx_resources - allocate all queues Rx resources
+ * @interface: board private structure
+ *
+ * If this function returns with an error, then it's possible one or
+ * more of the rings is populated (while the rest are not).  It is the
+ * callers duty to clean those orphaned rings.
+ *
+ * Return 0 on success, negative on failure
+ **/
+static int fm10k_setup_all_rx_resources(struct fm10k_intfc *interface)
+{
+	int i, err = 0;
+
+	for (i = 0; i < interface->num_rx_queues; i++) {
+		err = fm10k_setup_rx_resources(interface->rx_ring[i]);
+		if (!err)
+			continue;
+
+		netif_err(interface, probe, interface->netdev,
+			  "Allocation for Rx Queue %u failed\n", i);
+		goto err_setup_rx;
+	}
+
+	return 0;
+err_setup_rx:
+	/* rewind the index freeing the rings as we go */
+	while (i--)
+		fm10k_free_rx_resources(interface->rx_ring[i]);
+	return err;
+}
+
+void fm10k_unmap_and_free_tx_resource(struct fm10k_ring *ring,
+				      struct fm10k_tx_buffer *tx_buffer)
+{
+	if (tx_buffer->skb) {
+		dev_kfree_skb_any(tx_buffer->skb);
+		if (dma_unmap_len(tx_buffer, len))
+			dma_unmap_single(ring->dev,
+					 dma_unmap_addr(tx_buffer, dma),
+					 dma_unmap_len(tx_buffer, len),
+					 DMA_TO_DEVICE);
+	} else if (dma_unmap_len(tx_buffer, len)) {
+		dma_unmap_page(ring->dev,
+			       dma_unmap_addr(tx_buffer, dma),
+			       dma_unmap_len(tx_buffer, len),
+			       DMA_TO_DEVICE);
+	}
+	tx_buffer->next_to_watch = NULL;
+	tx_buffer->skb = NULL;
+	dma_unmap_len_set(tx_buffer, len, 0);
+	/* tx_buffer must be completely set up in the transmit path */
+}
+
+/**
+ * fm10k_clean_tx_ring - Free Tx Buffers
+ * @tx_ring: ring to be cleaned
+ **/
+static void fm10k_clean_tx_ring(struct fm10k_ring *tx_ring)
+{
+	struct fm10k_tx_buffer *tx_buffer;
+	unsigned long size;
+	u16 i;
+
+	/* ring already cleared, nothing to do */
+	if (!tx_ring->tx_buffer)
+		return;
+
+	/* Free all the Tx ring sk_buffs */
+	for (i = 0; i < tx_ring->count; i++) {
+		tx_buffer = &tx_ring->tx_buffer[i];
+		fm10k_unmap_and_free_tx_resource(tx_ring, tx_buffer);
+	}
+
+	/* reset BQL values */
+	netdev_tx_reset_queue(txring_txq(tx_ring));
+
+	size = sizeof(struct fm10k_tx_buffer) * tx_ring->count;
+	memset(tx_ring->tx_buffer, 0, size);
+
+	/* Zero out the descriptor ring */
+	memset(tx_ring->desc, 0, tx_ring->size);
+}
+
+/**
+ * fm10k_free_tx_resources - Free Tx Resources per Queue
+ * @tx_ring: Tx descriptor ring for a specific queue
+ *
+ * Free all transmit software resources
+ **/
+void fm10k_free_tx_resources(struct fm10k_ring *tx_ring)
+{
+	fm10k_clean_tx_ring(tx_ring);
+
+	vfree(tx_ring->tx_buffer);
+	tx_ring->tx_buffer = NULL;
+
+	/* if not set, then don't free */
+	if (!tx_ring->desc)
+		return;
+
+	dma_free_coherent(tx_ring->dev, tx_ring->size,
+			  tx_ring->desc, tx_ring->dma);
+	tx_ring->desc = NULL;
+}
+
+/**
+ * fm10k_clean_all_tx_rings - Free Tx Buffers for all queues
+ * @interface: board private structure
+ **/
+void fm10k_clean_all_tx_rings(struct fm10k_intfc *interface)
+{
+	int i;
+
+	for (i = 0; i < interface->num_tx_queues; i++)
+		fm10k_clean_tx_ring(interface->tx_ring[i]);
+}
+
+/**
+ * fm10k_free_all_tx_resources - Free Tx Resources for All Queues
+ * @interface: board private structure
+ *
+ * Free all transmit software resources
+ **/
+static void fm10k_free_all_tx_resources(struct fm10k_intfc *interface)
+{
+	int i = interface->num_tx_queues;
+
+	while (i--)
+		fm10k_free_tx_resources(interface->tx_ring[i]);
+}
+
+/**
+ * fm10k_clean_rx_ring - Free Rx Buffers per Queue
+ * @rx_ring: ring to free buffers from
+ **/
+static void fm10k_clean_rx_ring(struct fm10k_ring *rx_ring)
+{
+	unsigned long size;
+	u16 i;
+
+	if (!rx_ring->rx_buffer)
+		return;
+
+	if (rx_ring->skb)
+		dev_kfree_skb(rx_ring->skb);
+	rx_ring->skb = NULL;
+
+	/* Free all the Rx ring sk_buffs */
+	for (i = 0; i < rx_ring->count; i++) {
+		struct fm10k_rx_buffer *buffer = &rx_ring->rx_buffer[i];
+		/* clean-up will only set page pointer to NULL */
+		if (!buffer->page)
+			continue;
+
+		dma_unmap_page(rx_ring->dev, buffer->dma,
+			       PAGE_SIZE, DMA_FROM_DEVICE);
+		__free_page(buffer->page);
+
+		buffer->page = NULL;
+	}
+
+	size = sizeof(struct fm10k_rx_buffer) * rx_ring->count;
+	memset(rx_ring->rx_buffer, 0, size);
+
+	/* Zero out the descriptor ring */
+	memset(rx_ring->desc, 0, rx_ring->size);
+
+	rx_ring->next_to_alloc = 0;
+	rx_ring->next_to_clean = 0;
+	rx_ring->next_to_use = 0;
+}
+
+/**
+ * fm10k_free_rx_resources - Free Rx Resources
+ * @rx_ring: ring to clean the resources from
+ *
+ * Free all receive software resources
+ **/
+void fm10k_free_rx_resources(struct fm10k_ring *rx_ring)
+{
+	fm10k_clean_rx_ring(rx_ring);
+
+	vfree(rx_ring->rx_buffer);
+	rx_ring->rx_buffer = NULL;
+
+	/* if not set, then don't free */
+	if (!rx_ring->desc)
+		return;
+
+	dma_free_coherent(rx_ring->dev, rx_ring->size,
+			  rx_ring->desc, rx_ring->dma);
+
+	rx_ring->desc = NULL;
+}
+
+/**
+ * fm10k_clean_all_rx_rings - Free Rx Buffers for all queues
+ * @interface: board private structure
+ **/
+void fm10k_clean_all_rx_rings(struct fm10k_intfc *interface)
+{
+	int i;
+
+	for (i = 0; i < interface->num_rx_queues; i++)
+		fm10k_clean_rx_ring(interface->rx_ring[i]);
+}
+
+/**
+ * fm10k_free_all_rx_resources - Free Rx Resources for All Queues
+ * @interface: board private structure
+ *
+ * Free all receive software resources
+ **/
+static void fm10k_free_all_rx_resources(struct fm10k_intfc *interface)
+{
+	int i = interface->num_rx_queues;
+
+	while (i--)
+		fm10k_free_rx_resources(interface->rx_ring[i]);
+}
+
+/**
  * fm10k_request_glort_range - Request GLORTs for use in configuring rules
  * @interface: board private structure
  *
@@ -59,6 +381,16 @@ int fm10k_open(struct net_device *netdev)
 	struct fm10k_intfc *interface = netdev_priv(netdev);
 	int err;
 
+	/* allocate transmit descriptors */
+	err = fm10k_setup_all_tx_resources(interface);
+	if (err)
+		goto err_setup_tx;
+
+	/* allocate receive descriptors */
+	err = fm10k_setup_all_rx_resources(interface);
+	if (err)
+		goto err_setup_rx;
+
 	/* allocate interrupt resources */
 	err = fm10k_qv_request_irq(interface);
 	if (err)
@@ -81,6 +413,10 @@ int fm10k_open(struct net_device *netdev)
 err_set_queues:
 	fm10k_qv_free_irq(interface);
 err_req_irq:
+	fm10k_free_all_rx_resources(interface);
+err_setup_rx:
+	fm10k_free_all_tx_resources(interface);
+err_setup_tx:
 	return err;
 }
 
@@ -103,6 +439,9 @@ int fm10k_close(struct net_device *netdev)
 
 	fm10k_qv_free_irq(interface);
 
+	fm10k_free_all_tx_resources(interface);
+	fm10k_free_all_rx_resources(interface);
+
 	return 0;
 }
 
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index 879371f..e8ed1a4 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -485,6 +485,299 @@ static void fm10k_service_task(struct work_struct *work)
 	fm10k_service_event_complete(interface);
 }
 
+/**
+ * fm10k_configure_tx_ring - Configure Tx ring after Reset
+ * @interface: board private structure
+ * @ring: structure containing ring specific data
+ *
+ * Configure the Tx descriptor ring after a reset.
+ **/
+static void fm10k_configure_tx_ring(struct fm10k_intfc *interface,
+				    struct fm10k_ring *ring)
+{
+	struct fm10k_hw *hw = &interface->hw;
+	u64 tdba = ring->dma;
+	u32 size = ring->count * sizeof(struct fm10k_tx_desc);
+	u32 txint = FM10K_INT_MAP_DISABLE;
+	u32 txdctl = FM10K_TXDCTL_ENABLE | (1 << FM10K_TXDCTL_MAX_TIME_SHIFT);
+	u8 reg_idx = ring->reg_idx;
+
+	/* disable queue to avoid issues while updating state */
+	fm10k_write_reg(hw, FM10K_TXDCTL(reg_idx), 0);
+	fm10k_write_flush(hw);
+
+	/* possible poll here to verify ring resources have been cleaned */
+
+	/* set location and size for descriptor ring */
+	fm10k_write_reg(hw, FM10K_TDBAL(reg_idx), tdba & DMA_BIT_MASK(32));
+	fm10k_write_reg(hw, FM10K_TDBAH(reg_idx), tdba >> 32);
+	fm10k_write_reg(hw, FM10K_TDLEN(reg_idx), size);
+
+	/* reset head and tail pointers */
+	fm10k_write_reg(hw, FM10K_TDH(reg_idx), 0);
+	fm10k_write_reg(hw, FM10K_TDT(reg_idx), 0);
+
+	/* store tail pointer */
+	ring->tail = &interface->uc_addr[FM10K_TDT(reg_idx)];
+
+	/* reset ntu and ntc to place SW in sync with hardwdare */
+	ring->next_to_clean = 0;
+	ring->next_to_use = 0;
+
+	/* Map interrupt */
+	if (ring->q_vector) {
+		txint = ring->q_vector->v_idx + NON_Q_VECTORS(hw);
+		txint |= FM10K_INT_MAP_TIMER0;
+	}
+
+	fm10k_write_reg(hw, FM10K_TXINT(reg_idx), txint);
+
+	/* enable use of FTAG bit in Tx descriptor, register is RO for VF */
+	fm10k_write_reg(hw, FM10K_PFVTCTL(reg_idx),
+			FM10K_PFVTCTL_FTAG_DESC_ENABLE);
+
+	/* enable queue */
+	fm10k_write_reg(hw, FM10K_TXDCTL(reg_idx), txdctl);
+}
+
+/**
+ * fm10k_enable_tx_ring - Verify Tx ring is enabled after configuration
+ * @interface: board private structure
+ * @ring: structure containing ring specific data
+ *
+ * Verify the Tx descriptor ring is ready for transmit.
+ **/
+static void fm10k_enable_tx_ring(struct fm10k_intfc *interface,
+				 struct fm10k_ring *ring)
+{
+	struct fm10k_hw *hw = &interface->hw;
+	int wait_loop = 10;
+	u32 txdctl;
+	u8 reg_idx = ring->reg_idx;
+
+	/* if we are already enabled just exit */
+	if (fm10k_read_reg(hw, FM10K_TXDCTL(reg_idx)) & FM10K_TXDCTL_ENABLE)
+		return;
+
+	/* poll to verify queue is enabled */
+	do {
+		usleep_range(1000, 2000);
+		txdctl = fm10k_read_reg(hw, FM10K_TXDCTL(reg_idx));
+	} while (!(txdctl & FM10K_TXDCTL_ENABLE) && --wait_loop);
+	if (!wait_loop)
+		netif_err(interface, drv, interface->netdev,
+			  "Could not enable Tx Queue %d\n", reg_idx);
+}
+
+/**
+ * fm10k_configure_tx - Configure Transmit Unit after Reset
+ * @interface: board private structure
+ *
+ * Configure the Tx unit of the MAC after a reset.
+ **/
+static void fm10k_configure_tx(struct fm10k_intfc *interface)
+{
+	int i;
+
+	/* Setup the HW Tx Head and Tail descriptor pointers */
+	for (i = 0; i < interface->num_tx_queues; i++)
+		fm10k_configure_tx_ring(interface, interface->tx_ring[i]);
+
+	/* poll here to verify that Tx rings are now enabled */
+	for (i = 0; i < interface->num_tx_queues; i++)
+		fm10k_enable_tx_ring(interface, interface->tx_ring[i]);
+}
+
+/**
+ * fm10k_configure_rx_ring - Configure Rx ring after Reset
+ * @interface: board private structure
+ * @ring: structure containing ring specific data
+ *
+ * Configure the Rx descriptor ring after a reset.
+ **/
+static void fm10k_configure_rx_ring(struct fm10k_intfc *interface,
+				    struct fm10k_ring *ring)
+{
+	u64 rdba = ring->dma;
+	struct fm10k_hw *hw = &interface->hw;
+	u32 size = ring->count * sizeof(union fm10k_rx_desc);
+	u32 rxqctl = FM10K_RXQCTL_ENABLE | FM10K_RXQCTL_PF;
+	u32 rxdctl = FM10K_RXDCTL_WRITE_BACK_MIN_DELAY;
+	u32 srrctl = FM10K_SRRCTL_BUFFER_CHAINING_EN;
+	u32 rxint = FM10K_INT_MAP_DISABLE;
+	u8 rx_pause = interface->rx_pause;
+	u8 reg_idx = ring->reg_idx;
+
+	/* disable queue to avoid issues while updating state */
+	fm10k_write_reg(hw, FM10K_RXQCTL(reg_idx), 0);
+	fm10k_write_flush(hw);
+
+	/* possible poll here to verify ring resources have been cleaned */
+
+	/* set location and size for descriptor ring */
+	fm10k_write_reg(hw, FM10K_RDBAL(reg_idx), rdba & DMA_BIT_MASK(32));
+	fm10k_write_reg(hw, FM10K_RDBAH(reg_idx), rdba >> 32);
+	fm10k_write_reg(hw, FM10K_RDLEN(reg_idx), size);
+
+	/* reset head and tail pointers */
+	fm10k_write_reg(hw, FM10K_RDH(reg_idx), 0);
+	fm10k_write_reg(hw, FM10K_RDT(reg_idx), 0);
+
+	/* store tail pointer */
+	ring->tail = &interface->uc_addr[FM10K_RDT(reg_idx)];
+
+	/* reset ntu and ntc to place SW in sync with hardwdare */
+	ring->next_to_clean = 0;
+	ring->next_to_use = 0;
+	ring->next_to_alloc = 0;
+
+	/* Configure the Rx buffer size for one buff without split */
+	srrctl |= FM10K_RX_BUFSZ >> FM10K_SRRCTL_BSIZEPKT_SHIFT;
+
+	/* Configure the Rx ring to supress loopback packets */
+	srrctl |= FM10K_SRRCTL_LOOPBACK_SUPPRESS;
+	fm10k_write_reg(hw, FM10K_SRRCTL(reg_idx), srrctl);
+
+	/* Enable drop on empty */
+#if defined(HAVE_DCBNL_IEEE) && defined(CONFIG_DCB)
+	if (interface->pfc_en)
+		rx_pause = interface->pfc_en;
+#endif
+	if (!(rx_pause & (1 << ring->qos_pc)))
+		rxdctl |= FM10K_RXDCTL_DROP_ON_EMPTY;
+
+	fm10k_write_reg(hw, FM10K_RXDCTL(reg_idx), rxdctl);
+
+	/* assign default VLAN to queue */
+	ring->vid = hw->mac.default_vid;
+
+	/* Map interrupt */
+	if (ring->q_vector) {
+		rxint = ring->q_vector->v_idx + NON_Q_VECTORS(hw);
+		rxint |= FM10K_INT_MAP_TIMER1;
+	}
+
+	fm10k_write_reg(hw, FM10K_RXINT(reg_idx), rxint);
+
+	/* enable queue */
+	fm10k_write_reg(hw, FM10K_RXQCTL(reg_idx), rxqctl);
+}
+
+/**
+ * fm10k_update_rx_drop_en - Configures the drop enable bits for Rx rings
+ * @interface: board private structure
+ *
+ * Configure the drop enable bits for the Rx rings.
+ **/
+void fm10k_update_rx_drop_en(struct fm10k_intfc *interface)
+{
+	struct fm10k_hw *hw = &interface->hw;
+	u8 rx_pause = interface->rx_pause;
+	int i;
+
+#if defined(HAVE_DCBNL_IEEE) && defined(CONFIG_DCB)
+	if (interface->pfc_en)
+		rx_pause = interface->pfc_en;
+
+#endif
+	for (i = 0; i < interface->num_rx_queues; i++) {
+		struct fm10k_ring *ring = interface->rx_ring[i];
+		u32 rxdctl = FM10K_RXDCTL_WRITE_BACK_MIN_DELAY;
+		u8 reg_idx = ring->reg_idx;
+
+		if (!(rx_pause & (1 << ring->qos_pc)))
+			rxdctl |= FM10K_RXDCTL_DROP_ON_EMPTY;
+
+		fm10k_write_reg(hw, FM10K_RXDCTL(reg_idx), rxdctl);
+	}
+}
+
+/**
+ * fm10k_configure_dglort - Configure Receive DGLORT after reset
+ * @interface: board private structure
+ *
+ * Configure the DGLORT description and RSS tables.
+ **/
+static void fm10k_configure_dglort(struct fm10k_intfc *interface)
+{
+	struct fm10k_dglort_cfg dglort = { 0 };
+	struct fm10k_hw *hw = &interface->hw;
+	int i;
+	u32 mrqc;
+
+	/* Fill out hash function seeds */
+	for (i = 0; i < FM10K_RSSRK_SIZE; i++)
+		fm10k_write_reg(hw, FM10K_RSSRK(0, i), interface->rssrk[i]);
+
+	/* Write RETA table to hardware */
+	for (i = 0; i < FM10K_RETA_SIZE; i++)
+		fm10k_write_reg(hw, FM10K_RETA(0, i), interface->reta[i]);
+
+	/* Generate RSS hash based on packet types, TCP/UDP
+	 * port numbers and/or IPv4/v6 src and dst addresses
+	 */
+	mrqc = FM10K_MRQC_IPV4 |
+	       FM10K_MRQC_TCP_IPV4 |
+	       FM10K_MRQC_IPV6 |
+	       FM10K_MRQC_TCP_IPV6;
+
+	if (interface->flags & FM10K_FLAG_RSS_FIELD_IPV4_UDP)
+		mrqc |= FM10K_MRQC_UDP_IPV4;
+	if (interface->flags & FM10K_FLAG_RSS_FIELD_IPV6_UDP)
+		mrqc |= FM10K_MRQC_UDP_IPV6;
+
+	fm10k_write_reg(hw, FM10K_MRQC(0), mrqc);
+
+	/* configure default DGLORT mapping for RSS/DCB */
+	dglort.inner_rss = 1;
+	dglort.rss_l = fls(interface->ring_feature[RING_F_RSS].mask);
+	dglort.pc_l = fls(interface->ring_feature[RING_F_QOS].mask);
+	hw->mac.ops.configure_dglort_map(hw, &dglort);
+
+	/* assign GLORT per queue for queue mapped testing */
+	if (interface->glort_count > 64) {
+		memset(&dglort, 0, sizeof(dglort));
+		dglort.inner_rss = 1;
+		dglort.glort = interface->glort + 64;
+		dglort.idx = fm10k_dglort_pf_queue;
+		dglort.queue_l = fls(interface->num_rx_queues - 1);
+		hw->mac.ops.configure_dglort_map(hw, &dglort);
+	}
+
+	/* assign glort value for RSS/DCB specific to this interface */
+	memset(&dglort, 0, sizeof(dglort));
+	dglort.inner_rss = 1;
+	dglort.glort = interface->glort;
+	dglort.rss_l = fls(interface->ring_feature[RING_F_RSS].mask);
+	dglort.pc_l = fls(interface->ring_feature[RING_F_QOS].mask);
+	/* configure DGLORT mapping for RSS/DCB */
+	dglort.idx = fm10k_dglort_pf_rss;
+	hw->mac.ops.configure_dglort_map(hw, &dglort);
+}
+
+/**
+ * fm10k_configure_rx - Configure Receive Unit after Reset
+ * @interface: board private structure
+ *
+ * Configure the Rx unit of the MAC after a reset.
+ **/
+static void fm10k_configure_rx(struct fm10k_intfc *interface)
+{
+	int i;
+
+	/* Configure SWPRI to PC map */
+	fm10k_configure_swpri_map(interface);
+
+	/* Configure RSS and DGLORT map */
+	fm10k_configure_dglort(interface);
+
+	/* Setup the HW Rx Head and Tail descriptor pointers */
+	for (i = 0; i < interface->num_rx_queues; i++)
+		fm10k_configure_rx_ring(interface, interface->rx_ring[i]);
+
+	/* possible poll here to verify that Rx rings are now enabled */
+}
+
 static void fm10k_napi_enable_all(struct fm10k_intfc *interface)
 {
 	struct fm10k_q_vector *q_vector;
@@ -969,6 +1262,12 @@ void fm10k_up(struct fm10k_intfc *interface)
 	/* Enable Tx/Rx DMA */
 	hw->mac.ops.start_hw(hw);
 
+	/* configure Tx descriptor rings */
+	fm10k_configure_tx(interface);
+
+	/* configure Rx descriptor rings */
+	fm10k_configure_rx(interface);
+
 	/* configure interrupts */
 	hw->mac.ops.update_int_moderator(hw);
 
@@ -1030,6 +1329,9 @@ void fm10k_down(struct fm10k_intfc *interface)
 
 	/* Disable DMA engine for Tx/Rx */
 	hw->mac.ops.stop_hw(hw);
+
+	/* free any buffers still on the rings */
+	fm10k_clean_all_tx_rings(interface);
 }
 
 /**

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [net-next PATCH 16/29] fm10k: Add transmit and receive fastpath and interrupt handlers
  2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
                   ` (14 preceding siblings ...)
  2014-09-18 22:37 ` [net-next PATCH 15/29] fm10k: Add Tx/Rx hardware ring bring-up/tear-down Alexander Duyck
@ 2014-09-18 22:38 ` Alexander Duyck
  2014-09-18 22:38 ` [net-next PATCH 17/29] fm10k: Add ethtool support Alexander Duyck
                   ` (15 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-18 22:38 UTC (permalink / raw)
  To: davem
  Cc: nhorman, netdev, john.fastabend, matthew.vick, jeffrey.t.kirsher,
	sassmann

This change adds the transmit and receive fastpath and interrupt handlers.
With this code in place the network device is now able to send and receive
frames over the network interface using a single queue.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k.h        |    5 
 drivers/net/ethernet/intel/fm10k/fm10k_main.c   |  937 +++++++++++++++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_netdev.c |   94 ++
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c    |    3 
 4 files changed, 1037 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k.h b/drivers/net/ethernet/intel/fm10k/fm10k.h
index c0a169a..98cbcd8 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k.h
@@ -391,6 +391,11 @@ extern char fm10k_driver_name[];
 extern const char fm10k_driver_version[];
 int fm10k_init_queueing_scheme(struct fm10k_intfc *interface);
 void fm10k_clear_queueing_scheme(struct fm10k_intfc *interface);
+netdev_tx_t fm10k_xmit_frame_ring(struct sk_buff *skb,
+				  struct fm10k_ring *tx_ring);
+void fm10k_tx_timeout_reset(struct fm10k_intfc *interface);
+bool fm10k_check_tx_hang(struct fm10k_ring *tx_ring);
+void fm10k_alloc_rx_buffers(struct fm10k_ring *rx_ring, u16 cleaned_count);
 
 /* PCI */
 void fm10k_mbx_free_irq(struct fm10k_intfc *);
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_main.c b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
index bf84c26..0d209c5 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_main.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
@@ -67,6 +67,921 @@ static void __exit fm10k_exit_module(void)
 }
 module_exit(fm10k_exit_module);
 
+static bool fm10k_alloc_mapped_page(struct fm10k_ring *rx_ring,
+				    struct fm10k_rx_buffer *bi)
+{
+	struct page *page = bi->page;
+	dma_addr_t dma;
+
+	/* Only page will be NULL if buffer was consumed */
+	if (likely(page))
+		return true;
+
+	/* alloc new page for storage */
+	page = alloc_page(GFP_ATOMIC | __GFP_COLD);
+	if (unlikely(!page)) {
+		rx_ring->rx_stats.alloc_failed++;
+		return false;
+	}
+
+	/* map page for use */
+	dma = dma_map_page(rx_ring->dev, page, 0, PAGE_SIZE, DMA_FROM_DEVICE);
+
+	/* if mapping failed free memory back to system since
+	 * there isn't much point in holding memory we can't use
+	 */
+	if (dma_mapping_error(rx_ring->dev, dma)) {
+		__free_page(page);
+		bi->page = NULL;
+
+		rx_ring->rx_stats.alloc_failed++;
+		return false;
+	}
+
+	bi->dma = dma;
+	bi->page = page;
+	bi->page_offset = 0;
+
+	return true;
+}
+
+/**
+ * fm10k_alloc_rx_buffers - Replace used receive buffers
+ * @rx_ring: ring to place buffers on
+ * @cleaned_count: number of buffers to replace
+ **/
+void fm10k_alloc_rx_buffers(struct fm10k_ring *rx_ring, u16 cleaned_count)
+{
+	union fm10k_rx_desc *rx_desc;
+	struct fm10k_rx_buffer *bi;
+	u16 i = rx_ring->next_to_use;
+
+	/* nothing to do */
+	if (!cleaned_count)
+		return;
+
+	rx_desc = FM10K_RX_DESC(rx_ring, i);
+	bi = &rx_ring->rx_buffer[i];
+	i -= rx_ring->count;
+
+	do {
+		if (!fm10k_alloc_mapped_page(rx_ring, bi))
+			break;
+
+		/* Refresh the desc even if buffer_addrs didn't change
+		 * because each write-back erases this info.
+		 */
+		rx_desc->q.pkt_addr = cpu_to_le64(bi->dma + bi->page_offset);
+
+		rx_desc++;
+		bi++;
+		i++;
+		if (unlikely(!i)) {
+			rx_desc = FM10K_RX_DESC(rx_ring, 0);
+			bi = rx_ring->rx_buffer;
+			i -= rx_ring->count;
+		}
+
+		/* clear the hdr_addr for the next_to_use descriptor */
+		rx_desc->q.hdr_addr = 0;
+
+		cleaned_count--;
+	} while (cleaned_count);
+
+	i += rx_ring->count;
+
+	if (rx_ring->next_to_use != i) {
+		/* record the next descriptor to use */
+		rx_ring->next_to_use = i;
+
+		/* update next to alloc since we have filled the ring */
+		rx_ring->next_to_alloc = i;
+
+		/* Force memory writes to complete before letting h/w
+		 * know there are new descriptors to fetch.  (Only
+		 * applicable for weak-ordered memory model archs,
+		 * such as IA-64).
+		 */
+		wmb();
+
+		/* notify hardware of new descriptors */
+		writel(i, rx_ring->tail);
+	}
+}
+
+/**
+ * fm10k_reuse_rx_page - page flip buffer and store it back on the ring
+ * @rx_ring: rx descriptor ring to store buffers on
+ * @old_buff: donor buffer to have page reused
+ *
+ * Synchronizes page for reuse by the interface
+ **/
+static void fm10k_reuse_rx_page(struct fm10k_ring *rx_ring,
+				struct fm10k_rx_buffer *old_buff)
+{
+	struct fm10k_rx_buffer *new_buff;
+	u16 nta = rx_ring->next_to_alloc;
+
+	new_buff = &rx_ring->rx_buffer[nta];
+
+	/* update, and store next to alloc */
+	nta++;
+	rx_ring->next_to_alloc = (nta < rx_ring->count) ? nta : 0;
+
+	/* transfer page from old buffer to new buffer */
+	memcpy(new_buff, old_buff, sizeof(struct fm10k_rx_buffer));
+
+	/* sync the buffer for use by the device */
+	dma_sync_single_range_for_device(rx_ring->dev, old_buff->dma,
+					 old_buff->page_offset,
+					 FM10K_RX_BUFSZ,
+					 DMA_FROM_DEVICE);
+}
+
+static bool fm10k_can_reuse_rx_page(struct fm10k_rx_buffer *rx_buffer,
+				    struct page *page,
+				    unsigned int truesize)
+{
+	/* avoid re-using remote pages */
+	if (unlikely(page_to_nid(page) != numa_mem_id()))
+		return false;
+
+#if (PAGE_SIZE < 8192)
+	/* if we are only owner of page we can reuse it */
+	if (unlikely(page_count(page) != 1))
+		return false;
+
+	/* flip page offset to other buffer */
+	rx_buffer->page_offset ^= FM10K_RX_BUFSZ;
+
+	/* since we are the only owner of the page and we need to
+	 * increment it, just set the value to 2 in order to avoid
+	 * an unnecessary locked operation
+	 */
+	atomic_set(&page->_count, 2);
+#else
+	/* move offset up to the next cache line */
+	rx_buffer->page_offset += truesize;
+
+	if (rx_buffer->page_offset > (PAGE_SIZE - FM10K_RX_BUFSZ))
+		return false;
+
+	/* bump ref count on page before it is given to the stack */
+	get_page(page);
+#endif
+
+	return true;
+}
+
+/**
+ * fm10k_add_rx_frag - Add contents of Rx buffer to sk_buff
+ * @rx_ring: rx descriptor ring to transact packets on
+ * @rx_buffer: buffer containing page to add
+ * @rx_desc: descriptor containing length of buffer written by hardware
+ * @skb: sk_buff to place the data into
+ *
+ * This function will add the data contained in rx_buffer->page to the skb.
+ * This is done either through a direct copy if the data in the buffer is
+ * less than the skb header size, otherwise it will just attach the page as
+ * a frag to the skb.
+ *
+ * The function will then update the page offset if necessary and return
+ * true if the buffer can be reused by the interface.
+ **/
+static bool fm10k_add_rx_frag(struct fm10k_ring *rx_ring,
+			      struct fm10k_rx_buffer *rx_buffer,
+			      union fm10k_rx_desc *rx_desc,
+			      struct sk_buff *skb)
+{
+	struct page *page = rx_buffer->page;
+	unsigned int size = le16_to_cpu(rx_desc->w.length);
+#if (PAGE_SIZE < 8192)
+	unsigned int truesize = FM10K_RX_BUFSZ;
+#else
+	unsigned int truesize = ALIGN(size, L1_CACHE_BYTES);
+#endif
+
+	if ((size <= FM10K_RX_HDR_LEN) && !skb_is_nonlinear(skb)) {
+		unsigned char *va = page_address(page) + rx_buffer->page_offset;
+
+		memcpy(__skb_put(skb, size), va, ALIGN(size, sizeof(long)));
+
+		/* we can reuse buffer as-is, just make sure it is local */
+		if (likely(page_to_nid(page) == numa_mem_id()))
+			return true;
+
+		/* this page cannot be reused so discard it */
+		put_page(page);
+		return false;
+	}
+
+	skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, page,
+			rx_buffer->page_offset, size, truesize);
+
+	return fm10k_can_reuse_rx_page(rx_buffer, page, truesize);
+}
+
+static struct sk_buff *fm10k_fetch_rx_buffer(struct fm10k_ring *rx_ring,
+					     union fm10k_rx_desc *rx_desc,
+					     struct sk_buff *skb)
+{
+	struct fm10k_rx_buffer *rx_buffer;
+	struct page *page;
+
+	rx_buffer = &rx_ring->rx_buffer[rx_ring->next_to_clean];
+
+	page = rx_buffer->page;
+	prefetchw(page);
+
+	if (likely(!skb)) {
+		void *page_addr = page_address(page) +
+				  rx_buffer->page_offset;
+
+		/* prefetch first cache line of first page */
+		prefetch(page_addr);
+#if L1_CACHE_BYTES < 128
+		prefetch(page_addr + L1_CACHE_BYTES);
+#endif
+
+		/* allocate a skb to store the frags */
+		skb = netdev_alloc_skb_ip_align(rx_ring->netdev,
+						FM10K_RX_HDR_LEN);
+		if (unlikely(!skb)) {
+			rx_ring->rx_stats.alloc_failed++;
+			return NULL;
+		}
+
+		/* we will be copying header into skb->data in
+		 * pskb_may_pull so it is in our interest to prefetch
+		 * it now to avoid a possible cache miss
+		 */
+		prefetchw(skb->data);
+	}
+
+	/* we are reusing so sync this buffer for CPU use */
+	dma_sync_single_range_for_cpu(rx_ring->dev,
+				      rx_buffer->dma,
+				      rx_buffer->page_offset,
+				      FM10K_RX_BUFSZ,
+				      DMA_FROM_DEVICE);
+
+	/* pull page into skb */
+	if (fm10k_add_rx_frag(rx_ring, rx_buffer, rx_desc, skb)) {
+		/* hand second half of page back to the ring */
+		fm10k_reuse_rx_page(rx_ring, rx_buffer);
+	} else {
+		/* we are not reusing the buffer so unmap it */
+		dma_unmap_page(rx_ring->dev, rx_buffer->dma,
+			       PAGE_SIZE, DMA_FROM_DEVICE);
+	}
+
+	/* clear contents of rx_buffer */
+	rx_buffer->page = NULL;
+
+	return skb;
+}
+
+/**
+ * fm10k_process_skb_fields - Populate skb header fields from Rx descriptor
+ * @rx_ring: rx descriptor ring packet is being transacted on
+ * @rx_desc: pointer to the EOP Rx descriptor
+ * @skb: pointer to current skb being populated
+ *
+ * This function checks the ring, descriptor, and packet information in
+ * order to populate the hash, checksum, VLAN, timestamp, protocol, and
+ * other fields within the skb.
+ **/
+static unsigned int fm10k_process_skb_fields(struct fm10k_ring *rx_ring,
+					     union fm10k_rx_desc *rx_desc,
+					     struct sk_buff *skb)
+{
+	unsigned int len = skb->len;
+
+	FM10K_CB(skb)->fi.w.vlan = rx_desc->w.vlan;
+
+	skb_record_rx_queue(skb, rx_ring->queue_index);
+
+	FM10K_CB(skb)->fi.d.glort = rx_desc->d.glort;
+
+	if (rx_desc->w.vlan) {
+		u16 vid = le16_to_cpu(rx_desc->w.vlan);
+
+		if (vid != rx_ring->vid)
+			__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), vid);
+	}
+
+	skb->protocol = eth_type_trans(skb, rx_ring->netdev);
+
+	return len;
+}
+
+/**
+ * fm10k_is_non_eop - process handling of non-EOP buffers
+ * @rx_ring: Rx ring being processed
+ * @rx_desc: Rx descriptor for current buffer
+ *
+ * This function updates next to clean.  If the buffer is an EOP buffer
+ * this function exits returning false, otherwise it will place the
+ * sk_buff in the next buffer to be chained and return true indicating
+ * that this is in fact a non-EOP buffer.
+ **/
+static bool fm10k_is_non_eop(struct fm10k_ring *rx_ring,
+			     union fm10k_rx_desc *rx_desc)
+{
+	u32 ntc = rx_ring->next_to_clean + 1;
+
+	/* fetch, update, and store next to clean */
+	ntc = (ntc < rx_ring->count) ? ntc : 0;
+	rx_ring->next_to_clean = ntc;
+
+	prefetch(FM10K_RX_DESC(rx_ring, ntc));
+
+	if (likely(fm10k_test_staterr(rx_desc, FM10K_RXD_STATUS_EOP)))
+		return false;
+
+	return true;
+}
+
+/**
+ * fm10k_pull_tail - fm10k specific version of skb_pull_tail
+ * @rx_ring: rx descriptor ring packet is being transacted on
+ * @rx_desc: pointer to the EOP Rx descriptor
+ * @skb: pointer to current skb being adjusted
+ *
+ * This function is an fm10k specific version of __pskb_pull_tail.  The
+ * main difference between this version and the original function is that
+ * this function can make several assumptions about the state of things
+ * that allow for significant optimizations versus the standard function.
+ * As a result we can do things like drop a frag and maintain an accurate
+ * truesize for the skb.
+ */
+static void fm10k_pull_tail(struct fm10k_ring *rx_ring,
+			    union fm10k_rx_desc *rx_desc,
+			    struct sk_buff *skb)
+{
+	struct skb_frag_struct *frag = &skb_shinfo(skb)->frags[0];
+	unsigned char *va;
+	unsigned int pull_len;
+
+	/* it is valid to use page_address instead of kmap since we are
+	 * working with pages allocated out of the lomem pool per
+	 * alloc_page(GFP_ATOMIC)
+	 */
+	va = skb_frag_address(frag);
+
+	/* we need the header to contain the greater of either ETH_HLEN or
+	 * 60 bytes if the skb->len is less than 60 for skb_pad.
+	 */
+	pull_len = eth_get_headlen(va, FM10K_RX_HDR_LEN);
+
+	/* align pull length to size of long to optimize memcpy performance */
+	skb_copy_to_linear_data(skb, va, ALIGN(pull_len, sizeof(long)));
+
+	/* update all of the pointers */
+	skb_frag_size_sub(frag, pull_len);
+	frag->page_offset += pull_len;
+	skb->data_len -= pull_len;
+	skb->tail += pull_len;
+}
+
+/**
+ * fm10k_cleanup_headers - Correct corrupted or empty headers
+ * @rx_ring: rx descriptor ring packet is being transacted on
+ * @rx_desc: pointer to the EOP Rx descriptor
+ * @skb: pointer to current skb being fixed
+ *
+ * Address the case where we are pulling data in on pages only
+ * and as such no data is present in the skb header.
+ *
+ * In addition if skb is not at least 60 bytes we need to pad it so that
+ * it is large enough to qualify as a valid Ethernet frame.
+ *
+ * Returns true if an error was encountered and skb was freed.
+ **/
+static bool fm10k_cleanup_headers(struct fm10k_ring *rx_ring,
+				  union fm10k_rx_desc *rx_desc,
+				  struct sk_buff *skb)
+{
+	if (unlikely((fm10k_test_staterr(rx_desc,
+					 FM10K_RXD_STATUS_RXE)))) {
+		dev_kfree_skb_any(skb);
+		rx_ring->rx_stats.errors++;
+		return true;
+	}
+
+	/* place header in linear portion of buffer */
+	if (skb_is_nonlinear(skb))
+		fm10k_pull_tail(rx_ring, rx_desc, skb);
+
+	/* if skb_pad returns an error the skb was freed */
+	if (unlikely(skb->len < 60)) {
+		int pad_len = 60 - skb->len;
+
+		if (skb_pad(skb, pad_len))
+			return true;
+		__skb_put(skb, pad_len);
+	}
+
+	return false;
+}
+
+/**
+ * fm10k_receive_skb - helper function to handle rx indications
+ * @q_vector: structure containing interrupt and ring information
+ * @skb: packet to send up
+ **/
+static void fm10k_receive_skb(struct fm10k_q_vector *q_vector,
+			      struct sk_buff *skb)
+{
+	napi_gro_receive(&q_vector->napi, skb);
+}
+
+static bool fm10k_clean_rx_irq(struct fm10k_q_vector *q_vector,
+			       struct fm10k_ring *rx_ring,
+			       int budget)
+{
+	struct sk_buff *skb = rx_ring->skb;
+	unsigned int total_bytes = 0, total_packets = 0;
+	u16 cleaned_count = fm10k_desc_unused(rx_ring);
+
+	do {
+		union fm10k_rx_desc *rx_desc;
+
+		/* return some buffers to hardware, one at a time is too slow */
+		if (cleaned_count >= FM10K_RX_BUFFER_WRITE) {
+			fm10k_alloc_rx_buffers(rx_ring, cleaned_count);
+			cleaned_count = 0;
+		}
+
+		rx_desc = FM10K_RX_DESC(rx_ring, rx_ring->next_to_clean);
+
+		if (!fm10k_test_staterr(rx_desc, FM10K_RXD_STATUS_DD))
+			break;
+
+		/* This memory barrier is needed to keep us from reading
+		 * any other fields out of the rx_desc until we know the
+		 * RXD_STATUS_DD bit is set
+		 */
+		rmb();
+
+		/* retrieve a buffer from the ring */
+		skb = fm10k_fetch_rx_buffer(rx_ring, rx_desc, skb);
+
+		/* exit if we failed to retrieve a buffer */
+		if (!skb)
+			break;
+
+		cleaned_count++;
+
+		/* fetch next buffer in frame if non-eop */
+		if (fm10k_is_non_eop(rx_ring, rx_desc))
+			continue;
+
+		/* verify the packet layout is correct */
+		if (fm10k_cleanup_headers(rx_ring, rx_desc, skb)) {
+			skb = NULL;
+			continue;
+		}
+
+		/* populate checksum, timestamp, VLAN, and protocol */
+		total_bytes += fm10k_process_skb_fields(rx_ring, rx_desc, skb);
+
+		fm10k_receive_skb(q_vector, skb);
+
+		/* reset skb pointer */
+		skb = NULL;
+
+		/* update budget accounting */
+		total_packets++;
+	} while (likely(total_packets < budget));
+
+	/* place incomplete frames back on ring for completion */
+	rx_ring->skb = skb;
+
+	u64_stats_update_begin(&rx_ring->syncp);
+	rx_ring->stats.packets += total_packets;
+	rx_ring->stats.bytes += total_bytes;
+	u64_stats_update_end(&rx_ring->syncp);
+	q_vector->rx.total_packets += total_packets;
+	q_vector->rx.total_bytes += total_bytes;
+
+	return total_packets < budget;
+}
+
+static bool fm10k_tx_desc_push(struct fm10k_ring *tx_ring,
+			       struct fm10k_tx_desc *tx_desc, u16 i,
+			       dma_addr_t dma, unsigned int size, u8 desc_flags)
+{
+	/* set RS and INT for last frame in a cache line */
+	if ((++i & (FM10K_TXD_WB_FIFO_SIZE - 1)) == 0)
+		desc_flags |= FM10K_TXD_FLAG_RS | FM10K_TXD_FLAG_INT;
+
+	/* record values to descriptor */
+	tx_desc->buffer_addr = cpu_to_le64(dma);
+	tx_desc->flags = desc_flags;
+	tx_desc->buflen = cpu_to_le16(size);
+
+	/* return true if we just wrapped the ring */
+	return i == tx_ring->count;
+}
+
+static void fm10k_tx_map(struct fm10k_ring *tx_ring,
+			 struct fm10k_tx_buffer *first)
+{
+	struct sk_buff *skb = first->skb;
+	struct fm10k_tx_buffer *tx_buffer;
+	struct fm10k_tx_desc *tx_desc;
+	struct skb_frag_struct *frag;
+	unsigned char *data;
+	dma_addr_t dma;
+	unsigned int data_len, size;
+	u16 i = tx_ring->next_to_use;
+	u8 flags = 0;
+
+	tx_desc = FM10K_TX_DESC(tx_ring, i);
+
+	/* add HW VLAN tag */
+	if (vlan_tx_tag_present(skb))
+		tx_desc->vlan = cpu_to_le16(vlan_tx_tag_get(skb));
+	else
+		tx_desc->vlan = 0;
+
+	size = skb_headlen(skb);
+	data = skb->data;
+
+	dma = dma_map_single(tx_ring->dev, data, size, DMA_TO_DEVICE);
+
+	data_len = skb->data_len;
+	tx_buffer = first;
+
+	for (frag = &skb_shinfo(skb)->frags[0];; frag++) {
+		if (dma_mapping_error(tx_ring->dev, dma))
+			goto dma_error;
+
+		/* record length, and DMA address */
+		dma_unmap_len_set(tx_buffer, len, size);
+		dma_unmap_addr_set(tx_buffer, dma, dma);
+
+		while (unlikely(size > FM10K_MAX_DATA_PER_TXD)) {
+			if (fm10k_tx_desc_push(tx_ring, tx_desc++, i++, dma,
+					       FM10K_MAX_DATA_PER_TXD, flags)) {
+				tx_desc = FM10K_TX_DESC(tx_ring, 0);
+				i = 0;
+			}
+
+			dma += FM10K_MAX_DATA_PER_TXD;
+			size -= FM10K_MAX_DATA_PER_TXD;
+		}
+
+		if (likely(!data_len))
+			break;
+
+		if (fm10k_tx_desc_push(tx_ring, tx_desc++, i++,
+				       dma, size, flags)) {
+			tx_desc = FM10K_TX_DESC(tx_ring, 0);
+			i = 0;
+		}
+
+		size = skb_frag_size(frag);
+		data_len -= size;
+
+		dma = skb_frag_dma_map(tx_ring->dev, frag, 0, size,
+				       DMA_TO_DEVICE);
+
+		tx_buffer = &tx_ring->tx_buffer[i];
+	}
+
+	/* write last descriptor with LAST bit set */
+	flags |= FM10K_TXD_FLAG_LAST;
+
+	if (fm10k_tx_desc_push(tx_ring, tx_desc, i++, dma, size, flags))
+		i = 0;
+
+	/* record bytecount for BQL */
+	netdev_tx_sent_queue(txring_txq(tx_ring), first->bytecount);
+
+	/* record SW timestamp if HW timestamp is not available */
+	skb_tx_timestamp(first->skb);
+
+	/* Force memory writes to complete before letting h/w know there
+	 * are new descriptors to fetch.  (Only applicable for weak-ordered
+	 * memory model archs, such as IA-64).
+	 *
+	 * We also need this memory barrier to make certain all of the
+	 * status bits have been updated before next_to_watch is written.
+	 */
+	wmb();
+
+	/* set next_to_watch value indicating a packet is present */
+	first->next_to_watch = tx_desc;
+
+	tx_ring->next_to_use = i;
+
+	/* notify HW of packet */
+	writel(i, tx_ring->tail);
+
+	/* we need this if more than one processor can write to our tail
+	 * at a time, it synchronizes IO on IA64/Altix systems
+	 */
+	mmiowb();
+
+	return;
+dma_error:
+	dev_err(tx_ring->dev, "TX DMA map failed\n");
+
+	/* clear dma mappings for failed tx_buffer map */
+	for (;;) {
+		tx_buffer = &tx_ring->tx_buffer[i];
+		fm10k_unmap_and_free_tx_resource(tx_ring, tx_buffer);
+		if (tx_buffer == first)
+			break;
+		if (i == 0)
+			i = tx_ring->count;
+		i--;
+	}
+
+	tx_ring->next_to_use = i;
+}
+
+static int __fm10k_maybe_stop_tx(struct fm10k_ring *tx_ring, u16 size)
+{
+	netif_stop_subqueue(tx_ring->netdev, tx_ring->queue_index);
+
+	smp_mb();
+
+	/* We need to check again in a case another CPU has just
+	 * made room available. */
+	if (likely(fm10k_desc_unused(tx_ring) < size))
+		return -EBUSY;
+
+	/* A reprieve! - use start_queue because it doesn't call schedule */
+	netif_start_subqueue(tx_ring->netdev, tx_ring->queue_index);
+	++tx_ring->tx_stats.restart_queue;
+	return 0;
+}
+
+static inline int fm10k_maybe_stop_tx(struct fm10k_ring *tx_ring, u16 size)
+{
+	if (likely(fm10k_desc_unused(tx_ring) >= size))
+		return 0;
+	return __fm10k_maybe_stop_tx(tx_ring, size);
+}
+
+netdev_tx_t fm10k_xmit_frame_ring(struct sk_buff *skb,
+				  struct fm10k_ring *tx_ring)
+{
+	struct fm10k_tx_buffer *first;
+	u32 tx_flags = 0;
+#if PAGE_SIZE > FM10K_MAX_DATA_PER_TXD
+	unsigned short f;
+#endif
+	u16 count = TXD_USE_COUNT(skb_headlen(skb));
+
+	/* need: 1 descriptor per page * PAGE_SIZE/FM10K_MAX_DATA_PER_TXD,
+	 *       + 1 desc for skb_headlen/FM10K_MAX_DATA_PER_TXD,
+	 *       + 2 desc gap to keep tail from touching head
+	 * otherwise try next time
+	 */
+#if PAGE_SIZE > FM10K_MAX_DATA_PER_TXD
+	for (f = 0; f < skb_shinfo(skb)->nr_frags; f++)
+		count += TXD_USE_COUNT(skb_shinfo(skb)->frags[f].size);
+#else
+	count += skb_shinfo(skb)->nr_frags;
+#endif
+	if (fm10k_maybe_stop_tx(tx_ring, count + 3)) {
+		tx_ring->tx_stats.tx_busy++;
+		return NETDEV_TX_BUSY;
+	}
+
+	/* record the location of the first descriptor for this packet */
+	first = &tx_ring->tx_buffer[tx_ring->next_to_use];
+	first->skb = skb;
+	first->bytecount = max_t(unsigned int, skb->len, ETH_ZLEN);
+	first->gso_segs = 1;
+
+	/* record initial flags and protocol */
+	first->tx_flags = tx_flags;
+
+	fm10k_tx_map(tx_ring, first);
+
+	fm10k_maybe_stop_tx(tx_ring, DESC_NEEDED);
+
+	return NETDEV_TX_OK;
+}
+
+static u64 fm10k_get_tx_completed(struct fm10k_ring *ring)
+{
+	return ring->stats.packets;
+}
+
+static u64 fm10k_get_tx_pending(struct fm10k_ring *ring)
+{
+	/* use SW head and tail until we have real hardware */
+	u32 head = ring->next_to_clean;
+	u32 tail = ring->next_to_use;
+
+	return ((head <= tail) ? tail : tail + ring->count) - head;
+}
+
+bool fm10k_check_tx_hang(struct fm10k_ring *tx_ring)
+{
+	u32 tx_done = fm10k_get_tx_completed(tx_ring);
+	u32 tx_done_old = tx_ring->tx_stats.tx_done_old;
+	u32 tx_pending = fm10k_get_tx_pending(tx_ring);
+
+	clear_check_for_tx_hang(tx_ring);
+
+	/* Check for a hung queue, but be thorough. This verifies
+	 * that a transmit has been completed since the previous
+	 * check AND there is at least one packet pending. By
+	 * requiring this to fail twice we avoid races with
+	 * clearing the ARMED bit and conditions where we
+	 * run the check_tx_hang logic with a transmit completion
+	 * pending but without time to complete it yet.
+	 */
+	if (!tx_pending || (tx_done_old != tx_done)) {
+		/* update completed stats and continue */
+		tx_ring->tx_stats.tx_done_old = tx_done;
+		/* reset the countdown */
+		clear_bit(__FM10K_HANG_CHECK_ARMED, &tx_ring->state);
+
+		return false;
+	}
+
+	/* make sure it is true for two checks in a row */
+	return test_and_set_bit(__FM10K_HANG_CHECK_ARMED, &tx_ring->state);
+}
+
+/**
+ * fm10k_tx_timeout_reset - initiate reset due to Tx timeout
+ * @interface: driver private struct
+ **/
+void fm10k_tx_timeout_reset(struct fm10k_intfc *interface)
+{
+	/* Do the reset outside of interrupt context */
+	if (!test_bit(__FM10K_DOWN, &interface->state)) {
+		netdev_err(interface->netdev, "Reset interface\n");
+		interface->tx_timeout_count++;
+		interface->flags |= FM10K_FLAG_RESET_REQUESTED;
+		fm10k_service_event_schedule(interface);
+	}
+}
+
+/**
+ * fm10k_clean_tx_irq - Reclaim resources after transmit completes
+ * @q_vector: structure containing interrupt and ring information
+ * @tx_ring: tx ring to clean
+ **/
+static bool fm10k_clean_tx_irq(struct fm10k_q_vector *q_vector,
+			       struct fm10k_ring *tx_ring)
+{
+	struct fm10k_intfc *interface = q_vector->interface;
+	struct fm10k_tx_buffer *tx_buffer;
+	struct fm10k_tx_desc *tx_desc;
+	unsigned int total_bytes = 0, total_packets = 0;
+	unsigned int budget = q_vector->tx.work_limit;
+	unsigned int i = tx_ring->next_to_clean;
+
+	if (test_bit(__FM10K_DOWN, &interface->state))
+		return true;
+
+	tx_buffer = &tx_ring->tx_buffer[i];
+	tx_desc = FM10K_TX_DESC(tx_ring, i);
+	i -= tx_ring->count;
+
+	do {
+		struct fm10k_tx_desc *eop_desc = tx_buffer->next_to_watch;
+
+		/* if next_to_watch is not set then there is no work pending */
+		if (!eop_desc)
+			break;
+
+		/* prevent any other reads prior to eop_desc */
+		read_barrier_depends();
+
+		/* if DD is not set pending work has not been completed */
+		if (!(eop_desc->flags & FM10K_TXD_FLAG_DONE))
+			break;
+
+		/* clear next_to_watch to prevent false hangs */
+		tx_buffer->next_to_watch = NULL;
+
+		/* update the statistics for this packet */
+		total_bytes += tx_buffer->bytecount;
+		total_packets += tx_buffer->gso_segs;
+
+		/* free the skb */
+		dev_kfree_skb_any(tx_buffer->skb);
+
+		/* unmap skb header data */
+		dma_unmap_single(tx_ring->dev,
+				 dma_unmap_addr(tx_buffer, dma),
+				 dma_unmap_len(tx_buffer, len),
+				 DMA_TO_DEVICE);
+
+		/* clear tx_buffer data */
+		tx_buffer->skb = NULL;
+		dma_unmap_len_set(tx_buffer, len, 0);
+
+		/* unmap remaining buffers */
+		while (tx_desc != eop_desc) {
+			tx_buffer++;
+			tx_desc++;
+			i++;
+			if (unlikely(!i)) {
+				i -= tx_ring->count;
+				tx_buffer = tx_ring->tx_buffer;
+				tx_desc = FM10K_TX_DESC(tx_ring, 0);
+			}
+
+			/* unmap any remaining paged data */
+			if (dma_unmap_len(tx_buffer, len)) {
+				dma_unmap_page(tx_ring->dev,
+					       dma_unmap_addr(tx_buffer, dma),
+					       dma_unmap_len(tx_buffer, len),
+					       DMA_TO_DEVICE);
+				dma_unmap_len_set(tx_buffer, len, 0);
+			}
+		}
+
+		/* move us one more past the eop_desc for start of next pkt */
+		tx_buffer++;
+		tx_desc++;
+		i++;
+		if (unlikely(!i)) {
+			i -= tx_ring->count;
+			tx_buffer = tx_ring->tx_buffer;
+			tx_desc = FM10K_TX_DESC(tx_ring, 0);
+		}
+
+		/* issue prefetch for next Tx descriptor */
+		prefetch(tx_desc);
+
+		/* update budget accounting */
+		budget--;
+	} while (likely(budget));
+
+	i += tx_ring->count;
+	tx_ring->next_to_clean = i;
+	u64_stats_update_begin(&tx_ring->syncp);
+	tx_ring->stats.bytes += total_bytes;
+	tx_ring->stats.packets += total_packets;
+	u64_stats_update_end(&tx_ring->syncp);
+	q_vector->tx.total_bytes += total_bytes;
+	q_vector->tx.total_packets += total_packets;
+
+	if (check_for_tx_hang(tx_ring) && fm10k_check_tx_hang(tx_ring)) {
+		/* schedule immediate reset if we believe we hung */
+		struct fm10k_hw *hw = &interface->hw;
+
+		netif_err(interface, drv, tx_ring->netdev,
+			  "Detected Tx Unit Hang\n"
+			  "  Tx Queue             <%d>\n"
+			  "  TDH, TDT             <%x>, <%x>\n"
+			  "  next_to_use          <%x>\n"
+			  "  next_to_clean        <%x>\n",
+			  tx_ring->queue_index,
+			  fm10k_read_reg(hw, FM10K_TDH(tx_ring->reg_idx)),
+			  fm10k_read_reg(hw, FM10K_TDT(tx_ring->reg_idx)),
+			  tx_ring->next_to_use, i);
+
+		netif_stop_subqueue(tx_ring->netdev,
+				    tx_ring->queue_index);
+
+		netif_info(interface, probe, tx_ring->netdev,
+			   "tx hang %d detected on queue %d, resetting interface\n",
+			   interface->tx_timeout_count + 1,
+			   tx_ring->queue_index);
+
+		fm10k_tx_timeout_reset(interface);
+
+		/* the netdev is about to reset, no point in enabling stuff */
+		return true;
+	}
+
+	/* notify netdev of completed buffers */
+	netdev_tx_completed_queue(txring_txq(tx_ring),
+				  total_packets, total_bytes);
+
+#define TX_WAKE_THRESHOLD min_t(u16, FM10K_MIN_TXD - 1, DESC_NEEDED * 2)
+	if (unlikely(total_packets && netif_carrier_ok(tx_ring->netdev) &&
+		     (fm10k_desc_unused(tx_ring) >= TX_WAKE_THRESHOLD))) {
+		/* Make sure that anybody stopping the queue after this
+		 * sees the new next_to_clean.
+		 */
+		smp_mb();
+		if (__netif_subqueue_stopped(tx_ring->netdev,
+					     tx_ring->queue_index) &&
+		    !test_bit(__FM10K_DOWN, &interface->state)) {
+			netif_wake_subqueue(tx_ring->netdev,
+					    tx_ring->queue_index);
+			++tx_ring->tx_stats.restart_queue;
+		}
+	}
+
+	return !!budget;
+}
+
 /**
  * fm10k_update_itr - update the dynamic ITR value based on packet size
  *
@@ -137,6 +1052,28 @@ static int fm10k_poll(struct napi_struct *napi, int budget)
 {
 	struct fm10k_q_vector *q_vector =
 			       container_of(napi, struct fm10k_q_vector, napi);
+	struct fm10k_ring *ring;
+	int per_ring_budget;
+	bool clean_complete = true;
+
+	fm10k_for_each_ring(ring, q_vector->tx)
+		clean_complete &= fm10k_clean_tx_irq(q_vector, ring);
+
+	/* attempt to distribute budget to each queue fairly, but don't
+	 * allow the budget to go below 1 because we'll exit polling
+	 */
+	if (q_vector->rx.count > 1)
+		per_ring_budget = max(budget/q_vector->rx.count, 1);
+	else
+		per_ring_budget = budget;
+
+	fm10k_for_each_ring(ring, q_vector->rx)
+		clean_complete &= fm10k_clean_rx_irq(q_vector, ring,
+						     per_ring_budget);
+
+	/* If all work not completed, return budget and keep polling */
+	if (!clean_complete)
+		return budget;
 
 	/* all work done, exit the polling mode */
 	napi_complete(napi);
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
index 24a3e8c..1ddc0d5 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
@@ -447,8 +447,66 @@ int fm10k_close(struct net_device *netdev)
 
 static netdev_tx_t fm10k_xmit_frame(struct sk_buff *skb, struct net_device *dev)
 {
-	dev_kfree_skb_any(skb);
-	return NETDEV_TX_OK;
+	struct fm10k_intfc *interface = netdev_priv(dev);
+	unsigned int r_idx = 0;
+	int err;
+
+	if ((skb->protocol ==  htons(ETH_P_8021Q)) &&
+	    !vlan_tx_tag_present(skb)) {
+		/* FM10K only supports hardware tagging, any tags in frame
+		 * are considered 2nd level or "outer" tags
+		 */
+		struct vlan_hdr *vhdr;
+		__be16 proto;
+
+		/* make sure skb is not shared */
+		skb = skb_share_check(skb, GFP_ATOMIC);
+		if (!skb)
+			return NETDEV_TX_OK;
+
+		/* make sure there is enough room to move the ethernet header */
+		if (unlikely(!pskb_may_pull(skb, VLAN_ETH_HLEN)))
+			return NETDEV_TX_OK;
+
+		/* verify the skb head is not shared */
+		err = skb_cow_head(skb, 0);
+		if (err)
+			return NETDEV_TX_OK;
+
+		/* locate vlan header */
+		vhdr = (struct vlan_hdr *)(skb->data + ETH_HLEN);
+
+		/* pull the 2 key pieces of data out of it */
+		__vlan_hwaccel_put_tag(skb,
+				       htons(ETH_P_8021Q),
+				       ntohs(vhdr->h_vlan_TCI));
+		proto = vhdr->h_vlan_encapsulated_proto;
+		skb->protocol = (ntohs(proto) >= 1536) ? proto :
+							 htons(ETH_P_802_2);
+
+		/* squash it by moving the ethernet addresses up 4 bytes */
+		memmove(skb->data + VLAN_HLEN, skb->data, 12);
+		__skb_pull(skb, VLAN_HLEN);
+		skb_reset_mac_header(skb);
+	}
+
+	/* The minimum packet size for a single buffer is 17B so pad the skb
+	 * in order to meet this minimum size requirement.
+	 */
+	if (unlikely(skb->len < 17)) {
+		int pad_len = 17 - skb->len;
+
+		if (skb_pad(skb, pad_len))
+			return NETDEV_TX_OK;
+		__skb_put(skb, pad_len);
+	}
+
+	if (r_idx >= interface->num_tx_queues)
+		r_idx %= interface->num_tx_queues;
+
+	err = fm10k_xmit_frame_ring(skb, interface->tx_ring[r_idx]);
+
+	return err;
 }
 
 static int fm10k_change_mtu(struct net_device *dev, int new_mtu)
@@ -461,6 +519,37 @@ static int fm10k_change_mtu(struct net_device *dev, int new_mtu)
 	return 0;
 }
 
+/**
+ * fm10k_tx_timeout - Respond to a Tx Hang
+ * @netdev: network interface device structure
+ **/
+static void fm10k_tx_timeout(struct net_device *netdev)
+{
+	struct fm10k_intfc *interface = netdev_priv(netdev);
+	bool real_tx_hang = false;
+	int i;
+
+#define TX_TIMEO_LIMIT 16000
+	for (i = 0; i < interface->num_tx_queues; i++) {
+		struct fm10k_ring *tx_ring = interface->tx_ring[i];
+
+		if (check_for_tx_hang(tx_ring) && fm10k_check_tx_hang(tx_ring))
+			real_tx_hang = true;
+	}
+
+	if (real_tx_hang) {
+		fm10k_tx_timeout_reset(interface);
+	} else {
+		netif_info(interface, drv, netdev,
+			   "Fake Tx hang detected with timeout of %d seconds\n",
+			   netdev->watchdog_timeo/HZ);
+
+		/* fake Tx hang - increase the kernel timeout */
+		if (netdev->watchdog_timeo < TX_TIMEO_LIMIT)
+			netdev->watchdog_timeo *= 2;
+	}
+}
+
 static int fm10k_uc_vlan_unsync(struct net_device *netdev,
 				const unsigned char *uc_addr)
 {
@@ -887,6 +976,7 @@ static const struct net_device_ops fm10k_netdev_ops = {
 	.ndo_start_xmit		= fm10k_xmit_frame,
 	.ndo_set_mac_address	= fm10k_set_mac,
 	.ndo_change_mtu		= fm10k_change_mtu,
+	.ndo_tx_timeout		= fm10k_tx_timeout,
 	.ndo_vlan_rx_add_vid	= fm10k_vlan_rx_add_vid,
 	.ndo_vlan_rx_kill_vid	= fm10k_vlan_rx_kill_vid,
 	.ndo_set_rx_mode	= fm10k_set_rx_mode,
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index e8ed1a4..4e21ddd 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -661,6 +661,9 @@ static void fm10k_configure_rx_ring(struct fm10k_intfc *interface,
 
 	/* enable queue */
 	fm10k_write_reg(hw, FM10K_RXQCTL(reg_idx), rxqctl);
+
+	/* place buffers on ring for receive data */
+	fm10k_alloc_rx_buffers(ring, fm10k_desc_unused(ring));
 }
 
 /**

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [net-next PATCH 17/29] fm10k: Add ethtool support
  2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
                   ` (15 preceding siblings ...)
  2014-09-18 22:38 ` [net-next PATCH 16/29] fm10k: Add transmit and receive fastpath and interrupt handlers Alexander Duyck
@ 2014-09-18 22:38 ` Alexander Duyck
  2014-09-18 22:38 ` [net-next PATCH 18/29] fm10k: Add support for PCI power management and error handling Alexander Duyck
                   ` (14 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-18 22:38 UTC (permalink / raw)
  To: davem
  Cc: nhorman, netdev, john.fastabend, matthew.vick, jeffrey.t.kirsher,
	sassmann

This patch adds basic ethtool support to the device to allow for configuration.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/fm10k/Makefile        |    2 
 drivers/net/ethernet/intel/fm10k/fm10k.h         |    3 
 drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c |  868 ++++++++++++++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_netdev.c  |    1 
 4 files changed, 873 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c

diff --git a/drivers/net/ethernet/intel/fm10k/Makefile b/drivers/net/ethernet/intel/fm10k/Makefile
index ee4e415..b72815e 100644
--- a/drivers/net/ethernet/intel/fm10k/Makefile
+++ b/drivers/net/ethernet/intel/fm10k/Makefile
@@ -28,5 +28,5 @@
 obj-$(CONFIG_FM10K) += fm10k.o
 
 fm10k-objs := fm10k_main.o fm10k_common.o fm10k_pci.o \
-	      fm10k_netdev.o fm10k_pf.o \
+	      fm10k_netdev.o fm10k_ethtool.o fm10k_pf.o \
 	      fm10k_mbx.o fm10k_tlv.o
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k.h b/drivers/net/ethernet/intel/fm10k/fm10k.h
index 98cbcd8..aea79b7 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k.h
@@ -424,4 +424,7 @@ void fm10k_restore_rx_state(struct fm10k_intfc *);
 void fm10k_reset_rx_state(struct fm10k_intfc *);
 int fm10k_open(struct net_device *netdev);
 int fm10k_close(struct net_device *netdev);
+
+/* Ethtool */
+void fm10k_set_ethtool_ops(struct net_device *dev);
 #endif /* _FM10K_H_ */
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c b/drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c
new file mode 100644
index 0000000..a88c75c
--- /dev/null
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c
@@ -0,0 +1,868 @@
+/* Intel Ethernet Switch Host Interface Driver
+ * Copyright(c) 2013 - 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Contact Information:
+ * e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+ * Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+ */
+
+#include "fm10k.h"
+
+struct fm10k_stats {
+	char stat_string[ETH_GSTRING_LEN];
+	int sizeof_stat;
+	int stat_offset;
+};
+
+#define FM10K_NETDEV_STAT(_net_stat) { \
+	.stat_string = #_net_stat, \
+	.sizeof_stat = FIELD_SIZEOF(struct net_device_stats, _net_stat), \
+	.stat_offset = offsetof(struct net_device_stats, _net_stat) \
+}
+
+static const struct fm10k_stats fm10k_gstrings_net_stats[] = {
+	FM10K_NETDEV_STAT(tx_packets),
+	FM10K_NETDEV_STAT(tx_bytes),
+	FM10K_NETDEV_STAT(tx_errors),
+	FM10K_NETDEV_STAT(rx_packets),
+	FM10K_NETDEV_STAT(rx_bytes),
+	FM10K_NETDEV_STAT(rx_errors),
+	FM10K_NETDEV_STAT(rx_dropped),
+
+	/* detailed Rx errors */
+	FM10K_NETDEV_STAT(rx_length_errors),
+	FM10K_NETDEV_STAT(rx_crc_errors),
+	FM10K_NETDEV_STAT(rx_fifo_errors),
+};
+
+#define FM10K_NETDEV_STATS_LEN	ARRAY_SIZE(fm10k_gstrings_net_stats)
+
+#define FM10K_STAT(_name, _stat) { \
+	.stat_string = _name, \
+	.sizeof_stat = FIELD_SIZEOF(struct fm10k_intfc, _stat), \
+	.stat_offset = offsetof(struct fm10k_intfc, _stat) \
+}
+
+static const struct fm10k_stats fm10k_gstrings_stats[] = {
+	FM10K_STAT("tx_restart_queue", restart_queue),
+	FM10K_STAT("tx_busy", tx_busy),
+	FM10K_STAT("tx_csum_errors", tx_csum_errors),
+	FM10K_STAT("rx_alloc_failed", alloc_failed),
+	FM10K_STAT("rx_csum_errors", rx_csum_errors),
+	FM10K_STAT("rx_errors", rx_errors),
+
+	FM10K_STAT("tx_packets_nic", tx_packets_nic),
+	FM10K_STAT("tx_bytes_nic", tx_bytes_nic),
+	FM10K_STAT("rx_packets_nic", rx_packets_nic),
+	FM10K_STAT("rx_bytes_nic", rx_bytes_nic),
+	FM10K_STAT("rx_drops_nic", rx_drops_nic),
+	FM10K_STAT("rx_overrun_pf", rx_overrun_pf),
+	FM10K_STAT("rx_overrun_vf", rx_overrun_vf),
+
+	FM10K_STAT("timeout", stats.timeout.count),
+	FM10K_STAT("ur", stats.ur.count),
+	FM10K_STAT("ca", stats.ca.count),
+	FM10K_STAT("um", stats.um.count),
+	FM10K_STAT("xec", stats.xec.count),
+	FM10K_STAT("vlan_drop", stats.vlan_drop.count),
+	FM10K_STAT("loopback_drop", stats.loopback_drop.count),
+	FM10K_STAT("nodesc_drop", stats.nodesc_drop.count),
+
+	FM10K_STAT("swapi_status", hw.swapi.status),
+	FM10K_STAT("mac_rules_used", hw.swapi.mac.used),
+	FM10K_STAT("mac_rules_avail", hw.swapi.mac.avail),
+
+	FM10K_STAT("mbx_tx_busy", hw.mbx.tx_busy),
+	FM10K_STAT("mbx_tx_dropped", hw.mbx.tx_dropped),
+	FM10K_STAT("mbx_tx_messages", hw.mbx.tx_messages),
+	FM10K_STAT("mbx_tx_dwords", hw.mbx.tx_dwords),
+	FM10K_STAT("mbx_rx_messages", hw.mbx.rx_messages),
+	FM10K_STAT("mbx_rx_dwords", hw.mbx.rx_dwords),
+	FM10K_STAT("mbx_rx_parse_err", hw.mbx.rx_parse_err),
+};
+
+#define FM10K_GLOBAL_STATS_LEN ARRAY_SIZE(fm10k_gstrings_stats)
+
+#define FM10K_QUEUE_STATS_LEN \
+	  (MAX_QUEUES * 2 * (sizeof(struct fm10k_queue_stats) / sizeof(u64)))
+
+#define FM10K_STATS_LEN (FM10K_GLOBAL_STATS_LEN + \
+			 FM10K_NETDEV_STATS_LEN + \
+			 FM10K_QUEUE_STATS_LEN)
+
+static void fm10k_get_strings(struct net_device *dev, u32 stringset,
+			      u8 *data)
+{
+	char *p = (char *)data;
+	int i;
+
+	switch (stringset) {
+	case ETH_SS_STATS:
+		for (i = 0; i < FM10K_NETDEV_STATS_LEN; i++) {
+			memcpy(p, fm10k_gstrings_net_stats[i].stat_string,
+			       ETH_GSTRING_LEN);
+			p += ETH_GSTRING_LEN;
+		}
+		for (i = 0; i < FM10K_GLOBAL_STATS_LEN; i++) {
+			memcpy(p, fm10k_gstrings_stats[i].stat_string,
+			       ETH_GSTRING_LEN);
+			p += ETH_GSTRING_LEN;
+		}
+
+		for (i = 0; i < MAX_QUEUES; i++) {
+			sprintf(p, "tx_queue_%u_packets", i);
+			p += ETH_GSTRING_LEN;
+			sprintf(p, "tx_queue_%u_bytes", i);
+			p += ETH_GSTRING_LEN;
+			sprintf(p, "rx_queue_%u_packets", i);
+			p += ETH_GSTRING_LEN;
+			sprintf(p, "rx_queue_%u_bytes", i);
+			p += ETH_GSTRING_LEN;
+		}
+		break;
+	}
+}
+
+static int fm10k_get_sset_count(struct net_device *dev, int sset)
+{
+	switch (sset) {
+	case ETH_SS_STATS:
+		return FM10K_STATS_LEN;
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static void fm10k_get_ethtool_stats(struct net_device *netdev,
+				    struct ethtool_stats *stats, u64 *data)
+{
+	const int stat_count = sizeof(struct fm10k_queue_stats) / sizeof(u64);
+	struct fm10k_intfc *interface = netdev_priv(netdev);
+	struct net_device_stats *net_stats = &netdev->stats;
+	char *p;
+	int i, j;
+
+	fm10k_update_stats(interface);
+
+	for (i = 0; i < FM10K_NETDEV_STATS_LEN; i++) {
+		p = (char *)net_stats + fm10k_gstrings_net_stats[i].stat_offset;
+		*(data++) = (fm10k_gstrings_net_stats[i].sizeof_stat ==
+			sizeof(u64)) ? *(u64 *)p : *(u32 *)p;
+	}
+
+	for (i = 0; i < FM10K_GLOBAL_STATS_LEN; i++) {
+		p = (char *)interface + fm10k_gstrings_stats[i].stat_offset;
+		*(data++) = (fm10k_gstrings_stats[i].sizeof_stat ==
+			sizeof(u64)) ? *(u64 *)p : *(u32 *)p;
+	}
+
+	for (i = 0; i < MAX_QUEUES; i++) {
+		struct fm10k_ring *ring;
+		u64 *queue_stat;
+
+		ring = interface->tx_ring[i];
+		if (ring)
+			queue_stat = (u64 *)&ring->stats;
+		for (j = 0; j < stat_count; j++)
+			*(data++) = ring ? queue_stat[j] : 0;
+
+		ring = interface->rx_ring[i];
+		if (ring)
+			queue_stat = (u64 *)&ring->stats;
+		for (j = 0; j < stat_count; j++)
+			*(data++) = ring ? queue_stat[j] : 0;
+	}
+}
+
+/* If function below adds more registers this define needs to be updated */
+#define FM10K_REGS_LEN_Q 29
+
+static void fm10k_get_reg_q(struct fm10k_hw *hw, u32 *buff, int i)
+{
+	int idx = 0;
+
+	buff[idx++] = fm10k_read_reg(hw, FM10K_RDBAL(i));
+	buff[idx++] = fm10k_read_reg(hw, FM10K_RDBAH(i));
+	buff[idx++] = fm10k_read_reg(hw, FM10K_RDLEN(i));
+	buff[idx++] = fm10k_read_reg(hw, FM10K_TPH_RXCTRL(i));
+	buff[idx++] = fm10k_read_reg(hw, FM10K_RDH(i));
+	buff[idx++] = fm10k_read_reg(hw, FM10K_RDT(i));
+	buff[idx++] = fm10k_read_reg(hw, FM10K_RXQCTL(i));
+	buff[idx++] = fm10k_read_reg(hw, FM10K_RXDCTL(i));
+	buff[idx++] = fm10k_read_reg(hw, FM10K_RXINT(i));
+	buff[idx++] = fm10k_read_reg(hw, FM10K_SRRCTL(i));
+	buff[idx++] = fm10k_read_reg(hw, FM10K_QPRC(i));
+	buff[idx++] = fm10k_read_reg(hw, FM10K_QPRDC(i));
+	buff[idx++] = fm10k_read_reg(hw, FM10K_QBRC_L(i));
+	buff[idx++] = fm10k_read_reg(hw, FM10K_QBRC_H(i));
+	buff[idx++] = fm10k_read_reg(hw, FM10K_TDBAL(i));
+	buff[idx++] = fm10k_read_reg(hw, FM10K_TDBAH(i));
+	buff[idx++] = fm10k_read_reg(hw, FM10K_TDLEN(i));
+	buff[idx++] = fm10k_read_reg(hw, FM10K_TPH_TXCTRL(i));
+	buff[idx++] = fm10k_read_reg(hw, FM10K_TDH(i));
+	buff[idx++] = fm10k_read_reg(hw, FM10K_TDT(i));
+	buff[idx++] = fm10k_read_reg(hw, FM10K_TXDCTL(i));
+	buff[idx++] = fm10k_read_reg(hw, FM10K_TXQCTL(i));
+	buff[idx++] = fm10k_read_reg(hw, FM10K_TXINT(i));
+	buff[idx++] = fm10k_read_reg(hw, FM10K_QPTC(i));
+	buff[idx++] = fm10k_read_reg(hw, FM10K_QBTC_L(i));
+	buff[idx++] = fm10k_read_reg(hw, FM10K_QBTC_H(i));
+	buff[idx++] = fm10k_read_reg(hw, FM10K_TQDLOC(i));
+	buff[idx++] = fm10k_read_reg(hw, FM10K_TX_SGLORT(i));
+	buff[idx++] = fm10k_read_reg(hw, FM10K_PFVTCTL(i));
+
+	BUG_ON(idx != FM10K_REGS_LEN_Q);
+}
+
+/* If function above adds more registers this define needs to be updated */
+#define FM10K_REGS_LEN_VSI 43
+
+static void fm10k_get_reg_vsi(struct fm10k_hw *hw, u32 *buff, int i)
+{
+	int idx = 0, j;
+
+	buff[idx++] = fm10k_read_reg(hw, FM10K_MRQC(i));
+	for (j = 0; j < 10; j++)
+		buff[idx++] = fm10k_read_reg(hw, FM10K_RSSRK(i, j));
+	for (j = 0; j < 32; j++)
+		buff[idx++] = fm10k_read_reg(hw, FM10K_RETA(i, j));
+
+	BUG_ON(idx != FM10K_REGS_LEN_VSI);
+}
+
+static void fm10k_get_regs(struct net_device *netdev,
+			   struct ethtool_regs *regs, void *p)
+{
+	struct fm10k_intfc *interface = netdev_priv(netdev);
+	struct fm10k_hw *hw = &interface->hw;
+	u32 *buff = p;
+	u16 i;
+
+	regs->version = (1 << 24) | (hw->revision_id << 16) | hw->device_id;
+
+	switch (hw->mac.type) {
+	case fm10k_mac_pf:
+		/* General PF Registers */
+		*(buff++) = fm10k_read_reg(hw, FM10K_CTRL);
+		*(buff++) = fm10k_read_reg(hw, FM10K_CTRL_EXT);
+		*(buff++) = fm10k_read_reg(hw, FM10K_GCR);
+		*(buff++) = fm10k_read_reg(hw, FM10K_GCR_EXT);
+
+		for (i = 0; i < 8; i++) {
+			*(buff++) = fm10k_read_reg(hw, FM10K_DGLORTMAP(i));
+			*(buff++) = fm10k_read_reg(hw, FM10K_DGLORTDEC(i));
+		}
+
+		for (i = 0; i < 65; i++) {
+			fm10k_get_reg_vsi(hw, buff, i);
+			buff += FM10K_REGS_LEN_VSI;
+		}
+
+		*(buff++) = fm10k_read_reg(hw, FM10K_DMA_CTRL);
+		*(buff++) = fm10k_read_reg(hw, FM10K_DMA_CTRL2);
+
+		for (i = 0; i < FM10K_MAX_QUEUES_PF; i++) {
+			fm10k_get_reg_q(hw, buff, i);
+			buff += FM10K_REGS_LEN_Q;
+		}
+
+		*(buff++) = fm10k_read_reg(hw, FM10K_TPH_CTRL);
+
+		for (i = 0; i < 8; i++)
+			*(buff++) = fm10k_read_reg(hw, FM10K_INT_MAP(i));
+
+		/* Interrupt Throttling Registers */
+		for (i = 0; i < 130; i++)
+			*(buff++) = fm10k_read_reg(hw, FM10K_ITR(i));
+
+		break;
+	default:
+		return;
+	}
+}
+
+/* If function above adds more registers these define need to be updated */
+#define FM10K_REGS_LEN_PF \
+(162 + (65 * FM10K_REGS_LEN_VSI) + (FM10K_MAX_QUEUES_PF * FM10K_REGS_LEN_Q))
+
+static int fm10k_get_regs_len(struct net_device *netdev)
+{
+	struct fm10k_intfc *interface = netdev_priv(netdev);
+	struct fm10k_hw *hw = &interface->hw;
+
+	switch (hw->mac.type) {
+	case fm10k_mac_pf:
+		return FM10K_REGS_LEN_PF * sizeof(u32);
+	default:
+		return 0;
+	}
+}
+
+static void fm10k_get_drvinfo(struct net_device *dev,
+			      struct ethtool_drvinfo *info)
+{
+	struct fm10k_intfc *interface = netdev_priv(dev);
+
+	strncpy(info->driver, fm10k_driver_name,
+		sizeof(info->driver) - 1);
+	strncpy(info->version, fm10k_driver_version,
+		sizeof(info->version) - 1);
+	strncpy(info->bus_info, pci_name(interface->pdev),
+		sizeof(info->bus_info) - 1);
+
+	info->n_stats = FM10K_STATS_LEN;
+
+	info->regdump_len = fm10k_get_regs_len(dev);
+}
+
+static void fm10k_get_pauseparam(struct net_device *dev,
+				 struct ethtool_pauseparam *pause)
+{
+	struct fm10k_intfc *interface = netdev_priv(dev);
+
+	/* record fixed values for autoneg and tx pause */
+	pause->autoneg = 0;
+	pause->tx_pause = 1;
+
+	pause->rx_pause = interface->rx_pause ? 1 : 0;
+}
+
+static int fm10k_set_pauseparam(struct net_device *dev,
+				struct ethtool_pauseparam *pause)
+{
+	struct fm10k_intfc *interface = netdev_priv(dev);
+	struct fm10k_hw *hw = &interface->hw;
+
+	if (pause->autoneg || !pause->tx_pause)
+		return -EINVAL;
+
+	/* we can only support pause on the PF to avoid head-of-line blocking */
+	if (hw->mac.type == fm10k_mac_pf)
+		interface->rx_pause = pause->rx_pause ? ~0 : 0;
+	else if (pause->rx_pause)
+		return -EINVAL;
+
+	if (netif_running(dev))
+		fm10k_update_rx_drop_en(interface);
+
+	return 0;
+}
+
+static u32 fm10k_get_msglevel(struct net_device *netdev)
+{
+	struct fm10k_intfc *interface = netdev_priv(netdev);
+
+	return interface->msg_enable;
+}
+
+static void fm10k_set_msglevel(struct net_device *netdev, u32 data)
+{
+	struct fm10k_intfc *interface = netdev_priv(netdev);
+
+	interface->msg_enable = data;
+}
+
+static void fm10k_get_ringparam(struct net_device *netdev,
+				struct ethtool_ringparam *ring)
+{
+	struct fm10k_intfc *interface = netdev_priv(netdev);
+
+	ring->rx_max_pending = FM10K_MAX_RXD;
+	ring->tx_max_pending = FM10K_MAX_TXD;
+	ring->rx_mini_max_pending = 0;
+	ring->rx_jumbo_max_pending = 0;
+	ring->rx_pending = interface->rx_ring_count;
+	ring->tx_pending = interface->tx_ring_count;
+	ring->rx_mini_pending = 0;
+	ring->rx_jumbo_pending = 0;
+}
+
+static int fm10k_set_ringparam(struct net_device *netdev,
+			       struct ethtool_ringparam *ring)
+{
+	struct fm10k_intfc *interface = netdev_priv(netdev);
+	struct fm10k_ring *temp_ring;
+	int i, err = 0;
+	u32 new_rx_count, new_tx_count;
+
+	if ((ring->rx_mini_pending) || (ring->rx_jumbo_pending))
+		return -EINVAL;
+
+	new_tx_count = clamp_t(u32, ring->tx_pending,
+			       FM10K_MIN_TXD, FM10K_MAX_TXD);
+	new_tx_count = ALIGN(new_tx_count, FM10K_REQ_TX_DESCRIPTOR_MULTIPLE);
+
+	new_rx_count = clamp_t(u32, ring->rx_pending,
+			       FM10K_MIN_RXD, FM10K_MAX_RXD);
+	new_rx_count = ALIGN(new_rx_count, FM10K_REQ_RX_DESCRIPTOR_MULTIPLE);
+
+	if ((new_tx_count == interface->tx_ring_count) &&
+	    (new_rx_count == interface->rx_ring_count)) {
+		/* nothing to do */
+		return 0;
+	}
+
+	while (test_and_set_bit(__FM10K_RESETTING, &interface->state))
+		usleep_range(1000, 2000);
+
+	if (!netif_running(interface->netdev)) {
+		for (i = 0; i < interface->num_tx_queues; i++)
+			interface->tx_ring[i]->count = new_tx_count;
+		for (i = 0; i < interface->num_rx_queues; i++)
+			interface->rx_ring[i]->count = new_rx_count;
+		interface->tx_ring_count = new_tx_count;
+		interface->rx_ring_count = new_rx_count;
+		goto clear_reset;
+	}
+
+	/* allocate temporary buffer to store rings in */
+	i = max_t(int, interface->num_tx_queues, interface->num_rx_queues);
+	temp_ring = vmalloc(i * sizeof(struct fm10k_ring));
+
+	if (!temp_ring) {
+		err = -ENOMEM;
+		goto clear_reset;
+	}
+
+	fm10k_down(interface);
+
+	/* Setup new Tx resources and free the old Tx resources in that order.
+	 * We can then assign the new resources to the rings via a memcpy.
+	 * The advantage to this approach is that we are guaranteed to still
+	 * have resources even in the case of an allocation failure.
+	 */
+	if (new_tx_count != interface->tx_ring_count) {
+		for (i = 0; i < interface->num_tx_queues; i++) {
+			memcpy(&temp_ring[i], interface->tx_ring[i],
+			       sizeof(struct fm10k_ring));
+
+			temp_ring[i].count = new_tx_count;
+			err = fm10k_setup_tx_resources(&temp_ring[i]);
+			if (err) {
+				while (i) {
+					i--;
+					fm10k_free_tx_resources(&temp_ring[i]);
+				}
+				goto err_setup;
+			}
+		}
+
+		for (i = 0; i < interface->num_tx_queues; i++) {
+			fm10k_free_tx_resources(interface->tx_ring[i]);
+
+			memcpy(interface->tx_ring[i], &temp_ring[i],
+			       sizeof(struct fm10k_ring));
+		}
+
+		interface->tx_ring_count = new_tx_count;
+	}
+
+	/* Repeat the process for the Rx rings if needed */
+	if (new_rx_count != interface->rx_ring_count) {
+		for (i = 0; i < interface->num_rx_queues; i++) {
+			memcpy(&temp_ring[i], interface->rx_ring[i],
+			       sizeof(struct fm10k_ring));
+
+			temp_ring[i].count = new_rx_count;
+			err = fm10k_setup_rx_resources(&temp_ring[i]);
+			if (err) {
+				while (i) {
+					i--;
+					fm10k_free_rx_resources(&temp_ring[i]);
+				}
+				goto err_setup;
+			}
+		}
+
+		for (i = 0; i < interface->num_rx_queues; i++) {
+			fm10k_free_rx_resources(interface->rx_ring[i]);
+
+			memcpy(interface->rx_ring[i], &temp_ring[i],
+			       sizeof(struct fm10k_ring));
+		}
+
+		interface->rx_ring_count = new_rx_count;
+	}
+
+err_setup:
+	fm10k_up(interface);
+	vfree(temp_ring);
+clear_reset:
+	clear_bit(__FM10K_RESETTING, &interface->state);
+	return err;
+}
+
+static int fm10k_get_coalesce(struct net_device *dev,
+			      struct ethtool_coalesce *ec)
+{
+	struct fm10k_intfc *interface = netdev_priv(dev);
+
+	ec->use_adaptive_tx_coalesce =
+		!!(interface->tx_itr & FM10K_ITR_ADAPTIVE);
+	ec->tx_coalesce_usecs = interface->tx_itr & ~FM10K_ITR_ADAPTIVE;
+
+	ec->use_adaptive_rx_coalesce =
+		!!(interface->rx_itr & FM10K_ITR_ADAPTIVE);
+	ec->rx_coalesce_usecs = interface->rx_itr & ~FM10K_ITR_ADAPTIVE;
+
+	return 0;
+}
+
+static int fm10k_set_coalesce(struct net_device *dev,
+			      struct ethtool_coalesce *ec)
+{
+	struct fm10k_intfc *interface = netdev_priv(dev);
+	struct fm10k_q_vector *qv;
+	u16 tx_itr, rx_itr;
+	int i;
+
+	/* verify limits */
+	if ((ec->rx_coalesce_usecs > FM10K_ITR_MAX) ||
+	    (ec->tx_coalesce_usecs > FM10K_ITR_MAX))
+		return -EINVAL;
+
+	/* record settings */
+	tx_itr = ec->tx_coalesce_usecs;
+	rx_itr = ec->rx_coalesce_usecs;
+
+	/* set initial values for adaptive ITR */
+	if (ec->use_adaptive_tx_coalesce)
+		tx_itr = FM10K_ITR_ADAPTIVE | FM10K_ITR_10K;
+
+	if (ec->use_adaptive_rx_coalesce)
+		rx_itr = FM10K_ITR_ADAPTIVE | FM10K_ITR_20K;
+
+	/* update interface */
+	interface->tx_itr = tx_itr;
+	interface->rx_itr = rx_itr;
+
+	/* update q_vectors */
+	for (i = 0; i < interface->num_q_vectors; i++) {
+		qv = interface->q_vector[i];
+		qv->tx.itr = tx_itr;
+		qv->rx.itr = rx_itr;
+	}
+
+	return 0;
+}
+
+static int fm10k_get_rss_hash_opts(struct fm10k_intfc *interface,
+				   struct ethtool_rxnfc *cmd)
+{
+	cmd->data = 0;
+
+	/* Report default options for RSS on fm10k */
+	switch (cmd->flow_type) {
+	case TCP_V4_FLOW:
+	case TCP_V6_FLOW:
+		cmd->data |= RXH_L4_B_0_1 | RXH_L4_B_2_3;
+		/* fall through */
+	case UDP_V4_FLOW:
+		if (interface->flags & FM10K_FLAG_RSS_FIELD_IPV4_UDP)
+			cmd->data |= RXH_L4_B_0_1 | RXH_L4_B_2_3;
+		/* fall through */
+	case SCTP_V4_FLOW:
+	case SCTP_V6_FLOW:
+	case AH_ESP_V4_FLOW:
+	case AH_ESP_V6_FLOW:
+	case AH_V4_FLOW:
+	case AH_V6_FLOW:
+	case ESP_V4_FLOW:
+	case ESP_V6_FLOW:
+	case IPV4_FLOW:
+	case IPV6_FLOW:
+		cmd->data |= RXH_IP_SRC | RXH_IP_DST;
+		break;
+	case UDP_V6_FLOW:
+		if (interface->flags & FM10K_FLAG_RSS_FIELD_IPV6_UDP)
+			cmd->data |= RXH_L4_B_0_1 | RXH_L4_B_2_3;
+		cmd->data |= RXH_IP_SRC | RXH_IP_DST;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int fm10k_get_rxnfc(struct net_device *dev, struct ethtool_rxnfc *cmd,
+			   u32 *rule_locs)
+{
+	struct fm10k_intfc *interface = netdev_priv(dev);
+	int ret = -EOPNOTSUPP;
+
+	switch (cmd->cmd) {
+	case ETHTOOL_GRXRINGS:
+		cmd->data = interface->num_rx_queues;
+		ret = 0;
+		break;
+	case ETHTOOL_GRXFH:
+		ret = fm10k_get_rss_hash_opts(interface, cmd);
+		break;
+	default:
+		break;
+	}
+
+	return ret;
+}
+
+#define UDP_RSS_FLAGS (FM10K_FLAG_RSS_FIELD_IPV4_UDP | \
+		       FM10K_FLAG_RSS_FIELD_IPV6_UDP)
+static int fm10k_set_rss_hash_opt(struct fm10k_intfc *interface,
+				  struct ethtool_rxnfc *nfc)
+{
+	u32 flags = interface->flags;
+
+	/* RSS does not support anything other than hashing
+	 * to queues on src and dst IPs and ports
+	 */
+	if (nfc->data & ~(RXH_IP_SRC | RXH_IP_DST |
+			  RXH_L4_B_0_1 | RXH_L4_B_2_3))
+		return -EINVAL;
+
+	switch (nfc->flow_type) {
+	case TCP_V4_FLOW:
+	case TCP_V6_FLOW:
+		if (!(nfc->data & RXH_IP_SRC) ||
+		    !(nfc->data & RXH_IP_DST) ||
+		    !(nfc->data & RXH_L4_B_0_1) ||
+		    !(nfc->data & RXH_L4_B_2_3))
+			return -EINVAL;
+		break;
+	case UDP_V4_FLOW:
+		if (!(nfc->data & RXH_IP_SRC) ||
+		    !(nfc->data & RXH_IP_DST))
+			return -EINVAL;
+		switch (nfc->data & (RXH_L4_B_0_1 | RXH_L4_B_2_3)) {
+		case 0:
+			flags &= ~FM10K_FLAG_RSS_FIELD_IPV4_UDP;
+			break;
+		case (RXH_L4_B_0_1 | RXH_L4_B_2_3):
+			flags |= FM10K_FLAG_RSS_FIELD_IPV4_UDP;
+			break;
+		default:
+			return -EINVAL;
+		}
+		break;
+	case UDP_V6_FLOW:
+		if (!(nfc->data & RXH_IP_SRC) ||
+		    !(nfc->data & RXH_IP_DST))
+			return -EINVAL;
+		switch (nfc->data & (RXH_L4_B_0_1 | RXH_L4_B_2_3)) {
+		case 0:
+			flags &= ~FM10K_FLAG_RSS_FIELD_IPV6_UDP;
+			break;
+		case (RXH_L4_B_0_1 | RXH_L4_B_2_3):
+			flags |= FM10K_FLAG_RSS_FIELD_IPV6_UDP;
+			break;
+		default:
+			return -EINVAL;
+		}
+		break;
+	case AH_ESP_V4_FLOW:
+	case AH_V4_FLOW:
+	case ESP_V4_FLOW:
+	case SCTP_V4_FLOW:
+	case AH_ESP_V6_FLOW:
+	case AH_V6_FLOW:
+	case ESP_V6_FLOW:
+	case SCTP_V6_FLOW:
+		if (!(nfc->data & RXH_IP_SRC) ||
+		    !(nfc->data & RXH_IP_DST) ||
+		    (nfc->data & RXH_L4_B_0_1) ||
+		    (nfc->data & RXH_L4_B_2_3))
+			return -EINVAL;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	/* if we changed something we need to update flags */
+	if (flags != interface->flags) {
+		struct fm10k_hw *hw = &interface->hw;
+		u32 mrqc;
+
+		if ((flags & UDP_RSS_FLAGS) &&
+		    !(interface->flags & UDP_RSS_FLAGS))
+			netif_warn(interface, drv, interface->netdev,
+				   "enabling UDP RSS: fragmented packets may arrive out of order to the stack above\n");
+
+		interface->flags = flags;
+
+		/* Perform hash on these packet types */
+		mrqc = FM10K_MRQC_IPV4 |
+		       FM10K_MRQC_TCP_IPV4 |
+		       FM10K_MRQC_IPV6 |
+		       FM10K_MRQC_TCP_IPV6;
+
+		if (flags & FM10K_FLAG_RSS_FIELD_IPV4_UDP)
+			mrqc |= FM10K_MRQC_UDP_IPV4;
+		if (flags & FM10K_FLAG_RSS_FIELD_IPV6_UDP)
+			mrqc |= FM10K_MRQC_UDP_IPV6;
+
+		fm10k_write_reg(hw, FM10K_MRQC(0), mrqc);
+	}
+
+	return 0;
+}
+
+static int fm10k_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *cmd)
+{
+	struct fm10k_intfc *interface = netdev_priv(dev);
+	int ret = -EOPNOTSUPP;
+
+	switch (cmd->cmd) {
+	case ETHTOOL_SRXFH:
+		ret = fm10k_set_rss_hash_opt(interface, cmd);
+		break;
+	default:
+		break;
+	}
+
+	return ret;
+}
+
+static u32 fm10k_get_reta_size(struct net_device *netdev)
+{
+	return FM10K_RETA_SIZE * FM10K_RETA_ENTRIES_PER_REG;
+}
+
+static int fm10k_get_reta(struct net_device *netdev, u32 *indir)
+{
+	struct fm10k_intfc *interface = netdev_priv(netdev);
+	int i;
+
+	if (!indir)
+		return 0;
+
+	for (i = 0; i < FM10K_RETA_SIZE; i++, indir += 4) {
+		u32 reta = interface->reta[i];
+
+		indir[0] = (reta << 24) >> 24;
+		indir[1] = (reta << 16) >> 24;
+		indir[2] = (reta <<  8) >> 24;
+		indir[3] = (reta) >> 24;
+	}
+
+	return 0;
+}
+
+static int fm10k_set_reta(struct net_device *netdev, const u32 *indir)
+{
+	struct fm10k_intfc *interface = netdev_priv(netdev);
+	struct fm10k_hw *hw = &interface->hw;
+	int i;
+	u16 rss_i;
+
+	if (!indir)
+		return 0;
+
+	/* Verify user input. */
+	rss_i = interface->ring_feature[RING_F_RSS].indices;
+	for (i = fm10k_get_reta_size(netdev); i--;) {
+		if (indir[i] < rss_i)
+			continue;
+		return -EINVAL;
+	}
+
+	/* record entries to reta table */
+	for (i = 0; i < FM10K_RETA_SIZE; i++, indir += 4) {
+		u32 reta = indir[0] |
+			   (indir[1] << 8) |
+			   (indir[2] << 16) |
+			   (indir[3] << 24);
+
+		if (interface->reta[i] == reta)
+			continue;
+
+		interface->reta[i] = reta;
+		fm10k_write_reg(hw, FM10K_RETA(0, i), reta);
+	}
+
+	return 0;
+}
+
+static u32 fm10k_get_rssrk_size(struct net_device *netdev)
+{
+	return FM10K_RSSRK_SIZE * FM10K_RSSRK_ENTRIES_PER_REG;
+}
+
+static int fm10k_get_rssh(struct net_device *netdev, u32 *indir, u8 *key)
+{
+	struct fm10k_intfc *interface = netdev_priv(netdev);
+	int i, err;
+
+	err = fm10k_get_reta(netdev, indir);
+	if (err || !key)
+		return err;
+
+	for (i = 0; i < FM10K_RSSRK_SIZE; i++, key += 4)
+		*(__le32 *)key = cpu_to_le32(interface->rssrk[i]);
+
+	return 0;
+}
+
+static int fm10k_set_rssh(struct net_device *netdev, const u32 *indir,
+			  const u8 *key)
+{
+	struct fm10k_intfc *interface = netdev_priv(netdev);
+	struct fm10k_hw *hw = &interface->hw;
+	int i, err;
+
+	err = fm10k_set_reta(netdev, indir);
+	if (err || !key)
+		return err;
+
+	for (i = 0; i < FM10K_RSSRK_SIZE; i++, key += 4) {
+		u32 rssrk = le32_to_cpu(*(__le32 *)key);
+
+		if (interface->rssrk[i] == rssrk)
+			continue;
+
+		interface->rssrk[i] = rssrk;
+		fm10k_write_reg(hw, FM10K_RSSRK(0, i), rssrk);
+	}
+
+	return 0;
+}
+
+static const struct ethtool_ops fm10k_ethtool_ops = {
+	.get_strings		= fm10k_get_strings,
+	.get_sset_count		= fm10k_get_sset_count,
+	.get_ethtool_stats      = fm10k_get_ethtool_stats,
+	.get_drvinfo		= fm10k_get_drvinfo,
+	.get_link		= ethtool_op_get_link,
+	.get_pauseparam		= fm10k_get_pauseparam,
+	.set_pauseparam		= fm10k_set_pauseparam,
+	.get_msglevel		= fm10k_get_msglevel,
+	.set_msglevel		= fm10k_set_msglevel,
+	.get_ringparam		= fm10k_get_ringparam,
+	.set_ringparam		= fm10k_set_ringparam,
+	.get_coalesce		= fm10k_get_coalesce,
+	.set_coalesce		= fm10k_set_coalesce,
+	.get_rxnfc		= fm10k_get_rxnfc,
+	.set_rxnfc		= fm10k_set_rxnfc,
+	.get_regs               = fm10k_get_regs,
+	.get_regs_len           = fm10k_get_regs_len,
+	.get_rxfh_indir_size	= fm10k_get_reta_size,
+	.get_rxfh_key_size	= fm10k_get_rssrk_size,
+	.get_rxfh		= fm10k_get_rssh,
+	.set_rxfh		= fm10k_set_rssh,
+};
+
+void fm10k_set_ethtool_ops(struct net_device *dev)
+{
+	dev->ethtool_ops = &fm10k_ethtool_ops;
+}
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
index 1ddc0d5..7437244 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
@@ -996,6 +996,7 @@ struct net_device *fm10k_alloc_netdev(void)
 
 	/* set net device and ethtool ops */
 	dev->netdev_ops = &fm10k_netdev_ops;
+	fm10k_set_ethtool_ops(dev);
 
 	/* configure default debug level */
 	interface = netdev_priv(dev);

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [net-next PATCH 18/29] fm10k: Add support for PCI power management and error handling
  2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
                   ` (16 preceding siblings ...)
  2014-09-18 22:38 ` [net-next PATCH 17/29] fm10k: Add ethtool support Alexander Duyck
@ 2014-09-18 22:38 ` Alexander Duyck
  2014-09-18 22:38 ` [net-next PATCH 19/29] fm10k: Add support for multiple queues Alexander Duyck
                   ` (13 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-18 22:38 UTC (permalink / raw)
  To: davem
  Cc: nhorman, netdev, john.fastabend, matthew.vick, jeffrey.t.kirsher,
	sassmann

Add PCI power management and error handling to allow the device to support
suspend/resume and recovery of any PCIe errors.  The fm10k devices do not
support wake on LAN, and there is no plan to add this as a feature.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c |  221 ++++++++++++++++++++++++++
 1 file changed, 221 insertions(+)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index 4e21ddd..9b97da7 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -19,6 +19,7 @@
  */
 
 #include <linux/module.h>
+#include <linux/aer.h>
 
 #include "fm10k.h"
 
@@ -1534,6 +1535,8 @@ static int fm10k_probe(struct pci_dev *pdev,
 		goto err_pci_reg;
 	}
 
+	pci_enable_pcie_error_reporting(pdev);
+
 	pci_set_master(pdev);
 	pci_save_state(pdev);
 
@@ -1660,14 +1663,232 @@ static void fm10k_remove(struct pci_dev *pdev)
 	pci_release_selected_regions(pdev,
 				     pci_select_bars(pdev, IORESOURCE_MEM));
 
+	pci_disable_pcie_error_reporting(pdev);
+
 	pci_disable_device(pdev);
 }
 
+#ifdef CONFIG_PM
+/**
+ * fm10k_resume - Restore device to pre-sleep state
+ * @pdev: PCI device information struct
+ *
+ * fm10k_resume is called after the system has powered back up from a sleep
+ * state and is ready to resume operation.  This function is meant to restore
+ * the device back to its pre-sleep state.
+ **/
+static int fm10k_resume(struct pci_dev *pdev)
+{
+	struct fm10k_intfc *interface = pci_get_drvdata(pdev);
+	struct net_device *netdev = interface->netdev;
+	struct fm10k_hw *hw = &interface->hw;
+	u32 err;
+
+	pci_set_power_state(pdev, PCI_D0);
+	pci_restore_state(pdev);
+
+	/* pci_restore_state clears dev->state_saved so call
+	 * pci_save_state to restore it.
+	 */
+	pci_save_state(pdev);
+
+	err = pci_enable_device_mem(pdev);
+	if (err) {
+		dev_err(&pdev->dev, "Cannot enable PCI device from suspend\n");
+		return err;
+	}
+	pci_set_master(pdev);
+
+	pci_wake_from_d3(pdev, false);
+
+	/* refresh hw_addr in case it was dropped */
+	hw->hw_addr = interface->uc_addr;
+
+	/* reset hardware to known state */
+	err = hw->mac.ops.init_hw(&interface->hw);
+	if (err)
+		return err;
+
+	/* reset statistics starting values */
+	hw->mac.ops.rebind_hw_stats(hw, &interface->stats);
+
+	rtnl_lock();
+
+	err = fm10k_init_queueing_scheme(interface);
+	if (!err) {
+		fm10k_mbx_request_irq(interface);
+		if (netif_running(netdev))
+			err = fm10k_open(netdev);
+	}
+
+	rtnl_unlock();
+
+	if (err)
+		return err;
+
+	netif_device_attach(netdev);
+
+	return 0;
+}
+
+/**
+ * fm10k_suspend - Prepare the device for a system sleep state
+ * @pdev: PCI device information struct
+ *
+ * fm10k_suspend is meant to shutdown the device prior to the system entering
+ * a sleep state.  The fm10k hardware does not support wake on lan so the
+ * driver simply needs to shut down the device so it is in a low power state.
+ **/
+static int fm10k_suspend(struct pci_dev *pdev, pm_message_t state)
+{
+	struct fm10k_intfc *interface = pci_get_drvdata(pdev);
+	struct net_device *netdev = interface->netdev;
+	int err = 0;
+
+	netif_device_detach(netdev);
+
+	rtnl_lock();
+
+	if (netif_running(netdev))
+		fm10k_close(netdev);
+
+	fm10k_mbx_free_irq(interface);
+
+	fm10k_clear_queueing_scheme(interface);
+
+	rtnl_unlock();
+
+	err = pci_save_state(pdev);
+	if (err)
+		return err;
+
+	pci_disable_device(pdev);
+	pci_wake_from_d3(pdev, false);
+	pci_set_power_state(pdev, PCI_D3hot);
+
+	return 0;
+}
+
+#endif /* CONFIG_PM */
+/**
+ * fm10k_io_error_detected - called when PCI error is detected
+ * @pdev: Pointer to PCI device
+ * @state: The current pci connection state
+ *
+ * This function is called after a PCI bus error affecting
+ * this device has been detected.
+ */
+static pci_ers_result_t fm10k_io_error_detected(struct pci_dev *pdev,
+						pci_channel_state_t state)
+{
+	struct fm10k_intfc *interface = pci_get_drvdata(pdev);
+	struct net_device *netdev = interface->netdev;
+
+	netif_device_detach(netdev);
+
+	if (state == pci_channel_io_perm_failure)
+		return PCI_ERS_RESULT_DISCONNECT;
+
+	if (netif_running(netdev))
+		fm10k_close(netdev);
+
+	fm10k_mbx_free_irq(interface);
+
+	pci_disable_device(pdev);
+
+	/* Request a slot reset. */
+	return PCI_ERS_RESULT_NEED_RESET;
+}
+
+/**
+ * fm10k_io_slot_reset - called after the pci bus has been reset.
+ * @pdev: Pointer to PCI device
+ *
+ * Restart the card from scratch, as if from a cold-boot.
+ */
+static pci_ers_result_t fm10k_io_slot_reset(struct pci_dev *pdev)
+{
+	struct fm10k_intfc *interface = pci_get_drvdata(pdev);
+	pci_ers_result_t result;
+
+	if (pci_enable_device_mem(pdev)) {
+		dev_err(&pdev->dev,
+			"Cannot re-enable PCI device after reset.\n");
+		result = PCI_ERS_RESULT_DISCONNECT;
+	} else {
+		pci_set_master(pdev);
+		pci_restore_state(pdev);
+
+		/* After second error pci->state_saved is false, this
+		 * resets it so EEH doesn't break.
+		 */
+		pci_save_state(pdev);
+
+		pci_wake_from_d3(pdev, false);
+
+		/* refresh hw_addr in case it was dropped */
+		interface->hw.hw_addr = interface->uc_addr;
+
+		interface->flags |= FM10K_FLAG_RESET_REQUESTED;
+		fm10k_service_event_schedule(interface);
+
+		result = PCI_ERS_RESULT_RECOVERED;
+	}
+
+	pci_cleanup_aer_uncorrect_error_status(pdev);
+
+	return result;
+}
+
+/**
+ * fm10k_io_resume - called when traffic can start flowing again.
+ * @pdev: Pointer to PCI device
+ *
+ * This callback is called when the error recovery driver tells us that
+ * its OK to resume normal operation.
+ */
+static void fm10k_io_resume(struct pci_dev *pdev)
+{
+	struct fm10k_intfc *interface = pci_get_drvdata(pdev);
+	struct net_device *netdev = interface->netdev;
+	struct fm10k_hw *hw = &interface->hw;
+	int err = 0;
+
+	/* reset hardware to known state */
+	hw->mac.ops.init_hw(&interface->hw);
+
+	/* reset statistics starting values */
+	hw->mac.ops.rebind_hw_stats(hw, &interface->stats);
+
+	/* reassociate interrupts */
+	fm10k_mbx_request_irq(interface);
+
+	if (netif_running(netdev))
+		err = fm10k_open(netdev);
+
+	/* final check of hardware state before registering the interface */
+	err = err ? : fm10k_hw_ready(interface);
+
+	if (!err)
+		netif_device_attach(netdev);
+}
+
+static const struct pci_error_handlers fm10k_err_handler = {
+	.error_detected = fm10k_io_error_detected,
+	.slot_reset = fm10k_io_slot_reset,
+	.resume = fm10k_io_resume,
+};
+
 static struct pci_driver fm10k_driver = {
 	.name			= fm10k_driver_name,
 	.id_table		= fm10k_pci_tbl,
 	.probe			= fm10k_probe,
 	.remove			= fm10k_remove,
+#ifdef CONFIG_PM
+	.suspend		= fm10k_suspend,
+	.resume			= fm10k_resume,
+#endif
+	.err_handler		= &fm10k_err_handler
 };
 
 /**

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [net-next PATCH 19/29] fm10k: Add support for multiple queues
  2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
                   ` (17 preceding siblings ...)
  2014-09-18 22:38 ` [net-next PATCH 18/29] fm10k: Add support for PCI power management and error handling Alexander Duyck
@ 2014-09-18 22:38 ` Alexander Duyck
  2014-09-18 22:38 ` [net-next PATCH 20/29] fm10k: Add support for netdev offloads Alexander Duyck
                   ` (12 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-18 22:38 UTC (permalink / raw)
  To: davem
  Cc: nhorman, netdev, john.fastabend, matthew.vick, jeffrey.t.kirsher,
	sassmann

This patch takes the driver from supporting a single queue to supporting
multiple queues.  The upper queue limit for the PF is 128 queues and the
upper limit for the VF is (128 / num_vfs) rounded down to nearest power of 2.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k.h         |    1 
 drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c |   57 ++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_main.c    |  148 ++++++++++++++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_netdev.c  |   41 ++++++
 4 files changed, 247 insertions(+)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k.h b/drivers/net/ethernet/intel/fm10k/fm10k.h
index aea79b7..19c3f8c 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k.h
@@ -422,6 +422,7 @@ void fm10k_unmap_and_free_tx_resource(struct fm10k_ring *,
 				      struct fm10k_tx_buffer *);
 void fm10k_restore_rx_state(struct fm10k_intfc *);
 void fm10k_reset_rx_state(struct fm10k_intfc *);
+int fm10k_setup_tc(struct net_device *dev, u8 tc);
 int fm10k_open(struct net_device *netdev);
 int fm10k_close(struct net_device *netdev);
 
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c b/drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c
index a88c75c..54e8ebd 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c
@@ -838,6 +838,61 @@ static int fm10k_set_rssh(struct net_device *netdev, const u32 *indir,
 	return 0;
 }
 
+static unsigned int fm10k_max_channels(struct net_device *dev)
+{
+	struct fm10k_intfc *interface = netdev_priv(dev);
+	unsigned int max_combined = interface->hw.mac.max_queues;
+	u8 tcs = netdev_get_num_tc(dev);
+
+	/* For QoS report channels per traffic class */
+	if (tcs > 1)
+		max_combined = 1 << (fls(max_combined / tcs) - 1);
+
+	return max_combined;
+}
+
+static void fm10k_get_channels(struct net_device *dev,
+			       struct ethtool_channels *ch)
+{
+	struct fm10k_intfc *interface = netdev_priv(dev);
+	struct fm10k_hw *hw = &interface->hw;
+
+	/* report maximum channels */
+	ch->max_combined = fm10k_max_channels(dev);
+
+	/* report info for other vector */
+	ch->max_other = NON_Q_VECTORS(hw);
+	ch->other_count = ch->max_other;
+
+	/* record RSS queues */
+	ch->combined_count = interface->ring_feature[RING_F_RSS].indices;
+}
+
+static int fm10k_set_channels(struct net_device *dev,
+			      struct ethtool_channels *ch)
+{
+	struct fm10k_intfc *interface = netdev_priv(dev);
+	unsigned int count = ch->combined_count;
+	struct fm10k_hw *hw = &interface->hw;
+
+	/* verify they are not requesting separate vectors */
+	if (!count || ch->rx_count || ch->tx_count)
+		return -EINVAL;
+
+	/* verify other_count has not changed */
+	if (ch->other_count != NON_Q_VECTORS(hw))
+		return -EINVAL;
+
+	/* verify the number of channels does not exceed hardware limits */
+	if (count > fm10k_max_channels(dev))
+		return -EINVAL;
+
+	interface->ring_feature[RING_F_RSS].limit = count;
+
+	/* use setup TC to update any traffic class queue mapping */
+	return fm10k_setup_tc(dev, netdev_get_num_tc(dev));
+}
+
 static const struct ethtool_ops fm10k_ethtool_ops = {
 	.get_strings		= fm10k_get_strings,
 	.get_sset_count		= fm10k_get_sset_count,
@@ -860,6 +915,8 @@ static const struct ethtool_ops fm10k_ethtool_ops = {
 	.get_rxfh_key_size	= fm10k_get_rssrk_size,
 	.get_rxfh		= fm10k_get_rssh,
 	.set_rxfh		= fm10k_set_rssh,
+	.get_channels		= fm10k_get_channels,
+	.set_channels		= fm10k_set_channels,
 };
 
 void fm10k_set_ethtool_ops(struct net_device *dev)
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_main.c b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
index 0d209c5..1c2dfa1 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_main.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
@@ -1085,6 +1085,81 @@ static int fm10k_poll(struct napi_struct *napi, int budget)
 }
 
 /**
+ * fm10k_set_qos_queues: Allocate queues for a QOS-enabled device
+ * @interface: board private structure to initialize
+ *
+ * When QoS (Quality of Service) is enabled, allocate queues for
+ * each traffic class.  If multiqueue isn't available,then abort QoS
+ * initialization.
+ *
+ * This function handles all combinations of Qos and RSS.
+ *
+ **/
+static bool fm10k_set_qos_queues(struct fm10k_intfc *interface)
+{
+	struct net_device *dev = interface->netdev;
+	struct fm10k_ring_feature *f;
+	int rss_i, i;
+	int pcs;
+
+	/* Map queue offset and counts onto allocated tx queues */
+	pcs = netdev_get_num_tc(dev);
+
+	if (pcs <= 1)
+		return false;
+
+	/* set QoS mask and indices */
+	f = &interface->ring_feature[RING_F_QOS];
+	f->indices = pcs;
+	f->mask = (1 << fls(pcs - 1)) - 1;
+
+	/* determine the upper limit for our current DCB mode */
+	rss_i = interface->hw.mac.max_queues / pcs;
+	rss_i = 1 << (fls(rss_i) - 1);
+
+	/* set RSS mask and indices */
+	f = &interface->ring_feature[RING_F_RSS];
+	rss_i = min_t(u16, rss_i, f->limit);
+	f->indices = rss_i;
+	f->mask = (1 << fls(rss_i - 1)) - 1;
+
+	/* configure pause class to queue mapping */
+	for (i = 0; i < pcs; i++)
+		netdev_set_tc_queue(dev, i, rss_i, rss_i * i);
+
+	interface->num_rx_queues = rss_i * pcs;
+	interface->num_tx_queues = rss_i * pcs;
+
+	return true;
+}
+
+/**
+ * fm10k_set_rss_queues: Allocate queues for RSS
+ * @interface: board private structure to initialize
+ *
+ * This is our "base" multiqueue mode.  RSS (Receive Side Scaling) will try
+ * to allocate one Rx queue per CPU, and if available, one Tx queue per CPU.
+ *
+ **/
+static bool fm10k_set_rss_queues(struct fm10k_intfc *interface)
+{
+	struct fm10k_ring_feature *f;
+	u16 rss_i;
+
+	f = &interface->ring_feature[RING_F_RSS];
+	rss_i = min_t(u16, interface->hw.mac.max_queues, f->limit);
+
+	/* record indices and power of 2 mask for RSS */
+	f->indices = rss_i;
+	f->mask = (1 << fls(rss_i - 1)) - 1;
+
+	interface->num_rx_queues = rss_i;
+	interface->num_tx_queues = rss_i;
+
+	return true;
+}
+
+/**
  * fm10k_set_num_queues: Allocate queues for device, feature dependent
  * @interface: board private structure to initialize
  *
@@ -1100,6 +1175,11 @@ static void fm10k_set_num_queues(struct fm10k_intfc *interface)
 	/* Start with base case */
 	interface->num_rx_queues = 1;
 	interface->num_tx_queues = 1;
+
+	if (fm10k_set_qos_queues(interface))
+		return;
+
+	fm10k_set_rss_queues(interface);
 }
 
 /**
@@ -1380,6 +1460,71 @@ static int fm10k_init_msix_capability(struct fm10k_intfc *interface)
 	return 0;
 }
 
+/**
+ * fm10k_cache_ring_qos - Descriptor ring to register mapping for QoS
+ * @interface: Interface structure continaining rings and devices
+ *
+ * Cache the descriptor ring offsets for Qos
+ **/
+static bool fm10k_cache_ring_qos(struct fm10k_intfc *interface)
+{
+	struct net_device *dev = interface->netdev;
+	int pc, offset, rss_i, i, q_idx;
+	u16 pc_stride = interface->ring_feature[RING_F_QOS].mask + 1;
+	u8 num_pcs = netdev_get_num_tc(dev);
+
+	if (num_pcs <= 1)
+		return false;
+
+	rss_i = interface->ring_feature[RING_F_RSS].indices;
+
+	for (pc = 0, offset = 0; pc < num_pcs; pc++, offset += rss_i) {
+		q_idx = pc;
+		for (i = 0; i < rss_i; i++) {
+			interface->tx_ring[offset + i]->reg_idx = q_idx;
+			interface->tx_ring[offset + i]->qos_pc = pc;
+			interface->rx_ring[offset + i]->reg_idx = q_idx;
+			interface->rx_ring[offset + i]->qos_pc = pc;
+			q_idx += pc_stride;
+		}
+	}
+
+	return true;
+}
+
+/**
+ * fm10k_cache_ring_rss - Descriptor ring to register mapping for RSS
+ * @interface: Interface structure continaining rings and devices
+ *
+ * Cache the descriptor ring offsets for RSS
+ **/
+static void fm10k_cache_ring_rss(struct fm10k_intfc *interface)
+{
+	int i;
+
+	for (i = 0; i < interface->num_rx_queues; i++)
+		interface->rx_ring[i]->reg_idx = i;
+
+	for (i = 0; i < interface->num_tx_queues; i++)
+		interface->tx_ring[i]->reg_idx = i;
+}
+
+/**
+ * fm10k_assign_rings - Map rings to network devices
+ * @interface: Interface structure containing rings and devices
+ *
+ * This function is meant to go though and configure both the network
+ * devices so that they contain rings, and configure the rings so that
+ * they function with their network devices.
+ **/
+static void fm10k_assign_rings(struct fm10k_intfc *interface)
+{
+	if (fm10k_cache_ring_qos(interface))
+		return;
+
+	fm10k_cache_ring_rss(interface);
+}
+
 static void fm10k_init_reta(struct fm10k_intfc *interface)
 {
 	u16 i, rss_i = interface->ring_feature[RING_F_RSS].indices;
@@ -1447,6 +1592,9 @@ int fm10k_init_queueing_scheme(struct fm10k_intfc *interface)
 	if (err)
 		return err;
 
+	/* Map rings to devices, and map devices to physical queues */
+	fm10k_assign_rings(interface);
+
 	/* Initialize RSS redirection table */
 	fm10k_init_reta(interface);
 
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
index 7437244..2433a14 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
@@ -969,6 +969,46 @@ static struct rtnl_link_stats64 *fm10k_get_stats64(struct net_device *netdev,
 	return stats;
 }
 
+int fm10k_setup_tc(struct net_device *dev, u8 tc)
+{
+	struct fm10k_intfc *interface = netdev_priv(dev);
+
+	/* Currently only the PF supports priority classes */
+	if (tc && (interface->hw.mac.type != fm10k_mac_pf))
+		return -EINVAL;
+
+	/* Hardware supports up to 8 traffic classes */
+	if (tc > 8)
+		return -EINVAL;
+
+	/* Hardware has to reinitialize queues to match packet
+	 * buffer alignment. Unfortunately, the hardware is not
+	 * flexible enough to do this dynamically.
+	 */
+	if (netif_running(dev))
+		fm10k_close(dev);
+
+	fm10k_mbx_free_irq(interface);
+
+	fm10k_clear_queueing_scheme(interface);
+
+	/* we expect the prio_tc map to be repopulated later */
+	netdev_reset_tc(dev);
+	netdev_set_num_tc(dev, tc);
+
+	fm10k_init_queueing_scheme(interface);
+
+	fm10k_mbx_request_irq(interface);
+
+	if (netif_running(dev))
+		fm10k_open(dev);
+
+	/* flag to indicate SWPRI has yet to be updated */
+	interface->flags |= FM10K_FLAG_SWPRI_CONFIG;
+
+	return 0;
+}
+
 static const struct net_device_ops fm10k_netdev_ops = {
 	.ndo_open		= fm10k_open,
 	.ndo_stop		= fm10k_close,
@@ -981,6 +1021,7 @@ static const struct net_device_ops fm10k_netdev_ops = {
 	.ndo_vlan_rx_kill_vid	= fm10k_vlan_rx_kill_vid,
 	.ndo_set_rx_mode	= fm10k_set_rx_mode,
 	.ndo_get_stats64	= fm10k_get_stats64,
+	.ndo_setup_tc		= fm10k_setup_tc,
 };
 
 #define DEFAULT_DEBUG_LEVEL_SHIFT 3

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [net-next PATCH 20/29] fm10k: Add support for netdev offloads
  2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
                   ` (18 preceding siblings ...)
  2014-09-18 22:38 ` [net-next PATCH 19/29] fm10k: Add support for multiple queues Alexander Duyck
@ 2014-09-18 22:38 ` Alexander Duyck
  2014-09-18 22:39 ` [net-next PATCH 21/29] fm10k: Add support for MACVLAN acceleration Alexander Duyck
                   ` (11 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-18 22:38 UTC (permalink / raw)
  To: davem
  Cc: nhorman, netdev, john.fastabend, matthew.vick, jeffrey.t.kirsher,
	sassmann

This patch adds support for basic offloads including TSO, Tx checksum, Rx
checksum, Rx hash, and the same features applied to VXLAN/NVGRE tunnels.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_main.c   |  307 +++++++++++++++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_netdev.c |  155 +++++++++++-
 2 files changed, 459 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_main.c b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
index 1c2dfa1..baab163 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_main.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
@@ -341,6 +341,59 @@ static struct sk_buff *fm10k_fetch_rx_buffer(struct fm10k_ring *rx_ring,
 	return skb;
 }
 
+static inline void fm10k_rx_checksum(struct fm10k_ring *ring,
+				     union fm10k_rx_desc *rx_desc,
+				     struct sk_buff *skb)
+{
+	skb_checksum_none_assert(skb);
+
+	/* Rx checksum disabled via ethtool */
+	if (!(ring->netdev->features & NETIF_F_RXCSUM))
+		return;
+
+	/* TCP/UDP checksum error bit is set */
+	if (fm10k_test_staterr(rx_desc,
+			       FM10K_RXD_STATUS_L4E |
+			       FM10K_RXD_STATUS_L4E2 |
+			       FM10K_RXD_STATUS_IPE |
+			       FM10K_RXD_STATUS_IPE2)) {
+		ring->rx_stats.csum_err++;
+		return;
+	}
+
+	/* It must be a TCP or UDP packet with a valid checksum */
+	if (fm10k_test_staterr(rx_desc, FM10K_RXD_STATUS_L4CS2))
+		skb->encapsulation = true;
+	else if (!fm10k_test_staterr(rx_desc, FM10K_RXD_STATUS_L4CS))
+		return;
+
+	skb->ip_summed = CHECKSUM_UNNECESSARY;
+}
+
+#define FM10K_RSS_L4_TYPES_MASK \
+	((1ul << FM10K_RSSTYPE_IPV4_TCP) | \
+	 (1ul << FM10K_RSSTYPE_IPV4_UDP) | \
+	 (1ul << FM10K_RSSTYPE_IPV6_TCP) | \
+	 (1ul << FM10K_RSSTYPE_IPV6_UDP))
+
+static inline void fm10k_rx_hash(struct fm10k_ring *ring,
+				 union fm10k_rx_desc *rx_desc,
+				 struct sk_buff *skb)
+{
+	u16 rss_type;
+
+	if (!(ring->netdev->features & NETIF_F_RXHASH))
+		return;
+
+	rss_type = le16_to_cpu(rx_desc->w.pkt_info) & FM10K_RXD_RSSTYPE_MASK;
+	if (!rss_type)
+		return;
+
+	skb_set_hash(skb, le32_to_cpu(rx_desc->d.rss),
+		     (FM10K_RSS_L4_TYPES_MASK & (1ul << rss_type)) ?
+		     PKT_HASH_TYPE_L4 : PKT_HASH_TYPE_L3);
+}
+
 /**
  * fm10k_process_skb_fields - Populate skb header fields from Rx descriptor
  * @rx_ring: rx descriptor ring packet is being transacted on
@@ -357,6 +410,10 @@ static unsigned int fm10k_process_skb_fields(struct fm10k_ring *rx_ring,
 {
 	unsigned int len = skb->len;
 
+	fm10k_rx_hash(rx_ring, rx_desc, skb);
+
+	fm10k_rx_checksum(rx_ring, rx_desc, skb);
+
 	FM10K_CB(skb)->fi.w.vlan = rx_desc->w.vlan;
 
 	skb_record_rx_queue(skb, rx_ring->queue_index);
@@ -568,6 +625,240 @@ static bool fm10k_clean_rx_irq(struct fm10k_q_vector *q_vector,
 	return total_packets < budget;
 }
 
+#define VXLAN_HLEN (sizeof(struct udphdr) + 8)
+static struct ethhdr *fm10k_port_is_vxlan(struct sk_buff *skb)
+{
+	struct fm10k_intfc *interface = netdev_priv(skb->dev);
+	struct fm10k_vxlan_port *vxlan_port;
+
+	/* we can only offload a vxlan if we recognize it as such */
+	vxlan_port = list_first_entry_or_null(&interface->vxlan_port,
+					      struct fm10k_vxlan_port, list);
+
+	if (!vxlan_port)
+		return NULL;
+	if (vxlan_port->port != udp_hdr(skb)->dest)
+		return NULL;
+
+	/* return offset of udp_hdr plus 8 bytes for VXLAN header */
+	return (struct ethhdr *)(skb_transport_header(skb) + VXLAN_HLEN);
+}
+
+#define FM10K_NVGRE_RESERVED0_FLAGS htons(0x9FFF)
+#define NVGRE_TNI htons(0x2000)
+struct fm10k_nvgre_hdr {
+	__be16 flags;
+	__be16 proto;
+	__be32 tni;
+};
+
+static struct ethhdr *fm10k_gre_is_nvgre(struct sk_buff *skb)
+{
+	struct fm10k_nvgre_hdr *nvgre_hdr;
+	int hlen = ip_hdrlen(skb);
+
+	/* currently only IPv4 is supported due to hlen above */
+	if (vlan_get_protocol(skb) != htons(ETH_P_IP))
+		return NULL;
+
+	/* our transport header should be NVGRE */
+	nvgre_hdr = (struct fm10k_nvgre_hdr *)(skb_network_header(skb) + hlen);
+
+	/* verify all reserved flags are 0 */
+	if (nvgre_hdr->flags & FM10K_NVGRE_RESERVED0_FLAGS)
+		return NULL;
+
+	/* verify protocol is transparent Ethernet bridging */
+	if (nvgre_hdr->proto != htons(ETH_P_TEB))
+		return NULL;
+
+	/* report start of ethernet header */
+	if (nvgre_hdr->flags & NVGRE_TNI)
+		return (struct ethhdr *)(nvgre_hdr + 1);
+
+	return (struct ethhdr *)(&nvgre_hdr->tni);
+}
+
+static __be16 fm10k_tx_encap_offload(struct sk_buff *skb)
+{
+	struct ethhdr *eth_hdr;
+	u8 l4_hdr = 0;
+
+	switch (vlan_get_protocol(skb)) {
+	case htons(ETH_P_IP):
+		l4_hdr = ip_hdr(skb)->protocol;
+		break;
+	case htons(ETH_P_IPV6):
+		l4_hdr = ipv6_hdr(skb)->nexthdr;
+		break;
+	default:
+		return 0;
+	}
+
+	switch (l4_hdr) {
+	case IPPROTO_UDP:
+		eth_hdr = fm10k_port_is_vxlan(skb);
+		break;
+	case IPPROTO_GRE:
+		eth_hdr = fm10k_gre_is_nvgre(skb);
+		break;
+	default:
+		return 0;
+	}
+
+	if (!eth_hdr)
+		return 0;
+
+	switch (eth_hdr->h_proto) {
+	case htons(ETH_P_IP):
+	case htons(ETH_P_IPV6):
+		break;
+	default:
+		return 0;
+	}
+
+	return eth_hdr->h_proto;
+}
+
+static int fm10k_tso(struct fm10k_ring *tx_ring,
+		     struct fm10k_tx_buffer *first)
+{
+	struct sk_buff *skb = first->skb;
+	struct fm10k_tx_desc *tx_desc;
+	unsigned char *th;
+	u8 hdrlen;
+
+	if (skb->ip_summed != CHECKSUM_PARTIAL)
+		return 0;
+
+	if (!skb_is_gso(skb))
+		return 0;
+
+	/* compute header lengths */
+	if (skb->encapsulation) {
+		if (!fm10k_tx_encap_offload(skb))
+			goto err_vxlan;
+		th = skb_inner_transport_header(skb);
+	} else {
+		th = skb_transport_header(skb);
+	}
+
+	/* compute offset from SOF to transport header and add header len */
+	hdrlen = (th - skb->data) + (((struct tcphdr *)th)->doff << 2);
+
+	first->tx_flags |= FM10K_TX_FLAGS_CSUM;
+
+	/* update gso size and bytecount with header size */
+	first->gso_segs = skb_shinfo(skb)->gso_segs;
+	first->bytecount += (first->gso_segs - 1) * hdrlen;
+
+	/* populate Tx descriptor header size and mss */
+	tx_desc = FM10K_TX_DESC(tx_ring, tx_ring->next_to_use);
+	tx_desc->hdrlen = hdrlen;
+	tx_desc->mss = cpu_to_le16(skb_shinfo(skb)->gso_size);
+
+	return 1;
+err_vxlan:
+	tx_ring->netdev->features &= ~NETIF_F_GSO_UDP_TUNNEL;
+	if (!net_ratelimit())
+		netdev_err(tx_ring->netdev,
+			   "TSO requested for unsupported tunnel, disabling offload\n");
+	return -1;
+}
+
+static void fm10k_tx_csum(struct fm10k_ring *tx_ring,
+			  struct fm10k_tx_buffer *first)
+{
+	struct sk_buff *skb = first->skb;
+	struct fm10k_tx_desc *tx_desc;
+	union {
+		struct iphdr *ipv4;
+		struct ipv6hdr *ipv6;
+		u8 *raw;
+	} network_hdr;
+	__be16 protocol;
+	u8 l4_hdr = 0;
+
+	if (skb->ip_summed != CHECKSUM_PARTIAL)
+		goto no_csum;
+
+	if (skb->encapsulation) {
+		protocol = fm10k_tx_encap_offload(skb);
+		if (!protocol) {
+			if (skb_checksum_help(skb)) {
+				dev_warn(tx_ring->dev,
+					 "failed to offload encap csum!\n");
+				tx_ring->tx_stats.csum_err++;
+			}
+			goto no_csum;
+		}
+		network_hdr.raw = skb_inner_network_header(skb);
+	} else {
+		protocol = vlan_get_protocol(skb);
+		network_hdr.raw = skb_network_header(skb);
+	}
+
+	switch (protocol) {
+	case htons(ETH_P_IP):
+		l4_hdr = network_hdr.ipv4->protocol;
+		break;
+	case htons(ETH_P_IPV6):
+		l4_hdr = network_hdr.ipv6->nexthdr;
+		break;
+	default:
+		if (unlikely(net_ratelimit())) {
+			dev_warn(tx_ring->dev,
+				 "partial checksum but ip version=%x!\n",
+				 protocol);
+		}
+		tx_ring->tx_stats.csum_err++;
+		goto no_csum;
+	}
+
+	switch (l4_hdr) {
+	case IPPROTO_TCP:
+	case IPPROTO_UDP:
+		break;
+	case IPPROTO_GRE:
+		if (skb->encapsulation)
+			break;
+	default:
+		if (unlikely(net_ratelimit())) {
+			dev_warn(tx_ring->dev,
+				 "partial checksum but l4 proto=%x!\n",
+				 l4_hdr);
+		}
+		tx_ring->tx_stats.csum_err++;
+		goto no_csum;
+	}
+
+	/* update TX checksum flag */
+	first->tx_flags |= FM10K_TX_FLAGS_CSUM;
+
+no_csum:
+	/* populate Tx descriptor header size and mss */
+	tx_desc = FM10K_TX_DESC(tx_ring, tx_ring->next_to_use);
+	tx_desc->hdrlen = 0;
+	tx_desc->mss = 0;
+}
+
+#define FM10K_SET_FLAG(_input, _flag, _result) \
+	((_flag <= _result) ? \
+	 ((u32)(_input & _flag) * (_result / _flag)) : \
+	 ((u32)(_input & _flag) / (_flag / _result)))
+
+static u8 fm10k_tx_desc_flags(struct sk_buff *skb, u32 tx_flags)
+{
+	/* set type for advanced descriptor with frame checksum insertion */
+	u32 desc_flags = 0;
+
+	/* set checksum offload bits */
+	desc_flags |= FM10K_SET_FLAG(tx_flags, FM10K_TX_FLAGS_CSUM,
+				     FM10K_TXD_FLAG_CSUM);
+
+	return desc_flags;
+}
+
 static bool fm10k_tx_desc_push(struct fm10k_ring *tx_ring,
 			       struct fm10k_tx_desc *tx_desc, u16 i,
 			       dma_addr_t dma, unsigned int size, u8 desc_flags)
@@ -595,8 +886,9 @@ static void fm10k_tx_map(struct fm10k_ring *tx_ring,
 	unsigned char *data;
 	dma_addr_t dma;
 	unsigned int data_len, size;
+	u32 tx_flags = first->tx_flags;
 	u16 i = tx_ring->next_to_use;
-	u8 flags = 0;
+	u8 flags = fm10k_tx_desc_flags(skb, tx_flags);
 
 	tx_desc = FM10K_TX_DESC(tx_ring, i);
 
@@ -731,6 +1023,7 @@ netdev_tx_t fm10k_xmit_frame_ring(struct sk_buff *skb,
 				  struct fm10k_ring *tx_ring)
 {
 	struct fm10k_tx_buffer *first;
+	int tso;
 	u32 tx_flags = 0;
 #if PAGE_SIZE > FM10K_MAX_DATA_PER_TXD
 	unsigned short f;
@@ -762,11 +1055,23 @@ netdev_tx_t fm10k_xmit_frame_ring(struct sk_buff *skb,
 	/* record initial flags and protocol */
 	first->tx_flags = tx_flags;
 
+	tso = fm10k_tso(tx_ring, first);
+	if (tso < 0)
+		goto out_drop;
+	else if (!tso)
+		fm10k_tx_csum(tx_ring, first);
+
 	fm10k_tx_map(tx_ring, first);
 
 	fm10k_maybe_stop_tx(tx_ring, DESC_NEEDED);
 
 	return NETDEV_TX_OK;
+
+out_drop:
+	dev_kfree_skb_any(first->skb);
+	first->skb = NULL;
+
+	return NETDEV_TX_OK;
 }
 
 static u64 fm10k_get_tx_completed(struct fm10k_ring *ring)
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
index 2433a14..6383db2 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
@@ -19,6 +19,9 @@
  */
 
 #include "fm10k.h"
+#if IS_ENABLED(CONFIG_VXLAN)
+#include <net/vxlan.h>
+#endif /* CONFIG_VXLAN */
 
 /**
  * fm10k_setup_tx_resources - allocate Tx resources (Descriptors)
@@ -365,6 +368,128 @@ static void fm10k_request_glort_range(struct fm10k_intfc *interface)
 }
 
 /**
+ * fm10k_del_vxlan_port_all
+ * @interface: board private structure
+ *
+ * This function frees the entire vxlan_port list
+ **/
+static void fm10k_del_vxlan_port_all(struct fm10k_intfc *interface)
+{
+	struct fm10k_vxlan_port *vxlan_port;
+
+	/* flush all entries from list */
+	vxlan_port = list_first_entry_or_null(&interface->vxlan_port,
+					      struct fm10k_vxlan_port, list);
+	while (vxlan_port) {
+		list_del(&vxlan_port->list);
+		kfree(vxlan_port);
+		vxlan_port = list_first_entry_or_null(&interface->vxlan_port,
+						      struct fm10k_vxlan_port,
+						      list);
+	}
+}
+
+/**
+ * fm10k_restore_vxlan_port
+ * @interface: board private structure
+ *
+ * This function restores the value in the tunnel_cfg register after reset
+ **/
+static void fm10k_restore_vxlan_port(struct fm10k_intfc *interface)
+{
+	struct fm10k_hw *hw = &interface->hw;
+	struct fm10k_vxlan_port *vxlan_port;
+
+	/* only the PF supports configuring tunnels */
+	if (hw->mac.type != fm10k_mac_pf)
+		return;
+
+	vxlan_port = list_first_entry_or_null(&interface->vxlan_port,
+					      struct fm10k_vxlan_port, list);
+
+	/* restore tunnel configuration register */
+	fm10k_write_reg(hw, FM10K_TUNNEL_CFG,
+			(vxlan_port ? ntohs(vxlan_port->port) : 0) |
+			(ETH_P_TEB << FM10K_TUNNEL_CFG_NVGRE_SHIFT));
+}
+
+/**
+ * fm10k_add_vxlan_port
+ * @netdev: network interface device structure
+ * @sa_family: Address family of new port
+ * @port: port number used for VXLAN
+ *
+ * This funciton is called when a new VXLAN interface has added a new port
+ * number to the range that is currently in use for VXLAN.  The new port
+ * number is always added to the tail so that the port number list should
+ * match the order in which the ports were allocated.  The head of the list
+ * is always used as the VXLAN port number for offloads.
+ **/
+static void fm10k_add_vxlan_port(struct net_device *dev,
+				 sa_family_t sa_family, __be16 port) {
+	struct fm10k_intfc *interface = netdev_priv(dev);
+	struct fm10k_vxlan_port *vxlan_port;
+
+	/* only the PF supports configuring tunnels */
+	if (interface->hw.mac.type != fm10k_mac_pf)
+		return;
+
+	/* existing ports are pulled out so our new entry is always last */
+	fm10k_vxlan_port_for_each(vxlan_port, interface) {
+		if ((vxlan_port->port == port) &&
+		    (vxlan_port->sa_family == sa_family)) {
+			list_del(&vxlan_port->list);
+			goto insert_tail;
+		}
+	}
+
+	/* allocate memory to track ports */
+	vxlan_port = kmalloc(sizeof(*vxlan_port), GFP_ATOMIC);
+	if (!vxlan_port)
+		return;
+	vxlan_port->port = port;
+	vxlan_port->sa_family = sa_family;
+
+insert_tail:
+	/* add new port value to list */
+	list_add_tail(&vxlan_port->list, &interface->vxlan_port);
+
+	fm10k_restore_vxlan_port(interface);
+}
+
+/**
+ * fm10k_del_vxlan_port
+ * @netdev: network interface device structure
+ * @sa_family: Address family of freed port
+ * @port: port number used for VXLAN
+ *
+ * This funciton is called when a new VXLAN interface has freed a port
+ * number from the range that is currently in use for VXLAN.  The freed
+ * port is removed from the list and the new head is used to determine
+ * the port number for offloads.
+ **/
+static void fm10k_del_vxlan_port(struct net_device *dev,
+				 sa_family_t sa_family, __be16 port) {
+	struct fm10k_intfc *interface = netdev_priv(dev);
+	struct fm10k_vxlan_port *vxlan_port;
+
+	if (interface->hw.mac.type != fm10k_mac_pf)
+		return;
+
+	/* find the port in the list and free it */
+	fm10k_vxlan_port_for_each(vxlan_port, interface) {
+		if ((vxlan_port->port == port) &&
+		    (vxlan_port->sa_family == sa_family)) {
+			list_del(&vxlan_port->list);
+			kfree(vxlan_port);
+			break;
+		}
+	}
+
+	fm10k_restore_vxlan_port(interface);
+}
+
+/**
  * fm10k_open - Called when a network interface is made active
  * @netdev: network interface device structure
  *
@@ -406,6 +531,11 @@ int fm10k_open(struct net_device *netdev)
 	if (err)
 		goto err_set_queues;
 
+#if IS_ENABLED(CONFIG_VXLAN)
+	/* update VXLAN port configuration */
+	vxlan_get_rx_port(netdev);
+
+#endif
 	fm10k_up(interface);
 
 	return 0;
@@ -439,6 +569,8 @@ int fm10k_close(struct net_device *netdev)
 
 	fm10k_qv_free_irq(interface);
 
+	fm10k_del_vxlan_port_all(interface);
+
 	fm10k_free_all_tx_resources(interface);
 	fm10k_free_all_rx_resources(interface);
 
@@ -888,6 +1020,9 @@ void fm10k_restore_rx_state(struct fm10k_intfc *interface)
 
 	/* record updated xcast mode state */
 	interface->xcast_mode = xcast_mode;
+
+	/* Restore tunnel configuration */
+	fm10k_restore_vxlan_port(interface);
 }
 
 void fm10k_reset_rx_state(struct fm10k_intfc *interface)
@@ -1022,6 +1157,8 @@ static const struct net_device_ops fm10k_netdev_ops = {
 	.ndo_set_rx_mode	= fm10k_set_rx_mode,
 	.ndo_get_stats64	= fm10k_get_stats64,
 	.ndo_setup_tc		= fm10k_setup_tc,
+	.ndo_add_vxlan_port	= fm10k_add_vxlan_port,
+	.ndo_del_vxlan_port	= fm10k_del_vxlan_port,
 };
 
 #define DEFAULT_DEBUG_LEVEL_SHIFT 3
@@ -1044,7 +1181,15 @@ struct net_device *fm10k_alloc_netdev(void)
 	interface->msg_enable = (1 << DEFAULT_DEBUG_LEVEL_SHIFT) - 1;
 
 	/* configure default features */
-	dev->features |= NETIF_F_SG;
+	dev->features |= NETIF_F_IP_CSUM |
+			 NETIF_F_IPV6_CSUM |
+			 NETIF_F_SG |
+			 NETIF_F_TSO |
+			 NETIF_F_TSO6 |
+			 NETIF_F_TSO_ECN |
+			 NETIF_F_GSO_UDP_TUNNEL |
+			 NETIF_F_RXHASH |
+			 NETIF_F_RXCSUM;
 
 	/* all features defined to this point should be changeable */
 	dev->hw_features |= dev->features;
@@ -1053,7 +1198,13 @@ struct net_device *fm10k_alloc_netdev(void)
 	dev->vlan_features |= dev->features;
 
 	/* configure tunnel offloads */
-	dev->hw_enc_features = NETIF_F_SG;
+	dev->hw_enc_features = NETIF_F_IP_CSUM |
+			       NETIF_F_TSO |
+			       NETIF_F_TSO6 |
+			       NETIF_F_TSO_ECN |
+			       NETIF_F_GSO_UDP_TUNNEL |
+			       NETIF_F_IPV6_CSUM |
+			       NETIF_F_SG;
 
 	/* we want to leave these both on as we cannot disable VLAN tag
 	 * insertion or stripping on the hardware since it is contained

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [net-next PATCH 21/29] fm10k: Add support for MACVLAN acceleration
  2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
                   ` (19 preceding siblings ...)
  2014-09-18 22:38 ` [net-next PATCH 20/29] fm10k: Add support for netdev offloads Alexander Duyck
@ 2014-09-18 22:39 ` Alexander Duyck
  2014-09-18 22:39 ` [net-next PATCH 22/29] fm10k: Add support for PF <-> VF mailbox Alexander Duyck
                   ` (10 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-18 22:39 UTC (permalink / raw)
  To: davem
  Cc: nhorman, netdev, john.fastabend, matthew.vick, jeffrey.t.kirsher,
	sassmann

This patch adds support for L2 MACVLAN by making use of the fact that the
RRC provides a unique tag per filter called a Global Resource Tag, or GLORT.
In the case of this offload what I have done is assigned a linear block of
these so that each GLORT represents one of the MACVLAN netdevs.  By doing
this I can share the Rx queues and Tx queues for all of the MACVLAN netdevs
while allowing them to be demuxed in the Rx cleanup path.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k.h        |   11 ++
 drivers/net/ethernet/intel/fm10k/fm10k_main.c   |   32 +++++
 drivers/net/ethernet/intel/fm10k/fm10k_netdev.c |  154 +++++++++++++++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c    |    2 
 4 files changed, 198 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k.h b/drivers/net/ethernet/intel/fm10k/fm10k.h
index 19c3f8c..82efaed 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k.h
@@ -54,6 +54,15 @@
 /* How many Rx Buffers do we bundle into one write to the hardware ? */
 #define FM10K_RX_BUFFER_WRITE	16	/* Must be power of 2 */
 
+#define FM10K_MAX_STATIONS	63
+struct fm10k_l2_accel {
+	int size;
+	u16 count;
+	u16 dglort;
+	struct rcu_head rcu;
+	struct net_device *macvlan[0];
+};
+
 enum fm10k_ring_state_t {
 	__FM10K_TX_DETECT_HANG,
 	__FM10K_HANG_CHECK_ARMED,
@@ -104,6 +113,7 @@ struct fm10k_ring {
 	struct fm10k_q_vector *q_vector;/* backpointer to host q_vector */
 	struct net_device *netdev;	/* netdev ring belongs to */
 	struct device *dev;		/* device for DMA mapping */
+	struct fm10k_l2_accel __rcu *l2_accel;	/* L2 acceleration list */
 	void *desc;			/* descriptor ring memory */
 	union {
 		struct fm10k_tx_buffer *tx_buffer;
@@ -217,6 +227,7 @@ struct fm10k_vxlan_port {
 struct fm10k_intfc {
 	unsigned long active_vlans[BITS_TO_LONGS(VLAN_N_VID)];
 	struct net_device *netdev;
+	struct fm10k_l2_accel *l2_accel; /* pointer to L2 acceleration list */
 	struct pci_dev *pdev;
 	unsigned long state;
 
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_main.c b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
index baab163..1e4fcba 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_main.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
@@ -394,6 +394,35 @@ static inline void fm10k_rx_hash(struct fm10k_ring *ring,
 		     PKT_HASH_TYPE_L4 : PKT_HASH_TYPE_L3);
 }
 
+static void fm10k_type_trans(struct fm10k_ring *rx_ring,
+			     union fm10k_rx_desc *rx_desc,
+			     struct sk_buff *skb)
+{
+	struct net_device *dev = rx_ring->netdev;
+	struct fm10k_l2_accel *l2_accel = ACCESS_ONCE(rx_ring->l2_accel);
+
+	/* check to see if DGLORT belongs to a MACVLAN */
+	if (l2_accel) {
+		u16 idx = le16_to_cpu(FM10K_CB(skb)->fi.w.dglort) - 1;
+
+		idx -= l2_accel->dglort;
+		if (idx < l2_accel->size && l2_accel->macvlan[idx])
+			dev = l2_accel->macvlan[idx];
+		else
+			l2_accel = NULL;
+	}
+
+	skb->protocol = eth_type_trans(skb, dev);
+
+	if (!l2_accel)
+		return;
+
+	/* update MACVLAN statistics */
+	macvlan_count_rx(netdev_priv(dev), skb->len + ETH_HLEN, 1,
+			 !!(rx_desc->w.hdr_info &
+			    cpu_to_le16(FM10K_RXD_HDR_INFO_XC_MASK)));
+}
+
 /**
  * fm10k_process_skb_fields - Populate skb header fields from Rx descriptor
  * @rx_ring: rx descriptor ring packet is being transacted on
@@ -427,7 +456,7 @@ static unsigned int fm10k_process_skb_fields(struct fm10k_ring *rx_ring,
 			__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), vid);
 	}
 
-	skb->protocol = eth_type_trans(skb, rx_ring->netdev);
+	fm10k_type_trans(rx_ring, rx_desc, skb);
 
 	return len;
 }
@@ -1567,6 +1596,7 @@ static int fm10k_alloc_q_vector(struct fm10k_intfc *interface,
 		/* assign generic ring traits */
 		ring->dev = &interface->pdev->dev;
 		ring->netdev = interface->netdev;
+		ring->l2_accel = interface->l2_accel;
 
 		/* configure backlink on ring */
 		ring->q_vector = q_vector;
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
index 6383db2..af30e25 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
@@ -1144,6 +1144,155 @@ int fm10k_setup_tc(struct net_device *dev, u8 tc)
 	return 0;
 }
 
+static void fm10k_assign_l2_accel(struct fm10k_intfc *interface,
+				  struct fm10k_l2_accel *l2_accel)
+{
+	struct fm10k_ring *ring;
+	int i;
+
+	for (i = 0; i < interface->num_rx_queues; i++) {
+		ring = interface->rx_ring[i];
+		rcu_assign_pointer(ring->l2_accel, l2_accel);
+	}
+
+	rcu_assign_pointer(interface->l2_accel, l2_accel);
+}
+
+static void *fm10k_dfwd_add_station(struct net_device *dev,
+				    struct net_device *sdev)
+{
+	struct fm10k_intfc *interface = netdev_priv(dev);
+	struct fm10k_l2_accel *l2_accel = ACCESS_ONCE(interface->l2_accel);
+	struct fm10k_l2_accel *old_l2_accel = NULL;
+	struct fm10k_dglort_cfg dglort = { 0 };
+	struct fm10k_hw *hw = &interface->hw;
+	int size = 0, i;
+	u16 glort;
+
+	/* allocate l2 accel structure if it is not available */
+	if (!l2_accel) {
+		/* verify there is enough free GLORTs to support l2_accel */
+		if (interface->glort_count < 7)
+			return ERR_PTR(-EBUSY);
+
+		size = offsetof(struct fm10k_l2_accel, macvlan[7]);
+		l2_accel = kzalloc(size, GFP_KERNEL);
+		if (!l2_accel)
+			return ERR_PTR(-ENOMEM);
+
+		l2_accel->size = 7;
+		l2_accel->dglort = interface->glort;
+
+		/* update pointers */
+		fm10k_assign_l2_accel(interface, l2_accel);
+	/* do not expand if we are at our limit */
+	} else if ((l2_accel->count == FM10K_MAX_STATIONS) ||
+		   (l2_accel->count == (interface->glort_count - 1))) {
+		return ERR_PTR(-EBUSY);
+	/* expand if we have hit the size limit */
+	} else if (l2_accel->count == l2_accel->size) {
+		old_l2_accel = l2_accel;
+		size = offsetof(struct fm10k_l2_accel,
+				macvlan[(l2_accel->size * 2) + 1]);
+		l2_accel = kzalloc(size, GFP_KERNEL);
+		if (!l2_accel)
+			return ERR_PTR(-ENOMEM);
+
+		memcpy(l2_accel, old_l2_accel,
+		       offsetof(struct fm10k_l2_accel,
+				macvlan[old_l2_accel->size]));
+
+		l2_accel->size = (old_l2_accel->size * 2) + 1;
+
+		/* update pointers */
+		fm10k_assign_l2_accel(interface, l2_accel);
+		kfree_rcu(old_l2_accel, rcu);
+	}
+
+	/* add macvlan to accel table, and record GLORT for position */
+	for (i = 0; i < l2_accel->size; i++) {
+		if (!l2_accel->macvlan[i])
+			break;
+	}
+
+	/* record station */
+	l2_accel->macvlan[i] = sdev;
+	l2_accel->count++;
+
+	/* configure default DGLORT mapping for RSS/DCB */
+	dglort.idx = fm10k_dglort_pf_rss;
+	dglort.inner_rss = 1;
+	dglort.rss_l = fls(interface->ring_feature[RING_F_RSS].mask);
+	dglort.pc_l = fls(interface->ring_feature[RING_F_QOS].mask);
+	dglort.glort = interface->glort;
+	dglort.shared_l = fls(l2_accel->size);
+	hw->mac.ops.configure_dglort_map(hw, &dglort);
+
+	/* Add rules for this specific dglort to the switch */
+	fm10k_mbx_lock(interface);
+
+	glort = l2_accel->dglort + 1 + i;
+	hw->mac.ops.update_xcast_mode(hw, glort, FM10K_XCAST_MODE_MULTI);
+	hw->mac.ops.update_uc_addr(hw, glort, sdev->dev_addr, 0, true, 0);
+
+	fm10k_mbx_unlock(interface);
+
+	return sdev;
+}
+
+static void fm10k_dfwd_del_station(struct net_device *dev, void *priv)
+{
+	struct fm10k_intfc *interface = netdev_priv(dev);
+	struct fm10k_l2_accel *l2_accel = ACCESS_ONCE(interface->l2_accel);
+	struct fm10k_dglort_cfg dglort = { 0 };
+	struct fm10k_hw *hw = &interface->hw;
+	struct net_device *sdev = priv;
+	int i;
+	u16 glort;
+
+	if (!l2_accel)
+		return;
+
+	/* search table for matching interface */
+	for (i = 0; i < l2_accel->size; i++) {
+		if (l2_accel->macvlan[i] == sdev)
+			break;
+	}
+
+	/* exit if macvlan not found */
+	if (i == l2_accel->size)
+		return;
+
+	/* Remove any rules specific to this dglort */
+	fm10k_mbx_lock(interface);
+
+	glort = l2_accel->dglort + 1 + i;
+	hw->mac.ops.update_xcast_mode(hw, glort, FM10K_XCAST_MODE_NONE);
+	hw->mac.ops.update_uc_addr(hw, glort, sdev->dev_addr, 0, false, 0);
+
+	fm10k_mbx_unlock(interface);
+
+	/* record removal */
+	l2_accel->macvlan[i] = NULL;
+	l2_accel->count--;
+
+	/* configure default DGLORT mapping for RSS/DCB */
+	dglort.idx = fm10k_dglort_pf_rss;
+	dglort.inner_rss = 1;
+	dglort.rss_l = fls(interface->ring_feature[RING_F_RSS].mask);
+	dglort.pc_l = fls(interface->ring_feature[RING_F_QOS].mask);
+	dglort.glort = interface->glort;
+	if (l2_accel)
+		dglort.shared_l = fls(l2_accel->size);
+	hw->mac.ops.configure_dglort_map(hw, &dglort);
+
+	/* If table is empty remove it */
+	if (l2_accel->count == 0) {
+		fm10k_assign_l2_accel(interface, NULL);
+		kfree_rcu(l2_accel, rcu);
+	}
+}
+
 static const struct net_device_ops fm10k_netdev_ops = {
 	.ndo_open		= fm10k_open,
 	.ndo_stop		= fm10k_close,
@@ -1159,6 +1308,8 @@ static const struct net_device_ops fm10k_netdev_ops = {
 	.ndo_setup_tc		= fm10k_setup_tc,
 	.ndo_add_vxlan_port	= fm10k_add_vxlan_port,
 	.ndo_del_vxlan_port	= fm10k_del_vxlan_port,
+	.ndo_dfwd_add_station	= fm10k_dfwd_add_station,
+	.ndo_dfwd_del_station	= fm10k_dfwd_del_station,
 };
 
 #define DEFAULT_DEBUG_LEVEL_SHIFT 3
@@ -1194,6 +1345,9 @@ struct net_device *fm10k_alloc_netdev(void)
 	/* all features defined to this point should be changeable */
 	dev->hw_features |= dev->features;
 
+	/* allow user to enable L2 forwarding acceleration */
+	dev->hw_features |= NETIF_F_HW_L2FW_DOFFLOAD;
+
 	/* configure VLAN features */
 	dev->vlan_features |= dev->features;
 
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index 9b97da7..c61241a 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -756,6 +756,8 @@ static void fm10k_configure_dglort(struct fm10k_intfc *interface)
 	dglort.pc_l = fls(interface->ring_feature[RING_F_QOS].mask);
 	/* configure DGLORT mapping for RSS/DCB */
 	dglort.idx = fm10k_dglort_pf_rss;
+	if (interface->l2_accel)
+		dglort.shared_l = fls(interface->l2_accel->size);
 	hw->mac.ops.configure_dglort_map(hw, &dglort);
 }
 

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [net-next PATCH 22/29] fm10k: Add support for PF <-> VF mailbox
  2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
                   ` (20 preceding siblings ...)
  2014-09-18 22:39 ` [net-next PATCH 21/29] fm10k: Add support for MACVLAN acceleration Alexander Duyck
@ 2014-09-18 22:39 ` Alexander Duyck
  2014-09-18 22:39 ` [net-next PATCH 23/29] fm10k: Add support for VF Alexander Duyck
                   ` (9 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-18 22:39 UTC (permalink / raw)
  To: davem
  Cc: nhorman, netdev, john.fastabend, matthew.vick, jeffrey.t.kirsher,
	sassmann

This patch adds support for the PF <-> VF mailbox.  It functions similar to
the PF <-> SM mailbox however there are several modifications made to
improve the reliability of the mailbox itself.  In addition the PF/VF
mailbox is much smaller an only supports a total size of 16 DWORDs vs the
1024 DWORDS provided for the PF/SM mailbox.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_mbx.c |  770 ++++++++++++++++++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_mbx.h |   64 ++
 2 files changed, 834 insertions(+)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_mbx.c b/drivers/net/ethernet/intel/fm10k/fm10k_mbx.c
index 2609847..a6a66fd 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_mbx.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_mbx.c
@@ -511,6 +511,148 @@ static s32 fm10k_mbx_push_tail(struct fm10k_hw *hw,
 	return 0;
 }
 
+/* pre-generated data for generating the CRC based on the poly 0xAC9A. */
+static const u16 fm10k_crc_16b_table[256] = {
+	0x0000, 0x7956, 0xF2AC, 0x8BFA, 0xBC6D, 0xC53B, 0x4EC1, 0x3797,
+	0x21EF, 0x58B9, 0xD343, 0xAA15, 0x9D82, 0xE4D4, 0x6F2E, 0x1678,
+	0x43DE, 0x3A88, 0xB172, 0xC824, 0xFFB3, 0x86E5, 0x0D1F, 0x7449,
+	0x6231, 0x1B67, 0x909D, 0xE9CB, 0xDE5C, 0xA70A, 0x2CF0, 0x55A6,
+	0x87BC, 0xFEEA, 0x7510, 0x0C46, 0x3BD1, 0x4287, 0xC97D, 0xB02B,
+	0xA653, 0xDF05, 0x54FF, 0x2DA9, 0x1A3E, 0x6368, 0xE892, 0x91C4,
+	0xC462, 0xBD34, 0x36CE, 0x4F98, 0x780F, 0x0159, 0x8AA3, 0xF3F5,
+	0xE58D, 0x9CDB, 0x1721, 0x6E77, 0x59E0, 0x20B6, 0xAB4C, 0xD21A,
+	0x564D, 0x2F1B, 0xA4E1, 0xDDB7, 0xEA20, 0x9376, 0x188C, 0x61DA,
+	0x77A2, 0x0EF4, 0x850E, 0xFC58, 0xCBCF, 0xB299, 0x3963, 0x4035,
+	0x1593, 0x6CC5, 0xE73F, 0x9E69, 0xA9FE, 0xD0A8, 0x5B52, 0x2204,
+	0x347C, 0x4D2A, 0xC6D0, 0xBF86, 0x8811, 0xF147, 0x7ABD, 0x03EB,
+	0xD1F1, 0xA8A7, 0x235D, 0x5A0B, 0x6D9C, 0x14CA, 0x9F30, 0xE666,
+	0xF01E, 0x8948, 0x02B2, 0x7BE4, 0x4C73, 0x3525, 0xBEDF, 0xC789,
+	0x922F, 0xEB79, 0x6083, 0x19D5, 0x2E42, 0x5714, 0xDCEE, 0xA5B8,
+	0xB3C0, 0xCA96, 0x416C, 0x383A, 0x0FAD, 0x76FB, 0xFD01, 0x8457,
+	0xAC9A, 0xD5CC, 0x5E36, 0x2760, 0x10F7, 0x69A1, 0xE25B, 0x9B0D,
+	0x8D75, 0xF423, 0x7FD9, 0x068F, 0x3118, 0x484E, 0xC3B4, 0xBAE2,
+	0xEF44, 0x9612, 0x1DE8, 0x64BE, 0x5329, 0x2A7F, 0xA185, 0xD8D3,
+	0xCEAB, 0xB7FD, 0x3C07, 0x4551, 0x72C6, 0x0B90, 0x806A, 0xF93C,
+	0x2B26, 0x5270, 0xD98A, 0xA0DC, 0x974B, 0xEE1D, 0x65E7, 0x1CB1,
+	0x0AC9, 0x739F, 0xF865, 0x8133, 0xB6A4, 0xCFF2, 0x4408, 0x3D5E,
+	0x68F8, 0x11AE, 0x9A54, 0xE302, 0xD495, 0xADC3, 0x2639, 0x5F6F,
+	0x4917, 0x3041, 0xBBBB, 0xC2ED, 0xF57A, 0x8C2C, 0x07D6, 0x7E80,
+	0xFAD7, 0x8381, 0x087B, 0x712D, 0x46BA, 0x3FEC, 0xB416, 0xCD40,
+	0xDB38, 0xA26E, 0x2994, 0x50C2, 0x6755, 0x1E03, 0x95F9, 0xECAF,
+	0xB909, 0xC05F, 0x4BA5, 0x32F3, 0x0564, 0x7C32, 0xF7C8, 0x8E9E,
+	0x98E6, 0xE1B0, 0x6A4A, 0x131C, 0x248B, 0x5DDD, 0xD627, 0xAF71,
+	0x7D6B, 0x043D, 0x8FC7, 0xF691, 0xC106, 0xB850, 0x33AA, 0x4AFC,
+	0x5C84, 0x25D2, 0xAE28, 0xD77E, 0xE0E9, 0x99BF, 0x1245, 0x6B13,
+	0x3EB5, 0x47E3, 0xCC19, 0xB54F, 0x82D8, 0xFB8E, 0x7074, 0x0922,
+	0x1F5A, 0x660C, 0xEDF6, 0x94A0, 0xA337, 0xDA61, 0x519B, 0x28CD };
+
+/**
+ *  fm10k_crc_16b - Generate a 16 bit CRC for a region of 16 bit data
+ *  @data: pointer to data to process
+ *  @seed: seed value for CRC
+ *  @len: length measured in 16 bits words
+ *
+ *  This function will generate a CRC based on the polynomial 0xAC9A and
+ *  whatever value is stored in the seed variable.  Note that this
+ *  value inverts the local seed and the result in order to capture all
+ *  leading and trailing zeros.
+ */
+static u16 fm10k_crc_16b(const u32 *data, u16 seed, u16 len)
+{
+	u32 result = seed;
+
+	while (len--) {
+		result ^= *(data++);
+		result = (result >> 8) ^ fm10k_crc_16b_table[result & 0xFF];
+		result = (result >> 8) ^ fm10k_crc_16b_table[result & 0xFF];
+
+		if (!(len--))
+			break;
+
+		result = (result >> 8) ^ fm10k_crc_16b_table[result & 0xFF];
+		result = (result >> 8) ^ fm10k_crc_16b_table[result & 0xFF];
+	}
+
+	return (u16)result;
+}
+
+/**
+ *  fm10k_fifo_crc - generate a CRC based off of FIFO data
+ *  @fifo: pointer to FIFO
+ *  @offset: offset point for start of FIFO
+ *  @len: number of DWORDS words to process
+ *  @seed: seed value for CRC
+ *
+ *  This function generates a CRC for some region of the FIFO
+ **/
+static u16 fm10k_fifo_crc(struct fm10k_mbx_fifo *fifo, u16 offset,
+			  u16 len, u16 seed)
+{
+	u32 *data = fifo->buffer + offset;
+
+	/* track when we should cross the end of the FIFO */
+	offset = fifo->size - offset;
+
+	/* if we are in 2 blocks process the end of the FIFO first */
+	if (offset < len) {
+		seed = fm10k_crc_16b(data, seed, offset * 2);
+		data = fifo->buffer;
+		len -= offset;
+	}
+
+	/* process any remaining bits */
+	return fm10k_crc_16b(data, seed, len * 2);
+}
+
+/**
+ *  fm10k_mbx_update_local_crc - Update the local CRC for outgoing data
+ *  @mbx: pointer to mailbox
+ *  @head: head index provided by remote mailbox
+ *
+ *  This function will generate the CRC for all data from the end of the
+ *  last head update to the current one.  It uses the result of the
+ *  previous CRC as the seed for this update.  The result is stored in
+ *  mbx->local.
+ **/
+static void fm10k_mbx_update_local_crc(struct fm10k_mbx_info *mbx, u16 head)
+{
+	u16 len = mbx->tail_len - fm10k_mbx_index_len(mbx, head, mbx->tail);
+
+	/* determine the offset for the start of the region to be pulled */
+	head = fm10k_fifo_head_offset(&mbx->tx, mbx->pulled);
+
+	/* update local CRC to include all of the pulled data */
+	mbx->local = fm10k_fifo_crc(&mbx->tx, head, len, mbx->local);
+}
+
+/**
+ *  fm10k_mbx_verify_remote_crc - Verify the CRC is correct for current data
+ *  @mbx: pointer to mailbox
+ *
+ *  This function will take all data that has been provided from the remote
+ *  end and generate a CRC for it.  This is stored in mbx->remote.  The
+ *  CRC for the header is then computed and if the result is non-zero this
+ *  is an error and we signal an error dropping all data and resetting the
+ *  connection.
+ */
+static s32 fm10k_mbx_verify_remote_crc(struct fm10k_mbx_info *mbx)
+{
+	struct fm10k_mbx_fifo *fifo = &mbx->rx;
+	u16 len = mbx->head_len;
+	u16 offset = fm10k_fifo_tail_offset(fifo, mbx->pushed) - len;
+	u16 crc;
+
+	/* update the remote CRC if new data has been received */
+	if (len)
+		mbx->remote = fm10k_fifo_crc(fifo, offset, len, mbx->remote);
+
+	/* process the full header as we have to validate the CRC */
+	crc = fm10k_crc_16b(&mbx->mbx_hdr, mbx->remote, 1);
+
+	/* notify other end if we have a problem */
+	return crc ? FM10K_MBX_ERR_CRC : 0;
+}
+
 /**
  *  fm10k_mbx_rx_ready - Indicates that a message is ready in the Rx FIFO
  *  @mbx: pointer to mailbox
@@ -686,6 +828,204 @@ static void fm10k_mbx_write(struct fm10k_hw *hw, struct fm10k_mbx_info *mbx)
 }
 
 /**
+ *  fm10k_mbx_create_connect_hdr - Generate a connect mailbox header
+ *  @mbx: pointer to mailbox
+ *
+ *  This function returns a connection mailbox header
+ **/
+static void fm10k_mbx_create_connect_hdr(struct fm10k_mbx_info *mbx)
+{
+	mbx->mbx_lock |= FM10K_MBX_REQ;
+
+	mbx->mbx_hdr = FM10K_MSG_HDR_FIELD_SET(FM10K_MSG_CONNECT, TYPE) |
+		       FM10K_MSG_HDR_FIELD_SET(mbx->head, HEAD) |
+		       FM10K_MSG_HDR_FIELD_SET(mbx->rx.size - 1, CONNECT_SIZE);
+}
+
+/**
+ *  fm10k_mbx_create_data_hdr - Generate a data mailbox header
+ *  @mbx: pointer to mailbox
+ *
+ *  This function returns a data mailbox header
+ **/
+static void fm10k_mbx_create_data_hdr(struct fm10k_mbx_info *mbx)
+{
+	u32 hdr = FM10K_MSG_HDR_FIELD_SET(FM10K_MSG_DATA, TYPE) |
+		  FM10K_MSG_HDR_FIELD_SET(mbx->tail, TAIL) |
+		  FM10K_MSG_HDR_FIELD_SET(mbx->head, HEAD);
+	struct fm10k_mbx_fifo *fifo = &mbx->tx;
+	u16 crc;
+
+	if (mbx->tail_len)
+		mbx->mbx_lock |= FM10K_MBX_REQ;
+
+	/* generate CRC for data in flight and header */
+	crc = fm10k_fifo_crc(fifo, fm10k_fifo_head_offset(fifo, mbx->pulled),
+			     mbx->tail_len, mbx->local);
+	crc = fm10k_crc_16b(&hdr, crc, 1);
+
+	/* load header to memory to be written */
+	mbx->mbx_hdr = hdr | FM10K_MSG_HDR_FIELD_SET(crc, CRC);
+}
+
+/**
+ *  fm10k_mbx_create_disconnect_hdr - Generate a disconnect mailbox header
+ *  @mbx: pointer to mailbox
+ *
+ *  This function returns a disconnect mailbox header
+ **/
+static void fm10k_mbx_create_disconnect_hdr(struct fm10k_mbx_info *mbx)
+{
+	u32 hdr = FM10K_MSG_HDR_FIELD_SET(FM10K_MSG_DISCONNECT, TYPE) |
+		  FM10K_MSG_HDR_FIELD_SET(mbx->tail, TAIL) |
+		  FM10K_MSG_HDR_FIELD_SET(mbx->head, HEAD);
+	u16 crc = fm10k_crc_16b(&hdr, mbx->local, 1);
+
+	mbx->mbx_lock |= FM10K_MBX_ACK;
+
+	/* load header to memory to be written */
+	mbx->mbx_hdr = hdr | FM10K_MSG_HDR_FIELD_SET(crc, CRC);
+}
+
+/**
+ *  fm10k_mbx_create_error_msg - Generate a error message
+ *  @mbx: pointer to mailbox
+ *  @err: local error encountered
+ *
+ *  This function will interpret the error provided by err, and based on
+ *  that it may shift the message by 1 DWORD and then place an error header
+ *  at the start of the message.
+ **/
+static void fm10k_mbx_create_error_msg(struct fm10k_mbx_info *mbx, s32 err)
+{
+	/* only generate an error message for these types */
+	switch (err) {
+	case FM10K_MBX_ERR_TAIL:
+	case FM10K_MBX_ERR_HEAD:
+	case FM10K_MBX_ERR_TYPE:
+	case FM10K_MBX_ERR_SIZE:
+	case FM10K_MBX_ERR_RSVD0:
+	case FM10K_MBX_ERR_CRC:
+		break;
+	default:
+		return;
+	}
+
+	mbx->mbx_lock |= FM10K_MBX_REQ;
+
+	mbx->mbx_hdr = FM10K_MSG_HDR_FIELD_SET(FM10K_MSG_ERROR, TYPE) |
+		       FM10K_MSG_HDR_FIELD_SET(err, ERR_NO) |
+		       FM10K_MSG_HDR_FIELD_SET(mbx->head, HEAD);
+}
+
+/**
+ *  fm10k_mbx_validate_msg_hdr - Validate common fields in the message header
+ *  @mbx: pointer to mailbox
+ *  @msg: message array to read
+ *
+ *  This function will parse up the fields in the mailbox header and return
+ *  an error if the header contains any of a number of invalid configurations
+ *  including unrecognized type, invalid route, or a malformed message.
+ **/
+static s32 fm10k_mbx_validate_msg_hdr(struct fm10k_mbx_info *mbx)
+{
+	u16 type, rsvd0, head, tail, size;
+	const u32 *hdr = &mbx->mbx_hdr;
+
+	type = FM10K_MSG_HDR_FIELD_GET(*hdr, TYPE);
+	rsvd0 = FM10K_MSG_HDR_FIELD_GET(*hdr, RSVD0);
+	tail = FM10K_MSG_HDR_FIELD_GET(*hdr, TAIL);
+	head = FM10K_MSG_HDR_FIELD_GET(*hdr, HEAD);
+	size = FM10K_MSG_HDR_FIELD_GET(*hdr, CONNECT_SIZE);
+
+	if (rsvd0)
+		return FM10K_MBX_ERR_RSVD0;
+
+	switch (type) {
+	case FM10K_MSG_DISCONNECT:
+		/* validate that all data has been received */
+		if (tail != mbx->head)
+			return FM10K_MBX_ERR_TAIL;
+
+		/* fall through */
+	case FM10K_MSG_DATA:
+		/* validate that head is moving correctly */
+		if (!head || (head == FM10K_MSG_HDR_MASK(HEAD)))
+			return FM10K_MBX_ERR_HEAD;
+		if (fm10k_mbx_index_len(mbx, head, mbx->tail) > mbx->tail_len)
+			return FM10K_MBX_ERR_HEAD;
+
+		/* validate that tail is moving correctly */
+		if (!tail || (tail == FM10K_MSG_HDR_MASK(TAIL)))
+			return FM10K_MBX_ERR_TAIL;
+		if (fm10k_mbx_index_len(mbx, mbx->head, tail) < mbx->mbmem_len)
+			break;
+
+		return FM10K_MBX_ERR_TAIL;
+	case FM10K_MSG_CONNECT:
+		/* validate size is in range and is power of 2 mask */
+		if ((size < FM10K_VFMBX_MSG_MTU) || (size & (size + 1)))
+			return FM10K_MBX_ERR_SIZE;
+
+		/* fall through */
+	case FM10K_MSG_ERROR:
+		if (!head || (head == FM10K_MSG_HDR_MASK(HEAD)))
+			return FM10K_MBX_ERR_HEAD;
+		/* neither create nor error include a tail offset */
+		if (tail)
+			return FM10K_MBX_ERR_TAIL;
+
+		break;
+	default:
+		return FM10K_MBX_ERR_TYPE;
+	}
+
+	return 0;
+}
+
+/**
+ *  fm10k_mbx_create_reply - Generate reply based on state and remote head
+ *  @mbx: pointer to mailbox
+ *  @head: acknowledgement number
+ *
+ *  This function will generate an outgoing message based on the current
+ *  mailbox state and the remote fifo head.  It will return the length
+ *  of the outgoing message excluding header on success, and a negative value
+ *  on error.
+ **/
+static s32 fm10k_mbx_create_reply(struct fm10k_hw *hw,
+				  struct fm10k_mbx_info *mbx, u16 head)
+{
+	switch (mbx->state) {
+	case FM10K_STATE_OPEN:
+	case FM10K_STATE_DISCONNECT:
+		/* update our checksum for the outgoing data */
+		fm10k_mbx_update_local_crc(mbx, head);
+
+		/* as long as other end recognizes us keep sending data */
+		fm10k_mbx_pull_head(hw, mbx, head);
+
+		/* generate new header based on data */
+		if (mbx->tail_len || (mbx->state == FM10K_STATE_OPEN))
+			fm10k_mbx_create_data_hdr(mbx);
+		else
+			fm10k_mbx_create_disconnect_hdr(mbx);
+		break;
+	case FM10K_STATE_CONNECT:
+		/* send disconnect even if we aren't connected */
+		fm10k_mbx_create_connect_hdr(mbx);
+		break;
+	case FM10K_STATE_CLOSED:
+		/* generate new header based on data */
+		fm10k_mbx_create_disconnect_hdr(mbx);
+	default:
+		break;
+	}
+
+	return 0;
+}
+
+/**
  *  fm10k_mbx_reset_work- Reset internal pointers for any pending work
  *  @mbx: pointer to mailbox
  *
@@ -731,6 +1071,359 @@ static void fm10k_mbx_update_max_size(struct fm10k_mbx_info *mbx, u16 size)
 }
 
 /**
+ *  fm10k_mbx_connect_reset - Reset following request for reset
+ *  @mbx: pointer to mailbox
+ *
+ *  This function resets the mailbox to either a disconnected state
+ *  or a connect state depending on the current mailbox state
+ **/
+static void fm10k_mbx_connect_reset(struct fm10k_mbx_info *mbx)
+{
+	/* just do a quick resysnc to start of frame */
+	fm10k_mbx_reset_work(mbx);
+
+	/* reset CRC seeds */
+	mbx->local = FM10K_MBX_CRC_SEED;
+	mbx->remote = FM10K_MBX_CRC_SEED;
+
+	/* we cannot exit connect until the size is good */
+	if (mbx->state == FM10K_STATE_OPEN)
+		mbx->state = FM10K_STATE_CONNECT;
+	else
+		mbx->state = FM10K_STATE_CLOSED;
+}
+
+/**
+ *  fm10k_mbx_process_connect - Process connect header
+ *  @mbx: pointer to mailbox
+ *  @msg: message array to process
+ *
+ *  This function will read an incoming connect header and reply with the
+ *  appropriate message.  It will return a value indicating the number of
+ *  data DWORDs on success, or will return a negative value on failure.
+ **/
+static s32 fm10k_mbx_process_connect(struct fm10k_hw *hw,
+				     struct fm10k_mbx_info *mbx)
+{
+	const enum fm10k_mbx_state state = mbx->state;
+	const u32 *hdr = &mbx->mbx_hdr;
+	u16 size, head;
+
+	/* we will need to pull all of the fields for verification */
+	size = FM10K_MSG_HDR_FIELD_GET(*hdr, CONNECT_SIZE);
+	head = FM10K_MSG_HDR_FIELD_GET(*hdr, HEAD);
+
+	switch (state) {
+	case FM10K_STATE_DISCONNECT:
+	case FM10K_STATE_OPEN:
+		/* reset any in-progress work */
+		fm10k_mbx_connect_reset(mbx);
+		break;
+	case FM10K_STATE_CONNECT:
+		/* we cannot exit connect until the size is good */
+		if (size > mbx->rx.size) {
+			mbx->max_size = mbx->rx.size - 1;
+		} else {
+			/* record the remote system requesting connection */
+			mbx->state = FM10K_STATE_OPEN;
+
+			fm10k_mbx_update_max_size(mbx, size);
+		}
+		break;
+	default:
+		break;
+	}
+
+	/* align our tail index to remote head index */
+	mbx->tail = head;
+
+	return fm10k_mbx_create_reply(hw, mbx, head);
+}
+
+/**
+ *  fm10k_mbx_process_data - Process data header
+ *  @mbx: pointer to mailbox
+ *
+ *  This function will read an incoming data header and reply with the
+ *  appropriate message.  It will return a value indicating the number of
+ *  data DWORDs on success, or will return a negative value on failure.
+ **/
+static s32 fm10k_mbx_process_data(struct fm10k_hw *hw,
+				  struct fm10k_mbx_info *mbx)
+{
+	const u32 *hdr = &mbx->mbx_hdr;
+	u16 head, tail;
+	s32 err;
+
+	/* we will need to pull all of the fields for verification */
+	head = FM10K_MSG_HDR_FIELD_GET(*hdr, HEAD);
+	tail = FM10K_MSG_HDR_FIELD_GET(*hdr, TAIL);
+
+	/* if we are in connect just update our data and go */
+	if (mbx->state == FM10K_STATE_CONNECT) {
+		mbx->tail = head;
+		mbx->state = FM10K_STATE_OPEN;
+	}
+
+	/* abort on message size errors */
+	err = fm10k_mbx_push_tail(hw, mbx, tail);
+	if (err < 0)
+		return err;
+
+	/* verify the checksum on the incoming data */
+	err = fm10k_mbx_verify_remote_crc(mbx);
+	if (err)
+		return err;
+
+	/* process messages if we have received any */
+	fm10k_mbx_dequeue_rx(hw, mbx);
+
+	return fm10k_mbx_create_reply(hw, mbx, head);
+}
+
+/**
+ *  fm10k_mbx_process_disconnect - Process disconnect header
+ *  @mbx: pointer to mailbox
+ *
+ *  This function will read an incoming disconnect header and reply with the
+ *  appropriate message.  It will return a value indicating the number of
+ *  data DWORDs on success, or will return a negative value on failure.
+ **/
+static s32 fm10k_mbx_process_disconnect(struct fm10k_hw *hw,
+					struct fm10k_mbx_info *mbx)
+{
+	const enum fm10k_mbx_state state = mbx->state;
+	const u32 *hdr = &mbx->mbx_hdr;
+	u16 head, tail;
+	s32 err;
+
+	/* we will need to pull all of the fields for verification */
+	head = FM10K_MSG_HDR_FIELD_GET(*hdr, HEAD);
+	tail = FM10K_MSG_HDR_FIELD_GET(*hdr, TAIL);
+
+	/* We should not be receiving disconnect if Rx is incomplete */
+	if (mbx->pushed)
+		return FM10K_MBX_ERR_TAIL;
+
+	/* we have already verified mbx->head == tail so we know this is 0 */
+	mbx->head_len = 0;
+
+	/* verify the checksum on the incoming header is correct */
+	err = fm10k_mbx_verify_remote_crc(mbx);
+	if (err)
+		return err;
+
+	switch (state) {
+	case FM10K_STATE_DISCONNECT:
+	case FM10K_STATE_OPEN:
+		/* state doesn't change if we still have work to do */
+		if (!fm10k_mbx_tx_complete(mbx))
+			break;
+
+		/* verify the head indicates we completed all transmits */
+		if (head != mbx->tail)
+			return FM10K_MBX_ERR_HEAD;
+
+		/* reset any in-progress work */
+		fm10k_mbx_connect_reset(mbx);
+		break;
+	default:
+		break;
+	}
+
+	return fm10k_mbx_create_reply(hw, mbx, head);
+}
+
+/**
+ *  fm10k_mbx_process_error - Process error header
+ *  @mbx: pointer to mailbox
+ *
+ *  This function will read an incoming error header and reply with the
+ *  appropriate message.  It will return a value indicating the number of
+ *  data DWORDs on success, or will return a negative value on failure.
+ **/
+static s32 fm10k_mbx_process_error(struct fm10k_hw *hw,
+				   struct fm10k_mbx_info *mbx)
+{
+	const u32 *hdr = &mbx->mbx_hdr;
+	s32 err_no;
+	u16 head;
+
+	/* we will need to pull all of the fields for verification */
+	head = FM10K_MSG_HDR_FIELD_GET(*hdr, HEAD);
+
+	/* we only have lower 10 bits of error number os add upper bits */
+	err_no = FM10K_MSG_HDR_FIELD_GET(*hdr, ERR_NO);
+	err_no |= ~FM10K_MSG_HDR_MASK(ERR_NO);
+
+	switch (mbx->state) {
+	case FM10K_STATE_OPEN:
+	case FM10K_STATE_DISCONNECT:
+		/* flush any uncompleted work */
+		fm10k_mbx_reset_work(mbx);
+
+		/* reset CRC seeds */
+		mbx->local = FM10K_MBX_CRC_SEED;
+		mbx->remote = FM10K_MBX_CRC_SEED;
+
+		/* reset tail index and size to prepare for reconnect */
+		mbx->tail = head;
+
+		/* if open then reset max_size and go back to connect */
+		if (mbx->state == FM10K_STATE_OPEN) {
+			mbx->state = FM10K_STATE_CONNECT;
+			break;
+		}
+
+		/* send a connect message to get data flowing again */
+		fm10k_mbx_create_connect_hdr(mbx);
+		return 0;
+	default:
+		break;
+	}
+
+	return fm10k_mbx_create_reply(hw, mbx, mbx->tail);
+}
+
+/**
+ *  fm10k_mbx_process - Process mailbox interrupt
+ *  @hw: pointer to hardware structure
+ *  @mbx: pointer to mailbox
+ *
+ *  This function will process incoming mailbox events and generate mailbox
+ *  replies.  It will return a value indicating the number of DWORDs
+ *  transmitted excluding header on success or a negative value on error.
+ **/
+static s32 fm10k_mbx_process(struct fm10k_hw *hw,
+			     struct fm10k_mbx_info *mbx)
+{
+	s32 err;
+
+	/* we do not read mailbox if closed */
+	if (mbx->state == FM10K_STATE_CLOSED)
+		return 0;
+
+	/* copy data from mailbox */
+	err = fm10k_mbx_read(hw, mbx);
+	if (err)
+		return err;
+
+	/* validate type, source, and destination */
+	err = fm10k_mbx_validate_msg_hdr(mbx);
+	if (err < 0)
+		goto msg_err;
+
+	switch (FM10K_MSG_HDR_FIELD_GET(mbx->mbx_hdr, TYPE)) {
+	case FM10K_MSG_CONNECT:
+		err = fm10k_mbx_process_connect(hw, mbx);
+		break;
+	case FM10K_MSG_DATA:
+		err = fm10k_mbx_process_data(hw, mbx);
+		break;
+	case FM10K_MSG_DISCONNECT:
+		err = fm10k_mbx_process_disconnect(hw, mbx);
+		break;
+	case FM10K_MSG_ERROR:
+		err = fm10k_mbx_process_error(hw, mbx);
+		break;
+	default:
+		err = FM10K_MBX_ERR_TYPE;
+		break;
+	}
+
+msg_err:
+	/* notify partner of errors on our end */
+	if (err < 0)
+		fm10k_mbx_create_error_msg(mbx, err);
+
+	/* copy data from mailbox */
+	fm10k_mbx_write(hw, mbx);
+
+	return err;
+}
+
+/**
+ *  fm10k_mbx_disconnect - Shutdown mailbox connection
+ *  @hw: pointer to hardware structure
+ *  @mbx: pointer to mailbox
+ *
+ *  This function will shut down the mailbox.  It places the mailbox first
+ *  in the disconnect state, it then allows up to a predefined timeout for
+ *  the mailbox to transition to close on its own.  If this does not occur
+ *  then the mailbox will be forced into the closed state.
+ *
+ *  Any mailbox transactions not completed before calling this function
+ *  are not guaranteed to complete and may be dropped.
+ **/
+static void fm10k_mbx_disconnect(struct fm10k_hw *hw,
+				 struct fm10k_mbx_info *mbx)
+{
+	int timeout = mbx->timeout ? FM10K_MBX_DISCONNECT_TIMEOUT : 0;
+
+	/* Place mbx in ready to disconnect state */
+	mbx->state = FM10K_STATE_DISCONNECT;
+
+	/* trigger interrupt to start shutdown process */
+	fm10k_write_reg(hw, mbx->mbx_reg, FM10K_MBX_REQ |
+					  FM10K_MBX_INTERRUPT_DISABLE);
+	do {
+		udelay(FM10K_MBX_POLL_DELAY);
+		mbx->ops.process(hw, mbx);
+		timeout -= FM10K_MBX_POLL_DELAY;
+	} while ((timeout > 0) && (mbx->state != FM10K_STATE_CLOSED));
+
+	/* in case we didn't close just force the mailbox into shutdown */
+	fm10k_mbx_connect_reset(mbx);
+	fm10k_mbx_update_max_size(mbx, 0);
+
+	fm10k_write_reg(hw, mbx->mbmem_reg, 0);
+}
+
+/**
+ *  fm10k_mbx_connect - Start mailbox connection
+ *  @hw: pointer to hardware structure
+ *  @mbx: pointer to mailbox
+ *
+ *  This function will initiate a mailbox connection.  It will populate the
+ *  mailbox with a broadcast connect message and then initialize the lock.
+ *  This is safe since the connect message is a single DWORD so the mailbox
+ *  transaction is guaranteed to be atomic.
+ *
+ *  This function will return an error if the mailbox has not been initiated
+ *  or is currently in use.
+ **/
+static s32 fm10k_mbx_connect(struct fm10k_hw *hw, struct fm10k_mbx_info *mbx)
+{
+	/* we cannot connect an uninitialized mailbox */
+	if (!mbx->rx.buffer)
+		return FM10K_MBX_ERR_NO_SPACE;
+
+	/* we cannot connect an already connected mailbox */
+	if (mbx->state != FM10K_STATE_CLOSED)
+		return FM10K_MBX_ERR_BUSY;
+
+	/* mailbox timeout can now become active */
+	mbx->timeout = FM10K_MBX_INIT_TIMEOUT;
+
+	/* Place mbx in ready to connect state */
+	mbx->state = FM10K_STATE_CONNECT;
+
+	/* initialize header of remote mailbox */
+	fm10k_mbx_create_disconnect_hdr(mbx);
+	fm10k_write_reg(hw, mbx->mbmem_reg ^ mbx->mbmem_len, mbx->mbx_hdr);
+
+	/* enable interrupt and notify other party of new message */
+	mbx->mbx_lock = FM10K_MBX_REQ_INTERRUPT | FM10K_MBX_ACK_INTERRUPT |
+			FM10K_MBX_INTERRUPT_ENABLE;
+
+	/* generate and load connect header into mailbox */
+	fm10k_mbx_create_connect_hdr(mbx);
+	fm10k_mbx_write(hw, mbx);
+
+	return 0;
+}
+
+/**
  *  fm10k_mbx_validate_handlers - Validate layout of message parsing data
  *  @msg_data: handlers for mailbox events
  *
@@ -806,6 +1499,83 @@ static s32 fm10k_mbx_register_handlers(struct fm10k_mbx_info *mbx,
 }
 
 /**
+ *  fm10k_pfvf_mbx_init - Initialize mailbox memory for PF/VF mailbox
+ *  @hw: pointer to hardware structure
+ *  @mbx: pointer to mailbox
+ *  @msg_data: handlers for mailbox events
+ *  @id: ID reference for PF as it supports up to 64 PF/VF mailboxes
+ *
+ *  This function initializes the mailbox for use.  It will split the
+ *  buffer provided an use that th populate both the Tx and Rx FIFO by
+ *  evenly splitting it.  In order to allow for easy masking of head/tail
+ *  the value reported in size must be a power of 2 and is reported in
+ *  DWORDs, not bytes.  Any invalid values will cause the mailbox to return
+ *  error.
+ **/
+s32 fm10k_pfvf_mbx_init(struct fm10k_hw *hw, struct fm10k_mbx_info *mbx,
+			const struct fm10k_msg_data *msg_data, u8 id)
+{
+	/* initialize registers */
+	switch (hw->mac.type) {
+	case fm10k_mac_pf:
+		/* there are only 64 VF <-> PF mailboxes */
+		if (id < 64) {
+			mbx->mbx_reg = FM10K_MBX(id);
+			mbx->mbmem_reg = FM10K_MBMEM_VF(id, 0);
+			break;
+		}
+		/* fallthough */
+	default:
+		return FM10K_MBX_ERR_NO_MBX;
+	}
+
+	/* start out in closed state */
+	mbx->state = FM10K_STATE_CLOSED;
+
+	/* validate layout of handlers before assigning them */
+	if (fm10k_mbx_validate_handlers(msg_data))
+		return FM10K_ERR_PARAM;
+
+	/* initialize the message handlers */
+	mbx->msg_data = msg_data;
+
+	/* start mailbox as timed out and let the reset_hw call
+	 * set the timeout value to begin communications
+	 */
+	mbx->timeout = 0;
+	mbx->udelay = FM10K_MBX_INIT_DELAY;
+
+	/* initalize tail and head */
+	mbx->tail = 1;
+	mbx->head = 1;
+
+	/* initialize CRC seeds */
+	mbx->local = FM10K_MBX_CRC_SEED;
+	mbx->remote = FM10K_MBX_CRC_SEED;
+
+	/* Split buffer for use by Tx/Rx FIFOs */
+	mbx->max_size = FM10K_MBX_MSG_MAX_SIZE;
+	mbx->mbmem_len = FM10K_VFMBMEM_VF_XOR;
+
+	/* initialize the FIFOs, sizes are in 4 byte increments */
+	fm10k_fifo_init(&mbx->tx, mbx->buffer, FM10K_MBX_TX_BUFFER_SIZE);
+	fm10k_fifo_init(&mbx->rx, &mbx->buffer[FM10K_MBX_TX_BUFFER_SIZE],
+			FM10K_MBX_RX_BUFFER_SIZE);
+
+	/* initialize function pointers */
+	mbx->ops.connect = fm10k_mbx_connect;
+	mbx->ops.disconnect = fm10k_mbx_disconnect;
+	mbx->ops.rx_ready = fm10k_mbx_rx_ready;
+	mbx->ops.tx_ready = fm10k_mbx_tx_ready;
+	mbx->ops.tx_complete = fm10k_mbx_tx_complete;
+	mbx->ops.enqueue_tx = fm10k_mbx_enqueue_tx;
+	mbx->ops.process = fm10k_mbx_process;
+	mbx->ops.register_handlers = fm10k_mbx_register_handlers;
+
+	return 0;
+}
+
+/**
  *  fm10k_sm_mbx_create_data_hdr - Generate a mailbox header for local FIFO
  *  @mbx: pointer to mailbox
  *
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_mbx.h b/drivers/net/ethernet/intel/fm10k/fm10k_mbx.h
index f5ba86f..0419a7f 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_mbx.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_mbx.h
@@ -99,6 +99,39 @@ enum fm10k_mbx_state {
 	FM10K_STATE_DISCONNECT,
 };
 
+/* PF/VF Mailbox header format
+ *    3			  2		      1			  0
+ *  1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ * |        Size/Err_no/CRC        | Rsvd0 | Head  | Tail  | Type  |
+ * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ *
+ * The layout above describes the format for the header used in the PF/VF
+ * mailbox.  The header is broken out into the following fields:
+ * Type: There are 4 supported message types
+ *		0x8: Data header - used to transport message data
+ *		0xC: Connect header - used to establish connection
+ *		0xD: Disconnect header - used to tear down a connection
+ *		0xE: Error header - used to address message exceptions
+ * Tail: Tail index for local FIFO
+ *		Tail index actually consists of two parts.  The MSB of
+ *		the head is a loop tracker, it is 0 on an even numbered
+ *		loop through the FIFO, and 1 on the odd numbered loops.
+ *		To get the actual mailbox offset based on the tail it
+ *		is necessary to add bit 3 to bit 0 and clear bit 3.  This
+ *		gives us a valid range of 0x1 - 0xE.
+ * Head: Head index for remote FIFO
+ *		Head index follows the same format as the tail index.
+ * Rsvd0: Reserved 0 portion of the mailbox header
+ * CRC: Running CRC for all data since connect plus current message header
+ * Size: Maximum message size - Applies only to connect headers
+ *		The maximum message size is provided during connect to avoid
+ *		jamming the mailbox with messages that do not fit.
+ * Err_no: Error number - Applies only to error headers
+ *		The error number provides a indication of the type of error
+ *		experienced.
+ */
+
 /* macros for retriving and setting header values */
 #define FM10K_MSG_HDR_MASK(name) \
 	((0x1u << FM10K_MSG_##name##_SIZE) - 1)
@@ -107,6 +140,35 @@ enum fm10k_mbx_state {
 #define FM10K_MSG_HDR_FIELD_GET(value, name) \
 	((u16)((value) >> FM10K_MSG_##name##_SHIFT) & FM10K_MSG_HDR_MASK(name))
 
+/* offsets shared between all headers */
+#define FM10K_MSG_TYPE_SHIFT			0
+#define FM10K_MSG_TYPE_SIZE			4
+#define FM10K_MSG_TAIL_SHIFT			4
+#define FM10K_MSG_TAIL_SIZE			4
+#define FM10K_MSG_HEAD_SHIFT			8
+#define FM10K_MSG_HEAD_SIZE			4
+#define FM10K_MSG_RSVD0_SHIFT			12
+#define FM10K_MSG_RSVD0_SIZE			4
+
+/* offsets for data/disconnect headers */
+#define FM10K_MSG_CRC_SHIFT			16
+#define FM10K_MSG_CRC_SIZE			16
+
+/* offsets for connect headers */
+#define FM10K_MSG_CONNECT_SIZE_SHIFT		16
+#define FM10K_MSG_CONNECT_SIZE_SIZE		16
+
+/* offsets for error headers */
+#define FM10K_MSG_ERR_NO_SHIFT			16
+#define FM10K_MSG_ERR_NO_SIZE			16
+
+enum fm10k_msg_type {
+	FM10K_MSG_DATA			= 0x8,
+	FM10K_MSG_CONNECT		= 0xC,
+	FM10K_MSG_DISCONNECT		= 0xD,
+	FM10K_MSG_ERROR			= 0xE,
+};
+
 /* HNI/SM Mailbox FIFO format
  *    3                   2                   1                   0
  *  1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
@@ -237,6 +299,8 @@ struct fm10k_mbx_info {
 	u32 buffer[FM10K_MBX_BUFFER_SIZE];
 };
 
+s32 fm10k_pfvf_mbx_init(struct fm10k_hw *, struct fm10k_mbx_info *,
+			const struct fm10k_msg_data *, u8);
 s32 fm10k_sm_mbx_init(struct fm10k_hw *, struct fm10k_mbx_info *,
 		      const struct fm10k_msg_data *);
 

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [net-next PATCH 23/29] fm10k: Add support for VF
  2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
                   ` (21 preceding siblings ...)
  2014-09-18 22:39 ` [net-next PATCH 22/29] fm10k: Add support for PF <-> VF mailbox Alexander Duyck
@ 2014-09-18 22:39 ` Alexander Duyck
  2014-09-18 22:39 ` [net-next PATCH 24/29] fm10k: Add support for SR-IOV to PF core files Alexander Duyck
                   ` (8 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-18 22:39 UTC (permalink / raw)
  To: davem
  Cc: nhorman, netdev, john.fastabend, matthew.vick, jeffrey.t.kirsher,
	sassmann

This patch provides the functions necessary to configure the VF making use
of the same API pointers as the PF.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/fm10k/Makefile        |    2 
 drivers/net/ethernet/intel/fm10k/fm10k.h         |    4 
 drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c |  114 +++++
 drivers/net/ethernet/intel/fm10k/fm10k_mbx.c     |    4 
 drivers/net/ethernet/intel/fm10k/fm10k_netdev.c  |   15 +
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c     |  102 ++++
 drivers/net/ethernet/intel/fm10k/fm10k_type.h    |    9 
 drivers/net/ethernet/intel/fm10k/fm10k_vf.c      |  523 ++++++++++++++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_vf.h      |   68 +++
 9 files changed, 837 insertions(+), 4 deletions(-)
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_vf.c
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_vf.h

diff --git a/drivers/net/ethernet/intel/fm10k/Makefile b/drivers/net/ethernet/intel/fm10k/Makefile
index b72815e..f70893a 100644
--- a/drivers/net/ethernet/intel/fm10k/Makefile
+++ b/drivers/net/ethernet/intel/fm10k/Makefile
@@ -28,5 +28,5 @@
 obj-$(CONFIG_FM10K) += fm10k.o
 
 fm10k-objs := fm10k_main.o fm10k_common.o fm10k_pci.o \
-	      fm10k_netdev.o fm10k_ethtool.o fm10k_pf.o \
+	      fm10k_netdev.o fm10k_ethtool.o fm10k_pf.o fm10k_vf.o \
 	      fm10k_mbx.o fm10k_tlv.o
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k.h b/drivers/net/ethernet/intel/fm10k/fm10k.h
index 82efaed..0c4a873 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k.h
@@ -28,6 +28,7 @@
 #include <linux/pci.h>
 
 #include "fm10k_pf.h"
+#include "fm10k_vf.h"
 
 #define FM10K_MAX_JUMBO_FRAME_SIZE	15358	/* Maximum supported size 15K */
 
@@ -180,12 +181,13 @@ static inline struct netdev_queue *txring_txq(const struct fm10k_ring *ring)
 #define MIN_Q_VECTORS	1
 enum fm10k_non_q_vectors {
 	FM10K_MBX_VECTOR,
+#define NON_Q_VECTORS_VF NON_Q_VECTORS_PF
 	NON_Q_VECTORS_PF
 };
 
 #define NON_Q_VECTORS(hw)	(((hw)->mac.type == fm10k_mac_pf) ? \
 						NON_Q_VECTORS_PF : \
-						0)
+						NON_Q_VECTORS_VF)
 #define MIN_MSIX_COUNT(hw)	(MIN_Q_VECTORS + NON_Q_VECTORS(hw))
 
 struct fm10k_q_vector {
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c b/drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c
index 54e8ebd..42beb89 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c
@@ -102,6 +102,17 @@ static const struct fm10k_stats fm10k_gstrings_stats[] = {
 			 FM10K_NETDEV_STATS_LEN + \
 			 FM10K_QUEUE_STATS_LEN)
 
+static const char fm10k_gstrings_test[][ETH_GSTRING_LEN] = {
+	"Mailbox test (on/offline)"
+};
+
+#define FM10K_TEST_LEN (sizeof(fm10k_gstrings_test) / ETH_GSTRING_LEN)
+
+enum fm10k_self_test_types {
+	FM10K_TEST_MBX,
+	FM10K_TEST_MAX = FM10K_TEST_LEN
+};
+
 static void fm10k_get_strings(struct net_device *dev, u32 stringset,
 			      u8 *data)
 {
@@ -109,6 +120,10 @@ static void fm10k_get_strings(struct net_device *dev, u32 stringset,
 	int i;
 
 	switch (stringset) {
+	case ETH_SS_TEST:
+		memcpy(data, *fm10k_gstrings_test,
+		       FM10K_TEST_LEN * ETH_GSTRING_LEN);
+		break;
 	case ETH_SS_STATS:
 		for (i = 0; i < FM10K_NETDEV_STATS_LEN; i++) {
 			memcpy(p, fm10k_gstrings_net_stats[i].stat_string,
@@ -138,6 +153,8 @@ static void fm10k_get_strings(struct net_device *dev, u32 stringset,
 static int fm10k_get_sset_count(struct net_device *dev, int sset)
 {
 	switch (sset) {
+	case ETH_SS_TEST:
+		return FM10K_TEST_LEN;
 	case ETH_SS_STATS:
 		return FM10K_STATS_LEN;
 	default:
@@ -288,6 +305,28 @@ static void fm10k_get_regs(struct net_device *netdev,
 			*(buff++) = fm10k_read_reg(hw, FM10K_ITR(i));
 
 		break;
+	case fm10k_mac_vf:
+		/* General VF registers */
+		*(buff++) = fm10k_read_reg(hw, FM10K_VFCTRL);
+		*(buff++) = fm10k_read_reg(hw, FM10K_VFINT_MAP);
+		*(buff++) = fm10k_read_reg(hw, FM10K_VFSYSTIME);
+
+		/* Interrupt Throttling Registers */
+		for (i = 0; i < 8; i++)
+			*(buff++) = fm10k_read_reg(hw, FM10K_VFITR(i));
+
+		fm10k_get_reg_vsi(hw, buff, 0);
+		buff += FM10K_REGS_LEN_VSI;
+
+		for (i = 0; i < FM10K_MAX_QUEUES_POOL; i++) {
+			if (i < hw->mac.max_queues)
+				fm10k_get_reg_q(hw, buff, i);
+			else
+				memset(buff, 0, sizeof(u32) * FM10K_REGS_LEN_Q);
+			buff += FM10K_REGS_LEN_Q;
+		}
+
+		break;
 	default:
 		return;
 	}
@@ -296,6 +335,8 @@ static void fm10k_get_regs(struct net_device *netdev,
 /* If function above adds more registers these define need to be updated */
 #define FM10K_REGS_LEN_PF \
 (162 + (65 * FM10K_REGS_LEN_VSI) + (FM10K_MAX_QUEUES_PF * FM10K_REGS_LEN_Q))
+#define FM10K_REGS_LEN_VF \
+(11 + FM10K_REGS_LEN_VSI + (FM10K_MAX_QUEUES_POOL * FM10K_REGS_LEN_Q))
 
 static int fm10k_get_regs_len(struct net_device *netdev)
 {
@@ -305,6 +346,8 @@ static int fm10k_get_regs_len(struct net_device *netdev)
 	switch (hw->mac.type) {
 	case fm10k_mac_pf:
 		return FM10K_REGS_LEN_PF * sizeof(u32);
+	case fm10k_mac_vf:
+		return FM10K_REGS_LEN_VF * sizeof(u32);
 	default:
 		return 0;
 	}
@@ -734,6 +777,76 @@ static int fm10k_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *cmd)
 	return ret;
 }
 
+static int fm10k_mbx_test(struct fm10k_intfc *interface, u64 *data)
+{
+	struct fm10k_hw *hw = &interface->hw;
+	struct fm10k_mbx_info *mbx = &hw->mbx;
+	u32 attr_flag, test_msg[6];
+	unsigned long timeout;
+	int err;
+
+	/* For now this is a VF only feature */
+	if (hw->mac.type != fm10k_mac_vf)
+		return 0;
+
+	/* loop through both nested and unnested attribute types */
+	for (attr_flag = (1 << FM10K_TEST_MSG_UNSET);
+	     attr_flag < (1 << (2 * FM10K_TEST_MSG_NESTED));
+	     attr_flag += attr_flag) {
+		/* generate message to be tested */
+		fm10k_tlv_msg_test_create(test_msg, attr_flag);
+
+		fm10k_mbx_lock(interface);
+		mbx->test_result = FM10K_NOT_IMPLEMENTED;
+		err = mbx->ops.enqueue_tx(hw, mbx, test_msg);
+		fm10k_mbx_unlock(interface);
+
+		/* wait up to 1 second for response */
+		timeout = jiffies + HZ;
+		do {
+			if (err < 0)
+				goto err_out;
+
+			usleep_range(500, 1000);
+
+			fm10k_mbx_lock(interface);
+			mbx->ops.process(hw, mbx);
+			fm10k_mbx_unlock(interface);
+
+			err = mbx->test_result;
+			if (!err)
+				break;
+		} while (time_is_after_jiffies(timeout));
+
+		/* reporting errors */
+		if (err)
+			goto err_out;
+	}
+
+err_out:
+	*data = err < 0 ? (attr_flag) : (err > 0);
+	return err;
+}
+
+static void fm10k_self_test(struct net_device *dev,
+			    struct ethtool_test *eth_test, u64 *data)
+{
+	struct fm10k_intfc *interface = netdev_priv(dev);
+	struct fm10k_hw *hw = &interface->hw;
+
+	memset(data, 0, sizeof(*data) * FM10K_TEST_LEN);
+
+	if (FM10K_REMOVED(hw)) {
+		netif_err(interface, drv, dev,
+			  "Interface removed - test blocked\n");
+		eth_test->flags |= ETH_TEST_FL_FAILED;
+		return;
+	}
+
+	if (fm10k_mbx_test(interface, &data[FM10K_TEST_MBX]))
+		eth_test->flags |= ETH_TEST_FL_FAILED;
+}
+
 static u32 fm10k_get_reta_size(struct net_device *netdev)
 {
 	return FM10K_RETA_SIZE * FM10K_RETA_ENTRIES_PER_REG;
@@ -911,6 +1024,7 @@ static const struct ethtool_ops fm10k_ethtool_ops = {
 	.set_rxnfc		= fm10k_set_rxnfc,
 	.get_regs               = fm10k_get_regs,
 	.get_regs_len           = fm10k_get_regs_len,
+	.self_test		= fm10k_self_test,
 	.get_rxfh_indir_size	= fm10k_get_reta_size,
 	.get_rxfh_key_size	= fm10k_get_rssrk_size,
 	.get_rxfh		= fm10k_get_rssh,
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_mbx.c b/drivers/net/ethernet/intel/fm10k/fm10k_mbx.c
index a6a66fd..14a4ea7 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_mbx.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_mbx.c
@@ -1517,6 +1517,10 @@ s32 fm10k_pfvf_mbx_init(struct fm10k_hw *hw, struct fm10k_mbx_info *mbx,
 {
 	/* initialize registers */
 	switch (hw->mac.type) {
+	case fm10k_mac_vf:
+		mbx->mbx_reg = FM10K_VFMBX;
+		mbx->mbmem_reg = FM10K_VFMBMEM(FM10K_VFMBMEM_VF_XOR);
+		break;
 	case fm10k_mac_pf:
 		/* there are only 64 VF <-> PF mailboxes */
 		if (id < 64) {
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
index af30e25..057fcb0 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
@@ -972,6 +972,21 @@ void fm10k_restore_rx_state(struct fm10k_intfc *interface)
 	int xcast_mode;
 	u16 vid, glort;
 
+	/* restore our address if perm_addr is set */
+	if (hw->mac.type == fm10k_mac_vf) {
+		if (is_valid_ether_addr(hw->mac.perm_addr)) {
+			ether_addr_copy(hw->mac.addr, hw->mac.perm_addr);
+			ether_addr_copy(netdev->perm_addr, hw->mac.perm_addr);
+			ether_addr_copy(netdev->dev_addr, hw->mac.perm_addr);
+			netdev->addr_assign_type &= ~NET_ADDR_RANDOM;
+		}
+
+		if (hw->mac.vlan_override)
+			netdev->features &= ~NETIF_F_HW_VLAN_CTAG_RX;
+		else
+			netdev->features |= NETIF_F_HW_VLAN_CTAG_RX;
+	}
+
 	/* record glort for this interface */
 	glort = interface->glort;
 
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index c61241a..ec34438 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -25,6 +25,7 @@
 
 static const struct fm10k_info *fm10k_info_tbl[] = {
 	[fm10k_device_pf] = &fm10k_pf_info,
+	[fm10k_device_vf] = &fm10k_vf_info,
 };
 
 /**
@@ -38,6 +39,7 @@ static const struct fm10k_info *fm10k_info_tbl[] = {
  */
 static const struct pci_device_id fm10k_pci_tbl[] = {
 	{ PCI_VDEVICE(INTEL, FM10K_DEV_ID_PF), fm10k_device_pf },
+	{ PCI_VDEVICE(INTEL, FM10K_DEV_ID_VF), fm10k_device_vf },
 	/* required last entry */
 	{ 0, }
 };
@@ -805,6 +807,28 @@ static irqreturn_t fm10k_msix_clean_rings(int irq, void *data)
 	return IRQ_HANDLED;
 }
 
+static irqreturn_t fm10k_msix_mbx_vf(int irq, void *data)
+{
+	struct fm10k_intfc *interface = data;
+	struct fm10k_hw *hw = &interface->hw;
+	struct fm10k_mbx_info *mbx = &hw->mbx;
+
+	/* re-enable mailbox interrupt and indicate 20us delay */
+	fm10k_write_reg(hw, FM10K_VFITR(FM10K_MBX_VECTOR),
+			FM10K_ITR_ENABLE | FM10K_MBX_INT_DELAY);
+
+	/* service upstream mailbox */
+	if (fm10k_mbx_trylock(interface)) {
+		mbx->ops.process(hw, mbx);
+		fm10k_mbx_unlock(interface);
+	}
+
+	hw->mac.get_host_state = 1;
+	fm10k_service_event_schedule(interface);
+
+	return IRQ_HANDLED;
+}
+
 #define FM10K_ERR_MSG(type) case (type): error = #type; break
 static void fm10k_print_fault(struct fm10k_intfc *interface, int type,
 			      struct fm10k_fault *fault)
@@ -996,6 +1020,8 @@ void fm10k_mbx_free_irq(struct fm10k_intfc *interface)
 				FM10K_EIMR_DISABLE(VFLR) |
 				FM10K_EIMR_DISABLE(MAXHOLDTIME));
 		itr_reg = FM10K_ITR(FM10K_MBX_VECTOR);
+	} else {
+		itr_reg = FM10K_VFITR(FM10K_MBX_VECTOR);
 	}
 
 	fm10k_write_reg(hw, itr_reg, FM10K_ITR_MASK_SET);
@@ -1003,6 +1029,33 @@ void fm10k_mbx_free_irq(struct fm10k_intfc *interface)
 	free_irq(entry->vector, interface);
 }
 
+static s32 fm10k_mbx_mac_addr(struct fm10k_hw *hw, u32 **results,
+			      struct fm10k_mbx_info *mbx)
+{
+	struct fm10k_intfc *interface = container_of(hw,
+						     struct fm10k_intfc,
+						     hw);
+	bool vlan_override = hw->mac.vlan_override;
+	u16 default_vid = hw->mac.default_vid;
+	s32 err;
+
+	err = fm10k_msg_mac_vlan_vf(hw, results, mbx);
+	if (err)
+		return err;
+
+	/* MAC was changed so we need reset */
+	if (is_valid_ether_addr(hw->mac.perm_addr) &&
+	    memcmp(hw->mac.perm_addr, hw->mac.addr, ETH_ALEN))
+		interface->flags |= FM10K_FLAG_RESET_REQUESTED;
+
+	/* VLAN override was changed, or default VLAN changed */
+	if ((vlan_override != hw->mac.vlan_override) ||
+	    (default_vid != hw->mac.default_vid))
+		interface->flags |= FM10K_FLAG_RESET_REQUESTED;
+
+	return 0;
+}
+
 /* generic error handler for mailbox issues */
 static s32 fm10k_mbx_error(struct fm10k_hw *hw, u32 **results,
 			   struct fm10k_mbx_info *mbx)
@@ -1018,6 +1071,46 @@ static s32 fm10k_mbx_error(struct fm10k_hw *hw, u32 **results,
 	return 0;
 }
 
+static const struct fm10k_msg_data vf_mbx_data[] = {
+	FM10K_TLV_MSG_TEST_HANDLER(fm10k_tlv_msg_test),
+	FM10K_VF_MSG_MAC_VLAN_HANDLER(fm10k_mbx_mac_addr),
+	FM10K_VF_MSG_LPORT_STATE_HANDLER(fm10k_msg_lport_state_vf),
+	FM10K_TLV_MSG_ERROR_HANDLER(fm10k_mbx_error),
+};
+
+static int fm10k_mbx_request_irq_vf(struct fm10k_intfc *interface)
+{
+	struct msix_entry *entry = &interface->msix_entries[FM10K_MBX_VECTOR];
+	struct net_device *dev = interface->netdev;
+	struct fm10k_hw *hw = &interface->hw;
+	int err;
+
+	/* Use timer0 for interrupt moderation on the mailbox */
+	u32 itr = FM10K_INT_MAP_TIMER0 | entry->entry;
+
+	/* register mailbox handlers */
+	err = hw->mbx.ops.register_handlers(&hw->mbx, vf_mbx_data);
+	if (err)
+		return err;
+
+	/* request the IRQ */
+	err = request_irq(entry->vector, fm10k_msix_mbx_vf, 0,
+			  dev->name, interface);
+	if (err) {
+		netif_err(interface, probe, dev,
+			  "request_irq for msix_mbx failed: %d\n", err);
+		return err;
+	}
+
+	/* map all of the interrupt sources */
+	fm10k_write_reg(hw, FM10K_VFINT_MAP, itr);
+
+	/* enable interrupt */
+	fm10k_write_reg(hw, FM10K_VFITR(entry->entry), FM10K_ITR_ENABLE);
+
+	return 0;
+}
+
 static s32 fm10k_lport_map(struct fm10k_hw *hw, u32 **results,
 			   struct fm10k_mbx_info *mbx)
 {
@@ -1141,7 +1234,10 @@ int fm10k_mbx_request_irq(struct fm10k_intfc *interface)
 	int err;
 
 	/* enable Mailbox cause */
-	err = fm10k_mbx_request_irq_pf(interface);
+	if (hw->mac.type == fm10k_mac_pf)
+		err = fm10k_mbx_request_irq_pf(interface);
+	else
+		err = fm10k_mbx_request_irq_vf(interface);
 
 	/* connect mailbox */
 	if (!err)
@@ -1219,7 +1315,9 @@ int fm10k_qv_request_irq(struct fm10k_intfc *interface)
 		}
 
 		/* Assign ITR register to q_vector */
-		q_vector->itr = &interface->uc_addr[FM10K_ITR(entry->entry)];
+		q_vector->itr = (hw->mac.type == fm10k_mac_pf) ?
+				&interface->uc_addr[FM10K_ITR(entry->entry)] :
+				&interface->uc_addr[FM10K_VFITR(entry->entry)];
 
 		/* request the IRQ */
 		err = request_irq(entry->vector, &fm10k_msix_clean_rings, 0,
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_type.h b/drivers/net/ethernet/intel/fm10k/fm10k_type.h
index eda0c7c..cc1df60 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_type.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_type.h
@@ -351,6 +351,13 @@ struct fm10k_hw;
 #define FM10K_QUEUE_DISABLE_TIMEOUT		100
 #define FM10K_RESET_TIMEOUT			100
 
+/* VF registers */
+#define FM10K_VFCTRL		0x00000
+#define FM10K_VFCTRL_RST			0x00000008
+#define FM10K_VFINT_MAP		0x00030
+#define FM10K_VFSYSTIME		0x00040
+#define FM10K_VFITR(_n)		((_n) + 0x00060)
+
 enum fm10k_int_source {
 	fm10k_int_Mailbox	= 0,
 	fm10k_int_PCIeFault	= 1,
@@ -522,6 +529,7 @@ struct fm10k_mac_ops {
 enum fm10k_mac_type {
 	fm10k_mac_unknown = 0,
 	fm10k_mac_pf,
+	fm10k_mac_vf,
 	fm10k_num_macs
 };
 
@@ -561,6 +569,7 @@ enum fm10k_xcast_modes {
 
 enum fm10k_devices {
 	fm10k_device_pf,
+	fm10k_device_vf,
 };
 
 struct fm10k_info {
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_vf.c b/drivers/net/ethernet/intel/fm10k/fm10k_vf.c
new file mode 100644
index 0000000..25c23fc
--- /dev/null
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_vf.c
@@ -0,0 +1,523 @@
+/* Intel Ethernet Switch Host Interface Driver
+ * Copyright(c) 2013 - 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Contact Information:
+ * e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+ * Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+ */
+
+#include "fm10k_vf.h"
+
+/**
+ *  fm10k_stop_hw_vf - Stop Tx/Rx units
+ *  @hw: pointer to hardware structure
+ *
+ **/
+static s32 fm10k_stop_hw_vf(struct fm10k_hw *hw)
+{
+	u8 *perm_addr = hw->mac.perm_addr;
+	u32 bal = 0, bah = 0;
+	s32 err;
+	u16 i;
+
+	/* we need to disable the queues before taking further steps */
+	err = fm10k_stop_hw_generic(hw);
+	if (err)
+		return err;
+
+	/* If permenant address is set then we need to restore it */
+	if (is_valid_ether_addr(perm_addr)) {
+		bal = (((u32)perm_addr[3]) << 24) |
+		      (((u32)perm_addr[4]) << 16) |
+		      (((u32)perm_addr[5]) << 8);
+		bah = (((u32)0xFF)	   << 24) |
+		      (((u32)perm_addr[0]) << 16) |
+		      (((u32)perm_addr[1]) << 8) |
+		       ((u32)perm_addr[2]);
+	}
+
+	/* The queues have already been disabled so we just need to
+	 * update their base address registers
+	 */
+	for (i = 0; i < hw->mac.max_queues; i++) {
+		fm10k_write_reg(hw, FM10K_TDBAL(i), bal);
+		fm10k_write_reg(hw, FM10K_TDBAH(i), bah);
+		fm10k_write_reg(hw, FM10K_RDBAL(i), bal);
+		fm10k_write_reg(hw, FM10K_RDBAH(i), bah);
+	}
+
+	return 0;
+}
+
+/**
+ *  fm10k_reset_hw_vf - VF hardware reset
+ *  @hw: pointer to hardware structure
+ *
+ *  This function should return the hardare to a state similar to the
+ *  one it is in after just being initialized.
+ **/
+static s32 fm10k_reset_hw_vf(struct fm10k_hw *hw)
+{
+	s32 err;
+
+	/* shut down queues we own and reset DMA configuration */
+	err = fm10k_stop_hw_vf(hw);
+	if (err)
+		return err;
+
+	/* Inititate VF reset */
+	fm10k_write_reg(hw, FM10K_VFCTRL, FM10K_VFCTRL_RST);
+
+	/* Flush write and allow 100us for reset to complete */
+	fm10k_write_flush(hw);
+	udelay(FM10K_RESET_TIMEOUT);
+
+	/* Clear reset bit and verify it was cleared */
+	fm10k_write_reg(hw, FM10K_VFCTRL, 0);
+	if (fm10k_read_reg(hw, FM10K_VFCTRL) & FM10K_VFCTRL_RST)
+		err = FM10K_ERR_RESET_FAILED;
+
+	return err;
+}
+
+/**
+ *  fm10k_init_hw_vf - VF hardware initialization
+ *  @hw: pointer to hardware structure
+ *
+ **/
+static s32 fm10k_init_hw_vf(struct fm10k_hw *hw)
+{
+	u32 tqdloc, tqdloc0 = ~fm10k_read_reg(hw, FM10K_TQDLOC(0));
+	s32 err;
+	u16 i;
+
+	/* assume we always have at least 1 queue */
+	for (i = 1; tqdloc0 && (i < FM10K_MAX_QUEUES_POOL); i++) {
+		/* verify the Descriptor cache offsets are increasing */
+		tqdloc = ~fm10k_read_reg(hw, FM10K_TQDLOC(i));
+		if (!tqdloc || (tqdloc == tqdloc0))
+			break;
+
+		/* check to verify the PF doesn't own any of our queues */
+		if (!~fm10k_read_reg(hw, FM10K_TXQCTL(i)) ||
+		    !~fm10k_read_reg(hw, FM10K_RXQCTL(i)))
+			break;
+	}
+
+	/* shut down queues we own and reset DMA configuration */
+	err = fm10k_disable_queues_generic(hw, i);
+	if (err)
+		return err;
+
+	/* record maximum queue count */
+	hw->mac.max_queues = i;
+
+	return 0;
+}
+
+/**
+ *  fm10k_is_slot_appropriate_vf - Indicate appropriate slot for this SKU
+ *  @hw: pointer to hardware structure
+ *
+ *  Looks at the PCIe bus info to confirm whether or not this slot can support
+ *  the necessary bandwidth for this device. Since the VF has no control over
+ *  the "slot" it is in, always indicate that the slot is appropriate.
+ **/
+static bool fm10k_is_slot_appropriate_vf(struct fm10k_hw *hw)
+{
+	return true;
+}
+
+/* This structure defines the attibutes to be parsed below */
+const struct fm10k_tlv_attr fm10k_mac_vlan_msg_attr[] = {
+	FM10K_TLV_ATTR_U32(FM10K_MAC_VLAN_MSG_VLAN),
+	FM10K_TLV_ATTR_BOOL(FM10K_MAC_VLAN_MSG_SET),
+	FM10K_TLV_ATTR_MAC_ADDR(FM10K_MAC_VLAN_MSG_MAC),
+	FM10K_TLV_ATTR_MAC_ADDR(FM10K_MAC_VLAN_MSG_DEFAULT_MAC),
+	FM10K_TLV_ATTR_MAC_ADDR(FM10K_MAC_VLAN_MSG_MULTICAST),
+	FM10K_TLV_ATTR_LAST
+};
+
+/**
+ *  fm10k_update_vlan_vf - Update status of VLAN ID in VLAN filter table
+ *  @hw: pointer to hardware structure
+ *  @vid: VLAN ID to add to table
+ *  @vsi: Reserved, should always be 0
+ *  @set: Indicates if this is a set or clear operation
+ *
+ *  This function adds or removes the corresponding VLAN ID from the VLAN
+ *  filter table for this VF.
+ **/
+static s32 fm10k_update_vlan_vf(struct fm10k_hw *hw, u32 vid, u8 vsi, bool set)
+{
+	struct fm10k_mbx_info *mbx = &hw->mbx;
+	u32 msg[4];
+
+	/* verify the index is not set */
+	if (vsi)
+		return FM10K_ERR_PARAM;
+
+	/* verify upper 4 bits of vid and length are 0 */
+	if ((vid << 16 | vid) >> 28)
+		return FM10K_ERR_PARAM;
+
+	/* encode set bit into the VLAN ID */
+	if (!set)
+		vid |= FM10K_VLAN_CLEAR;
+
+	/* generate VLAN request */
+	fm10k_tlv_msg_init(msg, FM10K_VF_MSG_ID_MAC_VLAN);
+	fm10k_tlv_attr_put_u32(msg, FM10K_MAC_VLAN_MSG_VLAN, vid);
+
+	/* load onto outgoing mailbox */
+	return mbx->ops.enqueue_tx(hw, mbx, msg);
+}
+
+/**
+ *  fm10k_msg_mac_vlan_vf - Read device MAC address from mailbox message
+ *  @hw: pointer to the HW structure
+ *  @results: Attributes for message
+ *  @mbx: unused mailbox data
+ *
+ *  This function should determine the MAC address for the VF
+ **/
+s32 fm10k_msg_mac_vlan_vf(struct fm10k_hw *hw, u32 **results,
+			  struct fm10k_mbx_info *mbx)
+{
+	u8 perm_addr[ETH_ALEN];
+	u16 vid;
+	s32 err;
+
+	/* record MAC address requested */
+	err = fm10k_tlv_attr_get_mac_vlan(
+					results[FM10K_MAC_VLAN_MSG_DEFAULT_MAC],
+					perm_addr, &vid);
+	if (err)
+		return err;
+
+	ether_addr_copy(hw->mac.perm_addr, perm_addr);
+	hw->mac.default_vid = vid & (FM10K_VLAN_TABLE_VID_MAX - 1);
+	hw->mac.vlan_override = !!(vid & FM10K_VLAN_CLEAR);
+
+	return 0;
+}
+
+/**
+ *  fm10k_read_mac_addr_vf - Read device MAC address
+ *  @hw: pointer to the HW structure
+ *
+ *  This function should determine the MAC address for the VF
+ **/
+static s32 fm10k_read_mac_addr_vf(struct fm10k_hw *hw)
+{
+	u8 perm_addr[ETH_ALEN];
+	u32 base_addr;
+
+	base_addr = fm10k_read_reg(hw, FM10K_TDBAL(0));
+
+	/* last byte should be 0 */
+	if (base_addr << 24)
+		return  FM10K_ERR_INVALID_MAC_ADDR;
+
+	perm_addr[3] = (u8)(base_addr >> 24);
+	perm_addr[4] = (u8)(base_addr >> 16);
+	perm_addr[5] = (u8)(base_addr >> 8);
+
+	base_addr = fm10k_read_reg(hw, FM10K_TDBAH(0));
+
+	/* first byte should be all 1's */
+	if ((~base_addr) >> 24)
+		return  FM10K_ERR_INVALID_MAC_ADDR;
+
+	perm_addr[0] = (u8)(base_addr >> 16);
+	perm_addr[1] = (u8)(base_addr >> 8);
+	perm_addr[2] = (u8)(base_addr);
+
+	ether_addr_copy(hw->mac.perm_addr, perm_addr);
+	ether_addr_copy(hw->mac.addr, perm_addr);
+
+	return 0;
+}
+
+/**
+ *  fm10k_update_uc_addr_vf - Update device unicast address
+ *  @hw: pointer to the HW structure
+ *  @glort: unused
+ *  @mac: MAC address to add/remove from table
+ *  @vid: VLAN ID to add/remove from table
+ *  @add: Indicates if this is an add or remove operation
+ *  @flags: flags field to indicate add and secure - unused
+ *
+ *  This function is used to add or remove unicast MAC addresses for
+ *  the VF.
+ **/
+static s32 fm10k_update_uc_addr_vf(struct fm10k_hw *hw, u16 glort,
+				   const u8 *mac, u16 vid, bool add, u8 flags)
+{
+	struct fm10k_mbx_info *mbx = &hw->mbx;
+	u32 msg[7];
+
+	/* verify VLAN ID is valid */
+	if (vid >= FM10K_VLAN_TABLE_VID_MAX)
+		return FM10K_ERR_PARAM;
+
+	/* verify MAC address is valid */
+	if (!is_valid_ether_addr(mac))
+		return FM10K_ERR_PARAM;
+
+	/* verify we are not locked down on the MAC address */
+	if (is_valid_ether_addr(hw->mac.perm_addr) &&
+	    memcmp(hw->mac.perm_addr, mac, ETH_ALEN))
+		return FM10K_ERR_PARAM;
+
+	/* add bit to notify us if this is a set of clear operation */
+	if (!add)
+		vid |= FM10K_VLAN_CLEAR;
+
+	/* generate VLAN request */
+	fm10k_tlv_msg_init(msg, FM10K_VF_MSG_ID_MAC_VLAN);
+	fm10k_tlv_attr_put_mac_vlan(msg, FM10K_MAC_VLAN_MSG_MAC, mac, vid);
+
+	/* load onto outgoing mailbox */
+	return mbx->ops.enqueue_tx(hw, mbx, msg);
+}
+
+/**
+ *  fm10k_update_mc_addr_vf - Update device multicast address
+ *  @hw: pointer to the HW structure
+ *  @glort: unused
+ *  @mac: MAC address to add/remove from table
+ *  @vid: VLAN ID to add/remove from table
+ *  @add: Indicates if this is an add or remove operation
+ *
+ *  This function is used to add or remove multicast MAC addresses for
+ *  the VF.
+ **/
+static s32 fm10k_update_mc_addr_vf(struct fm10k_hw *hw, u16 glort,
+				   const u8 *mac, u16 vid, bool add)
+{
+	struct fm10k_mbx_info *mbx = &hw->mbx;
+	u32 msg[7];
+
+	/* verify VLAN ID is valid */
+	if (vid >= FM10K_VLAN_TABLE_VID_MAX)
+		return FM10K_ERR_PARAM;
+
+	/* verify multicast address is valid */
+	if (!is_multicast_ether_addr(mac))
+		return FM10K_ERR_PARAM;
+
+	/* add bit to notify us if this is a set of clear operation */
+	if (!add)
+		vid |= FM10K_VLAN_CLEAR;
+
+	/* generate VLAN request */
+	fm10k_tlv_msg_init(msg, FM10K_VF_MSG_ID_MAC_VLAN);
+	fm10k_tlv_attr_put_mac_vlan(msg, FM10K_MAC_VLAN_MSG_MULTICAST,
+				    mac, vid);
+
+	/* load onto outgoing mailbox */
+	return mbx->ops.enqueue_tx(hw, mbx, msg);
+}
+
+/**
+ *  fm10k_update_int_moderator_vf - Request update of interrupt moderator list
+ *  @hw: pointer to hardware structure
+ *
+ *  This function will issue a request to the PF to rescan our MSI-X table
+ *  and to update the interrupt moderator linked list.
+ **/
+static void fm10k_update_int_moderator_vf(struct fm10k_hw *hw)
+{
+	struct fm10k_mbx_info *mbx = &hw->mbx;
+	u32 msg[1];
+
+	/* generate MSI-X request */
+	fm10k_tlv_msg_init(msg, FM10K_VF_MSG_ID_MSIX);
+
+	/* load onto outgoing mailbox */
+	mbx->ops.enqueue_tx(hw, mbx, msg);
+}
+
+/* This structure defines the attibutes to be parsed below */
+const struct fm10k_tlv_attr fm10k_lport_state_msg_attr[] = {
+	FM10K_TLV_ATTR_BOOL(FM10K_LPORT_STATE_MSG_DISABLE),
+	FM10K_TLV_ATTR_U8(FM10K_LPORT_STATE_MSG_XCAST_MODE),
+	FM10K_TLV_ATTR_BOOL(FM10K_LPORT_STATE_MSG_READY),
+	FM10K_TLV_ATTR_LAST
+};
+
+/**
+ *  fm10k_msg_lport_state_vf - Message handler for lport_state message from PF
+ *  @hw: Pointer to hardware structure
+ *  @results: pointer array containing parsed data
+ *  @mbx: Pointer to mailbox information structure
+ *
+ *  This handler is meant to capture the indication from the PF that we
+ *  are ready to bring up the interface.
+ **/
+s32 fm10k_msg_lport_state_vf(struct fm10k_hw *hw, u32 **results,
+			     struct fm10k_mbx_info *mbx)
+{
+	hw->mac.dglort_map = !results[FM10K_LPORT_STATE_MSG_READY] ?
+			     FM10K_DGLORTMAP_NONE : FM10K_DGLORTMAP_ZERO;
+
+	return 0;
+}
+
+/**
+ *  fm10k_update_lport_state_vf - Update device state in lower device
+ *  @hw: pointer to the HW structure
+ *  @glort: unused
+ *  @count: number of logical ports to enable - unused (always 1)
+ *  @enable: boolean value indicating if this is an enable or disable request
+ *
+ *  Notify the lower device of a state change.  If the lower device is
+ *  enabled we can add filters, if it is disabled all filters for this
+ *  logical port are flushed.
+ **/
+static s32 fm10k_update_lport_state_vf(struct fm10k_hw *hw, u16 glort,
+				       u16 count, bool enable)
+{
+	struct fm10k_mbx_info *mbx = &hw->mbx;
+	u32 msg[2];
+
+	/* reset glort mask 0 as we have to wait to be enabled */
+	hw->mac.dglort_map = FM10K_DGLORTMAP_NONE;
+
+	/* generate port state request */
+	fm10k_tlv_msg_init(msg, FM10K_VF_MSG_ID_LPORT_STATE);
+	if (!enable)
+		fm10k_tlv_attr_put_bool(msg, FM10K_LPORT_STATE_MSG_DISABLE);
+
+	/* load onto outgoing mailbox */
+	return mbx->ops.enqueue_tx(hw, mbx, msg);
+}
+
+/**
+ *  fm10k_update_xcast_mode_vf - Request update of multicast mode
+ *  @hw: pointer to hardware structure
+ *  @glort: unused
+ *  @mode: integer value indicating mode being requested
+ *
+ *  This function will attempt to request a higher mode for the port
+ *  so that it can enable either multicast, multicast promiscuous, or
+ *  promiscuous mode of operation.
+ **/
+static s32 fm10k_update_xcast_mode_vf(struct fm10k_hw *hw, u16 glort, u8 mode)
+{
+	struct fm10k_mbx_info *mbx = &hw->mbx;
+	u32 msg[3];
+
+	if (mode > FM10K_XCAST_MODE_NONE)
+		return FM10K_ERR_PARAM;
+	/* generate message requesting to change xcast mode */
+	fm10k_tlv_msg_init(msg, FM10K_VF_MSG_ID_LPORT_STATE);
+	fm10k_tlv_attr_put_u8(msg, FM10K_LPORT_STATE_MSG_XCAST_MODE, mode);
+
+	/* load onto outgoing mailbox */
+	return mbx->ops.enqueue_tx(hw, mbx, msg);
+}
+
+/**
+ *  fm10k_update_hw_stats_vf - Updates hardware related statistics of VF
+ *  @hw: pointer to hardware structure
+ *  @stats: pointer to statistics structure
+ *
+ *  This function collects and aggregates per queue hardware statistics.
+ **/
+static void fm10k_update_hw_stats_vf(struct fm10k_hw *hw,
+				     struct fm10k_hw_stats *stats)
+{
+	fm10k_update_hw_stats_q(hw, stats->q, 0, hw->mac.max_queues);
+}
+
+/**
+ *  fm10k_rebind_hw_stats_vf - Resets base for hardware statistics of VF
+ *  @hw: pointer to hardware structure
+ *  @stats: pointer to the stats structure to update
+ *
+ *  This function resets the base for queue hardware statistics.
+ **/
+static void fm10k_rebind_hw_stats_vf(struct fm10k_hw *hw,
+				     struct fm10k_hw_stats *stats)
+{
+	/* Unbind Queue Statistics */
+	fm10k_unbind_hw_stats_q(stats->q, 0, hw->mac.max_queues);
+
+	/* Reinitialize bases for all stats */
+	fm10k_update_hw_stats_vf(hw, stats);
+}
+
+/**
+ *  fm10k_configure_dglort_map_vf - Configures GLORT entry and queues
+ *  @hw: pointer to hardware structure
+ *  @dglort: pointer to dglort configuration structure
+ *
+ *  Reads the configuration structure contained in dglort_cfg and uses
+ *  that information to then populate a DGLORTMAP/DEC entry and the queues
+ *  to which it has been assigned.
+ **/
+static s32 fm10k_configure_dglort_map_vf(struct fm10k_hw *hw,
+					 struct fm10k_dglort_cfg *dglort)
+{
+	/* verify the dglort pointer */
+	if (!dglort)
+		return FM10K_ERR_PARAM;
+
+	/* stub for now until we determine correct message for this */
+
+	return 0;
+}
+
+static const struct fm10k_msg_data fm10k_msg_data_vf[] = {
+	FM10K_TLV_MSG_TEST_HANDLER(fm10k_tlv_msg_test),
+	FM10K_VF_MSG_MAC_VLAN_HANDLER(fm10k_msg_mac_vlan_vf),
+	FM10K_VF_MSG_LPORT_STATE_HANDLER(fm10k_msg_lport_state_vf),
+	FM10K_TLV_MSG_ERROR_HANDLER(fm10k_tlv_msg_error),
+};
+
+static struct fm10k_mac_ops mac_ops_vf = {
+	.get_bus_info		= &fm10k_get_bus_info_generic,
+	.reset_hw		= &fm10k_reset_hw_vf,
+	.init_hw		= &fm10k_init_hw_vf,
+	.start_hw		= &fm10k_start_hw_generic,
+	.stop_hw		= &fm10k_stop_hw_vf,
+	.is_slot_appropriate	= &fm10k_is_slot_appropriate_vf,
+	.update_vlan		= &fm10k_update_vlan_vf,
+	.read_mac_addr		= &fm10k_read_mac_addr_vf,
+	.update_uc_addr		= &fm10k_update_uc_addr_vf,
+	.update_mc_addr		= &fm10k_update_mc_addr_vf,
+	.update_xcast_mode	= &fm10k_update_xcast_mode_vf,
+	.update_int_moderator	= &fm10k_update_int_moderator_vf,
+	.update_lport_state	= &fm10k_update_lport_state_vf,
+	.update_hw_stats	= &fm10k_update_hw_stats_vf,
+	.rebind_hw_stats	= &fm10k_rebind_hw_stats_vf,
+	.configure_dglort_map	= &fm10k_configure_dglort_map_vf,
+	.get_host_state		= &fm10k_get_host_state_generic,
+};
+
+static s32 fm10k_get_invariants_vf(struct fm10k_hw *hw)
+{
+	fm10k_get_invariants_generic(hw);
+
+	return fm10k_pfvf_mbx_init(hw, &hw->mbx, fm10k_msg_data_vf, 0);
+}
+
+struct fm10k_info fm10k_vf_info = {
+	.mac		= fm10k_mac_vf,
+	.get_invariants	= &fm10k_get_invariants_vf,
+	.mac_ops	= &mac_ops_vf,
+};
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_vf.h b/drivers/net/ethernet/intel/fm10k/fm10k_vf.h
new file mode 100644
index 0000000..8e96ee5
--- /dev/null
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_vf.h
@@ -0,0 +1,68 @@
+/* Intel Ethernet Switch Host Interface Driver
+ * Copyright(c) 2013 - 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Contact Information:
+ * e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+ * Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+ */
+
+#ifndef _FM10K_VF_H_
+#define _FM10K_VF_H_
+
+#include "fm10k_type.h"
+#include "fm10k_common.h"
+
+enum fm10k_vf_tlv_msg_id {
+	FM10K_VF_MSG_ID_TEST = 0,	/* msg ID reserved for testing */
+	FM10K_VF_MSG_ID_MSIX,
+	FM10K_VF_MSG_ID_MAC_VLAN,
+	FM10K_VF_MSG_ID_LPORT_STATE,
+	FM10K_VF_MSG_ID_MAX,
+};
+
+enum fm10k_tlv_mac_vlan_attr_id {
+	FM10K_MAC_VLAN_MSG_VLAN,
+	FM10K_MAC_VLAN_MSG_SET,
+	FM10K_MAC_VLAN_MSG_MAC,
+	FM10K_MAC_VLAN_MSG_DEFAULT_MAC,
+	FM10K_MAC_VLAN_MSG_MULTICAST,
+	FM10K_MAC_VLAN_MSG_ID_MAX
+};
+
+enum fm10k_tlv_lport_state_attr_id {
+	FM10K_LPORT_STATE_MSG_DISABLE,
+	FM10K_LPORT_STATE_MSG_XCAST_MODE,
+	FM10K_LPORT_STATE_MSG_READY,
+	FM10K_LPORT_STATE_MSG_MAX
+};
+
+#define FM10K_VF_MSG_MSIX_HANDLER(func) \
+	 FM10K_MSG_HANDLER(FM10K_VF_MSG_ID_MSIX, NULL, func)
+
+s32 fm10k_msg_mac_vlan_vf(struct fm10k_hw *, u32 **, struct fm10k_mbx_info *);
+extern const struct fm10k_tlv_attr fm10k_mac_vlan_msg_attr[];
+#define FM10K_VF_MSG_MAC_VLAN_HANDLER(func) \
+	FM10K_MSG_HANDLER(FM10K_VF_MSG_ID_MAC_VLAN, \
+			  fm10k_mac_vlan_msg_attr, func)
+
+s32 fm10k_msg_lport_state_vf(struct fm10k_hw *, u32 **,
+			     struct fm10k_mbx_info *);
+extern const struct fm10k_tlv_attr fm10k_lport_state_msg_attr[];
+#define FM10K_VF_MSG_LPORT_STATE_HANDLER(func) \
+	FM10K_MSG_HANDLER(FM10K_VF_MSG_ID_LPORT_STATE, \
+			  fm10k_lport_state_msg_attr, func)
+
+extern struct fm10k_info fm10k_vf_info;
+#endif /* _FM10K_VF_H */

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [net-next PATCH 24/29] fm10k: Add support for SR-IOV to PF core files
  2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
                   ` (22 preceding siblings ...)
  2014-09-18 22:39 ` [net-next PATCH 23/29] fm10k: Add support for VF Alexander Duyck
@ 2014-09-18 22:39 ` Alexander Duyck
  2014-09-18 22:39 ` [net-next PATCH 25/29] fm10k: Add support for SR-IOV to driver Alexander Duyck
                   ` (7 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-18 22:39 UTC (permalink / raw)
  To: davem
  Cc: nhorman, netdev, john.fastabend, matthew.vick, jeffrey.t.kirsher,
	sassmann

This change adds a set of functions to fm10k_pf.c which allows for
configuring the VF via a set of standardized TLV messages.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_pf.c   |  808 +++++++++++++++++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_pf.h   |    9 
 drivers/net/ethernet/intel/fm10k/fm10k_type.h |   58 ++
 3 files changed, 874 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pf.c b/drivers/net/ethernet/intel/fm10k/fm10k_pf.c
index 8da382c..0b6ce10 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pf.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pf.c
@@ -19,6 +19,7 @@
  */
 
 #include "fm10k_pf.h"
+#include "fm10k_vf.h"
 
 /**
  *  fm10k_reset_hw_pf - PF hardware reset
@@ -75,6 +76,19 @@ static s32 fm10k_reset_hw_pf(struct fm10k_hw *hw)
 }
 
 /**
+ *  fm10k_is_ari_hierarchy_pf - Indicate ARI hierarchy support
+ *  @hw: pointer to hardware structure
+ *
+ *  Looks at the ARI hierarchy bit to determine whether ARI is supported or not.
+ **/
+static bool fm10k_is_ari_hierarchy_pf(struct fm10k_hw *hw)
+{
+	u16 sriov_ctrl = fm10k_read_pci_cfg_word(hw, FM10K_PCIE_SRIOV_CTRL);
+
+	return !!(sriov_ctrl & FM10K_PCIE_SRIOV_CTRL_VFARI);
+}
+
+/**
  *  fm10k_init_hw_pf - PF hardware initialization
  *  @hw: pointer to hardware structure
  *
@@ -164,6 +178,9 @@ static s32 fm10k_init_hw_pf(struct fm10k_hw *hw)
 	/* record maximum queue count, we limit ourselves to 128 */
 	hw->mac.max_queues = FM10K_MAX_QUEUES_PF;
 
+	/* We support either 64 VFs or 7 VFs depending on if we have ARI */
+	hw->iov.total_vfs = fm10k_is_ari_hierarchy_pf(hw) ? 64 : 7;
+
 	return 0;
 }
 
@@ -444,7 +461,8 @@ static void fm10k_update_int_moderator_pf(struct fm10k_hw *hw)
 	fm10k_write_reg(hw, FM10K_ITR2(FM10K_ITR_REG_COUNT_PF), i);
 
 	/* reset ITR2[0] to point to last enabled PF vector */
-	fm10k_write_reg(hw, FM10K_ITR2(0), i);
+	if (!hw->iov.num_vfs)
+		fm10k_write_reg(hw, FM10K_ITR2(0), i);
 
 	/* Enable interrupt moderator */
 	fm10k_write_reg(hw, FM10K_INT_CTRL, FM10K_INT_CTRL_ENABLEMODERATOR);
@@ -571,6 +589,782 @@ static s32 fm10k_configure_dglort_map_pf(struct fm10k_hw *hw,
 	return 0;
 }
 
+u16 fm10k_queues_per_pool(struct fm10k_hw *hw)
+{
+	u16 num_pools = hw->iov.num_pools;
+
+	return (num_pools > 32) ? 2 : (num_pools > 16) ? 4 : (num_pools > 8) ?
+	       8 : FM10K_MAX_QUEUES_POOL;
+}
+
+u16 fm10k_vf_queue_index(struct fm10k_hw *hw, u16 vf_idx)
+{
+	u16 num_vfs = hw->iov.num_vfs;
+	u16 vf_q_idx = FM10K_MAX_QUEUES;
+
+	vf_q_idx -= fm10k_queues_per_pool(hw) * (num_vfs - vf_idx);
+
+	return vf_q_idx;
+}
+
+static u16 fm10k_vectors_per_pool(struct fm10k_hw *hw)
+{
+	u16 num_pools = hw->iov.num_pools;
+
+	return (num_pools > 32) ? 8 : (num_pools > 16) ? 16 :
+	       FM10K_MAX_VECTORS_POOL;
+}
+
+static u16 fm10k_vf_vector_index(struct fm10k_hw *hw, u16 vf_idx)
+{
+	u16 vf_v_idx = FM10K_MAX_VECTORS_PF;
+
+	vf_v_idx += fm10k_vectors_per_pool(hw) * vf_idx;
+
+	return vf_v_idx;
+}
+
+/**
+ *  fm10k_iov_assign_resources_pf - Assign pool resources for virtualization
+ *  @hw: pointer to the HW structure
+ *  @num_vfs: number of VFs to be allocated
+ *  @num_pools: number of virtualization pools to be allocated
+ *
+ *  Allocates queues and traffic classes to virtualization entities to prepare
+ *  the PF for SR-IOV and VMDq
+ **/
+static s32 fm10k_iov_assign_resources_pf(struct fm10k_hw *hw, u16 num_vfs,
+					 u16 num_pools)
+{
+	u16 qmap_stride, qpp, vpp, vf_q_idx, vf_q_idx0, qmap_idx;
+	u32 vid = hw->mac.default_vid << FM10K_TXQCTL_VID_SHIFT;
+	int i, j;
+
+	/* hardware only supports up to 64 pools */
+	if (num_pools > 64)
+		return FM10K_ERR_PARAM;
+
+	/* the number of VFs cannot exceed the number of pools */
+	if ((num_vfs > num_pools) || (num_vfs > hw->iov.total_vfs))
+		return FM10K_ERR_PARAM;
+
+	/* record number of virtualization entities */
+	hw->iov.num_vfs = num_vfs;
+	hw->iov.num_pools = num_pools;
+
+	/* determine qmap offsets and counts */
+	qmap_stride = (num_vfs > 8) ? 32 : 256;
+	qpp = fm10k_queues_per_pool(hw);
+	vpp = fm10k_vectors_per_pool(hw);
+
+	/* calculate starting index for queues */
+	vf_q_idx = fm10k_vf_queue_index(hw, 0);
+	qmap_idx = 0;
+
+	/* establish TCs with -1 credits and no quanta to prevent transmit */
+	for (i = 0; i < num_vfs; i++) {
+		fm10k_write_reg(hw, FM10K_TC_MAXCREDIT(i), 0);
+		fm10k_write_reg(hw, FM10K_TC_RATE(i), 0);
+		fm10k_write_reg(hw, FM10K_TC_CREDIT(i),
+				FM10K_TC_CREDIT_CREDIT_MASK);
+	}
+
+	/* zero out all mbmem registers */
+	for (i = FM10K_VFMBMEM_LEN * num_vfs; i--;)
+		fm10k_write_reg(hw, FM10K_MBMEM(i), 0);
+
+	/* clear event notification of VF FLR */
+	fm10k_write_reg(hw, FM10K_PFVFLREC(0), ~0);
+	fm10k_write_reg(hw, FM10K_PFVFLREC(1), ~0);
+
+	/* loop through unallocated rings assigning them back to PF */
+	for (i = FM10K_MAX_QUEUES_PF; i < vf_q_idx; i++) {
+		fm10k_write_reg(hw, FM10K_TXDCTL(i), 0);
+		fm10k_write_reg(hw, FM10K_TXQCTL(i), FM10K_TXQCTL_PF | vid);
+		fm10k_write_reg(hw, FM10K_RXQCTL(i), FM10K_RXQCTL_PF);
+	}
+
+	/* PF should have already updated VFITR2[0] */
+
+	/* update all ITR registers to flow to VFITR2[0] */
+	for (i = FM10K_ITR_REG_COUNT_PF + 1; i < FM10K_ITR_REG_COUNT; i++) {
+		if (!(i & (vpp - 1)))
+			fm10k_write_reg(hw, FM10K_ITR2(i), i - vpp);
+		else
+			fm10k_write_reg(hw, FM10K_ITR2(i), i - 1);
+	}
+
+	/* update PF ITR2[0] to reference the last vector */
+	fm10k_write_reg(hw, FM10K_ITR2(0),
+			fm10k_vf_vector_index(hw, num_vfs - 1));
+
+	/* loop through rings populating rings and TCs */
+	for (i = 0; i < num_vfs; i++) {
+		/* record index for VF queue 0 for use in end of loop */
+		vf_q_idx0 = vf_q_idx;
+
+		for (j = 0; j < qpp; j++, qmap_idx++, vf_q_idx++) {
+			/* assign VF and locked TC to queues */
+			fm10k_write_reg(hw, FM10K_TXDCTL(vf_q_idx), 0);
+			fm10k_write_reg(hw, FM10K_TXQCTL(vf_q_idx),
+					(i << FM10K_TXQCTL_TC_SHIFT) | i |
+					FM10K_TXQCTL_VF | vid);
+			fm10k_write_reg(hw, FM10K_RXDCTL(vf_q_idx),
+					FM10K_RXDCTL_WRITE_BACK_MIN_DELAY |
+					FM10K_RXDCTL_DROP_ON_EMPTY);
+			fm10k_write_reg(hw, FM10K_RXQCTL(vf_q_idx),
+					FM10K_RXQCTL_VF |
+					(i << FM10K_RXQCTL_VF_SHIFT));
+
+			/* map queue pair to VF */
+			fm10k_write_reg(hw, FM10K_TQMAP(qmap_idx), vf_q_idx);
+			fm10k_write_reg(hw, FM10K_RQMAP(qmap_idx), vf_q_idx);
+		}
+
+		/* repeat the first ring for all of the remaining VF rings */
+		for (; j < qmap_stride; j++, qmap_idx++) {
+			fm10k_write_reg(hw, FM10K_TQMAP(qmap_idx), vf_q_idx0);
+			fm10k_write_reg(hw, FM10K_RQMAP(qmap_idx), vf_q_idx0);
+		}
+	}
+
+	/* loop through remaining indexes assigning all to queue 0 */
+	while (qmap_idx < FM10K_TQMAP_TABLE_SIZE) {
+		fm10k_write_reg(hw, FM10K_TQMAP(qmap_idx), 0);
+		fm10k_write_reg(hw, FM10K_RQMAP(qmap_idx), 0);
+		qmap_idx++;
+	}
+
+	return 0;
+}
+
+/**
+ *  fm10k_iov_configure_tc_pf - Configure the shaping group for VF
+ *  @hw: pointer to the HW structure
+ *  @vf_idx: index of VF receiving GLORT
+ *  @rate: Rate indicated in Mb/s
+ *
+ *  Configured the TC for a given VF to allow only up to a given number
+ *  of Mb/s of outgoing Tx throughput.
+ **/
+static s32 fm10k_iov_configure_tc_pf(struct fm10k_hw *hw, u16 vf_idx, int rate)
+{
+	/* configure defaults */
+	u32 interval = FM10K_TC_RATE_INTERVAL_4US_GEN3;
+	u32 tc_rate = FM10K_TC_RATE_QUANTA_MASK;
+
+	/* verify vf is in range */
+	if (vf_idx >= hw->iov.num_vfs)
+		return FM10K_ERR_PARAM;
+
+	/* set interval to align with 4.096 usec in all modes */
+	switch (hw->bus.speed) {
+	case fm10k_bus_speed_2500:
+		interval = FM10K_TC_RATE_INTERVAL_4US_GEN1;
+		break;
+	case fm10k_bus_speed_5000:
+		interval = FM10K_TC_RATE_INTERVAL_4US_GEN2;
+		break;
+	default:
+		break;
+	}
+
+	if (rate) {
+		if (rate > FM10K_VF_TC_MAX || rate < FM10K_VF_TC_MIN)
+			return FM10K_ERR_PARAM;
+
+		/* The quanta is measured in Bytes per 4.096 or 8.192 usec
+		 * The rate is provided in Mbits per second
+		 * To tralslate from rate to quanta we need to multiply the
+		 * rate by 8.192 usec and divide by 8 bits/byte.  To avoid
+		 * dealing with floating point we can round the values up
+		 * to the nearest whole number ratio which gives us 128 / 125.
+		 */
+		tc_rate = (rate * 128) / 125;
+
+		/* try to keep the rate limiting accurate by increasing
+		 * the number of credits and interval for rates less than 4Gb/s
+		 */
+		if (rate < 4000)
+			interval <<= 1;
+		else
+			tc_rate >>= 1;
+	}
+
+	/* update rate limiter with new values */
+	fm10k_write_reg(hw, FM10K_TC_RATE(vf_idx), tc_rate | interval);
+	fm10k_write_reg(hw, FM10K_TC_MAXCREDIT(vf_idx), FM10K_TC_MAXCREDIT_64K);
+	fm10k_write_reg(hw, FM10K_TC_CREDIT(vf_idx), FM10K_TC_MAXCREDIT_64K);
+
+	return 0;
+}
+
+/**
+ *  fm10k_iov_assign_int_moderator_pf - Add VF interrupts to moderator list
+ *  @hw: pointer to the HW structure
+ *  @vf_idx: index of VF receiving GLORT
+ *
+ *  Update the interrupt moderator linked list to include any MSI-X
+ *  interrupts which the VF has enabled in the MSI-X vector table.
+ **/
+static s32 fm10k_iov_assign_int_moderator_pf(struct fm10k_hw *hw, u16 vf_idx)
+{
+	u16 vf_v_idx, vf_v_limit, i;
+
+	/* verify vf is in range */
+	if (vf_idx >= hw->iov.num_vfs)
+		return FM10K_ERR_PARAM;
+
+	/* determine vector offset and count*/
+	vf_v_idx = fm10k_vf_vector_index(hw, vf_idx);
+	vf_v_limit = vf_v_idx + fm10k_vectors_per_pool(hw);
+
+	/* search for first vector that is not masked */
+	for (i = vf_v_limit - 1; i > vf_v_idx; i--) {
+		if (!fm10k_read_reg(hw, FM10K_MSIX_VECTOR_MASK(i)))
+			break;
+	}
+
+	/* reset linked list so it now includes our active vectors */
+	if (vf_idx == (hw->iov.num_vfs - 1))
+		fm10k_write_reg(hw, FM10K_ITR2(0), i);
+	else
+		fm10k_write_reg(hw, FM10K_ITR2(vf_v_limit), i);
+
+	return 0;
+}
+
+/**
+ *  fm10k_iov_assign_default_mac_vlan_pf - Assign a MAC and VLAN to VF
+ *  @hw: pointer to the HW structure
+ *  @vf_info: pointer to VF information structure
+ *
+ *  Assign a MAC address and default VLAN to a VF and notify it of the update
+ **/
+static s32 fm10k_iov_assign_default_mac_vlan_pf(struct fm10k_hw *hw,
+						struct fm10k_vf_info *vf_info)
+{
+	u16 qmap_stride, queues_per_pool, vf_q_idx, timeout, qmap_idx, i;
+	u32 msg[4], txdctl, txqctl, tdbal = 0, tdbah = 0;
+	s32 err = 0;
+	u16 vf_idx, vf_vid;
+
+	/* verify vf is in range */
+	if (!vf_info || vf_info->vf_idx >= hw->iov.num_vfs)
+		return FM10K_ERR_PARAM;
+
+	/* determine qmap offsets and counts */
+	qmap_stride = (hw->iov.num_vfs > 8) ? 32 : 256;
+	queues_per_pool = fm10k_queues_per_pool(hw);
+
+	/* calculate starting index for queues */
+	vf_idx = vf_info->vf_idx;
+	vf_q_idx = fm10k_vf_queue_index(hw, vf_idx);
+	qmap_idx = qmap_stride * vf_idx;
+
+	/* MAP Tx queue back to 0 temporarily, and disable it */
+	fm10k_write_reg(hw, FM10K_TQMAP(qmap_idx), 0);
+	fm10k_write_reg(hw, FM10K_TXDCTL(vf_q_idx), 0);
+
+	/* determine correct default VLAN ID */
+	if (vf_info->pf_vid)
+		vf_vid = vf_info->pf_vid | FM10K_VLAN_CLEAR;
+	else
+		vf_vid = vf_info->sw_vid;
+
+	/* generate MAC_ADDR request */
+	fm10k_tlv_msg_init(msg, FM10K_VF_MSG_ID_MAC_VLAN);
+	fm10k_tlv_attr_put_mac_vlan(msg, FM10K_MAC_VLAN_MSG_DEFAULT_MAC,
+				    vf_info->mac, vf_vid);
+
+	/* load onto outgoing mailbox, ignore any errors on enqueue */
+	if (vf_info->mbx.ops.enqueue_tx)
+		vf_info->mbx.ops.enqueue_tx(hw, &vf_info->mbx, msg);
+
+	/* verify ring has disabled before modifying base address registers */
+	txdctl = fm10k_read_reg(hw, FM10K_TXDCTL(vf_q_idx));
+	for (timeout = 0; txdctl & FM10K_TXDCTL_ENABLE; timeout++) {
+		/* limit ourselves to a 1ms timeout */
+		if (timeout == 10) {
+			err = FM10K_ERR_DMA_PENDING;
+			goto err_out;
+		}
+
+		usleep_range(100, 200);
+		txdctl = fm10k_read_reg(hw, FM10K_TXDCTL(vf_q_idx));
+	}
+
+	/* Update base address registers to contain MAC address */
+	if (is_valid_ether_addr(vf_info->mac)) {
+		tdbal = (((u32)vf_info->mac[3]) << 24) |
+			(((u32)vf_info->mac[4]) << 16) |
+			(((u32)vf_info->mac[5]) << 8);
+
+		tdbah = (((u32)0xFF)	        << 24) |
+			(((u32)vf_info->mac[0]) << 16) |
+			(((u32)vf_info->mac[1]) << 8) |
+			((u32)vf_info->mac[2]);
+	}
+
+	/* Record the base address into queue 0 */
+	fm10k_write_reg(hw, FM10K_TDBAL(vf_q_idx), tdbal);
+	fm10k_write_reg(hw, FM10K_TDBAH(vf_q_idx), tdbah);
+
+err_out:
+	/* configure Queue control register */
+	txqctl = ((u32)vf_vid << FM10K_TXQCTL_VID_SHIFT) &
+		 FM10K_TXQCTL_VID_MASK;
+	txqctl |= (vf_idx << FM10K_TXQCTL_TC_SHIFT) |
+		  FM10K_TXQCTL_VF | vf_idx;
+
+	/* assign VID */
+	for (i = 0; i < queues_per_pool; i++)
+		fm10k_write_reg(hw, FM10K_TXQCTL(vf_q_idx + i), txqctl);
+
+	/* restore the queue back to VF ownership */
+	fm10k_write_reg(hw, FM10K_TQMAP(qmap_idx), vf_q_idx);
+	return err;
+}
+
+/**
+ *  fm10k_iov_reset_resources_pf - Reassign queues and interrupts to a VF
+ *  @hw: pointer to the HW structure
+ *  @vf_info: pointer to VF information structure
+ *
+ *  Reassign the interrupts and queues to a VF following an FLR
+ **/
+static s32 fm10k_iov_reset_resources_pf(struct fm10k_hw *hw,
+					struct fm10k_vf_info *vf_info)
+{
+	u16 qmap_stride, queues_per_pool, vf_q_idx, qmap_idx;
+	u32 tdbal = 0, tdbah = 0, txqctl, rxqctl;
+	u16 vf_v_idx, vf_v_limit, vf_vid;
+	u8 vf_idx = vf_info->vf_idx;
+	int i;
+
+	/* verify vf is in range */
+	if (vf_idx >= hw->iov.num_vfs)
+		return FM10K_ERR_PARAM;
+
+	/* clear event notification of VF FLR */
+	fm10k_write_reg(hw, FM10K_PFVFLREC(vf_idx / 32), 1 << (vf_idx % 32));
+
+	/* force timeout and then disconnect the mailbox */
+	vf_info->mbx.timeout = 0;
+	if (vf_info->mbx.ops.disconnect)
+		vf_info->mbx.ops.disconnect(hw, &vf_info->mbx);
+
+	/* determine vector offset and count*/
+	vf_v_idx = fm10k_vf_vector_index(hw, vf_idx);
+	vf_v_limit = vf_v_idx + fm10k_vectors_per_pool(hw);
+
+	/* determine qmap offsets and counts */
+	qmap_stride = (hw->iov.num_vfs > 8) ? 32 : 256;
+	queues_per_pool = fm10k_queues_per_pool(hw);
+	qmap_idx = qmap_stride * vf_idx;
+
+	/* make all the queues inaccessible to the VF */
+	for (i = qmap_idx; i < (qmap_idx + qmap_stride); i++) {
+		fm10k_write_reg(hw, FM10K_TQMAP(i), 0);
+		fm10k_write_reg(hw, FM10K_RQMAP(i), 0);
+	}
+
+	/* calculate starting index for queues */
+	vf_q_idx = fm10k_vf_queue_index(hw, vf_idx);
+
+	/* determine correct default VLAN ID */
+	if (vf_info->pf_vid)
+		vf_vid = vf_info->pf_vid;
+	else
+		vf_vid = vf_info->sw_vid;
+
+	/* configure Queue control register */
+	txqctl = ((u32)vf_vid << FM10K_TXQCTL_VID_SHIFT) |
+		 (vf_idx << FM10K_TXQCTL_TC_SHIFT) |
+		 FM10K_TXQCTL_VF | vf_idx;
+	rxqctl = FM10K_RXQCTL_VF | (vf_idx << FM10K_RXQCTL_VF_SHIFT);
+
+	/* stop further DMA and reset queue ownership back to VF */
+	for (i = vf_q_idx; i < (queues_per_pool + vf_q_idx); i++) {
+		fm10k_write_reg(hw, FM10K_TXDCTL(i), 0);
+		fm10k_write_reg(hw, FM10K_TXQCTL(i), txqctl);
+		fm10k_write_reg(hw, FM10K_RXDCTL(i),
+				FM10K_RXDCTL_WRITE_BACK_MIN_DELAY |
+				FM10K_RXDCTL_DROP_ON_EMPTY);
+		fm10k_write_reg(hw, FM10K_RXQCTL(i), rxqctl);
+	}
+
+	/* reset TC with -1 credits and no quanta to prevent transmit */
+	fm10k_write_reg(hw, FM10K_TC_MAXCREDIT(vf_idx), 0);
+	fm10k_write_reg(hw, FM10K_TC_RATE(vf_idx), 0);
+	fm10k_write_reg(hw, FM10K_TC_CREDIT(vf_idx),
+			FM10K_TC_CREDIT_CREDIT_MASK);
+
+	/* update our first entry in the table based on previous VF */
+	if (!vf_idx)
+		hw->mac.ops.update_int_moderator(hw);
+	else
+		hw->iov.ops.assign_int_moderator(hw, vf_idx - 1);
+
+	/* reset linked list so it now includes our active vectors */
+	if (vf_idx == (hw->iov.num_vfs - 1))
+		fm10k_write_reg(hw, FM10K_ITR2(0), vf_v_idx);
+	else
+		fm10k_write_reg(hw, FM10K_ITR2(vf_v_limit), vf_v_idx);
+
+	/* link remaining vectors so that next points to previous */
+	for (vf_v_idx++; vf_v_idx < vf_v_limit; vf_v_idx++)
+		fm10k_write_reg(hw, FM10K_ITR2(vf_v_idx), vf_v_idx - 1);
+
+	/* zero out MBMEM, VLAN_TABLE, RETA, RSSRK, and MRQC registers */
+	for (i = FM10K_VFMBMEM_LEN; i--;)
+		fm10k_write_reg(hw, FM10K_MBMEM_VF(vf_idx, i), 0);
+	for (i = FM10K_VLAN_TABLE_SIZE; i--;)
+		fm10k_write_reg(hw, FM10K_VLAN_TABLE(vf_info->vsi, i), 0);
+	for (i = FM10K_RETA_SIZE; i--;)
+		fm10k_write_reg(hw, FM10K_RETA(vf_info->vsi, i), 0);
+	for (i = FM10K_RSSRK_SIZE; i--;)
+		fm10k_write_reg(hw, FM10K_RSSRK(vf_info->vsi, i), 0);
+	fm10k_write_reg(hw, FM10K_MRQC(vf_info->vsi), 0);
+
+	/* Update base address registers to contain MAC address */
+	if (is_valid_ether_addr(vf_info->mac)) {
+		tdbal = (((u32)vf_info->mac[3]) << 24) |
+			(((u32)vf_info->mac[4]) << 16) |
+			(((u32)vf_info->mac[5]) << 8);
+		tdbah = (((u32)0xFF)	   << 24) |
+			(((u32)vf_info->mac[0]) << 16) |
+			(((u32)vf_info->mac[1]) << 8) |
+			((u32)vf_info->mac[2]);
+	}
+
+	/* map queue pairs back to VF from last to first*/
+	for (i = queues_per_pool; i--;) {
+		fm10k_write_reg(hw, FM10K_TDBAL(vf_q_idx + i), tdbal);
+		fm10k_write_reg(hw, FM10K_TDBAH(vf_q_idx + i), tdbah);
+		fm10k_write_reg(hw, FM10K_TQMAP(qmap_idx + i), vf_q_idx + i);
+		fm10k_write_reg(hw, FM10K_RQMAP(qmap_idx + i), vf_q_idx + i);
+	}
+
+	return 0;
+}
+
+/**
+ *  fm10k_iov_set_lport_pf - Assign and enable a logical port for a given VF
+ *  @hw: pointer to hardware structure
+ *  @vf_info: pointer to VF information structure
+ *  @lport_idx: Logical port offset from the hardware glort
+ *  @flags: Set of capability flags to extend port beyond basic functionality
+ *
+ *  This function allows enabling a VF port by assigning it a GLORT and
+ *  setting the flags so that it can enable an Rx mode.
+ **/
+static s32 fm10k_iov_set_lport_pf(struct fm10k_hw *hw,
+				  struct fm10k_vf_info *vf_info,
+				  u16 lport_idx, u8 flags)
+{
+	u16 glort = (hw->mac.dglort_map + lport_idx) & FM10K_DGLORTMAP_NONE;
+
+	/* if glort is not valid return error */
+	if (!fm10k_glort_valid_pf(hw, glort))
+		return FM10K_ERR_PARAM;
+
+	vf_info->vf_flags = flags | FM10K_VF_FLAG_NONE_CAPABLE;
+	vf_info->glort = glort;
+
+	return 0;
+}
+
+/**
+ *  fm10k_iov_reset_lport_pf - Disable a logical port for a given VF
+ *  @hw: pointer to hardware structure
+ *  @vf_info: pointer to VF information structure
+ *
+ *  This function disables a VF port by stripping it of a GLORT and
+ *  setting the flags so that it cannot enable any Rx mode.
+ **/
+static void fm10k_iov_reset_lport_pf(struct fm10k_hw *hw,
+				     struct fm10k_vf_info *vf_info)
+{
+	u32 msg[1];
+
+	/* need to disable the port if it is already enabled */
+	if (FM10K_VF_FLAG_ENABLED(vf_info)) {
+		/* notify switch that this port has been disabled */
+		fm10k_update_lport_state_pf(hw, vf_info->glort, 1, false);
+
+		/* generate port state response to notify VF it is not ready */
+		fm10k_tlv_msg_init(msg, FM10K_VF_MSG_ID_LPORT_STATE);
+		vf_info->mbx.ops.enqueue_tx(hw, &vf_info->mbx, msg);
+	}
+
+	/* clear flags and glort if it exists */
+	vf_info->vf_flags = 0;
+	vf_info->glort = 0;
+}
+
+/**
+ *  fm10k_iov_update_stats_pf - Updates hardware related statistics for VFs
+ *  @hw: pointer to hardware structure
+ *  @q: stats for all queues of a VF
+ *  @vf_idx: index of VF
+ *
+ *  This function collects queue stats for VFs.
+ **/
+static void fm10k_iov_update_stats_pf(struct fm10k_hw *hw,
+				      struct fm10k_hw_stats_q *q,
+				      u16 vf_idx)
+{
+	u32 idx, qpp;
+
+	/* get stats for all of the queues */
+	qpp = fm10k_queues_per_pool(hw);
+	idx = fm10k_vf_queue_index(hw, vf_idx);
+	fm10k_update_hw_stats_q(hw, q, idx, qpp);
+}
+
+/**
+ *  fm10k_iov_msg_msix_pf - Message handler for MSI-X request from VF
+ *  @hw: Pointer to hardware structure
+ *  @results: Pointer array to message, results[0] is pointer to message
+ *  @mbx: Pointer to mailbox information structure
+ *
+ *  This function is a default handler for MSI-X requests from the VF.  The
+ *  assumption is that in this case it is acceptable to just directly
+ *  hand off the message form the VF to the underlying shared code.
+ **/
+s32 fm10k_iov_msg_msix_pf(struct fm10k_hw *hw, u32 **results,
+			  struct fm10k_mbx_info *mbx)
+{
+	struct fm10k_vf_info *vf_info = (struct fm10k_vf_info *)mbx;
+	u8 vf_idx = vf_info->vf_idx;
+
+	return hw->iov.ops.assign_int_moderator(hw, vf_idx);
+}
+
+/**
+ *  fm10k_iov_msg_mac_vlan_pf - Message handler for MAC/VLAN request from VF
+ *  @hw: Pointer to hardware structure
+ *  @results: Pointer array to message, results[0] is pointer to message
+ *  @mbx: Pointer to mailbox information structure
+ *
+ *  This function is a default handler for MAC/VLAN requests from the VF.
+ *  The assumption is that in this case it is acceptable to just directly
+ *  hand off the message form the VF to the underlying shared code.
+ **/
+s32 fm10k_iov_msg_mac_vlan_pf(struct fm10k_hw *hw, u32 **results,
+			      struct fm10k_mbx_info *mbx)
+{
+	struct fm10k_vf_info *vf_info = (struct fm10k_vf_info *)mbx;
+	int err = 0;
+	u8 mac[ETH_ALEN];
+	u32 *result;
+	u16 vlan;
+	u32 vid;
+
+	/* we shouldn't be updating rules on a disabled interface */
+	if (!FM10K_VF_FLAG_ENABLED(vf_info))
+		err = FM10K_ERR_PARAM;
+
+	if (!err && !!results[FM10K_MAC_VLAN_MSG_VLAN]) {
+		result = results[FM10K_MAC_VLAN_MSG_VLAN];
+
+		/* record VLAN id requested */
+		err = fm10k_tlv_attr_get_u32(result, &vid);
+		if (err)
+			return err;
+
+		/* if VLAN ID is 0, set the default VLAN ID instead of 0 */
+		if (!vid || (vid == FM10K_VLAN_CLEAR)) {
+			if (vf_info->pf_vid)
+				vid |= vf_info->pf_vid;
+			else
+				vid |= vf_info->sw_vid;
+		} else if (vid != vf_info->pf_vid) {
+			return FM10K_ERR_PARAM;
+		}
+
+		/* update VSI info for VF in regards to VLAN table */
+		err = hw->mac.ops.update_vlan(hw, vid, vf_info->vsi,
+					      !(vid & FM10K_VLAN_CLEAR));
+	}
+
+	if (!err && !!results[FM10K_MAC_VLAN_MSG_MAC]) {
+		result = results[FM10K_MAC_VLAN_MSG_MAC];
+
+		/* record unicast MAC address requested */
+		err = fm10k_tlv_attr_get_mac_vlan(result, mac, &vlan);
+		if (err)
+			return err;
+
+		/* block attempts to set MAC for a locked device */
+		if (is_valid_ether_addr(vf_info->mac) &&
+		    memcmp(mac, vf_info->mac, ETH_ALEN))
+			return FM10K_ERR_PARAM;
+
+		/* if VLAN ID is 0, set the default VLAN ID instead of 0 */
+		if (!vlan || (vlan == FM10K_VLAN_CLEAR)) {
+			if (vf_info->pf_vid)
+				vlan |= vf_info->pf_vid;
+			else
+				vlan |= vf_info->sw_vid;
+		} else if (vf_info->pf_vid) {
+			return FM10K_ERR_PARAM;
+		}
+
+		/* notify switch of request for new unicast address */
+		err = hw->mac.ops.update_uc_addr(hw, vf_info->glort, mac, vlan,
+						 !(vlan & FM10K_VLAN_CLEAR), 0);
+	}
+
+	if (!err && !!results[FM10K_MAC_VLAN_MSG_MULTICAST]) {
+		result = results[FM10K_MAC_VLAN_MSG_MULTICAST];
+
+		/* record multicast MAC address requested */
+		err = fm10k_tlv_attr_get_mac_vlan(result, mac, &vlan);
+		if (err)
+			return err;
+
+		/* verify that the VF is allowed to request multicast */
+		if (!(vf_info->vf_flags & FM10K_VF_FLAG_MULTI_ENABLED))
+			return FM10K_ERR_PARAM;
+
+		/* if VLAN ID is 0, set the default VLAN ID instead of 0 */
+		if (!vlan || (vlan == FM10K_VLAN_CLEAR)) {
+			if (vf_info->pf_vid)
+				vlan |= vf_info->pf_vid;
+			else
+				vlan |= vf_info->sw_vid;
+		} else if (vf_info->pf_vid) {
+			return FM10K_ERR_PARAM;
+		}
+
+		/* notify switch of request for new multicast address */
+		err = hw->mac.ops.update_mc_addr(hw, vf_info->glort, mac,
+						 !(vlan & FM10K_VLAN_CLEAR), 0);
+	}
+
+	return err;
+}
+
+/**
+ *  fm10k_iov_supported_xcast_mode_pf - Determine best match for xcast mode
+ *  @vf_info: VF info structure containing capability flags
+ *  @mode: Requested xcast mode
+ *
+ *  This function outputs the mode that most closely matches the requested
+ *  mode.  If not modes match it will request we disable the port
+ **/
+static u8 fm10k_iov_supported_xcast_mode_pf(struct fm10k_vf_info *vf_info,
+					    u8 mode)
+{
+	u8 vf_flags = vf_info->vf_flags;
+
+	/* match up mode to capabilities as best as possible */
+	switch (mode) {
+	case FM10K_XCAST_MODE_PROMISC:
+		if (vf_flags & FM10K_VF_FLAG_PROMISC_CAPABLE)
+			return FM10K_XCAST_MODE_PROMISC;
+		/* fallthough */
+	case FM10K_XCAST_MODE_ALLMULTI:
+		if (vf_flags & FM10K_VF_FLAG_ALLMULTI_CAPABLE)
+			return FM10K_XCAST_MODE_ALLMULTI;
+		/* fallthough */
+	case FM10K_XCAST_MODE_MULTI:
+		if (vf_flags & FM10K_VF_FLAG_MULTI_CAPABLE)
+			return FM10K_XCAST_MODE_MULTI;
+		/* fallthough */
+	case FM10K_XCAST_MODE_NONE:
+		if (vf_flags & FM10K_VF_FLAG_NONE_CAPABLE)
+			return FM10K_XCAST_MODE_NONE;
+		/* fallthough */
+	default:
+		break;
+	}
+
+	/* disable interface as it should not be able to request any */
+	return FM10K_XCAST_MODE_DISABLE;
+}
+
+/**
+ *  fm10k_iov_msg_lport_state_pf - Message handler for port state requests
+ *  @hw: Pointer to hardware structure
+ *  @results: Pointer array to message, results[0] is pointer to message
+ *  @mbx: Pointer to mailbox information structure
+ *
+ *  This function is a default handler for port state requests.  The port
+ *  state requests for now are basic and consist of enabling or disabling
+ *  the port.
+ **/
+s32 fm10k_iov_msg_lport_state_pf(struct fm10k_hw *hw, u32 **results,
+				 struct fm10k_mbx_info *mbx)
+{
+	struct fm10k_vf_info *vf_info = (struct fm10k_vf_info *)mbx;
+	u32 *result;
+	s32 err = 0;
+	u32 msg[2];
+	u8 mode = 0;
+
+	/* verify VF is allowed to enable even minimal mode */
+	if (!(vf_info->vf_flags & FM10K_VF_FLAG_NONE_CAPABLE))
+		return FM10K_ERR_PARAM;
+
+	if (!!results[FM10K_LPORT_STATE_MSG_XCAST_MODE]) {
+		result = results[FM10K_LPORT_STATE_MSG_XCAST_MODE];
+
+		/* XCAST mode update requested */
+		err = fm10k_tlv_attr_get_u8(result, &mode);
+		if (err)
+			return FM10K_ERR_PARAM;
+
+		/* prep for possible demotion depending on capabilities */
+		mode = fm10k_iov_supported_xcast_mode_pf(vf_info, mode);
+
+		/* if mode is not currently enabled, enable it */
+		if (!(FM10K_VF_FLAG_ENABLED(vf_info) & (1 << mode)))
+			fm10k_update_xcast_mode_pf(hw, vf_info->glort, mode);
+
+		/* swap mode back to a bit flag */
+		mode = FM10K_VF_FLAG_SET_MODE(mode);
+	} else if (!results[FM10K_LPORT_STATE_MSG_DISABLE]) {
+		/* need to disable the port if it is already enabled */
+		if (FM10K_VF_FLAG_ENABLED(vf_info))
+			err = fm10k_update_lport_state_pf(hw, vf_info->glort,
+							  1, false);
+
+		/* when enabling the port we should reset the rate limiters */
+		hw->iov.ops.configure_tc(hw, vf_info->vf_idx, vf_info->rate);
+
+		/* set mode for minimal functionality */
+		mode = FM10K_VF_FLAG_SET_MODE_NONE;
+
+		/* generate port state response to notify VF it is ready */
+		fm10k_tlv_msg_init(msg, FM10K_VF_MSG_ID_LPORT_STATE);
+		fm10k_tlv_attr_put_bool(msg, FM10K_LPORT_STATE_MSG_READY);
+		mbx->ops.enqueue_tx(hw, mbx, msg);
+	}
+
+	/* if enable state toggled note the update */
+	if (!err && (!FM10K_VF_FLAG_ENABLED(vf_info) != !mode))
+		err = fm10k_update_lport_state_pf(hw, vf_info->glort, 1,
+						  !!mode);
+
+	/* if state change succeeded, then update our stored state */
+	mode |= FM10K_VF_FLAG_CAPABLE(vf_info);
+	if (!err)
+		vf_info->vf_flags = mode;
+
+	return err;
+}
+
+const struct fm10k_msg_data fm10k_iov_msg_data_pf[] = {
+	FM10K_TLV_MSG_TEST_HANDLER(fm10k_tlv_msg_test),
+	FM10K_VF_MSG_MSIX_HANDLER(fm10k_iov_msg_msix_pf),
+	FM10K_VF_MSG_MAC_VLAN_HANDLER(fm10k_iov_msg_mac_vlan_pf),
+	FM10K_VF_MSG_LPORT_STATE_HANDLER(fm10k_iov_msg_lport_state_pf),
+	FM10K_TLV_MSG_ERROR_HANDLER(fm10k_tlv_msg_error),
+};
+
 /**
  *  fm10k_update_stats_hw_pf - Updates hardware related statistics of PF
  *  @hw: pointer to hardware structure
@@ -961,6 +1755,17 @@ static struct fm10k_mac_ops mac_ops_pf = {
 	.get_host_state		= &fm10k_get_host_state_pf,
 };
 
+static struct fm10k_iov_ops iov_ops_pf = {
+	.assign_resources		= &fm10k_iov_assign_resources_pf,
+	.configure_tc			= &fm10k_iov_configure_tc_pf,
+	.assign_int_moderator		= &fm10k_iov_assign_int_moderator_pf,
+	.assign_default_mac_vlan	= fm10k_iov_assign_default_mac_vlan_pf,
+	.reset_resources		= &fm10k_iov_reset_resources_pf,
+	.set_lport			= &fm10k_iov_set_lport_pf,
+	.reset_lport			= &fm10k_iov_reset_lport_pf,
+	.update_stats			= &fm10k_iov_update_stats_pf,
+};
+
 static s32 fm10k_get_invariants_pf(struct fm10k_hw *hw)
 {
 	fm10k_get_invariants_generic(hw);
@@ -972,4 +1777,5 @@ struct fm10k_info fm10k_pf_info = {
 	.mac		= fm10k_mac_pf,
 	.get_invariants	= &fm10k_get_invariants_pf,
 	.mac_ops	= &mac_ops_pf,
+	.iov_ops	= &iov_ops_pf,
 };
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pf.h b/drivers/net/ethernet/intel/fm10k/fm10k_pf.h
index 108a7a7..7ab1db4 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pf.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pf.h
@@ -25,6 +25,8 @@
 #include "fm10k_common.h"
 
 bool fm10k_glort_valid_pf(struct fm10k_hw *hw, u16 glort);
+u16 fm10k_queues_per_pool(struct fm10k_hw *hw);
+u16 fm10k_vf_queue_index(struct fm10k_hw *hw, u16 vf_idx);
 
 enum fm10k_pf_tlv_msg_id_v1 {
 	FM10K_PF_MSG_ID_TEST			= 0x000, /* msg ID reserved */
@@ -122,5 +124,12 @@ extern const struct fm10k_tlv_attr fm10k_1588_timestamp_msg_attr[];
 	FM10K_MSG_HANDLER(FM10K_PF_MSG_ID_1588_TIMESTAMP, \
 			  fm10k_1588_timestamp_msg_attr, func)
 
+s32 fm10k_iov_msg_msix_pf(struct fm10k_hw *, u32 **, struct fm10k_mbx_info *);
+s32 fm10k_iov_msg_mac_vlan_pf(struct fm10k_hw *, u32 **,
+			      struct fm10k_mbx_info *);
+s32 fm10k_iov_msg_lport_state_pf(struct fm10k_hw *, u32 **,
+				 struct fm10k_mbx_info *);
+extern const struct fm10k_msg_data fm10k_iov_msg_data_pf[];
+
 extern struct fm10k_info fm10k_pf_info;
 #endif /* _FM10K_PF_H */
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_type.h b/drivers/net/ethernet/intel/fm10k/fm10k_type.h
index cc1df60..ecaf93a 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_type.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_type.h
@@ -567,6 +567,62 @@ enum fm10k_xcast_modes {
 	FM10K_XCAST_MODE_DISABLE	= 4
 };
 
+#define FM10K_VF_TC_MAX		100000	/* 100,000 Mb/s aka 100Gb/s */
+#define FM10K_VF_TC_MIN		1	/* 1 Mb/s is the slowest rate */
+
+struct fm10k_vf_info {
+	/* mbx must be first field in struct unless all default IOV message
+	 * handlers are redone as the assumption is that vf_info starts
+	 * at the same offset as the mailbox
+	 */
+	struct fm10k_mbx_info	mbx;		/* PF side of VF mailbox */
+	int			rate;		/* Tx BW cap as defined by OS */
+	u16			glort;		/* resource tag for this VF */
+	u16			sw_vid;		/* Switch API assigned VLAN */
+	u16			pf_vid;		/* PF assigned Default VLAN */
+	u8			mac[ETH_ALEN];	/* PF Default MAC address */
+	u8			vsi;		/* VSI idenfifier */
+	u8			vf_idx;		/* which VF this is */
+	u8			vf_flags;	/* flags indicating what modes
+						 * are supported for the port
+						 */
+};
+
+#define FM10K_VF_FLAG_ALLMULTI_CAPABLE	((u8)1 << FM10K_XCAST_MODE_ALLMULTI)
+#define FM10K_VF_FLAG_MULTI_CAPABLE	((u8)1 << FM10K_XCAST_MODE_MULTI)
+#define FM10K_VF_FLAG_PROMISC_CAPABLE	((u8)1 << FM10K_XCAST_MODE_PROMISC)
+#define FM10K_VF_FLAG_NONE_CAPABLE	((u8)1 << FM10K_XCAST_MODE_NONE)
+#define FM10K_VF_FLAG_CAPABLE(vf_info)	((vf_info)->vf_flags & (u8)0xF)
+#define FM10K_VF_FLAG_ENABLED(vf_info)	((vf_info)->vf_flags >> 4)
+#define FM10K_VF_FLAG_SET_MODE(mode)	((u8)0x10 << (mode))
+#define FM10K_VF_FLAG_SET_MODE_NONE \
+	FM10K_VF_FLAG_SET_MODE(FM10K_XCAST_MODE_NONE)
+#define FM10K_VF_FLAG_MULTI_ENABLED \
+	(FM10K_VF_FLAG_SET_MODE(FM10K_XCAST_MODE_ALLMULTI) | \
+	 FM10K_VF_FLAG_SET_MODE(FM10K_XCAST_MODE_MULTI) | \
+	 FM10K_VF_FLAG_SET_MODE(FM10K_XCAST_MODE_PROMISC))
+
+struct fm10k_iov_ops {
+	/* IOV related bring-up and tear-down */
+	s32 (*assign_resources)(struct fm10k_hw *, u16, u16);
+	s32 (*configure_tc)(struct fm10k_hw *, u16, int);
+	s32 (*assign_int_moderator)(struct fm10k_hw *, u16);
+	s32 (*assign_default_mac_vlan)(struct fm10k_hw *,
+				       struct fm10k_vf_info *);
+	s32 (*reset_resources)(struct fm10k_hw *,
+			       struct fm10k_vf_info *);
+	s32 (*set_lport)(struct fm10k_hw *, struct fm10k_vf_info *, u16, u8);
+	void (*reset_lport)(struct fm10k_hw *, struct fm10k_vf_info *);
+	void (*update_stats)(struct fm10k_hw *, struct fm10k_hw_stats_q *, u16);
+};
+
+struct fm10k_iov_info {
+	struct fm10k_iov_ops ops;
+	u16 total_vfs;
+	u16 num_vfs;
+	u16 num_pools;
+};
+
 enum fm10k_devices {
 	fm10k_device_pf,
 	fm10k_device_vf,
@@ -576,6 +632,7 @@ struct fm10k_info {
 	enum fm10k_mac_type	mac;
 	s32			(*get_invariants)(struct fm10k_hw *);
 	struct fm10k_mac_ops	*mac_ops;
+	struct fm10k_iov_ops	*iov_ops;
 };
 
 struct fm10k_hw {
@@ -584,6 +641,7 @@ struct fm10k_hw {
 	struct fm10k_mac_info mac;
 	struct fm10k_bus_info bus;
 	struct fm10k_bus_info bus_caps;
+	struct fm10k_iov_info iov;
 	struct fm10k_mbx_info mbx;
 	struct fm10k_swapi_info swapi;
 	u16 device_id;

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [net-next PATCH 25/29] fm10k: Add support for SR-IOV to driver
  2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
                   ` (23 preceding siblings ...)
  2014-09-18 22:39 ` [net-next PATCH 24/29] fm10k: Add support for SR-IOV to PF core files Alexander Duyck
@ 2014-09-18 22:39 ` Alexander Duyck
  2014-09-18 22:40 ` [net-next PATCH 26/29] fm10k: Add support for IEEE DCBx Alexander Duyck
                   ` (6 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-18 22:39 UTC (permalink / raw)
  To: davem
  Cc: nhorman, netdev, john.fastabend, matthew.vick, jeffrey.t.kirsher,
	sassmann

This patch combines the recently added VF messaging and configuration
functionality with the interfaces provided by the kernel to allow for
configuration and management of SR-IOV.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/fm10k/Makefile       |    2 
 drivers/net/ethernet/intel/fm10k/fm10k.h        |   26 +
 drivers/net/ethernet/intel/fm10k/fm10k_iov.c    |  536 +++++++++++++++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_netdev.c |   20 +
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c    |   32 +
 5 files changed, 614 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_iov.c

diff --git a/drivers/net/ethernet/intel/fm10k/Makefile b/drivers/net/ethernet/intel/fm10k/Makefile
index f70893a..70565ab 100644
--- a/drivers/net/ethernet/intel/fm10k/Makefile
+++ b/drivers/net/ethernet/intel/fm10k/Makefile
@@ -29,4 +29,4 @@ obj-$(CONFIG_FM10K) += fm10k.o
 
 fm10k-objs := fm10k_main.o fm10k_common.o fm10k_pci.o \
 	      fm10k_netdev.o fm10k_ethtool.o fm10k_pf.o fm10k_vf.o \
-	      fm10k_mbx.o fm10k_tlv.o
+	      fm10k_mbx.o fm10k_iov.o fm10k_tlv.o
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k.h b/drivers/net/ethernet/intel/fm10k/fm10k.h
index 0c4a873..e529bfe 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k.h
@@ -218,6 +218,13 @@ struct fm10k_ring_feature {
 	u16 offset;	/* offset to start of feature */
 };
 
+struct fm10k_iov_data {
+	unsigned int		num_vfs;
+	unsigned int		next_vf_mbx;
+	struct rcu_head		rcu;
+	struct fm10k_vf_info	vf_info[0];
+};
+
 #define fm10k_vxlan_port_for_each(vp, intfc) \
 	list_for_each_entry(vp, &(intfc)->vxlan_port, list)
 struct fm10k_vxlan_port {
@@ -277,6 +284,9 @@ struct fm10k_intfc {
 	int num_q_vectors;	/* current number of q_vectors for device */
 	struct fm10k_ring_feature ring_feature[RING_F_ARRAY_SIZE];
 
+	/* SR-IOV information management structure */
+	struct fm10k_iov_data *iov_data;
+
 	struct fm10k_hw_stats stats;
 	struct fm10k_hw hw;
 	u32 __iomem *uc_addr;
@@ -441,4 +451,20 @@ int fm10k_close(struct net_device *netdev);
 
 /* Ethtool */
 void fm10k_set_ethtool_ops(struct net_device *dev);
+
+/* IOV */
+s32 fm10k_iov_event(struct fm10k_intfc *interface);
+s32 fm10k_iov_mbx(struct fm10k_intfc *interface);
+void fm10k_iov_suspend(struct pci_dev *pdev);
+int fm10k_iov_resume(struct pci_dev *pdev);
+void fm10k_iov_disable(struct pci_dev *pdev);
+int fm10k_iov_configure(struct pci_dev *pdev, int num_vfs);
+s32 fm10k_iov_update_pvid(struct fm10k_intfc *interface, u16 glort, u16 pvid);
+int fm10k_ndo_set_vf_mac(struct net_device *netdev, int vf_idx, u8 *mac);
+int fm10k_ndo_set_vf_vlan(struct net_device *netdev,
+			  int vf_idx, u16 vid, u8 qos);
+int fm10k_ndo_set_vf_bw(struct net_device *netdev, int vf_idx, int rate,
+			int unused);
+int fm10k_ndo_get_vf_config(struct net_device *netdev,
+			    int vf_idx, struct ifla_vf_info *ivi);
 #endif /* _FM10K_H_ */
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_iov.c b/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
new file mode 100644
index 0000000..0601908
--- /dev/null
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
@@ -0,0 +1,536 @@
+/* Intel Ethernet Switch Host Interface Driver
+ * Copyright(c) 2013 - 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Contact Information:
+ * e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+ * Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+ */
+
+#include "fm10k.h"
+#include "fm10k_vf.h"
+#include "fm10k_pf.h"
+
+static s32 fm10k_iov_msg_error(struct fm10k_hw *hw, u32 **results,
+			       struct fm10k_mbx_info *mbx)
+{
+	struct fm10k_vf_info *vf_info = (struct fm10k_vf_info *)mbx;
+	struct fm10k_intfc *interface = hw->back;
+	struct pci_dev *pdev = interface->pdev;
+
+	dev_err(&pdev->dev, "Unknown message ID %u on VF %d\n",
+		**results & FM10K_TLV_ID_MASK, vf_info->vf_idx);
+
+	return fm10k_tlv_msg_error(hw, results, mbx);
+}
+
+static const struct fm10k_msg_data iov_mbx_data[] = {
+	FM10K_TLV_MSG_TEST_HANDLER(fm10k_tlv_msg_test),
+	FM10K_VF_MSG_MSIX_HANDLER(fm10k_iov_msg_msix_pf),
+	FM10K_VF_MSG_MAC_VLAN_HANDLER(fm10k_iov_msg_mac_vlan_pf),
+	FM10K_VF_MSG_LPORT_STATE_HANDLER(fm10k_iov_msg_lport_state_pf),
+	FM10K_TLV_MSG_ERROR_HANDLER(fm10k_iov_msg_error),
+};
+
+s32 fm10k_iov_event(struct fm10k_intfc *interface)
+{
+	struct fm10k_hw *hw = &interface->hw;
+	struct fm10k_iov_data *iov_data;
+	s64 mbicr, vflre;
+	int i;
+
+	/* if there is no iov_data then there is no mailboxes to process */
+	if (!ACCESS_ONCE(interface->iov_data))
+		return 0;
+
+	rcu_read_lock();
+
+	iov_data = interface->iov_data;
+
+	/* check again now that we are in the RCU block */
+	if (!iov_data)
+		goto read_unlock;
+
+	if (!(fm10k_read_reg(hw, FM10K_EICR) & FM10K_EICR_VFLR))
+		goto process_mbx;
+
+	/* read VFLRE to determine if any VFs have been reset */
+	do {
+		vflre = fm10k_read_reg(hw, FM10K_PFVFLRE(0));
+		vflre <<= 32;
+		vflre |= fm10k_read_reg(hw, FM10K_PFVFLRE(1));
+		vflre = (vflre << 32) | (vflre >> 32);
+		vflre |= fm10k_read_reg(hw, FM10K_PFVFLRE(0));
+
+		i = iov_data->num_vfs;
+
+		for (vflre <<= 64 - i; vflre && i--; vflre += vflre) {
+			struct fm10k_vf_info *vf_info = &iov_data->vf_info[i];
+
+			if (vflre >= 0)
+				continue;
+
+			hw->iov.ops.reset_resources(hw, vf_info);
+			vf_info->mbx.ops.connect(hw, &vf_info->mbx);
+		}
+	} while (i != iov_data->num_vfs);
+
+process_mbx:
+	/* read MBICR to determine which VFs require attention */
+	mbicr = fm10k_read_reg(hw, FM10K_MBICR(1));
+	mbicr <<= 32;
+	mbicr |= fm10k_read_reg(hw, FM10K_MBICR(0));
+
+	i = iov_data->next_vf_mbx ? : iov_data->num_vfs;
+
+	for (mbicr <<= 64 - i; i--; mbicr += mbicr) {
+		struct fm10k_mbx_info *mbx = &iov_data->vf_info[i].mbx;
+
+		if (mbicr >= 0)
+			continue;
+
+		if (!hw->mbx.ops.tx_ready(&hw->mbx, FM10K_VFMBX_MSG_MTU))
+			break;
+
+		mbx->ops.process(hw, mbx);
+	}
+
+	if (i >= 0) {
+		iov_data->next_vf_mbx = i + 1;
+	} else if (iov_data->next_vf_mbx) {
+		iov_data->next_vf_mbx = 0;
+		goto process_mbx;
+	}
+read_unlock:
+	rcu_read_unlock();
+
+	return 0;
+}
+
+s32 fm10k_iov_mbx(struct fm10k_intfc *interface)
+{
+	struct fm10k_hw *hw = &interface->hw;
+	struct fm10k_iov_data *iov_data;
+	int i;
+
+	/* if there is no iov_data then there is no mailboxes to process */
+	if (!ACCESS_ONCE(interface->iov_data))
+		return 0;
+
+	rcu_read_lock();
+
+	iov_data = interface->iov_data;
+
+	/* check again now that we are in the RCU block */
+	if (!iov_data)
+		goto read_unlock;
+
+	/* lock the mailbox for transmit and receive */
+	fm10k_mbx_lock(interface);
+
+process_mbx:
+	for (i = iov_data->next_vf_mbx ? : iov_data->num_vfs; i--;) {
+		struct fm10k_vf_info *vf_info = &iov_data->vf_info[i];
+		struct fm10k_mbx_info *mbx = &vf_info->mbx;
+		u16 glort = vf_info->glort;
+
+		/* verify port mapping is valid, if not reset port */
+		if (vf_info->vf_flags && !fm10k_glort_valid_pf(hw, glort))
+			hw->iov.ops.reset_lport(hw, vf_info);
+
+		/* reset VFs that have mailbox timed out */
+		if (!mbx->timeout) {
+			hw->iov.ops.reset_resources(hw, vf_info);
+			mbx->ops.connect(hw, mbx);
+		}
+
+		/* no work pending, then just continue */
+		if (mbx->ops.tx_complete(mbx) && !mbx->ops.rx_ready(mbx))
+			continue;
+
+		/* guarantee we have free space in the SM mailbox */
+		if (!hw->mbx.ops.tx_ready(&hw->mbx, FM10K_VFMBX_MSG_MTU))
+			break;
+
+		/* cleanup mailbox and process received messages */
+		mbx->ops.process(hw, mbx);
+	}
+
+	if (i >= 0) {
+		iov_data->next_vf_mbx = i + 1;
+	} else if (iov_data->next_vf_mbx) {
+		iov_data->next_vf_mbx = 0;
+		goto process_mbx;
+	}
+
+	/* free the lock */
+	fm10k_mbx_unlock(interface);
+
+read_unlock:
+	rcu_read_unlock();
+
+	return 0;
+}
+
+void fm10k_iov_suspend(struct pci_dev *pdev)
+{
+	struct fm10k_intfc *interface = pci_get_drvdata(pdev);
+	struct fm10k_iov_data *iov_data = interface->iov_data;
+	struct fm10k_hw *hw = &interface->hw;
+	int num_vfs, i;
+
+	/* pull out num_vfs from iov_data */
+	num_vfs = iov_data ? iov_data->num_vfs : 0;
+
+	/* shut down queue mapping for VFs */
+	fm10k_write_reg(hw, FM10K_DGLORTMAP(fm10k_dglort_vf_rss),
+			FM10K_DGLORTMAP_NONE);
+
+	/* Stop any active VFs and reset their resources */
+	for (i = 0; i < num_vfs; i++) {
+		struct fm10k_vf_info *vf_info = &iov_data->vf_info[i];
+
+		hw->iov.ops.reset_resources(hw, vf_info);
+		hw->iov.ops.reset_lport(hw, vf_info);
+	}
+}
+
+int fm10k_iov_resume(struct pci_dev *pdev)
+{
+	struct fm10k_intfc *interface = pci_get_drvdata(pdev);
+	struct fm10k_iov_data *iov_data = interface->iov_data;
+	struct fm10k_dglort_cfg dglort = { 0 };
+	struct fm10k_hw *hw = &interface->hw;
+	int num_vfs, i;
+
+	/* pull out num_vfs from iov_data */
+	num_vfs = iov_data ? iov_data->num_vfs : 0;
+
+	/* return error if iov_data is not already populated */
+	if (!iov_data)
+		return -ENOMEM;
+
+	/* allocate hardware resources for the VFs */
+	hw->iov.ops.assign_resources(hw, num_vfs, num_vfs);
+
+	/* configure DGLORT mapping for RSS */
+	dglort.glort = hw->mac.dglort_map & FM10K_DGLORTMAP_NONE;
+	dglort.idx = fm10k_dglort_vf_rss;
+	dglort.inner_rss = 1;
+	dglort.rss_l = fls(fm10k_queues_per_pool(hw) - 1);
+	dglort.queue_b = fm10k_vf_queue_index(hw, 0);
+	dglort.vsi_l = fls(hw->iov.total_vfs - 1);
+	dglort.vsi_b = 1;
+
+	hw->mac.ops.configure_dglort_map(hw, &dglort);
+
+	/* assign resources to the device */
+	for (i = 0; i < num_vfs; i++) {
+		struct fm10k_vf_info *vf_info = &iov_data->vf_info[i];
+
+		/* allocate all but the last GLORT to the VFs */
+		if (i == ((~hw->mac.dglort_map) >> FM10K_DGLORTMAP_MASK_SHIFT))
+			break;
+
+		/* assign GLORT to VF, and restrict it to multicast */
+		hw->iov.ops.set_lport(hw, vf_info, i,
+				      FM10K_VF_FLAG_MULTI_CAPABLE);
+
+		/* assign our default vid to the VF following reset */
+		vf_info->sw_vid = hw->mac.default_vid;
+
+		/* mailbox is disconnected so we don't send a message */
+		hw->iov.ops.assign_default_mac_vlan(hw, vf_info);
+
+		/* now we are ready so we can connect */
+		vf_info->mbx.ops.connect(hw, &vf_info->mbx);
+	}
+
+	return 0;
+}
+
+s32 fm10k_iov_update_pvid(struct fm10k_intfc *interface, u16 glort, u16 pvid)
+{
+	struct fm10k_iov_data *iov_data = interface->iov_data;
+	struct fm10k_hw *hw = &interface->hw;
+	struct fm10k_vf_info *vf_info;
+	u16 vf_idx = (glort - hw->mac.dglort_map) & FM10K_DGLORTMAP_NONE;
+
+	/* no IOV support, not our message to process */
+	if (!iov_data)
+		return FM10K_ERR_PARAM;
+
+	/* glort outside our range, not our message to process */
+	if (vf_idx >= iov_data->num_vfs)
+		return FM10K_ERR_PARAM;
+
+	/* determine if an update has occured and if so notify the VF */
+	vf_info = &iov_data->vf_info[vf_idx];
+	if (vf_info->sw_vid != pvid) {
+		vf_info->sw_vid = pvid;
+		hw->iov.ops.assign_default_mac_vlan(hw, vf_info);
+	}
+
+	return 0;
+}
+
+static void fm10k_iov_free_data(struct pci_dev *pdev)
+{
+	struct fm10k_intfc *interface = pci_get_drvdata(pdev);
+
+	if (!interface->iov_data)
+		return;
+
+	/* reclaim hardware resources */
+	fm10k_iov_suspend(pdev);
+
+	/* drop iov_data from interface */
+	kfree_rcu(interface->iov_data, rcu);
+	interface->iov_data = NULL;
+}
+
+static s32 fm10k_iov_alloc_data(struct pci_dev *pdev, int num_vfs)
+{
+	struct fm10k_intfc *interface = pci_get_drvdata(pdev);
+	struct fm10k_iov_data *iov_data = interface->iov_data;
+	struct fm10k_hw *hw = &interface->hw;
+	size_t size;
+	int i, err;
+
+	/* return error if iov_data is already populated */
+	if (iov_data)
+		return -EBUSY;
+
+	/* The PF should always be able to assign resources */
+	if (!hw->iov.ops.assign_resources)
+		return -ENODEV;
+
+	/* nothing to do if no VFs are requested */
+	if (!num_vfs)
+		return 0;
+
+	/* allocate memory for VF storage */
+	size = offsetof(struct fm10k_iov_data, vf_info[num_vfs]);
+	iov_data = kzalloc(size, GFP_KERNEL);
+	if (!iov_data)
+		return -ENOMEM;
+
+	/* record number of VFs */
+	iov_data->num_vfs = num_vfs;
+
+	/* loop through vf_info structures initializing each entry */
+	for (i = 0; i < num_vfs; i++) {
+		struct fm10k_vf_info *vf_info = &iov_data->vf_info[i];
+
+		/* Record VF VSI value */
+		vf_info->vsi = i + 1;
+		vf_info->vf_idx = i;
+
+		/* initialize mailbox memory */
+		err = fm10k_pfvf_mbx_init(hw, &vf_info->mbx, iov_mbx_data, i);
+		if (err) {
+			dev_err(&pdev->dev,
+				"Unable to initialize SR-IOV mailbox\n");
+			kfree(iov_data);
+			return err;
+		}
+	}
+
+	/* assign iov_data to interface */
+	interface->iov_data = iov_data;
+
+	/* allocate hardware resources for the VFs */
+	fm10k_iov_resume(pdev);
+
+	return 0;
+}
+
+void fm10k_iov_disable(struct pci_dev *pdev)
+{
+	if (pci_num_vf(pdev) && pci_vfs_assigned(pdev))
+		dev_err(&pdev->dev,
+			"Cannot disable SR-IOV while VFs are assigned\n");
+	else
+		pci_disable_sriov(pdev);
+
+	fm10k_iov_free_data(pdev);
+}
+
+static void fm10k_disable_aer_comp_abort(struct pci_dev *pdev)
+{
+	u32 err_sev;
+	int pos;
+
+	pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ERR);
+	if (!pos)
+		return;
+
+	pci_read_config_dword(pdev, pos + PCI_ERR_UNCOR_SEVER, &err_sev);
+	err_sev &= ~PCI_ERR_UNC_COMP_ABORT;
+	pci_write_config_dword(pdev, pos + PCI_ERR_UNCOR_SEVER, err_sev);
+}
+
+int fm10k_iov_configure(struct pci_dev *pdev, int num_vfs)
+{
+	int current_vfs = pci_num_vf(pdev);
+	int err = 0;
+
+	if (current_vfs && pci_vfs_assigned(pdev)) {
+		dev_err(&pdev->dev,
+			"Cannot modify SR-IOV while VFs are assigned\n");
+		num_vfs = current_vfs;
+	} else {
+		pci_disable_sriov(pdev);
+		fm10k_iov_free_data(pdev);
+	}
+
+	/* allocate resources for the VFs */
+	err = fm10k_iov_alloc_data(pdev, num_vfs);
+	if (err)
+		return err;
+
+	/* allocate VFs if not already allocated */
+	if (num_vfs && (num_vfs != current_vfs)) {
+		/* Disable completer abort error reporting as
+		 * the VFs can trigger this any time they read a queue
+		 * that they don't own.
+		 */
+		fm10k_disable_aer_comp_abort(pdev);
+
+		err = pci_enable_sriov(pdev, num_vfs);
+		if (err) {
+			dev_err(&pdev->dev,
+				"Enable PCI SR-IOV failed: %d\n", err);
+			return err;
+		}
+	}
+
+	return num_vfs;
+}
+
+int fm10k_ndo_set_vf_mac(struct net_device *netdev, int vf_idx, u8 *mac)
+{
+	struct fm10k_intfc *interface = netdev_priv(netdev);
+	struct fm10k_iov_data *iov_data = interface->iov_data;
+	struct fm10k_hw *hw = &interface->hw;
+	struct fm10k_vf_info *vf_info;
+
+	/* verify SR-IOV is active and that vf idx is valid */
+	if (!iov_data || vf_idx >= iov_data->num_vfs)
+		return -EINVAL;
+
+	/* verify MAC addr is valid */
+	if (!is_zero_ether_addr(mac) && !is_valid_ether_addr(mac))
+		return -EINVAL;
+
+	/* record new MAC address */
+	vf_info = &iov_data->vf_info[vf_idx];
+	ether_addr_copy(vf_info->mac, mac);
+
+	/* assigning the MAC will send a mailbox message so lock is needed */
+	fm10k_mbx_lock(interface);
+
+	/* assign MAC address to VF */
+	hw->iov.ops.assign_default_mac_vlan(hw, vf_info);
+
+	fm10k_mbx_unlock(interface);
+
+	return 0;
+}
+
+int fm10k_ndo_set_vf_vlan(struct net_device *netdev, int vf_idx, u16 vid,
+			  u8 qos)
+{
+	struct fm10k_intfc *interface = netdev_priv(netdev);
+	struct fm10k_iov_data *iov_data = interface->iov_data;
+	struct fm10k_hw *hw = &interface->hw;
+	struct fm10k_vf_info *vf_info;
+
+	/* verify SR-IOV is active and that vf idx is valid */
+	if (!iov_data || vf_idx >= iov_data->num_vfs)
+		return -EINVAL;
+
+	/* QOS is unsupported and VLAN IDs accepted range 0-4094 */
+	if (qos || (vid > (VLAN_VID_MASK - 1)))
+		return -EINVAL;
+
+	vf_info = &iov_data->vf_info[vf_idx];
+
+	/* exit if there is nothing to do */
+	if (vf_info->pf_vid == vid)
+		return 0;
+
+	/* record default VLAN ID for VF */
+	vf_info->pf_vid = vid;
+
+	/* assigning the VLAN will send a mailbox message so lock is needed */
+	fm10k_mbx_lock(interface);
+
+	/* Clear the VLAN table for the VF */
+	hw->mac.ops.update_vlan(hw, FM10K_VLAN_ALL, vf_info->vsi, false);
+
+	/* Update VF assignment and trigger reset */
+	hw->iov.ops.assign_default_mac_vlan(hw, vf_info);
+
+	fm10k_mbx_unlock(interface);
+
+	return 0;
+}
+
+int fm10k_ndo_set_vf_bw(struct net_device *netdev, int vf_idx, int unused,
+			int rate)
+{
+	struct fm10k_intfc *interface = netdev_priv(netdev);
+	struct fm10k_iov_data *iov_data = interface->iov_data;
+	struct fm10k_hw *hw = &interface->hw;
+
+	/* verify SR-IOV is active and that vf idx is valid */
+	if (!iov_data || vf_idx >= iov_data->num_vfs)
+		return -EINVAL;
+
+	/* rate limit cannot be less than 10Mbs or greater than link speed */
+	if (rate && ((rate < FM10K_VF_TC_MIN) || rate > FM10K_VF_TC_MAX))
+		return -EINVAL;
+
+	/* store values */
+	iov_data->vf_info[vf_idx].rate = rate;
+
+	/* update hardware configuration */
+	hw->iov.ops.configure_tc(hw, vf_idx, rate);
+
+	return 0;
+}
+
+int fm10k_ndo_get_vf_config(struct net_device *netdev,
+			    int vf_idx, struct ifla_vf_info *ivi)
+{
+	struct fm10k_intfc *interface = netdev_priv(netdev);
+	struct fm10k_iov_data *iov_data = interface->iov_data;
+	struct fm10k_vf_info *vf_info;
+
+	/* verify SR-IOV is active and that vf idx is valid */
+	if (!iov_data || vf_idx >= iov_data->num_vfs)
+		return -EINVAL;
+
+	vf_info = &iov_data->vf_info[vf_idx];
+
+	ivi->vf = vf_idx;
+	ivi->max_tx_rate = vf_info->rate;
+	ivi->min_tx_rate = 0;
+	ether_addr_copy(ivi->mac, vf_info->mac);
+	ivi->vlan = vf_info->pf_vid;
+	ivi->qos = 0;
+
+	return 0;
+}
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
index 057fcb0..3e56506 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
@@ -364,7 +364,21 @@ static void fm10k_request_glort_range(struct fm10k_intfc *interface)
 	if (hw->mac.dglort_map == FM10K_DGLORTMAP_NONE)
 		return;
 
-	interface->glort_count = mask + 1;
+	/* we support 3 possible GLORT configurations.
+	 * 1: VFs consume all but the last 1
+	 * 2: VFs and PF split glorts with possible gap between
+	 * 3: VFs allocated first 64, all others belong to PF
+	 */
+	if (mask <= hw->iov.total_vfs) {
+		interface->glort_count = 1;
+		interface->glort += mask;
+	} else if (mask < 64) {
+		interface->glort_count = (mask + 1) / 2;
+		interface->glort += interface->glort_count;
+	} else {
+		interface->glort_count = mask - 63;
+		interface->glort += 64;
+	}
 }
 
 /**
@@ -1321,6 +1335,10 @@ static const struct net_device_ops fm10k_netdev_ops = {
 	.ndo_set_rx_mode	= fm10k_set_rx_mode,
 	.ndo_get_stats64	= fm10k_get_stats64,
 	.ndo_setup_tc		= fm10k_setup_tc,
+	.ndo_set_vf_mac		= fm10k_ndo_set_vf_mac,
+	.ndo_set_vf_vlan	= fm10k_ndo_set_vf_vlan,
+	.ndo_set_vf_rate	= fm10k_ndo_set_vf_bw,
+	.ndo_get_vf_config	= fm10k_ndo_get_vf_config,
 	.ndo_add_vxlan_port	= fm10k_add_vxlan_port,
 	.ndo_del_vxlan_port	= fm10k_del_vxlan_port,
 	.ndo_dfwd_add_station	= fm10k_dfwd_add_station,
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index ec34438..a2eb889 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -152,6 +152,8 @@ static void fm10k_reinit(struct fm10k_intfc *interface)
 
 	rtnl_lock();
 
+	fm10k_iov_suspend(interface->pdev);
+
 	if (netif_running(netdev))
 		fm10k_close(netdev);
 
@@ -171,6 +173,8 @@ static void fm10k_reinit(struct fm10k_intfc *interface)
 	if (netif_running(netdev))
 		fm10k_open(netdev);
 
+	fm10k_iov_resume(interface->pdev);
+
 	rtnl_unlock();
 
 	clear_bit(__FM10K_RESETTING, &interface->state);
@@ -260,6 +264,9 @@ static void fm10k_mbx_subtask(struct fm10k_intfc *interface)
 {
 	/* process upstream mailbox and update device state */
 	fm10k_watchdog_update_host_state(interface);
+
+	/* process downstream mailboxes */
+	fm10k_iov_mbx(interface);
 }
 
 /**
@@ -975,6 +982,7 @@ static irqreturn_t fm10k_msix_mbx_pf(int irq, void *data)
 	/* service mailboxes */
 	if (fm10k_mbx_trylock(interface)) {
 		mbx->ops.process(hw, mbx);
+		fm10k_iov_event(interface);
 		fm10k_mbx_unlock(interface);
 	}
 
@@ -1158,6 +1166,11 @@ static s32 fm10k_update_pvid(struct fm10k_hw *hw, u32 **results,
 	if (pvid >= FM10K_VLAN_TABLE_VID_MAX)
 		return FM10K_ERR_PARAM;
 
+	/* check to see if this belongs to one of the VFs */
+	err = fm10k_iov_update_pvid(interface, glort, pvid);
+	if (!err)
+		return 0;
+
 	/* we need to reset if default VLAN was just updated */
 	if (pvid != hw->mac.default_vid)
 		interface->flags |= FM10K_FLAG_RESET_REQUESTED;
@@ -1476,6 +1489,10 @@ static int fm10k_sw_init(struct fm10k_intfc *interface,
 	memcpy(&hw->mac.ops, fi->mac_ops, sizeof(hw->mac.ops));
 	hw->mac.type = fi->mac;
 
+	/* Setup IOV handlers */
+	if (fi->iov_ops)
+		memcpy(&hw->iov.ops, fi->iov_ops, sizeof(hw->iov.ops));
+
 	/* Set common capability flags and settings */
 	rss = min_t(int, FM10K_MAX_RSS_INDICES, num_online_cpus());
 	interface->ring_feature[RING_F_RSS].limit = rss;
@@ -1508,6 +1525,9 @@ static int fm10k_sw_init(struct fm10k_intfc *interface,
 	/* initialize hardware statistics */
 	hw->mac.ops.update_hw_stats(hw, &interface->stats);
 
+	/* Set upper limit on IOV VFs that can be allocated */
+	pci_sriov_set_totalvfs(pdev, hw->iov.total_vfs);
+
 	/* Start with random Ethernet address */
 	eth_random_addr(hw->mac.addr);
 
@@ -1707,6 +1727,9 @@ static int fm10k_probe(struct pci_dev *pdev,
 	/* print warning for non-optimal configurations */
 	fm10k_slot_warn(interface);
 
+	/* enable SR-IOV after registering netdev to enforce PF/VF ordering */
+	fm10k_iov_configure(pdev, 0);
+
 	/* clear the service task disable bit to allow service task to start */
 	clear_bit(__FM10K_SERVICE_DISABLE, &interface->state);
 
@@ -1750,6 +1773,9 @@ static void fm10k_remove(struct pci_dev *pdev)
 	if (netdev->reg_state == NETREG_REGISTERED)
 		unregister_netdev(netdev);
 
+	/* release VFs */
+	fm10k_iov_disable(pdev);
+
 	/* disable mailbox interrupt */
 	fm10k_mbx_free_irq(interface);
 
@@ -1826,6 +1852,9 @@ static int fm10k_resume(struct pci_dev *pdev)
 	if (err)
 		return err;
 
+	/* restore SR-IOV interface */
+	fm10k_iov_resume(pdev);
+
 	netif_device_attach(netdev);
 
 	return 0;
@@ -1847,6 +1876,8 @@ static int fm10k_suspend(struct pci_dev *pdev, pm_message_t state)
 
 	netif_device_detach(netdev);
 
+	fm10k_iov_suspend(pdev);
+
 	rtnl_lock();
 
 	if (netif_running(netdev))
@@ -1988,6 +2019,7 @@ static struct pci_driver fm10k_driver = {
 	.suspend		= fm10k_suspend,
 	.resume			= fm10k_resume,
 #endif
+	.sriov_configure	= fm10k_iov_configure,
 	.err_handler		= &fm10k_err_handler
 };
 

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [net-next PATCH 26/29] fm10k: Add support for IEEE DCBx
  2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
                   ` (24 preceding siblings ...)
  2014-09-18 22:39 ` [net-next PATCH 25/29] fm10k: Add support for SR-IOV to driver Alexander Duyck
@ 2014-09-18 22:40 ` Alexander Duyck
  2014-09-18 22:40 ` [net-next PATCH 27/29] fm10k: Add support for debugfs Alexander Duyck
                   ` (5 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-18 22:40 UTC (permalink / raw)
  To: davem
  Cc: nhorman, netdev, john.fastabend, matthew.vick, jeffrey.t.kirsher,
	sassmann

This patch adds support for management of the limited QOS features of the
FM10000 interface.  Specifically we can support up to 8 traffic classes,
however the part only provides 1 Rx and 1 Tx FIFO in the host interface and
as a result this can lead to head-of-line blocking on Rx.  This can be
avoided by setting PFC only for priorities that cannot afford to drop
frames.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
---
 drivers/net/ethernet/intel/fm10k/Makefile      |    3 
 drivers/net/ethernet/intel/fm10k/fm10k.h       |    3 
 drivers/net/ethernet/intel/fm10k/fm10k_dcbnl.c |  174 ++++++++++++++++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c   |    3 
 4 files changed, 182 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_dcbnl.c

diff --git a/drivers/net/ethernet/intel/fm10k/Makefile b/drivers/net/ethernet/intel/fm10k/Makefile
index 70565ab..5023625 100644
--- a/drivers/net/ethernet/intel/fm10k/Makefile
+++ b/drivers/net/ethernet/intel/fm10k/Makefile
@@ -29,4 +29,5 @@ obj-$(CONFIG_FM10K) += fm10k.o
 
 fm10k-objs := fm10k_main.o fm10k_common.o fm10k_pci.o \
 	      fm10k_netdev.o fm10k_ethtool.o fm10k_pf.o fm10k_vf.o \
-	      fm10k_mbx.o fm10k_iov.o fm10k_tlv.o
+	      fm10k_mbx.o fm10k_iov.o fm10k_tlv.o \
+	      fm10k_dcbnl.o
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k.h b/drivers/net/ethernet/intel/fm10k/fm10k.h
index e529bfe..fe1c033 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k.h
@@ -467,4 +467,7 @@ int fm10k_ndo_set_vf_bw(struct net_device *netdev, int vf_idx, int rate,
 			int unused);
 int fm10k_ndo_get_vf_config(struct net_device *netdev,
 			    int vf_idx, struct ifla_vf_info *ivi);
+
+/* DCB */
+void fm10k_dcbnl_set_ops(struct net_device *dev);
 #endif /* _FM10K_H_ */
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_dcbnl.c b/drivers/net/ethernet/intel/fm10k/fm10k_dcbnl.c
new file mode 100644
index 0000000..8d74edb
--- /dev/null
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_dcbnl.c
@@ -0,0 +1,174 @@
+/* Intel Ethernet Switch Host Interface Driver
+ * Copyright(c) 2013 - 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Contact Information:
+ * e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+ * Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+ */
+
+#include "fm10k.h"
+
+#if defined(HAVE_DCBNL_IEEE) && defined(CONFIG_DCB)
+/**
+ * fm10k_dcbnl_ieee_getets - get the ETS configuration for the device
+ * @dev: netdev interface for the device
+ * @ets: ETS structure to push configuration to
+ **/
+static int fm10k_dcbnl_ieee_getets(struct net_device *dev, struct ieee_ets *ets)
+{
+	int i;
+
+	/* we support 8 TCs in all modes */
+	ets->ets_cap = IEEE_8021QAZ_MAX_TCS;
+	ets->cbs = 0;
+
+	/* we only support strict priority and cannot do traffic shaping */
+	memset(ets->tc_tx_bw, 0, sizeof(ets->tc_tx_bw));
+	memset(ets->tc_rx_bw, 0, sizeof(ets->tc_rx_bw));
+	memset(ets->tc_tsa, IEEE_8021QAZ_TSA_STRICT, sizeof(ets->tc_tsa));
+
+	/* populate the prio map based on the netdev */
+	for (i = 0; i < IEEE_8021QAZ_MAX_TCS; i++)
+		ets->prio_tc[i] = netdev_get_prio_tc_map(dev, i);
+
+	return 0;
+}
+
+/**
+ * fm10k_dcbnl_ieee_setets - set the ETS configuration for the device
+ * @dev: netdev interface for the device
+ * @ets: ETS structure to pull configuration from
+ **/
+static int fm10k_dcbnl_ieee_setets(struct net_device *dev, struct ieee_ets *ets)
+{
+	u8 num_tc = 0;
+	int i, err;
+
+	/* verify type and determine num_tcs needed */
+	for (i = 0; i < IEEE_8021QAZ_MAX_TCS; i++) {
+		if (ets->tc_tx_bw[i] || ets->tc_rx_bw[i])
+			return -EINVAL;
+		if (ets->tc_tsa[i] != IEEE_8021QAZ_TSA_STRICT)
+			return -EINVAL;
+		if (ets->prio_tc[i] > num_tc)
+			num_tc = ets->prio_tc[i];
+	}
+
+	/* if requested TC is greater than 0 then num_tcs is max + 1 */
+	if (num_tc)
+		num_tc++;
+
+	if (num_tc > IEEE_8021QAZ_MAX_TCS)
+		return -EINVAL;
+
+	/* update TC hardware mapping if necessary */
+	if (num_tc != netdev_get_num_tc(dev)) {
+		err = fm10k_setup_tc(dev, num_tc);
+		if (err)
+			return err;
+	}
+
+	/* update priority mapping */
+	for (i = 0; i < IEEE_8021QAZ_MAX_TCS; i++)
+		netdev_set_prio_tc_map(dev, i, ets->prio_tc[i]);
+
+	return 0;
+}
+
+/**
+ * fm10k_dcbnl_ieee_getpfc - get the PFC configuration for the device
+ * @dev: netdev interface for the device
+ * @pfc: PFC structure to push configuration to
+ **/
+static int fm10k_dcbnl_ieee_getpfc(struct net_device *dev, struct ieee_pfc *pfc)
+{
+	struct fm10k_intfc *interface = netdev_priv(dev);
+
+	/* record flow control max count and state of TCs */
+	pfc->pfc_cap = IEEE_8021QAZ_MAX_TCS;
+	pfc->pfc_en = interface->pfc_en;
+
+	return 0;
+}
+
+/**
+ * fm10k_dcbnl_ieee_setpfc - set the PFC configuration for the device
+ * @dev: netdev interface for the device
+ * @pfc: PFC structure to pull configuration from
+ **/
+static int fm10k_dcbnl_ieee_setpfc(struct net_device *dev, struct ieee_pfc *pfc)
+{
+	struct fm10k_intfc *interface = netdev_priv(dev);
+
+	/* record PFC configuration to interface */
+	interface->pfc_en = pfc->pfc_en;
+
+	/* if we are running update the drop_en state for all queues */
+	if (netif_running(dev))
+		fm10k_update_rx_drop_en(interface);
+
+	return 0;
+}
+
+/**
+ * fm10k_dcbnl_ieee_getdcbx - get the DCBX configuration for the device
+ * @dev: netdev interface for the device
+ *
+ * Returns that we support only IEEE DCB for this interface
+ **/
+static u8 fm10k_dcbnl_getdcbx(struct net_device *dev)
+{
+	return DCB_CAP_DCBX_HOST | DCB_CAP_DCBX_VER_IEEE;
+}
+
+/**
+ * fm10k_dcbnl_ieee_setdcbx - get the DCBX configuration for the device
+ * @dev: netdev interface for the device
+ * @mode: new mode for this device
+ *
+ * Returns error on attempt to enable anything but IEEE DCB for this interface
+ **/
+static u8 fm10k_dcbnl_setdcbx(struct net_device *dev, u8 mode)
+{
+	return (mode != (DCB_CAP_DCBX_HOST | DCB_CAP_DCBX_VER_IEEE)) ? 1 : 0;
+}
+
+static const struct dcbnl_rtnl_ops fm10k_dcbnl_ops = {
+	.ieee_getets	= fm10k_dcbnl_ieee_getets,
+	.ieee_setets	= fm10k_dcbnl_ieee_setets,
+	.ieee_getpfc	= fm10k_dcbnl_ieee_getpfc,
+	.ieee_setpfc	= fm10k_dcbnl_ieee_setpfc,
+
+	.getdcbx	= fm10k_dcbnl_getdcbx,
+	.setdcbx	= fm10k_dcbnl_setdcbx,
+};
+
+#endif /* HAVE_DCBNL_IEEE && CONFIG_DCB */
+/**
+ * fm10k_dcbnl_set_ops - Configures dcbnl ops pointer for netdev
+ * @dev: netdev interface for the device
+ *
+ * Enables PF for DCB by assigning DCBNL ops pointer.
+ **/
+void fm10k_dcbnl_set_ops(struct net_device *dev)
+{
+#if defined(HAVE_DCBNL_IEEE) && defined(CONFIG_DCB)
+	struct fm10k_intfc *interface = netdev_priv(dev);
+	struct fm10k_hw *hw = &interface->hw;
+
+	if (hw->mac.type == fm10k_mac_pf)
+		dev->dcbnl_ops = &fm10k_dcbnl_ops;
+#endif /* HAVE_DCBNL_IEEE && CONFIG_DCB */
+}
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index a2eb889..b3ebedb 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -1555,6 +1555,9 @@ static int fm10k_sw_init(struct fm10k_intfc *interface,
 		netdev->hw_features &= ~NETIF_F_GSO_UDP_TUNNEL;
 	}
 
+	/* initialize DCBNL interface */
+	fm10k_dcbnl_set_ops(netdev);
+
 	/* Initialize service timer and service task */
 	set_bit(__FM10K_SERVICE_DISABLE, &interface->state);
 	setup_timer(&interface->service_timer, &fm10k_service_timer,

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [net-next PATCH 27/29] fm10k: Add support for debugfs
  2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
                   ` (25 preceding siblings ...)
  2014-09-18 22:40 ` [net-next PATCH 26/29] fm10k: Add support for IEEE DCBx Alexander Duyck
@ 2014-09-18 22:40 ` Alexander Duyck
  2014-09-18 22:40 ` [net-next PATCH 28/29] fm10k: Add support for ptp to hw specific files Alexander Duyck
                   ` (4 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-18 22:40 UTC (permalink / raw)
  To: davem
  Cc: nhorman, netdev, john.fastabend, matthew.vick, jeffrey.t.kirsher,
	sassmann

This patch adds limited debugfs support for the driver.  Most of the
functionality needed for dumping registers is already provided via ethtool.
The only thing we saw that we really neeed was the ability to dump the
descriptor rings so as such this patch will add a fm10k directory containing a
listing of directories each one with a unique PCI Bus, Device, and Function
number.  Each of those BDF directories will have a list of q_vectors, and
the q_vectors will contain a file for each of the Rx/Tx rings that are a part
of the vector.  For example:

# ls -RD /sys/kernel/debug/fm10k/
/sys/kernel/debug/fm10k/:
0000:01:00.0

/sys/kernel/debug/fm10k/0000:01:00.0:
q_vector.000  q_vector.001  q_vector.002  q_vector.003

/sys/kernel/debug/fm10k/0000:01:00.0/q_vector.000:
rx_ring.000  tx_ring.000

/sys/kernel/debug/fm10k/0000:01:00.0/q_vector.001:
rx_ring.001  tx_ring.001

/sys/kernel/debug/fm10k/0000:01:00.0/q_vector.002:
rx_ring.002  tx_ring.002

/sys/kernel/debug/fm10k/0000:01:00.0/q_vector.003:
rx_ring.003  tx_ring.003

# cat /sys/kernel/debug/fm10k/0000:01:00.0/q_vector.000/rx_ring.000
DES DATA       RSS        STATERR    LENGTH VLAN   DGLORT SGLORT TIMESTAMP
---------------------------------------------------------------------------
000 0x00000000 0x00000000 0x00000003 0x002a 0x0000 0x0000 0x0000 0x13951807dc4fedf0
001 0x00000000 0x00000000 0x00000003 0x002a 0x0000 0x0000 0x0000 0x1395180906c9f2c8
002 0x3731c000 0x00000000 0x00000000 0x0000 0x0000 0x0000 0x0000 0x0000000000000000
003 0x3731d000 0x00000000 0x00000000 0x0000 0x0000 0x0000 0x0000 0x0000000000000000
004 0xaab3a000 0x00000000 0x00000000 0x0000 0x0000 0x0000 0x0000 0x0000000000000000
...

# cat /sys/kernel/debug/fm10k/0000:01:00.0/q_vector.000/tx_ring.000
DES BUFFER_ADDRESS     LENGTH VLAN   MSS    HDRLEN FLAGS
---------------------------------------------------------
000 0x00000000aa8a1002 0x005a 0x0000 0x0000 0x0000 0xc0
001 0x00000000aa8a2002 0x005a 0x0000 0x0000 0x0000 0xc0
002 0x000000006bc13202 0x004e 0x0000 0x0000 0x0000 0xc0
003 0x000000006bc13c02 0x002a 0x0000 0x0000 0x0000 0xe1
004 0x000000006bc13602 0x0062 0x0000 0x0000 0x0000 0xc0

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/fm10k/Makefile        |    2 
 drivers/net/ethernet/intel/fm10k/fm10k.h         |   24 ++
 drivers/net/ethernet/intel/fm10k/fm10k_debugfs.c |  259 ++++++++++++++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_main.c    |    8 +
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c     |    6 +
 5 files changed, 298 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_debugfs.c

diff --git a/drivers/net/ethernet/intel/fm10k/Makefile b/drivers/net/ethernet/intel/fm10k/Makefile
index 5023625..fbc0e09 100644
--- a/drivers/net/ethernet/intel/fm10k/Makefile
+++ b/drivers/net/ethernet/intel/fm10k/Makefile
@@ -30,4 +30,4 @@ obj-$(CONFIG_FM10K) += fm10k.o
 fm10k-objs := fm10k_main.o fm10k_common.o fm10k_pci.o \
 	      fm10k_netdev.o fm10k_ethtool.o fm10k_pf.o fm10k_vf.o \
 	      fm10k_mbx.o fm10k_iov.o fm10k_tlv.o \
-	      fm10k_dcbnl.o
+	      fm10k_debugfs.o fm10k_dcbnl.o
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k.h b/drivers/net/ethernet/intel/fm10k/fm10k.h
index fe1c033..c355456 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k.h
@@ -199,6 +199,9 @@ struct fm10k_q_vector {
 	struct napi_struct napi;
 	char name[IFNAMSIZ + 9];
 
+#ifdef CONFIG_DEBUG_FS
+	struct dentry *dbg_q_vector;
+#endif /* CONFIG_DEBUG_FS */
 	struct rcu_head rcu;	/* to avoid race with update stats on free */
 
 	/* for dynamic allocation of rings associated with this q_vector */
@@ -307,6 +310,10 @@ struct fm10k_intfc {
 	/* VXLAN port tracking information */
 	struct list_head vxlan_port;
 
+#ifdef CONFIG_DEBUG_FS
+	struct dentry *dbg_intfc;
+
+#endif /* CONFIG_DEBUG_FS */
 #if defined(HAVE_DCBNL_IEEE) && defined(CONFIG_DCB)
 	u8 pfc_en;
 #endif
@@ -468,6 +475,23 @@ int fm10k_ndo_set_vf_bw(struct net_device *netdev, int vf_idx, int rate,
 int fm10k_ndo_get_vf_config(struct net_device *netdev,
 			    int vf_idx, struct ifla_vf_info *ivi);
 
+/* DebugFS */
+#ifdef CONFIG_DEBUG_FS
+void fm10k_dbg_q_vector_init(struct fm10k_q_vector *q_vector);
+void fm10k_dbg_q_vector_exit(struct fm10k_q_vector *q_vector);
+void fm10k_dbg_intfc_init(struct fm10k_intfc *interface);
+void fm10k_dbg_intfc_exit(struct fm10k_intfc *interface);
+void fm10k_dbg_init(void);
+void fm10k_dbg_exit(void);
+#else
+static inline void fm10k_dbg_q_vector_init(struct fm10k_q_vector *q_vector) {}
+static inline void fm10k_dbg_q_vector_exit(struct fm10k_q_vector *q_vector) {}
+static inline void fm10k_dbg_intfc_init(struct fm10k_intfc *interface) {}
+static inline void fm10k_dbg_intfc_exit(struct fm10k_intfc *interface) {}
+static inline void fm10k_dbg_init(void) {}
+static inline void fm10k_dbg_exit(void) {}
+#endif /* CONFIG_DEBUG_FS */
+
 /* DCB */
 void fm10k_dcbnl_set_ops(struct net_device *dev);
 #endif /* _FM10K_H_ */
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_debugfs.c b/drivers/net/ethernet/intel/fm10k/fm10k_debugfs.c
new file mode 100644
index 0000000..4327f86
--- /dev/null
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_debugfs.c
@@ -0,0 +1,259 @@
+/* Intel Ethernet Switch Host Interface Driver
+ * Copyright(c) 2013 - 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Contact Information:
+ * e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+ * Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+ */
+
+#ifdef CONFIG_DEBUG_FS
+
+#include "fm10k.h"
+
+#include <linux/debugfs.h>
+#include <linux/seq_file.h>
+
+static struct dentry *dbg_root;
+
+/* Descriptor Seq Functions */
+
+static void *fm10k_dbg_desc_seq_start(struct seq_file *s, loff_t *pos)
+{
+	struct fm10k_ring *ring = s->private;
+
+	return (*pos < ring->count) ? pos : NULL;
+}
+
+static void *fm10k_dbg_desc_seq_next(struct seq_file *s, void *v, loff_t *pos)
+{
+	struct fm10k_ring *ring = s->private;
+
+	return (++(*pos) < ring->count) ? pos : NULL;
+}
+
+static void fm10k_dbg_desc_seq_stop(struct seq_file *s, void *v)
+{
+	/* Do nothing. */
+}
+
+static void fm10k_dbg_desc_break(struct seq_file *s, int i)
+{
+	while (i--)
+		seq_puts(s, "-");
+
+	seq_puts(s, "\n");
+}
+
+static int fm10k_dbg_tx_desc_seq_show(struct seq_file *s, void *v)
+{
+	struct fm10k_ring *ring = s->private;
+	int i = *(loff_t *)v;
+	static const char tx_desc_hdr[] =
+		"DES BUFFER_ADDRESS     LENGTH VLAN   MSS    HDRLEN FLAGS\n";
+
+	/* Generate header */
+	if (!i) {
+		seq_printf(s, tx_desc_hdr);
+		fm10k_dbg_desc_break(s, sizeof(tx_desc_hdr) - 1);
+	}
+
+	/* Validate descriptor allocation */
+	if (!ring->desc) {
+		seq_printf(s, "%03X Descriptor ring not allocated.\n", i);
+	} else {
+		struct fm10k_tx_desc *txd = FM10K_TX_DESC(ring, i);
+
+		seq_printf(s, "%03X %#018llx %#06x %#06x %#06x %#06x %#04x\n",
+			   i, txd->buffer_addr, txd->buflen, txd->vlan,
+			   txd->mss, txd->hdrlen, txd->flags);
+	}
+
+	return 0;
+}
+
+static int fm10k_dbg_rx_desc_seq_show(struct seq_file *s, void *v)
+{
+	struct fm10k_ring *ring = s->private;
+	int i = *(loff_t *)v;
+	static const char rx_desc_hdr[] =
+	"DES DATA       RSS        STATERR    LENGTH VLAN   DGLORT SGLORT TIMESTAMP\n";
+
+	/* Generate header */
+	if (!i) {
+		seq_printf(s, rx_desc_hdr);
+		fm10k_dbg_desc_break(s, sizeof(rx_desc_hdr) - 1);
+	}
+
+	/* Validate descriptor allocation */
+	if (!ring->desc) {
+		seq_printf(s, "%03X Descriptor ring not allocated.\n", i);
+	} else {
+		union fm10k_rx_desc *rxd = FM10K_RX_DESC(ring, i);
+
+		seq_printf(s,
+			   "%03X %#010x %#010x %#010x %#06x %#06x %#06x %#06x %#018llx\n",
+			   i, rxd->d.data, rxd->d.rss, rxd->d.staterr,
+			   rxd->w.length, rxd->w.vlan, rxd->w.dglort,
+			   rxd->w.sglort, rxd->q.timestamp);
+	}
+
+	return 0;
+}
+
+static const struct seq_operations fm10k_dbg_tx_desc_seq_ops = {
+	.start = fm10k_dbg_desc_seq_start,
+	.next  = fm10k_dbg_desc_seq_next,
+	.stop  = fm10k_dbg_desc_seq_stop,
+	.show  = fm10k_dbg_tx_desc_seq_show,
+};
+
+static const struct seq_operations fm10k_dbg_rx_desc_seq_ops = {
+	.start = fm10k_dbg_desc_seq_start,
+	.next  = fm10k_dbg_desc_seq_next,
+	.stop  = fm10k_dbg_desc_seq_stop,
+	.show  = fm10k_dbg_rx_desc_seq_show,
+};
+
+static int fm10k_dbg_desc_open(struct inode *inode, struct file *filep)
+{
+	struct fm10k_ring *ring = inode->i_private;
+	struct fm10k_q_vector *q_vector = ring->q_vector;
+	const struct seq_operations *desc_seq_ops;
+	int err;
+
+	if (ring < q_vector->rx.ring)
+		desc_seq_ops = &fm10k_dbg_tx_desc_seq_ops;
+	else
+		desc_seq_ops = &fm10k_dbg_rx_desc_seq_ops;
+
+	err = seq_open(filep, desc_seq_ops);
+	if (err)
+		return err;
+
+	((struct seq_file *)filep->private_data)->private = ring;
+
+	return 0;
+}
+
+static const struct file_operations fm10k_dbg_desc_fops = {
+	.owner   = THIS_MODULE,
+	.open    = fm10k_dbg_desc_open,
+	.read    = seq_read,
+	.llseek  = seq_lseek,
+	.release = seq_release,
+};
+
+/**
+ * fm10k_dbg_q_vector_init - setup debugfs for the q_vectors
+ * @q_vector: q_vector to allocate directories for
+ *
+ * A folder is created for each q_vector found. In each q_vector
+ * folder, a debugfs file is created for each tx and rx ring
+ * allocated to the q_vector.
+ **/
+void fm10k_dbg_q_vector_init(struct fm10k_q_vector *q_vector)
+{
+	struct fm10k_intfc *interface = q_vector->interface;
+	char name[16];
+	int i;
+
+	if (!interface->dbg_intfc)
+		return;
+
+	/* Generate a folder for each q_vector */
+	sprintf(name, "q_vector.%03d", q_vector->v_idx);
+
+	q_vector->dbg_q_vector = debugfs_create_dir(name, interface->dbg_intfc);
+	if (!q_vector->dbg_q_vector)
+		return;
+
+	/* Generate a file for each rx ring in the q_vector */
+	for (i = 0; i < q_vector->tx.count; i++) {
+		struct fm10k_ring *ring = &q_vector->tx.ring[i];
+
+		sprintf(name, "tx_ring.%03d", ring->queue_index);
+
+		debugfs_create_file(name, 0600,
+				    q_vector->dbg_q_vector, ring,
+				    &fm10k_dbg_desc_fops);
+	}
+
+	/* Generate a file for each rx ring in the q_vector */
+	for (i = 0; i < q_vector->rx.count; i++) {
+		struct fm10k_ring *ring = &q_vector->rx.ring[i];
+
+		sprintf(name, "rx_ring.%03d", ring->queue_index);
+
+		debugfs_create_file(name, 0600,
+				    q_vector->dbg_q_vector, ring,
+				    &fm10k_dbg_desc_fops);
+	}
+}
+
+/**
+ * fm10k_dbg_free_q_vector_dir - setup debugfs for the q_vectors
+ * @q_vector: q_vector to allocate directories for
+ **/
+void fm10k_dbg_q_vector_exit(struct fm10k_q_vector *q_vector)
+{
+	struct fm10k_intfc *interface = q_vector->interface;
+
+	if (interface->dbg_intfc)
+		debugfs_remove_recursive(q_vector->dbg_q_vector);
+	q_vector->dbg_q_vector = NULL;
+}
+
+/**
+ * fm10k_dbg_intfc_init - setup the debugfs directory for the intferface
+ * @interface: the interface that is starting up
+ **/
+
+void fm10k_dbg_intfc_init(struct fm10k_intfc *interface)
+{
+	const char *name = pci_name(interface->pdev);
+
+	if (dbg_root)
+		interface->dbg_intfc = debugfs_create_dir(name, dbg_root);
+}
+
+/**
+ * fm10k_dbg_intfc_exit - clean out the interface's debugfs entries
+ * @interface: the interface that is stopping
+ **/
+void fm10k_dbg_intfc_exit(struct fm10k_intfc *interface)
+{
+	if (dbg_root)
+		debugfs_remove_recursive(interface->dbg_intfc);
+	interface->dbg_intfc = NULL;
+}
+
+/**
+ * fm10k_dbg_init - start up debugfs for the driver
+ **/
+void fm10k_dbg_init(void)
+{
+	dbg_root = debugfs_create_dir(fm10k_driver_name, NULL);
+}
+
+/**
+ * fm10k_dbg_exit - clean out the driver's debugfs entries
+ **/
+void fm10k_dbg_exit(void)
+{
+	debugfs_remove_recursive(dbg_root);
+	dbg_root = NULL;
+}
+
+#endif /* CONFIG_DEBUG_FS */
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_main.c b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
index 1e4fcba..d4b2fad 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_main.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
@@ -51,6 +51,8 @@ static int __init fm10k_init_module(void)
 	pr_info("%s - version %s\n", fm10k_driver_string, fm10k_driver_version);
 	pr_info("%s\n", fm10k_copyright);
 
+	fm10k_dbg_init();
+
 	return fm10k_register_pci_driver();
 }
 module_init(fm10k_init_module);
@@ -64,6 +66,8 @@ module_init(fm10k_init_module);
 static void __exit fm10k_exit_module(void)
 {
 	fm10k_unregister_pci_driver();
+
+	fm10k_dbg_exit();
 }
 module_exit(fm10k_exit_module);
 
@@ -1616,6 +1620,8 @@ static int fm10k_alloc_q_vector(struct fm10k_intfc *interface,
 		ring++;
 	}
 
+	fm10k_dbg_q_vector_init(q_vector);
+
 	return 0;
 }
 
@@ -1633,6 +1639,8 @@ static void fm10k_free_q_vector(struct fm10k_intfc *interface, int v_idx)
 	struct fm10k_q_vector *q_vector = interface->q_vector[v_idx];
 	struct fm10k_ring *ring;
 
+	fm10k_dbg_q_vector_exit(q_vector);
+
 	fm10k_for_each_ring(ring, q_vector->tx)
 		interface->tx_ring[ring->queue_index] = NULL;
 
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index b3ebedb..31741e5 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -1689,6 +1689,9 @@ static int fm10k_probe(struct pci_dev *pdev,
 	if (err)
 		goto err_sw_init;
 
+	/* enable debugfs support */
+	fm10k_dbg_intfc_init(interface);
+
 	err = fm10k_init_queueing_scheme(interface);
 	if (err)
 		goto err_sw_init;
@@ -1785,6 +1788,9 @@ static void fm10k_remove(struct pci_dev *pdev)
 	/* free interrupts */
 	fm10k_clear_queueing_scheme(interface);
 
+	/* remove any debugfs interfaces */
+	fm10k_dbg_intfc_exit(interface);
+
 	iounmap(interface->uc_addr);
 
 	free_netdev(netdev);

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [net-next PATCH 28/29] fm10k: Add support for ptp to hw specific files
  2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
                   ` (26 preceding siblings ...)
  2014-09-18 22:40 ` [net-next PATCH 27/29] fm10k: Add support for debugfs Alexander Duyck
@ 2014-09-18 22:40 ` Alexander Duyck
  2014-09-19  7:38   ` Richard Cochran
  2014-09-18 22:40 ` [net-next PATCH 29/29] fm10k: Add support for PTP Alexander Duyck
                   ` (3 subsequent siblings)
  31 siblings, 1 reply; 50+ messages in thread
From: Alexander Duyck @ 2014-09-18 22:40 UTC (permalink / raw)
  To: davem
  Cc: nhorman, netdev, john.fastabend, matthew.vick, jeffrey.t.kirsher,
	sassmann

This change adds the messaging support needed to support PTP.  In the case
of Tx timestamps it is necessary for the Switch Management entity to return
the frames via the mailbox as the host interface cannot know which port the
timestamp will be delivered to.  In addition there is only one clock on the
entire switch, as such the entity that has BAR 4 access is the only one who
can actually update the frequency as it is the only one with access.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_common.h |    8 +++
 drivers/net/ethernet/intel/fm10k/fm10k_pf.c     |   68 +++++++++++++++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_type.h   |   20 +++++++
 drivers/net/ethernet/intel/fm10k/fm10k_vf.c     |   29 ++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_vf.h     |   10 +++
 5 files changed, 135 insertions(+)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_common.h b/drivers/net/ethernet/intel/fm10k/fm10k_common.h
index 8250e14..45e4e5b 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_common.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_common.h
@@ -39,6 +39,14 @@ do { \
 		writel((val), &hw_addr[(reg)]); \
 } while (0)
 
+/* Switch register write operations, index using DWORDS */
+#define fm10k_write_sw_reg(hw, reg, val) \
+do { \
+	u32 __iomem *sw_addr = ACCESS_ONCE((hw)->sw_addr); \
+	if (!FM10K_REMOVED(sw_addr)) \
+		writel((val), &sw_addr[(reg)]); \
+} while (0)
+
 /* read ctrl register which has no clear on read fields as PCIe flush */
 #define fm10k_write_flush(hw) fm10k_read_reg((hw), FM10K_CTRL)
 s32 fm10k_get_bus_info_generic(struct fm10k_hw *hw);
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pf.c b/drivers/net/ethernet/intel/fm10k/fm10k_pf.c
index 0b6ce10..d7f9b89 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pf.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pf.c
@@ -1123,6 +1123,19 @@ static void fm10k_iov_update_stats_pf(struct fm10k_hw *hw,
 	fm10k_update_hw_stats_q(hw, q, idx, qpp);
 }
 
+static s32 fm10k_iov_report_timestamp_pf(struct fm10k_hw *hw,
+					 struct fm10k_vf_info *vf_info,
+					 u64 timestamp)
+{
+	u32 msg[4];
+
+	/* generate port state response to notify VF it is not ready */
+	fm10k_tlv_msg_init(msg, FM10K_VF_MSG_ID_1588);
+	fm10k_tlv_attr_put_u64(msg, FM10K_1588_MSG_TIMESTAMP, timestamp);
+
+	return vf_info->mbx.ops.enqueue_tx(hw, &vf_info->mbx, msg);
+}
+
 /**
  *  fm10k_iov_msg_msix_pf - Message handler for MSI-X request from VF
  *  @hw: Pointer to hardware structure
@@ -1723,6 +1736,59 @@ s32 fm10k_msg_err_pf(struct fm10k_hw *hw, u32 **results,
 	return 0;
 }
 
+const struct fm10k_tlv_attr fm10k_1588_timestamp_msg_attr[] = {
+	FM10K_TLV_ATTR_LE_STRUCT(FM10K_PF_ATTR_ID_1588_TIMESTAMP,
+				 sizeof(struct fm10k_swapi_1588_timestamp)),
+	FM10K_TLV_ATTR_LAST
+};
+
+/* currently there is no shared 1588 timestamp handler */
+
+/**
+ *  fm10k_adjust_systime_pf - Adjust systime frequency
+ *  @hw: pointer to hardware structure
+ *  @ppb: adjustment rate in parts per billion
+ *
+ *  This function will adjust the SYSTIME_CFG register contained in BAR 4
+ *  if this function is supported for BAR 4 access.  The adjustment amount
+ *  is based on the parts per billion value provided and adjusted to a
+ *  value based on parts per 2^48 clock cycles.
+ *
+ *  If adjustment is not supported or the requested value is too large
+ *  we will return an error.
+ **/
+static s32 fm10k_adjust_systime_pf(struct fm10k_hw *hw, s32 ppb)
+{
+	u64 systime_adjust;
+
+	/* if sw_addr is not set we don't have switch register access */
+	if (!hw->sw_addr)
+		return ppb ? FM10K_ERR_PARAM : 0;
+
+	/* we must convert the value from parts per billion to parts per
+	 * 2^48 cycles.  In addition we can only use the upper 30 bits of
+	 * the value when making the change so that restricts us futher.
+	 * The math for this is equivilent to ABS(pps) * 2^40 / 10 ^ 9,
+	 * the function below is roughly ABS(pps) * 2 ^ 31 / 5 ^ 9 which
+	 * is as close as we can get to the exact value with numbers as
+	 * small as possible.
+	 */
+	systime_adjust = (ppb < 0) ? -ppb : ppb;
+	systime_adjust <<= 31;
+	do_div(systime_adjust, 1953125);
+
+	/* verify the requested adjustment value is in range */
+	if (systime_adjust > FM10K_SW_SYSTIME_ADJUST_MASK)
+		return FM10K_ERR_PARAM;
+
+	if (ppb < 0)
+		systime_adjust |= FM10K_SW_SYSTIME_ADJUST_DIR_NEGATIVE;
+
+	fm10k_write_sw_reg(hw, FM10K_SW_SYSTIME_ADJUST, (u32)systime_adjust);
+
+	return 0;
+}
+
 static const struct fm10k_msg_data fm10k_msg_data_pf[] = {
 	FM10K_PF_MSG_ERR_HANDLER(XCAST_MODES, fm10k_msg_err_pf),
 	FM10K_PF_MSG_ERR_HANDLER(UPDATE_MAC_FWD_RULE, fm10k_msg_err_pf),
@@ -1753,6 +1819,7 @@ static struct fm10k_mac_ops mac_ops_pf = {
 	.set_dma_mask		= &fm10k_set_dma_mask_pf,
 	.get_fault		= &fm10k_get_fault_pf,
 	.get_host_state		= &fm10k_get_host_state_pf,
+	.adjust_systime		= &fm10k_adjust_systime_pf,
 };
 
 static struct fm10k_iov_ops iov_ops_pf = {
@@ -1764,6 +1831,7 @@ static struct fm10k_iov_ops iov_ops_pf = {
 	.set_lport			= &fm10k_iov_set_lport_pf,
 	.reset_lport			= &fm10k_iov_reset_lport_pf,
 	.update_stats			= &fm10k_iov_update_stats_pf,
+	.report_timestamp		= &fm10k_iov_report_timestamp_pf,
 };
 
 static s32 fm10k_get_invariants_pf(struct fm10k_hw *hw)
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_type.h b/drivers/net/ethernet/intel/fm10k/fm10k_type.h
index ecaf93a..5055bef 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_type.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_type.h
@@ -224,6 +224,15 @@ struct fm10k_hw;
 #define FM10K_STATS_LOOPBACK_DROP	0x3806
 #define FM10K_STATS_NODESC_DROP		0x3807
 
+/* Timesync registers */
+#define FM10K_RRTIME_CFG	0x3808
+#define FM10K_RRTIME_LIMIT(_n)	((_n) + 0x380C)
+#define FM10K_RRTIME_COUNT(_n)	((_n) + 0x3810)
+#define FM10K_SYSTIME		0x3814
+#define FM10K_SYSTIME0		0x3816
+#define FM10K_SYSTIME_CFG	0x3818
+#define FM10K_SYSTIME_CFG_STEP_MASK		0x0000000F
+
 /* PCIe state registers */
 #define FM10K_PHYADDR		0x381C
 
@@ -358,6 +367,15 @@ struct fm10k_hw;
 #define FM10K_VFSYSTIME		0x00040
 #define FM10K_VFITR(_n)		((_n) + 0x00060)
 
+/* Registers contained in BAR 4 for Switch management */
+#define FM10K_SW_SYSTIME_CFG	0x0224C
+#define FM10K_SW_SYSTIME_CFG_STEP_SHIFT		4
+#define FM10K_SW_SYSTIME_CFG_ADJUST_MASK	0xFF000000
+#define FM10K_SW_SYSTIME_ADJUST	0x0224D
+#define FM10K_SW_SYSTIME_ADJUST_MASK		0x3FFFFFFF
+#define FM10K_SW_SYSTIME_ADJUST_DIR_NEGATIVE	0x80000000
+#define FM10K_SW_SYSTIME_PULSE(_n)	((_n) + 0x02252)
+
 enum fm10k_int_source {
 	fm10k_int_Mailbox	= 0,
 	fm10k_int_PCIeFault	= 1,
@@ -614,6 +632,7 @@ struct fm10k_iov_ops {
 	s32 (*set_lport)(struct fm10k_hw *, struct fm10k_vf_info *, u16, u8);
 	void (*reset_lport)(struct fm10k_hw *, struct fm10k_vf_info *);
 	void (*update_stats)(struct fm10k_hw *, struct fm10k_hw_stats_q *, u16);
+	s32 (*report_timestamp)(struct fm10k_hw *, struct fm10k_vf_info *, u64);
 };
 
 struct fm10k_iov_info {
@@ -637,6 +656,7 @@ struct fm10k_info {
 
 struct fm10k_hw {
 	u32 __iomem *hw_addr;
+	u32 __iomem *sw_addr;
 	void *back;
 	struct fm10k_mac_info mac;
 	struct fm10k_bus_info bus;
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_vf.c b/drivers/net/ethernet/intel/fm10k/fm10k_vf.c
index 25c23fc..a0c47fc 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_vf.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_vf.c
@@ -431,6 +431,13 @@ static s32 fm10k_update_xcast_mode_vf(struct fm10k_hw *hw, u16 glort, u8 mode)
 	return mbx->ops.enqueue_tx(hw, mbx, msg);
 }
 
+const struct fm10k_tlv_attr fm10k_1588_msg_attr[] = {
+	FM10K_TLV_ATTR_U64(FM10K_1588_MSG_TIMESTAMP),
+	FM10K_TLV_ATTR_LAST
+};
+
+/* currently there is no shared 1588 timestamp handler */
+
 /**
  *  fm10k_update_hw_stats_vf - Updates hardware related statistics of VF
  *  @hw: pointer to hardware structure
@@ -482,6 +489,27 @@ static s32 fm10k_configure_dglort_map_vf(struct fm10k_hw *hw,
 	return 0;
 }
 
+/**
+ *  fm10k_adjust_systime_vf - Adjust systime frequency
+ *  @hw: pointer to hardware structure
+ *  @ppb: adjustment rate in parts per billion
+ *
+ *  This function takes an adjustment rate in parts per billion and will
+ *  verify that this value is 0 as the VF cannot support adjusting the
+ *  systime clock.
+ *
+ *  If the ppb value is non-zero the return is ERR_PARAM else success
+ **/
+static s32 fm10k_adjust_systime_vf(struct fm10k_hw *hw, s32 ppb)
+{
+	/* The VF cannot adjust the clock frequency, however it should
+	 * already have a syntonic clock with whichever host interface is
+	 * running as the master for the host interface clock domain so
+	 * there should be not frequency adjustment necessary.
+	 */
+	return ppb ? FM10K_ERR_PARAM : 0;
+}
+
 static const struct fm10k_msg_data fm10k_msg_data_vf[] = {
 	FM10K_TLV_MSG_TEST_HANDLER(fm10k_tlv_msg_test),
 	FM10K_VF_MSG_MAC_VLAN_HANDLER(fm10k_msg_mac_vlan_vf),
@@ -507,6 +535,7 @@ static struct fm10k_mac_ops mac_ops_vf = {
 	.rebind_hw_stats	= &fm10k_rebind_hw_stats_vf,
 	.configure_dglort_map	= &fm10k_configure_dglort_map_vf,
 	.get_host_state		= &fm10k_get_host_state_generic,
+	.adjust_systime		= &fm10k_adjust_systime_vf,
 };
 
 static s32 fm10k_get_invariants_vf(struct fm10k_hw *hw)
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_vf.h b/drivers/net/ethernet/intel/fm10k/fm10k_vf.h
index 8e96ee5..06a99d7 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_vf.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_vf.h
@@ -29,6 +29,7 @@ enum fm10k_vf_tlv_msg_id {
 	FM10K_VF_MSG_ID_MSIX,
 	FM10K_VF_MSG_ID_MAC_VLAN,
 	FM10K_VF_MSG_ID_LPORT_STATE,
+	FM10K_VF_MSG_ID_1588,
 	FM10K_VF_MSG_ID_MAX,
 };
 
@@ -48,6 +49,11 @@ enum fm10k_tlv_lport_state_attr_id {
 	FM10K_LPORT_STATE_MSG_MAX
 };
 
+enum fm10k_tlv_1588_attr_id {
+	FM10K_1588_MSG_TIMESTAMP,
+	FM10K_1588_MSG_MAX
+};
+
 #define FM10K_VF_MSG_MSIX_HANDLER(func) \
 	 FM10K_MSG_HANDLER(FM10K_VF_MSG_ID_MSIX, NULL, func)
 
@@ -64,5 +70,9 @@ extern const struct fm10k_tlv_attr fm10k_lport_state_msg_attr[];
 	FM10K_MSG_HANDLER(FM10K_VF_MSG_ID_LPORT_STATE, \
 			  fm10k_lport_state_msg_attr, func)
 
+extern const struct fm10k_tlv_attr fm10k_1588_msg_attr[];
+#define FM10K_VF_MSG_1588_HANDLER(func) \
+	FM10K_MSG_HANDLER(FM10K_VF_MSG_ID_1588, fm10k_1588_msg_attr, func)
+
 extern struct fm10k_info fm10k_vf_info;
 #endif /* _FM10K_VF_H */

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [net-next PATCH 29/29] fm10k: Add support for PTP
  2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
                   ` (27 preceding siblings ...)
  2014-09-18 22:40 ` [net-next PATCH 28/29] fm10k: Add support for ptp to hw specific files Alexander Duyck
@ 2014-09-18 22:40 ` Alexander Duyck
  2014-09-19 17:35   ` Richard Cochran
  2014-09-19  7:55 ` [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Jiri Pirko
                   ` (2 subsequent siblings)
  31 siblings, 1 reply; 50+ messages in thread
From: Alexander Duyck @ 2014-09-18 22:40 UTC (permalink / raw)
  To: davem
  Cc: nhorman, netdev, john.fastabend, matthew.vick, jeffrey.t.kirsher,
	sassmann

This change adds support for the Linux PTP Hardware clock and timestamping
functionality provided by the hardware.  There are actually two cases that
this timestamping is meant to support.

The first case would be an ordinary clock scenario.  In this configuration
the host interface does not have access to BAR 4.  However all of the host
interfaces should be locked into the same boundary clock region and as such
they are all on the same clock anyway.  With this being the case they can
synchronize among themselves and only need to adjust the offset since they
are all on the same clock with the same frequency.

The second case is a boundary clock scenario.  This is a special case and
would require both BAR 4 access, and a means of presenting a netdev per
boundary region.  The current plan is to use DSA at some point in the
future to provide these interfaces, but the DSA portion is still under
development.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/fm10k/Makefile        |    2 
 drivers/net/ethernet/intel/fm10k/fm10k.h         |   35 +
 drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c |   30 +
 drivers/net/ethernet/intel/fm10k/fm10k_main.c    |   20 +
 drivers/net/ethernet/intel/fm10k/fm10k_netdev.c  |   17 +
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c     |  113 +++++
 drivers/net/ethernet/intel/fm10k/fm10k_ptp.c     |  535 ++++++++++++++++++++++
 drivers/net/ethernet/intel/fm10k/fm10k_type.h    |    7 
 8 files changed, 751 insertions(+), 8 deletions(-)
 create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_ptp.c

diff --git a/drivers/net/ethernet/intel/fm10k/Makefile b/drivers/net/ethernet/intel/fm10k/Makefile
index fbc0e09..08859dd 100644
--- a/drivers/net/ethernet/intel/fm10k/Makefile
+++ b/drivers/net/ethernet/intel/fm10k/Makefile
@@ -30,4 +30,4 @@ obj-$(CONFIG_FM10K) += fm10k.o
 fm10k-objs := fm10k_main.o fm10k_common.o fm10k_pci.o \
 	      fm10k_netdev.o fm10k_ethtool.o fm10k_pf.o fm10k_vf.o \
 	      fm10k_mbx.o fm10k_iov.o fm10k_tlv.o \
-	      fm10k_debugfs.o fm10k_dcbnl.o
+	      fm10k_debugfs.o fm10k_ptp.o fm10k_dcbnl.o
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k.h b/drivers/net/ethernet/intel/fm10k/fm10k.h
index c355456..bdb1dbf 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k.h
@@ -26,6 +26,9 @@
 #include <linux/rtnetlink.h>
 #include <linux/if_vlan.h>
 #include <linux/pci.h>
+#include <linux/net_tstamp.h>
+#include <linux/clocksource.h>
+#include <linux/ptp_clock_kernel.h>
 
 #include "fm10k_pf.h"
 #include "fm10k_vf.h"
@@ -293,6 +296,7 @@ struct fm10k_intfc {
 	struct fm10k_hw_stats stats;
 	struct fm10k_hw hw;
 	u32 __iomem *uc_addr;
+	u32 __iomem *sw_addr;
 	u16 msg_enable;
 	u16 tx_ring_count;
 	u16 rx_ring_count;
@@ -314,6 +318,17 @@ struct fm10k_intfc {
 	struct dentry *dbg_intfc;
 
 #endif /* CONFIG_DEBUG_FS */
+	struct ptp_clock_info ptp_caps;
+	struct ptp_clock *ptp_clock;
+
+	struct sk_buff_head ts_tx_skb_queue;
+	u32 tx_hwtstamp_timeouts;
+
+	struct delayed_work ts_overflow_work;
+	struct hwtstamp_config ts_config;
+	struct cyclecounter cc;
+	struct timecounter tc;
+	rwlock_t tsreg_lock;
 #if defined(HAVE_DCBNL_IEEE) && defined(CONFIG_DCB)
 	u8 pfc_en;
 #endif
@@ -411,6 +426,10 @@ union fm10k_ftag_info {
 };
 
 struct fm10k_cb {
+	union {
+		__le64 tstamp;
+		unsigned long ts_tx_timeout;
+	};
 	union fm10k_ftag_info fi;
 };
 
@@ -492,6 +511,22 @@ static inline void fm10k_dbg_init(void) {}
 static inline void fm10k_dbg_exit(void) {}
 #endif /* CONFIG_DEBUG_FS */
 
+/* Time Stamping */
+void fm10k_ts_cc_to_hwtstamp(struct fm10k_intfc *interface,
+			     struct skb_shared_hwtstamps *hwtstamp,
+			     cycle_t systime);
+void fm10k_ts_tx_enqueue(struct fm10k_intfc *interface, struct sk_buff *skb);
+void fm10k_ts_tx_hwtstamp(struct fm10k_intfc *interface, __le16 dglort,
+			  cycle_t systime);
+void fm10k_ts_reset_cc(struct fm10k_intfc *interface);
+void fm10k_ts_start_cc(struct fm10k_intfc *interface);
+void fm10k_ts_stop_cc(struct fm10k_intfc *interface);
+void fm10k_ts_tx_subtask(struct fm10k_intfc *interface);
+void fm10k_ptp_register(struct fm10k_intfc *interface);
+void fm10k_ptp_unregister(struct fm10k_intfc *interface);
+int fm10k_get_ts_config(struct net_device *netdev, struct ifreq *ifr);
+int fm10k_set_ts_config(struct net_device *netdev, struct ifreq *ifr);
+
 /* DCB */
 void fm10k_dcbnl_set_ops(struct net_device *dev);
 #endif /* _FM10K_H_ */
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c b/drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c
index 42beb89..a9bbe60 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c
@@ -91,6 +91,8 @@ static const struct fm10k_stats fm10k_gstrings_stats[] = {
 	FM10K_STAT("mbx_rx_messages", hw.mbx.rx_messages),
 	FM10K_STAT("mbx_rx_dwords", hw.mbx.rx_dwords),
 	FM10K_STAT("mbx_rx_parse_err", hw.mbx.rx_parse_err),
+
+	FM10K_STAT("tx_hwtstamp_timeouts", tx_hwtstamp_timeouts),
 };
 
 #define FM10K_GLOBAL_STATS_LEN ARRAY_SIZE(fm10k_gstrings_stats)
@@ -1006,6 +1008,33 @@ static int fm10k_set_channels(struct net_device *dev,
 	return fm10k_setup_tc(dev, netdev_get_num_tc(dev));
 }
 
+static int fm10k_get_ts_info(struct net_device *dev,
+			   struct ethtool_ts_info *info)
+{
+	struct fm10k_intfc *interface = netdev_priv(dev);
+
+	info->so_timestamping =
+		SOF_TIMESTAMPING_TX_SOFTWARE |
+		SOF_TIMESTAMPING_RX_SOFTWARE |
+		SOF_TIMESTAMPING_SOFTWARE |
+		SOF_TIMESTAMPING_TX_HARDWARE |
+		SOF_TIMESTAMPING_RX_HARDWARE |
+		SOF_TIMESTAMPING_RAW_HARDWARE;
+
+	if (interface->ptp_clock)
+		info->phc_index = ptp_clock_index(interface->ptp_clock);
+	else
+		info->phc_index = -1;
+
+	info->tx_types = (1 << HWTSTAMP_TX_OFF) |
+			 (1 << HWTSTAMP_TX_ON);
+
+	info->rx_filters = (1 << HWTSTAMP_FILTER_NONE) |
+			   (1 << HWTSTAMP_FILTER_ALL);
+
+	return 0;
+}
+
 static const struct ethtool_ops fm10k_ethtool_ops = {
 	.get_strings		= fm10k_get_strings,
 	.get_sset_count		= fm10k_get_sset_count,
@@ -1031,6 +1060,7 @@ static const struct ethtool_ops fm10k_ethtool_ops = {
 	.set_rxfh		= fm10k_set_rssh,
 	.get_channels		= fm10k_get_channels,
 	.set_channels		= fm10k_set_channels,
+	.get_ts_info            = fm10k_get_ts_info,
 };
 
 void fm10k_set_ethtool_ops(struct net_device *dev)
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_main.c b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
index d4b2fad..df478710 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_main.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
@@ -398,6 +398,19 @@ static inline void fm10k_rx_hash(struct fm10k_ring *ring,
 		     PKT_HASH_TYPE_L4 : PKT_HASH_TYPE_L3);
 }
 
+static void fm10k_rx_hwtstamp(struct fm10k_ring *rx_ring,
+			      union fm10k_rx_desc *rx_desc,
+			      struct sk_buff *skb)
+{
+	struct fm10k_intfc *interface = rx_ring->q_vector->interface;
+
+	FM10K_CB(skb)->tstamp = rx_desc->q.timestamp;
+
+	if (unlikely(interface->flags & FM10K_FLAG_RX_TS_ENABLED))
+		fm10k_ts_cc_to_hwtstamp(interface, skb_hwtstamps(skb),
+					le64_to_cpu(rx_desc->q.timestamp));
+}
+
 static void fm10k_type_trans(struct fm10k_ring *rx_ring,
 			     union fm10k_rx_desc *rx_desc,
 			     struct sk_buff *skb)
@@ -447,6 +460,8 @@ static unsigned int fm10k_process_skb_fields(struct fm10k_ring *rx_ring,
 
 	fm10k_rx_checksum(rx_ring, rx_desc, skb);
 
+	fm10k_rx_hwtstamp(rx_ring, rx_desc, skb);
+
 	FM10K_CB(skb)->fi.w.vlan = rx_desc->w.vlan;
 
 	skb_record_rx_queue(skb, rx_ring->queue_index);
@@ -885,6 +900,11 @@ static u8 fm10k_tx_desc_flags(struct sk_buff *skb, u32 tx_flags)
 	/* set type for advanced descriptor with frame checksum insertion */
 	u32 desc_flags = 0;
 
+	/* set timestamping bits */
+	if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) &&
+	    likely(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS))
+			desc_flags |= FM10K_TXD_FLAG_TIME;
+
 	/* set checksum offload bits */
 	desc_flags |= FM10K_SET_FLAG(tx_flags, FM10K_TX_FLAGS_CSUM,
 				     FM10K_TXD_FLAG_CSUM);
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
index 3e56506..8239b5b 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
@@ -647,6 +647,10 @@ static netdev_tx_t fm10k_xmit_frame(struct sk_buff *skb, struct net_device *dev)
 		__skb_put(skb, pad_len);
 	}
 
+	/* prepare packet for hardware time stamping */
+	if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP))
+		fm10k_ts_tx_enqueue(interface, skb);
+
 	if (r_idx >= interface->num_tx_queues)
 		r_idx %= interface->num_tx_queues;
 
@@ -1173,6 +1177,18 @@ int fm10k_setup_tc(struct net_device *dev, u8 tc)
 	return 0;
 }
 
+static int fm10k_ioctl(struct net_device *netdev, struct ifreq *ifr, int cmd)
+{
+	switch (cmd) {
+	case SIOCGHWTSTAMP:
+		return fm10k_get_ts_config(netdev, ifr);
+	case SIOCSHWTSTAMP:
+		return fm10k_set_ts_config(netdev, ifr);
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
 static void fm10k_assign_l2_accel(struct fm10k_intfc *interface,
 				  struct fm10k_l2_accel *l2_accel)
 {
@@ -1341,6 +1357,7 @@ static const struct net_device_ops fm10k_netdev_ops = {
 	.ndo_get_vf_config	= fm10k_ndo_get_vf_config,
 	.ndo_add_vxlan_port	= fm10k_add_vxlan_port,
 	.ndo_del_vxlan_port	= fm10k_del_vxlan_port,
+	.ndo_do_ioctl		= fm10k_ioctl,
 	.ndo_dfwd_add_station	= fm10k_dfwd_add_station,
 	.ndo_dfwd_del_station	= fm10k_dfwd_del_station,
 };
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index 31741e5..9e25421 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -170,6 +170,9 @@ static void fm10k_reinit(struct fm10k_intfc *interface)
 	/* reassociate interrupts */
 	fm10k_mbx_request_irq(interface);
 
+	/* reset clock */
+	fm10k_ts_reset_cc(interface);
+
 	if (netif_running(netdev))
 		fm10k_open(netdev);
 
@@ -490,6 +493,7 @@ static void fm10k_service_task(struct work_struct *work)
 	/* tasks only run when interface is up */
 	fm10k_watchdog_subtask(interface);
 	fm10k_check_hang_subtask(interface);
+	fm10k_ts_tx_subtask(interface);
 
 	/* release lock on service events to allow scheduling next event */
 	fm10k_service_event_complete(interface);
@@ -1064,6 +1068,25 @@ static s32 fm10k_mbx_mac_addr(struct fm10k_hw *hw, u32 **results,
 	return 0;
 }
 
+static s32 fm10k_1588_msg_vf(struct fm10k_hw *hw, u32 **results,
+			     struct fm10k_mbx_info *mbx)
+{
+	struct fm10k_intfc *interface = container_of(hw,
+						     struct fm10k_intfc,
+						     hw);
+	u64 timestamp;
+	s32 err;
+
+	err = fm10k_tlv_attr_get_u64(results[FM10K_1588_MSG_TIMESTAMP],
+				     &timestamp);
+	if (err)
+		return err;
+
+	fm10k_ts_tx_hwtstamp(interface, 0, timestamp);
+
+	return 0;
+}
+
 /* generic error handler for mailbox issues */
 static s32 fm10k_mbx_error(struct fm10k_hw *hw, u32 **results,
 			   struct fm10k_mbx_info *mbx)
@@ -1083,6 +1106,7 @@ static const struct fm10k_msg_data vf_mbx_data[] = {
 	FM10K_TLV_MSG_TEST_HANDLER(fm10k_tlv_msg_test),
 	FM10K_VF_MSG_MAC_VLAN_HANDLER(fm10k_mbx_mac_addr),
 	FM10K_VF_MSG_LPORT_STATE_HANDLER(fm10k_msg_lport_state_vf),
+	FM10K_VF_MSG_1588_HANDLER(fm10k_1588_msg_vf),
 	FM10K_TLV_MSG_ERROR_HANDLER(fm10k_mbx_error),
 };
 
@@ -1180,6 +1204,68 @@ static s32 fm10k_update_pvid(struct fm10k_hw *hw, u32 **results,
 	return 0;
 }
 
+static s32 fm10k_1588_msg_pf(struct fm10k_hw *hw, u32 **results,
+			     struct fm10k_mbx_info *mbx)
+{
+	struct fm10k_intfc *interface = container_of(hw,
+						     struct fm10k_intfc,
+						     hw);
+	struct fm10k_iov_data *iov_data;
+	struct fm10k_swapi_1588_timestamp timestamp;
+	u16 sglort, vf_idx;
+	s32 err;
+
+	err = fm10k_tlv_attr_get_le_struct(
+				results[FM10K_PF_ATTR_ID_1588_TIMESTAMP],
+				&timestamp, sizeof(timestamp));
+	if (err)
+		return err;
+
+	if (timestamp.dglort) {
+		fm10k_ts_tx_hwtstamp(interface, timestamp.dglort,
+				     le64_to_cpu(timestamp.egress));
+		return 0;
+	}
+
+	/* either dglort or sglort must be set */
+	if (!timestamp.sglort)
+		return FM10K_ERR_PARAM;
+
+	/* verify GLORT is at least one of the ones we own */
+	sglort = le16_to_cpu(timestamp.sglort);
+	if (!fm10k_glort_valid_pf(hw, sglort))
+		return FM10K_ERR_PARAM;
+
+	if (sglort == interface->glort) {
+		fm10k_ts_tx_hwtstamp(interface, 0,
+				     le64_to_cpu(timestamp.ingress));
+		return 0;
+	}
+
+	/* if there is no iov_data then there is no mailboxes to process */
+	if (!ACCESS_ONCE(interface->iov_data))
+		return FM10K_ERR_PARAM;
+
+	rcu_read_lock();
+
+	/* notify VF if this timestamp belongs to it */
+	iov_data = interface->iov_data;
+	vf_idx = (hw->mac.dglort_map & FM10K_DGLORTMAP_NONE) - sglort;
+
+	if (!iov_data || vf_idx >= iov_data->num_vfs) {
+		err = FM10K_ERR_PARAM;
+		goto err_unlock;
+	}
+
+	err = hw->iov.ops.report_timestamp(hw, &iov_data->vf_info[vf_idx],
+					   le64_to_cpu(timestamp.ingress));
+
+err_unlock:
+	rcu_read_unlock();
+
+	return err;
+}
+
 static const struct fm10k_msg_data pf_mbx_data[] = {
 	FM10K_PF_MSG_ERR_HANDLER(XCAST_MODES, fm10k_msg_err_pf),
 	FM10K_PF_MSG_ERR_HANDLER(UPDATE_MAC_FWD_RULE, fm10k_msg_err_pf),
@@ -1187,6 +1273,7 @@ static const struct fm10k_msg_data pf_mbx_data[] = {
 	FM10K_PF_MSG_ERR_HANDLER(LPORT_CREATE, fm10k_msg_err_pf),
 	FM10K_PF_MSG_ERR_HANDLER(LPORT_DELETE, fm10k_msg_err_pf),
 	FM10K_PF_MSG_UPDATE_PVID_HANDLER(fm10k_update_pvid),
+	FM10K_PF_MSG_1588_TIMESTAMP_HANDLER(fm10k_1588_msg_pf),
 	FM10K_TLV_MSG_ERROR_HANDLER(fm10k_mbx_error),
 };
 
@@ -1548,6 +1635,12 @@ static int fm10k_sw_init(struct fm10k_intfc *interface,
 		return -EIO;
 	}
 
+	/* assign BAR 4 resources for use with PTP */
+	if (fm10k_read_reg(hw, FM10K_CTRL) & FM10K_CTRL_BAR4_ALLOWED)
+		interface->sw_addr = ioremap(pci_resource_start(pdev, 4),
+					     pci_resource_len(pdev, 4));
+	hw->sw_addr = interface->sw_addr;
+
 	/* Only the PF can support VXLAN and NVGRE offloads */
 	if (hw->mac.type != fm10k_mac_pf) {
 		netdev->hw_enc_features = 0;
@@ -1564,6 +1657,9 @@ static int fm10k_sw_init(struct fm10k_intfc *interface,
 		    (unsigned long)interface);
 	INIT_WORK(&interface->service_task, fm10k_service_task);
 
+	/* Intitialize timestamp counters */
+	fm10k_ts_start_cc(interface);
+
 	/* set default ring sizes */
 	interface->tx_ring_count = FM10K_DEFAULT_TXD;
 	interface->rx_ring_count = FM10K_DEFAULT_RXD;
@@ -1715,6 +1811,9 @@ static int fm10k_probe(struct pci_dev *pdev,
 	/* stop all the transmit queues from transmitting until link is up */
 	netif_tx_stop_all_queues(netdev);
 
+	/* Register PTP interface */
+	fm10k_ptp_register(interface);
+
 	/* print bus type/speed/width info */
 	dev_info(&pdev->dev, "(PCI Express:%s Width: %s Payload: %s)\n",
 		 (hw->bus.speed == fm10k_bus_speed_8000 ? "8.0GT/s" :
@@ -1746,6 +1845,8 @@ err_register:
 err_mbx_interrupt:
 	fm10k_clear_queueing_scheme(interface);
 err_sw_init:
+	if (interface->sw_addr)
+		iounmap(interface->sw_addr);
 	iounmap(interface->uc_addr);
 err_ioremap:
 	free_netdev(netdev);
@@ -1779,6 +1880,10 @@ static void fm10k_remove(struct pci_dev *pdev)
 	if (netdev->reg_state == NETREG_REGISTERED)
 		unregister_netdev(netdev);
 
+	/* cleanup timestamp handling */
+	fm10k_ptp_unregister(interface);
+	fm10k_ts_stop_cc(interface);
+
 	/* release VFs */
 	fm10k_iov_disable(pdev);
 
@@ -1791,6 +1896,8 @@ static void fm10k_remove(struct pci_dev *pdev)
 	/* remove any debugfs interfaces */
 	fm10k_dbg_intfc_exit(interface);
 
+	if (interface->sw_addr)
+		iounmap(interface->sw_addr);
 	iounmap(interface->uc_addr);
 
 	free_netdev(netdev);
@@ -1847,6 +1954,9 @@ static int fm10k_resume(struct pci_dev *pdev)
 	/* reset statistics starting values */
 	hw->mac.ops.rebind_hw_stats(hw, &interface->stats);
 
+	/* reset clock */
+	fm10k_ts_reset_cc(interface);
+
 	rtnl_lock();
 
 	err = fm10k_init_queueing_scheme(interface);
@@ -2003,6 +2113,9 @@ static void fm10k_io_resume(struct pci_dev *pdev)
 	/* reassociate interrupts */
 	fm10k_mbx_request_irq(interface);
 
+	/* reset clock */
+	fm10k_ts_reset_cc(interface);
+
 	if (netif_running(netdev))
 		err = fm10k_open(netdev);
 
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_ptp.c b/drivers/net/ethernet/intel/fm10k/fm10k_ptp.c
new file mode 100644
index 0000000..41da724
--- /dev/null
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_ptp.c
@@ -0,0 +1,535 @@
+/* Intel Ethernet Switch Host Interface Driver
+ * Copyright(c) 2013 - 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Contact Information:
+ * e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+ * Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+ */
+
+#include <linux/ptp_classify.h>
+#include <linux/ptp_clock_kernel.h>
+
+#include "fm10k.h"
+
+/* We use a 64b counter so overflow is extremely seldom.  Just
+ * to keep things sane we should check for overflow once per day
+ */
+#define FM10K_SYSTIME_OVERFLOW_PERIOD	(HZ * 3600 * 24)
+#define FM10K_TS_TX_TIMEOUT		(HZ * 15)
+
+static cycle_t fm10k_cc_read_pf(const struct cyclecounter *cc)
+{
+	struct fm10k_intfc *intfc = container_of(cc, struct fm10k_intfc, cc);
+	struct fm10k_hw *hw = &intfc->hw;
+	u32 systime_l, systime_h, systime_tmp;
+
+	systime_h = fm10k_read_reg(hw, FM10K_SYSTIME + 1);
+
+	do {
+		systime_tmp = systime_h;
+		systime_l = fm10k_read_reg(hw, FM10K_SYSTIME);
+		systime_h = fm10k_read_reg(hw, FM10K_SYSTIME + 1);
+	} while (systime_tmp != systime_h);
+
+	return ((u64)systime_h << 32) | systime_l;
+}
+
+static cycle_t fm10k_cc_read_vf(const struct cyclecounter *cc)
+{
+	struct fm10k_intfc *intfc = container_of(cc, struct fm10k_intfc, cc);
+	struct fm10k_hw *hw = &intfc->hw;
+	u32 systime_l, systime_h, systime_tmp;
+
+	systime_h = fm10k_read_reg(hw, FM10K_VFSYSTIME + 1);
+
+	do {
+		systime_tmp = systime_h;
+		systime_l = fm10k_read_reg(hw, FM10K_VFSYSTIME);
+		systime_h = fm10k_read_reg(hw, FM10K_VFSYSTIME + 1);
+	} while (systime_tmp != systime_h);
+
+	return ((u64)systime_h << 32) | systime_l;
+}
+
+void fm10k_ts_cc_to_hwtstamp(struct fm10k_intfc *interface,
+			     struct skb_shared_hwtstamps *hwtstamp,
+			     cycle_t systime)
+{
+	unsigned long flags;
+	u64 ns;
+
+	read_lock_irqsave(&interface->tsreg_lock, flags);
+	ns = timecounter_cyc2time(&interface->tc, systime);
+	read_unlock_irqrestore(&interface->tsreg_lock, flags);
+
+	hwtstamp->hwtstamp = ns_to_ktime(ns);
+}
+
+static struct sk_buff *fm10k_ts_tx_skb(struct fm10k_intfc *interface,
+				       __le16 dglort)
+{
+	struct sk_buff_head *list = &interface->ts_tx_skb_queue;
+	struct sk_buff *skb;
+
+	skb_queue_walk(list, skb) {
+		if (FM10K_CB(skb)->fi.w.dglort == dglort)
+			return skb;
+	}
+
+	return NULL;
+}
+
+void fm10k_ts_tx_enqueue(struct fm10k_intfc *interface, struct sk_buff *skb)
+{
+	struct sk_buff_head *list = &interface->ts_tx_skb_queue;
+	struct sk_buff *clone;
+	unsigned long flags;
+	__le16 dglort;
+
+	/* create clone for us to return on the Tx path */
+	clone = skb_clone_sk(skb);
+	if (!clone)
+		return;
+
+	FM10K_CB(clone)->ts_tx_timeout = jiffies + FM10K_TS_TX_TIMEOUT;
+	dglort = FM10K_CB(clone)->fi.w.dglort;
+
+	spin_lock_irqsave(&list->lock, flags);
+
+	/* attempt to locate any buffers with the same dglort,
+	 * if none are present then insert skb in tail of list
+	 */
+	skb = fm10k_ts_tx_skb(interface, FM10K_CB(clone)->fi.w.dglort);
+	if (!skb)
+		__skb_queue_tail(list, clone);
+
+	spin_unlock_irqrestore(&list->lock, flags);
+
+	/* if list is already has one then we just free the clone */
+	if (skb)
+		kfree_skb(skb);
+	else
+		skb_shinfo(clone)->tx_flags |= SKBTX_IN_PROGRESS;
+}
+
+void fm10k_ts_tx_hwtstamp(struct fm10k_intfc *interface, __le16 dglort,
+			  cycle_t systime)
+{
+	struct skb_shared_hwtstamps shhwtstamps;
+	struct sk_buff_head *list = &interface->ts_tx_skb_queue;
+	struct sk_buff *skb;
+	unsigned long flags;
+
+	spin_lock_irqsave(&list->lock, flags);
+
+	/* attempt to locate and pull the sk_buff out of the list */
+	skb = fm10k_ts_tx_skb(interface, dglort);
+	if (skb)
+		__skb_unlink(skb, list);
+
+	spin_unlock_irqrestore(&list->lock, flags);
+
+	/* if not found do nothing */
+	if (!skb)
+		return;
+
+	/* timestamp the sk_buff and return it to the socket */
+	fm10k_ts_cc_to_hwtstamp(interface, &shhwtstamps, systime);
+	skb_complete_tx_timestamp(skb, &shhwtstamps);
+}
+
+static void fm10k_ts_overflow_check(struct work_struct *work)
+{
+	struct fm10k_intfc *interface = container_of(work,
+						     struct fm10k_intfc,
+						     ts_overflow_work.work);
+	unsigned long flags;
+	struct timespec ts;
+	u64 now;
+
+	read_lock_irqsave(&interface->tsreg_lock, flags);
+	now = timecounter_read(&interface->tc);
+	read_unlock_irqrestore(&interface->tsreg_lock, flags);
+
+	ts = ns_to_timespec(now);
+	pr_debug("fm10k overflow check at %lu.%09lu\n", ts.tv_sec, ts.tv_nsec);
+
+	schedule_delayed_work(&interface->ts_overflow_work,
+			      FM10K_SYSTIME_OVERFLOW_PERIOD);
+}
+
+void fm10k_ts_reset_cc(struct fm10k_intfc *interface)
+{
+	unsigned long flags;
+	struct timespec now;
+
+	getnstimeofday(&now);
+
+	/* reinitialize the clock */
+	write_lock_irqsave(&interface->tsreg_lock, flags);
+	timecounter_init(&interface->tc, &interface->cc, timespec_to_ns(&now));
+	write_unlock_irqrestore(&interface->tsreg_lock, flags);
+}
+
+void fm10k_ts_start_cc(struct fm10k_intfc *interface)
+{
+	struct fm10k_hw *hw = &interface->hw;
+
+	/* Initialize cycle counter */
+	interface->cc.read = (hw->mac.type == fm10k_mac_pf) ? fm10k_cc_read_pf :
+							      fm10k_cc_read_vf;
+	interface->cc.mask = CLOCKSOURCE_MASK(64);
+	interface->cc.mult = 1;
+
+	/* Initialize lock protecting register access */
+	rwlock_init(&interface->tsreg_lock);
+
+	/* Initialize skb queue for pending timestamp requests */
+	skb_queue_head_init(&interface->ts_tx_skb_queue);
+
+	/* Initialize the clock */
+	fm10k_ts_reset_cc(interface);
+
+	/* Initialize the overflow work */
+	INIT_DELAYED_WORK(&interface->ts_overflow_work,
+			  fm10k_ts_overflow_check);
+	schedule_delayed_work(&interface->ts_overflow_work,
+			      FM10K_SYSTIME_OVERFLOW_PERIOD);
+}
+
+void fm10k_ts_stop_cc(struct fm10k_intfc *interface)
+{
+	/* remove any stale SKBs and free them */
+	skb_queue_purge(&interface->ts_tx_skb_queue);
+
+	/* shut down the overflow check */
+	cancel_delayed_work_sync(&interface->ts_overflow_work);
+}
+
+void fm10k_ts_tx_subtask(struct fm10k_intfc *interface)
+{
+	struct sk_buff_head *list = &interface->ts_tx_skb_queue;
+	struct sk_buff *skb, *tmp;
+	unsigned long flags;
+
+	/* If we're down or resetting, just bail */
+	if (test_bit(__FM10K_DOWN, &interface->state) ||
+	    test_bit(__FM10K_RESETTING, &interface->state))
+		return;
+
+	spin_lock_irqsave(&list->lock, flags);
+
+	/* walk though the list and flush any expired timestamp packets */
+	skb_queue_walk_safe(list, skb, tmp) {
+		if (!time_is_after_jiffies(FM10K_CB(skb)->ts_tx_timeout))
+			continue;
+		__skb_unlink(skb, list);
+		kfree_skb(skb);
+		interface->tx_hwtstamp_timeouts++;
+	}
+
+	spin_unlock_irqrestore(&list->lock, flags);
+}
+
+/**
+ * fm10k_get_ts_config - get current hardware timestamping configuration
+ * @netdev: network interface device structure
+ * @ifreq: ioctl data
+ *
+ * This function returns the current timestamping settings. Rather than
+ * attempt to deconstruct registers to fill in the values, simply keep a copy
+ * of the old settings around, and return a copy when requested.
+ */
+int fm10k_get_ts_config(struct net_device *netdev, struct ifreq *ifr)
+{
+	struct fm10k_intfc *interface = netdev_priv(netdev);
+	struct hwtstamp_config *config = &interface->ts_config;
+
+	return copy_to_user(ifr->ifr_data, config, sizeof(*config)) ?
+		-EFAULT : 0;
+}
+
+/**
+ * fm10k_set_ts_config - control hardware time stamping
+ * @netdev: network interface device structure
+ * @ifreq: ioctl data
+ *
+ * Outgoing time stamping can be enabled and disabled. Play nice and
+ * disable it when requested, although it shouldn't cause any overhead
+ * when no packet needs it. At most one packet in the queue may be
+ * marked for time stamping, otherwise it would be impossible to tell
+ * for sure to which packet the hardware time stamp belongs.
+ *
+ * Incoming time stamping has to be configured via the hardware
+ * filters. Not all combinations are supported, in particular event
+ * type has to be specified. Matching the kind of event packet is
+ * not supported, with the exception of "all V2 events regardless of
+ * level 2 or 4".
+ *
+ * Since hardware always timestamps Path delay packets when timestamping V2
+ * packets, regardless of the type specified in the register, only use V2
+ * Event mode. This more accurately tells the user what the hardware is going
+ * to do anyways.
+ */
+int fm10k_set_ts_config(struct net_device *netdev, struct ifreq *ifr)
+{
+	struct fm10k_intfc *interface = netdev_priv(netdev);
+	struct hwtstamp_config ts_config;
+
+	if (copy_from_user(&ts_config, ifr->ifr_data, sizeof(ts_config)))
+		return -EFAULT;
+
+	/* reserved for future extensions */
+	if (ts_config.flags)
+		return -EINVAL;
+
+	switch (ts_config.tx_type) {
+	case HWTSTAMP_TX_OFF:
+		break;
+	case HWTSTAMP_TX_ON:
+		/* we likely need some check here to see if this is supported */
+		break;
+	default:
+		return -ERANGE;
+	}
+
+	switch (ts_config.rx_filter) {
+	case HWTSTAMP_FILTER_NONE:
+		interface->flags &= ~FM10K_FLAG_RX_TS_ENABLED;
+		break;
+	case HWTSTAMP_FILTER_PTP_V1_L4_EVENT:
+	case HWTSTAMP_FILTER_PTP_V1_L4_SYNC:
+	case HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ:
+	case HWTSTAMP_FILTER_PTP_V2_L4_EVENT:
+	case HWTSTAMP_FILTER_PTP_V2_L4_SYNC:
+	case HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ:
+	case HWTSTAMP_FILTER_PTP_V2_L2_EVENT:
+	case HWTSTAMP_FILTER_PTP_V2_L2_SYNC:
+	case HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ:
+	case HWTSTAMP_FILTER_PTP_V2_EVENT:
+	case HWTSTAMP_FILTER_PTP_V2_SYNC:
+	case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
+	case HWTSTAMP_FILTER_ALL:
+		interface->flags |= FM10K_FLAG_RX_TS_ENABLED;
+		ts_config.rx_filter = HWTSTAMP_FILTER_ALL;
+		break;
+	default:
+		return -ERANGE;
+	}
+
+	/* save these settings for future reference */
+	interface->ts_config = ts_config;
+
+	return copy_to_user(ifr->ifr_data, &ts_config, sizeof(ts_config)) ?
+		-EFAULT : 0;
+}
+
+static int fm10k_ptp_adjfreq(struct ptp_clock_info *ptp, s32 ppb)
+{
+	struct fm10k_intfc *interface =
+		container_of(ptp, struct fm10k_intfc, ptp_caps);
+	struct fm10k_hw *hw = &interface->hw;
+	int err;
+
+	err = hw->mac.ops.adjust_systime(hw, ppb);
+
+	/* the only error we should see is if the value is out of range */
+	return (err == FM10K_ERR_PARAM) ? -ERANGE : err;
+}
+
+static int fm10k_ptp_adjtime(struct ptp_clock_info *ptp, s64 delta)
+{
+	struct fm10k_intfc *interface =
+		container_of(ptp, struct fm10k_intfc, ptp_caps);
+	unsigned long flags;
+
+	write_lock_irqsave(&interface->tsreg_lock, flags);
+
+	delta += timecounter_read(&interface->tc);
+	timecounter_init(&interface->tc, &interface->cc, delta);
+
+	write_unlock_irqrestore(&interface->tsreg_lock, flags);
+
+	return 0;
+}
+
+static int fm10k_ptp_gettime(struct ptp_clock_info *ptp, struct timespec *ts)
+{
+	struct fm10k_intfc *interface =
+		container_of(ptp, struct fm10k_intfc, ptp_caps);
+	unsigned long flags;
+	u64 now;
+
+	read_lock_irqsave(&interface->tsreg_lock, flags);
+
+	now = timecounter_read(&interface->tc);
+
+	read_unlock_irqrestore(&interface->tsreg_lock, flags);
+
+	*ts = ns_to_timespec(now);
+
+	return 0;
+}
+
+static int fm10k_ptp_settime(struct ptp_clock_info *ptp,
+			     const struct timespec *ts)
+{
+	struct fm10k_intfc *interface =
+		container_of(ptp, struct fm10k_intfc, ptp_caps);
+	unsigned long flags;
+	u64 ns = timespec_to_ns(ts);
+
+	write_lock_irqsave(&interface->tsreg_lock, flags);
+
+	timecounter_init(&interface->tc, &interface->cc, ns);
+
+	write_unlock_irqrestore(&interface->tsreg_lock, flags);
+
+	return 0;
+}
+
+static int fm10k_ptp_enable(struct ptp_clock_info *ptp,
+			    struct ptp_clock_request *rq, int on)
+{
+	struct fm10k_intfc *interface =
+		container_of(ptp, struct fm10k_intfc, ptp_caps);
+	struct ptp_clock_time *t = &rq->perout.period;
+	struct fm10k_hw *hw = &interface->hw;
+	u64 period;
+	u32 step;
+
+	/* we can only support periodic output */
+	if (rq->type != PTP_CLK_REQ_PEROUT)
+		return -EINVAL;
+
+	/* verify the requested channel is there */
+	if (rq->perout.index >= ptp->n_per_out)
+		return -EINVAL;
+
+	/* we simply cannot support the operation if we don't have BAR4 */
+	if (!hw->sw_addr)
+		return -ENOTSUPP;
+
+	/* we cannot enforce start time as there is no
+	 * mechanism for that in the hardware, we can only control
+	 * the period.
+	 */
+
+	/* we cannot support periods greater than 4 seconds due to reg limit */
+	if (t->sec > 4 || t->sec < 0)
+		return -ERANGE;
+
+	/* convert to unsigned 64b ns, verify we can put it in a 32b register */
+	period = t->sec * 1000000000LL + t->nsec;
+
+	/* determine the minimum size for period */
+	step = 2 * (fm10k_read_reg(hw, FM10K_SYSTIME_CFG) &
+		    FM10K_SYSTIME_CFG_STEP_MASK);
+
+	/* verify the value is in range supported by hardware */
+	if ((period && (period < step)) || (period > U32_MAX))
+		return -ERANGE;
+
+	/* notify hardware of request to being sending pulses */
+	fm10k_write_sw_reg(hw, FM10K_SW_SYSTIME_PULSE(rq->perout.index),
+			   (u32)period);
+
+	return 0;
+}
+
+static struct ptp_pin_desc fm10k_ptp_pd[2] = {
+	{
+		.name = "IEEE1588_PULSE0",
+		.index = 0,
+		.func = PTP_PF_PEROUT,
+		.chan = 0
+	},
+	{
+		.name = "IEEE1588_PULSE1",
+		.index = 1,
+		.func = PTP_PF_PEROUT,
+		.chan = 1
+	}
+};
+
+static int fm10k_ptp_verify(struct ptp_clock_info *ptp, unsigned int pin,
+			    enum ptp_pin_function func, unsigned int chan)
+{
+	/* verify the requested pin is there */
+	if (pin >= ptp->n_pins || !ptp->pin_config)
+		return -EINVAL;
+
+	/* enforce locked channels, no changing them */
+	if (chan != ptp->pin_config[pin].chan)
+		return -EINVAL;
+
+	/* we want to keep the functions locked as well */
+	if (func != ptp->pin_config[pin].func)
+		return -EINVAL;
+
+	return 0;
+}
+
+void fm10k_ptp_register(struct fm10k_intfc *interface)
+{
+	struct ptp_clock_info *ptp_caps = &interface->ptp_caps;
+	struct device *dev = &interface->pdev->dev;
+	struct ptp_clock *ptp_clock;
+
+	snprintf(ptp_caps->name, sizeof(ptp_caps->name),
+		 "%s", interface->netdev->name);
+	ptp_caps->owner		= THIS_MODULE;
+	ptp_caps->max_adj	= 976562; /* (2^30 / 2^40) * 10^9 */
+	ptp_caps->adjfreq	= fm10k_ptp_adjfreq;
+	ptp_caps->adjtime	= fm10k_ptp_adjtime;
+	ptp_caps->gettime	= fm10k_ptp_gettime;
+	ptp_caps->settime	= fm10k_ptp_settime;
+
+	/* provide pins if BAR4 is accessible */
+	if (interface->sw_addr) {
+		/* enable periodic outputs */
+		ptp_caps->n_per_out = 2;
+		ptp_caps->enable	= fm10k_ptp_enable;
+
+		/* enable clock pins */
+		ptp_caps->verify	= fm10k_ptp_verify;
+		ptp_caps->n_pins = 2;
+		ptp_caps->pin_config = fm10k_ptp_pd;
+	}
+
+	ptp_clock = ptp_clock_register(ptp_caps, dev);
+	if (IS_ERR(ptp_clock)) {
+		ptp_clock = NULL;
+		dev_err(dev, "ptp_clock_register failed\n");
+	} else {
+		dev_info(dev, "registered PHC device %s\n", ptp_caps->name);
+	}
+
+	interface->ptp_clock = ptp_clock;
+}
+
+void fm10k_ptp_unregister(struct fm10k_intfc *interface)
+{
+	struct ptp_clock *ptp_clock = interface->ptp_clock;
+	struct device *dev = &interface->pdev->dev;
+
+	if (!ptp_clock)
+		return;
+
+	interface->ptp_clock = NULL;
+
+	ptp_clock_unregister(ptp_clock);
+	dev_info(dev, "removed PHC %s\n", interface->ptp_caps.name);
+}
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_type.h b/drivers/net/ethernet/intel/fm10k/fm10k_type.h
index 5055bef..dac5b79 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_type.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_type.h
@@ -225,11 +225,7 @@ struct fm10k_hw;
 #define FM10K_STATS_NODESC_DROP		0x3807
 
 /* Timesync registers */
-#define FM10K_RRTIME_CFG	0x3808
-#define FM10K_RRTIME_LIMIT(_n)	((_n) + 0x380C)
-#define FM10K_RRTIME_COUNT(_n)	((_n) + 0x3810)
 #define FM10K_SYSTIME		0x3814
-#define FM10K_SYSTIME0		0x3816
 #define FM10K_SYSTIME_CFG	0x3818
 #define FM10K_SYSTIME_CFG_STEP_MASK		0x0000000F
 
@@ -368,9 +364,6 @@ struct fm10k_hw;
 #define FM10K_VFITR(_n)		((_n) + 0x00060)
 
 /* Registers contained in BAR 4 for Switch management */
-#define FM10K_SW_SYSTIME_CFG	0x0224C
-#define FM10K_SW_SYSTIME_CFG_STEP_SHIFT		4
-#define FM10K_SW_SYSTIME_CFG_ADJUST_MASK	0xFF000000
 #define FM10K_SW_SYSTIME_ADJUST	0x0224D
 #define FM10K_SW_SYSTIME_ADJUST_MASK		0x3FFFFFFF
 #define FM10K_SW_SYSTIME_ADJUST_DIR_NEGATIVE	0x80000000

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [net-next PATCH 28/29] fm10k: Add support for ptp to hw specific files
  2014-09-18 22:40 ` [net-next PATCH 28/29] fm10k: Add support for ptp to hw specific files Alexander Duyck
@ 2014-09-19  7:38   ` Richard Cochran
  2014-09-19 14:36     ` Alexander Duyck
  0 siblings, 1 reply; 50+ messages in thread
From: Richard Cochran @ 2014-09-19  7:38 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: davem, nhorman, netdev, john.fastabend, matthew.vick,
	jeffrey.t.kirsher, sassmann

On Thu, Sep 18, 2014 at 06:40:30PM -0400, Alexander Duyck wrote:

> +static s32 fm10k_adjust_systime_pf(struct fm10k_hw *hw, s32 ppb)
> +{
> +	u64 systime_adjust;
> +
> +	/* if sw_addr is not set we don't have switch register access */
> +	if (!hw->sw_addr)
> +		return ppb ? FM10K_ERR_PARAM : 0;
> +
> +	/* we must convert the value from parts per billion to parts per
> +	 * 2^48 cycles.  In addition we can only use the upper 30 bits of
> +	 * the value when making the change so that restricts us futher.
> +	 * The math for this is equivilent to ABS(pps) * 2^40 / 10 ^ 9,

Huh? Converting ppb to parts per 2^48 should be (ppb * 2^48 / 10^9),
shouldn't it?

Did you mean, "we must convert ... to parts per 2^40 cycles" ?

Also, the comment about using the upper 30 bits is not clear. Do you
mean that the hardware ignores the two least significant bits?

> +	 * the function below is roughly ABS(pps) * 2 ^ 31 / 5 ^ 9 which
> +	 * is as close as we can get to the exact value with numbers as
> +	 * small as possible.
> +	 */
> +	systime_adjust = (ppb < 0) ? -ppb : ppb;
> +	systime_adjust <<= 31;
> +	do_div(systime_adjust, 1953125);
> +
> +	/* verify the requested adjustment value is in range */
> +	if (systime_adjust > FM10K_SW_SYSTIME_ADJUST_MASK)
> +		return FM10K_ERR_PARAM;
> +
> +	if (ppb < 0)
> +		systime_adjust |= FM10K_SW_SYSTIME_ADJUST_DIR_NEGATIVE;
> +
> +	fm10k_write_sw_reg(hw, FM10K_SW_SYSTIME_ADJUST, (u32)systime_adjust);
> +
> +	return 0;
> +}

Thanks,
Richard

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface
  2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
                   ` (28 preceding siblings ...)
  2014-09-18 22:40 ` [net-next PATCH 29/29] fm10k: Add support for PTP Alexander Duyck
@ 2014-09-19  7:55 ` Jiri Pirko
  2014-09-19 14:43   ` Alexander Duyck
  2014-09-19 10:57 ` Jamal Hadi Salim
  2014-09-19 23:52 ` Alexander Duyck
  31 siblings, 1 reply; 50+ messages in thread
From: Jiri Pirko @ 2014-09-19  7:55 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: davem, nhorman, netdev, john.fastabend, matthew.vick,
	jeffrey.t.kirsher, sassmann

Fri, Sep 19, 2014 at 12:35:37AM CEST, alexander.h.duyck@intel.com wrote:
>This patch series adds support for the FM10000 Ethernet switch host
>interface.  The Intel FM10000 Ethernet Switch is a 48-port Ethernet switch
>supporting both Ethernet ports and PCI Express host interfaces.  The fm10k
>driver provides support for the host interface portion of the switch, both
>PF and VF.
>
>As the host interfaces are directly connected to the switch this results in
>some significant differences versus a standard network driver.  For example
>there is no PHY or MII on the device.  Since packets are delivered directly
>from the switch to the host interface these are unnecessary.  Otherwise most
>of the functionality is very similar to our other network drivers such as
>ixgbe or igb.  For example we support all the standard network offloads,
>jumbo frames, SR-IOV (64 VFS), PTP, and some VXLAN and NVGRE offloads.

I'm very happy to see this patchset in the wind. Great news! I have
couple of questions:

Do you also plan to introduce support for FM6000?

>From what I understand, there is one netdev instance for the whole switch (PF).
How can user get stats and info for particular ports? This topic was
discussed many times and I believe that general consensus is to have 1
netdev instance to represent one switch port (that is for example how we
do it in rocker driver).

Thanks.


>
>---
>
>Alexander Duyck (29):
>      fm10k: Add skeletal frame for Intel(R) FM10000 Ethernet Switch Host Interface Driver
>      fm10k: Add register defines and basic structures
>      fm10k: Add support for TLV message parsing and generation
>      fm10k: Add support for basic interaction with hardware
>      fm10k: Add support for mailbox
>      fm10k: Implement PF <-> SM mailbox operations
>      fm10k: Add support for PF
>      fm10k: Add support for configuring PF interface
>      fm10k: Add netdev
>      fm10k: Add support for L2 filtering
>      fm10k: Add support for ndo_open/stop
>      fm10k: Add interrupt support
>      fm10k: add support for Tx/Rx rings
>      fm10k: Add service task to handle delayed events
>      fm10k: Add Tx/Rx hardware ring bring-up/tear-down
>      fm10k: Add transmit and receive fastpath and interrupt handlers
>      fm10k: Add ethtool support
>      fm10k: Add support for PCI power management and error handling
>      fm10k: Add support for multiple queues
>      fm10k: Add support for netdev offloads
>      fm10k: Add support for MACVLAN acceleration
>      fm10k: Add support for PF <-> VF mailbox
>      fm10k: Add support for VF
>      fm10k: Add support for SR-IOV to PF core files
>      fm10k: Add support for SR-IOV to driver
>      fm10k: Add support for IEEE DCBx
>      fm10k: Add support for debugfs
>      fm10k: Add support for ptp to hw specific files
>      fm10k: Add support for PTP
>
>
> drivers/net/ethernet/intel/Kconfig               |   19 
> drivers/net/ethernet/intel/Makefile              |    1 
> drivers/net/ethernet/intel/fm10k/Makefile        |   33 
> drivers/net/ethernet/intel/fm10k/fm10k.h         |  532 +++++
> drivers/net/ethernet/intel/fm10k/fm10k_common.c  |  534 +++++
> drivers/net/ethernet/intel/fm10k/fm10k_common.h  |   65 +
> drivers/net/ethernet/intel/fm10k/fm10k_dcbnl.c   |  174 ++
> drivers/net/ethernet/intel/fm10k/fm10k_debugfs.c |  259 +++
> drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c | 1069 +++++++++++
> drivers/net/ethernet/intel/fm10k/fm10k_iov.c     |  536 +++++
> drivers/net/ethernet/intel/fm10k/fm10k_main.c    | 1978 ++++++++++++++++++++
> drivers/net/ethernet/intel/fm10k/fm10k_mbx.c     | 2125 ++++++++++++++++++++++
> drivers/net/ethernet/intel/fm10k/fm10k_mbx.h     |  307 +++
> drivers/net/ethernet/intel/fm10k/fm10k_netdev.c  | 1424 ++++++++++++++
> drivers/net/ethernet/intel/fm10k/fm10k_pci.c     | 2166 ++++++++++++++++++++++
> drivers/net/ethernet/intel/fm10k/fm10k_pf.c      | 1849 +++++++++++++++++++
> drivers/net/ethernet/intel/fm10k/fm10k_pf.h      |  135 +
> drivers/net/ethernet/intel/fm10k/fm10k_ptp.c     |  535 +++++
> drivers/net/ethernet/intel/fm10k/fm10k_tlv.c     |  863 +++++++++
> drivers/net/ethernet/intel/fm10k/fm10k_tlv.h     |  186 ++
> drivers/net/ethernet/intel/fm10k/fm10k_type.h    |  769 ++++++++
> drivers/net/ethernet/intel/fm10k/fm10k_vf.c      |  552 ++++++
> drivers/net/ethernet/intel/fm10k/fm10k_vf.h      |   78 +
> 23 files changed, 16189 insertions(+)
> create mode 100644 drivers/net/ethernet/intel/fm10k/Makefile
> create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k.h
> create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_common.c
> create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_common.h
> create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_dcbnl.c
> create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_debugfs.c
> create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c
> create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_iov.c
> create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_main.c
> create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_mbx.c
> create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_mbx.h
> create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
> create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_pci.c
> create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_pf.c
> create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_pf.h
> create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_ptp.c
> create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_tlv.c
> create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_tlv.h
> create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_type.h
> create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_vf.c
> create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_vf.h
>
>-- 
>--
>To unsubscribe from this list: send the line "unsubscribe netdev" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface
  2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
                   ` (29 preceding siblings ...)
  2014-09-19  7:55 ` [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Jiri Pirko
@ 2014-09-19 10:57 ` Jamal Hadi Salim
  2014-09-19 14:54   ` Alexander Duyck
  2014-09-19 23:52 ` Alexander Duyck
  31 siblings, 1 reply; 50+ messages in thread
From: Jamal Hadi Salim @ 2014-09-19 10:57 UTC (permalink / raw)
  To: Alexander Duyck, davem
  Cc: nhorman, netdev, john.fastabend, matthew.vick, jeffrey.t.kirsher,
	sassmann

On 09/18/14 18:35, Alexander Duyck wrote:
> This patch series adds support for the FM10000 Ethernet switch host
> interface.  The Intel FM10000 Ethernet Switch is a 48-port Ethernet switch
> supporting both Ethernet ports and PCI Express host interfaces.  The fm10k
> driver provides support for the host interface portion of the switch, both
> PF and VF.
>
> As the host interfaces are directly connected to the switch this results in
> some significant differences versus a standard network driver.  For example
> there is no PHY or MII on the device.  Since packets are delivered directly
> from the switch to the host interface these are unnecessary.  Otherwise most
> of the functionality is very similar to our other network drivers such as
> ixgbe or igb.  For example we support all the standard network offloads,
> jumbo frames, SR-IOV (64 VFS), PTP, and some VXLAN and NVGRE offloads.
>

First off Kudos to the intel folks. Hopefully this is one of many such
drivers coming - and lets hope this is as landscape-changing as the
e1000 was ;-> Broadcom smell that Tim Horton's coffee.

You described this as a switch but then expose it as a glorified nic.
Assuming more goodstuff(tm) coming?

cheers,
jamal

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [net-next PATCH 28/29] fm10k: Add support for ptp to hw specific files
  2014-09-19  7:38   ` Richard Cochran
@ 2014-09-19 14:36     ` Alexander Duyck
  2014-09-19 15:19       ` Richard Cochran
  0 siblings, 1 reply; 50+ messages in thread
From: Alexander Duyck @ 2014-09-19 14:36 UTC (permalink / raw)
  To: Richard Cochran
  Cc: davem, nhorman, netdev, john.fastabend, matthew.vick,
	jeffrey.t.kirsher, sassmann

On 09/19/2014 12:38 AM, Richard Cochran wrote:
> On Thu, Sep 18, 2014 at 06:40:30PM -0400, Alexander Duyck wrote:
> 
>> +static s32 fm10k_adjust_systime_pf(struct fm10k_hw *hw, s32 ppb)
>> +{
>> +	u64 systime_adjust;
>> +
>> +	/* if sw_addr is not set we don't have switch register access */
>> +	if (!hw->sw_addr)
>> +		return ppb ? FM10K_ERR_PARAM : 0;
>> +
>> +	/* we must convert the value from parts per billion to parts per
>> +	 * 2^48 cycles.  In addition we can only use the upper 30 bits of
>> +	 * the value when making the change so that restricts us futher.
>> +	 * The math for this is equivilent to ABS(pps) * 2^40 / 10 ^ 9,
> 
> Huh? Converting ppb to parts per 2^48 should be (ppb * 2^48 / 10^9),
> shouldn't it?
> 
> Did you mean, "we must convert ... to parts per 2^40 cycles" ?
> 
> Also, the comment about using the upper 30 bits is not clear. Do you
> mean that the hardware ignores the two least significant bits?

So to break it all down.

1.  The hardware provides a value that can be specified in parts per
2^48, however that value spans across 2 registers starting at bit 24 in
the first register.  You can actually make out a bit more if you look at
the end of the patch.  The layout for the registers look something like
this:

 /* Registers contained in BAR 4 for Switch management */
 #define FM10K_SW_SYSTIME_CFG	0x0224C
 #define FM10K_SW_SYSTIME_CFG_ADJUST_MASK	0xFF000000
 #define FM10K_SW_SYSTIME_ADJUST	0x0224D
 #define FM10K_SW_SYSTIME_ADJUST_MASK		0x3FFFFFFF
 #define FM10K_SW_SYSTIME_ADJUST_DIR_NEGATIVE	0x80000000

2.  The value is in 2 ^ 48 units, and the OS wants to give us a value in
parts per billion, not parts per 256 * 2 ^ 40.  So in order to simplify
things I dropped the lower 8 bits and only update
FM10K_SW_SYSTIME_ADJUST register giving me an adjust value multiplied by
256.   So the adjustment value I am using now is:
	(256 / (256 * (2 ^ 40)))
		this simplifies down to
	1 / (2 ^ 40)

The value 1 / (2 ^ 40) implies that we are still looking at something
close to parts per trillion, however since I am now only adjusting one
register instead of 2 it meets my needs since those lower 8 bits are an
even smaller part of the whole.

3.  The last bit in the adjustment is to deal with the the fact that we
have a power of 2, not a power of 10 and the adjustments are in parts
per billion.  So we would need to convert by 1024 ^ 3 / 1000 ^ 3.  So
the whole thing broken down would be:
	value / (10 ^ 9) == adj_val / (2 ^ 40)
		so if I solve for adj_val
	value * (2 ^ 40) / (10 ^ 9) == adj_val
		then reduce it by dropping the shared values of 2
	value * (2 ^ 31) / (5 ^ 9) == adj_val

Hopefully that explains how I came to that value.

Thanks,

Alex

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface
  2014-09-19  7:55 ` [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Jiri Pirko
@ 2014-09-19 14:43   ` Alexander Duyck
  0 siblings, 0 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-19 14:43 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: davem, nhorman, netdev, john.fastabend, matthew.vick,
	jeffrey.t.kirsher, sassmann

On 09/19/2014 12:55 AM, Jiri Pirko wrote:
> Fri, Sep 19, 2014 at 12:35:37AM CEST, alexander.h.duyck@intel.com wrote:
>> This patch series adds support for the FM10000 Ethernet switch host
>> interface.  The Intel FM10000 Ethernet Switch is a 48-port Ethernet switch
>> supporting both Ethernet ports and PCI Express host interfaces.  The fm10k
>> driver provides support for the host interface portion of the switch, both
>> PF and VF.
>>
>> As the host interfaces are directly connected to the switch this results in
>> some significant differences versus a standard network driver.  For example
>> there is no PHY or MII on the device.  Since packets are delivered directly
>>from the switch to the host interface these are unnecessary.  Otherwise most
>> of the functionality is very similar to our other network drivers such as
>> ixgbe or igb.  For example we support all the standard network offloads,
>> jumbo frames, SR-IOV (64 VFS), PTP, and some VXLAN and NVGRE offloads.
> 
> I'm very happy to see this patchset in the wind. Great news! I have
> couple of questions:
> 
> Do you also plan to introduce support for FM6000?
> 
> From what I understand, there is one netdev instance for the whole switch (PF).
> How can user get stats and info for particular ports? This topic was
> discussed many times and I believe that general consensus is to have 1
> netdev instance to represent one switch port (that is for example how we
> do it in rocker driver).
> 
> Thanks.
> 

Just to be clear this isn't the entire switch.  This is only the host
interface that this driver supports.  The host interface is essentially
just an embedded NIC on the switch.

In regards to your question about the netdev instances for the ports
that would be up to the switch driver.  We are still debating some of
the implementation details of that.  For example we are still trying to
decide if the switch should be a part of the fm10k driver or if it
should be a separate driver onto itself similar to how the DSA driver
model works.  My preference is the latter, but that isn't my decision as
I am only responsible for the host interface.

As far as the FM6000 goes I can't say.  I don't have visibility into
what the team responsible for that switch plans to do about their host
interface.  It would be nice to have both switches end up following the
same model, but I can't speak for that team.

Thanks,

Alex

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface
  2014-09-19 10:57 ` Jamal Hadi Salim
@ 2014-09-19 14:54   ` Alexander Duyck
  2014-09-19 16:58     ` Alexei Starovoitov
  0 siblings, 1 reply; 50+ messages in thread
From: Alexander Duyck @ 2014-09-19 14:54 UTC (permalink / raw)
  To: Jamal Hadi Salim, davem
  Cc: nhorman, netdev, john.fastabend, matthew.vick, jeffrey.t.kirsher,
	sassmann

On 09/19/2014 03:57 AM, Jamal Hadi Salim wrote:
> On 09/18/14 18:35, Alexander Duyck wrote:
>> This patch series adds support for the FM10000 Ethernet switch host
>> interface.  The Intel FM10000 Ethernet Switch is a 48-port Ethernet
>> switch
>> supporting both Ethernet ports and PCI Express host interfaces.  The
>> fm10k
>> driver provides support for the host interface portion of the switch,
>> both
>> PF and VF.
>>
>> As the host interfaces are directly connected to the switch this
>> results in
>> some significant differences versus a standard network driver.  For
>> example
>> there is no PHY or MII on the device.  Since packets are delivered
>> directly
>> from the switch to the host interface these are unnecessary. 
>> Otherwise most
>> of the functionality is very similar to our other network drivers such as
>> ixgbe or igb.  For example we support all the standard network offloads,
>> jumbo frames, SR-IOV (64 VFS), PTP, and some VXLAN and NVGRE offloads.
>>
> 
> First off Kudos to the intel folks. Hopefully this is one of many such
> drivers coming - and lets hope this is as landscape-changing as the
> e1000 was ;-> Broadcom smell that Tim Horton's coffee.

Thanks.  Though just to be clear it is just the host interface, not the
entire switch that this driver supports.

> You described this as a switch but then expose it as a glorified nic.
> Assuming more goodstuff(tm) coming?

Since the switch can support up to 9 of these we are treating the host
interface driver as though it is a NIC driver.  We need to usually ship
NIC drivers well ahead of the silicon so that it can make it into the
distributions before the silicon is actually released, thus avoiding the
need for a driver disk to do a network install on a system with one of
these on board.

I don't have any hard dates on when anything else related to the switch
might be released.  That is being managed by another team and I don't
have visibility to that information.

Thanks,

Alex

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [net-next PATCH 28/29] fm10k: Add support for ptp to hw specific files
  2014-09-19 14:36     ` Alexander Duyck
@ 2014-09-19 15:19       ` Richard Cochran
  2014-09-19 15:34         ` Alexander Duyck
  0 siblings, 1 reply; 50+ messages in thread
From: Richard Cochran @ 2014-09-19 15:19 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: davem, nhorman, netdev, john.fastabend, matthew.vick,
	jeffrey.t.kirsher, sassmann

On Fri, Sep 19, 2014 at 07:36:56AM -0700, Alexander Duyck wrote:
> On 09/19/2014 12:38 AM, Richard Cochran wrote:
> > On Thu, Sep 18, 2014 at 06:40:30PM -0400, Alexander Duyck wrote:
> > 
> >> +static s32 fm10k_adjust_systime_pf(struct fm10k_hw *hw, s32 ppb)
> >> +{
> >> +	u64 systime_adjust;
> >> +
> >> +	/* if sw_addr is not set we don't have switch register access */
> >> +	if (!hw->sw_addr)
> >> +		return ppb ? FM10K_ERR_PARAM : 0;
> >> +
> >> +	/* we must convert the value from parts per billion to parts per
> >> +	 * 2^48 cycles.  In addition we can only use the upper 30 bits of
> >> +	 * the value when making the change so that restricts us futher.
> >> +	 * The math for this is equivilent to ABS(pps) * 2^40 / 10 ^ 9,

...

> 2.  The value is in 2 ^ 48 units, and the OS wants to give us a value in
> parts per billion, not parts per 256 * 2 ^ 40.  So in order to simplify
> things I dropped the lower 8 bits and only update

So what does the comment in the code about "upper 30 bits" mean?

Thanks,
Richard

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [net-next PATCH 28/29] fm10k: Add support for ptp to hw specific files
  2014-09-19 15:19       ` Richard Cochran
@ 2014-09-19 15:34         ` Alexander Duyck
  0 siblings, 0 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-19 15:34 UTC (permalink / raw)
  To: Richard Cochran
  Cc: davem, nhorman, netdev, john.fastabend, matthew.vick,
	jeffrey.t.kirsher, sassmann

On 09/19/2014 08:19 AM, Richard Cochran wrote:
> On Fri, Sep 19, 2014 at 07:36:56AM -0700, Alexander Duyck wrote:
>> On 09/19/2014 12:38 AM, Richard Cochran wrote:
>>> On Thu, Sep 18, 2014 at 06:40:30PM -0400, Alexander Duyck wrote:
>>>
>>>> +static s32 fm10k_adjust_systime_pf(struct fm10k_hw *hw, s32 ppb)
>>>> +{
>>>> +	u64 systime_adjust;
>>>> +
>>>> +	/* if sw_addr is not set we don't have switch register access */
>>>> +	if (!hw->sw_addr)
>>>> +		return ppb ? FM10K_ERR_PARAM : 0;
>>>> +
>>>> +	/* we must convert the value from parts per billion to parts per
>>>> +	 * 2^48 cycles.  In addition we can only use the upper 30 bits of
>>>> +	 * the value when making the change so that restricts us futher.
>>>> +	 * The math for this is equivilent to ABS(pps) * 2^40 / 10 ^ 9,
> 
> ...
> 
>> 2.  The value is in 2 ^ 48 units, and the OS wants to give us a value in
>> parts per billion, not parts per 256 * 2 ^ 40.  So in order to simplify
>> things I dropped the lower 8 bits and only update
> 
> So what does the comment in the code about "upper 30 bits" mean?
> 

The value is measured in parts per 2^48 units, but the field we are
given to provide adjustment units is 38 bits wide, with the 30 most
significant bits in the main adjustment register, and the 8 least
significant bits in another register.  The direction is a separate bit
at bit 31 in the upper register with a 1 bit gap between it and the value.

Thanks,

Alex

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface
  2014-09-19 14:54   ` Alexander Duyck
@ 2014-09-19 16:58     ` Alexei Starovoitov
  2014-09-19 18:22       ` Alexander Duyck
  0 siblings, 1 reply; 50+ messages in thread
From: Alexei Starovoitov @ 2014-09-19 16:58 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Jamal Hadi Salim, David S. Miller, nhorman, netdev,
	john fastabend, matthew.vick, jeffrey.t.kirsher, sassmann

On Fri, Sep 19, 2014 at 7:54 AM, Alexander Duyck
<alexander.h.duyck@intel.com> wrote:
>
> Since the switch can support up to 9 of these we are treating the host
> interface driver as though it is a NIC driver.  We need to usually ship

Am I reading it right that 9 hosts can be attached to this one physical device
and they see it as different 'physical' nics ?

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [net-next PATCH 29/29] fm10k: Add support for PTP
  2014-09-18 22:40 ` [net-next PATCH 29/29] fm10k: Add support for PTP Alexander Duyck
@ 2014-09-19 17:35   ` Richard Cochran
  2014-09-19 18:32     ` Alexander Duyck
  2014-09-22 11:03     ` David Laight
  0 siblings, 2 replies; 50+ messages in thread
From: Richard Cochran @ 2014-09-19 17:35 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: davem, nhorman, netdev, john.fastabend, matthew.vick,
	jeffrey.t.kirsher, sassmann

On Thu, Sep 18, 2014 at 06:40:46PM -0400, Alexander Duyck wrote:

> +static s32 fm10k_1588_msg_vf(struct fm10k_hw *hw, u32 **results,
> +			     struct fm10k_mbx_info *mbx)
> +{
> +	struct fm10k_intfc *interface = container_of(hw,
> +						     struct fm10k_intfc,
> +						     hw);

This looks really funny to me here and in the other spot. Why not this?

	struct fm10k_intfc *interface = container_of(hw, struct fm10k_intfc, hw);

Its only one over the 80 km/h speed limit.

> +	u64 timestamp;
> +	s32 err;
> +
> +	err = fm10k_tlv_attr_get_u64(results[FM10K_1588_MSG_TIMESTAMP],
> +				     &timestamp);
> +	if (err)
> +		return err;
> +
> +	fm10k_ts_tx_hwtstamp(interface, 0, timestamp);
> +
> +	return 0;
> +}

> diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_ptp.c b/drivers/net/ethernet/intel/fm10k/fm10k_ptp.c
> new file mode 100644
> index 0000000..41da724
> --- /dev/null
> +++ b/drivers/net/ethernet/intel/fm10k/fm10k_ptp.c

> +/* We use a 64b counter so overflow is extremely seldom.  Just
> + * to keep things sane we should check for overflow once per day
> + */

Hm...

> +void fm10k_ts_start_cc(struct fm10k_intfc *interface)
> +{
> +	struct fm10k_hw *hw = &interface->hw;
> +
> +	/* Initialize cycle counter */
> +	interface->cc.read = (hw->mac.type == fm10k_mac_pf) ? fm10k_cc_read_pf :
> +							      fm10k_cc_read_vf;
> +	interface->cc.mask = CLOCKSOURCE_MASK(64);
> +	interface->cc.mult = 1;

So shift = 0 and multi = 1. Your clock counts nanoseconds. Why not use
it directly? Then you won't need the timecounter stuff or the overflow
watchdog either.

> +
> +	/* Initialize lock protecting register access */
> +	rwlock_init(&interface->tsreg_lock);
> +
> +	/* Initialize skb queue for pending timestamp requests */
> +	skb_queue_head_init(&interface->ts_tx_skb_queue);
> +
> +	/* Initialize the clock */
> +	fm10k_ts_reset_cc(interface);
> +
> +	/* Initialize the overflow work */
> +	INIT_DELAYED_WORK(&interface->ts_overflow_work,
> +			  fm10k_ts_overflow_check);
> +	schedule_delayed_work(&interface->ts_overflow_work,
> +			      FM10K_SYSTIME_OVERFLOW_PERIOD);
> +}

> +static int fm10k_ptp_enable(struct ptp_clock_info *ptp,
> +			    struct ptp_clock_request *rq, int on)
> +{
> +	struct fm10k_intfc *interface =
> +		container_of(ptp, struct fm10k_intfc, ptp_caps);
> +	struct ptp_clock_time *t = &rq->perout.period;
> +	struct fm10k_hw *hw = &interface->hw;
> +	u64 period;
> +	u32 step;
> +
> +	/* we can only support periodic output */
> +	if (rq->type != PTP_CLK_REQ_PEROUT)
> +		return -EINVAL;
> +
> +	/* verify the requested channel is there */
> +	if (rq->perout.index >= ptp->n_per_out)
> +		return -EINVAL;
> +
> +	/* we simply cannot support the operation if we don't have BAR4 */
> +	if (!hw->sw_addr)
> +		return -ENOTSUPP;
> +
> +	/* we cannot enforce start time as there is no
> +	 * mechanism for that in the hardware, we can only control
> +	 * the period.
> +	 */

Is this because of the timecounter in the way? Another reason to use
the 64 bit nanosecond counter directly.

> diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_type.h b/drivers/net/ethernet/intel/fm10k/fm10k_type.h
> index 5055bef..dac5b79 100644
> --- a/drivers/net/ethernet/intel/fm10k/fm10k_type.h
> +++ b/drivers/net/ethernet/intel/fm10k/fm10k_type.h
> @@ -225,11 +225,7 @@ struct fm10k_hw;
>  #define FM10K_STATS_NODESC_DROP		0x3807
>  
>  /* Timesync registers */
> -#define FM10K_RRTIME_CFG	0x3808
> -#define FM10K_RRTIME_LIMIT(_n)	((_n) + 0x380C)
> -#define FM10K_RRTIME_COUNT(_n)	((_n) + 0x3810)
>  #define FM10K_SYSTIME		0x3814
> -#define FM10K_SYSTIME0		0x3816
>  #define FM10K_SYSTIME_CFG	0x3818
>  #define FM10K_SYSTIME_CFG_STEP_MASK		0x0000000F
>  
> @@ -368,9 +364,6 @@ struct fm10k_hw;
>  #define FM10K_VFITR(_n)		((_n) + 0x00060)
>  
>  /* Registers contained in BAR 4 for Switch management */
> -#define FM10K_SW_SYSTIME_CFG	0x0224C
> -#define FM10K_SW_SYSTIME_CFG_STEP_SHIFT		4
> -#define FM10K_SW_SYSTIME_CFG_ADJUST_MASK	0xFF000000

You added these three lines in the previous patch.

>  #define FM10K_SW_SYSTIME_ADJUST	0x0224D
>  #define FM10K_SW_SYSTIME_ADJUST_MASK		0x3FFFFFFF
>  #define FM10K_SW_SYSTIME_ADJUST_DIR_NEGATIVE	0x80000000
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Thanks,
Richard

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface
  2014-09-19 16:58     ` Alexei Starovoitov
@ 2014-09-19 18:22       ` Alexander Duyck
  0 siblings, 0 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-19 18:22 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Jamal Hadi Salim, David S. Miller, nhorman, netdev,
	john fastabend, matthew.vick, jeffrey.t.kirsher, sassmann

On 09/19/2014 09:58 AM, Alexei Starovoitov wrote:
> On Fri, Sep 19, 2014 at 7:54 AM, Alexander Duyck
> <alexander.h.duyck@intel.com> wrote:
>>
>> Since the switch can support up to 9 of these we are treating the host
>> interface driver as though it is a NIC driver.  We need to usually ship
> 
> Am I reading it right that 9 hosts can be attached to this one physical device
> and they see it as different 'physical' nics ?
> 

No, there are up to 9 independent host interfaces that are attached to
the switch.  Each host device shows up as a separate PCI device capable
of SR-IOV.  If SR-IOV is enabled with 64 VFs then one Host interface
will appear as 1 PF PCI device, and 64 VF PCI devices.

Thanks,

Alex

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [net-next PATCH 29/29] fm10k: Add support for PTP
  2014-09-19 17:35   ` Richard Cochran
@ 2014-09-19 18:32     ` Alexander Duyck
  2014-09-20 21:07       ` Richard Cochran
  2014-09-20 21:16       ` Richard Cochran
  2014-09-22 11:03     ` David Laight
  1 sibling, 2 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-19 18:32 UTC (permalink / raw)
  To: Richard Cochran
  Cc: davem, nhorman, netdev, john.fastabend, matthew.vick,
	jeffrey.t.kirsher, sassmann

On 09/19/2014 10:35 AM, Richard Cochran wrote:
> On Thu, Sep 18, 2014 at 06:40:46PM -0400, Alexander Duyck wrote:
> 
>> +static s32 fm10k_1588_msg_vf(struct fm10k_hw *hw, u32 **results,
>> +			     struct fm10k_mbx_info *mbx)
>> +{
>> +	struct fm10k_intfc *interface = container_of(hw,
>> +						     struct fm10k_intfc,
>> +						     hw);
> 
> This looks really funny to me here and in the other spot. Why not this?
> 
> 	struct fm10k_intfc *interface = container_of(hw, struct fm10k_intfc, hw);
> 

Because doing it that way it extends over 80 characters.

> Its only one over the 80 km/h speed limit.
> 
>> +	u64 timestamp;
>> +	s32 err;
>> +
>> +	err = fm10k_tlv_attr_get_u64(results[FM10K_1588_MSG_TIMESTAMP],
>> +				     &timestamp);
>> +	if (err)
>> +		return err;
>> +
>> +	fm10k_ts_tx_hwtstamp(interface, 0, timestamp);
>> +
>> +	return 0;
>> +}
> 
>> diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_ptp.c b/drivers/net/ethernet/intel/fm10k/fm10k_ptp.c
>> new file mode 100644
>> index 0000000..41da724
>> --- /dev/null
>> +++ b/drivers/net/ethernet/intel/fm10k/fm10k_ptp.c
> 
>> +/* We use a 64b counter so overflow is extremely seldom.  Just
>> + * to keep things sane we should check for overflow once per day
>> + */
> 
> Hm...
> 
>> +void fm10k_ts_start_cc(struct fm10k_intfc *interface)
>> +{
>> +	struct fm10k_hw *hw = &interface->hw;
>> +
>> +	/* Initialize cycle counter */
>> +	interface->cc.read = (hw->mac.type == fm10k_mac_pf) ? fm10k_cc_read_pf :
>> +							      fm10k_cc_read_vf;
>> +	interface->cc.mask = CLOCKSOURCE_MASK(64);
>> +	interface->cc.mult = 1;
> 
> So shift = 0 and multi = 1. Your clock counts nanoseconds. Why not use
> it directly? Then you won't need the timecounter stuff or the overflow
> watchdog either.

Because the value cannot be adjusted directly.  The timesource for the
switch is shared by all ports and host interfaces.  As such we have to
maintain a software offset per host to avoid corrupting the other clocks.

>> +
>> +	/* Initialize lock protecting register access */
>> +	rwlock_init(&interface->tsreg_lock);
>> +
>> +	/* Initialize skb queue for pending timestamp requests */
>> +	skb_queue_head_init(&interface->ts_tx_skb_queue);
>> +
>> +	/* Initialize the clock */
>> +	fm10k_ts_reset_cc(interface);
>> +
>> +	/* Initialize the overflow work */
>> +	INIT_DELAYED_WORK(&interface->ts_overflow_work,
>> +			  fm10k_ts_overflow_check);
>> +	schedule_delayed_work(&interface->ts_overflow_work,
>> +			      FM10K_SYSTIME_OVERFLOW_PERIOD);
>> +}
> 
>> +static int fm10k_ptp_enable(struct ptp_clock_info *ptp,
>> +			    struct ptp_clock_request *rq, int on)
>> +{
>> +	struct fm10k_intfc *interface =
>> +		container_of(ptp, struct fm10k_intfc, ptp_caps);
>> +	struct ptp_clock_time *t = &rq->perout.period;
>> +	struct fm10k_hw *hw = &interface->hw;
>> +	u64 period;
>> +	u32 step;
>> +
>> +	/* we can only support periodic output */
>> +	if (rq->type != PTP_CLK_REQ_PEROUT)
>> +		return -EINVAL;
>> +
>> +	/* verify the requested channel is there */
>> +	if (rq->perout.index >= ptp->n_per_out)
>> +		return -EINVAL;
>> +
>> +	/* we simply cannot support the operation if we don't have BAR4 */
>> +	if (!hw->sw_addr)
>> +		return -ENOTSUPP;
>> +
>> +	/* we cannot enforce start time as there is no
>> +	 * mechanism for that in the hardware, we can only control
>> +	 * the period.
>> +	 */
> 
> Is this because of the timecounter in the way? Another reason to use
> the 64 bit nanosecond counter directly.

The problem is we cannot modify the counter as it is in use by multiple
clocks.  So if one adjusted the value it would mess up all of the others.

Also in our case the hardware design doesn't give us a register we can
plug the start time into.  We can only control the frequency of the pulse.

>> diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_type.h b/drivers/net/ethernet/intel/fm10k/fm10k_type.h
>> index 5055bef..dac5b79 100644
>> --- a/drivers/net/ethernet/intel/fm10k/fm10k_type.h
>> +++ b/drivers/net/ethernet/intel/fm10k/fm10k_type.h
>> @@ -225,11 +225,7 @@ struct fm10k_hw;
>>  #define FM10K_STATS_NODESC_DROP		0x3807
>>  
>>  /* Timesync registers */
>> -#define FM10K_RRTIME_CFG	0x3808
>> -#define FM10K_RRTIME_LIMIT(_n)	((_n) + 0x380C)
>> -#define FM10K_RRTIME_COUNT(_n)	((_n) + 0x3810)
>>  #define FM10K_SYSTIME		0x3814
>> -#define FM10K_SYSTIME0		0x3816
>>  #define FM10K_SYSTIME_CFG	0x3818
>>  #define FM10K_SYSTIME_CFG_STEP_MASK		0x0000000F
>>  
>> @@ -368,9 +364,6 @@ struct fm10k_hw;
>>  #define FM10K_VFITR(_n)		((_n) + 0x00060)
>>  
>>  /* Registers contained in BAR 4 for Switch management */
>> -#define FM10K_SW_SYSTIME_CFG	0x0224C
>> -#define FM10K_SW_SYSTIME_CFG_STEP_SHIFT		4
>> -#define FM10K_SW_SYSTIME_CFG_ADJUST_MASK	0xFF000000
> 
> You added these three lines in the previous patch.
> 
>>  #define FM10K_SW_SYSTIME_ADJUST	0x0224D
>>  #define FM10K_SW_SYSTIME_ADJUST_MASK		0x3FFFFFFF
>>  #define FM10K_SW_SYSTIME_ADJUST_DIR_NEGATIVE	0x80000000
>>

Yeah, I noticed that after you asked your original question.  When I had
gone through to clean up the defines that weren't used I cleaned these
up in the wrong patch.

If/when I submit a v2 I will pull them out of the previous patch in the
sequence since that is where they are added.  I already have this
resolved in my local copy as of an hour or so ago.

Thanks,

Alex

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface
  2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
                   ` (30 preceding siblings ...)
  2014-09-19 10:57 ` Jamal Hadi Salim
@ 2014-09-19 23:52 ` Alexander Duyck
  2014-09-20  2:03   ` David Miller
  31 siblings, 1 reply; 50+ messages in thread
From: Alexander Duyck @ 2014-09-19 23:52 UTC (permalink / raw)
  To: davem
  Cc: Alexander Duyck, nhorman, netdev, john.fastabend, matthew.vick,
	jeffrey.t.kirsher, sassmann

Dave,

I was wondering what changes you wanted for these patches?  The only
feedback that I have gotten so far was on the last 2 patches, and a one
liner about dev_consume_skb_any vs dev_kfree_skb_any.  Is there
something I missed, or do I need  to resubmit the entire set with some
other sort of fix?

I was hoping you could just pull the first 27, and then I could resubmit
the last 2 with an attempt at addressing formatting/comment fixes based
on Richard's comments as well as a small patch for the
dev_consume_skb_any change needed in the transmit cleanup.

Thanks,

Alex

On 09/18/2014 03:35 PM, Alexander Duyck wrote:
> This patch series adds support for the FM10000 Ethernet switch host
> interface.  The Intel FM10000 Ethernet Switch is a 48-port Ethernet switch
> supporting both Ethernet ports and PCI Express host interfaces.  The fm10k
> driver provides support for the host interface portion of the switch, both
> PF and VF.
>
> As the host interfaces are directly connected to the switch this results in
> some significant differences versus a standard network driver.  For example
> there is no PHY or MII on the device.  Since packets are delivered directly
> from the switch to the host interface these are unnecessary.  Otherwise most
> of the functionality is very similar to our other network drivers such as
> ixgbe or igb.  For example we support all the standard network offloads,
> jumbo frames, SR-IOV (64 VFS), PTP, and some VXLAN and NVGRE offloads.
>
> ---
>
> Alexander Duyck (29):
>       fm10k: Add skeletal frame for Intel(R) FM10000 Ethernet Switch Host Interface Driver
>       fm10k: Add register defines and basic structures
>       fm10k: Add support for TLV message parsing and generation
>       fm10k: Add support for basic interaction with hardware
>       fm10k: Add support for mailbox
>       fm10k: Implement PF <-> SM mailbox operations
>       fm10k: Add support for PF
>       fm10k: Add support for configuring PF interface
>       fm10k: Add netdev
>       fm10k: Add support for L2 filtering
>       fm10k: Add support for ndo_open/stop
>       fm10k: Add interrupt support
>       fm10k: add support for Tx/Rx rings
>       fm10k: Add service task to handle delayed events
>       fm10k: Add Tx/Rx hardware ring bring-up/tear-down
>       fm10k: Add transmit and receive fastpath and interrupt handlers
>       fm10k: Add ethtool support
>       fm10k: Add support for PCI power management and error handling
>       fm10k: Add support for multiple queues
>       fm10k: Add support for netdev offloads
>       fm10k: Add support for MACVLAN acceleration
>       fm10k: Add support for PF <-> VF mailbox
>       fm10k: Add support for VF
>       fm10k: Add support for SR-IOV to PF core files
>       fm10k: Add support for SR-IOV to driver
>       fm10k: Add support for IEEE DCBx
>       fm10k: Add support for debugfs
>       fm10k: Add support for ptp to hw specific files
>       fm10k: Add support for PTP
>
>
>  drivers/net/ethernet/intel/Kconfig               |   19 
>  drivers/net/ethernet/intel/Makefile              |    1 
>  drivers/net/ethernet/intel/fm10k/Makefile        |   33 
>  drivers/net/ethernet/intel/fm10k/fm10k.h         |  532 +++++
>  drivers/net/ethernet/intel/fm10k/fm10k_common.c  |  534 +++++
>  drivers/net/ethernet/intel/fm10k/fm10k_common.h  |   65 +
>  drivers/net/ethernet/intel/fm10k/fm10k_dcbnl.c   |  174 ++
>  drivers/net/ethernet/intel/fm10k/fm10k_debugfs.c |  259 +++
>  drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c | 1069 +++++++++++
>  drivers/net/ethernet/intel/fm10k/fm10k_iov.c     |  536 +++++
>  drivers/net/ethernet/intel/fm10k/fm10k_main.c    | 1978 ++++++++++++++++++++
>  drivers/net/ethernet/intel/fm10k/fm10k_mbx.c     | 2125 ++++++++++++++++++++++
>  drivers/net/ethernet/intel/fm10k/fm10k_mbx.h     |  307 +++
>  drivers/net/ethernet/intel/fm10k/fm10k_netdev.c  | 1424 ++++++++++++++
>  drivers/net/ethernet/intel/fm10k/fm10k_pci.c     | 2166 ++++++++++++++++++++++
>  drivers/net/ethernet/intel/fm10k/fm10k_pf.c      | 1849 +++++++++++++++++++
>  drivers/net/ethernet/intel/fm10k/fm10k_pf.h      |  135 +
>  drivers/net/ethernet/intel/fm10k/fm10k_ptp.c     |  535 +++++
>  drivers/net/ethernet/intel/fm10k/fm10k_tlv.c     |  863 +++++++++
>  drivers/net/ethernet/intel/fm10k/fm10k_tlv.h     |  186 ++
>  drivers/net/ethernet/intel/fm10k/fm10k_type.h    |  769 ++++++++
>  drivers/net/ethernet/intel/fm10k/fm10k_vf.c      |  552 ++++++
>  drivers/net/ethernet/intel/fm10k/fm10k_vf.h      |   78 +
>  23 files changed, 16189 insertions(+)
>  create mode 100644 drivers/net/ethernet/intel/fm10k/Makefile
>  create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k.h
>  create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_common.c
>  create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_common.h
>  create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_dcbnl.c
>  create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_debugfs.c
>  create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c
>  create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_iov.c
>  create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_main.c
>  create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_mbx.c
>  create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_mbx.h
>  create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
>  create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_pci.c
>  create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_pf.c
>  create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_pf.h
>  create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_ptp.c
>  create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_tlv.c
>  create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_tlv.h
>  create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_type.h
>  create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_vf.c
>  create mode 100644 drivers/net/ethernet/intel/fm10k/fm10k_vf.h
>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface
  2014-09-19 23:52 ` Alexander Duyck
@ 2014-09-20  2:03   ` David Miller
  0 siblings, 0 replies; 50+ messages in thread
From: David Miller @ 2014-09-20  2:03 UTC (permalink / raw)
  To: alexander.duyck
  Cc: alexander.h.duyck, nhorman, netdev, john.fastabend, matthew.vick,
	jeffrey.t.kirsher, sassmann

From: Alexander Duyck <alexander.duyck@gmail.com>
Date: Fri, 19 Sep 2014 16:52:32 -0700

> I was wondering what changes you wanted for these patches?  The only
> feedback that I have gotten so far was on the last 2 patches, and a one
> liner about dev_consume_skb_any vs dev_kfree_skb_any.  Is there
> something I missed, or do I need  to resubmit the entire set with some
> other sort of fix?

Any time there is any change requested or intended adjustment, I mark
the entire series as changes requested, and you are expected to make
the adjustments and resubmit the entire series.

I do this silently to work my queue as efficiently as possible, and yet
I still sometimes fall very far behind and get backlogged.

Please do not ask me to repeat what's already been stated in the patch
followup postings as I'll fall even further behind.

Thanks.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [net-next PATCH 29/29] fm10k: Add support for PTP
  2014-09-19 18:32     ` Alexander Duyck
@ 2014-09-20 21:07       ` Richard Cochran
  2014-09-20 21:37         ` Joe Perches
  2014-09-20 21:16       ` Richard Cochran
  1 sibling, 1 reply; 50+ messages in thread
From: Richard Cochran @ 2014-09-20 21:07 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: davem, nhorman, netdev, john.fastabend, matthew.vick,
	jeffrey.t.kirsher, sassmann

On Fri, Sep 19, 2014 at 11:32:24AM -0700, Alexander Duyck wrote:
> Because doing it that way it extends over 80 characters.

Obviously, but still the result is really ugly.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [net-next PATCH 29/29] fm10k: Add support for PTP
  2014-09-19 18:32     ` Alexander Duyck
  2014-09-20 21:07       ` Richard Cochran
@ 2014-09-20 21:16       ` Richard Cochran
  2014-09-20 23:36         ` Alexander Duyck
  1 sibling, 1 reply; 50+ messages in thread
From: Richard Cochran @ 2014-09-20 21:16 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: davem, nhorman, netdev, john.fastabend, matthew.vick,
	jeffrey.t.kirsher, sassmann

On Fri, Sep 19, 2014 at 11:32:24AM -0700, Alexander Duyck wrote:
> 
> Because the value cannot be adjusted directly.  The timesource for the
> switch is shared by all ports and host interfaces.  As such we have to
> maintain a software offset per host to avoid corrupting the other clocks.

So any host can set the time, but only one may adjust the frequency?

Regarding the overflow, probably you can remove the daily check, since
you only need a single clock read every 300 years, and that is
reasonable to expect in any case.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [net-next PATCH 29/29] fm10k: Add support for PTP
  2014-09-20 21:07       ` Richard Cochran
@ 2014-09-20 21:37         ` Joe Perches
  0 siblings, 0 replies; 50+ messages in thread
From: Joe Perches @ 2014-09-20 21:37 UTC (permalink / raw)
  To: Richard Cochran
  Cc: Alexander Duyck, davem, nhorman, netdev, john.fastabend,
	matthew.vick, jeffrey.t.kirsher, sassmann

On Sat, 2014-09-20 at 23:07 +0200, Richard Cochran wrote:
> On Fri, Sep 19, 2014 at 11:32:24AM -0700, Alexander Duyck wrote:
> > Because doing it that way it extends over 80 characters.
> 
> Obviously, but still the result is really ugly.

True.

It might be reasonable to add a new container_of_ptr
to kernel.h

Something like:

/**
 * container_of_ptr - cast a member of a pointer to a structure
 *		      out to the containing structure
 * @ptr:	the pointer to the member.
 * @type_ptr:	a pointer of type of the container struct this is embedded in.
 * @member:	the name of the member within the type_ptr.
 *
 */
#define container_of_ptr(ptr, type_ptr, member)				\
({									\
	const typeof((type_ptr)->member) *__mptr = (ptr);		\
	(type_ptr)((char *)__mptr -					\
		   ((char *)&((ptr_type)->member) - (char *)(ptr_type))); \
})

So this could be written as:

	struct fm10k_intfc *interface = container_of_ptr(hw, interface, hw);

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [net-next PATCH 29/29] fm10k: Add support for PTP
  2014-09-20 21:16       ` Richard Cochran
@ 2014-09-20 23:36         ` Alexander Duyck
  0 siblings, 0 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-20 23:36 UTC (permalink / raw)
  To: Richard Cochran, Alexander Duyck
  Cc: davem, nhorman, netdev, john.fastabend, matthew.vick,
	jeffrey.t.kirsher, sassmann

On 09/20/2014 02:16 PM, Richard Cochran wrote:
> On Fri, Sep 19, 2014 at 11:32:24AM -0700, Alexander Duyck wrote:
>> Because the value cannot be adjusted directly.  The timesource for the
>> switch is shared by all ports and host interfaces.  As such we have to
>> maintain a software offset per host to avoid corrupting the other clocks.
> So any host can set the time, but only one may adjust the frequency?

Yes.  Essentially the one who should be adjusting the frequency should
be the host interface that is providing traffic for the control plane
processor.  That way the control plane processor can act as a boundary
clock and can adjust the frequency to match the port it has decided to
be a slave on.

In the case of the ordinary clock host interfaces they will be slaves to
the boundary clock and since they share the same clock source anyway
they shouldn't need any frequency adjustments, or at least that is how
it is supposed to work.  The other reason for using the software offset
is that it can be very time intensive as you have to stop the clock,
update the staring value for each host interface and the Ethernet ports,
and then you can resume the clock.  It can be a rather time consuming
process doing it that way versus just maintaining an offset in software.

Tx timestamps from the host interface are actually identical to the Rx
timestamps on the receiving host interface in the case of traffic going
between two fm10k host interfaces on the same switch.  So what you end
up with is that any two hosts should be able to perfectly synchronize
their clocks with just the follow-up to the first sync message.  I've
done some testing for this with an FPGA and actually it has proven out
as ptp4l did one offset adjustment on the follow-up and then indicated 0
adjustments from there forward.  It is one of the reasons why I believe
we should be able to disable the frequency adjustment for interfaces
configured as ordinary clocks.

> Regarding the overflow, probably you can remove the daily check, since
> you only need a single clock read every 300 years, and that is
> reasonable to expect in any case.

Yeah, I figured that out after I recalled that the overflow check is
only really necessary if the mask is less than 64 bits.

Thanks,

Alex

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [net-next PATCH 29/29] fm10k: Add support for PTP
  2014-09-19 17:35   ` Richard Cochran
  2014-09-19 18:32     ` Alexander Duyck
@ 2014-09-22 11:03     ` David Laight
  2014-09-22 14:21       ` Alexander Duyck
  1 sibling, 1 reply; 50+ messages in thread
From: David Laight @ 2014-09-22 11:03 UTC (permalink / raw)
  To: 'Richard Cochran', Alexander Duyck
  Cc: davem, nhorman, netdev, john.fastabend, matthew.vick,
	jeffrey.t.kirsher, sassmann

From: Of Richard Cochran
> On Thu, Sep 18, 2014 at 06:40:46PM -0400, Alexander Duyck wrote:
> 
> > +static s32 fm10k_1588_msg_vf(struct fm10k_hw *hw, u32 **results,
> > +			     struct fm10k_mbx_info *mbx)
> > +{
> > +	struct fm10k_intfc *interface = container_of(hw,
> > +						     struct fm10k_intfc,
> > +						     hw);
> 
> This looks really funny to me here and in the other spot. Why not this?
> 
> 	struct fm10k_intfc *interface = container_of(hw, struct fm10k_intfc, hw);
> 
> Its only one over the 80 km/h speed limit.

Or split the assignment from the declaration.

	David

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [net-next PATCH 29/29] fm10k: Add support for PTP
  2014-09-22 11:03     ` David Laight
@ 2014-09-22 14:21       ` Alexander Duyck
  0 siblings, 0 replies; 50+ messages in thread
From: Alexander Duyck @ 2014-09-22 14:21 UTC (permalink / raw)
  To: David Laight, 'Richard Cochran'
  Cc: davem, nhorman, netdev, john.fastabend, matthew.vick,
	jeffrey.t.kirsher, sassmann

On 09/22/2014 04:03 AM, David Laight wrote:
> From: Of Richard Cochran
>> On Thu, Sep 18, 2014 at 06:40:46PM -0400, Alexander Duyck wrote:
>>
>>> +static s32 fm10k_1588_msg_vf(struct fm10k_hw *hw, u32 **results,
>>> +			     struct fm10k_mbx_info *mbx)
>>> +{
>>> +	struct fm10k_intfc *interface = container_of(hw,
>>> +						     struct fm10k_intfc,
>>> +						     hw);
>>
>> This looks really funny to me here and in the other spot. Why not this?
>>
>> 	struct fm10k_intfc *interface = container_of(hw, struct fm10k_intfc, hw);
>>
>> Its only one over the 80 km/h speed limit.
> 
> Or split the assignment from the declaration.
> 
> 	David

That was the solution I went with.  It solved both the line over 80
characters and breaking up of the container_of parameter list.

Thanks,

Alex

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2014-09-22 14:21 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-18 22:35 [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Alexander Duyck
2014-09-18 22:35 ` [net-next PATCH 01/29] fm10k: Add skeletal frame for Intel(R) FM10000 Ethernet Switch Host Interface Driver Alexander Duyck
2014-09-18 22:35 ` [net-next PATCH 02/29] fm10k: Add register defines and basic structures Alexander Duyck
2014-09-18 22:36 ` [net-next PATCH 03/29] fm10k: Add support for TLV message parsing and generation Alexander Duyck
2014-09-18 22:36 ` [net-next PATCH 04/29] fm10k: Add support for basic interaction with hardware Alexander Duyck
2014-09-18 22:36 ` [net-next PATCH 05/29] fm10k: Add support for mailbox Alexander Duyck
2014-09-18 22:36 ` [net-next PATCH 06/29] fm10k: Implement PF <-> SM mailbox operations Alexander Duyck
2014-09-18 22:36 ` [net-next PATCH 07/29] fm10k: Add support for PF Alexander Duyck
2014-09-18 22:36 ` [net-next PATCH 08/29] fm10k: Add support for configuring PF interface Alexander Duyck
2014-09-18 22:36 ` [net-next PATCH 09/29] fm10k: Add netdev Alexander Duyck
2014-09-18 22:37 ` [net-next PATCH 10/29] fm10k: Add support for L2 filtering Alexander Duyck
2014-09-18 22:37 ` [net-next PATCH 11/29] fm10k: Add support for ndo_open/stop Alexander Duyck
2014-09-18 22:37 ` [net-next PATCH 12/29] fm10k: Add interrupt support Alexander Duyck
2014-09-18 22:37 ` [net-next PATCH 13/29] fm10k: add support for Tx/Rx rings Alexander Duyck
2014-09-18 22:37 ` [net-next PATCH 14/29] fm10k: Add service task to handle delayed events Alexander Duyck
2014-09-18 22:37 ` [net-next PATCH 15/29] fm10k: Add Tx/Rx hardware ring bring-up/tear-down Alexander Duyck
2014-09-18 22:38 ` [net-next PATCH 16/29] fm10k: Add transmit and receive fastpath and interrupt handlers Alexander Duyck
2014-09-18 22:38 ` [net-next PATCH 17/29] fm10k: Add ethtool support Alexander Duyck
2014-09-18 22:38 ` [net-next PATCH 18/29] fm10k: Add support for PCI power management and error handling Alexander Duyck
2014-09-18 22:38 ` [net-next PATCH 19/29] fm10k: Add support for multiple queues Alexander Duyck
2014-09-18 22:38 ` [net-next PATCH 20/29] fm10k: Add support for netdev offloads Alexander Duyck
2014-09-18 22:39 ` [net-next PATCH 21/29] fm10k: Add support for MACVLAN acceleration Alexander Duyck
2014-09-18 22:39 ` [net-next PATCH 22/29] fm10k: Add support for PF <-> VF mailbox Alexander Duyck
2014-09-18 22:39 ` [net-next PATCH 23/29] fm10k: Add support for VF Alexander Duyck
2014-09-18 22:39 ` [net-next PATCH 24/29] fm10k: Add support for SR-IOV to PF core files Alexander Duyck
2014-09-18 22:39 ` [net-next PATCH 25/29] fm10k: Add support for SR-IOV to driver Alexander Duyck
2014-09-18 22:40 ` [net-next PATCH 26/29] fm10k: Add support for IEEE DCBx Alexander Duyck
2014-09-18 22:40 ` [net-next PATCH 27/29] fm10k: Add support for debugfs Alexander Duyck
2014-09-18 22:40 ` [net-next PATCH 28/29] fm10k: Add support for ptp to hw specific files Alexander Duyck
2014-09-19  7:38   ` Richard Cochran
2014-09-19 14:36     ` Alexander Duyck
2014-09-19 15:19       ` Richard Cochran
2014-09-19 15:34         ` Alexander Duyck
2014-09-18 22:40 ` [net-next PATCH 29/29] fm10k: Add support for PTP Alexander Duyck
2014-09-19 17:35   ` Richard Cochran
2014-09-19 18:32     ` Alexander Duyck
2014-09-20 21:07       ` Richard Cochran
2014-09-20 21:37         ` Joe Perches
2014-09-20 21:16       ` Richard Cochran
2014-09-20 23:36         ` Alexander Duyck
2014-09-22 11:03     ` David Laight
2014-09-22 14:21       ` Alexander Duyck
2014-09-19  7:55 ` [net-next PATCH 00/29] Add support for the Intel FM10000 Ethernet Switch Host Interface Jiri Pirko
2014-09-19 14:43   ` Alexander Duyck
2014-09-19 10:57 ` Jamal Hadi Salim
2014-09-19 14:54   ` Alexander Duyck
2014-09-19 16:58     ` Alexei Starovoitov
2014-09-19 18:22       ` Alexander Duyck
2014-09-19 23:52 ` Alexander Duyck
2014-09-20  2:03   ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.