linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] New Microsemi PCI Switch Management Driver
@ 2017-02-02 18:05 Logan Gunthorpe
  2017-02-02 18:06 ` [PATCH v2 1/4] MicroSemi Switchtec management interface driver Logan Gunthorpe
                   ` (7 more replies)
  0 siblings, 8 replies; 36+ messages in thread
From: Logan Gunthorpe @ 2017-02-02 18:05 UTC (permalink / raw)
  To: Keith Busch, Myron Stowe, Greg Kroah-Hartman, Bjorn Helgaas,
	Geert Uytterhoeven, Jonathan Corbet, David S. Miller,
	Andrew Morton, Emil Velikov, Mauro Carvalho Chehab,
	Guenter Roeck, Jarkko Sakkinen, Linus Walleij, Ryusuke Konishi,
	Stefan Berger, Wei Zhang
  Cc: Kurt Schwemmer, Stephen Bates, linux-pci, linux-doc, linux-nvme,
	linux-kernel, Logan Gunthorpe

Changes since v1:

* Rebased onto 4.10-rc6 (cleanly)
* Split the patch into a few more easily digestible patches (as
  suggested by Greg Kroah-Hartman)
* Folded switchtec.c into switchtec.h (per Greg)
* Fixed a bunch of 32bit build warnings caught by the kbuild test robot
* Fixed some issues in the documentation so it has a proper
  reStructredText format (as noted by Jonathan Corbet)
* Fixed padding and sizes in the IOCTL structures as noticed by Emil
  Velikov and used pahole to verify their consistency across 32 and 64
  bit builds
* Reworked one of the IOCTL interfaces to be more future proof (per
  Emil).

Changes since RFC:

* Fixed incorrect use of the drive model as pointed out by Greg
  Kroah-Hartman
* Used devm functions as suggested by Keith Busch
* Added a handful of sysfs attributes to the switchtec class
* Added a handful of IOCTLs to the switchtec device
* A number of miscellaneous bug fixes

--

Hi,

This is a continuation of the RFC we posted lasted month [1] which
proposes a management driver for Microsemi's Switchtec line of PCI
switches. This hardware is still looking to be used in the Open
Compute Platform

To make this entirely clear: the Switchtec products are compliant
with the PCI specifications and are supported today with the standard
in-kernel driver. However, these devices also expose a management endpoint
on a separate PCI function address which can be used to perform some
advanced operations. This is a driver for that function. See the patch
for more information.

Since the RFC, we've made the changes requested by Greg Kroah-Hartman
and Keith Busch, and we've also fleshed out a number of features. We've
added a couple of IOCTLs and sysfs attributes which are documented in
the patch. Significant work has also been done on the userspace tool
which is available under a GPL license at [2]. We've also had testing
done by some of the interested parties.

We hope to see this work included in either 4.11 or 4.12 assuming a
smooth review process.

The patch is based off of the v4.10-rc6 release.

Thanks for your review,

Logan

[1] https://www.spinics.net/lists/linux-pci/msg56897.html
[2] https://github.com/sbates130272/switchtec-user

--

Logan Gunthorpe (4):
  MicroSemi Switchtec management interface driver
  switchtec: Add user interface documentation
  switchtec: Add sysfs attributes to the Switchtec driver
  switchtec: Add IOCTLs to the Switchtec driver

 Documentation/ABI/testing/sysfs-class-switchtec |   96 ++
 Documentation/ioctl/ioctl-number.txt            |    1 +
 Documentation/switchtec.txt                     |   80 ++
 MAINTAINERS                                     |   11 +
 drivers/pci/Kconfig                             |    1 +
 drivers/pci/Makefile                            |    1 +
 drivers/pci/switch/Kconfig                      |   13 +
 drivers/pci/switch/Makefile                     |    1 +
 drivers/pci/switch/switchtec.c                  | 1608 +++++++++++++++++++++++
 include/uapi/linux/switchtec_ioctl.h            |  132 ++
 10 files changed, 1944 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-class-switchtec
 create mode 100644 Documentation/switchtec.txt
 create mode 100644 drivers/pci/switch/Kconfig
 create mode 100644 drivers/pci/switch/Makefile
 create mode 100644 drivers/pci/switch/switchtec.c
 create mode 100644 include/uapi/linux/switchtec_ioctl.h

--
2.1.4

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v2 1/4] MicroSemi Switchtec management interface driver
  2017-02-02 18:05 [PATCH v2 0/4] New Microsemi PCI Switch Management Driver Logan Gunthorpe
@ 2017-02-02 18:06 ` Logan Gunthorpe
  2017-02-10 14:51   ` Greg Kroah-Hartman
                     ` (2 more replies)
  2017-02-02 18:06 ` [PATCH v2 2/4] switchtec: Add user interface documentation Logan Gunthorpe
                   ` (6 subsequent siblings)
  7 siblings, 3 replies; 36+ messages in thread
From: Logan Gunthorpe @ 2017-02-02 18:06 UTC (permalink / raw)
  To: Keith Busch, Myron Stowe, Greg Kroah-Hartman, Bjorn Helgaas,
	Geert Uytterhoeven, Jonathan Corbet, David S. Miller,
	Andrew Morton, Emil Velikov, Mauro Carvalho Chehab,
	Guenter Roeck, Jarkko Sakkinen, Linus Walleij, Ryusuke Konishi,
	Stefan Berger, Wei Zhang
  Cc: Kurt Schwemmer, Stephen Bates, linux-pci, linux-doc, linux-nvme,
	linux-kernel, Logan Gunthorpe

Microsemi's "Switchtec" line of PCI switch devices is already well
supported by the kernel with standard PCI switch drivers. However, the
Switchtec device advertises a special management endpoint with a separate
PCI function address and class code. This endpoint enables some additional
functionality which includes:

 * Packet and Byte Counters
 * Switch Firmware Upgrades
 * Event and Error logs
 * Querying port link status
 * Custom user firmware commands

This patch introduces the switchtec kernel module which provides
PCI driver that exposes a char device. The char device provides
userspace access to this interface through read, write and (optionally)
poll calls.

A userspace tool and library which utilizes this interface is available
at [1]. This tool takes inspiration (and borrows some code) from
nvme-cli [2]. The tool is largely complete at this time but additional
features may be added in the future.

[1] https://github.com/sbates130272/switchtec-user
[2] https://github.com/linux-nvme/nvme-cli

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Stephen Bates <stephen.bates@microsemi.com>
---
 MAINTAINERS                    |    8 +
 drivers/pci/Kconfig            |    1 +
 drivers/pci/Makefile           |    1 +
 drivers/pci/switch/Kconfig     |   13 +
 drivers/pci/switch/Makefile    |    1 +
 drivers/pci/switch/switchtec.c | 1028 ++++++++++++++++++++++++++++++++++++++++
 6 files changed, 1052 insertions(+)
 create mode 100644 drivers/pci/switch/Kconfig
 create mode 100644 drivers/pci/switch/Makefile
 create mode 100644 drivers/pci/switch/switchtec.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 5f10c28..9c35198 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9507,6 +9507,14 @@ S:	Maintained
 F:	Documentation/devicetree/bindings/pci/aardvark-pci.txt
 F:	drivers/pci/host/pci-aardvark.c
 
+PCI DRIVER FOR MICROSEMI SWITCHTEC
+M:	Kurt Schwemmer <kurt.schwemmer@microsemi.com>
+M:	Stephen Bates <stephen.bates@microsemi.com>
+M:	Logan Gunthorpe <logang@deltatee.com>
+L:	linux-pci@vger.kernel.org
+S:	Maintained
+F:	drivers/pci/switch/switchtec*
+
 PCI DRIVER FOR NVIDIA TEGRA
 M:	Thierry Reding <thierry.reding@gmail.com>
 L:	linux-tegra@vger.kernel.org
diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
index 6555eb7..f72e8c5 100644
--- a/drivers/pci/Kconfig
+++ b/drivers/pci/Kconfig
@@ -133,3 +133,4 @@ config PCI_HYPERV
 
 source "drivers/pci/hotplug/Kconfig"
 source "drivers/pci/host/Kconfig"
+source "drivers/pci/switch/Kconfig"
diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
index 8db5079..15b46dd 100644
--- a/drivers/pci/Makefile
+++ b/drivers/pci/Makefile
@@ -68,3 +68,4 @@ ccflags-$(CONFIG_PCI_DEBUG) := -DDEBUG
 
 # PCI host controller drivers
 obj-y += host/
+obj-y += switch/
diff --git a/drivers/pci/switch/Kconfig b/drivers/pci/switch/Kconfig
new file mode 100644
index 0000000..4c49648
--- /dev/null
+++ b/drivers/pci/switch/Kconfig
@@ -0,0 +1,13 @@
+menu "PCI switch controller drivers"
+	depends on PCI
+
+config PCI_SW_SWITCHTEC
+	tristate "MicroSemi Switchtec PCIe Switch Management Driver"
+	help
+	 Enables support for the management interface for the MicroSemi
+	 Switchtec series of PCIe switches. Supports userspace access
+	 to submit MRPC commands to the switch via /dev/switchtecX
+	 devices. See <file:Documentation/switchtec.txt> for more
+	 information.
+
+endmenu
diff --git a/drivers/pci/switch/Makefile b/drivers/pci/switch/Makefile
new file mode 100644
index 0000000..37d8cfb
--- /dev/null
+++ b/drivers/pci/switch/Makefile
@@ -0,0 +1 @@
+obj-$(CONFIG_PCI_SW_SWITCHTEC) += switchtec.o
diff --git a/drivers/pci/switch/switchtec.c b/drivers/pci/switch/switchtec.c
new file mode 100644
index 0000000..e4a4294
--- /dev/null
+++ b/drivers/pci/switch/switchtec.c
@@ -0,0 +1,1028 @@
+/*
+ * Microsemi Switchtec(tm) PCIe Management Driver
+ * Copyright (c) 2017, Microsemi Corporation
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ */
+
+#include <linux/interrupt.h>
+#include <linux/module.h>
+#include <linux/fs.h>
+#include <linux/uaccess.h>
+#include <linux/poll.h>
+#include <linux/pci.h>
+#include <linux/cdev.h>
+#include <linux/wait.h>
+
+MODULE_DESCRIPTION("Microsemi Switchtec(tm) PCI-E Management Driver");
+MODULE_VERSION("0.1");
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Microsemi Corporation");
+
+static int max_devices = 16;
+module_param(max_devices, int, 0644);
+MODULE_PARM_DESC(max_devices, "max number of switchtec device instances");
+
+static dev_t switchtec_devt;
+static struct class *switchtec_class;
+static DEFINE_IDA(switchtec_minor_ida);
+
+#define MICROSEMI_VENDOR_ID         0x11f8
+#define MICROSEMI_NTB_CLASSCODE     0x068000
+#define MICROSEMI_MGMT_CLASSCODE    0x058000
+
+#define SWITCHTEC_MRPC_PAYLOAD_SIZE 1024
+#define SWITCHTEC_MAX_PFF_CSR 48
+
+#define SWITCHTEC_EVENT_OCCURRED BIT(0)
+#define SWITCHTEC_EVENT_CLEAR    BIT(0)
+#define SWITCHTEC_EVENT_EN_LOG   BIT(1)
+#define SWITCHTEC_EVENT_EN_CLI   BIT(2)
+#define SWITCHTEC_EVENT_EN_IRQ   BIT(3)
+#define SWITCHTEC_EVENT_FATAL    BIT(4)
+
+enum {
+	SWITCHTEC_GAS_MRPC_OFFSET       = 0x0000,
+	SWITCHTEC_GAS_TOP_CFG_OFFSET    = 0x1000,
+	SWITCHTEC_GAS_SW_EVENT_OFFSET   = 0x1800,
+	SWITCHTEC_GAS_SYS_INFO_OFFSET   = 0x2000,
+	SWITCHTEC_GAS_FLASH_INFO_OFFSET = 0x2200,
+	SWITCHTEC_GAS_PART_CFG_OFFSET   = 0x4000,
+	SWITCHTEC_GAS_NTB_OFFSET        = 0x10000,
+	SWITCHTEC_GAS_PFF_CSR_OFFSET    = 0x134000,
+};
+
+struct mrpc_regs {
+	u8 input_data[SWITCHTEC_MRPC_PAYLOAD_SIZE];
+	u8 output_data[SWITCHTEC_MRPC_PAYLOAD_SIZE];
+	u32 cmd;
+	u32 status;
+	u32 ret_value;
+} __packed;
+
+enum mrpc_status {
+	SWITCHTEC_MRPC_STATUS_INPROGRESS = 1,
+	SWITCHTEC_MRPC_STATUS_DONE = 2,
+	SWITCHTEC_MRPC_STATUS_ERROR = 0xFF,
+	SWITCHTEC_MRPC_STATUS_INTERRUPTED = 0x100,
+};
+
+struct sw_event_regs {
+	u64 event_report_ctrl;
+	u64 reserved1;
+	u64 part_event_bitmap;
+	u64 reserved2;
+	u32 global_summary;
+	u32 reserved3[3];
+	u32 stack_error_event_hdr;
+	u32 stack_error_event_data;
+	u32 reserved4[4];
+	u32 ppu_error_event_hdr;
+	u32 ppu_error_event_data;
+	u32 reserved5[4];
+	u32 isp_error_event_hdr;
+	u32 isp_error_event_data;
+	u32 reserved6[4];
+	u32 sys_reset_event_hdr;
+	u32 reserved7[5];
+	u32 fw_exception_hdr;
+	u32 reserved8[5];
+	u32 fw_nmi_hdr;
+	u32 reserved9[5];
+	u32 fw_non_fatal_hdr;
+	u32 reserved10[5];
+	u32 fw_fatal_hdr;
+	u32 reserved11[5];
+	u32 twi_mrpc_comp_hdr;
+	u32 twi_mrpc_comp_data;
+	u32 reserved12[4];
+	u32 twi_mrpc_comp_async_hdr;
+	u32 twi_mrpc_comp_async_data;
+	u32 reserved13[4];
+	u32 cli_mrpc_comp_hdr;
+	u32 cli_mrpc_comp_data;
+	u32 reserved14[4];
+	u32 cli_mrpc_comp_async_hdr;
+	u32 cli_mrpc_comp_async_data;
+	u32 reserved15[4];
+	u32 gpio_interrupt_hdr;
+	u32 gpio_interrupt_data;
+	u32 reserved16[4];
+} __packed;
+
+struct sys_info_regs {
+	u32 device_id;
+	u32 device_version;
+	u32 firmware_version;
+	u32 reserved1;
+	u32 vendor_table_revision;
+	u32 table_format_version;
+	u32 partition_id;
+	u32 cfg_file_fmt_version;
+	u32 reserved2[58];
+	char vendor_id[8];
+	char product_id[16];
+	char product_revision[4];
+	char component_vendor[8];
+	u16 component_id;
+	u8 component_revision;
+} __packed;
+
+struct flash_info_regs {
+	u32 flash_part_map_upd_idx;
+
+	struct active_partition_info {
+		u32 address;
+		u32 build_version;
+		u32 build_string;
+	} active_img;
+
+	struct active_partition_info active_cfg;
+	struct active_partition_info inactive_img;
+	struct active_partition_info inactive_cfg;
+
+	u32 flash_length;
+
+	struct partition_info {
+		u32 address;
+		u32 length;
+	} cfg0;
+
+	struct partition_info cfg1;
+	struct partition_info img0;
+	struct partition_info img1;
+	struct partition_info nvlog;
+	struct partition_info vendor[8];
+};
+
+struct ntb_info_regs {
+	u8  partition_count;
+	u8  partition_id;
+	u16 reserved1;
+	u64 ep_map;
+	u16 requester_id;
+} __packed;
+
+struct part_cfg_regs {
+	u32 status;
+	u32 state;
+	u32 port_cnt;
+	u32 usp_port_mode;
+	u32 usp_pff_inst_id;
+	u32 vep_pff_inst_id;
+	u32 dsp_pff_inst_id[47];
+	u32 reserved1[11];
+	u16 vep_vector_number;
+	u16 usp_vector_number;
+	u32 port_event_bitmap;
+	u32 reserved2[3];
+	u32 part_event_summary;
+	u32 reserved3[3];
+	u32 part_reset_hdr;
+	u32 part_reset_data[5];
+	u32 mrpc_comp_hdr;
+	u32 mrpc_comp_data[5];
+	u32 mrpc_comp_async_hdr;
+	u32 mrpc_comp_async_data[5];
+	u32 dyn_binding_hdr;
+	u32 dyn_binding_data[5];
+	u32 reserved4[159];
+} __packed;
+
+enum {
+	SWITCHTEC_PART_CFG_EVENT_RESET = 1 << 0,
+	SWITCHTEC_PART_CFG_EVENT_MRPC_CMP = 1 << 1,
+	SWITCHTEC_PART_CFG_EVENT_MRPC_ASYNC_CMP = 1 << 2,
+	SWITCHTEC_PART_CFG_EVENT_DYN_PART_CMP = 1 << 3,
+};
+
+struct pff_csr_regs {
+	u16 vendor_id;
+	u16 device_id;
+	u32 pci_cfg_header[15];
+	u32 pci_cap_region[48];
+	u32 pcie_cap_region[448];
+	u32 indirect_gas_window[128];
+	u32 indirect_gas_window_off;
+	u32 reserved[127];
+	u32 pff_event_summary;
+	u32 reserved2[3];
+	u32 aer_in_p2p_hdr;
+	u32 aer_in_p2p_data[5];
+	u32 aer_in_vep_hdr;
+	u32 aer_in_vep_data[5];
+	u32 dpc_hdr;
+	u32 dpc_data[5];
+	u32 cts_hdr;
+	u32 cts_data[5];
+	u32 reserved3[6];
+	u32 hotplug_hdr;
+	u32 hotplug_data[5];
+	u32 ier_hdr;
+	u32 ier_data[5];
+	u32 threshold_hdr;
+	u32 threshold_data[5];
+	u32 power_mgmt_hdr;
+	u32 power_mgmt_data[5];
+	u32 tlp_throttling_hdr;
+	u32 tlp_throttling_data[5];
+	u32 force_speed_hdr;
+	u32 force_speed_data[5];
+	u32 credit_timeout_hdr;
+	u32 credit_timeout_data[5];
+	u32 link_state_hdr;
+	u32 link_state_data[5];
+	u32 reserved4[174];
+} __packed;
+
+struct switchtec_dev {
+	struct pci_dev *pdev;
+	struct msix_entry msix[4];
+	struct device dev;
+	struct cdev cdev;
+	unsigned int event_irq;
+
+	int partition;
+	int partition_count;
+	int pff_csr_count;
+	char pff_local[SWITCHTEC_MAX_PFF_CSR];
+
+	void __iomem *mmio;
+	struct mrpc_regs __iomem *mmio_mrpc;
+	struct sw_event_regs __iomem *mmio_sw_event;
+	struct sys_info_regs __iomem *mmio_sys_info;
+	struct flash_info_regs __iomem *mmio_flash_info;
+	struct ntb_info_regs __iomem *mmio_ntb;
+	struct part_cfg_regs __iomem *mmio_part_cfg;
+	struct part_cfg_regs __iomem *mmio_part_cfg_all;
+	struct pff_csr_regs __iomem *mmio_pff_csr;
+
+	/*
+	 * The mrpc mutex must be held when accessing the other
+	 * mrpc fields
+	 */
+	struct mutex mrpc_mutex;
+	struct list_head mrpc_queue;
+	int mrpc_busy;
+	struct work_struct mrpc_work;
+	struct delayed_work mrpc_timeout;
+	wait_queue_head_t event_wq;
+	atomic_t event_cnt;
+};
+
+static struct switchtec_dev *to_stdev(struct device *dev)
+{
+	return container_of(dev, struct switchtec_dev, dev);
+}
+
+struct switchtec_user {
+	struct switchtec_dev *stdev;
+
+	enum mrpc_state {
+		MRPC_IDLE = 0,
+		MRPC_QUEUED,
+		MRPC_RUNNING,
+		MRPC_DONE,
+	} state;
+
+	struct completion comp;
+	struct kref kref;
+	struct list_head list;
+
+	u32 cmd;
+	u32 status;
+	u32 return_code;
+	size_t data_len;
+	unsigned char data[SWITCHTEC_MRPC_PAYLOAD_SIZE];
+	int event_cnt;
+};
+
+static struct switchtec_user *stuser_create(struct switchtec_dev *stdev)
+{
+	struct switchtec_user *stuser;
+
+	stuser = kzalloc(sizeof(*stuser), GFP_KERNEL);
+	if (!stuser)
+		return ERR_PTR(-ENOMEM);
+
+	get_device(&stdev->dev);
+	stuser->stdev = stdev;
+	kref_init(&stuser->kref);
+	INIT_LIST_HEAD(&stuser->list);
+	init_completion(&stuser->comp);
+	stuser->event_cnt = atomic_read(&stdev->event_cnt);
+
+	dev_dbg(&stdev->dev, "%s: %p\n", __func__, stuser);
+
+	return stuser;
+}
+
+static void stuser_free(struct kref *kref)
+{
+	struct switchtec_user *stuser;
+
+	stuser = container_of(kref, struct switchtec_user, kref);
+
+	dev_dbg(&stuser->stdev->dev, "%s: %p\n", __func__, stuser);
+
+	put_device(&stuser->stdev->dev);
+	kfree(stuser);
+}
+
+static void stuser_put(struct switchtec_user *stuser)
+{
+	kref_put(&stuser->kref, stuser_free);
+}
+
+static void stuser_set_state(struct switchtec_user *stuser,
+			     enum mrpc_state state)
+{
+	const char * const state_names[] = {
+		[MRPC_IDLE] = "IDLE",
+		[MRPC_QUEUED] = "QUEUED",
+		[MRPC_RUNNING] = "RUNNING",
+		[MRPC_DONE] = "DONE",
+	};
+
+	stuser->state = state;
+
+	dev_dbg(&stuser->stdev->dev, "stuser state %p -> %s",
+		stuser, state_names[state]);
+}
+
+static int stdev_is_alive(struct switchtec_dev *stdev)
+{
+	return stdev->mmio != NULL;
+}
+
+static void mrpc_complete_cmd(struct switchtec_dev *stdev);
+
+static void mrpc_cmd_submit(struct switchtec_dev *stdev)
+{
+	/* requires the mrpc_mutex to already be held when called */
+
+	struct switchtec_user *stuser;
+
+	if (stdev->mrpc_busy)
+		return;
+
+	if (list_empty(&stdev->mrpc_queue))
+		return;
+
+	stuser = list_entry(stdev->mrpc_queue.next, struct switchtec_user,
+			    list);
+
+	stuser_set_state(stuser, MRPC_RUNNING);
+	stdev->mrpc_busy = 1;
+	memcpy_toio(&stdev->mmio_mrpc->input_data,
+		    stuser->data, stuser->data_len);
+	iowrite32(stuser->cmd, &stdev->mmio_mrpc->cmd);
+
+	stuser->status = ioread32(&stdev->mmio_mrpc->status);
+	if (stuser->status != SWITCHTEC_MRPC_STATUS_INPROGRESS)
+		mrpc_complete_cmd(stdev);
+
+	schedule_delayed_work(&stdev->mrpc_timeout,
+			      msecs_to_jiffies(500));
+}
+
+static void mrpc_queue_cmd(struct switchtec_user *stuser)
+{
+	/* requires the mrpc_mutex to already be held when called */
+
+	struct switchtec_dev *stdev = stuser->stdev;
+
+	kref_get(&stuser->kref);
+	stuser_set_state(stuser, MRPC_QUEUED);
+	init_completion(&stuser->comp);
+	list_add_tail(&stuser->list, &stdev->mrpc_queue);
+
+	mrpc_cmd_submit(stdev);
+}
+
+static void mrpc_complete_cmd(struct switchtec_dev *stdev)
+{
+	/* requires the mrpc_mutex to already be held when called */
+	struct switchtec_user *stuser;
+
+	if (list_empty(&stdev->mrpc_queue))
+		return;
+
+	stuser = list_entry(stdev->mrpc_queue.next, struct switchtec_user,
+			    list);
+
+	stuser->status = ioread32(&stdev->mmio_mrpc->status);
+	if (stuser->status == SWITCHTEC_MRPC_STATUS_INPROGRESS)
+		return;
+
+	stuser_set_state(stuser, MRPC_DONE);
+	stuser->return_code = 0;
+
+	if (stuser->status != SWITCHTEC_MRPC_STATUS_DONE)
+		goto out;
+
+	stuser->return_code = ioread32(&stdev->mmio_mrpc->ret_value);
+	if (stuser->return_code != 0)
+		goto out;
+
+	memcpy_fromio(stuser->data, &stdev->mmio_mrpc->output_data,
+		      sizeof(stuser->data));
+
+out:
+	complete_all(&stuser->comp);
+	list_del_init(&stuser->list);
+	stuser_put(stuser);
+	stdev->mrpc_busy = 0;
+
+	mrpc_cmd_submit(stdev);
+}
+
+static void mrpc_event_work(struct work_struct *work)
+{
+	struct switchtec_dev *stdev;
+
+	stdev = container_of(work, struct switchtec_dev, mrpc_work);
+
+	dev_dbg(&stdev->dev, "%s\n", __func__);
+
+	mutex_lock(&stdev->mrpc_mutex);
+	cancel_delayed_work(&stdev->mrpc_timeout);
+	mrpc_complete_cmd(stdev);
+	mutex_unlock(&stdev->mrpc_mutex);
+}
+
+static void mrpc_timeout_work(struct work_struct *work)
+{
+	struct switchtec_dev *stdev;
+	u32 status;
+
+	stdev = container_of(work, struct switchtec_dev, mrpc_timeout.work);
+
+	dev_dbg(&stdev->dev, "%s\n", __func__);
+
+	mutex_lock(&stdev->mrpc_mutex);
+
+	if (stdev_is_alive(stdev)) {
+		status = ioread32(&stdev->mmio_mrpc->status);
+		if (status == SWITCHTEC_MRPC_STATUS_INPROGRESS) {
+			schedule_delayed_work(&stdev->mrpc_timeout,
+					      msecs_to_jiffies(500));
+			goto out;
+		}
+	}
+
+	mrpc_complete_cmd(stdev);
+
+out:
+	mutex_unlock(&stdev->mrpc_mutex);
+}
+
+static int switchtec_dev_open(struct inode *inode, struct file *filp)
+{
+	struct switchtec_dev *stdev;
+	struct switchtec_user *stuser;
+
+	stdev = container_of(inode->i_cdev, struct switchtec_dev, cdev);
+
+	stuser = stuser_create(stdev);
+	if (!stuser)
+		return PTR_ERR(stuser);
+
+	filp->private_data = stuser;
+	nonseekable_open(inode, filp);
+
+	dev_dbg(&stdev->dev, "%s: %p\n", __func__, stuser);
+
+	return 0;
+}
+
+static int switchtec_dev_release(struct inode *inode, struct file *filp)
+{
+	struct switchtec_user *stuser = filp->private_data;
+
+	stuser_put(stuser);
+
+	return 0;
+}
+
+static ssize_t switchtec_dev_write(struct file *filp, const char __user *data,
+				   size_t size, loff_t *off)
+{
+	struct switchtec_user *stuser = filp->private_data;
+	struct switchtec_dev *stdev = stuser->stdev;
+	int rc;
+
+	if (!stdev_is_alive(stdev))
+		return -ENXIO;
+
+	if (size < sizeof(stuser->cmd) ||
+	    size > sizeof(stuser->cmd) + SWITCHTEC_MRPC_PAYLOAD_SIZE)
+		return -EINVAL;
+
+	if (mutex_lock_interruptible(&stdev->mrpc_mutex))
+		return -EINTR;
+
+	if (stuser->state != MRPC_IDLE) {
+		rc = -EBADE;
+		goto out;
+	}
+
+	rc = copy_from_user(&stuser->cmd, data, sizeof(stuser->cmd));
+	if (rc) {
+		rc = -EFAULT;
+		goto out;
+	}
+
+	data += sizeof(stuser->cmd);
+	rc = copy_from_user(&stuser->data, data, size - sizeof(stuser->cmd));
+	if (rc) {
+		rc = -EFAULT;
+		goto out;
+	}
+
+	stuser->data_len = size - sizeof(stuser->cmd);
+
+	mrpc_queue_cmd(stuser);
+
+	rc = size;
+
+out:
+	mutex_unlock(&stdev->mrpc_mutex);
+
+	return rc;
+}
+
+static ssize_t switchtec_dev_read(struct file *filp, char __user *data,
+				  size_t size, loff_t *off)
+{
+	struct switchtec_user *stuser = filp->private_data;
+	struct switchtec_dev *stdev = stuser->stdev;
+	int rc;
+
+	if (!stdev_is_alive(stdev))
+		return -ENXIO;
+
+	if (size < sizeof(stuser->cmd) ||
+	    size > sizeof(stuser->cmd) + SWITCHTEC_MRPC_PAYLOAD_SIZE)
+		return -EINVAL;
+
+	if (stuser->state == MRPC_IDLE)
+		return -EBADE;
+
+	if (filp->f_flags & O_NONBLOCK) {
+		if (!try_wait_for_completion(&stuser->comp))
+			return -EAGAIN;
+	} else {
+		rc = wait_for_completion_interruptible(&stuser->comp);
+		if (rc < 0)
+			return rc;
+	}
+
+	if (mutex_lock_interruptible(&stdev->mrpc_mutex))
+		return -EINTR;
+
+	if (stuser->state != MRPC_DONE)
+		return -EBADE;
+
+	rc = copy_to_user(data, &stuser->return_code,
+			  sizeof(stuser->return_code));
+	if (rc) {
+		rc = -EFAULT;
+		goto out;
+	}
+
+	data += sizeof(stuser->return_code);
+	rc = copy_to_user(data, &stuser->data,
+			  size - sizeof(stuser->return_code));
+	if (rc) {
+		rc = -EFAULT;
+		goto out;
+	}
+
+	stuser_set_state(stuser, MRPC_IDLE);
+
+	if (stuser->status == SWITCHTEC_MRPC_STATUS_DONE)
+		rc = size;
+	else if (stuser->status == SWITCHTEC_MRPC_STATUS_INTERRUPTED)
+		rc = -ENXIO;
+	else
+		rc = -EBADMSG;
+
+out:
+	mutex_unlock(&stdev->mrpc_mutex);
+
+	return rc;
+}
+
+static unsigned int switchtec_dev_poll(struct file *filp, poll_table *wait)
+{
+	struct switchtec_user *stuser = filp->private_data;
+	struct switchtec_dev *stdev = stuser->stdev;
+	int ret = 0;
+
+	poll_wait(filp, &stuser->comp.wait, wait);
+	poll_wait(filp, &stdev->event_wq, wait);
+
+	if (!stdev_is_alive(stdev))
+		return POLLERR;
+
+	if (try_wait_for_completion(&stuser->comp))
+		ret |= POLLIN | POLLRDNORM;
+
+	if (stuser->event_cnt != atomic_read(&stdev->event_cnt))
+		ret |= POLLPRI | POLLRDBAND;
+
+	return ret;
+}
+
+static const struct file_operations switchtec_fops = {
+	.owner = THIS_MODULE,
+	.open = switchtec_dev_open,
+	.release = switchtec_dev_release,
+	.write = switchtec_dev_write,
+	.read = switchtec_dev_read,
+	.poll = switchtec_dev_poll,
+};
+
+static void stdev_release(struct device *dev)
+{
+	struct switchtec_dev *stdev = to_stdev(dev);
+
+	ida_simple_remove(&switchtec_minor_ida,
+			  MINOR(dev->devt));
+	kfree(stdev);
+}
+
+static void stdev_unregister(struct switchtec_dev *stdev)
+{
+	cdev_del(&stdev->cdev);
+	device_unregister(&stdev->dev);
+}
+
+static struct switchtec_dev *stdev_create(struct pci_dev *pdev)
+{
+	struct switchtec_dev *stdev;
+	int minor;
+	struct device *dev;
+	struct cdev *cdev;
+	int rc;
+
+	stdev = kzalloc_node(sizeof(*stdev), GFP_KERNEL,
+			     dev_to_node(&pdev->dev));
+	if (!stdev)
+		return ERR_PTR(-ENOMEM);
+
+	stdev->pdev = pdev;
+	INIT_LIST_HEAD(&stdev->mrpc_queue);
+	mutex_init(&stdev->mrpc_mutex);
+	stdev->mrpc_busy = 0;
+	INIT_WORK(&stdev->mrpc_work, mrpc_event_work);
+	INIT_DELAYED_WORK(&stdev->mrpc_timeout, mrpc_timeout_work);
+	init_waitqueue_head(&stdev->event_wq);
+	atomic_set(&stdev->event_cnt, 0);
+
+	minor = ida_simple_get(&switchtec_minor_ida, 0, 0,
+			       GFP_KERNEL);
+	if (minor < 0)
+		return ERR_PTR(minor);
+
+	dev = &stdev->dev;
+	device_initialize(dev);
+	dev->devt = MKDEV(MAJOR(switchtec_devt), minor);
+	dev->class = switchtec_class;
+	dev->parent = &pdev->dev;
+	dev->release = stdev_release;
+	dev_set_name(dev, "switchtec%d", minor);
+
+	cdev = &stdev->cdev;
+	cdev_init(cdev, &switchtec_fops);
+	cdev->owner = THIS_MODULE;
+	cdev->kobj.parent = &dev->kobj;
+
+	rc = cdev_add(&stdev->cdev, dev->devt, 1);
+	if (rc)
+		goto err_cdev;
+
+	rc = device_add(dev);
+	if (rc) {
+		cdev_del(&stdev->cdev);
+		put_device(dev);
+		return ERR_PTR(rc);
+	}
+
+	return stdev;
+
+err_cdev:
+	ida_simple_remove(&switchtec_minor_ida, minor);
+
+	return ERR_PTR(rc);
+}
+
+static irqreturn_t switchtec_event_isr(int irq, void *dev)
+{
+	struct switchtec_dev *stdev = dev;
+	u32 reg;
+	irqreturn_t ret = IRQ_NONE;
+
+	reg = ioread32(&stdev->mmio_part_cfg->mrpc_comp_hdr);
+	if (reg & SWITCHTEC_EVENT_OCCURRED) {
+		dev_dbg(&stdev->dev, "%s: mrpc comp\n", __func__);
+		ret = IRQ_HANDLED;
+		schedule_work(&stdev->mrpc_work);
+		iowrite32(reg, &stdev->mmio_part_cfg->mrpc_comp_hdr);
+	}
+
+	return ret;
+}
+
+static int switchtec_init_msix_isr(struct switchtec_dev *stdev)
+{
+	struct pci_dev *pdev = stdev->pdev;
+	int rc, i, msix_count, node;
+
+	node = dev_to_node(&pdev->dev);
+
+	for (i = 0; i < ARRAY_SIZE(stdev->msix); ++i)
+		stdev->msix[i].entry = i;
+
+	msix_count = pci_enable_msix_range(pdev, stdev->msix, 1,
+					   ARRAY_SIZE(stdev->msix));
+	if (msix_count < 0)
+		return msix_count;
+
+	stdev->event_irq = ioread32(&stdev->mmio_part_cfg->vep_vector_number);
+	if (stdev->event_irq < 0 || stdev->event_irq >= msix_count) {
+		rc = -EFAULT;
+		goto err_msix_request;
+	}
+
+	stdev->event_irq = stdev->msix[stdev->event_irq].vector;
+	dev_dbg(&stdev->dev, "Using msix interrupts: event_irq=%d\n",
+		stdev->event_irq);
+
+	return 0;
+
+err_msix_request:
+	pci_disable_msix(pdev);
+	return rc;
+}
+
+static int switchtec_init_msi_isr(struct switchtec_dev *stdev)
+{
+	int rc;
+	struct pci_dev *pdev = stdev->pdev;
+
+	/* Try to set up msi irq */
+	rc = pci_enable_msi_range(pdev, 1, 4);
+	if (rc < 0)
+		return rc;
+
+	stdev->event_irq = ioread32(&stdev->mmio_part_cfg->vep_vector_number);
+	if (stdev->event_irq < 0 || stdev->event_irq >= 4) {
+		rc = -EFAULT;
+		goto err_msi_request;
+	}
+
+	stdev->event_irq = pdev->irq + stdev->event_irq;
+	dev_dbg(&stdev->dev, "Using msi interrupts: event_irq=%d\n",
+		stdev->event_irq);
+
+	return 0;
+
+err_msi_request:
+	pci_disable_msi(pdev);
+	return rc;
+}
+
+static int switchtec_init_isr(struct switchtec_dev *stdev)
+{
+	int rc;
+
+	rc = switchtec_init_msix_isr(stdev);
+	if (rc)
+		rc = switchtec_init_msi_isr(stdev);
+
+	if (rc)
+		return rc;
+
+	rc = devm_request_irq(&stdev->pdev->dev, stdev->event_irq,
+			      switchtec_event_isr, 0, KBUILD_MODNAME, stdev);
+
+	return rc;
+}
+
+static void switchtec_deinit_isr(struct switchtec_dev *stdev)
+{
+	devm_free_irq(&stdev->pdev->dev, stdev->event_irq, stdev);
+	pci_disable_msix(stdev->pdev);
+	pci_disable_msi(stdev->pdev);
+}
+
+static void init_pff(struct switchtec_dev *stdev)
+{
+	int i;
+	u32 reg;
+	struct part_cfg_regs *pcfg = stdev->mmio_part_cfg;
+
+	for (i = 0; i < SWITCHTEC_MAX_PFF_CSR; i++) {
+		reg = ioread16(&stdev->mmio_pff_csr[i].vendor_id);
+		if (reg != MICROSEMI_VENDOR_ID)
+			break;
+	}
+
+	stdev->pff_csr_count = i;
+
+	reg = ioread32(&pcfg->usp_pff_inst_id);
+	if (reg < SWITCHTEC_MAX_PFF_CSR)
+		stdev->pff_local[reg] = 1;
+
+	reg = ioread32(&pcfg->vep_pff_inst_id);
+	if (reg < SWITCHTEC_MAX_PFF_CSR)
+		stdev->pff_local[reg] = 1;
+
+	for (i = 0; i < ARRAY_SIZE(pcfg->dsp_pff_inst_id); i++) {
+		reg = ioread32(&pcfg->dsp_pff_inst_id[i]);
+		if (reg < SWITCHTEC_MAX_PFF_CSR)
+			stdev->pff_local[reg] = 1;
+	}
+}
+
+static int switchtec_init_pci(struct switchtec_dev *stdev,
+			      struct pci_dev *pdev)
+{
+	int rc;
+
+	rc = pcim_enable_device(pdev);
+	if (rc)
+		return rc;
+
+	rc = pcim_iomap_regions(pdev, 0x1, KBUILD_MODNAME);
+	if (rc)
+		return rc;
+
+	pci_set_master(pdev);
+
+	stdev->mmio = pcim_iomap_table(pdev)[0];
+	stdev->mmio_mrpc = stdev->mmio + SWITCHTEC_GAS_MRPC_OFFSET;
+	stdev->mmio_sw_event = stdev->mmio + SWITCHTEC_GAS_SW_EVENT_OFFSET;
+	stdev->mmio_sys_info = stdev->mmio + SWITCHTEC_GAS_SYS_INFO_OFFSET;
+	stdev->mmio_flash_info = stdev->mmio + SWITCHTEC_GAS_FLASH_INFO_OFFSET;
+	stdev->mmio_ntb = stdev->mmio + SWITCHTEC_GAS_NTB_OFFSET;
+	stdev->partition = ioread8(&stdev->mmio_ntb->partition_id);
+	stdev->partition_count = ioread8(&stdev->mmio_ntb->partition_count);
+	stdev->mmio_part_cfg_all = stdev->mmio + SWITCHTEC_GAS_PART_CFG_OFFSET;
+	stdev->mmio_part_cfg = &stdev->mmio_part_cfg_all[stdev->partition];
+	stdev->mmio_pff_csr = stdev->mmio + SWITCHTEC_GAS_PFF_CSR_OFFSET;
+
+	init_pff(stdev);
+
+	iowrite32(SWITCHTEC_EVENT_CLEAR |
+		  SWITCHTEC_EVENT_EN_IRQ,
+		  &stdev->mmio_part_cfg->mrpc_comp_hdr);
+
+	pci_set_drvdata(pdev, stdev);
+
+	return 0;
+}
+
+static void switchtec_deinit_pci(struct switchtec_dev *stdev)
+{
+	struct pci_dev *pdev = stdev->pdev;
+
+	stdev->mmio = NULL;
+	pci_clear_master(pdev);
+	pci_set_drvdata(pdev, NULL);
+}
+
+static int switchtec_pci_probe(struct pci_dev *pdev,
+			       const struct pci_device_id *id)
+{
+	struct switchtec_dev *stdev;
+	int rc;
+
+	stdev = stdev_create(pdev);
+	if (!stdev)
+		return PTR_ERR(stdev);
+
+	rc = switchtec_init_pci(stdev, pdev);
+	if (rc)
+		goto err_init_pci;
+
+	rc = switchtec_init_isr(stdev);
+	if (rc) {
+		dev_err(&stdev->dev, "failed to init isr.\n");
+		goto err_init_isr;
+	}
+
+	dev_info(&stdev->dev, "Management device registered.\n");
+
+	return 0;
+
+err_init_isr:
+	switchtec_deinit_pci(stdev);
+err_init_pci:
+	stdev_unregister(stdev);
+	return rc;
+}
+
+static void switchtec_pci_remove(struct pci_dev *pdev)
+{
+	struct switchtec_dev *stdev = pci_get_drvdata(pdev);
+
+	switchtec_deinit_isr(stdev);
+	switchtec_deinit_pci(stdev);
+	stdev_unregister(stdev);
+}
+
+#define SWITCHTEC_PCI_DEVICE(device_id) \
+	{ \
+		.vendor     = MICROSEMI_VENDOR_ID, \
+		.device     = device_id, \
+		.subvendor  = PCI_ANY_ID, \
+		.subdevice  = PCI_ANY_ID, \
+		.class      = MICROSEMI_MGMT_CLASSCODE, \
+		.class_mask = 0xFFFFFFFF, \
+	}, \
+	{ \
+		.vendor     = MICROSEMI_VENDOR_ID, \
+		.device     = device_id, \
+		.subvendor  = PCI_ANY_ID, \
+		.subdevice  = PCI_ANY_ID, \
+		.class      = MICROSEMI_NTB_CLASSCODE, \
+		.class_mask = 0xFFFFFFFF, \
+	}
+
+static const struct pci_device_id switchtec_pci_tbl[] = {
+	SWITCHTEC_PCI_DEVICE(0x8531),  //PFX 24xG3
+	SWITCHTEC_PCI_DEVICE(0x8532),  //PFX 32xG3
+	SWITCHTEC_PCI_DEVICE(0x8533),  //PFX 48xG3
+	SWITCHTEC_PCI_DEVICE(0x8534),  //PFX 64xG3
+	SWITCHTEC_PCI_DEVICE(0x8535),  //PFX 80xG3
+	SWITCHTEC_PCI_DEVICE(0x8536),  //PFX 96xG3
+	SWITCHTEC_PCI_DEVICE(0x8543),  //PSX 48xG3
+	SWITCHTEC_PCI_DEVICE(0x8544),  //PSX 64xG3
+	SWITCHTEC_PCI_DEVICE(0x8545),  //PSX 80xG3
+	SWITCHTEC_PCI_DEVICE(0x8546),  //PSX 96xG3
+	{0}
+};
+MODULE_DEVICE_TABLE(pci, switchtec_pci_tbl);
+
+static struct pci_driver switchtec_pci_driver = {
+	.name		= KBUILD_MODNAME,
+	.id_table	= switchtec_pci_tbl,
+	.probe		= switchtec_pci_probe,
+	.remove		= switchtec_pci_remove,
+};
+
+static int __init switchtec_init(void)
+{
+	int rc;
+
+	max_devices = max(max_devices, 256);
+	rc = alloc_chrdev_region(&switchtec_devt, 0, max_devices,
+				 "switchtec");
+	if (rc)
+		return rc;
+
+	switchtec_class = class_create(THIS_MODULE, "switchtec");
+	if (IS_ERR(switchtec_class)) {
+		rc = PTR_ERR(switchtec_class);
+		goto err_create_class;
+	}
+
+	rc = pci_register_driver(&switchtec_pci_driver);
+	if (rc)
+		goto err_pci_register;
+
+	pr_info(KBUILD_MODNAME ": loaded.\n");
+
+	return 0;
+
+err_pci_register:
+	class_destroy(switchtec_class);
+
+err_create_class:
+	unregister_chrdev_region(switchtec_devt, max_devices);
+
+	return rc;
+}
+module_init(switchtec_init);
+
+static void __exit switchtec_exit(void)
+{
+	pci_unregister_driver(&switchtec_pci_driver);
+	class_destroy(switchtec_class);
+	unregister_chrdev_region(switchtec_devt, max_devices);
+	ida_destroy(&switchtec_minor_ida);
+
+	pr_info(KBUILD_MODNAME ": unloaded.\n");
+}
+module_exit(switchtec_exit);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 2/4] switchtec: Add user interface documentation
  2017-02-02 18:05 [PATCH v2 0/4] New Microsemi PCI Switch Management Driver Logan Gunthorpe
  2017-02-02 18:06 ` [PATCH v2 1/4] MicroSemi Switchtec management interface driver Logan Gunthorpe
@ 2017-02-02 18:06 ` Logan Gunthorpe
  2017-02-02 18:06 ` [PATCH v2 3/4] switchtec: Add sysfs attributes to the Switchtec driver Logan Gunthorpe
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 36+ messages in thread
From: Logan Gunthorpe @ 2017-02-02 18:06 UTC (permalink / raw)
  To: Keith Busch, Myron Stowe, Greg Kroah-Hartman, Bjorn Helgaas,
	Geert Uytterhoeven, Jonathan Corbet, David S. Miller,
	Andrew Morton, Emil Velikov, Mauro Carvalho Chehab,
	Guenter Roeck, Jarkko Sakkinen, Linus Walleij, Ryusuke Konishi,
	Stefan Berger, Wei Zhang
  Cc: Kurt Schwemmer, Stephen Bates, linux-pci, linux-doc, linux-nvme,
	linux-kernel, Logan Gunthorpe

This adds standard documentation for the sysfs switchtec attributes and
a RST formatted text file which documents the char device interface.
Jonathan Corbet has indicated he will move this to a new user-space
developer documentation book once it's created.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Stephen Bates <stephen.bates@microsemi.com>
---
 Documentation/switchtec.txt | 53 +++++++++++++++++++++++++++++++++++++++++++++
 MAINTAINERS                 |  1 +
 2 files changed, 54 insertions(+)
 create mode 100644 Documentation/switchtec.txt

diff --git a/Documentation/switchtec.txt b/Documentation/switchtec.txt
new file mode 100644
index 0000000..4bced4c
--- /dev/null
+++ b/Documentation/switchtec.txt
@@ -0,0 +1,53 @@
+========================
+Linux Switchtec Support
+========================
+
+Microsemi's "Switchtec" line of PCI switch devices is already
+supported by the kernel with standard PCI switch drivers. However, the
+Switchtec device advertises a special management endpoint which
+enables some additional functionality. This includes:
+
+* Packet and Byte Counters
+* Firmware Upgrades
+* Event and Error logs
+* Querying port link status
+* Custom user firmware commands
+
+The switchtec kernel module implements this functionality.
+
+
+Interface
+=========
+
+The primary means of communicating with the Switchtec management firmware is
+through the Memory-mapped Remote Procedure Call (MRPC) interface.
+Commands are submitted to the interface with a 4-byte command
+identifier and up to 1KB of command specific data. The firmware will
+respond with a 4 bytes return code and up to 1KB of command specific
+data. The interface only processes a single command at a time.
+
+
+Userspace Interface
+===================
+
+The MRPC interface will be exposed to userspace through a simple char
+device: /dev/switchtec#, one for each management endpoint in the system.
+
+The char device has the following semantics:
+
+* A write must consist of at least 4 bytes and no more than 1028 bytes.
+  The first four bytes will be interpreted as the command to run and
+  the remainder will be used as the input data. A write will send the
+  command to the firmware to begin processing.
+
+* Each write must be followed by exactly one read. Any double write will
+  produce an error and any read that doesn't follow a write will
+  produce an error.
+
+* A read will block until the firmware completes the command and return
+  the four bytes of status plus up to 1024 bytes of output data. (The
+  length will be specified by the size parameter of the read call --
+  reading less than 4 bytes will produce an error.
+
+* The poll call will also be supported for userspace applications that
+  need to do other things while waiting for the command to complete.
diff --git a/MAINTAINERS b/MAINTAINERS
index 9c35198..0ab858d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9513,6 +9513,7 @@ M:	Stephen Bates <stephen.bates@microsemi.com>
 M:	Logan Gunthorpe <logang@deltatee.com>
 L:	linux-pci@vger.kernel.org
 S:	Maintained
+F:	Documentation/switchtec.txt
 F:	drivers/pci/switch/switchtec*
 
 PCI DRIVER FOR NVIDIA TEGRA
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 3/4] switchtec: Add sysfs attributes to the Switchtec driver
  2017-02-02 18:05 [PATCH v2 0/4] New Microsemi PCI Switch Management Driver Logan Gunthorpe
  2017-02-02 18:06 ` [PATCH v2 1/4] MicroSemi Switchtec management interface driver Logan Gunthorpe
  2017-02-02 18:06 ` [PATCH v2 2/4] switchtec: Add user interface documentation Logan Gunthorpe
@ 2017-02-02 18:06 ` Logan Gunthorpe
  2017-02-10 14:54   ` Greg Kroah-Hartman
  2017-02-23 22:43   ` Bjorn Helgaas
  2017-02-02 18:06 ` [PATCH v2 4/4] switchtec: Add IOCTLs " Logan Gunthorpe
                   ` (4 subsequent siblings)
  7 siblings, 2 replies; 36+ messages in thread
From: Logan Gunthorpe @ 2017-02-02 18:06 UTC (permalink / raw)
  To: Keith Busch, Myron Stowe, Greg Kroah-Hartman, Bjorn Helgaas,
	Geert Uytterhoeven, Jonathan Corbet, David S. Miller,
	Andrew Morton, Emil Velikov, Mauro Carvalho Chehab,
	Guenter Roeck, Jarkko Sakkinen, Linus Walleij, Ryusuke Konishi,
	Stefan Berger, Wei Zhang
  Cc: Kurt Schwemmer, Stephen Bates, linux-pci, linux-doc, linux-nvme,
	linux-kernel, Logan Gunthorpe

This patch adds a few read-only sysfs attributes which provide
some device information that is exposed from the devices. Primarily
component and device names and versions. These are documented in
Documentation/ABI/testing/sysfs-class-switchtec.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Stephen Bates <stephen.bates@microsemi.com>
---
 Documentation/ABI/testing/sysfs-class-switchtec |  96 ++++++++++++++++++++
 MAINTAINERS                                     |   1 +
 drivers/pci/switch/switchtec.c                  | 113 ++++++++++++++++++++++++
 3 files changed, 210 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-class-switchtec

diff --git a/Documentation/ABI/testing/sysfs-class-switchtec b/Documentation/ABI/testing/sysfs-class-switchtec
new file mode 100644
index 0000000..48cb4c1
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-class-switchtec
@@ -0,0 +1,96 @@
+switchtec - Microsemi Switchtec PCI Switch Management Endpoint
+
+For details on this subsystem look at Documentation/switchtec.txt.
+
+What: 		/sys/class/switchtec
+Date:		05-Jan-2017
+KernelVersion:	v4.11
+Contact:	Logan Gunthorpe <logang@deltatee.com>
+Description:	The switchtec class subsystem folder.
+		Each registered switchtec driver is represented by a switchtecX
+		subfolder (X being an integer >= 0).
+
+
+What:		/sys/class/switchtec/switchtec[0-9]+/component_id
+Date:		05-Jan-2017
+KernelVersion:	v4.11
+Contact:	Logan Gunthorpe <logang@deltatee.com>
+Description:	Component identifier as stored in the hardware (eg. PM8543)
+		(read only)
+Values: 	arbitrary string.
+
+
+What:		/sys/class/switchtec/switchtec[0-9]+/component_revision
+Date:		05-Jan-2017
+KernelVersion:	v4.11
+Contact:	Logan Gunthorpe <logang@deltatee.com>
+Description:	Component revision stored in the hardware (read only)
+Values: 	integer.
+
+
+What:		/sys/class/switchtec/switchtec[0-9]+/component_vendor
+Date:		05-Jan-2017
+KernelVersion:	v4.11
+Contact:	Logan Gunthorpe <logang@deltatee.com>
+Description:	Component vendor as stored in the hardware (eg. MICROSEM)
+		(read only)
+Values: 	arbitrary string.
+
+
+What:		/sys/class/switchtec/switchtec[0-9]+/device_version
+Date:		05-Jan-2017
+KernelVersion:	v4.11
+Contact:	Logan Gunthorpe <logang@deltatee.com>
+Description:	Device version as stored in the hardware (read only)
+Values: 	integer.
+
+
+What:		/sys/class/switchtec/switchtec[0-9]+/fw_version
+Date:		05-Jan-2017
+KernelVersion:	v4.11
+Contact:	Logan Gunthorpe <logang@deltatee.com>
+Description:	Currently running firmware version (read only)
+Values: 	integer (in hexadecimal).
+
+
+What:		/sys/class/switchtec/switchtec[0-9]+/partition
+Date:		05-Jan-2017
+KernelVersion:	v4.11
+Contact:	Logan Gunthorpe <logang@deltatee.com>
+Description:	Partition number for this device in the switch (read only)
+Values: 	integer.
+
+
+What:		/sys/class/switchtec/switchtec[0-9]+/partition_count
+Date:		05-Jan-2017
+KernelVersion:	v4.11
+Contact:	Logan Gunthorpe <logang@deltatee.com>
+Description:	Total number of partitions in the switch (read only)
+Values: 	integer.
+
+
+What:		/sys/class/switchtec/switchtec[0-9]+/product_id
+Date:		05-Jan-2017
+KernelVersion:	v4.11
+Contact:	Logan Gunthorpe <logang@deltatee.com>
+Description:	Product identifier as stored in the hardware (eg. PSX 48XG3)
+		(read only)
+Values: 	arbitrary string.
+
+
+What:		/sys/class/switchtec/switchtec[0-9]+/product_revision
+Date:		05-Jan-2017
+KernelVersion:	v4.11
+Contact:	Logan Gunthorpe <logang@deltatee.com>
+Description:	Product revision stored in the hardware (eg. RevB)
+		(read only)
+Values: 	arbitrary string.
+
+
+What:		/sys/class/switchtec/switchtec[0-9]+/product_vendor
+Date:		05-Jan-2017
+KernelVersion:	v4.11
+Contact:	Logan Gunthorpe <logang@deltatee.com>
+Description:	Product vendor as stored in the hardware (eg. MICROSEM)
+		(read only)
+Values: 	arbitrary string.
diff --git a/MAINTAINERS b/MAINTAINERS
index 0ab858d..d39b7fd 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9514,6 +9514,7 @@ M:	Logan Gunthorpe <logang@deltatee.com>
 L:	linux-pci@vger.kernel.org
 S:	Maintained
 F:	Documentation/switchtec.txt
+F:	Documentation/ABI/testing/sysfs-class-switchtec
 F:	drivers/pci/switch/switchtec*
 
 PCI DRIVER FOR NVIDIA TEGRA
diff --git a/drivers/pci/switch/switchtec.c b/drivers/pci/switch/switchtec.c
index e4a4294..354ddfd 100644
--- a/drivers/pci/switch/switchtec.c
+++ b/drivers/pci/switch/switchtec.c
@@ -485,6 +485,118 @@ static void mrpc_timeout_work(struct work_struct *work)
 	mutex_unlock(&stdev->mrpc_mutex);
 }
 
+static ssize_t device_version_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct switchtec_dev *stdev = to_stdev(dev);
+	u32 ver;
+
+	ver = ioread32(&stdev->mmio_sys_info->device_version);
+
+	return sprintf(buf, "%x\n", ver);
+}
+static DEVICE_ATTR_RO(device_version);
+
+static ssize_t fw_version_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct switchtec_dev *stdev = to_stdev(dev);
+	u32 ver;
+
+	ver = ioread32(&stdev->mmio_sys_info->firmware_version);
+
+	return sprintf(buf, "%08x\n", ver);
+}
+static DEVICE_ATTR_RO(fw_version);
+
+static ssize_t io_string_show(char *buf, void __iomem *attr, size_t len)
+{
+	int i;
+
+	memcpy_fromio(buf, attr, len);
+	buf[len] = '\n';
+	buf[len + 1] = 0;
+
+	for (i = len - 1; i > 0; i--) {
+		if (buf[i] != ' ')
+			break;
+		buf[i] = '\n';
+		buf[i + 1] = 0;
+	}
+
+	return strlen(buf);
+}
+
+#define DEVICE_ATTR_SYS_INFO_STR(field) \
+static ssize_t field ## _show(struct device *dev, \
+	struct device_attribute *attr, char *buf) \
+{ \
+	struct switchtec_dev *stdev = to_stdev(dev); \
+	return io_string_show(buf, &stdev->mmio_sys_info->field, \
+			    sizeof(stdev->mmio_sys_info->field)); \
+} \
+\
+static DEVICE_ATTR_RO(field)
+
+DEVICE_ATTR_SYS_INFO_STR(vendor_id);
+DEVICE_ATTR_SYS_INFO_STR(product_id);
+DEVICE_ATTR_SYS_INFO_STR(product_revision);
+DEVICE_ATTR_SYS_INFO_STR(component_vendor);
+
+static ssize_t component_id_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct switchtec_dev *stdev = to_stdev(dev);
+	int id = ioread16(&stdev->mmio_sys_info->component_id);
+
+	return sprintf(buf, "PM%04X\n", id);
+}
+static DEVICE_ATTR_RO(component_id);
+
+static ssize_t component_revision_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct switchtec_dev *stdev = to_stdev(dev);
+	int rev = ioread8(&stdev->mmio_sys_info->component_revision);
+
+	return sprintf(buf, "%d\n", rev);
+}
+static DEVICE_ATTR_RO(component_revision);
+
+static ssize_t partition_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct switchtec_dev *stdev = to_stdev(dev);
+
+	return sprintf(buf, "%d\n", stdev->partition);
+}
+static DEVICE_ATTR_RO(partition);
+
+static ssize_t partition_count_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct switchtec_dev *stdev = to_stdev(dev);
+
+	return sprintf(buf, "%d\n", stdev->partition_count);
+}
+static DEVICE_ATTR_RO(partition_count);
+
+static struct attribute *switchtec_device_attrs[] = {
+	&dev_attr_device_version.attr,
+	&dev_attr_fw_version.attr,
+	&dev_attr_vendor_id.attr,
+	&dev_attr_product_id.attr,
+	&dev_attr_product_revision.attr,
+	&dev_attr_component_vendor.attr,
+	&dev_attr_component_id.attr,
+	&dev_attr_component_revision.attr,
+	&dev_attr_partition.attr,
+	&dev_attr_partition_count.attr,
+	NULL,
+};
+
+ATTRIBUTE_GROUPS(switchtec_device);
+
 static int switchtec_dev_open(struct inode *inode, struct file *filp)
 {
 	struct switchtec_dev *stdev;
@@ -699,6 +811,7 @@ static struct switchtec_dev *stdev_create(struct pci_dev *pdev)
 	dev->devt = MKDEV(MAJOR(switchtec_devt), minor);
 	dev->class = switchtec_class;
 	dev->parent = &pdev->dev;
+	dev->groups = switchtec_device_groups;
 	dev->release = stdev_release;
 	dev_set_name(dev, "switchtec%d", minor);
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 4/4] switchtec: Add IOCTLs to the Switchtec driver
  2017-02-02 18:05 [PATCH v2 0/4] New Microsemi PCI Switch Management Driver Logan Gunthorpe
                   ` (2 preceding siblings ...)
  2017-02-02 18:06 ` [PATCH v2 3/4] switchtec: Add sysfs attributes to the Switchtec driver Logan Gunthorpe
@ 2017-02-02 18:06 ` Logan Gunthorpe
  2017-02-09 23:16 ` [PATCH v2 0/4] New Microsemi PCI Switch Management Driver Wei Zhang
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 36+ messages in thread
From: Logan Gunthorpe @ 2017-02-02 18:06 UTC (permalink / raw)
  To: Keith Busch, Myron Stowe, Greg Kroah-Hartman, Bjorn Helgaas,
	Geert Uytterhoeven, Jonathan Corbet, David S. Miller,
	Andrew Morton, Emil Velikov, Mauro Carvalho Chehab,
	Guenter Roeck, Jarkko Sakkinen, Linus Walleij, Ryusuke Konishi,
	Stefan Berger, Wei Zhang
  Cc: Kurt Schwemmer, Stephen Bates, linux-pci, linux-doc, linux-nvme,
	linux-kernel, Logan Gunthorpe

This patch introduces a couple of special IOCTLs which are provided to:

* Inform userspace of firmware partition locations
* Pass event counts and allow userspace to wait on events
* Translate between PFF numbers used by the switch to
  port numbers.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Stephen Bates <stephen.bates@microsemi.com>
---
 Documentation/ioctl/ioctl-number.txt |   1 +
 Documentation/switchtec.txt          |  27 ++
 MAINTAINERS                          |   1 +
 drivers/pci/switch/switchtec.c       | 467 +++++++++++++++++++++++++++++++++++
 include/uapi/linux/switchtec_ioctl.h | 132 ++++++++++
 5 files changed, 628 insertions(+)
 create mode 100644 include/uapi/linux/switchtec_ioctl.h

diff --git a/Documentation/ioctl/ioctl-number.txt b/Documentation/ioctl/ioctl-number.txt
index 81c7f2b..032b33c 100644
--- a/Documentation/ioctl/ioctl-number.txt
+++ b/Documentation/ioctl/ioctl-number.txt
@@ -191,6 +191,7 @@ Code  Seq#(hex)	Include File		Comments
 'W'	00-1F	linux/watchdog.h	conflict!
 'W'	00-1F	linux/wanrouter.h	conflict!		(pre 3.9)
 'W'	00-3F	sound/asound.h		conflict!
+'W'	40-5F   drivers/pci/switch/switchtec.c
 'X'	all	fs/xfs/xfs_fs.h		conflict!
 		and fs/xfs/linux-2.6/xfs_ioctl32.h
 		and include/linux/falloc.h
diff --git a/Documentation/switchtec.txt b/Documentation/switchtec.txt
index 4bced4c..a0a9c7b 100644
--- a/Documentation/switchtec.txt
+++ b/Documentation/switchtec.txt
@@ -51,3 +51,30 @@ The char device has the following semantics:
 
 * The poll call will also be supported for userspace applications that
   need to do other things while waiting for the command to complete.
+
+The following IOCTLs are also supported by the device:
+
+* SWITCHTEC_IOCTL_FLASH_INFO - Retrieve firmware length and number
+  of partitions in the device.
+
+* SWITCHTEC_IOCTL_FLASH_PART_INFO - Retrieve address and lengeth for
+  any specified partition in flash.
+
+* SWITCHTEC_IOCTL_EVENT_SUMMARY - Read a structure of bitmaps
+  indicating all uncleared events.
+
+* SWITCHTEC_IOCTL_EVENT_CTL - Get the current count, clear and set flags
+  for any event. This ioctl takes in a switchtec_ioctl_event_ctl struct
+  with the event_id, index and flags set (index being the partition or PFF
+  number for non-global events). It returns whether the event has
+  occurred, the number of times and any event specific data. The flags
+  can be used to clear the count or enable and disable actions to
+  happen when the event occurs.
+  By using the SWITCHTEC_IOCTL_EVENT_FLAG_EN_POLL flag,
+  you can set an event to trigger a poll command to return with
+  POLLPRI. In this way, userspace can wait for events to occur.
+
+* SWITCHTEC_IOCTL_PFF_TO_PORT and SWITCHTEC_IOCTL_PORT_TO_PFF convert
+  between PCI Function Framework number (used by the event system)
+  and Switchtec Logic Port ID and Partition number (which is more
+  user friendly).
diff --git a/MAINTAINERS b/MAINTAINERS
index d39b7fd..ea461d1 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9516,6 +9516,7 @@ S:	Maintained
 F:	Documentation/switchtec.txt
 F:	Documentation/ABI/testing/sysfs-class-switchtec
 F:	drivers/pci/switch/switchtec*
+F:	include/uapi/linux/switchtec_ioctl.h
 
 PCI DRIVER FOR NVIDIA TEGRA
 M:	Thierry Reding <thierry.reding@gmail.com>
diff --git a/drivers/pci/switch/switchtec.c b/drivers/pci/switch/switchtec.c
index 354ddfd..82bfd18 100644
--- a/drivers/pci/switch/switchtec.c
+++ b/drivers/pci/switch/switchtec.c
@@ -13,6 +13,8 @@
  *
  */
 
+#include <linux/switchtec_ioctl.h>
+
 #include <linux/interrupt.h>
 #include <linux/module.h>
 #include <linux/fs.h>
@@ -755,6 +757,417 @@ static unsigned int switchtec_dev_poll(struct file *filp, poll_table *wait)
 	return ret;
 }
 
+static int ioctl_flash_info(struct switchtec_dev *stdev,
+			    struct switchtec_ioctl_flash_info __user *uinfo)
+{
+	struct switchtec_ioctl_flash_info info = {0};
+	struct flash_info_regs __iomem *fi = stdev->mmio_flash_info;
+
+	info.flash_length = ioread32(&fi->flash_length);
+	info.num_partitions = SWITCHTEC_IOCTL_NUM_PARTITIONS;
+
+	if (copy_to_user(uinfo, &info, sizeof(info)))
+		return -EFAULT;
+
+	return 0;
+}
+
+static void set_fw_info_part(struct switchtec_ioctl_flash_part_info *info,
+			     struct partition_info __iomem *pi)
+{
+	info->address = ioread32(&pi->address);
+	info->length = ioread32(&pi->length);
+}
+
+static int ioctl_flash_part_info(struct switchtec_dev *stdev,
+	struct switchtec_ioctl_flash_part_info __user *uinfo)
+{
+	struct switchtec_ioctl_flash_part_info info = {0};
+	struct flash_info_regs __iomem *fi = stdev->mmio_flash_info;
+	u32 active_addr = -1;
+
+	if (copy_from_user(&info, uinfo, sizeof(info)))
+		return -EFAULT;
+
+	switch (info.flash_partition) {
+	case SWITCHTEC_IOCTL_PART_CFG0:
+		active_addr = ioread32(&fi->active_cfg);
+		set_fw_info_part(&info, &fi->cfg0);
+		break;
+	case SWITCHTEC_IOCTL_PART_CFG1:
+		active_addr = ioread32(&fi->active_cfg);
+		set_fw_info_part(&info, &fi->cfg1);
+		break;
+	case SWITCHTEC_IOCTL_PART_IMG0:
+		active_addr = ioread32(&fi->active_img);
+		set_fw_info_part(&info, &fi->img0);
+		break;
+	case SWITCHTEC_IOCTL_PART_IMG1:
+		active_addr = ioread32(&fi->active_img);
+		set_fw_info_part(&info, &fi->img1);
+		break;
+	case SWITCHTEC_IOCTL_PART_NVLOG:
+		set_fw_info_part(&info, &fi->nvlog);
+		break;
+	case SWITCHTEC_IOCTL_PART_VENDOR0:
+		set_fw_info_part(&info, &fi->vendor[0]);
+		break;
+	case SWITCHTEC_IOCTL_PART_VENDOR1:
+		set_fw_info_part(&info, &fi->vendor[1]);
+		break;
+	case SWITCHTEC_IOCTL_PART_VENDOR2:
+		set_fw_info_part(&info, &fi->vendor[2]);
+		break;
+	case SWITCHTEC_IOCTL_PART_VENDOR3:
+		set_fw_info_part(&info, &fi->vendor[3]);
+		break;
+	case SWITCHTEC_IOCTL_PART_VENDOR4:
+		set_fw_info_part(&info, &fi->vendor[4]);
+		break;
+	case SWITCHTEC_IOCTL_PART_VENDOR5:
+		set_fw_info_part(&info, &fi->vendor[5]);
+		break;
+	case SWITCHTEC_IOCTL_PART_VENDOR6:
+		set_fw_info_part(&info, &fi->vendor[6]);
+		break;
+	case SWITCHTEC_IOCTL_PART_VENDOR7:
+		set_fw_info_part(&info, &fi->vendor[7]);
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	if (info.address == active_addr)
+		info.active = 1;
+
+	if (copy_to_user(uinfo, &info, sizeof(info)))
+		return -EFAULT;
+
+	return 0;
+}
+
+static int ioctl_event_summary(struct switchtec_dev *stdev,
+	struct switchtec_user *stuser,
+	struct switchtec_ioctl_event_summary __user *usum)
+{
+	struct switchtec_ioctl_event_summary s = {0};
+	int i;
+	u32 reg;
+
+	s.global = ioread32(&stdev->mmio_sw_event->global_summary);
+	s.part_bitmap = ioread32(&stdev->mmio_sw_event->part_event_bitmap);
+	s.local_part = ioread32(&stdev->mmio_part_cfg->part_event_summary);
+
+	for (i = 0; i < stdev->partition_count; i++) {
+		reg = ioread32(&stdev->mmio_part_cfg_all[i].part_event_summary);
+		s.part[i] = reg;
+	}
+
+	for (i = 0; i < SWITCHTEC_MAX_PFF_CSR; i++) {
+		reg = ioread16(&stdev->mmio_pff_csr[i].vendor_id);
+		if (reg != MICROSEMI_VENDOR_ID)
+			break;
+
+		reg = ioread32(&stdev->mmio_pff_csr[i].pff_event_summary);
+		s.pff[i] = reg;
+	}
+
+	if (copy_to_user(usum, &s, sizeof(s)))
+		return -EFAULT;
+
+	stuser->event_cnt = atomic_read(&stdev->event_cnt);
+
+	return 0;
+}
+
+static u32 __iomem *global_ev_reg(struct switchtec_dev *stdev,
+				  size_t offset, int index)
+{
+	return (void __iomem *)stdev->mmio_sw_event + offset;
+}
+
+static u32 __iomem *part_ev_reg(struct switchtec_dev *stdev,
+				size_t offset, int index)
+{
+	return (void __iomem *)&stdev->mmio_part_cfg_all[index] + offset;
+}
+
+static u32 __iomem *pff_ev_reg(struct switchtec_dev *stdev,
+			       size_t offset, int index)
+{
+	return (void __iomem *)&stdev->mmio_pff_csr[index] + offset;
+}
+
+#define EV_GLB(i, r)[i] = {offsetof(struct sw_event_regs, r), global_ev_reg}
+#define EV_PAR(i, r)[i] = {offsetof(struct part_cfg_regs, r), part_ev_reg}
+#define EV_PFF(i, r)[i] = {offsetof(struct pff_csr_regs, r), pff_ev_reg}
+
+const struct event_reg {
+	size_t offset;
+	u32 __iomem *(*map_reg)(struct switchtec_dev *stdev,
+				size_t offset, int index);
+} event_regs[] = {
+	EV_GLB(SWITCHTEC_IOCTL_EVENT_STACK_ERROR, stack_error_event_hdr),
+	EV_GLB(SWITCHTEC_IOCTL_EVENT_PPU_ERROR, ppu_error_event_hdr),
+	EV_GLB(SWITCHTEC_IOCTL_EVENT_ISP_ERROR, isp_error_event_hdr),
+	EV_GLB(SWITCHTEC_IOCTL_EVENT_SYS_RESET, sys_reset_event_hdr),
+	EV_GLB(SWITCHTEC_IOCTL_EVENT_FW_EXC, fw_exception_hdr),
+	EV_GLB(SWITCHTEC_IOCTL_EVENT_FW_NMI, fw_nmi_hdr),
+	EV_GLB(SWITCHTEC_IOCTL_EVENT_FW_NON_FATAL, fw_non_fatal_hdr),
+	EV_GLB(SWITCHTEC_IOCTL_EVENT_FW_FATAL, fw_fatal_hdr),
+	EV_GLB(SWITCHTEC_IOCTL_EVENT_TWI_MRPC_COMP, twi_mrpc_comp_hdr),
+	EV_GLB(SWITCHTEC_IOCTL_EVENT_TWI_MRPC_COMP_ASYNC,
+	       twi_mrpc_comp_async_hdr),
+	EV_GLB(SWITCHTEC_IOCTL_EVENT_CLI_MRPC_COMP, cli_mrpc_comp_hdr),
+	EV_GLB(SWITCHTEC_IOCTL_EVENT_CLI_MRPC_COMP_ASYNC,
+	       cli_mrpc_comp_async_hdr),
+	EV_GLB(SWITCHTEC_IOCTL_EVENT_GPIO_INT, gpio_interrupt_hdr),
+	EV_PAR(SWITCHTEC_IOCTL_EVENT_PART_RESET, part_reset_hdr),
+	EV_PAR(SWITCHTEC_IOCTL_EVENT_MRPC_COMP, mrpc_comp_hdr),
+	EV_PAR(SWITCHTEC_IOCTL_EVENT_MRPC_COMP_ASYNC, mrpc_comp_async_hdr),
+	EV_PAR(SWITCHTEC_IOCTL_EVENT_DYN_PART_BIND_COMP, dyn_binding_hdr),
+	EV_PFF(SWITCHTEC_IOCTL_EVENT_AER_IN_P2P, aer_in_p2p_hdr),
+	EV_PFF(SWITCHTEC_IOCTL_EVENT_AER_IN_VEP, aer_in_vep_hdr),
+	EV_PFF(SWITCHTEC_IOCTL_EVENT_DPC, dpc_hdr),
+	EV_PFF(SWITCHTEC_IOCTL_EVENT_CTS, cts_hdr),
+	EV_PFF(SWITCHTEC_IOCTL_EVENT_HOTPLUG, hotplug_hdr),
+	EV_PFF(SWITCHTEC_IOCTL_EVENT_IER, ier_hdr),
+	EV_PFF(SWITCHTEC_IOCTL_EVENT_THRESH, threshold_hdr),
+	EV_PFF(SWITCHTEC_IOCTL_EVENT_POWER_MGMT, power_mgmt_hdr),
+	EV_PFF(SWITCHTEC_IOCTL_EVENT_TLP_THROTTLING, tlp_throttling_hdr),
+	EV_PFF(SWITCHTEC_IOCTL_EVENT_FORCE_SPEED, force_speed_hdr),
+	EV_PFF(SWITCHTEC_IOCTL_EVENT_CREDIT_TIMEOUT, credit_timeout_hdr),
+	EV_PFF(SWITCHTEC_IOCTL_EVENT_LINK_STATE, link_state_hdr),
+};
+
+static u32 __iomem *event_hdr_addr(struct switchtec_dev *stdev,
+				   int event_id, int index)
+{
+	size_t off;
+
+	if (event_id < 0 || event_id >= SWITCHTEC_IOCTL_MAX_EVENTS)
+		return ERR_PTR(-EINVAL);
+
+	off = event_regs[event_id].offset;
+
+	if (event_regs[event_id].map_reg == part_ev_reg) {
+		if (index == SWITCHTEC_IOCTL_EVENT_LOCAL_PART_IDX)
+			index = stdev->partition;
+		else if (index < 0 || index >= stdev->partition_count)
+			return ERR_PTR(-EINVAL);
+	} else if (event_regs[event_id].map_reg == pff_ev_reg) {
+		if (index < 0 || index >= stdev->pff_csr_count)
+			return ERR_PTR(-EINVAL);
+	}
+
+	return event_regs[event_id].map_reg(stdev, off, index);
+}
+
+static int event_ctl(struct switchtec_dev *stdev,
+		     struct switchtec_ioctl_event_ctl *ctl)
+{
+	int i;
+	u32 __iomem *reg;
+	u32 hdr;
+
+	reg = event_hdr_addr(stdev, ctl->event_id, ctl->index);
+	if (IS_ERR(reg))
+		return PTR_ERR(reg);
+
+	hdr = ioread32(reg);
+	for (i = 0; i < ARRAY_SIZE(ctl->data); i++)
+		ctl->data[i] = ioread32(&reg[i + 1]);
+
+	ctl->occurred = hdr & SWITCHTEC_EVENT_OCCURRED;
+	ctl->count = (hdr >> 5) & 0xFF;
+
+	if (!(ctl->flags & SWITCHTEC_IOCTL_EVENT_FLAG_CLEAR))
+		hdr &= ~SWITCHTEC_EVENT_CLEAR;
+	if (ctl->flags & SWITCHTEC_IOCTL_EVENT_FLAG_EN_POLL)
+		hdr |= SWITCHTEC_EVENT_EN_IRQ;
+	if (ctl->flags & SWITCHTEC_IOCTL_EVENT_FLAG_DIS_POLL)
+		hdr &= ~SWITCHTEC_EVENT_EN_IRQ;
+	if (ctl->flags & SWITCHTEC_IOCTL_EVENT_FLAG_EN_LOG)
+		hdr |= SWITCHTEC_EVENT_EN_LOG;
+	if (ctl->flags & SWITCHTEC_IOCTL_EVENT_FLAG_DIS_LOG)
+		hdr &= ~SWITCHTEC_EVENT_EN_LOG;
+	if (ctl->flags & SWITCHTEC_IOCTL_EVENT_FLAG_EN_CLI)
+		hdr |= SWITCHTEC_EVENT_EN_CLI;
+	if (ctl->flags & SWITCHTEC_IOCTL_EVENT_FLAG_DIS_CLI)
+		hdr &= ~SWITCHTEC_EVENT_EN_CLI;
+	if (ctl->flags & SWITCHTEC_IOCTL_EVENT_FLAG_EN_FATAL)
+		hdr |= SWITCHTEC_EVENT_FATAL;
+	if (ctl->flags & SWITCHTEC_IOCTL_EVENT_FLAG_DIS_FATAL)
+		hdr &= ~SWITCHTEC_EVENT_FATAL;
+
+	if (ctl->flags)
+		iowrite32(hdr, reg);
+
+	ctl->flags = 0;
+	if (hdr & SWITCHTEC_EVENT_EN_IRQ)
+		ctl->flags |= SWITCHTEC_IOCTL_EVENT_FLAG_EN_POLL;
+	if (hdr & SWITCHTEC_EVENT_EN_LOG)
+		ctl->flags |= SWITCHTEC_IOCTL_EVENT_FLAG_EN_LOG;
+	if (hdr & SWITCHTEC_EVENT_EN_CLI)
+		ctl->flags |= SWITCHTEC_IOCTL_EVENT_FLAG_EN_CLI;
+	if (hdr & SWITCHTEC_EVENT_FATAL)
+		ctl->flags |= SWITCHTEC_IOCTL_EVENT_FLAG_EN_FATAL;
+
+	return 0;
+}
+
+static int ioctl_event_ctl(struct switchtec_dev *stdev,
+	struct switchtec_ioctl_event_ctl __user *uctl)
+{
+	int ret;
+	int nr_idxs;
+	struct switchtec_ioctl_event_ctl ctl;
+
+	if (copy_from_user(&ctl, uctl, sizeof(ctl)))
+		return -EFAULT;
+
+	if (ctl.event_id > SWITCHTEC_IOCTL_MAX_EVENTS)
+		return -EINVAL;
+
+	if (ctl.flags & SWITCHTEC_IOCTL_EVENT_FLAG_UNUSED)
+		return -EINVAL;
+
+	if (ctl.index == SWITCHTEC_IOCTL_EVENT_IDX_ALL) {
+		if (event_regs[ctl.event_id].map_reg == global_ev_reg)
+			nr_idxs = 1;
+		else if (event_regs[ctl.event_id].map_reg == part_ev_reg)
+			nr_idxs = stdev->partition_count;
+		else if (event_regs[ctl.event_id].map_reg == pff_ev_reg)
+			nr_idxs = stdev->pff_csr_count;
+		else
+			return -EINVAL;
+
+		for (ctl.index = 0; ctl.index < nr_idxs; ctl.index++) {
+			ret = event_ctl(stdev, &ctl);
+			if (ret < 0)
+				return ret;
+		}
+	} else {
+		ret = event_ctl(stdev, &ctl);
+		if (ret < 0)
+			return ret;
+	}
+
+	if (copy_to_user(uctl, &ctl, sizeof(ctl)))
+		return -EFAULT;
+
+	return 0;
+}
+
+static int ioctl_pff_to_port(struct switchtec_dev *stdev,
+			     struct switchtec_ioctl_pff_port *up)
+{
+	int i, part;
+	u32 reg;
+	struct part_cfg_regs *pcfg;
+	struct switchtec_ioctl_pff_port p;
+
+	if (copy_from_user(&p, up, sizeof(p)))
+		return -EFAULT;
+
+	p.port = -1;
+	for (part = 0; part < stdev->partition_count; part++) {
+		pcfg = &stdev->mmio_part_cfg_all[part];
+		p.partition = part;
+
+		reg = ioread32(&pcfg->usp_pff_inst_id);
+		if (reg == p.pff) {
+			p.port = 0;
+			break;
+		}
+
+		reg = ioread32(&pcfg->vep_pff_inst_id);
+		if (reg == p.pff) {
+			p.port = SWITCHTEC_IOCTL_PFF_VEP;
+			break;
+		}
+
+		for (i = 0; i < ARRAY_SIZE(pcfg->dsp_pff_inst_id); i++) {
+			reg = ioread32(&pcfg->dsp_pff_inst_id[i]);
+			if (reg != p.pff)
+				continue;
+
+			p.port = i + 1;
+			break;
+		}
+
+		if (p.port != -1)
+			break;
+	}
+
+	if (copy_to_user(up, &p, sizeof(p)))
+		return -EFAULT;
+
+	return 0;
+}
+
+static int ioctl_port_to_pff(struct switchtec_dev *stdev,
+			     struct switchtec_ioctl_pff_port *up)
+{
+	struct switchtec_ioctl_pff_port p;
+	struct part_cfg_regs *pcfg;
+
+	if (copy_from_user(&p, up, sizeof(p)))
+		return -EFAULT;
+
+	if (p.partition == SWITCHTEC_IOCTL_EVENT_LOCAL_PART_IDX)
+		pcfg = stdev->mmio_part_cfg;
+	else if (p.partition < stdev->partition_count)
+		pcfg = &stdev->mmio_part_cfg_all[p.partition];
+	else
+		return -EINVAL;
+
+	switch (p.port) {
+	case 0:
+		p.pff = ioread32(&pcfg->usp_pff_inst_id);
+		break;
+	case SWITCHTEC_IOCTL_PFF_VEP:
+		p.pff = ioread32(&pcfg->vep_pff_inst_id);
+		break;
+	default:
+		if (p.port > ARRAY_SIZE(pcfg->dsp_pff_inst_id))
+			return -EINVAL;
+		p.pff = ioread32(&pcfg->dsp_pff_inst_id[p.port - 1]);
+		break;
+	}
+
+	if (copy_to_user(up, &p, sizeof(p)))
+		return -EFAULT;
+
+	return 0;
+}
+
+static long switchtec_dev_ioctl(struct file *filp, unsigned int cmd,
+				unsigned long arg)
+{
+	struct switchtec_user *stuser = filp->private_data;
+	struct switchtec_dev *stdev = stuser->stdev;
+
+	void __user *argp = (void __user *)arg;
+
+	switch (cmd) {
+	case SWITCHTEC_IOCTL_FLASH_INFO:
+		return ioctl_flash_info(stdev, argp);
+	case SWITCHTEC_IOCTL_FLASH_PART_INFO:
+		return ioctl_flash_part_info(stdev, argp);
+	case SWITCHTEC_IOCTL_EVENT_SUMMARY:
+		return ioctl_event_summary(stdev, stuser, argp);
+	case SWITCHTEC_IOCTL_EVENT_CTL:
+		return ioctl_event_ctl(stdev, argp);
+	case SWITCHTEC_IOCTL_PFF_TO_PORT:
+		return ioctl_pff_to_port(stdev, argp);
+	case SWITCHTEC_IOCTL_PORT_TO_PFF:
+		return ioctl_port_to_pff(stdev, argp);
+	default:
+		return -ENOTTY;
+	}
+}
+
 static const struct file_operations switchtec_fops = {
 	.owner = THIS_MODULE,
 	.open = switchtec_dev_open,
@@ -762,6 +1175,8 @@ static const struct file_operations switchtec_fops = {
 	.write = switchtec_dev_write,
 	.read = switchtec_dev_read,
 	.poll = switchtec_dev_poll,
+	.unlocked_ioctl = switchtec_dev_ioctl,
+	.compat_ioctl = switchtec_dev_ioctl,
 };
 
 static void stdev_release(struct device *dev)
@@ -839,11 +1254,52 @@ static struct switchtec_dev *stdev_create(struct pci_dev *pdev)
 	return ERR_PTR(rc);
 }
 
+static int mask_event(struct switchtec_dev *stdev, int eid, int idx)
+{
+	size_t off = event_regs[eid].offset;
+	u32 __iomem *hdr_reg;
+	u32 hdr;
+
+	hdr_reg = event_regs[eid].map_reg(stdev, off, idx);
+	hdr = ioread32(hdr_reg);
+
+	if (!(hdr & SWITCHTEC_EVENT_OCCURRED && hdr & SWITCHTEC_EVENT_EN_IRQ))
+		return 0;
+
+	dev_dbg(&stdev->dev, "%s: %d %d %x\n", __func__, eid, idx, hdr);
+	hdr &= ~(SWITCHTEC_EVENT_EN_IRQ | SWITCHTEC_EVENT_OCCURRED);
+	iowrite32(hdr, hdr_reg);
+
+	return 1;
+}
+
+static int mask_all_events(struct switchtec_dev *stdev, int eid)
+{
+	int idx;
+	int count = 0;
+
+	if (event_regs[eid].map_reg == part_ev_reg) {
+		for (idx = 0; idx < stdev->partition_count; idx++)
+			count += mask_event(stdev, eid, idx);
+	} else if (event_regs[eid].map_reg == pff_ev_reg) {
+		for (idx = 0; idx < stdev->pff_csr_count; idx++) {
+			if (!stdev->pff_local[idx])
+				continue;
+			count += mask_event(stdev, eid, idx);
+		}
+	} else {
+		count += mask_event(stdev, eid, 0);
+	}
+
+	return count;
+}
+
 static irqreturn_t switchtec_event_isr(int irq, void *dev)
 {
 	struct switchtec_dev *stdev = dev;
 	u32 reg;
 	irqreturn_t ret = IRQ_NONE;
+	int eid, event_count = 0;
 
 	reg = ioread32(&stdev->mmio_part_cfg->mrpc_comp_hdr);
 	if (reg & SWITCHTEC_EVENT_OCCURRED) {
@@ -853,6 +1309,17 @@ static irqreturn_t switchtec_event_isr(int irq, void *dev)
 		iowrite32(reg, &stdev->mmio_part_cfg->mrpc_comp_hdr);
 	}
 
+	for (eid = 0; eid < SWITCHTEC_IOCTL_MAX_EVENTS; eid++)
+		event_count += mask_all_events(stdev, eid);
+
+	if (event_count) {
+		atomic_inc(&stdev->event_cnt);
+		wake_up_interruptible(&stdev->event_wq);
+		dev_dbg(&stdev->dev, "%s: %d events\n", __func__,
+			event_count);
+		return IRQ_HANDLED;
+	}
+
 	return ret;
 }
 
diff --git a/include/uapi/linux/switchtec_ioctl.h b/include/uapi/linux/switchtec_ioctl.h
new file mode 100644
index 0000000..3e824e1
--- /dev/null
+++ b/include/uapi/linux/switchtec_ioctl.h
@@ -0,0 +1,132 @@
+/*
+ * Microsemi Switchtec PCIe Driver
+ * Copyright (c) 2017, Microsemi Corporation
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ */
+
+#ifndef _UAPI_LINUX_SWITCHTEC_IOCTL_H
+#define _UAPI_LINUX_SWITCHTEC_IOCTL_H
+
+#include <linux/types.h>
+
+#define SWITCHTEC_IOCTL_PART_CFG0	0
+#define SWITCHTEC_IOCTL_PART_CFG1	1
+#define SWITCHTEC_IOCTL_PART_IMG0	2
+#define SWITCHTEC_IOCTL_PART_IMG1	3
+#define SWITCHTEC_IOCTL_PART_NVLOG	4
+#define SWITCHTEC_IOCTL_PART_VENDOR0	5
+#define SWITCHTEC_IOCTL_PART_VENDOR1	6
+#define SWITCHTEC_IOCTL_PART_VENDOR2	7
+#define SWITCHTEC_IOCTL_PART_VENDOR3	8
+#define SWITCHTEC_IOCTL_PART_VENDOR4	9
+#define SWITCHTEC_IOCTL_PART_VENDOR5	10
+#define SWITCHTEC_IOCTL_PART_VENDOR6	11
+#define SWITCHTEC_IOCTL_PART_VENDOR7	12
+#define SWITCHTEC_IOCTL_NUM_PARTITIONS	13
+
+struct switchtec_ioctl_flash_info {
+	__u64 flash_length;
+	__u32 num_partitions;
+	__u32 padding;
+};
+
+struct switchtec_ioctl_flash_part_info {
+	__u32 flash_partition;
+	__u32 address;
+	__u32 length;
+	__u32 active;
+};
+
+struct switchtec_ioctl_event_summary {
+	__u64 global;
+	__u64 part_bitmap;
+	__u32 local_part;
+	__u32 padding;
+	__u32 part[48];
+	__u32 pff[48];
+};
+
+#define SWITCHTEC_IOCTL_EVENT_STACK_ERROR		0
+#define SWITCHTEC_IOCTL_EVENT_PPU_ERROR			1
+#define SWITCHTEC_IOCTL_EVENT_ISP_ERROR			2
+#define SWITCHTEC_IOCTL_EVENT_SYS_RESET			3
+#define SWITCHTEC_IOCTL_EVENT_FW_EXC			4
+#define SWITCHTEC_IOCTL_EVENT_FW_NMI			5
+#define SWITCHTEC_IOCTL_EVENT_FW_NON_FATAL		6
+#define SWITCHTEC_IOCTL_EVENT_FW_FATAL			7
+#define SWITCHTEC_IOCTL_EVENT_TWI_MRPC_COMP		8
+#define SWITCHTEC_IOCTL_EVENT_TWI_MRPC_COMP_ASYNC	9
+#define SWITCHTEC_IOCTL_EVENT_CLI_MRPC_COMP		10
+#define SWITCHTEC_IOCTL_EVENT_CLI_MRPC_COMP_ASYNC	11
+#define SWITCHTEC_IOCTL_EVENT_GPIO_INT			12
+#define SWITCHTEC_IOCTL_EVENT_PART_RESET		13
+#define SWITCHTEC_IOCTL_EVENT_MRPC_COMP			14
+#define SWITCHTEC_IOCTL_EVENT_MRPC_COMP_ASYNC		15
+#define SWITCHTEC_IOCTL_EVENT_DYN_PART_BIND_COMP	16
+#define SWITCHTEC_IOCTL_EVENT_AER_IN_P2P		17
+#define SWITCHTEC_IOCTL_EVENT_AER_IN_VEP		18
+#define SWITCHTEC_IOCTL_EVENT_DPC			19
+#define SWITCHTEC_IOCTL_EVENT_CTS			20
+#define SWITCHTEC_IOCTL_EVENT_HOTPLUG			21
+#define SWITCHTEC_IOCTL_EVENT_IER			22
+#define SWITCHTEC_IOCTL_EVENT_THRESH			23
+#define SWITCHTEC_IOCTL_EVENT_POWER_MGMT		24
+#define SWITCHTEC_IOCTL_EVENT_TLP_THROTTLING		25
+#define SWITCHTEC_IOCTL_EVENT_FORCE_SPEED		26
+#define SWITCHTEC_IOCTL_EVENT_CREDIT_TIMEOUT		27
+#define SWITCHTEC_IOCTL_EVENT_LINK_STATE		28
+#define SWITCHTEC_IOCTL_MAX_EVENTS			29
+
+#define SWITCHTEC_IOCTL_EVENT_LOCAL_PART_IDX -1
+#define SWITCHTEC_IOCTL_EVENT_IDX_ALL -2
+
+#define SWITCHTEC_IOCTL_EVENT_FLAG_CLEAR     (1 << 0)
+#define SWITCHTEC_IOCTL_EVENT_FLAG_EN_POLL   (1 << 1)
+#define SWITCHTEC_IOCTL_EVENT_FLAG_EN_LOG    (1 << 2)
+#define SWITCHTEC_IOCTL_EVENT_FLAG_EN_CLI    (1 << 3)
+#define SWITCHTEC_IOCTL_EVENT_FLAG_EN_FATAL  (1 << 4)
+#define SWITCHTEC_IOCTL_EVENT_FLAG_DIS_POLL  (1 << 5)
+#define SWITCHTEC_IOCTL_EVENT_FLAG_DIS_LOG   (1 << 6)
+#define SWITCHTEC_IOCTL_EVENT_FLAG_DIS_CLI   (1 << 7)
+#define SWITCHTEC_IOCTL_EVENT_FLAG_DIS_FATAL (1 << 8)
+#define SWITCHTEC_IOCTL_EVENT_FLAG_UNUSED    (~0x1ff)
+
+struct switchtec_ioctl_event_ctl {
+	__u32 event_id;
+	__s32 index;
+	__u32 flags;
+	__u32 occurred;
+	__u32 count;
+	__u32 data[5];
+};
+
+#define SWITCHTEC_IOCTL_PFF_VEP 100
+struct switchtec_ioctl_pff_port {
+	__u32 pff;
+	__u32 partition;
+	__u32 port;
+};
+
+#define SWITCHTEC_IOCTL_FLASH_INFO \
+	_IOR('W', 0x40, struct switchtec_ioctl_flash_info)
+#define SWITCHTEC_IOCTL_FLASH_PART_INFO \
+	_IOWR('W', 0x41, struct switchtec_ioctl_flash_part_info)
+#define SWITCHTEC_IOCTL_EVENT_SUMMARY \
+	_IOR('W', 0x42, struct switchtec_ioctl_event_summary)
+#define SWITCHTEC_IOCTL_EVENT_CTL \
+	_IOWR('W', 0x43, struct switchtec_ioctl_event_ctl)
+#define SWITCHTEC_IOCTL_PFF_TO_PORT \
+	_IOWR('W', 0x44, struct switchtec_ioctl_pff_port)
+#define SWITCHTEC_IOCTL_PORT_TO_PFF \
+	_IOWR('W', 0x45, struct switchtec_ioctl_pff_port)
+
+#endif
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 0/4] New Microsemi PCI Switch Management Driver
  2017-02-02 18:05 [PATCH v2 0/4] New Microsemi PCI Switch Management Driver Logan Gunthorpe
                   ` (3 preceding siblings ...)
  2017-02-02 18:06 ` [PATCH v2 4/4] switchtec: Add IOCTLs " Logan Gunthorpe
@ 2017-02-09 23:16 ` Wei Zhang
  2017-02-09 23:46   ` Wei Zhang
  2017-02-10 14:54 ` Greg Kroah-Hartman
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 36+ messages in thread
From: Wei Zhang @ 2017-02-09 23:16 UTC (permalink / raw)
  To: Logan Gunthorpe, Keith Busch, Myron Stowe, Greg Kroah-Hartman,
	Bjorn Helgaas, Geert Uytterhoeven, Jonathan Corbet,
	David S. Miller, Andrew Morton, Emil Velikov,
	Mauro Carvalho Chehab, Guenter Roeck, Jarkko Sakkinen,
	Linus Walleij, Ryusuke Konishi, Stefan Berger, Krishna Dhulipala
  Cc: Kurt Schwemmer, Stephen Bates, linux-pci, linux-doc, linux-nvme,
	linux-kernel

Hi,

The switchtec driver patches [PATCH v2 (0-4)/4] in conjunction with the switchtec userland tool is used to communicate with the Microsemi 8536 PCIe Switch used on Facebook’s Lightning platform. The following essential driver and tool functions were successfully tested with it:
·         Retrieval of firmware and configuration information along with CRCs
·         Switch firmware and configuration upgrades
·         Switch PHY/Link error counter collection and the ability to reset them
·         Switch upstream and downstream ports’ link status reporting
·         Switch interface functioning
·         Retrieval of switch ASIC temperature
·         Exporting switch firmware log dump
·         Read the information of firmware and configuration binaries
·         Extract the firmware and configuration images stored in the switch EEPROM

Tested-by: Krishna Dhulipala krishnad@fb.com
Tested-by: Wei Zhang wzhang@fb.com  

Thanks,
-Wei

--
wei zhang | software engineer | facebook
wzhang@fb.com | (408) 460-4803

On 2/2/17, 10:05 AM, "Logan Gunthorpe" <logang@deltatee.com> wrote:

    Changes since v1:
    
    * Rebased onto 4.10-rc6 (cleanly)
    * Split the patch into a few more easily digestible patches (as
      suggested by Greg Kroah-Hartman)
    * Folded switchtec.c into switchtec.h (per Greg)
    * Fixed a bunch of 32bit build warnings caught by the kbuild test robot
    * Fixed some issues in the documentation so it has a proper
      reStructredText format (as noted by Jonathan Corbet)
    * Fixed padding and sizes in the IOCTL structures as noticed by Emil
      Velikov and used pahole to verify their consistency across 32 and 64
      bit builds
    * Reworked one of the IOCTL interfaces to be more future proof (per
      Emil).
    
    Changes since RFC:
    
    * Fixed incorrect use of the drive model as pointed out by Greg
      Kroah-Hartman
    * Used devm functions as suggested by Keith Busch
    * Added a handful of sysfs attributes to the switchtec class
    * Added a handful of IOCTLs to the switchtec device
    * A number of miscellaneous bug fixes
    
    --
    
    Hi,
    
    This is a continuation of the RFC we posted lasted month [1] which
    proposes a management driver for Microsemi's Switchtec line of PCI
    switches. This hardware is still looking to be used in the Open
    Compute Platform
    
    To make this entirely clear: the Switchtec products are compliant
    with the PCI specifications and are supported today with the standard
    in-kernel driver. However, these devices also expose a management endpoint
    on a separate PCI function address which can be used to perform some
    advanced operations. This is a driver for that function. See the patch
    for more information.
    
    Since the RFC, we've made the changes requested by Greg Kroah-Hartman
    and Keith Busch, and we've also fleshed out a number of features. We've
    added a couple of IOCTLs and sysfs attributes which are documented in
    the patch. Significant work has also been done on the userspace tool
    which is available under a GPL license at [2]. We've also had testing
    done by some of the interested parties.
    
    We hope to see this work included in either 4.11 or 4.12 assuming a
    smooth review process.
    
    The patch is based off of the v4.10-rc6 release.
    
    Thanks for your review,
    
    Logan
    
    [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__www.spinics.net_lists_linux-2Dpci_msg56897.html&d=DwIBAg&c=5VD0RTtNlTh3ycd41b3MUw&r=LRFoLl92zWj5mkgkc_hRcg&m=VLDBJqzotzGkTj8-xjlfT-J0k2uFq6FcWg2nA_oKYJo&s=OkigHoSqH1Z3dnmLqN76lIQ_WxRJDj1uqIDl35SI58A&e= 
    [2] https://github.com/sbates130272/switchtec-user
    
    --
    
    Logan Gunthorpe (4):
      MicroSemi Switchtec management interface driver
      switchtec: Add user interface documentation
      switchtec: Add sysfs attributes to the Switchtec driver
      switchtec: Add IOCTLs to the Switchtec driver
    
     Documentation/ABI/testing/sysfs-class-switchtec |   96 ++
     Documentation/ioctl/ioctl-number.txt            |    1 +
     Documentation/switchtec.txt                     |   80 ++
     MAINTAINERS                                     |   11 +
     drivers/pci/Kconfig                             |    1 +
     drivers/pci/Makefile                            |    1 +
     drivers/pci/switch/Kconfig                      |   13 +
     drivers/pci/switch/Makefile                     |    1 +
     drivers/pci/switch/switchtec.c                  | 1608 +++++++++++++++++++++++
     include/uapi/linux/switchtec_ioctl.h            |  132 ++
     10 files changed, 1944 insertions(+)
     create mode 100644 Documentation/ABI/testing/sysfs-class-switchtec
     create mode 100644 Documentation/switchtec.txt
     create mode 100644 drivers/pci/switch/Kconfig
     create mode 100644 drivers/pci/switch/Makefile
     create mode 100644 drivers/pci/switch/switchtec.c
     create mode 100644 include/uapi/linux/switchtec_ioctl.h
    
    --
    2.1.4
    

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 0/4] New Microsemi PCI Switch Management Driver
  2017-02-09 23:16 ` [PATCH v2 0/4] New Microsemi PCI Switch Management Driver Wei Zhang
@ 2017-02-09 23:46   ` Wei Zhang
  0 siblings, 0 replies; 36+ messages in thread
From: Wei Zhang @ 2017-02-09 23:46 UTC (permalink / raw)
  To: Logan Gunthorpe, Keith Busch, Myron Stowe, Greg Kroah-Hartman,
	Bjorn Helgaas, Geert Uytterhoeven, Jonathan Corbet,
	David S. Miller, Andrew Morton, Emil Velikov,
	Mauro Carvalho Chehab, Guenter Roeck, Jarkko Sakkinen,
	Linus Walleij, Ryusuke Konishi, Stefan Berger, Krishna Dhulipala
  Cc: Kurt Schwemmer, Stephen Bates, linux-pci, linux-doc, linux-nvme,
	linux-kernel

Corrections to the email below.  It should be

    Tested-by: Krishna Dhulipala krishnad@fb.com
    Reviewed-by: Wei Zhang wzhang@fb.com  

Thanks,
-Wei

--
wei zhang | software engineer | facebook
wzhang@fb.com | (408) 460-4803

On 2/9/17, 3:16 PM, "Wei Zhang" <wzhang@fb.com> wrote:

    Hi,
    
    The switchtec driver patches [PATCH v2 (0-4)/4] in conjunction with the switchtec userland tool is used to communicate with the Microsemi 8536 PCIe Switch used on Facebook’s Lightning platform. The following essential driver and tool functions were successfully tested with it:
    ·         Retrieval of firmware and configuration information along with CRCs
    ·         Switch firmware and configuration upgrades
    ·         Switch PHY/Link error counter collection and the ability to reset them
    ·         Switch upstream and downstream ports’ link status reporting
    ·         Switch interface functioning
    ·         Retrieval of switch ASIC temperature
    ·         Exporting switch firmware log dump
    ·         Read the information of firmware and configuration binaries
    ·         Extract the firmware and configuration images stored in the switch EEPROM
    
    Tested-by: Krishna Dhulipala krishnad@fb.com
    Reviewed-by: Wei Zhang wzhang@fb.com  
    
    Thanks,
    -Wei
    
    --
    wei zhang | software engineer | facebook
    wzhang@fb.com | (408) 460-4803
    
    On 2/2/17, 10:05 AM, "Logan Gunthorpe" <logang@deltatee.com> wrote:
    
        Changes since v1:
        
        * Rebased onto 4.10-rc6 (cleanly)
        * Split the patch into a few more easily digestible patches (as
          suggested by Greg Kroah-Hartman)
        * Folded switchtec.c into switchtec.h (per Greg)
        * Fixed a bunch of 32bit build warnings caught by the kbuild test robot
        * Fixed some issues in the documentation so it has a proper
          reStructredText format (as noted by Jonathan Corbet)
        * Fixed padding and sizes in the IOCTL structures as noticed by Emil
          Velikov and used pahole to verify their consistency across 32 and 64
          bit builds
        * Reworked one of the IOCTL interfaces to be more future proof (per
          Emil).
        
        Changes since RFC:
        
        * Fixed incorrect use of the drive model as pointed out by Greg
          Kroah-Hartman
        * Used devm functions as suggested by Keith Busch
        * Added a handful of sysfs attributes to the switchtec class
        * Added a handful of IOCTLs to the switchtec device
        * A number of miscellaneous bug fixes
        
        --
        
        Hi,
        
        This is a continuation of the RFC we posted lasted month [1] which
        proposes a management driver for Microsemi's Switchtec line of PCI
        switches. This hardware is still looking to be used in the Open
        Compute Platform
        
        To make this entirely clear: the Switchtec products are compliant
        with the PCI specifications and are supported today with the standard
        in-kernel driver. However, these devices also expose a management endpoint
        on a separate PCI function address which can be used to perform some
        advanced operations. This is a driver for that function. See the patch
        for more information.
        
        Since the RFC, we've made the changes requested by Greg Kroah-Hartman
        and Keith Busch, and we've also fleshed out a number of features. We've
        added a couple of IOCTLs and sysfs attributes which are documented in
        the patch. Significant work has also been done on the userspace tool
        which is available under a GPL license at [2]. We've also had testing
        done by some of the interested parties.
        
        We hope to see this work included in either 4.11 or 4.12 assuming a
        smooth review process.
        
        The patch is based off of the v4.10-rc6 release.
        
        Thanks for your review,
        
        Logan
        
        [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__www.spinics.net_lists_linux-2Dpci_msg56897.html&d=DwIBAg&c=5VD0RTtNlTh3ycd41b3MUw&r=LRFoLl92zWj5mkgkc_hRcg&m=VLDBJqzotzGkTj8-xjlfT-J0k2uFq6FcWg2nA_oKYJo&s=OkigHoSqH1Z3dnmLqN76lIQ_WxRJDj1uqIDl35SI58A&e= 
        [2] https://github.com/sbates130272/switchtec-user
        
        --
        
        Logan Gunthorpe (4):
          MicroSemi Switchtec management interface driver
          switchtec: Add user interface documentation
          switchtec: Add sysfs attributes to the Switchtec driver
          switchtec: Add IOCTLs to the Switchtec driver
        
         Documentation/ABI/testing/sysfs-class-switchtec |   96 ++
         Documentation/ioctl/ioctl-number.txt            |    1 +
         Documentation/switchtec.txt                     |   80 ++
         MAINTAINERS                                     |   11 +
         drivers/pci/Kconfig                             |    1 +
         drivers/pci/Makefile                            |    1 +
         drivers/pci/switch/Kconfig                      |   13 +
         drivers/pci/switch/Makefile                     |    1 +
         drivers/pci/switch/switchtec.c                  | 1608 +++++++++++++++++++++++
         include/uapi/linux/switchtec_ioctl.h            |  132 ++
         10 files changed, 1944 insertions(+)
         create mode 100644 Documentation/ABI/testing/sysfs-class-switchtec
         create mode 100644 Documentation/switchtec.txt
         create mode 100644 drivers/pci/switch/Kconfig
         create mode 100644 drivers/pci/switch/Makefile
         create mode 100644 drivers/pci/switch/switchtec.c
         create mode 100644 include/uapi/linux/switchtec_ioctl.h
        
        --
        2.1.4
        
    
    

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 1/4] MicroSemi Switchtec management interface driver
  2017-02-02 18:06 ` [PATCH v2 1/4] MicroSemi Switchtec management interface driver Logan Gunthorpe
@ 2017-02-10 14:51   ` Greg Kroah-Hartman
  2017-02-10 16:48     ` Logan Gunthorpe
  2017-02-10 17:57     ` [PATCH] switchtec: cleanup cdev init Logan Gunthorpe
  2017-02-10 14:54   ` [PATCH v2 1/4] MicroSemi Switchtec management interface driver Greg Kroah-Hartman
  2017-02-24  0:35   ` Bjorn Helgaas
  2 siblings, 2 replies; 36+ messages in thread
From: Greg Kroah-Hartman @ 2017-02-10 14:51 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: Keith Busch, Myron Stowe, Bjorn Helgaas, Geert Uytterhoeven,
	Jonathan Corbet, David S. Miller, Andrew Morton, Emil Velikov,
	Mauro Carvalho Chehab, Guenter Roeck, Jarkko Sakkinen,
	Linus Walleij, Ryusuke Konishi, Stefan Berger, Wei Zhang,
	Kurt Schwemmer, Stephen Bates, linux-pci, linux-doc, linux-nvme,
	linux-kernel

On Thu, Feb 02, 2017 at 11:06:00AM -0700, Logan Gunthorpe wrote:
> +	cdev = &stdev->cdev;
> +	cdev_init(cdev, &switchtec_fops);
> +	cdev->owner = THIS_MODULE;
> +	cdev->kobj.parent = &dev->kobj;

Minor nit, the kobject in a cdev is unlike any other kobject you have
ever seen, don't mess with it, it's not doing anything like you think it
is doing.  So no need to set the parent field.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 0/4] New Microsemi PCI Switch Management Driver
  2017-02-02 18:05 [PATCH v2 0/4] New Microsemi PCI Switch Management Driver Logan Gunthorpe
                   ` (4 preceding siblings ...)
  2017-02-09 23:16 ` [PATCH v2 0/4] New Microsemi PCI Switch Management Driver Wei Zhang
@ 2017-02-10 14:54 ` Greg Kroah-Hartman
  2017-02-10 16:14 ` Jens Axboe
  2017-02-17 20:36 ` Logan Gunthorpe
  7 siblings, 0 replies; 36+ messages in thread
From: Greg Kroah-Hartman @ 2017-02-10 14:54 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: Keith Busch, Myron Stowe, Bjorn Helgaas, Geert Uytterhoeven,
	Jonathan Corbet, David S. Miller, Andrew Morton, Emil Velikov,
	Mauro Carvalho Chehab, Guenter Roeck, Jarkko Sakkinen,
	Linus Walleij, Ryusuke Konishi, Stefan Berger, Wei Zhang,
	Kurt Schwemmer, Stephen Bates, linux-pci, linux-doc, linux-nvme,
	linux-kernel

On Thu, Feb 02, 2017 at 11:05:59AM -0700, Logan Gunthorpe wrote:
> Changes since v1:
> 
> * Rebased onto 4.10-rc6 (cleanly)
> * Split the patch into a few more easily digestible patches (as
>   suggested by Greg Kroah-Hartman)
> * Folded switchtec.c into switchtec.h (per Greg)
> * Fixed a bunch of 32bit build warnings caught by the kbuild test robot
> * Fixed some issues in the documentation so it has a proper
>   reStructredText format (as noted by Jonathan Corbet)
> * Fixed padding and sizes in the IOCTL structures as noticed by Emil
>   Velikov and used pahole to verify their consistency across 32 and 64
>   bit builds
> * Reworked one of the IOCTL interfaces to be more future proof (per
>   Emil).
> 
> Changes since RFC:
> 
> * Fixed incorrect use of the drive model as pointed out by Greg
>   Kroah-Hartman
> * Used devm functions as suggested by Keith Busch
> * Added a handful of sysfs attributes to the switchtec class
> * Added a handful of IOCTLs to the switchtec device
> * A number of miscellaneous bug fixes

I didn't audit the ioctl code, but the driver model and sysfs stuff
looks sane to me, nice job cleaning it up.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 1/4] MicroSemi Switchtec management interface driver
  2017-02-02 18:06 ` [PATCH v2 1/4] MicroSemi Switchtec management interface driver Logan Gunthorpe
  2017-02-10 14:51   ` Greg Kroah-Hartman
@ 2017-02-10 14:54   ` Greg Kroah-Hartman
  2017-02-24  0:35   ` Bjorn Helgaas
  2 siblings, 0 replies; 36+ messages in thread
From: Greg Kroah-Hartman @ 2017-02-10 14:54 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: Keith Busch, Myron Stowe, Bjorn Helgaas, Geert Uytterhoeven,
	Jonathan Corbet, David S. Miller, Andrew Morton, Emil Velikov,
	Mauro Carvalho Chehab, Guenter Roeck, Jarkko Sakkinen,
	Linus Walleij, Ryusuke Konishi, Stefan Berger, Wei Zhang,
	Kurt Schwemmer, Stephen Bates, linux-pci, linux-doc, linux-nvme,
	linux-kernel

On Thu, Feb 02, 2017 at 11:06:00AM -0700, Logan Gunthorpe wrote:
> Microsemi's "Switchtec" line of PCI switch devices is already well
> supported by the kernel with standard PCI switch drivers. However, the
> Switchtec device advertises a special management endpoint with a separate
> PCI function address and class code. This endpoint enables some additional
> functionality which includes:
> 
>  * Packet and Byte Counters
>  * Switch Firmware Upgrades
>  * Event and Error logs
>  * Querying port link status
>  * Custom user firmware commands
> 
> This patch introduces the switchtec kernel module which provides
> PCI driver that exposes a char device. The char device provides
> userspace access to this interface through read, write and (optionally)
> poll calls.
> 
> A userspace tool and library which utilizes this interface is available
> at [1]. This tool takes inspiration (and borrows some code) from
> nvme-cli [2]. The tool is largely complete at this time but additional
> features may be added in the future.
> 
> [1] https://github.com/sbates130272/switchtec-user
> [2] https://github.com/linux-nvme/nvme-cli
> 
> Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
> Signed-off-by: Stephen Bates <stephen.bates@microsemi.com>

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 3/4] switchtec: Add sysfs attributes to the Switchtec driver
  2017-02-02 18:06 ` [PATCH v2 3/4] switchtec: Add sysfs attributes to the Switchtec driver Logan Gunthorpe
@ 2017-02-10 14:54   ` Greg Kroah-Hartman
  2017-02-23 22:43   ` Bjorn Helgaas
  1 sibling, 0 replies; 36+ messages in thread
From: Greg Kroah-Hartman @ 2017-02-10 14:54 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: Keith Busch, Myron Stowe, Bjorn Helgaas, Geert Uytterhoeven,
	Jonathan Corbet, David S. Miller, Andrew Morton, Emil Velikov,
	Mauro Carvalho Chehab, Guenter Roeck, Jarkko Sakkinen,
	Linus Walleij, Ryusuke Konishi, Stefan Berger, Wei Zhang,
	Kurt Schwemmer, Stephen Bates, linux-pci, linux-doc, linux-nvme,
	linux-kernel

On Thu, Feb 02, 2017 at 11:06:02AM -0700, Logan Gunthorpe wrote:
> This patch adds a few read-only sysfs attributes which provide
> some device information that is exposed from the devices. Primarily
> component and device names and versions. These are documented in
> Documentation/ABI/testing/sysfs-class-switchtec.
> 
> Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
> Signed-off-by: Stephen Bates <stephen.bates@microsemi.com>

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 0/4] New Microsemi PCI Switch Management Driver
  2017-02-02 18:05 [PATCH v2 0/4] New Microsemi PCI Switch Management Driver Logan Gunthorpe
                   ` (5 preceding siblings ...)
  2017-02-10 14:54 ` Greg Kroah-Hartman
@ 2017-02-10 16:14 ` Jens Axboe
  2017-02-17 20:36 ` Logan Gunthorpe
  7 siblings, 0 replies; 36+ messages in thread
From: Jens Axboe @ 2017-02-10 16:14 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: Keith Busch, Myron Stowe, Greg Kroah-Hartman, Bjorn Helgaas,
	Geert Uytterhoeven, Jonathan Corbet, David S. Miller,
	Andrew Morton, Emil Velikov, Mauro Carvalho Chehab,
	Guenter Roeck, Jarkko Sakkinen, Linus Walleij, Ryusuke Konishi,
	Stefan Berger, Wei Zhang, linux-doc, linux-pci, linux-kernel,
	linux-nvme, Stephen Bates, Kurt Schwemmer

On Thu, Feb 2, 2017 at 11:05 AM, Logan Gunthorpe <logang@deltatee.com> wrote:
> Changes since v1:
>
> * Rebased onto 4.10-rc6 (cleanly)
> * Split the patch into a few more easily digestible patches (as
>   suggested by Greg Kroah-Hartman)
> * Folded switchtec.c into switchtec.h (per Greg)
> * Fixed a bunch of 32bit build warnings caught by the kbuild test robot
> * Fixed some issues in the documentation so it has a proper
>   reStructredText format (as noted by Jonathan Corbet)
> * Fixed padding and sizes in the IOCTL structures as noticed by Emil
>   Velikov and used pahole to verify their consistency across 32 and 64
>   bit builds
> * Reworked one of the IOCTL interfaces to be more future proof (per
>   Emil).
>
> Changes since RFC:
>
> * Fixed incorrect use of the drive model as pointed out by Greg
>   Kroah-Hartman
> * Used devm functions as suggested by Keith Busch
> * Added a handful of sysfs attributes to the switchtec class
> * Added a handful of IOCTLs to the switchtec device
> * A number of miscellaneous bug fixes
>
> --
>
> Hi,
>
> This is a continuation of the RFC we posted lasted month [1] which
> proposes a management driver for Microsemi's Switchtec line of PCI
> switches. This hardware is still looking to be used in the Open
> Compute Platform
>
> To make this entirely clear: the Switchtec products are compliant
> with the PCI specifications and are supported today with the standard
> in-kernel driver. However, these devices also expose a management endpoint
> on a separate PCI function address which can be used to perform some
> advanced operations. This is a driver for that function. See the patch
> for more information.
>
> Since the RFC, we've made the changes requested by Greg Kroah-Hartman
> and Keith Busch, and we've also fleshed out a number of features. We've
> added a couple of IOCTLs and sysfs attributes which are documented in
> the patch. Significant work has also been done on the userspace tool
> which is available under a GPL license at [2]. We've also had testing
> done by some of the interested parties.
>
> We hope to see this work included in either 4.11 or 4.12 assuming a
> smooth review process.
>
> The patch is based off of the v4.10-rc6 release.
>
> Thanks for your review,

Looks good to me, you can add my:

Reviewed-by: Jens Axboe <axboe@fb.com>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 1/4] MicroSemi Switchtec management interface driver
  2017-02-10 14:51   ` Greg Kroah-Hartman
@ 2017-02-10 16:48     ` Logan Gunthorpe
  2017-02-10 16:55       ` Greg Kroah-Hartman
  2017-02-10 17:57     ` [PATCH] switchtec: cleanup cdev init Logan Gunthorpe
  1 sibling, 1 reply; 36+ messages in thread
From: Logan Gunthorpe @ 2017-02-10 16:48 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Keith Busch, Myron Stowe, Bjorn Helgaas, Geert Uytterhoeven,
	Jonathan Corbet, David S. Miller, Andrew Morton, Emil Velikov,
	Mauro Carvalho Chehab, Guenter Roeck, Jarkko Sakkinen,
	Linus Walleij, Ryusuke Konishi, Stefan Berger, Wei Zhang,
	Kurt Schwemmer, Stephen Bates, linux-pci, linux-doc, linux-nvme,
	linux-kernel

Hey Greg,

Thanks so much for the review.

On 10/02/17 07:51 AM, Greg Kroah-Hartman wrote:
> On Thu, Feb 02, 2017 at 11:06:00AM -0700, Logan Gunthorpe wrote:
>> +	cdev = &stdev->cdev;
>> +	cdev_init(cdev, &switchtec_fops);
>> +	cdev->owner = THIS_MODULE;
>> +	cdev->kobj.parent = &dev->kobj;
> 
> Minor nit, the kobject in a cdev is unlike any other kobject you have
> ever seen, don't mess with it, it's not doing anything like you think it
> is doing.  So no need to set the parent field.

Ok, that makes sense. I'll do a v3 shortly.

I copied this from drivers/dax/dax.c so when I have a spare moment I'll
submit a patch to remove it from there as well.

Just to make sure I get this right without extra churn: does this look
correct?


        cdev = &stdev->cdev;
        cdev_init(cdev, &switchtec_fops);
        cdev->owner = THIS_MODULE;

        rc = cdev_add(&stdev->cdev, dev->devt, 1);
        if (rc)
                goto err_cdev;

        dev = &stdev->dev;
        dev->devt = MKDEV(MAJOR(switchtec_devt), minor);
        dev->class = switchtec_class;
        dev->parent = &pdev->dev;
        dev->groups = switchtec_device_groups;
        dev->release = stdev_release;
        dev_set_name(dev, "switchtec%d", minor);

        rc = device_register(dev);
        if (rc) {
                cdev_del(&stdev->cdev);
                put_device(dev);
                return ERR_PTR(rc);
        }


Thanks,

Logan

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 1/4] MicroSemi Switchtec management interface driver
  2017-02-10 16:48     ` Logan Gunthorpe
@ 2017-02-10 16:55       ` Greg Kroah-Hartman
  2017-02-10 17:03         ` Logan Gunthorpe
  0 siblings, 1 reply; 36+ messages in thread
From: Greg Kroah-Hartman @ 2017-02-10 16:55 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: Keith Busch, Myron Stowe, Bjorn Helgaas, Geert Uytterhoeven,
	Jonathan Corbet, David S. Miller, Andrew Morton, Emil Velikov,
	Mauro Carvalho Chehab, Guenter Roeck, Jarkko Sakkinen,
	Linus Walleij, Ryusuke Konishi, Stefan Berger, Wei Zhang,
	Kurt Schwemmer, Stephen Bates, linux-pci, linux-doc, linux-nvme,
	linux-kernel

On Fri, Feb 10, 2017 at 09:48:37AM -0700, Logan Gunthorpe wrote:
> Hey Greg,
> 
> Thanks so much for the review.
> 
> On 10/02/17 07:51 AM, Greg Kroah-Hartman wrote:
> > On Thu, Feb 02, 2017 at 11:06:00AM -0700, Logan Gunthorpe wrote:
> >> +	cdev = &stdev->cdev;
> >> +	cdev_init(cdev, &switchtec_fops);
> >> +	cdev->owner = THIS_MODULE;
> >> +	cdev->kobj.parent = &dev->kobj;
> > 
> > Minor nit, the kobject in a cdev is unlike any other kobject you have
> > ever seen, don't mess with it, it's not doing anything like you think it
> > is doing.  So no need to set the parent field.
> 
> Ok, that makes sense. I'll do a v3 shortly.
> 
> I copied this from drivers/dax/dax.c so when I have a spare moment I'll
> submit a patch to remove it from there as well.
> 
> Just to make sure I get this right without extra churn: does this look
> correct?
> 
> 
>         cdev = &stdev->cdev;
>         cdev_init(cdev, &switchtec_fops);
>         cdev->owner = THIS_MODULE;
> 
>         rc = cdev_add(&stdev->cdev, dev->devt, 1);
>         if (rc)
>                 goto err_cdev;
> 
>         dev = &stdev->dev;
>         dev->devt = MKDEV(MAJOR(switchtec_devt), minor);
>         dev->class = switchtec_class;
>         dev->parent = &pdev->dev;
>         dev->groups = switchtec_device_groups;
>         dev->release = stdev_release;
>         dev_set_name(dev, "switchtec%d", minor);
> 
>         rc = device_register(dev);
>         if (rc) {
>                 cdev_del(&stdev->cdev);
>                 put_device(dev);
>                 return ERR_PTR(rc);
>         }
> 

Yes, but try it yourself to verify it really is correct :)

And it can just be an add-on patch, no need to respin a whole new
version for just that simple change, it doesn't hurt anything as-is,
it's just "not needed".

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 1/4] MicroSemi Switchtec management interface driver
  2017-02-10 16:55       ` Greg Kroah-Hartman
@ 2017-02-10 17:03         ` Logan Gunthorpe
  2017-02-10 17:09           ` Greg Kroah-Hartman
  0 siblings, 1 reply; 36+ messages in thread
From: Logan Gunthorpe @ 2017-02-10 17:03 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Keith Busch, Myron Stowe, Bjorn Helgaas, Geert Uytterhoeven,
	Jonathan Corbet, David S. Miller, Andrew Morton, Emil Velikov,
	Mauro Carvalho Chehab, Guenter Roeck, Jarkko Sakkinen,
	Linus Walleij, Ryusuke Konishi, Stefan Berger, Wei Zhang,
	Kurt Schwemmer, Stephen Bates, linux-pci, linux-doc, linux-nvme,
	linux-kernel



On 10/02/17 09:55 AM, Greg Kroah-Hartman wrote:
> Yes, but try it yourself to verify it really is correct :)

Yes, of course... turns out it wasn't. I accidentally refereed to dev
before I assigned it. I had mainly just wanted your feedback to ensure
that switching to device_register made sense.

> And it can just be an add-on patch, no need to respin a whole new
> version for just that simple change, it doesn't hurt anything as-is,
> it's just "not needed".

Ok... How should I do that exactly? Attempt to reply to the thread with
another patch?

Thanks,

Logan

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 1/4] MicroSemi Switchtec management interface driver
  2017-02-10 17:03         ` Logan Gunthorpe
@ 2017-02-10 17:09           ` Greg Kroah-Hartman
  2017-02-10 18:00             ` Logan Gunthorpe
  0 siblings, 1 reply; 36+ messages in thread
From: Greg Kroah-Hartman @ 2017-02-10 17:09 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: Keith Busch, Myron Stowe, Bjorn Helgaas, Geert Uytterhoeven,
	Jonathan Corbet, David S. Miller, Andrew Morton, Emil Velikov,
	Mauro Carvalho Chehab, Guenter Roeck, Jarkko Sakkinen,
	Linus Walleij, Ryusuke Konishi, Stefan Berger, Wei Zhang,
	Kurt Schwemmer, Stephen Bates, linux-pci, linux-doc, linux-nvme,
	linux-kernel

On Fri, Feb 10, 2017 at 10:03:10AM -0700, Logan Gunthorpe wrote:
> 
> 
> On 10/02/17 09:55 AM, Greg Kroah-Hartman wrote:
> > Yes, but try it yourself to verify it really is correct :)
> 
> Yes, of course... turns out it wasn't. I accidentally refereed to dev
> before I assigned it. I had mainly just wanted your feedback to ensure
> that switching to device_register made sense.
> 
> > And it can just be an add-on patch, no need to respin a whole new
> > version for just that simple change, it doesn't hurt anything as-is,
> > it's just "not needed".
> 
> Ok... How should I do that exactly? Attempt to reply to the thread with
> another patch?

Sure, or just wait for these to be applied to the PCI tree and then send
a follow-on patch.  It's up to Bjorn really, as to what he wants.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH] switchtec: cleanup cdev init
  2017-02-10 14:51   ` Greg Kroah-Hartman
  2017-02-10 16:48     ` Logan Gunthorpe
@ 2017-02-10 17:57     ` Logan Gunthorpe
  2017-02-18 20:22       ` Logan Gunthorpe
  1 sibling, 1 reply; 36+ messages in thread
From: Logan Gunthorpe @ 2017-02-10 17:57 UTC (permalink / raw)
  To: Keith Busch, Myron Stowe, Greg Kroah-Hartman, Bjorn Helgaas,
	Geert Uytterhoeven, Jonathan Corbet, David S. Miller,
	Andrew Morton, Emil Velikov, Mauro Carvalho Chehab,
	Guenter Roeck, Jarkko Sakkinen, Linus Walleij, Ryusuke Konishi,
	Stefan Berger, Wei Zhang
  Cc: Kurt Schwemmer, Stephen Bates, linux-pci, linux-doc, linux-nvme,
	linux-kernel, Logan Gunthorpe

Per a suggestion from Greg Kroah-Hartman [1], don't set the cdev's
kobject parent. This allows us to use device_register instead of init
and add.

[1] https://lkml.org/lkml/2017/2/10/370
---
 drivers/pci/switch/switchtec.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/pci/switch/switchtec.c b/drivers/pci/switch/switchtec.c
index 82bfd18..014eaec 100644
--- a/drivers/pci/switch/switchtec.c
+++ b/drivers/pci/switch/switchtec.c
@@ -1222,24 +1222,23 @@ static struct switchtec_dev *stdev_create(struct pci_dev *pdev)
 		return ERR_PTR(minor);
 
 	dev = &stdev->dev;
-	device_initialize(dev);
 	dev->devt = MKDEV(MAJOR(switchtec_devt), minor);
-	dev->class = switchtec_class;
-	dev->parent = &pdev->dev;
-	dev->groups = switchtec_device_groups;
-	dev->release = stdev_release;
-	dev_set_name(dev, "switchtec%d", minor);
 
 	cdev = &stdev->cdev;
 	cdev_init(cdev, &switchtec_fops);
 	cdev->owner = THIS_MODULE;
-	cdev->kobj.parent = &dev->kobj;
 
 	rc = cdev_add(&stdev->cdev, dev->devt, 1);
 	if (rc)
 		goto err_cdev;
 
-	rc = device_add(dev);
+	dev->class = switchtec_class;
+	dev->parent = &pdev->dev;
+	dev->groups = switchtec_device_groups;
+	dev->release = stdev_release;
+	dev_set_name(dev, "switchtec%d", minor);
+
+	rc = device_register(dev);
 	if (rc) {
 		cdev_del(&stdev->cdev);
 		put_device(dev);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 1/4] MicroSemi Switchtec management interface driver
  2017-02-10 17:09           ` Greg Kroah-Hartman
@ 2017-02-10 18:00             ` Logan Gunthorpe
  0 siblings, 0 replies; 36+ messages in thread
From: Logan Gunthorpe @ 2017-02-10 18:00 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Keith Busch, Myron Stowe, Bjorn Helgaas, Geert Uytterhoeven,
	Jonathan Corbet, David S. Miller, Andrew Morton, Emil Velikov,
	Mauro Carvalho Chehab, Guenter Roeck, Jarkko Sakkinen,
	Linus Walleij, Ryusuke Konishi, Stefan Berger, Wei Zhang,
	Kurt Schwemmer, Stephen Bates, linux-pci, linux-doc, linux-nvme,
	linux-kernel



On 10/02/17 10:09 AM, Greg Kroah-Hartman wrote:
> Sure, or just wait for these to be applied to the PCI tree and then send
> a follow-on patch.  It's up to Bjorn really, as to what he wants.

Ok, I sent a working follow-on patch to this thread.

@Bjorn: I'm happy to send the patches however you like. Just let me
know. If at all possible, we'd really like to see this in for the 4.11
merge window.

Thanks,

Logan

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 0/4] New Microsemi PCI Switch Management Driver
  2017-02-02 18:05 [PATCH v2 0/4] New Microsemi PCI Switch Management Driver Logan Gunthorpe
                   ` (6 preceding siblings ...)
  2017-02-10 16:14 ` Jens Axboe
@ 2017-02-17 20:36 ` Logan Gunthorpe
  2017-02-23 20:36   ` Logan Gunthorpe
  7 siblings, 1 reply; 36+ messages in thread
From: Logan Gunthorpe @ 2017-02-17 20:36 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Keith Busch, Myron Stowe, Greg Kroah-Hartman, Geert Uytterhoeven,
	Jonathan Corbet, David S. Miller, Andrew Morton, Emil Velikov,
	Mauro Carvalho Chehab, Guenter Roeck, Jarkko Sakkinen,
	Linus Walleij, Ryusuke Konishi, Stefan Berger, Wei Zhang,
	Kurt Schwemmer, Stephen Bates, linux-pci, linux-doc, linux-nvme,
	linux-kernel

Hi Bjorn,

Can you give us an idea of when you might be able to comment on our
patchset? We've addressed all the outstanding issues and have a couple
of reviewed and tested tags. So we'd like to see this move forward as
soon as possible.

I can do a respin with the tags collected or address any concerns you
may have, just please let us know.

Thanks,

Logan

On 02/02/17 11:05 AM, Logan Gunthorpe wrote:
> Changes since v1:
> 
> * Rebased onto 4.10-rc6 (cleanly)
> * Split the patch into a few more easily digestible patches (as
>   suggested by Greg Kroah-Hartman)
> * Folded switchtec.c into switchtec.h (per Greg)
> * Fixed a bunch of 32bit build warnings caught by the kbuild test robot
> * Fixed some issues in the documentation so it has a proper
>   reStructredText format (as noted by Jonathan Corbet)
> * Fixed padding and sizes in the IOCTL structures as noticed by Emil
>   Velikov and used pahole to verify their consistency across 32 and 64
>   bit builds
> * Reworked one of the IOCTL interfaces to be more future proof (per
>   Emil).
> 
> Changes since RFC:
> 
> * Fixed incorrect use of the drive model as pointed out by Greg
>   Kroah-Hartman
> * Used devm functions as suggested by Keith Busch
> * Added a handful of sysfs attributes to the switchtec class
> * Added a handful of IOCTLs to the switchtec device
> * A number of miscellaneous bug fixes
> 
> --
> 
> Hi,
> 
> This is a continuation of the RFC we posted lasted month [1] which
> proposes a management driver for Microsemi's Switchtec line of PCI
> switches. This hardware is still looking to be used in the Open
> Compute Platform
> 
> To make this entirely clear: the Switchtec products are compliant
> with the PCI specifications and are supported today with the standard
> in-kernel driver. However, these devices also expose a management endpoint
> on a separate PCI function address which can be used to perform some
> advanced operations. This is a driver for that function. See the patch
> for more information.
> 
> Since the RFC, we've made the changes requested by Greg Kroah-Hartman
> and Keith Busch, and we've also fleshed out a number of features. We've
> added a couple of IOCTLs and sysfs attributes which are documented in
> the patch. Significant work has also been done on the userspace tool
> which is available under a GPL license at [2]. We've also had testing
> done by some of the interested parties.
> 
> We hope to see this work included in either 4.11 or 4.12 assuming a
> smooth review process.
> 
> The patch is based off of the v4.10-rc6 release.
> 
> Thanks for your review,
> 
> Logan
> 
> [1] https://www.spinics.net/lists/linux-pci/msg56897.html
> [2] https://github.com/sbates130272/switchtec-user
> 
> --
> 
> Logan Gunthorpe (4):
>   MicroSemi Switchtec management interface driver
>   switchtec: Add user interface documentation
>   switchtec: Add sysfs attributes to the Switchtec driver
>   switchtec: Add IOCTLs to the Switchtec driver
> 
>  Documentation/ABI/testing/sysfs-class-switchtec |   96 ++
>  Documentation/ioctl/ioctl-number.txt            |    1 +
>  Documentation/switchtec.txt                     |   80 ++
>  MAINTAINERS                                     |   11 +
>  drivers/pci/Kconfig                             |    1 +
>  drivers/pci/Makefile                            |    1 +
>  drivers/pci/switch/Kconfig                      |   13 +
>  drivers/pci/switch/Makefile                     |    1 +
>  drivers/pci/switch/switchtec.c                  | 1608 +++++++++++++++++++++++
>  include/uapi/linux/switchtec_ioctl.h            |  132 ++
>  10 files changed, 1944 insertions(+)
>  create mode 100644 Documentation/ABI/testing/sysfs-class-switchtec
>  create mode 100644 Documentation/switchtec.txt
>  create mode 100644 drivers/pci/switch/Kconfig
>  create mode 100644 drivers/pci/switch/Makefile
>  create mode 100644 drivers/pci/switch/switchtec.c
>  create mode 100644 include/uapi/linux/switchtec_ioctl.h
> 
> --
> 2.1.4
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] switchtec: cleanup cdev init
  2017-02-10 17:57     ` [PATCH] switchtec: cleanup cdev init Logan Gunthorpe
@ 2017-02-18 20:22       ` Logan Gunthorpe
  2017-02-19 21:43         ` Dan Williams
  0 siblings, 1 reply; 36+ messages in thread
From: Logan Gunthorpe @ 2017-02-18 20:22 UTC (permalink / raw)
  To: Keith Busch, Myron Stowe, Greg Kroah-Hartman, Bjorn Helgaas,
	Geert Uytterhoeven, Jonathan Corbet, David S. Miller,
	Andrew Morton, Emil Velikov, Mauro Carvalho Chehab,
	Guenter Roeck, Jarkko Sakkinen, Linus Walleij, Ryusuke Konishi,
	Stefan Berger, Wei Zhang, Dan Williams
  Cc: Kurt Schwemmer, Stephen Bates, linux-pci, linux-doc, linux-nvme,
	linux-kernel

Hi,

Please don't apply this patch and instead apply the switchtec driver as
we submitted in v2. As per the discussion in [1], not using the cdev's
kobj parent results in incorrect reference counting and a possible use
of the cdev after its containing structure is freed.

I've also done a quick review around the kernel and found the pattern in
question to be quite prevalent. It's used in, at least, these drivers:

drivers/dax/dax.c:708
drivers/input/evdev.c:1419
drivers/input/joydev.c:908
drivers/input/mousedev.c:904
drivers/gpio/gpiolib.c:1039
drivers/char/tpm/tpm-chip.c:190
drivers/platform/chrome/cros_ec_dev.c:411
drivers/infiniband/hw/hfi1/device.c:72
drivers/infiniband/core/uverbs_main.c:1168
drivers/infiniband/core/user_mad.c:1186
drivers/infiniband/core/user_mad.c:1205
drivers/iio/industrialio-core.c:1721
drivers/media/cec/cec-core.c:140
drivers/media/media-devnode.c:258

Dan Williams has proposed a helper API in [2] that could make this code
appear significantly less suspect which I think would be a good idea.
However, for now, I feel the switchtec code should also follow this
pattern (ie. the way it was submitted in v2) and can be changed or
cleaned up when/if the above numerous examples are fixed up.

Thanks,

Logan

[1] https://lkml.org/lkml/2017/2/10/589
[2] https://lkml.org/lkml/2017/2/13/700




On 10/02/17 10:57 AM, Logan Gunthorpe wrote:
> Per a suggestion from Greg Kroah-Hartman [1], don't set the cdev's
> kobject parent. This allows us to use device_register instead of init
> and add.
> 
> [1] https://lkml.org/lkml/2017/2/10/370
> ---
>  drivers/pci/switch/switchtec.c | 15 +++++++--------
>  1 file changed, 7 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/pci/switch/switchtec.c b/drivers/pci/switch/switchtec.c
> index 82bfd18..014eaec 100644
> --- a/drivers/pci/switch/switchtec.c
> +++ b/drivers/pci/switch/switchtec.c
> @@ -1222,24 +1222,23 @@ static struct switchtec_dev *stdev_create(struct pci_dev *pdev)
>  		return ERR_PTR(minor);
>  
>  	dev = &stdev->dev;
> -	device_initialize(dev);
>  	dev->devt = MKDEV(MAJOR(switchtec_devt), minor);
> -	dev->class = switchtec_class;
> -	dev->parent = &pdev->dev;
> -	dev->groups = switchtec_device_groups;
> -	dev->release = stdev_release;
> -	dev_set_name(dev, "switchtec%d", minor);
>  
>  	cdev = &stdev->cdev;
>  	cdev_init(cdev, &switchtec_fops);
>  	cdev->owner = THIS_MODULE;
> -	cdev->kobj.parent = &dev->kobj;
>  
>  	rc = cdev_add(&stdev->cdev, dev->devt, 1);
>  	if (rc)
>  		goto err_cdev;
>  
> -	rc = device_add(dev);
> +	dev->class = switchtec_class;
> +	dev->parent = &pdev->dev;
> +	dev->groups = switchtec_device_groups;
> +	dev->release = stdev_release;
> +	dev_set_name(dev, "switchtec%d", minor);
> +
> +	rc = device_register(dev);
>  	if (rc) {
>  		cdev_del(&stdev->cdev);
>  		put_device(dev);
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] switchtec: cleanup cdev init
  2017-02-18 20:22       ` Logan Gunthorpe
@ 2017-02-19 21:43         ` Dan Williams
  2017-02-20  4:22           ` Logan Gunthorpe
  0 siblings, 1 reply; 36+ messages in thread
From: Dan Williams @ 2017-02-19 21:43 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: Keith Busch, Myron Stowe, Greg Kroah-Hartman, Bjorn Helgaas,
	Geert Uytterhoeven, Jonathan Corbet, David S. Miller,
	Andrew Morton, Emil Velikov, Mauro Carvalho Chehab,
	Guenter Roeck, Jarkko Sakkinen, Linus Walleij, Ryusuke Konishi,
	Stefan Berger, Wei Zhang, linux-doc, linux-pci,
	Linux Kernel Mailing List, linux-nvme, Stephen Bates,
	Kurt Schwemmer

On Sat, Feb 18, 2017 at 12:22 PM, Logan Gunthorpe <logang@deltatee.com> wrote:
> Hi,
>
> Please don't apply this patch and instead apply the switchtec driver as
> we submitted in v2. As per the discussion in [1], not using the cdev's
> kobj parent results in incorrect reference counting and a possible use
> of the cdev after its containing structure is freed.

Is this race present for all file operations? I've only seen it with
mmap() and late faults. So if these other drivers do not support mmap
it's not clear they can trigger the failure.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] switchtec: cleanup cdev init
  2017-02-19 21:43         ` Dan Williams
@ 2017-02-20  4:22           ` Logan Gunthorpe
  2017-02-21 18:37             ` Jason Gunthorpe
  0 siblings, 1 reply; 36+ messages in thread
From: Logan Gunthorpe @ 2017-02-20  4:22 UTC (permalink / raw)
  To: Dan Williams
  Cc: Keith Busch, Myron Stowe, Greg Kroah-Hartman, Bjorn Helgaas,
	Geert Uytterhoeven, Jonathan Corbet, David S. Miller,
	Andrew Morton, Emil Velikov, Mauro Carvalho Chehab,
	Guenter Roeck, Jarkko Sakkinen, Linus Walleij, Ryusuke Konishi,
	Stefan Berger, Wei Zhang, linux-doc, linux-pci,
	Linux Kernel Mailing List, linux-nvme, Stephen Bates,
	Kurt Schwemmer, Jason Gunthorpe



On 19/02/17 02:43 PM, Dan Williams wrote:
> Is this race present for all file operations? I've only seen it with
> mmap() and late faults. So if these other drivers do not support mmap
> it's not clear they can trigger the failure.

I don't see how it's related to mmap only. I think there's a number of
variations on this but the race I see happens if you try to take a
reference to the device in the open/close handlers of a char device file.

If a driver puts the last remaining reference in the release handler,
then the device structure will be freed, and on returning from the
release handler, the char_dev code will try to put the last reference to
the cdev structure which may have already been free'd. There needs to be
a way for the cdev structure to take a reference on the device structure
so it doesn't get freed and the kobj.parent trick seems to accomplish
this fairly well.

Really, in any situation where there's a cdev and a device in the same
structure, the life cycles of the two become linked but their reference
counts are not and that is the problem here.

I feel like a number of developers have made the same realization
independently so this would probably be a good thing to clean up.

See these commits for examples spanning around 5 years:

4a215aa Input: fix use-after-free introduced with dynamic minor changes
0d5b7da iio: Prevent race between IIO chardev opening and IIO device
ba0ef85 tpm: Fix initialization of the cdev

Also, seems both my brother and I have independently looked into this
exact issue:

https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg25692.html

So, I think it would be a good idea to clean it up with Dan's proposed
API cleanup. If I have some time tomorrow, I may throw together a patch
set with that patch and the changes to the instances around the kernel.

Logan

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] switchtec: cleanup cdev init
  2017-02-20  4:22           ` Logan Gunthorpe
@ 2017-02-21 18:37             ` Jason Gunthorpe
  0 siblings, 0 replies; 36+ messages in thread
From: Jason Gunthorpe @ 2017-02-21 18:37 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: Dan Williams, Keith Busch, Myron Stowe, Greg Kroah-Hartman,
	Bjorn Helgaas, Geert Uytterhoeven, Jonathan Corbet,
	David S. Miller, Andrew Morton, Emil Velikov,
	Mauro Carvalho Chehab, Guenter Roeck, Jarkko Sakkinen,
	Linus Walleij, Ryusuke Konishi, Stefan Berger, Wei Zhang,
	linux-doc, linux-pci, Linux Kernel Mailing List, linux-nvme,
	Stephen Bates, Kurt Schwemmer

On Sun, Feb 19, 2017 at 09:22:35PM -0700, Logan Gunthorpe wrote:

> Really, in any situation where there's a cdev and a device in the same
> structure, the life cycles of the two become linked but their reference
> counts are not and that is the problem here.

Yes, the cdev must hold a kref on the containing struct otherwise
userspace can trigger a use after free. This cannot be fixed with an
approach inside the open/release function either as the cdev core code
itself relies on the memory to exist.

I've suggested something like this before:

https://lkml.org/lkml/2015/7/8/1066

So I hope this will make it in, it is a step in the right direction.

If it does, would you make another patch to go further? I think
cdev_init should take enough arguments to hold the enclosing kref, API
wise there should be no API to init a cdev without the caller
specifying the enclosing struct's kref. That is the only way we will
stamp this bug-class out.

Eg look at kernel/time/posix-clock.c, it is wrong in the same way as
well - the kref_put in posix_clock_release is not enough to make it
work, clk->cdev is referenced after posix_clock_release returns by the
cdev core so this has a use-after-free.

Jason

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 0/4] New Microsemi PCI Switch Management Driver
  2017-02-17 20:36 ` Logan Gunthorpe
@ 2017-02-23 20:36   ` Logan Gunthorpe
  2017-02-23 20:51     ` Sinan Kaya
                       ` (3 more replies)
  0 siblings, 4 replies; 36+ messages in thread
From: Logan Gunthorpe @ 2017-02-23 20:36 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Keith Busch, Myron Stowe, Greg Kroah-Hartman, Geert Uytterhoeven,
	Jonathan Corbet, David S. Miller, Andrew Morton, Emil Velikov,
	Mauro Carvalho Chehab, Guenter Roeck, Jarkko Sakkinen,
	Linus Walleij, Ryusuke Konishi, Stefan Berger, Wei Zhang,
	Kurt Schwemmer, Stephen Bates, linux-pci, linux-doc, linux-nvme,
	linux-kernel, Jens Axboe

Hello,

We're still waiting on any kind of response from Bjorn. (If you're
listening please say something!)

Does anyone have any suggestions for dealing with an unresponsive
maintainer? Or a way for us to move forward with this quickly and get it
merged?

ie. Can anyone else pick this up through another route? In the end, it's
just a fairly basic driver and doesn't touch any core PCI functionality
and we've had a fair amount of review from other kernel contributors,
all of which we've addressed.

Thanks,

Logan




On 17/02/17 01:36 PM, Logan Gunthorpe wrote:
> Hi Bjorn,
> 
> Can you give us an idea of when you might be able to comment on our
> patchset? We've addressed all the outstanding issues and have a couple
> of reviewed and tested tags. So we'd like to see this move forward as
> soon as possible.
> 
> I can do a respin with the tags collected or address any concerns you
> may have, just please let us know.
> 
> Thanks,
> 
> Logan
> 
> On 02/02/17 11:05 AM, Logan Gunthorpe wrote:
>> Changes since v1:
>>
>> * Rebased onto 4.10-rc6 (cleanly)
>> * Split the patch into a few more easily digestible patches (as
>>   suggested by Greg Kroah-Hartman)
>> * Folded switchtec.c into switchtec.h (per Greg)
>> * Fixed a bunch of 32bit build warnings caught by the kbuild test robot
>> * Fixed some issues in the documentation so it has a proper
>>   reStructredText format (as noted by Jonathan Corbet)
>> * Fixed padding and sizes in the IOCTL structures as noticed by Emil
>>   Velikov and used pahole to verify their consistency across 32 and 64
>>   bit builds
>> * Reworked one of the IOCTL interfaces to be more future proof (per
>>   Emil).
>>
>> Changes since RFC:
>>
>> * Fixed incorrect use of the drive model as pointed out by Greg
>>   Kroah-Hartman
>> * Used devm functions as suggested by Keith Busch
>> * Added a handful of sysfs attributes to the switchtec class
>> * Added a handful of IOCTLs to the switchtec device
>> * A number of miscellaneous bug fixes
>>
>> --
>>
>> Hi,
>>
>> This is a continuation of the RFC we posted lasted month [1] which
>> proposes a management driver for Microsemi's Switchtec line of PCI
>> switches. This hardware is still looking to be used in the Open
>> Compute Platform
>>
>> To make this entirely clear: the Switchtec products are compliant
>> with the PCI specifications and are supported today with the standard
>> in-kernel driver. However, these devices also expose a management endpoint
>> on a separate PCI function address which can be used to perform some
>> advanced operations. This is a driver for that function. See the patch
>> for more information.
>>
>> Since the RFC, we've made the changes requested by Greg Kroah-Hartman
>> and Keith Busch, and we've also fleshed out a number of features. We've
>> added a couple of IOCTLs and sysfs attributes which are documented in
>> the patch. Significant work has also been done on the userspace tool
>> which is available under a GPL license at [2]. We've also had testing
>> done by some of the interested parties.
>>
>> We hope to see this work included in either 4.11 or 4.12 assuming a
>> smooth review process.
>>
>> The patch is based off of the v4.10-rc6 release.
>>
>> Thanks for your review,
>>
>> Logan
>>
>> [1] https://www.spinics.net/lists/linux-pci/msg56897.html
>> [2] https://github.com/sbates130272/switchtec-user
>>
>> --
>>
>> Logan Gunthorpe (4):
>>   MicroSemi Switchtec management interface driver
>>   switchtec: Add user interface documentation
>>   switchtec: Add sysfs attributes to the Switchtec driver
>>   switchtec: Add IOCTLs to the Switchtec driver
>>
>>  Documentation/ABI/testing/sysfs-class-switchtec |   96 ++
>>  Documentation/ioctl/ioctl-number.txt            |    1 +
>>  Documentation/switchtec.txt                     |   80 ++
>>  MAINTAINERS                                     |   11 +
>>  drivers/pci/Kconfig                             |    1 +
>>  drivers/pci/Makefile                            |    1 +
>>  drivers/pci/switch/Kconfig                      |   13 +
>>  drivers/pci/switch/Makefile                     |    1 +
>>  drivers/pci/switch/switchtec.c                  | 1608 +++++++++++++++++++++++
>>  include/uapi/linux/switchtec_ioctl.h            |  132 ++
>>  10 files changed, 1944 insertions(+)
>>  create mode 100644 Documentation/ABI/testing/sysfs-class-switchtec
>>  create mode 100644 Documentation/switchtec.txt
>>  create mode 100644 drivers/pci/switch/Kconfig
>>  create mode 100644 drivers/pci/switch/Makefile
>>  create mode 100644 drivers/pci/switch/switchtec.c
>>  create mode 100644 include/uapi/linux/switchtec_ioctl.h
>>
>> --
>> 2.1.4
>>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 0/4] New Microsemi PCI Switch Management Driver
  2017-02-23 20:36   ` Logan Gunthorpe
@ 2017-02-23 20:51     ` Sinan Kaya
  2017-02-23 20:54       ` Jens Axboe
  2017-02-23 20:56       ` Logan Gunthorpe
  2017-02-23 20:52     ` Jens Axboe
                       ` (2 subsequent siblings)
  3 siblings, 2 replies; 36+ messages in thread
From: Sinan Kaya @ 2017-02-23 20:51 UTC (permalink / raw)
  To: Logan Gunthorpe, Bjorn Helgaas
  Cc: Keith Busch, Myron Stowe, Greg Kroah-Hartman, Geert Uytterhoeven,
	Jonathan Corbet, David S. Miller, Andrew Morton, Emil Velikov,
	Mauro Carvalho Chehab, Guenter Roeck, Jarkko Sakkinen,
	Linus Walleij, Ryusuke Konishi, Stefan Berger, Wei Zhang,
	Kurt Schwemmer, Stephen Bates, linux-pci, linux-doc, linux-nvme,
	linux-kernel, Jens Axboe

On 2/23/2017 3:36 PM, Logan Gunthorpe wrote:
> We're still waiting on any kind of response from Bjorn. (If you're
> listening please say something!)
> 
> Does anyone have any suggestions for dealing with an unresponsive
> maintainer? Or a way for us to move forward with this quickly and get it
> merged?

You'll want to be careful during the merge window (these days) as the
maintainer is usually busy with code delivery. You can't rush your code in at
the last minute.

[GIT PULL] PCI changes for v4.11]

Here is a list of all the patches waiting to be reviewed. 

https://patchwork.ozlabs.org/project/linux-pci/list/

You'll just have to wait your turn.

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 0/4] New Microsemi PCI Switch Management Driver
  2017-02-23 20:36   ` Logan Gunthorpe
  2017-02-23 20:51     ` Sinan Kaya
@ 2017-02-23 20:52     ` Jens Axboe
  2017-02-23 20:55     ` Greg Kroah-Hartman
  2017-02-23 22:14     ` Bjorn Helgaas
  3 siblings, 0 replies; 36+ messages in thread
From: Jens Axboe @ 2017-02-23 20:52 UTC (permalink / raw)
  To: Logan Gunthorpe, Bjorn Helgaas
  Cc: Keith Busch, Myron Stowe, Greg Kroah-Hartman, Geert Uytterhoeven,
	Jonathan Corbet, David S. Miller, Andrew Morton, Emil Velikov,
	Mauro Carvalho Chehab, Guenter Roeck, Jarkko Sakkinen,
	Linus Walleij, Ryusuke Konishi, Stefan Berger, Wei Zhang,
	Kurt Schwemmer, Stephen Bates, linux-pci, linux-doc, linux-nvme,
	linux-kernel

On 02/23/2017 01:36 PM, Logan Gunthorpe wrote:
> Hello,
> 
> We're still waiting on any kind of response from Bjorn. (If you're
> listening please say something!)
> 
> Does anyone have any suggestions for dealing with an unresponsive
> maintainer? Or a way for us to move forward with this quickly and get it
> merged?
> 
> ie. Can anyone else pick this up through another route? In the end, it's
> just a fairly basic driver and doesn't touch any core PCI functionality
> and we've had a fair amount of review from other kernel contributors,
> all of which we've addressed.

I'll add that we're already using this driver internally, would be
great to get it into mainline.

What is holding this up?

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 0/4] New Microsemi PCI Switch Management Driver
  2017-02-23 20:51     ` Sinan Kaya
@ 2017-02-23 20:54       ` Jens Axboe
  2017-02-23 20:56       ` Logan Gunthorpe
  1 sibling, 0 replies; 36+ messages in thread
From: Jens Axboe @ 2017-02-23 20:54 UTC (permalink / raw)
  To: Sinan Kaya, Logan Gunthorpe, Bjorn Helgaas
  Cc: Keith Busch, Myron Stowe, Greg Kroah-Hartman, Geert Uytterhoeven,
	Jonathan Corbet, David S. Miller, Andrew Morton, Emil Velikov,
	Mauro Carvalho Chehab, Guenter Roeck, Jarkko Sakkinen,
	Linus Walleij, Ryusuke Konishi, Stefan Berger, Wei Zhang,
	Kurt Schwemmer, Stephen Bates, linux-pci, linux-doc, linux-nvme,
	linux-kernel

On 02/23/2017 01:51 PM, Sinan Kaya wrote:
> On 2/23/2017 3:36 PM, Logan Gunthorpe wrote:
>> We're still waiting on any kind of response from Bjorn. (If you're
>> listening please say something!)
>>
>> Does anyone have any suggestions for dealing with an unresponsive
>> maintainer? Or a way for us to move forward with this quickly and get it
>> merged?
> 
> You'll want to be careful during the merge window (these days) as the
> maintainer is usually busy with code delivery. You can't rush your code in at
> the last minute.
> 
> [GIT PULL] PCI changes for v4.11]
> 
> Here is a list of all the patches waiting to be reviewed. 
> 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.ozlabs.org_project_linux-2Dpci_list_&d=DwIC-g&c=5VD0RTtNlTh3ycd41b3MUw&r=cK1a7KivzZRh1fKQMjSm2A&m=1BfGDtZjAoX-qj9haoqNB7onGlV0IrYOSvLRdecBk88&s=LXm-NTK2rl_GWfVm9dlVB66fhl1ckx-Ar-2z80daybc&e= 
> 
> You'll just have to wait your turn.

The code was posted 2/2, and a ping was sent 2/17 and now today. I don't
think it's unreasonable to expect a response within a month. It's not
a major new feature in the core of PCI, it's a basic driver.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 0/4] New Microsemi PCI Switch Management Driver
  2017-02-23 20:36   ` Logan Gunthorpe
  2017-02-23 20:51     ` Sinan Kaya
  2017-02-23 20:52     ` Jens Axboe
@ 2017-02-23 20:55     ` Greg Kroah-Hartman
  2017-02-23 22:14     ` Bjorn Helgaas
  3 siblings, 0 replies; 36+ messages in thread
From: Greg Kroah-Hartman @ 2017-02-23 20:55 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: Bjorn Helgaas, Keith Busch, Myron Stowe, Geert Uytterhoeven,
	Jonathan Corbet, David S. Miller, Andrew Morton, Emil Velikov,
	Mauro Carvalho Chehab, Guenter Roeck, Jarkko Sakkinen,
	Linus Walleij, Ryusuke Konishi, Stefan Berger, Wei Zhang,
	Kurt Schwemmer, Stephen Bates, linux-pci, linux-doc, linux-nvme,
	linux-kernel, Jens Axboe

On Thu, Feb 23, 2017 at 01:36:51PM -0700, Logan Gunthorpe wrote:
> Hello,
> 
> We're still waiting on any kind of response from Bjorn. (If you're
> listening please say something!)
> 
> Does anyone have any suggestions for dealing with an unresponsive
> maintainer? Or a way for us to move forward with this quickly and get it
> merged?

It's the middle of the merge window, nothing new can go in right now.
Wait until after 4.11-rc1 comes out before expecting for maintainers to
be able to start reviewing and discussing new things like this.

patience please,

greg k-h

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 0/4] New Microsemi PCI Switch Management Driver
  2017-02-23 20:51     ` Sinan Kaya
  2017-02-23 20:54       ` Jens Axboe
@ 2017-02-23 20:56       ` Logan Gunthorpe
  1 sibling, 0 replies; 36+ messages in thread
From: Logan Gunthorpe @ 2017-02-23 20:56 UTC (permalink / raw)
  To: Sinan Kaya, Bjorn Helgaas
  Cc: Keith Busch, Myron Stowe, Greg Kroah-Hartman, Geert Uytterhoeven,
	Jonathan Corbet, David S. Miller, Andrew Morton, Emil Velikov,
	Mauro Carvalho Chehab, Guenter Roeck, Jarkko Sakkinen,
	Linus Walleij, Ryusuke Konishi, Stefan Berger, Wei Zhang,
	Kurt Schwemmer, Stephen Bates, linux-pci, linux-doc, linux-nvme,
	linux-kernel, Jens Axboe



On 23/02/17 01:51 PM, Sinan Kaya wrote:
> You'll want to be careful during the merge window (these days) as the
> maintainer is usually busy with code delivery. You can't rush your code in at
> the last minute.

Thanks for the advice, we will continue to wait.

However, I would say we've been very patient. It's been three weeks
since we posted the latest revision, a month since the first version and
almost 3 months since our RFC. I don't think it's too much to expect at
least a response saying that it's in the works or something. That long
with dead silence from the maintainer is a bit much.

Logan

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 0/4] New Microsemi PCI Switch Management Driver
  2017-02-23 20:36   ` Logan Gunthorpe
                       ` (2 preceding siblings ...)
  2017-02-23 20:55     ` Greg Kroah-Hartman
@ 2017-02-23 22:14     ` Bjorn Helgaas
  2017-02-23 22:19       ` Logan Gunthorpe
  3 siblings, 1 reply; 36+ messages in thread
From: Bjorn Helgaas @ 2017-02-23 22:14 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: Bjorn Helgaas, Keith Busch, Myron Stowe, Greg Kroah-Hartman,
	Geert Uytterhoeven, Jonathan Corbet, David S. Miller,
	Andrew Morton, Emil Velikov, Mauro Carvalho Chehab,
	Guenter Roeck, Jarkko Sakkinen, Linus Walleij, Ryusuke Konishi,
	Stefan Berger, Wei Zhang, Kurt Schwemmer, Stephen Bates,
	linux-pci, linux-doc, linux-nvme, linux-kernel, Jens Axboe

On Thu, Feb 23, 2017 at 01:36:51PM -0700, Logan Gunthorpe wrote:
> Hello,
> 
> We're still waiting on any kind of response from Bjorn. (If you're
> listening please say something!)
> 
> Does anyone have any suggestions for dealing with an unresponsive
> maintainer? Or a way for us to move forward with this quickly and get it
> merged?

I try to deal with regressions first and other bug fixes second.
After that, I look at things that add new functionality.  I try to
look at the new stuff in roughly chronological order, as you would see
here:

https://patchwork.ozlabs.org/project/linux-pci/list/?order=date&page=1

If other folks have feedback, as they did on your 12/17, 1/31, and
even the 2/2 posting, I generally let that get sorted out before I
look at it.  I apologize that I haven't responded to your queries
about posting a v3 vs updating v2.

To answer that question, it's much simpler for me to deal with a
fresh, clean new series than it is to tweak things in an
already-posted series, partly because a series with discussion other
than simple acks and reviewed-bys looks more like work-in-progress.

Bjorn

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 0/4] New Microsemi PCI Switch Management Driver
  2017-02-23 22:14     ` Bjorn Helgaas
@ 2017-02-23 22:19       ` Logan Gunthorpe
  0 siblings, 0 replies; 36+ messages in thread
From: Logan Gunthorpe @ 2017-02-23 22:19 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Bjorn Helgaas, Keith Busch, Myron Stowe, Greg Kroah-Hartman,
	Geert Uytterhoeven, Jonathan Corbet, David S. Miller,
	Andrew Morton, Emil Velikov, Mauro Carvalho Chehab,
	Guenter Roeck, Jarkko Sakkinen, Linus Walleij, Ryusuke Konishi,
	Stefan Berger, Wei Zhang, Kurt Schwemmer, Stephen Bates,
	linux-pci, linux-doc, linux-nvme, linux-kernel, Jens Axboe

Thanks Bjorn! I understand your busy and we appreciate your time in this
matter.

I'll prepare a v3 with a collected set of tags shortly. We're more than
happy to clean this up to make your job as easy as possible. We were
just looking for direction in how to move this forward.

Logan

On 23/02/17 03:14 PM, Bjorn Helgaas wrote:
> On Thu, Feb 23, 2017 at 01:36:51PM -0700, Logan Gunthorpe wrote:
>> Hello,
>>
>> We're still waiting on any kind of response from Bjorn. (If you're
>> listening please say something!)
>>
>> Does anyone have any suggestions for dealing with an unresponsive
>> maintainer? Or a way for us to move forward with this quickly and get it
>> merged?
> 
> I try to deal with regressions first and other bug fixes second.
> After that, I look at things that add new functionality.  I try to
> look at the new stuff in roughly chronological order, as you would see
> here:
> 
> https://patchwork.ozlabs.org/project/linux-pci/list/?order=date&page=1
> 
> If other folks have feedback, as they did on your 12/17, 1/31, and
> even the 2/2 posting, I generally let that get sorted out before I
> look at it.  I apologize that I haven't responded to your queries
> about posting a v3 vs updating v2.
> 
> To answer that question, it's much simpler for me to deal with a
> fresh, clean new series than it is to tweak things in an
> already-posted series, partly because a series with discussion other
> than simple acks and reviewed-bys looks more like work-in-progress.
> 
> Bjorn
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 3/4] switchtec: Add sysfs attributes to the Switchtec driver
  2017-02-02 18:06 ` [PATCH v2 3/4] switchtec: Add sysfs attributes to the Switchtec driver Logan Gunthorpe
  2017-02-10 14:54   ` Greg Kroah-Hartman
@ 2017-02-23 22:43   ` Bjorn Helgaas
  2017-02-23 22:56     ` Logan Gunthorpe
  1 sibling, 1 reply; 36+ messages in thread
From: Bjorn Helgaas @ 2017-02-23 22:43 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: Keith Busch, Myron Stowe, Greg Kroah-Hartman, Bjorn Helgaas,
	Geert Uytterhoeven, Jonathan Corbet, David S. Miller,
	Andrew Morton, Emil Velikov, Mauro Carvalho Chehab,
	Guenter Roeck, Jarkko Sakkinen, Linus Walleij, Ryusuke Konishi,
	Stefan Berger, Wei Zhang, Kurt Schwemmer, Stephen Bates,
	linux-pci, linux-doc, linux-nvme, linux-kernel

On Thu, Feb 02, 2017 at 11:06:02AM -0700, Logan Gunthorpe wrote:
> This patch adds a few read-only sysfs attributes which provide
> some device information that is exposed from the devices. Primarily
> component and device names and versions. These are documented in
> Documentation/ABI/testing/sysfs-class-switchtec.
> 
> Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
> Signed-off-by: Stephen Bates <stephen.bates@microsemi.com>
> ---
>  Documentation/ABI/testing/sysfs-class-switchtec |  96 ++++++++++++++++++++
>  MAINTAINERS                                     |   1 +
>  drivers/pci/switch/switchtec.c                  | 113 ++++++++++++++++++++++++
>  3 files changed, 210 insertions(+)
>  create mode 100644 Documentation/ABI/testing/sysfs-class-switchtec
> 
> diff --git a/Documentation/ABI/testing/sysfs-class-switchtec b/Documentation/ABI/testing/sysfs-class-switchtec
> new file mode 100644
> index 0000000..48cb4c1
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-class-switchtec
> @@ -0,0 +1,96 @@
> +switchtec - Microsemi Switchtec PCI Switch Management Endpoint
> +
> +For details on this subsystem look at Documentation/switchtec.txt.
> +
> +What: 		/sys/class/switchtec

This path seems a little generic.  I don't see other cases where a
product brand name ("Switchtec") appears at the top level of
/sys/class/...

My question is based on "ls Documentation/ABI/testing/sysfs-class*",
not on any great knowledge of sysfs, and I see Greg has already given
a Reviewed-by for this, so maybe this is the right approach.

It does seem like the path could include a clue that this is related
to PCI.

Is there a link to the switch PCI device itself, e.g., to
/sys/devices/pci*?  Should these attributes simply be put in a
subdirectory there, e.g., in

  /sys/devices/pci0000:00/0000:00:00.0/stats/...

instead of in /sys/class/?  It seems like it'd be useful for userspace
to be able to connect this info with the switch somehow.

> +Date:		05-Jan-2017
> +KernelVersion:	v4.11
> +Contact:	Logan Gunthorpe <logang@deltatee.com>
> +Description:	The switchtec class subsystem folder.
> +		Each registered switchtec driver is represented by a switchtecX
> +		subfolder (X being an integer >= 0).
> +
> +
> +What:		/sys/class/switchtec/switchtec[0-9]+/component_id
> +Date:		05-Jan-2017
> +KernelVersion:	v4.11
> +Contact:	Logan Gunthorpe <logang@deltatee.com>
> +Description:	Component identifier as stored in the hardware (eg. PM8543)
> +		(read only)
> +Values: 	arbitrary string.
> +
> +
> +What:		/sys/class/switchtec/switchtec[0-9]+/component_revision
> +Date:		05-Jan-2017
> +KernelVersion:	v4.11
> +Contact:	Logan Gunthorpe <logang@deltatee.com>
> +Description:	Component revision stored in the hardware (read only)
> +Values: 	integer.
> +
> +
> +What:		/sys/class/switchtec/switchtec[0-9]+/component_vendor
> +Date:		05-Jan-2017
> +KernelVersion:	v4.11
> +Contact:	Logan Gunthorpe <logang@deltatee.com>
> +Description:	Component vendor as stored in the hardware (eg. MICROSEM)
> +		(read only)
> +Values: 	arbitrary string.
> +
> +
> +What:		/sys/class/switchtec/switchtec[0-9]+/device_version
> +Date:		05-Jan-2017
> +KernelVersion:	v4.11
> +Contact:	Logan Gunthorpe <logang@deltatee.com>
> +Description:	Device version as stored in the hardware (read only)
> +Values: 	integer.
> +
> +
> +What:		/sys/class/switchtec/switchtec[0-9]+/fw_version
> +Date:		05-Jan-2017
> +KernelVersion:	v4.11
> +Contact:	Logan Gunthorpe <logang@deltatee.com>
> +Description:	Currently running firmware version (read only)
> +Values: 	integer (in hexadecimal).
> +
> +
> +What:		/sys/class/switchtec/switchtec[0-9]+/partition
> +Date:		05-Jan-2017
> +KernelVersion:	v4.11
> +Contact:	Logan Gunthorpe <logang@deltatee.com>
> +Description:	Partition number for this device in the switch (read only)
> +Values: 	integer.
> +
> +
> +What:		/sys/class/switchtec/switchtec[0-9]+/partition_count
> +Date:		05-Jan-2017
> +KernelVersion:	v4.11
> +Contact:	Logan Gunthorpe <logang@deltatee.com>
> +Description:	Total number of partitions in the switch (read only)
> +Values: 	integer.
> +
> +
> +What:		/sys/class/switchtec/switchtec[0-9]+/product_id
> +Date:		05-Jan-2017
> +KernelVersion:	v4.11
> +Contact:	Logan Gunthorpe <logang@deltatee.com>
> +Description:	Product identifier as stored in the hardware (eg. PSX 48XG3)
> +		(read only)
> +Values: 	arbitrary string.
> +
> +
> +What:		/sys/class/switchtec/switchtec[0-9]+/product_revision
> +Date:		05-Jan-2017
> +KernelVersion:	v4.11
> +Contact:	Logan Gunthorpe <logang@deltatee.com>
> +Description:	Product revision stored in the hardware (eg. RevB)
> +		(read only)
> +Values: 	arbitrary string.
> +
> +
> +What:		/sys/class/switchtec/switchtec[0-9]+/product_vendor
> +Date:		05-Jan-2017
> +KernelVersion:	v4.11
> +Contact:	Logan Gunthorpe <logang@deltatee.com>
> +Description:	Product vendor as stored in the hardware (eg. MICROSEM)
> +		(read only)
> +Values: 	arbitrary string.
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 0ab858d..d39b7fd 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -9514,6 +9514,7 @@ M:	Logan Gunthorpe <logang@deltatee.com>
>  L:	linux-pci@vger.kernel.org
>  S:	Maintained
>  F:	Documentation/switchtec.txt
> +F:	Documentation/ABI/testing/sysfs-class-switchtec
>  F:	drivers/pci/switch/switchtec*
>  
>  PCI DRIVER FOR NVIDIA TEGRA
> diff --git a/drivers/pci/switch/switchtec.c b/drivers/pci/switch/switchtec.c
> index e4a4294..354ddfd 100644
> --- a/drivers/pci/switch/switchtec.c
> +++ b/drivers/pci/switch/switchtec.c
> @@ -485,6 +485,118 @@ static void mrpc_timeout_work(struct work_struct *work)
>  	mutex_unlock(&stdev->mrpc_mutex);
>  }
>  
> +static ssize_t device_version_show(struct device *dev,
> +	struct device_attribute *attr, char *buf)
> +{
> +	struct switchtec_dev *stdev = to_stdev(dev);
> +	u32 ver;
> +
> +	ver = ioread32(&stdev->mmio_sys_info->device_version);
> +
> +	return sprintf(buf, "%x\n", ver);
> +}
> +static DEVICE_ATTR_RO(device_version);
> +
> +static ssize_t fw_version_show(struct device *dev,
> +	struct device_attribute *attr, char *buf)
> +{
> +	struct switchtec_dev *stdev = to_stdev(dev);
> +	u32 ver;
> +
> +	ver = ioread32(&stdev->mmio_sys_info->firmware_version);
> +
> +	return sprintf(buf, "%08x\n", ver);
> +}
> +static DEVICE_ATTR_RO(fw_version);
> +
> +static ssize_t io_string_show(char *buf, void __iomem *attr, size_t len)
> +{
> +	int i;
> +
> +	memcpy_fromio(buf, attr, len);
> +	buf[len] = '\n';
> +	buf[len + 1] = 0;
> +
> +	for (i = len - 1; i > 0; i--) {
> +		if (buf[i] != ' ')
> +			break;
> +		buf[i] = '\n';
> +		buf[i + 1] = 0;
> +	}
> +
> +	return strlen(buf);
> +}
> +
> +#define DEVICE_ATTR_SYS_INFO_STR(field) \
> +static ssize_t field ## _show(struct device *dev, \
> +	struct device_attribute *attr, char *buf) \
> +{ \
> +	struct switchtec_dev *stdev = to_stdev(dev); \
> +	return io_string_show(buf, &stdev->mmio_sys_info->field, \
> +			    sizeof(stdev->mmio_sys_info->field)); \
> +} \
> +\
> +static DEVICE_ATTR_RO(field)
> +
> +DEVICE_ATTR_SYS_INFO_STR(vendor_id);
> +DEVICE_ATTR_SYS_INFO_STR(product_id);
> +DEVICE_ATTR_SYS_INFO_STR(product_revision);
> +DEVICE_ATTR_SYS_INFO_STR(component_vendor);
> +
> +static ssize_t component_id_show(struct device *dev,
> +	struct device_attribute *attr, char *buf)
> +{
> +	struct switchtec_dev *stdev = to_stdev(dev);
> +	int id = ioread16(&stdev->mmio_sys_info->component_id);
> +
> +	return sprintf(buf, "PM%04X\n", id);
> +}
> +static DEVICE_ATTR_RO(component_id);
> +
> +static ssize_t component_revision_show(struct device *dev,
> +	struct device_attribute *attr, char *buf)
> +{
> +	struct switchtec_dev *stdev = to_stdev(dev);
> +	int rev = ioread8(&stdev->mmio_sys_info->component_revision);
> +
> +	return sprintf(buf, "%d\n", rev);
> +}
> +static DEVICE_ATTR_RO(component_revision);
> +
> +static ssize_t partition_show(struct device *dev,
> +	struct device_attribute *attr, char *buf)
> +{
> +	struct switchtec_dev *stdev = to_stdev(dev);
> +
> +	return sprintf(buf, "%d\n", stdev->partition);
> +}
> +static DEVICE_ATTR_RO(partition);
> +
> +static ssize_t partition_count_show(struct device *dev,
> +	struct device_attribute *attr, char *buf)
> +{
> +	struct switchtec_dev *stdev = to_stdev(dev);
> +
> +	return sprintf(buf, "%d\n", stdev->partition_count);
> +}
> +static DEVICE_ATTR_RO(partition_count);
> +
> +static struct attribute *switchtec_device_attrs[] = {
> +	&dev_attr_device_version.attr,
> +	&dev_attr_fw_version.attr,
> +	&dev_attr_vendor_id.attr,
> +	&dev_attr_product_id.attr,
> +	&dev_attr_product_revision.attr,
> +	&dev_attr_component_vendor.attr,
> +	&dev_attr_component_id.attr,
> +	&dev_attr_component_revision.attr,
> +	&dev_attr_partition.attr,
> +	&dev_attr_partition_count.attr,
> +	NULL,
> +};
> +
> +ATTRIBUTE_GROUPS(switchtec_device);
> +
>  static int switchtec_dev_open(struct inode *inode, struct file *filp)
>  {
>  	struct switchtec_dev *stdev;
> @@ -699,6 +811,7 @@ static struct switchtec_dev *stdev_create(struct pci_dev *pdev)
>  	dev->devt = MKDEV(MAJOR(switchtec_devt), minor);
>  	dev->class = switchtec_class;
>  	dev->parent = &pdev->dev;
> +	dev->groups = switchtec_device_groups;
>  	dev->release = stdev_release;
>  	dev_set_name(dev, "switchtec%d", minor);
>  
> -- 
> 2.1.4
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 3/4] switchtec: Add sysfs attributes to the Switchtec driver
  2017-02-23 22:43   ` Bjorn Helgaas
@ 2017-02-23 22:56     ` Logan Gunthorpe
  2017-02-24 22:45       ` Bjorn Helgaas
  0 siblings, 1 reply; 36+ messages in thread
From: Logan Gunthorpe @ 2017-02-23 22:56 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Keith Busch, Myron Stowe, Greg Kroah-Hartman, Bjorn Helgaas,
	Geert Uytterhoeven, Jonathan Corbet, David S. Miller,
	Andrew Morton, Emil Velikov, Mauro Carvalho Chehab,
	Guenter Roeck, Jarkko Sakkinen, Linus Walleij, Ryusuke Konishi,
	Stefan Berger, Wei Zhang, Kurt Schwemmer, Stephen Bates,
	linux-pci, linux-doc, linux-nvme, linux-kernel



On 23/02/17 03:43 PM, Bjorn Helgaas wrote:
> This path seems a little generic.  I don't see other cases where a
> product brand name ("Switchtec") appears at the top level of
> /sys/class/...

Ok, well we are certainly open to suggestions, but there isn't really a
generic version of this device available so I'm not sure how we would
change that. Per device-type classes aren't that uncommon though, a
quick grep shows things like:

platform/chrome/cros_ec_dev.c:40:static struct class cros_class
s390/char/raw3270.h:94:extern struct class *class3270;
net/ethernet/hisilicon/hns/hnae.c:19:static struct class *hnae_class;
mfd/ucb1x00-core.c:490:static struct class ucb1x00_class

> My question is based on "ls Documentation/ABI/testing/sysfs-class*",
> not on any great knowledge of sysfs, and I see Greg has already given
> a Reviewed-by for this, so maybe this is the right approach.
> 
> It does seem like the path could include a clue that this is related
> to PCI.

I mean, we could change it to pci-switchtec or something if you think
that would be better..?? But I'm not sure how else to accommodate this.

> Is there a link to the switch PCI device itself, e.g., to
> /sys/devices/pci*?  Should these attributes simply be put in a
> subdirectory there, e.g., in
> 
>   /sys/devices/pci0000:00/0000:00:00.0/stats/...

Well our device shows up here in the tree:

/sys/devices/pci0000:00/0000:00:03.0/0000:03:00.1/switchtec/switchtec0

(Which userspace can get to by following the link at
/sys/class/switchtec/switchtec0) The switch is then always:

/sys/devices/pci0000:00/0000:00:03.0

Thanks,

Logan

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 1/4] MicroSemi Switchtec management interface driver
  2017-02-02 18:06 ` [PATCH v2 1/4] MicroSemi Switchtec management interface driver Logan Gunthorpe
  2017-02-10 14:51   ` Greg Kroah-Hartman
  2017-02-10 14:54   ` [PATCH v2 1/4] MicroSemi Switchtec management interface driver Greg Kroah-Hartman
@ 2017-02-24  0:35   ` Bjorn Helgaas
  2017-02-24 18:32     ` Logan Gunthorpe
  2 siblings, 1 reply; 36+ messages in thread
From: Bjorn Helgaas @ 2017-02-24  0:35 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: Keith Busch, Myron Stowe, Greg Kroah-Hartman, Bjorn Helgaas,
	Geert Uytterhoeven, Jonathan Corbet, David S. Miller,
	Andrew Morton, Emil Velikov, Mauro Carvalho Chehab,
	Guenter Roeck, Jarkko Sakkinen, Linus Walleij, Ryusuke Konishi,
	Stefan Berger, Wei Zhang, Kurt Schwemmer, Stephen Bates,
	linux-pci, linux-doc, linux-nvme, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Christoph Hellwig

[+cc Peter, Ingo, Arnaldo, Alexander, Christoph]

On Thu, Feb 02, 2017 at 11:06:00AM -0700, Logan Gunthorpe wrote:
> Microsemi's "Switchtec" line of PCI switch devices is already well
> supported by the kernel with standard PCI switch drivers. However, the
> Switchtec device advertises a special management endpoint with a separate
> PCI function address and class code. This endpoint enables some additional
> functionality which includes:
> 
>  * Packet and Byte Counters
>  * Switch Firmware Upgrades
>  * Event and Error logs
>  * Querying port link status
>  * Custom user firmware commands

Would it make any sense to integrate this with the perf tool?  It
looks like switchtec-user measures bandwidth and latency, which sounds
sort of perf-ish.

> This patch introduces the switchtec kernel module which provides
> PCI driver that exposes a char device. The char device provides
> userspace access to this interface through read, write and (optionally)
> poll calls.
> 
> A userspace tool and library which utilizes this interface is available
> at [1]. This tool takes inspiration (and borrows some code) from
> nvme-cli [2]. The tool is largely complete at this time but additional
> features may be added in the future.
> 
> [1] https://github.com/sbates130272/switchtec-user
> [2] https://github.com/linux-nvme/nvme-cli
> 
> Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
> Signed-off-by: Stephen Bates <stephen.bates@microsemi.com>
> ---
>  MAINTAINERS                    |    8 +
>  drivers/pci/Kconfig            |    1 +
>  drivers/pci/Makefile           |    1 +
>  drivers/pci/switch/Kconfig     |   13 +
>  drivers/pci/switch/Makefile    |    1 +
>  drivers/pci/switch/switchtec.c | 1028 ++++++++++++++++++++++++++++++++++++++++
>  6 files changed, 1052 insertions(+)
>  create mode 100644 drivers/pci/switch/Kconfig
>  create mode 100644 drivers/pci/switch/Makefile
>  create mode 100644 drivers/pci/switch/switchtec.c
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 5f10c28..9c35198 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -9507,6 +9507,14 @@ S:	Maintained
>  F:	Documentation/devicetree/bindings/pci/aardvark-pci.txt
>  F:	drivers/pci/host/pci-aardvark.c
>  
> +PCI DRIVER FOR MICROSEMI SWITCHTEC
> +M:	Kurt Schwemmer <kurt.schwemmer@microsemi.com>
> +M:	Stephen Bates <stephen.bates@microsemi.com>
> +M:	Logan Gunthorpe <logang@deltatee.com>
> +L:	linux-pci@vger.kernel.org
> +S:	Maintained
> +F:	drivers/pci/switch/switchtec*
> +
>  PCI DRIVER FOR NVIDIA TEGRA
>  M:	Thierry Reding <thierry.reding@gmail.com>
>  L:	linux-tegra@vger.kernel.org
> diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
> index 6555eb7..f72e8c5 100644
> --- a/drivers/pci/Kconfig
> +++ b/drivers/pci/Kconfig
> @@ -133,3 +133,4 @@ config PCI_HYPERV
>  
>  source "drivers/pci/hotplug/Kconfig"
>  source "drivers/pci/host/Kconfig"
> +source "drivers/pci/switch/Kconfig"
> diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
> index 8db5079..15b46dd 100644
> --- a/drivers/pci/Makefile
> +++ b/drivers/pci/Makefile
> @@ -68,3 +68,4 @@ ccflags-$(CONFIG_PCI_DEBUG) := -DDEBUG
>  
>  # PCI host controller drivers
>  obj-y += host/
> +obj-y += switch/
> diff --git a/drivers/pci/switch/Kconfig b/drivers/pci/switch/Kconfig
> new file mode 100644
> index 0000000..4c49648
> --- /dev/null
> +++ b/drivers/pci/switch/Kconfig
> @@ -0,0 +1,13 @@
> +menu "PCI switch controller drivers"
> +	depends on PCI
> +
> +config PCI_SW_SWITCHTEC
> +	tristate "MicroSemi Switchtec PCIe Switch Management Driver"
> +	help
> +	 Enables support for the management interface for the MicroSemi
> +	 Switchtec series of PCIe switches. Supports userspace access
> +	 to submit MRPC commands to the switch via /dev/switchtecX
> +	 devices. See <file:Documentation/switchtec.txt> for more
> +	 information.
> +
> +endmenu
> diff --git a/drivers/pci/switch/Makefile b/drivers/pci/switch/Makefile
> new file mode 100644
> index 0000000..37d8cfb
> --- /dev/null
> +++ b/drivers/pci/switch/Makefile
> @@ -0,0 +1 @@
> +obj-$(CONFIG_PCI_SW_SWITCHTEC) += switchtec.o
> diff --git a/drivers/pci/switch/switchtec.c b/drivers/pci/switch/switchtec.c
> new file mode 100644
> index 0000000..e4a4294
> --- /dev/null
> +++ b/drivers/pci/switch/switchtec.c
> @@ -0,0 +1,1028 @@
> +/*
> + * Microsemi Switchtec(tm) PCIe Management Driver
> + * Copyright (c) 2017, Microsemi Corporation
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + */
> +
> +#include <linux/interrupt.h>
> +#include <linux/module.h>
> +#include <linux/fs.h>
> +#include <linux/uaccess.h>
> +#include <linux/poll.h>
> +#include <linux/pci.h>
> +#include <linux/cdev.h>
> +#include <linux/wait.h>
> +
> +MODULE_DESCRIPTION("Microsemi Switchtec(tm) PCI-E Management Driver");

Nit: s/PCI-E/PCIe/

> +MODULE_VERSION("0.1");
> +MODULE_LICENSE("GPL");
> +MODULE_AUTHOR("Microsemi Corporation");
> +
> +static int max_devices = 16;

This static initialization is slightly misleading because
switchtec_init() immediately sets max_devices to at least 256.

> +module_param(max_devices, int, 0644);
> +MODULE_PARM_DESC(max_devices, "max number of switchtec device instances");
> +
> +static dev_t switchtec_devt;
> +static struct class *switchtec_class;
> +static DEFINE_IDA(switchtec_minor_ida);
> +
> +#define MICROSEMI_VENDOR_ID         0x11f8
> +#define MICROSEMI_NTB_CLASSCODE     0x068000
> +#define MICROSEMI_MGMT_CLASSCODE    0x058000
> +
> +#define SWITCHTEC_MRPC_PAYLOAD_SIZE 1024
> +#define SWITCHTEC_MAX_PFF_CSR 48
> +
> +#define SWITCHTEC_EVENT_OCCURRED BIT(0)
> +#define SWITCHTEC_EVENT_CLEAR    BIT(0)
> +#define SWITCHTEC_EVENT_EN_LOG   BIT(1)
> +#define SWITCHTEC_EVENT_EN_CLI   BIT(2)
> +#define SWITCHTEC_EVENT_EN_IRQ   BIT(3)
> +#define SWITCHTEC_EVENT_FATAL    BIT(4)
> +
> +enum {
> +	SWITCHTEC_GAS_MRPC_OFFSET       = 0x0000,
> +	SWITCHTEC_GAS_TOP_CFG_OFFSET    = 0x1000,
> +	SWITCHTEC_GAS_SW_EVENT_OFFSET   = 0x1800,
> +	SWITCHTEC_GAS_SYS_INFO_OFFSET   = 0x2000,
> +	SWITCHTEC_GAS_FLASH_INFO_OFFSET = 0x2200,
> +	SWITCHTEC_GAS_PART_CFG_OFFSET   = 0x4000,
> +	SWITCHTEC_GAS_NTB_OFFSET        = 0x10000,
> +	SWITCHTEC_GAS_PFF_CSR_OFFSET    = 0x134000,
> +};
> +
> +struct mrpc_regs {
> +	u8 input_data[SWITCHTEC_MRPC_PAYLOAD_SIZE];
> +	u8 output_data[SWITCHTEC_MRPC_PAYLOAD_SIZE];
> +	u32 cmd;
> +	u32 status;
> +	u32 ret_value;
> +} __packed;
> +
> +enum mrpc_status {
> +	SWITCHTEC_MRPC_STATUS_INPROGRESS = 1,
> +	SWITCHTEC_MRPC_STATUS_DONE = 2,
> +	SWITCHTEC_MRPC_STATUS_ERROR = 0xFF,
> +	SWITCHTEC_MRPC_STATUS_INTERRUPTED = 0x100,
> +};
> +
> +struct sw_event_regs {
> +	u64 event_report_ctrl;
> +	u64 reserved1;
> +	u64 part_event_bitmap;
> +	u64 reserved2;
> +	u32 global_summary;
> +	u32 reserved3[3];
> +	u32 stack_error_event_hdr;
> +	u32 stack_error_event_data;
> +	u32 reserved4[4];
> +	u32 ppu_error_event_hdr;
> +	u32 ppu_error_event_data;
> +	u32 reserved5[4];
> +	u32 isp_error_event_hdr;
> +	u32 isp_error_event_data;
> +	u32 reserved6[4];
> +	u32 sys_reset_event_hdr;
> +	u32 reserved7[5];
> +	u32 fw_exception_hdr;
> +	u32 reserved8[5];
> +	u32 fw_nmi_hdr;
> +	u32 reserved9[5];
> +	u32 fw_non_fatal_hdr;
> +	u32 reserved10[5];
> +	u32 fw_fatal_hdr;
> +	u32 reserved11[5];
> +	u32 twi_mrpc_comp_hdr;
> +	u32 twi_mrpc_comp_data;
> +	u32 reserved12[4];
> +	u32 twi_mrpc_comp_async_hdr;
> +	u32 twi_mrpc_comp_async_data;
> +	u32 reserved13[4];
> +	u32 cli_mrpc_comp_hdr;
> +	u32 cli_mrpc_comp_data;
> +	u32 reserved14[4];
> +	u32 cli_mrpc_comp_async_hdr;
> +	u32 cli_mrpc_comp_async_data;
> +	u32 reserved15[4];
> +	u32 gpio_interrupt_hdr;
> +	u32 gpio_interrupt_data;
> +	u32 reserved16[4];
> +} __packed;
> +
> +struct sys_info_regs {
> +	u32 device_id;
> +	u32 device_version;
> +	u32 firmware_version;
> +	u32 reserved1;
> +	u32 vendor_table_revision;
> +	u32 table_format_version;
> +	u32 partition_id;
> +	u32 cfg_file_fmt_version;
> +	u32 reserved2[58];
> +	char vendor_id[8];
> +	char product_id[16];
> +	char product_revision[4];
> +	char component_vendor[8];
> +	u16 component_id;
> +	u8 component_revision;
> +} __packed;
> +
> +struct flash_info_regs {
> +	u32 flash_part_map_upd_idx;
> +
> +	struct active_partition_info {
> +		u32 address;
> +		u32 build_version;
> +		u32 build_string;
> +	} active_img;
> +
> +	struct active_partition_info active_cfg;
> +	struct active_partition_info inactive_img;
> +	struct active_partition_info inactive_cfg;
> +
> +	u32 flash_length;
> +
> +	struct partition_info {
> +		u32 address;
> +		u32 length;
> +	} cfg0;
> +
> +	struct partition_info cfg1;
> +	struct partition_info img0;
> +	struct partition_info img1;
> +	struct partition_info nvlog;
> +	struct partition_info vendor[8];
> +};
> +
> +struct ntb_info_regs {
> +	u8  partition_count;
> +	u8  partition_id;
> +	u16 reserved1;
> +	u64 ep_map;
> +	u16 requester_id;
> +} __packed;
> +
> +struct part_cfg_regs {
> +	u32 status;
> +	u32 state;
> +	u32 port_cnt;
> +	u32 usp_port_mode;
> +	u32 usp_pff_inst_id;
> +	u32 vep_pff_inst_id;
> +	u32 dsp_pff_inst_id[47];
> +	u32 reserved1[11];
> +	u16 vep_vector_number;
> +	u16 usp_vector_number;
> +	u32 port_event_bitmap;
> +	u32 reserved2[3];
> +	u32 part_event_summary;
> +	u32 reserved3[3];
> +	u32 part_reset_hdr;
> +	u32 part_reset_data[5];
> +	u32 mrpc_comp_hdr;
> +	u32 mrpc_comp_data[5];
> +	u32 mrpc_comp_async_hdr;
> +	u32 mrpc_comp_async_data[5];
> +	u32 dyn_binding_hdr;
> +	u32 dyn_binding_data[5];
> +	u32 reserved4[159];
> +} __packed;
> +
> +enum {
> +	SWITCHTEC_PART_CFG_EVENT_RESET = 1 << 0,
> +	SWITCHTEC_PART_CFG_EVENT_MRPC_CMP = 1 << 1,
> +	SWITCHTEC_PART_CFG_EVENT_MRPC_ASYNC_CMP = 1 << 2,
> +	SWITCHTEC_PART_CFG_EVENT_DYN_PART_CMP = 1 << 3,
> +};
> +
> +struct pff_csr_regs {
> +	u16 vendor_id;
> +	u16 device_id;
> +	u32 pci_cfg_header[15];
> +	u32 pci_cap_region[48];
> +	u32 pcie_cap_region[448];
> +	u32 indirect_gas_window[128];
> +	u32 indirect_gas_window_off;
> +	u32 reserved[127];
> +	u32 pff_event_summary;
> +	u32 reserved2[3];
> +	u32 aer_in_p2p_hdr;
> +	u32 aer_in_p2p_data[5];
> +	u32 aer_in_vep_hdr;
> +	u32 aer_in_vep_data[5];
> +	u32 dpc_hdr;
> +	u32 dpc_data[5];
> +	u32 cts_hdr;
> +	u32 cts_data[5];
> +	u32 reserved3[6];
> +	u32 hotplug_hdr;
> +	u32 hotplug_data[5];
> +	u32 ier_hdr;
> +	u32 ier_data[5];
> +	u32 threshold_hdr;
> +	u32 threshold_data[5];
> +	u32 power_mgmt_hdr;
> +	u32 power_mgmt_data[5];
> +	u32 tlp_throttling_hdr;
> +	u32 tlp_throttling_data[5];
> +	u32 force_speed_hdr;
> +	u32 force_speed_data[5];
> +	u32 credit_timeout_hdr;
> +	u32 credit_timeout_data[5];
> +	u32 link_state_hdr;
> +	u32 link_state_data[5];
> +	u32 reserved4[174];
> +} __packed;
> +
> +struct switchtec_dev {
> +	struct pci_dev *pdev;
> +	struct msix_entry msix[4];
> +	struct device dev;
> +	struct cdev cdev;
> +	unsigned int event_irq;
> +
> +	int partition;
> +	int partition_count;
> +	int pff_csr_count;
> +	char pff_local[SWITCHTEC_MAX_PFF_CSR];
> +
> +	void __iomem *mmio;
> +	struct mrpc_regs __iomem *mmio_mrpc;
> +	struct sw_event_regs __iomem *mmio_sw_event;
> +	struct sys_info_regs __iomem *mmio_sys_info;
> +	struct flash_info_regs __iomem *mmio_flash_info;
> +	struct ntb_info_regs __iomem *mmio_ntb;
> +	struct part_cfg_regs __iomem *mmio_part_cfg;
> +	struct part_cfg_regs __iomem *mmio_part_cfg_all;
> +	struct pff_csr_regs __iomem *mmio_pff_csr;
> +
> +	/*
> +	 * The mrpc mutex must be held when accessing the other
> +	 * mrpc fields
> +	 */
> +	struct mutex mrpc_mutex;
> +	struct list_head mrpc_queue;
> +	int mrpc_busy;
> +	struct work_struct mrpc_work;
> +	struct delayed_work mrpc_timeout;
> +	wait_queue_head_t event_wq;
> +	atomic_t event_cnt;

Why is this atomic_t?  You only do an atomic_set() (in stdev_create())
and atomic_reads() -- no writes other than the initial one.  So I'm
not sure what value atomic_t brings.

> +};
> +
> +static struct switchtec_dev *to_stdev(struct device *dev)
> +{
> +	return container_of(dev, struct switchtec_dev, dev);
> +}
> +
> +struct switchtec_user {
> +	struct switchtec_dev *stdev;
> +
> +	enum mrpc_state {
> +		MRPC_IDLE = 0,
> +		MRPC_QUEUED,
> +		MRPC_RUNNING,
> +		MRPC_DONE,
> +	} state;
> +
> +	struct completion comp;
> +	struct kref kref;
> +	struct list_head list;
> +
> +	u32 cmd;
> +	u32 status;
> +	u32 return_code;
> +	size_t data_len;
> +	unsigned char data[SWITCHTEC_MRPC_PAYLOAD_SIZE];
> +	int event_cnt;
> +};
> +
> +static struct switchtec_user *stuser_create(struct switchtec_dev *stdev)
> +{
> +	struct switchtec_user *stuser;
> +
> +	stuser = kzalloc(sizeof(*stuser), GFP_KERNEL);
> +	if (!stuser)
> +		return ERR_PTR(-ENOMEM);
> +
> +	get_device(&stdev->dev);
> +	stuser->stdev = stdev;
> +	kref_init(&stuser->kref);
> +	INIT_LIST_HEAD(&stuser->list);
> +	init_completion(&stuser->comp);
> +	stuser->event_cnt = atomic_read(&stdev->event_cnt);
> +
> +	dev_dbg(&stdev->dev, "%s: %p\n", __func__, stuser);
> +
> +	return stuser;
> +}
> +
> +static void stuser_free(struct kref *kref)
> +{
> +	struct switchtec_user *stuser;
> +
> +	stuser = container_of(kref, struct switchtec_user, kref);
> +
> +	dev_dbg(&stuser->stdev->dev, "%s: %p\n", __func__, stuser);
> +
> +	put_device(&stuser->stdev->dev);
> +	kfree(stuser);
> +}
> +
> +static void stuser_put(struct switchtec_user *stuser)
> +{
> +	kref_put(&stuser->kref, stuser_free);
> +}
> +
> +static void stuser_set_state(struct switchtec_user *stuser,
> +			     enum mrpc_state state)
> +{
> +	const char * const state_names[] = {
> +		[MRPC_IDLE] = "IDLE",
> +		[MRPC_QUEUED] = "QUEUED",
> +		[MRPC_RUNNING] = "RUNNING",
> +		[MRPC_DONE] = "DONE",
> +	};
> +
> +	stuser->state = state;
> +
> +	dev_dbg(&stuser->stdev->dev, "stuser state %p -> %s",
> +		stuser, state_names[state]);
> +}
> +
> +static int stdev_is_alive(struct switchtec_dev *stdev)
> +{
> +	return stdev->mmio != NULL;
> +}
> +
> +static void mrpc_complete_cmd(struct switchtec_dev *stdev);
> +
> +static void mrpc_cmd_submit(struct switchtec_dev *stdev)
> +{
> +	/* requires the mrpc_mutex to already be held when called */
> +
> +	struct switchtec_user *stuser;
> +
> +	if (stdev->mrpc_busy)
> +		return;
> +
> +	if (list_empty(&stdev->mrpc_queue))
> +		return;
> +
> +	stuser = list_entry(stdev->mrpc_queue.next, struct switchtec_user,
> +			    list);
> +
> +	stuser_set_state(stuser, MRPC_RUNNING);
> +	stdev->mrpc_busy = 1;
> +	memcpy_toio(&stdev->mmio_mrpc->input_data,
> +		    stuser->data, stuser->data_len);
> +	iowrite32(stuser->cmd, &stdev->mmio_mrpc->cmd);
> +
> +	stuser->status = ioread32(&stdev->mmio_mrpc->status);
> +	if (stuser->status != SWITCHTEC_MRPC_STATUS_INPROGRESS)
> +		mrpc_complete_cmd(stdev);
> +
> +	schedule_delayed_work(&stdev->mrpc_timeout,
> +			      msecs_to_jiffies(500));
> +}
> +
> +static void mrpc_queue_cmd(struct switchtec_user *stuser)
> +{
> +	/* requires the mrpc_mutex to already be held when called */
> +
> +	struct switchtec_dev *stdev = stuser->stdev;
> +
> +	kref_get(&stuser->kref);
> +	stuser_set_state(stuser, MRPC_QUEUED);
> +	init_completion(&stuser->comp);
> +	list_add_tail(&stuser->list, &stdev->mrpc_queue);
> +
> +	mrpc_cmd_submit(stdev);
> +}
> +
> +static void mrpc_complete_cmd(struct switchtec_dev *stdev)
> +{
> +	/* requires the mrpc_mutex to already be held when called */
> +	struct switchtec_user *stuser;
> +
> +	if (list_empty(&stdev->mrpc_queue))
> +		return;
> +
> +	stuser = list_entry(stdev->mrpc_queue.next, struct switchtec_user,
> +			    list);
> +
> +	stuser->status = ioread32(&stdev->mmio_mrpc->status);
> +	if (stuser->status == SWITCHTEC_MRPC_STATUS_INPROGRESS)
> +		return;
> +
> +	stuser_set_state(stuser, MRPC_DONE);
> +	stuser->return_code = 0;
> +
> +	if (stuser->status != SWITCHTEC_MRPC_STATUS_DONE)
> +		goto out;
> +
> +	stuser->return_code = ioread32(&stdev->mmio_mrpc->ret_value);
> +	if (stuser->return_code != 0)
> +		goto out;
> +
> +	memcpy_fromio(stuser->data, &stdev->mmio_mrpc->output_data,
> +		      sizeof(stuser->data));

Apparently you always get 1K of data back, no matter what?

> +
> +out:
> +	complete_all(&stuser->comp);
> +	list_del_init(&stuser->list);
> +	stuser_put(stuser);
> +	stdev->mrpc_busy = 0;
> +
> +	mrpc_cmd_submit(stdev);
> +}
> +
> +static void mrpc_event_work(struct work_struct *work)
> +{
> +	struct switchtec_dev *stdev;
> +
> +	stdev = container_of(work, struct switchtec_dev, mrpc_work);
> +
> +	dev_dbg(&stdev->dev, "%s\n", __func__);
> +
> +	mutex_lock(&stdev->mrpc_mutex);
> +	cancel_delayed_work(&stdev->mrpc_timeout);
> +	mrpc_complete_cmd(stdev);
> +	mutex_unlock(&stdev->mrpc_mutex);
> +}
> +
> +static void mrpc_timeout_work(struct work_struct *work)
> +{
> +	struct switchtec_dev *stdev;
> +	u32 status;
> +
> +	stdev = container_of(work, struct switchtec_dev, mrpc_timeout.work);
> +
> +	dev_dbg(&stdev->dev, "%s\n", __func__);
> +
> +	mutex_lock(&stdev->mrpc_mutex);
> +
> +	if (stdev_is_alive(stdev)) {
> +		status = ioread32(&stdev->mmio_mrpc->status);
> +		if (status == SWITCHTEC_MRPC_STATUS_INPROGRESS) {
> +			schedule_delayed_work(&stdev->mrpc_timeout,
> +					      msecs_to_jiffies(500));
> +			goto out;
> +		}
> +	}
> +
> +	mrpc_complete_cmd(stdev);
> +
> +out:
> +	mutex_unlock(&stdev->mrpc_mutex);
> +}
> +
> +static int switchtec_dev_open(struct inode *inode, struct file *filp)
> +{
> +	struct switchtec_dev *stdev;
> +	struct switchtec_user *stuser;
> +
> +	stdev = container_of(inode->i_cdev, struct switchtec_dev, cdev);
> +
> +	stuser = stuser_create(stdev);
> +	if (!stuser)
> +		return PTR_ERR(stuser);
> +
> +	filp->private_data = stuser;
> +	nonseekable_open(inode, filp);
> +
> +	dev_dbg(&stdev->dev, "%s: %p\n", __func__, stuser);
> +
> +	return 0;
> +}
> +
> +static int switchtec_dev_release(struct inode *inode, struct file *filp)
> +{
> +	struct switchtec_user *stuser = filp->private_data;
> +
> +	stuser_put(stuser);
> +
> +	return 0;
> +}
> +
> +static ssize_t switchtec_dev_write(struct file *filp, const char __user *data,
> +				   size_t size, loff_t *off)
> +{
> +	struct switchtec_user *stuser = filp->private_data;
> +	struct switchtec_dev *stdev = stuser->stdev;
> +	int rc;
> +
> +	if (!stdev_is_alive(stdev))
> +		return -ENXIO;

What exactly does this protect against?  Is it OK if stdev_is_alive()
becomes false immediately after we check?

> +
> +	if (size < sizeof(stuser->cmd) ||
> +	    size > sizeof(stuser->cmd) + SWITCHTEC_MRPC_PAYLOAD_SIZE)

I think I would use "sizeof(stuser->data)" instead of
SWITCHTEC_MRPC_PAYLOAD_SIZE so it matches what mrpc_complete_cmd()
does.  Same below in switchtec_dev_read().

mrpc_mutex doesn't protect the stuser things, does it?  If not, you
could do this without the gotos.  And I think it's a little easier to
be confident there are no buffer overruns:

  if (stuser->state != MRPC_IDLE)
    return -EBADE;

  if (size < sizeof(stuser->cmd))
    return -EINVAL;

  rc = copy_from_user(&stuser->cmd, data, sizeof(stuser->cmd));
  if (rc)
    return -EFAULT;

  size -= sizeof(stuser->cmd);
  data += sizeof(stuser->cmd);
  if (size > sizeof(stuser->data))
    return -EINVAL;

  rc = copy_from_user(&stuser->data, data, size);
  if (rc)
    return -EFAULT;

  stuser->data_len = size;

  if (mutex_lock_interruptible(&stdev->mrpc_mutex))
    return -EINTR;
  mrpc_queue_cmd(stuser);
  mutex_unlock(&stdev->mrpc_mutex);

  return size;

> +		return -EINVAL;
> +
> +	if (mutex_lock_interruptible(&stdev->mrpc_mutex))
> +		return -EINTR;
> +
> +	if (stuser->state != MRPC_IDLE) {
> +		rc = -EBADE;
> +		goto out;
> +	}
> +
> +	rc = copy_from_user(&stuser->cmd, data, sizeof(stuser->cmd));
> +	if (rc) {
> +		rc = -EFAULT;
> +		goto out;
> +	}
> +
> +	data += sizeof(stuser->cmd);
> +	rc = copy_from_user(&stuser->data, data, size - sizeof(stuser->cmd));
> +	if (rc) {
> +		rc = -EFAULT;
> +		goto out;
> +	}
> +
> +	stuser->data_len = size - sizeof(stuser->cmd);
> +
> +	mrpc_queue_cmd(stuser);
> +
> +	rc = size;
> +
> +out:
> +	mutex_unlock(&stdev->mrpc_mutex);
> +
> +	return rc;
> +}
> +
> +static ssize_t switchtec_dev_read(struct file *filp, char __user *data,
> +				  size_t size, loff_t *off)
> +{
> +	struct switchtec_user *stuser = filp->private_data;
> +	struct switchtec_dev *stdev = stuser->stdev;
> +	int rc;
> +
> +	if (!stdev_is_alive(stdev))
> +		return -ENXIO;
> +
> +	if (size < sizeof(stuser->cmd) ||
> +	    size > sizeof(stuser->cmd) + SWITCHTEC_MRPC_PAYLOAD_SIZE)
> +		return -EINVAL;
> +
> +	if (stuser->state == MRPC_IDLE)
> +		return -EBADE;
> +
> +	if (filp->f_flags & O_NONBLOCK) {
> +		if (!try_wait_for_completion(&stuser->comp))
> +			return -EAGAIN;
> +	} else {
> +		rc = wait_for_completion_interruptible(&stuser->comp);
> +		if (rc < 0)
> +			return rc;
> +	}
> +
> +	if (mutex_lock_interruptible(&stdev->mrpc_mutex))
> +		return -EINTR;
> +
> +	if (stuser->state != MRPC_DONE)
> +		return -EBADE;
> +
> +	rc = copy_to_user(data, &stuser->return_code,
> +			  sizeof(stuser->return_code));
> +	if (rc) {
> +		rc = -EFAULT;
> +		goto out;
> +	}
> +
> +	data += sizeof(stuser->return_code);
> +	rc = copy_to_user(data, &stuser->data,
> +			  size - sizeof(stuser->return_code));
> +	if (rc) {
> +		rc = -EFAULT;
> +		goto out;
> +	}
> +
> +	stuser_set_state(stuser, MRPC_IDLE);
> +
> +	if (stuser->status == SWITCHTEC_MRPC_STATUS_DONE)
> +		rc = size;
> +	else if (stuser->status == SWITCHTEC_MRPC_STATUS_INTERRUPTED)
> +		rc = -ENXIO;
> +	else
> +		rc = -EBADMSG;
> +
> +out:
> +	mutex_unlock(&stdev->mrpc_mutex);
> +
> +	return rc;
> +}
> +
> +static unsigned int switchtec_dev_poll(struct file *filp, poll_table *wait)
> +{
> +	struct switchtec_user *stuser = filp->private_data;
> +	struct switchtec_dev *stdev = stuser->stdev;
> +	int ret = 0;
> +
> +	poll_wait(filp, &stuser->comp.wait, wait);
> +	poll_wait(filp, &stdev->event_wq, wait);
> +
> +	if (!stdev_is_alive(stdev))
> +		return POLLERR;
> +
> +	if (try_wait_for_completion(&stuser->comp))
> +		ret |= POLLIN | POLLRDNORM;
> +
> +	if (stuser->event_cnt != atomic_read(&stdev->event_cnt))
> +		ret |= POLLPRI | POLLRDBAND;
> +
> +	return ret;
> +}
> +
> +static const struct file_operations switchtec_fops = {
> +	.owner = THIS_MODULE,
> +	.open = switchtec_dev_open,
> +	.release = switchtec_dev_release,
> +	.write = switchtec_dev_write,
> +	.read = switchtec_dev_read,
> +	.poll = switchtec_dev_poll,
> +};
> +
> +static void stdev_release(struct device *dev)
> +{
> +	struct switchtec_dev *stdev = to_stdev(dev);
> +
> +	ida_simple_remove(&switchtec_minor_ida,
> +			  MINOR(dev->devt));
> +	kfree(stdev);
> +}
> +
> +static void stdev_unregister(struct switchtec_dev *stdev)
> +{
> +	cdev_del(&stdev->cdev);
> +	device_unregister(&stdev->dev);
> +}
> +
> +static struct switchtec_dev *stdev_create(struct pci_dev *pdev)
> +{
> +	struct switchtec_dev *stdev;
> +	int minor;
> +	struct device *dev;
> +	struct cdev *cdev;
> +	int rc;
> +
> +	stdev = kzalloc_node(sizeof(*stdev), GFP_KERNEL,
> +			     dev_to_node(&pdev->dev));
> +	if (!stdev)
> +		return ERR_PTR(-ENOMEM);
> +
> +	stdev->pdev = pdev;
> +	INIT_LIST_HEAD(&stdev->mrpc_queue);
> +	mutex_init(&stdev->mrpc_mutex);
> +	stdev->mrpc_busy = 0;
> +	INIT_WORK(&stdev->mrpc_work, mrpc_event_work);
> +	INIT_DELAYED_WORK(&stdev->mrpc_timeout, mrpc_timeout_work);
> +	init_waitqueue_head(&stdev->event_wq);
> +	atomic_set(&stdev->event_cnt, 0);
> +
> +	minor = ida_simple_get(&switchtec_minor_ida, 0, 0,
> +			       GFP_KERNEL);
> +	if (minor < 0)
> +		return ERR_PTR(minor);
> +
> +	dev = &stdev->dev;
> +	device_initialize(dev);
> +	dev->devt = MKDEV(MAJOR(switchtec_devt), minor);
> +	dev->class = switchtec_class;
> +	dev->parent = &pdev->dev;
> +	dev->release = stdev_release;
> +	dev_set_name(dev, "switchtec%d", minor);
> +
> +	cdev = &stdev->cdev;
> +	cdev_init(cdev, &switchtec_fops);
> +	cdev->owner = THIS_MODULE;
> +	cdev->kobj.parent = &dev->kobj;
> +
> +	rc = cdev_add(&stdev->cdev, dev->devt, 1);
> +	if (rc)
> +		goto err_cdev;
> +
> +	rc = device_add(dev);
> +	if (rc) {
> +		cdev_del(&stdev->cdev);
> +		put_device(dev);
> +		return ERR_PTR(rc);
> +	}
> +
> +	return stdev;
> +
> +err_cdev:
> +	ida_simple_remove(&switchtec_minor_ida, minor);
> +
> +	return ERR_PTR(rc);
> +}
> +
> +static irqreturn_t switchtec_event_isr(int irq, void *dev)
> +{
> +	struct switchtec_dev *stdev = dev;
> +	u32 reg;
> +	irqreturn_t ret = IRQ_NONE;
> +
> +	reg = ioread32(&stdev->mmio_part_cfg->mrpc_comp_hdr);
> +	if (reg & SWITCHTEC_EVENT_OCCURRED) {
> +		dev_dbg(&stdev->dev, "%s: mrpc comp\n", __func__);
> +		ret = IRQ_HANDLED;
> +		schedule_work(&stdev->mrpc_work);
> +		iowrite32(reg, &stdev->mmio_part_cfg->mrpc_comp_hdr);
> +	}
> +
> +	return ret;
> +}
> +
> +static int switchtec_init_msix_isr(struct switchtec_dev *stdev)
> +{
> +	struct pci_dev *pdev = stdev->pdev;
> +	int rc, i, msix_count, node;
> +
> +	node = dev_to_node(&pdev->dev);

Unused?

> +	for (i = 0; i < ARRAY_SIZE(stdev->msix); ++i)
> +		stdev->msix[i].entry = i;
> +
> +	msix_count = pci_enable_msix_range(pdev, stdev->msix, 1,
> +					   ARRAY_SIZE(stdev->msix));
> +	if (msix_count < 0)
> +		return msix_count;

Maybe this could be simplified by using pci_alloc_irq_vectors()?

> +	stdev->event_irq = ioread32(&stdev->mmio_part_cfg->vep_vector_number);
> +	if (stdev->event_irq < 0 || stdev->event_irq >= msix_count) {
> +		rc = -EFAULT;
> +		goto err_msix_request;
> +	}

Not sure what this is doing, but you immediately overwrite
stdev->event_irq below.  If you're using it as just a temporary above
for doing some validation, I would just use a local variable to avoid
the connection with stdev->event_irq.

> +	stdev->event_irq = stdev->msix[stdev->event_irq].vector;

Oh, I guess you allocate several MSI-X vectors, but you only actually
use one of them?  Why is that?  I was confused about why you allocate
several vectors, but there's only one devm_request_irq() call below.

> +	dev_dbg(&stdev->dev, "Using msix interrupts: event_irq=%d\n",
> +		stdev->event_irq);
> +
> +	return 0;
> +
> +err_msix_request:
> +	pci_disable_msix(pdev);
> +	return rc;
> +}
> +
> +static int switchtec_init_msi_isr(struct switchtec_dev *stdev)
> +{
> +	int rc;
> +	struct pci_dev *pdev = stdev->pdev;
> +
> +	/* Try to set up msi irq */
> +	rc = pci_enable_msi_range(pdev, 1, 4);
> +	if (rc < 0)
> +		return rc;
> +
> +	stdev->event_irq = ioread32(&stdev->mmio_part_cfg->vep_vector_number);
> +	if (stdev->event_irq < 0 || stdev->event_irq >= 4) {
> +		rc = -EFAULT;
> +		goto err_msi_request;
> +	}
> +
> +	stdev->event_irq = pdev->irq + stdev->event_irq;
> +	dev_dbg(&stdev->dev, "Using msi interrupts: event_irq=%d\n",
> +		stdev->event_irq);
> +
> +	return 0;
> +
> +err_msi_request:
> +	pci_disable_msi(pdev);
> +	return rc;
> +}
> +
> +static int switchtec_init_isr(struct switchtec_dev *stdev)
> +{
> +	int rc;
> +
> +	rc = switchtec_init_msix_isr(stdev);
> +	if (rc)
> +		rc = switchtec_init_msi_isr(stdev);
> +
> +	if (rc)
> +		return rc;
> +
> +	rc = devm_request_irq(&stdev->pdev->dev, stdev->event_irq,
> +			      switchtec_event_isr, 0, KBUILD_MODNAME, stdev);
> +
> +	return rc;
> +}
> +
> +static void switchtec_deinit_isr(struct switchtec_dev *stdev)
> +{
> +	devm_free_irq(&stdev->pdev->dev, stdev->event_irq, stdev);
> +	pci_disable_msix(stdev->pdev);
> +	pci_disable_msi(stdev->pdev);
> +}
> +
> +static void init_pff(struct switchtec_dev *stdev)
> +{
> +	int i;
> +	u32 reg;
> +	struct part_cfg_regs *pcfg = stdev->mmio_part_cfg;
> +
> +	for (i = 0; i < SWITCHTEC_MAX_PFF_CSR; i++) {
> +		reg = ioread16(&stdev->mmio_pff_csr[i].vendor_id);
> +		if (reg != MICROSEMI_VENDOR_ID)
> +			break;
> +	}
> +
> +	stdev->pff_csr_count = i;
> +
> +	reg = ioread32(&pcfg->usp_pff_inst_id);
> +	if (reg < SWITCHTEC_MAX_PFF_CSR)
> +		stdev->pff_local[reg] = 1;
> +
> +	reg = ioread32(&pcfg->vep_pff_inst_id);
> +	if (reg < SWITCHTEC_MAX_PFF_CSR)
> +		stdev->pff_local[reg] = 1;
> +
> +	for (i = 0; i < ARRAY_SIZE(pcfg->dsp_pff_inst_id); i++) {
> +		reg = ioread32(&pcfg->dsp_pff_inst_id[i]);
> +		if (reg < SWITCHTEC_MAX_PFF_CSR)
> +			stdev->pff_local[reg] = 1;
> +	}
> +}
> +
> +static int switchtec_init_pci(struct switchtec_dev *stdev,
> +			      struct pci_dev *pdev)
> +{
> +	int rc;
> +
> +	rc = pcim_enable_device(pdev);
> +	if (rc)
> +		return rc;
> +
> +	rc = pcim_iomap_regions(pdev, 0x1, KBUILD_MODNAME);
> +	if (rc)
> +		return rc;
> +
> +	pci_set_master(pdev);
> +
> +	stdev->mmio = pcim_iomap_table(pdev)[0];
> +	stdev->mmio_mrpc = stdev->mmio + SWITCHTEC_GAS_MRPC_OFFSET;
> +	stdev->mmio_sw_event = stdev->mmio + SWITCHTEC_GAS_SW_EVENT_OFFSET;
> +	stdev->mmio_sys_info = stdev->mmio + SWITCHTEC_GAS_SYS_INFO_OFFSET;
> +	stdev->mmio_flash_info = stdev->mmio + SWITCHTEC_GAS_FLASH_INFO_OFFSET;
> +	stdev->mmio_ntb = stdev->mmio + SWITCHTEC_GAS_NTB_OFFSET;
> +	stdev->partition = ioread8(&stdev->mmio_ntb->partition_id);
> +	stdev->partition_count = ioread8(&stdev->mmio_ntb->partition_count);
> +	stdev->mmio_part_cfg_all = stdev->mmio + SWITCHTEC_GAS_PART_CFG_OFFSET;
> +	stdev->mmio_part_cfg = &stdev->mmio_part_cfg_all[stdev->partition];
> +	stdev->mmio_pff_csr = stdev->mmio + SWITCHTEC_GAS_PFF_CSR_OFFSET;
> +
> +	init_pff(stdev);
> +
> +	iowrite32(SWITCHTEC_EVENT_CLEAR |
> +		  SWITCHTEC_EVENT_EN_IRQ,

Does this enable the device to generate IRQs?  You haven't connected
the ISR yet (but I guess you probably also haven't told the device to
do anything that would actually generate an IRQ).

> +		  &stdev->mmio_part_cfg->mrpc_comp_hdr);
> +
> +	pci_set_drvdata(pdev, stdev);
> +
> +	return 0;
> +}
> +
> +static void switchtec_deinit_pci(struct switchtec_dev *stdev)
> +{
> +	struct pci_dev *pdev = stdev->pdev;
> +
> +	stdev->mmio = NULL;
> +	pci_clear_master(pdev);
> +	pci_set_drvdata(pdev, NULL);
> +}
> +
> +static int switchtec_pci_probe(struct pci_dev *pdev,
> +			       const struct pci_device_id *id)
> +{
> +	struct switchtec_dev *stdev;
> +	int rc;
> +
> +	stdev = stdev_create(pdev);
> +	if (!stdev)
> +		return PTR_ERR(stdev);
> +
> +	rc = switchtec_init_pci(stdev, pdev);
> +	if (rc)
> +		goto err_init_pci;
> +
> +	rc = switchtec_init_isr(stdev);
> +	if (rc) {
> +		dev_err(&stdev->dev, "failed to init isr.\n");
> +		goto err_init_isr;
> +	}
> +
> +	dev_info(&stdev->dev, "Management device registered.\n");
> +
> +	return 0;
> +
> +err_init_isr:
> +	switchtec_deinit_pci(stdev);
> +err_init_pci:
> +	stdev_unregister(stdev);
> +	return rc;
> +}
> +
> +static void switchtec_pci_remove(struct pci_dev *pdev)
> +{
> +	struct switchtec_dev *stdev = pci_get_drvdata(pdev);
> +
> +	switchtec_deinit_isr(stdev);
> +	switchtec_deinit_pci(stdev);
> +	stdev_unregister(stdev);
> +}
> +
> +#define SWITCHTEC_PCI_DEVICE(device_id) \
> +	{ \
> +		.vendor     = MICROSEMI_VENDOR_ID, \
> +		.device     = device_id, \
> +		.subvendor  = PCI_ANY_ID, \
> +		.subdevice  = PCI_ANY_ID, \
> +		.class      = MICROSEMI_MGMT_CLASSCODE, \
> +		.class_mask = 0xFFFFFFFF, \
> +	}, \
> +	{ \
> +		.vendor     = MICROSEMI_VENDOR_ID, \
> +		.device     = device_id, \
> +		.subvendor  = PCI_ANY_ID, \
> +		.subdevice  = PCI_ANY_ID, \
> +		.class      = MICROSEMI_NTB_CLASSCODE, \
> +		.class_mask = 0xFFFFFFFF, \
> +	}
> +
> +static const struct pci_device_id switchtec_pci_tbl[] = {
> +	SWITCHTEC_PCI_DEVICE(0x8531),  //PFX 24xG3
> +	SWITCHTEC_PCI_DEVICE(0x8532),  //PFX 32xG3
> +	SWITCHTEC_PCI_DEVICE(0x8533),  //PFX 48xG3
> +	SWITCHTEC_PCI_DEVICE(0x8534),  //PFX 64xG3
> +	SWITCHTEC_PCI_DEVICE(0x8535),  //PFX 80xG3
> +	SWITCHTEC_PCI_DEVICE(0x8536),  //PFX 96xG3
> +	SWITCHTEC_PCI_DEVICE(0x8543),  //PSX 48xG3
> +	SWITCHTEC_PCI_DEVICE(0x8544),  //PSX 64xG3
> +	SWITCHTEC_PCI_DEVICE(0x8545),  //PSX 80xG3
> +	SWITCHTEC_PCI_DEVICE(0x8546),  //PSX 96xG3
> +	{0}
> +};
> +MODULE_DEVICE_TABLE(pci, switchtec_pci_tbl);
> +
> +static struct pci_driver switchtec_pci_driver = {
> +	.name		= KBUILD_MODNAME,
> +	.id_table	= switchtec_pci_tbl,
> +	.probe		= switchtec_pci_probe,
> +	.remove		= switchtec_pci_remove,
> +};
> +
> +static int __init switchtec_init(void)
> +{
> +	int rc;
> +
> +	max_devices = max(max_devices, 256);
> +	rc = alloc_chrdev_region(&switchtec_devt, 0, max_devices,
> +				 "switchtec");
> +	if (rc)
> +		return rc;
> +
> +	switchtec_class = class_create(THIS_MODULE, "switchtec");
> +	if (IS_ERR(switchtec_class)) {
> +		rc = PTR_ERR(switchtec_class);
> +		goto err_create_class;
> +	}
> +
> +	rc = pci_register_driver(&switchtec_pci_driver);
> +	if (rc)
> +		goto err_pci_register;
> +
> +	pr_info(KBUILD_MODNAME ": loaded.\n");
> +
> +	return 0;
> +
> +err_pci_register:
> +	class_destroy(switchtec_class);
> +
> +err_create_class:
> +	unregister_chrdev_region(switchtec_devt, max_devices);
> +
> +	return rc;
> +}
> +module_init(switchtec_init);
> +
> +static void __exit switchtec_exit(void)
> +{
> +	pci_unregister_driver(&switchtec_pci_driver);
> +	class_destroy(switchtec_class);
> +	unregister_chrdev_region(switchtec_devt, max_devices);
> +	ida_destroy(&switchtec_minor_ida);
> +
> +	pr_info(KBUILD_MODNAME ": unloaded.\n");
> +}
> +module_exit(switchtec_exit);
> -- 
> 2.1.4
 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 1/4] MicroSemi Switchtec management interface driver
  2017-02-24  0:35   ` Bjorn Helgaas
@ 2017-02-24 18:32     ` Logan Gunthorpe
  0 siblings, 0 replies; 36+ messages in thread
From: Logan Gunthorpe @ 2017-02-24 18:32 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Keith Busch, Myron Stowe, Greg Kroah-Hartman, Bjorn Helgaas,
	Geert Uytterhoeven, Jonathan Corbet, David S. Miller,
	Andrew Morton, Emil Velikov, Mauro Carvalho Chehab,
	Guenter Roeck, Jarkko Sakkinen, Linus Walleij, Ryusuke Konishi,
	Stefan Berger, Wei Zhang, Kurt Schwemmer, Stephen Bates,
	linux-pci, linux-doc, linux-nvme, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Christoph Hellwig

Hey Bjorn,

Thanks for the thorough review. It definitely helped a lot to make the
code as good as it can be.

I've made all of the changes you suggested. I'm just going to do a bit
more testing and then post a v4. See my responses to all of your
feedback bellow.

Logan

On 23/02/17 05:35 PM, Bjorn Helgaas wrote:
> Would it make any sense to integrate this with the perf tool?  It
> looks like switchtec-user measures bandwidth and latency, which sounds
> sort of perf-ish.

That would be cool but lets file that under potential future work. This
driver also enables more than bandwidth and latency so it's still
required for us.

>> +MODULE_DESCRIPTION("Microsemi Switchtec(tm) PCI-E Management Driver");
> 
> Nit: s/PCI-E/PCIe/
> 

Fixed.

>> +MODULE_VERSION("0.1");
>> +MODULE_LICENSE("GPL");
>> +MODULE_AUTHOR("Microsemi Corporation");
>> +
>> +static int max_devices = 16;
> 
> This static initialization is slightly misleading because
> switchtec_init() immediately sets max_devices to at least 256.

Oops copied that from dax without thinking. I've just removed the max()
call in the init function.


>> +	atomic_t event_cnt;
> 
> Why is this atomic_t?  You only do an atomic_set() (in stdev_create())
> and atomic_reads() -- no writes other than the initial one.  So I'm
> not sure what value atomic_t brings.

If you looked at a following patch in the series you'd see that there's
an atomic_inc in the ISR. The splitting of the series wasn't as precise
as it maybe should have been and thus this became a bit confusing.

>> +	memcpy_fromio(stuser->data, &stdev->mmio_mrpc->output_data,
>> +		      sizeof(stuser->data));
> 
> Apparently you always get 1K of data back, no matter what?

Yes, sort of unfortunately. Because these transactions can occur before
the user actually does the read, we don't know how much data the user
wants. This may be a performance improvement in the future (ie. if the
user reads before the MRPC transaction comes back, then only read the
requested amount). But we will leave it this way for now.


>> +	if (!stdev_is_alive(stdev))
>> +		return -ENXIO;
> 
> What exactly does this protect against?  Is it OK if stdev_is_alive()
> becomes false immediately after we check?

Yup, you're right: that's racy. Turns out cleanup is hard and I've
learned a lot even in the short time since I wrote this code. I've
gotten rid of stdev_is_alive and moved the pci cleanup code into the
device's release function so they won't occur until the last user closes
the cdev. I think that should work better but please let me know if you
see an issue with this.

>> +
>> +	if (size < sizeof(stuser->cmd) ||
>> +	    size > sizeof(stuser->cmd) + SWITCHTEC_MRPC_PAYLOAD_SIZE)
> 
> I think I would use "sizeof(stuser->data)" instead of
> SWITCHTEC_MRPC_PAYLOAD_SIZE so it matches what mrpc_complete_cmd()
> does.  Same below in switchtec_dev_read().

Ok.

> mrpc_mutex doesn't protect the stuser things, does it?  If not, you
> could do this without the gotos.  And I think it's a little easier to
> be confident there are no buffer overruns:

I was using the mutex to protect stuser->state as well. But I've made
your changes and just adjusted stuser->state to be atomic and I think
this should be just as good.

>> +static int switchtec_init_msix_isr(struct switchtec_dev *stdev)
>> +{
>> +	struct pci_dev *pdev = stdev->pdev;
>> +	int rc, i, msix_count, node;
>> +
>> +	node = dev_to_node(&pdev->dev);
> 
> Unused?

Yup, I've removed it.

>> +	for (i = 0; i < ARRAY_SIZE(stdev->msix); ++i)
>> +		stdev->msix[i].entry = i;
>> +
>> +	msix_count = pci_enable_msix_range(pdev, stdev->msix, 1,
>> +					   ARRAY_SIZE(stdev->msix));
>> +	if (msix_count < 0)
>> +		return msix_count;
> 
> Maybe this could be simplified by using pci_alloc_irq_vectors()?

Yup! I wasn't aware of that function. It's a _huge_ improvement. Thanks.
Still I'd really appreciate a quick re-review after I post v4 just to
make sure I got it correct.

>> +	stdev->event_irq = ioread32(&stdev->mmio_part_cfg->vep_vector_number);
>> +	if (stdev->event_irq < 0 || stdev->event_irq >= msix_count) {
>> +		rc = -EFAULT;
>> +		goto err_msix_request;
>> +	}
> 
> Not sure what this is doing, but you immediately overwrite
> stdev->event_irq below.  If you're using it as just a temporary above
> for doing some validation, I would just use a local variable to avoid
> the connection with stdev->event_irq.

Yes, I made this temporary.

>> +	stdev->event_irq = stdev->msix[stdev->event_irq].vector;
> 
> Oh, I guess you allocate several MSI-X vectors, but you only actually
> use one of them?  Why is that?  I was confused about why you allocate
> several vectors, but there's only one devm_request_irq() call below.

The event_irq is configurable in hardware and won't necessarily be the
first irq available. (It should always be in the first four.) As I
understand it, we need to allocate all of them in order to use the one
we want. The hardware has a couple of other IRQs that do other things
that we are not currently using.

>> +	iowrite32(SWITCHTEC_EVENT_CLEAR |
>> +		  SWITCHTEC_EVENT_EN_IRQ,
> 
> Does this enable the device to generate IRQs?  You haven't connected
> the ISR yet (but I guess you probably also haven't told the device to
> do anything that would actually generate an IRQ).

Yes on both counts. I've moved it past the ISR initialization just so
it's more correct.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 3/4] switchtec: Add sysfs attributes to the Switchtec driver
  2017-02-23 22:56     ` Logan Gunthorpe
@ 2017-02-24 22:45       ` Bjorn Helgaas
  0 siblings, 0 replies; 36+ messages in thread
From: Bjorn Helgaas @ 2017-02-24 22:45 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: Keith Busch, Myron Stowe, Greg Kroah-Hartman, Bjorn Helgaas,
	Geert Uytterhoeven, Jonathan Corbet, David S. Miller,
	Andrew Morton, Emil Velikov, Mauro Carvalho Chehab,
	Guenter Roeck, Jarkko Sakkinen, Linus Walleij, Ryusuke Konishi,
	Stefan Berger, Wei Zhang, Kurt Schwemmer, Stephen Bates,
	linux-pci, linux-doc, linux-nvme, linux-kernel

On Thu, Feb 23, 2017 at 03:56:21PM -0700, Logan Gunthorpe wrote:
> 
> 
> On 23/02/17 03:43 PM, Bjorn Helgaas wrote:
> > This path seems a little generic.  I don't see other cases where a
> > product brand name ("Switchtec") appears at the top level of
> > /sys/class/...
> 
> Ok, well we are certainly open to suggestions, but there isn't really a
> generic version of this device available so I'm not sure how we would
> change that. Per device-type classes aren't that uncommon though, a
> quick grep shows things like:
> 
> platform/chrome/cros_ec_dev.c:40:static struct class cros_class
> s390/char/raw3270.h:94:extern struct class *class3270;
> net/ethernet/hisilicon/hns/hnae.c:19:static struct class *hnae_class;
> mfd/ucb1x00-core.c:490:static struct class ucb1x00_class
> 
> > My question is based on "ls Documentation/ABI/testing/sysfs-class*",
> > not on any great knowledge of sysfs, and I see Greg has already given
> > a Reviewed-by for this, so maybe this is the right approach.
> > 
> > It does seem like the path could include a clue that this is related
> > to PCI.
> 
> I mean, we could change it to pci-switchtec or something if you think
> that would be better..?? But I'm not sure how else to accommodate this.

I'm OK with it as-is.

> > Is there a link to the switch PCI device itself, e.g., to
> > /sys/devices/pci*?  Should these attributes simply be put in a
> > subdirectory there, e.g., in
> > 
> >   /sys/devices/pci0000:00/0000:00:00.0/stats/...
> 
> Well our device shows up here in the tree:
> 
> /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.1/switchtec/switchtec0
> 
> (Which userspace can get to by following the link at
> /sys/class/switchtec/switchtec0) The switch is then always:
> 
> /sys/devices/pci0000:00/0000:00:03.0

That's exactly what I was looking for; I just didn't realize it was
connected like this.

Bjorn

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2017-02-24 22:45 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-02 18:05 [PATCH v2 0/4] New Microsemi PCI Switch Management Driver Logan Gunthorpe
2017-02-02 18:06 ` [PATCH v2 1/4] MicroSemi Switchtec management interface driver Logan Gunthorpe
2017-02-10 14:51   ` Greg Kroah-Hartman
2017-02-10 16:48     ` Logan Gunthorpe
2017-02-10 16:55       ` Greg Kroah-Hartman
2017-02-10 17:03         ` Logan Gunthorpe
2017-02-10 17:09           ` Greg Kroah-Hartman
2017-02-10 18:00             ` Logan Gunthorpe
2017-02-10 17:57     ` [PATCH] switchtec: cleanup cdev init Logan Gunthorpe
2017-02-18 20:22       ` Logan Gunthorpe
2017-02-19 21:43         ` Dan Williams
2017-02-20  4:22           ` Logan Gunthorpe
2017-02-21 18:37             ` Jason Gunthorpe
2017-02-10 14:54   ` [PATCH v2 1/4] MicroSemi Switchtec management interface driver Greg Kroah-Hartman
2017-02-24  0:35   ` Bjorn Helgaas
2017-02-24 18:32     ` Logan Gunthorpe
2017-02-02 18:06 ` [PATCH v2 2/4] switchtec: Add user interface documentation Logan Gunthorpe
2017-02-02 18:06 ` [PATCH v2 3/4] switchtec: Add sysfs attributes to the Switchtec driver Logan Gunthorpe
2017-02-10 14:54   ` Greg Kroah-Hartman
2017-02-23 22:43   ` Bjorn Helgaas
2017-02-23 22:56     ` Logan Gunthorpe
2017-02-24 22:45       ` Bjorn Helgaas
2017-02-02 18:06 ` [PATCH v2 4/4] switchtec: Add IOCTLs " Logan Gunthorpe
2017-02-09 23:16 ` [PATCH v2 0/4] New Microsemi PCI Switch Management Driver Wei Zhang
2017-02-09 23:46   ` Wei Zhang
2017-02-10 14:54 ` Greg Kroah-Hartman
2017-02-10 16:14 ` Jens Axboe
2017-02-17 20:36 ` Logan Gunthorpe
2017-02-23 20:36   ` Logan Gunthorpe
2017-02-23 20:51     ` Sinan Kaya
2017-02-23 20:54       ` Jens Axboe
2017-02-23 20:56       ` Logan Gunthorpe
2017-02-23 20:52     ` Jens Axboe
2017-02-23 20:55     ` Greg Kroah-Hartman
2017-02-23 22:14     ` Bjorn Helgaas
2017-02-23 22:19       ` Logan Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).