linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v2 0/3] Add PCIe enclosure management support
@ 2022-02-02 17:59 Stuart Hayes
  2022-02-02 17:59 ` [RFC PATCH v2 1/3] Add support for seven more indicators in enclosure driver Stuart Hayes
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Stuart Hayes @ 2022-02-02 17:59 UTC (permalink / raw)
  To: Bjorn Helgaas, linux-pci, Dan Williams
  Cc: Keith Busch, kw, mariusz.tkaczyk, helgaas, lukas, pavel,
	linux-cxl, martin.petersen, James.Bottomley, Arnd Bergmann,
	Greg Kroah-Hartman, Stuart Hayes

This patch set adds support for two PCIe enclosure management methods,
"Native PCIe Enclosure Management" (defined in the PCI Express Base Spec)
and "_DSM Definitions for PCIe SSD Status LED" (defined in the PCI
Firmware Spec). The latter is very similar to the former, but uses a _DSM
interface (which can be provided by firmware) rather than a PCIe
extended capability.  Each provides a way to control up to ten indicator
states (such as locate, fault, etc.)

There are three patches in the set:

(1) Expands the existing enclosure driver to support more
indicators.

(2) Adds an auxiliary driver "pcie_em" that can attach to auxiliary
devices created by the drivers of any devices that can support PCIe
enclosure management (nvme, pcieport, and cxl).  It will register an
enclosure device with one component for each device with an enclosure
management interface.

(3) Modifies the nvme driver to create an auxiliary device to which the
pcie_em driver can attach.

These patches do not modify the cxl or pcieport drivers to add support,
though the driver was designed to make it easy to do so.
---
v2:  fixed compile error related to switch default in enclosure.c
     added documentation of added sysfs attributes
---

Stuart Hayes (3):
  Add support for seven more indicators in enclosure driver
  Add PCIe enclosure management auxiliary driver
  Register auxiliary device for PCIe enclosure management

 .../ABI/testing/sysfs-class-enclosure         |  14 +
 drivers/misc/enclosure.c                      | 191 ++++---
 drivers/nvme/host/pci.c                       |  11 +
 drivers/pci/pcie/Kconfig                      |   8 +
 drivers/pci/pcie/Makefile                     |   1 +
 drivers/pci/pcie/pcie_em.c                    | 473 ++++++++++++++++++
 include/linux/enclosure.h                     |  22 +
 include/linux/pcie_em.h                       |  97 ++++
 include/uapi/linux/pci_regs.h                 |  11 +-
 9 files changed, 749 insertions(+), 79 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-class-enclosure
 create mode 100644 drivers/pci/pcie/pcie_em.c
 create mode 100644 include/linux/pcie_em.h

-- 
2.31.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [RFC PATCH v2 1/3] Add support for seven more indicators in enclosure driver
  2022-02-02 17:59 [RFC PATCH v2 0/3] Add PCIe enclosure management support Stuart Hayes
@ 2022-02-02 17:59 ` Stuart Hayes
  2022-02-02 23:55   ` Bjorn Helgaas
  2022-02-02 17:59 ` [RFC PATCH v2 2/3] Add PCIe enclosure management auxiliary driver Stuart Hayes
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 7+ messages in thread
From: Stuart Hayes @ 2022-02-02 17:59 UTC (permalink / raw)
  To: Bjorn Helgaas, linux-pci, Dan Williams
  Cc: Keith Busch, kw, mariusz.tkaczyk, helgaas, lukas, pavel,
	linux-cxl, martin.petersen, James.Bottomley, Arnd Bergmann,
	Greg Kroah-Hartman, Stuart Hayes

This patch adds support for seven additional indicators (ok, rebuild,
prdfail, hotspare, ica, ifa, disabled) to the enclosure driver, which
currently only supports three (fault, active, locate). It also reduces
duplicated code for the set and show functions for the sysfs attributes
for all of the indicators, and allows users of the driver to provide
common get_led and set_led callbacks to control all indicators (though
it continues to support the existing callbacks for the three currently
supported indicators, so it does not require any changes to code that
already uses the enclosure driver).

This will be used by the pcie_em driver.

Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
---
 .../ABI/testing/sysfs-class-enclosure         |  14 ++
 drivers/misc/enclosure.c                      | 191 +++++++++++-------
 include/linux/enclosure.h                     |  22 ++
 3 files changed, 149 insertions(+), 78 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-class-enclosure

diff --git a/Documentation/ABI/testing/sysfs-class-enclosure b/Documentation/ABI/testing/sysfs-class-enclosure
new file mode 100644
index 000000000000..25d91e42d768
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-class-enclosure
@@ -0,0 +1,14 @@
+What:		/sys/class/enclosure/<enclosure>/<component>/rebuild
+What:		/sys/class/enclosure/<enclosure>/<component>/disabled
+What:		/sys/class/enclosure/<enclosure>/<component>/hotspare
+What:		/sys/class/enclosure/<enclosure>/<component>/ica
+What:		/sys/class/enclosure/<enclosure>/<component>/ifa
+What:		/sys/class/enclosure/<enclosure>/<component>/ok
+What:		/sys/class/enclosure/<enclosure>/<component>/prdfail
+Date:		February 2022
+Contact:	Stuart Hayes <stuart.w.hayes@gmail.com>
+Description:
+		(RW) Get or set the indicator (1 is on, 0 is off) that
+		corresponds to the attribute name, for a component in an
+		enclosure. The "ica" and "ifa" states are "in critical
+		array" and "in failed array".
diff --git a/drivers/misc/enclosure.c b/drivers/misc/enclosure.c
index 1b010d9267c9..485d6a3c655b 100644
--- a/drivers/misc/enclosure.c
+++ b/drivers/misc/enclosure.c
@@ -473,30 +473,6 @@ static const char *const enclosure_type[] = {
 	[ENCLOSURE_COMPONENT_ARRAY_DEVICE] = "array device",
 };
 
-static ssize_t get_component_fault(struct device *cdev,
-				   struct device_attribute *attr, char *buf)
-{
-	struct enclosure_device *edev = to_enclosure_device(cdev->parent);
-	struct enclosure_component *ecomp = to_enclosure_component(cdev);
-
-	if (edev->cb->get_fault)
-		edev->cb->get_fault(edev, ecomp);
-	return sysfs_emit(buf, "%d\n", ecomp->fault);
-}
-
-static ssize_t set_component_fault(struct device *cdev,
-				   struct device_attribute *attr,
-				   const char *buf, size_t count)
-{
-	struct enclosure_device *edev = to_enclosure_device(cdev->parent);
-	struct enclosure_component *ecomp = to_enclosure_component(cdev);
-	int val = simple_strtoul(buf, NULL, 0);
-
-	if (edev->cb->set_fault)
-		edev->cb->set_fault(edev, ecomp, val);
-	return count;
-}
-
 static ssize_t get_component_status(struct device *cdev,
 				    struct device_attribute *attr,char *buf)
 {
@@ -531,54 +507,6 @@ static ssize_t set_component_status(struct device *cdev,
 		return -EINVAL;
 }
 
-static ssize_t get_component_active(struct device *cdev,
-				    struct device_attribute *attr, char *buf)
-{
-	struct enclosure_device *edev = to_enclosure_device(cdev->parent);
-	struct enclosure_component *ecomp = to_enclosure_component(cdev);
-
-	if (edev->cb->get_active)
-		edev->cb->get_active(edev, ecomp);
-	return sysfs_emit(buf, "%d\n", ecomp->active);
-}
-
-static ssize_t set_component_active(struct device *cdev,
-				    struct device_attribute *attr,
-				    const char *buf, size_t count)
-{
-	struct enclosure_device *edev = to_enclosure_device(cdev->parent);
-	struct enclosure_component *ecomp = to_enclosure_component(cdev);
-	int val = simple_strtoul(buf, NULL, 0);
-
-	if (edev->cb->set_active)
-		edev->cb->set_active(edev, ecomp, val);
-	return count;
-}
-
-static ssize_t get_component_locate(struct device *cdev,
-				    struct device_attribute *attr, char *buf)
-{
-	struct enclosure_device *edev = to_enclosure_device(cdev->parent);
-	struct enclosure_component *ecomp = to_enclosure_component(cdev);
-
-	if (edev->cb->get_locate)
-		edev->cb->get_locate(edev, ecomp);
-	return sysfs_emit(buf, "%d\n", ecomp->locate);
-}
-
-static ssize_t set_component_locate(struct device *cdev,
-				    struct device_attribute *attr,
-				    const char *buf, size_t count)
-{
-	struct enclosure_device *edev = to_enclosure_device(cdev->parent);
-	struct enclosure_component *ecomp = to_enclosure_component(cdev);
-	int val = simple_strtoul(buf, NULL, 0);
-
-	if (edev->cb->set_locate)
-		edev->cb->set_locate(edev, ecomp, val);
-	return count;
-}
-
 static ssize_t get_component_power_status(struct device *cdev,
 					  struct device_attribute *attr,
 					  char *buf)
@@ -641,29 +569,136 @@ static ssize_t get_component_slot(struct device *cdev,
 	return sysfs_emit(buf, "%d\n", slot);
 }
 
-static DEVICE_ATTR(fault, S_IRUGO | S_IWUSR, get_component_fault,
-		    set_component_fault);
+/*
+ * callbacks for attrs using enum enclosure_component_setting (LEDs)
+ */
+static ssize_t led_show(struct device *cdev,
+			enum enclosure_component_led led,
+			char *buf)
+{
+	struct enclosure_device *edev = to_enclosure_device(cdev->parent);
+	struct enclosure_component *ecomp = to_enclosure_component(cdev);
+
+	if (edev->cb->get_led)
+		edev->cb->get_led(edev, ecomp, led);
+	else
+		/*
+		 * support old callbacks for fault/active/locate
+		 */
+		switch (led) {
+		case ENCLOSURE_LED_FAULT:
+			if (edev->cb->get_fault) {
+				edev->cb->get_fault(edev, ecomp);
+				ecomp->led[led] = ecomp->fault;
+			}
+			break;
+		case ENCLOSURE_LED_ACTIVE:
+			if (edev->cb->get_active) {
+				edev->cb->get_active(edev, ecomp);
+				ecomp->led[led] = ecomp->active;
+			}
+			break;
+		case ENCLOSURE_LED_LOCATE:
+			if (edev->cb->get_locate) {
+				edev->cb->get_locate(edev, ecomp);
+				ecomp->led[led] = ecomp->locate;
+			}
+			break;
+		default:
+			break;
+		}
+
+	return sysfs_emit(buf, "%d\n", ecomp->led[led]);
+}
+
+static ssize_t led_set(struct device *cdev,
+		       enum enclosure_component_led led,
+		       const char *buf, size_t count)
+{
+	struct enclosure_device *edev = to_enclosure_device(cdev->parent);
+	struct enclosure_component *ecomp = to_enclosure_component(cdev);
+	int err, val;
+
+	err = kstrtoint(buf, 0, &val);
+	if (err)
+		return err;
+
+	if (edev->cb->set_led)
+		edev->cb->set_led(edev, ecomp, led, val);
+	else
+		/*
+		 * support old callbacks for fault/active/locate
+		 */
+		switch (led) {
+		case ENCLOSURE_LED_FAULT:
+			if (edev->cb->set_fault)
+				edev->cb->set_fault(edev, ecomp, val);
+			break;
+		case ENCLOSURE_LED_ACTIVE:
+			if (edev->cb->set_active)
+				edev->cb->set_active(edev, ecomp, val);
+			break;
+		case ENCLOSURE_LED_LOCATE:
+			if (edev->cb->set_locate)
+				edev->cb->set_locate(edev, ecomp, val);
+			break;
+		default:
+			break;
+		}
+
+	return count;
+}
+
+#define LED_ATTR_RW(led_attr, led)					\
+static ssize_t led_attr##_show(struct device *cdev,			\
+			       struct device_attribute *attr,		\
+			       char *buf)				\
+{									\
+	return led_show(cdev, led, buf);				\
+}									\
+static ssize_t led_attr##_store(struct device *cdev,			\
+				struct device_attribute *attr,		\
+				const char *buf, size_t count)		\
+{									\
+	return led_set(cdev, led, buf, count);				\
+}									\
+static DEVICE_ATTR_RW(led_attr)
+
 static DEVICE_ATTR(status, S_IRUGO | S_IWUSR, get_component_status,
 		   set_component_status);
-static DEVICE_ATTR(active, S_IRUGO | S_IWUSR, get_component_active,
-		   set_component_active);
-static DEVICE_ATTR(locate, S_IRUGO | S_IWUSR, get_component_locate,
-		   set_component_locate);
 static DEVICE_ATTR(power_status, S_IRUGO | S_IWUSR, get_component_power_status,
 		   set_component_power_status);
 static DEVICE_ATTR(type, S_IRUGO, get_component_type, NULL);
 static DEVICE_ATTR(slot, S_IRUGO, get_component_slot, NULL);
+LED_ATTR_RW(fault, ENCLOSURE_LED_FAULT);
+LED_ATTR_RW(active, ENCLOSURE_LED_ACTIVE);
+LED_ATTR_RW(locate, ENCLOSURE_LED_LOCATE);
+LED_ATTR_RW(ok, ENCLOSURE_LED_OK);
+LED_ATTR_RW(rebuild, ENCLOSURE_LED_REBUILD);
+LED_ATTR_RW(prdfail, ENCLOSURE_LED_PRDFAIL);
+LED_ATTR_RW(hotspare, ENCLOSURE_LED_HOTSPARE);
+LED_ATTR_RW(ica, ENCLOSURE_LED_ICA);
+LED_ATTR_RW(ifa, ENCLOSURE_LED_IFA);
+LED_ATTR_RW(disabled, ENCLOSURE_LED_DISABLED);
 
 static struct attribute *enclosure_component_attrs[] = {
 	&dev_attr_fault.attr,
 	&dev_attr_status.attr,
 	&dev_attr_active.attr,
 	&dev_attr_locate.attr,
+	&dev_attr_ok.attr,
+	&dev_attr_rebuild.attr,
+	&dev_attr_prdfail.attr,
+	&dev_attr_hotspare.attr,
+	&dev_attr_ica.attr,
+	&dev_attr_ifa.attr,
+	&dev_attr_disabled.attr,
 	&dev_attr_power_status.attr,
 	&dev_attr_type.attr,
 	&dev_attr_slot.attr,
 	NULL
 };
+
 ATTRIBUTE_GROUPS(enclosure_component);
 
 static int __init enclosure_init(void)
diff --git a/include/linux/enclosure.h b/include/linux/enclosure.h
index 1c630e2c2756..98dd0f05efb7 100644
--- a/include/linux/enclosure.h
+++ b/include/linux/enclosure.h
@@ -49,6 +49,20 @@ enum enclosure_component_setting {
 	ENCLOSURE_SETTING_BLINK_B_OFF_ON = 7,
 };
 
+enum enclosure_component_led {
+	ENCLOSURE_LED_FAULT,
+	ENCLOSURE_LED_ACTIVE,
+	ENCLOSURE_LED_LOCATE,
+	ENCLOSURE_LED_OK,
+	ENCLOSURE_LED_REBUILD,
+	ENCLOSURE_LED_PRDFAIL,
+	ENCLOSURE_LED_HOTSPARE,
+	ENCLOSURE_LED_ICA,
+	ENCLOSURE_LED_IFA,
+	ENCLOSURE_LED_DISABLED,
+	ENCLOSURE_LED_MAX,
+};
+
 struct enclosure_device;
 struct enclosure_component;
 struct enclosure_component_callbacks {
@@ -72,6 +86,13 @@ struct enclosure_component_callbacks {
 	int (*set_locate)(struct enclosure_device *,
 			  struct enclosure_component *,
 			  enum enclosure_component_setting);
+	void (*get_led)(struct enclosure_device *edev,
+			struct enclosure_component *ecomp,
+			enum enclosure_component_led);
+	int (*set_led)(struct enclosure_device *edev,
+		       struct enclosure_component *ecomp,
+		       enum enclosure_component_led,
+		       enum enclosure_component_setting);
 	void (*get_power_status)(struct enclosure_device *,
 				 struct enclosure_component *);
 	int (*set_power_status)(struct enclosure_device *,
@@ -90,6 +111,7 @@ struct enclosure_component {
 	int fault;
 	int active;
 	int locate;
+	int led[ENCLOSURE_LED_MAX];
 	int slot;
 	enum enclosure_status status;
 	int power_status;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC PATCH v2 2/3] Add PCIe enclosure management auxiliary driver
  2022-02-02 17:59 [RFC PATCH v2 0/3] Add PCIe enclosure management support Stuart Hayes
  2022-02-02 17:59 ` [RFC PATCH v2 1/3] Add support for seven more indicators in enclosure driver Stuart Hayes
@ 2022-02-02 17:59 ` Stuart Hayes
  2022-02-03  0:58   ` Bjorn Helgaas
  2022-02-02 17:59 ` [RFC PATCH v2 3/3] Register auxiliary device for PCIe enclosure management Stuart Hayes
  2022-02-10 11:52 ` [RFC PATCH v2 0/3] Add PCIe enclosure management support Mariusz Tkaczyk
  3 siblings, 1 reply; 7+ messages in thread
From: Stuart Hayes @ 2022-02-02 17:59 UTC (permalink / raw)
  To: Bjorn Helgaas, linux-pci, Dan Williams
  Cc: Keith Busch, kw, mariusz.tkaczyk, helgaas, lukas, pavel,
	linux-cxl, martin.petersen, James.Bottomley, Arnd Bergmann,
	Greg Kroah-Hartman, Stuart Hayes

This patch adds a new auxiliary driver to support PCIe enclosure
management (NPEM and _DSM).  Drivers whose PCIe devices could have
NPEM and/or _DSM enclosure management support could register an
auxiliary device, and this driver will attach to that and register an
enclosure to allow user space control of the associated state
indicators.

Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
---
 drivers/pci/pcie/Kconfig      |   8 +
 drivers/pci/pcie/Makefile     |   1 +
 drivers/pci/pcie/pcie_em.c    | 473 ++++++++++++++++++++++++++++++++++
 include/linux/pcie_em.h       |  97 +++++++
 include/uapi/linux/pci_regs.h |  11 +-
 5 files changed, 589 insertions(+), 1 deletion(-)
 create mode 100644 drivers/pci/pcie/pcie_em.c
 create mode 100644 include/linux/pcie_em.h

diff --git a/drivers/pci/pcie/Kconfig b/drivers/pci/pcie/Kconfig
index 45a2ef702b45..deaaf00c0d38 100644
--- a/drivers/pci/pcie/Kconfig
+++ b/drivers/pci/pcie/Kconfig
@@ -142,3 +142,11 @@ config PCIE_EDR
 	  the PCI Firmware Specification r3.2.  Enable this if you want to
 	  support hybrid DPC model which uses both firmware and OS to
 	  implement DPC.
+
+config PCIE_EM
+	tristate "PCIe Enclosure Management support"
+	depends on ENCLOSURE_SERVICES
+	help
+	  Auxiliary driver for PCI Express enclosure management (support
+	  for Native PCIe Enclosure Management (NPEM) and the related _DSM
+	  interface for status LED management).
diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile
index 5783a2f79e6a..3b2b160b6f13 100644
--- a/drivers/pci/pcie/Makefile
+++ b/drivers/pci/pcie/Makefile
@@ -13,3 +13,4 @@ obj-$(CONFIG_PCIE_PME)		+= pme.o
 obj-$(CONFIG_PCIE_DPC)		+= dpc.o
 obj-$(CONFIG_PCIE_PTM)		+= ptm.o
 obj-$(CONFIG_PCIE_EDR)		+= edr.o
+obj-$(CONFIG_PCIE_EM)		+= pcie_em.o
diff --git a/drivers/pci/pcie/pcie_em.c b/drivers/pci/pcie/pcie_em.c
new file mode 100644
index 000000000000..fe6ec3aaa561
--- /dev/null
+++ b/drivers/pci/pcie/pcie_em.c
@@ -0,0 +1,473 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Auxiliary driver providing support for PCIe Native PCIe Enclosure
+ * Management, supporting both the PCIe NPEM extended capability and the
+ * _DSM interface (as defined in the PCI Firmware Specification Rev
+ * 3.3, chapter 4.7, "_DSM Definitions for PCIe SSD Status LED".
+ *
+ * Copyright (c) 2021 Dell Inc.
+ *
+ * TODO: Actually add NPEM extended capability support
+ *       Add pcieport & cxl support
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/acpi.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/pci.h>
+#include <linux/enclosure.h>
+#include <linux/auxiliary_bus.h>
+#include <linux/delay.h>
+#include "../pci.h"
+#include <linux/pcie_em.h>
+
+#define DRIVER_NAME	"pcie-em"
+#define DRIVER_VERSION	"v1.0"
+
+#define NPEM_EXT_CAP_H
+/*
+ * NPEM & _DSM use the same state bit definitions
+ */
+#define	NPEM_STATE_OK		BIT(2)
+#define	NPEM_STATE_LOCATE	BIT(3)
+#define	NPEM_STATE_FAILED	BIT(4)
+#define	NPEM_STATE_REBUILD	BIT(5)
+#define	NPEM_STATE_PFA		BIT(6)  /* predicted failure analysis */
+#define	NPEM_STATE_HOTSPARE	BIT(7)
+#define	NPEM_STATE_ICA		BIT(8)  /* in a critical array */
+#define	NPEM_STATE_IFA		BIT(9)  /* in a failed array */
+#define	NPEM_STATE_INVALID	BIT(10)
+#define	NPEM_STATE_DISABLED	BIT(11)
+#define NPEM_ALL_STATES		GENMASK(11, 2)
+
+static const u32 to_npem_state[ENCLOSURE_LED_MAX] = {
+	[ENCLOSURE_LED_FAULT]		= NPEM_STATE_FAILED,
+	[ENCLOSURE_LED_LOCATE]		= NPEM_STATE_LOCATE,
+	[ENCLOSURE_LED_OK]		= NPEM_STATE_OK,
+	[ENCLOSURE_LED_REBUILD]		= NPEM_STATE_REBUILD,
+	[ENCLOSURE_LED_PRDFAIL]		= NPEM_STATE_PFA,
+	[ENCLOSURE_LED_HOTSPARE]	= NPEM_STATE_HOTSPARE,
+	[ENCLOSURE_LED_ICA]		= NPEM_STATE_ICA,
+	[ENCLOSURE_LED_IFA]		= NPEM_STATE_IFA,
+	[ENCLOSURE_LED_DISABLED]	= NPEM_STATE_DISABLED,
+};
+
+/*
+ * pcie_em_dev->pdev could be the drive itself or the downstream port
+ * leading to it
+ */
+struct pcie_em_dev {
+	struct pci_dev *pdev;			/* dev with controls */
+	struct pcie_em_led_state_ops *ops;	/* _DSM or NPEM ops */
+	struct enclosure_device *edev;
+	u32 states;				/* last written states */
+	u32 supported_states;
+	u16 npem_pos;				/* position of NPEM ext cap */
+	struct work_struct npem_work;		/* NPEM control write work func */
+	unsigned long last_ctrl_write;		/* time of last NPEM ctrl write */
+};
+
+struct pcie_em_led_state_ops {
+	void (*init)(struct pcie_em_dev *emdev);
+	int (*get_supported_states)(struct pcie_em_dev *emdev);
+	int (*get_current_states)(struct pcie_em_dev *emdev, u32 *states);
+	int (*set_current_states)(struct pcie_em_dev *emdev);
+};
+
+static struct mutex pcie_em_lock;
+static int pcie_em_exiting;
+
+#ifdef CONFIG_ACPI
+/*
+ * _DSM LED control
+ */
+static const guid_t pcie_pcie_em_dsm_guid = PCIE_SSD_LEDS_DSM_GUID;
+
+struct pcie_em_dsm_output {
+	u16 status;
+	u8 function_specific_err;
+	u8 vendor_specific_err;
+	u32 state;
+};
+
+static void dsm_status_err_print(struct pci_dev *pdev,
+				 struct pcie_em_dsm_output *output)
+{
+	switch (output->status) {
+	case 0:
+		break;
+	case 1:
+		pci_dbg(pdev, "_DSM not supported\n");
+		break;
+	case 2:
+		pci_dbg(pdev, "_DSM invalid input parameters\n");
+		break;
+	case 3:
+		pci_dbg(pdev, "_DSM communication error\n");
+		break;
+	case 4:
+		pci_dbg(pdev, "_DSM function-specific error 0x%x\n",
+			output->function_specific_err);
+		break;
+	case 5:
+		pci_dbg(pdev, "_DSM vendor-specific error 0x%x\n",
+			output->vendor_specific_err);
+		break;
+	default:
+		pci_dbg(pdev, "_DSM returned unknown status 0x%x\n",
+			output->status);
+	}
+}
+
+static int dsm_set(struct pci_dev *pdev, u32 value)
+{
+	acpi_handle handle;
+	union acpi_object *out_obj, arg3[2];
+	struct pcie_em_dsm_output *dsm_output;
+
+	handle = ACPI_HANDLE(&pdev->dev);
+	if (!handle)
+		return -ENODEV;
+
+	arg3[0].type = ACPI_TYPE_PACKAGE;
+	arg3[0].package.count = 1;
+	arg3[0].package.elements = &arg3[1];
+
+	arg3[1].type = ACPI_TYPE_BUFFER;
+	arg3[1].buffer.length = 4;
+	arg3[1].buffer.pointer = (u8 *)&value;
+
+	out_obj = acpi_evaluate_dsm_typed(handle, &pcie_pcie_em_dsm_guid,
+				1, SET_STATE_DSM, &arg3[0], ACPI_TYPE_BUFFER);
+	if (!out_obj)
+		return -EIO;
+
+	if (out_obj->buffer.length < 8) {
+		ACPI_FREE(out_obj);
+		return -EIO;
+	}
+
+	dsm_output = (struct pcie_em_dsm_output *)out_obj->buffer.pointer;
+
+	if (dsm_output->status != 0) {
+		dsm_status_err_print(pdev, dsm_output);
+		ACPI_FREE(out_obj);
+		return -EIO;
+	}
+	ACPI_FREE(out_obj);
+	return 0;
+}
+
+static int dsm_get(struct pci_dev *pdev, u64 dsm_func, u32 *output)
+{
+	acpi_handle handle;
+	union acpi_object *out_obj;
+	struct pcie_em_dsm_output *dsm_output;
+
+	handle = ACPI_HANDLE(&pdev->dev);
+	if (!handle)
+		return -ENODEV;
+
+	out_obj = acpi_evaluate_dsm_typed(handle, &pcie_pcie_em_dsm_guid, 0x1,
+					  dsm_func, NULL, ACPI_TYPE_BUFFER);
+	if (!out_obj)
+		return -EIO;
+
+	if (out_obj->buffer.length < 8) {
+		ACPI_FREE(out_obj);
+		return -EIO;
+	}
+
+	dsm_output = (struct pcie_em_dsm_output *)out_obj->buffer.pointer;
+	if (dsm_output->status != 0) {
+		dsm_status_err_print(pdev, dsm_output);
+		ACPI_FREE(out_obj);
+		return -EIO;
+	}
+
+	*output = dsm_output->state;
+	ACPI_FREE(out_obj);
+	return 0;
+}
+
+static int get_supported_states_dsm(struct pcie_em_dev *emdev)
+{
+	return dsm_get(emdev->pdev, GET_SUPPORTED_STATES_DSM, &emdev->supported_states);
+}
+
+static int get_current_states_dsm(struct pcie_em_dev *emdev, u32 *states)
+{
+	return dsm_get(emdev->pdev, GET_STATE_DSM, states);
+}
+
+static int set_current_states_dsm(struct pcie_em_dev *emdev)
+{
+	return dsm_set(emdev->pdev, emdev->states);
+}
+
+static struct pcie_em_led_state_ops dsm_pcie_em_led_state_ops = {
+	.get_supported_states	= get_supported_states_dsm,
+	.get_current_states	= get_current_states_dsm,
+	.set_current_states	= set_current_states_dsm,
+};
+
+/*
+ * end of _DSM code
+ */
+
+#endif /* CONFIG_ACPI */
+
+/*
+ * NPEM LED control
+ */
+
+static inline int npem_write_ctrl(struct pcie_em_dev *emdev, u32 reg)
+{
+	struct pci_dev *pdev = emdev->pdev;
+
+	emdev->last_ctrl_write = jiffies;
+	return pci_write_config_dword(pdev, emdev->npem_pos + PCI_NPEM_CTRL, reg);
+}
+
+static int get_supported_states_npem(struct pcie_em_dev *emdev)
+{
+	struct pci_dev *pdev = emdev->pdev;
+	u32 reg;
+	int ret;
+
+	ret = pci_read_config_dword(pdev, emdev->npem_pos + PCI_NPEM_CAP, &reg);
+	if (!ret)
+		emdev->supported_states = reg & NPEM_ALL_STATES;
+	return ret;
+}
+
+static int get_current_states_npem(struct pcie_em_dev *emdev, u32 *states)
+{
+	struct pci_dev *pdev = emdev->pdev;
+	u32 reg;
+	int ret;
+
+	ret = pci_read_config_dword(pdev, emdev->npem_pos + PCI_NPEM_CTRL, &reg);
+	if (!ret)
+		*states = reg & NPEM_ALL_STATES;
+	return ret;
+}
+
+static void npem_set_states_work(struct work_struct *w)
+{
+	struct pcie_em_dev *emdev = container_of(w, struct pcie_em_dev, npem_work);
+	struct pci_dev *pdev = emdev->pdev;
+	u32 status;
+
+	/*
+	 * per spec, wait up to 1 second for command completed status bit to be high
+	 */
+	while (true) {
+		if (pci_read_config_dword(pdev, emdev->npem_pos + PCI_NPEM_STATUS,
+					  &status))
+			return;
+		if ((status & PCI_NPEM_STATUS_CC) ||
+		    time_after(jiffies, emdev->last_ctrl_write + HZ)) {
+			break;
+		}
+		msleep(20);
+	}
+	npem_write_ctrl(emdev, emdev->states | PCI_NPEM_CTRL_EN);
+}
+
+static int set_current_states_npem(struct pcie_em_dev *emdev)
+{
+	schedule_work(&emdev->npem_work);
+	return 0;
+}
+
+static void npem_init(struct pcie_em_dev *emdev)
+{
+	struct pci_dev *pdev = emdev->pdev;
+
+	emdev->npem_pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_NPEM);
+	npem_write_ctrl(emdev, PCI_NPEM_CTRL_EN);
+	INIT_WORK(&emdev->npem_work, npem_set_states_work);
+}
+
+static struct pcie_em_led_state_ops npem_pcie_em_led_state_ops = {
+	.init			= npem_init,
+	.get_supported_states	= get_supported_states_npem,
+	.get_current_states	= get_current_states_npem,
+	.set_current_states	= set_current_states_npem,
+};
+
+/*
+ * end of NPEM code
+ */
+
+static void pcie_em_get_led(struct enclosure_device *edev,
+			     struct enclosure_component *ecomp,
+			     enum enclosure_component_led led)
+{
+	struct pcie_em_dev *emdev = ecomp->scratch;
+	u32 states = 0;
+
+	emdev->ops->get_current_states(emdev, &states);
+	ecomp->led[led] = !!(states & to_npem_state[led]) ?
+		ENCLOSURE_SETTING_ENABLED : ENCLOSURE_SETTING_DISABLED;
+}
+
+static int pcie_em_set_led(struct enclosure_device *edev,
+			 struct enclosure_component *ecomp,
+			 enum enclosure_component_led led,
+			 enum enclosure_component_setting val)
+{
+	struct pcie_em_dev *emdev = ecomp->scratch;
+	u32 npem_state = to_npem_state[led], states;
+	int rc;
+
+	if (val != ENCLOSURE_SETTING_ENABLED
+	    && val != ENCLOSURE_SETTING_DISABLED)
+		return -EINVAL;
+
+	states = emdev->states & ~npem_state;
+	states |= val == ENCLOSURE_SETTING_ENABLED ? npem_state : 0;
+
+	if ((states & emdev->supported_states) != states)
+		return -EINVAL;
+
+	emdev->states = states;
+	rc = emdev->ops->set_current_states(emdev);
+	/*
+	 * save last written state so it doesn't have to be re-read via NPEM/
+	 * _DSM on the next write
+	 */
+	return rc;
+}
+
+static struct enclosure_component_callbacks pcie_em_cb = {
+	.get_led	= pcie_em_get_led,
+	.set_led	= pcie_em_set_led,
+};
+
+static void pcie_em_remove(struct auxiliary_device *adev)
+{
+	struct pcie_em_dev *emdev = dev_get_drvdata(&adev->dev);
+
+	emdev->edev->scratch = NULL;
+	enclosure_unregister(emdev->edev);
+	dev_set_drvdata(&adev->dev, NULL);
+	kfree(emdev);
+}
+
+static int add_pcie_em_dev(struct auxiliary_device *adev,
+				struct pcie_em_led_state_ops *ops)
+{
+	struct pcie_em_dev *emdev;
+	struct pci_dev *pdev = to_pci_dev(adev->dev.parent);
+	struct enclosure_device *edev;
+	struct enclosure_component *ecomp;
+	int rc = 0;
+
+	mutex_lock(&pcie_em_lock);
+	if (pcie_em_exiting)
+		goto out_unlock;
+
+	emdev = kzalloc(sizeof(*emdev), GFP_KERNEL);
+	if (!emdev) {
+		rc = -ENOMEM;
+		goto out_unlock;
+	}
+	emdev->pdev = pdev;
+	emdev->ops = ops;
+	emdev->states = 0;
+	if (emdev->ops->init)
+		emdev->ops->init(emdev);
+
+	if (emdev->ops->get_supported_states(emdev) != 0)
+		goto out_free;
+
+	edev = enclosure_register(&pdev->dev, dev_name(&adev->dev), 1, &pcie_em_cb);
+	if (!edev)
+		goto out_free;
+
+	ecomp = enclosure_component_alloc(edev, 0, ENCLOSURE_COMPONENT_DEVICE, dev_name(&pdev->dev));
+	if (IS_ERR(ecomp))
+		goto out_unreg;
+
+	ecomp->type = ENCLOSURE_COMPONENT_ARRAY_DEVICE;
+	rc = enclosure_component_register(ecomp);
+	if (rc < 0)
+		goto out_unreg;
+
+	ecomp->scratch = emdev;
+	emdev->edev = edev;
+	dev_set_drvdata(&adev->dev, emdev);
+	goto out_unlock;
+
+out_unreg:
+	enclosure_unregister(edev);
+out_free:
+	kfree(emdev);
+out_unlock:
+	mutex_unlock(&pcie_em_lock);
+	return rc;
+}
+
+static int pcie_em_probe(struct auxiliary_device *adev,
+			  const struct auxiliary_device_id *id)
+{
+	struct pci_dev *pdev = to_pci_dev(adev->dev.parent);
+
+	/*
+	 * The PCI Firmware Spec Rev 3.3 says that the _DSM should be used
+	 * in preference to NPEM if available.
+	 */
+#ifdef CONFIG_ACPI
+	if (pci_has_pcie_em_dsm(pdev))
+		return add_pcie_em_dev(adev, &dsm_pcie_em_led_state_ops);
+#endif
+	if (pci_has_npem(pdev))
+		return add_pcie_em_dev(adev, &npem_pcie_em_led_state_ops);
+
+	/*
+	 * should not get here
+	 */
+	return -ENODEV;
+}
+
+static const struct auxiliary_device_id pcie_em_id_table[] = {
+	{ .name = "nvme.pcie_em", },
+//	{ .name = "cxl.pcie_em", },
+//	{ .name = "pcieport.pcie_em", },
+	{},
+};
+
+MODULE_DEVICE_TABLE(auxiliary, pcie_em_id_table);
+
+static struct auxiliary_driver pcie_em_driver = {
+	.name = "pcie_em",
+	.probe = pcie_em_probe,
+	.remove = pcie_em_remove,
+	.id_table = pcie_em_id_table,
+};
+
+static int __init pcie_em_init(void)
+{
+	mutex_init(&pcie_em_lock);
+	auxiliary_driver_register(&pcie_em_driver);
+	return 0;
+}
+
+static void __exit pcie_em_exit(void)
+{
+	mutex_lock(&pcie_em_lock);
+	pcie_em_exiting = 1;
+	mutex_unlock(&pcie_em_lock);
+	auxiliary_driver_unregister(&pcie_em_driver);
+}
+
+module_init(pcie_em_init);
+module_exit(pcie_em_exit);
+
+MODULE_AUTHOR("Stuart Hayes <stuart.w.hayes@gmail.com>");
+MODULE_DESCRIPTION("Support for PCIe SSD Status LEDs");
+MODULE_LICENSE("GPL");
diff --git a/include/linux/pcie_em.h b/include/linux/pcie_em.h
new file mode 100644
index 000000000000..26b6d68cccd3
--- /dev/null
+++ b/include/linux/pcie_em.h
@@ -0,0 +1,97 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef DRIVERS_PCI_PCIE_EM_H
+#define DRIVERS_PCI_PCIE_EM_H
+
+#include <linux/pci.h>
+#include <linux/acpi.h>
+
+#define PCIE_SSD_LEDS_DSM_GUID						\
+	GUID_INIT(0x5d524d9d, 0xfff9, 0x4d4b,				\
+		  0x8c, 0xb7, 0x74, 0x7e, 0xd5, 0x1e, 0x19, 0x4d)
+
+#define GET_SUPPORTED_STATES_DSM	0x01
+#define GET_STATE_DSM			0x02
+#define SET_STATE_DSM			0x03
+
+static inline bool pci_has_pcie_em_dsm(struct pci_dev *pdev)
+{
+#ifdef CONFIG_ACPI
+	acpi_handle handle;
+	const guid_t pcie_ssd_leds_dsm_guid = PCIE_SSD_LEDS_DSM_GUID;
+
+	handle = ACPI_HANDLE(&pdev->dev);
+	if (handle)
+		if (acpi_check_dsm(handle, &pcie_ssd_leds_dsm_guid, 0x1,
+				   1 << GET_SUPPORTED_STATES_DSM ||
+				   1 << GET_STATE_DSM ||
+				   1 << SET_STATE_DSM))
+			return true;
+#endif
+	return false;
+}
+
+static inline bool pci_has_npem(struct pci_dev *pdev)
+{
+	int pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_NPEM);
+	u32 cap;
+
+	if (pos)
+		if (!pci_read_config_dword(pdev, pos + PCI_NPEM_CAP, &cap))
+			return cap & PCI_NPEM_CAP_NPEM_CAP;
+	return false;
+}
+
+static inline bool pci_has_enclosure_management(struct pci_dev *pdev)
+{
+	return pci_has_pcie_em_dsm(pdev) || pci_has_npem(pdev);
+}
+
+static inline void release_pcie_em_aux_device(struct device *dev)
+{
+	kfree(to_auxiliary_dev(dev));
+}
+
+static inline struct auxiliary_device *register_pcie_em_auxdev(struct device *dev, int id)
+{
+	struct auxiliary_device *adev;
+	int ret;
+
+	if (!pci_has_enclosure_management(to_pci_dev(dev)))
+		return NULL;
+
+	adev = kzalloc(sizeof(*adev), GFP_KERNEL);
+	if (!adev)
+		goto em_reg_out_err;
+
+	adev->name = "pcie_em";
+	adev->dev.parent = dev;
+	adev->dev.release = release_pcie_em_aux_device;
+	adev->id = id;
+
+	ret = auxiliary_device_init(adev);
+	if (ret < 0)
+		goto em_reg_out_free;
+
+	ret = auxiliary_device_add(adev);
+	if (ret) {
+		auxiliary_device_uninit(adev);
+		goto em_reg_out_free;
+	}
+
+	return adev;
+em_reg_out_free:
+	kfree(adev);
+em_reg_out_err:
+	dev_warn(dev, "failed to register pcie_em device\n");
+	return NULL;
+}
+
+static inline void unregister_pcie_em_auxdev(struct auxiliary_device *auxdev)
+{
+	if (auxdev) {
+		auxiliary_device_delete(auxdev);
+		auxiliary_device_uninit(auxdev);
+	}
+}
+
+#endif /* DRIVERS_PCI_PCIE_EM_H */
diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index ff6ccbc6efe9..375540065c29 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -736,7 +736,8 @@
 #define PCI_EXT_CAP_ID_DVSEC	0x23	/* Designated Vendor-Specific */
 #define PCI_EXT_CAP_ID_DLF	0x25	/* Data Link Feature */
 #define PCI_EXT_CAP_ID_PL_16GT	0x26	/* Physical Layer 16.0 GT/s */
-#define PCI_EXT_CAP_ID_MAX	PCI_EXT_CAP_ID_PL_16GT
+#define PCI_EXT_CAP_ID_NPEM	0x29	/* Native PCIe Enclosure Management */
+#define PCI_EXT_CAP_ID_MAX	PCI_EXT_CAP_ID_NPEM
 
 #define PCI_EXT_CAP_DSN_SIZEOF	12
 #define PCI_EXT_CAP_MCAST_ENDPOINT_SIZEOF 40
@@ -1098,4 +1099,12 @@
 #define  PCI_PL_16GT_LE_CTRL_USP_TX_PRESET_MASK		0x000000F0
 #define  PCI_PL_16GT_LE_CTRL_USP_TX_PRESET_SHIFT	4
 
+/* NPEM */
+#define PCI_NPEM_CAP		0x04
+#define  PCI_NPEM_CAP_NPEM_CAP		0x00000001  /* dev is NPEM capable */
+#define PCI_NPEM_CTRL		0x08
+#define  PCI_NPEM_CTRL_EN		0x00000001  /* Enable NPEM */
+#define PCI_NPEM_STATUS		0x0c
+#define  PCI_NPEM_STATUS_CC		0x00000001  /* NPEM command completed */
+
 #endif /* LINUX_PCI_REGS_H */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC PATCH v2 3/3] Register auxiliary device for PCIe enclosure management
  2022-02-02 17:59 [RFC PATCH v2 0/3] Add PCIe enclosure management support Stuart Hayes
  2022-02-02 17:59 ` [RFC PATCH v2 1/3] Add support for seven more indicators in enclosure driver Stuart Hayes
  2022-02-02 17:59 ` [RFC PATCH v2 2/3] Add PCIe enclosure management auxiliary driver Stuart Hayes
@ 2022-02-02 17:59 ` Stuart Hayes
  2022-02-10 11:52 ` [RFC PATCH v2 0/3] Add PCIe enclosure management support Mariusz Tkaczyk
  3 siblings, 0 replies; 7+ messages in thread
From: Stuart Hayes @ 2022-02-02 17:59 UTC (permalink / raw)
  To: Bjorn Helgaas, linux-pci, Dan Williams
  Cc: Keith Busch, kw, mariusz.tkaczyk, helgaas, lukas, pavel,
	linux-cxl, martin.petersen, James.Bottomley, Arnd Bergmann,
	Greg Kroah-Hartman, Stuart Hayes

This patch will register an auxiliary devivce for the pcie_em driver,
allowing it to attach and provide access to PCIe enclosure management
(LEDs), if present.

Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
---
 drivers/nvme/host/pci.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index ca2ee806d74b..22ff10798333 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -27,6 +27,8 @@
 #include <linux/io-64-nonatomic-hi-lo.h>
 #include <linux/sed-opal.h>
 #include <linux/pci-p2pdma.h>
+#include <linux/auxiliary_bus.h>
+#include <linux/pcie_em.h>
 
 #include "trace.h"
 #include "nvme.h"
@@ -159,6 +161,9 @@ struct nvme_dev {
 	unsigned int nr_poll_queues;
 
 	bool attrs_added;
+
+	/* auxiliary_device for pcie_em driver (LEDs) */
+	struct auxiliary_device *pcie_em_auxdev;
 };
 
 static int io_queue_depth_set(const char *val, const struct kernel_param *kp)
@@ -2838,6 +2843,11 @@ static void nvme_reset_work(struct work_struct *work)
 		nvme_unfreeze(&dev->ctrl);
 	}
 
+	/*
+	 * register auxiliary device if this supports PCIe enclosure management
+	 */
+	dev->pcie_em_auxdev = register_pcie_em_auxdev(dev->dev, dev->ctrl.instance);
+
 	/*
 	 * If only admin queue live, keep it to do further investigation or
 	 * recovery.
@@ -3135,6 +3145,7 @@ static void nvme_remove(struct pci_dev *pdev)
 	nvme_free_queues(dev, 0);
 	nvme_release_prp_pools(dev);
 	nvme_dev_unmap(dev);
+	unregister_pcie_em_auxdev(dev->pcie_em_auxdev);
 	nvme_uninit_ctrl(&dev->ctrl);
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH v2 1/3] Add support for seven more indicators in enclosure driver
  2022-02-02 17:59 ` [RFC PATCH v2 1/3] Add support for seven more indicators in enclosure driver Stuart Hayes
@ 2022-02-02 23:55   ` Bjorn Helgaas
  0 siblings, 0 replies; 7+ messages in thread
From: Bjorn Helgaas @ 2022-02-02 23:55 UTC (permalink / raw)
  To: Stuart Hayes
  Cc: Bjorn Helgaas, linux-pci, Dan Williams, Keith Busch, kw,
	mariusz.tkaczyk, lukas, pavel, linux-cxl, martin.petersen,
	James.Bottomley, Arnd Bergmann, Greg Kroah-Hartman

Follow subject line convention (also applies to other patches):

  $ git log --oneline drivers/misc/enclosure.c
  714f1af14bb0 misc: enclosure: replace snprintf in show functions with sysfs_emit
  6a57251c70a4 misc: enclosure: Update enclosure_remove_device() documentation to match reality
  82f5b473d91a misc: enclosure: Fix some kerneldoc anomalies
  529244bd1afc scsi: enclosure: Fix stale device oops with hot replug
  e3575c1201f0 misc: enclosure: Use struct_size() in kzalloc()
  750b54deb569 misc: enclosure: Remove unnecessary error check

I don't think "seven" is really relevant for the subject or even the
commit log since you list them all anyway.

On Wed, Feb 02, 2022 at 11:59:11AM -0600, Stuart Hayes wrote:
> This patch adds support for seven additional indicators (ok, rebuild,
> prdfail, hotspare, ica, ifa, disabled) to the enclosure driver, which
> currently only supports three (fault, active, locate). It also reduces
> duplicated code for the set and show functions for the sysfs attributes
> for all of the indicators, and allows users of the driver to provide
> common get_led and set_led callbacks to control all indicators (though
> it continues to support the existing callbacks for the three currently
> supported indicators, so it does not require any changes to code that
> already uses the enclosure driver).

This restructures things (replacing get_component_fault(),
set_component_fault(), etc with attributes) and adds things (the new
indicators and also maybe the .get_led() and .set_led() hooks).  It
would be nice to split all those into separate patches.

Also consider using imperative mood:
  https://chris.beams.io/posts/git-commit/
  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/maintainer-tip.rst?id=v5.16#n134

A few more minor comments below.

> @@ -0,0 +1,14 @@
> +What:		/sys/class/enclosure/<enclosure>/<component>/rebuild
> +What:		/sys/class/enclosure/<enclosure>/<component>/disabled
> +What:		/sys/class/enclosure/<enclosure>/<component>/hotspare
> +What:		/sys/class/enclosure/<enclosure>/<component>/ica
> +What:		/sys/class/enclosure/<enclosure>/<component>/ifa
> +What:		/sys/class/enclosure/<enclosure>/<component>/ok
> +What:		/sys/class/enclosure/<enclosure>/<component>/prdfail
> +Date:		February 2022
> +Contact:	Stuart Hayes <stuart.w.hayes@gmail.com>
> +Description:
> +		(RW) Get or set the indicator (1 is on, 0 is off) that
> +		corresponds to the attribute name, for a component in an
> +		enclosure. The "ica" and "ifa" states are "in critical
> +		array" and "in failed array".

What's "prdfail"?

> diff --git a/drivers/misc/enclosure.c b/drivers/misc/enclosure.c
> index 1b010d9267c9..485d6a3c655b 100644
> --- a/drivers/misc/enclosure.c
> +++ b/drivers/misc/enclosure.c
> @@ -473,30 +473,6 @@ static const char *const enclosure_type[] = {
>  	[ENCLOSURE_COMPONENT_ARRAY_DEVICE] = "array device",
>  };
>  
> -static ssize_t get_component_fault(struct device *cdev,
> -				   struct device_attribute *attr, char *buf)
> -{
> -	struct enclosure_device *edev = to_enclosure_device(cdev->parent);
> -	struct enclosure_component *ecomp = to_enclosure_component(cdev);
> -
> -	if (edev->cb->get_fault)
> -		edev->cb->get_fault(edev, ecomp);
> -	return sysfs_emit(buf, "%d\n", ecomp->fault);
> -}
> -
> -static ssize_t set_component_fault(struct device *cdev,
> -				   struct device_attribute *attr,
> -				   const char *buf, size_t count)
> -{
> -	struct enclosure_device *edev = to_enclosure_device(cdev->parent);
> -	struct enclosure_component *ecomp = to_enclosure_component(cdev);
> -	int val = simple_strtoul(buf, NULL, 0);
> -
> -	if (edev->cb->set_fault)
> -		edev->cb->set_fault(edev, ecomp, val);
> -	return count;
> -}
> -
>  static ssize_t get_component_status(struct device *cdev,
>  				    struct device_attribute *attr,char *buf)
>  {
> @@ -531,54 +507,6 @@ static ssize_t set_component_status(struct device *cdev,
>  		return -EINVAL;
>  }
>  
> -static ssize_t get_component_active(struct device *cdev,
> -				    struct device_attribute *attr, char *buf)
> -{
> -	struct enclosure_device *edev = to_enclosure_device(cdev->parent);
> -	struct enclosure_component *ecomp = to_enclosure_component(cdev);
> -
> -	if (edev->cb->get_active)
> -		edev->cb->get_active(edev, ecomp);
> -	return sysfs_emit(buf, "%d\n", ecomp->active);
> -}
> -
> -static ssize_t set_component_active(struct device *cdev,
> -				    struct device_attribute *attr,
> -				    const char *buf, size_t count)
> -{
> -	struct enclosure_device *edev = to_enclosure_device(cdev->parent);
> -	struct enclosure_component *ecomp = to_enclosure_component(cdev);
> -	int val = simple_strtoul(buf, NULL, 0);
> -
> -	if (edev->cb->set_active)
> -		edev->cb->set_active(edev, ecomp, val);
> -	return count;
> -}
> -
> -static ssize_t get_component_locate(struct device *cdev,
> -				    struct device_attribute *attr, char *buf)
> -{
> -	struct enclosure_device *edev = to_enclosure_device(cdev->parent);
> -	struct enclosure_component *ecomp = to_enclosure_component(cdev);
> -
> -	if (edev->cb->get_locate)
> -		edev->cb->get_locate(edev, ecomp);
> -	return sysfs_emit(buf, "%d\n", ecomp->locate);
> -}
> -
> -static ssize_t set_component_locate(struct device *cdev,
> -				    struct device_attribute *attr,
> -				    const char *buf, size_t count)
> -{
> -	struct enclosure_device *edev = to_enclosure_device(cdev->parent);
> -	struct enclosure_component *ecomp = to_enclosure_component(cdev);
> -	int val = simple_strtoul(buf, NULL, 0);
> -
> -	if (edev->cb->set_locate)
> -		edev->cb->set_locate(edev, ecomp, val);
> -	return count;
> -}
> -
>  static ssize_t get_component_power_status(struct device *cdev,
>  					  struct device_attribute *attr,
>  					  char *buf)
> @@ -641,29 +569,136 @@ static ssize_t get_component_slot(struct device *cdev,
>  	return sysfs_emit(buf, "%d\n", slot);
>  }
>  
> -static DEVICE_ATTR(fault, S_IRUGO | S_IWUSR, get_component_fault,
> -		    set_component_fault);
> +/*
> + * callbacks for attrs using enum enclosure_component_setting (LEDs)
> + */
> +static ssize_t led_show(struct device *cdev,
> +			enum enclosure_component_led led,
> +			char *buf)
> +{
> +	struct enclosure_device *edev = to_enclosure_device(cdev->parent);
> +	struct enclosure_component *ecomp = to_enclosure_component(cdev);
> +
> +	if (edev->cb->get_led)
> +		edev->cb->get_led(edev, ecomp, led);
> +	else

The switch is technically one statement, but it's so long that I think
it really needs braces around it.

> +		/*
> +		 * support old callbacks for fault/active/locate
> +		 */
> +		switch (led) {
> +		case ENCLOSURE_LED_FAULT:
> +			if (edev->cb->get_fault) {
> +				edev->cb->get_fault(edev, ecomp);
> +				ecomp->led[led] = ecomp->fault;
> +			}
> +			break;
> +		case ENCLOSURE_LED_ACTIVE:
> +			if (edev->cb->get_active) {
> +				edev->cb->get_active(edev, ecomp);
> +				ecomp->led[led] = ecomp->active;
> +			}
> +			break;
> +		case ENCLOSURE_LED_LOCATE:
> +			if (edev->cb->get_locate) {
> +				edev->cb->get_locate(edev, ecomp);
> +				ecomp->led[led] = ecomp->locate;
> +			}
> +			break;
> +		default:
> +			break;
> +		}
> +
> +	return sysfs_emit(buf, "%d\n", ecomp->led[led]);
> +}
> +
> +static ssize_t led_set(struct device *cdev,
> +		       enum enclosure_component_led led,
> +		       const char *buf, size_t count)
> +{
> +	struct enclosure_device *edev = to_enclosure_device(cdev->parent);
> +	struct enclosure_component *ecomp = to_enclosure_component(cdev);
> +	int err, val;
> +
> +	err = kstrtoint(buf, 0, &val);
> +	if (err)
> +		return err;
> +
> +	if (edev->cb->set_led)
> +		edev->cb->set_led(edev, ecomp, led, val);
> +	else

Same here.  Or you could just return early for the cb->set_led case
and unindent the below.

> +		/*
> +		 * support old callbacks for fault/active/locate
> +		 */
> +		switch (led) {
> +		case ENCLOSURE_LED_FAULT:
> +			if (edev->cb->set_fault)
> +				edev->cb->set_fault(edev, ecomp, val);
> +			break;
> +		case ENCLOSURE_LED_ACTIVE:
> +			if (edev->cb->set_active)
> +				edev->cb->set_active(edev, ecomp, val);
> +			break;
> +		case ENCLOSURE_LED_LOCATE:
> +			if (edev->cb->set_locate)
> +				edev->cb->set_locate(edev, ecomp, val);
> +			break;
> +		default:
> +			break;
> +		}

I guess you rely on the callbacks to set ecomp->led[led] (or
ecomp->fault, etc)?

Maybe do the "ecomp->led[led] = ecomp->fault" bits here instead of in
led_show() so that crashdumps will show the same info regardless of
whether led_show() has been run?  ...

Oh, never mind, I see that you only ever update led[led] in the
.get_led() callbacks.  In that case, why do you even *store* the
result in led[led]?  Couldn't you just return it from .get_led()?

> +	return count;
> +}
> +
> +#define LED_ATTR_RW(led_attr, led)					\
> +static ssize_t led_attr##_show(struct device *cdev,			\
> +			       struct device_attribute *attr,		\
> +			       char *buf)				\
> +{									\
> +	return led_show(cdev, led, buf);				\
> +}									\
> +static ssize_t led_attr##_store(struct device *cdev,			\
> +				struct device_attribute *attr,		\
> +				const char *buf, size_t count)		\
> +{									\
> +	return led_set(cdev, led, buf, count);				\
> +}									\
> +static DEVICE_ATTR_RW(led_attr)
> +
>  static DEVICE_ATTR(status, S_IRUGO | S_IWUSR, get_component_status,
>  		   set_component_status);
> -static DEVICE_ATTR(active, S_IRUGO | S_IWUSR, get_component_active,
> -		   set_component_active);
> -static DEVICE_ATTR(locate, S_IRUGO | S_IWUSR, get_component_locate,
> -		   set_component_locate);
>  static DEVICE_ATTR(power_status, S_IRUGO | S_IWUSR, get_component_power_status,
>  		   set_component_power_status);
>  static DEVICE_ATTR(type, S_IRUGO, get_component_type, NULL);
>  static DEVICE_ATTR(slot, S_IRUGO, get_component_slot, NULL);
> +LED_ATTR_RW(fault, ENCLOSURE_LED_FAULT);
> +LED_ATTR_RW(active, ENCLOSURE_LED_ACTIVE);
> +LED_ATTR_RW(locate, ENCLOSURE_LED_LOCATE);
> +LED_ATTR_RW(ok, ENCLOSURE_LED_OK);
> +LED_ATTR_RW(rebuild, ENCLOSURE_LED_REBUILD);
> +LED_ATTR_RW(prdfail, ENCLOSURE_LED_PRDFAIL);
> +LED_ATTR_RW(hotspare, ENCLOSURE_LED_HOTSPARE);
> +LED_ATTR_RW(ica, ENCLOSURE_LED_ICA);
> +LED_ATTR_RW(ifa, ENCLOSURE_LED_IFA);
> +LED_ATTR_RW(disabled, ENCLOSURE_LED_DISABLED);
>  
>  static struct attribute *enclosure_component_attrs[] = {
>  	&dev_attr_fault.attr,
>  	&dev_attr_status.attr,
>  	&dev_attr_active.attr,
>  	&dev_attr_locate.attr,
> +	&dev_attr_ok.attr,
> +	&dev_attr_rebuild.attr,
> +	&dev_attr_prdfail.attr,
> +	&dev_attr_hotspare.attr,
> +	&dev_attr_ica.attr,
> +	&dev_attr_ifa.attr,
> +	&dev_attr_disabled.attr,
>  	&dev_attr_power_status.attr,
>  	&dev_attr_type.attr,
>  	&dev_attr_slot.attr,
>  	NULL
>  };
> +

Spurious diff?  I see both with and without this blank line, but seems
more common without.

>  ATTRIBUTE_GROUPS(enclosure_component);

>  
>  static int __init enclosure_init(void)
> diff --git a/include/linux/enclosure.h b/include/linux/enclosure.h
> index 1c630e2c2756..98dd0f05efb7 100644
> --- a/include/linux/enclosure.h
> +++ b/include/linux/enclosure.h
> @@ -49,6 +49,20 @@ enum enclosure_component_setting {
>  	ENCLOSURE_SETTING_BLINK_B_OFF_ON = 7,
>  };
>  
> +enum enclosure_component_led {
> +	ENCLOSURE_LED_FAULT,
> +	ENCLOSURE_LED_ACTIVE,
> +	ENCLOSURE_LED_LOCATE,
> +	ENCLOSURE_LED_OK,
> +	ENCLOSURE_LED_REBUILD,
> +	ENCLOSURE_LED_PRDFAIL,
> +	ENCLOSURE_LED_HOTSPARE,
> +	ENCLOSURE_LED_ICA,
> +	ENCLOSURE_LED_IFA,
> +	ENCLOSURE_LED_DISABLED,
> +	ENCLOSURE_LED_MAX,
> +};
> +
>  struct enclosure_device;
>  struct enclosure_component;
>  struct enclosure_component_callbacks {
> @@ -72,6 +86,13 @@ struct enclosure_component_callbacks {
>  	int (*set_locate)(struct enclosure_device *,
>  			  struct enclosure_component *,
>  			  enum enclosure_component_setting);
> +	void (*get_led)(struct enclosure_device *edev,
> +			struct enclosure_component *ecomp,
> +			enum enclosure_component_led);
> +	int (*set_led)(struct enclosure_device *edev,
> +		       struct enclosure_component *ecomp,
> +		       enum enclosure_component_led,
> +		       enum enclosure_component_setting);
>  	void (*get_power_status)(struct enclosure_device *,
>  				 struct enclosure_component *);
>  	int (*set_power_status)(struct enclosure_device *,
> @@ -90,6 +111,7 @@ struct enclosure_component {
>  	int fault;
>  	int active;
>  	int locate;
> +	int led[ENCLOSURE_LED_MAX];
>  	int slot;
>  	enum enclosure_status status;
>  	int power_status;
> -- 
> 2.31.1
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH v2 2/3] Add PCIe enclosure management auxiliary driver
  2022-02-02 17:59 ` [RFC PATCH v2 2/3] Add PCIe enclosure management auxiliary driver Stuart Hayes
@ 2022-02-03  0:58   ` Bjorn Helgaas
  0 siblings, 0 replies; 7+ messages in thread
From: Bjorn Helgaas @ 2022-02-03  0:58 UTC (permalink / raw)
  To: Stuart Hayes
  Cc: Bjorn Helgaas, linux-pci, Dan Williams, Keith Busch, kw,
	mariusz.tkaczyk, lukas, pavel, linux-cxl, martin.petersen,
	James.Bottomley, Arnd Bergmann, Greg Kroah-Hartman

On Wed, Feb 02, 2022 at 11:59:12AM -0600, Stuart Hayes wrote:
> This patch adds a new auxiliary driver to support PCIe enclosure
> management (NPEM and _DSM).  Drivers whose PCIe devices could have
> NPEM and/or _DSM enclosure management support could register an
> auxiliary device, and this driver will attach to that and register an
> enclosure to allow user space control of the associated state
> indicators.
> 
> Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
> ---
>  drivers/pci/pcie/Kconfig      |   8 +
>  drivers/pci/pcie/Makefile     |   1 +
>  drivers/pci/pcie/pcie_em.c    | 473 ++++++++++++++++++++++++++++++++++

Despite having "PCIe" in the name, NPEM has nothing in particular to
do with PCIe, and does not require portdrv or any PCIe-specific
services.  If we do this, I think this should go in drivers/pci, not
in drivers/pci/pcie/.

And "pcie_em" is both unnecessarily specific ("pcie", when it's not
PCIe-specific and could be completely via ACPI) and not descriptive
enough ("em" doesn't give a clue about what this is).

>  include/linux/pcie_em.h       |  97 +++++++
>  include/uapi/linux/pci_regs.h |  11 +-
>  5 files changed, 589 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/pci/pcie/pcie_em.c
>  create mode 100644 include/linux/pcie_em.h
> 
> diff --git a/drivers/pci/pcie/Kconfig b/drivers/pci/pcie/Kconfig
> index 45a2ef702b45..deaaf00c0d38 100644
> --- a/drivers/pci/pcie/Kconfig
> +++ b/drivers/pci/pcie/Kconfig
> @@ -142,3 +142,11 @@ config PCIE_EDR
>  	  the PCI Firmware Specification r3.2.  Enable this if you want to
>  	  support hybrid DPC model which uses both firmware and OS to
>  	  implement DPC.
> +
> +config PCIE_EM
> +	tristate "PCIe Enclosure Management support"
> +	depends on ENCLOSURE_SERVICES
> +	help
> +	  Auxiliary driver for PCI Express enclosure management (support
> +	  for Native PCIe Enclosure Management (NPEM) and the related _DSM
> +	  interface for status LED management).
> diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile
> index 5783a2f79e6a..3b2b160b6f13 100644
> --- a/drivers/pci/pcie/Makefile
> +++ b/drivers/pci/pcie/Makefile
> @@ -13,3 +13,4 @@ obj-$(CONFIG_PCIE_PME)		+= pme.o
>  obj-$(CONFIG_PCIE_DPC)		+= dpc.o
>  obj-$(CONFIG_PCIE_PTM)		+= ptm.o
>  obj-$(CONFIG_PCIE_EDR)		+= edr.o
> +obj-$(CONFIG_PCIE_EM)		+= pcie_em.o
> diff --git a/drivers/pci/pcie/pcie_em.c b/drivers/pci/pcie/pcie_em.c
> new file mode 100644
> index 000000000000..fe6ec3aaa561
> --- /dev/null
> +++ b/drivers/pci/pcie/pcie_em.c
> @@ -0,0 +1,473 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Auxiliary driver providing support for PCIe Native PCIe Enclosure
> + * Management, supporting both the PCIe NPEM extended capability and the
> + * _DSM interface (as defined in the PCI Firmware Specification Rev
> + * 3.3, chapter 4.7, "_DSM Definitions for PCIe SSD Status LED".
> + *
> + * Copyright (c) 2021 Dell Inc.
> + *
> + * TODO: Actually add NPEM extended capability support

Looks like you have NPEM support now?

> + *       Add pcieport & cxl support

Maybe I spoke too soon above.  What do you envision needing for
pcieport support?

> + */
> +
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt

Does this actually work?  The only messages I see below are pci_dbg(),
and they would use dev_fmt, not pr_fmt.

> +#include <linux/acpi.h>
> +#include <linux/module.h>
> +#include <linux/mutex.h>
> +#include <linux/pci.h>
> +#include <linux/enclosure.h>
> +#include <linux/auxiliary_bus.h>
> +#include <linux/delay.h>
> +#include "../pci.h"
> +#include <linux/pcie_em.h>
> +
> +#define DRIVER_NAME	"pcie-em"
> +#define DRIVER_VERSION	"v1.0"
> +
> +#define NPEM_EXT_CAP_H

Unused?

> +/*
> + * NPEM & _DSM use the same state bit definitions
> + */
> +#define	NPEM_STATE_OK		BIT(2)
> +#define	NPEM_STATE_LOCATE	BIT(3)
> +#define	NPEM_STATE_FAILED	BIT(4)
> +#define	NPEM_STATE_REBUILD	BIT(5)
> +#define	NPEM_STATE_PFA		BIT(6)  /* predicted failure analysis */
> +#define	NPEM_STATE_HOTSPARE	BIT(7)
> +#define	NPEM_STATE_ICA		BIT(8)  /* in a critical array */
> +#define	NPEM_STATE_IFA		BIT(9)  /* in a failed array */
> +#define	NPEM_STATE_INVALID	BIT(10)
> +#define	NPEM_STATE_DISABLED	BIT(11)
> +#define NPEM_ALL_STATES		GENMASK(11, 2)
> +
> +static const u32 to_npem_state[ENCLOSURE_LED_MAX] = {
> +	[ENCLOSURE_LED_FAULT]		= NPEM_STATE_FAILED,
> +	[ENCLOSURE_LED_LOCATE]		= NPEM_STATE_LOCATE,
> +	[ENCLOSURE_LED_OK]		= NPEM_STATE_OK,
> +	[ENCLOSURE_LED_REBUILD]		= NPEM_STATE_REBUILD,
> +	[ENCLOSURE_LED_PRDFAIL]		= NPEM_STATE_PFA,
> +	[ENCLOSURE_LED_HOTSPARE]	= NPEM_STATE_HOTSPARE,
> +	[ENCLOSURE_LED_ICA]		= NPEM_STATE_ICA,
> +	[ENCLOSURE_LED_IFA]		= NPEM_STATE_IFA,
> +	[ENCLOSURE_LED_DISABLED]	= NPEM_STATE_DISABLED,
> +};
> +
> +/*
> + * pcie_em_dev->pdev could be the drive itself or the downstream port
> + * leading to it

Which downstream port?  The closest one?  Or any downstream port in
the path?

> + */
> +struct pcie_em_dev {
> +	struct pci_dev *pdev;			/* dev with controls */
> +	struct pcie_em_led_state_ops *ops;	/* _DSM or NPEM ops */
> +	struct enclosure_device *edev;
> +	u32 states;				/* last written states */
> +	u32 supported_states;
> +	u16 npem_pos;				/* position of NPEM ext cap */
> +	struct work_struct npem_work;		/* NPEM control write work func */
> +	unsigned long last_ctrl_write;		/* time of last NPEM ctrl write */

Wrap or better yet, shorten, to fit in 80 columns.  Also other
instances below.

> +};
> +
> +struct pcie_em_led_state_ops {
> +	void (*init)(struct pcie_em_dev *emdev);
> +	int (*get_supported_states)(struct pcie_em_dev *emdev);
> +	int (*get_current_states)(struct pcie_em_dev *emdev, u32 *states);
> +	int (*set_current_states)(struct pcie_em_dev *emdev);
> +};
> +
> +static struct mutex pcie_em_lock;
> +static int pcie_em_exiting;

Hmmm.  I'm not a module expert, but I'm looking sort of sideways at
this "exiting" thing.  Is this a common design pattern?

> +#ifdef CONFIG_ACPI
> +/*
> + * _DSM LED control
> + */
> +static const guid_t pcie_pcie_em_dsm_guid = PCIE_SSD_LEDS_DSM_GUID;
> +
> +struct pcie_em_dsm_output {
> +	u16 status;
> +	u8 function_specific_err;
> +	u8 vendor_specific_err;
> +	u32 state;
> +};
> +
> +static void dsm_status_err_print(struct pci_dev *pdev,
> +				 struct pcie_em_dsm_output *output)
> +{
> +	switch (output->status) {
> +	case 0:
> +		break;
> +	case 1:
> +		pci_dbg(pdev, "_DSM not supported\n");
> +		break;
> +	case 2:
> +		pci_dbg(pdev, "_DSM invalid input parameters\n");
> +		break;
> +	case 3:
> +		pci_dbg(pdev, "_DSM communication error\n");
> +		break;
> +	case 4:
> +		pci_dbg(pdev, "_DSM function-specific error 0x%x\n",
> +			output->function_specific_err);
> +		break;
> +	case 5:
> +		pci_dbg(pdev, "_DSM vendor-specific error 0x%x\n",
> +			output->vendor_specific_err);
> +		break;
> +	default:
> +		pci_dbg(pdev, "_DSM returned unknown status 0x%x\n",
> +			output->status);
> +	}
> +}
> +
> +static int dsm_set(struct pci_dev *pdev, u32 value)
> +{
> +	acpi_handle handle;
> +	union acpi_object *out_obj, arg3[2];
> +	struct pcie_em_dsm_output *dsm_output;
> +
> +	handle = ACPI_HANDLE(&pdev->dev);
> +	if (!handle)
> +		return -ENODEV;
> +
> +	arg3[0].type = ACPI_TYPE_PACKAGE;
> +	arg3[0].package.count = 1;
> +	arg3[0].package.elements = &arg3[1];
> +
> +	arg3[1].type = ACPI_TYPE_BUFFER;
> +	arg3[1].buffer.length = 4;
> +	arg3[1].buffer.pointer = (u8 *)&value;
> +
> +	out_obj = acpi_evaluate_dsm_typed(handle, &pcie_pcie_em_dsm_guid,
> +				1, SET_STATE_DSM, &arg3[0], ACPI_TYPE_BUFFER);
> +	if (!out_obj)
> +		return -EIO;
> +
> +	if (out_obj->buffer.length < 8) {
> +		ACPI_FREE(out_obj);
> +		return -EIO;
> +	}
> +
> +	dsm_output = (struct pcie_em_dsm_output *)out_obj->buffer.pointer;
> +
> +	if (dsm_output->status != 0) {
> +		dsm_status_err_print(pdev, dsm_output);
> +		ACPI_FREE(out_obj);
> +		return -EIO;
> +	}
> +	ACPI_FREE(out_obj);
> +	return 0;
> +}
> +
> +static int dsm_get(struct pci_dev *pdev, u64 dsm_func, u32 *output)
> +{
> +	acpi_handle handle;
> +	union acpi_object *out_obj;
> +	struct pcie_em_dsm_output *dsm_output;
> +
> +	handle = ACPI_HANDLE(&pdev->dev);
> +	if (!handle)
> +		return -ENODEV;
> +
> +	out_obj = acpi_evaluate_dsm_typed(handle, &pcie_pcie_em_dsm_guid, 0x1,
> +					  dsm_func, NULL, ACPI_TYPE_BUFFER);
> +	if (!out_obj)
> +		return -EIO;
> +
> +	if (out_obj->buffer.length < 8) {
> +		ACPI_FREE(out_obj);
> +		return -EIO;
> +	}
> +
> +	dsm_output = (struct pcie_em_dsm_output *)out_obj->buffer.pointer;
> +	if (dsm_output->status != 0) {
> +		dsm_status_err_print(pdev, dsm_output);
> +		ACPI_FREE(out_obj);
> +		return -EIO;
> +	}
> +
> +	*output = dsm_output->state;
> +	ACPI_FREE(out_obj);
> +	return 0;
> +}
> +
> +static int get_supported_states_dsm(struct pcie_em_dev *emdev)
> +{
> +	return dsm_get(emdev->pdev, GET_SUPPORTED_STATES_DSM, &emdev->supported_states);
> +}
> +
> +static int get_current_states_dsm(struct pcie_em_dev *emdev, u32 *states)
> +{
> +	return dsm_get(emdev->pdev, GET_STATE_DSM, states);
> +}
> +
> +static int set_current_states_dsm(struct pcie_em_dev *emdev)
> +{
> +	return dsm_set(emdev->pdev, emdev->states);
> +}
> +
> +static struct pcie_em_led_state_ops dsm_pcie_em_led_state_ops = {
> +	.get_supported_states	= get_supported_states_dsm,
> +	.get_current_states	= get_current_states_dsm,
> +	.set_current_states	= set_current_states_dsm,
> +};
> +
> +/*
> + * end of _DSM code
> + */
> +
> +#endif /* CONFIG_ACPI */
> +
> +/*
> + * NPEM LED control
> + */
> +
> +static inline int npem_write_ctrl(struct pcie_em_dev *emdev, u32 reg)
> +{
> +	struct pci_dev *pdev = emdev->pdev;
> +
> +	emdev->last_ctrl_write = jiffies;
> +	return pci_write_config_dword(pdev, emdev->npem_pos + PCI_NPEM_CTRL, reg);
> +}
> +
> +static int get_supported_states_npem(struct pcie_em_dev *emdev)
> +{
> +	struct pci_dev *pdev = emdev->pdev;
> +	u32 reg;
> +	int ret;
> +
> +	ret = pci_read_config_dword(pdev, emdev->npem_pos + PCI_NPEM_CAP, &reg);
> +	if (!ret)
> +		emdev->supported_states = reg & NPEM_ALL_STATES;
> +	return ret;
> +}
> +
> +static int get_current_states_npem(struct pcie_em_dev *emdev, u32 *states)
> +{
> +	struct pci_dev *pdev = emdev->pdev;
> +	u32 reg;
> +	int ret;
> +
> +	ret = pci_read_config_dword(pdev, emdev->npem_pos + PCI_NPEM_CTRL, &reg);
> +	if (!ret)
> +		*states = reg & NPEM_ALL_STATES;
> +	return ret;
> +}
> +
> +static void npem_set_states_work(struct work_struct *w)
> +{
> +	struct pcie_em_dev *emdev = container_of(w, struct pcie_em_dev, npem_work);
> +	struct pci_dev *pdev = emdev->pdev;
> +	u32 status;
> +
> +	/*
> +	 * per spec, wait up to 1 second for command completed status bit to be high

Include spec citation, please.  "PCIe r6.0, sec 7.9.19.4," I think.

Capitalize these like normal English sentences.

> +	 */
> +	while (true) {
> +		if (pci_read_config_dword(pdev, emdev->npem_pos + PCI_NPEM_STATUS,
> +					  &status))
> +			return;
> +		if ((status & PCI_NPEM_STATUS_CC) ||
> +		    time_after(jiffies, emdev->last_ctrl_write + HZ)) {
> +			break;
> +		}
> +		msleep(20);
> +	}
> +	npem_write_ctrl(emdev, emdev->states | PCI_NPEM_CTRL_EN);
> +}
> +
> +static int set_current_states_npem(struct pcie_em_dev *emdev)
> +{
> +	schedule_work(&emdev->npem_work);
> +	return 0;
> +}
> +
> +static void npem_init(struct pcie_em_dev *emdev)
> +{
> +	struct pci_dev *pdev = emdev->pdev;
> +
> +	emdev->npem_pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_NPEM);

I don't really like the fact that we don't check for 0 here.  I know
there's something up the chain that checks this, but it's a fair
amount of work to verify that.  I'd like it to be completely obvious
that this isn't going to poke at the wrong things in config space.

Maybe .init() should be able to return failure?

> +	npem_write_ctrl(emdev, PCI_NPEM_CTRL_EN);
> +	INIT_WORK(&emdev->npem_work, npem_set_states_work);
> +}
> +
> +static struct pcie_em_led_state_ops npem_pcie_em_led_state_ops = {
> +	.init			= npem_init,
> +	.get_supported_states	= get_supported_states_npem,
> +	.get_current_states	= get_current_states_npem,
> +	.set_current_states	= set_current_states_npem,
> +};
> +
> +/*
> + * end of NPEM code
> + */
> +
> +static void pcie_em_get_led(struct enclosure_device *edev,
> +			     struct enclosure_component *ecomp,
> +			     enum enclosure_component_led led)
> +{
> +	struct pcie_em_dev *emdev = ecomp->scratch;
> +	u32 states = 0;
> +
> +	emdev->ops->get_current_states(emdev, &states);
> +	ecomp->led[led] = !!(states & to_npem_state[led]) ?
> +		ENCLOSURE_SETTING_ENABLED : ENCLOSURE_SETTING_DISABLED;

Personal bias: I don't really care for "!!".  Does it add something
here?  It looks like this would work the same without it.

> +}
> +
> +static int pcie_em_set_led(struct enclosure_device *edev,
> +			 struct enclosure_component *ecomp,
> +			 enum enclosure_component_led led,
> +			 enum enclosure_component_setting val)
> +{
> +	struct pcie_em_dev *emdev = ecomp->scratch;
> +	u32 npem_state = to_npem_state[led], states;
> +	int rc;
> +
> +	if (val != ENCLOSURE_SETTING_ENABLED
> +	    && val != ENCLOSURE_SETTING_DISABLED)
> +		return -EINVAL;
> +
> +	states = emdev->states & ~npem_state;
> +	states |= val == ENCLOSURE_SETTING_ENABLED ? npem_state : 0;
> +
> +	if ((states & emdev->supported_states) != states)
> +		return -EINVAL;
> +
> +	emdev->states = states;
> +	rc = emdev->ops->set_current_states(emdev);

No need for "rc"; you can just

  return emdev->ops->set_current_states(emdev);

> +	/*
> +	 * save last written state so it doesn't have to be re-read via NPEM/
> +	 * _DSM on the next write
> +	 */

I'm guessing this comment applies to "emdev->states = states" above,
and should be moved before that?

It doesn't look like _DSM has any need to reread anything; you could
just call "emdev->ops->set_current_states(emdev, states)" for it.

But for NPEM it looks like you might need to save the states for the
scheduled work.  Maybe the comment needs to omit _DSM?  But even NPEM
doesn't *reread* anything.  I guess I just don't understand the
comment at all.

> +	return rc;
> +}
> +
> +static struct enclosure_component_callbacks pcie_em_cb = {
> +	.get_led	= pcie_em_get_led,
> +	.set_led	= pcie_em_set_led,
> +};
> +
> +static void pcie_em_remove(struct auxiliary_device *adev)
> +{
> +	struct pcie_em_dev *emdev = dev_get_drvdata(&adev->dev);
> +
> +	emdev->edev->scratch = NULL;
> +	enclosure_unregister(emdev->edev);
> +	dev_set_drvdata(&adev->dev, NULL);
> +	kfree(emdev);
> +}
> +
> +static int add_pcie_em_dev(struct auxiliary_device *adev,
> +				struct pcie_em_led_state_ops *ops)
> +{
> +	struct pcie_em_dev *emdev;
> +	struct pci_dev *pdev = to_pci_dev(adev->dev.parent);
> +	struct enclosure_device *edev;
> +	struct enclosure_component *ecomp;
> +	int rc = 0;
> +
> +	mutex_lock(&pcie_em_lock);
> +	if (pcie_em_exiting)
> +		goto out_unlock;
> +
> +	emdev = kzalloc(sizeof(*emdev), GFP_KERNEL);
> +	if (!emdev) {
> +		rc = -ENOMEM;
> +		goto out_unlock;
> +	}
> +	emdev->pdev = pdev;
> +	emdev->ops = ops;
> +	emdev->states = 0;
> +	if (emdev->ops->init)
> +		emdev->ops->init(emdev);
> +
> +	if (emdev->ops->get_supported_states(emdev) != 0)
> +		goto out_free;
> +
> +	edev = enclosure_register(&pdev->dev, dev_name(&adev->dev), 1, &pcie_em_cb);
> +	if (!edev)
> +		goto out_free;
> +
> +	ecomp = enclosure_component_alloc(edev, 0, ENCLOSURE_COMPONENT_DEVICE, dev_name(&pdev->dev));
> +	if (IS_ERR(ecomp))
> +		goto out_unreg;
> +
> +	ecomp->type = ENCLOSURE_COMPONENT_ARRAY_DEVICE;
> +	rc = enclosure_component_register(ecomp);
> +	if (rc < 0)
> +		goto out_unreg;
> +
> +	ecomp->scratch = emdev;
> +	emdev->edev = edev;
> +	dev_set_drvdata(&adev->dev, emdev);
> +	goto out_unlock;
> +
> +out_unreg:
> +	enclosure_unregister(edev);
> +out_free:
> +	kfree(emdev);
> +out_unlock:
> +	mutex_unlock(&pcie_em_lock);
> +	return rc;
> +}
> +
> +static int pcie_em_probe(struct auxiliary_device *adev,
> +			  const struct auxiliary_device_id *id)
> +{
> +	struct pci_dev *pdev = to_pci_dev(adev->dev.parent);
> +
> +	/*
> +	 * The PCI Firmware Spec Rev 3.3 says that the _DSM should be used

Can you include the section reference, please?

> +	 * in preference to NPEM if available.
> +	 */
> +#ifdef CONFIG_ACPI
> +	if (pci_has_pcie_em_dsm(pdev))
> +		return add_pcie_em_dev(adev, &dsm_pcie_em_led_state_ops);
> +#endif
> +	if (pci_has_npem(pdev))
> +		return add_pcie_em_dev(adev, &npem_pcie_em_led_state_ops);
> +
> +	/*
> +	 * should not get here
> +	 */

What assures us that we shouldn't get here?  I think it's all the
magic inside register_pcie_em_auxdev(), but honestly, that's pretty
hard to verify.  It must have something do with "nvme.pcie_em" devices
only existing if pci_has_enclosure_management(), but I can't connect
the dots there because "nvme.pcie_em" only appears here.

> +	return -ENODEV;
> +}
> +
> +static const struct auxiliary_device_id pcie_em_id_table[] = {
> +	{ .name = "nvme.pcie_em", },
> +//	{ .name = "cxl.pcie_em", },
> +//	{ .name = "pcieport.pcie_em", },
> +	{},
> +};
> +
> +MODULE_DEVICE_TABLE(auxiliary, pcie_em_id_table);
> +
> +static struct auxiliary_driver pcie_em_driver = {
> +	.name = "pcie_em",
> +	.probe = pcie_em_probe,
> +	.remove = pcie_em_remove,
> +	.id_table = pcie_em_id_table,
> +};
> +
> +static int __init pcie_em_init(void)
> +{
> +	mutex_init(&pcie_em_lock);
> +	auxiliary_driver_register(&pcie_em_driver);
> +	return 0;
> +}
> +
> +static void __exit pcie_em_exit(void)
> +{
> +	mutex_lock(&pcie_em_lock);
> +	pcie_em_exiting = 1;
> +	mutex_unlock(&pcie_em_lock);
> +	auxiliary_driver_unregister(&pcie_em_driver);
> +}
> +
> +module_init(pcie_em_init);
> +module_exit(pcie_em_exit);
> +
> +MODULE_AUTHOR("Stuart Hayes <stuart.w.hayes@gmail.com>");
> +MODULE_DESCRIPTION("Support for PCIe SSD Status LEDs");
> +MODULE_LICENSE("GPL");
> diff --git a/include/linux/pcie_em.h b/include/linux/pcie_em.h
> new file mode 100644
> index 000000000000..26b6d68cccd3
> --- /dev/null
> +++ b/include/linux/pcie_em.h
> @@ -0,0 +1,97 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef DRIVERS_PCI_PCIE_EM_H
> +#define DRIVERS_PCI_PCIE_EM_H
> +
> +#include <linux/pci.h>
> +#include <linux/acpi.h>
> +

AFAICT, most of the stuff here is only needed internally and only the
register_pcie_em_auxdev() and unregister_pcie_em_auxdev() declarations
are actually needed outside driver/pci.  It'd be nice if all the rest
were not visible outside drivers/pci.

I don't know enough about modules to know how this works if nvme calls
register_pcie_em_auxdev() when the pcie_em.o module is not loaded.  Is
there some module dependency that guarantees that can never happen?

> +#define PCIE_SSD_LEDS_DSM_GUID						\
> +	GUID_INIT(0x5d524d9d, 0xfff9, 0x4d4b,				\
> +		  0x8c, 0xb7, 0x74, 0x7e, 0xd5, 0x1e, 0x19, 0x4d)
> +
> +#define GET_SUPPORTED_STATES_DSM	0x01
> +#define GET_STATE_DSM			0x02
> +#define SET_STATE_DSM			0x03
> +
> +static inline bool pci_has_pcie_em_dsm(struct pci_dev *pdev)
> +{
> +#ifdef CONFIG_ACPI
> +	acpi_handle handle;
> +	const guid_t pcie_ssd_leds_dsm_guid = PCIE_SSD_LEDS_DSM_GUID;
> +
> +	handle = ACPI_HANDLE(&pdev->dev);
> +	if (handle)
> +		if (acpi_check_dsm(handle, &pcie_ssd_leds_dsm_guid, 0x1,
> +				   1 << GET_SUPPORTED_STATES_DSM ||
> +				   1 << GET_STATE_DSM ||
> +				   1 << SET_STATE_DSM))
> +			return true;
> +#endif
> +	return false;
> +}
> +
> +static inline bool pci_has_npem(struct pci_dev *pdev)
> +{
> +	int pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_NPEM);

It's a real shame we need to do this twice (here and in npem_init()).

> +	u32 cap;
> +
> +	if (pos)
> +		if (!pci_read_config_dword(pdev, pos + PCI_NPEM_CAP, &cap))
> +			return cap & PCI_NPEM_CAP_NPEM_CAP;
> +	return false;
> +}
> +
> +static inline bool pci_has_enclosure_management(struct pci_dev *pdev)
> +{
> +	return pci_has_pcie_em_dsm(pdev) || pci_has_npem(pdev);
> +}
> +
> +static inline void release_pcie_em_aux_device(struct device *dev)
> +{
> +	kfree(to_auxiliary_dev(dev));
> +}
> +
> +static inline struct auxiliary_device *register_pcie_em_auxdev(struct device *dev, int id)
> +{
> +	struct auxiliary_device *adev;
> +	int ret;
> +
> +	if (!pci_has_enclosure_management(to_pci_dev(dev)))
> +		return NULL;
> +
> +	adev = kzalloc(sizeof(*adev), GFP_KERNEL);
> +	if (!adev)
> +		goto em_reg_out_err;
> +
> +	adev->name = "pcie_em";
> +	adev->dev.parent = dev;
> +	adev->dev.release = release_pcie_em_aux_device;
> +	adev->id = id;
> +
> +	ret = auxiliary_device_init(adev);
> +	if (ret < 0)
> +		goto em_reg_out_free;
> +
> +	ret = auxiliary_device_add(adev);
> +	if (ret) {
> +		auxiliary_device_uninit(adev);
> +		goto em_reg_out_free;
> +	}
> +
> +	return adev;
> +em_reg_out_free:
> +	kfree(adev);
> +em_reg_out_err:
> +	dev_warn(dev, "failed to register pcie_em device\n");
> +	return NULL;
> +}
> +
> +static inline void unregister_pcie_em_auxdev(struct auxiliary_device *auxdev)
> +{
> +	if (auxdev) {
> +		auxiliary_device_delete(auxdev);
> +		auxiliary_device_uninit(auxdev);
> +	}

This makes me nervous because IIUC this relies on the driver (nvme,
etc) to do the right thing if the PCI device gets removed.  If the
driver forgets to call this, we have this auxiliary device hanging
around and probably accessible via sysfs, and something probably blows
up if the user reads/writes it.

> +}
> +
> +#endif /* DRIVERS_PCI_PCIE_EM_H */
> diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
> index ff6ccbc6efe9..375540065c29 100644
> --- a/include/uapi/linux/pci_regs.h
> +++ b/include/uapi/linux/pci_regs.h
> @@ -736,7 +736,8 @@
>  #define PCI_EXT_CAP_ID_DVSEC	0x23	/* Designated Vendor-Specific */
>  #define PCI_EXT_CAP_ID_DLF	0x25	/* Data Link Feature */
>  #define PCI_EXT_CAP_ID_PL_16GT	0x26	/* Physical Layer 16.0 GT/s */
> -#define PCI_EXT_CAP_ID_MAX	PCI_EXT_CAP_ID_PL_16GT
> +#define PCI_EXT_CAP_ID_NPEM	0x29	/* Native PCIe Enclosure Management */
> +#define PCI_EXT_CAP_ID_MAX	PCI_EXT_CAP_ID_NPEM
>  
>  #define PCI_EXT_CAP_DSN_SIZEOF	12
>  #define PCI_EXT_CAP_MCAST_ENDPOINT_SIZEOF 40
> @@ -1098,4 +1099,12 @@
>  #define  PCI_PL_16GT_LE_CTRL_USP_TX_PRESET_MASK		0x000000F0
>  #define  PCI_PL_16GT_LE_CTRL_USP_TX_PRESET_SHIFT	4
>  
> +/* NPEM */
> +#define PCI_NPEM_CAP		0x04
> +#define  PCI_NPEM_CAP_NPEM_CAP		0x00000001  /* dev is NPEM capable */
> +#define PCI_NPEM_CTRL		0x08
> +#define  PCI_NPEM_CTRL_EN		0x00000001  /* Enable NPEM */
> +#define PCI_NPEM_STATUS		0x0c
> +#define  PCI_NPEM_STATUS_CC		0x00000001  /* NPEM command completed */
> +
>  #endif /* LINUX_PCI_REGS_H */
> -- 
> 2.31.1
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH v2 0/3] Add PCIe enclosure management support
  2022-02-02 17:59 [RFC PATCH v2 0/3] Add PCIe enclosure management support Stuart Hayes
                   ` (2 preceding siblings ...)
  2022-02-02 17:59 ` [RFC PATCH v2 3/3] Register auxiliary device for PCIe enclosure management Stuart Hayes
@ 2022-02-10 11:52 ` Mariusz Tkaczyk
  3 siblings, 0 replies; 7+ messages in thread
From: Mariusz Tkaczyk @ 2022-02-10 11:52 UTC (permalink / raw)
  To: Stuart Hayes
  Cc: Bjorn Helgaas, linux-pci, Dan Williams, Keith Busch, kw, helgaas,
	lukas, pavel, linux-cxl, martin.petersen, James.Bottomley,
	Arnd Bergmann, Greg Kroah-Hartman

On Wed,  2 Feb 2022 11:59:10 -0600
Stuart Hayes <stuart.w.hayes@gmail.com> wrote:

> (3) Modifies the nvme driver to create an auxiliary device to which
> the pcie_em driver can attach.
> 
> These patches do not modify the cxl or pcieport drivers to add
> support, though the driver was designed to make it easy to do so.

Hi Stuart,

I would like to help you with NPEM testing. Currently I have setup with
NPEM enabled in Downstream port. It will require to add support for
pcieport. Could you add it in next version?

I'm also searching for a drive with NPEM to do full validation (when
both Downstream and the SSD are reporting capability). If you can point
me the model, it will be helpful.

Thanks,
Mariusz

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-02-10 11:52 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-02 17:59 [RFC PATCH v2 0/3] Add PCIe enclosure management support Stuart Hayes
2022-02-02 17:59 ` [RFC PATCH v2 1/3] Add support for seven more indicators in enclosure driver Stuart Hayes
2022-02-02 23:55   ` Bjorn Helgaas
2022-02-02 17:59 ` [RFC PATCH v2 2/3] Add PCIe enclosure management auxiliary driver Stuart Hayes
2022-02-03  0:58   ` Bjorn Helgaas
2022-02-02 17:59 ` [RFC PATCH v2 3/3] Register auxiliary device for PCIe enclosure management Stuart Hayes
2022-02-10 11:52 ` [RFC PATCH v2 0/3] Add PCIe enclosure management support Mariusz Tkaczyk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).